Pattern? List of IDs/SKUs on STDIN

I’ve been working on a tool to publish products to WooCommerce and Ebay, as a followup to “ebay-tools” (which is now defunct). This old tool was centered around managing directories named with SKUs.

The new tool keeps product info in a table, and references everything with SKUs. The SKU is used in WooCommerce (WC), Ebay, and in the table.

One pattern I used in ebay-tools was a “SKU stream”, where I’d accept a list of SKUs on STDIN, one SKU per line, and perform the command for each SKU.

So you have a list of SKUs like this:

b100
b101
b102
i1000

You run the command like this:

./ebay mvdirs <dest>

Then pass (or paste) the list into STDIN.

Each SKU would be found, and then moved to the destination.

Software Tools and Pipes

Way back in the 1970s, a book called Software Tools, by Kerninghan and Plauger, described how to write “software tools” that could be used to create automated scripts. The ST book introduced the idea of STDIN, STDOUT, and pipes.

For more about this, read: https://www.princeton.edu/~hos/frs122/unixhist/tools.htm

The traditional Unix way to use pipes is to read input one line at a time, process or transform the line, and print it to standard output.

So the line was the record, and the data was in the line.

The Object Stream

Microsoft PowerShell came up with a twist: the object stream. It’s exactly what it sounds like – a stream of objects. Each object is like a line, but instead of characters, it’s an object with properties.

A Stream of SKUs/References

The stream I used, was a list of references to SKUs, which are objects in the databases/WC/Ebay.

It’s akin to a stream of filenames: these are references to the actual data, which is database somewhere.

The code for this is pretty basic. This is a sample taken from mvdirs:

<?php
//...
    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $targetState = $input->getArgument('targetState');

        while($line = fgets(STDIN)) {
            $line = trim($line);
            if (!$line) continue; // blank line
            $sku = $this->getCanonicalSKU($line);

            $targetPath = $this->cleanSKUAndMoveInto($sku, $targetState);
        }
    }

That’s pretty basic code. Nothing special. Still, it opens the possibility of building pipelines.

Maybe I need a function that runs the loop, so the resultant code can look more like this:

$this->applyToSKUStream(function($sku) use ($targetState) {
    $targetPath = $this->cleanSKUAndMoveInto($sku, $targetState);
});

That reduces the line count by 4.

A condensed command calling function could also help improve transparency:

$this->applyToSKUStream(function($sku) {
    $sku = $this->callCommand('app:wc:upload-all', ['sku'=>$sku]);
    $sku = $this->callCommand('app:wc:some-command', ['sku'=>$sku]);
    $sku = $this->callCommand('app:wc:other-command', ['sku'=>$sku]);
});

That’s pretty much self-documenting code. If it’s too slow, then direct calls to methods could be performed, to avoid creating new instances of a Command. (But retain the original code as documentation.)

Here’s a slightly different version. Maybe it’s easier to understand:

$this->pipeSKUStream()
     ->cmd('app:wc:upload-all')
     ->cmd('app:wc:some-command')
     ->cmd('app:wc:other-command');

// Each command's output is sent to the STDIN of the next.

This wouldn’t be a real Unix pipeline, because I don’t care about speed as much as memory. Symfony is a fat framework, and image processing is memory hungry. Uploading data is very slow, too.

Brainstorming Some Standards

Each command that takes an SKU argument should also take a — argument to process a SKU stream on STDIN.

Each command should print the SKU on STDOUT after it’s processed.

# single SKU
bin/console app:do-thing b100
# single SKU
echo b100 | bin/console app:do-thing --
# multi SKU
echo <<<EOF | bin/console app:do-thing --
b100
b101
EOF
# multi SKU
cat list | xargs bin/console app:do-thing
cat list | bin/console app:do-thing --
# multi SKU, pipeline
cat list | bin/console app:fix-pictures -- | bin/console app:set-button --

That last line would apply fix-pictures and set-button to each SKU.

The original fix-pictures command did a lot of checking to make sure that resources were renamed, products existed, and files were attached to the product. It’s a bit like a pipeline, and should have been implemented using standalone commands, instead of a mega-command that called other commands.

Right now the code to do things to SKUs are in Traits. These should be broken out as commands, like:

app:wc:delete-image <imageid>
app:wc:get-images <sku>
app:wc:assure-product-exists <sku>
app:wc:delete-product <sku>
app:image:get-all <sku>
app:image:rename-with-prefix <sku> <filename>
app:image:upload-all <sku>

Some commands output SKUs, and some output other things. I think if they don’t output a SKU, they should output a one line JSON object, like:

{ "sku":"b100", "filename":"path/to/file" }

And a bare SKU should be considered a special case of the more generic:

{ "sku":"b100" }

And, while we’re at it, the SKU line format should be:

b100 # comment

That way, we can have a program that outputs human-friendly lines like:

b100 # Title of a book here