Positioning essay

Why TAFFISH Is Not Another Workflow Engine

TAFFISH starts with the line that actually runs a tool: the shell command people copy, edit, test, and pass along.

1. The confusion is reasonable

TAFFISH has tools. TAFFISH has flows. TAFFISH can organize bioinformatics analyses. From the outside, it is natural to ask whether TAFFISH is simply another workflow engine.

More precisely, no. TAFFISH can help build workflows, but it begins with a smaller object: the command someone would otherwise type, paste, or place inside a larger system.

A workflow engine asks how steps should be connected. TAFFISH asks what kind of command each step is made from.

2. What workflow engines are good at

Workflow engines such as Nextflow and Snakemake solve important problems. They say what should run after what, decide how steps are grouped, and help an analysis move from a few commands to a larger plan that can run on real computing infrastructure.

Galaxy solves a different but equally important problem: it opens a web door into analysis tools. CWL, WDL, Boutiques, and other descriptor systems give tool interfaces and runs a more explicit written form.

We are not arguing against those layers. TAFFISH lives in the same world they do: command-line tools, containers, package records, scripts, and users who need a command to mean the same thing tomorrow.

3. TAFFISH works at the command layer

A workflow task usually ends with a command. The command may look simple:

samtools sort input.bam -o sorted.bam

But the visible line does not tell the whole story. Which samtools is this? Where does it come from? What must be present on the machine? What can another person check before trusting the line?

TAFFISH treats that line as something worth giving a sturdier form. A TAFFISH app lets the same kind of tool call become installable while still looking like a shell command:

taf install samtools
taf-samtools samtools sort input.bam -o sorted.bam

The command stays familiar, but it now has a name, a release trail, backend hints, platform notes, and Hub records behind it.

4. A TAFFISH command is still a command

The practical change can be very small. If an old shell, Perl, or Python script calls samtools, the first TAFFISH step is not to rewrite the whole script into a new language. It can be as simple as replacing one bare tool call with a versioned TAFFISH command:

Old script line:

samtools sort input.bam -o sorted.bam

Updated script line:

taf-samtools-v1.23.1-r1 samtools sort input.bam -o sorted.bam

The updated line is still an ordinary command from the viewpoint of shell, Perl, Python, Make, an HPC job script, or a workflow task. The surrounding script can stay the same: the command can still be called with system(...), subprocess.run(...), a pipe, a loop, or a larger script in the same way other command-line tools are called.

What changes is not how users work, but how much execution context the command carries. The app name pins the TAFFISH package, the version and release identify the wrapper state, and the container backend provides a more controlled runtime than an accidental host installation.

This does not make biological inputs, hardware, or upstream tools disappear as sources of variation. Inputs, reference files, CPU architecture, backend behavior, external databases, time, and randomness can still matter. But when the input is the same, the TAFFISH app version is fixed, the architecture and backend are comparable, and the upstream tool is deterministic, the command is much more likely to mean the same thing on another machine.

That is the kind of portability we want TAFFISH to provide: not a demand that users learn a new workflow DSL or Docker command line first, but a small shell-level change that makes an existing script easier to carry forward.

5. Stable workflows need stable parts

A workflow can be beautifully written and still be fragile. A tool may be installed differently. An image may disappear. A helper program may be missing. A command that looked obvious on one server may mean something else on another.

TAFFISH starts one level earlier. Before asking how to arrange many steps, it asks how one step can become something reusable enough to be carried into the next project.

Once the small pieces are easier to install, inspect, pass around, and compose, higher-level workflows have less instability under them. The workflow file still matters; the commands inside it are simply less mysterious.

6. TAFFISH can work with workflow systems

TAFFISH commands are meant to be usable wherever ordinary commands are already accepted:

  • directly in an interactive terminal;
  • inside ordinary shell scripts;
  • inside lightweight taf flows;
  • inside HPC job scripts;
  • inside Nextflow, Snakemake, Galaxy, or other workflow tasks when TAFFISH is available in that task context.

In that sense, TAFFISH is not a higher layer above workflow engines. It is a way of making the commands they call less exposed.

7. We work at a smaller layer

This boundary is not a rejection of larger systems. TAFFISH works at a different and smaller layer. It is a small tool focused on a narrow part of the stack: the command-level boundary where a tool is packaged, versioned, installed, invoked, and shared.

For now, our work is to make that small layer solid. If a bioinformatics command can be easier to package, install, inspect, run, share, and compose without pulling researchers away from shell, then TAFFISH is fulfilling its role.

Let workflow engines arrange the steps; we make each step less fragile.

That is why we do not describe TAFFISH as another workflow engine. We are working on the smaller command layer underneath workflows, and trying to make that layer easier to carry forward.