The full name of TAFFISH is "Tools And Flows Framework Intensify SHell". The term "FISH" also embodies the philosophy "Teach a man to fish".
From a technical perspective, TAFFISH is essentially a cross-device and cross-platform "App Store" tailored both for users (who simply use the tools) and developers (who create their own tools and workflows). It mainly consists of three parts:
So, how do wu use TAFFISH? It’s actually quite simple:
# Choose the installation method according to your operating system
sudo apt update
# Select the appropriate container management software according to your needs
sudo apt install apptainer
More information: [2. Install]
taf install xxx
taf-xxx -h
taf-xxx ...
More information about installing and using: [2. Install] & [3. Quick Start(an example)]
vim xxx.taf
taffish xxx.taf ...
taf ...
More Information: TAFFISH-DEVELOPMENT-MANUAL
Yse, our "taffish" can be seen as a "software package management system" just like "apt"/"nmp"/"homebrew"/"conda"/... But "taffish": No environment change/rely & Reproducible & Portable
At present, the main field of "TAFFISH" is bioinformatics, because this field requires frequent use of command-line tools and makeing/use of command-line workflows, and often involves cross-device, multi-device computation and cooperation and other related issues, which is what "TAFFISH" excels in:
"TAFFISH" now is just made for users:
If you are interested in joining us, developing "TAFFISH" with us, or have any constructive comments, please feel free to contact us by email! (contact@taffish.com)
A lot of the software in our "taf-hub/taf-app-store" is implemented on top of container management software, so it is highly recommended to install the relevant container management software first! It is also recommended to install Apptainer (formerly known as Singularity, which is suitable for users with high-performance computing needs, but only supports Linux), Podman (a non-rooted version of Docker for multi-user/non-root users), and Docker (for individual Windows/Mac users with root privileges).
We recommend going to the corresponding official website to install the corresponding container management software by yourself:
Also, it's a good idea to make sure you have "curl" included in your computer, so that you can use the "curl" in the command line to implement the automatic installation in the next step! ("taffish" also need curl to work)
Depending on your system, you can choose to install curl yourself using package management software such as apt or brew.
At present, our Taffish is only compatible with the following operating systems (if your device is not included, then we are very sorry, you can send an email to submit your device situation (operating system and hardware architecture), and we will consider adding an appropriate installation package for your device at our discretion!)
Note: If you need to install "TAFFISH" for all users on your computer, then use the root account or add the "sudo" command to install it!
sudo sh -c "..." -n
Note: During the installation process, you may also have some errors that need to update/install some related library dependencies, you can choose to install the corresponding dependencies by yourself with package management software such as apt or brew according to your system.
sh -c "$(curl -fsSL https://github.com/taffish-org/taffish-install/releases/download/latest/install-taffish-debian12-amd64-beta.sh)" -n
sh -c "$(curl -fsSL https://github.com/taffish-org/taffish-install/releases/download/latest/install-taffish-ubuntu-amd64-beta.sh)" -n
sh -c "$(curl -fsSL https://github.com/taffish-org/taffish-install/releases/download/latest/install-taffish-darwin-arm64-beta.sh)" -n
sh -c "$(curl -fsSL https://github.com/taffish-org/taffish-install/releases/download/latest/install-taffish-darwin-amd64-beta.sh)" -n
You can add parameters at the end to make some different automatic installation settings:
- -n, --no-ask :: Default installation, installing/updating software but not overwriting config files, etc (all issues are skipped automatically, using default options)
- -y, --yes :: (Use Caution) Force installation, install/update software and force overwrite of configuration files, etc. (select "yes" option for all issues)
After installation, you will have two commands/executables in your system, namely "taffish" (interpreter) and "taf" (package management system), you can enter the following code to check whether the installation is successful:
taf -v; taffish -v
If the installation is successful, you should return two pieces of information with the version number, similar to the following (the version may differ from the update date):
taf 1.0.0-beta KaiyuanHan(HermitHan) 2025-03-16
taffish 1.0.0-beta KaiyuanHan(HermitHan) 2025-03-15
If the display matches the above format, then congratulations, the installation is successful!
You can use "taf -h" or "taffish -h" to see more details on how to use taf and taffish!
By the way, you can change help language(English/Chinese) by config file (root: /usr/local/etc/taffish/config.taffish.taf & local: ~/.config/taffish/config.taffish.taf)
$ taf -h
taf 1.0.0-beta KaiyuanHan(HermitHan) 2025-03-16
-----------------------------------------------
Usage:
taf [options] [commands] ...
Options:
-h, --help show this help
-v, --version show taf's version
Commands:
history <show all history of taf and taffish>
update-taf <update taf and taffish> # Show how to install (command line order) on your computer
search [options] [app] <Search for the app from the official website> # Regular expressions are supported
-a, --all ... Additional detailed descriptions of each app package are displayed
install [options] [app] <Install the app> # The non-root users are only installed locally
-f, --file [file] ... Install the app from the local app-taf file or app-tar-gz file
-y, --yes ... Use "yes" to all selections, no need any select
-n, --no ... Use "yes" to all selections, no need any select
taf-xxx <Use the app> # You can use installed apps just by taf-[app]
-h, --help ... show help of the xxx app
uninstall [options] [app] <Uninstall the app> # The non-root user only uninstalls the local app
-y, --yes ... Use "yes" to all selections, no need any select
upgrade [options] [app] <Upgrade the app> # Upgrade all apps if you don't give any app
-y, --yes ... Use "yes" to all selections, no need any select
clean [options] [app] <Clean apps' local things (Container and Image)>
[NULL] ... If give nothing, it will clean something which are created during downloading
all ... Delete all apps's things(the things are still controled by -a/--al)
-a, --all ... If no -a/--all, it will only remove Container, and will remove Image too with -a/--all
apps [options] <Displays all apps that are currently installed>
-g, --global ... Displays the global public repositories
-l, --local ... Showcase your personal local repository
help/info [options] [app] <View information about an app> # Regular expressions are supported
-g, --global ... Match apps from a global public repository
-l, --local ... Match apps from a local personal repository
pull [options] [app] <Get the app from the installed app repository>
-g, --global ... Get it from a global public repository
-l, --local ... Get it from your local personal repository
* More Information: https://taffish.com
More Information: TAFFISH-DEVELOPMENT-MANUAL
$ taffish -h
taffish 1.0.0-beta KaiyuanHan(HermitHan) 2025-03-15
-----------------------------------------------
Usage:
taffish [options] [taf-file] [--args-name args-value] ...
Options:
-h, --help show this help
-v, --version show taffish's version
-t, --template show a .taf template to help users to write their own .taf file
-n, --dry-run just show shell orders which are translated by .taf file
-f, --force [Carefully] ignore errors and still translate and run shells
-s, --silent-run silent run, silent all output which was automatically run by taffish
* More Information: https://taffish.com
More Information: TAFFISH-DEVELOPMENT-MANUAL
During our installation, we mainly did the following things:
Local installation may require you to add the corresponding path to the end of your shrc file (such as "~/.bashrc" or "~/.zshrc", etc.), the specific method and process will have corresponding prompts in the installation, and you can execute it according to the operation, generally the code you need to add is:
export PATH=~/.taffish/bin/:$PATH
Same as the local installation in the previous step, you may need to manually add the code of the source corresponding to the autocomplete script file in the shrc file during the local installation, and the specific operation will be given during the installation process. And after adding it for the first time, you may need to manually source or restart the terminal to achieve local autocompletion.
More information about config: TAFFISH-DEVELOPMENT-MANUAL
This step may not be successful, depending on the suitability of the operating system.
Now that you have successfully installed "TAFFISH", we will go through a simple example to help you understand and use "TAFFISH" even further!
Hello, I'm fishka, a researcher in bioinformatics, and today a botanist friend of mine brought me some data:
This is just a demonstration (castrated version), and the real situation may be some kind of medicinal plant and other protein families or gene families that have not been studied and sequenced much, or the genetic sequence and some disease sequence data in the patient's body
He wanted me to analyze which proteins in Arabidopsis thaliana might be proteins of the p450 family from the perspective of bioinformation. And it's better to give him the corresponding protein ID directly, rather than complex file information! I'm going to show you how I can use Taffish to accomplish such a task!
First, we should clarify how to solve this problem and what bioinformatics tools we will use in this problem:
The second step can actually use local tools such as cut and uniq, but this may reduce portability and reproducibility, so here we will use taf-app to complete all the steps.
So let's get hands-on with the problem.
First of all, before starting the detailed process, I will give all the code needed to deal with the problem directly, and then expand the process step by step, so as to give the user an intuitive concept:
taf update
taf install blast debian
taf-blast --cmd blastp --dbin ./p450.fasta --in ./at.fasta
taf-debian cut -f 1 ./blast-out/out.blastp_matches_my-blast-db.txt \| sort \| uniq
That's right, it only takes three lines of code to install the software, to use the tools, to fix the problem, and as long as you have the correct installation of Taffish and any of the containerized software (and the device is "functional" networked), the above process can be easily replicated on any supported device without any conflicts in the software installation environment!
Taffish's software installation is different from other software installations, our software installation only downloads the TAF script (plain text file) of the corresponding tool from the Internet, and does not involve the system environment at all, so the installation process of any TAF-APP can basically be completed quickly.
So let's start showing the logic behind the above process in detail!
The software under the "TAFFISH" system is based on container management software, so as long as you have at least one container management system installed correctly [2.1 (optional but recommended) Install container management software], then we don't have to worry about any dependencies or environments in the software installation process, we just need to find the corresponding software in "Taf-Hub/App-Store", and then install it with a single line of commands!
taf update
taf search blast debian
In fact, taf-blast includes tools such as cut and uniq, but not all of these taf-app environments will include those general tools.
You might get an output similar to the following:
[All apps searched]:
blast debian
As you can see, we have the corresponding tools in the "App Store", so let's quickly install these two tools:
taf install blast debian
If this step has already been installed, you may be asked if you want to overwrite and reinstall.
If the installation is successful, you may see something like this:
[√] blast ..................................... [Installed]
[√] debian .................................... [Installed]
Congratulations, the installation is successful! Then we can start using these two tools:
Now that we have successfully installed taf-blast, we can use the following command to see how the tool works:
taf-blast -h
You might get something like this:
# <blast:latest | KaiyuanHan | 2025-01-06>
### Optional ##############################################################################
<main>
if ( echo '::dbin::' | grep "\.fasta$" > /dev/null 2>&1 ); then makeblastdb ::db-opts::; fi;
::cmd:: ::opts::
<outdir>
::*WORKDIR*::blast-out
<dbout>
"::outdir::/my-blast-db/::dbtitle::"
<db-opts>
-in ::dbin:: # (auto) for building blast-database
-dbtype ::dbtype:: # (default: auto) database type ,prot=>protain,nucl=nucleic-acid
-title ::dbtitle:: # (default: my-blast-db) database title
-out ::dbout:: # (default: "./blast-out/my-blast-db/::dbtitle::") database output
::db-opts-add::
<opts>
-db ::db:: # (need) database for blast
-query ::in:: # (need) blast seqs' file
-out ::out:: # (default: "./blast-out/out.cmd_matches_::dbtitle::.txt") output
-evalue ::evalue:: # (default: 1e-5) e-value
# -num_aligntments ::num-align:: # (default: 10) seqs' number for blast
# -max_target_seqs ::blast-maxnum:: # (default: 4) most target seqs' mapped number
# -perc_identity ::identity:: # (default: 90) perc identity
-num_threads ::threads:: # (default: 4) cpu threads
-outfmt=::outfmt:: # (default: 6) output format: 0~18, usually 0,5,6,7
# [0: same to online] [5: XML] [6: table] [7: table with anno]...
::opts-add::
<out>
"::outdir::/out.::cmd::_matches_::dbtitle::.txt"
### NEED ##############################################################################
<cmd>
# [blastn:DNA=>DNA-db] [blastp:Protain=>Protain-db] [blastx:DNA=>Protain-db] ...
<dbin>
# fasta file for building blast-database
<in>
# input fasta file
### RUN ##############################################################################
<container:taf-blast:docker.io/ncbi/blast:latest>
::*MAIN*::
In fact, the core content is the NEED parameter and the RUN command, we only need to provide:
So let's use taf-blast like any other command-line tool:
taf-blast --cmd blastp --dbin ./p450.fasta --in ./at.fasta
From a runtime perspective, the taf script will run all the code in the RUN, and any '::xxx::' parameter involved in it can be assigned from the command line using 'taf-cmd --xxx xxx-value' (regardless of whether the default value is given in ARGS or not) (except for the built-in parameters)
The first run may have a process of getting the image from the official, and then it will not be run again.
In this run, we showed the "developer-suggested usage" of taf-app, that is, the built-in usage that the developer has optimized, rather than the original usage of the tool, which requires users to know and learn the tool usage additionally, and supports taf-app that has been "optimized" for the tool developed by themselves. In the next step, we'll show another more universal and general usage.
When we finish running it and use ls
we should see that there is an additional folder ./blast-out/
under the current working path, and the comparison result we want for the two data is in this folder: ./blast-out/out.blastp_matches_my-blast-db.txt
. We can use the less -SN
command or your preferred command to see the results of our comparison.
sp|Q9ASR3|C7091_ARATH tr|Q0X087|Q0X087_SOLLC 38.178 516 307 5 6 517 8 515 7.07e-135 396
sp|Q9ASR3|C7091_ARATH sp|O48786|C734A_ARATH 37.739 522 297 8 9 519 13 517 3.25e-133 392
sp|Q9ASR3|C7091_ARATH sp|Q05047|SLS1_CATRO 38.760 516 296 8 15 518 16 523 1.95e-118 354
……
It can be seen that BLAST aligns each AT sequence with the sequence in the p450 library, and it is possible that one AT sequence is highly similar to multiple p450 sequences. In the blast in the previous step, our default parameters have already screened for sequences with relatively high similarity, so now we only need to use cut to get the AT sequence of the first column and use sort + uniq to remove the duplicates to get the AT protein sequence that may be P450!
If in a normal environment, then our code should look like this:
cut -f 1 ./blast-out/out.blastp_matches_my-blast-db.txt | sort | uniq
However, in order to ensure portability and customer service as much as possible, we use cut sort and uniq in the official Debian image to achieve this, namely:
taf-debian cut -f 1 ./blast-out/out.blastp_matches_my-blast-db.txt \| sort \| uniq
Comparing the two, we have made two changes:
taf-debian
at the top: this step is equivalent to putting the later code in the environment to run, then containerized software such as docker ensures that this step is portable and reproducible;|
to \|
so that the pipe can also be passed to taf-debian as part of the code, rather than being recognized by the local shell as a pipe to pass to the local sort and uniq;
So if you still use
|
it's probably fine, but then it will use your local sort and uniq
After running it, we might get something like this:
sp|O65785|C71B3_ARATH
sp|P92994|TCMO_ARATH
sp|Q9ASR3|C7091_ARATH
tr|Q9LEX2|Q9LEX2_ARATH
tr|Q9STI1|Q9STI1_ARATH
And so we have completed the task of our botanist friends!
But what if he comes to us again? It's better to package the above process directly, so that he can also implement the above steps himself! Usually we rarely share our work directly with them and let them calculate it by themselves, because most of the time our computer environment and software installation are very different, and if we want our scripts to run smoothly on their devices, we often need to make a lot of effort (environmental monitoring, software installation, let's do it...... )。 BUT TAFFISH WILL CHANGE THAT, WE ONLY NEED "CONTAINER MANAGEMENT SOFTWARE" + "TAFFISH" TO REPRODUCE OUR WORK ON MOST DEVICES!
If you don't give your tool both x86 and arm64 architectures, then there may be a minor compatibility issue.
So again, let's code first, and then explain!
+FLOW:blast-get-IDs
ARGS
<cmd>
blastp
<dbin>
./db.fasta
<in>
./in.fasta
RUN
<auto-flow>
taf-blast --cmd ::cmd:: --dbin ::dbin:: --in ::in:: > /dev/null 2>&1
taf-debian cut -f 1 ./blast-out/out.::cmd::_matches_my-blast-db.txt \| sort \| uniq
In fact, it is easy to see that compared to the original work, we have only made the following changes:
--cmd ...
RUN-<auto-flow>
tag is set: there is no need to manually install the software, and taf-blast
and taf-debian
will be installed automaticallytaf-blast
step we added the code at the end: > /dev/null 2>&1
, which clears the output of this step on the screen, ensuring that only the final ID result we want is output on the screentaf-debian
step, as the output filename of taf-blast will change somewhat depending on cmd (see taf-blast -h for details)Then all we need to do is copy and paste this script file/code and send it to our botanical friend and ask him to run it directly next time! There is no need to install any additional software, do any environment configuration, and do not need to modify any code. In this way, we almost directly took a few simple lines of commands while we were working and turned them into a portable, reproducible, easily shareable and installable software!
I wonder if this case has given you a deeper understanding of Taffish? If you've already seen this, then try using Taffish to easily use, build, and share your tools/processes and work!
Now I'm going to give you a brief introduction to taffish as a language, and if there are some parts of the above that you don't understand, then I think it will be easier to understand the above after reading the rest of the content.
The Taffish language is a similar markup language, the syntax is simple, it separates the elements of each line by line breaks, and then separates the functional structure of the work code with specific tags (single-line elements), and the functional structure of the code can be divided into four levels from top to bottom:
*WROKDIR*
: The current user's work path*USER*
: The username of the current user*CPUS*
: The number of threads on the current device*CMD-ARGS*
: All command-line parameters of this run (including taffish configuration parameters, such as --force, --silent-run, etc.)*APP-ARGS*
: The parameters accepted by the app in this run (excluding the taffish configuration parameters, only the parameters accepted by the taf-app)*LOAD-DIR*
: The path where the taffish file is run this time (if it is taf-app, that is the path where the taf script of the app is located, sometimes some run-related code will form a certain relative path relationship with the script, and you can set a stable call path through this parameter)*MAIN*
: (1) If the user adds *APP-ARGS*
, that is, a subsequent parameter, after calling taf-app, and the first item of the parameter starts with --
, then the parameter will be replaced with ::main::
, and the user needs to define ARGS-<main>
parameter to customize the usage; (2) If the first item of the user's subsequent parameter does not start with '--', then replace this parameter with ::*APP-ARGS*::
, that is, directly use the user's input as the running code (the use of taf-debian in 3.4 relies on this method); (3) If the user does not give any app parameters, that is, use them in a way like taf-blast
, then this parameter will be replaced with ::else::
, and the user can set the code when there is no input through ARGS-<else>
.These are the main built-in parameters, and it is recommended to use
*MAIN*
in every taf-app to write code!
<>
, the specific functions and meanings of the contents of the package are different under different first-level labels, and now briefly explain the meaning under the two first-level labels of "ARGS" and "RUN":
ARGS: The content corresponding to the secondary tag under ARGS
is the variable name, e.g. if there is a taf code like this:
+TOOL:test
ARGS
<xxx>
-a 1
--name 23
……
Then you can use ::xxx::
(wrap the xxx
variable with a pair of ::
double colons) at anywhere else in the taf file and replace it with the corresponding position, and it should be noted that the content under the secondary tag will automatically replace the line break with a space when the variable is replaced, that is, if you have a code like this: echo "::xxx::"
, then it will be replaced with echo "-a 1 --name 23"
. Therefore, "ARGS" actually constitutes a parameter substitution system;
RUN: As the name suggests, in fact, RUN
is the code snippet we are going to run, so the taf language is actually somewhat similar to some static languages, you need to declare variables (ARGS) first, and then call variables to write code (RUN), but some differences are that our variables are stored under the "ARGS" tag, and when we run, we can use the secondary tag of "RUN" to make different processing of the code, and finally all the processed code will be saved to a " shell script" (in fact, there is not really such a file, but the underlying shell code is passed directly to the bash command to run):
<local>/<sh>/<shell>
: The code under this tag is copied into the final shell script as if it were shell code;<container/apptainer/podman/docker($cmd):taf-container-name:docker-image>
: The code under this tag will add the shell code for automatic generation/running of the container according to the container name and docker image name you provide, and pass the shell code under the second-level tag to the corresponding cmd in the container (bash by default) through heredoc;<flow>
: The shell code under this tag can call taf-app and convert it into the corresponding shell code and embed it into the current shell, instead of simply keeping taf-app in the current shell and handing it over to the shell to run;<auto-flow>
: This tag <flow>
will automatically detect whether the corresponding taf-app is installed, and if not, the corresponding taf-app will be automatically installed."RUN" is usually code, and it is usually shell code, but for tags like
<python>
are code for other languages, and there are even some custom secondaryRUN
tags under which there can be special code structures, etc. Users can also programmatically define their own secondaryRUN
tags, you can learn it at: TAFFISH-DEVELOPMENT-MANUAL
More Information about TAFFISH: TAFFISH-DEVELOPMENT-MANUAL