Welcome to Viash!
Viash is your go-to script wrapper for building data pipelines from modular software components. All you need is your trusty script and a metadata file to embark on this journey.
Check out some of Viash’s key features:
- Code in your favorite scripting language. Mix and match scripting between multiple components to suit your needs. Viash supports a wide range of languages, including Bash, Python, R, Scala, JS, and C#.
- A custom Docker container is auto-generated based on the dependencies you’ve outlined in your metadata, meaning you don’t need to be a Docker expert.
- Viash also generates a Nextflow module from your script, so no need to be a Nextflow guru either.
- Effortlessly combine Nextflow modules to design and run scalable, reproducible data pipelines.
- Test every component on your local workstation using the convenient built-in development kit.
Requirements
This guide assumes you’ve already installed Viash, Docker and Nextflow.
Quickstart example project
To get up and running fast, we provide a template project for you to use. It contains three components from the same package as well as 2 custom operators from vsh-pipeline-operators, which are combined into a Nextflow pipeline as follows:
This pipeline takes one or more TSV files as input and stores its output in an output folder.
Step 1: Get the template
First create a new repository by clicking the “Use this template” button. If you can’t see the “Use this template” button, log into GitHub first.
Next, clone the repository using the following command.
git clone https://github.com/youruser/my_first_pipeline.git && cd my_first_pipeline
Your new repository should contain the following files:
tree my_first_pipeline
.
├── CHANGELOG.md
├── LICENSE.md
├── README.md
├── README.qmd
├── _viash.yaml
├── main.nf
├── resources_test
│ ├── file1.tsv
│ └── file2.tsv
└── src
└── template
├── combine_columns
│ ├── config.vsh.yaml
│ └── script.R
├── remove_comments
│ ├── config.vsh.yaml
│ ├── script.sh
│ └── test.sh
├── take_column
│ ├── config.vsh.yaml
│ └── script.py
└── workflow
├── config.vsh.yaml
├── main.nf
└── nextflow.config
Step 2: Build the Viash components
With Viash you can turn the components in src/
into Dockerized Nextflow modules by running:
viash ns build --setup cachedbuild --parallel
Output
Exporting remove_comments (template) =nextflow=> target/nextflow/template/remove_comments
Exporting combine_columns (template) =executable=> target/executable/template/combine_columns
Exporting remove_comments (template) =executable=> target/executable/template/remove_comments
Exporting workflow (template) =nextflow=> target/nextflow/template/workflow
[notice] Building container 'ghcr.io/viash-io/project_template/template/combine_columns:0.3.0' with Dockerfile
[notice] Building container 'ghcr.io/viash-io/project_template/template/remove_comments:0.3.0' with Dockerfile
Exporting take_column (template) =executable=> target/executable/template/take_column
Exporting take_column (template) =nextflow=> target/nextflow/template/take_column
[notice] Building container 'ghcr.io/viash-io/project_template/template/take_column:0.3.0' with Dockerfile
Exporting combine_columns (template) =nextflow=> target/nextflow/template/combine_columns
All 7 configs built successfully
This command not only transforms the Viash components in src/
to Nextflow modules but it also builds the containers when appropriate (starting from the Docker cache when available using the cachedbuild
argument). Once everything is built, a new target directory has been created containing the executables and modules grouped per platform:
ls -l
Output
total 80
-rw-r--r-- 1 runner docker 1482 Sep 6 22:24 CHANGELOG.md
-rw-r--r-- 1 runner docker 32219 Sep 6 22:24 LICENSE.md
-rw-r--r-- 1 runner docker 12142 Sep 6 22:24 README.md
-rw-r--r-- 1 runner docker 7840 Sep 6 22:24 README.qmd
-rw-r--r-- 1 runner docker 512 Sep 6 22:26 _viash.yaml
-rw-r--r-- 1 runner docker 245 Sep 6 22:24 main.nf
-rw-r--r-- 1 runner docker 222 Sep 6 22:24 nextflow.config
drwxr-xr-x 2 runner docker 4096 Sep 6 22:24 resources_test
drwxr-xr-x 3 runner docker 4096 Sep 6 22:24 src
drwxr-xr-x 4 runner docker 4096 Sep 6 22:24 target
Step 3: Run the pipeline
Now run the pipeline with Nextflow:
nextflow run . \
-main-script target/nextflow/template/workflow/main.nf \
-with-docker \
--input resources_test/file*.tsv \
--publish_dir output
Output
N E X T F L O W ~ version 24.04.4
Launching `target/nextflow/template/workflow/main.nf` [condescending_koch] DSL2 - revision: 45d6e681b4
[fe/671114] Submitted process > workflow:run_wf:remove_comments:processWf:remove_comments_process (run)
[18/8d73b7] Submitted process > workflow:run_wf:take_column:processWf:take_column_process (run)
[0f/93f775] Submitted process > workflow:run_wf:combine_columns:processWf:combine_columns_process (combined)
[93/2fa613] Submitted process > workflow:publishStatesSimpleWf:publishStatesProc (combined)
This will run the different stages of the workflow , with the final result result being stored in a file named combined.workflow.output.tsv in the output directory output
:
cat output/combined.workflow.output.tsv
Output
"1" 0.11
"2" 0.23
"3" 0.35
"4" 0.47
What’s next?
Congratulations, you’ve reached the end of this quickstart tutorial, and we’re excited for you to delve deeper into the world of Viash! Our comprehensive guide and reference documentation is here to help you explore various topics, such as:
- Creating a Viash component and converting it into a standalone executable
- Ensuring reproducibility and designing customised Docker images
- Ensuring code reliability with unit testing for Viash
- Streamlining your workflow by performing batch operations on Viash projects
- Building Nextflow pipelines using Viash components
So, get ready to enhance your skills and create outstanding solutions with Viash!