graph LR A(file?.tsv) --> B[/remove_comments/] B --> C[/take_column/] C --> D[/combine_columns/] D --> E(output)
This tutorial will guide you through using our Viash template project to run a data pipeline.
This guide assumes you’ve already installed Viash, Docker and Nextflow.
Viash is a script code wrapper for building modular software components that serve as building blocks to develop (Nextflow) data pipelines. All you need is your script and a metadata file to get started.
Here are a few of Viash’s key features:
This Quickstart will take you from nothing to a scalable and reproducible Nextflow data pipeline. Here’s the flow of the pipeline you’ll be using:
graph LR A(file?.tsv) --> B[/remove_comments/] B --> C[/take_column/] C --> D[/combine_columns/] D --> E(output)
One or more TSV files are taken as the input and will be processed through a series of modules. At the end, the output is written away to a folder.
To get up and running fast, we provide a template project for you to use.
First create a new repository by clicking the “Use this template” button in the viash_project_template repository or clicking the button below.
Then clone the repository using the following command.
Click the button below to download a zip file containing the template project. Once downloaded, unzip the file and rename the root directory to my_first_pipeline.
The template repo contains the following files:
.
├── LICENSE.md License information
├── README.qmd The source qmd file for this readme
├── README.md This readme
├── _viash.yaml Global Viash settings
├── resources_test/*.tsv Sample files to showcase pipeline and
│ ├── file1.tsv run component unit tests.
│ └── file2.tsv
├── src/demo Source directory with Viash components
│ ├── combine_columns
│ ├── remove_comments
│ └── take_column
└── workflows
└── demo_pipeline Demo Nextflow pipeline
├── main.nf
└── nextflow.config
With Viash you can turn the components in src/ into Dockerized Nextflow modules by running:
While building, this will result in the following output:
Exporting take_column (demo) =docker=> target/docker/demo/take_column
[notice] Building container 'ghcr.io/viash-io/viash_project_template/demo_take_column:dev' with Dockerfile
Exporting take_column (demo) =nextflow=> target/nextflow/demo/take_column
Exporting remove_comments (demo) =docker=> target/docker/demo/remove_comments
[notice] Building container 'ghcr.io/viash-io/viash_project_template/demo_remove_comments:dev' with Dockerfile
Exporting remove_comments (demo) =nextflow=> target/nextflow/demo/remove_comments
Exporting combine_columns (demo) =docker=> target/docker/demo/combine_columns
[notice] Building container 'ghcr.io/viash-io/viash_project_template/demo_combine_columns:dev' with Dockerfile
Exporting combine_columns (demo) =nextflow=> target/nextflow/demo/combine_columns
All 6 configs built successfully
Once everything is built, a new target directory has been created containing the executables and modules grouped per platform:
target/
├── docker
│ └── demo
│ ├── combine_columns
│ │ ├── combine_columns
│ │ └── viash.yaml
│ ├── remove_comments
│ │ ├── remove_comments
│ │ └── viash.yaml
│ └── take_column
│ ├── take_column
│ └── viash.yaml
└── nextflow
└── demo
├── combine_columns
│ ├── main.nf
│ ├── nextflow.config
│ └── viash.yaml
├── remove_comments
│ ├── main.nf
│ ├── nextflow.config
│ └── viash.yaml
└── take_column
├── main.nf
├── nextflow.config
└── viash.yaml
Now run run the pipeline with Nextflow:
nextflow run . \
-main-script workflows/demo_pipeline/main.nf \
-with-docker \
--input resources_test/file*.tsv \
--publishDir tempThis will run the three modules in sequence, with the final result result being stored in a file named combined.combine_columns.output.tsv in a new temp directory:
"1" 0.11
"2" 0.23
"3" 0.35
"4" 0.47
Now that you’ve had a taste of what Viash can do for you, take a look at our Guide and Reference pages to learn more about how to use Viash. If you want to start simple, we suggest to take a look at the Native component creation guide.