Create a pipeline

This guide explains how to create an example pipeline that’s closer to a typical use-case of a Nextflow bioinformatics pipeline.


This page assumes knowledge of how to create and manipulate Nextflow channels using DSL2. For more information, check out the Nextflow reference docs or contact Data Intuitive for a complete Nextflow+Viash course.

Get the template project

To get started with building a pipeline, we provide a template project which already contains a few components. First create a new repository by clicking the “Use this template” button in the viash_project_template repository or clicking the button below.

Use project template

Then clone the repository using the following command.

git clone

The pipeline already contains three components with which we will build the following pipeline:

graph LR
   A(file?.tsv) --> B[/remove_comments/]
   B --> C[/take_column/]
   C --> D[/combine_columns/]
   D --> E(output)

  • remove_comments is a Bash script which removes all lines starting with a # from a file.
  • take_column is a Python script which extracts one of the columns in a TSV file.
  • combine_columns is an R script which combines multiple files into a TSV.

Build the VDSL3 modules

First, we need to build the components into VDSL3 modules.

viash ns build --setup cachedbuild --parallel
Exporting combine_columns (demo) =docker=> /home/runner/work/website/website/guide/_viash_project_template/target/docker/demo/combine_columns
Exporting take_column (demo) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/demo/take_column
[notice] Building container '' with Dockerfile
Exporting remove_comments (demo) =docker=> /home/runner/work/website/website/guide/_viash_project_template/target/docker/demo/remove_comments
[notice] Building container '' with Dockerfile
Exporting remove_comments (demo) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/demo/remove_comments
Exporting combine_columns (demo) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/demo/combine_columns
Exporting take_column (demo) =docker=> /home/runner/work/website/website/guide/_viash_project_template/target/docker/demo/take_column
[notice] Building container '' with Dockerfile
All 6 configs built successfully

Once everything is built, a new target directory has been created containing the executables and modules grouped per platform:

tree target
├── docker
│   └── demo
│       ├── combine_columns
│       │   └── combine_columns
│       ├── remove_comments
│       │   └── remove_comments
│       └── take_column
│           └── take_column
└── nextflow
    └── demo
        ├── combine_columns
        │   ├──
        │   └── nextflow.config
        ├── remove_comments
        │   ├──
        │   └── nextflow.config
        └── take_column
            └── nextflow.config

10 directories, 9 files

Create a pipeline

Below is a first Nextflow pipeline which uses just one VDSL3 module and with hard-coded input parameters (file1 and file2).


include { remove_comments } from "./target/nextflow/demo/remove_comments/"

workflow {
  // Create a channel with two events
  // Each event contains a string (an identifier) and a file (input)
    ["file1", file("resources_test/file1.tsv")],
    ["file2", file("resources_test/file2.tsv")]

    // View channel contents
    | view { tup -> "Input: $tup" }
    // Process the input file using the 'remove_comments' module.
    // This removes comment lines from the input TSV.
      directives: [
        publishDir: "output/"

    // View channel contents
    | view { tup -> "Output: $tup" }

VDSL3 module interface

It’s important to note what the interface of every VDSL3 module is. A VDSL3 module expects an input to be a tuple with the following elements:

  • id (String): A unique identifier used for tracking data objects and for ensuring output filenames are unique.
  • data (Map[String, Any] or File): A named map (or dictionary) used to pass the module’s input arguments. If the module only has a single input file, the file itself can simply be passed.
  • ... (Any*): Any other elements in the tuple simply pass through the module without being altered in any way. For this reason, it is often referred to as the “passthrough” objects.

In turn, a VDSL3 module will return a tuple with the same interface, except that the input data object has been replaced with the output data:

  • id (String): The identifier from the input tuple.
  • data (Map[String, Any] or File): A named map (or dictionary) containing the module’s output files. Important: If the module only has a single output file, the file itself will be returned.
  • ... (Any*): The passthrough objects from the input tuple (if any).

What is .run()?

Usually, Nextflow processes are quite static objects. For example, changing its directives can be quite tricky.

The run() function is a unique feature for every VDSL3 module which allows dynamically altering the behaviour of a module from within the pipeline. In this case, we use it to set the publishDir directive to "output/" so the output of that step in the pipeline will be stored as output.

Run the pipeline

Now run the pipeline with Nextflow:

nextflow run . \
N E X T F L O W  ~  version 22.10.6
Launching `` [agitated_kirch] DSL2 - revision: 111508427e
[-        ] process > remove_comments:remove_comm... -
Input: [file1, /home/runner/work/website/website/guide/_viash_project_template/resources_test/file1.tsv]
Input: [file2, /home/runner/work/website/website/guide/_viash_project_template/resources_test/file2.tsv]

executor >  local (2)
[0d/a28c57] process > remove_comments:remove_comm... [ 50%] 1 of 2
Input: [file1, /home/runner/work/website/website/guide/_viash_project_template/resources_test/file1.tsv]
Input: [file2, /home/runner/work/website/website/guide/_viash_project_template/resources_test/file2.tsv]
Output: [file1, /home/runner/work/website/website/guide/_viash_project_template/work/0d/a28c575fc3b352f763bec2d3fa70ad/file1.remove_comments.output.tsv]

executor >  local (2)
[ad/ae6e3c] process > remove_comments:remove_comm... [100%] 2 of 2 ✔
Input: [file1, /home/runner/work/website/website/guide/_viash_project_template/resources_test/file1.tsv]
Input: [file2, /home/runner/work/website/website/guide/_viash_project_template/resources_test/file2.tsv]
Output: [file1, /home/runner/work/website/website/guide/_viash_project_template/work/0d/a28c575fc3b352f763bec2d3fa70ad/file1.remove_comments.output.tsv]
Output: [file2, /home/runner/work/website/website/guide/_viash_project_template/work/ad/ae6e3c557217bc8292abc79ddddc20/file2.remove_comments.output.tsv]
tree output
├── file1.remove_comments.output.tsv -> /home/runner/work/website/website/guide/_viash_project_template/work/0d/a28c575fc3b352f763bec2d3fa70ad/file1.remove_comments.output.tsv
└── file2.remove_comments.output.tsv -> /home/runner/work/website/website/guide/_viash_project_template/work/ad/ae6e3c557217bc8292abc79ddddc20/file2.remove_comments.output.tsv

0 directories, 2 files
cat output/*
one 0.11    123
two 0.23    456
three   0.35    789
four    0.47    123
eins    0.111   234
zwei    0.222   234
drei    0.333   123
vier    0.444   123


The above example pipeline serves as the backbone for creating more advanced pipelines. However, for the sake of simplicity it contained several hardcoded elements:

  • Input parameters
  • Output directory
  • VDSL3 module directory