Basic Pipeline

This guide will explain how to create a basic Nextflow pipeline using a Viash component.

Creating the module

Nextflow works with modules to run scripts and handle their input and output, so the first step is generating a Nextflow module from a Viash component. This guide will use a small component named remove_comments that removes comments (lines starting with a hashtag) from a TSV file.

Note

This guide won’t go in-depth about how to generate a Nextflow module from a Viash component. For more information on that topic, read the Nextflow component creation and the Nextflow Build & Run guides.

Download the source files

Download the zip below and extract it to a directory of your choosing:

Download basic_pipeline.zip

Once extracted, the directory contains the remove_comments component in the src directory and a small TSV file in the data directory:

basic_pipeline
├── data
│   └── sample.tsv
└── src
    └── remove_comments
        ├── config.vsh.yaml
        └── script.sh

Building the Nextflow module

With the component ready, execute the viash build command below to generate the Nextflow module:

viash build src/remove_comments/config.vsh.yaml -p nextflow -o target/remove_comments

You should now have a target directory that contains the generated Nextflow module.

basic_pipeline
├── data
│   └── sample.tsv
├── src
│   └── remove_comments
│       ├── config.vsh.yaml
│       └── script.sh
└── target
    └── remove_comments
        ├── main.nf
        └── nextflow.config

Creating the pipeline

To use the module in a pipeline, create a new file in the root of the directory and name it main.nf, this will be the Nextflow pipeline script. Add this as its contents:

targetDir = "./target" // 1

include { remove_comments } from "$targetDir/remove_comments/main.nf" // 2

workflow {
  Channel.fromPath(params.input) // 3
    | map{ file -> [ file.baseName, file ] } // 4
    | view{ file -> "Input: $file" } // 5
    | remove_comments.run( // 6
      auto: [ publish: true ]
      )
    | view{ file -> "Output: $file" } // 7
}

Here’s an overview of this Nextflow script:

  1. Store the location of the target directory where the modules are located
  2. Include the remove_comments module from the remove_comments/main.nf script
  3. Create a channel based on the input parameter’s path
  4. Take the tuple list and map it to the [ file.baseName, file ] format
  5. Print the input tuple to the console
  6. Run the remove_comments module with auto publishing enabled using the auto directive, this makes sure that the output of this module is written to a directory based on its id
  7. Print the output tuple to the console

Running the pipeline

You can now run the pipeline script above with Nextflow using the following command:

nextflow run main.nf --input "data/sample.tsv" --publishDir output

This results in a console output similar to this:

N E X T F L O W  ~  version 22.04.3
Launching `main.nf` [curious_gates] DSL2 - revision: 3e22e3038c
executor >  local (1)
[2a/5df658] process > remove_comments:remove_comments_process (1) [100%] 1 of 1 ✔
Input: [sample, basic_pipeline/data/sample.tsv]
Output: [sample, basic_pipeline/work/2a/5df6584524e26995953a4eaec97136/sample.remove_comments.output.tsv]

After the pipeline has finished working, a new file named sample.remove_comments.output.tsv has been created in the output directory:

basic_pipeline
└── output
    └── sample.remove_comments.output.tsv

For comparison, here’s what the input and output files look like to show what changed:

sample.tsv

# this is a header      
# this is also a header     
one     0.11    123
two     0.23    456
three   0.35    789
four    0.47    123

sample.remove_comments.output.tsv

one     0.11    123
two     0.23    456
three   0.35    789
four    0.47    123

What’s next?

The pipeline in this guide was a bare minimum example, to learn more about creating Nextflow pipelines, take a look at the Advanced Pipeline guide.