Basic Pipeline
This guide will explain how to create a basic Nextflow pipeline using a Viash component.
Creating the module
Nextflow works with modules to run scripts and handle their input and output, so the first step is generating a Nextflow module from a Viash component. This guide will use a small component named remove_comments that removes comments (lines starting with a hashtag) from a TSV file.
This guide won’t go in-depth about how to generate a Nextflow module from a Viash component. For more information on that topic, read the Nextflow component creation and the Nextflow Build & Run guides.
Download the source files
Download the zip below and extract it to a directory of your choosing:
Once extracted, the directory contains the remove_comments component in the src directory and a small TSV file in the data directory:
basic_pipeline
├── data
│ └── sample.tsv
└── src
└── remove_comments
├── config.vsh.yaml
└── script.sh
Building the Nextflow module
With the component ready, execute the viash build command below to generate the Nextflow module:
viash build src/remove_comments/config.vsh.yaml -p nextflow -o target/remove_commentsYou should now have a target directory that contains the generated Nextflow module.
basic_pipeline
├── data
│ └── sample.tsv
├── src
│ └── remove_comments
│ ├── config.vsh.yaml
│ └── script.sh
└── target
└── remove_comments
├── main.nf
└── nextflow.config
Creating the pipeline
To use the module in a pipeline, create a new file in the root of the directory and name it main.nf, this will be the Nextflow pipeline script. Add this as its contents:
targetDir = "./target" // 1
include { remove_comments } from "$targetDir/remove_comments/main.nf" // 2
workflow {
Channel.fromPath(params.input) // 3
| map{ file -> [ file.baseName, file ] } // 4
| view{ file -> "Input: $file" } // 5
| remove_comments.run( // 6
auto: [ publish: true ]
)
| view{ file -> "Output: $file" } // 7
}Here’s an overview of this Nextflow script:
- Store the location of the
targetdirectory where the modules are located - Include the
remove_commentsmodule from theremove_comments/main.nfscript - Create a channel based on the
inputparameter’s path - Take the tuple list and map it to the
[ file.baseName, file ]format - Print the
inputtuple to the console - Run the
remove_commentsmodule with auto publishing enabled using theautodirective, this makes sure that the output of this module is written to a directory based on itsid - Print the
outputtuple to the console
Running the pipeline
You can now run the pipeline script above with Nextflow using the following command:
nextflow run main.nf --input "data/sample.tsv" --publishDir outputThis results in a console output similar to this:
N E X T F L O W ~ version 22.04.3
Launching `main.nf` [curious_gates] DSL2 - revision: 3e22e3038c
executor > local (1)
[2a/5df658] process > remove_comments:remove_comments_process (1) [100%] 1 of 1 ✔
Input: [sample, basic_pipeline/data/sample.tsv]
Output: [sample, basic_pipeline/work/2a/5df6584524e26995953a4eaec97136/sample.remove_comments.output.tsv]
After the pipeline has finished working, a new file named sample.remove_comments.output.tsv has been created in the output directory:
basic_pipeline
└── output
└── sample.remove_comments.output.tsv
For comparison, here’s what the input and output files look like to show what changed:
sample.tsv
# this is a header
# this is also a header
one 0.11 123
two 0.23 456
three 0.35 789
four 0.47 123sample.remove_comments.output.tsv
one 0.11 123
two 0.23 456
three 0.35 789
four 0.47 123What’s next?
The pipeline in this guide was a bare minimum example, to learn more about creating Nextflow pipelines, take a look at the Advanced Pipeline guide.