Nextflow Directives

Directives are optional settings that affect the execution of the process.

Example:

directives:
    container: rocker/r-ver:4.1
    label: highcpu
    cpus: 4
    memory: 16 GB

accelerator

Type: Map of String to String

Default: Empty

The accelerator directive allows you to specify the hardware accelerator requirement for the task execution e.g. GPU processor.

Viash implements this directive as a map with accepted keywords: type, limit, request, and runtime.

See accelerator.

Example:

[ limit: 4, type: "nvidia-tesla-k80" ]

afterScript

Type: String

Default: Empty

The afterScript directive allows you to execute a custom (Bash) snippet immediately after the main process has run. This may be useful to clean up your staging area.

See afterScript.

Example:

source /cluster/bin/cleanup

beforeScript

Type: String

Default: Empty

The beforeScript directive allows you to execute a custom (Bash) snippet before the main process script is run. This may be useful to initialise the underlying cluster environment or for other custom initialisation.

See beforeScript.

Example:

source /cluster/bin/setup

cache

Type: Either Boolean or String

Default: Empty

The cache directive allows you to store the process results to a local cache. When the cache is enabled and the pipeline is launched with the resume option, any following attempt to execute the process, along with the same inputs, will cause the process execution to be skipped, producing the stored data as the actual results.

The caching feature generates a unique key by indexing the process script and inputs. This key is used to identify univocally the outputs produced by the process execution.

The cache is enabled by default, you can disable it for a specific process by setting the cache directive to false.

Accepted values are: true, false, "deep", and "lenient".

See cache.

Examples:

true
false
"deep"
"lenient"

conda

Type: String / List of String

Default: Empty

The conda directive allows for the definition of the process dependencies using the Conda package manager.

Nextflow automatically sets up an environment for the given package names listed by in the conda directive.

See conda.

Examples:

"bwa=0.7.15"
"bwa=0.7.15 fastqc=0.11.5"
["bwa=0.7.15", "fastqc=0.11.5"]

container

Type: Either Map of String to String or String

Default: Empty

The container directive allows you to execute the process script in a Docker container.

It requires the Docker daemon to be running in machine where the pipeline is executed, i.e. the local machine when using the local executor or the cluster nodes when the pipeline is deployed through a grid executor.

Viash implements allows either a string value or a map. In case a map is used, the allowed keys are: registry, image, and tag. The image value must be specified.

See container.

Examples:

"foo/bar:tag"

This is transformed to "reg/im:ta":

[ registry: "reg", image: "im", tag: "ta" ]

This is transformed to "im:latest":

[ image: "im" ]

containerOptions

Type: String / List of String

Default: Empty

The containerOptions directive allows you to specify any container execution option supported by the underlying container engine (ie. Docker, Singularity, etc). This can be useful to provide container settings only for a specific process e.g. mount a custom path.

See containerOptions.

Examples:

"--foo bar"
["--foo bar", "-f b"]

cpus

Type: Either Int or String

Default: Empty

The cpus directive allows you to define the number of (logical) CPU required by the process’ task.

See cpus.

Examples:

1
10

disk

Type: String

Default: Empty

The disk directive allows you to define how much local disk storage the process is allowed to use.

See disk.

Examples:

"1 GB"
"2TB"
"3.2KB"
"10.B"

echo

Type: Either Boolean or String

Default: Empty

By default the stdout produced by the commands executed in all processes is ignored. By setting the echo directive to true, you can forward the process stdout to the current top running process stdout file, showing it in the shell terminal.

See echo.

Examples:

true
false

errorStrategy

Type: String

Default: Empty

The errorStrategy directive allows you to define how an error condition is managed by the process. By default when an error status is returned by the executed script, the process stops immediately. This in turn forces the entire pipeline to terminate.

Table of available error strategies: | Name | Executor | |——|———-| | terminate | Terminates the execution as soon as an error condition is reported. Pending jobs are killed (default) | | finish | Initiates an orderly pipeline shutdown when an error condition is raised, waiting the completion of any submitted job. | | ignore | Ignores processes execution errors. | | retry | Re-submit for execution a process returning an error condition. |

See errorStrategy.

Examples:

"terminate"
"finish"

executor

Type: String

Default: Empty

The executor defines the underlying system where processes are executed. By default a process uses the executor defined globally in the nextflow.config file.

The executor directive allows you to configure what executor has to be used by the process, overriding the default configuration. The following values can be used:

Name Executor
awsbatch The process is executed using the AWS Batch service.
azurebatch The process is executed using the Azure Batch service.
condor The process is executed using the HTCondor job scheduler.
google-lifesciences The process is executed using the Google Genomics Pipelines service.
ignite The process is executed using the Apache Ignite cluster.
k8s The process is executed using the Kubernetes cluster.
local The process is executed in the computer where Nextflow is launched.
lsf The process is executed using the Platform LSF job scheduler.
moab The process is executed using the Moab job scheduler.
nqsii The process is executed using the NQSII job scheduler.
oge Alias for the sge executor.
pbs The process is executed using the PBS/Torque job scheduler.
pbspro The process is executed using the PBS Pro job scheduler.
sge The process is executed using the Sun Grid Engine / Open Grid Engine.
slurm The process is executed using the SLURM job scheduler.
tes The process is executed using the GA4GH TES service.
uge Alias for the sge executor.

See executor.

Examples:

"local"
"sge"

label

Type: String / List of String

Default: Empty

The label directive allows the annotation of processes with mnemonic identifier of your choice.

See label.

Examples:

"big_mem"
"big_cpu"
["big_mem", "big_cpu"]

machineType

Type: String

Default: Empty

The machineType can be used to specify a predefined Google Compute Platform machine type when running using the Google Life Sciences executor.

See machineType.

Example:

"n1-highmem-8"

maxErrors

Type: Either String or Int

Default: Empty

The maxErrors directive allows you to specify the maximum number of times a process can fail when using the retry error strategy. By default this directive is disabled.

See maxErrors.

Examples:

1
3

maxForks

Type: Either String or Int

Default: Empty

The maxForks directive allows you to define the maximum number of process instances that can be executed in parallel. By default this value is equals to the number of CPU cores available minus 1.

If you want to execute a process in a sequential manner, set this directive to one.

See maxForks.

Examples:

1
3

maxRetries

Type: Either String or Int

Default: Empty

The maxRetries directive allows you to define the maximum number of times a process instance can be re-submitted in case of failure. This value is applied only when using the retry error strategy. By default only one retry is allowed.

See maxRetries.

Examples:

1
3

memory

Type: String

Default: Empty

The memory directive allows you to define how much memory the process is allowed to use.

See memory.

Examples:

"1 GB"
"2TB"
"3.2KB"
"10.B"

module

Type: String / List of String

Default: Empty

Environment Modules is a package manager that allows you to dynamically configure your execution environment and easily switch between multiple versions of the same software tool.

If it is available in your system you can use it with Nextflow in order to configure the processes execution environment in your pipeline.

In a process definition you can use the module directive to load a specific module version to be used in the process execution environment.

See module.

Examples:

"ncbi-blast/2.2.27"
"ncbi-blast/2.2.27:t_coffee/10.0"
["ncbi-blast/2.2.27", "t_coffee/10.0"]

penv

Type: String

Default: Empty

The penv directive allows you to define the parallel environment to be used when submitting a parallel task to the SGE resource manager.

See penv.

Example:

"smp"

pod

Type: Map of String to String / List of Map of String to String

Default: Empty

The pod directive allows the definition of pods specific settings, such as environment variables, secrets and config maps when using the Kubernetes executor.

See pod.

Examples:

[ label: "key", value: "val" ]
[ annotation: "key", value: "val" ]
[ env: "key", value: "val" ]
[ [label: "l", value: "v"], [env: "e", value: "v"]]

publishDir

Type: Either String or Map of String to String / List of Either String or Map of String to String

Default: Empty

The publishDir directive allows you to publish the process output files to a specified folder.

Viash implements this directive as a plain string or a map. The allowed keywords for the map are: path, mode, overwrite, pattern, saveAs, enabled. The path key and value are required. The allowed values for mode are: symlink, rellink, link, copy, copyNoFollow, move.

See publishDir.

Examples:

[]
[ [ path: "foo", enabled: true ], [ path: "bar", enabled: false ] ]

This is transformed to [[ path: "/path/to/dir" ]]:

"/path/to/dir"

This is transformed to [[ path: "/path/to/dir", mode: "cache" ]]:

[ path: "/path/to/dir", mode: "cache" ]

queue

Type: String / List of String

Default: Empty

The queue directory allows you to set the queue where jobs are scheduled when using a grid based executor in your pipeline.

See queue.

Examples:

"long"
"short,long"
["short", "long"]

scratch

Type: Either Boolean or String

Default: Empty

The scratch directive allows you to execute the process in a temporary folder that is local to the execution node.

See scratch.

Examples:

true
"/path/to/scratch"
'$MY_PATH_TO_SCRATCH'
"ram-disk"

stageInMode

Type: String

Default: Empty

The stageInMode directive defines how input files are staged-in to the process work directory. The following values are allowed:

Value Description
copy Input files are staged in the process work directory by creating a copy.
link Input files are staged in the process work directory by creating an (hard) link for each of them.
symlink Input files are staged in the process work directory by creating a symbolic link with an absolute path for each of them (default).
rellink Input files are staged in the process work directory by creating a symbolic link with a relative path for each of them.

See stageInMode.

Examples:

"copy"
"link"

stageOutMode

Type: String

Default: Empty

The stageOutMode directive defines how output files are staged-out from the scratch directory to the process work directory. The following values are allowed:

Value Description
copy Output files are copied from the scratch directory to the work directory.
move Output files are moved from the scratch directory to the work directory.
rsync Output files are copied from the scratch directory to the work directory by using the rsync utility.

See stageOutMode.

Examples:

"copy"
"link"

storeDir

Type: String

Default: Empty

The storeDir directive allows you to define a directory that is used as a permanent cache for your process results.

See storeDir.

Example:

"/path/to/storeDir"

tag

Type: String

Default: '$id'

The tag directive allows you to associate each process execution with a custom label, so that it will be easier to identify them in the log file or in the trace execution report.

For ease of use, the default tag is set to "$id", which allows tracking the progression of the channel events through the workflow more easily.

See tag.

Example:

"foo"

time

Type: String

Default: Empty

The time directive allows you to define how long a process is allowed to run.

See time.

Examples:

"1h"
"2days"
"1day 6hours 3minutes 30seconds"