Nextflow Legacy Platform

Run a Viash component as a Nextflow module.

id

Type: String

Every platform can be given a specific id that can later be referred to explicitly when running or building the Viash component.

image

Type: String

If no image attributes are configured, Viash will use the auto-generated image name from the Docker platform:

[<namespace>/]<name>:<version>

It’s possible to specify the container image explicitly with which to run the module in different ways:

image: dataintuitive/viash:0.4.0

Exactly the same can be obtained with

image: dataintuitive/viash
registry: index.docker.io/v1/
tag: 0.4.0

Specifying the attribute(s) like this will use the container dataintuitive/viash:0.4.0 from Docker hub (registry).

If no tag is specified Viash will use functionality.version as the tag.

If no registry is specified, Viash (and NextFlow) will assume the image is available locally or on Docker Hub. In other words, the registry: ... attribute above is superfluous. No other registry is checked automatically due to a limitation from Docker itself.

label

Type: String

When running the module in a cluster context and depending on the cluster type, NextFlow allows for attaching labels to the process that can later be used as selectors for associating resources to this process.

In order to attach one label to a process/component, one can use the label: ... attribute, multiple labels can be added using labels: [ ..., ... ] and the two can even be mixed.

In the main nextflow.config, one can now use this label:

process { … withLabel: bigmem { maxForks = 5 … } }

Example

label: highmem labels: [ highmem, highcpu ]

labels

Type: String / List of String

When running the module in a cluster context and depending on the cluster type, NextFlow allows for attaching labels to the process that can later be used as selectors for associating resources to this process.

In order to attach one label to a process/component, one can use the label: ... attribute, multiple labels can be added using labels: [ ..., ... ] and the two can even be mixed.

In the main nextflow.config, one can now use this label:

process { … withLabel: bigmem { maxForks = 5 … } }

Example

label: highmem labels: [ highmem, highcpu ]

namespace_separator

Type: String

The default namespace separator is “_“.

Example

namespace_separator: "+"

organization

Type: String

Name of a container’s organization.

Example

organization: viash-io

path

Type: String

When publish: true, this attribute defines where the output is written relative to the params.publishDir setting. For example, path: processed in combination with --output s3://some_bucket/ will store the output of this component under

s3://some_bucket/processed/

This attribute gives control over the directory structure of the output. For example:

path: raw_data

Or even:

path: raw_data/bcl

Please note that per_id and path can be combined.

per_id

Type: Boolean

By default, a subdirectory is created corresponding to the unique ID that is passed in the triplet. Let us illustrate this with an example. The following code snippet uses the value of --input as an input of a workflow. The input can include a wildcard so that multiple samples can run in parallel. We use the parent directory name (.getParent().baseName) as an identifier for the sample. We pass this as the first entry of the triplet:

Channel.fromPath(params.input) \
    | map{ it -> [ it.getParent().baseName , it ] } \
    | map{ it -> [ it[0] , it[1], params ] }
    | ...

Say the resulting sample names are SAMPLE1 and SAMPLE2. The next step in the pipeline will be published (at least by default) under:

<publishDir>/SAMPLE1/
<publishDir>/SAMPLE2/

These per-ID subdirectories can be avoided by setting:

per_id: false

publish

Type: Boolean

NextFlow uses the autogenerated work dirs to manage process IO under the hood. In order effectively output something one can publish the results a module or step in the pipeline. In order to do this, add publish: true to the config:

  • publish is optional
  • Default value is false

This attribute simply defines if output of a component should be published yes or no. The output location has to be provided at pipeline launch by means of the option --publishDir ... or as params.publishDir in nextflow.config:

params.publishDir = "..."

registry

Type: String

The URL to the a custom Docker registry.

Example

registry: https://my-docker-registry.org

separate_multiple_outputs

Type: Boolean

Separates the outputs generated by a Nextflow component with multiple outputs as separate events on the channel. Default value: true.

Example

separate_multiple_outputs: false

stageInMode

Type: String

By default NextFlow will create a symbolic link to the inputs for a process/module and run the tool at hand using those symbolic links. Some applications do not cope well with this strategy, in that case the files should effectively be copied rather than linked to. This can be achieved by using stageInMode: copy. This attribute is optional, the default is symlink.

Example

stageInMode: copy

tag

Type: Version

Specify a Docker image based on its tag.

Example

tag: 4.0