Quickstart

Important

This guide assumes you’ve installed Viash, Docker and Nextflow on your system.

This tutorial will guide you through using our Viash template project to run a data pipeline.

What is Viash?

Viash is a script code wrapper for building modular software components that serve as building blocks to develop (Nextflow) data pipelines. All you need is your script and a metadata file to get started.
Here are a few of Viash’s key features:

  • You can use your preferred scripting language per component, and mix and match scripts between multiple components as you please. Supported languages include: Bash, Python, R, Scala, JS and C#.
  • A custom Docker container is automatically generated based on your dependencies described in your metadata. No expert Docker knowledge is required.
  • Viash generates a Nextflow module from your script. No expert Nextflow knowledge is required.
  • You can simply script the nextflow modules to create and run your scalable and reproducible data pipeline.
  • You can test every single module on your local workstation through the built-in development kit.

Quickstart example project

This Quickstart will take you from nothing to a scalable and reproducible Nextflow data pipeline. Here’s the flow of the pipeline you’ll be using:

graph LR
   A(file?.tsv) --> B[/remove_comments/]
   B --> C[/take_column/]
   C --> D[/combine_columns/]
   D --> E(output)

One or more TSV files are taken as the input and will be processed through a series of modules. At the end, the output is written away to a folder.

Step 1: Get the template

To get up and running fast, we provide a template project for you to use:

Run the command below to clone the template repository to a new folder named advanced_pipeline:

git clone https://github.com/viash-io/viash_project_template advanced_pipeline

Click the button below to download a zip file containing the template project. Once downloaded, unzip the file and rename the root directory to advanced_pipeline.

Download template

Step 2: Project overview

Open the advanced_pipeline project directory if you haven’t done so already. Here’s a quick overview of its contents:

  • The bin directory contains a single script to initialize the project, more on that in the next step
  • main.nf and nextflow.config are dummy files required for Nextflow to work correctly
  • README.md contains extra information on how to use the template
  • The resources_test directory contains two TSV files which will be used to test the example pipeline
  • src contains three Viash components, each with a config file and a script
  • The workflows directory contains a Nextflow script and config to be able to run the pipeline

Step 3: Use the init script

Inside of the project is a bin directory containing the init script. This will automatically download Viash and Nextflow to that same directory.

Use the following command to run the init script inside of the bin directory:

bin/init

After a short while, the script will finish and the following files will be in the bin directory:

bin
├── init
├── nextflow
├── viash
├── viash_build
├── viash_clean_nxf
├── viash_gendoc
├── viash_genrep
├── viash_install
├── viash_push
├── viash_skeleton
├── viash_tag
├── viash_test
└── viash_trafo

These are executables needed to build, test and run the pipeline.

Step 4: Build the Viash components

With the tools present, you can build the Docker executables and Nextflow modules using Viash. To do so, make sure Docker is running and use the following command:

bin/viash_build

While building, this will result in the following output:

In development mode with 'dev'.
Exporting remove_comments (demo) =nextflow=> target/nextflow/demo/remove_comments
Exporting combine_columns (demo) =docker=> target/docker/demo/combine_columns
Exporting take_column (demo) =nextflow=> target/nextflow/demo/take_column
Exporting take_column (demo) =docker=> target/docker/demo/take_column
Exporting remove_comments (demo) =docker=> target/docker/demo/remove_comments
Exporting combine_columns (demo) =nextflow=> target/nextflow/demo/combine_columns
[notice] Building container 'demo_combine_columns:dev' with Dockerfile
[notice] Building container 'demo_take_column:dev' with Dockerfile
[notice] Building container 'demo_remove_comments:dev' with Dockerfile

Once everything is built, a new target directory has been created containing the executables and modules grouped per platform:

target/
├── docker
│   └── demo
│       ├── combine_columns
│       │   ├── combine_columns
│       │   └── viash.yaml
│       ├── remove_comments
│       │   ├── remove_comments
│       │   └── viash.yaml
│       └── take_column
│           ├── take_column
│           └── viash.yaml
└── nextflow
    └── demo
        ├── combine_columns
        │   ├── main.nf
        │   ├── nextflow.config
        │   └── viash.yaml
        ├── remove_comments
        │   ├── main.nf
        │   ├── nextflow.config
        │   └── viash.yaml
        └── take_column
            ├── main.nf
            ├── nextflow.config
            └── viash.yaml

Step 5: Run the pipeline

Now run run the pipeline with Nextflow:

bin/nextflow run . \
  -main-script workflows/demo_pipeline/main.nf \
  -with-docker \
  --input resources_test/file*.tsv \
  --publishDir temp

This will run the three modules in sequence, with the final result result being stored in a file named combined.combine_columns.output in a new temp directory:

"1"     0.11
"2"     0.23
"3"     0.35
"4"     0.47

What’s next?

Now that you’ve had a taste of what Viash can do for you, take a look at our Guide and Reference pages to learn more about how to use Viash. If you want to start simple, we suggest to take a look at the Native component creation guide.