Add dependencies

Adding custom dependencies to a component.

In the previous section, reproducibility of our Viash component was ensured by a predefined Docker image such as bash:4.0 and python:3.10. However, your script might require other software dependencies, such as command-line tools or Python and R packages.

By default, Viash will build component-specific Docker images. This means that every Viash component can have its own set of dependencies.

Extended example

Below is an example where additional software is added to a base Docker image using the setup section of a Docker platform.

name: example_bash_with_setup
description: A minimal example component.
arguments:
  - type: file
    name: --input
    example: file.txt
    required: true
  - type: file
    name: --output
    direction: output
    example: output.txt
    required: true
resources:
  - type: bash_script
    path: script.sh
engines:
  - type: docker
    image: bash:4.0
    setup:
      - type: apk
        packages:
          - curl
          - wget
  - type: native
runners:
  - type: executable
  - type: nextflow
name: example_csharp_with_setup
description: A minimal example component.
arguments:
  - type: file
    name: --input
    example: file.txt
    required: true
  - type: file
    name: --output
    direction: output
    example: output.txt
    required: true
resources:
  - type: csharp_script
    path: script.csx
engines:
  - type: docker
    image: ghcr.io/data-intuitive/dotnet-script:1.3.1
    setup:
      - type: apk
        packages:
          - curl
          - wget
  - type: native
runners:
  - type: executable
  - type: nextflow
name: example_js_with_setup
description: A minimal example component.
arguments:
  - type: file
    name: --input
    example: file.txt
    required: true
  - type: file
    name: --output
    direction: output
    example: output.txt
    required: true
resources:
  - type: javascript_script
    path: script.js
engines:
  - type: docker
    image: node:19-bullseye-slim
    setup:
      - type: apt
        packages:
          - curl
          - wget
  - type: native
runners:
  - type: executable
  - type: nextflow
name: example_python_with_setup
description: A minimal example component.
arguments:
  - type: file
    name: --input
    example: file.txt
    required: true
  - type: file
    name: --output
    direction: output
    example: output.txt
    required: true
resources:
  - type: python_script
    path: script.py
engines:
  - type: docker
    image: python:3.10-slim
    setup:
      - type: apt
        packages:
          - curl
          - wget
      - type: python
        packages: anndata
  - type: native
runners:
  - type: executable
  - type: nextflow
name: example_r_with_setup
description: A minimal example component.
arguments:
  - type: file
    name: --input
    example: file.txt
    required: true
  - type: file
    name: --output
    direction: output
    example: output.txt
    required: true
resources:
  - type: r_script
    path: script.R
engines:
  - type: docker
    image: eddelbuettel/r2u:22.04
    setup:
      - type: apt
        packages:
          - curl
          - wget
      - type: r
        packages: tidyverse
  - type: native
runners:
  - type: executable
  - type: nextflow
name: example_scala_with_setup
description: A minimal example component.
arguments:
  - type: file
    name: --input
    example: file.txt
    required: true
  - type: file
    name: --output
    direction: output
    example: output.txt
    required: true
resources:
  - type: scala_script
    path: script.scala
engines:
  - type: docker
    image: sbtscala/scala-sbt:eclipse-temurin-19_36_1.7.2_2.13.10
    setup:
      - type: apt
        packages:
          - curl
          - wget
  - type: native
runners:
  - type: executable
  - type: nextflow

You can (re)build a component’s Docker image by passing the ---setup flag to the executable:

Build the executable:

viash build config.vsh.yaml --engine docker --output target

Build the Docker image:

target/example_bash_with_setup ---setup cachedbuild
[notice] Building container 'example_bash_with_setup:latest' with Dockerfile

Build the executable:

viash build config.vsh.yaml --engine docker --output target

Build the Docker image:

target/example_csharp_with_setup ---setup cachedbuild
[notice] Building container 'example_csharp_with_setup:latest' with Dockerfile

Build the executable:

viash build config.vsh.yaml --engine docker --output target

Build the Docker image:

target/example_js_with_setup ---setup cachedbuild
[notice] Building container 'example_js_with_setup:latest' with Dockerfile

Build the executable:

viash build config.vsh.yaml --engine docker --output target

Build the Docker image:

target/example_python_with_setup ---setup cachedbuild
[notice] Building container 'example_python_with_setup:latest' with Dockerfile

Build the executable:

viash build config.vsh.yaml --engine docker --output target

Build the Docker image:

target/example_r_with_setup ---setup cachedbuild
[notice] Building container 'example_r_with_setup:latest' with Dockerfile

Build the executable:

viash build config.vsh.yaml --engine docker --output target

Build the Docker image:

target/example_scala_with_setup ---setup cachedbuild
[notice] Building container 'example_scala_with_setup:latest' with Dockerfile

Alternatively, you can also build the executable and it’s corresponding Docker image in one go:

viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild

Steps for creating a custom Docker engine

Here is a series of steps you can follow to add a Docker engine to your Viash component from scratch.

Step 1: Choose a base image

To start off, you’ll need to choose a base Docker image to start working with. In deciding which base image to use, it’s important to consider the size of the image and how trustworthy the source image is.

Tip

If the container does not have Bash installed, don’t forget to install this in Step 2.

Here is a list of base images we commonly use:

See the section on ‘minimum requirements’ when building a custom base image.

Step 2: Installing additional dependencies

You can use the setup section to many different types of layers. Here are some examples:

  • Apk requirements:

    setup:
      - type: apk
        packages: [ curl ]
  • Apt requirements:

    setup:
      - type: apt
        packages: [ curl ]
  • Docker requirements:

    setup:
      - type: docker
        build_args: "R_VERSION=hello_world"
        run: |
          echo 'Run a custom command'
          echo 'Foo' > /path/to/file.txt
  • Javascript requirements:

    setup:
      - type: javascript
        packages: [ express ]
        github: [ "expressjs/express" ]
  • Python requirements:

    setup:
      - type: python
        packages: [ anndata ]
      github: [ jkbr/httpie ]
  • R requirements:

    setup:
      - type: r
        packages: [ anndata ]
        bioc: [ AnnotationDbi, SingleCellExperiment ]
        github: rcannood/SCORPIUS
  • Ruby requirements:

    setup:
      - type: ruby
        packages: [ pry ]
  • Yum requirements:

    setup:
      - type: ruby
        packages: [ pry ]
        github: [ "pry/pry" ]

For more information on the possible setup entries, check out the reference documentation.

Important

Don’t forget to rebuild the Docker image after making changes to the setup section of your Docker platform (see next step).

Step 3: Rebuild Docker image

After adding additional setup entries, it’s important to rerun ---setup cachedbuild to rebuild the Docker image, as Viash will not rebuild the Docker image when it already exists.

viash build config.vsh.yaml 
  --engine docker 
  --output target 
  --setup cachedbuild
viash build config.vsh.yaml 
  --engine docker 
  --output target 
  --setup cachedbuild
viash build config.vsh.yaml 
  --engine docker 
  --output target 
  --setup cachedbuild
viash build config.vsh.yaml 
  --engine docker 
  --output target 
  --setup cachedbuild
viash build config.vsh.yaml 
  --engine docker 
  --output target 
  --setup cachedbuild
viash build config.vsh.yaml 
  --engine docker 
  --output target 
  --setup cachedbuild

You can choose what strategy to build an executable with when using a Docker backend by passing the --setup option followed by one of the strategies below.

Building an image:

  • alwaysbuild / build / b: Always build the image from the dockerfile. This is the default setup strategy.
  • alwayscachedbuild / cachedbuild / cb: Always build the image from the dockerfile, with caching enabled.
  • ifneedbebuild: Build the image if it does not exist locally.
  • ifneedbecachedbuild: Build the image with caching enabled if it does not exist locally.

Pulling an image:

  • alwayspull / pull / p: Try to pull the container from Docker Hub or the specified docker registry.
  • alwayspullelsebuild / pullelsebuild: Try to pull the image from a registry and build it if it doesn’t exist.
  • alwayspullelsecachedbuild / pullelsecachedbuild: Try to pull the image from a registry and build it with caching if it doesn’t exist.
  • ifneedbepull: If the image does not exist locally, pull the image.
  • ifneedbepullelsebuild If the image does not exist locally, pull the image. If the image does exist, build it.
  • ifneedbepullelsecachedbuild: If the image does not exist locally, pull the image. If the image does exist, build it with caching enabled.

Pushing an image:

Doing nothing:

  • donothing / meh: Do not build or pull anything.

Troubleshooting

Below are several steps that might help you troubleshoot the image when the setup fails.

View Dockerfile

You can view the actual Dockerfile used by Viash by passing the ---dockerfile flag:

target/example_bash_with_setup ---dockerfile
FROM bash:4.0
ENTRYPOINT []
RUN apk add --no-cache curl wget

LABEL org.opencontainers.image.description="Companion container for running component example_bash_with_setup"
LABEL org.opencontainers.image.created="2024-12-04T09:07:16Z"
target/example_csharp_with_setup ---dockerfile
FROM ghcr.io/data-intuitive/dotnet-script:1.3.1
ENTRYPOINT []
RUN apk add --no-cache curl wget

LABEL org.opencontainers.image.description="Companion container for running component example_csharp_with_setup"
LABEL org.opencontainers.image.created="2024-12-04T09:07:24Z"
target/example_js_with_setup ---dockerfile
FROM node:19-bullseye-slim
ENTRYPOINT []
RUN apt-get update && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
  rm -rf /var/lib/apt/lists/*

LABEL org.opencontainers.image.description="Companion container for running component example_js_with_setup"
LABEL org.opencontainers.image.created="2024-12-04T09:07:37Z"
target/example_python_with_setup ---dockerfile
FROM python:3.10-slim
ENTRYPOINT []
RUN apt-get update && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
  rm -rf /var/lib/apt/lists/*

RUN pip install --upgrade pip && \
  pip install --upgrade --no-cache-dir "anndata"

LABEL org.opencontainers.image.description="Companion container for running component example_python_with_setup"
LABEL org.opencontainers.image.created="2024-12-04T09:07:52Z"
target/example_r_with_setup ---dockerfile
FROM eddelbuettel/r2u:22.04
ENTRYPOINT []
RUN apt-get update && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
  rm -rf /var/lib/apt/lists/*

RUN Rscript -e 'if (!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes")' && \
  Rscript -e 'remotes::install_cran(c("tidyverse"), repos = "https://cran.rstudio.com")'

LABEL org.opencontainers.image.description="Companion container for running component example_r_with_setup"
LABEL org.opencontainers.image.created="2024-12-04T09:08:18Z"
target/example_scala_with_setup ---dockerfile
FROM sbtscala/scala-sbt:eclipse-temurin-19_36_1.7.2_2.13.10
ENTRYPOINT []
RUN apt-get update && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
  rm -rf /var/lib/apt/lists/*

LABEL org.opencontainers.image.description="Companion container for running component example_scala_with_setup"
LABEL org.opencontainers.image.created="2024-12-04T09:09:00Z"

Enter debugging session

You can also hop in a Bash session inside the Docker image using the ---debug flag:

target/example_bash_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_bash_with_setup:latest'
root@93c38006a124:/pwd#
target/example_csharp_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_csharp_with_setup:latest'
root@93c38006a124:/pwd#
target/example_js_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_js_with_setup:latest'
root@93c38006a124:/pwd#
target/example_python_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_python_with_setup:latest'
root@93c38006a124:/pwd#
target/example_r_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_r_with_setup:latest'
root@93c38006a124:/pwd#
target/example_scala_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_scala_with_setup:latest'
root@93c38006a124:/pwd#

This is useful for interactively debugging issues inside the container. For example, for figuring out whether you need to use apk, apt or yum to install software and to search for the exact name of packages like libcurl4-openssl-dev.

Alternative solutions

There are multiple ways you might try to find a Docker image which contains the right set of dependencies for your component:

  • Browse Docker Hub: Look a Docker image on Docker Hub or other Docker registries which has the right set of dependencies.
    • This is generally not recommended because it might take a long time to find a pre-existing image with the right set of dependencies
    • Poses a serious security risk.
  • Write a custom Dockerfile: You can write a custom Dockerfile to build your own Docker image and store it in a Docker registry, effectively creating a new ‘trusted’ base image.
    • Requires manual bookkeeping of which Docker images are used in which components.
    • Not difficult but requires more know-how on how to build custom Docker images.
  • Use Viash setup to build component-specific images: The methodology described above.
    • Easier to add / change dependencies to one component without breaking another
    • Store images in a centralized container registry

Behind the scenes

Auto-mount

Any executable built by Viash with a Docker engine will automatically mount the directories of files passed to the executable as arguments. For example, when running:

./my_executable --input /foo/bar/file.txt --output /dest/path

The executable will automatically mount the /foo/bar and /dest folder to /viash_automount/foo/bar/ and /viash_automount/dest inside the Docker container.

Auto-chown

By default, files created and modified by a Docker container are owned by root. By default, Viash automatically changes the owner of any files defined in the config file to the user running the executable. This behaviour can be overridden by setting the chown setting to false in your config file.

Example with standard Docker:

docker run -v `pwd`:/pwd bash:4.0 touch /pwd/file.txt
ls -l
-rw-r--r--. 1 root     root         0 Jan 26 16:03 file.txt

Example with a Viash executable:

/my_executable --output file.txt
-rw-r--r--. 1 myuser   myuser        Jan 26 16:03 file.txt

Minimum requirements for custom Docker images

Viash components only require a minimal set of dependencies which need to be available inside the Docker image:

  • Bash: bash.
  • C#: bash and dotnet-script.
  • JavaScript: bash and node (Node.js).
  • Python: bash, python and pip.
  • R: bash and R.
  • Scala: bash, openjdk-devel and sbt.