Batch processing

Within this project it’s possible to do all of the same things mentioned in the “Component” guide such as build a target executable with viash build and test a component with viash test. However, doing this for all components in the repository can bet quite tedious.

Luckily, Viash provides a set of commands for building, testing or inspecting all Viash components in the current namespace (ns).

Build all components

You can generate your first full development build using the viash ns build command.

viash ns build --setup cachedbuild
Output
Exporting remove_comments (template) =executable=> target/executable/template/remove_comments
[notice] Building container 'ghcr.io/viash-io/project_template/template/remove_comments:0.3.0' with Dockerfile
Exporting remove_comments (template) =nextflow=> target/nextflow/template/remove_comments
Exporting workflow (template) =nextflow=> target/nextflow/template/workflow
Exporting combine_columns (template) =executable=> target/executable/template/combine_columns
[notice] Building container 'ghcr.io/viash-io/project_template/template/combine_columns:0.3.0' with Dockerfile
Exporting combine_columns (template) =nextflow=> target/nextflow/template/combine_columns
Exporting take_column (template) =executable=> target/executable/template/take_column
[notice] Building container 'ghcr.io/viash-io/project_template/template/take_column:0.3.0' with Dockerfile
Exporting take_column (template) =nextflow=> target/nextflow/template/take_column
All 7 configs built successfully

Here are some useful optional arguments:

  • --parallel: Run multiple builds in parallel.
  • --setup cachedbuild: Build Docker images using the cachedbuild strategy.
  • --query demo: Only select components that have ‘demo’ in the namespace or name.

Test all components

You can run all of the component tests using the viash ns test command.

viash ns test
Output
           namespace                 name               runner               engine            test_name exit_code duration               result
            template      remove_comments           executable               docker                start                                        
The working directory for the namespace tests is /tmp/viash_ns_test13190949569644680698
            template      remove_comments           executable               docker     build_executable         0        0              SUCCESS
            template      remove_comments           executable               docker              test.sh         0        1              SUCCESS
            template      combine_columns           executable               docker                start                                        
            template      combine_columns           executable               docker     build_executable         0       10              SUCCESS
            template      combine_columns           executable               docker               test.R         0        2              SUCCESS
            template          take_column           executable               docker                start                                        
            template          take_column           executable               docker     build_executable         0        0              SUCCESS
            template          take_column           executable               docker              test.py         0        2              SUCCESS
All 6 configs built and tested successfully

Listing components

You can run list all components using the viash ns list command.

viash ns list
Output
All 4 configs parsed successfully
- name: "remove_comments"
  namespace: "template"
  version: "0.3.0"
  argument_groups:
  - name: "Arguments"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      example:
      - "resources_test/file1.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: false
      multiple_sep: ";"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      example:
      - "file.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ";"
  resources:
  - type: "bash_script"
    path: "script.sh"
    is_executable: true
  test_resources:
  - type: "bash_script"
    path: "test.sh"
    is_executable: true
  status: "enabled"
  license: "GPL-3.0"
  links:
    repository: "https://github.com/viash-io/viash_project_template"
    docker_registry: "ghcr.io"
  runners:
  - type: "executable"
    id: "executable"
    docker_setup_strategy: "ifneedbepullelsecachedbuild"
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1000000000.B"
        mem2gb: "memory = 2000000000.B"
        mem5gb: "memory = 5000000000.B"
        mem10gb: "memory = 10000000000.B"
        mem20gb: "memory = 20000000000.B"
        mem50gb: "memory = 50000000000.B"
        mem100gb: "memory = 100000000000.B"
        mem200gb: "memory = 200000000000.B"
        mem500gb: "memory = 500000000000.B"
        mem1tb: "memory = 1000000000000.B"
        mem2tb: "memory = 2000000000000.B"
        mem5tb: "memory = 5000000000000.B"
        mem10tb: "memory = 10000000000000.B"
        mem20tb: "memory = 20000000000000.B"
        mem50tb: "memory = 50000000000000.B"
        mem100tb: "memory = 100000000000000.B"
        mem200tb: "memory = 200000000000000.B"
        mem500tb: "memory = 500000000000000.B"
        mem1gib: "memory = 1073741824.B"
        mem2gib: "memory = 2147483648.B"
        mem4gib: "memory = 4294967296.B"
        mem8gib: "memory = 8589934592.B"
        mem16gib: "memory = 17179869184.B"
        mem32gib: "memory = 34359738368.B"
        mem64gib: "memory = 68719476736.B"
        mem128gib: "memory = 137438953472.B"
        mem256gib: "memory = 274877906944.B"
        mem512gib: "memory = 549755813888.B"
        mem1tib: "memory = 1099511627776.B"
        mem2tib: "memory = 2199023255552.B"
        mem4tib: "memory = 4398046511104.B"
        mem8tib: "memory = 8796093022208.B"
        mem16tib: "memory = 17592186044416.B"
        mem32tib: "memory = 35184372088832.B"
        mem64tib: "memory = 70368744177664.B"
        mem128tib: "memory = 140737488355328.B"
        mem256tib: "memory = 281474976710656.B"
        mem512tib: "memory = 562949953421312.B"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  engines:
  - type: "docker"
    id: "docker"
    image: "ubuntu:20.04"
    namespace_separator: "/"
    entrypoint: []
  build_info:
    config: "src/template/remove_comments/config.vsh.yaml"
    viash_version: "0.9.0-RC6"
    git_commit: "74d131952d15d22058d664a3eec7c8dc754bc57e"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.3.0"
  package_config:
    name: "project_template"
    version: "0.3.0"
    description: "This is a project template for viash projects.\n"
    viash_version: "0.9.0-RC6"
    source: "src"
    target: "target"
    keywords:
    - "viash"
    - "template"
    license: "GPL-3.0"
    organization: "viash-io"
    links:
      repository: "https://github.com/viash-io/viash_project_template"
      docker_registry: "ghcr.io"
      issue_tracker: "https://github.com/viash-io/viash_project_template/issues"
- name: "workflow"
  namespace: "template"
  version: "0.3.0"
  argument_groups:
  - name: "Arguments"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      description: "Input TSV file"
      example:
      - "resources_test/file1.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: false
      multiple_sep: ";"
    - type: "integer"
      name: "--column"
      description: "The column index to extract from the TSV input file"
      default:
      - 2
      required: false
      direction: "input"
      multiple: false
      multiple_sep: ";"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      description: "Output TSV file"
      example:
      - "output.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ";"
  resources:
  - type: "nextflow_script"
    path: "main.nf"
    is_executable: true
    entrypoint: "run_wf"
  description: "An example pipeline and project template.\n\nMultiple TSV files can\
    \ be input, each with a 'column' identifier that\nshould be extracted. All extracted\
    \ columns are then collated again.\n"
  status: "enabled"
  dependencies:
  - name: "template/combine_columns"
    repository:
      type: "local"
  - name: "template/remove_comments"
    repository:
      type: "local"
  - name: "template/take_column"
    repository:
      type: "local"
  license: "GPL-3.0"
  links:
    repository: "https://github.com/viash-io/viash_project_template"
    docker_registry: "ghcr.io"
  runners:
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1000000000.B"
        mem2gb: "memory = 2000000000.B"
        mem5gb: "memory = 5000000000.B"
        mem10gb: "memory = 10000000000.B"
        mem20gb: "memory = 20000000000.B"
        mem50gb: "memory = 50000000000.B"
        mem100gb: "memory = 100000000000.B"
        mem200gb: "memory = 200000000000.B"
        mem500gb: "memory = 500000000000.B"
        mem1tb: "memory = 1000000000000.B"
        mem2tb: "memory = 2000000000000.B"
        mem5tb: "memory = 5000000000000.B"
        mem10tb: "memory = 10000000000000.B"
        mem20tb: "memory = 20000000000000.B"
        mem50tb: "memory = 50000000000000.B"
        mem100tb: "memory = 100000000000000.B"
        mem200tb: "memory = 200000000000000.B"
        mem500tb: "memory = 500000000000000.B"
        mem1gib: "memory = 1073741824.B"
        mem2gib: "memory = 2147483648.B"
        mem4gib: "memory = 4294967296.B"
        mem8gib: "memory = 8589934592.B"
        mem16gib: "memory = 17179869184.B"
        mem32gib: "memory = 34359738368.B"
        mem64gib: "memory = 68719476736.B"
        mem128gib: "memory = 137438953472.B"
        mem256gib: "memory = 274877906944.B"
        mem512gib: "memory = 549755813888.B"
        mem1tib: "memory = 1099511627776.B"
        mem2tib: "memory = 2199023255552.B"
        mem4tib: "memory = 4398046511104.B"
        mem8tib: "memory = 8796093022208.B"
        mem16tib: "memory = 17592186044416.B"
        mem32tib: "memory = 35184372088832.B"
        mem64tib: "memory = 70368744177664.B"
        mem128tib: "memory = 140737488355328.B"
        mem256tib: "memory = 281474976710656.B"
        mem512tib: "memory = 562949953421312.B"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  build_info:
    config: "src/template/workflow/config.vsh.yaml"
    viash_version: "0.9.0-RC6"
    git_commit: "74d131952d15d22058d664a3eec7c8dc754bc57e"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.3.0"
  package_config:
    name: "project_template"
    version: "0.3.0"
    description: "This is a project template for viash projects.\n"
    viash_version: "0.9.0-RC6"
    source: "src"
    target: "target"
    keywords:
    - "viash"
    - "template"
    license: "GPL-3.0"
    organization: "viash-io"
    links:
      repository: "https://github.com/viash-io/viash_project_template"
      docker_registry: "ghcr.io"
      issue_tracker: "https://github.com/viash-io/viash_project_template/issues"
- name: "combine_columns"
  namespace: "template"
  version: "0.3.0"
  argument_groups:
  - name: "Arguments"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      example:
      - "resources_test/file1.tsv"
      - "resources_test/file2.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: true
      multiple_sep: ";"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      example:
      - "output.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ";"
  resources:
  - type: "r_script"
    path: "script.R"
    is_executable: true
  test_resources:
  - type: "r_script"
    path: "test.R"
    is_executable: true
  status: "enabled"
  license: "GPL-3.0"
  links:
    repository: "https://github.com/viash-io/viash_project_template"
    docker_registry: "ghcr.io"
  runners:
  - type: "executable"
    id: "executable"
    docker_setup_strategy: "ifneedbepullelsecachedbuild"
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1000000000.B"
        mem2gb: "memory = 2000000000.B"
        mem5gb: "memory = 5000000000.B"
        mem10gb: "memory = 10000000000.B"
        mem20gb: "memory = 20000000000.B"
        mem50gb: "memory = 50000000000.B"
        mem100gb: "memory = 100000000000.B"
        mem200gb: "memory = 200000000000.B"
        mem500gb: "memory = 500000000000.B"
        mem1tb: "memory = 1000000000000.B"
        mem2tb: "memory = 2000000000000.B"
        mem5tb: "memory = 5000000000000.B"
        mem10tb: "memory = 10000000000000.B"
        mem20tb: "memory = 20000000000000.B"
        mem50tb: "memory = 50000000000000.B"
        mem100tb: "memory = 100000000000000.B"
        mem200tb: "memory = 200000000000000.B"
        mem500tb: "memory = 500000000000000.B"
        mem1gib: "memory = 1073741824.B"
        mem2gib: "memory = 2147483648.B"
        mem4gib: "memory = 4294967296.B"
        mem8gib: "memory = 8589934592.B"
        mem16gib: "memory = 17179869184.B"
        mem32gib: "memory = 34359738368.B"
        mem64gib: "memory = 68719476736.B"
        mem128gib: "memory = 137438953472.B"
        mem256gib: "memory = 274877906944.B"
        mem512gib: "memory = 549755813888.B"
        mem1tib: "memory = 1099511627776.B"
        mem2tib: "memory = 2199023255552.B"
        mem4tib: "memory = 4398046511104.B"
        mem8tib: "memory = 8796093022208.B"
        mem16tib: "memory = 17592186044416.B"
        mem32tib: "memory = 35184372088832.B"
        mem64tib: "memory = 70368744177664.B"
        mem128tib: "memory = 140737488355328.B"
        mem256tib: "memory = 281474976710656.B"
        mem512tib: "memory = 562949953421312.B"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  engines:
  - type: "docker"
    id: "docker"
    image: "rocker/r2u:22.04"
    namespace_separator: "/"
    setup:
    - type: "r"
      packages:
      - "bit64"
      bioc_force_install: false
    test_setup:
    - type: "r"
      packages:
      - "processx"
      - "testthat"
      bioc_force_install: false
    entrypoint: []
  build_info:
    config: "src/template/combine_columns/config.vsh.yaml"
    viash_version: "0.9.0-RC6"
    git_commit: "74d131952d15d22058d664a3eec7c8dc754bc57e"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.3.0"
  package_config:
    name: "project_template"
    version: "0.3.0"
    description: "This is a project template for viash projects.\n"
    viash_version: "0.9.0-RC6"
    source: "src"
    target: "target"
    keywords:
    - "viash"
    - "template"
    license: "GPL-3.0"
    organization: "viash-io"
    links:
      repository: "https://github.com/viash-io/viash_project_template"
      docker_registry: "ghcr.io"
      issue_tracker: "https://github.com/viash-io/viash_project_template/issues"
- name: "take_column"
  namespace: "template"
  version: "0.3.0"
  argument_groups:
  - name: "Arguments"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      example:
      - "resources_test/file1.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: false
      multiple_sep: ";"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      example:
      - "path/to/output.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ";"
    - type: "integer"
      name: "--column"
      default:
      - 2
      required: false
      direction: "input"
      multiple: false
      multiple_sep: ";"
  resources:
  - type: "python_script"
    path: "script.py"
    is_executable: true
  test_resources:
  - type: "python_script"
    path: "test.py"
    is_executable: true
  status: "enabled"
  license: "GPL-3.0"
  links:
    repository: "https://github.com/viash-io/viash_project_template"
    docker_registry: "ghcr.io"
  runners:
  - type: "executable"
    id: "executable"
    docker_setup_strategy: "ifneedbepullelsecachedbuild"
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1000000000.B"
        mem2gb: "memory = 2000000000.B"
        mem5gb: "memory = 5000000000.B"
        mem10gb: "memory = 10000000000.B"
        mem20gb: "memory = 20000000000.B"
        mem50gb: "memory = 50000000000.B"
        mem100gb: "memory = 100000000000.B"
        mem200gb: "memory = 200000000000.B"
        mem500gb: "memory = 500000000000.B"
        mem1tb: "memory = 1000000000000.B"
        mem2tb: "memory = 2000000000000.B"
        mem5tb: "memory = 5000000000000.B"
        mem10tb: "memory = 10000000000000.B"
        mem20tb: "memory = 20000000000000.B"
        mem50tb: "memory = 50000000000000.B"
        mem100tb: "memory = 100000000000000.B"
        mem200tb: "memory = 200000000000000.B"
        mem500tb: "memory = 500000000000000.B"
        mem1gib: "memory = 1073741824.B"
        mem2gib: "memory = 2147483648.B"
        mem4gib: "memory = 4294967296.B"
        mem8gib: "memory = 8589934592.B"
        mem16gib: "memory = 17179869184.B"
        mem32gib: "memory = 34359738368.B"
        mem64gib: "memory = 68719476736.B"
        mem128gib: "memory = 137438953472.B"
        mem256gib: "memory = 274877906944.B"
        mem512gib: "memory = 549755813888.B"
        mem1tib: "memory = 1099511627776.B"
        mem2tib: "memory = 2199023255552.B"
        mem4tib: "memory = 4398046511104.B"
        mem8tib: "memory = 8796093022208.B"
        mem16tib: "memory = 17592186044416.B"
        mem32tib: "memory = 35184372088832.B"
        mem64tib: "memory = 70368744177664.B"
        mem128tib: "memory = 140737488355328.B"
        mem256tib: "memory = 281474976710656.B"
        mem512tib: "memory = 562949953421312.B"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  engines:
  - type: "docker"
    id: "docker"
    image: "python:3.10-slim"
    namespace_separator: "/"
    setup:
    - type: "python"
      user: false
      packages:
      - "pandas"
      upgrade: true
    - type: "apt"
      packages:
      - "procps"
      interactive: false
    entrypoint: []
  build_info:
    config: "src/template/take_column/config.vsh.yaml"
    viash_version: "0.9.0-RC6"
    git_commit: "74d131952d15d22058d664a3eec7c8dc754bc57e"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.3.0"
  package_config:
    name: "project_template"
    version: "0.3.0"
    description: "This is a project template for viash projects.\n"
    viash_version: "0.9.0-RC6"
    source: "src"
    target: "target"
    keywords:
    - "viash"
    - "template"
    license: "GPL-3.0"
    organization: "viash-io"
    links:
      repository: "https://github.com/viash-io/viash_project_template"
      docker_registry: "ghcr.io"
      issue_tracker: "https://github.com/viash-io/viash_project_template/issues"

Custom batch processing

The viash ns exec command can be used to run a command on every component.

viash ns exec "echo Hello {}"
Output
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/config.vsh.yaml
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/config.vsh.yaml

  Exit code: 0

  Output:
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/workflow/config.vsh.yaml
  Exit code: 0

  Output:
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/config.vsh.yaml
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/workflow/config.vsh.yaml

  Exit code: 0

  Output:
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/config.vsh.yaml
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/config.vsh.yaml

  Exit code: 0
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/config.vsh.yaml

  Output:

Tips

Parallel builds

Some commands shown above can be optimized by adding the --parallel option:

  • viash ns build --parallel will build in parallel
  • viash ns test --parallel will test in parallel

For example:

viash ns test --parallel
Output
           namespace                 name               runner               engine            test_name exit_code duration               result
The working directory for the namespace tests is /tmp/viash_ns_test15412689880712260478
            template          take_column           executable               docker                start                                        
            template      combine_columns           executable               docker                start                                        
            template      remove_comments           executable               docker                start                                        
            template      remove_comments           executable               docker     build_executable         0        0              SUCCESS
            template      remove_comments           executable               docker              test.sh         0        2              SUCCESS
            template          take_column           executable               docker     build_executable         0        1              SUCCESS
            template          take_column           executable               docker              test.py         0        2              SUCCESS
            template      combine_columns           executable               docker     build_executable         0        1              SUCCESS
            template      combine_columns           executable               docker               test.R         1        2                ERROR
====================================================================
+/tmp/viash_ns_test15412689880712260478/template_combine_columns/test_test/test_executable
Generating test data
[1] "Running combine_columns\n"
Checking results
Error: `table` not equal to `expected_table`.
Length mismatch: comparison on first 4 components
Component 2: Mean relative difference: 1.5
Component 3: Modes: numeric, character
Component 3: target is numeric, current is character
Component 4: Modes: character, logical
Component 4: target is character, current is logical
Execution halted

====================================================================
Not all configs built and tested successfully
  1/6 tests failed
  5/6 configs built and tested successfully

Subset to components or namespaces

In a development context, one often needs to rebuild one or a few components rather than the full repository. For this situation, viash ns has the option to specify query arguments: --query, query_name and query_namespace. We refer to the reference documentation for details and illustrate the use using an example:

viash ns build --query "^.*columns$"
Output
Exporting combine_columns (template) =executable=> target/executable/template/combine_columns
Exporting combine_columns (template) =nextflow=> target/nextflow/template/combine_columns
Not all configs built successfully
  3 configs were disabled
  2/2 configs built successfully

As shown here, the query arguments accept regular expressions.