Batch processing

Within this project it’s possible to do all of the same things mentioned in the “Component” guide such as build a target executable with viash build and test a component with viash test. However, doing this for all components in the repository can bet quite tedious.

Luckily, Viash provides a set of commands for building, testing or inspecting all Viash components in the current namespace (ns).

Build all components

You can generate your first full development build using the viash ns build command.

viash ns build --setup cachedbuild
Output
temporaryFolder: /tmp/viash_hub_repo15147278532119400426 uri: https://viash-hub.com/data-intuitive/vsh-pipeline-operators.git
Cloning into '.'...
checkout out: List(git, checkout, tags/v0.2.0, --, .) 0 
Exporting combine_columns (template) =docker=> /home/runner/work/website/website/guide/_viash_project_template/target/docker/template/combine_columns
[notice] Building container 'ghcr.io/viash-io/viash_project_template/template/combine_columns:0.2.3' with Dockerfile
Exporting combine_columns (template) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/template/combine_columns
Exporting remove_comments (template) =docker=> /home/runner/work/website/website/guide/_viash_project_template/target/docker/template/remove_comments
[notice] Building container 'ghcr.io/viash-io/viash_project_template/template/remove_comments:0.2.3' with Dockerfile
Exporting remove_comments (template) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/template/remove_comments
Exporting workflow (template) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/template/workflow
Exporting take_column (template) =docker=> /home/runner/work/website/website/guide/_viash_project_template/target/docker/template/take_column
[notice] Building container 'ghcr.io/viash-io/viash_project_template/template/take_column:0.2.3' with Dockerfile
Exporting take_column (template) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/template/take_column
All 7 configs built successfully

Here are some useful optional arguments:

  • --parallel: Run multiple builds in parallel.
  • --setup cachedbuild: Build Docker images using the cachedbuild strategy.
  • --query demo: Only select components that have ‘demo’ in the namespace or name.

Test all components

You can run all of the component tests using the viash ns test command.

viash ns test
Output
temporaryFolder: /tmp/viash_hub_repo10975427760112615476 uri: https://viash-hub.com/data-intuitive/vsh-pipeline-operators.git
Cloning into '.'...
checkout out: List(git, checkout, tags/v0.2.0, --, .) 0 
The working directory for the namespace tests is /tmp/viash_ns_test1587598001783550163
           namespace        functionality             platform            test_name exit_code duration               result
            template      combine_columns               docker                start                                        
            template      combine_columns               docker     build_executable         0        0              SUCCESS
            template      combine_columns               docker                tests        -1        0              MISSING
no tests found
====================================================================
            template      remove_comments               docker                start                                        
            template      remove_comments               docker     build_executable         0        0              SUCCESS
            template      remove_comments               docker              test.sh         0        1              SUCCESS
            template          take_column               docker                start                                        
            template          take_column               docker     build_executable         0        0              SUCCESS
            template          take_column               docker                tests        -1        0              MISSING
no tests found
====================================================================
Not all configs built and tested successfully
  2/6 tests missing
  4/6 configs built and tested successfully

Listing components

You can run list all components using the viash ns list command.

viash ns list
Output
temporaryFolder: /tmp/viash_hub_repo8156408546155443408 uri: https://viash-hub.com/data-intuitive/vsh-pipeline-operators.git
Cloning into '.'...
checkout out: List(git, checkout, tags/v0.2.0, --, .) 0 
- functionality:
    name: "combine_columns"
    namespace: "template"
    version: "0.2.3"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: true
      multiple_sep: ":"
      dest: "par"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    resources:
    - type: "r_script"
      path: "script.R"
      is_executable: true
      parent: "file:/home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/"
    status: "enabled"
    set_wd_to_resources_dir: false
  platforms:
  - type: "docker"
    id: "docker"
    image: "eddelbuettel/r2u:22.04"
    target_organization: "viash-io/viash_project_template"
    target_registry: "ghcr.io"
    namespace_separator: "/"
    resolve_volume: "Automatic"
    chown: true
    setup_strategy: "ifneedbepullelsecachedbuild"
    target_image_source: "https://github.com/viash-io/viash_project_template"
    setup:
    - type: "r"
      packages:
      - "bit64"
      bioc_force_install: false
    entrypoint: []
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1.GB"
        mem2gb: "memory = 2.GB"
        mem4gb: "memory = 4.GB"
        mem8gb: "memory = 8.GB"
        mem16gb: "memory = 16.GB"
        mem32gb: "memory = 32.GB"
        mem64gb: "memory = 64.GB"
        mem128gb: "memory = 128.GB"
        mem256gb: "memory = 256.GB"
        mem512gb: "memory = 512.GB"
        mem1tb: "memory = 1.TB"
        mem2tb: "memory = 2.TB"
        mem4tb: "memory = 4.TB"
        mem8tb: "memory = 8.TB"
        mem16tb: "memory = 16.TB"
        mem32tb: "memory = 32.TB"
        mem64tb: "memory = 64.TB"
        mem128tb: "memory = 128.TB"
        mem256tb: "memory = 256.TB"
        mem512tb: "memory = 512.TB"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  info:
    config: "/home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/config.vsh.yaml"
    viash_version: "0.8.4"
All 4 configs parsed successfully
    git_commit: "e51a8b31beca2e3c1c043951c3e679d8f14fc8c7"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.2.3"
- functionality:
    name: "remove_comments"
    namespace: "template"
    version: "0.2.3"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      example:
      - "file.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      example:
      - "file.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    resources:
    - type: "bash_script"
      path: "script.sh"
      is_executable: true
      parent: "file:/home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/"
    test_resources:
    - type: "bash_script"
      path: "test.sh"
      is_executable: true
      parent: "file:/home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/"
    status: "enabled"
    set_wd_to_resources_dir: false
  platforms:
  - type: "docker"
    id: "docker"
    image: "ubuntu:20.04"
    target_organization: "viash-io/viash_project_template"
    target_registry: "ghcr.io"
    namespace_separator: "/"
    resolve_volume: "Automatic"
    chown: true
    setup_strategy: "ifneedbepullelsecachedbuild"
    target_image_source: "https://github.com/viash-io/viash_project_template"
    entrypoint: []
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1.GB"
        mem2gb: "memory = 2.GB"
        mem4gb: "memory = 4.GB"
        mem8gb: "memory = 8.GB"
        mem16gb: "memory = 16.GB"
        mem32gb: "memory = 32.GB"
        mem64gb: "memory = 64.GB"
        mem128gb: "memory = 128.GB"
        mem256gb: "memory = 256.GB"
        mem512gb: "memory = 512.GB"
        mem1tb: "memory = 1.TB"
        mem2tb: "memory = 2.TB"
        mem4tb: "memory = 4.TB"
        mem8tb: "memory = 8.TB"
        mem16tb: "memory = 16.TB"
        mem32tb: "memory = 32.TB"
        mem64tb: "memory = 64.TB"
        mem128tb: "memory = 128.TB"
        mem256tb: "memory = 256.TB"
        mem512tb: "memory = 512.TB"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  info:
    config: "/home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/config.vsh.yaml"
    viash_version: "0.8.4"
    git_commit: "e51a8b31beca2e3c1c043951c3e679d8f14fc8c7"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.2.3"
- functionality:
    name: "workflow"
    namespace: "template"
    version: "0.2.3"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      description: "Input TSV file"
      example:
      - "file1.tar.gz"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    - type: "integer"
      name: "--column"
      description: "The column index to extract from the TSV input file"
      default:
      - 2
      required: false
      direction: "input"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      description: "Output TSV file"
      example:
      - "output.tsv"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    resources:
    - type: "nextflow_script"
      path: "main.nf"
      is_executable: true
      parent: "file:/home/runner/work/website/website/guide/_viash_project_template/src/template/workflow/"
      entrypoint: "run_wf"
    description: "An example pipeline and project template.\n\nMultiple TSV files\
      \ can be input, each with a 'column' identifier that\nshould be extracted. All\
      \ extracted columns are then collated again.\n"
    status: "enabled"
    dependencies:
    - name: "template/combine_columns"
      repository:
        type: "local"
        localPath: ""
      foundConfigPath: "/home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/config.vsh.yaml"
      configInfo:
        functionalityName: "combine_columns"
        git_tag: "v0.2.3"
        git_remote: "https://github.com/viash-io/viash_project_template.git"
        viash_version: "0.8.4"
        config: "/home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/config.vsh.yaml"
        functionalityNamespace: "template"
        output: ""
        platform: ""
        git_commit: "e51a8b31beca2e3c1c043951c3e679d8f14fc8c7"
        executable: ""
    - name: "template/remove_comments"
      repository:
        type: "local"
        localPath: ""
      foundConfigPath: "/home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/config.vsh.yaml"
      configInfo:
        functionalityName: "remove_comments"
        git_tag: "v0.2.3"
        git_remote: "https://github.com/viash-io/viash_project_template.git"
        viash_version: "0.8.4"
        config: "/home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/config.vsh.yaml"
        functionalityNamespace: "template"
        output: ""
        platform: ""
        git_commit: "e51a8b31beca2e3c1c043951c3e679d8f14fc8c7"
        executable: ""
    - name: "template/take_column"
      repository:
        type: "local"
        localPath: ""
      foundConfigPath: "/home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/config.vsh.yaml"
      configInfo:
        functionalityName: "take_column"
        git_tag: "v0.2.3"
        git_remote: "https://github.com/viash-io/viash_project_template.git"
        viash_version: "0.8.4"
        config: "/home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/config.vsh.yaml"
        functionalityNamespace: "template"
        output: ""
        platform: ""
        git_commit: "e51a8b31beca2e3c1c043951c3e679d8f14fc8c7"
        executable: ""
    - name: "join/vsh_toList"
      repository:
        type: "vsh"
        name: ""
        repo: "data-intuitive/vsh-pipeline-operators"
        tag: "v0.2.0"
        localPath: "/tmp/viash_hub_repo8156408546155443408"
      foundConfigPath: "/tmp/viash_hub_repo8156408546155443408/target/nextflow/join/vsh_toList/.config.vsh.yaml"
      configInfo:
        functionalityName: "vsh_toList"
        git_remote: "git@viash-hub.com:data-intuitive/vsh-pipeline-operators.git"
        viash_version: "0.8.3"
        config: "/Users/toni/code/projects/viash-hub/vsh-pipeline-operators/src/join/vsh_toList/config.vsh.yaml"
        functionalityNamespace: "join"
        output: "/Users/toni/code/projects/viash-hub/vsh-pipeline-operators/target/nextflow/join/vsh_toList"
        platform: "nextflow"
        git_commit: "05a5bfa4eaa2c04ba473671e4d30c6c18aceec6e"
        executable: "/Users/toni/code/projects/viash-hub/vsh-pipeline-operators/target/nextflow/join/vsh_toList/main.nf"
    repositories:
    - type: "local"
      name: "local"
      localPath: ""
    - type: "vsh"
      name: "vsh-pipeline-operators"
      repo: "data-intuitive/vsh-pipeline-operators"
      tag: "v0.2.0"
      localPath: ""
    set_wd_to_resources_dir: false
  platforms:
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1.GB"
        mem2gb: "memory = 2.GB"
        mem4gb: "memory = 4.GB"
        mem8gb: "memory = 8.GB"
        mem16gb: "memory = 16.GB"
        mem32gb: "memory = 32.GB"
        mem64gb: "memory = 64.GB"
        mem128gb: "memory = 128.GB"
        mem256gb: "memory = 256.GB"
        mem512gb: "memory = 512.GB"
        mem1tb: "memory = 1.TB"
        mem2tb: "memory = 2.TB"
        mem4tb: "memory = 4.TB"
        mem8tb: "memory = 8.TB"
        mem16tb: "memory = 16.TB"
        mem32tb: "memory = 32.TB"
        mem64tb: "memory = 64.TB"
        mem128tb: "memory = 128.TB"
        mem256tb: "memory = 256.TB"
        mem512tb: "memory = 512.TB"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  info:
    config: "/home/runner/work/website/website/guide/_viash_project_template/src/template/workflow/config.vsh.yaml"
    viash_version: "0.8.4"
    git_commit: "e51a8b31beca2e3c1c043951c3e679d8f14fc8c7"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.2.3"
- functionality:
    name: "take_column"
    namespace: "template"
    version: "0.2.3"
    arguments:
    - type: "file"
      name: "--input"
      alternatives:
      - "-i"
      must_exist: true
      create_parent: true
      required: true
      direction: "input"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    - type: "file"
      name: "--output"
      alternatives:
      - "-o"
      must_exist: true
      create_parent: true
      required: true
      direction: "output"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    - type: "integer"
      name: "--column"
      default:
      - 2
      required: false
      direction: "input"
      multiple: false
      multiple_sep: ":"
      dest: "par"
    resources:
    - type: "python_script"
      path: "script.py"
      is_executable: true
      parent: "file:/home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/"
    status: "enabled"
    set_wd_to_resources_dir: false
  platforms:
  - type: "docker"
    id: "docker"
    image: "python:3.10-slim"
    target_organization: "viash-io/viash_project_template"
    target_registry: "ghcr.io"
    namespace_separator: "/"
    resolve_volume: "Automatic"
    chown: true
    setup_strategy: "ifneedbepullelsecachedbuild"
    target_image_source: "https://github.com/viash-io/viash_project_template"
    setup:
    - type: "python"
      user: false
      packages:
      - "pandas"
      upgrade: true
    - type: "apt"
      packages:
      - "procps"
      interactive: false
    entrypoint: []
  - type: "nextflow"
    id: "nextflow"
    directives:
      tag: "$id"
    auto:
      simplifyInput: true
      simplifyOutput: false
      transcript: false
      publish: false
    config:
      labels:
        mem1gb: "memory = 1.GB"
        mem2gb: "memory = 2.GB"
        mem4gb: "memory = 4.GB"
        mem8gb: "memory = 8.GB"
        mem16gb: "memory = 16.GB"
        mem32gb: "memory = 32.GB"
        mem64gb: "memory = 64.GB"
        mem128gb: "memory = 128.GB"
        mem256gb: "memory = 256.GB"
        mem512gb: "memory = 512.GB"
        mem1tb: "memory = 1.TB"
        mem2tb: "memory = 2.TB"
        mem4tb: "memory = 4.TB"
        mem8tb: "memory = 8.TB"
        mem16tb: "memory = 16.TB"
        mem32tb: "memory = 32.TB"
        mem64tb: "memory = 64.TB"
        mem128tb: "memory = 128.TB"
        mem256tb: "memory = 256.TB"
        mem512tb: "memory = 512.TB"
        cpu1: "cpus = 1"
        cpu2: "cpus = 2"
        cpu5: "cpus = 5"
        cpu10: "cpus = 10"
        cpu20: "cpus = 20"
        cpu50: "cpus = 50"
        cpu100: "cpus = 100"
        cpu200: "cpus = 200"
        cpu500: "cpus = 500"
        cpu1000: "cpus = 1000"
    debug: false
    container: "docker"
  info:
    config: "/home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/config.vsh.yaml"
    viash_version: "0.8.4"
    git_commit: "e51a8b31beca2e3c1c043951c3e679d8f14fc8c7"
    git_remote: "https://github.com/viash-io/viash_project_template.git"
    git_tag: "v0.2.3"

Custom batch processing

The viash ns exec command can be used to run a command on every component.

viash ns exec "echo Hello {}"
Output
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/config.vsh.yaml
  Exit code: 0

  Output:
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/config.vsh.yaml
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/combine_columns/config.vsh.yaml

  Exit code: 0

  Output:
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/workflow/config.vsh.yaml
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/remove_comments/config.vsh.yaml

  Exit code: 0

  Output:
+ echo Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/config.vsh.yaml
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/workflow/config.vsh.yaml

  Exit code: 0

  Output:
Hello /home/runner/work/website/website/guide/_viash_project_template/src/template/take_column/config.vsh.yaml

Tips

Parallel builds

Some commands shown above can be optimized by adding the --parallel option:

  • viash ns build --parallel will build in parallel
  • viash ns test --parallel will test in parallel

For example:

viash ns test --parallel
Output
temporaryFolder: /tmp/viash_hub_repo11676019065492068620 uri: https://viash-hub.com/data-intuitive/vsh-pipeline-operators.git
Cloning into '.'...
checkout out: List(git, checkout, tags/v0.2.0, --, .) 0 
The working directory for the namespace tests is /tmp/viash_ns_test8388312295931904174
           namespace        functionality             platform            test_name exit_code duration               result
            template      combine_columns               docker                start                                        
            template      remove_comments               docker                start                                        
            template          take_column               docker                start                                        
            template          take_column               docker     build_executable         0        1              SUCCESS
            template          take_column               docker                tests        -1        0              MISSING
no tests found
====================================================================
            template      combine_columns               docker     build_executable         0        1              SUCCESS
no tests found            template      combine_columns               docker                tests        -1        0              MISSING

====================================================================
            template      remove_comments               docker     build_executable         0        1              SUCCESS
            template      remove_comments               docker              test.sh         0        1              SUCCESS
Not all configs built and tested successfully
  2/6 tests missing
  4/6 configs built and tested successfully

Subset to components or namespaces

In a development context, one often needs to rebuild one or a few components rather than the full repository. For this situation, viash ns has the option to specify query arguments: --query, query_name and query_namespace. We refer to the reference documentation for details and illustrate the use using an example:

viash ns build --query "^.*columns$"
Output
Exporting combine_columns (template) =docker=> /home/runner/work/website/website/guide/_viash_project_template/target/docker/template/combine_columns
Exporting combine_columns (template) =nextflow=> /home/runner/work/website/website/guide/_viash_project_template/target/nextflow/template/combine_columns
Not all configs built successfully
  3 configs were disabled
  2/2 configs built successfully

As shown here, the query arguments accept regular expressions.