Creating a C# Component

Developing a new Viash component using C#.

In this tutorial, you’ll create a component that does the following:

  • Extract all hyperlinks from a markdown file
  • Check if every URL is reachable
  • Create a text report with the results

The component will be able to run locally and as a docker container. In order to create a component you need two files: a script for the functionality and a config file that describes the component.

The files used in this tutorial can be found here:

https://github.com/viash-io/viash_web/tree/main/static/examples/md_url_checker_csharp

Prerequisites

To follow along with this tutorial, you need to have this software installed on your machine:

We recommend you take a look at the hello world example first to understand how components work.

Write a script in C#

The first step of developing this component, is writing the core functionality of the component, in this case a C# script.
Create a new folder named my_viash_component and open it. Now create a new file named script.csx in there and add this code as its content:

#r "nuget: Markdig, 0.26.0"
#r "nuget: HtmlAgilityPack, 1.11.36"

using Markdig;
using HtmlAgilityPack;
using System.Net;

// 1

// VIASH START
var par = new {
  inputfile = "Testfile.md",
  domain = "https://viash.io",
  output = "output.txt",
};
// VIASH END

int amountOfErrors = 0;
string md =  File.ReadAllText(par.inputfile);


List<string> titles = new List<string>();
List<string> links = new List<string>();

// 2

string html = Markdown.ToHtml(md);
HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
HtmlNodeCollection htmlNodeCollection = htmlDocument.DocumentNode.SelectNodes("//a");


// 3

foreach (HtmlNode htmlNode in htmlNodeCollection)
{
    titles.Add(htmlNode.InnerText);
    string link = htmlNode.GetAttributeValue("href", null);

    // 4
    // If an URL doesn't start with 'http', add the domain before it
    if (!link.StartsWith("http"))
    {
        link = par.domain + link;
    }

    links.Add(link);
}

// Clear output file
File.WriteAllText(par.output, "");
StreamWriter sw = new StreamWriter(par.output);

// Iterate over all hyperlinks and check each URL
for (int i = 0; i < links.Count; i++)
{
    sw.WriteLine("Link name: " + titles[i]);
    sw.WriteLine("URL: " + links[i]);

    Console.WriteLine((i+1) + ": " + links[i]);
    WebRequest request = WebRequest.Create(links[i]);
    request.Method = "HEAD";
    
    // 5

    try
    {
        WebResponse response = request.GetResponse();
        Console.WriteLine(((HttpWebResponse)response).StatusCode);
        sw.WriteLine("Status: OK, can be reached.");
    }
    catch (System.Exception)
    {
        Console.WriteLine("404");
        sw.WriteLine("Status: ERROR! URL cannot be reached. Status code: 404");
        amountOfErrors++;
    }

    sw.WriteLine("---");
}

sw.Close();

Console.WriteLine();
Console.WriteLine($"{par.inputfile} has been checked and a report named {par.output} has been generated.");
Console.WriteLine($"{amountOfErrors} of {links.Count} URLs could not be resolved.");

Note the numbered comments scattered about looking like // X, here’s a breakdown of the code:

  1. The variables are placed between // VIASH START and // VIASH END for debugging purposes, their final values will be dynamically generated by Viash once the script is turned into a component. If you want to skip the testing of your script, you can leave these out and Viash will create variables based on the configuration file. There are three variables:
    • inputfile: The markdown file that needs to be parsed.
    • domain: The domain URL that gets inserted before any relative URLs. For example, “/documentation/intro” could be replaced with “https://my-website/documentation/intro” to create a valid URL.
    • output: The path of the output text file that will contain the report.
  2. The script converts the markdown file to html and extracts the hyperlinks into an HtmlNodeCollection for later use.
  3. Start a for-loop to iterate the hyperlinks.
  4. Any relative URLs (or those that don’t start with “http” at least) will get the domain added before it.
  5. A web request is used to check for a response from the URL. The results get written to the terminal and the report.

Test the script

Before turning the script into a component, it’s a good idea to test if it actually works as expected.
As the script expects a markdown file with hyperlinks, create a new file in the script folder named Testfile.md and paste in the following:

# Test File

This is a simple markdown file with some hyperlinks to test if the check_if_URLS_reachable component works correctly.
Some links to websites:

- [Google](https://www.google.com)
- [Reddit](https://www.reddit.com)
- [A broken link](http://microsoft.com/random-link)

Links that are relative to [viash.io](http://www.viash.io):

- You can [install viash here](/guides/getting_started/installation).
- It all starts with a script and a [config file](/api/config/config) for your components.

Now open a terminal in the folder and execute the following command to run the C# script:

dotnet script script.csx

The script will now show the following output:

1: https://www.google.com
OK
2: https://www.reddit.com
OK
3: http://microsoft.com/random-link
404
4: http://www.viash.io
OK
5: https://viash.io/guides/getting_started/installation
404
6: https://viash.io/api/config/config
404

Testfile.md has been checked and a report named output.txt has been generated.
3 of 6 URLs could not be resolved.

If you get this same output, that means the script is working as intended! Feel free to take a peek at the generated output.txt file as well. You might have noticed you didn’t have to provide any arguments, that’s because the values are hard-coded into the script for debugging purposes.

Now the script has been tested, it’s time to create a config file to describe the component based on it.

Describe the component using YAML

A viash config file is a YAML file that describes the behavior and supported platforms of a Viash component. Create new file named config.vsh.yaml and paste the following template inside of it:

functionality:
  name: NAME
  description: DESCRIPTION
  arguments:                     
  - type: string
    name: --input
    description: INPUT DESCRIPTION
  resources:
  - type: LANGUAGE_script
    path: SCRIPT
platforms:
  - type: native

Every config file requires these two dictionaries: functionality and platforms. This bare-bones config file makes it easy to “fill in the blanks” for this example. For more information about config files, take a look at the Config section of the API.

Let’s start off by defining the functionality of our component.

Defining the functionality

The functionality dictionary describes what the component does and the resources it needs to do so. The first key is name, this will be the name of the component once it’s built. Replace the NAME value with md_url_checker_csharp or any other name of your choosing.

Next up is the description key, its value will be printed out at the top when the –help command is called. Replace DESCRIPTION with “Check if URLs in a markdown are reachable and create a text report with the results.”. You can use multiple lines for a description by starting its value with a pipe (|) and a new line, like so:

functionality:
  name: md_url_checker_csharp
  description: |
    This is the first line of my description.
    Here's a second line!

The arguments dictionary contains all of the arguments that are accepted by the component. These arguments will be injected as variables in the script. In the case of the example script, this are the variables we’re working with:

  • inputfile
  • domain
  • output

To create good arguments, you need to ask yourself a few essential questions about each variable:

  • What is the most fitting data type?
  • Is it an input or an output?
  • Is it required?

Let’s take a closer look at inputfile for starters:

We know it’s a file, as the script needs the path to a markdown file as its input. It’s also definitely a required variable, as the script would be pointless without it.
With this in mind, modify the first argument as follows:

  • Change type’s value to file.
  • Set name’s value to –inputfile. The name of an argument has to match the variable name as the argument will be injected into the final script. In the case of C# scripts, the variables are added to an anonymous class named par.
  • Use “The input markdown file.” for the description value. This description will be included when the –help option is called.
  • Add a new key named required and set its value to true. This ensures that the component will not be run without a value for this argument.
  • Add another key, name it must_exist and set its value to true. This key is unique to file type arguments, it adds extra logic to the component to check if a file exists before running the component. This saves you from having to do this check yourself in the script.

That’s it for the first argument! The result should look like this:

  - type: file
    name: --inputfile
    description: The input markdown file.
    required: true
    must_exist: true

Now for domain, this is a simple optional string that gets added before relative URLs. Make room for a new argument by creating a new line below must_exist: true and press Shift + Tab to back up one tab so the cursor is aligned with the start of the first argument. Add the --domain argument here:

  - type: string                           
    name: --domain
    description: The domain URL that gets inserted before any relative URLs. For example, "/documentation/intro" could be replaced with "https://my-website/documentation/intro" to create a valid URL.

If an argument isn’t required, you can simply omit the required key. Here’s what the arguments dictionary look like up until now:

  arguments:                     
  - type: file
    name: --inputfile
    description: The input markdown file.
    required: true
    must_exist: true
  - type: string                           
    name: --domain
    description: The domain URL that gets inserted before any relative URLs. For example, "/documentation/intro" could be replaced with "https://my-website/documentation/intro" to create a valid URL.

The final variable to create an argument for is output. This is another file and clearly an output. Its value isn’t required as we can use a default path if no explicit value is given.
Add yet another new argument with the following keys and values:

  • Add a type key and set file as its value.
  • The next key is name, use –output as its value.
  • For the description, use “The path of the output text file that will contain the report.”.
  • Add a new key and name it default. This will act as the default value when not specified by the user of the component. Set its value to “output.txt”, including the quotation marks.
  • Finally, add the direction key and set its value to output. This specifies the direction of an argument as either input or output, with input being the default. Specifying that an argument is an output is important so the component can correctly handle the writing of files and the passing of values in a pipeline.

The finished argument should look like this:

  - type: file                           
    name: --output
    description: The path of the output text file that will contain the report.
    default: "output.txt"
    direction: output

With that, there’s just one more part of the functionality to fill in: the script itself!
Every Viash component has one or more resources, the most important of which is often the script. The template already contains a resources dictionary, so replace the following values to point to the script:

  • Set the value of type to csharp_script. The script used in this case was written in C#, so the resource type is set accordingly so Viash knows what flavour of code to generate to create the final component. You can find a full overview of the different resource types on the Functionality page.
  • Change the value of path to script.csx. This points to the resource and can be a relative path, an absolute path or even a URL. In this case we keep the script in the same directory as the config file to keep things simple.

That finishes up the functionality side of the component! All that’s left is defining the platforms with their dependencies and then running and building the component.

Defining the platforms

The platforms dictionary specifies the requirements to execute the component on zero or more platforms. The list of currently supported platforms are Native, Docker, and Nextflow. If no platforms are specified, a native platform is assumed. Here’s a quick overview of the platforms:

  • native: The platform for developers that know what they’re doing or for simple components without any dependencies. All dependencies need to be installed on the system the component is run on.
  • docker: This platform is recommended for most components. The dependencies are resolved by using docker containers, either from scratch or by pulling one from a docker repository. This has huge benefits as the end user doesn’t need to have any of the dependencies installed locally.
  • nextflow: This converts the component into a NextFlow module that can be imported into a pipeline.

In this tutorial, we’ll take a look at both the native and docker platforms. The platforms are also defined in the config.vsh.yaml file at the very bottom. The native platform is actually already defined in the template, that one type key with a value of native is enough! Now for adding the docker platform, add a new line below the last and add the following:

  - type: docker
    image: "dataintuitive/dotnet-script:1.2.1"

This tells Viash that this component can be built to a docker container with the an image containing the latest version of dotnet-script as its base. In order for Viash components to work, bash needs to be added.
Luckily, this isn’t a problem since Viash supports defining dependencies which then get pulled from inside the docker container before running the script. To add the dependencies that needs to be installed, add these lines below image: "dataintuitive/dotnet-script":

    setup:
      - type: apk
        packages: [ bash ]

This will prompt the apk package manager to download and install bash inside of the container. That’s it for the config! Be sure to save it and let’s move on to actually running the component you’ve created. For reference, you can take a look at the completed config.vsh.yaml file in our Github repository.

Run the component

Time to run the component! First off, let’s see what the output of --help is. To do that, open a terminal in the my_viash_component folder and execute the following command:

viash run config.vsh.yaml -- --help

This will show the following:

md_url_checker_csharp <not versioned>
Check if URLs in a markdown are reachable and create a text report with the results.

Options:
   --inputfile
        type: file, required parameter, file must exist
        The input markdown file.

   --domain
        type: string
        The domain URL that gets inserted before any relative URLs. For example, "/documentation/intro" could be replaced with "https://my-website/documentation/intro" to create a valid URL.

   --output
        type: file, output
        default: output.txt
        The path of the output text file that will contain the report.

As you can see, the values you entered into the config file are all here.
Next, let’s run the component natively with some arguments. You can use one of your own markdown files as the input if you desire. In that case, replace Testfile.md in the command with the path to your file.
Execute the following command to run the component with the default platform, in this case native as it’s the first in the platforms dictionary:

viash run config.vsh.yaml -- --inputfile=Testfile.md --domain=https://viash.io/ --output=my_report.txt

If all goes well, you’ll see something like this output in the terminal and a file named my_report.txt will have appeared:

/tmp/viash-run-md_url_checker_csharp-UzGs5k(68,26): warning SYSLIB0014: 'WebRequest.Create(string)' is obsolete: 'WebRequest, HttpWebRequest, ServicePoint, and WebClient are obsolete. Use HttpClient instead.'
1: https://www.google.com
OK
2: https://www.reddit.com
OK
3: http://microsoft.com/random-link
404
4: http://www.viash.io
OK
5: https://viash.io//guides/getting_started/installation
404
6: https://viash.io//api/config/config
404

Testfile.md has been checked and a report named my_report.txt has been generated.
3 of 6 URLs could not be resolved.

For more information on the run command, take a look at the Viash run command page. Great! With that working, the next step is building an executable.

Building an executable

You can generate an executable using either the native or the docker platform. The former will generate a file that can be run locally, but depends on your locally installed software packages to work. A docker executable on the other hand can build and start up a docker container that handles the dependencies for you.
To create a native build, execute the following command:

viash build config.vsh.yaml

A new folder named output will have been created with an executable inside named md_url_checker_csharp. To test it out, execute the following command:

output/md_url_checker_csharp --inputfile=Testfile.md --domain=https://viash.io/ --output=my_report.txt

The output is the same as by running the component, but the executable can be easily shared and now includes the ability to feed arguments to it and an included --help command. Not bad!
Next up is the docker executable. You can specify the platform with the -p argument and choose an output folder using -o, apart from that it’s the same as the previous build command:

viash build -p docker -o docker_output config.vsh.yaml 

You’ll now have a docker_ouput folder alongside the output one. This folder also contains a file named md_url_checker_csharp, but its inner workings are slightly different than before. Run md_url_checker_csharp with the full arguments list to test what happens:

docker_output/md_url_checker_csharp --inputfile=Testfile.md --domain=https://viash.io/ --output=my_report.txt

Here’s what just happened:

  • If the docker image wasn’t found, Viash will download it.
  • A check is made to see if a container named “md_url_checker_csharp” exists. If not, one will be built with the image defined in the config as its base.
  • All dependencies defined in the config are taken care of.
  • The script is run with the passed arguments and the output is passed to your shell. The my_report.txt file is written to your working directory.

For more information about the viash build command, take a look at its command page. That concludes the building of executables based on components using Viash!

Writing and running a unit test

To finish off this tutorial, it’s important to talk about unit tests. To ensure that your component works as expected during its development cycle, writing one or more tests is essential. Luckily, writing a unit test for a Viash component is straightforward.

You just need to add test parameters in the config file and write a script which runs the executable and verifies the output. When running tests, Viash will automatically build an executable and place it alongside the other defined resources in a temporary working directory. To get started, open up config.vsh.yaml file again and add this at the end of the functionality dictionary, between the path: script.csx and platforms: lines:

  tests:
  - type: bash_script
    path: test.sh
  - path: Testfile.md

This test dictionary contains a reference to the test script and all of the files that need to be copied over in order to complete a test. In the case of our example, test.sh will be the test script and Testfile.md is necessary as an input markdown file is required for the script to function. Now create a new file named test.sh in the my_viash_component folder and add this as its content:

set -ex # exit the script when one of the checks fail and output all commands.

# check 1
echo ">>> Checking whether output is correct"

# run component with required input(s)
./md_url_checker_csharp --inputfile Testfile.md > test-output.txt

[[ ! -f test-output.txt ]] && echo "Test output file could not be found!" && exit 1
grep -q '1: https://www.google.com' test-output.txt # Did the script find the URL?
grep -q '404' test-output.txt  # Did the web request return a 404 for the page that doesn't exist?

# check 2
echo ">>> Checking whether an output file was created correctly"

[[ ! -f output.txt ]] && echo "Output file could not be found!" && exit 1
grep -q 'URL: https://www.google.com' output.txt # Was the URL written correctly in the report?
grep -q 'Status: ERROR! URL cannot be reached. Status code: 404' output.txt # Was the error written correctly in the report?
grep -q 'Link name: install viash here' output.txt # Was link name written correctly in the report?

echo ">>> Test finished successfully!"
exit 0 # don't forget to put this at the end

This bash script will run the component and perform several checks to its output. A successful test runs all the way down and exits with a 0 exit code, any other code means a failure:

  • set -ex will stop the script once any of the lines fail and will output all commands to the shell with a ‘+’ before it.
  • ./md_url_checker --inputfile Testfile.md > test-output.txt runs the component and writes its output to a file.
  • [[ ! -f test-output.txt ]] && echo "Test output file could not be found!" && exit 1 checks is the output file exists, if it doesn’t exit with a 1 code.
  • All of the grep calls check if a certain piece of text could be found. Each of these calls can exit the script if the text wasn’t found.
  • If everything succeeded, exit with a 0 code. Make sure not to forget this final line in your own tests.

Make sure both the config and test files are saved, then run a test by running this command:

viash test config.vsh.yaml 

The output will look like this:

Running tests in temporary directory: '/tmp/viash_test_md_url_checker_csharp8103297969290591143'
====================================================================
+/tmp/viash_test_md_url_checker_csharp8103297969290591143/test_test.sh/test.sh
>>> Checking whether output is correct
+ echo '>>> Checking whether output is correct'
+ ./md_url_checker_csharp --inputfile Testfile.md
+ [[ ! -f test-output.txt ]]
+ grep -q '1: https://www.google.com' test-output.txt
+ grep -q 404 test-output.txt
+ echo '>>> Checking whether an output file was created correctly'
>>> Checking whether an output file was created correctly
+ [[ ! -f output.txt ]]
+ grep -q 'URL: https://www.google.com' output.txt
+ grep -q 'Status: ERROR! URL cannot be reached. Status code: 404' output.txt
+ grep -q 'Link name: install viash here' output.txt
>>> Test finished successfully!
+ echo '>>> Test finished successfully!'
+ exit 0
====================================================================
SUCCESS! All 1 out of 1 test scripts succeeded!
Cleaning up temporary directory

If the test succeeds it simply writes the full output to the shell. If there’s any issues, the script stops and an error message will appear in red. For more information on tests take a look at the viash test command page.

What’s next?

Now you’re ready to use Viash to creating components from your own scripts, check out the rest of our guides and the API section. Here are some good starting points: