<div style="border: 2px solid #8A9AD0; margin: 1em 0.2em; padding: 0.5em;">

# Best practices for workflows in GitHub repositories

by [Simone Leo](https://training.galaxyproject.org/hall-of-fame/simleo/), [Eli Chadwick](https://training.galaxyproject.org/hall-of-fame/elichad/)

Apache-2.0 licensed content from the [Galaxy Training Network](https://training.galaxyproject.org/)

**Objectives**

- What are Workflow Best Practices
- How does RO-Crate help?

**Objectives**

- Generate a workflow test using Planemo
- Understand how testing can be automated with GitHub Actions

**Time Estimation: 30M**
</div>


<p>A workflow, just like any other piece of software, can be formally correct and runnable but still lack a number of additional features that might help its reusability, interoperability, understandability, etc.</p>
<p>One of the most useful additions to a workflow is a suite of tests, which help check that the workflow is operating as intended. A test case consists of a set of inputs and corresponding expected outputs, together with a procedure for comparing the workflow‚Äôs actual outputs with the expected ones. It might be the case, in fact, that a test may be considered successful even if the actual outputs do not match the expected ones exactly, for instance because the computation involves a certain degree of randomness, or the output includes timestamps or randomly generated identifiers.</p>
<p>Providing documentation is also important to help understand the workflow‚Äôs purpose and mode of operation, its requirements, the effect of its parameters, etc. Even a single, well structured README file can go a long way towards getting users started with your workflow, especially if complemented by examples that include sample inputs and running instructions.</p>
<blockquote class="agenda" style="border: 2px solid #86D486;display: none; margin: 1em 0.2em">
<div class="box-title agenda-title" id="agenda">Agenda</div>
<p>In this tutorial, you will learn about the best practices that the Galaxy community
has created for workflows.</p>
<ol id="markdown-toc">
<li><a href="#community-best-practices" id="markdown-toc-community-best-practices">Community best practices</a></li>
<li><a href="#best-practice-repositories-and-ro-crate" id="markdown-toc-best-practice-repositories-and-ro-crate">Best practice repositories and RO-Crate</a></li>
</ol>
</blockquote>
<blockquote class="tip" style="border: 2px solid #FFE19E; margin: 1em 0.2em">
<div class="box-title tip-title" id="tip-using-your-own-workflow"><button class="gtn-boxify-button tip" type="button" aria-controls="tip-using-your-own-workflow" aria-expanded="true"><i class="far fa-lightbulb" aria-hidden="true" ></i> <span>Tip: Using your own workflow</span><span class="fold-unfold fa fa-minus-square"></span></button></div>
<p>This tutorial assumes that you already have a Galaxy workflow that you want to apply best practices to. You can follow along using any workflow you have created or imported during a previous tutorial (such as <a href="{% link topics/introduction/tutorials/galaxy-intro-short/workflows/index.md %}">A short introduction to Galaxy</a>).</p>
</blockquote>
<h2 id="community-best-practices">Community best practices</h2>
<p>Though the practices listed in the introduction can be considered general enough to be applicable to any kind of software, individual communities usually add their own specific sets of rules and conventions that help users quickly find their way around software projects, understand them more easily and reuse them more effectively. The Galaxy community, for instance, has a <a href="https://planemo.readthedocs.io/en/latest/best_practices_workflows.html">guide on best practices for maintaining workflows</a> and a built-in Best Practices panel in the workflow editor (see the tip below).</p>
<blockquote class="hands_on" style="border: 2px solid #dfe5f9; margin: 1em 0.2em">
<div class="box-title hands-on-title" id="hands-on-apply-best-practices-for-workflow-structure"><i class="fas fa-pencil-alt" aria-hidden="true" ></i> Hands-on: Apply best practices for workflow structure</div>
<ol>
<li>Open your workflow for editing and find the <strong>Best Practices</strong> panel (see the tip above).</li>
<li>Resolve the warnings that appear until every item has a green tick.</li>
</ol>
</blockquote>
<p>The <a href="https://github.com/galaxyproject/iwc">Intergalactic Workflow Commission (IWC)</a> is a collection of highly curated Galaxy workflows that follow best practices and conform to a specific GitHub directory layout, as specified in the <a href="https://github.com/galaxyproject/iwc/blob/main/workflows/README.md#adding-workflows">guide on adding workflows</a>. In particular, the workflow file must be accompanied by a <a href="https://planemo.readthedocs.io/en/latest/test_format.html">Planemo test file</a> with the same name but a <code style="color: inherit">-test.yml</code> extension, and a <code style="color: inherit">test-data</code> directory that contains the datasets used by the tests described in the test file. The guide also specifies how to fulfill other requirements such as setting a license, a creator and a version tag. A new workflow can be proposed for inclusion in the collection by opening a pull request to the <a href="https://github.com/galaxyproject/iwc">IWC repository</a>: if it passes the review and is merged, it will be published to <a href="https://github.com/iwc-workflows">iwc-workflows</a>. The publication process also generates a metadata file that turns the repository into a <a href="https://crs4.github.io/life_monitor/workflow_testing_ro_crate">Workflow Testing RO-Crate</a>, which can be registered to <a href="https://workflowhub.eu/">WorkflowHub</a> and <a href="https://www.lifemonitor.eu/">LifeMonitor</a>.</p>
<h2 id="best-practice-repositories-and-ro-crate">Best practice repositories and RO-Crate</h2>
<p>The <a href="https://github.com/crs4/repo2rocrate">repo2rocrate</a> software package allows to generate a <a href="https://crs4.github.io/life_monitor/workflow_testing_ro_crate">Workflow Testing RO-Crate</a> for a workflow repository that follows community best practices. It currently supports Galaxy (based on IWC guidelines), Nextflow and Snakemake. The tool assumes that the workflow repository is structured according to the community guidelines and generates the appropriate <a href="https://w3id.org/ro/crate/">RO-Crate</a> metadata for the various entities. Several command line options allow to specify additional information that cannot be automatically detected or needs to be overridden.</p>
<p>To try the software, we‚Äôll clone one of the iwc-workflows repositories, whose layout is known to respect the IWC guidelines. Since it already contains an RO-Crate metadata file, we‚Äôll delete it before running the tool.</p>


In [None]:
pip install repo2rocrate
git clone https://github.com/iwc-workflows/parallel-accession-download
cd parallel-accession-download/
rm -fv ro-crate-metadata.json
repo2rocrate --repo-url https://github.com/iwc-workflows/parallel-accession-download

<p>This adds an <code style="color: inherit">ro-crate-metadata.json</code> file at the top level with metadata generated based on the tool‚Äôs knowledge of the expected repository layout. By specifying a zip file as an output with the <code style="color: inherit">-o</code> option, we can directly generate an RO-Crate in the format accepted by WorkflowHub and LifeMonitor:</p>


In [None]:
repo2rocrate --repo-url https://github.com/iwc-workflows/parallel-accession-download -o ../parallel-accession-download.crate.zip

<h2 id="generating-tests-for-your-workflow">Generating tests for your workflow</h2>
<p>What if you only have a workflow, but you don‚Äôt have the test layout yet? You can use Planemo to generate it.</p>


In [None]:
pip install planemo

<p>As an example we will use this <a href="https://github.com/crs4/life_monitor/blob/50cdb790ff125613aa07e70cb439e3a36b82d0bf/interaction_experiments/workflow_examples/galaxy/sort-and-change-case/sort-and-change-case.ga">simple workflow</a>, which has only two steps: it sorts the input lines and changes them to upper case. Follow these steps to generate a test layout for it:</p>
<blockquote class="hands_on" style="border: 2px solid #dfe5f9; margin: 1em 0.2em">
<div class="box-title hands-on-title" id="hands-on-generate-workflow-tests-with-planemo"><i class="fas fa-pencil-alt" aria-hidden="true" ></i> Hands-on: Generate Workflow Tests With Planemo</div>
<ol>
<li>Download <a href="https://raw.githubusercontent.com/crs4/life_monitor/50cdb790ff125613aa07e70cb439e3a36b82d0bf/interaction_experiments/workflow_examples/galaxy/sort-and-change-case/sort-and-change-case.ga">the workflow</a> to a <code style="color: inherit">sort-and-change-case.ga</code> file.</li>
<li>Download <a href="https://raw.githubusercontent.com/crs4/life_monitor/50cdb790ff125613aa07e70cb439e3a36b82d0bf/interaction_experiments/workflow_examples/galaxy/sort-and-change-case/input.bed">this input dataset</a> to an <code style="color: inherit">input.bed</code> file.</li>
<li>Upload the workflow to Galaxy (e.g., <a href="https://usegalaxy.eu/">Galaxy Europe</a>): from the upper menu, click on ‚ÄúWorkflow‚Äù &gt; ‚ÄúImport‚Äù &gt; ‚ÄúBrowse‚Äù, choose <code style="color: inherit">sort-and-change-case.ga</code> and then click ‚ÄúImport workflow‚Äù.</li>
<li>Rename the uploaded workflow from <code style="color: inherit">sort-and-change-case (imported from uploaded file)</code> to <code style="color: inherit">sort-and-change-case</code> by clicking the pencil icon next to the workflow name.</li>
<li>Start a new history: click on the ‚Äú+‚Äù button on the History panel to the right.</li>
<li>Upload the input dataset to the new history: on the left panel, go to ‚ÄúUpload Data‚Äù &gt; ‚ÄúChoose local files‚Äù and select <code style="color: inherit">input.bed</code>, then click ‚ÄúStart‚Äù &gt; ‚ÄúClose‚Äù.</li>
<li>Wait for the file to finish uploading (i.e., for the loading circle on the dataset‚Äôs line in the history to disappear).</li>
<li>Run the workflow on the input dataset: click on ‚ÄúWorkflow‚Äù in the upper menu, locate <code style="color: inherit">sort-and-change-case</code>, and click on the play button to the right.</li>
</ol>
<p><a href="img/workflow-entry.png" rel="noopener noreferrer"><img src="img/workflow-entry.png" alt="Workflow Entry. " width="502" height="200" loading="lazy" /></a></p>
<ol>
<li>
<p>This should take you to the workflow running page. The input slot should be already filled with <code style="color: inherit">input.bed</code> since there is nothing else in the history. Click on ‚ÄúRun Workflow‚Äù on the upper right of the center panel.</p>
<p><a href="img/workflow-run-page.png" rel="noopener noreferrer"><img src="img/workflow-run-page.png" alt="Workflow Run Page. " width="1007" height="197" loading="lazy" /></a></p>
</li>
<li>Wait for the workflow execution to finish.</li>
<li>
<p>On the upper menu, go to ‚ÄúData‚Äù &gt; ‚ÄúWorkflow Invocations‚Äù, expand the invocation corresponding to the workflow just run and copy the invocation‚Äôs ID. In my case it says ‚ÄúInvocation ID: 86ecc02a9dd77649‚Äù on the right, where <code style="color: inherit">86ecc02a9dd77649</code> is the ID.</p>
<p><a href="img/workflow-invocation.png" rel="noopener noreferrer"><img src="img/workflow-invocation.png" alt="Workflow Invocation. " width="1322" height="222" loading="lazy" /></a></p>
</li>
<li>
<p>On the upper menu, go to ‚ÄúUser‚Äù &gt; ‚ÄúPreferences‚Äù &gt; ‚ÄúManage API Key‚Äù. If you don‚Äôt have an API key yet, click the button to create a new one. Under ‚ÄúCurrent API key‚Äù, click the button to copy the API Key on the right.</p>
<p><a href="img/api-key.png" rel="noopener noreferrer"><img src="img/api-key.png" alt="API key. " width="951" height="280" loading="lazy" /></a></p>
</li>
<li>Run <code style="color: inherit">planemo workflow_test_init --galaxy_url https://usegalaxy.eu --from_invocation INVOCATION_ID --galaxy_user_key API_KEY</code>, replacing <code style="color: inherit">INVOCATION_ID</code> with the actual invocation ID and <code style="color: inherit">API_KEY</code> with the actual API key. If you‚Äôre not using the Galaxy Europe instance, also replace <code style="color: inherit">https://usegalaxy.eu</code> with the URL of the instance you‚Äôre using.</li>
<li>Browse the files that have been created - <code style="color: inherit">sort-and-change-case-tests.yml</code> and <code style="color: inherit">test_data/</code></li>
</ol>
<p>Optionally see this tip for more details:</p>
</blockquote>
<blockquote class="question" style="border: 2px solid #8A9AD0; margin: 1em 0.2em">
<div class="box-title question-title" id="question"><i class="far fa-question-circle" aria-hidden="true" ></i> Question</div>
<ol>
<li>How do the files in <code style="color: inherit">test_data/</code> relate to your Galaxy history?</li>
<li>Look at the contents of <code style="color: inherit">sort-and-change-case-tests.yml</code>. What are the expected outputs of the test?</li>
</ol>
<br/><details style="border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;"><summary>üëÅ View solution</summary>
<div class="box-title solution-title" id="solution"><button class="gtn-boxify-button solution" type="button" aria-controls="solution" aria-expanded="true"><i class="far fa-eye" aria-hidden="true" ></i> <span>Solution</span><span class="fold-unfold fa fa-minus-square"></span></button></div>
<ol>
<li>The files in <code style="color: inherit">test_data/</code> correspond to the output files in the history, though some of the names are different:</li>
<li><code style="color: inherit">bed_input.bed</code> has the same name in the history - this is the input file we uploaded</li>
<li><code style="color: inherit">sorted_bed.bed</code> corresponds to the <code style="color: inherit">Sort on data 1</code> step (you can confirm this by viewing the file contents)</li>
<li><code style="color: inherit">uppercase_bed.tabular</code> corresponds to the <code style="color: inherit">Change case on data 2</code> step (you can confirm this by viewing the file contents)</li>
<li>The expected outputs are <code style="color: inherit">test-data/sorted_bed.bed</code> and <code style="color: inherit">test-data/uppercase_bed.tabular</code>. This means that when the workflow is run on the input (<code class="language-plaintext highlighter-rouge">test-data/bed_input.bed</code>), it is expected to produce two files that look exactly like those outputs.</li>
</ol>
</details>
</blockquote>
<p>To build up the test suite further, you can invoke the workflow multiple times with different inputs, and use each invocation to generate a test, using the same command as before:</p>


In [None]:
planemo workflow_test_init --galaxy_url https://usegalaxy.eu --from_invocation INVOCATION_ID --galaxy_user_key API_KEY

<p>Each invocation should test a different behavior of the workflow. This could mean using different datatypes for inputs, or changing the workflow settings to produce different results.</p>
<blockquote class="hands_on" style="border: 2px solid #dfe5f9; margin: 1em 0.2em">
<div class="box-title hands-on-title" id="hands-on-generate-tests-for-your-own-workflow"><i class="fas fa-pencil-alt" aria-hidden="true" ></i> Hands-on: Generate tests for your own workflow</div>
<ol>
<li>Create a new folder on your computer to store the workflow.</li>
<li>Download the Galaxy workflow you updated to follow best practices earlier in this tutorial. You can do this by going to the Workflow page and clicking {% icon galaxy-download %} <strong>Download workflow in .ga format</strong>.</li>
<li>Create a new Galaxy history, and run the workflow on some appropriate input data.</li>
<li>Use <code style="color: inherit">planemo</code> to turn that workflow invocation into a test case.</li>
</ol>
</blockquote>
<h2 id="adding-a-github-workflow-for-running-tests-automatically">Adding a GitHub workflow for running tests automatically</h2>
<p>In the previous section, you learned how to generate a test layout for an example Galaxy workflow. This procedure also gives you the file structure you need to populate the GitHub repository in line with community best practices. One thing is still missing though: a GitHub workflow to test the Galaxy workflow automatically. Let‚Äôs create this now.</p>
<p>At the top level of the repository, create a <code style="color: inherit">.github/workflows</code> directory and place a <code style="color: inherit">wftest.yml</code> file inside it with the following content:</p>
<div class="language-yaml highlighter-rouge"><div><pre style="color: inherit; background: transparent"><code style="color: inherit"><span class="na">name</span><span class="pi">:</span> <span class="s">Periodic workflow test</span>
<span class="na">on</span><span class="pi">:</span>
<span class="na">schedule</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">cron</span><span class="pi">:</span> <span class="s1">'</span><span class="s">0</span><span class="nv"> </span><span class="s">3</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*'</span>
<span class="na">workflow_dispatch</span><span class="pi">:</span>
<span class="na">jobs</span><span class="pi">:</span>
<span class="na">test</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">Test workflow</span>
<span class="na">runs-on</span><span class="pi">:</span> <span class="s">ubuntu-latest</span>
<span class="na">steps</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v2</span>
<span class="na">with</span><span class="pi">:</span>
<span class="na">fetch-depth</span><span class="pi">:</span> <span class="m">1</span>
<span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/setup-python@v1</span>
<span class="na">with</span><span class="pi">:</span>
<span class="na">python-version</span><span class="pi">:</span> <span class="s1">'</span><span class="s">3.7'</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">install Planemo</span>
<span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">pip install --upgrade pip</span>
<span class="s">pip install planemo</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">run planemo test</span>
<span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">planemo test --biocontainers sort-and-change-case.ga</span>
</code></pre></div></div>
<p>Replacing <code style="color: inherit">sort-and-change-case.ga</code> with the name of your actual Galaxy workflow. You can find extensive <a href="https://docs.github.com/en/actions/using-workflows">documentation on GitHub workflows</a> on the GitHub web site. Here we‚Äôll give some highlights:</p>
<ul>
<li>the <code style="color: inherit">on</code> field sets the GitHub workflow to run:
<ul>
<li>automatically every day at 3 AM</li>
<li>when manually dispatched</li>
</ul>
</li>
<li>the steps do the following:
<ul>
<li>check out the GitHub repository</li>
<li>set up a Python environment</li>
<li>install Planemo</li>
<li>run <code style="color: inherit">planemo test</code> on the Galaxy workflow</li>
</ul>
</li>
</ul>
<p>An example of a repository built according to the guidelines given here is <a href="https://github.com/simleo/ccs-bam-to-fastq-qc-crate">simleo/ccs-bam-to-fastq-qc-crate</a>, which realizes the Workflow Testing RO-Crate setup for <a href="https://workflowhub.eu/workflows/220">BAM-to-FASTQ-QC</a>.</p>
<p>Your workflow is now ready to add to GitHub! If you‚Äôre not familiar with GitHub, follow these instructions to create a repository and upload your workflow files: <a href="https://docs.github.com/en/get-started/start-your-journey/uploading-a-project-to-github">Uploading a project to GitHub</a>.</p>


# Key Points

- repo2rocrate lets you easily generate templated out metadata for your workflow
- Generating tests is easy and something everyone should do.

# Congratulations on successfully completing this tutorial!

Please [fill out the feedback on the GTN website](https://training.galaxyproject.org/training-material/topics/fair/tutorials/ro-crate-galaxy-best-practices/tutorial.html#feedback) and check there for further resources!
