Introduction to the ToolFactory tutorial.
Contributors
Objectives
This session introduces the ToolFactory and explains how it might be useful for programmers new to Galaxy
What is the ToolFactory?
- The ToolFactory is a Galaxy tool.
- It generates new tools from working command line scripts.
- Generated tools work exactly the same way as manually prepared Galaxy tools.
- They can be installed from a toolshed and used in workflows.
- The script is “wrapped” so it runs whenever the tool is executed.
Speaker Notes
The ToolFactory is a Galaxy tool for developers and scientists who routinely write their own analysis code. It generates fully functional, toolshed-ready tools. Supplied with a working command line script, it can generate a new Galaxy tool that “wraps” that script. Scripts useful to other scientists can be widely shared through any toolshed.
BYO programming skills
- If you write useful scripts and want to make them into new tools, the ToolFactory can help.
- Programming and scripting skills are needed to make new analysis scripts.
- Useful new analysis scripts are required inputs for the ToolFactory.
- Scripting skills are needed for this training material to be useful.
Speaker Notes
Programming skills are needed to make any new Galaxy tool. The ToolFactory automatically generates wrapper code, but does not write the script. If you routinely create generalisable, working analysis code, the ToolFactory can help share your work as real Galaxy tools.
The ToolFactory needs a working command line script to be useful
- Bash, Rscript, Python,…,Lisp. Any interpreter available in Conda.
- Parameter settings, data input file paths and output file paths are passed to the script on the command line.
Argparse
(named) orpositional
parameter passing can be used.- For example:
python mangiare.py --food zitti.pasta --cooked "al dente" --sauce "tomato+basil"
mangiare.py
is executed as a Python script.- Parameters are passed in
argparse
format - Code must deal correctly with the parameters
food
set to “zitti.pasta”cooked
set to “al dente”- and so on.
- It works if you can run a command like the one above in a Linux shell and get useful, correct outputs.
Speaker Notes
The ToolFactory can help turn working scripts into new Galaxy tools. Any scripting language available in Conda will work. Positional or “Argparse” style command line parameter passing can be used. The generated tool will not work if the script was broken to start with. Without a working script, it is about as useful as a chocolate teapot.
What does the developer need to do ?
- Debug the script so it works correctly on the command line.
- Upload the test data samples into a new Galaxy history.
- Start the ToolFactory tool.
- Paste the script
- Describe the inputs, outputs and parameters.
- A completed ToolFactory form specifies a new tool.
- Each input is defined with a small sample selected from the history.
- This sample is used in the generated tool test.
Speaker Notes
The developer prepares a working script and input samples. The sample input files are uploaded to a Galaxy history. The completed ToolFactory form collects all the information needed to generate a new tool. The samples become the inputs for the built-in test.
What happens when the ToolFactory is executed?
- The information from the form drives a code generator.
- The galaxyxml library generates the wrapper XML
- A new XML wrapper is created in the history
- The new tool is installed in the local Galaxy.
Speaker Notes
An XML tool wrapper is generated in the history. The new tool is installed in the local server. It is ready to run locally. It looks and acts just like any manually prepared Galaxy tool.
What happens when a new tool is updated with the planemo_test
tool ?
- Planemo generates the tool test outputs and then tests the finalised tool.
- Test reports and the updated toolshed archive are written to the history
- If the tool is useful to others, it can be shared through any toolshed.
Speaker Notes
The Planemo test and lint reports and a copy of the generated XML wrapper and a log file are returned. The tested toolshed archive is ready to share if the tests and linting passed.
Easy and quick to learn, but limited compared to manual tool wrapping.
- Automated code generators are limited in scope in comparison to a skilled programmer.
- Galaxy developers maintain a separate, comprehensive manual tool development infrastructure.
- The ToolFactory is limited to simple scripts but it takes far less time for a developer to become productive.
- Scripts involving conditional parameter complexities must deal with them internally, otherwise a manually prepared tool wrapper must be written.
- The job that generates a tool can be re-run like any other persistent Galaxy job
- This allows the tool form to be adjusted and the generated tool updated at any time if changes are needed.
Speaker Notes
Although a code generator is easy to learn to use, it is limited to relatively simple scripts. Many Conda packages require complexities that no code generator can provide. Hand written code is required. Scripts can sometimes be adapted to work around the many limitations of the code generator.
The big picture
Speaker Notes
Galaxy can serve as a persistent integrated tool development environment. The developer supplies all the details for the new tool on the form, including a known good script with test data. Clicking “Run Tool” on the form runs a Galaxy job. It generates the new tool wrapper, writing it to the history and installing it in the local Galaxy. ToolFactory jobs are like any other jobs - clicking the redo button will recreate the form used. The developer can return to adjust the form and generate an updated tool at any time. In this way, tools can easily be maintained as long as the job that generates them is saved.
If you got this far
- There is a tutorial to follow
- It introduces the ToolFactory in more detail, shows how to run your own and how to explore the samples to learn how to use it.
Speaker Notes
If you would like to learn more about the ToolFactory, there is a far more detailed introduction and hands-on tutorial available. We hope you will enjoy learning about and using the ToolFactory in your work.
Key Points
- The ToolFactory is a specialised Galaxy tool for users who routinely write their own analysis code
- It turns useful, working command line scripts into shareable, toolshed-ready tools.
- A code generator is easy to learn but only simple requirements can be fully automated