Computational pipelines often become quite simple if we breakdown the process into simple stages.
Note
Ruffus refers to each stage of your pipeline as a task.
Let us start with the usual “Hello World”.We have the following two python functions which we would like to turn into an automatic pipeline:![]()
The simplest Ruffus pipeline would look like this:
![]()
The functions which do the actual work of each stage of the pipeline remain unchanged. The role of Ruffus is to make sure these functions are called in the right order, with the right parameters, running in parallel using multiprocessing if desired.
There are three simple parts to building a ruffus pipeline
- importing ruffus
- “Decorating” functions which are part of the pipeline
- Running the pipeline!
You need to tag or decorator existing code to tell Ruffus that they are part of the pipeline.
Note
decorators are ways to tag or mark out functions.
They start with a @ prefix and take a number of parameters in parenthesis.
![]()
The ruffus decorator @follows makes sure that second_task follows first_task.
We run the pipeline by specifying the last stage (task function) of your pipeline. Ruffus will know what other functions this depends on, following the appropriate chain of dependencies automatically, making sure that the entire pipeline is up-to-date.
Because second_task depends on first_task, both functions are executed in order.
>>> pipeline_run([second_task], verbose = 1)Ruffus by default prints out the verbose progress through your pipeline, interleaved with our Hello and World.
![]()