We find that much of our Ruffus pipeline code is built on the same template and this is generally a good place to start developing a new pipeline.
All you need to do is
change the file name(!)
add any command line arguments
- Place your pipeline code where it says
# Put pipeline code here
and run the script
The standard code uses the ubiquitous optparse module (but should be easily adaptable for argparse) and provides command line options for:
- The script provides for logging both to the command line:
myscript -v myscript --verbose- and an optional log file:
myscript --log_file PIPELINE.LOG_FILELoggin is ignored if --verbose or --log_file are not specified on the command line
To write to the same log file from multiple processes, you need to pass logger_proxy rather than logger in your Ruffus decorators, as well as logging_mutex. The latter allows writing to the log files to be synchronised across different jobs:
with logging_mutex: logger_proxy.info("Look Ma. No hands")Logging is set up so that you can write
logger.info("A message")
logger.debug("A message")
logger.log(MESSAGE, "A message")
This is extremely useful for understanding what is happening with your pipeline, what tasks and which jobs are up-to-date etc.
See the manual for a full discussion
- To trace the pipeline, call script with the following options
myscript -n or myscript --just_printIncreasing levels of verbosity (-v to -vvvvvvvvv) provide more detailed output
This can be specified using the following option:
myscript --flowchart xxxchart.svgThe extension of the flowchart file indicates what format the flowchart should take, for example, svg, jpg etc.
Optionally specify the number of parallel strands of execution and which the final task is:
myscript --jobs 15 --target_tasks "final_task" myscript -j 15