Pipelined tasks are created by “decorating” a function with the following syntax:
def func_a(): pass @follows(func_a) def func_b (): passEach task is a single function which is applied one or more times to a list of parameters (typically input files to produce a list of output files).
Each of these is a separate, independent job (sharing the same code) which can be run in parallel.
To run the pipeline:
pipeline_run(target_tasks, forcedtorun_tasks = [], multiprocess = 1, logger = stderr_logger, gnu_make_maximal_rebuild_mode = True, cleanup_log = "../cleanup.log") pipeline_cleanup(cleanup_log = "../cleanup.log")
Basic Task decorators are:
and
Task decorators include:
More advanced users may require:
Run pipelines.
Parameters: |
|
---|
Printouts the parts of the pipeline which will be run
Because the parameters of some jobs depend on the results of previous tasks, this function produces only the current snap-shot of task jobs. In particular, tasks which generate variable number of inputs into following tasks will not produce the full range of jobs.
verbose = 0 : nothing
verbose = 1 : print task name
verbose = 2 : print task description if exists
verbose = 3 : print job names for jobs to be run
verbose = 4 : print list of up-to-date tasks and job names for jobs to be run
verbose = 5 : print job names for all jobs whether up-to-date or not
Parameters: |
|
---|
print out pipeline dependencies in various formats
Parameters: |
|
---|
Usage:
- for i, o in param_func():
- print ” input file name = ” , i print “output file name = ” , o
..Note:
1. Each job requires input/output file names
2. Input/output file names can be a string, an arbitrarily nested sequence
3. Non-string types are ignored
3. Either Input or output file name must contain at least one string
..Note:
1. Each job requires input/output file names
2. Input/output file names can be a string, an arbitrarily nested sequence
3. Non-string types are ignored
3. Either Input or output file name must contain at least one string
Given input and output files, see if all exist and whether output files are later than input files Each can be
- string: assumed to be a filename “file1”
- any other type
- arbitrary nested sequence of (1) and (2)