Next: Complex parallel job
Up: Examples
Previous: Creating a flow
For many parallel jobs, the program Parallel
is all that is needed.
Parallel is meant for parallel jobs where
the input and output are either distributed, SEP.pf_splist.parfile,
or share input, SEP.pf_copy.parfile, and the data is
partitioned along a single axis for each file.
It also requires that you are running a single program on each node
(rather than some more complex operation) and
you aren't wanting to add the output to another file.
The required arguments to Parallel are composed of the program name,
command, the number of blocks to break the program into nblock,
and a series of lists (comma separated).
- files
- The name of the parallel file(s).
- tags
- The tags associated with each file.
- axis
- The axis that each file is split along. If the file
is shared, the axis is ignored for this file.
- usage
- The usage for each file "INPUT" or "OUTPUT".
- file_type
- The parallel file type for each file ("DISTRIBUTE" or "COPY").
All arguments that aren't part of the Parallel program
are passed as command line arguments to parallelized serial code.
The program is effective for parallelizing code where the computational cost
is significantly more than the cost of transferring
the data (migration and modeling for example). It is also
effective when handling problems that benefit from
being held in memory (operations such as transposes).
For example, a multi-gigabyte 2-D file could be transposed
at marginally more than the cost of distributing and collecting the dataset
through
Parallel files="in.H,out.H" tags="stdin,stdout" axis="2,1" usage="INPUT,OUTPUT"\
file_type="DISTRIBUTE,DISTRIBUTE"
Next: Complex parallel job
Up: Examples
Previous: Creating a flow
Stanford Exploration Project
5/3/2005