Simple parallel jobs

Next: Complex parallel job Up: Examples Previous: Creating a flow

Simple parallel jobs

For many parallel jobs, the program Parallel is all that is needed. Parallel is meant for parallel jobs where the input and output are either distributed, SEP.pf_splist.parfile, or share input, SEP.pf_copy.parfile, and the data is partitioned along a single axis for each file. It also requires that you are running a single program on each node (rather than some more complex operation) and you aren't wanting to add the output to another file. The required arguments to Parallel are composed of the program name, command, the number of blocks to break the program into nblock, and a series of lists (comma separated).

files: The name of the parallel file(s).
tags: The tags associated with each file.
axis: The axis that each file is split along. If the file is shared, the axis is ignored for this file.
usage: The usage for each file "INPUT" or "OUTPUT".
file_type: The parallel file type for each file ("DISTRIBUTE" or "COPY").

All arguments that aren't part of the Parallel program are passed as command line arguments to parallelized serial code.

The program is effective for parallelizing code where the computational cost is significantly more than the cost of transferring the data (migration and modeling for example). It is also effective when handling problems that benefit from being held in memory (operations such as transposes). For example, a multi-gigabyte 2-D file could be transposed at marginally more than the cost of distributing and collecting the dataset through

Parallel files="in.H,out.H" tags="stdin,stdout" axis="2,1" usage="INPUT,OUTPUT"\
  file_type="DISTRIBUTE,DISTRIBUTE"

Next: Complex parallel job Up: Examples Previous: Creating a flow

Stanford Exploration Project
5/3/2005