Next: Inversion objects Up: R. Clapp: Python and Previous: Parallel files

Parallel jobs

The controlling process for running a parallel job comes from the SEP.pj_base.par_job class or its children. It is derived from the SEP.opt_base.options class for parameter handling. There are also numerous optional parameters that can tune the performance on a cluster. There are two required parameters to initialize a parallel job. The first is a dictionary files whose values are the parallel files needed for the job. A second is a dictionary sect_pars linking tasks to parameters. In addition, most parallel jobs will have program, the executable that will be run on each node, and global_pars, a list of parameters that each job will need in addition to those described in sect_pars. There is a number of other options such as device (which tells what Ethernet device the cluster is connected to) that can be useful to tune performance on a given cluster.

At the start of a parallel job, several communication threads are forked. Each of these threads' purpose will be to handle communication between a set of slave processes (the jobs on remote machines) and the master machine. The master thread then requests a list of all of the machines that are available. It checks to make sure each of these machines is functional. It then begins a loop that runs until each job has run to completion.

The loop begins by requesting from the machine object a list of available machine labels. It has to parse this list if any of the parallel files are of type SEP.pf_copy.parfile and are being used as output. Only a single process can be started on a given node until the file has been created. It then matches available jobs to the remaining machine labels, and requests from each parallel file object a local version of that file. It takes the parameters in global_pars, the task parameters in sect_pars, and adds in parameters telling the jobs how to communicate with the socket it has been assigned to. Then the command line for a given job is constructed by the command_func routine. By default this routine builds the command line based on program defined in the initialization. This function can be overwritten for more complex tasks. It forks a new thread for each job, and records that a job has been sent. These forked threads will exit when the job has been completed. If the exit status of the job is not 0, the job will be listed as failing.

Once a series of jobs has been started, the master thread reads the series of files written to by the SEP.par_msg.server_msg_obj objects, and updates the status of each job. The status messages come in several forms:

running: A task has successfully started. Notification that a job has started successfully is important in the case of an output SEP.pf_copy.parfile. The signal is sent when the output file has been successfully created and notifies the server that it is safe to start other jobs on the node.
finished: The task has completed successfully. When a job has finished, the machine is marked available. If all jobs are finished the loop is exited.
progress: The task has completed a certain portion of its job. If a job is restarted this information is included in the command line options for the job.
failed: The task failed. The machine status is checked. If it is no longer working, all jobs that have completed on that node are marked as needing to be rerun. If the node is working, the task is guaranteed to be assigned to another node. If it fails more than twice (also configurable) the job is exited.

The process then sleeps and restarts the loops. Every few minutes it checks to see if any nodes have failed or if any previously failed nodes now work. If the job loop exits successfully, the sockets are shut down and the parallel files are collected as necessary.

There are two extensions to the SEP.pj_base.par_job object. The SEP.pj_simple.par_job class is all that is needed for most parallel jobs. It takes the additional command line arguments:

command: The name of the program to run.
files: The list of files the jobs needs.
tags: The list of tags associated with the files described above.
usage: The usage for each of the files.
nblock: The number of parts to break the files into.
axis: The axis in which each file is split along.
file_type: The file type for each distributed file (DISTRIBUTE or COPY).

The object then builds all of the appropriate parallel file objects.

The final parallel job class, SEP.pj_split.par_job, is useful for many inversion problems. It is initialized with a dictionary assign_map linking the job with the machine, or more precisely a machine label, specifying where the jobs should be run. By always running a specific portion of the dataset on a given node, you can avoid collecting the dataset at each step in the inversion process. It can also be useful in things like wave-equation migration velocity analysis where a large file, in the velocity analysis case the wave-field, is needed for calculations. The downside of this approach is, if a node goes down, the job can not run to completion but must terminate when it has accomplished the work on all the remaining nodes.

Next: Inversion objects Up: R. Clapp: Python and Previous: Parallel files

Stanford Exploration Project
5/3/2005