A parallel job is a Python class object that inherits from three other classes. The first is the SEP.status.sep_status class. At the most basic level this class simply reads and writes to an ASCII text file. This text file is what keeps track of the progress for the job. Each line of the status file is a `:' separated list. The first item is the text descriptor, the jobid, for each task. The second item is the status of the job (todo,sent,running,finished,collected). The third is what machine (if any) the job is running on/ran on. The final two are progress indicators for the job (to enable restarting), and the number of times the job has failed. In the course of a parallel job, several different processes will need to read and/or write to the status file. In order to avoid clobbering of the file contents, each process can get an exclusive lock on the status file.
The parallel job class also inherits from a class that handles
sep_socket.sep_server. The socket class knows how to find a free socket number, and how to run a socket server that takes actions based on simple string messages. Finally it inherits from a class, SEP.par_log.jobs, that stores the stdout and stderr of the various jobs.