Table of Contents
The *batch task executor* Python script: Easy construction of parallel geophysical operators on CEES clusters
A lot of geophysical problems we encounter are (as Jon pointed out) 'embarrassingly parallel', therefore we can run batch tasks using multiple nodes simultaneously on CEES clusters. For each batch task (like wave-equation migration or tomography), we can divide the task into subjobs by shots and assign a different subset of shots to every available node.
However, running batch tasks on CEES-cluster requires significant overhead from users (in terms of both programming and human interaction). In addition to implementing the task division logic, the user also has to deal with:
- 1) PBS script generation. To ensure that each subjob's execution can be monitored, each subjob should have its own PBS script, with its own jobname identification and its own dedicated log file. It is a burden if the user has to implement that for every task he/she runs.
- 2) PBS job submission. A greedy (but not exhaustive) job submission strategy is desired to maximize the user's utilization on both the SEP/default queues without crashing the PBS scheduler.
- 3) Automatic fault recovery. With hundreds of jobs running in parallel, some of them do fail. If the failed jobs are not restarted automatically, then it would require human intervention to continue the computation flow. Having no automatic fault recovery will greatly jeopardize the utilization of the cluster because there are at least 8 hrs that we won't be awake to attend computers.
To relieve the burden from future cluster users, I (Yang) have written a batch task executor framework that does the (non-trivial) chores mentioned above. The user is only responsible for the description of the batch task he/she wants to run, including how the task is divided into subjobs, and what executables to run for each subjob. My framework will handle all the PBS stuff and perform automatic fault recovery. It also provides a combining functionality to merge the outputs from all subjobs to one single final output, which becomes handy in tasks like migration and tomogray (Combine all the partial images/velocity perturbations into one final image). My code can handle the case that inputs are of different dimensions, (for e.g., partial images migrated from a towed-streamer data set).
The framework is written in Python, therefore basic Python programming is needed to leverage my work. (Trust me, Python is a no-brainer choice compared to its alternatives like CShell/Bash/Perl, let alone Fortran+MPI). To encapsulate the complex internal implementation from the users and provide them a simple programming interface, the framework employs some popular paradigms in object-oriented software design. Specifically, it uses polymorphism, in which users plug in their batch task description into the framework by implementing a derived class from a given base class (BatchTaskComposer) with carefully designed function interfaces.
To help you get started, a sample code example is created for your reference. As you can see from the example, by writing just 50 lines of python code to define his/her own BatchTaskComposer, the user can get all the features mentioned above from the framework effortlessly.
Make sure that the test sample code can run successfully before you start building your own operator. To run the sample, first log into cees-cluster/cees-rcf, (for testing purpose, I suggest you use cees-cluster as the rcf is usually much more crowded):
- First, add /home/zyang03/script to your PYTHONPATH env variable.
- Then, copy the sample folder to your own data path, e.g. /data/sep/jdoe:
cp -rL /data/sep/zyang03/for/sep/batch_task /data/sep/jdoe/batch_task_sample
- Last, 'cd' into /data/sep/jdoe/batch_task_sample,
make clear make test
In the end, you should see a message saying 'The test is successful', otherwise follow the error message to address the issue. Feel free to talk to me (Yang) for help if you think necessary.
Explanation on the test sample code
The code that is supposed to be user created is in batch_task_executor_test.py, which consists two parts: 1) a derived class definition from the BatchTaskComposer class that provides the batch task specification used in this example; 2) a piece of driver code (like the main() in C) that launches the user-defined batch task and collect all its partial outputs into one final output file.
The batch task description in the sample class DummyBatchTaskComposer is very simple. It assumes the task consists of m subjobs, and each subjobs can generate up to 2*n-1 outputs, in which m,n are parameters specified by the user. The actual work logic for each subjob is dummy, simply making copies of an original data file (test2.H) as the outputs. The sample code also simulates subjob failures by generating corrupted PBS scripts randomly.
Read the comments/notations in the source code (Makefile, batch_task_executor_test.py and /home/zyang03/script/batch_task_executor.py) for more detailed explanations.
Useful tips on job monitoring
- Aliases for displaying all your own jobs (Notice: replace jdoe with your own user name).
alias myjob "qstat -a | grep jdoe | grep -v ' C ' " alias myjob2 "qstat | grep jdoe | grep -v ' C ' "
- Utility for deleting all your current jobs (in R or Q status).
alias kjoball /home/zyang03/script/alias/kjoball
When using the kjoball utility, you can specify an optional parameter, .i.e.
If the optional argument 'prefix' is not present, then the utility would simply print out the cmd for deleting all jobs without actually deleting them, in order to prevent user mistakes. However, if argument 'prefix' is provided, the utility will actually delete all your jobs whose job name starts with the given prefix.