next up previous print clean
Next: Parallel job Up: R. Clapp: Parallel SEPlib Previous: R. Clapp: Parallel SEPlib

INTRODUCTION

In the past few years, several papers have been written dealing with SEP's attempts to run on Beowulf style clusters. The initial work involved the use of the Open Multi-Processing (OMP) library Biondi et al. (1999). As we increased the number of nodes at SEP, we switched to, or added on support for, MPI to many of our programs Sava and Clapp (2002). This proved to be a somewhat successful strategy with a few notable drawbacks. First, it required each program to include MPI specific coding. Clapp (2003b) described an effort to minimize the repetitive portions by introducing a library that did MPI-like operations on SEPlib files.

Another problem was that we relied on mounting, or in our case automounting, of disks. Each process would open the history file (and in most cases the binary file) of each input SEPlib file. This posed two problems. First it doesn't scale well to a large number of nodes (100 processes opening up the same file will often fail). Second, automount isn't particularly stable on Linux. Clapp (2004a) was an attempt to address this problem. It worked in a master-slave manner. The master process would read a sepfile and pass along its description and contents to worker nodes. In addition, it introduced the concept of a distributed SEPlib dataset. A SEPlib file could be broken into sections along one of its axes and various parts could sit on different nodes. It provided routines to partition and collect the distributed dataset. Unfortunately this left one glaring problem, node instability. When running a MPI program if a node becomes inoperable the entire job will die either immediately or at the first communication attempt. The solution, to add to each application the ability to `restart' itself can be extremely challenging, especially in large, global inversion applications Clapp (2003a); Sava and Biondi (2003).

In this paper I describe a library, written in Python, that allows auto-parallelization with a high-level of fault tolerance for almost any SEPlib program. Instead of handling parallelization within a compiled code at the library level, the parallelization is done at the script level which sits on top of the executables. The Python library distributes and collects the datasets, keeps track of what portion of the parallel job are done, and monitors the state of the nodes. The distribution and collection are done through MPI but individual jobs are all serial codes. The code is written using Python's object oriented capabilities so it is easily expandable. A parallel job is described by a series of files and a series of tasks.


next up previous print clean
Next: Parallel job Up: R. Clapp: Parallel SEPlib Previous: R. Clapp: Parallel SEPlib
Stanford Exploration Project
10/23/2004