In our implementation, the objects used for imaging, typically the slowness, the data and the image are spread to the compute nodes before any computation takes place. The main reasons for choosing this solution is speed and robustness, since no network traffic or remote file access happens during computation. This design is also more advantageous for inversion where the same operation is repeatedly done on the same data which would otherwise have to be repeatedly transferred over the network.
The drawback of this design is that certain objects, for example the slowness and the image, get duplicated on all the nodes. This may be a problem for extremely big datasets, although large local storage is not so expensive and likely to further decrease in price ().
Once the data is distributed to the nodes, each one of them operates independently of the other nodes until done. Typically, the local data is further broken into blocks (over depth and frequency) that can be loaded in memory.
There are at least two possibilities of distribution for the wavefield (Figure 1): in strategy 1, the node which distributes the wavefield acts as a compute node; in strategy 2, a host distributes the wavefield to the compute nodes and does not participate in the actual computation. Strategy 2 is slightly more efficient (i.e. completes the entire computation faster) than strategy 1, but it is characterized by poorer usage of the hardware (i.e. the master node is most of the time idle). Our implementation uses strategy 1.
Figure 1 Two strategies for data distribution on cluster computers.