Large problem results

Next: KIRCHHOFF MIGRATION Up: SPLIT-STEP MIGRATION Previous: Small problem results

Large problem results

Next, we analyze the results for the large problem. Figure 4 shows the relative speeds for all the computations performed to solve the large problem, excluded one-time initializations. In relative terms, the SGIs performs better than Intel-based computers for this large problems, both as single-CPU speed and as parallel speed up. The MIPS-processors' performances are less affected by out-of-cache computations than the Pentium III's because the memory bandwidth of the SGI systems is better balanced with the CPU speed. Actually, for the large problem, the parallel speed-up improves on the SGIs because the large problem is more computationally intensive than the small one, and thus the serial I/O are less of an handicap. On the contrary, the parallel speed-up of the Pentiums is little worse than for the small problems. This is a possible indication of memory contentions between threads in accessing memory.

Figure 5 shows the relative speeds for all the computations performed in parallel. The SGIs show almost perfect parallel speed-up, indicating no contentions in accessing memory. On the contrary, the parallel speed-up on the Xeon is similar to the one shown in Figure 4, indicating that the loss in parallel efficiency is likely caused by memory contentions, and not by the serial I/Os. Somewhat surprisingly, the FFTs seem to run in parallel without loss of performances on all machines (Figure 6), with the exception of the dual-processor Pentium. It is possible that the memory access pattern of FFTs benefits from the larger and faster caches more than the rest of the computations.

Large-All
Figure 4 Relative speeds as a function of number of CPU for all the computations in the large problem: 1) Power Challenge (FFTW), 2) Power Challenge (SGILIB), 3) O200 (FFTW), 4) O200 (SGILIB), 5) Dual-processor Pentium III (FFTW), 6) Four-processor Xeon (FFTW). One processor Power Challenge has a speed of 1. The dashed lines correspond to the ideal parallel speed up.

Large-par-All
Figure 5 Relative speeds as a function of number of CPU for all the parallel computations in the large problem: 1) Power Challenge (FFTW), 2) Power Challenge (SGILIB), 3) O200 (FFTW), 4) O200 (SGILIB), 5) Dual-processor Pentium III (FFTW), 6) Four-processor Xeon (FFTW). One processor Power Challenge has a speed of 1. The dashed lines correspond to the ideal parallel speed up.

Large-fft-All
Figure 6 Relative speeds as a function of number of CPU for the computations of the FFTs in the large problem: 1) Power Challenge (FFTW), 2) Power Challenge (SGILIB), 3) O200 (FFTW), 4) O200 (SGILIB), 5) Dual-processor Pentium III (FFTW), 6) Four-processor Xeon (FFTW). One processor Power Challenge has a speed of 1. The dashed lines correspond to the ideal parallel speed up.

Next: KIRCHHOFF MIGRATION Up: SPLIT-STEP MIGRATION Previous: Small problem results

Stanford Exploration Project
10/25/1999