Next: SPLIT-STEP MIGRATION Up: Biondi et al.: Testing Previous: Biondi et al.: Testing

Introduction

SEP's present computer servers (18-processor Power Challenge and 4-processor Origin 200) fall short from delivering the computer power needed to perform research in advanced algorithms for 3-D wave-equation migration, 3-D velocity estimation, and 3-D wavefield interpolation. We are therefore evaluating the choices for our next computer server (or servers). Computers based on commodity processors (Intel) have attractive prices, and seem to have finally caught up in floating-point performances with computers based on processors specialized for floating-point computations (SGI-MIPS, SUN-Ultra, etc). Further, memory and disk-storage are much better priced for Intel-based computers than for any other. We are thus evaluating Intel-based Linux multi-processors systems. At the moment the choice is between systems based on dual-processor Pentium III and systems based on four-processor Pentium III Xeon. In the near future (end of 1999?), it should be also possible to purchase eight-processor Pentium III Xeon system. We evaluated a dual-processor Pentium III marketed by VA Linux Systems (StartX MP Workstation) and a four-processor Pentium III Xeon kindly loaned to us for evaluation by SGI (SGI 1400L).

The main goal for the tests that we report here is to determine whether we can run efficiently our ``production'' codes on multi-processors Linux systems. For several years now SEP has operated Linux computers as desktops. We are satisfied by the experience, to the point that all our desktops are Intel PC's running Linux. However, we perform the heavy-duty parallel computations on our SGIs. One of the authors (JR) performed some tests running parallel programs across our network of Linux desktop PC's using PVM message passing as a parallel-programming tool. However, no parallel ``production'' codes runs across the Linux network.

Our question has a software component as well as a hardware component. First, Linux kernels before 2.2 had notoriously poor performance when running multi-threaded applications, and even for 2.2 the overhead of starting new threads is higher than on Irix (Bee Bednar, private communication). Second, the most of our production codes are parallelized with a shared-memory model, and the parallelism is achieved using SGI or OpenMP compiler directives. The F77/F90 compiler that we presently use on Linux (from NAGWare) does not support these parallel-programming style. Therefore, we need to look at alternative compilers; we evaluated the Portland Group F90 compiler.

We run our benchmarks on four computers:

1)

18-CPU - SGI Power Challenge

CPU: 75 Mhz - MIPS R8000
L2-Cache: 75 Mhz - 4MB
Memory: 2 GB
Operating System: Irix 6.5
Compiler: MIPSpro f90 version 7.2.1

2)

4-CPU - SGI Origin 200

CPU: 180 Mhz - MIPS R10000
L2-Cache: 120 Mhz - 1MB
Memory: 512 MB
Operating System: Irix 6.5
Compiler: MIPSpro f90 version 7.2.1

3)

4-CPU - SGI 1400L

CPU: 500 Mhz - Intel Pentium III Xeon
L2-Cache: 500 Mhz - 2MB
Memory: 1 GB
Operating System: Linux 2.2.5 (Red Hat 6.0)
Compiler: PGI pgf90 version 3.1-1

4)

2-CPU - VA Linux StartX MP Workstation

CPU: 500 Mhz - Intel Pentium III
L2-Cache: 250 Mhz - 512KB
Memory: 256 MB
Operating System: Linux 2.2.7 (Red Hat 6.0)
Compiler: PGI pgf90 version 3.1-1

Hardware-wise we were interested both in the relative performance of Intel-based computers compared with MIPS-based computers, as well as the absolute parallel efficiency of Intel-based computers when running multi-threaded applications. In particular, we tried to analyze the following issues regarding Intel-based multiprocessor performances:

Floating-point performance.
Effects of different secondary cache sizes (2MB and 512KB) and speeds (255 Mhz and 500 Mhz) on performance of production runs.
Problems caused by memory-access conflicts when running multi-threaded applications.

To answer these questions in a context as much relevant as possible with our environment, we run three different programs that cover the most of the types of computations that we routinely perform: a FFT-intensive split-step migration, a Kirchhoff migration, and an implicit finite-difference migration.

Next: SPLIT-STEP MIGRATION Up: Biondi et al.: Testing Previous: Biondi et al.: Testing

Stanford Exploration Project
10/25/1999