previous up next print clean
Next: FIRST ATTEMPT Up: Cole: Porting a simple Previous: Cole: Porting a simple


Figure [*] is a 2-D poststack implicit finite-difference depth migration program in the $(\omega,x)$ domain taken from Imaging the Earth's Interior (Claerbout, 1985).
#%  Migration in the (omega,x,z)-domain
real p(48,64), pi2, alpha, dt, dtau, dw
complex cp(48,64), cd(48), ce(48), cf(48), aa, a, b, c, cshift
integer ix, nx, iz, nz, iw, nw, it, nt, esize

nt= 64; nz= nt; nx= 48; pi2= 2.*3.141592 dt= 1.; dtau= 1.; dw= pi2/(dt*nt); nw= nt/2; alpha = .25 # alpha = v*v*dtau/(4*dx*dx) do iz= 1, nz { do ix=1,nx { p(ix,iz) = 0.; cp(ix,iz)=0. }} do it= nt/3, nt, nt/4 do ix= 1, 4 # Broadened impulse source { cp(ix,it) = (5.-ix); cp(ix,it+1) = (5.-ix) } call rowcc( nx, nt, cp, +1., +1.) # F.T. over time. do iz= 1, nz { # iz and iw loops interchangeable do iw= 2, nw { # iz and iw loops interchangeable aa = - alpha /( (0.,-1.)*(iw-1)*dw ) a = -aa; b = 1.+2.*aa; c = -aa # Compute right-hand side do ix= 2, nx-1 cd(ix) = aa*cp(ix+1,iw) + (1.-2.*aa)*cp(ix,iw) + aa*cp(ix-1,iw) cd(1) = 0.; cd(nx) = 0. # Tridiagonal solver call ctris( nx, -a, a, b, c, -c, cd, cp(1,iw)) # Lens term cshift = cexp( cmplx( 0.,-(iw-1)*dw*dtau)) do ix= 1, nx cp(ix,iw) = cp(ix,iw) * cshift # Imaging do ix= 1, nx p(ix,iz) = p(ix,iz)+cp(ix,iw) # p(t=0) = Sum P(omega) }} esize=4 to history: integer n1:nx, n2:nz, esize call hclose() call srite( 'out', p, nx*nz*4 ) return; end

A Ratfor program for migration in the $(\omega,x,z)$-domain taken from Claerbout (1985).
The program is short, the underlying algorithm is well-understood by geophysicists, and programs derived from this simple one are still widely used. For these reasons, I thought that this would be a good first program to bring to the Connection Machine. In addition to helping me learn about the machine, it is hopefully an example that will be of interest to others.

I had no experience with parallel machines prior to this, and still don't consider myself an expert, but I have gotten the program up and running, and it is performing (for problems large enough to take full advantage of the parallelism of the machine) roughly 25 times faster than a well-vectorized version of the program running on our Convex C-1, reaching a speed greater than 300 Megaflops/sec when using 4K CM processors (half of the 8K processor machine that is installed at SEP).

In this paper, I compare the code as I have it running on the two machines. The Connection Machine version of the code offers a brief introduction to Fortran 90, a new version of Fortran with parallel constructs, which we have been using on the CM. I also compare execution times, and discuss how I gradually improved the performance on the CM.

previous up next print clean
Next: FIRST ATTEMPT Up: Cole: Porting a simple Previous: Cole: Porting a simple
Stanford Exploration Project