The parallel migration tends to be bound in the inner loop by indirect
memory addressing for the trace sample values at the migration operator
times *t*_{k}. Simple linear interpolation of a trace value for the time *t*_{k}
requires two indirect memory accesses, any single nearest-neighbor
triangle interpolation requires three indirect memory accesses (the triangle
peak is placed at the nearest neighbor location of *t*_{k}), and a general
triangle interpolation for filter coefficients falling on non-sampled
time locations requires six indirect memory accesses. I use the
nearest-neighbor triangle anti-aliasing scheme since it is only 1.5x slower
than simple linear interpolation, instead of the general triangle interpolation
method which is
a rather unacceptable 3x more inefficient than simple linear interpolation.
The drawback to the nearest-neighbor triangle filters is that more
smoothing is done than required in unaliased regimes where linear interpolation
would have sufficed. Since the traces have been doubly integrated, and I can't
afford to keep a copy of non-integrated input traces, nor switch to the 6-point
general triangle filters easily in a SIMD algorithm flow, I have elected to
always smooth at a minimum of three adjacent points on the time access.
This is fine for data that has energy mostly below half-Nyquist, but may
represent over-smoothing and unnecessary loss of some bandwidth otherwise.

