Further potential speedups

All of the speedups in this paper include the transfer time to and from the processor. If multiple portions of the algorithm are performed on the FPGA without returning to the CPU the additional speedup can be considerable. In the cases shown in this paper the limiting factor is the transfer time. For example if the FFT and FK step can reside simultaneously on the FPGA the cost of the FK step disappears. In the case of acoustic modeling multiple time steps could be applied simultaneously.