next up previous [pdf]

Next: Acknowledgments Up: Fu et al.: FPGAs Previous: Acceleration Results

Conclusions

Our exploration on FPGA convolution designs shows that, the `cube' stencil fits the FPGA streaming architecture much better than the `star' stencil. We especially investigate the architecture that processes multiple time steps in one pass. This approach removes the constraints of the memory bandwidth, and improves the performance at the cost of extra data buffering and streaming overhead. Experiment results show that the FPGA streaming architecture provides great potential for accelerating 3D convolution, and can achieve up to two orders of magnitude speedup.




2009-05-05