MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  A Parallel 3-D FFT Algorithm on Clusters of Vector SMPs

Download:
Download as a PDF | Download as a PS
by Daisuke Takahashi
http://www.ii.uib.no/para2000/program/daisuke.ps
Add To MetaCart

Abstract:

In this paper, we propose a high-performance parallel three-dimensional fast Fourier transform (FFT) algorithm on clusters of vector symmetric multiprocessor (SMP) nodes. The three-dimensional FFT algorithm can be altered into a multirow FFT algorithm to expand the innermost loop length. We use the multirow FFT algorithm to implement the parallel three-dimensional FFT algorithm. Performance results of three-dimensional power-of-two FFTs on clusters of (pseudo) vector SMP nodes, Hitachi SR8000, are reported. We succeeded in obtaining performance of about 40 GFLOPS on a 16-node Hitachi SR8000. 1

Citations

223 Computational Frameworks for the Fast Fourier Transform, ser – Loan - 1992
108 FFT’s in external or hierarchical memory – Bailey - 1990
63 Multiprocessor FFTs – Swarztrauber - 1987
21 FFT algorithms for vector computers – Swarztrauber - 1984
13 Tukey, "An Algorithm for the Machine Calculation of the Complex Fourier Series – Cooley, W - 1965
10 An efficient parallel algorithm for the 3-D FFT NAS parallel benchmark – Agarwal, Gustavson, et al. - 1994
9 Fast Radix 2,3,4, and 5 Kernels for Fast Fourier Transformations on Computers with overlapping multiplyadd instructions – Goedecker - 1997
7 An implementation of multiple and multi-variate Fourier transforms on vector processors – Hegland - 1995
5 Two and Three Dimensional FFTs on Highly Parallel Computers – Brass, Pawley - 1986
4 Real and Complex Fast Fourier Transforms on the Fujitsu VPP 500 – Hegland - 1996
4 Implementation of parallel FFT algorithms on distributed memory machines with a minimum overhead of communication – Calvin - 1996
4 A generalized prime factor FFT algorithm for any n = 2 p q r – Temperton - 1992
3 Pseudo Vector Processor based on Register-Windowed Superscalar Pipeline – Nakazawa, Nakamura, et al. - 1992
1 High-Performance FFT Algorithms for the Convex C4/XA Supercomputer – Wadleigh, Gostin, et al. - 1995