4 citations found. Retrieving documents...
A. Dubey, M. Zubair, and C. E. Grosch. A general purpose subroutine for Fast Fourier Transform on a distributed memory parallel machine. Parallel Computing, 20:1697--1710, 1994. 37

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
A simple and efficient parallel FFT algorithm using the BSP.. - Inda, Bisseling (2000)   (1 citation)  (Correct)

....variants of the algorithm have appeared. For an extensive discussion of the family of FFT algorithms, see Van Loan [27] In recent years, after the dawn of parallel computing, the originally sequential FFT algorithms have been modified and adapted to the needs of parallel computation (see e.g. [1, 2, 3, 8, 11, 12, 14, 15, 21, 22, 25, 27]) The lack of a unified parallel computing model and the existence of many different parallel architectures have made it rather difficult to develop efficient and portable parallel FFTs. Recently, however, as the parallel programming environments have become less machine dependent, examples of ....

.... have been proposed [23] The earliest methods produced parallel algorithms that, using p processors, carry out an FFT of size N in O(log p) computation supersteps, which are interleaved by O(log p) communication supersteps that need to communicate O( N p ) data elements per processor (see e.g. [8, 11, 25]) and hence have cost O( N p g l) Such methods appeared as a direct consequence of the divide and conquer structure of the radix 2 FFT algorithm. The paper [8] by Chu and George discusses several existing parallel algorithms of this type and three variants of their own. Restricting the ....

[Article contains additional citation context not shown here]

A. Dubey, M. Zubair, and C. E. Grosch. A general purpose subroutine for Fast Fourier Transform on a distributed memory parallel machine. Parallel Computing, 20:1697--1710, 1994. 37


Project Report of CPSC 667 - Yang (1997)   (Correct)

.... of CPSC 667 Wuji Yang Winter, 1997 April 18, 1997 1 Introduction The FFT was originally developed by Cooley and Tukey [3] for sequences with length N equal to a power of two (PO2) There are many literatures contributed to the implementation of this FFT on the interconnection network, for example, [4, 5, 7]. This is also the topic of my master thesis (refer to [8] The FFT of arbitrary length N is getting more attention. Based on the proposal in [3] Bergland [1] provided the mixed radix algorithm. However, Bergland s mixed radix FFT does not map well onto the hypercube network. Bluestein [2] ....

.... 6 7 Input Sequence Output Sequence X X X X X X X X w w w w w w w w w w w w w w w w w w w w w w w w 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 2 4 4 4 4 4 4 4 4 0 0 0 0 1 1 1 1 8 8 8 8 8 8 8 8 0 0 1 1 2 2 3 Figure 2: Computational Flow Graph of an 8 point Decimation in Time FFT (Cooley Tukey Framework) [4]. x x x x x x x x Input Sequence Output Sequence X X X X X X 1 X X 0 1 2 3 4 5 6 7 0 4 2 6 1 5 3 7 w w w w w w w w w w w w w w w w w w w w w w w w 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 4 4 4 4 4 4 4 4 0 0 0 0 1 1 1 1 8 8 8 8 8 8 8 8 0 0 1 2 2 3 3 Figure 3: Rearrangement of Figure 2 with ....

[Article contains additional citation context not shown here]

A. Dubey, M. Zubair, and C. E. Grosch. A general purpose subroutine for Fast Fourier Transform on a distributed memory parallel machine. Parallel Computing, 20:1697--1710, 1994.


Parallel Algorithms For The Spectral Transform Method - Foster, Worley (1994)   (11 citations)  (Correct)

....then switching to the Theta(Q) algorithm, analogously to the butterfly sum algorithm. 35 The distributed FFT can also be modified to use log 4 Q stages and can then exploit factors of 4 to reduce computation costs. And it is possible to apply a transpose like algorithm within the FFT itself [9]. These hybrid algorithms can improve performance somewhat in regimes where message startup costs and data volume costs are comparable. However, they place additional requirements on problem size and processor counts. Mesh based algorithms. We have restricted ourselves to algorithms designed for ....

A. Dubey, M. Zubair, and C. E. Grosch, A general purpose subroutine for fast Fourier transform on a distributed memory parallel machine, Parallel Computing, (to appear). - 38 -


Parallel Algorithms For The Spectral Transform Method - Foster, Worley (1994)   (11 citations)  (Correct)

....then switching to the Theta(Q) algorithm, analogously to the butterfly sum algorithm. The distributed FFT can also be modified to use log 4 Q stages and can then exploit factors of 4 to reduce computation costs. And it is possible to apply a transpose like algorithm within the FFT itself [9]. These hybrid algorithms can improve performance somewhat in regimes where message startup costs and data volume costs are comparable. However, they place additional requirements on problem size and processor counts. Mesh based algorithms. We have restricted ourselves to algorithms designed for ....

A. Dubey, M. Zubair, and C. E. Grosch, A general purpose subroutine for fast Fourier transform on a distributed memory parallel machine, Parallel Computing, (to appear).

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC