25 citations found. Retrieving documents...
P. Duhamel and M. Vetterli. "Fast Fourier Transforms: A Tutorial Review and a State of the Art", Signal Processing, Vol. 19, 1990, pp. 259--299.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
The Advanced FFT Program Generator GENFFT - Frigo, Kral (2001)   (Correct)

....genfft produces a directed acyclic graph (dag) that encodes a DFT algorithm for a transform of size n. genfft implements several algorithms discovered in the past 35 years, and it employs the most appropriate for the given size. Speci cally, the algorithm used is: The split radix algorithm [DV90] if n is a multiple of 4. A prime factor algorithm (as described in [OS89, page 619] if n factors into n 1 n 2 , where n i 6= 1 and gcd(n 1 ; n 2 ) 1. The Cooley Tukey FFT algorithm [CT65] if n factors into n 1 n 2 where n i 6= 1. n is a prime number) Rader s algorithm for ....

P. Duhamel and M. Vetterli. Fast Fourier transforms: a tutorial review and a state of the art. Signal Processing, 19:259-299, Apr 1990.


Empirical Performance Prediction For Ifft/fft Cores For Ofdm .. - Pagiamtzis, Gulak (2002)   (Correct)

....that the DFT cT berec1 sivelydec8fi00G into smaller DFTs until trivial one point DFTs arereac1fifi The signal flow graph of an 8 point FFT is displayed in Figure 1. The opencenG8] representcpres x valued addition, and a number beside adirecxG edgeindicfl] cdic x valued multiplicG 1]1 Referenc [4] provides acxflfi0G 0[ veintroduc tion to the FFT. aR bR a I b I aR b I a I bR (a) 2 adder, 4 multiplier implementation. aR a I bR b I a I bR aR b I . # # cR c I (b) 5 adder, 3 multiplier implementation. Figure 2: Data flow in twoc0xfi] x multiplier implementations ....

.... whether the rece G ve dec[G 0q8[8G (or decflqfl0G 0 is performed in time or in frequenc y, the radix (the number of sub DFTs perdec1fl[1 sition stage) Radix 2, radix 4, radix 8, split radix (ac1828[G 0fl of radix 2 and radix 4approac800 and even mixed radix implementations are possible [4]. Afteralgorithmic clgori have been made, there are a variety ofarcfl1qG 000q crcfl1q An example of theceG[01 at the arc1xfixG 0fi0 level is the implementation of the multiplicfi]fiG operation. One possibledecbleG1fiqfix of the cheG1 x multiplic]Gc c I = aR ja I ) bR jb I ) is the ....

P. Duhamel and M. Vetterli. Fast Fourier transforms: a tutorial review and a state of the art. Signal Processing, 19:259--299, April 1990.


Portable High-Performance Programs - Frigo (1992)   (1 citation)  (Correct)

....matches the lower bound by Hong and Kung [82] Recall that the discrete Fourier transform (DFT) of an array X of n complex numbers is the array Y given by ##=n is a primitive nth root of unity, and # i n. Many known algorithms evaluate Equation (3. 3) in time O#n ## n# for all integers n [48]. In this section, however, we assume that n is an exact power of #, and compute Equation (3.3) according to the Cooley Tukey algorithm, which works recursively as follows. In the base case where n # O###, we compute Equation (3.3) directly. Otherwise, for any factorization n # n # n # of ....

....: 3.4) Observe that both the inner and the outer summation in Equation (3.4) is a DFT. Operationally, the computation specified by Equation (3. 4) can be performed by computing n # transforms of size n # (the inner sum) multiplying the result by the factors #i# n (called the twiddle factors [48]) and finally computing n # transforms of size n # (the outer sum) We choose n # to be # ### and n # to be # ### . The recursive step then operates as follows. 1. Pretend that input is a row major n # n # matrix A. Transpose A in place, i.e. use the cache oblivious algorithm to ....

[Article contains additional citation context not shown here]

P. DUHAMEL AND M. VETTERLI, Fast Fourier transforms: a tutorial review and a state of the art, Signal Processing, 19 (1990), pp. 259--299.


Design Parameters for Distributed PIM Memory Systems - Murphy (2000)   (Correct)

....fields, including image processing and digital signal filtering. In fact, a Fast Fourier Transform (FFT) from the same code based) occurs in both the Ray Tracing and Method of Moments benchmarks to be described later. The benchmark implements a 3dimensional Fourier Transform using an FFT [13] algorithm. It is believed that the results of a three dimensional system will demonstrate architectural properties indicative of higher dimensional operations. The input matrices for this benchmark are approximately 45 MB in total size. A Discrete Fourier Transform (DFT) in 3 dimensions is ....

Duhamel and Vetterli. Fast Fourier Transforms: a Tutorial Review and State of the Art. Signal Processing, 19, April 1990.


Multidigit Multiplication For Mathematicians - Bernstein   (3 citations)  (Correct)

....r of a ring is cancellable if multiplication by r is injective. Notes. There are several textbooks and survey articles covering portions of the material presented here: 3, chapter 7] 13] 16, chapter 4] 17, section 6.2] 18, section 4. 7 and chapter 9] 20, chapter 8] 21, chapter 2] [34], 41, chapter 4] 46] 57, section 4.3] 59] 64] 71] 73, section 7.3] 91, chapters 36 and 41] 95, sections 2.14 and 4.17] 99] 108, section 2.5] and [109] For lower bounds on arithmetic time in restricted models of computation, see, e.g. 16] 21] 39] 49] 110] 112] ....

Pierre Duhamel, Martin Vetterli, Fast Fourier transforms: a tutorial review and a state of the art, Signal Processing 19 (1990), 259-299. MR 91a:94004.


Design Parameters for Distributed PIM Memory Systems - Murphy (2000)   (Correct)

....fields, including image processing and digital signal filtering. In fact, a Fast Fourier Transform (FFT) from the same code based) occurs in both the Ray Tracing and Method of Moments benchmarks to be described later. The benchmark implements a 3dimensional Fourier Transform using an FFT [13] algorithm. It is believed that the results of a three dimensional system will demonstrate architectural properties indicative of higher dimensional operations. The input matrices for this benchmark are approximately 45 MB in total size. A Discrete Fourier Transform (DFT) in 3 dimensions is ....

Duhamel and Vetterli. Fast Fourier Transforms: a Tutorial Review and State of the Art. Signal Processing, 19, April 1990.


Application Development using Compositional Performance Analysis - Rifkin (1999)   (Correct)

.... implemented using the mesh spectral archetype; in this paper, we use this algorithm primarily to illustrate the performance model, rather than attempt to achieve the fastest possible FFT (as is the focus of other work, such as [Win78] Duhamel and Vetterli provide an excellent survey of FFTs [DV90] A good comparison of the FFT algorithm we use with more efficient ones (such as a split radix algorithm) on a vanilla workstation is 49 given in [Arn96] although, with a multicomputer, a simpler butterfly structure might be better for more actual computation [Har96] Benchmarking. The ....

P. Duhamel and M. Vetterli. Fast Fourier Transforms: A Tutorial Review and a State of the Art. Signal Processing, 19:259--299, April 1990.


Cache-Oblivious Algorithms (Extended Abstract) - Frigo, al.   (Correct)

....discrete Fourier transform (DFT) of an array X of n complex numbers is the array Y given by Y [i] n Gamma1 j=0 X [ j]w Gammai j n ; 9) where w n = e 2p p Gamma1=n is a primitive nth root of unity, and 0 i n. Many algorithms evaluate Equation (9) in O(nlgn) time for all integers n [15]. In this paper, however, we assume that n is an exact power of 2, and we compute Equation (9) according to the Cooley Tukey algorithm, which works recursively as follows. In the base case where n =O(1) we compute Equation (9) directly. Otherwise, for any factorization n = n 1 n 2 of n, we have ....

....: Observe that both the inner and outer summations in Equation (10) are DFT s. Operationally, the computation specified by Equation (10) can be performed by computing n 2 transforms of size n 1 (the inner sum) multiplying the result by the factors w Gammai 1 j 2 n (called the twiddle factors [15]) and finally computing n 1 transforms of size n 2 (the outer sum) We choose n 1 to be 2 dlgn=2e and n 2 to be 2 blgn=2c . The recursive step then operates as follows: 1. Pretend that input is a row major n 1 Theta n 2 matrix A. Transpose A in place, i.e. use the cache oblivious ....

P. Duhamel and M. Vetterli. Fast Fourier transforms: a tutorial review and a state of the art. Signal Processing, 19:259--299, Apr. 1990.


Portable High-Performance Programs - Frigo (1999)   (1 citation)  (Correct)

....the discrete Fourier transform (DFT) of an array X of n complex numbers is the array Y given by Y [i] n 1 X j=0 X[j] ij n ; 3.3) where n = e 2 p 1=n is a primitive nth root of unity, and 0 i n. Many known algorithms evaluate Equation (3. 3) in time O(n lg n) for all integers n [48]. In this section, however, we assume that n is an exact power of 2, and compute Equation (3.3) according to the Cooley Tukey algorithm, which works recursively as follows. In the base case where n = O(1) we compute Equation (3.3) directly. Otherwise, for any factorization n = n 1 n 2 of n, we ....

....Observe that both the inner and the outer summation in Equation (3.4) is a DFT. Operationally, the computation specified by Equation (3. 4) can be performed by computing n 2 transforms of size n 1 (the inner sum) multiplying the result by the factors i 1 j 2 n (called the twiddle factors [48]) and finally computing n 1 transforms of size n 2 (the outer sum) We choose n 1 to be 2 dlg n=2e and n 2 to be 2 blg n=2c . The recursive step then operates as follows. 1. Pretend that input is a row major n 1 n 2 matrix A. Transpose A in place, i.e. use the cache oblivious algorithm to ....

[Article contains additional citation context not shown here]

P. DUHAMEL AND M. VETTERLI, Fast Fourier transforms: a tutorial review and a state of the art, Signal Processing, 19 (1990), pp. 259--299.


Cache-Oblivious Algorithms - Prokop (1999)   (3 citations)  (Correct)

....discrete Fourier transform (DFT) of an array X of n complex numbers is the array Y given by Y[i] n 1 X j=0 X[ j]# i j n , 3.1) where # n = e 2# p 1=n is a primitive nth root of unity, and 0 i n. Many known algorithms evaluate Equation (3. 1) in time O(n lg n) for all integers n [17]. In this thesis, however, we assume that n is an exact power of 2, and compute Equation (3.1) according to the Cooley Tukey algorithm, which works recursively as follows. In the base case where n = O(1) we compute Equation (3.1) directly. Otherwise, for any factorization n = n 1 n 2 of n, we ....

....(3.2) Observe that both the inner and outer summations in Equation (3.2) are DFT s. Operationally, the computation specified by Equation (3. 2) can be performed by computing n 2 transforms of size n 1 (the inner sum) multiplying the result by the factors # i 1 j 2 n (called the twiddle factors [17]) and finally computing n 1 transforms of size n 2 (the outer sum) We choose n 1 to be 2 d(lg n) 2e and n 2 to be 2 b(lg n) 2c . The recursive step then operates as follows: 1. Pretend that the input is a row major n 1 Theta n 2 matrix A. Transpose A in place, i.e. use the cache oblivious ....

DUHAMEL, P., AND VETTERLI, M. Fast Fourier transforms: a tutorial review and a state of the art. Signal Processing 19 (Apr. 1990), 259--299.


Cache-Oblivious Algorithms (Extended Abstract) - Frigo, Leiserson, Prokop..   (Correct)

....transform (DFT) of an array X of n complex numbers is the array Y given by Y[i] n Gamma1 j=0 X[ j] Gammai j n ; 3) where n = e 2 p Gamma1=n is a primitive nth root of unity, and 0 i n. Many known algorithms evaluate Equation (3) in time O(n lg n) for all integers n [13]. In this paper, however, we assume that n is an exact power of 2, and compute Equation (3) according to the Cooley Tukey algorithm, which works recursively as follows. In the base case where n = O(1) we compute Equation (3) directly. Otherwise, for any factorization n = n 1 n 2 of n, we have ....

....Observe that both the inner and the outer summation in Equation (4) is a DFT. Operationally, the computation specified by Equation (4) can be performed by computing n 2 transforms of size n 1 (the inner sum) multiplying the result by the factors Gammai 1 j 2 n (called the twiddle factors [13]) and finally computing n 1 transforms of size n 2 (the outer sum) We choose n 1 to be 2 dlg n=2e and n 2 to be 2 blg n=2c . The recursive step then operates as follows. 1. Pretend that input is a row major n 1 Theta n 2 matrix A. Transpose A in place, i.e. use the cache oblivious ....

P. Duhamel and M. Vetterli. Fast Fourier transforms: a tutorial review and a state of the art. Signal Processing, 19:259--299, Apr. 1990.


Comparing Model Checking And Term Rewriting For The.. - Schneider, Huhn (1999)   (Correct)

.... The most efficient one, a true two dimensional approach requires only 54 multiplications, 464 additions and 6 arithmetic shifts [FL90] Some of the fast DCT algorithms go back to the discrete Fourier transform [CT65] others consider the DCT as a separate field [CSF77] A survey is given in [DV90] Figure 1 lists two efficient DCT algorithms, namely the version of Loeffler, Ligtenberg and Moschytz (LLM DCT) LLM89] and Arai, Agui and Nakajiama s version (AAN DCT) PM93] The additional optimization obtained by these implementations is based on the following trigonometric addition ....

P. Duhamel and M. Vetterli. Fast fourier transforms: A tutorial review and a state of the art. Signal Processing, 19:259--299, 1990.


FFTW: An Adaptive Software Architecture For The FFT - Frigo, Johnson (1998)   (110 citations)  (Correct)

....performance than all other publicly available software. FFTW also compares favorably with machine specific, vendor optimized libraries. 1. INTRODUCTION The discrete Fourier transform (DFT) is an important tool in many branches of science and engineering [1] and has been studied extensively [2]. For many practical applications, it is important to have an implementation of the DFT that is as fast as possible. In the past, speed was the direct consequence of clever algorithms [2] that minimized the number of arithmetic operations. On presentday general purpose microprocessors, however, ....

....(DFT) is an important tool in many branches of science and engineering [1] and has been studied extensively [2] For many practical applications, it is important to have an implementation of the DFT that is as fast as possible. In the past, speed was the direct consequence of clever algorithms [2] that minimized the number of arithmetic operations. On presentday general purpose microprocessors, however, the performance of a program is mostly determined by complicated interactions of the code with the processor pipeline, and by the structure of the memory. Designing for performance under ....

[Article contains additional citation context not shown here]

P. Duhamel and M. Vetterli, "Fast Fourier transforms: a tutorial review and a state of the art," Signal Processing, vol. 19, pp. 259--299, Apr. 1990.


A self-sorting in-place fast Fourier transform algorithm suitable .. - Hegland (1994)   (1 citation)  (Correct)

....is their long vector lengths and stride one data access after the first step. Our new algorithm can be interpreted as a combination of these algorithms with ideas from the Johnson Burrus algorithm. A review of the historical development of fast Fourier transforms since Gauss can be found in [DV90] Several of the 117 references in this paper point to software. Furthermore a simple tutorial on FFTs is presented there without explicit usage of Kronecker products. Both complexity results and implementation ideas are presented. A broad introduction to FFTs featuring matrix language can be ....

Duhamel, P., Vetterli, M.: Fast Fourier transforms: a tutorial review and a state of the art. Signal Process. 19(4), 259--299 (1990)


Stable Computation of the Complex Roots of Unity - Stephen Tate (1992)   (Correct)

....trigonometric function evaluation can be avoided when computing the transform. In this paper, we consider the problem of precomputing these values. For a good survey of issues arising in computing the Fast Fourier Transform, the reader may refer to the excellent paper by Duhamel and Vetterli [2]. In particular, we show that the method given in a standard reference text [3] is numerically unstable, and can produce very inaccurate values for moderately sized sequences. More importantly, we present an alternative way of calculating the roots of unity, present analysis that shows its ....

P. Duhamel and M. Vetterli. "Fast Fourier Transforms: A Tutorial Review and a State of the Art", Signal Processing, Vol. 19, 1990, pp. 259--299.


The Fastest Fourier Transform in the West - Frigo, Johnson (1997)   (17 citations)  (Correct)

....(see [8] and [6, page 619] within the codelets. A huge body of previous work on the Fourier transform exists, which we do not have space to reference properly. We limit ourselves to mention some references that are important to the present paper. A good tutorial on the FFT can be found in [9] or in classical textbooks such as [6] Previous work exists on automatic generation of FFT programs: 10] describes the generation of FFT programs for prime sizes, and [11] presents a generator of Pascal programs implementing a Prime Factor algorithm. Johnson and Burrus [12] first applied dynamic ....

P. Duhamel and M. Vetterli, "Fast Fourier transforms: a tutorial review and a state of the art," Signal Processing, vol. 19, pp. 259--299, Apr. 1990.


A Fast Fourier Transform Compiler - Frigo (1999)   (40 citations)  (Correct)

....1 Introduction Recently, Steven G. Johnson and I released Version 2.0 of the FFTW library [FJ98, FJ] a comprehensive collection of fast C routines for computing the discrete Fourier transform (DFT) in one or more dimensions, of both real and complex data, and of arbitrary input size. The DFT [DV90] is one of the most important computational problems, and many real world applications require that the transform be computed as quickly as possible. FFTW is one of the fastest DFT programs available (see Figures 1 and 2) because of two unique features. First, FFTW automatically adapts the ....

....it does not contain optimizations that are not required for the DFT programs it generates (for example loop unrolling) genfft operates in four phases. 1. In the creation phase, genfft produces a directed acyclic graph (dag) of the codelet, according to some well known algorithm for the DFT [DV90] The generator contains many such algorithms and it applies the most appropriate. 2. In the simplifier, genfft applies local rewriting rules to each node of the dag, in order to simplify it. In traditional compiler terminology, this phase performs algebraic transformations and ....

[Article contains additional citation context not shown here]

P. Duhamel and M. Vetterli. Fast Fourier transforms: a tutorial review and a state of the art. Signal Processing, 19:259--299, April 1990.


Performance Analysis for Mesh and Mesh-Spectral Archetype.. - Rifkin, Massingill (1998)   (Correct)

.... implemented using the mesh spectral archetype; in this paper, we use this algorithm primarily to illustrate the performance model, rather than attempt to achieve the fastest possible FFT (as is the focus of other work, such as [Win78] Duhamel and Vetterli provide an excellent survey of FFTs [DV90] A good comparison of the FFT algorithm we use with more efficient ones (such as a split radix algorithm) on a vanilla workstation is given in [Arn96] although, with a multicomputer, a simpler butterfly structure might be better for more actual computation [Har96] Benchmarking. The ....

P. Duhamel and M. Vetterli. Fast Fourier Transforms: A Tutorial Review and a State of the Art. Signal Processing, 19:259--299, April 1990.


A Self-Sorting In-Place Fast Fourier Transform Algorithm Suitable .. - Hegland (1994)   (1 citation)  (Correct)

....is their long vector lengths and stride one data access after the first step. Our new algorithm can be interpreted as 3 a combination of these algorithms with ideas from the Johnson Burrus algorithm. A review of the historical development of fast Fourier transforms since Gauss can be found in [DV90] Several of the 117 references in this paper point to software. Furthermore a simple tutorial on FFTs is presented there without explicit usage of Kronecker products. Both complexity results and implementation ideas are presented. A broad introduction to FFTs featuring matrix language can be ....

P. Duhamel and M. Vetterli, Fast Fourier transforms: a tutorial review and a state of the art, Signal Process. 19 (1990), no. 4, 259--299.


Cyclic Prefixing or Zero Padding for Wireless.. - Muquet, Wang.. (2002)   (2 citations)  Self-citation (Duhamel)   (Correct)

....HL2) is a power of two and , and hence, can be decomposed as the product of two coprime numbers: five and a power of two. Hence, the size FFT can be implemented easily using five FFTs of size , and FFTs of size five without any additional operations such as multiplications by twiddle factors [8]. A table comparing the arithmetic complexity of the equalizers considered in this paper is provided in Table I in the case of HL2 (that is, for and ) Since there is a plethora of implementations for complex divisions, we have deliberately chosen to not decompose them in terms of real additions ....

P. Duhamel and M. Vetterli, "Fast Fourier transforms: a tutorial review and a state of the art," Signal Processing, no. 19, pp. 259--299, 1990.


Stable Computation of the Complex Roots of Unity - Stephen Tate Department   (Correct)

No context found.

P. Duhamel and M. Vetterli. "Fast Fourier Transforms: A Tutorial Review and a State of the Art", Signal Processing, Vol. 19, 1990, pp. 259--299.


Fast Multi-Resolution Image Operations - In The Wavelet   (Correct)

No context found.

P. Duhamel and M. Vetterli. Fast Fourier transforms: A tutorial review and a state of the art. Signal Proc., 19(4):259--299, April 1990.


ANew, Fast and Low-Cost FFT Estimation Scheme of Signals - Using Bit Non-Subtractive (2004)   (Correct)

No context found.

P. Duhamel and M. Vetterli, "Fast Fourier transforms: A tutorial review and a state of the art ", Signal Processing, Vol. 19, pp 259-299, 1990.


Fast Multiresolution Image Operations - In The Wavelet   (Correct)

No context found.

P. Duhamel and M. Vetterli, "Fast Fourier Transforms: A Tutorial Review and a State of the Art," Signal Processing, vol. 19, no. 4, pp. 259-299, Apr. 1990.


Fast Multiplication And Its Applications - Bernstein   (Correct)

No context found.

Pierre Duhamel, Martin Vetterli, Fast Fourier transforms: a tutorial review and a state of the art, Signal Processing 19 (

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC