| D. Sepiashvili. Performance models and search methods for optimal FFT implementations. Master's thesis, ECE Dept., Carnegie Mellon University, 2000. |
....need to be timed can be greatly reduced, but still becomes intractable at larger sizes. A common approach for searching the very large space of possible implementations of signal transforms has been to use dynamic programming [Johnson and Burrus, 1983; Frigo and Johnson, 1998; Haentjens, 2000; Sepiashvili, 2000] This approach maintains a list of the fastest formulas it has found for each transform and size. When trying to find the fastest formula for a particular transform and size, it considers all possible splits of the root node. For each child of the root node, dynamic programming substitutes the ....
....programming considers split trees with at most k children at any node. Unfortunately, increasing k can significantly increase the number of formulas to be timed. As another generalization, k best dynamic programming keeps track of the k best formulas for each transform and size [Haentjens, 2000; Sepiashvili, 2000] This softens the dynamic programming assumption, allowing for the fact that a sub optimal formula for a given transform and size might be the optimal way to split such a node in a larger tree. Unfortunately, moving from standard 1 best to just 2 best more than doubles the number of formulas to ....
D. Sepiashvili. Performance models and search methods for optimal FFT implementations. Master 's thesis, ECE Dept., Carnegie Mellon University, 2000.
....need to be timed can be greatly reduced (for example, from 51,819 to 101 trees for size 2) but still becomes intractable at larger sizes. A common approach for searching the very large space of possible implementations of signal transforms has been to use dynamic programming [19] 10] 20] [21]. This approach maintains a list of the fastest formulas it has found for each transform and size. When trying to find the fastest formula for a particular transform and size, it considers all possible splits of the root node. For each child of the root node, dynamic programming substitutes the ....
....programming considers split trees with at most k children at any node. Unfortunately, increasing k can significantly increase the number of formulas that must be timed. As another generalization, k best dynamic programming keeps track of the k best formulas for each transform and size [20] [21]. This softens the dynamic programming assumption, allowing for the fact that a sub optimal formula for a given transform and size might be the optimal way to split such a node in a larger tree. Unfortunately, moving from standard 1 best to just 2 best more than doubles the number of formulas that ....
David Sepiashvili, "Performance models and search methods for optimal FFT implementations," M.S. thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, May 2000.
....explicit permutations. The Small FFT Code Modules Highly optimized code modules were written to compute FFTs of sizes 2 1 , 2 2 , 2 3 , and 2 4 . The code modules were used for the base cases of the recursion in the Cooley Tukey algorithm. The code modules are discussed in detail in [8] and are based on algorithms given in [7] The Plan Structure The FFT program uses a plan structure that allows each decomposition to be represented by a unique plan. A plan is simply a linked list of nodes that contains information needed for different steps in an FFT computation. A diagram of ....
....also shows that there is no accumulation of strides for the output of the right child computation, since the output of this computation is stored in the temporary storage array that is dedicated for node n. 2.2. 6 Measuring Runtimes All runtimes are measured using a software timer explained in [8]. The timer uses the clock( function in the C library to make the timings. The timer repeats each routine it measures until at least one second has expired before making a single timing. The reason for making the minimum experiment duration one second is that the precision of the clock( ....
David Sepiashvili. Performance models and search methods for optimal FFT implementations. Master's thesis, Carnegie Mellon University, Pittsburgh, May 2000.
No context found.
D. Sepiashvili. Performance models and search methods for optimal FFT implementations. Master's thesis, ECE Dept., Carnegie Mellon University, 2000.
No context found.
D. Sepiashvili. Performance models and search methods for optimal FFT implementations. Master's thesis, ECE Dept., Carnegie Mellon University, 2000.
No context found.
D. Sepiashvili. Performance models and search methods for optimal FFT implementations. Master's thesis, ECE Dept., Carnegie Mellon University, 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC