(Enter summary)
Abstract: ...................................................... xiii 1 Introduction .................................................. 1 1. Divide-and-Conquer and the Memory Hierarchy . . . . . . . . . . . 2 2. Overview of Architecture-Cognizant Divide-and Conquer . . . . . . 4 3. Overview of Napoleon . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. What You Can Expect . . . . . . . . . . . . . . . . . . . . . . . . . 6 5. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 ... (Update)
Cited by: More
Rescheduling for Locality in Sparse Matrix Computations - Strout, Carter, Ferrante
(Correct)
Active bibliography (related documents): More All
1.1: Architecture-Cognizant Divide and Conquer Algorithms - Gatlin, Carter (1999)
(Correct)
0.8: Memory Hierarchy Considerations for Fast Transpose and.. - Gatlin, Carter (1999)
(Correct)
0.7: Faster FFTs via Architecture-Cognizance - Gatlin, Carter (2000)
(Correct)
Similar documents based on text: More All
0.5: Adaptive Scheduling of Master/Worker Applications on Distributed.. - Shao (2001)
(Correct)
0.4: Asymptotic Expansions of the Mergesort Recurrences - Hwang (1998)
(Correct)
0.4: Cv - Strout
(Correct)
BibTeX entry: (Update)
Kang Su Gatlin. Portable High Performance Programming via ArchitectureCognizant Divide-and-Conquer Algorithms. Ph.d. thesis, University of California, San Diego, September 2000. http://citeseer.ist.psu.edu/gatlin00portable.html More
@phdthesis{ gatlin00portable,
author = "K. Gatlin",
title = "Portable High Performance Programming via ArchitectureCognizant Divide-and-Conquer Algorithms",
school = {University of California, San Diego},
year = "2000",
month = September,
url = "citeseer.ist.psu.edu/gatlin00portable.html",
url = "citeseer.nj.nec.com/gatlin00portable.html" }
Citations (may not include all citations):
3972
Introduction to Algorithms (context) - Cormen, Leiserson et al. - 1994
1535
Cambridge University Press (context) - Press, Teukolsky et al. - 1993
1399
Compilers: Principles (context) - Aho, Sethi et al. - 1988
476
Programming Language (context) - Kernighan, Ritchie - 1988
294
A Loop Transformation Theory and an Algorithm to Maximize Pa.. (context) - Wolf, Lam - 1991
292
Advanced Compiler Design and Implementation (context) - Muchnick - 1997
216
Strategies for Cache and Local Memory Managementby Global Pr.. (context) - Gannon, Jalby et al. - 1988
206
An Algorithm for the Machine Calculation of Complex Fourier .. (context) - Cooley, Tukey - 1965
193
Superscalar Microprocessor Design (context) - Johnson - 1991
157
Automatically Tuned Linear Algebra Software
- Whaley, Dongarra - 1998
123
Optimizing Matrix Multiply using PHiPAC: a Portable High-Per..
- Bilmes, Asanovic et al. - 1997
99
Hitting the Memory Wall: Implications of the Obvious
- Wulf, McKee - 1995
98
High Performance Compilers for Parallel Computing (context) - Wolfe - 1996
83
Programming with POSIX Threads (context) - Butenhof - 1997
82
To Copy or Not to Copy: A Compile-Time Technique for Assessi..
- Teman, Granston et al. - 1993
79
Intel Architecture Software Developer's Manual (context) - Corporation - 1999
64
Cache-Oblivious Algorithms (context) - Frigo, Leiserson et al. - 1999
62
Computational Frameworks for the Fast Fourier Transform (context) - Van Loan - 1992
62
An analysis of DAG-consistent distributed shared-memory algo..
- Blumofe, Frigo et al. - 1996
60
The Uniform Memory Hierarchy Model of Computation
- Alpern, Carter et al. - 1994
60
Recursion Leads to Automatic Variable Blocking for Dense Lin.. (context) - Gustavson - 1997
49
Design and Implementation of Code Optimizations for a TypeDi.. (context) - Tarditi - 1997
42
The Cache Performance of Blocked Algorithms (context) - Lam, Rothberg et al. - 1991
40
Algorithmic Skeletons: A Structured Approach to the Manageme.. (context) - Cole - 1988
29
Modeling Parallel Computers as Memory Hierarchies
- Alpern, Carter et al. - 1993
28
Tuning Strassen's Matrix Multiplication for Memory Eciency
- Thottethodi, Chatterjee et al. - 1998
26
Computer Architecture: A Quantitative Approach (context) - Hennessey, Patterson - 1996
24
POWER2: The Next Generation of the RISC System/6000 Family (context) - White, Dhawan - 1994
23
Optimizing for Reduced Code Space Using Genetic Algorithms
- Cooper, Schielke et al. - 1999
21
Hierarchical Tiling: A Methodology for High Performance
- Carter, Ferrante et al. - 1996
18
The Fastest Fourier Transform in the West
- Frigo, Johnson - 1997
15
Architecture-Cognizant Divide and Conquer Algorithms
- Gatlin, Carter - 1999
14
Towards a Model for Portable Parallel Performance: Exposing ..
- Alpern, Carter - 1994
13
Mastering Regular Expressions (context) - Friedl, Oram - 1997
12
Portable High-PerformanceSupercomputing: High-Level Platform..
- Brewer - 1994
11
In uence of Caches on the Performance of Sorting (context) - LaMarca, Ladner - 1997
9
Space-Limited Procedures: A Methodology for Portable High-Pe..
- Alpern, Carter et al. - 1995
8
Towards an Optimal Bit-Reversal Permutation Program
- Carter, Gatlin - 1998
8
Parallelization of Divide-and-Conquer in the Bird-Meertens F..
- Gorlatch, Lengauer - 1995
6
Challenges of Computing the Fast Fourier Transform
- Johnson, Johnson
6
Microprocessor Report (context) - Linley, Sports et al. - 1997
5
Memory Hierarchy Consideration for Fast Transpose and Bit-Re..
- Gatlin, Carter - 1999
5
Software Methods for Improvement of Cache Performance on Sup.. (context) - Portereld - 1989
4
Reference Manual (context) - Group - 1998
4
On Computing the Fast Fourier Transform (context) - Singleton - 1967
4
Eliminating Con ict Misses for High Performance Architecture.. (context) - Rivera, Tseng - 1998
3
IBM's POWER to replace PSC (context) - POWER, SC et al. - 1997
2
An Overview of the Alpha AXP 21164 Microprocessor (context) - Benschneider - 1995
2
Programming with Divide-and-Conquer Skeletons: ACaseStudy of.. (context) - Gorlatch - 1997
1
Technical Report SC (context) - Scientic, Version et al. - 1992
1
The Hewlett-Packard PA 8000 RISC CPU: A High Performance Out.. (context) - Kumar - 1997
1
Overcoming the Challenges to Feeback-Directed Optimizations (context) - Smith - 2000
1
Personnal Correspondance (context) - Mou - 1998
Documents on the same site (http://www.cs.ucsd.edu/~kgatlin/papers.html): More
Towards an Optimal Bit-Reversal Permutation Program - Carter, Gatlin (1998)
(Correct)
Architecture-Cognizant Divide and Conquer Algorithms - Gatlin, Carter (1999)
(Correct)
Performance Optimisations of the NPB FT Kernel by.. - Getov, Wei, Carter.. (1999)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC