(Enter summary)
Abstract: The goal of this dissertation is to give programmers the ability to achieve high performance
by focusing on developing parallel algorithms, rather than on architecturespecific
details. The advantages of this approach also include program portability and
legibility. To achieve high performance, we provide automatic compilation techniques
that tailor parallel algorithms to shared-memory multiprocessors with local caches
and a common bus. In particular, the compiler maps complete applications onto ... (Update)
Context of citations to this paper: More
...sectioning to the Fortran D back end. Loop fusion is deferred because even hand written Fortran 77 programs can benefit significantly [24, 28]. Sectioning is needed in the back end because forall loops may also be present in Fortran 77D. We assign to the Fortran 90D front end...
.... loops Interprocedural analysis can identify candidate loops for fusion enabling transformations such as loop extraction and loop embedding[9]. Loop distribution [3] may be used to produce parallel loop nests that adhere to the model. Since the loops will be eventually fused...
Cited by: More
Estimating Cache Misses and Locality Using Stack Distances - Cascaval, Padua (2003)
(Correct)
Runtime Code Parallelization for On-Chip Multiprocessors - Kandemir Zhang Cse (2003)
(Correct)
Improving Data Locality with Loop Transformations - McKinley, CARR, TSENG (1996)
(Correct)
Similar documents (at the sentence level): More
13.7%: Optimizing for Parallelism and Data Locality - Kennedy, McKinley (1992)
(Correct)
8.9%: Analysis and Transformation in the ParaScope Editor - Kennedy, McKinley, Tseng (1991)
(Correct)
8.4%: Analysis and Transformation in an Interactive Parallel.. - Kennedy, McKinley, Tseng (1993)
(Correct)
Active bibliography (related documents): More All
2.1: Interactive Parallel Programming Using the ParaScope Editor - Kennedy, McKinley, Tseng (1991)
(Correct)
1.5: The ParaScope Parallel Programming Environment - Cooper (1993)
(Correct)
1.4: A Compiler Optimization Algorithm for Shared-Memory Multiprocessors - McKinley (1998)
(Correct)
Similar documents based on text: More All
0.3: Very Large-Scale Linear Programming: A Case Study Exploiting Both .. - Kilgore (1993)
(Correct)
0.2: Quantifying Loop Nest Locality Using SPEC'95 and the Perfect.. - McKinley, Temam (1999)
(Correct)
0.2: Automatic Data Layout for Distributed Memory Machines - Kremer (1995)
(Correct)
Related documents from co-citation: More All
9: The ParaScope parallel programming environment
- Cooper - 1993
8: A Data Locality Optimizing Algorithm (context) - Wolf, Lam - 1991
8: Automatic translation of fortran programs to vector form
- Allen, Kennedy - 1987
BibTeX entry: (Update)
K. S. McKinley. Automatic and Interactive Parallelization. PhD thesis, Dept. of Computer Science, Rice University, April 1992. http://citeseer.ist.psu.edu/mckinley94automatic.html More
@techreport{ kathryn92utomatic,
author = "McKinley, Kathryn S.",
title = "{A}utomatic and {I}nteractive {P}arallelization",
number = "CRPC-TR92214-S",
month = "April",
year = "1992",
url = "citeseer.ist.psu.edu/mckinley94automatic.html" }
Citations (may not include all citations):
1450
The Design and Analysis of Computer Algorithms (context) - Aho, Hopcroft et al. - 1974
480
The program dependence graph and its use in optimization (context) - Ferrante, Ottenstein et al. - 1987
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991
376
The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991
299
Dependence Analysis for Supercomputing (context) - Banerjee - 1988
283
Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1982
283
Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1989
258
Automatic translation of Fortran programs to vector form
- Allen, Kennedy - 1987
245
An extended set of Fortran basic linear algebra subprograms
- Dongarra, Croz et al. - 1988
216
Strategies for cache and local memory management by global p.. (context) - Gannon, Jalby et al. - 1987
216
Strategies for cache and local memory management by global p.. (context) - Gannon, Jalby et al. - 1988
180
LINPACK User's Guide (context) - Dongarra, Bunch et al. - 1979
178
Supernode partitioning (context) - Irigoin, Triolet - 1988
171
Dependence graphs and compiler optimizations (context) - Kuck, Kuhn et al. - 1981
168
The parallel execution of DO loops (context) - Lamport - 1974
158
Improving register allocation for subscripted variables
- Callahan, Carr et al. - 1990
157
Conversion of control dependence to data dependence (context) - Allen, Kennedy et al. - 1983
152
An efficient method of computing static single assignment fo.. (context) - Cytron, Ferrante et al. - 1989
149
An implementation of interprocedural bounded regular section..
- Havlak, Kennedy - 1991
146
Unimodular transformations of double loops (context) - Banerjee - 1990
122
SUPERB: A tool for semiautomatic MIMD/SIMD parallelization (context) - Zima, Bast et al. - 1988
111
More iteration space tiling (context) - Wolfe - 1989
110
The Livermore Fortran Kernels: A computer test of the numeri.. (context) - McMahon - 1986
110
Practical dependence testing
- Goff, Kennedy et al. - 1991
107
Software Methods for Improvement of Cache Performance (context) - Porterfield - 1989
106
Compiler optimizations for Fortran D on MIMD distributed-mem..
- Hiranandani, Kennedy et al. - 1991
104
The Structure of Computers and Computations (context) - Kuck - 1978
94
Optimizing for parallelism and data locality
- Kennedy, Kinley - 1992
87
Analysis of interprocedural side effects in a parallel progr.. (context) - Callahan, Kennedy - 1987
82
On estimating and enhancing cache effectiveness (context) - Ferrante, Sarkar et al. - 1991
80
Direct parallelization of CALL statements (context) - Triolet, Irigoin et al. - 1986
79
Direct search methods on parallel machines
- Dennis, Torczon - 1991
79
Interprocedural dependence analysis and parallelization (context) - Burke, Cytron - 1986
78
An overview of the PTRAN analysis system for multiprocessing (context) - Allen, Burke et al. - 1987
78
Software--- Practice and Experience (context) - Knuth, study et al. - 1971
76
Doacross: Beyond vectorization for multiprocessors (context) - Cytron - 1986
72
A catalogue of optimizing transformations (context) - Allen, Cocke - 1972
71
Supercomputer performance evaluation and the Perfect benchma..
- Cybenko, Kipp et al. - 1990
70
An interval-based approach to exhaustive and incremental int.. (context) - Burke - 1990
69
Estimating interlock and improving balance for pipelined mac..
- Callahan, Cocke et al. - 1988
67
Evaluation of compiler optimizations for Fortran D on MIMD d..
- Hiranandani, Kennedy et al. - 1992
66
A technique for summarizing data access and its use in paral.. (context) - Balasundaram, Kennedy - 1989
66
ParaScope: A parallel programming environment (context) - Callahan, Cooper et al. - 1988
66
Interprocedural constant propagation (context) - Callahan, Cooper et al. - 1986
59
the number of operations simultaneously executable in Fortra.. (context) - Kuck, Muraoka et al. - 1972
57
An overview of the Fortran D programming system
- Hiranandani, Kennedy et al. - 1991
55
Interactive parallel programming using the ParaScope Editor
- Kennedy, Kinley et al. - 1991
54
Automatic decomposition of scientific programs for parallel .. (context) - Allen, Callahan et al. - 1987
51
Managing Interprocedural Optimization
- Hall - 1991
48
Dependence Analysis for Subscripted Variables and Its Applic.. (context) - Allen - 1983
45
Symbolic dependence analysis for high-performance paralleliz.. (context) - Haghighat, Polychronopoulos - 1990
44
Fortran at ten gigaflops: The Connection Machine convolution.. (context) - Bromley, Heller et al. - 1991
44
A Global Approach to Detection of Parallelism (context) - Callahan - 1987
43
The impact of interprocedural analysis and optimization in t.. (context) - Cooper, Kennedy et al. - 1986
43
Automatic loop interchange (context) - Allen, Kennedy - 1984
42
Program improvement by source-to-source transformations (context) - Loveman - 1977
42
Loop skewing: The wavefront method revisited (context) - Wolfe - 1986
40
An experiment with inline substitution
- Cooper, Hall et al. - 1991
37
Procedure cloning
- Cooper, Hall et al. - 1992
35
The impact of synchronization and granularity on parallel sy..
- Chen, Su et al. - 1990
35
Incremental data flow analysis algorithms (context) - Ryder, Paull - 1988
34
A theory of loop permutations (context) - Banerjee - 1990
34
An empirical investigation of the effectiveness of and limit.. (context) - Singh, Hennessy - 1991
33
Blocking linear algebra codes for memory hierarchies
- Carr, Kennedy - 1989
33
The structure of Parafrase-2: An advanced parallelizing comp.. (context) - Polychronopoulos, Girkar et al. - 1990
33
A comparison of programming models for shared memory multipr.. (context) - Lin, Snyder - 1990
31
Incremental Dependence Analysis (context) - Rosene - 1990
31
and inline expansion (context) - Allen, Johnson et al. - 1990
30
Improving the Performance of Virtual Memory Computers (context) - Abu-Sufah - 1979
30
IEEE Transactions on Electronic Computers (context) - Bernstein, programs et al. - 1966
28
or the value of renaming for parallelism detection and stora.. (context) - Cytron, Ferrante et al. - 1987
27
Incremental data flow analysis in a structured program edito.. (context) - Zadeck - 1984
26
The ParaScope Editor: An interactive parallel programming to.. (context) - Balasundaram, Kennedy et al. - 1989
25
Experiences using the ParaScope Editor: an interactive paral..
- Hall, Harvey et al. - 1993
25
Interprocedural optimization: Eliminating unnecessary recomp.. (context) - Burke, Cooper et al. - 1990
24
Guide to Parallel Programming on Sequent Computer Systems (context) - Osterhaug - 1989
24
An effectiveness study of parallelizing compiler techniques (context) - Eigenmann, Blume - 1991
23
Loop distribution with arbitrary control flow
- Kennedy, Kinley - 1990
23
Parallelism Exposure and Exploitation in Programs (context) - Muraoka - 1971
22
Experiences using control dependence in PTRAN (context) - Cytron, Ferrante et al. - 1990
22
Analysis and transformation in the ParaScope Editor
- Kennedy, Kinley et al. - 1991
22
Parallel algorithms for banded linear systems (context) - Wright - 1991
22
Vectorizing compilers: A test suite and results (context) - Callahan, Dongarra et al. - 1988
21
Efficient interprocedural analysis for program restructuring.. (context) - Li, Yew - 1988
21
Control and Data Dependence for Program Transformation (context) - Towle - 1976
19
Stable parallel algorithms for two-point boundary value prob..
- Wright - 1992
19
Goal-directed interprocedural optimization (context) - Briggs, Cooper et al. - 1990
19
Relaxing SIMD control flow constraints using loop transforma..
- Hanxleden, Kennedy - 1992
18
Automatic software cache coherence through vectorization
- Darnell, Kennedy et al. - 1992
17
Interprocedural optimization: Eliminating unnecessary recomp.. (context) - Cooper, Kennedy et al. - 1986
17
Experience with interprocedural analysis of array side effec.. (context) - Havlak, Kennedy - 1990
15
The structure of an advanced retargetable vectorizer (context) - Kuck, Kuhn et al. - 1984
15
The structure of an advanced retargetable vectorizer (context) - Kuck, Kuhn et al. - 1980
14
Stream processing (context) - Goldberg, Paige - 1984
14
The program dependence graph and vectorization (context) - Baxter, Bauer - 1989
12
SIGMACS: A programmable programming environment (context) - Shei, Gannon - 1990
12
Advanced tools and techniques for automatic parallelization (context) - Kremer, Zima et al. - 1988
11
Dependence analysis of arrays subscripted by index arrays (context) - McKinley - 1990
11
Maximizing parallelism via loop transformations (context) - Wolf, Lam - 1990
11
PAT - an interactive Fortran parallelizing assistant tool (context) - Smith, Appelbe - 1988
10
Static performance estimation in a parallelizing compiler (context) - Kennedy, McIntosh et al. - 1991
10
Interprocedural optimization: Experimental results (context) - Richardson, Ganapathi - 1989
10
and languages with only two formation rules (context) - Bohm, Jacopini et al. - 1966
10
Semi-automatic domain decomposition (context) - Wolfe - 1989
10
PCF Fortran: Language Definition (context) - Leasure - 1990
9
On linearizing parallel code (context) - Ferrante, Mace - 1985
9
A static performance estimator in the Fortran D programming .. (context) - Balasundaram, Fox et al. - 1992
8
Incremental dependence analysis for interactive parallelizat.. (context) - Smith, Appelbe et al. - 1990
8
Analysis of Synchronization in a Parallel Programming Enviro.. (context) - Subhlok - 1990
8
Interprocedural analysis and program restructuring for paral.. (context) - Li, Yew - 1988
7
A vectorizing Fortran compiler (context) - Scarborough, Kolsky - 1986
7
Faust: An environment for programming parallel scientific ap.. (context) - Guarna, Gannon et al. - 1988
6
An implementation of a parallel primal-dual interior point m.. (context) - Lustig, Li - 1992
6
A general-purpose parallel algorithm for unconstrained optim.. (context) - Nash, Sofer - 1991
6
Generating sequential code from parallel code (context) - Ferrante, Mace et al. - 1988
6
Analysis and transformation of programs for parallel computa.. (context) - Kuck, Kuhn et al. - 1980
6
An incremental algorithm for software analysis (context) - Ryder, Carroll - 1986
6
A dynamic study of vectorization in PFC (context) - Callahan, Kennedy et al. - 1989
5
The ParaScope Editor: User interface goals (context) - Fletcher, Kennedy et al. - 1990
5
Partitioned dynamic programming for optimal control (context) - Wright - 1991
4
An inline subroutine expander for Parafrase (context) - Huson - 1982
4
A framework for detecting useful parallelism (context) - Allen, Burke et al. - 1988
4
Exact dependence analysis using data access descriptors (context) - Huelsbergen, Hahn et al. - 1990
4
Exact dependence analysis using data access descriptors (context) - Huelsbergen, Hahn et al. - 1990
4
BTN: software for parallel unconstrained optimization (context) - Nash, Sofer - 1992
4
An interactive conversion of sequential to multitasking Fort.. (context) - Smith, Appelbe - 1989
4
Private communication (context) - Allen - 1990
2
Supercomputer Software Newsletter (context) - Callahan, Kalem - 1987
2
Improving data locality (context) - Kennedy, Kinley et al. - 1992
2
Finding large-grain parallelism in loops with serial control.. (context) - Dietz - 1988
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://softlib.rice.edu/CRPC/softlib/TRs_online.html): More
Experiences on Data-Parallel Programming - Clark, von Hanxleden, Kennedy (1994)
(Correct)
A Priori Estimates for Mixed Finite Element.. - Cowsar, Dupont, Wheeler
(Correct)
An Empirical Evaluation of Dependence Analysis in Parallel Program .. - Monk (1995)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC