A major component of the success of scientific computing is the rapid increase in computing capability. Parallel computing can provide the next great leap in the computation power scientists and engineers need to solve many important problems. The proliferation of parallel architectures, however, discourages users from writing parallel applications. Recent advances in automatic parallelization and parallel languages provide a means for users to write portable programs that protect software investment. Unfortunately, current systems frequently experience poor performance because they fail to fully exploit the features of the underlying parallel architecture. For uniprocessors, compilers have been quite successful in mapping machine-independent programs (e.g., Fortran, C) down to the complex features of today's microprocessors. I believe that compilers are also well-suited for customizing portable parallel programs. The objective of my research is to develop compilation techniques to support efficient machine-independent programming of high-performance multiprocessors. I plan to implement and evaluate these effectiveness of these techniques in COSMIC, a Communication-Optimizing,Shared-Memory Integrated Compiler. COSMIC will target a shared-memory programming model because it provides the best combination of flexibility and performance. Experiments show that simply exploiting parallelism is no longer sufficient for achieving the best performance, mainly because the cost of interprocessor communication is too great compared to computation and local memory
|
963
|
Performance Fortran Forum. High Performance Fortran language specification version 1.0
– High
- 1993
|
|
848
|
Memory coherence in shared virtual memory systems
– Li, Hudak
- 1989
|
|
455
|
Design and evaluation of a compiler algorithm for prefetching
– Mowry, Lam, et al.
- 1992
|
|
441
|
Optimizing Supercompilers for Supercomputers
– Wolfe
- 1989
|
|
422
|
Lazy Release Consistency for Software Distributed Shared Memory
– Keleher, Cox, et al.
- 1992
|
|
389
|
The High Performance Fortran Handbook
– Koelbel, Loveman, et al.
- 1994
|
|
361
|
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
– Wolf, Lam
- 1991
|
|
338
|
The Directory-Based Cache Coherence Protocol for the Dash Multiprocessor
– Lenoski
- 1990
|
|
323
|
Tempest and Typhoon: User-Level Shared Memory
– Reinhardt, Larus, et al.
- 1994
|
|
318
|
The Stanford FLASH Multiprocessor
– Kuskin, Ofelt, et al.
- 1994
|
|
316
|
Compiling Fortran D for MIMD distributed-memory machines
– Hiranandani, Kennedy, et al.
- 1992
|
|
298
|
Fortran D Language Specification
– Fox, Hiranandani, et al.
- 1990
|
|
264
|
Tolerating Latency Through SoftwareControlled Prefetching in Shared-Memory Multiprocessors
– Mowry, Gupta
- 1991
|
|
253
|
Munin: Distributed Shared Memory Based on Type-Specific Memory Coherence
– Bennett, Carter, et al.
- 1990
|
|
241
|
Global optimizations for parallelism and locality on scalable parallel machines
– Anderson, Lain
- 1993
|
|
240
|
Software prefetching
– Callahan, Kennedy, et al.
- 1991
|
|
188
|
Compiler optimizations for improving data locality
– Carr, McKinley, et al.
- 1994
|
|
179
|
SUPERB: A tool for semi-automatic MIMD/SIMD parallelization
– ZIMA, BAST, et al.
- 1988
|
|
175
|
Supporting Compiling Global Name-Space Parallel Loops for Distributed Execution
– Koelbel, Mehrotra
- 1991
|
|
166
|
Fine-Grain Access Control for Distributed Shared Memory
– Schoinas, Falsafi, et al.
- 1994
|
|
163
|
The MIT Alewife Machine: Architecture and Performance
– Agarwal, Bianchini, et al.
- 1995
|
|
159
|
Data and computation transformation for multiprocessors
– Anderson, Amarasinghe, et al.
- 1995
|
|
151
|
Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers
– Gupta, Banerjee
- 1992
|
|
142
|
An optimizing Fortran D compiler for MIMD distributed-memory machines
– TSENG
- 1993
|
|
141
|
Compiling programs for distributed-memory multiprocessors
– CALLAHAN, KENNEDY
- 1988
|
|
138
|
Communication Optimization and Code Generation for Distributed Memory
– Amarasinghe, Lam
- 1993
|
|
137
|
Communication optimizations for irregular scientific computations on distributed memory architectures
– Das, Uysal, et al.
- 1994
|
|
137
|
Practical dependence testing
– Goff, Kennedy, et al.
- 1991
|
|
133
|
Data-parallel programming on MIMD computers
– HATCHER, QUINN, et al.
- 1991
|
|
127
|
Using processor affinity in loop scheduling on shared-memory multiprocessors
– Markatos, LeBlanc
- 1994
|
|
125
|
The ParaScope Parallel Programming Environment
– Cooper, Hall, et al.
- 1993
|
|
123
|
Automatic Array Privatization
– Tu, Padua
- 1993
|
|
120
|
Compiling communication-efficient programs for massively parallel machines
– LI, CHEN
- 1991
|
|
113
|
Supporting Shared Data Structures on Distributed Memory Architectures. PPoPP
– Koelbel, Mehrotra, et al.
- 1990
|
|
100
|
Integrating Message-Passing and Shared-Memory: Early Experience
– Kranz, Johnson, et al.
- 1993
|
|
97
|
False sharing and spatial locality in multiprocessor caches
– Torrellas, Lam, et al.
- 1994
|
|
89
|
ApplicationSpecific Protocols for User-Level Shared Memory
– Falsafi, Lebeck, et al.
- 1994
|
|
84
|
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
– Hall, Amarasinghe, et al.
- 1995
|
|
84
|
Efficient support for irregular applications on distributed-memory machines
– Mukherjee, Sharma, et al.
- 1995
|
|
82
|
Eliminating false sharing
– Eggers, Jeremiassen
- 1991
|
|
81
|
Compiler support for machine-independent parallel programming in Fortran D
– Hiranandani, Kennedy, et al.
|
|
77
|
Evaluating compiler optimizations for Fortran D
– Hiranandani, Kennedy, et al.
- 1994
|
|
76
|
Simple but effective techniques for NUMA memory management
– Bolosky, Scott, et al.
- 1989
|
|
75
|
Automatic Data Layout for High Performance Fortran
– Kremer
- 1995
|
|
74
|
Updating Distributed Variables in Local Computations
– Gerndt
- 1990
|
|
71
|
Compiler optimization for eliminating barrier synchronization
– Tseng
- 1995
|
|
69
|
An interactive environment for data partitioning and distribution
– BALASUNDARAM, Foxj, et al.
- 1990
|
|
67
|
An overview of the Fortran D programming system
– HIRANANDANI, KENNEDY, et al.
- 1991
|
|
66
|
Automatic Data Layout Using 0{1 Integer Programming
– Bixby, Kennedy, et al.
- 1993
|
|
62
|
Interactive parallel programming using the ParaScope Editor
– Kennedy, McKinley, et al.
- 1991
|