MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Privatization and Distribution of Arrays A Preliminary Proposal

Download:
pdf | ps
by Peng Tu
http://polaris.cs.uiuc.edu/reports/1410.ps.gz
Add To MetaCart

Abstract:

In today's high performance NUMA (Non-Uniform Memory Architecture) multiprocessors with memory hierarchy or distributed memory, the partition and distribution of data associated with parallel computations affect the amount of parallelism that can be exploited and the amount of data movement in the system. The objective of this research is to study and evaluate compile time data management techniques to enhance parallelism and to improve locality of memory reference for large scientific programs written in Fortran. Our first step is to reduce the amount of shared data through privatization. Privatization is a technique that allocates a separate copy of a shared variable in the private storage of each processor such that each processor can access a distinct instance of the variable. Privatization can enhance inherent parallelism of a program by eliminating memory-related anti- and output dependences. It can also improve the locality of references since accessing a private variable is inherently local and communication free. We present our algorithm for array privatization and the result of our experiment on the effectiveness of the algorithm. For the remaining shared data, we introduce a new concept: placement matrix, and show its application in deriving data alignment and data decomposition to reduce communication. We also incorporate the ratio of communication to computation in our evaluation of different data partitions. The work is continuing on heuristics for data distribution and the implementation of the tools.

Citations

296 Advanced compiler optimizations for supercomputers – Padua, Wolfe - 1986
179 SUPERB: A tool for semi-automatic MIMD/SIMD parallelization – ZIMA, BAST, et al. - 1988
163 Process decomposition through locality of reference – ROGERS, PINGALI - 1989
151 Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers – Gupta, Banerjee - 1992
141 Compiling programs for distributed-memory multiprocessors – CALLAHAN, KENNEDY - 1988
117 The data alignment phase in compiling programs for distributed-memory machines – Li, Chen - 1991
102 Experience in the automatic parallelization of four perfect benchmark programs – Hoeflinger, Li, et al. - 1992
85 Compile-time techniques for data distribution in distributed memory machines – Ramanujam, Sadayappan - 1991
81 Array Expansion – Feautrier - 1988
81 Compiler support for machine-independent parallel programming in Fortran D – Hiranandani, Kennedy, et al.
75 Supercomputer performance evaluation and the Perfect Benchmarks – CYBENKO, KIPP, et al. - 1990
59 Measuring parallelism in computationintensive scientific/engineering applications – Kumar - 1988
31 Stencils and problem partitionings: Their influence on the performance of multiple processor systems – Reed, Adams, et al. - 1987
30 Automatic generation of nested, fork-join parallelism – Burke, Cytron, et al. - 1989
19 MAXPAR: An execution driven simulator for studying parallel systems – Chen - 1989
16 Machine-Independent Evaluation of Parallelizing Compilers – Petersen, Padua - 1992
11 Programming Concurrent Processors – Fox - 1989
11 The Delta Program Manipulation system --- Preliminary design – Padua - 1989
6 Translating control parallelism to data parallelism – Balasundaram - 1991
3 Stencils and problem partitionings: Their in uence on the performance of multiple processor systems – Reed, Adams, et al. - 1987