MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Locality Analysis for Parallel C Programs (1997) [8 citations — 2 self]

Download:
Download as a PDF | Download as a PS
by Yingchun Zhu, Laurie J. Hendren
80 Configurations Cache Manager (CCM), Convex Exemplar, 102 CORBA, 17 D DASH monitor, 41 data
http://www.sable.mcgill.ca/~hendren/ftp/ying/locality.ps.gz
Add To MetaCart

Abstract:

Many parallel architectures support a memory model where some memory accesses are local, and thus inexpensive, while other memory accesses are remote, and potentially quite expensive. In order to achieve good parallel performance, it is often necessary to reduce the number of remote memory accesses. This can be done by the programmer, the compiler, or a combination of both. The overall goal is to minimize the work required by the programmer, and have the compiler automate the process as much as possible. This paper reports on compiler techniques for decreasing the number of remote memory accesses using locality analysis for a parallel dialect of C called EARTHC. The locality analysis uses an algorithm inspired by type inference algorithms for fast points-to analysis. The algorithm estimates when an indirect reference via a pointer can be safely assumed to be a local access. The locality inference algorithm is also used to guide the automatic specialization of functions in order to take advantage of locality specific to particular calling contexts. The locality analysis and automatic specialization has been implemented in the EARTH-C compiler which produces low-level threaded code for the EARTH multithreaded architecture. Experimental results are presented for a set of benchmarks that operate on irregular, dynamically-allocated data structures. The techniques give moderate to significant speedups and they do lessen the burden on the programmer. 1

Citations

415 Points-to analysis in almost linear time – Steensgaard - 1996
133 Supporting dynamic data structures on distributed-memory machines – Rogers, Carlisle, et al. - 1995
85 Efficient type inference for higher-order binding-time analysis – Henglein
52 Olden: Parallelizing programs with dynamic data structures on distributed-memory machines – CARLISLE - 1996
46 Polling Watchdog: Combining polling and interrupts for efficient message handling – Maquelin, Gao, et al. - 1996
46 Commutativity analysis: A new analysis framework for parallelizing compilers – Rinard, Diniz - 1996
45 A study of the EARTH-MANNA multithreaded system – Hum, Maquelin, et al. - 1996
40 Designing the mccat compiler based on a family of structured intermediate representations – Hendren, Donawa, et al. - 1992
28 Latency hiding in message-passing architectures – Bruening, Giloi, et al.
27 Compiling C for the EARTH multithreaded architecture – Hendren, Tang, et al. - 1997