(Enter summary)
Abstract: Using Compile-Time Analysis and Transformations to Reduce False
Sharing on Shared Memory Multiprocessors
by Tor E. Jeremiassen
Chairperson of Supervisory Committee: Professor Susan J. Eggers
Department of Computer Science
and Engineering
On bus-based, shared-memory multiprocessors, much of the "unnecessary" bus traffic,
i.e., that which could be eliminated with better processor locality, is coherency overhead
caused by false sharing. False sharing occurs when multiple processors access
(both... (Update)
Context of citations to this paper: More
...programming model provides. The most important constraints involve pointers and separate compilation (a full description will appear in [Jer95] While our model allows for pointers, the full generality of pointers in C is restricted to reduce pointer aliasing of statically...
...has no practical effect. 8. Indirection tables have been successfully used to reduce false sharing on shared memory multiprocessors [Jeremiassen 1995; Jeremiassen and Eggers 1995] They can be effective in this context because misses in the original array, which is written, are...
Cited by: More
Trade-offs Between False Sharing and Aggregation in.. - Amza, Cox, Rajamani, .. (1997)
(Correct)
Improving Performance of Bus-Based Multiprocessors - Anderson (1995)
(Correct)
Maximizing Speedup Through Self-Tuning of Processor.. - Nguyen, Vaswani, Zahorjan (1996)
(Correct)
Active bibliography (related documents): More All
1.4: Static Analysis of Barrier Synchronization in Explicitly.. - Tor Jeremiassen (1994)
(Correct)
1.2: Reducing False Sharing on Shared Memory Multiprocessors.. - Tor Jeremiassen (1994)
(Correct)
1.1: Reducing False Sharing on Shared Memory Multiprocessors through .. - Jeremiassen (1994)
(Correct)
Similar documents based on text: More All
1.4: Lossy Compression of Scientific Data via Wavelets and Vector.. - Goldschneider (1997)
(Correct)
0.2: Analysis of Mechanical Properties of Fiber-reinforced Composites.. - Zeman (1999)
(Correct)
0.0: Computer-Generated Pen-and-Ink Illustration - Winkenbach (1996)
(Correct)
Related documents from co-citation: More All
3: Performance of Multiprogrammed Multiprocessor Scheduling Algorithms (context) - Leutenegger, Vernon - 1990
3: Scheduling in Multiprogrammed Parallel Systems (context) - Majumdar, Eager et al. - 1988
3: Global Optimization (context) - Torn, Antanas - 1989
BibTeX entry: (Update)
T. E. Jeremiassen. Using Compile-Time Analysis and Transformation to Reduce False Sharing on Shared-Memory Multiprocessors. PhD thesis, University of Washington, 1995. http://citeseer.ist.psu.edu/jeremiassen95using.html More
@phdthesis{ jeremiassen95using,
author = "T.E. Jeremiassen",
title = "Using Compile Time Analysis and Transformations to Reduce False Sharing on Shared Memory Multiprocessors",
month = "June",
year = "1995",
url = "citeseer.ist.psu.edu/jeremiassen95using.html" }
Citations (may not include all citations):
1399
Compilers Principles (context) - Aho, Sethi et al. - 1988
496
SPLASH: Stanford Parallel Applications for Shared-Memory (context) - Singh, Weber et al. - 1991
478
The Stanford DASH Multiprocessor (context) - Lenoski, Laudon et al. - 1992
468
Memory consistency and event ordering in scalable sharedmemo..
- Gharachorloo, Lenoski et al. - 1990
252
Analysis of pointers and structures (context) - Chase, Wegman et al. - 1990
225
Flow Analysis of Computer Programs (context) - Hecht - 1977
212
APRIL: A processor architecture for multiprocessing
- Agarwal, Lim et al. - 1990
149
An implementation of interprocedural bounded regular section..
- Havlak, Kennedy - 1991
146
Parallelizing programs with recursive data structures (context) - Hendren, Nicolau - 1990
141
Asynchronous distributed simulation via a sequence of parall.. (context) - Chandy, Misra - 1981
99
Dependence analysis for pointer variables (context) - Horwitz, Pfeiffer et al. - 1989
94
Optimizing for parallelism and data locality
- Kennedy, McKinley - 1992
93
Global data flow analysis and iterative algorithms (context) - Kam, Ullman - 1976
91
An efficient way to find the side effects of procedure calls.. (context) - Banning - 1979
86
A precise inter-procedural data flow algorithm (context) - Myers - 1981
86
A general-purpose algorithm for analyzing concurrent program.. (context) - Taylor - 1983
86
A precise inter-procedural data flow algorithm (context) - Myers - 1981
70
Simple but effective techniques for NUMA memory management
- Bolosky, Fitzgerald et al. - 1989
69
Interprocedural modification side effect analysis with point..
- Landi, Ryder et al. - 1993
68
Interprocedural data flow analysis in the presence of pointe..
- Weihl - 1980
66
Implementing a cache consistency protocol (context) - Katz, Eggers et al. - 1985
66
Interprocedural side-effect analysis in linear time (context) - Cooper, Kennedy - 1988
65
Eliminating false sharing (context) - Eggers, Jeremiassen - 1991
62
The symmetry multiprocessor system (context) - Lovett, Thakkar - 1988
61
The effect of sharing on the cache and bus performance of pa.. (context) - Eggers, Katz - 1989
60
KSR-1 Principles of Operation (context) - Research - 1992
60
Techniques for efficient inline tracing on a shared-memory m..
- Eggers, Keppel et al. - 1990
59
Analysis of cache invalidation patterns in multiprocessors (context) - Weber, Gupta - 1989
57
The detection and elimination of useless misses in multiproc..
- Dubois, Skeppstedt et al. - 1993
51
Fast interprocedural alias analysis (context) - Cooper, Kennedy - 1989
49
False sharing and spatial locality in multiprocessor caches
- Torrellas, Lam et al. - 1994
48
Non-concurrency analysis (context) - Masticola, Ryder - 1993
48
Evaluating the performance of four snooping cache coherency .. (context) - Eggers, Katz - 1989
46
Analysis of Event Synchronization in a Parallel Programming ..
- Callahan, Kennedy et al. - 1990
44
A practical interprocedural data flow analysis algorithm (context) - Barth - 1978
41
Lifetime analysis of dynamically allocated objects (context) - Ruggieri, Murtagh - 1988
40
Limitations of cache prefetching on a bus-based multiprocess.. (context) - Tullsen, Eggers - 1993
40
Adjustable block size coherent caches
- Dubnicki, LeBlanc - 1992
39
Minimum distance: A method for partitioning recurrences for .. (context) - Peir, Cytron - 1989
36
Memory-reference characteristics of multiprocessor applicati.. (context) - Agarwal, Gupta - 1988
32
and scheduling programs on multiprocessors (context) - Polychronopoulos, Girkar et al. - 1989
30
Reduction of cache coherence overhead by compiler data layou.. (context) - Ju, Dietz - 1991
29
Shared data placement optimizations to reduce multiprocessor.. (context) - Torrellas, Lam et al. - 1990
27
The vmp multiprocessor: Initial experience (context) - Cheriton, Gupta et al. - 1988
26
A Practical Algorithm for Static Analysis of Parallel Progra.. (context) - McDowell - 1989
22
Simulation analysis of data sharing in shared memory multipr.. (context) - Eggers - 1989
20
Impact of sharing-based thread placement on multithreaded ar..
- Thekkath, Eggers - 1994
20
Multiprocessor cache design considerations (context) - Lee, Yew et al. - 1987
19
A parallel adaptive fast multipole method
- Singh, Holt et al. - 1993
19
Logic verification algorithms and their parallel implementat.. (context) - Ma, Devadas et al. - 1987
16
Software caching on cache-coherent multiprocessors
- Bianchini, LeBlanc - 1992
15
Topological optimization of multiple level array logic (context) - Devadas, Newton - 1987
14
and verification of the SGI Challenge multiprocessor (context) - Galles, Williams et al. - 1994
13
Restructuring lisp programs for concurrent execution (context) - Larus, Hilfinger - 1988
11
Static analysis of barrier synchronization in explicitly par..
- Jeremiassen, Eggers - 1994
10
The impact of parallel loop scheduling strategies on prefetc..
- Lilja - 1994
10
Computing per-process summary side-effect information
- Jeremiassen, Eggers - 1992
10
Cache protocols with partial block invalidations (context) - Chen, Dubois - 1993
9
An evaluation of cache coherence solutions in shared-bus mul.. (context) - Archibald, Baer - 1986
9
Effects of program parallelization and stripmining transform.. (context) - Gupta, Padua - 1991
9
Simplicity versus accuracy in a model of cache coherency ove.. (context) - Eggers - 1991
8
Experiments with shared virtual memory on an iPSC/2 hypercub..
- Priol, Lahjomri - 1992
7
Computing reachable states of parallel programs
- Helmbold, McDowell - 1991
6
Toward a compile-time methodology for reducing false sharing.. (context) - Granston - 1993
5
A parallel maxflow implementation (context) - Carrasco - 1988
4
The next-generation SPARC multiprocessing system architectur.. (context) - Frailong, Cekleov et al. - 1993
3
Predicting program behavior using real or esitmated profiles (context) - Wall - 1991
2
The effect of barrier syncrhonization and scheduling overhea.. (context) - Beckmann, Polychronopoulos - 1989
1
University of California (context) - Gibson, Technical - 1988
1
Technical Report UCB/Computer Science Dpt (context) - Fujimoto, of et al. - 1983
1
Visualization algorithms: Performance and architectural impl.. (context) - Singh, Gupta et al. - 1994
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://cm.bell-labs.com/cm/cs/papers.html): More
Local Control over Filtered WWW Access - Baker, Grosse (1995)
(Correct)
Parameterized Pattern Matching by Boyer-Moore-type Algorithms - Baker (1995)
(Correct)
Determining Subspace Information from the Hessian of a Barrier.. - Wright (1992)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC