(Enter summary)
Abstract: This paper proposes a hardware mechanism for reducing
coherency overhead occurring in scientific computations
within DSM systems. A first phase aims at detecting,
in the address space regular patterns (called streams) of
coherency events (such as requests for exclusive, shared
or invalidation).
Once a stream is detected at a loop level, regularity
of data access can be exploited at the loop level (spatial
locality) but also between loops (temporal locality). We
present a hardware mechanism... (Update)
Active bibliography (related documents): More All
0.3: The Coherence Predictor Cache: A Resource-Efficient and .. - Nilsson, Landin..
(Correct)
0.2: Mechanisms for Efficient Shared-Memory, Lock-Based Synchronization - Kagi (1999)
(Correct)
0.2: Memory Dependence Prediction - Andreas Ioannis Moshovos
(Correct)
Similar documents based on text: More All
0.3: Branch Strategies to Optimize Decision Trees for.. - Carribault.. (2004)
(Correct)
0.2: Influence of Cross-Interferences on Blocked Loops: A Case.. - Fricker, Temam, Jalby (1995)
(Correct)
0.2: Proxy Cache Coherency and Replacement-Towards a More.. - Krishnamurthy, Wills (1999)
(Correct)
BibTeX entry: (Update)
@inproceedings{ acquaviva00hardware,
author = "Jean-Thomas Acquaviva and William Jalby",
title = "Hardware Prediction for Data Coherency of Scientific Codes on {DSM}",
booktitle = "Supercomputing",
year = "2000",
url = "citeseer.ist.psu.edu/393610.html" }
Citations (may not include all citations):
222
The SGI Origin: A CC-NUMA highly scalable server (context) - Laudon, Lenoski - 1997
204
Munin: Distributed Shared Memory Based on Type-Specific Memo..
- Carter, Bennett et al. - 1990
159
the NYU Ultracomputer Designing a MIMD Shared-Memory Paralle.. (context) - Gottlieb, Grishman et al. - 1983
147
Alternative Implementation of Two-Level Adaptive Branch Pred..
- Yeh, Patt - 1992
98
Evaluating stream buffers as secondary cache replacement (context) - Palacharla, Kessler - 1994
66
Parallel Computer Architecture a Hardware/ Software approach (context) - Culler, Singh et al. - 1998
57
An Argument for Simple COMA
- Saulsbury, Wilkinson et al. - 1995
52
An Adaptive Cache Coherence Protocol Optimized for Migratory.. (context) - Stenstrom, Brorsson et al. - 1993
39
Toward scalable Cache Only Memory Architectures
- Hagersten - 1992
29
The Augmint Multiprocessor Simulation Toolkit for Intelx86 A..
- Nguyen, Michael et al. - 1996
26
Using Prediction to Accelerate Coherence Protocol
- Mukherjee, Hill - 1998
25
Improving ccnuma performance using instruction-based predict.. (context) - Kaxiras, Goodman - 1999
20
Memory Sharing Predictor: The Key to a Speculative Coherent ..
- Lai, Falsafi - 1999
12
Effective hardware-based data prefetching for highperformanc.. (context) - Chen, Baer - 1995
10
PRISM: An integrated Architecture for Scalable Shared Memory
- Ekanadham, Lim et al. - 1998
8
Benchmarker's Guide to singleProcessor Optimization CRAY TE .. (context) - Anderson, Brooks et al. - 1997
4
Identification and Optimization of Sharing Patterns for Scal..
- Kaxiras - 1998
1
Compiler-based approaches to reduce memory access penalties .. (context) - Skeppstedt - 1997
1
Simple COMA SMP (context) - Saulsbury, Simple et al. - 1995
Documents on the same site (http://www.sc2000.org/techpapr/): More
The MicroGrid: a Scientific Tool for Modeling.. - Song, Liu.. (2000)
(Correct)
Scalable Molecular Dynamics for Large Biomolecular Systems - Brunner, Phillips.. (2000)
(Correct)
Tiling Imperfectly-nested Loop Nests - Ahmed, Mateev, Pingali (2000)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC