(Enter summary)
Abstract: Memory latency is an important bottleneck in system performance
that cannot be adequately solved by hardware alone. Several promising
software techniques have been shown to address this problem
successfully in specific situations. However, the generality of these
software approaches has been limited because current architectures
do not provide a fine-grained, low-overhead mechanism for
observing and reacting to memory behavior directly. To fill this
need, we propose a new class of memory... (Update)
Context of citations to this paper: More
.... such as informing memory operations, are also based on hardware support and are presently not supported in contemporary architectures [15, 20]. Recent work by Mellor Crummey et al. uses a modi ed compiler to insert instrumentation code that extracts a data trace of array...
.... might incorporate mechanisms for this, perhaps similar to those proposed by Horowitz et al. in their paper on Informing Memory Operations [Horowitz96]. We found that evaluating performance on the WildFire system requires that new and different methodologies. Bench 14 markers...
Cited by: More
Balanced Multithreading: Increasing Throughput via a.. - Tune, Kumar, Tullsen, .. (2004)
(Correct)
Compiler Orchestrated Prefetching via Speculation.. - Rabbah.. (2004)
(Correct)
METRIC: Tracking Down Inefficiencies in the Memory .. - Marathe, Mueller, .. (2003)
(Correct)
Similar documents (at the sentence level):
59.9%: Informing Memory Operations: Memory Performance.. - Horowitz.. (1998)
(Correct)
Active bibliography (related documents): More All
0.4: Informing Loads: Enabling Software To Observe And.. - Horowitz.. (1995)
(Correct)
0.4: Integrating Performance Monitoring and Communication in .. - Martonosi, Ofelt.. (1996)
(Correct)
0.3: The SHRIMP Performance Monitor: Design and Applications - Martonosi, Clark, Mesarina (1996)
(Correct)
Similar documents based on text: More All
0.2: An Efficient Static Analysis Algorithm to Detect Redundant.. - Cooper, Xu (2002)
(Correct)
0.2: Tuning Memory Performance in Sequential and Parallel Programs - Martonosi, Gupta, Anderson (1995)
(Correct)
0.2: Protocols and Strategies for Optimizing Performance .. - Nieplocha.. (2002)
(Correct)
Related documents from co-citation: More All
9: The SPLASH-2 programs: Characterization and methodological considerations
- Woo, Ohara et al. - 1995
8: Design and evaluation of a compiler algorithm for prefetching
- Mowry, Lam et al. - 1992
8: Tempest and Typhoon: User-Level Shared Memory
- Reinhardt, Larus et al. - 1994
BibTeX entry: (Update)
M. Horowitz, M. Martonosi, T. C. Mowry, and M. D. Smith. Informing memory operations: Providing memory performance feedback in modern processors. In ISCA'96, pages 260--270, May 1996. http://citeseer.ist.psu.edu/horowitz96informing.html More
@inproceedings{ horowitz96informing,
author = "Mark Horowitz and Margaret Martonosi and Todd C. Mowry and Michael D. Smith",
title = "Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors",
booktitle = "{ISCA}",
pages = "260-270",
year = "1996",
url = "citeseer.ist.psu.edu/horowitz96informing.html" }
Citations (may not include all citations):
595
Active Messages: A Mechanism for Integrated Communication an..
- von Eicken, Culler et al. - 1992
443
Improving direct-mapped cache performance by the addition of..
- Jouppi - 1990
358
The Tera Computer System
- Alverson, Callahan et al. - 1990
344
Design and evaluation of a compiler algorithm for prefetchin..
- Mowry, Lam et al. - 1992
275
Virtual Memory Mapped Network Interface for the SHRIMP Multi..
- Blumrich, Li et al. - 1994
268
Tempest and Typhoon: User-Level Shared Memory
- Reinhardt, Larus et al. - 1994
212
The MIT Alewife Machine: Architecture and Performance
- Agarwal, Bianchini et al. - 1995
157
Architecture and Applications of the HEP Multiprocessor Comp.. (context) - Smith - 1981
131
Fine-Grain Access Control for Distributed Shared Memory
- Schoinas, Falsafi et al. - 1994
109
Cache Profiling and the SPEC Benchmarks: A Case Study
- Lebeck, Wood - 1994
107
Software Methods for Improvement of Cache Performance on Sup.. (context) - Porterfield - 1989
80
Avoiding conflict misses dynamically in large direct-mapped ..
- Bershad, Lee et al. - 1994
60
Scheduling and page migration for multiprocessor compute ser..
- Chandra, Devine et al. - 1994
50
Mtool: An Integrated System for Performance Debugging Shared.. (context) - Goldberg, Hennessy - 1993
50
Data access microarchitectures for superscalar processors wi..
- Chen, Mahlke et al. - 1991
45
mp Scalable Shared Memory Multiprocessor (context) - Nowatzyk, Aybay et al. - 1994
41
The Impact of Hierarchical Memory Systems on Linear Algebra .. (context) - Gallivan, Jalby et al. - 1987
40
Interleaving: A Multithreading Technique Targeting Multiproc..
- Laudon, Gupta et al. - 1994
40
Sparcle: 11 An Evolutionary Processor Design for Large-Scale..
- Agarwal, Kubiatowicz et al. - 1993
37
Hardware and Software Support for Efficient Exception Handli..
- Thekkath, Levy - 1994
36
Performance Tradeoffs with Non-Blocking Loads (context) - Farkas, Jouppi - 1994
30
Microprocessor User's Manual (context) - Heinrich - 1995
27
New CPU Benchmark Suites from SPEC (context) - Dixit - 1992
26
Performance-Measurement Tools in a Multiprocessor Environmen.. (context) - Burkhart, Millen - 1989
18
on Programming Language Design and Implementation (context) - Wolf, Lam et al. - 1991
17
Tuning Memory Performance of Sequential and Parallel Program..
- Martonosi, Gupta et al. - 1995
14
and Understanding of Matrix Algorithms for Parallel Processo.. (context) - Dongarra, Brewer et al. - 1990
10
Assembly Language Programming (context) - Paul - 1994
9
Automatic Program Transformations for Virtual Memory Compute.. (context) - Abu-Sufah, Kuck et al. - 1979
5
Pentium Secrets (context) - Mathison - 1994
4
Symposium on Computer Architecture (context) - Kuskin, Ofelt et al. - 1994
4
Informing Loads: Enabling Software to Observe and React to M..
- Horowitz, Martonosi et al. - 1995
3
DECChip 21064 RISC Microprocessor Preliminary Data Sheet (context) - Corp - 1992
1
MHz 64-bit Quad-issue CMOS RISC Microprocessor (context) - Edmonson, Rubenfeld et al. - 1995
1
ACM Sigmetrics Conf (context) - Covington, Madala et al. - 1988
1
on Architectural Support for Programming Languages and Opera.. (context) - Thekkath, Eggers et al. - 1994
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.eecg.toronto.edu/~tcm/Papers.html): More
Informing Loads: Enabling Software To Observe And.. - Horowitz.. (1995)
(Correct)
Compiler-Based Prefetching for Recursive Data Structures - Luk (1996)
(Correct)
Tolerating Latency Through Software-Controlled Prefetching in.. - Mowry, Gupta (1991)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC