See this document in CiteSeerX!

Informing Loads: Enabling Software To Observe And React To Memory Behavior (1995)  (Make Corrections)  (4 citations)
Mark Horowitz, Margaret Martonosi, Todd C. Mowry, Michael D. Smith



  Home/Search   Context   Related

 
View or download:
cmu.edu/~tcm/tcm_p...inf_load_TR95.ps.Z
toronto.edu/~tcm/t...inf_load_TR95.ps.Z
cmu.edu/user/tcm/w...nf_load_TR95.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/~tcm/Papers (more)
From:  toronto.edu/~tcm/Papers
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem successfully in specific situations. However, the generality of these software approaches has been limited because current architectures do not provide a fine-grained, low-overhead mechanism to observe memory behavior directly. To fill this need, we propose a new set of memory operations called informing ... (Update)

Context of citations to this paper:   More

...mechanism offers quicker control transfers than current cache miss counters. Our second method (evaluated more fully in an earlier study [HMMS95] removes the explicit user state for the hit miss information, but retains the explicit dispatch instruction. In this case, the...

...which can be collected using informing memory operations is the precise miss rate of all memory references. A previous study [10] has demonstrated that per reference miss rates can be captured with low runtime overheads (less than a 25 ) and tolerable data cache...

Cited by:   More
Compiler Orchestrated Prefetching via Speculation.. - Rabbah.. (2004)   (Correct)
Hybrid Compiler/Hardware Prefetching for Multiprocessors.. - Skeppstedt, Dubois (1997)   (Correct)
Predicting Data Cache Misses in Non-Numeric Applications.. - Chi-Keung Luk (1997)   (Correct)

Similar documents (at the sentence level):
15.2%:   Informing Memory Operations: Memory Performance.. - Horowitz.. (1998)   (Correct)

Active bibliography (related documents):   More   All
0.7:   Integrating Performance Monitoring and Communication in .. - Martonosi, Ofelt.. (1996)   (Correct)
0.4:   Informing Memory Operations: Providing Memory Performance.. - Horowitz (1996)   (Correct)
0.2:   Improving Balanced Scheduling with Compiler Optimizations that.. - Lo, Eggers (1995)   (Correct)

Similar documents based on text:   More   All
0.3:   Tuning Memory Performance in Sequential and Parallel Programs - Martonosi, Gupta, Anderson (1995)   (Correct)
0.3:   Resume - Ghosh   (Correct)
0.3:   Memory Referencing Behavior in Compiler-Parallelized.. - Torrie, Martonosi..   (Correct)

Related documents from co-citation:   More   All
4:   Design and evaluation of a compiler algorithm for prefetching - Mowry, Lam et al. - 1992
3:   Informing memory operations: Providing memory performance feedback in modern pro.. - Horowitz, Martonosi et al. - 1996
2:   Performance Tradeoffs with Non-Blocking Loads (context) - Farkas, Jouppi - 1994

BibTeX entry:   (Update)

M. Horowitz, M. Martonosi, T. C. Mowry, and M. D. Smith. Informing Loads: Enabling Software to Observe and React to Memory Behavior. Stanford CSL Technical Report CSL-TR-95-673. Stanford University. July 1995. http://citeseer.ist.psu.edu/article/horowitz95informing.html   More

@techreport{ horowitz95informing,
    author = "Mark Horowitz and Margaret Martonosi and Todd C. Mowry and Michael D. Smith",
    title = "Informing Loads: Enabling Software to Observe and React to Memory Behavior",
    number = "CSL-TR-95-673",
    pages = "23",
    year = "1995",
    url = "citeseer.ist.psu.edu/article/horowitz95informing.html" }
Citations (may not include all citations):
2441   Johns Hopkins University Press (context) - Golub, Van Loan - 1989
1575   Computer Architecture: A Quantitative Approach (context) - Hennessy, Patterson - 1990
496   SPLASH: Stanford Parallel Applications for Shared Memory (context) - Singh, Weber et al. - 1991
474   A data locality optimizing algorithm (context) - Wolf, Lam - 1991
443   Improving direct-mapped cache performance by the addition of.. - Jouppi - 1990
407   Trace scheduling: A technique for global microcode compactio.. (context) - Fisher - 1981
376   The Cache Performance and Optimizations of Blocked Algorithm.. (context) - Lam, Rothberg et al. - 1991
362   The Stanford FLASH Multiprocessor (context) - Kuskin, Ofelt et al. - 1994
344   Design and evaluation of a compiler algorithm for prefetchin.. - Mowry, Lam et al. - 1992
249   Tolerating Latency Through Software-Controlled Data Prefetch.. - Mowry - 1994
166   The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel C.. - Reinhardt, Hill et al. - 1993
137   Lockup-free instruction fetch/prefetch cache organization (context) - Kroft - 1981
109   Cache Profiling and the SPEC Benchmarks: A Case Study - Lebeck, Wood - 1994
107   Software Methods for Improvement of Cache Performance on Sup.. (context) - Porterfield - 1989
87   The implementation of a coherent memory abstraction on a NUM.. (context) - Cox, Fowler - 1989
80   Avoiding conflict misses dynamically in large direct-mapped .. - Bershad, Lee et al. - 1994
70   Simple but effective techniques for NUMA memory management - Bolosky, Fitzgerald et al. - 1989
61   Experimental comparison of memory management policies for NU.. - Jr, Carla - 1991
60   Scheduling and page migration for multiprocessor compute ser.. - Chandra, Devine et al. - 1994
50   Data access microarchitectures for superscalar processors wi.. - Chen, Mahlke et al. - 1991
50   Mtool: An Integrated System for Performance Debugging Shared.. (context) - Goldberg, Hennessy - 1993
41   The impact of hierarchical memory systems on linear algebra .. (context) - Gallivan, Jalby et al. - 1987
40   Interleaving: A Multithreading Technique Targeting Multiproc.. - Laudon, Gupta et al. - 1994
28   Technical Report CSL-TR (context) - Smith, Pixie - 1991
27   New CPU Benchmark Suites from SPEC (context) - Dixit - 1992
26   Performance-Measurement Tools in a Multiprocessor Environmen.. (context) - Burkhart, Millen - 1989
25   dual-issue CMOS Microprocessor (context) - Dobberpuhl, MHz - 1992
24   Support for Speculative Execution in High-Performance Proces.. (context) - Smith - 1992
24   Integrating Scalar Optimizations and Parallelization (context) - Tjiang, Wolf et al. - 1991
19   Two High-performance Workstations (context) - Dutton, Eiref et al. - 1992
10   Assembly Language Programming (context) - Paul - 1994
10   Page placement algorithms for large real-index caches (context) - Kessler, Hill - 1992
9   Automatic program transformations for virtual memory compute.. (context) - Abu-Sufah, Kuck et al. - 1979
8   Analyzing and Tuning Memory Performance in Sequential and Pa.. (context) - Martonosi - 1993
8   The organization of matrices and matrix operations in a page.. (context) - McKeller, Coffman - 1969
5   Architectural and Implementation Tradeoffs for Multiple-Cont.. (context) - Laudon - 1994
4   fills out PowerPC product line (context) - Gwennap - 1994
2   Technical report (context) - DECChip, Preliminary et al. - 1992
1   Special Report: Memory (context) - Comerford, Watson et al. - 1992
1   Instruction Set Reference Manual (context) - PA-RISC - 1992

Documents on the same site (http://www.cs.cmu.edu/~tcm/Papers.html):   More
Predicting Data Cache Misses in Non-Numeric Applications.. - Mowry, Luk (1997)   (Correct)
Automatic Compiler-Inserted I/O Prefetching for.. - Mowry, Demke, Krieger (1996)   (Correct)
Cooperative Prefetching: Compiler and Hardware Support for.. - Luk, Mowry (1998)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC