See this document in CiteSeerX!

Using Interaction Costs for Microarchitectural Bottleneck Analysis  (Make Corrections)  (1 citation)
Brian A. Fields, Rastislav Bodik, Mark D. Hill, Chris J. Newburn



  Home/Search   Context   Related

 
View or download:
microarch.org/micr...nteractionCost.pdf
Cached:  PDF   PS.gz  PS  Image  Update  Help

From:  microarch.org/micro36/h...program (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Attacking bottlenecks in modern processors is difficult because many microarchitectural events overlap with each other. This parallelism makes it difficult to both (a) assign a cost to an event (e.g., to one of two overlapping cache misses) and (b) assign blame for each cycle (e.g., for a cycle where many, overlapping resources are active). This paper introduces a new model for understanding event costs to facilitate processor design and optimization. (Update)

Cited by:   More
Microarchitecture Evaluation With Floorplanning - And Interconnect Pipelining   (Correct)

Active bibliography (related documents):   More   All
0.6:   Slack: Maximizing Performance Under Technological Constraints - Fields, Bodik, Hill (2002)   (Correct)
0.6:   Permission to Make Digital Or Hard Copies of All Or Part.. - Personal Or Classroom   (Correct)
0.5:   Quantifying Instruction Criticality - Tune, Tullsen, Calder (2002)   (Correct)

Similar documents based on text:   More   All
0.2:   Path-Sensitive, Value-Flow Optimizations of Programs - Bodik (1999)   (Correct)
0.2:   Data Flow Terminology and Representations - Newburn   (Correct)
0.2:   Node Labeling - Newburn (1997)   (Correct)

BibTeX entry:   (Update)

@misc{ fields-using,
  author = "Brian A. Fields and Rastislav Bodik and Mark D. Hill and Chris J. Newburn",
  title = "Using Interaction Costs for Microarchitectural Bottleneck Analysis",
  url = "citeseer.ist.psu.edu/fields03using.html" }
Citations (may not include all citations):
1575   Computer Architecture: A Quantitative Approach (context) - Hennessy, Patterson - 2002
145   Exceeding the dataflow limit via value prediction - Lipasti, Shen - 1996
121   Continuous profiling: Where have all the cycles gone - Anderson, Berc et al. - 1997
107   Technical Report CS-TR (context) - Burger, Austin et al. - 1997
100   Dynamic instruction reuse - Sodani, Sohi - 1997
91   The impact of architectural trends on operating system perfo.. (context) - Rosenblum, Bugnion et al. - 1995
70   Selective value prediction - Calder, Reinman et al. - 1999
67   ProfileMe: Hardware support for instruction-level profiling .. - Dean, Hicks et al. - 1997
59   Performance analysis using the MIPS R10000 performance count.. - Zagha, Larson et al. - 1996
49   The impact of instruction-level parallelism on multiprocesso.. - Pai, Ranganathan et al. - 1997
38   A scalable approach to thread-level speculation - Steffan, Colohan et al. - 2000
33   Load latency tolerance in dynamically scheduled processors - Srinivasan, Lebeck - 1998
33   Whole-genome random sequencing and assembly of haemophilus-i.. (context) - Fleischmann - 1995
32   Increasing processor performance by implementing deeper pipe.. (context) - Sprangle, Carmean - 2002
30   Performance of database workloads on shared-memory systems w.. - Ranganathan, Gharachorloo et al. - 1998
30   Focusing processor policies via critical-path prediction - Fields, Rubin et al. - 2001
24   Dynamic prediction of critical path instructions - Tune, Liang et al. - 2001
22   Improving trace cache effectiveness with branch promotion an.. - Patel, Evers et al. - 1998
19   Speculative lock elision: Enabling highly concurrent multith.. - Rajwar, Goodman - 2001
17   The optimal logic depth per pipeline stage is 6 to 8 FO4 inv.. - Hrishikesh, Jouppi et al. - 2002
16   Energy-efficient processor design using multiple clock domai.. - Semeraro, Magklis et al. - 2002
16   Intel Itanium 2 processor reference manual for software deve.. (context) - Corporation - 2003
13   Dynamic instruction scheduling slack (context) - Casmira, Grunwald - 2000
12   Slack: Maximizing performance under technological constraint.. - Fields, Bodk et al. - 2002
11   The non-critical buffer: Using load latency tolerance to imp.. - Fisk, Bahar - 1999
10   Loose loops sink chips - Borch, Tune et al. - 2002
10   The optimum pipeline depth for a microprocessor (context) - Hartstein, Puzak - 2002
9   Joint local and global hardware adaptations for energy - Sasanka, Hughes et al. - 2002
8   Performance characterization of a hardware mechanism for dyn.. - Fahs, Bose et al. - 2001
8   Hierarchical performance modeling with MACS: A case study of.. (context) - Boyd, Davidson - 1993
7   Pentium 4 performance-monitoring features (context) - Sprunt - 2002
5   Non-vital loads - Rakvic, Black et al. - 2002
3   A statistically rigorous approach for improving simulation m.. - Yi, Lilja et al. - 2003
3   Dz ching Ju (context) - Srinivasan - 2001
2   Quantifying instruction criticality - Tune, Tullsen et al. - 2002
2   Reducing power Symposium on Microarchitecture (context) - Seng, Tune et al. - 2001
2   Reducing power Symposium on Microarchitecture (context) - Seng, Tune et al. - 2001
1   The Art of Cumpter Systems Performance Analysis (context) - Jain - 1991
1   Intel Pentium 4 processor manual (context) - Corporation

Documents on the same site (http://www.microarch.org/micro36/html/program.html):   More
Fast Secure Processor for Inhibiting Software Piracy and.. - Jun Yang Youtao (2003)   (Correct)
Fast Path-Based Neural Branch Prediction - Daniel Jimenez Department (2003)   (Correct)
The Performance of Runtime Data Cache Prefetching in a.. - Optimization System Jiwei   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC