(Enter summary)
Abstract: With the conventional superscalar approach delivering... effective but quite modest hardware that supports communication and synchronization of registers between on-chip processors. Furthermore, we propose hardware support that handles true memory dependence violations when the application is run in a speculative execution mode. We also present the compiler support that enables automatic identification of threads from sequential binaries. We show how the software-hardware approach enables... (Update)
Similar documents (at the sentence level): More
11.0%: A Clustered Approach to Multithreaded Processors - Krishnan, Torrellas (1998)
(Correct)
10.7%: Hardware and Software Support for Speculative Execution of.. - Krishnan, Torrellas (1998)
(Correct)
9.5%: A Direct-Execution Framework for Fast and Accurate.. - Krishnan, Torrellas (1998)
(Correct)
Active bibliography (related documents): More All
0.8: Running Parallel Applications on an MP With.. - Krishnan, Zhang..
(Correct)
0.4: Exploiting Thread-Level Parallelism On . . . - Lo (1998)
(Correct)
0.3: Efficient Use of Processing Transistors for Larger On-Chip.. - Krishnan, Torrellas (1997)
(Correct)
Similar documents based on text: More All
0.1: The Impact Of Workload On The Dependability Of Microprocessors.. - Krishnan (1996)
(Correct)
0.1: Executing Sequential Binaries on a Clustered Multithreaded.. - Krishnan, Torrellas (1998)
(Correct)
0.1: A Speculation-Based Approach For Performance And.. - Huang, Kalbarczyk, al. (1998)
(Correct)
BibTeX entry: (Update)
@techreport{ krishnan98speculative,
author = "Venkata S. Krishnan",
title = "Speculative Multithreading Architectures",
number = "UIUCDCS-R-98-2048",
year = "1998",
url = "citeseer.ist.psu.edu/krishnan98speculative.html" }
Citations (may not include all citations):
866
Techniques and Tools (context) - Aho, Sethi et al. - 1986
386
ATOM: A System for Building Customized Program Analysis Tool.. (context) - Srivastava, Eustace - 1994
357
The Directorybased Cache Coherence Protocol for the DASH Mul.. (context) - Lenoski, Laudon et al. - 1990
353
The SPLASH-2 Programs: Characterization and Methodological C..
- Woo, Ohara et al. - 1995
320
MediaBench: A Tool for Evaluating and Synthesizing Multimedi..
- Lee, Potkonjak et al. - 1997
299
Dependence Analysis for Supercomputing (context) - Banerjee - 1988
283
Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1989
275
Shade: A Fast Instruction-Set Simulator for Execution Prolin..
- Cmelik, Keppel - 1994
269
Multiscalar Processors
- Sohi, Breach et al. - 1995
251
Simultaneous Multithreading: Maximizing on-chip Parallelism
- Tullsen, Eggers et al. - 1995
230
Limits of Instruction Level Parallelism
- Wall - 1991
197
Maximizing Multiprocessor Performance with the SUIF Compiler
- Hall, Anderson et al. - 1996
193
Superscalar Microprocessor Design (context) - Johnson - 1990
190
Value locality and load value prediction
- Lipasti, Wilkerson et al. - 1996
186
Exploiting Choice: Instruction Fetch and Issue on an Impleme..
- Tullsen, Eggers et al. - 1996
183
Trace Cache: A Low Latency Approach to High Bandwidth Instru..
- Rotenberg, Bennett et al. - 1996
160
IMPACT: An Architectural Framework for Multiple-Instruction-..
- Chang, Mahlke et al. - 1991
159
The LRPD Test: Speculative Run-Time Parallelization of Loops..
- Rauchwerger, Padua - 1995
157
Limits of Control Flow on Parallelism
- Lam, Wilson - 1992
157
Architecture and applications of the HEP multiprocessor comp.. (context) - Smith - 1984
150
PROTEUS: A HighPerformance Parallel-Architecture Simulator
- Brewer, Dellarocas et al. - 1991
147
Alternative Implementations of Two-Level Adaptive Branch Pre..
- Yeh, Patt - 1992
136
Parallel Programming with Polaris (context) - Blume, Doallo et al. - 1996
125
Trace Processors
- Rotenberg, Jacobson et al. - 1997
110
Improving the Accuracy of Dynamic Branch Prediction Using Br.. (context) - Pan, So et al. - 1992
109
Multiprocessor Simulation and Tracing using Tango (context) - Davis, Goldschmidt et al. - 1991
109
Advanced Compiler Design Implementation (context) - Muchnick - 1997
100
Dynamic Instruction Reuse
- Sodani, Sohi - 1997
97
The Case for a Single-Chip Multiprocessor (context) - Olukotun, Nayfeh et al. - 1996
95
Pipeline Gating: Speculation Control For Energy Reduction
- Manne, Klauser et al. - 1998
93
Optimization of Instruction Fetch Mechanisms for High Issue ..
- Conte, Menezes et al. - 1995
87
Complete Computer System Simulation: The SimOS Approach
- Rosenblum, Herrod et al. - 1995
76
Will Physical Scalability Sabotage Performance Gains
- Matzke - 1997
74
Speculative Versioning Cache
- Gopal, Vijaykumar et al. - 1998
70
The Superthreaded Architecture: Thread Pipelining with Run-T..
- Tsai, Yew - 1996
65
Interconnect Scaling - The Real Limiter to high-performance .. (context) - Bohr - 1995
64
Trace-Driven Memory Simulation: A Survey
- Uhlig, Mudge - 1997
64
Missing the Memory Wall: The Case for Processor/Memory Integ.. (context) - Saulsbury, Pong et al. - 1996
60
Software and Hardware for Exploiting Speculative Parallelism..
- Oplinger, Heine et al. - 1997
59
The Microarchitecture of Superscalar Processors
- Smith, Sohi - 1995
56
Machine Multicomputer (context) - Fillo, Keckler et al. - 1995
54
Digital 21264 Sets New Standard (context) - Gwennap - 1996
54
Scalable Processors in the Billion-Transistor Era: IRAM
- Kozyrakis, Perissakis et al. - 1997
53
Processor Coupling: Integrating Compile time and Runtime Sch..
- Keckler, Dally - 1992
53
Improving Superscalar Instruction Dispatch and Issue by Expl..
- Vajapeyam, Mitra - 1997
49
The Impact of Instruction-Level Parallelism on Multiprocesso..
- Pai, Ranganathan et al. - 1997
48
Portable Programs for Parallel Processors (context) - Lusk - 1987
44
Hardware for Speculative Run-Time Parallelization in Distrib..
- Zhang, Rauchwerger et al. - 1998
43
A Single-Chip Multiprocessor
- Hammond, Nayfeh et al. - 1997
43
A Comparison of Full and Partial Predicated Execution Suppor..
- Mahlke, Hank et al. - 1995
38
Evaluation of Design Alternatives for a Multiprocessor Micro.. (context) - Nayfeh, Hammond et al. - 1996
36
Architecture: Compiler-assisted Fine-grained Multithreading (context) - Dubey, O'Brien et al. - 1995
36
Characterizing the Caching and Synchronization Performance o.. (context) - Torrellas, Gupta et al. - 1992
35
A Comparison of Trace-Sampling Techniques for Multi-Megabyte..
- Kessler, Hill et al. - 1994
33
Wisconsin Wind Tunnel II: A Fast and Portable Parallel Archi..
- Mukherjee, Reinhardt et al. - 1997
32
RSIM: An Execution-Driven Simulator for ILP-Based Shared-Mem..
- Pai, Ranganathan et al. - 1997
32
Trace Processors: Moving to Fourth Generation Microarchitect.. (context) - Smith, Vajapeyam - 1997
31
A Model for Estimating Trace-Sample Miss Ratios
- Wood, Hill et al. - 1991
31
Execution-Driven Tools for Parallel Simulation of Parallel A..
- Poulsen, Yew - 1993
30
Alternative implementations of hybrid branch predictors
- Chang, Hao et al. - 1995
29
The Augmint Multiprocessor Simulation Toolkit for Intel x86 ..
- Nguyen, Michael et al. - 1996
28
The Anatomy of the Register File in a Multiscalar Processor
- Breach, Vijaykumar et al. - 1994
24
Computer Science Department (context) - Veenstra, Fowler et al. - 1993
23
ARB: A Hardware Mechanism for Dynamic Memory Disambiguation (context) - Franklin, Sohi - 1996
21
Performance Study of a Multithreaded Superscalar Microproces..
- Gulati, Emer - 1996
21
Billion-Transistor Architectures
- Burger, Goodman - 1997
20
A benchmark evaluation of a Multithreaded RISC Processor Arc.. (context) - Prasadh, Wu - 1991
20
Converting Thread-Level Parallelism Into Instruction-Level P.. (context) - Lo, Eggers et al. - 1997
19
HP Make EPIC Disclosure (context) - Gwennap - 1997
18
The Concurrent Execution of Multiple Instruction Streams on .. (context) - Jr, Torng - 1991
16
PA-8500: The Continuing Evolution of the PA-8000 Family (context) - Lesartre, Hunt - 1997
16
ective Superscalar Processors (context) - Palacharla, Jouppi et al. - 1997
16
Superscalar Instruction Execution in the 21164 Alpha Micropr.. (context) - Edmondson, Rubinfeld et al. - 1995
15
A Critique of Trace-Driven Simulation for Shared-Memory Mult.. (context) - Bitar - 1990
13
Execution-Driven Simulation of Multiprocessors: Address and ..
- Dwarkadas, Jump et al. - 1994
13
The Potential for Using Thread-Level Data Speculation to Fac.. (context) - Ste, Mowry - 1998
12
Silicon Trends and Limits for Advanced Microprocessors (context) - Bohr - 1998
12
Microprocessor Chipset (context) - Technologies - 1994
11
Center for Integrated Systems (context) - Smith, Pixie et al. - 1991
10
The Design of the Microarchitecture of UltraSPARC (context) - Tremblay, Greenley et al. - 1995
10
a High Performance Restricted Data Flow Architecture Having .. (context) - Hwu, Patt - 1986
8
The Multi ow Trace Scheduling Compiler (context) - Lowney, Freudenberger et al. - 1993
8
A Clustered Approach to Multithreaded Processors
- Krishnan, Torrellas - 1998
7
BiCMOS Processor with Dynamic Execution (context) - Colwell, Steck - 1995
6
Executing Sequential Binaries on a Clustered Multithreaded A..
- Krishnan, Torrellas - 1998
4
Microprocessor Forum (context) - Alpha, Out-of-Order - 1996
4
Analysis for Streamlining Inter-Operation Communication in F.. (context) - Franklin, Sohi - 1992
3
The PowerPC 620 Microprocessor (context) - Levitan, Thomas et al. - 1995
3
Intel's Merced and IA-64: Technology and Market Forecast (context) - Gwennap - 1998
3
Systematic Prototyping of Superscalar Computer Architectures
- Conte, Hwu - 1992
2
The Perfect Club Benchmarks (context) - Berry - 1989
1
ACM Transactions on Computer Systems (context) - Uhlig, Nagle et al. - 1995
1
MINT: A Front End for EÆcient Simulation of SharedMemory Mul.. (context) - Veenstra, Fowler - 1994
1
ective Scheduling Technique for VLIW Processors (context) - Lam, An - 1988
Documents on the same site (http://iacoma.cs.uiuc.edu/multithreading/index.html): More
Efficient Use of Processing Transistors for Larger On-Chip.. - Krishnan, Torrellas (1997)
(Correct)
Executing Sequential Binaries on a Clustered Multithreaded.. - Krishnan, Torrellas (1998)
(Correct)
Quantifying the Benefits of SPECint Distant.. - Ortega, Martel..
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC