See this document in CiteSeerX!

Speculative Multithreading Architectures (1998)  (Make Corrections)  
Venkata S. Krishnan



  Home/Search   Context   Related

 
View or download:
uiuc.edu/multithread...phd_venkat.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  uiuc.edu/multithreading/index (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: With the conventional superscalar approach delivering... effective but quite modest hardware that supports communication and synchronization of registers between on-chip processors. Furthermore, we propose hardware support that handles true memory dependence violations when the application is run in a speculative execution mode. We also present the compiler support that enables automatic identification of threads from sequential binaries. We show how the software-hardware approach enables... (Update)

Similar documents (at the sentence level):   More
11.0%:   A Clustered Approach to Multithreaded Processors - Krishnan, Torrellas (1998)   (Correct)
10.7%:   Hardware and Software Support for Speculative Execution of.. - Krishnan, Torrellas (1998)   (Correct)
9.5%:   A Direct-Execution Framework for Fast and Accurate.. - Krishnan, Torrellas (1998)   (Correct)

Active bibliography (related documents):   More   All
0.8:   Running Parallel Applications on an MP With.. - Krishnan, Zhang..   (Correct)
0.4:   Exploiting Thread-Level Parallelism On . . . - Lo (1998)   (Correct)
0.3:   Efficient Use of Processing Transistors for Larger On-Chip.. - Krishnan, Torrellas (1997)   (Correct)

Similar documents based on text:   More   All
0.1:   The Impact Of Workload On The Dependability Of Microprocessors.. - Krishnan (1996)   (Correct)
0.1:   Executing Sequential Binaries on a Clustered Multithreaded.. - Krishnan, Torrellas (1998)   (Correct)
0.1:   A Speculation-Based Approach For Performance And.. - Huang, Kalbarczyk, al. (1998)   (Correct)

BibTeX entry:   (Update)

@techreport{ krishnan98speculative,
    author = "Venkata S. Krishnan",
    title = "Speculative Multithreading Architectures",
    number = "UIUCDCS-R-98-2048",
    year = "1998",
    url = "citeseer.ist.psu.edu/krishnan98speculative.html" }
Citations (may not include all citations):
866   Techniques and Tools (context) - Aho, Sethi et al. - 1986
386   ATOM: A System for Building Customized Program Analysis Tool.. (context) - Srivastava, Eustace - 1994
357   The Directorybased Cache Coherence Protocol for the DASH Mul.. (context) - Lenoski, Laudon et al. - 1990
353   The SPLASH-2 Programs: Characterization and Methodological C.. - Woo, Ohara et al. - 1995
320   MediaBench: A Tool for Evaluating and Synthesizing Multimedi.. - Lee, Potkonjak et al. - 1997
299   Dependence Analysis for Supercomputing (context) - Banerjee - 1988
283   Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1989
275   Shade: A Fast Instruction-Set Simulator for Execution Prolin.. - Cmelik, Keppel - 1994
269   Multiscalar Processors - Sohi, Breach et al. - 1995
251   Simultaneous Multithreading: Maximizing on-chip Parallelism - Tullsen, Eggers et al. - 1995
230   Limits of Instruction Level Parallelism - Wall - 1991
197   Maximizing Multiprocessor Performance with the SUIF Compiler - Hall, Anderson et al. - 1996
193   Superscalar Microprocessor Design (context) - Johnson - 1990
190   Value locality and load value prediction - Lipasti, Wilkerson et al. - 1996
186   Exploiting Choice: Instruction Fetch and Issue on an Impleme.. - Tullsen, Eggers et al. - 1996
183   Trace Cache: A Low Latency Approach to High Bandwidth Instru.. - Rotenberg, Bennett et al. - 1996
160   IMPACT: An Architectural Framework for Multiple-Instruction-.. - Chang, Mahlke et al. - 1991
159   The LRPD Test: Speculative Run-Time Parallelization of Loops.. - Rauchwerger, Padua - 1995
157   Limits of Control Flow on Parallelism - Lam, Wilson - 1992
157   Architecture and applications of the HEP multiprocessor comp.. (context) - Smith - 1984
150   PROTEUS: A HighPerformance Parallel-Architecture Simulator - Brewer, Dellarocas et al. - 1991
147   Alternative Implementations of Two-Level Adaptive Branch Pre.. - Yeh, Patt - 1992
136   Parallel Programming with Polaris (context) - Blume, Doallo et al. - 1996
125   Trace Processors - Rotenberg, Jacobson et al. - 1997
110   Improving the Accuracy of Dynamic Branch Prediction Using Br.. (context) - Pan, So et al. - 1992
109   Multiprocessor Simulation and Tracing using Tango (context) - Davis, Goldschmidt et al. - 1991
109   Advanced Compiler Design Implementation (context) - Muchnick - 1997
100   Dynamic Instruction Reuse - Sodani, Sohi - 1997
97   The Case for a Single-Chip Multiprocessor (context) - Olukotun, Nayfeh et al. - 1996
95   Pipeline Gating: Speculation Control For Energy Reduction - Manne, Klauser et al. - 1998
93   Optimization of Instruction Fetch Mechanisms for High Issue .. - Conte, Menezes et al. - 1995
87   Complete Computer System Simulation: The SimOS Approach - Rosenblum, Herrod et al. - 1995
76   Will Physical Scalability Sabotage Performance Gains - Matzke - 1997
74   Speculative Versioning Cache - Gopal, Vijaykumar et al. - 1998
70   The Superthreaded Architecture: Thread Pipelining with Run-T.. - Tsai, Yew - 1996
65   Interconnect Scaling - The Real Limiter to high-performance .. (context) - Bohr - 1995
64   Trace-Driven Memory Simulation: A Survey - Uhlig, Mudge - 1997
64   Missing the Memory Wall: The Case for Processor/Memory Integ.. (context) - Saulsbury, Pong et al. - 1996
60   Software and Hardware for Exploiting Speculative Parallelism.. - Oplinger, Heine et al. - 1997
59   The Microarchitecture of Superscalar Processors - Smith, Sohi - 1995
56   Machine Multicomputer (context) - Fillo, Keckler et al. - 1995
54   Digital 21264 Sets New Standard (context) - Gwennap - 1996
54   Scalable Processors in the Billion-Transistor Era: IRAM - Kozyrakis, Perissakis et al. - 1997
53   Processor Coupling: Integrating Compile time and Runtime Sch.. - Keckler, Dally - 1992
53   Improving Superscalar Instruction Dispatch and Issue by Expl.. - Vajapeyam, Mitra - 1997
49   The Impact of Instruction-Level Parallelism on Multiprocesso.. - Pai, Ranganathan et al. - 1997
48   Portable Programs for Parallel Processors (context) - Lusk - 1987
44   Hardware for Speculative Run-Time Parallelization in Distrib.. - Zhang, Rauchwerger et al. - 1998
43   A Single-Chip Multiprocessor - Hammond, Nayfeh et al. - 1997
43   A Comparison of Full and Partial Predicated Execution Suppor.. - Mahlke, Hank et al. - 1995
38   Evaluation of Design Alternatives for a Multiprocessor Micro.. (context) - Nayfeh, Hammond et al. - 1996
36   Architecture: Compiler-assisted Fine-grained Multithreading (context) - Dubey, O'Brien et al. - 1995
36   Characterizing the Caching and Synchronization Performance o.. (context) - Torrellas, Gupta et al. - 1992
35   A Comparison of Trace-Sampling Techniques for Multi-Megabyte.. - Kessler, Hill et al. - 1994
33   Wisconsin Wind Tunnel II: A Fast and Portable Parallel Archi.. - Mukherjee, Reinhardt et al. - 1997
32   RSIM: An Execution-Driven Simulator for ILP-Based Shared-Mem.. - Pai, Ranganathan et al. - 1997
32   Trace Processors: Moving to Fourth Generation Microarchitect.. (context) - Smith, Vajapeyam - 1997
31   A Model for Estimating Trace-Sample Miss Ratios - Wood, Hill et al. - 1991
31   Execution-Driven Tools for Parallel Simulation of Parallel A.. - Poulsen, Yew - 1993
30   Alternative implementations of hybrid branch predictors - Chang, Hao et al. - 1995
29   The Augmint Multiprocessor Simulation Toolkit for Intel x86 .. - Nguyen, Michael et al. - 1996
28   The Anatomy of the Register File in a Multiscalar Processor - Breach, Vijaykumar et al. - 1994
24   Computer Science Department (context) - Veenstra, Fowler et al. - 1993
23   ARB: A Hardware Mechanism for Dynamic Memory Disambiguation (context) - Franklin, Sohi - 1996
21   Performance Study of a Multithreaded Superscalar Microproces.. - Gulati, Emer - 1996
21   Billion-Transistor Architectures - Burger, Goodman - 1997
20   A benchmark evaluation of a Multithreaded RISC Processor Arc.. (context) - Prasadh, Wu - 1991
20   Converting Thread-Level Parallelism Into Instruction-Level P.. (context) - Lo, Eggers et al. - 1997
19   HP Make EPIC Disclosure (context) - Gwennap - 1997
18   The Concurrent Execution of Multiple Instruction Streams on .. (context) - Jr, Torng - 1991
16   PA-8500: The Continuing Evolution of the PA-8000 Family (context) - Lesartre, Hunt - 1997
16   ective Superscalar Processors (context) - Palacharla, Jouppi et al. - 1997
16   Superscalar Instruction Execution in the 21164 Alpha Micropr.. (context) - Edmondson, Rubinfeld et al. - 1995
15   A Critique of Trace-Driven Simulation for Shared-Memory Mult.. (context) - Bitar - 1990
13   Execution-Driven Simulation of Multiprocessors: Address and .. - Dwarkadas, Jump et al. - 1994
13   The Potential for Using Thread-Level Data Speculation to Fac.. (context) - Ste, Mowry - 1998
12   Silicon Trends and Limits for Advanced Microprocessors (context) - Bohr - 1998
12   Microprocessor Chipset (context) - Technologies - 1994
11   Center for Integrated Systems (context) - Smith, Pixie et al. - 1991
10   The Design of the Microarchitecture of UltraSPARC (context) - Tremblay, Greenley et al. - 1995
10   a High Performance Restricted Data Flow Architecture Having .. (context) - Hwu, Patt - 1986
8   The Multi ow Trace Scheduling Compiler (context) - Lowney, Freudenberger et al. - 1993
8   A Clustered Approach to Multithreaded Processors - Krishnan, Torrellas - 1998
7   BiCMOS Processor with Dynamic Execution (context) - Colwell, Steck - 1995
6   Executing Sequential Binaries on a Clustered Multithreaded A.. - Krishnan, Torrellas - 1998
4   Microprocessor Forum (context) - Alpha, Out-of-Order - 1996
4   Analysis for Streamlining Inter-Operation Communication in F.. (context) - Franklin, Sohi - 1992
3   The PowerPC 620 Microprocessor (context) - Levitan, Thomas et al. - 1995
3   Intel's Merced and IA-64: Technology and Market Forecast (context) - Gwennap - 1998
3   Systematic Prototyping of Superscalar Computer Architectures - Conte, Hwu - 1992
2   The Perfect Club Benchmarks (context) - Berry - 1989
1   ACM Transactions on Computer Systems (context) - Uhlig, Nagle et al. - 1995
1   MINT: A Front End for EÆcient Simulation of SharedMemory Mul.. (context) - Veenstra, Fowler - 1994
1   ective Scheduling Technique for VLIW Processors (context) - Lam, An - 1988

Documents on the same site (http://iacoma.cs.uiuc.edu/multithreading/index.html):   More
Efficient Use of Processing Transistors for Larger On-Chip.. - Krishnan, Torrellas (1997)   (Correct)
Executing Sequential Binaries on a Clustered Multithreaded.. - Krishnan, Torrellas (1998)   (Correct)
Quantifying the Benefits of SPECint Distant.. - Ortega, Martel..   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC