See this document in CiteSeerX!

Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy (2004)  (Make Corrections)  
Eric Tune, Rakesh Kumar, Dean M. Tullsen, Brad Calder



  Home/Search   Context   Related

 
View or download:
microarch.org/micr...Multithreading.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help
Problem Downloading?
From:  microarch.org/micro37/program (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: A simultaneous multithreading (SMT) processor can issue instructions from several threads every cycle, allowing it to effectively hide various instruction latencies; this effect increases with the number of simultaneous contexts supported. However, each added context on an SMT processor incurs a cost in complexity, which may lead to an increase in pipeline length or a decrease in the maximum clock rate. This paper presents new designs for multithreaded processors which combine a conservative... (Update)

Active bibliography (related documents):   More   All
0.5:   Speculative Software Management of Datapath-width.. - Pokam, Rochecouste, .. (2004)   (Correct)
0.5:   Design and Applications of a Virtual Context Architecture - Oehmke, Binkert.. (2004)   (Correct)
0.4:   Characterizing a New Class of Threads in Scientific .. - Rodrigues, Murphy, ..   (Correct)

Similar documents based on text:   More   All
0.3:   Instruction Recycling on a Multiple-Path Processor - Wallace, Tullsen, Calder (1999)   (Correct)
0.3:   Dynamic Prediction of Critical Path Instructions - Tune, Liang, Tullsen, Calder (2001)   (Correct)
0.3:   Threaded Multiple Path Execution - Wallace, Calder, Tullsen (1998)   (Correct)

BibTeX entry:   (Update)

@misc{ tune-balanced,
  author = "Eric Tune and Rakesh Kumar and Dean M. Tullsen and Brad Calder",
  title = "Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading
    Hierarchy",
  url = "citeseer.ist.psu.edu/tune04balanced.html" }
Citations (may not include all citations):
358   The Tera computer system - Alverson, Callahan et al. - 1990  ACM   DBLP
251   Simultaneous multithreading: Maximizing on-chip parallelism - Tullsen, Eggers et al. - 1995  DBLP
186   Exploiting choice: Instruction fetch and issue on an impleme.. - Tullsen, Eggers et al. - 1996
136   superscalar microprocessor (context) - Yeager - 1996
79   Automatically characterizing large scale program behavior - Sherwood, Perelman et al. - 2002  ACM   DBLP
72   MASA: a multithreaded processor architecture for parallel sy.. (context) - Halstead, Fujita - 1998  ACM   DBLP
67   An elementary processor architecture with simultaneous instr.. (context) - Hirata, Kimura et al. - 1992  ACM   DBLP
64   The microarchitecture of the pentium 4 processor (context) - Hinton, Sager et al. - 2001
40   Interleaving: A multithreading technique targeting multiproc.. - Laudon, Gupta et al. - 1994  DBLP
35   Simulation and modeling of a simultaneous multithreading pro.. (context) - Tullsen - 1996
34   Multiplebanked register file architectures (context) - Cruz, Gonzalez et al. - 2000
33   Register relocation: Flexible contexts for multithreading - Waldspurger, Weihl - 1993  DBLP
32   Informing memory operations: Providing memory performance fe.. - Horowitz, Martonosi et al. - 1996  DBLP
30   Increasing superscalar performance through multistreaming (context) - Yamamoto, Nemirovsky - 1995  ACM
28   Closing the window of vulnerability in multiphase memory tra.. - Kubiatowicz - 1993
26   Alpha 21264 Microprocessor Hardware Reference Manual (context) - Corp, MA - 2000
25   Symbiotic jobscheduling for a simultaneous multithreading ar.. - Snavely, Tullsen - 2000
23   The effectiveness of multiple hardware contexts - Thekkath, Eggers - 1994  ACM   DBLP
22   The architecture of HEP (context) - Smith - 1985  ACM
20   An integrated cache timing (context) - Shivakumar, Jouppi - 2001
17   Reducing the complexity of the register file in dynamic supe.. (context) - Balasubramonian, Dwarkadas et al. - 2001  ACM   DBLP
15   Handling long-latency loads in a simultaneous multithreading.. - Tullsen, Brown - 2001  ACM   DBLP
10   Softwaredirected register deallocation for simultaneous mult.. - Lo, Parekh et al. - 1999
10   Loose loops sink chips - Borch, Tune et al. - 2002  ACM   DBLP
9   A multithreaded powerPC processor for commercial servers (context) - Borkenhagen, Eickemeyer et al. - 2000  DBLP
8   Software-controlled multithreading using informing memory op.. - Mowry, Ramkissoon - 2000  DBLP
5   Reducing register ports for higher speed and lower energy - Park, Powell et al. - 2002  ACM   DBLP
4   Analysis of multithreaded architectures for parallel computi.. (context) - Saavedra-Barrera, Culler et al. - 1990  ACM   DBLP
3   Mini-threads: Increasing TLP on small-scale SMT processors - Redstone, Eggers et al. - 2003  DBLP
3   Evaluation of multithreaded processors and thread-switch pol.. (context) - Eickemeyer, Johnson et al. - 1997  ACM   DBLP
2   Banked multiported register files for high-frequency supersc.. - Tseng, Asanovic - 2003  ACM   DBLP
1   Improving memory latency aware fetch policies for smt proces.. (context) - Cazorla, Fernandez et al. - 2003  DBLP
1   Reducing register ports using delayed write-back queues and .. (context) - Kim, Mudge - 2003  ACM
1   Sparcle: An evolutionary procesProceedings of the 37th Inter.. (context) - Agarwal, Kubiatowicz et al. - 2004

Documents on the same site (http://www.microarch.org/micro37/program.html):   More
Cache Refill/Access Decoupling for Vector Machines - Batten, Krashinsky.. (2004)   (Correct)
Managing Wire Delay in Large Chip-Multiprocessor Caches - Beckmann, Wood (2004)   (Correct)
Adaptive History-Based Memory Schedulers - Hur, Lin   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC