See this document in CiteSeerX!

Software-Directed Register Deallocation for Simultaneous Multithreaded Processors (1999)  (Make Corrections)  (10 citations)
Jack L. Lo, Sujay S. Parekh, Susan J. Eggers, Henry M. Levy, Dean M. Tullsen
IEEE Transactions on Parallel and Distributed Systems



  Home/Search   Context   Related

 
View or download:
washington.edu/researc...register.TR.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  128.95.4.112/homes/jlo/ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper proposes and evaluates software techniques that increase register file utilization for simultaneous multithreading (SMT) processors. SMT processors require large register files to hold multiple thread contexts that can issue instructions, out of order, every cycle. By supporting better inter-thread sharing and management of physical registers, an SMT processor can reduce the number of registers required and can improve performance for a given register file size. Our techniques... (Update)

Context of citations to this paper:   More

.... level simulator is a detailed, stand alone, execution based simulator used extensively in previous SMT studies [22, 43, 44, 45, 46, 47, 59, 77, 81, 82, 83]. It models the processor pipeline and memory system in great detail. While the simulator excels at modelling user...

...operands are allocated. They find a 25 reduction in the number of renaming registers with little loss in performance. Lo et al. [15] investigate deallocating registers on SMT after their last use via compiler inserted annotations. They observed up to an average speedup...

Cited by:   More
Balanced Multithreading: Increasing Throughput via a.. - Tune, Kumar, Tullsen, .. (2004)   (Correct)
Design and Applications of a Virtual Context Architecture - Oehmke, Binkert.. (2004)   (Correct)
Physical Register Inlining - Lipasti, Mestan, Gunadi (2004)   (Correct)

Similar documents (at the sentence level):
67.7%:   Software-Directed Register Deallocation for.. - Lo, Parekh, Eggers, ..   (Correct)
36.9%:   Exploiting Thread-Level Parallelism On . . . - Lo (1998)   (Correct)

Active bibliography (related documents):   More   All
0.2:   Smart Register Files for High-Performance Microprocessors - Postiff, Mudge (1999)   (Correct)
0.1:   Compiler and Microarchitecture Mechanisms for Exploiting.. - Postiff (2001)   (Correct)
0.1:   PUMP: A New Architecture for Multithreaded Processors - Bradford   (Correct)

System load high. Please wait...
Timeout. Please try your query later.
Similar documents based on text:   More   All
0.8:   Improving Server Software Support for Simultaneous.. - McDowell, Eggers..   (Correct)
0.7:   Mini-threads: Increasing TLP on Small-Scale SMT Processors - Joshua Redstone Susan (2003)   (Correct)
0.4:   Tuning Compiler Optimizations for Simultaneous.. - Lo, Eggers, Levy.. (1997)   (Correct)

Related documents from co-citation:   More   All
8:   Multiple-banked register file architectures (context) - Cruz, Gonzlez et al. - 2000
7:   Delaying Physical Register Allocation Through Virtual-Physical Registers (context) - Monreal, Gonzlez et al. - 1999
6:   superscalar microprocessor (context) - Yeager - 1996

BibTeX entry:   (Update)

J. Lo, S. Parekh, S. Eggers, H. Levy, and D. Tullsen. Softwaredirected register deallocation for simultaneous multithreaded processors. IEEE Transactions on Parallel and Distributed Systems, to appear. http://citeseer.ist.psu.edu/lo99softwaredirected.html   More

@article{ lo99softwaredirected,
    author = "J. L. Lo and S. S. Parekh and S. J. Eggers and H. M. Levy and D. M. Tullsen",
    title = "Software-Directed Register Deallocation for Simultaneous Multithreaded Processors",
    journal = "IEEE Transactions on Parallel and Distributed Systems",
    volume = "10",
    number = "9",
    pages = "922--??",
    year = "1999",
    url = "citeseer.ist.psu.edu/lo99softwaredirected.html" }
Citations (may not include all citations):
358   The Tera computer system - Alverson, Callahan et al. - 1990
353   The SPLASH-2 programs: Characterization and methodological c.. - Woo, Ohara et al. - 1995
251   Simultaneous multithreading: Maximizing on-chip parallelism - Tullsen, Eggers et al. - 1995
214   Combining branch predictors - McFarling - 1993
197   Maximizing multiprocessor performance with the SUIF compiler - Hall, Anderson et al. - 1996
186   Exploiting choice: Instruction fetch and issue on an impleme.. - Tullsen, Eggers et al. - 1996
156   The Multiflow trace scheduling compiler - Lowney, Freudenberger et al. - 1993
136   superscalar microprocessor (context) - Yeager - 1996
130   A VLIW architecture for a trace scheduling compiler (context) - Colwell, Nix et al. - 1988
110   Portable Programs for Parallel Processors (context) - Boyle, Butler et al. - 1987
57   Simultaneous multithreading: A platform for next-generation .. - Eggers, Emer et al. - 1997
54   Partitioned register files for VLIWs: A preliminary analysis.. (context) - Capitanio, Dutt et al. - 1992
54   Digital 21264 sets new standard (context) - Gwennap - 1996
52   Converting thread-level parallelism to instruction-level par.. - Lo, Eggers et al. - 1997
37   Scaling parallel programs for multiprocessors: Methodology a.. (context) - Singh, Hennessy et al. - 1993
35   Register file design considerations in dynamically scheduled.. - Farkas, Jouppi et al. - 1996
33   Register relocation: Flexible contexts for multithreading - Waldspurger, Weihl - 1993
28   Register traffic analysis for streamlining inter-operation c.. - Franklin, Sohi - 1992
25   Partitioned register files for TTAs - Janssen, Corporaal - 1995
20   Facilitating superscalar processing via a combined static/dy.. (context) - Sprangle, Patt - 1994
16   Exploiting short-lived variables in superscalar processors (context) - Lozano, Gao - 1995
16   Non-consistent dual register files to reduce register pressu.. - Llosa, Valero et al. - 1995
13   A three dimensional register file for superscalar processors (context) - Tremblay, Joy et al. - 1995
12   The performance potential of multiple functional unit proces.. (context) - Pleszkun, Sohi - 1988
12   Register connection: A new approach to adding registers into.. - Kiyohara, Mahlke et al. - 1993
7   The named-state register file: Implementation and performanc.. - Nuth, Dally - 1995
5   SPEC CPU '95 Technical Manual (context) - Evaluation - 1995



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://128.95.4.112/homes/jlo/):   More
Tuning Compiler Optimizations for Simultaneous.. - Lo, Eggers, Levy.. (1997)   (Correct)
Converting Thread-Level Parallelism to.. - Lo, Eggers, Emer, .. (1997)   (Correct)
Supporting Fine-Grained Synchronization on a.. - Tullsen, Lo, Eggers.. (1999)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC