See this document in CiteSeerX!

Exploiting Fine-Grain Thread Level Parallelism on the MIT Multi-ALU Processor (1998)  (Make Corrections)  (16 citations)
Stephen W. Keckler, William J. Dally, Daniel Maskit, Nicholas P. Carter, Andrew Chang, Whay S. Lee
ISCA



  Home/Search   Context   Related

 
View or download:
utexas.edu/users/dburge...15_keckler.ps
stanford.edu/pub/publicat...isca98.ps.Z
utexas.edu/users/dburge...15_keckler.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  utexas.edu/users/dburger/teach... (more)
From:  stanford.edu/cva_publications
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Much of the improvement in computer performance over the last twenty years has come from faster transistors and architectural advances that increase parallelism. Historically, parallelism has been exploited either at the instruction level with a grain-size of a single instruction or by partitioning applications into coarse threads with grain-sizes of thousands of instructions. Fine--grain threads fill the parallelism gap between these extremesby enabling tasks with run lengths as small as 20... (Update)

Context of citations to this paper:   More

.... HEP [Smith81] Tera [Alverson90] and Alewife [Kranz92] They are also similar to the pre and post condition mechanism of the M Machine [Keckler98], which differs from the others in that instead of causing a trap, a failure sets a predicate register which must be explicitly...

...d2) In the patterns, the arrows represent detected data races. of the new epoch very quickly and without software intervention (e.g. [5, 10, 13]) One difference in ReEnact is in the way epochs are terminated and started. An epoch terminates by default when it reaches a...

Cited by:   More
ACRES Architecture and Compilation - Ang, Schlansker (2004)   (Correct)
Exploiting Thread-Level Parallelism On . . . - Lo (1998)   (Correct)
Synchronization Support Using Full/Empty Tagged Shared Memory .. - Vlassov, Moritz   (Correct)

Similar documents (at the sentence level):
37.7%:   Fast Thread Communication and Synchronization Mechanisms for a.. - Keckler (1998)   (Correct)

Active bibliography (related documents):   More   All
1.1:   Exploiting Fine-Grain Thread Level Parallelism on.. - Keckler, Dally.. (1998)   (Correct)
0.3:   The Importance of Locality in Scheduling and Load Balancing for.. - Keckler (1994)   (Correct)
0.1:   The Effects of Explicitly Parallel Mechanisms on.. - Chang, Dally.. (1998)   (Correct)

Similar documents based on text:   More   All
0.8:   The M-Machine Multicomputer - Fillo, Keckler, Dally, Carter.. (1995)   (Correct)
0.5:   The MMachine Multicomputer - Fillo, Keckler, Dally, al. (1995)   (Correct)
0.3:   Appears in the Proceedings of the 6th International Conference on.. - And   (Correct)

Related documents from co-citation:   More   All
9:   The Tera computer system - Alverson, Callahan et al. - 1990
6:   machine multicomputer (context) - Fillo, Keckler et al. - 1995
5:   Simultaneous multithreading: Maximizing on-chip parallelism - Tullsen, Eggers et al. - 1995

BibTeX entry:   (Update)

S.W. Keckler, W.J. Dally, D. Maskit, N.P. Carter, A. Chang, and W.S. Lee. Exploiting fine-grain thread level parallelism on the MIT multi-alu processor. In 25th Annual International Symposium on Computer Architecture, June 1998. http://citeseer.ist.psu.edu/article/keckler98exploiting.html   More

@inproceedings{ keckler98exploiting,
    author = "Stephen W. Keckler and William J. Dally and Daniel Maskit and Nicholas P. Carter and Andrew Chang and Whay Sing Lee",
    title = "Exploiting Fine-grain Thread Level Parallelism on the {MIT} Multi-{ALU} Processor",
    booktitle = "{ISCA}",
    pages = "306-317",
    year = "1998",
    url = "citeseer.ist.psu.edu/article/keckler98exploiting.html" }
Citations (may not include all citations):
358   The Tera computer system - ALVERSON, CALLAHAN et al. - 1990
341   Parallel programming in Split-C - CULLER, DUSSEAU et al. - 1993
269   Multiscalar processors - SOHI, BREACH et al. - 1995
251   Simultaneous multithreading: Maximizing on-chip parallelism - TULLSEN, EGGERS et al. - 1995
230   Limits of instruction-level parallelism - WALL - 1991
156   The multiflow trace scheduling compiler - LOWNEY, FREUDENBERGER et al. - 1993
127   A multithreaded massively parallel architecture (context) - NIKHIL, PAPADOPOULOS et al. - 1992
104   for semiconductors. Semiconductor Industry Association (context) - technology - 1997
56   Machine Multicomputer (context) - FILLO, KECKLER et al. - 1995
38   Evaluation of design alternatives for a multiprocessor micro.. (context) - NAYFEH, HAMMOND et al. - 1996
35   The impact of synchronization and granularity on parallel sy.. - CHEN, SU et al. - 1990
25   Department of Electrical Engineering and Computer Science (context) - GUREVICH - 1995
16   Experience with fine-grain synchronization in MIMD machines .. - YEUNG, AGARWAL - 1993
6   Thread prioritization: A thread scheduling mechanism for mul.. - FISKE, DALLY - 1995
3   A highperformance Lisp (context) - KRANZ, HALSTEAD et al. - 1989
2   Springer-Verlag (context) - ROBBINS, ROBBINS - 1987
1   Application performanceon the MIT alewife machine (context) - CHONG, LIM et al. - 1996



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.utexas.edu/users/dburger/teaching/spring99/cs395/):   More
The Multicluster Architecture: Reducing Cycle Time.. - Farkas, Chow.. (1997)   (Correct)
A Scalable Front-End Architecture for Fast Instruction.. - Reinman, Austin, Calder (1999)   (Correct)
Analytic Evaluation of Shared-Memory Systems with ILP.. - Sorin, Pai, Adve.. (1998)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC