See this document in CiteSeerX!

Exploiting Fine-Grain Thread Level Parallelism on the MIT Multi-ALU Processor (1998)  (Make Corrections)  (16 citations)
Stephen W. Keckler, William J. Dally, Daniel Maskit, Nicholas P. Carter, Andrew Chang, Whay S. Lee
ISCA



  Home/Search   Context   Related

Links:   ACM   DBLP

 
View or download:
utexas.edu/users/skeckler/...isca98.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  utexas.edu/users/cart/pubs (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Much of the improvement in computer performance over the last twenty years has come from faster transistors and architectural advances that increase parallelism. Historically, parallelism has been exploited either at the instruction level with a grain-sie of a single instruction or by partitioning applications into coarse threads with grain-sies of thousands of instructions. Fine-grain threads fill the parallelism gap between these extremes by enabling tasks with run lengths as small as 20... (Update)

Context of citations to this paper:   More

.... HEP [Smith81] Tera [Alverson90] and Alewife [Kranz92] They are also similar to the pre and post condition mechanism of the M Machine [Keckler98], which differs from the others in that instead of causing a trap, a failure sets a predicate register which must be explicitly...

...d2) In the patterns, the arrows represent detected data races. of the new epoch very quickly and without software intervention (e.g. [5, 10, 13]) One difference in ReEnact is in the way epochs are terminated and started. An epoch terminates by default when it reaches a...

Cited by:   More
ACRES Architecture and Compilation - Ang, Schlansker (2004)   (Correct)
Exploiting Thread-Level Parallelism On . . . - Lo (1998)   (Correct)
Synchronization Support Using Full/Empty Tagged Shared Memory .. - Vlassov, Moritz   (Correct)

Similar documents (at the sentence level):
32.4%:   Fast Thread Communication and Synchronization Mechanisms for a.. - Keckler (1998)   (Correct)

Active bibliography (related documents):   More   All
1.1:   Exploiting Fine-Grain Thread Level Parallelism on.. - Keckler, Dally.. (1998)   (Correct)
0.3:   The Importance of Locality in Scheduling and Load Balancing for.. - Keckler (1994)   (Correct)
0.2:   The M-Machine Multicomputer - Fillo, Keckler, Dally, Carter.. (1995)   (Correct)

Similar documents based on text:   More   All
0.6:   The Effects of Explicitly Parallel Mechanisms on.. - Chang, Dally.. (1998)   (Correct)
0.6:   Efficient, Protected Message Interface in the MIT.. - Lee, Dally, Keckler..   (Correct)
0.4:   Instruction Scheduling on the M-Machine Multicomputer - Fillo (1995)   (Correct)

Related documents from co-citation:   More   All
9:   The Tera computer system - Alverson, Callahan et al. - 1990
6:   machine multicomputer (context) - Fillo, Keckler et al. - 1995
5:   Simultaneous multithreading: Maximizing on-chip parallelism - Tullsen, Eggers et al. - 1995

BibTeX entry:   (Update)

S.W. Keckler, W.J. Dally, D. Maskit, N.P. Carter, A. Chang, and W.S. Lee. Exploiting fine-grain thread level parallelism on the MIT multi-alu processor. In 25th Annual International Symposium on Computer Architecture, June 1998. http://citeseer.ist.psu.edu/keckler98exploiting.html   More

@inproceedings{ keckler98exploiting,
    author = "Stephen W. Keckler and William J. Dally and Daniel Maskit and Nicholas P. Carter and Andrew Chang and Whay Sing Lee",
    title = "Exploiting Fine-grain Thread Level Parallelism on the {MIT} Multi-{ALU} Processor",
    booktitle = "{ISCA}",
    pages = "306-317",
    year = "1998",
    url = "citeseer.ist.psu.edu/keckler98exploiting.html" }
Citations (may not include all citations):
358   The Tera computer system - ALVERSON, CALLAHAN et al. - 1990  ACM   DBLP
341   Parallel programming in Split-C - CULLER, DUSSEAU et al. - 1993  ACM   DBLP
269   Multiscalar processors - SOHI, BREACH et al. - 1995  ACM   DBLP
251   Simultaneous multithreading: Maximizing on-chip parallelism - TULLSEN, EGGERS et al. - 1995  DBLP
230   Limits of instruction-level parallelism - WALL - 1991  ACM   DBLP
156   The multiflow trace scheduling compiler - LOWNEY, FREUDENBERGER et al. - 1993  ACM
127   A multithreaded massively parallel architecture (context) - NIKHIL, PAPADOPOULOS et al. - 1992
104   for semiconductors. Semiconductor Industry Association (context) - technology - 1997
56   Machine Multicomputer (context) - FILLO, KECKLER et al. - 1995
38   Evaluation of design alternatives for a multiprocessor micro.. (context) - NAYFEH, HAMMOND et al. - 1996  ACM   DBLP
35   The impact of synchronization and granularity on parallel sy.. - CHEN, Su et al. - 1990  ACM   DBLP
16   Experience with fine-grain synchronization in MIMD machines .. - YEUNG, AGARWAL - 1993  ACM   DBLP
6   Thread prioritization: A thread scheduling mechanism for mul.. - FISKE, DALLY - 1995  DBLP
5   Application performance on the MIT alewife machine (context) - CHONG, LIM et al. - 1996  ACM   DBLP
4   Department of Electrical Engineering and Computer Science (context) - GUREVICH - 1995  ACM
3   A highperformance Lisp (context) - KRANZ, HALSTEAD et al. - 1989
2   Springer-Verlag (context) - ROBBINS, ROBBINS - 1987



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.utexas.edu/users/cart/pubs.html):   More
Characterizing the SPHINX Speech Recognition System - Agaram, Keckler, Burger   (Correct)
A Technology-Scalable Architecture for Fast Clocks.. - Sankaralingam.. (2001)   (Correct)
SimpleScalar Simulation of the PowerPC Instruction.. - Sankaralingam.. (2000)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC