Alternate document:   Details   Lazy Threads: Implementing a Fast Parallel Call Goldstein stacks, called a cactus stack. Unfortunately, a parallel call or thread fork is fundamentally more expensive than a sequential

See this document in CiteSeerX!

Lazy Threads: Implementing a Fast Parallel Call (1996)  (Make Corrections)  (14 citations)
Seth Copen Goldstein, Klaus Erik Schauser, David E. Culler
Journal of Parallel and Distributed Computing



  Home/Search   Context   Related

 
View or download:
cmu.edu/~seth/papers/jpdc.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/~seth/papers (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: In this paper we describe lazy threads, a new approach for implementing multi-threaded execution models on conventional machines. We show how they can implement a parallel call at nearly the efficiency of a sequential call. The central idea is to specialize the representation of a parallel call so that it can execute as a parallel-ready sequential call. This allows excess parallelism to degrade into sequential calls with the attendant efficient stack management and direct transfer of control... (Update)

Cited by:   More
Compiler Optimization of Value Communication for Thread-Level.. - Zhai (2005)   (Correct)
Hardware Support for Thread-Level Speculation - Steffan (2003)   (Correct)
Engineering Parallel Symbolic Programs in GpH - Loidl, Trinder, Hammond.. (1999)   (Correct)

Similar documents (at the sentence level):
78.5%:   Lazy Threads: Implementing a Fast Parallel Call - Goldstein (1996)   (Correct)
25.0%:   Enabling Primitives For Compiling Parallel Languages - Goldstein, schauser, Culler (1995)   (Correct)
13.8%:   Lazy Threads: Compiler and Runtime Structures for Fine-Grained.. - Goldstein (1997)   (Correct)

Active bibliography (related documents):   More   All
0.3:   Chant: Lightweight Threads in a Distributed Memory Environment - Haines, Mehrotra, Cronk (1995)   (Correct)
0.3:   Compilation Techniques for Fair Execution of Shared Memory.. - Yosi Ben-Asher   (Correct)
0.2:   The Data Locality of Work Stealing - Acar, Blelloch, Blumofe (2000)   (Correct)

Similar documents based on text:   More   All
0.4:   Empirical Study of a Dataflow Language on the CM-5 - Culler, Goldstein.. (1994)   (Correct)
0.2:   Order-Sorted Feature Theory Unification - Aït-Kaci, Podelski, Goldstein (1997)   (Correct)
0.2:   An Abstract Machine to Implement Functions in LIFE - Goldstein (1992)   (Correct)

Related documents from co-citation:   More   All
8:   Lazy task creation: a technique for increasing the granularity of parallel progr.. - Mohr, Kranz et al. - 1990
8:   Cilk: An Efficient Multithreaded Runtime System - Blumofe, Joerg et al. - 1995
6:   Department of Electrical Engineering and Computer Science (context) - Blumofe, Programs et al. - 1995

BibTeX entry:   (Update)

S. C. Goldstein, K. E. Schauser, and D. E. Culler. Lazy threads: Implementing a fast parallel call. Journal of Parallel and Distributed Computing, 37(1):5--20, August 1996. http://citeseer.ist.psu.edu/goldstein96lazy.html   More

@article{ goldstein96lazy,
    author = "Seth Copen Goldstein and Klaus Erik Schauser and David E. Culler",
    title = "Lazy Threads: Implementing a Fast Parallel Call",
    journal = "Journal of Parallel and Distributed Computing",
    volume = "37",
    number = "1",
    pages = "5--20",
    year = "1996",
    url = "citeseer.ist.psu.edu/goldstein96lazy.html" }
Citations (may not include all citations):
912   Mpi: a message-passing interface standard - Interface - 1994
595   Active Messages: a mechanism for integrated communication an.. - von Eicken, Culler et al. - 1992
521   Compiling with continuations (context) - Appel - 1992
394   The High Performance Fortran Handbook (context) - koelbel, Loveman et al. - 1994
341   Parallel Programming in Split-C - Culler, Dusseau et al. - 1993
304   Scheduler activations: effective kernel support for the user.. - Anderson, Bershad et al. - 1992
173   Lazy task creation: a technique for increasing the granulari.. - Mohr, Kranz et al. - 1991
120   Overview of the CHORUS distributed operating system - Rozier, Abrossimov et al. - 1992
112   Supporting dynamic data structures on distributed-memory mac.. - Rogers, Carlisle et al. - 1995
89   TAM --- a compiler controlled threaded abstract machine (context) - Culler, Goldstein et al. - 1993
84   The design and evaluation of a high performance Smalltalk sy.. (context) - Ungar - 1987
83   a portable concurrent object oriented system based on C (context) - Kale, Krishnan - 1993
80   Machine multicomputer: an architectural evaluation (context) - Noakes, Wallach et al. - 1993
74   Threads and input/output in the Synthesis kernel - Massalin, Pu - 1989
67   Representing control in the presence of first-class continua.. - Hieb, Dybvig et al. - 1990
67   compositional parallel programming (context) - Chandy, Kesselman - 1993
66   a high-performance parallel Lisp (context) - Kranz, Halstead et al. - 1989
62   Distributed Filaments: efficient fine-grain parallelism on a.. - Freeh, Lowenthal et al. - 1994
61   Technical Report CMU-CS (context) - Cooper, Draves - 1988
61   the design of Chant: a talking threads package - Haines, Cronk et al. - 1994
58   WorkCrews: an abstraction for controlling parallelism (context) - Vandevoorde, Roberts - 1988
56   Concert: efficient runtime support for concurrent object-ori.. - Karamcheti, Chien - 1993
43   Debugging optimized code with dynamic deoptimization (context) - Holzle, Chambers et al. - 1992
34   shared memory (context) - Nikhil, parallel - 1995
30   Early experiences with Olden (context) - Carlisle, Rogers et al. - 1994
30   A multithreaded implementation of Id using P-RISC graphs (context) - Nikhil - 1994
25   Dataflow architectures (context) - Arvind - 1986
23   Performance measurement on HEP --- a pipelined MIMD computer (context) - Jordan - 1983
22   StackThreads: an abstract machine for scheduling fine-grain .. (context) - Taura, Matsuoka et al. - 1995
20   An overview of the Opus language and runtime system - Mehrotra, Haines - 1995
20   The performance of an object-oriented threads package (context) - Faust, Levy - 1990
18   Procs and Locks: a portable multiprocessing platform for sta.. - Morrisett, Tolmach - 1993
14   Leapfrogging: a portable technique for implementing efficien.. - Wagner, Calder - 1993
13   Technical Report CSG Memo (context) - Nikhil, version et al. - 1988
10   and Synchronizers: Enabling primitives for compiling paralle.. (context) - Goldstein, Culler et al. - 1995
10   Supporting SPMD execution for dynamic data structures - Rogers, Reppy et al. - 1993
6   Shared Filaments: efficient fine-grain parallelism on shared.. (context) - Engler, Lowenthal - 1993
5   MIT Lab for Comp (context) - Blumofe, Joerg et al. - 1994
4   Heaps o' Stacks: combined heap-based activation allocation f.. (context) - Grunwald, Calder et al. - 1994



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.cmu.edu/~seth/papers.html):   More
NIFDY: A Low Overhead, High Throughput Network Interface - Timothy Callahan (1995)   (Correct)
Separation Constraint Partitioning - A New Algorithm.. - Schauser, Culler.. (1995)   (Correct)
Hardware-Assisted Replay of Multiprocessor Programs - Bacon, Copen (1991)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC