(Enter summary)
Abstract: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xvi
Chapter
1 INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
1.1 Latency Tolerant Architectures : : : : : : : : : : : : : : : : : : : : : 1
1.2 Exploiting Thread Level Parallelism : : : : : : : : : : : : : : : : : : : 4
1.3 Multithreaded Models : : : : : : : : : : : : : : : : : : : : : : : : : : 5
1.4 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8
1.5 Claim of... (Update)
Context of citations to this paper: More
...paradigms [49, 98, 99] A more in depth discussion on such multithreaded compilation technology can be found in X. Tang s Ph.D. thesis [94]. Such compilers may also require the use of a cost model for the underlying multithreaded machines. This is a non trivial task due to the...
Cited by: More
Advances in the Dataflow Computational Model - Najjary, Lee, Gao (1999)
(Correct)
Similar documents (at the sentence level):
8.5%: Heap Analysis and Optimizations for Threaded Programs - Tang, Ghiya, Hendren, Gao (1997)
(Correct)
Active bibliography (related documents): More All
2.6: How "hard" Is Thread Partitioning and How "bad" Is a List.. - Tang, Gao
(Correct)
1.7: Thread Partitioning and Scheduling Based on Cost Model - Tang, al. (1997)
(Correct)
1.2: Design and Implementation of an Efficient Thread.. - Amaral, Gao.. (1999)
(Correct)
Similar documents based on text: More All
2.3: Parsing And Incrementality - Schneider
(Correct)
1.4: Strategic Interaction Online: A Comparison of Instructional.. - Colburn (2002)
(Correct)
0.2: Thread Partitioning and Scheduling Based On Cost Model - Tang, Wang, Theobald, Gao (1997)
(Correct)
Related documents from co-citation: More All
2: Memory Consistency and Event Ordering in Scalable Shared-memory Multiprocessors
- Gharachorloo, Lenoski - 1990
2: On Memory Models and Cache Management for Shared-Memory Multiprocessors
- Dennis, Gao - 1995
2: Scheduling dynamic dataflow graphs with bounded memory using the Token Flow Mode..
- Buck - 1993
BibTeX entry: (Update)
X. Tang. Compiling for Multithreaded Architectures. PhD thesis, University of Delaware, Newark, DE, Apr. 1999. http://citeseer.ist.psu.edu/tang99compiling.html More
@misc{ tang99compiling,
author = "X. Tang",
title = "Compiling for Multithreaded Architectures",
text = "X. Tang. Compiling for Multithreaded Architectures. PhD thesis, University
of Delaware, Newark, DE, Apr. 1999.",
year = "1999",
url = "citeseer.ist.psu.edu/tang99compiling.html" }
Citations (may not include all citations):
3972
Introduction to Algorithms (context) - Cormen, Leiserson et al. - 1990 ACM
1399
Compilers --- Principles (context) - Aho, Sethi et al. - 1986
480
The program dependence graph and its use in optimization (context) - Ferrante, Ottenstein et al. - 1987 ACM DBLP
444
Mach: A new kernel foundation for UNIX development (context) - Accetta, Baron et al. - 1986
358
The Tera computer system
- Alverson, Callahan et al. - 1990 ACM DBLP
356
Computers and Intractability: A Guide to the Theory of NP-Co.. (context) - Garey, Johnson - 1979
341
Parallel programming in Split-C
- Culler, Dusseau et al. - 1993 ACM DBLP
269
Multiscalar processors
- Sohi, Breach et al. - 1995 ACM DBLP
246
Context-sensitive interprocedural points-to analysis in the ..
- Emami, Ghiya et al. - 1994 ACM DBLP
241
A study of branch prediction strategies (context) - Smith - 1981
230
Cilk: An efficient multithreaded runtime system
- Blumofe, Joerg et al. - 1995 ACM DBLP
219
Bounds on multiprocessing timing anomalies (context) - Graham - 1969 DBLP
212
Optimization and approximation in deterministic sequencing a.. (context) - Graham, Lawler et al. - 1979
185
Branch prediction strategies and branch target buffer design (context) - Lee, Smith - 1984 ACM DBLP
158
Improving register allocation for subscripted variables
- Callahan, Carr et al. - 1990 ACM DBLP
139
The predictability of data values
- Sazeides, Smith - 1997 ACM DBLP
127
A multithreaded massively parallel architecture (context) - Nikhil, Papadopoulos - 1992
121
Monsoon: An explicit tokenstore architecture (context) - Papadopoulos, Culler - 1990
112
Concurrent Programming in Java: Design Principles and Patter.. (context) - Lea - 1996
112
Supporting dynamic data structures on distributed-memory mac..
- Rogers, Carlisle et al. - 1995 ACM DBLP
100
Dynamic instruction reuse
- Sodani, Sohi - 1998 ACM DBLP
92
Amoeba: A distributed operating system
- Mullender, van Rossum et al. - 1990
89
TAM -- a compiler controlled threaded abstract machine (context) - Culler, Goldstein et al. - 1993
89
SISAL: Streams and iteration in a single assignment language.. (context) - McGraw - 1985
77
The potential for using thread-level data speculation to fac..
- Steffan, Mowry - 1998
77
Parallel sorting by regular sampling
- Shi, Schaeffer - 1992 ACM DBLP
76
A comparison of clustering heuristics for scheduling directe.. (context) - Gerasoulis, Yang - 1992
74
Speculative versioning cache
- Gopal, Vijaykumar et al. - 1998 ACM DBLP
71
First version of a data-flow procedure language (context) - Dennis - 1974
68
the granularity and clustering of directed acyclic task grap..
- Gerasoulis, Yang - 1993
67
Arb: A hardware mechanism for dynamic reordering of memory r..
- Franklin, Sohi - 1996 DBLP
65
von Neumann hybrid architecture (context) - Iannucci, dataflow - 1988
65
von Neumann hybrid architecture (context) - Iannucci, dataflow - 1988
62
The transitive reduction of a directed graph (context) - Aho, Garey et al. - 1972 DBLP
59
Very long instruction word architectures and the ELI (context) - Fisher - 1983
57
Simultaneous multithreading: a platform for nextgeneration p..
- Eggers, Emer et al. - 1997
56
Machine multicomputer (context) - Fillo, Keckler et al. - 1995
55
Distributed Operating Systems (context) - Tanenbaum - 1995 ACM DBLP
55
Software caching and computation migration in Olden
- Carlisle, Rogers - 1995 ACM DBLP
52
Converting thread-level parallelism to instruction-level par..
- Lo, Eggers et al. - 1997
51
A comparison of list schedules for parallel processing syste.. (context) - Adam, Chandy et al. - 1974 ACM DBLP
51
Two fundamental issues in multiprocessing (context) - Robert, Iannucci - 1987 ACM DBLP
48
Design of a Computer: The Control Data (context) - Thornton - 1970
47
Probabilistic analysis of partitioning algorithms for the tr.. (context) - Karp - 1977
47
Sparcle: An evolutionary processor design for multiprocessor.. (context) - Agarwal, Kubiatowicz et al. - 1993
47
Programming with Threads (context) - Kleiman, Shah et al. - 1996 ACM
47
A study of the EARTH-MANNA multithreaded system (context) - Hum, Maquelin et al. - 1996 ACM
46
MIT Laboratory for Computer Science (context) - Nikhil, Version et al. - 1988
43
Control flow speculation in multiscalar processors
- Jacobson, Bennett et al. - 1997 ACM DBLP
41
Divergence preserving discrete surface integral methods for .. (context) - Madsen - 1992 ACM
39
Exploiting heterogeneous parallelism on a multithreaded mult.. (context) - Alverson, Alverson et al. - 1991 ACM DBLP
38
Building multithreaded architectures with off-the-shelf micr..
- Hum, Theobald et al. - 1994 ACM DBLP
36
The program structure tree: Computing control regions in lin..
- Johnson, Pearson et al. - 1994 DBLP
35
List scheduling with and without communication delay
- Yang, Gerasoulis - 1993
34
Latency hiding in message-passing architectures (context) - Bruening, Giloi et al. - 1994 ACM DBLP
34
Id: a language with implicit parallelism (context) - Nikhil - 1990
33
Parallel MIMD Computation: HEP Supercomputer and its Applica.. (context) - Kowalik - 1985 ACM
31
Olden: Parallelizing Programs with Dynamic Data Structures o.. (context) - Carlisle - 1996
30
Global analysis for partitioning non-strict programs into se.. (context) - Traub, Culler et al. - 1992 ACM DBLP
28
Sequential implementation of lenient programming languages (context) - Traub - 1988
28
The anatomy of the register file in a multiscalar processor
- Breach, Vijaykumar et al. - 1994 ACM DBLP
27
Parallel operation in the Control Data (context) - Thornton - 1964
25
Separation constraint partitioning --- A new algorithm for p..
- Schauser, Culler et al. - 1995
25
A generalized scheme for mapping parallel algorithms (context) - Chaudhary, Aggarwal - 1993 ACM DBLP
24
An analysis framework for the McCAT compiler (context) - Sridharan - 1992
24
Exceeding the dataflow limit via value prediction (context) - Lipasti, Shen - 1996 ACM DBLP
24
Alternative implementations of two-level adaptive branch pre.. (context) - Yeh, Patt - 1992 ACM DBLP
23
Super-threading: Architectural and software mechanisms for o.. (context) - Sakai, Okamoto et al. - 1993 DBLP
23
the EARTH multithreaded architecture (context) - Hendren, Tang et al. - 1997
23
Compile-time partitioning of a non-strict language into sequ.. (context) - Hoch, Davenport et al. - 1993
21
Performance study of a multithreaded superscalar microproces..
- Gulati, Bagherzadeh - 1996 ACM DBLP
21
An efficient pipelined dataflow processor architecture (context) - Dennis, Gao - 1988 ACM DBLP
21
Task selection for a multiscalar processor
- Vijaykumar, Sohi - 1998 ACM DBLP
19
Optimizing parallel programs with explicit synchronization
- Krishnamurthy, Yelick - 1995 ACM DBLP
19
Architectural support for thread-level data speculation
- Steffan, Colohan et al. - 1999
17
A compiler for the MIT tagged-token dataflow architecture (context) - Traub - 1986 ACM
17
EARTH: An Efficient Architecture for Running Threads (context) - Theobald - 1999
16
Exploiting fine-grain thread level parallelism on the mit mu..
- Keckler, Dally et al. - 1998 ACM DBLP
15
StarT the Next Generation: Integrating global caches and dat.. (context) - Ang, Derek - 1994
14
Compilercontrolled multithreading for lenient parallel langu.. (context) - Schauser, Culler et al. - 1991
14
Multi-processor performance on the tera mta (context) - Snavely, Carter et al. - 1998 ACM
14
Communication optimizations for parallel C programs
- Zhu, Hendren - 1998 ACM DBLP
14
An efficient hybrid dataflow architecture model (context) - Gao - 1993 ACM
13
Decentralized optimal power pricing: The development of a pa..
- Lumetta, Murphy et al. - 1993 DBLP
13
The HTMT program execution model
- Gao, Theobald et al. - 1998
13
Department of Electrical and Computer Engineering (context) - Theobald, Amaral et al. - 1998
13
Performance and programming experience on the tera mta
- Carter, Feo et al. - 1999 DBLP
13
The superthreaded architecture: Thread pipelining with run-t.. (context) - Tsai, Yew - 1996
12
Heap analysis and optimizations for threaded programs
- Tang, Ghiya et al. - 1997 ACM DBLP
12
A parallel algorithm for constructing minimum spanning trees (context) - Bentley - 1980 DBLP
10
Computing perimeters of regions in images represented by qua.. (context) - Samet - 1981
10
Concurrency by inheritance in C (context) - Arjomandi, O'Farrell et al. - 1995
10
Mimd-style parallel programming based on continuation-passin..
- Halbherr, Zhou et al. - 1994
10
Thread partitioning and scheduling based on cost model
- Tang, Wang et al. - 1997 ACM DBLP
10
SIGPLAN Notices (context) - May - 1983
9
A performance study of Time Warp (context) - Lomow, Cleary et al. - 1988
9
Two-level adaptive training branch prediction (context) - Yeh, Patt - 1991
9
Computer Science Department (context) - Roh, Evaluation et al. - 1995
8
Instruction reordering for fork-join parallelism (context) - Sarkar - 1990 ACM DBLP
8
Parallel function invocation in a dynamic argument-fetching .. (context) - Gao, Hum et al. - 1990
7
A comparison of list schedulers for parallel processing syst.. (context) - Adam, Chandy et al. - 1974
7
A multi-threaded 64-bit powerpc commercial risc processor de.. (context) - Storino, Borkenhagen - 1999
7
Compiler techniques for the superthreaded architectures
- Tsai, Jiang et al. - 1998 ACM DBLP
6
A high-speed memory organization for hybrid dataflow/von Neu.. (context) - Hum, Gao - 1992 ACM
6
An algorithm for transitive reduction of an acyclic graph (context) - Gries, Martin et al. - 1989 ACM DBLP
6
Compiling Lenient Languages for Parallel Asynchronous Execut.. (context) - Schauser - 1995 ACM
5
An evaluation of bottom-up and top-down thread generation te.. (context) - Bohm, Najjar et al. - 1993 ACM DBLP
5
Identifying the capability of overlapping computation with c..
- Sohn, Ku et al. - 1996 ACM
5
Connection Analaysis: A practical interprocedural heap analy.. (context) - Ghiya, Hendren - 1996
5
Statement reordering for doacross loops (context) - Chen, Yew - 1994
5
IEEE Transactions on Computers (context) - Rumbaugh, flow - 1977
4
Pthreads Primer: A Guide to Multithreaded Programming (context) - Bewis, Berg - 1997
4
Xil and yil: The intermediate languages of tobey
- O'Brien, O'Brien et al. - 1995 DBLP
3
Dropping anchors: Putting heap analysis to work (context) - Ghiya, Hendren - 1997
3
Performance study of a concurrent multithreaded processor (context) - Tsai, Jiang et al. - 1998 ACM DBLP
2
Peformance Modeling and Analysis of Multithreaded Architectu.. (context) - Nemawarkar - 1996
2
Approximation algorithms for scheduling arithemetic expressi.. (context) - Bernstein, Rodeh et al. - 1989
2
Pgcc user's guide (context) - Group - 1995
2
Micro-kernal architecture: Key to modern operating system de.. (context) - Gien - 1990
1
A superstrand architecture and its compilation (context) - M'arquez, Theobald et al. - 1999
1
Tera computer company the next wave in supercomputing (context) - Company - 1999
1
Digital Equipment Corporation (context) - Nikhil, parallel et al. - 1994
1
MIT Laborarory for Computer Science (context) - Group, beta et al. - 1997
Documents on the same site (http://olsen.capsl.udel.edu/~tang/): More
Heap Analysis and Optimizations for Threaded Programs - Tang, Ghiya, Hendren, Gao (1997)
(Correct)
Thread Partitioning and Scheduling Based On Cost Model - Tang, Wang, Theobald, Gao (1997)
(Correct)
The HTMT Program Execution Model (Extended Abstract) - Gao, Theobald, Marquez.. (1998)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC