(Enter summary)
Abstract: High-performance, general-purpose microprocessors serve as compute engines for computers ranging from personal computers to supercomputers. Sequential programs constitute a major portion of real-world software that run on the computers. State-of-the-art microprocessors exploit instruction level parallelism (ILP) to achieve high performance on such applications by searching for independent instructions in a dynamic window of instructions and executing them on a wide-issue pipeline. Increasing... (Update)
Cited by: More
Compiler Optimization of Value Communication for Thread-Level.. - Zhai (2005)
(Correct)
Tolerating Dependences Between Large Speculative.. - Christopher Colohan..
(Correct)
Improving Cache Locality for Thread-Level Speculation Systems - Fung (2005)
(Correct)
Active bibliography (related documents): More All
0.8: Design And Evaluation Of A Multiscalar Processor - Breach (1998)
(Correct)
0.6: Complexity-Effective Superscalar Processors - Palacharla (1998)
(Correct)
0.6: The Microarchitecture of Superscalar Processors - Smith, Sohi (1995)
(Correct)
Similar documents based on text: More All
0.2: Multiscalar Processors - Sohi (1995)
(Correct)
0.2: Aspects Of The Molecular Phylogeny Of Three Species Of The.. - Zizania Based On
(Correct)
0.1: Data Memory Alternatives for Multiscalar Processors - Breach, Vijaykumar, Gopal, .. (1997)
(Correct)
Related documents from co-citation: More All
22: Multiscalar processors
- Sohi, Breach et al. - 1995
18: The Potential for Using ThreadLevel Data Speculation to Facilitate Automatic Par..
- Steffan, Mowry - 1998
16: Speculative Versioning Cache
- Gopal, Vijaykumar et al. - 1998
BibTeX entry: (Update)
T. N. Vijaykumar. Compiling for the Multiscalar Architecture. Ph.D. thesis, University of Wisconsin-Madison, Madison, WI 53706, Jan. 1998. http://citeseer.ist.psu.edu/vijaykumar98compiling.html More
@techreport{ vijaykumar98compiling,
author = "T. N. Vijaykumar",
title = "Compiling for the Multiscalar Architecture",
number = "CS-TR-1998-1370",
year = "1998",
url = "citeseer.ist.psu.edu/vijaykumar98compiling.html" }
Citations (may not include all citations):
1399
Compilers: Principles (context) - Aho, Sethi et al. - 1986
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991
407
Trace scheduling: A technique for global microcode compactio.. (context) - Fisher - 1981
390
Interprocedural slicing using dependence graphs
- Horwitz, Reps et al. - 1990
352
Supercompilers for Parallel and Vector Computers (context) - Zima, Chapman - 1991
318
IEEE Transactions on Software Engineering (context) - Weiser - 1984
283
Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1990
269
Multiscalar processors
- Sohi, Breach et al. - 1995
260
Validity of the single processor approach to achieving large.. (context) - Amdahl - 1967
258
Automatic translation of fortran programs to vector form
- Allen, Kennedy - 1987
241
A study of branch prediction strategies (context) - Smith - 1981
237
Global optimizations for parallelism and locality on scalabl..
- Anderson, Lam - 1993
214
Combining branch predictors
- McFarling - 1993
193
Superscalar Microprocessor Design (context) - Johnson - 1991
186
Exploiting choice: Instruction fetch and issue on an impleme..
- Tullsen, Eggers et al. - 1996
175
Complexity-effective superscalar processors
- Palacharla, Jouppi et al. - 1997
173
Bulldog: A Compiler for VLIW Architectures (context) - Ellis - 1985
171
Dependence graphs and compiler optimizations (context) - Kuck, Kuhn et al. - 1981
160
Impact: An architectural framework for multiple-instruction-..
- Chang, Mahlke et al. - 1991
158
Effective compiler support for predicated execution using th..
- Mahlke, Lin et al. - 1992
151
Baring it all to software: Raw machines
- Waingold - 1997
150
An efficient algorithm for exploiting multiple arithmetic un.. (context) - Tomasulo - 1967
147
Alternative implementations of two-level adaptive training b..
- Yeh, Patt - 1992
137
Lockup-free instruction fetch/prefetch cache organization (context) - Kroft - 1981
136
superscalar microprocessor (context) - Yeager - 1996
128
Global optimization by suppression of partial redundancies (context) - Morel, Renvoise - 1979
125
Trace processors
- Rotenberg, Jacobson et al. - 1997
116
Monotone data flow analysis frameworks (context) - Kam, Ullman - 1977
113
Data and computation transformations for multiprocessors
- Anderson, Amarasinghe et al. - 1995
112
Highly concurrent scalar processing (context) - Hsu, Davidson - 1986
107
Global instruction scheduling for superscalar machines (context) - Bernstein, Rodeh - 1991
102
Dynamic speculation and synchronization of data dependences
- Moshovos, Breach et al. - 1997
97
The case for a single-chip multiprocessor (context) - Olukotun, Nayfeh et al. - 1996
93
High-bandwidth data memory systems for superscalar processor.. (context) - Sohi, Franklin - 1991
92
A flexible approach to interprocedural data flow analysis an.. (context) - Jones, Muchnick - 1982
91
Two-level adaptive branch prediction (context) - Yeh, Patt - 1991
86
A precise inter-procedural data flow algorithm (context) - Myers - 1981
85
Code scheduling and register allocation in large basic block.. (context) - Goodman, Hsu - 1988
77
Efficient instruction scheduling for a pipelined architectur.. (context) - Gibbons, Muchnick - 1986
76
The program summary graph and flow-sensitive interprocedural.. (context) - Callahan - 1988
74
Instruction issue logic for high performance (context) - Sohi - 1990
70
An interval-based approach to exhaustive and incremental int.. (context) - Burke - 1990
70
The expandable split window paradigm for exploiting fine-gra..
- Franklin, Sohi - 1992
70
Integrating register allocation and instruction scheduling f.. (context) - Bradlee, Eggers et al. - 1991
67
ARB: A hardware mechanism for dynamic reordering of memory r..
- Franklin, Sohi - 1996
67
Evaluation of compiler optimizations for fortran d on mimd d..
- Hiranandani, Kennedy et al. - 1992
66
Boosting beyond static scheduling in a superscalar processor
- Smith, Lam et al. - 1990
65
Interconnect scaling - the real limiter to high performance .. (context) - Bohr - 1996
62
An efficient resource-constrained global scheduling techniqu.. (context) - Moon, Ebcioglu - 1992
60
Software and hardware for exploiting speculative parallelism..
- Oplinger - 1997
55
A program data flow analysis procedure (context) - Allen, Cocke - 1976
53
Improving superscalar instruction dispatch and issue by expl..
- Vajapeyam, Mitra - 1997
52
Efficient superscalar performance through boosting
- Smith, Horowitz et al. - 1992
52
A compilation technique for software pipelining of loops wit.. (context) - Ebcioglu - 1987
50
Region scheduling: An approach for detecting and redistribut.. (context) - Gupta, Soffa - 1990
49
The cray-1 computer system (context) - Russell - 1978
48
Design of a Computer---The Control Data (context) - Thornton - 1970
47
Code generation schema for modulo scheduled loops
- Rau, Schlansker et al. - 1992
47
Sentinel scheduling for VLIW and superscalar processors
- Mahlke, Chen et al. - 1992
44
A practical interprocedural data flow analysis algorithm (context) - Barth - 1978
43
Control flow speculation in multiscalar processors
- Jacobson, Bennett et al. - 1997
39
Path-based next trace prediction
- Jacobson, Rotenberg et al. - 1997
39
Balanced scheduling: Instruction scheduling when memory late..
- Kerns, Eggers - 1993
36
Practical adaptation of the global optimization algorithm of.. (context) - Dhamdhere - 1991
35
Percolation scheduling: A parallel compilation technique (context) - Nicolau - 1985
33
take - a balanced code placement framework (context) - Hanxleden, Kennedy - 1994
33
Superblock formation using static program analysis
- Hank, Mahlke et al. - 1993
32
An efficient hybrid algorithm for incremental data flow anal.. (context) - Marlowe, Ryder - 1990
32
Efficient computation of flow insensitive interprocedural su.. (context) - Cooper, Kennedy - 1984
31
and inline expansion (context) - Allen, Johnson et al. - 1988
30
Dynamic instruction scheduling and the astronautics zs (context) - Smith - 1989
28
Register traffic analysis for streamlining inter-operation c..
- Franklin, Sohi - 1992
28
The anatomy of the register file in a multiscalar processor
- Breach, Vijaykumar et al. - 1994
27
Parallel operation in the control data (context) - Thornton - 1961
27
Guarded execution and branch prediction in dynamic ilp proce.. (context) - Pnevmatikatos, Sohi - 1994
26
University of WisconsinMadison (context) - Franklin, Architecture - 1993
26
University of WisconsinMadison (context) - Franklin, Architecture et al. - 1993
25
Performance features of the pa7100 microprocessor (context) - Asprey - 1993
25
Instruction-level parallel processing (context) - Fisher, Rau - 1991
24
The zs-1 central processor
- Smith - 1987
24
Multiprocessors from a software perspective (context) - Amarasinghe, Anderson et al. - 1996
23
A simple algorithm for global data flow analysis problems (context) - Hecht, Ullman - 1975
21
Data flow analysis for procedural languages (context) - Rosen - 1979
20
mhz superscalar risc microprocessor with out-of-order execut.. (context) - Gieseke - 1997
16
Partitioning parallel programs for macro-dataflow (context) - Sarkar, Hennessy - 1986
16
The cache performance and optimizations of block algorithms (context) - Lam, Rothberg et al. - 1991
15
The potential for thread-level data speculation in tightly-c..
- Steffan, Mowry - 1998
15
Enhanced region scheduling on a program dependence graph (context) - Allan, Janardhan et al. - 1992
14
The program dependence graph and vectorization (context) - Baxter - 1989
14
Machine organization of the ibm risc system/6000 processor (context) - Grohoski - 1990
14
locality management in shared-memory multiprocessors (context) - Markatos, LeBlanc et al. - 1991
14
Organization of the motorola 88110 superscalar RISC micropro.. (context) - Diefendorff, Allen - 1992
13
An interprocedural data flow analysis algorithm (context) - Barth - 1977
11
Incremental data flow analysis (context) - Ryder - 1983
11
IBM RISC system/6000 processor architecture (context) - Oehler, Groves - 1990
11
Digital leads the pack with (context) - Gwennap - 1994
10
The design of a data flow analyzer (context) - Chow, Rudmik - 1982
8
High-level data flow analysis (context) - Rosen - 1977
8
Control flow prediction for dynamic ILP processors (context) - Pnevmatikatos, Franklin et al. - 1993
6
Register communication strategies for the multiscalar archit..
- Vijaykumar, Sohi - 1996
5
Architecture of the IBM system (context) - Amdahl - 1964
5
Power and PowerPC: Principles (context) - Weiss, Smith - 1994
4
IEEE Micro (context) - Hsu, the et al. - 1994
3
Intel reveals Pentium Implementation Details (context) - Case - 1993
2
A solution to a problem with morel and renvoise's 'global op.. (context) - Drechsler, Stadel - 1988
2
Englewood Cliffs (context) - Muchnick, Jones - 1981
2
An efficient approach to data flow analysis in a multiple pa.. (context) - Jain, Thompson - 1988
1
model 91: Machine philosphy and instruction-handling (context) - Anderson, Sparacio et al. - 1967
1
Compiling register communication in the multiscalar architec.. (context) - Vijaykumar, Sohi - 1996
1
Annual International Conference on Parallel Processing (context) - Cytron, Beyond et al. - 1986
1
Annual International Conference on Parallel Processing (context) - Chen, Yew et al. - 1994
1
PowerPC 604 powers past Pentium (context) - Gwennap - 1994
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cs.wisc.edu/~mscalar/publications.html): More
Incorporating Guarded Execution into Existing Instruction Sets - Pnevmatikatos (1996)
(Correct)
Streamlining Data Cache Access with Fast Address Calculation - Austin, Pnevmatikatos, Sohi (1995)
(Correct)
Multiscalar Processors - Sohi (1995)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC