See this document in CiteSeerX!

Compiling for the Multiscalar Architecture (1998)  (Make Corrections)  (25 citations)
T. N. Vijaykumar



  Home/Search   Context   Related

 
View or download:
wisc.edu/sohi/theses/vijay.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  wisc.edu/~mscalar/publications (more)
Homepages:  T.Vijaykumar  

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: High-performance, general-purpose microprocessors serve as compute engines for computers ranging from personal computers to supercomputers. Sequential programs constitute a major portion of real-world software that run on the computers. State-of-the-art microprocessors exploit instruction level parallelism (ILP) to achieve high performance on such applications by searching for independent instructions in a dynamic window of instructions and executing them on a wide-issue pipeline. Increasing... (Update)

Cited by:   More
Compiler Optimization of Value Communication for Thread-Level.. - Zhai (2005)   (Correct)
Tolerating Dependences Between Large Speculative.. - Christopher Colohan..   (Correct)
Improving Cache Locality for Thread-Level Speculation Systems - Fung (2005)   (Correct)

Active bibliography (related documents):   More   All
0.8:   Design And Evaluation Of A Multiscalar Processor - Breach (1998)   (Correct)
0.6:   Complexity-Effective Superscalar Processors - Palacharla (1998)   (Correct)
0.6:   The Microarchitecture of Superscalar Processors - Smith, Sohi (1995)   (Correct)

Similar documents based on text:   More   All
0.2:   Multiscalar Processors - Sohi (1995)   (Correct)
0.2:   Aspects Of The Molecular Phylogeny Of Three Species Of The.. - Zizania Based On   (Correct)
0.1:   Data Memory Alternatives for Multiscalar Processors - Breach, Vijaykumar, Gopal, .. (1997)   (Correct)

Related documents from co-citation:   More   All
22:   Multiscalar processors - Sohi, Breach et al. - 1995
18:   The Potential for Using ThreadLevel Data Speculation to Facilitate Automatic Par.. - Steffan, Mowry - 1998
16:   Speculative Versioning Cache - Gopal, Vijaykumar et al. - 1998

BibTeX entry:   (Update)

T. N. Vijaykumar. Compiling for the Multiscalar Architecture. Ph.D. thesis, University of Wisconsin-Madison, Madison, WI 53706, Jan. 1998. http://citeseer.ist.psu.edu/vijaykumar98compiling.html   More

@techreport{ vijaykumar98compiling,
    author = "T. N. Vijaykumar",
    title = "Compiling for the Multiscalar Architecture",
    number = "CS-TR-1998-1370",
    year = "1998",
    url = "citeseer.ist.psu.edu/vijaykumar98compiling.html" }
Citations (may not include all citations):
1399   Compilers: Principles (context) - Aho, Sethi et al. - 1986
474   A data locality optimizing algorithm (context) - Wolf, Lam - 1991
407   Trace scheduling: A technique for global microcode compactio.. (context) - Fisher - 1981
390   Interprocedural slicing using dependence graphs - Horwitz, Reps et al. - 1990
352   Supercompilers for Parallel and Vector Computers (context) - Zima, Chapman - 1991
318   IEEE Transactions on Software Engineering (context) - Weiser - 1984
283   Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1990
269   Multiscalar processors - Sohi, Breach et al. - 1995
260   Validity of the single processor approach to achieving large.. (context) - Amdahl - 1967
258   Automatic translation of fortran programs to vector form - Allen, Kennedy - 1987
241   A study of branch prediction strategies (context) - Smith - 1981
237   Global optimizations for parallelism and locality on scalabl.. - Anderson, Lam - 1993
214   Combining branch predictors - McFarling - 1993
193   Superscalar Microprocessor Design (context) - Johnson - 1991
186   Exploiting choice: Instruction fetch and issue on an impleme.. - Tullsen, Eggers et al. - 1996
175   Complexity-effective superscalar processors - Palacharla, Jouppi et al. - 1997
173   Bulldog: A Compiler for VLIW Architectures (context) - Ellis - 1985
171   Dependence graphs and compiler optimizations (context) - Kuck, Kuhn et al. - 1981
160   Impact: An architectural framework for multiple-instruction-.. - Chang, Mahlke et al. - 1991
158   Effective compiler support for predicated execution using th.. - Mahlke, Lin et al. - 1992
151   Baring it all to software: Raw machines - Waingold - 1997
150   An efficient algorithm for exploiting multiple arithmetic un.. (context) - Tomasulo - 1967
147   Alternative implementations of two-level adaptive training b.. - Yeh, Patt - 1992
137   Lockup-free instruction fetch/prefetch cache organization (context) - Kroft - 1981
136   superscalar microprocessor (context) - Yeager - 1996
128   Global optimization by suppression of partial redundancies (context) - Morel, Renvoise - 1979
125   Trace processors - Rotenberg, Jacobson et al. - 1997
116   Monotone data flow analysis frameworks (context) - Kam, Ullman - 1977
113   Data and computation transformations for multiprocessors - Anderson, Amarasinghe et al. - 1995
112   Highly concurrent scalar processing (context) - Hsu, Davidson - 1986
107   Global instruction scheduling for superscalar machines (context) - Bernstein, Rodeh - 1991
102   Dynamic speculation and synchronization of data dependences - Moshovos, Breach et al. - 1997
97   The case for a single-chip multiprocessor (context) - Olukotun, Nayfeh et al. - 1996
93   High-bandwidth data memory systems for superscalar processor.. (context) - Sohi, Franklin - 1991
92   A flexible approach to interprocedural data flow analysis an.. (context) - Jones, Muchnick - 1982
91   Two-level adaptive branch prediction (context) - Yeh, Patt - 1991
86   A precise inter-procedural data flow algorithm (context) - Myers - 1981
85   Code scheduling and register allocation in large basic block.. (context) - Goodman, Hsu - 1988
77   Efficient instruction scheduling for a pipelined architectur.. (context) - Gibbons, Muchnick - 1986
76   The program summary graph and flow-sensitive interprocedural.. (context) - Callahan - 1988
74   Instruction issue logic for high performance (context) - Sohi - 1990
70   An interval-based approach to exhaustive and incremental int.. (context) - Burke - 1990
70   The expandable split window paradigm for exploiting fine-gra.. - Franklin, Sohi - 1992
70   Integrating register allocation and instruction scheduling f.. (context) - Bradlee, Eggers et al. - 1991
67   ARB: A hardware mechanism for dynamic reordering of memory r.. - Franklin, Sohi - 1996
67   Evaluation of compiler optimizations for fortran d on mimd d.. - Hiranandani, Kennedy et al. - 1992
66   Boosting beyond static scheduling in a superscalar processor - Smith, Lam et al. - 1990
65   Interconnect scaling - the real limiter to high performance .. (context) - Bohr - 1996
62   An efficient resource-constrained global scheduling techniqu.. (context) - Moon, Ebcioglu - 1992
60   Software and hardware for exploiting speculative parallelism.. - Oplinger - 1997
55   A program data flow analysis procedure (context) - Allen, Cocke - 1976
53   Improving superscalar instruction dispatch and issue by expl.. - Vajapeyam, Mitra - 1997
52   Efficient superscalar performance through boosting - Smith, Horowitz et al. - 1992
52   A compilation technique for software pipelining of loops wit.. (context) - Ebcioglu - 1987
50   Region scheduling: An approach for detecting and redistribut.. (context) - Gupta, Soffa - 1990
49   The cray-1 computer system (context) - Russell - 1978
48   Design of a Computer---The Control Data (context) - Thornton - 1970
47   Code generation schema for modulo scheduled loops - Rau, Schlansker et al. - 1992
47   Sentinel scheduling for VLIW and superscalar processors - Mahlke, Chen et al. - 1992
44   A practical interprocedural data flow analysis algorithm (context) - Barth - 1978
43   Control flow speculation in multiscalar processors - Jacobson, Bennett et al. - 1997
39   Path-based next trace prediction - Jacobson, Rotenberg et al. - 1997
39   Balanced scheduling: Instruction scheduling when memory late.. - Kerns, Eggers - 1993
36   Practical adaptation of the global optimization algorithm of.. (context) - Dhamdhere - 1991
35   Percolation scheduling: A parallel compilation technique (context) - Nicolau - 1985
33   take - a balanced code placement framework (context) - Hanxleden, Kennedy - 1994
33   Superblock formation using static program analysis - Hank, Mahlke et al. - 1993
32   An efficient hybrid algorithm for incremental data flow anal.. (context) - Marlowe, Ryder - 1990
32   Efficient computation of flow insensitive interprocedural su.. (context) - Cooper, Kennedy - 1984
31   and inline expansion (context) - Allen, Johnson et al. - 1988
30   Dynamic instruction scheduling and the astronautics zs (context) - Smith - 1989
28   Register traffic analysis for streamlining inter-operation c.. - Franklin, Sohi - 1992
28   The anatomy of the register file in a multiscalar processor - Breach, Vijaykumar et al. - 1994
27   Parallel operation in the control data (context) - Thornton - 1961
27   Guarded execution and branch prediction in dynamic ilp proce.. (context) - Pnevmatikatos, Sohi - 1994
26   University of WisconsinMadison (context) - Franklin, Architecture - 1993
26   University of WisconsinMadison (context) - Franklin, Architecture et al. - 1993
25   Performance features of the pa7100 microprocessor (context) - Asprey - 1993
25   Instruction-level parallel processing (context) - Fisher, Rau - 1991
24   The zs-1 central processor - Smith - 1987
24   Multiprocessors from a software perspective (context) - Amarasinghe, Anderson et al. - 1996
23   A simple algorithm for global data flow analysis problems (context) - Hecht, Ullman - 1975
21   Data flow analysis for procedural languages (context) - Rosen - 1979
20   mhz superscalar risc microprocessor with out-of-order execut.. (context) - Gieseke - 1997
16   Partitioning parallel programs for macro-dataflow (context) - Sarkar, Hennessy - 1986
16   The cache performance and optimizations of block algorithms (context) - Lam, Rothberg et al. - 1991
15   The potential for thread-level data speculation in tightly-c.. - Steffan, Mowry - 1998
15   Enhanced region scheduling on a program dependence graph (context) - Allan, Janardhan et al. - 1992
14   The program dependence graph and vectorization (context) - Baxter - 1989
14   Machine organization of the ibm risc system/6000 processor (context) - Grohoski - 1990
14   locality management in shared-memory multiprocessors (context) - Markatos, LeBlanc et al. - 1991
14   Organization of the motorola 88110 superscalar RISC micropro.. (context) - Diefendorff, Allen - 1992
13   An interprocedural data flow analysis algorithm (context) - Barth - 1977
11   Incremental data flow analysis (context) - Ryder - 1983
11   IBM RISC system/6000 processor architecture (context) - Oehler, Groves - 1990
11   Digital leads the pack with (context) - Gwennap - 1994
10   The design of a data flow analyzer (context) - Chow, Rudmik - 1982
8   High-level data flow analysis (context) - Rosen - 1977
8   Control flow prediction for dynamic ILP processors (context) - Pnevmatikatos, Franklin et al. - 1993
6   Register communication strategies for the multiscalar archit.. - Vijaykumar, Sohi - 1996
5   Architecture of the IBM system (context) - Amdahl - 1964
5   Power and PowerPC: Principles (context) - Weiss, Smith - 1994
4   IEEE Micro (context) - Hsu, the et al. - 1994
3   Intel reveals Pentium Implementation Details (context) - Case - 1993
2   A solution to a problem with morel and renvoise's 'global op.. (context) - Drechsler, Stadel - 1988
2   Englewood Cliffs (context) - Muchnick, Jones - 1981
2   An efficient approach to data flow analysis in a multiple pa.. (context) - Jain, Thompson - 1988
1   model 91: Machine philosphy and instruction-handling (context) - Anderson, Sparacio et al. - 1967
1   Compiling register communication in the multiscalar architec.. (context) - Vijaykumar, Sohi - 1996
1   Annual International Conference on Parallel Processing (context) - Cytron, Beyond et al. - 1986
1   Annual International Conference on Parallel Processing (context) - Chen, Yew et al. - 1994
1   PowerPC 604 powers past Pentium (context) - Gwennap - 1994



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.wisc.edu/~mscalar/publications.html):   More
Incorporating Guarded Execution into Existing Instruction Sets - Pnevmatikatos (1996)   (Correct)
Streamlining Data Cache Access with Fast Address Calculation - Austin, Pnevmatikatos, Sohi (1995)   (Correct)
Multiscalar Processors - Sohi (1995)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC