(Enter summary)
Abstract: A major challenge facing computer architects today is designing cost-effective hardware
that executes multiple operations simultaneously. The goal of such designs is to improve
performance by taking advantage of fine-grain parallelism. In this dissertation, I
study vector architectures, the oldest of several processor designs that support fine-grain
parallelism. Because implementing a cost-effective processor that performs well requires
studying not only the design of processors but also the... (Update)
Context of citations to this paper: More
.... components that are needed to access the register file typically represent less than 5 of the area required by the register cells [10]. To access the register cell of a multiported RF, each port requires one transistor, a select line and a data line. In addition, a write port...
.... fact that the IC area of a register file increases linearly with the number and size of registers and quadratically with the number of ports [2]. The cycle time of the register file increases as a logarithmic function of the number of registers and read ports [3] Since...
Cited by: More
Resource Widening Versus Replication Limits and . . . - Lopez, al. (1998)
(Correct)
Reducing The Impact Of Register Pressure On Software Pipelined Loops - Llosa (1996)
(Correct)
Maps: A Compiler-Managed Memory System for Software-Exposed.. - Barua (2000)
(Correct)
Active bibliography (related documents): More All
1.3: Advanced Vector Architectures - Espasa (1997)
(Correct)
1.0: Vector Microprocessors - Asanovic (1998)
(Correct)
0.8: Loop Optimization Techniques On Multi-Issue Architectures - Kaiser
(Correct)
Similar documents based on text: More All
0.2: Linguistic Issues in Grace (Evaluation of Part-of-Speech.. - Lecomte, Lucas, RAJMAN
(Correct)
0.2: Making Database Optimizers More Extensible - Das (1995)
(Correct)
0.2: Imitators and Optimizers in Symmetric n-Firm Cournot Oligopoly - Schipper (2001)
(Correct)
Related documents from co-citation: More All
8: The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers
- Berry - 1989
7: Software pipelining: An effective scheduling technique for VLIW machines (context) - Lam - 1988
6: Partitioned Register Files for VLIWs: A Preliminary Analysis of Tradeoffs (context) - Capitanio, Dutt et al. - 1992
BibTeX entry: (Update)
Corinna G. Lee. Code Optimizers and Register Organizations for Vector Architectures. PhD thesis, University of California at Berkeley, 1992. http://citeseer.ist.psu.edu/lee92code.html More
@techreport{ grace92code,
author = "Lee, Corinna Grace",
title = "{C}ode {O}ptimizers and {R}egister {O}rganizations for {V}ector {A}rchitectures",
number = "UCB//CSD-92-686",
month = "August",
year = "92",
url = "citeseer.ist.psu.edu/lee92code.html" }
Citations (may not include all citations):
4212
Computers and Intractability: A Guide to the Theory of NP-Co.. (context) - Garey, Johnson - 1979
1575
Computer Architecture: A Quantitative Approach (context) - Patterson, Hennessy - 1990
1399
Compilers: Principles (context) - Aho, Sethi et al. - 1986
480
The program dependence graph and its use in optimization (context) - Ferrante, Ottenstein et al. - 1987
376
The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991
303
Princeton University Press (context) - Ford, Fulkerson et al. - 1962
258
Automatic translation of FORTRAN programs to vector form
- Allen, Kennedy - 1987
230
Limits of instruction-level parallelism
- Wall - 1991
217
The Perfect Club benchmarks: Effective performance evaluatio..
- Berry, Chen et al. - 1988
201
Register allocation via coloring (context) - Chaitin, Auuslander et al. - 1981
178
The Connection Machine CM-5 Technical Summary (context) - Corporation, Massachusetts - 1991
121
An architecture for software-controlled data prefetching (context) - Klaiber, Levy - 1991
115
Reevaluating Amdahl's law (context) - Gustafson - 1988
110
The Livermore FORTRAN Kernels: A computer test of the numeri.. (context) - McMahon - 1986
110
Available instruction-level parallelism for superscalar and ..
- Jouppi, Wall - 1989
104
The Structure of Computers and Computations (context) - Kuck - 1978
94
Graphs and Algorithms (context) - Gondran, Minoux - 1984
93
High-bandwidth data memory systems for superscalar processor.. (context) - Sohi, Franklin - 1991
85
Code scheduling and register allocation in large basic block.. (context) - Goodman, Hsu - 1988
84
Efficient and exact data dependence analysis (context) - Maydan, Hennessy et al. - 1991
77
Efficient instruction scheduling for a pipelined architectur.. (context) - Gibbons, Muchnick - 1986
73
Parallel algorithms for dense linear algebra computations (context) - Gallivan, Plemmons et al. - 1990
66
The generation of optimal code for arithmetic expressions (context) - Sethi, Ullman - 1970
59
Very long instruction word architectures and the ELI (context) - Fisher - 1983
54
Complete register allocation problems (context) - Sethi - 1975
53
Optimal code generation for expression trees (context) - Aho, Johnson - 1976
47
Detection and parallel execution of independent instructions (context) - Tjaden, Flynn - 1970
46
A study of scalar compilation techniques for pipelined super.. (context) - Weiss, Smith - 1987
41
and Allan Porterfield (context) - Callahan, Kennedy - 1991
38
The Cydra 5 departmental supercomputer: Design philosphies (context) - Rau, Yen et al.
34
Adam Hilger Ltd (context) - Hockney, Jesshope et al. - 1981
32
Private communication (context) - Wawrzynek - 1991
31
and inline expansion (context) - Allen, Johnson et al. - 1988
22
The Warp computer: Architecture (context) - Annaratone, Arnould et al. - 1987
21
The nonuniform distribution of instruction-level and machine.. (context) - Jouppi - 1989
20
Architecture and implementation of a VLIW supercomputer (context) - Colwell, Hall et al. - 1990
20
Register allocation in the SPUR Lisp compiler (context) - Larus, Hilfinger - 1986
19
Register allocation via usage counts (context) - Freiburghouse - 1974
16
Compiler techniques for optimizing memory and register usage.. (context) - Eisenbeis, Jalby et al. - 1990
15
Strategies for achieving improved processor throughput
- Farrens, Pleszkun - 1991
15
Restructuring Symbolic Programs for Concurrent Execution on .. (context) - Larus - 1982
14
Index register allocation (context) - Horwitz, Karp et al. - 1966
12
Private communication (context) - Demmel - 1992
12
On compiling algorithms for arithmetic expressions (context) - Nakata - 1967
12
Vector register design for polycyclic vector scheduling (context) - Mangione-Smith, Abraham et al. - 1991
11
The effect on RISC performance of register set size and stru.. (context) - Bradlee, Eggers et al. - 1991
10
The Titan graphics supercomputer architecture (context) - Diede, Hagenmaier et al. - 1988
10
IEEE Transactions on Software Engineering (context) - Hsu, Fischer et al. - 1989
10
Tradeoffs in instruction format design for horizontal archit.. (context) - Sohi, Vajapeyam - 1989
9
ported CMOS register file (context) - Jolly, -ns - 1991
8
Compiler Construction for Digital Computers (context) - Gries - 1971
8
Microprocessor technology trends (context) - Myers, Yu et al. - 1986
7
Multi-threaded vectorization (context) - Chiueh - 1991
7
The IBM System/370 vector architecture (context) - Buchholz - 1986
7
An optimal instruction-scheduling model for a class of vecto.. (context) - Arya - 1985
7
A comparison of list schedulers for parallel processing syst.. (context) - Adam, Chandy et al. - 1974
6
The IBM 3090 system: An overview (context) - Tucker - 1986
6
Code Optimization of Pipeline Constraints (context) - Gross - 1983
6
On arithmetic expressions and trees (context) - Redziejowski - 1969
6
Register assignment algorithm for generation of highly optim.. (context) - Beatty - 1974
6
Private communication (context) - Fisher - 1990
6
Applied Graph Theory (context) - Marshall - 1971
5
Intel's secret is out (context) - Perry - 1989
5
High-speed processing schemes for summation type and iterati.. (context) - Wada, Ishii et al. - 1988
4
MP: The birth of a supercomputer (context) - August, Brost et al.
4
transistor microprocessor (context) - Kohn, Fu - 1989
4
An efficient algorithm for exploting multiple arithmetic uni.. (context) - Tomasulo - 1967
4
Squeezing more CPU performance out of a Cray-2 by vector blo.. (context) - Eisenbeis, Jalby et al. - 1988
3
An introduction to vector processing (context) - Johnson - 1978
3
Efficient computation of expressions with common subexpressi.. (context) - Prabhala, Sethi - 1980
3
Optimal chaining in expression trees (context) - Bernstein, Boral et al. - 1988
3
and Steve Wallach (context) - Chastain, Gostin et al. - 1988
2
chaining on 1-port vector supercomputers (context) - Tang, Davidson et al. - 1988
2
BiCMOS vector-pipelined processor (context) - Okamoto, Hagihara et al. - 1991
2
Engineering design of the Convex C (context) - Jones
2
SuperSPARC: A fully integrated superscalar processor (context) - Blanck, Krueger - 1991
2
Vector system performance of the IBM (context) - Clark, Wilson - 1986
2
Introduction of NEC Supercomputer SX system (context) - Watanabe, Katayama et al.
2
Economy and allocation of memory in the ALPHA-translator (context) - Ershov, Zmiyevskaya et al. - 1971
2
A survey of algorithms for register allocation in straight-l.. (context) - Rajlich, Moshier - 1984
1
The CRAY-1, the CRAY X-MP, the CRAY-2 and beyond: The superc.. (context) - Thompson
1
MP Computer Systems Functional Description Manual (context) - Research, Y- - 1988
1
Designing for Speed: An Introduction to the Cray World of Co.. (context) - Research - 1989
1
Cft77 Online Manual Page (context) - Research - 1990
1
Optimal assignment of computer storage by chain decompositio.. (context) - Dantzig, Reynolds - 1966
1
Architectural support for overlapped loops on the Cydra (context) - Dehnert, Hsu et al. - 1989
1
Real machines: Design choices/engineering trade-offs (context) - Patt, editor et al. - 1989
1
Part of distribution tape for SPEC benchmark suite (context) - for - 1989
1
MFLOPS sincle-chip supercomputer (context) - Iino, Takahashi et al. - 1992
1
Peak versus sustained performance in highly concurrent vecto.. (context) - Hack - 1986
1
Also published as Technical Report No (context) - Johnson, Design et al. - 1989
1
Fujitsu's supercomputer: FACOM vector processor system (context) - Miura
1
Demolition of reasonable principles by pretty pathetic pract.. (context) - Lincoln - 1990
1
technical overview: 64 bits/100 MHz or bust (context) - Killian - 1991
1
An analysis of the Cray-1 computer (context) - Sites - 1978
1
Vector processing on the VAXvector 6000 Model (context) - Slater, Fenwick et al. - 1990
1
Notes on evaluation of linear recurrences on a vector proces.. (context) - Smith, Taylor - 1990
1
Alphad architecture and first implementation (context) - Sites, Witek - 1992
1
array processor system (context) - Odaka, Nagashima et al.
1
Spill code minimization techniques for optimizating compiler.. (context) - Bernstein, Goldin et al. - 1989
1
mail to netlib@tantalus (context) - Gustafson, Rover et al. - 1991
1
Reducing the problem of memory allocation when compiling pro.. (context) - Ershov - 1962
1
An evaluation of Cray-1 and Cray X-MP performance on vectori.. (context) - Tang, Davidson - 1988
1
Notes for course at Illinois Summer Institute on Parallelizi.. (context) - Harrison, of - 1990
1
Supercomputers: Class VI Systems (context) - Fernbach - 1986
1
MIMIC: A custom VLSI parallel for musical sound synthesis (context) - Wawrzynek, von Eicken - 1990
1
The Promise of the Next Decade (context) - Hennessy, Jouppi et al. - 1991
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://hypatia.dcs.qmw.ac.uk/SEL-HPC/Articles/GeneratedHtml/comp.opt.html): More
Static Branch Frequency and Program Profile Analysis - Wu, Larus (1994)
(Correct)
Experimental Evaluation of Some Data Dependence Tests.. - Petersen, al. (1991)
(Correct)
Annotation-Directed Run-Time Specialization in C - Grant
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC