This directory is created automatically and some papers may be mislabeled. Only document within the CiteSeer database are listed. The directory is intended to provide entry points for browsing the database and is not intended to be authoritative. Papers may not appear in all relevant categories. For example, papers in a sub-category may not appear in higher level categories.
258 Parallel Programming in Split-C - Culler (1993)(Correct)
We introduce the Split-C language, a parallel extension
of C intended for high performance programming
on distributed memory multiprocessors, and demonstrate
the use of the language in optimizing para... / the programmer specifies the compiler takes care of addressing and br or locality is not limited by the compiler s recognition capability nor
212 LEDA - A Platform for Combinatorial and Geometric Computing - Mehlhorn, Näher (1995)(Correct)
LEDA is a library of efficient data types and algorithms in combinatorial and
geometric computing. The main features of the library are its wide collection
of data types and algorithms, the precise an... / can be used with any Ccompiler supporting templates e.g. cfront br areas such as discrete optimization scheduling traffic control
200 Uniprocessor Garbage Collection Techniques - Wilson (1992)(Correct)
We survey basic garbage collection algorithms, and variations such as incremental and generational collection; we then discuss low-level implementation considerations and the relationships between sto... / systems languages and compilers. Throughout we attempt to br and Smart Pointers . Compiler Cooperation and Optimizations
194 Shade: A Fast Instruction-Set Simulator for Execution Profiling - Bob Cmelik (1993)(Correct)
Tracing tools are used widely to help analyze, design, and tune
both hardware and software systems. This paper describes a tool
called Shade which combines efficient instruction-set simulation
with a ... / everything from architectures to compilers to applications. Analyzers can br on particular languages and compilers. Ideally it should also avoid
184 Compiler Transformations for High-Performance Computing - Bacon (1993)(Correct)
In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimizations for uniprocessors reduce the number of instructions executed by t... / Compiler Transformations for br three decades a large number of compiler transformations for optimizing
170 KIDS: A Semi-Automatic Program Development System - Smith (1990)(Correct)
The Kestrel Interactive Development System (KIDS) provides automated support for the development of correct and efficient programs from formal specifications. The system has components for performing ... / executable form by a conventional compiler. The initial algorithm that KIDS br language also called REFINE and compiler. The language supports
162 Performance of Various Computers Using Standard Linear Equations.. - Dongarra (1995)(Correct)
This report compares the performance of different computer systems in solving dense systems of linear equations. The comparison involves approximately a hundred computers, ranging from a Cray Y-MP to ... / or multiple processors. The compilers on some machines may of br gives the operating system and compiler used. The run was based on two
138 A Survey of Program Slicing Techniques - Tip (1995)(Correct)
A program slice consists of the parts of a program that (potentially) affect the
values computed at some point of interest, referred to as a slicing criterion. The task
of computing program slices is ... / are investigated. We discuss how compiler-optimization techniques can be br Section . Section suggests how compiler-optimization techniques may be
137 LimitLESS Directories: A Scalable Cache Coherence Scheme - Chaiken, Kubiatowicz, Agarwal (1991)(Correct)
Caches enhance the performance of multiprocessors by reducing
network traffic and average memory access latency.
However, cache-based systems must address the problem of
cache coherence. We propose th... / closely with a multiprocessor's compiler and run-time system. The br by function as static compiler-dependent or dynamic using
135 Simultaneous Multithreading: Maximizing On-Chip Parallelism - Tullsen, Eggers, Levy (1995)(Correct)
This paper examines simultaneous multithreading, a technique permitting
several independent threads to issue instructions to a superscalar
's multiple functional units in a single cycle. We present
se... / Multiflow trace scheduling compiler Our results show the br the workload and the wide-issue compiler optimization and scheduling
126 Value Locality and Load Value Prediction - Lipasti, al. (1996)(Correct)
Since the introduction of virtual memory demand-paging
and cache memories, computer systems have been exploiting
spatial and temporal locality to reduce the average latency of a
memory reference. In t... / by modern state-of-the-art compilers exhibits these tendencies. We br for e.g. a switch statement the compiler must generate code to load a
118 Programming In Vienna Fortran - Chapman, Mehrotra, Zima (1992)(Correct)
Exploiting the full performance potential of distributed memory machines requires
a careful distribution of data across the processors. Vienna Fortran is a language
extension of Fortran which provides... / not only for Fortran and current compiler research is aimed at br implementing them. Research in compiler technology has so far resulted in
118 A Metaobject Protocol for C++ - Chiba (1995)(Correct)
This paper presents a metaobject protocol (MOP)
for C++. This MOP was designed to bring the
power of meta-programming to C++ programmers.
It avoids penalties on runtime performance
by adopting a new m... / objects or customized compiler optimizations such Appeared in br extensions but also ones for compiler optimizations. From the
116 TIL: A Type-Directed Optimizing Compiler for ML - Tarditi, Morrisett, Cheng (1995)(Correct)
We describe a new compiler for Standard ML called TIL, that is based on four technologies: intensional
polymorphism, tag-free garbage collection, conventional functional language optimization,
and loo... / TIL A Type-Directed Optimizing Compiler for ML David Tarditi Greg br Abstract We describe a new compiler for Standard ML called TIL that
113 MediaBench: A Tool for Evaluating and Synthesizing Multimedia and.. - Lee (1997)(Correct)
Over the last decade, significant advances have been made in compilation technology for capitalizing on instruction-level parallelism (ILP). The vast majority of ILP compilation research has been cond... / matched to the needs of the ILP compilers. Most of these processors are br currently exists a gap between the compiler community and embedded
108 Compiling Polymorphism Using Intensional Type Analysis - Harper, Morrisett (1995)(Correct)
Traditional techniques for implementing polymorphism use a universal representation for objects of unknown type. Often, this forces a compiler to use universal representations even if the types of obj... / type. Often this forces a compiler to use universal representations br Introduction Many compilers assume a universal or boxed
100 An Implementation of Interprocedural Bounded Regular Section Analysis - Havlak, Kennedy (1991)(Correct)
Optimizing compilers should produce efficient code even in the presence of high-level language
constructs. However, current programming support systems are significantly lacking in their ability
to an... / Abstract Optimizing compilers should produce efficient code br Introduction A major goal of compiler optimization research is to
100 Improving Register Allocation for Subscripted Variables - Callahan, Carr, Kennedy (1990)(Correct)
Most conventional compilers fail to allocate array elements to registers because standard data-flow analysis treats arrays like scalars, making it impossible to analyze the definitions and uses of ind... / z Abstract Most conventional compilers fail to allocate array elements br register allocators found in most compilers. In addition we present
99 Lifetime-Sensitive Modulo Scheduling - Huff (1993)(Correct)
This paper shows how to software pipeline a loop for minimal
register pressure without sacrificing the loop's minimum
execution time. This novel bidirectional slack-scheduling
method has been impleme... / been implemented in a FORTRAN compiler and tested on many scientific br To find an overlapped schedule a compiler must represent the complex
97 Optimization of Object-Oriented Programs Using Static Class Hierarchy .. - Dean, Grove, Chambers (1995)(Correct)
Optimizing compilers for object-oriented languages apply static
class analysis and other techniques to try to deduce precise information about
the possible classes of the receivers of messages; if s... / Abstract. Optimizing compilers for object-oriented languages br class hierarchy analysis the compiler can improve the quality of static
94 Compiler Optimizations for Improving Data Locality - Carr, McKinley, Tseng (1994)(Correct)
In the past decade, processor speed has become significantly faster than memory speed. Small, fast cache memories are designed to overcome this discrepancy, but they are only effective when programs e... / Compiler Optimizations for Improving Data br In this paper we present compiler optimizations to improve data
91 Optimizing ML with Run-Time Code Generation - Leone, Lee (1995)(Correct)
We describe the design and implementation of a compiler that automatically translates ordinary
programs written in a subset of ML into code that generates native code at run time. Run-time
code genera... / the design and implementation of a compiler that automatically translates br Our system called Fabius is a compiler that takes ordinary programs
89 Dealing With Disaster: Surviving Misbehaved Kernel Extensions - Seltzer (1996)(Correct)
Today's extensible operating systems allow applications
to modify kernel behavior by providing mechanisms for
application code to run in the kernel address space. The
advantage of this approach is tha... / can take advantage of advanced compiler optimization techniques br e.g.compiled with the correct compiler Finally we must limit the
88 JavaParty - Transparent Remote Objects in Java - Philippsen, Zenger (1997)(Correct)
Java's threads offer appropriate means either for parallel programming of SMPs or as target constructs when compiling add-on features (e.g. forall constructs, automatic parallelization, etc.) Unfortun... / to specific nodes of the network compiler and runtime system deal with br selected and changed at runtime. Compiler analysis or a well-informed
87 Implementation of a Portable Nested Data-Parallel Language - Blelloch, Chatterjee, Hardwick.. (1994)(Correct)
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language.
This language and its implementation are the first to fully support nested data structures as well... / nested parallelism allows a compiler to convert them into a form that br Fortran and CM Fortran compilers generate near-optimal code. The
80 Improving Data Locality with Loop Transformations - McKinley (1996)(Correct)
this article, we present compiler optimizations to improve data locality based on a simple yet accurate cost model. The model computes both temporal and spatial unknown Improving Data Locality with Lo... / In this article we present compiler optimizations to improve data br Languages Processors-compilers optimization General
79 Scout: A Communications-Oriented Operating System - Montz, Mosberger, O'Malley.. (1994)(Correct)
This white paper describes Scout, a new operating system being designed for systems connected to the National Information Infrastructure (NII). Scout provides a communication-oriented software archite... / with the application of advanced compiler techniques result in a system br to the overall system. . Compiler Support A key design principle
79 Automatically Tuned Linear Algebra Software - Whaley, Dongarra (1997)(Correct)
This paper describes an approach for the automatic generation and optimization of numerical
software for processors with deep memory hierarchies and pipelined functional units.
The production of such ... / . . Why Can't the Compiler Do This br . . . Effects of poor compilers .
78 Type-Directed Partial Evaluation - Danvy (1996)(Correct)
We present a strikingly simple partial evaluator, that is typedirected
and reifies a compiled program into the text of a residual,
specialized program. Our partial evaluator is concise
(a few lines) a... / subtyping and coercions compiler optimization and run-time code br semantics-based compilation and compiler generation. Background and
77 Reducing Indirect Function Call Overhead In C++ Programs - Calder, Grumwald (1994)(Correct)
Modern computer architectures increasingly depend on mechanisms that estimate future control flow decisions to increase performance. Mechanisms such as speculative execution and prefetching are becomi... / techniques and demonstratehow compilers can use existing branch br in control and there are few compiler or hardware tricks that could
76 Practical Dependence Testing - Goff, Kennedy, Tseng (1991)(Correct)
Precise and efficient dependence tests are essential to
the effectiveness of a parallelizing compiler. This paper
proposes a dependence testing scheme based on classifying
pairs of subscripted variabl... / effectiveness of a parallelizing compiler. This paper proposes a br in both PFC a parallelizing compiler and ParaScope a parallel
75 Data and Computation Transformations for Multiprocessors - Anderson (1995)(Correct)
Effective memory hierarchy utilization is critical to the performance of modern multiprocessor architectures. We havedeveloped the first compiler system that fully automatically parallelizes sequentia... / We havedeveloped the first compiler system that fully automatically br framework. We ran our compiler on a set of application programs
74 Optimizing Matrix Multiply using PHiPAC: a Portable.. - Bilmes, Asanovic, Demmel, Lam, Chin (1996)(Correct)
BLAS3 operations have great potential for aggressive optimization. Unfortunately, they usually need to be hand-coded for a speci#c machine and compiler to achieve near-peak performance. Wehave develop... / for a speci c machine and compiler to achieve near-peak performance. br analyzing current machines and C compilers we've developed guidelines for
72 Optimizing for Parallelism and Data Locality - Kennedy, McKinley (1992)(Correct)
Previous research has used program transformation to
introduce parallelism and to exploit data locality. Unfortunately,
these two objectives have usually been considered
independently. This work explo... / are two of the most valuable compiler techniques in use today. br of the cache is required the compiler must know the cache line size
72 Branch Prediction For Free - Ball, Larus (1993)(Correct)
Many compilers rely on branch prediction to improve program performance by identifying frequently
executed regions and by aiding in scheduling instructions. Profile-based predictors
require a time-con... / Abstract Many compilers rely on branch prediction to br information available to a compiler would enhance our heuristics.
71 Unifying Data and Control Transformations for Distributed Shared.. - Cierniak (1994)(Correct)
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Control transformations inv... / have developed new techniques for compiler optimizations for distributed br with a memory hierarchy. Our compiler optimizations are based on an
71 Optimization of Instruction Fetch Mechanisms for High Issue Rates - Conte, Menezes, Mills, Patel (1995)(Correct)
Recent superscalar processors issue four instructions
per cycle. These processors are also powered by
highly-parallel superscalar cores. The potential performance
can only be exploited when fed by hig... / The performance boost provided by compiler optimization techniques is also br investigated. Results show that compiler optimization can significantly
71 Unifying Data and Control Transformations for Distributed.. - Cierniak, Li (1994)(Correct)
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Control transformations inv... / have developed new techniques for compiler optimizations for distributed br with a memory hierarchy. Our compiler optimizations are based on an
71 SPNP: Stochastic Petri Net Package - Ciardo, Muppala, Trivedi (1989)(Correct)
We present SPNP, a powerful GSPN package developed at Duke University. SPNP allows the modeling of complex system behaviors. Advanced constructs are available, such as markingdependent arc multiplicit... / it is compiled using the C compiler and then linked with the br sensitivity analysis are system optimization and bottleneck analysis. A
70 Increasing Network Throughput by Integrating Protocol Layers - Abbott, Peterson (1993)(Correct)
Integrating protocol data manipulations is a strategy for increasing the throughput of network protocols. The idea is to combine a series of protocol layers into a pipeline so as to access message dat... / Integration generalizes the compiler optimization known as loop br into one with good locality. The compiler takes advantage of this increased
68 Reducing Memory Latency via Non-blocking and Prefetching Caches - Chen (1992)(Correct)
Non-blocking caches and prefetching caches are two techniques for hiding memory latency by exploiting the overlap of processor computations with data accesses. A non-blocking cache allows execution to... / these approaches. We also consider compiler-based optimizations to enhance br can be improved substantially by compiler optimizations such as instruction
68 Compiler-directed Data Prefetching in Multiprocessors with Memory.. - Edward Gornish (1990)(Correct)
Memory hierarchies are used by multiprocessor systems
to reduce large memory access times. It is necessary to
automatically manage such a hierarchy, to obtain effective
memory utilization. In this pap... / Compiler-directed Data Prefetching in br We take the approach that the compiler should perform the program
68 A Retargetable Technique for Predicting Execution Time of Code.. - Harmon, Baker, Whalley (1992)(Correct)
Predicting the execution times of straight-line code sequences is a fundamental problem in the design and evaluation of hard-real-time systems. The reliability of system-level timings and schedulabi... / into account. This technique is compiler and language-independent and br is integrated with an existing C compiler. This system predicts the bounded
65 A Practical System for Intermodule Code Optimization at Link-Time - Srivastava, Wall (1992)(Correct)
We have developed a system called OM to explore the problem of code optimization at link-time. OM takes a collection of object modules constituting the entire program, and converts the object code int... / to perform optimizations that a compiler looking at a single module cannot br the particular source language or compiler this also gives us the chance
64 To Copy or Not to Copy: A Compile-Time Technique for Assessing When.. - Temam (1993)(Correct)
this paper, we present a compile-time technique for making this determination,
and present a selective copying strategy based on this methodology. Preliminary experimental
results demonstrate that, be... / data reuse cache conflicts compiler-directed cache management br incorporated into production compilers. Without copying the behavior
63 Compiler Blockability of Numerical Algorithms - Carr (1992)(Correct)
Over the past decade, microprocessor design strategies have focused on increasing the computational power on a single chip. Unfortunately, memory speeds have not kept pace. The result is an imbalance ... / Compiler Blockability of Numerical br CRPC -MS Houston TX Compiler Blockability of Numerical
61 Making Pure Object-Oriented Languages Practical - Chambers, Ungar (1991)(Correct)
In the past, object-oriented language designers and programmers have been forced to choose between pure message passing and performance. Last year, our SELF system achieved close to half the speed of ... / about as fast as an optimizing C compiler and runs at over half the speed br single target method and so the compiler cannot simply expand its
60 Profile-Guided Automatic Inline Expansion for C Programs - Chang, Mahlke, Chen, Hwu (1992)(Correct)
This paper describes critical implementation issues that must be addressed to develop a fully automatic inliner. These issues are: integration into a compiler, program representation, hazard preventio... / issues are integration into a compiler program representation hazard br integrated into an optimizing C compiler. The experimental results show
58 An Overview of the Pablo Performance Analysis Environment - Reed, Aydt, Madhyastha, Noe.. (1992)(Correct)
As massively parallel, distributed memory systems replace traditional vector supercomputers, effective application program optimization and system resource management become more than research curiosi... / Performance Tool Compiler Integration br Pablo with data parallel Fortran compilers based on the emerging High
57 A Static Parameter based Performance Prediction Tool for Parallel.. - Fahringer, Zima (1993)(Correct)
This paper presents a Parameter based Performance Prediction
Tool (P
3
T ) which is part of the Vienna Fortran Compilation
System (VFCS), a compiler that automatically translates
Fortran programs in... / Compilation System VFCS a compiler that automatically translates br programs. In contrast to earlier compilers such as SUPERB and the
57 Register Allocation with Instruction Scheduling: a New Approach - Pinter (1993)(Correct)
We present a new framework in which considerations of both register allocation and instruction scheduling can be applied uniformly and simultaneously. In this framework an optimal coloring of a graph,... / is an important task of every compiler. The problem of efficiently br instruction scheduling in some compilers like those for the MIPS
56 A Linear Algebra Framework for Static HPF Code Distribution - Corinne Ancourt (1995)(Correct)
High Performance Fortran (hpf) was developed
to support data parallel programming for simd
and mimd machines with distributed memory. The
programmer is provided a familiar uniform logic address
space... / distribution by directives. The compiler then exploits these directives br to shift part of the burden onto compilers by providing the programmer a
55 Compiler-Based Prefetching for Recursive Data Structures - Luk (1996)(Correct)
Software-controlled data prefetching offers the potential for bridging the ever-increasing speed gap between the memory subsystem and today's high-performance processors. While prefetching has enjoyed... / Compiler-Based Prefetching for Recursive br This paper investigates compilerbased prefetching for
55 Implementing Multiple Protection Domains in Java - Hawblitzel, Chang, Czajkowski, Hu.. (1998)(Correct)
Safe language technology can be used for protection within a single address space. This protection
is enforced by the language's type system, which ensures that references to objects cannot be
forged... / due to current Java just-in-time compilers optimizing for fast compile br in the case of Java just-in-time compilers have the opportunity to perform
54 Generating Communication for Array Statements: Design.. - Stichnoth (1994)(Correct)
Array statements as included in Fortran 90 or High Performance Fortran (HPF) are a wellaccepted way to specify data parallelism in programs. When generating code for such a data parallel program for a... / memory parallel system the compiler must determine when array br in an experimental Fortran compiler and this paper reports an
54 A Code Generation Interface for ANSI C - Fraser, Hanson (1991)(Correct)
lcc is a retargetable, production compiler for ANSI C; it has been
ported to the VAX, Motorola 68020, SPARC, and MIPS R3000, and
some versions have been in use for over two years. It is smaller
and fa... / is a retargetable production compiler for ANSI C it has been ported br it results in efficient compact compilers. The interface is illustrated
53 Using Profile Information to Assist Classic Code Optimizations - Chang (1991)(Correct)
This paper describes the design and implementation of an optimizing compiler that automatically generates profile information to assist classic code optimizations. This compiler contains two new compo... / implementation of an optimizing compiler that automatically generates br classic code optimizations. This compiler contains two new components an
53 Determining Average Program Execution Times and their Variance - Sarkar (1989)(Correct)
This paper presents a general framework for determining average program execution times and their variance, based on the program's interval structure and control dependence graph. Average execution ti... / It is important for a compiler to obtain estimates of execution br an IBM using the VS Fortran compiler Version Release . The
52 KIDS - A Knowledge-Based Software Development System - Smith (1990)(Correct)
The Kestrel Interactive Development System (KIDS) provides knowledge-based support for the derivation of correct and efficient programs from formal specifications. We trace the use of KIDS in deriving... / language also called Refine and compiler. The language supports br the creation of rules. The compiler generates CommonLisp code. The
51 Automatic Data Layout Using 0-1 Integer Programming - Bixby, Kennedy, Kremer (1994)(Correct)
The goal of languages like Fortran D or HPF is to provide a simple yet efficient machineindependent
parallel programming model. By shifting much of the burden of machine-dependent
optimization to th... / optimization to the compiler the programmer is able to write br Even the most sophisticated compiler will not be able to compensate
51 Interprocedural Modification Side Effect Analysis With Pointer.. - Landi, Ryder, Zhang (1993)(Correct)
We present a new interprocedural modification side effects algorithm for C programs, that can
discern side effects through general-purpose pointer usage. Ours is the first complete design and
implemen... / effects is crucial for aggressive compiler optimization ASU practical br analyzed in LR plus compiler a compiler for a subset of
50 Dependent Types in Practical Programming - Xi (1998)(Correct)
Programming is a notoriously error-prone process, and a great deal of evidence in practice has demonstrated that the use of a type system in a programming language can effectively detect program error... / program error detection and compiler optimization. A major br ones. The use of types for compiler optimization such as passing
48 Neural Network Synthesis Using Cellular Encoding And The Genetic.. - Frédéric Gruau (1994)(Correct)
Artificial neural networks used to be considered only as a machine that learns using small
modifications of internal parameters. Now this is changing. Such learning method do not allow
to generate big... / a number of properties and a compiler of high level language. The br A neural Compiler . Introduction
48 Beyond Induction Variables - Wolfe (1992)(Correct)
Induction variable detection is usually closely tied to the strength reduction optimization. This paper studies induction variable analysis from a different perspective, that of finding induction vari... / others are not analyzed by current compilers. Giving a unified approach br approach improves the speed of compilers and allows a more general
47 Cache Performance of the SPEC92 Benchmark Suite - Gee (1993)(Correct)
The SPEC92 benchmark suite consists of twenty public-domain, non-trivial programs that are
widely used to measure the performance of computer systems, particularly those in the Unix
workstation market... / of any source code modifications compiler and operating system release br realistic workloads. Similarly compiler writers have been concentrating
47 Titanium: A High-Performance Java Dialect - Yelick, Semenzato, Pike, Miyamoto.. (1998)(Correct)
Titanium is a language and system for high-performance parallel scientific computing. Titanium
uses Java as its base, thereby leveraging the advantages of that language and allowing us to focus
attent... / on heroic parallelizing compiler technology and the consequent br and the consequent absence of compilers and tools and the
47 Compiling Fortran D for MIMD Distributed-Memory Machines - Hiranandani (1992)(Correct)
Fortran D, a version of Fortran extended with
data decomposition specifications, is designed to provide
a machine-independent data-parallel programming
model. This paper describes analysis, optimizati... / employed in the Fortran D compiler. The compiler first partitions br in the Fortran D compiler. The compiler first partitions programs using
47 Cache Performance of the SPEC Benchmark Suite - Gee, Hill, Pnevmatikatos, Smith (1993)(Correct)
The SPEC benchmark suite consists of ten public-domain, non-trivial programs that are widely used to
measure the performance of computer systems, particularly those in the Unix workstation market.
The... / of any source code modifications compiler and operating system release br realistic workloads. Similarly compiler writers have been concentrating
46 A Standard ML Compiler - Appel, MacQueen (1987)(Correct)
Standard ML is a major revision of earlier dialects of the functional language
ML. We describe the first compiler written for Standard ML in Standard ML.
The compiler incorporates a number of novel fe... / A Standard ML Compiler Andrew W. Appel Dept. of br ML. We describe the first compiler written for Standard ML in
46 Optimal Code Motion: Theory and Practice - Knoop, Rüthing, Steffen (1994)(Correct)
this paper, we emphasize the practicality of lazy code motion by giving explicit
directions for its implementation in standard compiler environments. In particular,
we present a version of the algorit... / format is standard in optimizing compilers. The theoretical foundations of br for its implementation in standard compiler environments. Categories and
46 Automatic Partitioning of Parallel Loops and Data Arrays for.. - Agarwal (1995)(Correct)
This paper presents a theoretical framework for automatically partitioning parallel loops to minimize cache coherency traffic on shared-memory multiprocessors. While several previous papers have looke... / implemented this framework in a compiler for Alewife a distributed shared br by the run time system or by the compiler. Relegating the partitioning task
46 Profile-Guided Receiver Class Prediction - Grove, Dean, Garrett, Chambers (1995)(Correct)
The use of dynamically-dispatched procedure calls is a key
mechanism for writing extensible and flexible code in
object-oriented languages. Unfortunately, dynamic
dispatching imposes a runtime perform... / faster than previous Self compilers on the same applications. Thus br it internally within a compiler. In sections and we report
46 Cache Miss Equations: An Analytical Representation of Cache Misses - Ghosh, Martonisi, Malik (1997)(Correct)
With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Both hand-tuning and compiler optimization techniqu... / performance. Both hand-tuning and compiler optimization techniques are often br code. Implemented within the SUIF compiler framework our approach extends
45 A Retargetable Compiler for ANSI C - Fraser, Hanson (1991)(Correct)
lcc is a new retargetable compiler for ANSI C. Versions for the VAX, Motorola 68020, SPARC, and MIPS are in production use at Princeton University and at AT&T Bell Laboratories. With a few exceptions,... / A Retargetable Compiler for ANSI C Christopher W. br lcc is a new retargetable compiler for ANSI C. Versions for the VAX
45 Can Logic Programming Execute as Fast as Imperative Programming?.. - Van Roy (1990)(Correct)
Bibliographic references of "Can Logic Programming Execute as Fast as Imperative Programming?", Van Roy unknown
170
79. P. Voda, Trilogy version 1.0, Complete Logic Systems, Inc, September 1987.
80. ... / on a MIPS Results from a Prolog Compiler for a RISC th International br January . . A. K. Turk Compiler Optimizations for the WAM rd
44 A High-Performance Microarchitecture with Hardware-Programmable.. - Razdan, Smith (1994)(Correct)
This paper explores a novel way to incorporate hardware-programmable
resources into a processor microarchitecture to improve the
performance of general-purpose applications. Through a coupling
of comp... / Using this information the compiler interacts with sophisticated br processor and it relies on the compiler and run-time system to
44 Automatic Blocking of Nested Loops - Schreiber, Dongarra (1990)(Correct)
Blocked algorithms have much better properties of data locality and therefore
can be much more efficient than ordinary algorithms when a memory hierarchy is
Supported by the NAS Systems Division an... / algorithm parallel computing compiler optimization matrix br reasons Kennedy has stated that compiler management of memory hierarchies
44 Evaluating Compiler Optimizations For Fortran D - Hiranandani, Kennedy, Tseng (1994)(Correct)
The Fortran D compiler uses data decomposition specifications to automatically translate Fortran
programs for execution on MIMD distributed-memory machines. This paper introduces and
classifies a numb... / Evaluating Compiler Optimizations For Fortran D br Foundation. Evaluating Compiler Optimizations For Fortran D
43 Simple and Effective Link-Time Optimization of Modula-3 Programs - Fernandez (1994)(Correct)
this paper, we describe the opportunities for link-time optimization of Modula-3 and present two link-time optimization techniques. Data-driven simplification is a new technique. It uses a program's t... / to implement them the Modula- compiler must generate code for various br which saves a load and permits the compiler to inline methods in their
43 Cache-Conscious Data Placement - Calder, Krintz, John, Austin (1998)(Correct)
As the gap between memory and processor speeds continues to widen, cache efficiency is an increasingly important component of processor performance. Compiler techniques have been used to improve instr... / of processor performance. Compiler techniques have been used to br Data Placement. This is a compiler directed approach that creates
43 Quantifying Behavioral Differences Between C and C++ Programs - Calder (1994)(Correct)
Improving the performance of C programs has been a topic of great interest for many years. Both hardware technology and compiler optimization research has been applied in an effort to make C programs ... / Both hardware technology and compiler optimization research has been br results should be of interest to compiler writers and architecture
43 Putting Pointer Analysis To Work - Ghiya (1998)(Correct)
Pointer analysis has recently been a subject of active research. The focus of most
techniques is on: (1) estimating the targets for stack-directed pointers, (2) computing
relationships between heap-di... / results of pointer analysis for compiler optimizations. This thesis br information to a wide variety of compiler applications. That is once the
43 Emerald: A General-Purpose Programming Language - Raj, Tempero, Levy, Black, al. (1991)(Correct)
data types, Inheritance, Object-based concurrency
1. INTRODUCTION
Emerald
1; 2
is a strongly-typed programming language that supports an atypical variant of the objectoriented
paradigm. Although o... / implementation of the Emerald compiler. As part of its execution the br As part of its execution the compiler creates a parse tree whose nodes
42 Dependence-Based Program Analysis - Johnson, Pingali (1993)(Correct)
Program analysis and optimization can be speeded up through the use of the dependence flow graph (DFG), a representation of program dependences which generalizes def-use chains and static single assig... / into a traditional optimizing compiler framework. We accomplish this as br For example the Multiflow compiler performed predicate analysis to
42 Soft Typing - Cartwright, Fagan (1991)(Correct)
This paper presents a soft type systems that retains the expressiveness of dynamic typing, but offers the early error detection and improved optimization capabilities of static typing. The key idea un... / they extract information that a compiler can exploit to produce more br execution. Optimization The compiler produce better code by
42 PYRROS: Static Task Scheduling and Code Generation for Message.. - Yang, Gerasoulis (1992)(Correct)
We describe a parallel programming tool for scheduling static task graphs and generating the appropriate target code for message passing MIMD architectures. The computational complexity of the system ... / has been in the development of compilers or software tools that will br and FORTRAN D Most of those compiler systems have not addressed the
42 Evaluation of Compiler Optimizations for Fortran D on MIMD.. - Hiranandani, Kennedy, Tseng (1992)(Correct)
The Fortran D compiler uses data decomposition specifications
to automatically translate Fortran programs for execution
on MIMD distributed-memory machines. This paper
introduces and classifies a numb... / Evaluation of Compiler Optimizations for Fortran D on br DC July . Evaluation of Compiler Optimizations for Fortran D on
40 A Language-Based Approach to Protocol Implementation - Abbott, Peterson (1993)(Correct)
Morpheus is a special-purpose programming language that facilitates the efficient implementation of communication protocols. Protocols are divided into three categories, called shapes, so that they ca... / languages because a Morpheus compiler has more domain knowledge br available to the Morpheus compiler. While no Morpheus compiler has
40 Synchronization and Communication in the T3E Multiprocessor - Scott (1996)(Correct)
This paper describes the synchronization and communication primitives of the Cray T3E multiprocessor, a shared memory system scalable to 2048 processors. We discuss what we have learned from the T3D p... / significantly easier for the compiler. For either programming model br queue is used by both the CRAFT compiler to fetch remote data in loops
40 Code Compression - Ernst, Evans, Fraser, Lucco.. (1997)(Correct)
Current research in compiler optimization counts mainly
CPU time and perhaps the first cache level or two. This
view has been important but is becoming myopic, at least
from a system-wide viewpoint, a... / Abstract Current research in compiler optimization counts mainly CPU br for example all commercial JIT compilers known to us. This high
40 Minimizing Register Requirements under Resource-Constrained.. - Govindarajan, Altman, Gao (1994)(Correct)
In this paper we address the following software pipelining problem: given a loop and a machine architecture with a fixed number of processor resources (e.g. function units), how can one construct a so... / with such an opportunity via a compiler option. . The techniques br schedules inside a production compiler might be debatable the
40 Run-time Adaptive Cache Hierarchy Management via Reference Analysis - Johnson, Hwn (1997)(Correct)
Improvements in main memory speeds have not kept pace with increasing processor clock frequency and improved exploitation of instruction-level parallelism. Consequently, the gap between processor and ... / they can also disrupt the compiler-generated ILP schedule. br programs there are several known compiler techniques for optimizing data
40 Space-Efficient Closure Representations - Shao, Appel (1994)(Correct)
Many modern compilers implement function calls (or returns)
in two steps: first, a closure environment is properly
installed to provide access for free variables in the target
program fragment; second... / Abstract Many modern compilers implement function calls or br by the Standard ML of New Jersey compiler by about on a DECstation
39 Interactive Parallel Programming Using the ParaScope Editor - Kennedy, McKinley, Tseng (1991)(Correct)
The ParaScope project is developing an integrated collection of tools to help scientific programmers
implement correct and efficient parallel programs. The centerpiece of this collection
is the ParaSc... / the abilities of programmers and compiler writers alike. Programmers eager br a directive that instructs the compiler to ignore all dependences. The
38 Flick: A Flexible, Optimizing IDL Compiler - Eide, Frei, Ford, Lepreu, Lindstrom (1997)(Correct)
An interface definition language (IDL) is a nontraditional
language for describing interfaces between software components.
IDL compilers generate "stubs" that provide separate
communicating processes... / Flick A Flexible Optimizing IDL Compiler Eric Eide Kevin Frei Bryan br software components. IDL compilers generate stubs that provide
38 Beyond Induction Variables: Detecting and Classifying Sequences Using .. - Gerlek, Stoltz, Wolfe (1995)(Correct)
this article we present a practical technique for detecting a broader class of linear induction
variables than is usually recognized, as well as several other sequence forms, including periodic,
polyn... / optimization. For restructuring compilers effective data dependence br analysis requires that the compiler detect and accurately describe
37 A Methodology For Query Reformulation In Cis Using Semantic Knowledge - Florescu, Raschid (1996)(Correct)
We consider Cooperative Information Systems (CIS) that are multidatabase systems
(MDBMS), with a common object-oriented model, based on the ODMG standard, together
with local databases that may be rel... / technique in our Flora compiler prototype which we used for br within the Flora compiler optimizer for the ODMG data model
37 Data Access Microarchitectures for Superscalar Processors with.. - Chen (1991)(Correct)
The performance of superscalar processors is more sensitive to the memory system delay than their single-issue predecessors. This paper examines alternative data access microarchitectures that effecti... / for Superscalar Processors with Compiler-Assisted Data Prefetching br that effectively support compilerassisted data prefetching in
37 Global Tagging Optimization by Type Inference - Henglein (1992)(Correct)
Tag handling accounts for a substantial amount of execution
cost in latently typed languages such as Common LISP and
Scheme, especially on architectures that provide no special
hardware support.
We pr... / and can be interfaced with compiler backends of statically typed br respects. ffl In the LISP compiler for S BGS in Orbit
36 Typed Memory Management in a Calculus of Capabilities - Crary, Walker, Morrisett (1999)(Correct)
An increasing number of systems rely on programming language
technology to ensure safety and security of low-level
code. Unfortunately, these systems typically rely on a complex,
trusted garbage colle... / type-safe code. We present a compiler intermediate language called the br heavily optimized by hand or by compiler and yet be automatically
36 Two Classes of Boolean Functions for Dependency Analysis - Armstrong, Marriott, Schachte.. (1994)(Correct)
Many static analyses for declarative programming/database languages use Boolean functions
to express dependencies among variables or argument positions. Examples include
groundness analysis, arguably ... / Languages Processors-compilers optimization F. . Logics br only important for an optimizing compiler attempting to speed up
36 Rewriting Executable Files to Measure Program Behavior - Larus, Ball (1992)(Correct)
Inserting instrumentation code in a program is an effective technique for measuring many aspects of program performance. The instrumentation code can be added at any stage of the compilation process b... / by system tools such as the compiler or linker or by external tools br been avoided with minor changes to compilers and executable files' symbol
36 Flow-directed Inlining - Jagannathan, Wright (1996)(Correct)
A flow-directed inlining strategy uses information derived from control-flow analysis to specialize and inline procedures for functional and object-oriented languages. Since it uses control-flow analy... / makes it simple to upgrade a compiler that uses our strategy to include br is easy to implement in a compiler that uses flat closures A
36 Storage Assignment to Decrease Code Size - Liao (1995)(Correct)
DSP architectures typically provide indirect addressing modes with auto-increment and decrement. In addition, indexing mode is not available, and there are usually few, if any, general-purpose registe... / time-to-market. However current compilers for microcontrollers and br size penalties. While optimizing compilers have proved effective for
35 Value Profiling - Calder (1997)(Correct)
Identifying variables as invariant or constant at compile-time allows the compiler to perform optimizations including constant folding, code specialization, and partial evaluation. Some variables, whi... / at compile-time allows the compiler to perform optimizations br then benefit from invariant-based compiler optimizations. In this paper we
35 Analysis of Techniques to Improve Protocol Processing Latency - Mosberger (1996)(Correct)
This paper describes several techniques designed to improve
protocol latency, and reports on their effectiveness when
measured on a modern RISC machine employing the DEC
Alpha processor. We found that... / they can all be characterized as compiler-based techniques. As such one br limited context available to the compiler's optimizer. A technique called
35 Design and Implementation of the Glue-Nail Database System - Derr, Morishita, Phipps (1993)(Correct)
We describe the design and implementation of the Glue-Nail
database system. The Nail language is a purely declarative
query language; Glue is a procedural language used for nonquery
activities. The tw... / target language IGlue. The Nail compiler uses variants of the magic sets br is performed by the Glue compiler using techniques that include
34 Lightweight Run-Time Code Generation - Leone, Lee (1994)(Correct)
Run-time code generation is an alternative and complement
to compile-time program analysis and optimization. Static
analyses are inherently imprecise because most interesting
aspects of run-time behav... / developed for a prototype compiler are discussed and the results br Introduction Many compiler optimizations depend on
34 Automatic Data Layout for Distributed Memory Machines - Kremer (1993)(Correct)
An approach to programming distributed memory-parallel machines that has recently become popular is one where the programmer explicitly specifies the layout of data in a global name space, relying on ... / a global name space relying on a compiler to generate a parallel program br operations generated by the compiler. This will enable the user to
34 Points-to Analysis by Type Inference of Programs with Structures and.. - Bjarne Steensgaard (1996)(Correct)
We present an interprocedural flow-insensitive points-to analysis algorithm
based on monomorphic type inference. The source language model the important
features of C including pointers, pointer arith... / Introduction Modern optimizing compilers and program understanding and br variables Most current compilers and programming tools use only
34 Type-Based Alias Analysis - Diwan, McKinley, Moss (1998)(Correct)
This paper evaluates three alias analyses based on programming language types. The first analysis uses type compatibility to determine aliases. The second extends the first by using additional high-le... / of modern uniprocessors compilers must reorder instructions. For br programs that use pointers the compiler's alias analysis dramatically
34 Automatic Data Layout for Distributed-Memory Machines in the D.. - Kremer, Mellor-Crummey, Kennedy.. (1993)(Correct)
Although distributed-memory message-passing parallel computers are among the most cost effective high performance machines available, scientists find them extremely difficult to program. Most programm... / these annotations a sophisticated compiler can automatically transform a br Given a Fortran D program the compiler uses data layout directives to
34 Elimination of Redundant Array Subscript Range Checks - Kolte, Wolfe (1995)(Correct)
This paper presents a compiler optimization algorithm to reduce
the run time overhead of array subscript range checks
in programs without compromising safety. The algorithm
is based on partial redunda... / Abstract This paper presents a compiler optimization algorithm to reduce br the algorithm in our research compiler Nascent and conducted
33 Demand-driven Computation of Interprocedural Data Flow - Duesterwald, Gupta, Soffa (1995)(Correct)
This paper presents a general framework for deriving demanddriven
algorithms for interprocedural data flow analysis of
imperative programs. The goal of demand-driven analysis
is to reduce the time and... / Optimizing and parallelizing compilers that exhaustively analyze a br automatically e.g.by the compiler or manually by the user e.g.
32 Reducing Branch Costs via Branch Alignment - Calder, Grunwald (1994)(Correct)
Several researchers have proposed algorithms for basic block reordering. We call these branch alignment algorithms. The primary emphasis of these algorithms has been on improving instruction cache loc... / are compiled by any existing compiler and then transformed via binary br time analysis in the IMPACT-I compiler system. Using profile-based
32 Distributed Memory Compiler Design for Sparse Problems - Wu (1995)(Correct)
This paper addresses the issue of compiling concurrent loop nests in the presence of complicated
array references and irregularly distributed arrays. Arrays accessed within loops
may contain accesses ... / Distributed Memory Compiler Design for Sparse Problems br that is used effectively by a compiler to generate efficient code in
32 The ParaScope Parallel Programming Environment - Cooper (1993)(Correct)
The ParaScope parallel programming environment, developed to support scientific programming of sharedmemory
multiprocessors, includes a collection of tools that use global program analysis to help use... / the traditional single-procedure compiler by providing a mechanism for br The ParaScope editor brings both compiler analysis and user expertise to
32 The Semantics of Program Dependence - Cartwright, Felleisen (1989)(Correct)
Optimizing and parallelizing compilers for procedural languages rely on various
forms of program dependence graphs (pdgs) to express the essential control and
data dependences among atomic program ope... / Optimizing and parallelizing compilers for procedural languages rely on br that is suitable for scalar optimization vectorization
31 An Overview of the Fortran D Programming System - Hiranandani, Kenney, Koelbel.. (1991)(Correct)
The success of large-scale parallel architectures is
limited by the difficulty of developing machineindependent
parallel programs. We have developed
Fortran D, a version of Fortran extended
with data ... / Fourth Workshop on Languages and Compilers for Parallel Computing Santa br D programming system a prototype compiler and an environment to assist
31 Extending SUIF for Machine-dependent Optimizations - Smith (1996)(Correct)
This paper describes a set of modifications and extensions to the base SUIF library that provide the abstractions necessary for machine-dependent optimizations such as global instruction scheduling. W... / . . Introduction The SUIF compiler provides an excellent set of br are useful for machine-dependent compiler research and we want to exploit
31 Procedure Placement Using Temporal Ordering Information - Gloy, Blackwell, Smith, Calder (1997)(Correct)
Instruction cache performance is very important to
instruction fetch efficiency and overall processor performance.
The layout of an executable has a substantial effect
on the cache miss rate during ex... / direct-mapped caches where the compiler achieves an optimized cache line br code layout produced by most compilers places procedures in the order
31 A General Framework for Iteration-Reordering Loop Transformations.. - Sarkar (1992)(Correct)
This paper describes a general framework for representing iteration-reordering transformations. These transformations can be both matrix-based and non-matrix-based. Transformations are defined by rule... / used extensively by restructuring compilers for optimizing vector execution br to suffice for a practical compiler system e.g. none of
31 Automatic Program Transformation with JOIE - Cohen, Chase, Kaminsky (1998)(Correct)
While the availability of platform-independent code on
the Internet is increasing, third-party code rarely exhibits
all of the features desired by end users. Unfortunately,
developers cannot foresee a... / translated into an executable by a compiler. Authors or users can employ br instructions by a Just-In-Time compiler JIT JITs only reimplement the
31 An HPF Compiler for the IBM SP2 - Gupta, Midkriff, Schonberg, Shields, .. (1995)(Correct)
We describe pHPF, an research prototype HPF compiler for the IBM SP series parallel machines. The compiler accepts as input Fortran 90 and Fortran 77 programs, augmented with HPF directives; sequentia... / An Hpf Compiler For The Ibm Sp Manish Gupta br Phpf An Research Prototype Hpf Compiler For The Ibm Sp Series Parallel
31 The Effects of the Precision of Pointer Analysis - Shapiro (1997)(Correct)
In order to analyze programs that manipulate pointers, it is necessary to have safe information about what each pointer might point to. There are many algorithms that can be used to determine this i... / to run faster. Introduction Compilers often perform a variety of br assignment x a compiler can safely ignore the first
31 Simple and Effective Analysis of Statically-Typed Object-Oriented.. - Diwan (1996)(Correct)
To use modern hardware effectively, compilers need
extensive control-flow information. Unfortunately,
the frequent method invocations in object-oriented
languages obscure control flow. In this paper, ... / use modern hardware effectively compilers need extensive control-flow br are thus practical for use in a compiler. When they fail we introduce
31 Programming for Different Memory Consistency Models - Gharachorloo (1992)(Correct)
The memory consistency model, or memory model, supported by a shared-memory multiprocessor directly
affects its performance. The most commonly assumed memory model is sequential consistency (SC). Whil... / the use of common hardware and compiler optimizations. To remedy the br common uniprocessor hardware and compiler optimizations that reorder or
31 Precise and Efficient Groundness Analysis for Logic Programs - Marriott, Søndergaard (1993)(Correct)
We show how precise groundness information can be extracted from logic programs. The idea is to use abstract interpretation with Boolean functions as "approximations" to groundness dependencies betwee... / Languages Processors-compilers optimization F. . Logics br only important for an optimizing compiler attempting to speed up
31 Cache Miss Equations: A Compiler Framework for Analyzing and Tuning.. - Ghosh, Martonosi, Malik (1998)(Correct)
This paper describes methods for generating and solving Cache Miss Equations (CMEs) that
give a detailed representation of cache behavior, including conflict misses, in loop-oriented scientific
code. ... / Cache Miss Equations A Compiler Framework for Analyzing and br performance. Both handtuning and compiler optimization techniques are often
31 Control-Flow Analysis and Type Systems - Heintze (1995)(Correct)
We establish a series of equivalences between type systems
and control-flow analyses. Specifically, we take four type systems from the
literature (involving simple types, subtypes and recursion) and... / A central concept in compiler optimization and code generation br can significantly limit compiler performance. To addresses this
29 Sharlit - A Tool for Building Optimizers - Tjiang, al. (1992)(Correct)
This paper presents Sharlit, a tool to support the construction
of modular and extensible global optimizers. We
will show how Sharlit helps in constructing data-flow analyzers
and the transformations ... / function of a modern compiler is global optimization. Unlike br Unlike other functions of a compiler such as parsing and code
29 DyC: An Expressive Annotation-Directed Dynamic Compiler for C - Brian Grant (1997)(Correct)
We present the design of DyC, a dynamic-compilation system for C
based on run-time specialization. Directed by a few declarative user
annotations that specify the variables and code on which dynamic
c... / Annotation-Directed Dynamic Compiler for C Brian Grant Markus br in the context of an optimizing compiler and initial results have been
29 The execution algorithm of Mercury, an efficient purely declarative.. - Somogyi, Henderson, Conway (1996)(Correct)
Machine or WAM. Section 5 describes some optimizations and shows how
Mercury handles I/O. Section 6 gives the current state of the Mercury system while
section 7 presents performance results.
2. The... / very efficient code. The Mercury compiler uses this execution model to br as much help as possible from the compiler in locating errors in their
28 Improving Programs which Recurse over Multiple Inductive Structures - Fegaras (1994)(Correct)
This paper considers generic recursion schemes for programs
which recurse over multiple inductive structures simultaneously,
such as equality, zip and the nth element of a list
function. Such schemes ... / an automatic optimization phase in compilers of algebraic programs. Second br for use in the back-end of a compiler for an algebraic programming
28 A More Efficient RMI for Java - Nester, Philippsen, Haumacher (1999)(Correct)
In current Java implementations, Remote Method Invocation (RMI) is too slow, especially for high performance computing. RMI is designed for wide-area and high-latency networks, it is based on a slow o... / is currently working on the compiler project Manta Manta has br platforms or particular native compilers. There are other approaches to
28 A Manual for the CHAOS Runtime Library - Saltz, Ponnusamy, Sharma, Moon.. (1995)(Correct)
Procedures are presented that are designed to help users efficiently program irregular
problems (e.g. unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial
differential equations solver... / are also designed for use in compilers for distributed memory br forming a portion of a portable compiler independent runtime support
27 A Quantitative Analysis of Loop Nest Locality - McKinley, Temam (1996)(Correct)
This paper analyzes and quantifies the locality characteristics of numerical loop nests in order to suggest future directions for architecture and software cache optimizations. Since most programs spe... / and provide new insights for the compiler writer and the architect. br cache memories Smi Smi and compiler techniques that exploit cache
27 Optimizing Instruction Cache Performance for Operating System.. - Josep Torrellas (1995)(Correct)
High instruction cache hit rates are key to high performance.
One known technique to improve the hit rate of caches is to
use an optimizing compiler to minimize cache interference
via an improved layo... / of caches is to use an optimizing compiler to minimize cache interference br runs of the second phase of the C compiler which generates assembly code
27 Effective Flow Analysis for Avoiding Run-Time Checks - Jagannathan, Wright (1995)(Correct)
This paper describes a general purpose program analysis that
computes global control-flow and data-flow information for higher-order,
call-by-value programs. This information can be used to drive gl... / overheads hence sophisticated compiler optimizations are essential if br tend to be small these compiler optimizations must be
27 Interprocedural Symbolic Analysis - Havlak (1994)(Correct)
Interprocedural Symbolic Analysis
by Paul Havlak
Compiling for efficient execution on advanced computer architectures requires
extensive program analysis and transformation. Most compilers limit thei... / analysis and transformation. Most compilers limit their analysis to simple br techniques in a production compiler is justified by their
26 Nonlinear Array Layouts for Hierarchical Memory Systems - Chatterjee, Jain, Lebeck, Mundhra.. (1999)(Correct)
Programming languages that provide multidimensional arrays and
a flat linear model of memory must implement a mapping between
these two domains to order array elements in memory. This layout
function ... / in several high-performance compilers. Tiling techniques are also br by the programmer or by the compiler and examine the additional
26 A Hardware Implementation of Pure Esterel - Berry (1991)(Correct)
Esterel is a synchronous concurrent programming language dedicated to reactive systems (controllers, protocols, man-machine interfaces, etc.). Esterel has an efficient standard software implementation... / Software The standard Esterel compiler is directly based on one of the br applications. In addition to the compiler the Esterel environment
26 Generation of efficient interprocedural analyzers with PAG - Alt, Martin (1995)(Correct)
To produce high quality code, modern compilers use global
optimization algorithms based on abstract interpretation. These algorithms
are rather complex; their implementation is therefore a non--triv... / produce high quality code modern compilers use global optimization br be easily integrated in existing compilers. The analyzers are
26 DTRE - A Semi-Automatic Transformation System - Blaine, Goldberg (1991)(Correct)
This paper describes the theoretical framework and an implemented system (Dtre) for
the specification and verified refinement of specifications using operations on abstract
data types. The system is s... / using these terms to direct the compiler to implement particular objects br using these terms to direct the compiler to implement particular objects
25 Learning Approximate Control Rules Of High Utility - Cohen (1990)(Correct)
One of the difficult problems in the area of explanation based learning is the utility problem; learning too many rules of low utility can lead to swamping, or degradation of performance. This paper i... / second routine is a control rule compiler which given a set of control br rules is a multiple-resource optimization problem in which the two
25 Debugging Standard ML Without Reverse Engineering - Tolmach (1990)(Correct)
We have built a novel and efficient replay debugger for our
Standard ML compiler. Debugging facilities are provided
by instrumenting the user's source code; this approach,
made feasible by ML's safety... / debugger for our Standard ML compiler. Debugging facilities are br used functionally and our compiler uses continuation-passing style
25 Constraint-Based Type Inference and Parametric Polymorphism - Agesen (1994)(Correct)
Constraint-based analysis is a technique for inferring implementation types. Traditionally it has been described using mathematical formalisms. We explain it in a different and more intuitive way as... / any optimizing or parallelizing compiler Other programming tools br that is typically needed by compilers and other tools. For example
25 Annotation-Directed Run-Time Specialization in C - Grant(Correct)
We present the design of a dynamic compilation system for C.
Directed by a few declarative user annotations specifying where
and on what dynamic compilation is to take place, a binding time
analysis c... / context of an existing optimizing compiler. Introduction Dynamic br dynamic compilation. Ideally the compiler would make these decisions
25 Smartest Recompilation - Shao, Appel (1993)(Correct)
To separately compile a program module in traditional
statically-typed languages, one has to manually write down
an import interface which explicitly specifies all the external
symbols referenced in t... / Using the proper contexts the compiler can check that each module uses br elaborate module system but SML compilers have not supported separate
25 From ML to Ada(!?!): Strongly-typed Language Interoperability via.. - Tolmach, Oliva (1997)(Correct)
We describe a system that supports source-level integration of ML-like functional language
code with ANSI C or Ada83 code. The system works by translating the functional code into
type-correct, "vanil... / output of current optimizing ML compilers even though handicapped by a br details of FL and GL compilers which may be unacceptable in
25 Memory-Hierarchy Management - Carr (1992)(Correct)
The trend in high-performance microprocessor design is toward increasing computational power on the
chip. Microprocessors can now process dramatically more data per machine cycle than previous models.... / is a step in the wrong direction. Compilers not programmers should handle br develops and experiments with compiler algorithms that manage the memory
25 From ML to Ada: Strongly-typed Language Interoperability via Source.. - Tolmach, Oliva (1993)(Correct)
We describe a system that supports source-level integration of ML-like functional language
code with ANSI C or Ada83 code. The system works by translating the functional
code into type-correct, "vanil... / output of current optimizing ML compilers even though handicapped by a br details of FL and GL compilers which may be unacceptable in
25 Improving the Ratio of Memory Operations to Floating-Point Operations .. - Carr, Kennedy (1994)(Correct)
this paper we attempt to answer that question. To do so, we develop and
evaluate techniques that automatically restructure program loops to achieve high performance
on specific target architectures. T... / machine should be left to the compiler. But is our view practical Can br Can a sophisticated optimizing compiler obviate the need for the myriad
25 Precise Miss Analysis for Program Transformations with Caches of.. - Ghosh, Martonosi, Malik (1998)(Correct)
Analyzing and optimizing program memory performance is
a pressing problem in high-performance computer architectures.
Currently, software solutions addressing the processormemory
performance gap inclu... / performance gap include compiler- or programmerapplied br and other program transformations. Compiler optimization can be effective
25 Linear-time Subtransitive Control Flow Analysis - Heintze, McAllester (1997)(Correct)
We present a linear-time algorithm for boundedtype programs that builds a directed graph whose transitive closure gives exactly the results of the standard (cubic-time) Control-Flow Analysis (CFA) alg... / in the program. This limits compiler optimization. One way to address br a barrier to the use of CFA in compilers. In fact although a number of
24 The Jalapeño Dynamic Optimizing Compiler for Java - Burke, Choi, Fink, Grove, Hind.. (1999)(Correct)
The Jalape~no Dynamic Optimizing Compiler is a key component of the Jalape~no Virtual Machine, a new Java 1 Virtual Machine (JVM) designed to support efficient and scalable execution of Java applicati... / The Jalape no Dynamic Optimizing Compiler for Java TM Michael G. br The Jalape no Dynamic Optimizing Compiler is a key component of the
24 Compilation Techniques for Low Energy: An Overview - Tiwari, Malik, Wolfe (1994)(Correct)
Recent years have witnessed a rapid growth in research activity
targeted at reducing energy consumption in microprocessor
based systems. However, this research has by and
large not recognized the pote... / for lcc a retargetable ANSI C compiler. These programs accept a br The cost function used in most compilers including lcc is the