• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A Technique for Compilation to Exposed Memory Hierarchy (1999)

by B Greenwald
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

Strength Reduction of Integer Division and Modulo Operations

by Jeffrey Sheldon, Walter Lee, Ben Greenwald, Saman Amarasinghe , 2001
"... Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buers. In addition, they appear frequently in address computations as a result of compiler optim ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buers. In addition, they appear frequently in address computations as a result of compiler optimizations that improve data locality, perform data distribution, or enable parallelization. Experienced application programmers, however, avoid them because they are slow. Furthermore, while advances in both hardware and software have improved the performance of many parts of a program, few are applicable to division and modulo operations. This trend makes these operations increasingly detrimental to program performance.
(Show Context)

Citation Context

...is research is funded in part by Darpa contract # DABT63-96-C-0036 and in part by an IBM Research Fellowship. ory system [6], the Hot Pages software caching system [15], and the C-CHARM memory system =-=[13]-=- all introduce these operations to express the array indexes after transformations. However, the cost of using division and modulo operations is often prohibitive. Despite their suitability for repres...

LIBRARIES High Level Compilation for Gate Reconfigurable Architectures

by Jonathan William Babb, Anant Agarwal, Jonathan William Babb , 2001
"... A continuing exponential increase in the number of programmable elements is turning man-agement of gate-reconfigurable architectures as "glue logic " into an intractable problem; it is past time to raise this abstraction level. The physical hardware in gate-reconfigurable architectures is ..."
Abstract - Add to MetaCart
A continuing exponential increase in the number of programmable elements is turning man-agement of gate-reconfigurable architectures as "glue logic " into an intractable problem; it is past time to raise this abstraction level. The physical hardware in gate-reconfigurable architectures is all low level- individual wires, bit-level functions, and single bit registers- hence one should look to the fetch-decode-execute machinery of traditional computers for higher level abstractions. Ordinary computers have machine-level architectural mecha-nisms that interpret instructions- instructions that are generated by a high-level compiler. Efficiently moving up to the next abstraction level requires leveraging these mechanisms without introducing the overhead of machine-level interpretation. In this dissertation, I solve this fundamental problem by specializing architectural mechanisms with respect to input programs. This solution is the key to efficient compilation of high-level programs to gate reconfigurable architectures. My approach to specialization includes several novel techniques. I develop, with others, extensive bitwidth analyses that apply to registers, pointers, and arrays. I use pointer analysis and memory disambiguation to target devices with blocks of embedded memory.
(Show Context)

Citation Context

... expose parallelism in the inner loop. Successive Over Relaxation* (sor) The source code for the sor benchmark, a wellknown five point stencil relaxation, is borrowed from Greenwald's Master's Thesis =-=[65]-=-. This benchmark is similar to jacobi. 6.3 Basic Results Recall the evaluation modes introduced in Chapter 2 and summarized in Figures 2-6 and 27. The modes studied in basic results include: Mode 0 (f...

Strength Reduction of Integer Division and Modulo Operations

by unknown authors
"... ABSTRACT Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buffers. In addition, they appear frequently in address computations as a result of com ..."
Abstract - Add to MetaCart
ABSTRACT Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buffers. In addition, they appear frequently in address computations as a result of compiler optimizations that improve data locality, perform data distribution, or enable parallelization. Experienced application programmers, however, avoid them because they are slow. Furthermore, while advances in both hardware and software have improved the performance of many parts of a program, few are applicable to division and modulo operations. This trend makes these operations increasingly detrimental to program performance. This paper describes a suite of optimizations for eliminating division, modulo, and remainder operations from programs. These techniques are analogous to strength reduction techniques used for multiplications. In addition to some algebraic simplifications, we present a set of optimization techniques that eliminates division and modulo operations that are functions of loop induction variables and loop constants. The optimizations rely on algebra, integer programming, and loop transformations. 1.
(Show Context)

Citation Context

...is research is funded in part by Darpa contract # DABT63-96-C-0036 and in part by an IBM Research Fellowship. ory system [6], the Hot Pages software caching system [15], and the C-CHARM memory system =-=[13]-=- all introduce these operations to express the array indexes after transformations. However, the cost of using division and modulo operations is often prohibitive. Despite their suitability for repres...

Strength Reduction of Integer Division and Modulo Operations

by Saman Amarasinghe, Walter Lee, Ben Greenwald
"... Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buffers, array address calculations in data distribution, and cache locality compiler-optimiza ..."
Abstract - Add to MetaCart
Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buffers, array address calculations in data distribution, and cache locality compiler-optimizations. Experienced application programmers, however, avoid them because they are slow. Furthermore, while advances in both hardware and software have improved the performance of many parts of a program, few are applicable to division and modulo operations. This trend makes these operations increasingly detrimental to program performance.

Efficient Execution of Declarative Programs

by Matthew Frank , 2001
"... Memoization is an optimization that provides asymptotic speedups, automatically achieving many of the benefits of dynamic programming. Memoization, however, trades off reduced execution time for additional required storage. This additional storage requirement can be reduced somewhat by using seve ..."
Abstract - Add to MetaCart
Memoization is an optimization that provides asymptotic speedups, automatically achieving many of the benefits of dynamic programming. Memoization, however, trades off reduced execution time for additional required storage. This additional storage requirement can be reduced somewhat by using several techniques from incrementalization. The first technique, called pruning, statically identifies the set of cached results which are still required at any particular program point, allowing the remainder to be discarded. The second technique dynamically tracks dependencies, allowing a cached result to be reused in a broader set of contexts. This paper suggests that it would be possible to provide a generic graph library with a declarative interface in a traditional imperative programming language such as Java or C++. This library could be implemented using memoization and a dynamic version of pruning. This would enable users of traditional programming languages to program at a higher level of abstraction while still achieving the efficiency they require. 1

Efficient Execution of Declarative Programs (Area Exam Report)

by Matthew Frank , 2001
"... Memoization is an optimization that provides asymptotic speedups, automatically achieving many of the benefits of dynamic programming. Memoization, however, trades off reduced execution time for additional required storage. This additional storage requirement can be reduced somewhat by using several ..."
Abstract - Add to MetaCart
Memoization is an optimization that provides asymptotic speedups, automatically achieving many of the benefits of dynamic programming. Memoization, however, trades off reduced execution time for additional required storage. This additional storage requirement can be reduced somewhat by using several techniques from incrementalization. The first technique, called pruning, statically identifies the set of cached results which are still required at any particular program point, allowing the remainder to be discarded. The second technique dynamically tracks dependencies, allowing a cached result to be reused in a broader set of contexts. This paper suggests that it would be possible to provide a generic graph library with a declarative interface in a traditional imperative programming language such as Java or C++. This library could be implemented using memoization and a dynamic version of pruning. This would enable users of traditional programming languages to program at a higher level of abstraction while still achieving the efficiency they require. 1
(Show Context)

Citation Context

...a library. 3.1 Analysis at Compile Time "&L(- , $% &M(* Three roughly equivalent methods of automatically determining these storage requirements have been independently investigated in the literature =-=[18, 24, 13, 12, 26, 8, 27]-=-. All three methods work at compile time and attempt to recover enough dependence information to conservatively prove that certain memoized facts will never again be accessed. 6i j Figure 4: Dependen...

Hot Pages: Software Caching for Raw Microprocessors

by Csaba Andras, Moritz Matthew Frank, Walter Lee, Saman Amarasinghe
"... This paper describes Hot Pages, a software solution for managing on-chip data on the Raw Machine, a scalable, parallel, microprocessor architecture. This software system transparently manages the mapping between the program address space and on-chip memory. Hot Pages combines compile time informatio ..."
Abstract - Add to MetaCart
This paper describes Hot Pages, a software solution for managing on-chip data on the Raw Machine, a scalable, parallel, microprocessor architecture. This software system transparently manages the mapping between the program address space and on-chip memory. Hot Pages combines compile time information to selectively virtualize memory references and to eliminate many cachetag lookups. For many of the memory accesses that cannot be fully predicted, Hot Pages replaces the cache-tag lookups with simple register comparisons by reusing translated virtual page descriptions from earlier nearby memory references. Hot Pages implements a multi-bank memory structure, allowing multiple references in parallel, to provide memory bandwidth matched to the computational resources on the Raw microprocessor. Because virtualization is handled in software rather than hardware, the system is easier to test, it is more predictable, and provides the flexibility of application specific customized caching solutions. For the applications studied the Hot Pages system eliminates in average more than 90% of the cache-tag lookups and could be applied to reduce the power required for data caching. The performance of Hot Pages scales with added processors and for many applications is comparable with that of hardware solutions. Hot Pages is a credible new foundation for caching, opening up a new dimension for research in additional application specific software caching optimizations. 1
(Show Context)

Citation Context

...ation is to completely eliminate the virtualization software overhead for certain classes of memory accesses analyzable in the compiler. There are several possible approaches to this (see for example =-=[8]-=-). Our current system targets inner loops with affine array accesses. The idea is to strip-mine each inner loop of the program into double nested loops, in such a way that we can guarantee that the me...

Strength Reduction of Integer Division and . . .

by Saman Amarasinghe , 2001
"... ..."
Abstract - Add to MetaCart
Abstract not found
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University