• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Shade: A fast instruction-set simulator for execution profiling. (1993)

by R F Cmelik, D Keppel
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 383
Next 10 →

Valgrind: A program supervision framework

by Nicholas Nethercote, Julian Seward - In Third Workshop on Runtime Verification (RV’03 , 2003
"... a;1 ..."
Abstract - Cited by 250 (7 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...8], statically instrument program binaries; others instrument parse trees. However, we will concentrate on instrumenters that use dynamic binary translation, which are more similar to Valgrind. Shade =-=[6]-=- was an early dynamic translator. It supported insertion of basic trace instrumentation, and could run programs written for some architectures on some others, e.g. SPARC V8 programs on MIPS or SPARC V...

Dynamic storage allocation: A survey and critical review

by Paul R. Wilson, Mark S. Johnstone, Michael Neely, David Boles , 1995
"... Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their de ..."
Abstract - Cited by 241 (6 self) - Add to MetaCart
Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their design and evaluation. We then chronologically survey most of the literature on allocators between 1961 and 1995. (Scores of papers are discussed, in varying detail, and over 150 references are given.) We argue that allocator designs have been unduly restricted by an emphasis on mechanism, rather than policy, while the latter is more important; higher-level strategic issues are still more important, but have not been given much attention. Most theoretical analyses and empirical allocator evaluations to date have relied on very strong assumptions of randomness and independence, but real program behavior exhibits important regularities that must be exploited if allocators are to perform well in practice.
(Show Context)

Citation Context

...simulators are available for processing these traces. Larus' QPT tool (a successor to the earlier AE system [BL92]) modifies an executable program to make it self-tracing. The Shade tool from SunLabs =-=[CK93]-=- is essentially a CPU emulator, which runs a program in emulation and records various kinds of events in an extremely flexible way. For good performance, it uses dynamic compilation techniques to incr...

DAISY: Dynamic Compilation for 100% Architectural Compatibility

by Kemal Ebcioglu, Erik R. Altman , 1997
"... Although VLIW architectures offer the advantages of simplicity of design and high issue rates, a major impediment to their use is that they are not compatible with the existing software base. We describe new simple hardware features for a VLIW machine we call DAISY (Dynamically Architected Instructi ..."
Abstract - Cited by 206 (13 self) - Add to MetaCart
Although VLIW architectures offer the advantages of simplicity of design and high issue rates, a major impediment to their use is that they are not compatible with the existing software base. We describe new simple hardware features for a VLIW machine we call DAISY (Dynamically Architected Instruction Set from Yorlaown). DAISY is specifically intended to emulate existing architectures, so that all existing software for an old architecture (including operating system kernel code) runs without changes on the VLIW. Each time a new fragment of code is executed for the first time, the code is translated to VLIW primitives, parallelized and saved in a portion of main memory not visible to the old architecture, by a Firtual Machine Monitor (software) residing in read only memory. Subsequent executions of the same fragment do not require a translation (unless cast out). We discuss the architectural requirements for such a VLIW, to deal with issues including self-modifying code, precise exceptions, and aggressive reordedng of memory references in the presence of strong MP consistency and memory mapped I/O. We have implemented the dynamic parallelization algorithms for the PowerPC architecture. The initial results show high degrees of instruction level parallelism with reasonable translation overhead and memory usage.

Understanding data lifetime via whole system simulation

by Jim Chow, Ben Pfaff, Tal Garfinkel, Kevin Christopher, Mendel Rosenblum - In USENIX Security Symposium , 2004
"... Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. ..."
Abstract - Cited by 197 (5 self) - Add to MetaCart
Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
(Show Context)

Citation Context

...plored in SimOS [22]. Dynamic binary translators which operate at the single process level instead of the whole system level have demonstrated significant power for doing dynamic analysis of software =-=[8]-=-. These systems work as assemblyto-assembly translators, dynamically instrumenting binaries as they are executed, rather than as complete simulators. For example, Valgrind [19] has been widely deploye...

An Infrastructure for Adaptive Dynamic Optimization

by Derek Bruening, Timothy Garnett, Saman Amarasinghe , 2003
"... Dynamic optimization is emerging as a promising approach to overcome many of the obstacles of traditional static compilation. But while there are a number of compiler infrastructures for developing static optimizations, there are very few for developing dynamic optimizations. We present a framework ..."
Abstract - Cited by 189 (6 self) - Add to MetaCart
Dynamic optimization is emerging as a promising approach to overcome many of the obstacles of traditional static compilation. But while there are a number of compiler infrastructures for developing static optimizations, there are very few for developing dynamic optimizations. We present a framework for implementing dynamic analyses and optimizations. We provide an interface for building external modules, or clients, for the DynamoRIO dynamic code modification system. This interface abstracts away many low-level details of the DynamoRIO runtime system while exposing a simple and powerful, yet efficient and lightweight, API. This is achieved by restricting optimization units to linear streams of code and using adaptive levels of detail for representing instructions. The interface is not restricted to optimization and can be used for instrumentation, profiling, dynamic translation, etc.. To demonstrate

Optimizing ML with Run-Time Code Generation

by Peter Lee, Mark Leone - In Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design and Implementation
"... We describe the design and implementation of a compiler that automatically translates ordinary programs written in a subset of ML into code that generates native code at run time. Run-time code generation can make use of values and invariants that cannot be exploited at compile time, yielding code t ..."
Abstract - Cited by 174 (10 self) - Add to MetaCart
We describe the design and implementation of a compiler that automatically translates ordinary programs written in a subset of ML into code that generates native code at run time. Run-time code generation can make use of values and invariants that cannot be exploited at compile time, yielding code that is often superior to statically optimal code. But the cost of optimizing and generating code at run time can be prohibitive. We demonstrate how compile-time specialization can reduce the cost of run-time code generation by an order of magnitude without greatly affecting code quality. Several benchmark programs are examined, which exhibit an average cost of only six cycles per instruction generated at run time. 1 Introduction In this paper, we describe our experience with a prototype system for run-time code generation. Our system, called Fabius, is a compiler that takes ordinary programs written in a subset of ML and automatically compiles them into native code that generates native c...

Trace-Driven Memory Simulation: A Survey

by Richard A. Uhlig, Trevor N. Mudge - ACM Computing Surveys , 2004
"... This article surveys and analyzes these developments by establishing criteria for evaluating trace-driven methods, and then applies these criteria to describe, categorize, and compare over 50 trace-driven simulation tools. We discuss the strengths and weaknesses of different approaches and show t ..."
Abstract - Cited by 163 (0 self) - Add to MetaCart
This article surveys and analyzes these developments by establishing criteria for evaluating trace-driven methods, and then applies these criteria to describe, categorize, and compare over 50 trace-driven simulation tools. We discuss the strengths and weaknesses of different approaches and show that no single method is best when all criteria, including accuracy, speed, memory, flexibility, portability, expense, and ease of use are considered. In a concluding section, we examine fundamental limitations to trace-driven simulation, and survey some recent developments in memory simulation that may overcome these bottlenecks

Optimizing Dynamically-Dispatched Calls with Run-Time Type Feedback

by Urs Hölzle, David Ungar , 1994
"... Object-oriented programs are difficult to optimize because they execute many dynamically-dispatched calls. These calls cannot easily be eliminated because the compiler does not know which callee will be invoked at runtime. We have developed a simple technique that feeds back type information from t ..."
Abstract - Cited by 159 (8 self) - Add to MetaCart
Object-oriented programs are difficult to optimize because they execute many dynamically-dispatched calls. These calls cannot easily be eliminated because the compiler does not know which callee will be invoked at runtime. We have developed a simple technique that feeds back type information from the runtime system to the compiler. With this type feedback, the compiler can inline any dynamically-dispatched call. Our compiler drastically reduces the call frequency of a suite of large SELF applications (by a factor of 3.6) and improves performance by a factor of 1.7. We believe that type feedback could significantly reduce call frequencies and improve performance for most other object-oriented languages (statically-typed or not) as well as for languages with type-dependent operations such as generic arithmetic.

VCODE: A Retargetable, Extensible, Very Fast Dynamic Code Generation System

by Dawson R. Engler
"... Dynamic code generation is the creation of executable code at runtime. Such “on-the-fly ” code generation is a powerful technique, enabling applications to use runtime information to improve performance by up to an order of magnitude [4, 8, 20, 22, 23]. Unfortunately, previous general-purpose dynami ..."
Abstract - Cited by 133 (8 self) - Add to MetaCart
Dynamic code generation is the creation of executable code at runtime. Such “on-the-fly ” code generation is a powerful technique, enabling applications to use runtime information to improve performance by up to an order of magnitude [4, 8, 20, 22, 23]. Unfortunately, previous general-purpose dynamic code generation systems have been either inefficient or non-portable. We present VCODE, a retargetable, extensible, very fast dynamic code generation system. An important feature of VCODE is that it generates machine code “in-place” without the use of intermediate data structures. Eliminating the need to construct and consume an intermediate representation at runtime makes VCODE both efficient and extensible. VCODE dynamically generates code at an approximate cost of six to ten instructions per generated instruction, making it over an order of magnitude faster than the most efficient general-purpose code generation system in the literature [10]. Dynamic code generation is relatively well known within the compiler community. However, due in large part to the lack of a publicly available dynamic code generation system, it has remained a curiosity rather than a widely used technique. A practical contribution of this work is the free, unrestricted distribution of the VCODE system, which currently runs on the MIPS, SPARC, and Alpha architectures.
(Show Context)

Citation Context

...eation of executable code at runtime. Such "on-the-fly" code generation is a powerful technique, enabling applications to use runtime information to improve performance by up to an order of =-=magnitude [4, 8, 20, 22, 23]-=-. Unfortunately, previous general-purpose dynamic code generation systems have been either inefficient or non-portable. We present VCODE, a retargetable, extensible, very fast dynamic code generation ...

CommBench - A Telecommunications Benchmark for Network Processors

by Tilman Wolf, Mark Franklin - IN PROC. OF IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS , 2000
"... This paper presents a benchmark, CommBench, for use in evaluating and designing telecommunications network processors. The benchmark applications focus on small, computationally intense program kernels typical of the network processor environment. The benchmark ..."
Abstract - Cited by 125 (20 self) - Add to MetaCart
This paper presents a benchmark, CommBench, for use in evaluating and designing telecommunications network processors. The benchmark applications focus on small, computationally intense program kernels typical of the network processor environment. The benchmark
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University