Results 1 - 10
of
102
Bounding Pipeline and Instruction Cache Performance
- IEEE Transactions on Computers
, 1999
"... Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require sever ..."
Abstract
-
Cited by 104 (22 self)
- Add to MetaCart
Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to apipeline hazard oracache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst and best-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program’s control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst and best-case execution performance of each loop and function in the program. Finally, agraphical user interface is invoked that allows a user to request timing predictions on portions of the program. The results indicate that the timing analyzer efficiently produces tight predictions of worst and best-case performance for pipelining and instruction caching. Index terms: real-time systems, worst-case execution time, best-case execution time, timing analysis, instruction cache, pipelining
Integrating the timing analysis of pipelining and instruction caching
- In IEEE Real-Time Systems Symposium
, 1995
"... Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory reference ..."
Abstract
-
Cited by 92 (16 self)
- Add to MetaCart
Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to a pipeline hazard or a cache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program’s control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst-case execution performance of each loop and function in the program. Finally, agraphical user interface is invoked that allows a user to request timing predictions on portions of the program. 1.
A Retargetable Technique for Predicting Execution Time
- In IEEE Real-Time Systems Symposium
, 1992
"... Predicting the execution times of straight-line code sequences is a fundamental problem in the design and evaluation of hard-real-time systems. The reliability of system-level timings and schedulability analysis rests on the accuracy of execution time predictions for the basic schedulable units of ..."
Abstract
-
Cited by 82 (11 self)
- Add to MetaCart
Predicting the execution times of straight-line code sequences is a fundamental problem in the design and evaluation of hard-real-time systems. The reliability of system-level timings and schedulability analysis rests on the accuracy of execution time predictions for the basic schedulable units of work. Obtaining such predictions for contemporary microprocessors is difficult. First a summary of some of the hardware and software factors that make predicting execution time difficult is presented, along with the results of experiments that evaluate the degree of variation in execution time that may be caused by these factors. Traditional methods of measuring and predicting execution time are examined, and their strengths and weaknesses discussed. Second, we present a new technique for predicting point-to-point execution times on contemporary microprocessors. This technique is called micro-analysis. It uses machine-description rules, similar to those that have proven useful for c...
A Comparison of Static Analysis and Evolutionary Testing for the Verification of Timing Constraints
- Real-Time Systems
, 1998
"... This paper contrasts two methods to verify timing constraints of real-time applications. The method of static analysis predicts the worst-case and best-case execution times of a task's code by analyzing execution paths and simulating processor characteristics without ever executing the program or re ..."
Abstract
-
Cited by 75 (30 self)
- Add to MetaCart
This paper contrasts two methods to verify timing constraints of real-time applications. The method of static analysis predicts the worst-case and best-case execution times of a task's code by analyzing execution paths and simulating processor characteristics without ever executing the program or requiring the program's input. Evolutionary testing is an iterative testing procedure, which approximates the extreme execution times within several generations. By executing the test object dynamically and measuring the execution times the inputs are guided yielding gradually tighter predictions of the extreme execution times. We examined both approaches on a number of real world examples. The results show that static analysis and evolutionary testing are complementary methods, which together provide upper and lower bounds for both worst-case and best-case execution times. 1. Introduction For real-time systems the correct system functionality depends on their logical correctness as well as o...
Timing Analysis for Data Caches and Set-Associative Caches
, 1997
"... The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The given approach works on fully optimized code, performs the analysis over the entire control flow of a program, detects and exploits both spatial and tempo ..."
Abstract
-
Cited by 63 (11 self)
- Add to MetaCart
The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The given approach works on fully optimized code, performs the analysis over the entire control flow of a program, detects and exploits both spatial and temporal locality within data references, produces results typically within a few seconds, and estimates, on average, 30% tighter WCET bounds than can be predicted without analyzing data cache behavior. Results obtained by running the system on representative programs are presented and indicate that timing analysis of data cache behavior can result in significantly tighter worst-case performance predictions. Second, a framework to bound worst-case instruction cache performance for set-associative caches is formally introduced and operationally described. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain...
C--: A Portable Assembly Language That Supports Garbage Collection
- IN INTERNATIONAL CONFERENCE ON PRINCIPLES AND PRACTICE OF DECLARATIVE PROGRAMMING
, 1999
"... For a compiler writer, generating good machine code for a variety of platforms is hard work. One might try to reuse a retargetable code generator, but code generators are complex and difficult to use, and they limit one's choice of implementation language. One might try to use C as a portable ass ..."
Abstract
-
Cited by 62 (19 self)
- Add to MetaCart
For a compiler writer, generating good machine code for a variety of platforms is hard work. One might try to reuse a retargetable code generator, but code generators are complex and difficult to use, and they limit one's choice of implementation language. One might try to use C as a portable assembly language, but C limits the compiler writer's flexibility and the performance of the resulting code. The wide use of C, despite these drawbacks, argues for a portable assembly language. C-- is a new language designed expressly for this purpose. The use
Timing Analysis for Instruction Caches
- REAL-TIME SYSTEMS
, 2000
"... This paper contributes a comprehensive study of a framework to bound worst-case instruction cache performance for caches with arbitrary levels of associativity. The framework is formally introduced, operationally described and its correctness is shown. Results of incorporating instruction cache pred ..."
Abstract
-
Cited by 55 (22 self)
- Add to MetaCart
This paper contributes a comprehensive study of a framework to bound worst-case instruction cache performance for caches with arbitrary levels of associativity. The framework is formally introduced, operationally described and its correctness is shown. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches. The low cache simulation overhead allows interactive use of the analysis tool and scales well with increasing associativity. The approach taken is based on a data-ow specication of the problem and provides another step toward worst-case execution time prediction of contemporary architectures and its use in schedulability analysis for hard real-time systems.
Bounding Loop Iterations for Timing Analysis
- In Proceedings of the IEEE Real-Time Applications Symposium. IEEE CS Press, Los Alamitos, Calif
, 1998
"... Static timing analyzers need to know the minimum and maximum number of iterations associated with each loop in a real-time program so accurate timing predictions can be obtained. This paper describes three complementary methods to support timing analysis by bounding the number of loop iterations. Fi ..."
Abstract
-
Cited by 51 (8 self)
- Add to MetaCart
Static timing analyzers need to know the minimum and maximum number of iterations associated with each loop in a real-time program so accurate timing predictions can be obtained. This paper describes three complementary methods to support timing analysis by bounding the number of loop iterations. First, an algorithm is presented that determines the minimum and maximum number of iterations of loops with multiple exits. Second, the loopinvariant variables on which the number of loop iterations depends are identified for which the user can provide minimum and maximum values. Finally, a method is given to tightly predict the execution time of loops whose number of iterations is dependent on counter variables of outer level loops. These methods have been successfully integrated in an existing timing analyzer that predicts the performance for optimized code on a machine that exploits caching and pipelining. The result is tighter timing analysis predictions and less work for the user.
Finding effective optimization phase sequences
- In Proceedings of the 2003 ACM SIGPLAN conference on Language, Compiler, and Tool for Embedded Systems
, 2003
"... It has long been known that a single ordering of optimization phases will not produce the best code for every application. This phase ordering problem can be more severe when generating code for embedded systems due to the need to meet conflicting constraints on time, code size, and power consumptio ..."
Abstract
-
Cited by 45 (11 self)
- Add to MetaCart
It has long been known that a single ordering of optimization phases will not produce the best code for every application. This phase ordering problem can be more severe when generating code for embedded systems due to the need to meet conflicting constraints on time, code size, and power consumption. Given that many embedded application developers are willing to spend time tuning an application, we believe a viable approach is to allow the developer to steer the process of optimizing a function. In this paper, we describe support in VISTA, an interactive compilation system, for finding effective sequences of optimization phases. VISTA provides the user with dynamic and static performance information that can be used during an interactive compilation session to gauge the progress of improving the code. In addition, VISTA provides support for automatically using performance information to select the best optimization sequence among several attempted. One such feature is the use of a genetic algorithm to search for the most efficient sequence based on specified fitness criteria. We hav e included a number of experimental results that evaluate the effectiveness of using a genetic algorithm in VISTA to find effective optimization phase sequences.

