| Michael D. Smith, "Tracing with Pixie," Technical Report CSL-TR-91-497, Stanford University, Stanford CA, November 1991. |
....the monitoring workstation. Upon transfer, the shadowed copy is updated with the latest changes. The debug instructions can be added either before or after compilation. While the precompilation choice requires in source code embedding of cut I O instructions (for example, used in the MIPS Pixie [26]) the postprocessing step encompasses object code instrumentation similar to that implemented in Purify (Purify uses this technique to locate memory access errors) 11] The precompilation approach has a significant advantage due to independence with respect to the hardware platform. Debugging ....
M. D. Smith, "Tracing with pixie," Stanford Univ., Stanford, CA, Tech. Rep. CSL-TR-91-497, Nov. 1991.
....performance. However on modern machines the performance of most real applications is dominated by the time to service memory accesses. In figure 17 we show the cost of the various types of instructions for a 3D electrostatic PIC code run on a Dec Alpha workstation. A profiling tool called PIXIE[12] was used to make these measurements. Execution of floating point operations accounts for only 25 of the run time, and higher level control constructs such as if statements accounts for only 2 . The bulk of the execution time is spent performing fetches from or stores to memory, and in integer ....
M.D. Smith, "Tracing with PIXIE", (1991).
.... 3 4 6 No No i386 File Post Process [Larus90] AE 20 65 2 5 No No MIPS, SPARC File Post Process Object [Borg89] Epoxie 8 12 8 12 5 Yes No 1 Titan Buffer Processor [Chen93a] Epoxie2 15 15 2 Yes Yes R3000 Buffer Processor Binary [Smith91] Pixie 10 10 4 6 No No MIPS File Pipe [Stephens91] Goblin 20 20 10 No No RS 6000 Linked Into Task [Pierce94a] IDtrace 12 12 12 No No i486 File Pipe [Larus93] Qpt 10 60 2 5 3 No No MIPS, SPARC File Post Process Table 2.4 ....
Smith, M. D. Tracing with pixie. Technical Report, Stanford University, Stanford, CA. 1991.
....d caches. OLUKOTUN et al. MULTILEVEL OPTIMIZATION OF PIPELINED CACHES 1095 The architectural simulations were performed using a trace driven simulator, cacheUM, originally developed to study two level cache organizations [11] The traces were created using the MIPS program analysis tool pixie [12] from load modules of the benchmark programs listed in Table 1. A record of system calls was also made during the normal execution of the set of benchmarks. Each benchmark was used to represent a single process. To model the effect of multiprogramming, a system call file and a process ....
M.D. Smith, "Tracing with Pixie," Technical Report CSL-TR-91497, Computer Systems Laboratory, Nov. 1991.
....in section 7. The remaining two sections discuss related work and conclusions. 2. PROGRAM PROFILE COLLECTION The primary profile information used by Spike is a set of execution counts for the basic blocks of a program. The counts can be exact counts collected by instrumenting the program [Smi91] or estimated counts collected with the DCPI statistical profiler [And97] Instrumentation has the advantage of producing exact counts but the instrumentation increases execution time significantly. Also not every basic block can be instrumented in large complex programs like the Unix kernel ....
M. Smith. Tracing with Pixie. Tech. Rpt. CSL-TR-91-497, Stanford University, Nov. 1991
....instruction, to an elaborate simulation of the internal state of the processor, including updating instruction queues and cache lines. 114 The analysis tools used in these experiments are derivatives of two programs: pixstats and xsim. Pixstats is the MIPS utility supplied for program analysis [163]. Pixstats gives detailed processor execution statistics, but assumes a perfect memory system and does not give information on cache effects. We use pixstats in our runs on the R3000 comparing compilers and a modified version of pixstats for the experiments involving floating point latencies ....
.... loop unrolling to match standard gcc unroll counts (unroll) and loop unrolling 4 iterations (referred to as unroll 4 ) The cycles executed, and other performance characteristics, are found by running pixie and analyzing the trace using pixstats, a MIPS utility supplied for program analysis [163]. Pixstats assumes a perfect memory system and does not give information on cache effects. Initially, we will ignore cache effects to simplify the number of parameters in our machine model. Later, we will explore cache effects in some of the architectural experiments. TABLE 2. Compiler Technique ....
M. D. Smith, Tracing with pixie, Stanford University, Technical April 4 1991.
....is a common software solution to simulation. Instruction set emulation and static code annotation are common approaches used to collect execution traces [12] Binary translation is a technique used to construct fast simulators, but it also useful for other types of software problems [9] Pixie [10] is a program analyzer that reads the entire executable for a program and writes a new version of the program that saves profiling information as it runs. Pixie allows the user to collect only basic block execution counts and text and data traces. There is no provision for user defined routines to ....
M. D. Smith. Tracing with Pixie. Technical report, Stanford University, 1991.
....be recompiled, with the profiling data used to guide policy decisions in the compiler. A fundamental issue in profiling is the level of granularity and the specific mechanism used to collect data. Some profilers modify the binary source code to insert instructions that increment counter variables [Smith 1991]. Other profilers periodically interrupt the execution of the binary to sample the program counter [Anderson et al. 1997; Zhang et al. 1997; Graham et al. 1982] Profiling has been used to guide decisions to inline procedures in C programs [Chang et al. 1992] to drive instruction scheduling ....
SMITH, M. 1991. Tracing with Pixie. Tech. Rep. CSL-TR-91-497, Stanford University. November.
....CINT92 (integer) and CFP92 (floating point) All programs from both sets were used to make our evaluations. The benchmarks were compiled on a R4600 based SGI workstation using cc and the standard makefiles provided with the 7 suite (with all optimizations turned on) We used the pixie profiler [24] to collect instruction traces from a real processing of the SPEC benchmarks, including library calls. These traces fed our simulator which performs a cycle by cycle simulation and gathers the mean number of instructions retired per cycle (IPC) All the benchmarks were run to completion except ....
M. D. Smith, "Tracing with Pixie," Technical report, Stanford University, April 1991.
....using the pixie dynamic tracing facility, based on the MIPS architecture [2] In order to gather the data, pixie was run on the executable benchmark files, ignoring library code. The dynamic tracing information was sent to a branch target buffer simulator xsim, based on the program described in [16]. The results were then directed to a file for post processing. For further details on tracing programs with pixie, the reader is referred to [16] 4M ETRICS We may classify BTB accesses on five orthogonal Boolean dimensions: 1) Whether the access is a hit or a miss. 2) Whether the prediction ....
....files, ignoring library code. The dynamic tracing information was sent to a branch target buffer simulator xsim, based on the program described in [16] The results were then directed to a file for post processing. For further details on tracing programs with pixie, the reader is referred to [16]. 4M ETRICS We may classify BTB accesses on five orthogonal Boolean dimensions: 1) Whether the access is a hit or a miss. 2) Whether the prediction is taken or not taken. 3) Whether the address sent to the BTB is the address of a branch. 4) Whether the actual direction of the instruction ....
M. Smith, "Tracing with Pixie," Stanford Univ. Center for Integrated Systems, Apr. 1991.
....simulation of detailed traces collected for X Window programs. 6. 1 Instruction level Trace Collection Instruction level profiling techniques are an important aid in the design and analysis of computer architecture [67] Some examples of instruction level profiling environments 73 include pixie [75] for the MIPS architecture, shade [76] for SPARC, and goblin [77] for the IBM RS 6000. The data presented in this thesis are based on detailed instruction level traces collected for an X Window server running on a DECstation 3100 workstation. The traces were generated by an instrumented version ....
M. Smith, "Tracing with pixie," Tech. Rep. CSL-TR-91-497, Computer Systems Laboratory, Stanford University, Stanford, CA 94305, November 1991.
....hand, the misses optimization problem based on the placement paradigm alone is an NP complete problem. Most intruction cache optimization techniques use dynamic information; i.e. profiling information, which is gathered from executing the code on a selected set of input data. The Pixie 1 tools [11] and [10] use profiling information to find better placements for code segments. Information gathered dynamically is also used in [3, 8] for avoiding fetching into the cache instructions either when they are used only once before being purged from the cache, or because they might conflict with ....
M. D. Smith. Tracing with pixie. CSL-TR-91-497 91-497, Stanford University, Stanford, CA 94305-4055, November 1991.
....for floating point operations [14] Thus, the results obtained on a given architecture are applicable to a wide range of architectures. The results presented were obtained on the MIPS architecture, primarily due to the availability of the flexible program analysis tools pixie and pixstats [15]. Pixie reads an executable file and partitions the program into its basic blocks. It then writes a new version of the executable containing extra instructions to dynamically count the number of times each basic block is executed. The benchmarks use the standard input data sets, and each executes ....
M. D. Smith, "Tracing with pixie," Technical Report No. CSL-TR-91-497, Computer Systems Laboratory, Stanford University, November 1991.
....dependent) The tracing philosophy is to modify an NT executable or DLL (dynamic link library) patching all branches. The patches would then generate a trace record which could later be used to reconstruct the execution flow. A similar approach was used in the Pixie toolset for the MIPS platform [27]. To patch an executable or DLL, branches are overwritten with an unconditional jump to a patch section of code (appended to the end of the image) The patch section issues the appropriate PALcall (based on the type of branch that was overwritten) The PALcode captures the trace information and ....
M. Smith, "Tracing with Pixie," Technical Report, Stanford University, Stanford, CA, 1991.
....specified compiler optimization levels for benchmark programs, we generated the benchmark executables using the makefile script (i.e. M.dec risc) from the original SPEC92 suite package. Version 2.0 of C compiler was used to compile all benchmark programs written in C. The MIPS pixie tool [25] was used to instrument the executable of a benchmark program. The resulting annotated executable file runs exactly the same as the original executable, except that it also writes a stream of traces for both instruction and data to a special system file descriptor (i.e. 19) on which our cache ....
M. Smith, "Tracing with Pixie," Technical Report CSL-TR-91-497, Nov. 1991.
....: arithmetic and logic operations, shift : shifts and bit field manipulations, integer multiply, integer divide , load store: memory loads and stores . floating point arithmetic , floating point convert , floating point multiply, floating point divide . We used the pixie profiler [Smit91] in order to produce instruction traces from a real processing of the SPEC benchmarks. From all the data reported by this software, we picked out only the opcode and the memory address (in the case of a memory access) for each instruction. These traces are read by our simulator which performs a ....
M.D. Smith , "Tracing with Pixie", Stanford University , April 1991
....1 load store 2 3 shift 1 fp add conv 3 int. multiply 3 fp multiply 3 int. divide 20 fp divide 18s 31d s stands for single precision and d for double precision Table 1: Latencies table 1. All functional units but divide units, are fully pipelined and mutually independent. We used the pixie profiler [13] in order to produce instruction traces from a real processing of the SPEC benchmarks. From all the data reported by this software, we picked out only the opcode, the operands, and the dynamic memory address (in the case of a memory access) for each instruction. These traces are read by our ....
M.D. Smith, "Tracing with Pixie," Technical report, Stanford University, April 1991.
....frequencies to weight the edges, or 2) estimate the call frequencies by inspecting the CFG of the program and perform static branch prediction during graph construction. In this work we will report on using both of these methods. Dynamic profiles are available using a variety of programming tools [11, 26, 27]. A program is run with the appropriate form of instrumentation turned on, and call path frequencies are generated. For a given input, this can provide very detailed measurements to feed into a reordering algorithm, though the performance of the optimization is dependent upon how accurately the ....
M.D. Smith. Tracing with pixie. Stanford University Research Report CSL-TR-91-497, November 1991.
....2.2 Similar Experimental Methods To evaluate hardware performance under multi tasking environments, we need tools that are capable of monitoring system activities with minimal disturbance to the system under analysis. The most common monitoring tools are code annotation systems such as pixie [Smith91]. These are purely software based because they work by inserting monitoring code directly into executable images of programs. This process of inserting code is called annotation. When the annotated program is executed, the inserted code can record program activities into a predetermined file for ....
Smith, M.D. Tracing with pixie. Stanford University, Stanford, CA. 1991
....when executed on serial or parallel host machines and critical path simulation produces optimistic parallel traces from serial codes. Architecture dependent traces are acquired using hardware or software monitoring, for example on the Cedar multiprocessor [10] or using other tools such as Pixie [13]. EPG sim provides execution driven simulation capabilities in Chief. Serial or parallel application codes are instrumented to form execution driven event generators. The resulting event generators are coupled with parallel system simulators, using a runtime interface library and a lightweight ....
M. Smith, "Tracing with Pixie," technical report, Center for Integrated Systems, Stanford University, April 1991.
....but we do model the rich details of the processor including the pipeline, register renaming, the reorder buffer, branch prediction, instruction fetching, branching penalties, the memory hierarchy (including contention) etc. Table II shows the parameters of our model. We use pixie [13] to instrument the optimized MIPS object files produced by the compiler, and pipe the resulting trace into our simulator. To avoid misses during the initialization of dynamicallyallocated objects, we used a modified version of the IRIX mallopt routine [14] whereby we prefetch allocated objects ....
M. D. Smith, "Tracing with pixie," Tech. Rep. CSL-TR-91-497, Stanford University, November 1991.
....program was modified using the reconfigurable coprocessor instructions. The bitstream mobile was chosen as a benchmark. Figure 9: Execution Time Breakdown of MPEG 2 Decoding To determine the execution time breakdown we profiled the MPEG 2 decoding program using the profiling tool pixie [18]. Figure 9 shows the percentage of execution time spent in each stage while decoding 30 frames. The figure indicates that the execution time is distributed among MC, Add Block, IDCT, and VLD IQ. This suggests that reconfigurable coprocessors need to support multiple functions to accelerate the ....
Michael D. Smith, "Tracing with pixie", Technical Report No. CSL-TR-91-497, Computer Systems Laboratory, Stanford University, 1991.
....requires the use of large and varied input data sets during profiling. If efficient profiling techniques are not used the collecting this information could take a very long time. The profiling technique used to obtain the average software execution time is the object code annotation tool pixie [13]. This tool captures the dynamic execution frequencies for each basic block in the object code. By analyzing the execution time of each basic block on the target processor architecture and multiplying this time by the execution frequency of the block, an exact count of the number cycles executed ....
M. D. Smith, "Tracing with Pixie," Technical CSL-TR-91-497, Stanford University, Computer Systems Laboratory, Nov. 1991.
No context found.
Michael D. Smith, "Tracing with Pixie," Technical Report CSL-TR-91-497, Stanford University, Stanford CA, November 1991.
....then feeds that trace to a trace driven simulation program. The usefulness of instrumentation tools is obvious from a quick glance at current research publications in the area, where a significant number of authors use traces generated by two of the most popular instrumentation tools: pixie [23] and spixtools [6] These tools are popular because of their applicability to many architectures and programs, their relatively low overhead, and their simplicity of use. This chapter s focus is the design of instrumentation tools. Section 1 describes how instrumentation tools fit into the broad ....
....the execution of handwritten or other non compiled assembly code. Instrumentation Tools 69 4.2 pixie and nixie Pixie was the first binary instrumentation tool which received widespread use. Pixie is a full execution trace generation tool which runs on MIPS R2000, R3000 and R4000 based systems [23]. The tool is included in the performance debugging software package of most systems based upon the MIPS architecture. Versions are available which instrument ECOFF and ELF file formats. With newer versions of pixie, if pixified dynamic libraries exist, they can be linked into the instrumented ....
[Article contains additional citation context not shown here]
M. Smith, "Tracing with Pixie," Technical Report CSL-TR-91-497, Center for Integrated Systems, Stanford University, Nov. 1991.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC