19 citations found. Retrieving documents...
Uhlig, R., Nagle, D., Mudge, T. and Sechrest, S. Trap-driven simulation with Tapeworm II. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, ACM Press (SIGARCH), 132-144, 1994.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Code Cloning Tracing: A New Approach To Trace Collection - Lafage, Seznec, Rohou, Bodin (1998)   (Correct)

....such an ideal trace collection. Most current trace collection platforms are limited to collect traces from a single application, generally only user instructions in this application [15] while it was shown that the usage of such uncompleted traces leads to significant misleading conclusions [1, 16, 17]. Another concern for micro architecture studies is the performance of trace collection. Among the numerous trace collection platforms available [15] none of them allows a slowdown lower than 10 for simply collecting addresses for memory hierarchy simulations. For studies like value prediction ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest. Trap-driven simulation with Tapeworm II. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 132--144, San Jose, California, October 1994.


Low Perturbation Address Trace Collection with Simple.. - Russell Daigle Chun   (Correct)

....been applied to a uniprocessor operating system. However, these are hardly the right tools to capture complete traces of a multiprogrammed parallel workload that includes applications, operating system and several daemons. A second type of software based systems are trap driven simulation systems [20]. In this approach, simulations are driven by kernel traps. These traps allow the simulation of a cache as the kernel executes. Indeed, memory traps are set on addresses that are currently not in the simulated cache. When that address is accessed, the kernel traps. After trapping, the kernel ....

....TLBs can be simulated in a similar way. With this approach, both operating system and application effects are considered. However, a major disadvantage of this scheme is that it cannot generate as much information as systems based on traces. Examples of this approach are Uhlig et al. s Tapeworm II [20] and Talluri s TLB simulator [17] Finally, the last type of software based system is exemplified by SimOS [15] In SimOS, the hardware of a machine is simulated with enough detail to run an entire operating system. On top of this operating system, we can run applications. With this system, we ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest. Trap-Driven Simulation with Tapeworm II. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 37--47, October 1994. 15


AVM: Application-Level Virtual Memory - Dawson Engler Sandeep (1995)   (24 citations)  (Correct)

....that AVM is required if applications are to acheive any consistent degree of flexibility. 2.1 Examples Fine grain monitoring. AVM (by necessity) gives very precise information about and control over the TLB. This information can be used to derive working sets [18] or to trace address streams [22]. Accurate in core information. AVM systems have total control over virtual memory mappings. Their accurate knowledge of which pages are resident in memory can be useful to many types of applications. For example, a scientific program manipulating large matrices could work on those pieces that ....

Richard Uhlig, David Nagle, Trevor Mudge, and Stuart Schrest. Trap-driven simulation with tapeworm II. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 132--144, October 1994.


Augmint - A Multiprocessor Simulation Environment for.. - Sharma, Nguyen.. (1996)   (25 citations)  (Correct)

....the latter approach leads to more efficient simulation. Finally, we examine trap driven simulation. In this approach, traps are set on memory locations by making use of error correction codes and the simulator takes control only during the traps. This approach has been used in WWT [8] and Tapeworm [9]. The advantage of trap driven simulation is that memory reference events, that do not miss in the simulated memory hierarchy do not cause a trap. As a result, the simulation is faster. However, since this approach is heavily dependent on the underlying architecture, it can not be used ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest. "Trap-driven simulation with Tapeworm II," 6th Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), Oct. 1994, pp. 132-144.


Improving the Address Translation Performance of Widely Shared .. - Yousef Khalidi (1995)   (3 citations)  (Correct)

....SPEC programs were compiled to use dynamic linking SPEC benchmarks are normally linked statically but most realworld programs on Solaris use dynamic linking to libraries. DRAFT. DO NOT COPY. 13 9. 2 Methodology We evaluate the performance of a common mask TLB using trap driven simulation [17] implemented in foxtrot [15] a Solaris 2.1 based operating system that counts the number of user TLB misses for a workload. Our simulation environment does not include kernel TLB misses but includes the effect of context switches in the multi programmed workloads. Kernel TLB misses will ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest, "Trap-driven Simulation with Tapeworm II," 6th Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), October 1994, pp. 132-144.


Efficient Memory Simulation in SimICS - Magnusson, Werner (1995)   (22 citations)  (Correct)

.... and OS workloads involved microcode or different forms of hardware monitoring or modifications [1, 13, 30, 41] Recently, more flexible techniques have been developed that rely only on the manipulation of ECC bits in the host memory and clever modifications to the host operating system [32, 42]. In general, these techniques are unwieldy and inflexible. Instrumenting the program binary is a common software solution to simulation. However, this approach tends to sharply constrain the class of programs that can be studied, in particular the programing language and or compiler being used. ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest. Trap--driven Simulation with Tapeworm II. In Proceedings of ASPLOS--VI, pages 132--144, October 1994.


Global Memory Management for Workstation Networks - Feeley (1996)   (2 citations)  (Correct)

....state bits for SVM on the CM5; they used ECC bits to cause faults, however, this would still require emulating writes, and the Alpha 250 has imprecise exceptions on data parity errors, making use of parity difficult or impossible. Similar techniques have been used for trace production as well [78]. 2 The IBM 801 used a similar scheme to manage transactions on units of less than a page in their case, for each 128 byte line [17] 126 Table 8.2: Page fault latencies for Eager Fullpage Fetch from remote memory. Latencies are arrival times of subpage and rest of page. Improvement ....

Richard Uhlig, David Nagle, Trevor Mudge, and Stuart Sechrest. Trap-driven simulation with Tapeworm II. In Proc. of the 6th Int. Conf. on Arch. Support for Prog. Languages and Operating Systems, October 1994.


Active Memory: A New Abstraction for Memory-System Simulation - Lebeck, Wood (1995)   (18 citations)  (Correct)

....a similar optimization more cleanly, using the OM liveness analysis to detect, and save, caller save registers used in the simulator routines [21] However, ATOM still incurs unnecessary procedure linkage overhead in the no action cases. A recent alternative technique, trap driven simulation [17, 25], optimizes no action cases to their logical extreme. Trap driven simulators exploit the characteristics of the simulation platform to implement effective address calculation and lookup (steps 1 and 2) in hardware. References requiring no action run at full hardware speed; other references cause ....

....and hardware support that is not readily available on most machines. Generality is lacking because current trap driven simulators do not simulate arbitrary memory systems: the Wisconsin Wind Tunnel does not simulate stack references [17] while Tapeworm II does not simulate any data references [25]. Furthermore, the overhead of memory exceptions can overwhelm the benefits of free lookups for simulations with non negligible miss ratios. The active memory abstraction described in detail in the next section combines the efficiency of trap driven simulation with the generality and ....

[Article contains additional citation context not shown here]

Richard Uhlig, David Nagle, Trevor Mudge, and Stuart Sechrest. Trap-Driven Simulation with TapewormII. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pages 132--144, October 1994.


Tools and Techniques for Memory System Design and Analysis - Lebeck (1995)   (2 citations)  (Correct)

....a similar optimization more cleanly, using the OM liveness analysis to detect, and save, caller save registers used in the simulator routines [71] However, ATOM still incurs unnecessary procedure linkage overhead in the no action cases. A recent alternative technique, trap driven simulation [60,78], optimizes no action cases to their logical extreme. Trap driven simulators exploit the characteristics of the Action Application All Addresses Simulator Ref Gen Figure 2: On The Fly Simulator 11 simulation platform to implement effective address calculation and lookup (steps 1 and 2) in ....

.... exploit the characteristics of the Action Application All Addresses Simulator Ref Gen Figure 2: On The Fly Simulator 11 simulation platform to implement effective address calculation and lookup (steps 1 and 2) in hardware using error correcting code (ECC) bits [60] or valid bits in the TLB [78]. References requiring no action run at full hardware speed; other references cause memory system exceptions that invoke simulation software. By executing most references without software intervention, these simulators potentially perform much better than other simulation systems. Unfortunately, ....

[Article contains additional citation context not shown here]

Richard Uhlig, David Nagle, Trevor Mudge, and Stuart Sechrest. Trap-Driven Simulation with TapewormII. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pages 132--144, October 1994.


Low Perturbation Address Trace Collection for Operating.. - Russell Daigle (1996)   (Correct)

....some perturbation. Indeed, 8 out of 64 general purpose registers are reserved for tracing in order to minimize memory references. Furthermore, tracing slows down execution by about 10 times. A related software scheme that has been implemented in uniprocessors is called trap driven simulation [21, 28]. In this approach, simulations are not driven by traces; they are driven by kernel traps. These traps allow the simulation of a cache as the kernel executes. Indeed, memory traps are set on addresses that are currently not in the simulated cache. When that address is accessed, the kernel traps. ....

....In addition, both operating system and application effects are considered. However, there is overhead involved in the extra instructions executed. Furthermore, this approach cannot generate as much information as trace driven simulations. Examples of this approach are Uhlig et al. s Tapeworm II [21, 28] and Talluri s [25] TLB simulator. Finally, there are many software based systems that only consider application traces and ignore the operating system. Many of them run on uniprocessors and may either trace a multiprocessor program (for example [13] or a uniprocessor program (for example [9] ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest. Trap-Driven Simulation with Tapeworm II. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 37--47, October 1994.


Improving the Address Translation Performance of Widely.. - Khalidi, Talluri (1995)   (3 citations)  (Correct)

....with 384MB of main memory. We are unable to report TLB simulation numbers for databases as we could not get access to a commercial database server that would run our modified operating system. 9. 2 Methodology We evaluate the performance of a common mask TLB using trap driven simulation [17] implemented in foxtrot [15] a Solaris 2.1 based operating system that counts the number of user TLB misses for a workload. Our simulation environment does not include kernel TLB misses, but includes the effect of context switches in the multi programmed workloads. Kernel TLB misses will ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest, "Trap-driven Simulation with Tapeworm II", 6th Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), October 1994, pp. 132-144.


Active Memory: A New Abstraction for Memory-System Simulation - Lebeck, Wood (1995)   (18 citations)  (Correct)

....a similar optimization more cleanly, using the OM liveness analysis to detect, and save, caller save registers used in the simulator routines [29] However, ATOM still incurs unnecessary procedure linkage overhead in the no action cases. A recent alternative technique, trap driven simulation [23,34], optimizes no action cases to their logical extreme. Trapdriven simulators exploit the characteristics of the simulation platform to implement effective address calculation and lookup (steps 1 and 2) in hardware using error correcting code (ECC) bits [23] or valid bits in the TLB [19] ....

....system and hardware support that is not readily available on most machines. Generality is lacking because current trap driven simulators do not simulate arbitrary memory systems: the Wisconsin Wind Tunnel [23] does not simulate stack references because of SPARC register windows, while Tapeworm II [34] does not simulate any data references because of write buffers on the DECstation. Furthermore, as we show in Section 5, the overhead of Lookup Action Application All Addresses Simulator Ref Gen Figure 2: On The Fly Simulator 6 memory exceptions (roughly 250 cycles [34,33,22] on well tuned ....

[Article contains additional citation context not shown here]

Richard Uhlig, David Nagle, Trevor Mudge, and Stuart Sechrest. Trap-Driven Simulation with TapewormII. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pages 132--144, October 1994.


Generating Dynamic Program Analysis Tools - Sloane (1997)   (Correct)

....gprof [4] and mprof [20] profilers provide information about processor usage and dynamic memory allocation, respectively. Memory hierarchy simulators can allow hardware designers to evaluate designs on real programs or permit programmers to take advantage of specific architectural features (e.g. [15, 18, 19]) Since there is no reason to expect hardware developments to cease or programs to become significantly simpler than they currently are, it is reasonable to expect a continuing need for new kinds of program analysis tools. Some dynamic analysis systems are limited in scope. For example, gprof ....

UHLIG, R., NAGLE, D., MUDGE, T., AND SECHREST, S. Trap-driven simulation with Tapeworm II. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, California, 1994), pp. 132--144.


Reducing Network Latency Using Subpages in a Global Memory.. - Herve Jamrozik (1996)   (14 citations)  (Correct)

....state bits for SVM on the CM5; they used ECC bits to cause faults, however, this would still require emulating writes, and the Alpha 250 has imprecise exceptions on data parity errors, making use of parity difficult or impossible. Similar techniques have been used for trace production as well [22]. 2 The IBM 801 used a similar scheme to manage transactions on units of less than a page in their case, for each 128 byte line [3] 3 We have optimized the performance of global memory operations along the lines described in [21] hence our latencies are slightly better than those reported ....

Richard Uhlig, David Nagle, Trevor Mudge, and Stuart Sechrest. Trap-driven simulation with Tapeworm II. In Proc. of the 6th Int. Conf. on Arch. Support for Prog. Languagesand Operating Systems, October 1994.


Trap-driven Memory Simulation - Uhlig (1995)   (2 citations)  Self-citation (Uhlig Mudge Sechrest)   (Correct)

No context found.

Uhlig, R., Nagle, D., Mudge, T. and Sechrest, S. Trap-driven simulation with Tapeworm II. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, ACM Press (SIGARCH), 132-144, 1994.


A Case Study of a Hardware-ManagedTLB in a Multi-Tasking.. - Chih-Chieh Lee (1994)   (2 citations)  Self-citation (Uhlig Mudge)   (Correct)

....the difficulty and expensive cost of the experimental methodology restrains designers from doing it. To overcome this problem, an efficient method of evaluating the system performance under a multi tasking environment has been proposed by Uhlig, et al. which is termed trap driven simulation [Uhlig94a, Uhlig94b]. In this study, we extended this new method to a different but even more popular hardware architecture and collected some interesting results. In order to emphasize the multi tasking environment, we incorporated the operating system (OS) because the OS is primarily responsible for managing ....

....we must be able to both monitor OS activities and keep the system functioning undisturbed (not stalled) as much as possible. A limited sized buffer and, therefore, the necessity of frequent system stalls inevitably changes the system behavior. To overcome these shortcomings, Uhlig et al. [Uhlig94b] developed a trap driven simulator, called Tapeworm, that can capture events during operating system activity efficiently and correctly. Furthermore, these events can be processed on the fly, thereby avoiding the need for buffering and stalling. Tapeworm, moreover, is purely software based. It ....

[Article contains additional citation context not shown here]

Uhlig, R., Nagle, D., Mudge, T., Sechrest, S. Trap-driven Simulation with Tapeworm II, In the proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, ACM, 132-144,


Instrumentation Tools - Pierce, Smith, Mudge (1995)   (2 citations)  Self-citation (Mudge)   (Correct)

....A disadvantage of using OS traps is that, if many events must be recorded, the cumulative OS overhead of handling all the traps is significant. However, there are a number of exception mechanisms in operating systems that can be utilized to improve the efficiency of this method. Tapeworm II [28] is an example of an efficient software based tool that drives cache and TLB simulations using information from kernel traps. It utilizes low overhead exceptions and traps of relatively few events. The applicability and efficiency of the OS trap approach depends upon the accessibility of certain ....

R. Uhlig, D. Nagle, T. Mudge, and S. Sechrest, "Trap-driven simulation with Tapeworm II," Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, (San Jose, CA), Oct. 1994.


Trace-driven Memory Simulation: A Survey - Uhlig, Mudge   (64 citations)  Self-citation (Uhlig Mudge)   (Correct)

....4 55 2 7 Annotation D cache No No [Rosenblum95] Witchel96] SimOS Embra 10 7 21 Emulation D cache, I cache, TLB Yes Yes Hardware based Miss Detection [Nagle93] Tapeworm 1 2 100 650 0.5 4.5 TLB Miss TLB Yes Yes [Reinhardt93] WWT 1 2 2,500 1 1. 4 46 1 ECC D cache No No [Uhlig94] Tapeworm II 1 2 300 0 10 ECC I cache, TLB Yes Yes [Lee94] Tapeworm486 1 2 3,600 4,000 0 14 Page Fault TLB Yes Yes [Talluri94] Foxtrot 1 2 1,500 4,000 TLB Miss TLB No No Table 8. Beyond Traces: Some Recent Fast Memory Simulators Each of the simulators in this table improve ....

.... Tapeworm simulator which also uses ECC bit modification to simulated caches, improves on the speed of WWT by showing that trap handling times can be reduced by nearly an order of magnitude to about 300 cycles, bringing overall simulation slowdowns for instruction caches into the range of 0 to 10 [Uhlig94]. Tapeworm II, like the original Tapeworm, also demonstrates that trap driven cache simulation is capable of complete monitoring multi process and operating system workloads. Experiments performed with Tapeworm II show that trap driven simulation slowdowns are highly dependent on the memory ....

[Article contains additional citation context not shown here]

Uhlig, R., Nagle, D., Mudge, T. and Sechrest, S. Trap-driven simulation with Tapeworm II. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, ACM Press (SIGARCH), 132-144, 1994.


Embra: Fast and Flexible Machine Simulation - Witchel, Rosenblum (1996)   (65 citations)  (Correct)

No context found.

Richard Uhlig, David Nagle, Trevor Mudge and Stuart Sechrest. Trap-driven Simulation with Tapeworm II, ASPLOS, San Jose, 1994.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC