35 citations found. Retrieving documents...
D. M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In Proceedings of the 22nd Annual Computer Measurement Group Conference, pages 384--393, San Diego, California, December 10--13, 1996.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

EMSim: An Extensible Simulation Environment for Studying.. - Ortiz-Arroyo, Lee, Yu (2002)   (Correct)

....that are variations of SS, such as SIMCA [13] which has multithreading capabilities. This special purpose simulator requires support from the compiler to generate threads. In addition, some simulators run only on specific platforms or require special compilers such as MIPS [8] or SMTSim [26]. All these simulators are execution driven. In contrast, there are simulators that are both, event driven and execution driven, e.g. RSIM [20] RSIM simulates an outof order processor similar to MIPS R10000 and is partially written in C and C . RSIM is also capable of simulating a ....

.... values from registers or memory [23] In other approaches, speculation is used to dynamically generate threads from a sequential flow of control [15] Furthermore, some recent architectures support the overlapped execution of multiple, independent threads using Simultaneous Multithreading (SMT) [26]. Therefore, it is obvious that to simulate these complex architectures, flexible simulation tools are required. Load Store Queue Reservation Stations Memory Fetch I Cache I Queue Decode Dispatch 1.1 D is p at c h L2 Cache D Cache ROB F U F U F U RF BTB Figure 1. ....

D. M. Tullsen, "Simulation and modeling of a simultaneous multithreading processor," Computer Measurement Group Conference, December 1996.


Front-End Policies for Improved Issue Efficiency in SMT.. - El-Moursy, Albonesi (2003)   (Correct)

....combination of these two policies achieves the best results in terms of both performance and issue queue occupancy reduction for a mixed integer and floating point workload of all techniques that we evaluated. 4 Simulation Methodology We modified the SMT simulator (SMTSIM) developed by Tullsen [19] to implement the new fetch schemes and to gather detailed statistics on the issue queues. The major simulator parameters are given in Table 1. The issue width is equal to the total number of functional units, and issue priority is by instruction age, with older instructions having priority over ....

....to evaluate adapting the thresholds dynamically to fit the workload, and to explore the interaction between fetch, dispatch, and scheduling policies on complexity reduction in other areas of SMT processors. 8 Acknowledgements The authors wish to thank Dean Tullsen for the use of his simulator [19] and his help with our many questions, and the reviewers for their useful comments, especially as related to simulation methodology. ....

D.M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. 22nd Annual Computer Measurement Group Conference, pp. 819-828, December 1996.


Control Flow Optimization Via Dynamic Reconvergence Prediction - Jamison Collins Dean (2004)   (1 citation)  Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Conjoined-core Chip Multiprocessing - Rakesh Kumar Norman   Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Balanced Multithreading: Increasing Throughput via a.. - Tune, Kumar, Tullsen, .. (2004)   Self-citation (Tullsen)   (Correct)

No context found.

D. M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Quantifying Instruction Criticality - Eric Tune Dean (2002)   (2 citations)  Self-citation (Tullsen)   (Correct)

No context found.

D. M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Clustered Multithreaded Architectures - Pursuing Both IPC.. - Collins, Tullsen (2004)   Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreaded processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Single-ISA Heterogeneous Multi-Core Architectures: .. - Kumar, Farkas.. (2003)   (1 citation)  Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Proceedings of 12th Intl Conference on Parallel.. - Initial Observations Of   Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Single-ISA Heterogeneous Multi-Core Architectures: .. - Kumar, Farkas.. (2003)   (1 citation)  Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Single-ISA Heterogeneous Multi-Core Architectures.. - Kumar, Tullsen.. (2004)   (1 citation)  Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Predictor-Directed Data Prefetching for Pointer-based Applications - Sair (2003)   Self-citation (Tullsen)   (Correct)

No context found.

D.M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference,De- cember 1996.


Compiling for Instruction Cache Performance on a.. - Kumar, Tullsen (2002)   Self-citation (Tullsen)   (Correct)

No context found.

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Processor Power Reduction Via Single-ISA.. - Kumar, Farkas.. (2003)   Self-citation (Tullsen)   (Correct)

....tables to account for the single threaded EV8 . The area data is then scaled for the 0.10 micron process. D. Modeling Performance Benchmark execution is simulated using SMTSIM, a cycleaccurate, execution driven simulator that simulates an out oforder, simultaneous multithreading processor [10]. SMTSIM executes unmodified, statically linked Alpha binaries. The simulator was modified to simulate a multi core processor comprising five heterogeneous cores sharing an on chip L2 cache and the memory subsytem. Because the R4700 does not execute Alpha binaries, what we are modeling is an ....

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


A Multi-Core Approach to Addressing the.. - Kumar, Farkas.. (2003)   Self-citation (Tullsen)   (Correct)

....Table 3 summarizes the benchmarks used. All 14 are chosen from the SPEC2000 benchmark suite, including 7 from SPECint and 7 from SPECfp. Benchmarks are simulated using SMTSIM, a cycleaccurate, execution driven simulator that simulates an outof order, simultaneous multithreading processor [26, 27]. SMTSIM executes unmodified, statically linked Alpha binaries. The simulator was modified to simulate a multi core processor comprising five heterogeneous cores sharing an on chip L2 cache and the memory subsytem. Because the R4700 does not execute Alpha binaries, what we are modeling is an ....

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Quantifying Instruction Criticality - Tune, Tullsen, Calder (2002)   (2 citations)  Self-citation (Tullsen)   (Correct)

....the next section, we describe the rescheduler and the constraint graph model. Simulations are performed using a detailed architectural simulation of an out of order processor executing the Alpha instruction set architecture. Simulations for this research were performed with the SMTSIM simulator [17], used in singlethread mode. The simulated processor has a reorder buffer of 255 instructions. Our simulated processor does not have a limited instruction queue; it is only limited by the size of the reorder buffer. The processor can fetch, execute, and commit up to 8 instructions per cycle. It ....

D. M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Pointer Cache Assisted - Collins, Sair, Calder, Tullsen (2002)   Self-citation (Tullsen)   (Correct)

....prediction with outcomes computed in a speculative thread. However, because this study focused only on the use of speculative threads for prefetching, computed branch outcomes are not used for this purpose. This is a topic of future work. 6 Methodology Benchmarks are simulated using SMTSIM [25], a cycle accurate, execution driven simulator that simulates an outof order, simultaneous multithreading processor. SMTSIM executes unmodified, statically linked Alpha binaries. Table 1 shows the configuration of the processor modeled in this research. Programs are simulated for 300 million ....

D.M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, December 1996.


Computing Along the Critical Path - Tullsen, Calder (1998)   (1 citation)  Self-citation (Tullsen)   (Correct)

....significantly (e.g. see Figure 9) from using the training set generated profiles. We verify the critical path profiler by comparing the critical path found during profiling to instructions that cause stalls during a detailed cycle by cycle instruction level simulation of an Alpha processor [23]. If our profile is accurate, we expect it to identify a high percentage of the stall producing instructions in a program running on a machine with similar instruction window size and cache parameters. While not all stall producing instructions are on the critical path, just about all critical ....

D.M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, December 1996.


ILP versus TLP on SMT - Mitchell, Carter, Ferrante, Tullsen (1999)   (3 citations)  Self-citation (Tullsen)   (Correct)

....when implementing parallelism. For example, increasing thread concurrency potentially decreases per thread resources. Thus, fine grained resource sharing introduces new interactions, particular to SMT. We explore these interactions in Section 4. 2. 1 Simulation issues We use the SMT simulator [16] for our experiments. This simulator performs an execution driven simulation of an SMT processor, including register renaming, branch prediction, three levels of cache, and TLB. In our experiments, the simulator has the parameters shown in Table 1. As no compiler currently targets SMT, we use a ....

DeanM. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In Computer Measurement Group Conference, December 1996.


Symbiotic Jobscheduling for a Simultaneous Multithreading.. - Snavely, Tullsen (2000)   (16 citations)  Self-citation (Tullsen)   (Correct)

....level. The jobscheduler selects, from the pool of jobs ready to run, a number of jobs to coschedule less than or equal to the multithreading level. Every so often, for fairness, this running set is swapped out and replaced with a new set of jobs from the ready pool. The simulator (based on SMTSIM [33]) models an out oforder processor based on the Compaq Alpha 21264 with modest hardware additions to support multithreading. The 21264 comes equipped with performance counters which can be used to capture dynamic execution information. We model 21264 instruction latencies, functional units (fully ....

D. M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Dynamic Speculative Precomputation - Collins, Tullsen, Wang, Shen (2001)   (18 citations)  Self-citation (Tullsen)   (Correct)

....proposed automatic techniques, and are comparable to the most aggressive manually applied optimizations. Additionally, SP in the context of multiple non speculative threads is explored. 3. Simulation Methodology Benchmarks are simulated using SMTSIM, a cycle accurate, execution driven simulator [18] that simulates an out of order, simultaneous multithreading processor. SMTSIM executes unmodified alpha binaries. Benchmarks from Pipeline Structure 8 stage pipeline, 1 cycle misfetch penalty, 6 cycle mispredict penalty Fetch 8 instructions total from up to two threads Branch Predictor 16k ....

D. Tullsen. Simulation and modeling of a simultaneous multithreaded processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Handling Long-latency Loads in a Simultaneous Multithreading.. - Tullsen, Brown (2001)   (9 citations)  Self-citation (Tullsen)   (Correct)

....64 byte lines Latency from previous level L2 10 cycles, L3 20 cycles (with no contention) Memory 100 cycles Table 3. Processor configuration. Execution is simulated on an out of order superscalar processor model which runs unaltered Alpha executables. The simulator is derived from SMTSIM [15], and models all typical sources of latency, including caches, branch mispredictions, TLB misses, and various resource conflicts, including renaming registers, queue entries, etc. It models both cache latencies and the effect of contention for caches and memory buses. It carefully models execution ....

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


Reducing Power with Dynamic Critical Path Information - Seng, Tune, Tullsen (2001)   (10 citations)  Self-citation (Tullsen)   (Correct)

....have exceeded allowable thermal limit, and slow down the entire processor until power consumption is reduced. Our techniques are intended to target (at design time) particular units with high power density. 4. Methodology Simulations for this research were performed with the SMTSIM simulator [19], used exclusively in single thread mode. In that mode it provides an accurate model of an outof order processor executing the Compaq Alpha instruction set architecture. Most of the SPEC 2000 integer benchmarks were used to evaluate the designs. All of the techniques modeled in this paper could ....

D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.


FAST: A Functionally Accurate Simulation Toolset for the.. - Cuvillo, Zhu, Hu, Gao (2005)   (Correct)

No context found.

D. M. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In Proceedings of the 22nd Annual Computer Measurement Group Conference, pages 384--393, San Diego, California, December 10--13, 1996.


Mesocode: Optimizations for Improving Fetch.. - Eng, Wang, Wang..   (Correct)

No context found.

D. M. Tullsen. Simulation and modeling of a simultaneous multithreaded processor. In 22 Annual Computer Measurement Group Conference, December 1996. 9

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC