| Dubois, M., Song, Y.: Assisted Execution. Technical Report CENG 98-25, Department of EE-Systems, University of Southern California (1998) |
....stalls nor initiates recovery when a cache replacement is needed. It simply loses data, implicitly predicting the R stream will re create the data before it is needed again. This may avoid many unnecessary stalls and recovery actions. Speculative Data Driven Multithreading [23] and related work [2,3,5,6,7,9,15,25,32], which spawn specialized threads to prefetch cache misses and resolve branch mispredictions in advance, are closer in spirit to slipstreaming. A fundamental difference is the use of multiple, short lived, specialized threads versus a single, persistent, functionally complete program (A stream) ....
Y. H. Song and M. Dubois. Assisted Execution. Technical Report CENG-98-25, Department of EE-Systems, University of Southern California, October 1998.
....to our knowledge, there has been little if any published work that is focused specifically on improving I cache performance by exploiting helper threads. Song and Dubois proposed assisted execution as a generic way to use multithreading resources to improve single threaded application performance [18]. Chappel et al. proposed Simultaneous Subordinate Multithreading (SSMT) a general framework for leveraging otherwise spare execution resources to benefit a singlethreaded application. They first evaluated using SSMT to provide a very large local pattern history based branch predictor [3] and ....
Y. Song and M. Dubois. Assisted execution. Technical Report CENG-98-25, Department of EE-Systems, University of Southern California, Oct. 1998.
....way to boost throughput performance with limited impact on processor die area [10] the performance of many singlethreaded applications does not benefit from SMT. Recently, a number of proposals have been put forth to exploit SMT resources to improve the latency of single threaded applications [7, 3, 28, 19, 16, 27, 1, 14, 6, 5, 4, 13]. In particular, several studies have investigated using helper threads in the form of program slice based precomputation for reducing latency due to load instructions tending to miss in the cache, and branches that mispredict. To our knowledge, there has been little if any published work that is ....
....by exploiting helper threads. This paper describes a novel framework for modeling such helper threads so as to optimize their impact on performance. Dubois and Song proposed assisted execution as a generic way to use multithreading resources to improve single threaded application performance [7]. Chappell et al. proposed Simultaneous Subordinate Multithreading (SSMT) a general framework for leveraging otherwise spare execution resources to benefit a single threaded application. They first evaluated the use of SSMT as a mechanism to provide a very large local pattern history based branch ....
M. Dubois and Y. Song. Assisted execution. Technical Report CENG 98-25, Department of EE-Systems, University of Southern California, Oct. 1998.
....of all of the factors that influence performance. Several researchers have investigated adding specialpurpose PCs with reduced register requirements to a superscalar, mostly for the purpose of improving performance of a primary thread by prefetching or warming up the branch prediction hardware [11, 4, 35, 27, 7]. These special purpose PCs usually lack an independent set of registers, and instead share registers with the primary register set and or have private registers written by hardware. The special PCs begin execution on a cache miss or, alternatively, by request from the primary thread [35] Mowry ....
SONG, Y., AND DUBOIS, M. Assisted execution. Technical Report CENG 98-25, Department of EE-Systems, University of Southern California (October 1998).
....low latency communication and synchronization mechanism, or decrease the startup cost for new threads by reducing the register initialization step. Fast thread initialization and communication are necessary for the SMT optimizations researchers have suggested that require super lightweight threads [92, 76, 18, 3, 72, 67]. Finally, all mini threads in an application could share values in the entire architectural register set. mt SMT permits all of these variations, because the application controls what register allocation is used and when. In all cases a compiler would have to compile a mini thread for a specific ....
....that influence performance. Several researchers have investigated adding special purpose light weight contexts with reduced register requirements to a superscalar, mostly for the purpose of improving performance of a primary thread by prefetching or warming up the branch prediction hardware [31, 8, 92, 72, 15]. These contexts usually lack an independent set of registers, and instead share registers with the primary register set and or have private registers written by hardware. Threads begin execution in the contexts on a cache miss or, alternatively, by request from the primary thread [92] 108 ....
SONG, Y., AND DUBOIS, M. Assisted execution. Technical Report CENG 9825, Department of EE-Systems, University of Southern California (October 1998).
....to predict future misses. Software prefetchers [15] insert prefetch directives into the code with enough lead time to allow the cache to acquire the data before the actual access is executed. Recently, the expected emergence of multithreaded processors [27] has led to thread based prefetchers [1, 5, 6, 13, 14, 18, 19, 24, 28], which execute code in another thread context, attempting to bring data into the shared cache before the primary thread accesses it. However, traditional prefetching techniques have difficulty with sequences of irregular accesses. A common example of this type of access is pointer chains, where ....
Y. Song and M. Dubois. Assisted execution. Technical Report CENG 98-25, University of Southern California, October 1988.
....prediction shares similarities with a number of recently proposed multi threaded models where a number of potentially speculative, helper threads are used to enhance an otherwise sequential, main thread. Simultaneous subordinate micro threading and assisted execution are two such proposals [2,13]. In the example application of SSMT given in [2] the helper threads are implemented in microcode and are used to enhance branch prediction. Zilles and Sohi suggested extracting slices at compile time and using them to pre execute performance critical instructions [15,16] Assuming compile time ....
Y. Song and M. Dubois. Assisted execution. Technical report, Technical Report CENG-98-25, Department of EESystems, University of Southern California, Oct. 1998.
....where the slice determining the branch is duplicated and made to run in a separate window. Farcy et al. [11] notice regularity in the branch condition computations and use value prediction to accelerate the second thread. Simultaneous Subordinate Microthreading (SSMT) 5] and Assisted Execution [9] are schemes where customgenerated threads are invoked within the hardware by certain events. These threads perform very simple specific tasks and cannot be automatically generated. A related concept is AR SMT [24] and SRT [22] that run two copies of the same program on an SMT processor and ....
M. Dubois and Y. H. Song. Assisted Execution. Technical Report CENG 98-25, EE-Systems, University of Southern California, Oct 1998.
....have to be displaced one is to make room in the virtual cache and the other is the one pointed to by the displaced R tag cache entry. This type of phenomenon is called paired eviction[14] The paired eviction degrades the cache storage utilization and results in higher cache miss ratio. In [15], Cekleov et al. assessed the performance impact of the associativity of the R tag cache quantitatively and proposed a scheme to avoid the paired eviction. In the proposed scheme, the R tag cache is virtually indexed although it contains physical tags. Since both the virtual cache and the R tag ....
M. Cekleov, M. Dubois, J.-C. Wang, and F. A. Briggs. Virtual-address caches. Technical Report CENG 90-18, University of Southern California, 1990.
No context found.
Dubois, M., Song, Y.: Assisted Execution. Technical Report CENG 98-25, Department of EE-Systems, University of Southern California (1998)
No context found.
M. Dubois and Y. Song. Assisted execution. Technical Report CENG 98-25, Department of EE-Systems, University of Southern California, October 1998.
No context found.
M. Dubois and Y. Song. Assisted execution. Technical Report CENG #98-25, Department of EE-Systems, University of Southern California, Oct. 1998.
No context found.
Y. Song and M. Dubois. Assisted execution. Technical Report CENG 98-25, University of Southern California, October 1988.
No context found.
Y.H. Song and M. Dubois. Assisted execution. Technical Report CENG 98-25, University of Southern California, Department of EE-Systems, October 1998. -
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC