| T. C. Mowry, "Tolerating latency through software-controlled data prefetching," Ph.D. dissertation, Department of Electrical Engineering, Stanford University, March 1994. |
....loop induction variables. Other researchers investigate optimizing high performance Java applications using traditional loop optimizations [4, 9] Their work is complimentary to our work. Mowry, Lam, and Gupta describe and evaluate compiler techniques for data prefetching in array based codes [19, 18]. Their paper is one of the first to report execution times for compiler inserted prefetching. The algorithm works on affine array accesses, and involves several steps. First, the compiler performs locality analysis to determine array accesses that are likely to be cache misses. The compiler uses ....
T. C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Department of Electrical Engineering, Mar. 1994.
....locations. The compiler also uses standard cache improvement techniques such as loop unrolling and tiling. Simulation results show improvements in cache utilization and execution speed. Mowry, Lam, and Gupta describe and evaluate compiler techniques for adding prefetching to array based codes [79, 78]. This paper is one of the first that reports execution times for compiler inserted prefetching. The algorithm works on affine array accesses within 25 scientific codes. The algorithm significantly improves performance by as much as a factor of 2. They also show that their algorithm is better ....
....several researchers have investigated prefetching of array based codes on multiprocessors. Fu and Patel evaluate two hardware prefetching schemes on a vector multiprocessor system [39] Mowry and Gupta evaluate software prefetching for array based programs on shared memory multiprocessors [77, 78]. Gornish, Granston, and Veidenbaum implement prefetching for shared memory multiprocessors [42] Dahlgren, Dubois, and Stenstrom evaluate sequential hardware prefetching and stride prefetching on a shared memory multiprocessor [32, 33] In his thesis, Gornish compares software and hardware ....
Todd C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Department of Electrical Engineering, March 1994.
....several researchers have investigated prefetching of array based codes on multiprocessors. Fu and Patel evaluate two hardware prefetching schemes on a vector multiprocessor system [39] Mowry and Gupta evaluate software prefetching for array based programs on shared memory multiprocessors [77, 78]. Gornish, Granston, and Veidenbaum implement prefetching for shared memory multiprocessors [42] Dahlgren, Dubois, and Stenstrom evaluate sequential hardware prefetching and stride prefetching on a shared memory multiprocessor [32, 33] In his thesis, Gornish compares software and hardware ....
Todd Mowry and Anoop Gupta. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, 12(2):87--106, June 1991.
.... B 1; n) do instruction is often handled as a hint for the processor to load a certain data item but the fulfillment of the prefetch is not guaranteed by the CPU. Prefetch instructions can be inserted into the code manually by the programmer or automatically by a compiler [Por89, KL91, CKP91, Mow94] In both cases prefetching involves overhead. The prefetch instructions themselves have to be executed, i.e. pipeline slots will be filled with prefetch instructions instead of other instructions ready to be executed. Furthermore, the memory address of the prefetched data must be calculated and ....
T.C. Mowry. Tolerating Latency Through Software--Controlled Data Prefetching. PhD thesis, Computer Systems Laboratory, Stanford University, March 1994.
....overcomes these shortcomings by means of prefetching. VSCAP does not solely rely on overlapping communication with computation, it also overlaps communication operations with other communication operations to hide even more network latency. Research in prefetching can be divided into software [13], hardware [4,9,5] and hybrid prefetching [16,10,12] VSCAP s prefetching approach is quite different as it does not address parallel architectures with cache coherent memory, it rather targets machines with distributed memory and explicit communication operations where data distribution is the ....
T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Department of Computer Science, Stanford University, March 1994.
....tain array data in registers instead of cache, reducing latency for accesses to this data. We hope to use static performance estimation along with our out of core transformations to identify regions of code where the balance between I O and computation can be improved. Cache Management. Mowry [15] describes a method for software controlled data prefetching which focuses on issues pertinent to the data cache. The algorithm contains three stages: identify reuse; isolate predicted misses; and schedule prefetches. Identification of reuse is done using a matrix representation of dependence ....
Todd C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Department of Electrical Engineering, Stanford University, March 1994.
....by the effectiveness of its compiler support. The CCDP scheme relies on the compiler to identify potentially stale and nonstale data references, and to generate and schedule the appropriate prefetch operations. Several compiler techniques have been developed for software initiated data prefetching [2, 12, 13, 23, 24]. However, as these data prefetching schemes are used solely for memory latency hiding, the data prefetching operations are determined based on data locality considerations alone. Since these techniques do not distinguish between potentially stale and nonstale references, they cannot be applied ....
T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Dept. of Electrical Engineering, March 1994.
....practice the use of linked data structures, which requires dynamic memory allocation. The proximity of storage layout of such applications does not imply the same degree of spatial locality that array based applications does. More recent approaches such as Multithreading [2,22,30] Prefetching [6,18,21], Jump Pointers [28] and Memory Forwarding [19] have been explored to address memory latency in pointer based applications. Multithreading tends to combat latency by passing the control of execution to other threads when a long latency operation is encounterd. Prefetching tries to predict the ....
T. C. Mowry. "Tolerating Latency Through Software-Controlled Data Prefetching", PhD thesis, Stanford University, March 1994.
No context found.
T. C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Mar. 1994.
No context found.
T. C. Mowry, "Tolerating latency through software-controlled data prefetching," Ph.D. dissertation, Department of Electrical Engineering, Stanford University, March 1994.
No context found.
T. C. Mowry and A. Gupta, "Tolerating latency through software-controlled prefetching in shared-memory multiprocessors," Journal of Parallel and Distributed Computing, vol. 12, no. 2, pp. 87-106, June 1991.
No context found.
Todd C. Mowry. Tolerating Latency through Software Controlled Data Prefetching. PhD Thesis Stanford University, March 1994.
No context found.
T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University, Mar. 1994. 4.7, 5.5
No context found.
Todd Mowry and Anoop Gupta. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors. J. Parallel Distrib. Comput., 12(2):87--106, 1991.
No context found.
T. Mowry and A. Gupta. "Tolerating Latency through SoftwareControlled Prefetching in Scalable Shared-Memory Multiprocessors ". In Jour. of Parallel and Distributed Computing (12) 2, 1991: 87-106.
No context found.
Mowry, T. "Tolerating Latency Through Software Controlled Data Prefetching," Ph.D. dissertation, Stanford University, March 1994.
No context found.
T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, 1994.
No context found.
T. Mowry and A. Gupta. Tolerating latency through software -controlled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, 12(2):87--106, June 1992.
No context found.
T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, 1994.
No context found.
T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University, Mar. 1994. 4.7, 5.5
No context found.
T. C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, March 1994.
No context found.
T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University, March 1994.
No context found.
T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, 1994.
No context found.
T.C. Mowry. Tolerating Latency Through Software{Controlled Data Prefetching. PhD thesis, Computer Systems Laboratory, Stanford University, 1994.
No context found.
T. Mowry and A. Gupta, "Tolerating Latency through Software-controlled Prefetching in Shared-memory Multiprocessors," Journal of Parallel and Distributed Computing, Vol. 12, No. 2, June 1991, pp. 87-106.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC