249 citations found. Retrieving documents...
T. C. Mowry, "Tolerating latency through software-controlled data prefetching," Ph.D. dissertation, Department of Electrical Engineering, Stanford University, March 1994.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Simple and Effective Array Prefetching in Java - Cahoon, McKinley (2002)   (Correct)

....loop induction variables. Other researchers investigate optimizing high performance Java applications using traditional loop optimizations [4, 9] Their work is complimentary to our work. Mowry, Lam, and Gupta describe and evaluate compiler techniques for data prefetching in array based codes [19, 18]. Their paper is one of the first to report execution times for compiler inserted prefetching. The algorithm works on affine array accesses, and involves several steps. First, the compiler performs locality analysis to determine array accesses that are likely to be cache misses. The compiler uses ....

T. C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Department of Electrical Engineering, Mar. 1994.


Effective Compile-Time Analysis for Data Prefetching in Java - Cahoon (2002)   (Correct)

....locations. The compiler also uses standard cache improvement techniques such as loop unrolling and tiling. Simulation results show improvements in cache utilization and execution speed. Mowry, Lam, and Gupta describe and evaluate compiler techniques for adding prefetching to array based codes [79, 78]. This paper is one of the first that reports execution times for compiler inserted prefetching. The algorithm works on affine array accesses within 25 scientific codes. The algorithm significantly improves performance by as much as a factor of 2. They also show that their algorithm is better ....

....several researchers have investigated prefetching of array based codes on multiprocessors. Fu and Patel evaluate two hardware prefetching schemes on a vector multiprocessor system [39] Mowry and Gupta evaluate software prefetching for array based programs on shared memory multiprocessors [77, 78]. Gornish, Granston, and Veidenbaum implement prefetching for shared memory multiprocessors [42] Dahlgren, Dubois, and Stenstrom evaluate sequential hardware prefetching and stride prefetching on a shared memory multiprocessor [32, 33] In his thesis, Gornish compares software and hardware ....

Todd C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Department of Electrical Engineering, March 1994.


Effective Compile-Time Analysis for Data Prefetching in Java - Cahoon (2002)   (Correct)

....several researchers have investigated prefetching of array based codes on multiprocessors. Fu and Patel evaluate two hardware prefetching schemes on a vector multiprocessor system [39] Mowry and Gupta evaluate software prefetching for array based programs on shared memory multiprocessors [77, 78]. Gornish, Granston, and Veidenbaum implement prefetching for shared memory multiprocessors [42] Dahlgren, Dubois, and Stenstrom evaluate sequential hardware prefetching and stride prefetching on a shared memory multiprocessor [32, 33] In his thesis, Gornish compares software and hardware ....

Todd Mowry and Anoop Gupta. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, 12(2):87--106, June 1991.


Data Locality Optimizations for Multigrid Methods on Structured.. - Weiß   (Correct)

.... B 1; n) do instruction is often handled as a hint for the processor to load a certain data item but the fulfillment of the prefetch is not guaranteed by the CPU. Prefetch instructions can be inserted into the code manually by the programmer or automatically by a compiler [Por89, KL91, CKP91, Mow94] In both cases prefetching involves overhead. The prefetch instructions themselves have to be executed, i.e. pipeline slots will be filled with prefetch instructions instead of other instructions ready to be executed. Furthermore, the memory address of the prefetched data must be calculated and ....

T.C. Mowry. Tolerating Latency Through Software--Controlled Data Prefetching. PhD thesis, Computer Systems Laboratory, Stanford University, March 1994.


Compiler-Generated Vector-based Prefetching on Architectures with .. - Müller   (Correct)

....overcomes these shortcomings by means of prefetching. VSCAP does not solely rely on overlapping communication with computation, it also overlaps communication operations with other communication operations to hide even more network latency. Research in prefetching can be divided into software [13], hardware [4,9,5] and hybrid prefetching [16,10,12] VSCAP s prefetching approach is quite different as it does not address parallel architectures with cache coherent memory, it rather targets machines with distributed memory and explicit communication operations where data distribution is the ....

T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Department of Computer Science, Stanford University, March 1994.


Scalable I/O for Out-of-Core Structures - Paleczny, al. (1993)   (Correct)

....tain array data in registers instead of cache, reducing latency for accesses to this data. We hope to use static performance estimation along with our out of core transformations to identify regions of code where the balance between I O and computation can be improved. Cache Management. Mowry [15] describes a method for software controlled data prefetching which focuses on issues pertinent to the data cache. The algorithm contains three stages: identify reuse; isolate predicted misses; and schedule prefetches. Identification of reuse is done using a matrix representation of dependence ....

Todd C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Department of Electrical Engineering, Stanford University, March 1994.


Efficient Integration of Compiler-directed Cache Coherence And.. - Lim, Yew (2000)   (1 citation)  (Correct)

....by the effectiveness of its compiler support. The CCDP scheme relies on the compiler to identify potentially stale and nonstale data references, and to generate and schedule the appropriate prefetch operations. Several compiler techniques have been developed for software initiated data prefetching [2, 12, 13, 23, 24]. However, as these data prefetching schemes are used solely for memory latency hiding, the data prefetching operations are determined based on data locality considerations alone. Since these techniques do not distinguish between potentially stale and nonstale references, they cannot be applied ....

T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Dept. of Electrical Engineering, March 1994.


Intelligent Memory Manager Eliminates Cache Pollution Due to.. - Rezaei, Kavi (2003)   (Correct)

....practice the use of linked data structures, which requires dynamic memory allocation. The proximity of storage layout of such applications does not imply the same degree of spatial locality that array based applications does. More recent approaches such as Multithreading [2,22,30] Prefetching [6,18,21], Jump Pointers [28] and Memory Forwarding [19] have been explored to address memory latency in pointer based applications. Multithreading tends to combat latency by passing the control of execution to other threads when a long latency operation is encounterd. Prefetching tries to predict the ....

T. C. Mowry. "Tolerating Latency Through Software-Controlled Data Prefetching", PhD thesis, Stanford University, March 1994.


Improving Hash Join Performance through Prefetching - Shimin Chen Anastassia (2003)   (5 citations)  Self-citation (Mowry)   (Correct)

No context found.

T. C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, Mar. 1994.


Memory Latency Rediction via Data Prefetching and Data Forwarding .. - Poulsen (1994)   (Correct)

No context found.

T. C. Mowry, "Tolerating latency through software-controlled data prefetching," Ph.D. dissertation, Department of Electrical Engineering, Stanford University, March 1994.


Memory Latency Rediction via Data Prefetching and Data Forwarding .. - Poulsen (1994)   (Correct)

No context found.

T. C. Mowry and A. Gupta, "Tolerating latency through software-controlled prefetching in shared-memory multiprocessors," Journal of Parallel and Distributed Computing, vol. 12, no. 2, pp. 87-106, June 1991.


Optimizing Compiler for a CELL Processor - Eichenberger, O'Brien, O'Brien.. (2005)   (1 citation)  (Correct)

No context found.

Todd C. Mowry. Tolerating Latency through Software Controlled Data Prefetching. PhD Thesis Stanford University, March 1994.


Software Methods to Improve Data Locality and Cache Behavior - Beyls (2004)   (Correct)

No context found.

T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University, Mar. 1994. 4.7, 5.5


Improving Cache Locality for Thread-Level Speculation Systems - Fung (2005)   (Correct)

No context found.

Todd Mowry and Anoop Gupta. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors. J. Parallel Distrib. Comput., 12(2):87--106, 1991.


Hardware Prefetching in Bus-Based Multiprocessors.. - Garzaran, Briz..   (Correct)

No context found.

T. Mowry and A. Gupta. "Tolerating Latency through SoftwareControlled Prefetching in Scalable Shared-Memory Multiprocessors ". In Jour. of Parallel and Distributed Computing (12) 2, 1991: 87-106.


Bus-Based COMA --- Reducing Traffic in Shared-Bus.. - Anders Landin And (1996)   (9 citations)  (Correct)

No context found.

Mowry, T. "Tolerating Latency Through Software Controlled Data Prefetching," Ph.D. dissertation, Stanford University, March 1994.


Power-Aware Compilation Techniques for High Performance Processors - Yang (2004)   (Correct)

No context found.

T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, 1994.


Permission to Make Digital Or Hard Copies of All Or Part.. - Personal Or Classroom   (Correct)

No context found.

T. Mowry and A. Gupta. Tolerating latency through software -controlled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, 12(2):87--106, June 1992.


Compiler-Assisted Cache Replacement: Problem.. - Yang, Govindarajan.. (2003)   (3 citations)  (Correct)

No context found.

T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, 1994.


Software Methods to Improve Data Locality and Cache Behavior - Beyls (2004)   (Correct)

No context found.

T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University, Mar. 1994. 4.7, 5.5


Estimating Cache Misses and Locality Using Stack Distances - Cascaval, Padua (2003)   (1 citation)  (Correct)

No context found.

T. C. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, March 1994.


Improving Effective Bandwidth through Compiler Enhancement of.. - Ding (2000)   (10 citations)  (Correct)

No context found.

T. Mowry. Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University, March 1994.


Compiler-Assisted Cache Replacement: Problem.. - Yang, Govindarajan.. (2003)   (3 citations)  (Correct)

No context found.

T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, 1994.


An Overview of Cache Optimization Techniques and Cache-Aware .. - Kowarschik, Weiß (2003)   (Correct)

No context found.

T.C. Mowry. Tolerating Latency Through Software{Controlled Data Prefetching. PhD thesis, Computer Systems Laboratory, Stanford University, 1994.


Exploiting the Prefetching Effect Provided by Executing.. - Lilja, Kunkel (2002)   (1 citation)  (Correct)

No context found.

T. Mowry and A. Gupta, "Tolerating Latency through Software-controlled Prefetching in Shared-memory Multiprocessors," Journal of Parallel and Distributed Computing, Vol. 12, No. 2, June 1991, pp. 87-106.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC