13 citations found. Retrieving documents...
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Compile-time Composition of Run-time Data and Iteration.. - Strout, Carter, Ferrante (2003)   (4 citations)  (Correct)

....cache, L1 cache, 1 2 of L2 cache, and L2 cache. its position in a composed inspector should result in less overhead than an inspector implemented in a run time library, since the latter must be generally applicable. The need for specialized inspectors has been described in work for data locality [20] and parallelism [11] Automatically generating code for the inspector and executor can leverage the work in [7] which describes compiler support for dynamic data packing, and the work in [16] which generates optimized code for compile time transformations. Specifically, the techniques ....

....Many run time data reordering transformations [4, 2, 21, 7, 12] fit within our framework. Space filling curves and register tiling for sparse matrix vector multiply are two types of data reordering transformations that are more specialized. Data reorderings generated from space filling curves [28, 20] traverse data mappings and mappings of data to spatial coordinates. The programmer must specify how data maps to spatial coordinates, therefore, such data reorderings can not be fully automated. Im and Yelick [13] have developed the SPARSITY code generator that improves the locality for the #x ....

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 1999.


Exploiting Locality in the Run-Time Parallelization.. - Martín, Singh.. (2002)   (Correct)

....two transformations: one reorders data accesses to improve temporal locality (locality grouping) and the other reorders data layout to enhance spatial reuse (dynamic data packing) They also assess the performance improvement of applying a combination of both techniques. Mellor Crummey et al. [14] use space filling curves to reorder data and or computation. There is much less research on the improvement of locality on multiprocessors, an important issue on NUMA systems. Han and Tseng evaluate in [7] the effect on the parallel execution of codes of uniprocessor techniques that improve ....

J. M. Mellor-Crummey, D. B. Whalley, and K. Kennedy. Improving Memory Hierarchy Performance for Irregular Applications. In ACM Int'l Conference on Supercomputing, pages 425--433, Rhodes, Greece, 1999.


A Selective Hardware/Compiler Approach for Improving.. - Memik, Kandemir..   (Correct)

....are most useful for numerical codes whose access patterns can be detected statically at compile time. Hardware locality optimization mechanism can be used to overcome this limitation. Very recently, compiler techniques that target numerical codes with irregular access patterns have been proposed [25, 11]. Whether these techniques will be competitive with hardware based approaches remains to be seen. Locality optimizations by hardware use either characteristics of the load store instructions making the memory reference or access patterns to the memory regions accessed, to make caching decisions. ....

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proc. the ACM International Conference on Supercomputing (ICS'99), Rhodes, Greece, June 1999.


Improving Memory Hierarchy Performance for Irregular .. - Mellor-Crummey.. (2001)   (6 citations)  Self-citation (Mellor-crummey Whalley Kennedy)   (Correct)

....or hierarchical orderings for multi level memory hierarchies. To extend blocking strategies to multi level memory hierarchies, it is necessary to block for each level in the hierarchy. In an earlier version of this work, we described a k level blocking strategy for k level memory hierarchies [30]. Unfortunately, choosing the best blocking factor for each level is difficult and experimentation is necessary. The best blocking factors for an application depend not only upon the architectural characteristics of the target machine s memory hierarchy but also upon characteristics of the ....

J. Mellor-Crummey, D. Whalley, and K. Kennedy, "Improving Memory Hierarchy Performance for Irregular Applications," Proceedings of the 1999.


Predicting Hierarchical Phases in Program Data Behavior - Xipeng Shen Yutao   (Correct)

No context found.

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001.


Predicting Whole-Program Locality Through Reuse Distance Analysis - Ding, Zhong (2003)   (6 citations)  (Correct)

No context found.

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001.


Scientific Computing Research Environments for the.. - Heinkenschloss.. (2001)   (Correct)

No context found.

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications using data and computation reorderings. International Journal of Parallel Programming, 29(3), June 2001.


Scientific Computing Research Environments for the.. - Heinkenschloss.. (2001)   (Correct)

No context found.

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 13th ACM International Conference on Supercomputing, pages 425--433, Rhodes, Greece, June 1999.


Adaptive Data Partition Using Probability Distribution - Xipeng Shen And   (Correct)

No context found.

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001.


Increasing the Parallelism of Irregular Loops With Dependences - David Singh Mar (2003)   (Correct)

No context found.

J. M. Mellor-Crummey, D. B. Whalley, and K. Kennedy. Improving Memory Hierarchy Performance for Irregular Applications. In ACM Int'l Conference on Supercomputing, pages 425-433, Rhodes, Greece, (1999).


Efficient Remapping Mechanisms for an Adaptable Memory System - Zhang (2002)   (Correct)

No context found.

J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 1999.


Efficient and Accurate Analytical Modeling of Whole-Program Data .. - Xue, Vera (2003)   (Correct)

No context found.

J. M. Mellor-Crummey, D. B. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications using data and computation reorderings. International Journal of Parallel Programming, 29(3):217--247, 2001.


Optimization Techniques for Parallel Codes of Irregular.. - Guo, Chang, Pan (2003)   (Correct)

No context found.

) Mellor-Crummey, J., Whalley, D. and Kennedy, K.: Improving memory hierarchy performance for irregular applications, Proc. 1999 ACM International Conference on Supercomputing, Rhodes, Greece (June 1999).

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC