| J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001. |
....cache, L1 cache, 1 2 of L2 cache, and L2 cache. its position in a composed inspector should result in less overhead than an inspector implemented in a run time library, since the latter must be generally applicable. The need for specialized inspectors has been described in work for data locality [20] and parallelism [11] Automatically generating code for the inspector and executor can leverage the work in [7] which describes compiler support for dynamic data packing, and the work in [16] which generates optimized code for compile time transformations. Specifically, the techniques ....
....Many run time data reordering transformations [4, 2, 21, 7, 12] fit within our framework. Space filling curves and register tiling for sparse matrix vector multiply are two types of data reordering transformations that are more specialized. Data reorderings generated from space filling curves [28, 20] traverse data mappings and mappings of data to spatial coordinates. The programmer must specify how data maps to spatial coordinates, therefore, such data reorderings can not be fully automated. Im and Yelick [13] have developed the SPARSITY code generator that improves the locality for the #x ....
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 1999.
....two transformations: one reorders data accesses to improve temporal locality (locality grouping) and the other reorders data layout to enhance spatial reuse (dynamic data packing) They also assess the performance improvement of applying a combination of both techniques. Mellor Crummey et al. [14] use space filling curves to reorder data and or computation. There is much less research on the improvement of locality on multiprocessors, an important issue on NUMA systems. Han and Tseng evaluate in [7] the effect on the parallel execution of codes of uniprocessor techniques that improve ....
J. M. Mellor-Crummey, D. B. Whalley, and K. Kennedy. Improving Memory Hierarchy Performance for Irregular Applications. In ACM Int'l Conference on Supercomputing, pages 425--433, Rhodes, Greece, 1999.
....are most useful for numerical codes whose access patterns can be detected statically at compile time. Hardware locality optimization mechanism can be used to overcome this limitation. Very recently, compiler techniques that target numerical codes with irregular access patterns have been proposed [25, 11]. Whether these techniques will be competitive with hardware based approaches remains to be seen. Locality optimizations by hardware use either characteristics of the load store instructions making the memory reference or access patterns to the memory regions accessed, to make caching decisions. ....
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proc. the ACM International Conference on Supercomputing (ICS'99), Rhodes, Greece, June 1999.
....or hierarchical orderings for multi level memory hierarchies. To extend blocking strategies to multi level memory hierarchies, it is necessary to block for each level in the hierarchy. In an earlier version of this work, we described a k level blocking strategy for k level memory hierarchies [30]. Unfortunately, choosing the best blocking factor for each level is difficult and experimentation is necessary. The best blocking factors for an application depend not only upon the architectural characteristics of the target machine s memory hierarchy but also upon characteristics of the ....
J. Mellor-Crummey, D. Whalley, and K. Kennedy, "Improving Memory Hierarchy Performance for Irregular Applications," Proceedings of the 1999.
No context found.
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001.
No context found.
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001.
No context found.
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications using data and computation reorderings. International Journal of Parallel Programming, 29(3), June 2001.
No context found.
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 13th ACM International Conference on Supercomputing, pages 425--433, Rhodes, Greece, June 1999.
No context found.
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. International Journal of Parallel Programming, 29(3), June 2001.
No context found.
J. M. Mellor-Crummey, D. B. Whalley, and K. Kennedy. Improving Memory Hierarchy Performance for Irregular Applications. In ACM Int'l Conference on Supercomputing, pages 425-433, Rhodes, Greece, (1999).
No context found.
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 1999.
No context found.
J. M. Mellor-Crummey, D. B. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications using data and computation reorderings. International Journal of Parallel Programming, 29(3):217--247, 2001.
No context found.
) Mellor-Crummey, J., Whalley, D. and Kennedy, K.: Improving memory hierarchy performance for irregular applications, Proc. 1999 ACM International Conference on Supercomputing, Rhodes, Greece (June 1999).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC