| S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. S. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proc. 11th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 222--231, 1999. |
.... [19] and [45] Hilbert curves have been shown to preserve several measures of locality [35, 23] An alternative with better performance in two dimensions is given in [37] Generalizations of Hilbert curves to higher dimensions are given in [1] Specific applications include matrix multiplication [11, 20], domain decomposition [3, 25, 39] and image processing [2, 34, 4, 51, 31, 30] They are also a standard tool in the creation of cache oblivious algorithms [21, 40, 5, 41, 6, 10] which have asymptotically optimal memory performance on multilevel memory hierarchies while avoiding memory specific ....
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. S. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proc. 11th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 222--231, 1999.
.... [19] and [45] Hilbert curves have been shown to preserve several measures of locality [35, 23] An alternative with better performance in two dimensions is given in [37] Generalizations of Hilbert curves to higher dimensions are given in [1] Specific applications include matrix multiplication [11, 20], domain decomposition [3, 25, 39] and image processing [2, 34, 4, 51, 31, 30] They are also a standard tool in the creation of cacheoblivious algorithms [21, 40, 5, 41, 6, 10] which have asymptotically optimal memory performance on multilevel memory hierarchies while avoiding memory specific ....
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. S. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proc. 11th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 222--231, 1999.
.... compilation to a challenging problem: the automatic parallelization of divide and conquer programs [4] The inherent parallelism and good cache locality of divide and conquer algorithms make them a good match for modern parallel machines, with excellent performance on a range of problems [2, 10, 6, 3]. The tasks in divide and conquer programs often access disjoint regions of the same array. To parallelize such a program, the compiler must precisely characterize the regions of memory that the complete computation of each procedure accesses. But it can be quite difficult to extract this ....
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
.... a large variety of problems, including sorting, matrix manipulation, and many dynamic programming problems [4] The inherent parallelism and good cache locality of divide and conquer algorithms make them a good match for modern parallel machines, with excellent performance on a range of problems [2, 10, 6, 3]. 1.3 Automatic Parallelization The tasks in divide and conquer programs often access disjoint regions of the same array. To parallelize such a program, the compiler must precisely characterize the regions of memory that the complete computation of each procedure accesses. But it can be quite ....
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
....the sizes of the basic blocks. Recursion re rolling rolls back the recursive part of the procedure to ensure that a large unrolled base case is always executed, regardless of the problem size. 1. 1 Divide and Conquer Programs We have applied recursion unrolling to divide and conquer programs [10, 8, 5]. Divide and conquer algorithms solve problems by breaking them into smaller subproblems, then combining the results to generate a solution to the original problem. They use recursion as their primary control structure to generate and solve the smaller subproblems. When the division has reduced ....
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
....this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and or a fee. PLDI 2000, Vancouver, British Colombia, Canada. Copyright 2000 ACM 1 58113 199 2 00 0006. 5.00. cated arrays [20, 15, 9]. These programs present a challenging set of program analysis problems: they use recursion as their primary control structure, they use dynamic memory allocation to match the sizes of the data structures to the problem size, and they access data structures using pointers and pointer arithmetic, ....
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
....this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and or a fee. PLDI 2000, Vancouver, British Colombia, Canada. Copyright 2000 ACM 1 58113 199 2 00 0006. 5.00. cated arrays [16, 13, 8]. These programs present a challenging set of program analysis problems: they use recursion as their primary control structure, they use dynamic memory allocation to match the sizes of the data structures to the problem size, and they access data structures using pointers and pointer arithmetic, ....
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
No context found.
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. S. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proc. 11th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 222--231, 1999.
No context found.
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. Thotterhodi. Recursive array layouts and fast matrix multiplication. IEEE Transactions on Parallel and Distributed Systems, 13:1105-- 1123, 2002.
No context found.
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
No context found.
S. Chartterjee, A. R. Lebeck, and P. K. Patnala. Recursive array layouts and fast matrix multiplication. IEEE TPDS.
No context found.
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
No context found.
S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC