21 citations found. Retrieving documents...
M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of the 1998.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Precise Data Locality Optimization of Nested Loops - Loechner, Meister, Clauss (2002)   (7 citations)  (Correct)

....investigations of the effect of data transformations combined with loop transformations. However, the only data layouts that have been considered are row major or column major storage [2, 6, 11] and data transformations have been restricted to be unimodular [14] More recently, Kandemir et al. [12] and O Boyle and Knijnenburg [17] have proposed a unifying framework for loop and more general data transformations. In [17] the authors propose an extension to nonsingular data transformations. Unfortunately, these approaches do not use any symbolic analysis in order to evaluate the array sizes, ....

....C(L) i #Di where r denotes the number of different references occurring in L and D i denotes the set of iterations of the loop nest enclosing the ith reference. All loop nests of a program are ordered according to this value. This cost function is similar to the one defined by Kandemir et al. in [12], except that our evaluation tools allow to compute exact symbolic values. Loop nests are optimized from the costliest to the less costly nest. When considering one loop nest, the best combination of loop and data transformations depends on previous optimizations made on costlier nests and on the ....

[Article contains additional citation context not shown here]

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to global locality optimization. Journal of Parallel and Distributed Computing, 58: 190-235, 1999.


Data Sequence Locality: a Generalization of Temporal Locality - Loechner, Meister, Clauss   (Correct)

.... several loop transformations have been uni ed into a single framework using a matrix representation of these transformations [22] These techniques consist either in unimodular [1] or non unimodular [11] iteration space transformations as well as tiling [10, 20, 21] More recently, Kandemir et al. [9] and O Boyle and Knijnenburg [14] have proposed a unifying framework for loop and more general data transformations. In [14] the authors propose an extension to nonsingular data transformations. Unfortunately, these approaches do not pay special attention to TLBs. Hence, a program that exhibits ....

....on massively parallel architectures: when a processor brings a data page into its local memory, it will reuse it as much as possible due to our temporal optimization. This yields signi cant reductions in page faults and hence in network trac. We can also say, as mentioned by Kandemir et al. in [9], that optimized programs do not need explicit data placement techniques on shared memory NUMA architectures: when a processor uses a data page frequently, the page is either replicated onto that processor s memory or migrated into it. In either cases, all the remaining accesses will be local. ....

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to global locality optimization. Journal of Parallel and Distributed Computing, 58:190-235, 1999.


High Performance Numerical Computing in Java.. - Artigas, Gupta.. (1999)   (8 citations)  (Correct)

.... of data layout: the actual data layout for the Arrays is not exposed to the programmer (as it is not specified) While this may prevent the programmer from doing certain optimizations, we believe this is beneficial in the longer term because it facilitates data layout optimizations for Arrays [10, 18, 21, 22, 14]. The compiler has fewer constraints on ensuring the correctness of the program in the presence of data layout transformations, and can avoid copy operations otherwise needed to restore the data layout to a fixed format. The class hierarchy of the Array package is shown in Figure 2. It has been ....

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT'98), Paris, France, October 1998.


Minimizing Strides in Loops with Affine Array References - Clauss, Loechner, Meister (2001)   (Correct)

....[18, 17] Although there has been less attention paid to data transformations, some works investigate the e ect of data transformations combined with loop transformations. However, the data transformations they consider are restricted to be unimodular [8, 1, 10] More recently, Kandemir et al. [9] and O Boyle and Knijnenburg [12] have proposed an unifying framework for loop and more general data transformations. In [12] the authors propose an extension to nonsingular data transformations. Unfortunately, all these approaches do not use any symbolic analysis in order to evaluate the array ....

....on processor locality: when a processor brings a data page into its local memory, it will reuse it as much as possible due to our temporal and spatial optimizations. This yields signi cant reductions in page faults and hence in network trac. We can also say, as mentioned by Kandemir et al. in [9], that optimized programs do not need explicit data placement techniques on shared memory NUMA architectures: when a processor uses a data page frequently, the page is either replicated onto that processor s memory or migrated into it. In either cases, most of the remaining accesses will be ....

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to global locality optimization. Journal of Parallel and Distributed Computing, 58:190-235, 1999.


Guiding Program Transformations with Modal Performance Models - Mitchell (2000)   (2 citations)  (Correct)

....spatial locality of a#ne memory references [7, 72, 66, 61] For example, they might remap from row to column major order, or to storage aligned with diagonals. Leung s work [72] also addresses issues particular to non a#ne index expressions. Other researchers combine iteration and data reordering [24, 25, 62]. All of these works use linear remappings. Just as iteration permutations via tiling cannot optimize non a#ne references, linear data remappings also do not su#ce. 40 original computation bucket tiled computation = p permutation generation FOREACH loops loops from implement DO loop ....

Mahmut T. Kandemir, Alok Choudhary, Jagannathan Ramanujam, and Prithviraj Banerjee. A matrix-based approach to the global locality optimization problem. In Parallel Architectures and Compilation Techniques, October 1998.


High Performance Computing in Java: Language and.. - Artigas, Gupta.. (1999)   (Correct)

.... of data layout: the actual data layout for the Arrays is not exposed to the programmer (as it is not specified) While this may prevent the programmer from doing certain optimizations, we believe this is beneficial in the longer term because it facilitates data layout optimizations for Arrays [10, 17, 20, 21, 14]: the compiler has fewer constraints on ensuring the correctness of the program in the presence of data layout transformations, and can avoid copy operations otherwise needed to restore the data layout to a fixed format. The class hierarchy of the Array package is shown in Figure 2. It has been ....

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT'98), Paris, France, October 1998. 16


Java Programming for High Performance Numerical Computing - Moreira, Midkiff.. (2000)   (21 citations)  (Correct)

....fact that data elements are contiguous in memory, and that the next logical element of the array is also adjacent in storage to the previous element. Quite frequently, Fortran programs rely on storage association to execute correctly. In addition, optimizations that alter the layout of an array [16, 20, 21, 29] are globally visible. As a consequence, data typically has to be copied to and from the original layout. This makes these optimizations more fragile and less useful, since the overhead of copying the arrays twice must be factored into the cost model for determining if the optimization should be ....

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT'98), Paris, France, October 1998.


Inter-array Data Regrouping - Ding, Kennedy (1999)   (9 citations)  (Correct)

.... optimization of data shapes on an operation dag[25] Leung studied a general class of data transformations unimodular transformations for improving cache performance[23] The combination of both computation and data restructuring is explored by Cierniak and Li[10] and then by Kandemir et al.[19]. For irregular and dynamic programs, run time data packing is used to improve spatial reuse by Ding and Kennedy[12] All these previous work are anchored on restructuring data within one array. As a result, none of them address the selective partition of data arrays, neither can they exploit ....

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques, 1998.


Localizing Non-affine Array References - Mitchell, Carter, Ferrante (1999)   (15 citations)  (Correct)

....locality of affine memory references [2, 20, 18, 15] For example, they might remap from row to column major order, or to storage aligned with diagonals. Leung s work [20] also addresses issues particular to non affine index expressions. Other researchers combine iteration and data reordering [6, 7, 14]. All of these works use linear remappings. Just as iteration permutations 1 In Section 2, we will see why remapping A is undesirable. computation bucket tiled computation original loops from FOREACH loops makes DO loop regeneration data remapping permutation generation PSfrag replacements ....

M. T. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Parallel Architectures and Compilation Techniques, Oct. 1998.


An Integer Linear Programming Approach for Optimizing.. - Kandemir Banerjee.. (1999)   (3 citations)  Self-citation (Kandemir Choudhary Ramanujam Banerjee)   (Correct)

No context found.

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proc. 1998.


A Matrix-Based Approach to Global Locality Optimization - Kandemir, Choudhary.. (1999)   (16 citations)  Self-citation (Kandemir Choudhary Ramanujam Banerjee)   (Correct)

No context found.

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proc. International Conference on Parallel Architectures and Compilation Techniques (PACT'98), October 14--17, 1998, Paris, France.


Data Relation Vectors: A New Abstraction for Data Optimizations - Kandemir, Ramanujam   Self-citation (Kandemir Ramanujam)   (Correct)

....in loop nests. These include (i) loop optimization strategies such as linear loop transformations, tiling, loop fusion, loop fission, and loop unrolling [31] ii) data transformations, which change the memory layout of data [20, 14] and (iii) combined loop and data transformation techniques [8, 12, 13, 24]. These transformation techniques need to use some kind of abstraction (a) for expressing the inherent locality exhibited by the code, and (b) for deriving optimal (or enhanced) loop orders and or memory layouts. So far, a majority of the approaches to optimizing data locality in structured ....

....data access pattern, thereby promoting intra tile reuse. Careful selection of tile traversals also leads to better inter tile locality. Several recent studies have focussed on data space transformations [20, 14, 23, 28] and on combining loop and data transformations in a unified framework [24, 8, 15, 12, 15]. Cierniak and Li [8] in particular, use an abstraction that exploits the offset difference between successively accessed elements by a given reference. They focus on self reuses and the abstraction that they use allows them to handle only column major and row major layouts and higher dimensional ....

[Article contains additional citation context not shown here]

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proc. 1998 Intl. Conf. Parallel Arch. & Comp. Tech.,, 1998.


A Graph Based Framework to Detect Optimal Memory.. - Kandemir.. (1999)   Self-citation (Kandemir Choudhary Ramanujam Banerjee)   (Correct)

....[8] Such a scheme in general favors one of the references over the other, resulting sometimes in a sub optimal memory layout for the latter. An alternative way of handling this global array layout optimization problem is based on propagating memory layouts across loop nests. In an earlier paper [7], we proposed a technique that starts with the most costly nest and optimizes it using data transformations. After this step, the optimal memory layouts for the arrays referenced in this loop nest are determined. These layouts are then propagated to the next most costly nest. The potential ....

....After this step, the optimal memory layouts for the arrays referenced in this loop nest are determined. These layouts are then propagated to the next most costly nest. The potential negative impact of these new layouts in this second loop nest is decreased using iteration space transformations [18, 7]. This approach has three main problems, though. First, it uses iteration space transformations for all but the first nest, thereby bringing the issue of legality into the picture again. Second, it may not always be possible to find a loop transformation to lessen the negative impact of data ....

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. To appear in Proc. 1998 Intl. Conf. Parallel Architectures & Compilation Techniques (PACT'98), Paris, France, October 1998.


An ILP Approach for Optimizing Cache Locality - Kandemir, Banerjee.. (1998)   Self-citation (Kandemir Choudhary Ramanujam Banerjee)   (Correct)

....[6] for a study of techniques for ensuring the legality of memory layout transformations. It seems natural to try and combine the benefits of loop and data transformations in improving the memory performance of programs. There have been some efforts aimed at unifying loop and data transformations [2, 21, 23, 24, 26, 40]; all these efforts have used some form of heuristics. For example, these heuristics are used to decide such things as the order of processing the nests in deciding layouts and the order in which loop or data transformations are applied in each nest. In earlier work, we have presented a ....

....For example, these heuristics are used to decide such things as the order of processing the nests in deciding layouts and the order in which loop or data transformations are applied in each nest. In earlier work, we have presented a heuristic for deciding the order of processing loop nests [23] and have shown results on using, for each loop nest, loop transformations followed by data transformations [24] The use of heuristics leads to no guarantee of optimality. In this paper, we present a new approach that uses integer linear programming (ILP) 38] We use a structure called the ....

[Article contains additional citation context not shown here]

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proc. 1998 Intl. Conf. Parallel Architectures & Compilation Techniques (PACT'98), Paris, France, October 1998.


Generating Cache Hints for Improved Program Efficiency - Beyls, D'Hollander (2004)   (Correct)

No context found.

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of the 1998.


Localizing Non-affine Array References - Mitchell, Carter, Ferrante (1999)   (15 citations)  (Correct)

No context found.

M. T. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Parallel Architectures and Compilation Techniques, Oct. 1998.


Next-Generation Memory Systems - Wang (2004)   (Correct)

No context found.

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In The 1998.


Inter-array Data Regrouping - Ding, Kennedy (1999)   (9 citations)  (Correct)

No context found.

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques, 1998.


Improving Data Locality by Array - Yonghong Song Rong   (Correct)

No context found.

Mahmut Kandemir, Alok Choudhary, J. Ramanujam, and Prithviraj Banerjee. A matrix-based approach to global locality optimization. Journal of Parallel and Distributed Computing, 58(2):190--235, 1999.


Improving Effective Bandwidth through Compiler Enhancement of.. - Ding (2000)   (10 citations)  (Correct)

No context found.

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques, 1998.


Using the Compiler to Improve Cache Replacement Decisions - Wang, McKinley, Rosenberg, .. (2002)   (11 citations)  (Correct)

No context found.

M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In The 1998.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC