| Yonghong Song, Rong Xu, Cheng Wang, and Zhiyuan Li. Data locality enhancement by memory reduction. In Proceedings of the 15th ACM International Conference on Supercomputing, Naples, Italy, June 2001. |
....algorithm. Then, several authors (see the Related Work section) contributed to loop fusion optimizations, but with slightly di#erent objectives, focusing on loop fusion for locality [17] weighted loop fusion [19] maximal fusion (number of loops) 4] loop fusion for memory reduction [25, 15], etc. All these approaches keep in mind array contraction but they do not optimize directly for it. They target variants of data locality (for example, number of fused dependences) and, in favorable cases (but not always) they can achieve array contraction as a secondary e#ect. Since the work of ....
....= 2 E(I 1) B(I 1) 3 F(I 1) C(I) A(I 1) B(I 1) D(I) A(I 1) B(I 1) ENDDO Figure 4: Di#erent versions for locality. 5 Related Work Loop fusion for optimization of locality and memory reduction has a long history. All experimental studies (see for example the experimental results in [17, 19, 24, 14, 15, 25, 16]) lead to the same conclusions: loop transformations (especially loop fusion) are important for data locality optimization in general and array contraction in particular, and array contraction has an impact both on performance and on memory size. Furthermore, there are benefits to performing these ....
[Article contains additional citation context not shown here]
Yonghong Song, Rong Xu, Cheng Wang, and Zhiyuan Li. Data locality enhancement by memory reduction. In ACM International Conference on Supercomputing, pages 50--64, Sorrento, Italy, 2001. ACM Press.
....for sequential execution. Fraboulet et al. 5] use loop alignment to reduce memory requirement between adjacent loops by formulating the one dimensional version of the prob lem as a network flow problem; they did look at the effect of their solution on cache behavior or communication. Song et al. [17, 18] present a different network flow formulation of the memory reduction problem and they include a simple model of cache misses as well. They do not consider trading off memory for recomputation or the impact of data distribution on communication costs while meeting per processor memory constraints ....
Y. Song, R. Xu, C. Wang and Z. Li. Data locality enhancement by memory reduction. In Proc. of ACM 15th International Conference on Supercomputing, pages 50--64, June 2001.
....has been some recent work on using loop fusion for memory reduction for sequential execution. Fraboulet et al. 7] use loop alignment to reduce memory requirement between adjacent loops by formulating the one dimensional version of the problem as a network flow problem. Song [23] and Song et al. [25, 24] present a different network flow formulation of the memory reduction problem and they include a simple model of cache misses as well. However, they do not consider the issue of trading off memory for recomputation. 9. CONCLUSION This paper addressed a space time trade off problem that arises ....
Y. Song, R. Xu, C. Wang, and Z. Li. Data locality enhancement by memory reduction. In 15th ACM International Conference on Supercomputing, pages 50--64, Sorrento, Italy, June 2001.
....a max flow min cut algorithm, taking into account the data dependencies. However, their study is motivated by data locality enhancement and not memory reduction. Also, they only considered fusions of conformable loop nests, i.e. loop nests that contain exactly the same set of loops. Song et al. [25] have explored the use of loop fusion for memory reduction for sequential execution. They do not consider trading off memory for recomputation or the impact of data distribution on communication costs while meeting per processor memory constraints in a distributed memory machine. Loop fusion in ....
Y. Song, R. Xu, C. Wang and Z. Li. Data locality enhancement by memory reduction. In Proc. of ACM 15th International Conference on Supercomputing, pages 50-- 64, June 2001.
....contracting one dimension of an array is prima facie optimal, our system can often contract most of the array to a fixed number of scalars. This is explained in section 4, and illustrated in Figure 3. Song et al. describe a compiler that combines loop shifting, loop fusion, and array contraction [25]. Tiling is not done: a given loop s iterations are performed in the same order as specified in the FORTRAN source code. Furthermore, loop shifting is a blunt tool for exposing loop fusion opportunities. The advantage they gain by limiting the set of possible transformations is that selecting a ....
Y. Song, R. Xu, C. Wang, and Z. Li. Data Locality Enhancement by Memory Reduction. 15th ACM International Conference on Supercomputing, June 2001.
No context found.
Yonghong Song, Rong Xu, Cheng Wang, and Zhiyuan Li. Data locality enhancement by memory reduction. In Proceedings of the 15th ACM International Conference on Supercomputing, Naples, Italy, June 2001.
No context found.
Y. Song, R. Xu, C. Wang, and Z. Li. Data locality enhancement by memory reduction. In Proceedings of the 15th ACM International Conference on Supercomputing, Sorrento, Italy, June 2001.
No context found.
Y. Song, R. Xu, C. Wang, and Z. Li. Data locality enhancement by memory reduction. In 15th ACM International Conference on Supercomputing, pages 50--64, Sorrento, Italy, June 2001.
No context found.
Yonghong Song, Rong Xu, Cheng Wang, and Zhiyuan Li. Data locality enhancement by memory reduction. In ACM International Conference on Supercomputing, pages 50--64, Sorrento, Italy, 2001. ACM Press.
No context found.
Y. Song, R. Xu, C. Wang, and Z. Li. Data locality enhancement by memory reduction. In International Conference on Supercomputing, pages 50--64, 2001.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC