| M. Wolfe. Advanced loop interchanging. In Proceedings of International Conference on Parallel Processing (ICPP'96), 1996. |
....due to software pipelining also depends on high level loop transformations performed earlier in the compilation process. 2. 3 Loop permutation Loop permutation (also called loop interchange for two dimensional loops) is a useful high level loop transformation for performance optimization [19]. See the following C program fragment: Since the array a is placed by row major mode, the above program fragment doesn t have good cache locality because two successive references on array a have a large span in memory space. By switching the inner and outer loop, the original loop is ....
Michael Wolfe. Advanced loop interchanging. In Proc. of the
....of the dependence occur in different iterations of the loop. Even if the outermost loop cannot be parallelized, it may be possible to move an inner parallel loop to the outermost position. This transformation is called loop permutation [Ban90] a generalization of loop interchange [AK84] AK87] Wol86] Wol89] To perform loop permutation, the dependence analysis phase looks for loops that do not carry any dependences. These loops will be candidates for moving to the outermost position. This section presents two versions of loop permutation. First, we address the problem of perfectly nested ....
....not be modified along the chain of calls from their new position to their original one. Possible exceptions include loop bounds based on induction variables of an outer loop. Permutation of these triangular and trapezoidal loops requires a slight extension to the transformation described here [Wol86] Mechanics. During dependence analysis, loops that do not carry any dependences are marked. To determine whether a loop containing a procedure call carries a dependence, rsds represent the called procedure. When selecting loops for loop permutation, the compiler examines a chain of procedures ....
[Article contains additional citation context not shown here]
M. Wolfe. Advanced loop interchanging. In Proceedings of the
....due to software pipelining also depends on high level loop transformations performed earlier in the compilation process. 2. 3 Loop permutation Loop permutation (also called loop interchange for two dimensional loops) is a useful high level loop transformation for performance optimization [19]. See the following C program fragment: for (i = 0; i M; i ) f for (j = 0; j N; j ) f a[j] i] 1; g g Since the array a is placed by row major mode, the above program fragment doesn t have good cache locality because two successive references on array a have a large span in memory ....
Michael Wolfe. Advanced loop interchanging. In Proc. of the
....several parts and then further transforming the resulting subqueries. In this thesis, we combine the execution of a subquery with the partitioning phase of a Hybrid hash join which is similar to using pipelining. The idea of interchanging loops appears frequently in work on vectorizing FORTRAN [PADU86, WOLF86, WOLF89]. For instance, do I = 1, N do J = 1, N S = S B(I,J) A(I,J 1) A(I,J) B(I,J) C(I,J) enddo enddo cannot be directly vectorized. However, if we interchange the I and J loops, the definition of A(I,J 1) can be vectorized. The definition of S involves a reduction operation. A reduction ....
....We discuss the related work for each in turn. 2.2.1. Related Transformation Work An enormous amount of work has been done on parallelizing relational queries (e.g. GERB86, SCHN89, GRAE90, KITS90, WOLF90, HUA91, WALT91, DEWI92a] Other work has been on parallelizing loops in FORTRAN (e.g. [PADU86, WOLF86, WOLF89]) and in LISP [LARU89] All this work makes extensive use of program transformations. HART88, HART89] discuss their parallelizing compiler for FAD, a functional DBPL. They use analysis to determine if a program can correctly be executed in parallel bringing all the data to a central site and ....
Michael Wolfe. Advanced Loop Interchanging. Proc. 1986 Int. Conf. Parallel Processing, August 1986. 126
....entries in the distance direction vector. If the result is lexicographically positive, the permutation is legal, and we transform the nest. By definition, 2 In Section 3.5, we perform imperfect interchanges with distribution. The evaluation method can also drive imperfect loop interchange [Wolfe 1986], but we did not implement it. Improving Data Locality with Loop Transformations Delta 7 Input: O = fi1 ; i 2 ; i ng, the original loop ordering DV = set of original legal direction vectors for l n L = fi oe 1 ; i oe 2 ; i oe n g , a permutation of O with the best estimated ....
Wolfe, M. J. 1986. Advanced loop interchanging. In Proceedings of the 1986 International Conference on Parallel Processing. CRC Press, Boca Raton, Fla.
....by Kim s technique. We, too, break a complicated subquery into several parts. We combine the execution of a subquery with the partitioning phase of a hybrid hash join which is similar to using pipelining. The idea of interchanging loops appears frequently in work on vectorizing FORTRAN [PADU86,WOLF86,WOLF89]. For instance, do I = 1, N do J = 1, N S = S B(I,J) A(I,J 1) A(I,J) B(I,J) C(I,J) enddo enddo cannot be directly vectorized. However, if we interchange the I and J loops, the definition of A(I,J 1) can be vectorized. The definition of S involves a reduction operation. A reduction ....
Michael Wolfe. Advanced Loop Interchanging. Proc. 1986 Int. Conf. Parallel Processing, August 1986.
....for the NearbyPermutation algorithm. Theorem: If there exists a legal permutation where oe n is the innermost loop, then NearbyPermutation will find a permutation where oe n is innermost. 2 Determining memory order does not depend on a perfect nest. Methods exist for permuting imperfect nests [49], but we only permute perfect nests or nests that fusion or distribution make perfect (see Section 3.3.1) Figure 6: NearbyPermutation: NearbyPermutation Algorithm INPUT: O = fi 1 ; i 2 ; i n g, the original loop ordering L = fi oe 1 ; i oe 2 ; i oe n g , a permutation of O ....
M. J. Wolfe. Advanced loop interchanging. In Proceedings of the 1986 International Conference on Parallel Processing, pages 536--543, St. Charles, IL, August 1986.
....communication placement [49, 84] message pipelining [136] vector message pipelining [95] and iteration reordering [113] Improving Parallelism: Recognizing reductions and parallel prefix scan operations [53, 114] can help to improve the available parallelism. Loop interchange and strip mining [13, 160, 162] can be used to adjust the granularity of pipelined computations to balance parallelism and communication [93] Storage Management: The use of overlap areas [73] and hash tables can ease the details of buffer management for certain types of computations. Message blocking [95] can be used in ....
M. J. Wolfe. Advanced loop interchanging. In Proceedings of the 1986 International Conference on Parallel Processing, St. Charles, IL, August 1986.
....of arrays, especially in the context of loops. Arrays, often called subscripted variables or vectors, are used extensively in loops. Loops offer a great potential source for speedup in parallel systems [47, 12, 49] However, vectors used in the context of loops complicate the analysis considerably [71, 72, 75]. Most of the restructuring transformations discussed in the following chapters are applied to loops and thus rely extensively on the information obtained from the dependence analysis of arrays in loops. In this section we introduce some additional dependence notation along with a few new ....
....examine two transformation operators, loop interchange and scalar expansion. Loop interchange can be used to move a recurrence to an outer loop, while scalar expansion can be used to break a recurrence. In either case, the change can allow some statements to be vectorized. Loop Interchange Wolfe [72], Allen and Kennedy [6] and Banerjee [11] have studied the loop interchange problem extensively. Loop interchange is widely used in restructuring compilers primarily because it represents a powerful method with which one can exploit statement level parallelism, reduce memory bank conflicts, or ....
Wolfe, M. Advanced Loop Interchanging. In Proc. of the 1986 International Conference on Parallel Processing (1986), IEEE Computer Science Press, pp. 536--543.
....the permutation is legal, and we transform the nest. By definition, the original distance direction vector is legal, i.e. lexicographically positive [Allen 2 In Section 3.5, we perform imperfect interchanges with distribution. The evaluation method can also drive imperfect loop interchange [Wolfe 1986], but we did not implement it. Improving Data Locality with Loop Transformations Delta 7 Input: O = fi1 ; i 2 ; i ng, the original loop ordering DV = set of original legal direction vectors for l n L = fi oe 1 ; i oe 2 ; i oe n g , a permutation of O with the best estimated ....
Wolfe, M. J. 1986. Advanced loop interchanging. In Proceedings of the 1986 International Conference on Parallel Processing. CRC Press, Boca Raton, Flor.
....2 3 I 1 K 1 0 0 1 1 2 3 2 3 Figure 15: Dependence graphs of loops in Example 21. The correctness of loop interchanging can be determined using only direction vectors. This method was developed by Steve Chen for the Burroughs Scientific Processor. Loop interchanging is described in detail by Wolfe [14, 78], who also studied how it can be applied to triangular loops. Further discussions on interchanging can be found in the work of Allen and Kennedy [79] The second technique discussed here is skewing, which is very similar to the technique of the same name presented above for single loops, except ....
M. Wolfe. Advanced loop interchanging. In K. Hwang, S. Jacobs, and E. Swartzlander, editors, Proceedings of the 1986 Int'l. Conf. on Parallel Processing, pages 536--543, St. Charles, Ill., August 1986. IEEE Computer Society Press, Washington, DC.
No context found.
M. Wolfe. Advanced loop interchanging. In Proceedings of International Conference on Parallel Processing (ICPP'96), 1996.
No context found.
M. Wolfe. Advanced loop interchanging. In Proceedings of International Conference on Parallel Processing (ICPP'96), 1996.
No context found.
Michael Wolfe. Advanced loop interchanging. Proceedings of the 1986 International Conference on Parallel Processing, pages 536--543, Aug. 19--22 1986.
No context found.
M. J. Wolfe: "Advanced Loop Interchanging ", Proc. of 1986 Int'l Conf. on Parallel Processing, pp. 536--543 (1986).
No context found.
M. J. Wolfe: "Advanced Loop Interchanging ", Proc. of 1986 Int'l Conf. on Parallel Processing, pp. 536--543 (1986).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC