| M. W. Hall, B. R. Murphy, and S. P. Amarasinghe. Interprocedural Analysis for Parallelization. In Proc. of the 8th Work. on Langs. and Compilers for Parallel Computing, volume 1033 of Lecture Notes in Computer Science, pages 61--80, Columbus, OH, Aug. 1995. SpringerVerlag. 1996. |
....focused on the question of what techniques are important to implement in a compiler. One of the rst such studies [17] was done at Illinois as part of the Cedar project, by a group of people of which the author was a member. Similar studies have been done at Stanford [21] Researchers at Stanford [22], and Minnesota [20] plus the PIPS group at Ecole des mines de Paris [13, 14] and the Parafrase 2 group at Illinois [31] have implemented compilers including the same basic transformations found to be important in the Cedar Project. The results have been similar enough to form a general ....
M. Hall, B. Murphy, S. Amarasinghe, S. Liao, and M. Lam. Interprocedural Analysis for Parallelization. Proceedings of 8th Workshop on Language and Compilers for Parallel Computing, August 1995.
.... taking into account the parallelization information [2] Individual nests can either be parallelized explicitly by programmers using compiler directives [1, 4] or can be parallelized automatically (without user intervention) as a result of intra procedural and inter procedural compiler analyses [2, 10, 11]. In either case, after the parallelization step, our approach determines the data regions (for a given dataset) accessed by each processor involved. Three typical access patterns for a given reference to a disk resident array U are depicted in Figure 3 for abstract form of a loop nest taken from ....
M. W. Hall, B. Murphy, S. Amarasinghe, S. Liao, and M. Lam. Inter-procedural analysis for parallelization. In Proc. 8th International Workshop on Lang. and Comp. for Parallel Computers, pages 61--80, Columbus, Ohio, August 1995.
.... is taking into account the parallelization information [2] Individual nests can either be parallelized explicitly by programmers using compiler directives [1, 7] or can be parallelized automatically (without user intervention) as a result of intraprocedural and inter procedural compiler analyses [2, 20, 21]. In either case, after the parallelization step, our approach determines the data regions (for a given dataset) accessed by each processor involved. Three typical access patterns for a given reference to a disk resident array U are depicted in Figure 6.3 for abstract form of a loop nest taken ....
M. W. Hall, B. Murphy, S. Amarasinghe, S. Liao, and M. Lam. Inter-procedural analysis for parallelization. In Proc. 8th International Workshop on Lang. and Comp. for Parallel Computers, pages 61--80, Columbus, Ohio, August 1995.
....protocol specification are essential for scalability of the input problem. 5. Related Work The computation power of recent machines enables the application of interprocedural analysis to practical problems (e.g. interprocedural points to analysis[14, 31] interprocedural array dataflow analysis[17], and interprocedural partial redundancy elimination[2] So far, these advanced analyses have not been used for explicit parallel shared memory programs. Existing research about cooperation between optimizing compilers and software DSM can be divided in three kinds. The first is that a ....
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S. Liao, and M. S. Lam. Interprocedural Analysis for Parallelization. In Proc. of the 8th Int. Workshop on LCPC. Springer-Verlag, Aug. 1995.
....offers a best way to implement this tool. One of the most important questions for which this tool should provide an answer is to be able to select the best way of partitioning the data among processors. The kind of code analysis that this tool has to perform includes interprocedural analysis [16] and array region analysis. An optimizing compiler usually performs only intraprocedural analysis, i.e. it does not attempt to understand the data dependencies between procedures. In order to detect coarse grain parallelism in an application one has to do some kind of interprocedural analysis. The ....
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S. Liao, and M. S. Lam. Interprocedural analysis for parallelization. Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (LCPC95) August, 1995.
....a separate pass in Parafrase 2. In order to obtain the loops that enumerate the elements in the ownership, producer, consumer, and communication sets, we use the Omega library [34] Currently, our implementation works on a single procedure at a time and does not use any inter procedural analysis [22, 23]. In our experiments, we measured that on the average approximately 41 of the total compilation time is spent in our global communication and synchronization approach. However, it should be mentioned that nearly 24 of the 23 Appears in IEEE Trans. on Parallel and Distributed Systems, December ....
M. W. Hall, B. Murphy, S. Amarasinghe, S. Liao, and M. Lam. Inter-procedural analysis for parallelization. In Proc. 8th International Workshop on Languages and Compilers for Parallel Computers, pages 61--80, Columbus, Ohio, August 1995.
....the call graph in such a way that a node is visited only after all the nodes that call it have been visited. During the visit of a node, we compute the RECV sets for each node of it. It should be noted that there are several inter procedural communication optimization algorithms (e.g. 25] [26], 15] with different degrees of sophistication, and the detailed analysis of communication optimization across procedure boundaries is beyond the scope of this paper. However, we believe that for most of the algorithms found in the literature, the summarized communication information represented ....
M. W. HALL, B. MURPHY, S. AMARASINGHE, S. LIAO, and M. LAM. Inter-procedural analysis for parallelization. In Proc. 8th International Workshop on Languages and Compilers for Parallel Computers, pages 61--80, Columbus, Ohio, August 1995.
....Hence, there is a severe limit on the feasible scope of in lining. It is widely recognized that, for large scale applications, often a better alternative is to perform interprocedural summary analysis instead of in lining. Interprocedural data dependence analysis has been discussed extensively [21, 24, 31, 40]. In recent years we have seen increased efforts on array data flow analysis [10, 17, 20, 32, 33, 37, 38, 42] However, few tools are capable of interprocedural array data flow analysis without in lining [10, 20, 23] 2.4 Complications of Array Data flow Analysis In reality, a parallelizing ....
....access descriptors [2] etc. to summarize MOD USE sets of array accesses. They are not array data flow analyses. Recently, array data flow analyses based on these sets were proposed (Gross and Steenkiste [19] Rosene [38] Li [29] Tu and Padua [43] Creusillet and Irigoin [10] and M. Hall et al. [21]) Of these, ours is the only one using conditional regions (GAR s) even though some do handle IF conditions using other approaches. Although the second group does not provide as many details about reaching definitions as the first group, it handles complex program constructs better and can be ....
[Article contains additional citation context not shown here]
M.W. Hall, B.R. Murphy, S.P. Amarasinghe, S.-W. Liao, and M.S. Lam. Interprocedural analysis for parallelization. In Proceedings of the 8th Workshop on Languages and Compilers for Parallel Computing, No. 1033, In Lecture Notes in Computer Science, Springer-Verlag, Berlin, pages 61--80, August 1995.
....and formal procedures are not often used in scientific programs. 9 such as partial redundancy elimination [63] many other interprocedural scalar analyses have also been introduced. They range from constant propagation [18, 36, 10, 40, 23] to subexpression availability and variable values [47], ranges [13] or preconditions [53, 52] propagation. To handle arrays more accurately than sdfi, flow insensitive array region analysis was introduced by Triolet [77] followed by many others [21, 7, 50, 57] Today, many commercial products include some interprocedural flow insensitive analyses, ....
....and polyhedra are an interesting tradeoff in the accuracy speed space and that their potential has not yet been measured. 2. 5 PIPS and Other Research Tools Many research Fortran optimizing tools include some kind of interprocedural analyses, but the closest ones to pips certainly are fiat suif [46, 47], ParaScope [26] and the D system [44] Polaris [11] Parafrase 2 [68] and Panorama [65, 37, 38] FIAT SUIF suif is an intraprocedural compiler for parallel machines developed at Stanford. To enable the parallelization of loops containing procedure calls, fiat [46] an interprocedural engine, ....
M. Hall, B. Murphy, S. Amarasinghe, S.-W. Liao, and M. Lam. Interprocedural analysis for parallelization. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pages 61--80. Springer-Verlag, August 1995.
....[26, 33] Solving such problems requires a precise intra and inter procedural analysis of array data flow. E mail: fcreusillet,irigoing cri.ensmp.fr Several algorithms for array privatization or array expansion (a similar technique for shared memory machines) have already been proposed [16, 27, 26, 33, 22, 1, 14], based on different types of array data flow analyses. The first approach [16, 27] performs an exact analysis of array data flow, but for a restricted source language 1 . Most of the other methods use conservative approximations of array element sets, such as MayBeDefined and MustBeDefined ....
Mary Hall, Brian Murphy, Saman Amarasinghe, Shih-Wei Liao, and Monica Lam. Interprocedural analysis for parallelization. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pages 61--80. Springer-Verlag, August 1995.
....access descriptors [2] etc. to summarize MOD USE sets of array accesses. They are not array data flow analyses. Recently, array data flow analyses based on these sets were proposed (Gross and Steenkiste [10] Rosene [20] Li [14] Tu and Padua [25] Creusillet and Irigoin [6] and M. Hall et al. [12]) Of these, ours is the only one using conditional regions(GAR s) even though some do handle IF conditions using other approaches. Although the second group does not provide as many details about reaching definitions as the first group, it handles complex program constructs better and can be ....
....or a single convex region to summarize one array. Obviously, a single set can potentially lose information, and it may be not useful in some cases. Tu and Padua [25] and Creusillet and Irigoin [6] seem to use a single regular section and a single convex region, respectively. M. Hall et al. [12] use a list of convex regions to summarize all the references of an array. It is unclear if this representation is more precise than a list of regular sections, upon which our approach is based. Regarding path sensitivity, the commonality of these previous methods is that they do not distinguish ....
M.W. Hall, B.R. Murphy, S.P. Amarasinghe, S.-W. Liao, and M.S. Lam. Interprocedural analysis for parallelization. In Proceedings of the 8th Workshop on Languages and Compilers for Parallel Computing, No. 1033, In Lecture Notes in Computer Science, Springer-Verlag, Berlin, pages 61--80, August 1995.
.... 7, the resulting regions are all exact: C(OE1,OE2 ,OE3 ) W EXACT f1 =OE1 =N,1 =OE2 =10,1 =OE3 =20,OE2 10OE3 =110g D(OE1,OE2 ) W EXACT f1 =OE1 =5, 2 =OE2 =10g D(OE1,OE2 ) W EXACT f1 =OE1 =5, OE 2= 1g 7 Related Work The previous work closest to ours are those of Triolet[33] Tang[31] Hall[18], Li[27, 17] and Leservot[23] and the works by Burke and Cytron[8] and Maslov[26] for the interprocedural translation. Many other less recent studies[9, 19, 4] have addressed the problem of the interprocedural propagation of array element sets. But they did not provide sufficient symbolic ....
....had to be similarly declared. Tang[31] Tang summarizes multiple array references in the form of an integer programming problem. It provides exact solutions, but the source language is very restricted, and array reshaping is only handled in very simple cases (subarrays, as Triolet[32] Hall et al.[18] Fiat Suif includes an intra and inter procedural framework for the analysis of array variables. Under and over approximations of array elements sets are represented by lists of polyhedra. The problem of exactness is not considered. However the list representation is more precise than ours, and ....
Mary Hall, Brian Murphy, Saman Amarasinghe, Shih-Wei Liao, and Monica Lam. Interprocedural analysis for parallelization. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pages 61--80. Springer-Verlag, August 1995.
....mixed and matched. 1.2. 1 Exposing parallelism Traditional Fortran loop parallelizing compilers expose parallelism using a combination of ad hoc nested loop data dependence analyses [48] 86] and interprocedural region based and linear inequality driven analyses and transformations [13] 65] 6] [38]. These approaches can work well for very structured Fortran programs where most of the work lies in nested loops, and where accesses to large, flat, static data structures in the loops are affine functions of the iteration space. However, these Fortran techniques do not work as well for programs ....
.... The data dependence analysis for the Id97 compiler is inspired by Fortran loop parallelization techniques [48] 86] 13] 65] Recent work by the SUIF project has extended this loop parallelization to including loops containing function calls, and loop restructuring for memory locality [6] [38]. The interprocedural analysis necessary for parallelizing Id97 is based on Banning s interprocedural alias analysis [15] and Cooper and Kennedy s advances [19] 20] on Banning s original algorithms. 1.4.5 Related multithreading run time systems The thread representation and scheduling work is a ....
M. W. Hall, S. P. Amarasinghe, B. R. Murphy, S. Liao, and M. S. Lam. Interprocedural Analysis for Parallelization. In Proceedings of Supercomputing '95, December 1995.
....data dependence, control dependence, symbolic constants, and interprocedural information. The data dependence analyzer uses a suite of tests of varying complexity to obtain accurate results quickly. The interprocedural information is computed by an unusually extensive collection of analyses [HMA95] SUIF structures its interprocedural analyses as a bottom up pass that summarizes the behavior of each subroutine, followed by a top down pass that applies calling contexts to each subroutine s summary description to compute its final analysis result. For optimization, SUIF has a wide assortment ....
....symbolic constant propagation [Pau95] Parafrase 2, ParaScope, and VFCS [Zim95] perform flow insensitive interprocedural analysis by summarizing where variables are referenced or modified. SUIF s FIAT tool provides a powerful framework for both flow insensitive and flow sensitive analysis [HMA95] Optimizations A wide range of optimizations is supported by these compilers. Optimizations performed by uniprocessor compilers are termed as traditional. Except Sage , all compilers provide traditional optimizations as listed in Table 2. Notice that SUIF generates native code with two levels ....
M. Hall, B. Murphy, and S. Amarasinghe. Interprocedural analysis for parallelization. In Proceedings of the Eighth Workshop on Languages and Compilers for Parallel Computing, Columbus, OH, August 1995.
.... which examines the SPEC89 and PERFECT benchmark suites using the FIDA system [59] When considering only those loops containing calls for this set of 16 programs, the SUIF system is able to parallelize greater than five times more of these loops (a comparison with FIDA is presented in detail in [53]) Static loop counts, however, are not good indicators of whether parallelization will be successful. Specifically, parallelizing just one outermost loop can have a profound impact on a program s performance. Dynamic measurements provide much more insight into whether a program may benefit from ....
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S.-W. Liao, and M. S. Lain. Interprocedural analysis for parallelization. In Proceedings of the 8th Interna- tional Workshop on Languages and Compilers for Parallel Computing. Springer- Verlag, August 1995.
No context found.
Mary W. Hall, Brian R. Murphy, Saman P. Amarasinghe, Shih-Wei Liao, and Monica S. Lam. Interprocedural analysis for parallelization. In Eigth International Workshop on Languages and Compilers for Parallel Computing, August 1995.
....Guru and the visualization. 2.4. Automatic Parallelization The SUIF compiler consists of a large number of parallelization analyses designed to find coarse grain parallelism in a program[51] 52] Our analysis of array accesses is based on the polyhedral theory of integer programs[7][56]. Array regions are represented as sets of systems of linear inequalities, and general mathematical algorithms are used to precisely capture the data accesses in a program. All of the parallelization analyses can be applied across whole programs, thus allowing information to be gathered across ....
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S.-W. Liao, and M. S. Lam. "Interprocedural analysis for parallelization." In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing, Springer-Verlag, August 1995.
.... efficient interprocedural analysis for shared memory machines or compiling for distributed memory machines, but usually not in combination [17] Others have focused on efficiently solving classic data flow problems interprocedurally [11, 32] or precise interprocedural analyses of array accesses [14, 15, 22, 23, 24, 27]. Several recent distributed memory compilation systems employ integer polyhedra instead of RSDs for greater flexibility [3, 4] Few other distributed memory compilation systems have discussed interprocedural issues, especially interprocedural optimization. The CM Fortran compiler utilizes ....
M. Hall, B. Murphy, S. Amarasinghe, S. Liao, and M. Lam. Interprocedural analysis for parallelization. In Proceedings of the Eighth Workshop on Languages and Compilers for Parallel Computing, Columbus, OH, August 1995.
....Fiat significantly to support array data flow analysis and flow sensitive analysis. 4 Parallelization Analysis Algorithms This section overviews the parallelization analysis algorithms and describes how the different phases of the analysis fit together. Further description can be found elsewhere [9, 10]. 4.1 Scalar Analysis Our system has interprocedural scalar analysis that encompasses both scalar parallelization analysis and scalar symbolic analysis. For the scalar parallelization analysis, simple flow insensitive analysis interprocedural analysis that does not consider control flow ....
.... have compared our results with the most recent of these empirical studies, which examines the Spec89 and Perfect benchmark suites [14] When considering only those loops containing calls for this set of 16 programs, the SUIF system is able to parallelize greater than five times more of these loops [9]. The key difference between the two systems is that SUIF contains full interprocedural array analysis, including array privatization and reduction recognition (see Section 5) Static loop counts, however, are not good indicators of whether parallelization will be successful. Specifically, ....
M. Hall, B. Murphy, S. Amarasinghe, S. Liao, and M. Lam. Interprocedural analysis for parallelization. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing. Springer-Verlag, August 1995.
....iteration. If there are no upwards exposed read regions, privatization is safe. Otherwise, privatization is only possible if these upwards exposed read regions are not written by any other iteration. Predicated array data ow analysis extends SUIF s existing array data ow analysis implementation [12]. This analysis computes a four tuple at each program region, h Read, Exposed, Write, MustWrite i, which are the set of array sections that may be read, may be upwards exposed, may be written and are always written, respectively. Dependence and privatization testing performs comparisons on these ....
Hall, M. W., Murphy, B. R., Amarasinghe, S. P., Liao, S.-W., and Lam, M. S. Interprocedural analysis for parallelization. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (Columbus, Ohio, August 1995), pp. 61-80.
....to perform this run time testing, rather than pruning the instrumentation arrays using the analysis approach presented here. 3 Background on Parallelization Analysis The system described in this paper augments an existing automatic parallelization system that is part of the Stanford SUIF compiler [8, 9, 10]. The system parallelizes loops whose iterations can be executed in parallel, partitioning the iterations to execute on different processors. To meet this criterion, the memory locations accessed by each iteration of a loop (and thus by each processor) must be independent of locations written by ....
....by the compiler. Further discussion of reductions is omitted in this paper for clarity of presentation. The compiler uses an interprocedural array data flow analysis to determine which loops access independent memory locations, or for which privatization eliminates remaining dependences [10]. The analysis computes data flow values for each program region, where a region is either a basic block, a loop body, a loop, a procedure call, or a procedure body. The data flow value at each region consists of a 4 tuple hRead, Exposed, Write, MustWritei, with the four components of the tuple ....
[Article contains additional citation context not shown here]
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S.-W. Liao, and M. S. Lam. Interprocedural analysis for parallelization. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing, pages 61--80, Columbus, Ohio, August 1995.
.... exist to solve them exactly [11] Even in the presence of some nonlinear subscript expressions, such as the common case of access expressions involving multi loop induction variables, approaches based on integer linear programming can detect independence with assistance from symbolic analysis [7]. In some cases, accesses to an array variable within a loop must be transformed to make each processor s memory accesses independent. As one example, if all locations read by an iteration are first written within the same iteration, it may be possible to privatize the variable so that each ....
....upwards exposed to the beginning of the iteration and are written within other iterations. The SUIF automatic parallelization system, upon which we base our implementation, performs a single array data flow analysis to test for both independence and privatization, as will be discussed in Section 3 [7]. In the examples in this section, we will show how predicates can refine the compiler s notion of data dependences and upwards exposed reads, producing more precise parallelization results. 2.1 Combining Predicates and Data Flow Values Traditional data flow analysis computes what is called the ....
[Article contains additional citation context not shown here]
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S.-W. Liao, and M. S. Lam. Interprocedural analysis for parallelization. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing, pages 61--80, Columbus, Ohio, August 1995.
No context found.
M. W. Hall, B. R. Murphy, and S. P. Amarasinghe. Interprocedural Analysis for Parallelization. In Proc. of the 8th Work. on Langs. and Compilers for Parallel Computing, volume 1033 of Lecture Notes in Computer Science, pages 61--80, Columbus, OH, Aug. 1995. SpringerVerlag. 1996.
No context found.
M. W. Hall, B. R. Murphy, and S. P. Amarasinghe, "Interprocedural analysis for parallelization, " in Proceedings of the 8th Workshop on Languages and Compilers for Parallel Computing, Columbus, OH, Aug. 1995.
No context found.
M.W. Hall, B.R. Murphy, S.P. Amarasinghe, S.-W. Liao, and M.S. Lam. Interprocedural Analysis for Parallelization. In 8th International Workshop on 170 Languages and Compilers for Parallel Computing, pages 61--80, 1996.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC