10 citations found. Retrieving documents...
W. Blume et al. Polaris: Improving the effectiveness of parallelizing compilers. In Languages and Compilers for Parallel Computing, pages 141--154, 1994.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Automatic Parallelization for Non-cache Coherent Multiprocessors - Paek, Padua (1996)   (Correct)

....machine with a global memory space and noncoherent caches. 1 Introduction Of the three main classes of today s parallel computers, namely, message passing multiprocessors, cache coherent multiprocessors, and noncoherent cache multiprocessors with a global address space, parallelizing compilers [2, 9, 11, 12] have been extensively studied for only the first two. In this paper, we present a preliminary study on the automatic parallelization of Fortran programs for the third class machine. Our translation algorithms were implemented in the Polaris restructurer [2] which was developed by the authors and ....

....parallelizing compilers [2, 9, 11, 12] have been extensively studied for only the first two. In this paper, we present a preliminary study on the automatic parallelization of Fortran programs for the third class machine. Our translation algorithms were implemented in the Polaris restructurer [2], which was developed by the authors and others at Illinois. Important advances in automatic parallelization for cache coherent multiprocessors have been demonstrated recently with the Polaris restructurer. For the work reported in this paper, we have extended Polaris to generate code for the Cray ....

[Article contains additional citation context not shown here]

B. Blume, et al., Polaris: Improving the Effectiveness of Parallelizing Compilers, Proceedings of the Seventh Workshop on Languages and Compilers for Parallel Computing, OR. Lecture Note in Computer Science, Aug. 1994, pp. 141-154


The Jrpm System for Dynamically Parallelizing Java Programs - Chen, Olukotun (2003)   (1 citation)  (Correct)

.... 100s of cycles, eliminates any potential speedups for frae grained parallel tasks that either have Irue dependencies or closely shared cache lines. Modern compilers that perform array dependence analysis can parallelize Fortran like numerical applications on traditional multiprocessors [2][5][20] 38] Unfommately, numerous challenges have made automatic compiler parallelization of general integer programs difficult. Analyzing pointer aliasing, control flow, irregular array accesses, and dynamic loop limits as well as handling inter procedural analysis complicate static dependence ....

.... for traditional multiprocessor systems as a way to preserve correctness for loops executed in parallel that may have complex dependency patterns [ 18] 36] 37] 39] There are numerous commercial and research compilers based on array dependence analysis for parallelizing Fortran programs [2][5][20] 38] Several studies have looked at how these compilers might be applied to general programs [4] 25] 35] There is a growing body of related work on TLS compilation addressing different architectures and aspects of the problem. The Multiscalar [43] compiler focuses on compile time ....

Blume, W. et al. Polaris: Improving the Effectiveness of Parallelizing Compilers. In 7 th International Workshop on Languages and Compilers for Parallel Computing. Ithaca, NY, August 1994.


High-Level Language Support for User-Defined Reductions - Deitz, Chamberlain, Snyder (2001)   (Correct)

....With this approach, programming is as easy as writing code for a single processor because this is just what the programmer does. The compiler is solely responsible for exploiting parallelism. Traditionally pattern matching and idiom recognition have been used to parallelize reductions [4, 14]. Sophisticated techniques for recognizing broader classes of reductions have also been examined [8, 19] Commutativity analysis [15] promises to be yet another effective technique. However, it is an undecidable problem to determine whether a function is associative [10] Moreover, even if a ....

W. Blume, R. Eigenmann, K. Faigin, J. Grout, J. Hoeinger, D. Padua, P. Petersen, W. Pottenger, L. Rauchwerger, P. Tu, and S. Weatherford. Polaris: Improving the effectiveness of parallelizing compilers. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing, 1994.


Running Parallel Applications on an MP With.. - Krishnan, Zhang..   (Correct)

....time steps (iterations) were reduced to let the simulations complete faster. The Splash applications are explicitly (written by user) parallel programs written in C and uses ANL m4 macros for parallel constructs. mxm, emit , fft and cholesky from NASA7 are written in FORTRAN77. We used Polaris [Bea94] an automatic parallelizing compiler, to identify parallel sections of code. Yet another compiler pass was added to automatically generate parallel code using ANL macros to run on the TangoLite environment. These represent the other spectrum of explicit parallel programming where no modification ....

William Blume and et al. Polaris: Improving the effectiveness of parallelizing compilers. In Proceedings of the 7th Workshop on Languages and Compilers for Parallel Computing, 1994.


Fusion of Loops for Parallelism and Locality - Manjikian, Abdelrahman (1995)   (12 citations)  (Correct)

....transformation was applied, as well as the length of the longest sequence and the maximum shift peel amounts for any sequence. The loop nests of interest are transformed automatically with a prototype compiler implementation of the shift and peel transformation which is based on Polaris [7], a robust compiler framework from the University of Illinois at Urbana Champaign. Accurate dependence distances are obtained with the Omega test [27] which are then used in the algorithm of Figure 8 to derive the required shifting and peeling. Fusion is then performed using the code manipulation ....

B. Blume, R. Eigenmann, K. Faigin, J. Grout, J. Hoeflinger, D. Padua, P. Petersen, B. Pottenger, L. Rauchwerger, P. Tu, and S. Weatherford. Polaris: Improving the effectiveness of parallelizing compilers. In K. Pingali et al., editors, Languages and Compilers for Parallel Computing, pages 141--154. Springer-Verlag, Berlin, 1995.


Automatic Data and Computation Partitioning on Scalable.. - Tandri, Abdelrahman (1996)   (4 citations)  (Correct)

.... University of Toronto Hector [3] and NUMAchine [4] the KSR1 [5] and the Convex Exemplar [6] Automatic parallelization of scientific applications on bus based shared memory multiprocessors has been mainly concerned with the detection of parallelism and the scheduling of parallel loop iterations [7, 8]. Hence, it is not surprising that on SSMMs issues related to data placement have been ignored by compilers and delegated to the operating system as part of page management. Policies, such as first hit and roundrobin place pages in the physically distributed shared memory as these pages are ....

....system policies. We implemented the algorithm in a prototype compiler called Jasmine, which is developed at the University of Toronto. It consists of four major phases: parallelism detection, cache locality enhancement, memory locality enhancement and code generation. We use the Polaris compiler [8] from the University of Illinois as well as the KAP preprocessor from Kuck and Associates for the parallelism detection phase. The algorithm described in this paper constitutes the memory locality enhancement phase. The compiler is used to automatically determine data and computation partitions ....

W. Blume et al. Polaris: Improving the effectiveness of parallelizing compilers. In David Gelernter, Alexandru Nicolau, and David Padua, editors, Languages and Compilers for Parallel Computing, pages 141--154. Springer-Verlag, 1994.


A Heterogeneous Hierarchical Solution to Cost-efficient High.. - Miled, Fortes (1996)   (Correct)

....locality. 3.1. Instruction Temporal Locality with Respect to the Degree of Parallelism The purpose of the experiments presented here is to quantify and analyze instruction temporal locality with respect to parallelism. For each of the selected benchmarks, parallelism is detected by Polaris [5]. The programs are then instrumented to output the value of ADP for each window. Each window is then classified as scalar or parallel for values of MDP of 10, 100 and 1000. The collection of all windows constitutes the entire dynamic instruction trace of the program. The size of each window is the ....

Blume, W., Eigenmann, R., and al. Polaris: Improving the effectiveness of parallelizing compilers. Tech. Rep. UIUCCSRD -1405, CSRD, Univ. of Illinois, 1995.


Automatic Data and Computation Partitioning on Scalable.. - Tandri, Abdelrahman (1996)   (4 citations)  (Correct)

.... University of Toronto Hector [3] and NUMAchine [4] the KSR1 [5] and the Convex Exemplar [6] Automatic parallelization of scientific applications on bus based shared memory multiprocessors has been mainly concerned with the detection of parallelism and the scheduling of parallel loop iterations [7, 8]. Hence, it is not surprising that on SSMMs issues related to data placement have been ignored by compilers and delegated to the operating system as part of page management. Policies, such as first hit and round robin place pages in the physically distributed shared memory as these pages are ....

....system policies. We implemented the algorithm in a prototype compiler called Jasmine, which is developed at the University of Toronto. It consists of four major phases: parallelism detection, cache locality enhancement, memory locality enhancement and code generation. We use the Polaris compiler [8] from the University of Illinois as well as the KAP preprocessor from Kuck and Associates for the parallelism detection phase. The algorithm described in this paper constitutes the memory locality enhancement phase. The compiler is used to automatically determine data and computation partitions ....

W. Blume et al. Polaris: Improving the effectiveness of parallelizing compilers. In David Gelernter, Alexandru Nicolau, and David Padua, editors, Languages and Compilers for Parallel Computing, pages 141--154. SpringerVerlag, 1994.


Non-Linear and Symbolic Data Dependence Testing - Blume, Eigenmann (1998)   (7 citations)  Self-citation (Blume Eigenmann)   (Correct)

....induction variable substitution, and reduction recognition. Because of this, dependence arcs from reductions, induction variables, and private arrays and scalars have already been eliminated when the Range and Omega Tests were executed. Details of these advanced techniques can be found in [11, 10]. From Table 2, we can see that there are cases where the Range Test does better, and cases where the Omega Test does better. This should not be surprising, because the Omega Test has difficulties with non affine expressions while the Range Test was designed to handle such cases. On the other ....

....Test can prove independence for many of the other parallel loops that contain symbolic nonlinear array subscript expressions. We have implemented the Range Test together with a symbolic range propagation algorithm in Polaris, a parallelizing compiler being developed at the University of Illinois [19, 10]. Currently, the Range Test is the only data dependence test implemented in Polaris. To determine its effectiveness, we have run it through an initial compiler test suite, which consists of half of the codes of the Perfect Benchmarks plus other applications gathered from users of high performance ....

[Article contains additional citation context not shown here]

William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, Paul Petersen, Bill Pottenger, Lawrence Rauchwerger, Peng Tu, and Stephen Weatherford. Polaris: Improving the Effectiveness of Parallelizing Compilers. Lecture Notes in Computer Science, 892: Languages and Compilers for Parallel Computing, pages 141--154, 1995.


Automatic Partitioning of Data and Computations - Sudarsan Tandri Ibm   (Correct)

No context found.

W. Blume et al. Polaris: Improving the effectiveness of parallelizing compilers. In Languages and Compilers for Parallel Computing, pages 141--154, 1994.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC