19 citations found. Retrieving documents...
A. Krishnamurthy and K. A. Yelick. Optimizing parallel programs with explicit synchronization. In SIGPLAN Conference on Programming Language Design and Implementation, pages 196--204, 1995.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Compiler Optimization Techniques for OpenMP Programs - Satoh, Kusano, Sato   (Correct)

....by Srinivasan et al. 25] Novillo et al. 16] and Lee et al. 15] present optimization algorithm using concurrent SSA form. Collard [3] propose array SSA form for parallel programs. Knoop and Ste en[10] present code motion algorithm for ############# parallel programs. Krishnamurthy and Yelick [13] propose communication optimization for SPMD program with shared variables. Pugh and Rosser [17] present communication optimization within a single parallel loop. The former work on analyses and optimizations for explicit parallel programs deal with parallel programs whose parallelization ....

A.Krishnamurthy and K.Yelick. Optimizing Parallel Programs with Explicit Synchronization. PLDI'95, pp. 196-204, 1995.


Using Information from the Programmer to Implement Shared-Memory.. - Adve (1998)   (3 citations)  (Correct)

.... of different memory operations can be disambiguated (i.e. the accuracy of the conflict relation) Krishnamurthy and Yelick subsequently performed a more substantial modification to the above algorithm to exploit knowledge about synchronization to eliminate certain spurious critical cycles [27]. Similar to the work on programmer centric models, their modification requires programmers to use explicit synchronization constructs recognizable by the system, and makes certain assumptions about the behavior of synchronization operations (e.g. only one post per event variable for post wait ....

A. Krishnamurthy and K. Yelick. Optimizing Parallel Programs with Explicit Synchronization. In Proc. SIGPLAN Conf. on Programming Language Design and Implementation, July 1995.


Compiler Optimization Techniques for OpenMP Programs - Satoh, Kusano, Sato (2000)   (Correct)

....et al. 25] Novillo et al. 16] and Lee et al. 15] presented optimization algorithms using concurrent SSA form. Collard [3] proposed array SSA form for parallel programs. Knoop and Ste en[10] presented a code motion algorithm for cobegin coend parallel programs. Krishnamurthy and Yelick [13] proposed communication optimization for SPMD programs with shared variables. Pugh and Rosser [17] presented communication optimization within a single parallel loop. Previous work on analyses and optimizations for explicit parallel programs dealt with programs whose parallelization constructs or ....

A.Krishnamurthy and K.Yelick. Optimizing Parallel Programs with Explicit Synchronization. PLDI'95, pp. 196-204, 1995.


G. Research - Computer Science Research   (Correct)

....and points as first class objects to ease programming [49] and compiler analysis. ffl Language (rather than library) constructs for synchronization, for which the compiler provides static analysis to avoid certain synchronization errors [1] Synchronization analysis also aids in optimization [57]. ffl A looping construct that specifies a set of legal sequential executions, thereby enabling loop optimizations without traditional dependence analysis. A release for the Titanium compiler is planned for the end of 1997 and will target the UCB NOW platform, Cray T3E and shared memory ....

....overlap memory operations (e.g. prefetching and non blocking writes) keep cached copies, and other general memory hierarchy optimizations. We have used information about independence on more regular problems, for example, to optimize communication on distributed memory multiprocessors [57]. We will adapt the compiler analysis to unstructured problems by including unstructured and partitioned unstructured types in the language. The optimizations for independent subgraphs are analogous to those allowed on unions of disjoint rectangular domains, which appear in AMR, but the ....

[Article contains additional citation context not shown here]

A. Krishnamurthy and K. Yelick. Optimizing parallel programs with explicit synchronization. In Programming Language Design and Implementation, June 1995.


Optimizing High Performance Software Libraries - Guyer, Lin (2000)   (2 citations)  (Correct)

....perform many kinds of dataflow analysis to determine how the program manipulates data. For example, constant propagation determines whether or not variables have constant values at compile time. Unfortunately, most send optimization, require analyses that can understand parallel program semantics [27, 26]. 5 programming languages have no notion of matrix, let al..one matrix distributions. Thus, in order to perform this kind of optimization, we need to tell the compiler about the relevant abstractions and facilitate program analysis in those terms. PLAPACK itself is implemented using MPI, which ....

Arvind Krishnamurthy and Katherine Yelick. Optimizing parallel programs with explicit synchronization. In Programming Language Design and Implementation, June 1995.


An Evaluation of Memory Consistency Models for.. - Parthasarathy..   (Correct)

....Work Compiler optimizations. In choosing a consistency model, the hardware designer must consider both system performance and programmability. The techniques of this paper address hardware performance. However, though there has been some work on compiler optimizations with sequential consistency [36, 20, 21], SC at the application programming level also restricts compiler optimizations. To avoid these restrictions, it is likely that high performance compilers will expose a release consistent model to the applications programmer. If compilers mandate RC, then the improved performance and lower ....

A. Krishnamurthy and K. Yelick. Optimizing Parallel Programs with Explicit Synchronization. In Proceedings SIGPLAN Conference on Programming Language Design and Implementation, July 1995.


Communication Optimizations for Parallel C Programs - Yingchun Zhu And (1998)   (11 citations)  (Correct)

....distance remains invariant across several calls to the function, and all the field accesses with respect to this pointer can be placed before the first call by interprocedural partial redundancy elimination. Currently, we achieve this effect via function inlining. Krishnamurthy and Yelick [15] present communication optimizations in the context of compiling explicitly parallel programs to Split C programs. Their optimizations include message pipelining by converting remote read writes into their split phase analogues, eliminating acknowledgement traffic, and reusing values from remote ....

Arvind Krishnamurthy and Katherine Yelick. Optimizing parallel programs with explicit synchronization. In Proc. of the ACM SIGPLAN '95 Conf. on Programming Language Design and Implementation, pages 196--204, La Jolla, Calif., Jun. 1995.


Compiling For Multithreaded Architectures - Tang (1999)   (1 citation)  (Correct)

....the optimization impact on threaded programs, especially the heap based optimization transformations we studied in this thesis. Four of the most related works to ours are: 1. Krishnamurthy and Yelick s work on using explicit synchronization provided by the programmer to optimize Split C programs [76]; 2. Roh et. al s work on evaluating the impact of optimizations on threaded code generation [102] and 3. Sohn et. al s work on identifying the capability of overlapping computation with communication [119] 4. Zhu and Hendren s work on reducing the number of communication messages by ....

Arvind Krishnamurthy and Katherine Yelick. Optimizing parallel programs with explicit synchronization. In Proceedings of the ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 196-- 204, La Jolla, California, June 18--21, 1995. SIGPLAN Notices, 30(6), June 1995.


Barrier Inference - Gay (1998)   (13 citations)  (Correct)

....and handling language features such as exception handling. We are also working on an algorithm that uses the results of our inference system to represent the portions of the code that may be executing simultaneously so that SPMD optimisations, such as those proposed by Krishnamuthy and Yelick [14], may be more precise. ....

Arvind Krishnamurthy and Katherine Yelick. Optimizing Parallel Programs with Explicit Synchronization. In ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 196--204, New York, NY, USA, June 1995. ACM Press.


Communication Optimizations for Parallel C Programs - Hendren (1998)   (11 citations)  (Correct)

....function distance remains invariant across several calls to the function, and all the field accesses with respect to this pointer can be placed before the first call by interprocedural partial redundancy elimination. Currently, we achieve this effect via function inlining. Krishnamurthy and Yelick [15] present communication optimizations in the context of compiling explicitly parallel programs to Split C programs. Their optimizations include message pipelining by converting remote read writes into their split phase analogues, eliminating acknowledgement traffic, and reusing values from remote ....

Arvind Krishnamurthy and Katherine Yelick. Optimizing parallel programs with explicit synchronization. In Proc. of the ACM SIGPLAN '95 Conf. on Programming Language Design and Implementation , pages 196--204.


Optimizing COOP Languages: Study of a Protein Dynamics.. - Zhang, Karamcheti, Ng.. (1996)   (Correct)

....transformations here. General compilerdirected node level reuse and communication grouping for COOP programs is still an open problem, because the programming model supports a shared namespace in the presence of inherently concurrent threads, requiring additional concurrency [28] and access cycle [26] information. In addition, the pervasive use of heapbased data structures also complicates dataflow analysis of object access pattern. However, the non recursive data structures, used in IC CEDAR and similar applications, are amenable to high level data structure descriptions [34, 16] based on ....

Arvind Krishnamurthy and Katherine Yelick. Optimizing parallel programs with explicit synchronization. In Proceedings of the 1995 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 196--204, June 1995.


Recent Advances in Memory Consistency Models for Hardware .. - Adve, Pai, Ranganathan (1999)   (6 citations)  (Correct)

....the effect of certain optimizations using their algorithm, on arraybased Split C programs run on a CM 5 (message passing) multiprocessor. They found reductions in execution time of 20 to 50 . They further improved their algorithm to reduce the impact of bi directional conflict order edges [22], using some information from the programmer. This improvement reduced the number of critical cycles for which program order needs to be enforced, and gave additional reductions in execution time of 20 to 35 for Split C programs on a CM 5 multiprocessor. It is not yet known how effective the ....

A. Krishnamurthy and K. Yelick, "Optimizing Parallel Programs with Explicit Synchronization," in Proc. SIGPLAN Conf. on Programming Language Design and Implementation, July 1995.


Barrier Inference - Aiken, Gay (1998)   (13 citations)  (Correct)

....in SPMD programs. In our experience, this extra information is extremely useful for understanding SPMD programs written by others. Barrier inference also gives the compiler a more precise understanding of the portions of the program that execute in parallel, which makes SPMD optimizations, e.g. [14], more precise. There are structurally correct programs our system cannot verify, such as Figure 1h. Intuitively, the problem with this example is that although both branches execute the same number of barriers, our system only infers that the branches each execute some unknown number of barriers ....

A. Krishnamurthy and K. Yelick. Optimizing Parallel Programs with Explicit Synchronization. In ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 196--204, New York, NY, USA, June 1995. ACM Press.


Using Information from the Programmer to Implement System.. - Adve (1996)   (3 citations)  (Correct)

....non conflictingoperations not associated with each other can never be on a critical edge, and can be reordered. The following describes various intermediate mechanisms that are practically easier to implement. The Split C mechanism. Split C allows a counter to be associated with every operation [19, 37, 38]. It further defines a sync instructionthat is also associated with a counter, and waits for all preceding (by program order) operations of its processor associated with the same counter to complete. This provides a conservative mechanism to associate a sender with operations it sends (by ....

....final acknowledgement to the writing processor to inform it of the completion of the write. We consider eliminating acknowledgements for some writes. This optimization may be important for bandwidthlimited systems, or for systems that support shared memory and process acknowledgements in software [19, 22, 38]. Acknowledgements are needed only to indicate the completion of a write operation, which is required only if some later operation will wait for this write. Thus, to determine when acknowledgements are not required, we need to characterize those writes for which no other operations need to wait ....

[Article contains additional citation context not shown here]

A. Krishnamurthy and K. Yelick. Optimizing Parallel Programs with Explicit Synchronization. In Proc. SIGPLAN Conf. on Programming Language Design and Implementation, July 1995.


Efficient Resource Scheduling in Multiprocessors - Chakrabarti (1996)   (1 citation)  (Correct)

....the programmer s view of consistency of program variables. An early result in this direction finds cycles of conflicting reads and writes and inserts minimal 109 synchronization to break all cycles [102] This technique has recently been improved for single program multiple data (SPMD) sources [82]. Another related area is instruction scheduling, which has been intensively studied [94] The performance impact of these scheduling algorithms has grown with the advent of superscalar RISC architectures which have multiple functional units permitting out of order execution. 6.2 Multiprocessor ....

A. Krishnamurthy and K. Yelick. Optimizing parallel programs with explicit synchronization. In Programming Language Design and Implementation (PLDI), La Jolla, CA, June 1995. ACM.


Heap Analysis and Optimizations for Threaded Programs - Tang, Ghiya, Hendren, Gao (1997)   (2 citations)  (Correct)

....impact on threaded programs, especially the heap based optimization transformations we studied in this paper. Three of the most related works to this paper are: 1) Krishnamurthy and Yelick s work on using explicit synchronization provided by the programmer to optimize Split C programs [11]; 2) Roh et. al s work on evaluating the impact of optimizations on threaded code generation [16] and (3) Sohn et. al s work on identifying the capability of overlapping computation with communication [20] Krishnamurthy and Yelick use dataflow analysis for synchronization, barriers, and locks ....

Arvind Krishnamurthy and Katherine Yelick. Optimizing parallel programs with explicit synchronization. In Proceedings of the ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 196--204, La Jolla, California, June 18--21, 1995. SIGPLAN Notices, 30(6), June 1995.


Heap Analysis and Optimizations for Threaded Programs - Tang, Ghiya, Hendren, Gao (1997)   (2 citations)  (Correct)

....impact on threaded programs, especially the heap based optimization transformations we studied in this paper. Three of the most related works to this paper are: 1) Krishnamurthy and Yelick s work on using explicit synchronization provided by the programmer to optimize Split C programs [12]; 2) Roh et. al s work on evaluating the impact of optimizations on threaded code generation [17] and (3) Sohn et. al s work on identifying the capability of overlapping computation with communication [21] Krishnamurthy and Yelick use the dataflow analysis for synchronization, barriers, and ....

A. Krishnamurthy and K. Yelick. Optimizing parallel programs with explicit synchronization. In Proc. of SIGPLAN PLDI '95, pages 196--204, La Jolla, Calif., Jun. 1995.


Polynomial-time Algorithms for Enforcing Sequential - Consistency In Spmd   Self-citation (Krishnamurthy Yelick)   (Correct)

No context found.

A. Krishnamurthy and K. A. Yelick. Optimizing parallel programs with explicit synchronization. In SIGPLAN Conference on Programming Language Design and Implementation, pages 196--204, 1995.


Polynomial-time Algorithms for Enforcing Sequential - Consistency In Spmd   Self-citation (Krishnamurthy Yelick)   (Correct)

No context found.

A. Krishnamurthy and K. A. Yelick. Optimizing parallel programs with explicit synchronization. In SIGPLAN Conference on Programming Language Design and Implementation, pages 196--204, 1995.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC