| A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38:139--144, 1996. |
....to safely optimize multithreaded programs even in the presence of accesses to shared data [91, 87, 57, 62, 56, 64] The presence 3 of multithreading may also inspire optimizations with no obvious counterpart in the optimization of sequential programs. Examples include communication optimizations [59, 100], optimizing mutual exclusion synchronization [30, 31, 79, 3, 98, 11, 13, 21, 82] and optimizing barrier synchronization [96] A more conservative approach is to ensure that the optimizations preserve the semantics of the original program by first identifying regions of the program that do not ....
.... other threads potentially observe) accesses to shared data [78] In this context, the alternative to a weak consistency model is to disable these optimizations unless the compiler performs the global analysis required to determine that parallel threads do not observe the reordered memory accesses [59, 64]. Requiring the extraction of this kind of global information as part of the standard compilation process is clearly problematic, primarily because it rules out optimized separate compilation. Another approach is to develop analyses and transformations that restore the abstraction of a single ....
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38(2), Nov. 1996.
....calls. Perhaps an even better possibility could be to use our SWCC as a tool for an optimizing compiler. Optimizations have been created for the Split C compiler which will analyze a program to automatically cache global variables in local memory when the compiler detects repeated accesses [6]. However, the ability to evaluate the program at compile time is limited, and our software cache coherence library could be used as a fall back when the compiler isn t able manage the caching statically. Ideally, the compiler would even determine which variables have a possibility of benefiting ....
Krishnamurthy, Yelick. Analyses and Optimizations for Shared Address Space Programs. Journal of Parallel and Distributed Computing, 38(2): 130-44, 1996.
....of global program behavior. Because these latter protocols minimize remote processor involvment, they can take advantage of capabilities such as remote memory access being encountered in highperformance networks like Myrinet and VIA. Our compiler infrastructure extends interprocedural concurrency [11, 9] and side effect [5] analyses for SPMD programs previously reported in literature. These analyses, optionally augmented with run time profiling information, infer the minimal coherence requirements for each view access. Specifically, three kinds of analyses are performed (see Figure 2) 1. ....
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 1996.
....L [12] Mutex bodies are defined in terms of lock protected nodes. In general, a mutex body BL (N) for lock variable L is a multiple entry, multiple exit region of the graph that encompasses all the flowgraph nodes that are protected by a common set of lock nodes (N ) In contrast, previous work [8, 11] has treated mutex bodies as single entry, single exit regions. A mutex structure for a lock variable L is the set of all the mutex bodies for L in the program. 4 Lock Independent Code Motion Lock Independent Code Motion (LICM) is a code motion technique that attempts to minimize the amount of ....
A. Krishnamurthy and K. Yelick. Analyses and Optimizations for Shared Address Space Programs. Journal of Parallel and Distributed Computing, 38:130--144, 1996.
....equivalent. Furthermore, if the analysis determines that statement s is sometimes protected and sometimes not, this information could be used to warn the user about an anomalous locking pattern. Existing work on mutual exclusion synchronization is based on a structural definition of mutex bodies [2, 4, 6]. A mutex body is indicated by a pair of lock and unlock nodes. All the graph nodes dominated by the lock node and post dominated by the unlock node are part of the mutex body. Although correct, this notion of mutex body fails to identify some valid locking patterns present in some programs. For ....
A. Krishnamurthy and K. Yelick. Analyses and Optimizations for Shared Address Space Programs. J. Parallel and Distributed Computing, 38:130--144, 1996.
....We can estimate the cost of wide references by computing the average time required per edge when all data is stored in the local memory region. In Table 1, we present times collected on a Thinking Machines CM 5 and partial times collected on a Cray T3D. These findings were originally presented in [15] and [24] respectively. The benchmark reveals that the performance cost of using wide references for local data can be profound. Even when the code for reading and writing through wide references is inlined, the CM 5 shows nearly a 75 slowdown compared with simple pointers. This is largely due ....
Arvind Krishnamurthy. Analyses and Optimizations for Shared Address Space Programs. Ph.D. qualifying examination talk, November 1995.
.... detection that allows re ordering of memory references in a program to increase concurrency while maintaining the sequential consistency dictated by the code [20] Krishnamurthy and Yelick extended cycle detection analysis to incorporate additional information from synchronization in the program [13]. Although their work supports post wait, barrier and mutual exclusion synchronization, they only focus on optimizing remote memory references on a specific class of explicitly parallel programs. Grunwald and Srinivasan developed data flow equations to compute reaching definition information on ....
A. Krishnamurthy and K. Yelick. Analyses and Optimizations for Shared Address Space Programs. Journal of Parallel and Distributed Computing, 38:130--144, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Jorunal of Parallel and Distributed Computing, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Jorunal of Parallel and Distributed Computing, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Jorunal of Parallel and Distributed Computing, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Jorunal of Parallel and Distributed Computing, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Jorunal of Parallel and Distributed Computing, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Jorunal of Parallel and Distributed Computing, 1996.
....for one of the weaker consistency models, such as processor consistency or release consistency. Krishnamurthy and Yelick have shown that languages can provide the stronger model of sequential consistency through static analysis, even when the languages execute on hardware with weaker semantics [7]. We are exploring the use of this analysis, which requires good aliasing and synchronization information, in the context of Titanium, but our current consistency model does not rely on such analysis. Instead, we adopt the Java consistency model, which is weakly consistent at the program level, ....
A. Krishnamurthy and K. Yelick. Analyses and Optimizations for Shared Address Space Programs. Journal of Parallel and Distributed Computation, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38:139--144, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38(2):130--144, 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38(2), Nov. 1996.
No context found.
A. Krishnamurthy and K. Yelick. Analyses and optimizations for shared address space programs. Journal of Parallel and Distributed Computing, 38(2):130--144, 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC