97 citations found. Retrieving documents...
Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the OSDI'96, pp. 75--88, Oct 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Model Checking a Cache Coherence Protocol for a Java.. - W, Fokkink, Hofman..   (Correct)

....processors all store this copy at the same virtual address. The protocol is based on self invalidation, which means the cached copy of a region remains valid until the thread itself invalidates the copy, which occurs whenever it reaches a synchronization point. Jackal combines features of HLRC [28] and TreadMarks [14] As in HLRC, modifications are flushed to a home node; as in TreadMarks, twinning and diffing are used to allow concurrent writes to shared data. Unlike TreadMarks, Jackal uses software access checks inserted before each object usage to detect non local or stable data. The ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release-consistency protocols for shared virtual memory systems. In Proceedings 2nd USENIX Symposium on Operating Systems Design and Implementation, pages 75--88, 1996.


High-Performance Networking for Software DSMs - Rodrigo Weber Dos   (Correct)

....becomes inappropriate, as it requires communicating processors to synchronize at the communication point. The model should also facilitate buffer management for asynchronous messages if the requestreply behavior is to be relaxed as in page based software DSMs such as ADSM [12] AEC [16] or HLRC [17], where some form of update coherence is applied. This restriction makes the remote memory write model (VMMC2) inadequate, since neither senders nor receivers can control the receive buffer usage. In order to avoid overwriting messages in this model, a large amount of buffer space must be coupled ....

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proceedings of the 2nd OSDI, October 1996.


CC-MPI: A Compiled Communication Capable MPI Prototype.. - Karwande, Yuan..   (Correct)

....12.75s CLASS=A MPICH 111.91s 46.71s 28.01s CC MPI 40.19s 21.34s 11.23s Table 5: Execution time for FT with di#erent MPI libraries, di#erent number of nodes and di#erent problem sizes. SDSM is built within the Filaments package [22] and uses an eager version of home based release consistency [33]. We tested the potential of using Level 1 compiled communication for MPI Alltoallv to implement exchange of page information through a synthetic application that first modifies a set number of pages on each node and then invokes a barrier. The barrier causes all pages to be made consistent ....

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation (OSDI'96), pages 75--88, 1996.


Dynamic Data Replication: an Approach to Providing.. - Christodoulopoulou, .. (2003)   (1 citation)  (Correct)

....main memory without interrupting the remote host processor. VMMC also tolerates transient network errors by using packet retransmission, and guarantees FIFO message delivery. 3.2. Original SVM protocol The original SVM protocol, GeNIMA [5] is based on home based lazy release consistency (HLRC) [34] and is designed to take advantage of a number of architectural features in modern clusters and system area networks. In order to comply with the partial order requirements of LRC for shared memory accesses [18] the application execution of each processor on each node is partitioned into time ....

....on each node is partitioned into time intervals that are delimited by consecutive release operations executed by threads on the same SMP. During each time interval all local page updates are recorded into a common update list. Shared pages in GeNIMA are assigned a home node according to HLRC [34], to which writers send their page updates eagerly, upon a release. Nodes propagate page updates in the form of diffs, which consist of the modifications applied to the version of the page before its first write (also called the twin) Diffs address the problem of false sharing as they allow ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two homebased lazy release consistency protocols for shared virtual memory systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation, pages 75--88, 1996.


Efficient Runtime Support for Cluster-Based Distributed Shared.. - Speight (1997)   (3 citations)  (Correct)

....The final adaptive protocol used in Brazos is an adaptive page management mecha nism. Some previous studies have shown that distributed page based systems outper form home based page systems [27] while yet other studies have argued that home based page systems offer superior performance [53]. We have elected to provide an adaptive mechanism whereby those pages better suited to a home based management technique can switch from the default distributed algorithm. To determine when a page should change page management mechanisms, processes continually monitor the size of the diffs being ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedizgs of the Secod USENIX Slmposium o Operatig Slstem Desig ad Implemem tation, pages 75 88, November 1996.


Runtime Optimizations for a Java DSM Implementation - Veldema, Hofman, Bhoedjang.. (2001)   (8 citations)  (Correct)

....runtime system. If a processor runs out of free physical memory, it initiates a global garbage collection that frees both Java objects and physical memory pages. 3. 3 Coherence Protocol and Access Checks Jackal employs an invalidation based, multiple writer protocol that combines features of HLRC [29] and TreadMarks [17] As in HLRC, modifications are flushed to a home node; as in TreadMarks, twinning and diffing is used to allow concurrent writes to shared data. Unlike TreadMarks, Jackal uses software access checks inserted before each object array usage to detect non local and stale data. ....

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release-Consistency Protocols for Shared Virtual Memory Systems. In 2nd USENIX Symp. on Operating Systems Design and Implementation, pages 75--88, Seattle, WA, October 1996.


Coherence-Centric Logging and Recovery for Home-Based.. - Kongmunvattana, Tzeng   (Correct)

....the probability of system failures increases as the system size grows. This existence of vulnerability is unacceptable, especially for long running applications and high availability situations. Hence, a mechanism for supporting fast crash recovery in SDSM is indispensable. Home based SDSM [18] is one type of SDSM developed under the notion of relaxed memory consistency model [1] While it relies on a virtual memory trap as other SDSM systems [2, 5] home based SDSM assigns a home node for each shared memory page to collect updates from all writers of that page. This home node offers ....

....fault nor requires any summary of write modifications, ii) it takes only one round trip message to bring a remote copy of any shared memory page up to date, and (iii) no garbage collection is needed. Due to these advantages, the home based SDSM protocol has been a focus of several recent studies [7, 10, 14, 18]. Unfortunately, no prior work has ever been attempted on crash recovery in home based SDSM. This paper is the very first one to deal with crash recovery in such SDSM. Message logging is a popular technique for providing home less SDSM with fault tolerant capability [6, 11, 12, 13, 17] This ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. In Proc. of the 2nd USENIX Symp. on Operating Systems Design and Implementation (OSDI), pages 75--88, October 1996.


Runtime Optimizations for a Java DSM Implementation - Veldema, Hofman, Bhoedjang.. (2001)   (8 citations)  (Correct)

....runtime system. If a processor runs out of free physical memory, it initiates a global garbage collection that frees both Java objects and physical memory pages. 3. 3 Coherence Protocol and Access Checks Jackal employs an invalidation based, multiple writer protocol that combines features of HLRC [26] and TreadMarks [15] As in HLRC, modifications are flushed to a home node; as in TreadMarks, twinning and diffing is used to allow concurrent writes to shared data. Unlike TreadMarks, Jackal uses software access checks inserted before each object array usage to detect non local and stale data. ....

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release-Consistency Protocols for Shared Virtual Memory Systems. In 2nd USENIX Symp. on Operating Systems Design and Implementation, pages 75--88, Seattle, WA, October 1996.


Dynamic Data Replication for Tolerating Single Node.. - Christodoulopoulou..   (Correct)

....but also enhances scalability because it enables computing nodes to be used as backup nodes while performing useful computation. We use as our starting point the GeNIMA shared virtual memory system [4] that has been optimized for this type of interconnects. GeNIMA is a home based protocol [26, 19] that was shown to provide scalable performance to the 64 processor level [9] We extend GeNIMA by introducing additional operations to guarantee consistency in the presence of failures. GeNIMA, as well as our extensions use low overhead direct remote operations (read, write) provided by ....

....safely given that the execution replay starts after the last release performed locally and there has been no other release until the point of failure. 3 Original SVM Protocol The original shared virtual memory protocol, GeNIMA [4] is based on home based lazy release consistency (HLRC) [26] and is designed to take advantage of a number of architectural features in modern clusters and system area networks. In order to comply with the partial order requirements of LRC for shared memory accesses [10] the application execution of each processor on each node is partitioned into time ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared memory virtual memory systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation (OSDI'96), pages 75--88, 1996.


CableS : Thread Control and Memory System Extensions for.. - Jamieson, Bilas (2001)   (Correct)

....targeting both scientific and commercial applications. Shared memory clusters are an attractive approach to providing a#ordable and scalable compute cycles and I O. For this reason, there has recently been a lot of work on designing e#cient shared virtual memory (SVM) protocols for such clusters [23,16,26,13]. These protocols take advantage of features provided by SANs, such as low latencies for short messages and direct remote memory operations with no remote processor intervention [12,10,9] to improve system performance and scalability [16] Providing a shared memory programming abstraction on ....

.... improve system performance and scalability [16] Providing a shared memory programming abstraction on clusters has made it easier to run applications that have been written for more traditional, tightly coupled multiprocessors (both shared bus and distributed shared memory machines) Recent work [23,16,26] has shown that the performance of SVM clusters is competitive for wide ranges of applications to more traditional, tightly coupled multiprocessors. For instance, the authors in [16] find that a 64 processor cluster o#ers, for most SPLASH 2 [24] applications (after a number of optimizations) at ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.


Shared Virtual Memory Clusters with Next--generation.. - Courtney Gibson And (2001)   (Correct)

....consistent software shared memory layer on top of 8 processor SMP nodes. They find that the cost for page invalidations within each node is very high. Another study examined the all software home based HLRC and the original Treadmarks protocols on a 64 processor Intel Paragon multiprocessor [17]. This study focused on the ability of a communication coprocessor to overlap protocol processing with useful computation in the two protocols but it also compared the protocols at this large scale. However, the architectural features and performance parameters of the Paragon system and its ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.


Symmetry and Performance in Consistency Protocols - Keleher   (1 citation)  (Correct)

....a nominal home, home nodes are used only to satisfy cold misses. Subsequent misses are satisfied by the last processors to write the page. This symmetry has a number of advantages, including less hot spot contention and less reliance on the original data distribution. However, recent work [10] has shown that asymmetric approaches can result in significant savings of runtime overhead when application specific sharing behavior is available. Homebased LRC protocols differ from symmetric LRC protocols in that each page has a statically assigned home. Protocol performance is asymmetric ....

....the case can be made that both are most appropriate for different types of applications. As developers might be unwilling or unable to distinguish between the two situations themselves, we present a new adaptive protocol that attempts to strike a balance between the two. Home based LRC protocols [10] differ from homeless LRC protocols in that all shared modifications are flushed to the page s home at the end of the current synchronization interval. These protocols benefit from the home effect, the ability to dispense with write trapping for modifications made by the home node of a given ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li, "Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems," in Proceedings of the 2nd Symposium on Operating Systems Design and Implementation, October, 1996.


Source-Level Global Optimizations for Fine-Grain.. - Veldema, Hofman.. (2001)   (Correct)

....parts, where P equals the number of processors. Each processor allocates memory only in its own partition, so that computing a region s home node from its virtual address amounts to one divide operation. Jackal employs an invalidation based, multiple writer protocol that combines features of HLRC [32] and TreadMarks [17] Jackal does not use a single writer protocol because it would force the compiler to mark the end of each read write operation, which reduces the opportunity to lift access checks. In addition, end markers increase sequential overhead. As in HLRC and TreadMarks, modifications ....

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release-Consistency Protocols for Shared Virtual Memory Systems. In 2nd USENIX Symp. on OSDI, pages 75--88, Seattle, WA, Oct. 1996.


Limits on the Performance of Software Shared Memory: A.. - Bilas, Jiang, Zhou..   (Correct)

.... mature approaches: page based shared virtual memory (which we will call SVM) and fine grained or variable grained access control through code instrumentation (which we call the fine grained approach) Much excellent research has been done in the design and implementation of these protocols as well [19, 15, 17, 30, 8, 25]. And finally, above the programming model or protocol layer runs the application itself. Each layer has its own functional and performance characteristics that contribute to the end performance seen by a user. Despite all the research and improvements, currently software shared memory systems ....

.... in page based SVM to alleviate the effects of false sharing [1] and the recent lowering of instrumentation costs for fine grained protocols [25] which are able to avoid false sharing by virtue of using fine granularity) Many important improvements have been made in SVM protocols since then [15, 17, 12, 30]. And the emergence of commercial, low latency, high bandwidth network interfaces and system area interconnects, such as Myrinet [4] and Memory Channel [9] have been the great technological enabler to support these protocols more efficiently on clusters. Fine grained protocols such as the SC ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.


Update Protocols and Cluster-based Shared Memory - Peter Keleher Keleher (1999)   (8 citations)  (Correct)

....Contributions This paper presents the design and performance of several new protocols to handle this and similar types of applications. We first look at a relatively straightforward update protocol based on multi writer lazy release consistency (LRC) 6] We then show that modified home based [15] protocols can perform even better than LRC protocols for applications of this type. Whereas home less LRC protocols can perform poorly for applications that modify (and communicate) large amounts of data, home based protocols maintain relatively little state, and such state lives has short ....

....Sections 4 and 5 describe and analyze the performance of extensions to our home based protocols that impose less of a load on the underlying operating system. Finally, Section 6 concludes. 2. Background and Protocol Descriptions Our home based protocols are based on those discussed by Zhou [15]. Home based protocols are, in turn, based on the multi writer lazy release consistent (LRC) protocols used by DSMs such as TreadMarks [1] and CVM [5] 2.1 Distributed Shared Memory (DSM) Systems CVM is a software distributed shared memory (DSM) system. DSM allows processes to assume a globally ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li, "Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems," in Proceedings of the 2nd Symposium on Operating Systems Design and Implementation, October, 1996.


ORION: An Adaptive Home-based Software Distributed Shared Memory .. - Ng, Wong (2000)   (1 citation)  (Correct)

....HRC, sending diffs home is mandatory which can be a performance loss factor. Published data have shown that for certain applications lazy release consistency model (LRC) performs better than home based lazy release consistency model (HLRC) 4] while for some other applications the reverse is true [2]. The main cause for this difference is the choice of the home for a page, which varies from application to application. This points to the need for the system to dynamically adapt to the needs of the applications in order to achieve better performances. The idea of automatic adaptive protocol is ....

Y. Zhou, L. Iftode, and K. Li. "Performance Evaluation of Two Home-based Lazy Release Consistency Protocols for Shared Virtual Memory Systems." In Proceedings of the Second USENIX Symposium on Operating System Design and Implementation, pages 75-88, November 1996.


ORION: An Adaptive Home-based Software DSM - Ng School Of   (Correct)

....HRC, sending diffs home is mandatory which can be a performance loss factor. Published data have shown that for certain applications lazy release consistency model (LRC) performs better than home based lazy release consistency model (HLRC) 4] while for some other applications the reverse is true [2]. The main cause for this difference is the choice of the home for a page, which varies from application to application. This points to the need for the system to dynamically adapt to the needs of the applications in order to achieve better performances. The idea of automatic adaptive protocol is ....

Y. Zhou, L. Iftode, and K. Li. "Performance Evaluation of Two Home-based Lazy Release Consistency Protocols for Shared Virtual Memory Systems." In Proceedings of the Second USENIX Symposium on Operating System Design and Implementation, pages 7588, November 1996.


The Effect of Contention on the Scalability of.. - de Lara, Hu, Lu.. (2000)   (Correct)

....on performance: eliminating contention reduced execution time by 64 in the most extreme case, even at the relatively modest scale that we consider in this paper. Our experiments are performed on a network of thirtytwo single processor nodes using both Princeton s homebased (HLRC) protocol [10] [6] and Rice s TreadMarks (Tmk) protocol [5] Both are widely used, page based, multiple writer protocols implementing Lazy Release Consistency (LRC) 4] From our experiments, we derive three specific conclusions. First, in comparing the results on 32 nodes to 8 nodes, we find that the effects ....

....ran on it. Section 5 presents the results of our evaluation. Section 6 compares our results to related work in the area. Finally, Section 7 summarizes our conclusions. 2 Background 2. 1 TreadMarks and Home based LRC The TreadMarks (Tmk) protocol [4] and the Princeton home based (HLRC) protocol [10] are multiple writerim1 plementations of lazy release consistency (LRC) 4] 3] The main difference between these protocols is in the location where updates are kept and in the way that a processor updates its copy of a page. In Tmk, processors update a page by fetching diffs from the last ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Second USENIX Symposium on Operating System Design and Implementation, pages 75--88, nov 1996. 6


The Effect of Contention on the Scalability of Page-Based.. - de Lara (1999)   (Correct)

.... thesis explores experimentally the relationship between the distribution of messages and performance on a network of thirty two nodes for three page based, multiple writer protocols implementing Lazy Release Consistency (LRC) 11] We use two existing protocols, Princeton s home based protocol [20] and the TreadMarks protocol [12] and a third novel protocol, Adaptive Striping. Adaptive Striping is a 2 simple extension of the TreadMarks protocol. Its goal is to eliminate load imbalance (within the protocol) at the nodes automatically. We evaluate these protocols using a suite of five ....

.... protocol appeared in the Munin system [5] Munin, however, used a form of eager RC that is generally inferior to LRC [10] Since then two multiple writer protocols that are compatible with LRC have been developed: the TreadMarks (Tmk) protocol [11] and the Princeton homebased (HLRC) protocol [20]. Some aspects of these protocols are the same. In particular, modifications to (shared) pages are detected and captured using twinning and diffing. Briefly, the implementation works as follows. A page is initially write protected, so that a protection fault occurs at the first write. The system ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Second USENIX Symposium on Operating System Design and Implementation, pages 75--88, nov 1996.


ADSM: A Hybrid DSM Protocol that Efficiently Adapts to.. - Monnerat, Bianchini (1997)   (1 citation)  (Correct)

....[2] on adaptive TreadMarks implementations is the most similar to ours. In this paper, we compare their most important implementation against ADSM and show that our protocol behaves better, as it does not require as much communication traffic. Other indirectly related pieces of work are [16] and [17]. Just as ADSM, the Lazy Hybrid (LH) protocol studied by Dwarkadas et al. in [6] also applies a hybrid invalidate update coherence approach. ADSM differs from LH in that it only updates the single writer pages associated with the lock variable on a lock acquire operation. During a lock acquire ....

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation (OSDI'96), October 1996. 20


Java for High-Performance Computing - Lobosco, Amorim, Loques (2001)   (Correct)

....the programmer is responsible for data communication among the nodes running an application. In distributed shared memory (DSM) systems processes share data transparently across node boundaries; data faulting, location, and movement is handled by the underlying system. Treadmarks [30] and HLRC [53] are examples of stateof the art software DSM systems. Other aspects such as communication transparency to the programmer, conformity with the language syntax, as well as the overall achieved performance are determined by the mechanism adopted for inter process communication. The second issue ....

Zhou Y, Iftode L, Li K. 1996. Performance evaluation of two homebased lazy release consistency protocols for shared virtual memory systems. In Proceedings of the 2nd Symposium on Operating Systems Design and Implementation, October 1996. IC - UFF Technical Report


JUMP-DP: A Software DSM System with Low-Latency.. - Cheung, Wang, Hwang (2000)   (1 citation)  (Correct)

....sending the updates made on the page to the faulting processor (e.g. in the form of diffs [7] The faulting processor then applies the updates in order (according to the timestamps attached to the diffs which specify when the updates are made) to obtain the clean copy of the page. It is shown in [6] that the home based protocol is more efficient than the homeless protocol. Moreover, the home based protocol is easier to implement since it does not need to handle timestamps. Although the home based protocol is more efficient than the homeless protocol, a fixed home for every page may not ....

Y. Zhou, L. Iftode and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symposium on Operating Systems Design and Implementation (OSDI'96), pages 75-88, October 1996.


Multigrain Shared Memory - Yeung, Kubiatowicz, Agarwal (2000)   (Correct)

....in a multigrain environment, since it is the only work to compare different DSMP configurations, all software DSMs, and all hardware DSMs on a single experimental platform. Most recently, the work in [6] describes an implementation of the home based lazy release consistency protocol (HLRC) [26] on a cluster of four 4 way Intel Pentium Pro SMPs. Compared to MGS, this system uses a more aggressive software DSM protocol that enforces coherence lazily at acquire operations, and therefore generates fewer inter node messages. The main contribution of the work lies in an extension of LRC for ....

Yuanyuan Zhou, Liviu Iftode, and Kai Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. In Proceedings of the 2nd Symposium on Operating Systems Design and Implementation, October 1996. 32


Accelerating Shared Virtual Memory via General-purpose.. - Bilas, Jiang, al. (2001)   (1 citation)  (Correct)

....lies at the other end of the spectrum. The network interface can be used not only to avoid interrupting the compute processor but also to perform full blown protocol processing, including di# creation and application and the management of timestamps and write notices. This approach was taken in [53]. 18] reserves a compute processor in an SMP node for protocol processing. The amount of proto Improving the Performance of Shared Virtual Memory on System Area Networks 27 col processing involved in SVM systems with SMP nodes was examined also in an earlier simulation study [30] and other ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, October 1996.


Improving Fine-Grained Irregular Shared-Memory Benchmarks.. - Hu, Cox, Zwaenepoel (2000)   (9 citations)  (Correct)

.... Barnes Hut, FMM, and Water Spatial from SPLASH 2 [33] and Moldyn and Unstructured from the Chaos benchmark suite [11] We have evaluated the modified programs on a hardware shared memory machine (a 16 processor SGI Origin 2000 [23] and two software shared memory systems (TreadMarks [1] and HLRC [35] on a cluster of 16 Pentium II based computers) Our results show that data ordering during initialization improves the performance of these applications by 12 99 on the Origin 2000, by 30 366 under TreadMarks, and by 14 269 under HLRC. These improvements result from better spatial ....

....data caches, and a unified 8MB of second level cache and 128 byte block size. The machine has 10GB of main memory and a 16KB page size. 4.1. 2 Software DSMs We use the TreadMarks system [1] from Rice and a modified version of TreadMarks that implements Princeton s home based LRC (HLRC) protocol [35]. Both systems are page based and use multiple writer protocols and lazy release consistency to alleviate the worse effects of false sharing. The two protocols differ in the location where modifications are kept and in the method by which they get propagated. A detailed comparison between the two ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Second USENIX Symposium on Operating System Design and Implementation, pages 75--88, nov 1996. 13


Performance Portability and Scalability in Shared-Address-Space.. - Jiang (2000)   (Correct)

....software rather than hardware in tightly coupled systems such as the SGI Origin2000 discussed in Chapter 2. Studies on software coherent shared address space multiprocessors have largely used applications as they were written for hardware cache coherent machines. The performance evaluations so far [43, 22, 46, 61, 31, 89, 35, 8] point out that for certain classes of applications there is a large performance gap between hardware cache coherent and software coherent systems. However, it should be possible to modify or restructure applications to interact better with software coherence protocols and granularities, and to ....

....release operation, di#s are propagated to the designated home of the page (not to the other sharers) The home copy is thus kept up to date. Upon a page fault following a causally related acquire CHAPTER 3. PERFORMANCE PORTABILITY TO CLUSTERS 59 operation, the entire page is fetched from the home [89]. The tradeo#s between homebased and traditional LRC protocols have been studied in the literature [46, 34, 89] Overall, due to software management for communication and coherence, SVM systems su#er high costs in communication, protocol overhead, and synchronization as well as critical section ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.


The Effect of Contention on the Scalability of.. - de Lara, Hu, Lu.. (2000)   (Correct)

....eliminating contention reduced execution time by 64 in the most extreme case, even at the relatively modest scale of 32 nodes that we consider in this paper. Our experiments are performed on a network of thirty two single processor nodes using both Princeton s home based (HLRC) protocol [7] [11] and Rice s TreadMarks (Tmk) protocol [6] Both are widely used, page based, multiplewriter protocols implementing Lazy Release Consistency (LRC) 5] From our experiments, we derive three speci c conclusions. 2 First, in comparing the results on 8 nodes to 32 nodes, we nd that the e ects of ....

....we ran on it. Section 5 presents the results of our evaluation. Section 6 compares our results to related work in the area. Finally, Section 7 summarizes our conclusions. 2 Background 2. 1 Lazy Release Consistency The TreadMarks (Tmk) protocol [5] and the Princeton home based (HLRC) protocol [11] are multiple writer implementations of lazy release consistency (LRC) 5] Lazy release consistency (LRC) 5] is an algorithm that implements the release consistency (RC) 4] memory model. RC is a relaxed memory model in which ordinary accesses are distinguished from synchronization accesses. ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Second USENIX Symposium on Operating System Design and Implementation, pages 75-88, nov 1996.


Update Protocols and Iterative Scientific Applications - Pete Keleher University (1998)   (15 citations)  (Correct)

....Contributions This paper presents the design and performance of several new protocols to handle this and similar types of applications. We first look at a relatively straightforward update protocol based on multi writer lazy release consistency (LRC) 6] We then show that modified home based [7] protocols can perform even better than LRC protocols for applications of this type. Whereas home less LRC protocols can perform poorly for applications that modify (and communicate) large amounts of data, home based protocols maintain relatively little state, and such state lives has short ....

....Sections 4 and 5 describe and analyze the performance of extensions to our home based protocols that impose less of a load on the underlying operating system. Finally, Section 6 concludes. 2. Background and Protocol Descriptions Our home based protocols are based on those discussed by Zhou [7]. Home based protocols are, in turn, based on the multi writer lazy release consistent (LRC) protocols used by DSMs such as TreadMarks [8] and CVM [9] 2.1 Multi Writer LRC Protocols 2.1.1 Lmw i LRC protocols allow the shared memory system to delay performing shared updates until specific ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li, "Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems," in Proceedings of the 2nd Symposium on Operating Systems Design and Implementation, October, 1996.


Improving the Performance of Shared Virtual Memory on System Area.. - Bilas (1998)   (5 citations)  (Correct)

....have been developed. These greatly alleviate the e#ects of false sharing, but communication and the propagation of coherence information are still expensive when they do occur. Much research has been done in this area, and CHAPTER 1. INTRODUCTION 18 many good protocols have been developed [51, 52, 19, 49, 29, 77, 44, 45, 100]. One way to provide the shared memory programming model in clusters is to extend these SVM protocols to use multiprocessor (SMP) rather than uniprocessor nodes. Another view of this approach is that the less e#cient SVM is used not as the basic mechanism with which to build multiprocessors out ....

....the constants, the node to network bandwidth may become a bottleneck if it is not increased considerably when going from a uniprocessor to an SMP node. A recent and promising form of SVM protocols is the class of so called home based CHAPTER 1. INTRODUCTION 19 protocols (HLRC, AURC,Cashmere) [44, 45, 46, 100, 54, 91]. Chapter 3 describes a protocol for home based SVM across SMP nodes (HLRC SMP, AURC SMP) that accomplishes the goals above. The SVM protocol can operate completely in software, or can exploit hardware support for automatic update (AU) propagation of writes to remote memories (also called ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.


Experience with an Adaptive Globally-Synchronizing Clock.. - Liao, Martonosi, Clark (1999)   (14 citations)  (Correct)

....of our clock synchronization code on the overall behavior of real applications. To address this, we have run several shared virtual memory (SVM) 11] applications with and without clock synchronization code running. Our SVM system implements a home based lazy release consistency (HLRC) protocol [21]. For the applications run with our synchronized clock implementation, synchronizing overhead measured at the program level is almost negligible: less than 0.3 in all cases. 3.3.2 Perturbation in Microbenchmarks Focusing on finer accuracy issues, we next describe a set of microbenchmark ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proc. of the Operating Systems Design and Implementation Symposium, Oct. 1996.


LOTS: A Software DSM Supporting Large Object Space - Cheung, Wang, Lau (2004)   Self-citation (Li)   (Correct)

No context found.

Y. Zhou, L. Iftode and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symposium on Operating Systems Design and Implementation (OSDI'96), pages 75-88, October 1996.


Memory Management for Networked Servers - Zhou (2000)   Self-citation (Zhou)   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, October 1996.


Journal Of Information Science And Engineering.. - Cheung, Wang, Lau (2002)   Self-citation (Li)   (Correct)

....the page in the form of diffs to the faulting processor. The faulting processor then applies these updates in order as specified by the timestamps attached, so that the clean copy of the page can be obtained. Figure 3(b) shows the events for serving a page fault in a homeless protocol. Research [11] shows that the home based protocol is more efficient than the homeless protocol by sending fewer messages in the network. In particular, it reduces the communication overhead in serving a page fault by requesting only one processor for a copy of the page. Moreover, home based protocols are easier ....

....only. Although AURC is dedicated to the SHRIMP multicomputer, which possesses a specialized automatic update hardware, the idea of a home for each shared memory page inspires later research efforts. The most remarkable one is the home based lazy release protocol (HLRC) proposed by Zhou and Iftode [11], which is a protocol implementing lazy release consistency using the home based approach, with no specialized hardware support needed. We have discussed the underlying concept adopted by HLRC when the home based protocol is introduced in Section 2.2. Zhou and Iftode also showed in their paper ....

Y. Zhou, L. Iftode and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symposium on Operating Systems Design and Implementation (OSDI'96), pages 75-88, October 1996.


Relaxed Consistency and Coherence Granularity in DSM.. - Zhou, Iftode, Singh.. (1997)   (19 citations)  Self-citation (Zhou Iftode Li)   (Correct)

....include release consistency [10] entry consistency [2] scope consistency [13] Lazy release consistency(LRC) 15] is a software implementation of release consistency which delays the coherence action until the acquire time. Most software shared systems today use LRC based protocols [14] 11] [30] [16] These consistency models employ sophisticated protocols to reduce false sharing and fragmentation. An alternative approach is to preserve the simplicity of sequential consistency, but nd some approach to reduce the coherence granularity. Examples of providing negrained access control ....

....with ne grained access control hardware that supports multiple sizes of coherence granularity. We studied the combinations of three consistency protocols (sequential consistency (SC) 17] single writer lazy release consistency (SW LRC) 16] and home based lazy release consistency (HLRC) [30]) with four sizes of coherence granularity. We also studied two mechanisms (polling and interrupt) to handle message arrivals for each case. Our experiments used eight real benchmarks developed for hardware shared memory systems and their variations(so total 12 applications) Our applications ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. In Proceedings of the Operating Systems Design and Implementation Symposium, October 1996. 14


Dynamic Join and Departure in Shared Object Middlewares - Zhou, Li   Self-citation (Zhou Li)   (Correct)

....n ) #( N n ) departure #(m) best O(1) worst O(m) #( N n ) #( N n ) Table 1: Comparison(m is the number of directory entries, n is the number of participants and N is the maximal number of participants the system supports) of sequential consistency. Recent shared virtual memory systems [3, 11, 4, 16, 18, 12] used various relaxed consistency models. Shared object systems [1, 2, 9] maintain memory consistency in the same way as the shared virtual memory approach except at object level. None of these systems, to the best of our knowledge, is used for distributed and collaborative applications because of ....

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. In Proceedings of the Operating Systems Design and Implementation Symposium, October 1996. 11


Scalable Fault-Tolerant Distributed Shared Memory - Sultan, Nguyen, Iftode (2000)   (4 citations)  Self-citation (Iftode)   (Correct)

....must have low memory overhead, freeing memory for fault tolerance related tasks (for example, logging) and (iii) the protocol must be light weight in terms of state maintained and thus incur less overhead for logging and checkpointing. The HLRC protocol has been shown to have these properties [35]. The challenge is to keep HLRC ecient while integrating fault tolerance. To keep overhead low, our recovery support (i) uses volatile memory for logging, and (ii) aggressively exploits HLRC s semantics to minimize the amount of state that must be logged or checkpointed. Furthermore, to make ....

Y. Zhou, L. Iftode, K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. Proc. 2nd USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 75-88, October 1996. 13


Software Distributed Shared Memory over Virtual Interface.. - And   Self-citation (Iftode)   (Correct)

....into user address space of remote memory without intermediate copies, VIA appears very promising for software DSM. To our best knowledge, ours is the rst implementation of a software DSM protocol on VIA. The protocol we have implemented on VIA is homebased lazy release consistency (HLRC) [37, 19]. Previous studies have shown that HLRC provides good scalability by reducing the number of messages and memory overhead compared to the homeless counterpart [37] Home based protocols have been previously implemented on other memory mapped interconnected clusters both for clusters of ....

....of a software DSM protocol on VIA. The protocol we have implemented on VIA is homebased lazy release consistency (HLRC) 37, 19] Previous studies have shown that HLRC provides good scalability by reducing the number of messages and memory overhead compared to the homeless counterpart [37]. Home based protocols have been previously implemented on other memory mapped interconnected clusters both for clusters of uniprocessors [22, 17] as well as for clusters of symmetric multiprocessors (SMPs) 31, 28] Although the communication model of these networks are similar to VIA, there are ....

[Article contains additional citation context not shown here]

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. In Proceedings of the Operating Systems Design and Implementation Symposium, October 1996.


Limits to the Performance of Software Shared Memory: A.. - Bilas, Jiang, Zhou.. (1997)   (1 citation)  Self-citation (Zhou)   (Correct)

....Protocol Programming Model Layer Communication Layer Communication Library Network Figure 1: The layers that a#ect the end application performance in software shared memory. The last decade has seen a lot of excellent research in the individual layers, especially the lower two system layers [4, 3, 5, 13, 11, 12, 21, 18, 17]. Still, software shared memory systems currently yield performance that is, for several classes of applications, far behind that of hardware coherent systems even at quite small scale. This paper uses a layered framework to examine where the major gains (or losses) in the parallel performance ....

....either hardware support or by code instrumentation in software [18] In both cases, we assume the coherence protocol runs in software handlers rather than in hardware, and on the main processor rather than on co processors. For SVM, we use the Home based Lazy Release Consistency (HLRC) protocol [21], which implements the lazy release consistency (LRC) model [11] to reduce the impact of false sharing. Both HLRC and older LRC protocols use software twinning and di#ng to solve the multiple writer problem, but with di#erent schemes for propagating and merging di#s (updated data) Traditional ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.


Limits to the Performance of Software Shared Memory: A.. - Angelos Bilas Dongming (1997)   (1 citation)  Self-citation (Zhou)   (Correct)

....Protocol Programming Model Layer Communication Layer Communication Library Network Figure 1: The layers that affect the end application performance in software shared memory. The last decade has seen a lot of excellent research in the individual layers, especially the lower two system layers [4, 3, 5, 14, 12, 13, 21, 7, 18]. Still, software shared memory systems currently yield performance that is, for several classes of applications, far behind that of hardware coherent systems even at quite small scale. This paper uses a layered framework to examine where the major gains (or losses) in the parallel performance ....

....either hardware support or by code instrumentation in software [7] In both cases, we assume the coherence protocol runs in software handlers rather than in hardware, and on the main processor rather than on co processors. For SVM, we use the Home based Lazy Release Consistency (HLRC) protocol [21], which implements the lazy release consistency (LRC) model [12] to reduce the impact of false sharing. Both HLRC and older LRC protocols use software twinning and diffing to solve the multiple writer problem, but with different schemes for propagating and merging diffs (updated data) ....

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.


Efficient Categorization of Memory Sharing Patterns in.. - De Castro Computer (2001)   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the OSDI'96, pp. 75--88, Oct 1996.


A New Distributed JVM for Cluster Computing - Marcelo Lobosco Anderson (2003)   (Correct)

No context found.

Zhou, Y, et alli. Performance Evaluation of Two Home-based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. OSDI, Oct 1996.


L.T. Yang et al. (Eds.): HPCC 2005, LNCS 3726, pp.. - Springer-Verlag.. (2005)   (Correct)

No context found.

Zhou, Y, et alli. Performance Evaluation of Two Homebased Lazy Release Consistency Protocols for Shared Virtual Memory Systems. OSDI, Oct 1996.


Distributed Shared Memory in Kernel Mode - Thobias Trevisan Vtor (2002)   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation (OSDI'96), pages 75--88, 1996.


Jupiter/SVM: A JVM-based Single System - Image For Clusters   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li, "Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems," in Proc. of OSDI, 1996.


Phoenix : a Parallel Programming Model for.. - Taura, Kaneda, Endo.. (2003)   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In ACM OSDI, 1996.


Distributed Shared Memory in Kernel Mode - Thobias Trevisan Vtor (2002)   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation (OSDI'96), pages 75--88, 1996.


Adaptive Techniques for Home-Based Software DSMs - Whately, Pinto, Bianchini.. (2001)   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation (OSDI'96), pages 75--88, October 1996.


CC-MPI: A Compiled Communication Capable MPI Prototype.. - Karwande, Yuan..   (Correct)

No context found.

Y. Zhou, L. Iftode, and K. Li. Performance Evaluation of Two Home-based Lazy Release Consistency Protocols for Shared Memory Virtual Memory Systems. In Proc. of the 2nd Symp. on Operating Systems Design and Implementation (OSDI'96), pages 75--88, 1996.


A Node Count-independent Logical Clock for Scaling Lazy.. - Arantes, Folliot, Sens (1999)   (Correct)

No context found.

Y. Zhou, L. Iftode and K. Li. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems. In the 2nd Symposium on Operating Systems Design and Implementation, Octobre 1996.


Optimizing Home-Based Software DSM Protocols - Hu, Shi, Tang   (Correct)

No context found.

Y. Zhou, L. Iftode and K. Li, "Performance Evaluation of Two Home-based Lazy Release Consistency Protocols for Shared Virtual Memory Systems", in Proc. of the 2nd USENIX Sym. on Operating System Design and Implementation, Seattle, Oct. 1996. 14

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC