23 citations found. Retrieving documents...
Wilson, A.W., R.P. LaRowe, Jr. and M.J. Teller, "Hardware Assist for Distributed Shared Memory," Proceedings of the 13th International Conference on Distributed Computing Systems, pp. 246-255, 1993.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Shared Virtual Memory with Automatic Update Support - Iftode, Blumrich.. (1998)   (5 citations)  (Correct)

....to o#oad some of the communication and coherence overheads from the computation processor. Using simulations they show that such a protocol processor can double the performance of TreadMarks on a 16 node configuration and that di# prefetching is not always beneficial. The PLUS [4] Galactica Net [19], Merlin [25] and its successor SESAME [28] systems implement hardwarebased shared memory using a sort of write through Page Di#s Di#s Application misses created applied Lock Barriers HLRC AURC HLRC AURC HLRC AURC Acquires Barnes 2,517 2,498 3,398 0 3,398 0 20,468 8 FFT 2,240 2,240 0 0 0 0 1 ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993.


Design Choices in the SHRIMP System: An Empirical Study - Blumrich, Alpert, Chen.. (1998)   (12 citations)  (Correct)

.... update is also similar to MemoryChannel (developed independently and concurrently at Digital) in which memory updates are automatically reflected to other nodes [23] Pagebased automatic update approaches were also used in Memnet [18] Merlin [36] SESAME [45] Plus [8] and Galactica Net [29]. These prior systems did not, however, provide for both automatic and deliberate update. This paper also quantifies the relationship between particular low level hardware primitives and the performance of the higher level software they support. As with active messages [44] SHRIMP s mechanisms ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993.


Virtual Memory Mapped Network Interface for the.. - Blumrich, Li.. (1994)   (238 citations)  (Correct)

....between separate partitions, but not between processes within a partition. Since packet headers must be constructed by applications, the message passing overhead is still hundreds of CPU instructions. Memnet[10] Merlin [21] and its successor SESAME [33] the Plus system [4] and Galactica Net [17] use the page based, automatic update approach to support shared memory. These systems do not provide a mechanism for high bandwidth, lowoverhead block data transfer such as deliberate update. Some systems provide communication via direct access to remote memory locations. An early example is ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware assist for distributed shared memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993.


Home-based Shared Virtual Memory - Iftode (1998)   (30 citations)  (Correct)

....that AURC CHAPTER 1. INTRODUCTION 18 outperforms the all software home based LRC (HLRC) IBD 98, BAB 98] Other studies have addressed the idea of using remote memory operations as a support for software shared memory with different degrees of hardware intrusion [CF89, SMV95, BR90, JJT93, PL93, PL94, KS96b, KHS 97, BIS98] 1.6.3 Programmable Network Interface In an SVM implementation without a dedicated protocol processor incoming requests interrupt the compute processor and are handled by it. The interrupt overhead is the most significant parameter of the communication ....

....by that writer are also performed on the home s copy. 2. 4 AURC with Copyset 2 Optimization Previous research in hardware support for update based shared memory protocols has suggested multicast support for a full fledged update based coherence scheme, with updates propagated in circular rings [JJT93, BR90] The automatic update hardware can propagate CHAPTER 2. HOME BASED PROTOCOLS 34 . Page Miss (page granularity) Automatic Update (word granularity) Copy 1 Copy 2 Copy N 1 Copy 1 Copy 2 Copy N 1 node 1 node 2 node N 1 node 1 node 2 node N 1 (a) communication from the local ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993. BIBLIOGRAPHY 128


Update-Based Cache Coherence Protocols For Scalable.. - Glasco, Delagi, Flynn (1993)   (8 citations)  (Correct)

....may be maintained through software, hardware or a combination of the two. The majority of scalable hardware based systems with a general interconnect use invalidations to maintain consistency [10, 21, 9, 13] Several of the software based schemes use a combination of invalidations and updates [2, 14, 11, 3, 24, 23]. This paper presents two hardware controlled update based cache coherence protocols: one based on a centralized directory and the other based on a singly linked distributed directory. The paper considers two major disadvantages of the update based protocols and develops approaches to overcoming ....

....the more efficient the line sized transfers of the invalidate protocols are. As the line utilization decreases, single word updates become more efficient. The line utilization measure is similar to the comparison metrics used in other evaluations of update based and invalidate based protocols [24, 7]. As will be shown in section 7, these classifications are a good predictor of the resulting performance of the cache coherence protocols. Currently, none of these applications have migratory data. Applications with migratory data may 6 significantly increase the number of updates required in an ....

Andrew W. Wilson Jr., Richard P. LaRowe Jr. and Marc J. Teller, "Hardware Assist for Distributed Shared Memory", 13th International Conference on Distributed Computing Systems, May 1993.


Update Propagation in the Galactica Net Distributed Shared.. - Andrew Wilson Jr (1993)   (5 citations)  (Correct)

....invalidation based coherence mechanisms typically employed by most software implemented systems, the GIM provides hardware support for update based coherence policies. Our motivation for selecting an update based coherence protocol supported by hardware is the subject of a recent publication [19]. In Galactica Net, when a processor attempts to modify a page that has been replicated (i.e. cached) at multiple nodes, instead of invalidating all of the other copies of the page, the operating system puts the page in update mode. Once the operating system puts a page in update mode, the GIM ....

.... cost factors forced us to limit our implementation aggressiveness) results of those investigations show that through the use of relaxed memory consistency models, much of the update latency discussed in this section can be overlapped with other useful computation for many applications [19][20] We believe that through a variety of techniques (e.g. more integrated hardware implementation, update combining, process management to improve locality, data management to reduce false sharing effects, compiler optimizations to exploit release consistency, and well designed application ....

[Article contains additional citation context not shown here]

A. W. Wilson Jr., R. P. LaRowe Jr., and M. J. Teller. Hardware assist for distributed shared memory. In Proceedings of the 13th International Conference on Distributed Computing Systems, pages 246-255, May, 1993.


Design Choices in the SHRIMP System: An Empirical Study - Matthias Blumrich (1998)   (12 citations)  (Correct)

.... automatic update is also similar to MemoryChannel (developed independently and concurrently at Digital) in which memory updates are automatically reected to other nodes [24] Pagebased automatic update approaches were also used in Memnet [18] Merlin [37] SESAME [45] Plus [8] and Galactica Net [30]. These prior systems did not, however, provide for both automatic and deliberate update. This paper also quanti Thetaes the relationship between particular low level hardware primitives and the performance of the higher level software they support. As with active messages [22] SHRIMP s ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246#255, May 1993.


Overview of Distributed Shared Memory - Judge, Nixon, Cahill, Tangney.. (1998)   (1 citation)  (Correct)

....Some all software systems, such as that described by Scales and Lam [171] run unmodified both on tightly coupled multi processors such as the CM 5 and on loosely coupled networks of workstations running PVM. Other systems propose hardware and software combinations in order to increase performance [83, 168, 200]. Schoinas, et al. 173, 59] study both hardware and software based access control in the Blizzard system (a fine grained dsm system running on a tightly coupled architecture over the Tempest system [164] they conclude that the software system varies from slightly faster to twice as slow as the ....

Andrew W. Wilson, Jr., Richard P. LaRowe, Jr., and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of the 13th International Conference on Distributed Computing Systems [93], pages 246--255.


Early Experience with Message-Passing on the SHRIMP.. - Felten, Alpert.. (1996)   (23 citations)  (Correct)

....which allows physical memory mapping only for a small amount of special memory. Several shared memory architecture projects use the page based, automaticupdate approach to support shared memory, including Memnet[17] Merlin [33] and its successor SESAME [46] the Plus system [9] and GalacticaNet [28]. These projects did not study the implementation of message passing libraries. Wilkes s sender based communication in the Hamlyn system [45] supports user level message passing, but requires application programs to build packet headers. They have not tried to implement messagepassing libraries ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993.


The Substrate Object Model and Architecture - Arindam Banerji (1993)   (2 citations)  (Correct)

....distributed versions of file systems, sockets and processmanagement. A good example of the kind of problems handled by the substrate approach is revealed by the implementation of XMM. XMM uses an invalidation approach in implementing updates to shared memory. However, differing hardware [Wilson, 1993] that may perform better with an update protocol would require significant painful changes to the kernel. Although some support for scalability exists in the way of load balancing, very little has been done to support evolution of software. Languages such as CLOS [Kiczales, 1991] have used ....

A. Wilson, R. LaRowe & M. Teller (1993) Hardware Assist for Distributed Shared Memory, Proc. of the Thirteenth International Conference on Distributed Computing Systems `93, pp.246-255.


Virtual Memory Mapped Network Interface for the.. - Blumrich, Alpert.. (1993)   (238 citations)  (Correct)

....the message passing overhead is still hundreds of CPU instructions. Several shared memory architecture projects use the page based, automatic update approach to support shared memory. Examples include Memnet[9] Merlin [19] and its successor SESAME [30] the Plus system [4] and Galactica Net [15]. These systems do not provide a mechanism for high bandwidth, lowoverhead block data transfer. Several parallel architectures use multiple threads [20, 27, 2, 1] to overlap communication with computation. These approaches require applications or compilers to create multiple threads on each node, ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware assist for distributed shared memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993.


Performance Evaluation of the Late Delta Cache.. - de Supinski.. (1996)   (Correct)

....cache blocks holds exactly one variable. Although this cache block size is unrealistically small, we expect increasing Performance Evaluation of the Late Delta Cache Coherence Protocol 11 the block size to favor the late delta protocol since increasing the block size favors update protocols [WLT93]. We first compare the sensitivity of the systems to contention for shared variables. We then investigate the scalability of the protocols and the effect of varying the probability of referencing shared data. 4.1. Varying Hot Spot Probability Initially, we compare the behavior of the protocol ....

Wilson, A.W., R.P. LaRowe, Jr. and M.J. Teller, "Hardware Assist for Distributed Shared Memory," Proceedings of the 13th International Conference on Distributed Computing Systems, pp. 246-255, 1993.


Models for Performance Prediction of Cache Coherence.. - Srbljic, Vranesic..   (Correct)

.... [3, 4] The accuracy and usefulness of our models is assessed by comparing the performance predicted by our models with the results of simulated execution for 15 parallel applications, mostly from the SPLASH and SPLASH 2 benchmark suites [5, 6] and also with simulation results reported by others [1, 7]. These comparisons show that even our simplest model is capable of choosing the right cache coherence protocol for all the applications we considered, while the results of our more sophisticated models lie within 10 of simulation results. As a side effect of our studies, we are also able to show ....

....benefits of supporting hybrid and dynamic hybrid protocols, but that the benefits of dynamic protocols are limited to some applications. Although we focus on tightly coupled multiprocessor systems in this paper, our models can also be easily applied to loosely coupled distributed memory systems [7 11]. In recent years, many models for predicting the performance of parallel systems have been proposed. They can roughly be classified into two groups based on the information they use to characterize the data access behavior of applications. One group of models assumes that shared data accesses are ....

[Article contains additional citation context not shown here]

A.W. Wilson, R.P. LaRowe, and M.J. Teller, "Hardware Assist for Distributed Shared Memory", In Proceedings of the 13th International Conference on Distributed Computing Systems, Pittsburgh, Pennsylvania, pages 246255, May 1993.


Heterogeneous By Design: An Environment for Exploiting.. - Richard Larowe (1993)   (1 citation)  (Correct)

....link) interconnection networks capable of supporting both shared memory and message based communication, a portable scalable operating system, hardware support for data format transformations, and a sophisticated heterogeneous programming environment. 3 The Galactica Net architecture [16] 17] 18][19] (with enhancements to support data format transformations) and the Mach operating system [1] with real time and fault tolerance extensions) provide an ideal framework on which to meet these requirements. This is the subject of Sections 2 and 3 of this paper, respectively. 2. Here we use ....

....development of heterogeneous applications, and Section 6 summarizes and concludes the paper. 2 Galactica Net as the Interconnection Architecture Galactica Net is an architecture for supporting high performance, scalable computing across a heterogeneous collection of processing nodes [16] 17] 18][19]. It efficiently supports both shared memory and explicit message based communication across nodes. A Galactica Net system consists of a set of interconnected nodes, each of which includes its own computing resources (e.g. a shared bus multiprocessor, a graphics workstation, or a SIMD array ....

[Article contains additional citation context not shown here]

A. W. Wilson Jr., R. P. LaRowe Jr., and M. Teller. Hardware Assist for Distributed Shared Memory. Center for High Performance Computing Technical Report 92-006. October, 1992. Submitted for Publication.


Improving Release-Consistent Shared Virtual Memory using.. - Iftode (1996)   (57 citations)  (Correct)

....file to insert a state table lookup before every shared memory reference. This technique works well in the presence of fine grain sharing; for well structured programs it imposes substantial overhead compared to more traditional shared virtual memory implementations. The PLUS [2] Galactica Net [12], Merlin [22] and its successor SESAME [27] systems implement hardware based shared memory using a sort of writethrough mechanism which is similar in some ways to automatic update. These systems do more in hardware, and thus are more expensive and complicated to build. Our automatic update ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware assist for distributed shared memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246-- 255, May 1993.


Scope Consistency : A Bridge between Release Consistency and.. - Iftode (1996)   (86 citations)  (Correct)

....an existing executable file to insert a state table lookup before every shared memory reference. This technique works well in the presence of finegrain sharing. Alternative directorybased protocols for memory mapped network interfaces were proposed for Cashmere [15] The Plus [3] Galactica Net [12], Merlin [18] and its successor SESAME [21] systems implement hardware based shared memory using a sort of write through mechanism which is similar in some ways to automatic update. The Memory Channel [8] allows remote memory to be mapped into the local virtual address space and have writes ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993.


Scope Consistency: A Bridge between Release Consistency and.. - Iftode, Singh, Li (1996)   (86 citations)  (Correct)

....executable file to insert a state table lookup before shared memory references. Alternative directory based shared virtual memory protocols for memory mapped network interfaces supporting automatic update as well as hardware remote reads were proposed for Cashmere [21] The Plus [4] Galactica Net [18], Merlin [25] and its successor SESAME [30] systems implement hardware based shared memory using a type of write through mechanism which is similar in some ways to automatic update. The Memory Channel [13] is a network interface similar to SHRIMP which allows remote memory to be mapped into the ....

Andrew W. Wilson Jr. Richard P. LaRowe Jr. and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proceedings of 13th International Conference on Distributed Computing Systems, pages 246--255, May 1993.


Implementation and Evaluation of Update-Based Cache.. - Grahn, Stenström, Dubois (1995)   (14 citations)  (Correct)

....system software is responsible for choosing which coherence policy to use for each page. This study also shows that most write latency can be hidden by using a relaxed memory consistency model and by choosing an appropriate coherence policy for each page. Another study by the same authors [35] shows that as the block size increases write update becomes preferable to write invalidate in terms of memory traffic. Therefore, overall, write update should be a much better choice than write invalidate for page level coherence in distributed shared memory systems. In this study we show that ....

A.W. Wilson, Jr., R.P. LaRowe, Jr., and M.J. Teller, Hardware assist for distributed shared memory, Proc. 13th Conf. on Distributed Computing Systems, Pittsburgh, PA (May 1993) 246-255.


The Remote Enqueue Operation on Networks of Workstations - Markatos, Katevenis.. (1998)   (2 citations)  (Correct)

.... There have been several projects to provide efficient communication primitives in networks of workstations via a combination of hardware and software: Dolphin s SCI interface [19] PRAM [24] Memory Channel [13] Myrinet [6] ServerNet [26] Active Messages [12] Fast Messages [17] Galactica Net [16], Hamlyn [9] U Net [27] NOW [1] Parastation [28] StarT Jt [15] Avalanche [10] Panda [2] and SHRIMP [4] provide efficient message passing on networks of workstations based on memory mapped interfaces. We view our work as complimentary to these projects, in the sense that we propose a fast ....

Andrew W. Wilson Jr., Richard P. LaRowe Jr., and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proc. 13-th Int. Conf. on Distr. Comp. Syst., pages 246--255, Pittsburgh, PA, May 1993.


Telegraphos: High-Performance Networking for Parallel Processing .. - Markatos (1996)   (16 citations)  (Correct)

....to support high performance computing,because communication on them has traditionally been very expensive. There have been several projects to provide efficient communication primitives in networks of workstations via a combination of hardware and software: PRAM [24] MERLIN [20] Galactica Net [14], Memory Channel [12] Hamlyn [7] NOW [1] and SHRIMP [4] provide efficient message passing on networks of workstations based on memory mapped interfaces. Their shared memory support, though, is limited because several of them do not Block Logic SRAM Notes: gates) Kbits) Central control 1000 ....

Andrew W. Wilson Jr., Richard P. LaRowe Jr., and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In PROC of the Thirteenth International Conference on DistributedComputing Systems, pages 246--255, Pittsburgh, PA, May 1993.


ASPEN: High-Performance Hardware Support for Distributed.. - Maxham (1994)   (Correct)

....list of nodes sharing each cache line [27] This organization is scalable; each cache line requires a pointer to the previous and next nodes in the list. SCI s protocol must frequently traverse these lists, and as a result the protocol is complex and message intensive. The GalacticaNet from WPI [37] proposes to augment a traditional software DSM approach with simple hardware support for performance critical operations. Like Aspen, GalacticaNet supports update multicasts. Like SCI, GalacticaNet uses a linked list for its sharing directories, mapping virtual rings upon its physical mesh. The ....

A. W. Wilson, R. P. LaRowe, and Marc J. Teller. Hardware assist for distributed shared memory. In Proceedings of the 12th International Conference on Distributed Computing Systems, May 1993.


Cache Coherence Protocol - Bronis De Supinski   (Correct)

No context found.

Wilson, A.W., R.P. LaRowe, Jr. and M.J. Teller, "Hardware Assist for Distributed Shared Memory," Proceedings of the 13th International Conference on Distributed Computing Systems, pp. 246-255, 1993.


Design and Performance of the Software-controlled COMA - Moga (1998)   (Correct)

No context found.

A.W. Wilson Jr., R.P. LaRowe Jr., and M.J. Teller. Hardware Assist for Distributed Shared Memory. In Proc. of the 13th Int'l Conference on Distributed Computing Systems (ICDCS-13), pages 246--255, May 1993.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC