Results 1 - 10
of
104
SoftFLASH: Analyzing the Performance of Clustered Distributed Virtual Shared Memory
- In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems
, 1996
"... One potentially attractive way to build large-scale shared-memory machines is to use small-scale to medium-scale shared-memory machines as clusters that are interconnected with an off-the-shelf network. To create a shared-memory programming environment across the clusters, it is possible to use a vi ..."
Abstract
-
Cited by 81 (0 self)
- Add to MetaCart
(Show Context)
One potentially attractive way to build large-scale shared-memory machines is to use small-scale to medium-scale shared-memory machines as clusters that are interconnected with an off-the-shelf network. To create a shared-memory programming environment across the clusters, it is possible to use a virtual shared-memory software layer. Because of the low latency and high bandwidth of the interconnect available within each cluster, there are clear advantages in making the clusters as large as possible. The critical question then becomes whether the latency and bandwidth of the top-level network and the software system are sufficient to support the communication demands generated by the clusters. To explore these questions, we have built an aggressive kernel implementation of a virtual shared-memory system using SGI multiprocessors and 100Mbyte/sec HIPPI interconnects. The system obtains speedups on 32 processors (four nodes, eight
Brazos: A Third Generation DSM System
- IN PROCEEDINGS OF THE 1ST USENIX WINDOWS NT SYMPOSIUM
, 1997
"... Brazos is a third generation distributed shared memory (DSM) system designed for x86 machines running Microsoft Windows NT 4.0. Brazos is unique among existing systems in its use of selective multicast, a software-only implementation of scope consistency, and several adaptive runtime performance tun ..."
Abstract
-
Cited by 80 (10 self)
- Add to MetaCart
Brazos is a third generation distributed shared memory (DSM) system designed for x86 machines running Microsoft Windows NT 4.0. Brazos is unique among existing systems in its use of selective multicast, a software-only implementation of scope consistency, and several adaptive runtime performance tuning mechanisms. The Brazos runtime system is multithreaded, allowing the overlap of computation with the long communication latencies typically associated with software DSM systems. Brazos also supports multithreaded user-code execution, allowing programs to take advantage of the local tightly-coupled shared memory available on multiprocessor PC servers, while transparently interacting with remote "virtual" shared memory. Brazos currently runs on a cluster of Compaq Proliant 1500 multiprocessor servers connected by a 100 Mbps FastEthernet. This paper describes the Brazos design and implementation, and compares its performance running five scientific applications to the performance of Solaris...
Using Multicast and Multithreading to Reduce Communication in Software DSM Systems
, 1998
"... This paper examines the performance benefits of employing multicast communication and application-level multithreading in the Brazos software distributed shared memory (DSM) system. Application-level multithreading in Brazos allows programs to transparently take advantage of available local multipro ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
(Show Context)
This paper examines the performance benefits of employing multicast communication and application-level multithreading in the Brazos software distributed shared memory (DSM) system. Application-level multithreading in Brazos allows programs to transparently take advantage of available local multiprocessing. Brazos uses multicast communication to reduce the number of consistency-related messages, and employs two adaptive mechanisms that reduce the detrimental side effects of using multicast communication. We compare three software DSM systems running on identical hardware: (1) a single-threaded point-to-point system, (2) a multithreaded point-to-point system, and (3) Brazos, which incorporates both multithreading and multicast communication. For the six applications studied, multicast and multithreading improve speedup on eight processors by an average of 38%.
Implementing a Caching Service for Distributed CORBA Objects
, 2000
"... . This paper discusses the implementation of CASCADE, a distributed caching service for CORBA objects. Our caching service is fully CORBA compliant, and supports caching of active objects, which include both data and code. It is specifically designed to operate over the Internet by employing a d ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
. This paper discusses the implementation of CASCADE, a distributed caching service for CORBA objects. Our caching service is fully CORBA compliant, and supports caching of active objects, which include both data and code. It is specifically designed to operate over the Internet by employing a dynamically built cache hierarchy. The service architecture is highly configurable with regard to a broad spectrum of application parameters. The main benefits of CASCADE are enhanced availability and service predictability, as well as easy dynamic code deployment and consistency maintenance. 1 Introduction One of the main goals of modern middlewares, and in particular of the CORBA standard [45], is to facilitate the design of interoperable, extensible and portable distributed systems. This is done by standardizing a programming language independent IDL, a large set of useful services, the Generic InterORB Protocol (and its TCP/IP derivative IIOP), and bridges to other common middleware...
Performance Evaluation of View-Oriented Parallel Programming
- IN: PROC. OF THE 2005 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP05), PP.251-258, IEEE COMPUTER SOCIETY
, 2005
"... This paper evaluates the performance of a novel View-Oriented Parallel Programming style for parallel programming on cluster computers. View-Oriented Parallel Programming is based on Distributed Shared Memory which is friendly and easy for programmers to use. It requires the programmer to divide sha ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
This paper evaluates the performance of a novel View-Oriented Parallel Programming style for parallel programming on cluster computers. View-Oriented Parallel Programming is based on Distributed Shared Memory which is friendly and easy for programmers to use. It requires the programmer to divide shared data into views according to the memory access pattern of the parallel algorithm. One of the advantages of this programming style is that it offers the performance potential for the underlying Distributed Shared Memory system to optimize consistency maintenance. Also it allows the programmer to participate in performance optimization of a program through wise partitioning of the shared data into views. Experimental results demonstrate a significant performance gain of the programs based on the View-Oriented Parallel Programming style.
Removing the Overhead from Software-Based Shared Memory
, 2001
"... The implementation presented in this paper—DSZOOM-WF— is a sequentially consistent, fine-grained distributed software-based shared memory. It demonstrates a protocol-handling overhead below a microsecond for all the actions involved in a remote load operation, to be compared to the fastest implement ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
The implementation presented in this paper—DSZOOM-WF— is a sequentially consistent, fine-grained distributed software-based shared memory. It demonstrates a protocol-handling overhead below a microsecond for all the actions involved in a remote load operation, to be compared to the fastest implementation to date of around ten microseconds. The all-software protocol is implemented assuming some basic low-level primitives in the cluster interconnect and an operating system bypass functionality, similar to the emerging InfiniBand standard. All interrupt- and/or poll-based asynchronous protocol processing is completely removed by running the entire coherence protocol in the requesting processor. This not only removes the asynchronous overhead, but also makes use of a processor that otherwise would stall. The technique is applicable to both page-based and fine-grain software-based shared memory. DSZOOM-WF consistently demonstrates performance comparable to hardware-based distributed shared memory implementations.
MultiJav: A Distributed Shared Memory System Based on Multiple Java Virtual Machines
- In Proceedings of the Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas
, 1998
"... Current distributed shared memory systems suffer from portability problems which hinder popularity. We present a distributed shared memory system as a distributed implementation of the Java Virtual Machine. The proposed system is unique in that it provides a user-friendly, flexible programming model ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
(Show Context)
Current distributed shared memory systems suffer from portability problems which hinder popularity. We present a distributed shared memory system as a distributed implementation of the Java Virtual Machine. The proposed system is unique in that it provides a user-friendly, flexible programming model based on pure Java. It is an object-based memory system which maintains the synchronization scope as the whole address space, like pagebased systems. MultiJav demonstrates that it is possible to design an efficient, portable, distributed shared memory system for running parallel and distributed applications written in a standard language. Keywords: distributed shared memory, objectbased sharing, Java, memory consistency 1 Introduction Numerous distributed shared memory systems have been designed to promote parallel computing on a cluster of workstations. Through continuous effort of performance optimization, systems of this kind have become a reasonable choice for applications of massive...
Hybrid-DSM: An Efficient Alternative to Pure Software DSM Systems on NUMA Architectures
, 2000
"... Usually, shared memory style programming is being supported on loosely coupled architectures like clusters or networks of workstations by pure software distributed shared memory (DSM) systems. With the Scalable Coherent Interface (SCI), which facilitates communication via hardware DSM, high performa ..."
Abstract
-
Cited by 13 (12 self)
- Add to MetaCart
Usually, shared memory style programming is being supported on loosely coupled architectures like clusters or networks of workstations by pure software distributed shared memory (DSM) systems. With the Scalable Coherent Interface (SCI), which facilitates communication via hardware DSM, high performance PC clusters with NUMA (Non-Uniform-Memory Access) characteristics can be built. By exploiting the remote memory access features of SCI and by utilizing the memory management techniques of software DSM systems, a global virtual memory abstraction on top of an SCI-based PC cluster can be provided (the SCI Virtual Memory or SCI-VM). This hybrid DSM approach is the basis for the ecient implementation of shared memory programming models for not only SCI based systems but also for loosely coupled NUMA architectures in general.
Efficient User-Level Thread Migration and Checkpointing on Windows NT Clusters
, 1999
"... ..."
(Show Context)
VODCA: View-Oriented, Distributed, Cluster-based Approach to parallel computing
- DSM WORKSHOP 2006, IN: PROC. OF THE IEEE/ACM SYMPOSIUM ON CLUSTER COMPUTING AND GRID 2006 (CCGRID06), IEEE COMPUTER SOCIETY
, 2006
"... This paper presents a high-performance Distributed Shared Memory system called VODCA, which supports a novel View-Oriented Parallel Programming on cluster computers. One advantage of View-Oriented Parallel Programming is that it allows the programmer to participate in performance optimization throug ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
This paper presents a high-performance Distributed Shared Memory system called VODCA, which supports a novel View-Oriented Parallel Programming on cluster computers. One advantage of View-Oriented Parallel Programming is that it allows the programmer to participate in performance optimization through wise partitioning of the shared data into views. Another advantage of this programming style is that it enables the underlying Distributed Shared Memory system to optimize consistency maintenance. VODCA implements a View-based Consistency model and uses an efficient View-Oriented Update Protocol with Integrated Diff to maintain consistency of a view. Important implementation details of VODCA are described in this paper. Experimental results demonstrate that VODCA performs very well and its performance is comparable with MPI (Message Passing Interface) systems.