Results 1 - 10
of
38
Fine-grain Access Control for Distributed Shared Memory
- In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI
, 1994
"... This paper discusses implementations of fine-grain memory access control, which selectively restricts reads and writes to cache-block-sized memory regions. Fine-grain access control forms the basis of efficient cache-coherent shared memory. This paper focuses on low-cost implementations that require ..."
Abstract
-
Cited by 186 (33 self)
- Add to MetaCart
(Show Context)
This paper discusses implementations of fine-grain memory access control, which selectively restricts reads and writes to cache-block-sized memory regions. Fine-grain access control forms the basis of efficient cache-coherent shared memory. This paper focuses on low-cost implementations that require little or no additional hardware. These techniques permit efficient implementation of shared memory on a wide range of parallel systems, thereby providing shared-memory codes with a portability previously limited to message passing. This paper categorizes techniques based on where access control is enforced and where access conflicts are handled. We incorporated three techniques that require no additional hardware into Blizzard, a system that supports distributed shared memory on the CM-5. The first adds a software lookup before each shared-memory reference by modifying the program's executable. The second uses the memory's error correcting code (ECC) as cache-block valid bits. The third is...
Teapot: Language Support for Writing Memory Coherence Protocols
, 1996
"... Recent shared-memory parallel computer systems offer the exciting possibility of customizing memory coherence protocols to fit an application’s semantics and sharing patterns. Custom protocols have been used to achieve message-passing performance-while retaining the convenient programming model of a ..."
Abstract
-
Cited by 56 (8 self)
- Add to MetaCart
Recent shared-memory parallel computer systems offer the exciting possibility of customizing memory coherence protocols to fit an application’s semantics and sharing patterns. Custom protocols have been used to achieve message-passing performance-while retaining the convenient programming model of a global address space-and to implement high-level language constructs. Unfortunately, coherence protocols written in a conventional language such as C are difficult to write, debug, understand, or modify. This paper describes Teapot, a small, domain-specific language for writing coherence protocols. Teapot uses continuations to help reduce the complexity of writing protocols. Simple static analysis in the Teapot compiler eliminates much of the overhead of continuations and results in protocols that run nearly as fast as hand-written C code. A Teapot specification can be compiled both to an executable coherence pmto-CO1and to input for a model checking system, which permits the specification to be verified. We report our experiences coding and verifying several protocols written in Teapot, along with measurements of the overhead incurred by writing a protocol in a higher-level language.
Space-Time Memory: A Parallel Programming Abstraction for Dynamic Vision Applications
- TECHREPORT
, 1997
"... We present a novel parallel programming abstraction called Space-Time Memory. ..."
Abstract
-
Cited by 49 (18 self)
- Add to MetaCart
We present a novel parallel programming abstraction called Space-Time Memory.
pSather: Layered Extensions to an Object-Oriented Language for Efficient Parallel Computation
, 1993
"... pSather is a parallel extension of the existing object-oriented language Sather. It offers a shared-memory programming model which integrates both control- and dataparallel extensions. This integration increases the flexibility of the language to express different algorithms and data structures, esp ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
(Show Context)
pSather is a parallel extension of the existing object-oriented language Sather. It offers a shared-memory programming model which integrates both control- and dataparallel extensions. This integration increases the flexibility of the language to express different algorithms and data structures, especially on distributed-memory machines (e.g. CM-5). This report describes our design objectives and the programming language pSather in detail. ICSI and Eidgenossische Technische Hochschule (ETH), Zurich, Switzerland. E-mail: murer@icsi.berkeley.edu. y ICSI and Computer Science Division, U.C. Berkeley. E-mail: jfeldman@icsi.berkeley.edu. z ICSI and Computer Science Division, U.C. Berkeley. E-mail: clim@icsi.berkeley.edu. x ICSI E-mail: mseidel@icsi.berkeley.edu. ii Contents 1 Introduction 4 1.1 Roadmap of this Report : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 1.2 Grammar Notation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 2...
Fine-Grain Distributed Shared Memory on Clusters of Workstations
, 1997
"... Shared memory, one of the most popular models for programming parallel platforms, is becoming ubiquitous both in low-end workstations and high-end servers. With the advent of low-latency networking hardware, clusters of workstations strive to offer the same processing power as high-end servers for a ..."
Abstract
-
Cited by 30 (10 self)
- Add to MetaCart
Shared memory, one of the most popular models for programming parallel platforms, is becoming ubiquitous both in low-end workstations and high-end servers. With the advent of low-latency networking hardware, clusters of workstations strive to offer the same processing power as high-end servers for a fraction of the cost. In such environments, shared memory has been limited to page-based systems that control access to shared memory using the memory's page protection to implement shared memory coherence protocols. Unfortunately, false sharing and fragmentation problems force such systems to resort to weak consistency shared memory models that complicate the shared memory programming model.
Stampede: A Cluster Programming Middleware for Interactive Stream-Oriented Applications
, 2003
"... Emerging application domains such as interactive vision, animation, and multimedia collaboration display dynamic scalable parallelism and high-computational requirements, making them good candidates for executing on parallel architectures such as SMPs and clusters of SMPs. Stampede is a programming ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
Emerging application domains such as interactive vision, animation, and multimedia collaboration display dynamic scalable parallelism and high-computational requirements, making them good candidates for executing on parallel architectures such as SMPs and clusters of SMPs. Stampede is a programming system that has many of the needed functionalities such as high-level data sharing, dynamic cluster-wide threads and their synchronization, support for task and data parallelism, handling of time-sequenced data items, and automatic buffer management. In this paper, we present an overview of Stampede, the primary data abstractions, the algorithmic basis of garbage collection, and the issues in implementing these abstractions on a cluster of SMPS. We also present a set of micromeasurements along with two multimedia applications implemented on top of Stampede, through which we demonstrate the low overhead of this runtime and that it is suitable for the streaming multimedia applications.
Implementation of Process Migration in Amoeba
- In Proceedings of the 14th International Conference of Distributed System
, 1994
"... The design of a process migration mechanism for the Amoeba distributed operating system is described. The primary motivation for this implementation is to carry out experimental and realistic studies of load balancing algorithms for a distributed operating system. Our aim has been the implementation ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
The design of a process migration mechanism for the Amoeba distributed operating system is described. The primary motivation for this implementation is to carry out experimental and realistic studies of load balancing algorithms for a distributed operating system. Our aim has been the implementation of a mechanism which is general, efficient and fully transparent, and which is reliable in the presence of network and processor failures. 1
Experiences with the Implementation of a Process Migration Mechanism for Amoeba
- Australian Computer Science Communications
, 1996
"... We describe our experiences with the implementation of a process migration mechanism for the distributed operating system Amoeba. After describing our design goals, we present our implementation for Amoeba. Though our goals have very largely been met, we have fallen short of the goal of complete tra ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
We describe our experiences with the implementation of a process migration mechanism for the distributed operating system Amoeba. After describing our design goals, we present our implementation for Amoeba. Though our goals have very largely been met, we have fallen short of the goal of complete transparency, and we discuss the consequences of that. We also present performance figures, indicating that the speed of process migration is limited only by the throughput of the network adapters used in our configuration, and that the overhead is comparable to that of process creation. We conclude with a review of the degree to which our design goals have been met, and discussion of the lessons learnt.