Results 1 - 10
of
44,031
Eliminating Invalidation in Coherent-Cache Parallel Graph Reduction
- PARLE 94 Parallel Architectures and Languages Europe
, 1994
"... . Parallel functional programs based on the graph reduction execution model display considerable locality of reference, favouring the use of large cache lines in the implementation of the shared heap on a shared-memory multiprocessor. They also display a very high rate of synchronisation, making ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
, making conventional weakly-consistent coherency protocols ineffective at avoiding unnecessary contention for write access to cache lines due to false sharing. We present the design of a specially adapted cache coherency protocol and show results of simulation experiments which demonstrate
A Hierarchical Internet Object Cache
- IN PROCEEDINGS OF THE 1996 USENIX TECHNICAL CONFERENCE
, 1995
"... This paper discusses the design andperformance of a hierarchical proxy-cache designed to make Internet information systems scale better. The design was motivated by our earlier trace-driven simulation study of Internet traffic. We believe that the conventional wisdom, that the benefits of hierarch ..."
Abstract
-
Cited by 501 (6 self)
- Add to MetaCart
This paper discusses the design andperformance of a hierarchical proxy-cache designed to make Internet information systems scale better. The design was motivated by our earlier trace-driven simulation study of Internet traffic. We believe that the conventional wisdom, that the benefits
An Incessantly Coherent Cache Scheme for Shared Memory Multithreaded Systems
, 1994
"... An incessantly coherent cache consistency protocol is proposed in this paper. The protocol supports limitless sharing and obviates the need to invalidate shared cache lines by automatically self invalidating cache lines after the expiry of its lifetime. The protocol mandates that all writes be perfo ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
An incessantly coherent cache consistency protocol is proposed in this paper. The protocol supports limitless sharing and obviates the need to invalidate shared cache lines by automatically self invalidating cache lines after the expiry of its lifetime. The protocol mandates that all writes
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers
, 1990
"... ..."
Reconciling Sharing and Spatial Locality Using Adjustable Block Size Coherent Caches
"... Several studies have shown that the performance of coherent caches depends on the relationship between the cache block size and the granularity of sharing and locality exhibited by the program. Large cache blocks exploit processor and spatial locality, but may cause unnecessary cache invalidations d ..."
Abstract
- Add to MetaCart
Several studies have shown that the performance of coherent caches depends on the relationship between the cache block size and the granularity of sharing and locality exhibited by the program. Large cache blocks exploit processor and spatial locality, but may cause unnecessary cache invalidations
Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors
- ACM Transactions on Computer Systems
, 1991
"... Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become marke ..."
Abstract
-
Cited by 567 (32 self)
- Add to MetaCart
-accessible ag variables, and for some other processor to terminate the spin with a single remote write operation at an appropriate time. Flag variables may be locally-accessible as a result of coherent caching, or by virtue of allocation in the local portion of physically distributed shared memory. We present a
Horus: A flexible group communication system
- Comm. of the ACM
, 1996
"... innovative system offering application developers an extensively flexible group communication model is described. The emergence of process-group environments for distributed computing represents a promising step toward robustness for mission-critical distributed applications. Process groups have a “ ..."
Abstract
-
Cited by 434 (28 self)
- Add to MetaCart
“natural’ ’ correspondence with data or services that have been replicated for availability or as part of a coherent cache. They can be used to support highly available security domains, and group mechanisms fit well with an emerging generation of intelligent network and collaborative work applications.
Locality and False Sharing in Coherent-Cache Parallel Graph Reduction
- PARLE 93 PARALLEL ARCHITECTURES AND LANGUAGES EUROPE
, 1993
"... Parallel graph reduction is a model for parallel program execution in which shared-memory is used under a strict access regime with single assignment and blocking reads. We outline the design of an efficient and accurate multiprocessor simulation scheme and the results of a simulation study of the ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
of the performance of a suite of benchmark programs operating under a cache coherency protocol that is representative of protocols used in commercial shared-memory machines and in more scalable distributed shared-memory systems. We analyse the influence of cache line size on performance and expose the relative
Results 1 - 10
of
44,031