Shared memory, one of the most popular models for programming parallel platforms, is becoming ubiquitous both in low-end workstations and high-end servers. With the advent of low-latency networking hardware, clusters of workstations strive to offer the same processing power as high-end servers for a fraction of the cost. In such environments, shared memory has been limited to page-based systems that control access to shared memory using the memory’s page protection to implement shared memory coherence protocols. Unfortunately, false sharing and fragmentation problems force such systems to resort to weak consistency shared memory models that complicate the shared memory programming model. This thesis studies fine-grain distributed shared memory (FGDSM) systems on networks of workstations to support shared memory and it explores the issues involved in the implementation of FGDSM systems on networks of commodity workstations running commodity operating systems. FGDSM systems rely on fine-grain memory access control to selectively restrict reads and writes to cache-block-sized memory regions. The thesis presents Blizzard, a family of FGDSM systems running on a network of workstations. Blizzard supports the Tempest
|
3148
|
Computer architecture: a quantitative approach
– Hennessy, Patterson
- 1990
|
|
926
|
Active Messages: A mechanism for integrated communication and computation
– Eicken, Culler, et al.
- 1992
|
|
847
|
Memory coherence in shared virtual memory systems
– Li, Hudak
- 1989
|
|
784
|
Myrinet: A Gigabit-per-second Local Area Network
– Boden, Cohen, et al.
- 1995
|
|
724
|
The SPLASH-2 programs: Characterization and methodological considerations
– Woo, Ohara, et al.
- 1995
|
|
705
|
SPLASH: Stanford Parallel Applications for Shared Memory
– Singh, Weber, et al.
- 1992
|
|
680
|
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and
– Jouppi
- 1990
|
|
530
|
Implementation and performance of Munin
– Carter, Bennett, et al.
- 1991
|
|
527
|
U-Net: A User-Level Network Interface for Parallel and Distributed Computing
– Eicken, Basu, et al.
- 1995
|
|
477
|
TreadMarks: Distributed shared memory on standard workstations and operating systems
– Keleher, Dwarkadas, et al.
- 1994
|
|
462
|
The NAS Parallel Benchmarks
– Bailey, Barton, et al.
- 1991
|
|
375
|
Algorithms for Scalable Synchronization on Shared-memory Multiprocessors
– Mellor-Crummey, Scott
- 1991
|
|
362
|
The Stanford Dash multiprocessor
– Lenoski, Laudon, et al.
- 1992
|
|
359
|
The Tera Computer System
– Alverson, Callahan, et al.
- 1990
|
|
357
|
Multilisp: A language for concurrent symbolic computation
– Halstead
- 1985
|
|
333
|
The SGI Origin: A ccNUMA Highly Scalable Server
– Laudon, Lenoski
- 1997
|
|
323
|
Tempest and Typhoon: User-Level Shared Memory
– Reinhardt, Larus, et al.
- 1994
|
|
309
|
Fbufs: a high-bandwidth cross-domain transfer facility
– Druschel, LL
- 1993
|
|
291
|
High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet
– Pakin, Lauria, et al.
- 1995
|
|
269
|
Virtual memory mapped network interface for the SHRIMP multicomputer
– Blumrich, Li, et al.
- 1994
|
|
263
|
CHARMM: a program for macromolecular energy, minimization and dynamics calculations
– Brooks, Buccoleri, et al.
- 1983
|
|
253
|
Munin: Distributed Shared Memory Based on Type-Specific Memory Coherence
– Bennett, Carter, et al.
- 1990
|
|
241
|
Checkpointing and Rollback-Recovery for Distributed Systems
– Koo, Toueg
- 1987
|
|
228
|
EEL: Machine-independent executable editing
– Larus, Schnarr
- 1995
|
|
224
|
Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model
– Archibald, Baer
- 1986
|
|
202
|
Shasta: A Low Overhead, Software-Only Approach for Supporting Fine-Grain Shared Memory
– Scales, Gharachorloo, et al.
- 1996
|
|
199
|
High-performance all-software distributed shared memory
– Johnson
- 1995
|
|
197
|
LimitLESS Directories: A Scalable Cache Coherence Scheme,’’ Architecture Support for Programming Languages and Operating Systems-IV
– Chaiken, Kubiatowicz, et al.
- 1991
|
|
187
|
The Wisconsin Wind Tunnel: Virtual prototyping of parallel computers
– Reinhardt, Hill, et al.
- 1993
|
|
180
|
Competitive snoopy caching
– Karlin, Manasse, et al.
- 1988
|
|
166
|
et al., “The Stanford FLASH Multiprocessor
– Kuskin
- 1994
|
|
166
|
Fine-Grain Access Control for Distributed Shared Memory
– Schoinas, Falsafi, et al.
- 1994
|
|
162
|
Experiences with a high-speed network adaptor: A software perspective
– Druschel, Peterson, et al.
- 1994
|
|
156
|
F.B.Schneider: Concepts and notations for concurrent programming
– Andrews
- 1983
|
|
154
|
Parallel programming in split-c
– Culler, Dusseau, et al.
- 1993
|
|
151
|
PVM 3 users guide and reference manual
– Geist, Beguelin, et al.
- 1994
|
|
145
|
Aspects of Cache Memory and Instruction Buffer Performance
– Hill
- 1987
|
|
118
|
The virtual interface architecture
– Dunning, Regnier, et al.
- 1998
|
|
115
|
Synchronization and communication in the T3E multiprocessor
– Scott
- 1996
|
|
109
|
Cooperative Shared Memory: Software and Hardware for Scalable Multiprocessors
– Hill, Larus, et al.
- 1993
|
|
107
|
UNIX Internals, The New Frontiers
– Vahalia
- 1996
|
|
104
|
Rewriting executable files to measure program behavior
– Larus, Ball
- 1992
|
|
101
|
Effective Distributed Scheduling of Parallel Workloads
– Dusseau, Arpaci, et al.
- 1996
|
|
95
|
Effects of communication latency, overhead, and bandwidth in a cluster architecture
– Martin, Vahdat, et al.
- 1997
|
|
94
|
Memory Channel Network for PCI
– Gillett
- 1996
|
|
92
|
An implementation of the Hamlyn sender-managed interface architecture
– Buzzard, Jacobson, et al.
- 1996
|
|
89
|
ApplicationSpecific Protocols for User-Level Shared Memory
– Falsafi, Lebeck, et al.
- 1994
|
|
86
|
Portable Programs for Parallel Processors
– Boyle, Butler, et al.
- 1987
|
|
86
|
TNet: A reliable system area network
– Horst
- 1995
|
|
84
|
Efficient support for irregular applications on distributed-memory machines
– Mukherjee, Sharma, et al.
- 1995
|