by Jie Tao, Wolfgang Karl, Martin Schulz
http://wwwbode.informatik.tu-muenchen.de/~tao/papers/SCI.ps
Add To MetaCart
Abstract:
Data locality is a key factor for the performance of parallel systems. In a Distributed Shared Memory (DSM) system, however, it is difficult for the users to maintain a high data locality as it is usually a priori unknown how the data is distributed among the nodes. In this article we introduce a monitoring framework that allows users to understand the memory behavior of parallel applications. The information offered by the monitor system enables data to be allocated or redistributed more rationally among memories. This reduces remote memory accesses and further improves parallel performance. Keywords---SCI, monitoring, data locality, performance optimization.
Citations
|
724
|
The SPLASH-2 programs: Characterization and methodological considerations
– Woo, Ohara, et al.
- 1995
|
|
237
|
Exploiting process lifetime distributions for dynamic load balancing
– Harchol-Balter, Downey
- 1997
|
|
129
|
The PVM Concurrent Computing System: Evolution, Experiences and Trends
– Sunderam, Geist, et al.
- 1994
|
|
98
|
The DASH Prototype: Logic Overhead and Performance
– Lenoski, Laudon, et al.
- 1993
|
|
60
|
The Utility of Exploiting Idle Workstations for Parallel Computation
– Acharya, Edjlali, et al.
- 1997
|
|
52
|
Task Assignment in a Distributed System: Improving Performance by Unbalancing Load
– Crovella, Harchol-Balter, et al.
- 1998
|
|
26
|
Integrating Performance Monitoring and Communication in Parallel Computers (92 kB
– Martonosi, Ofelt, et al.
- 1996
|
|
20
|
A PCI-SCI Bridge for Building a PC-Cluster with Distributed Shared Memory
– Acher, Hellwagner, et al.
- 1996
|
|
17
|
ParC --- an extension of C for shared memory parallel processing
– Ben-Asher, Feitelson, et al.
- 1996
|
|
14
|
Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE
– Karl, Leberecht, et al.
|
|
12
|
Combining static and dynamic scheduling on distributed-memory multiprocessors
– Plata, Rivera
- 1994
|
|
7
|
for the Scalable Coherent Interface(SCI
– Standard
- 1993
|
|
7
|
The SHRIMP performance monitor: Design and applications
– Martonosi, Clark, et al.
- 1996
|
|
5
|
an execution-driven multiprocessor simulation tool for the i486+-based PCs. School of Electrical Engineering
– Limes
- 1997
|
|
5
|
Non-intrusive deep tracing of SCI interconnect traffic
– Manzke, Coghlan
- 1999
|
|
5
|
SCI-VM: A flexible base for transparent shared memory programming models on clusters of PCs
– Schulz
- 1999
|
|
5
|
The SHRIMP performance monitor: Design and applications
– Martonosi, Clark, et al.
- 1996
|
|
2
|
Global virtual memory based on SCI-DSM
– Schulz, Hellwagner
- 1998
|
|
1
|
Automatic detection of parallelism–A grand challenge for High-Performance computing
– Blume, Eigenmann, et al.
- 1994
|