Abstract:
In out-of-core computations, disk storage is treated as another level in the memory hierarchy, below cache, local memory, and (in a parallel computer) remote memories. However, the tools used to manage this storage are typically quite different from those used to manage access to local and remote memory. This disparity complicates implementation of out-of-core algorithms and hinders portability. We describe a programming model that addresses this problem. This model allows parallel programs to use essentially the same mechanisms to manage the movement of data between any two adjacent levels in a hierarchical memory system. We take as our starting point the Global Arrays shared-memory model and library, which support a variety of operations on distributed arrays, including transfer between local and remote memories. We show how this model can be extended to support explicit transfer between global memory and secondary storage, and we define a Disk Resident Arrays library that supports such transfers. We illustrate the utility of the resulting model with two applications, an out-of-core matrix multiplication and a large computational chemistry program. We also describe implementation techniques on several parallel computers and present experimental results that demonstrate that the Disk Resident Arrays model can be implemented very efficiently on parallel computers.
Citations
|
963
|
Performance Fortran Forum. High Performance Fortran language specification version 1.0
– High
- 1993
|
|
652
|
Linda in context
– Carriero, Gelernter
- 1989
|
|
361
|
W.A.Sawdon, The Midway Distributed Shared Memory System
– Bershad
- 1993
|
|
245
|
Algorithms for parallel memory I: Two-level memories. Algorithmica
– Vitter, Shriver
- 1994
|
|
222
|
Disk-directed I/O for MIMD Multiprocessors
– Kotz
- 1994
|
|
101
|
Input-Output Characteristics of Scalable Parallel Applications
– Crandall, Aydt, et al.
- 1995
|
|
70
|
Parallel access to files in the Vesta file system
– Corbett, Feitelson, et al.
- 1993
|
|
49
|
A Model and Compilation Strategy for Out-of-Core Data Parallel Programs
– Bordawekar, Choudhary, et al.
- 1995
|
|
47
|
ViC*: A preprocessor for virtual-memory C
– Cormen, Colvin
- 1994
|
|
44
|
The Nexus task-parallel runtime system
– Foster, Kesselman, et al.
- 1994
|
|
42
|
Tools for the developement of application-specific virtual memory management
– Krueger, Loftesness, et al.
- 1993
|
|
39
|
Permuting information in idealized two-level storage
– FLOYD
- 1972
|
|
37
|
SUMMA: Scalable Universal Matrix Multiplication Algorithm,” Concurrency
– Geijn, Watts
- 1997
|
|
27
|
E cient compilation of out-of-core data parallel programs
– Bordawekar, Thakur, et al.
- 1994
|
|
26
|
Global Arrays: A Non-UniformMemory-Access Programming Model for High-Performance
– Nieplocha, Harrison, et al.
- 1996
|
|
22
|
SPIFFI -- a scalable parallel file system for the Intel Paragon
– Freedman, Burger, et al.
- 1996
|
|
21
|
Data access reorganizations in compiling out-of-core data parallel programs on distributed memory machines
– Bordawekar, Choudhary, et al.
- 1994
|
|
20
|
Disk-directed I/O for an out-of-core computation
– Kotz
- 1995
|
|
16
|
I/O characterization of a portable astrophysics application on
– Thakur, Lusk, et al.
- 1995
|
|
16
|
Overview of the MPI-IO parallel I/O interface
– Corbett, Feitelson, et al.
- 1995
|
|
13
|
A data management approach for handling large compressed arrays in high performance computing
– Seamons, Winslett
- 1995
|
|
10
|
Parallel processing of spaceborne imaging radar data
– Miller, Payne, et al.
- 1995
|
|
9
|
Global arrays: A portable "shared-memory" programming model for distributed memory computers
– Nieplocha, Harrison, et al.
- 1994
|
|
9
|
Ab Initio Molecular Electronic Structure on
– Harrison, Shepard
- 1994
|
|
7
|
Large-scale correlated electronic structure calculations: The RI-MP2 method on parallel computers
– Bernholdt, Harrison
- 1996
|
|
6
|
DONIO: Distributed object network I/O library
– D’Azevedo, Romine
- 1994
|
|
3
|
et al. Parallel computational chemistry made easier: The development of NWChem
– Bernholdt
- 1995
|
|
2
|
Issues arising
– Gibson, Stodolsky
|
|
2
|
A scalable implementation of RI-SCF on parallel computers
– Früchtl, Kendall, et al.
- 1996
|
|
2
|
An API for choreographing disk accesses
– Shriver, Wisniewski
|