We present a new model of parallel computation---the LogGP model---and use it to analyze a number of algorithms, most notably, the single node scatter (one-to-all personalized broadcast). The LogGP model is an extension of the LogP model for parallel computation [CKP 93] which abstracts the communication of fixed-sized short messages through the use of four parameters: the communication latency (L), overhead (o), bandwidth (g), and the number of processors (P). As evidenced by experimental data, the LogP model can accurately predict communication performance when only short messages are sent (as on the CM-5) [CKP
|
1206
|
Introduction to Parallel Algorithms and Architectures: Arrays
– Leighton
- 1992
|
|
970
|
A bridging model for parallel computation
– Valiant
- 1997
|
|
926
|
Active Messages: A mechanism for integrated communication and computation
– Eicken, Culler, et al.
- 1992
|
|
281
|
Parallel algorithms for shared-memory machines
– Karp, Ramachandran
- 1990
|
|
255
|
Parallelism in random access machines
– Fortune, Wyllie
- 1978
|
|
226
|
Optimum Broadcasting and Personalized Communication in Hypercubes
– Johnsson, Ho
- 1989
|
|
191
|
LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation
– Alexandrov, Ionescu, et al.
- 1995
|
|
135
|
Towards an architecture-independent analysis of parallel algorithms
– Papadimitriou, Yannakakis
- 1990
|
|
132
|
Scans as primitive parallel operations
– Blelloch
- 1989
|
|
132
|
Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories
– Mehlhorn, Vishkin
- 1984
|
|
113
|
Designing Broadcasting Algorithms in the Postal Model for Message-Passing Systems
– Bar-Noy, Kipnis
- 1994
|
|
95
|
A more practical PRAM model
– Gibbons
- 1989
|
|
90
|
Communication complexity of PRAMs
– Aggarwal, Chandra, et al.
- 1990
|
|
86
|
The APRAM: Incorporating asynchrony into the PRAM model
– Cole
- 1989
|
|
82
|
auf der Heide. Efficient PRAM simulation on a distributed memory machine
– Karp, Luby, et al.
- 1992
|
|
78
|
Meiko cs-2 interconnect elan-elite design
– Homewood, McLaren
- 1993
|
|
77
|
Parallel and Distributed Computation
– Bertsekas, Tsitsiklis
- 1989
|
|
65
|
Introduction to Parallel Computing
– Kumar, Grama, et al.
- 1994
|
|
65
|
Type Architecture, Shared Memory and the Corollary of Moest Potential
– Snyder
- 1986
|
|
61
|
On communication latency in PRAM computations
– Agarwal, Chandra, et al.
- 1989
|
|
53
|
Data communication in hypercubes
– Saad, Schultz
- 1989
|
|
48
|
Experience with Active Messages on the Meiko CS-2
– Schauser, Scheiman
- 1995
|
|
33
|
Fast parallel sorting under LogP: From theory to practice
– Culler, Dusseau, et al.
- 1993
|
|
20
|
The NX message passing interface
– Pierce
- 1994
|
|
16
|
Message passing on the Meiko CS-2
– Barton, Cownie, et al.
- 1994
|
|
11
|
Towards a model for portable parallel performance: Exposing the memory hierarchy
– Alpern, Carter
- 1994
|
|
10
|
Measurements of Active Messages Performance on the CM-5
– Liu, Culler
- 1994
|
|
5
|
Efficient parallel communication with the nCUBE 2S processor
– Schmidt-Voigt
- 1994
|
|
1
|
Performance Parameters and Results for the Genesis Parallel Benchmarks
– Hockney
- 1994
|
|
1
|
CICO: A Practical Shared-Memory Programming Performance Model
– Laurus, Chandra, et al.
- 1994
|
|
1
|
and deterministic simulations of PRAMs byparallel machines with restricted granularity of parallel memories
– Mehlhorn, Randomized
- 1984
|