(Enter summary)
Abstract: This paper describes a comparative performance study of MPI and
Remote Memory Access (RMA) communication models in context of four scientific
benchmarks: NAS MG, NAS CG, SUMMA matrix multiplication, and
Lennard Jones molecular dynamics on clusters with the Myrinet network. It is
shown that RMA communication delivers a consistent performance advantage
over MPI. In some cases an improvement as much as 50% was achieved. Benefits
of using non-blocking RMA for overlapping computation and... (Update)
Cited by: More
Host-Assisted Zero-Copy Remote Memory Access Communication on .. - Tipparaju, al.
(Correct)
Active bibliography (related documents): More All
0.5: Optimizing Synchronization Operations for Remote.. - Buntinas, Saify..
(Correct)
0.4: Fast Parallel Algorithms for Short-Range Molecular Dynamics - Plimpton (1995)
(Correct)
0.3: One-sided Communication on the Myrinet-based SMP Clusters.. - Nieplocha, Ju, Apra (2001)
(Correct)
Similar documents based on text: More All
0.5: Efficient Barrier using Remote Memory Operations on .. - Gupta, Tipparaju.. (2002)
(Correct)
0.5: Protocols and Strategies for Optimizing Performance .. - Nieplocha.. (2002)
(Correct)
0.4: Optimizing Collective I/O Performance on Parallel.. - Chen, Foster..
(Correct)
BibTeX entry: (Update)
V.Tipparaju, M. Krishnan, J. Nieplocha, G. Santhanaraman D. K. Panda, "Exploiting Nonblocking Remote Memory Access Communication in Scientific Benchmarks on Clusters", in Proc. HiPC'03. http://citeseer.ist.psu.edu/645804.html More
@misc{ tipparaju-exploiting,
author = "V. Tipparaju and M. Krishnan and J. Nieplocha and G. Santhanaraman and
D. Panda",
title = "Exploiting Nonblocking Remote Memory Access Communication in Scientific
Benchmarks on Clusters",
text = "V.Tipparaju, M. Krishnan, J. Nieplocha, G. Santhanaraman D. K. Panda, Exploiting
Nonblocking Remote Memory Access Communication in Scientific Benchmarks
on Clusters, in Proc. HiPC'03.",
url = "citeseer.ist.psu.edu/645804.html" }
Citations (may not include all citations):
217
NASA Ames Research Center (context) - Bailey, Barszcz et al. - 1994
99
Allocating Independent Subtasks on Parallel Processors (context) - Kruskal, Weiss - 1985
57
Fast Parallel Algorithms for Short-Range Molecular Dynamics
- Plimpton - 1995
35
Co-Array Fortran for parallel programming
- Numrich, Reid - 1998
33
Introduction to UPC and language specification (context) - Carlson, Draper et al. - 1999
23
Global Arrays: A portable sharedmemory ' programming model f.. (context) - Nieplocha, Harrison et al. - 1994
21
ARMCI: A Portable Remote Memory Copy Library for Distributed..
- Nieplocha, Carpenter - 1999
10
A Comparison of Three Programming Models for Adaptive Applic..
- Shan, Singh et al. - 2000
10
Communication overlap in multi-tier parallel algorithms
- Baden, Fink - 1998
9
Efficient Parallel Implementation of Molecular Dynamics on a.. (context) - Esselink, Smit et al. - 1993
7
SUMMA: Scalable Universal Matrix Multiplication Algorithm (context) - Geijn, Watts - 1997
7
SHMEM's User's Guide (context) - Bariuso, Knies - 1994
7
A Generalized Portable SHMEM Library for High Performance Co.. (context) - Parzyszek, Nieplocha et al. - 2000
5
Writing a multigrid solver using Coarray Fortran (context) - Numrich, Reid et al. - 1998
3
Protocols and Strategies for Optimizing Remote Memory Operat.. (context) - Nieplocha, Tipparaju et al. - 2002
3
Scalable Parallel Molecular Dynamics on MIMD supercomputers (context) - Plimpton - 1992
1
One-sided communication on Myrinet (context) - Nieplocha, Tipparaju et al. - 2003
1
Comparing Performance and Scalability SHMEM and MPI One Sid.. (context) - Silvia, Kraeva et al. - 2002
1
A Comparative Study of the NAS MG Benchmark across Parallel .. (context) - Chamberlain, Deitz et al. - 2000
www.pmodels.org
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC