Results 1 - 10
of
558
Performance Analysis of MPI Collective Operations
- In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05) - Workshop 15
, 2005
"... Previous studies of application usage show that the performance of collective communica-tions are critical for high performance computing and are often overlooked when compared to the point-to-point performance. In this paper we attempt to analyze and improve collective communication in the context ..."
Abstract
-
Cited by 78 (5 self)
- Add to MetaCart
of the widely deployed MPI programming paradigm by extending accepted models of point-to-point communication, such as Hockney, LogP/LogGP, and PLogP. The predictions from the models were compared to the experimentally gathered data and our findings were used to optimize the implementation of collective
Processing MPI Datatypes outside MPI
"... Abstract. The MPI datatype functionality provides a powerful tool for describing structured memory and file regions in parallel applications, enabling noncontiguous data to be operated on by MPI communication and I/O routines. However, no facilities are provided by the MPI standard to allow users to ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
processing routines. We show the use of MPITypes in three examples: copying data between user buffers and a “pack ” buffer, encoding of data in a portable format, and transpacking. Our experimental evaluation shows that the implementation achieves rates comparable to existing MPI implementations. 1
Automated application-level checkpointing of MPI programs
- In PPoPP ’03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
, 2003
"... Because of increasing hardware and software complexity, the running time of many computational science applications is now more than the mean-time-to-failure of highpeformance computing platforms. Therefore, computational science applications need to tolerate hardware failures. In this paper, we foc ..."
Abstract
-
Cited by 99 (15 self)
- Add to MetaCart
in the literature are not suitable for implementing this approach. In this paper, we present a suitable protocol, and show how it can be used with a precompiler that instruments C/MPI programs to save application and MPI library state. An advantage of our approach is that it is independent of the MPI implementation
HARNESS and fault tolerant MPI
, 2001
"... Initial versions of MPI were designed to work e ciently on multi-processors which had very little job control and thus static process models. Subsequently forcing them to support a dynamic process model would have a€ected their performance. As current HPC systems increase in size with greater potent ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
MPI API. Given isan overview of the FT-MPI semantics, design, example applications, debugging tools and some performance issues. Also discussed is the experimental HARNESS core (G_HCORE) implementation that FT-MPI isbuilt to operate upon.
The LAM/MPI checkpoint/restart framework: System-initiated checkpointing
- in Proceedings, LACSI Symposium, Sante Fe
, 2003
"... As high-performance clusters continue to grow in size and popularity, issues of fault tolerance and reliability are becoming limiting factors on application scalability. To address these issues, we present the design and implementation of a system for providing coordinated checkpointing and rollback ..."
Abstract
-
Cited by 109 (10 self)
- Add to MetaCart
for cluster maintenance and scheduling reasons as well as for fault tolerance. Experimental results show negligible communication performance impact due to the incorporation of the checkpoint support capabilities into LAM/MPI. 1
MPI-ACC: Accelerator-Aware MPI for Scientific Applications
"... Abstract—Data movement in high-performance computing systems accelerated by graphics processing units (GPUs) remains a challenging problem. Data communication in popular parallel programming models, such as the Message Passing Interface (MPI), is currently limited to the data stored in the CPU memor ..."
Abstract
- Add to MetaCart
communication-computation patterns in scientific applications from domains such as epidemiology simulation and seismology modeling, and we discuss the lessons learned. We present experimental results on a state-of-the-art cluster with hundreds of GPUs; and we compare the performance and productivity of MPI
Dynamic malleability in mpi applications
- In Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007
, 2007
"... Malleability enables a parallel application’s execution system to split or merge processes modifying the parallel application’s granularity. While process migration is widely used to adapt applications to dynamic execution environments, it is limited by the granularity of the application’s processes ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
processes. Malleability empowers process migration by allowing the application’s processes to expand or shrink following the availability of resources. We have implemented malleability as an extension to the PCM (Process Checkpointing and Migration) library, a user-level library for iterative MPI
Implementing MPI using Interrupts and Remote Copying for the AP1000/AP1000+
- Fujitsu Laboratories Ltd
, 1995
"... This paper documents an experimental MPI [1] library which has been built for both the AP1000 [3] and AP1000+ [2, 4] machines. Although the previous implementation of MPI [5, 6, 7] produced messaging performance that was almost identical to using CellOS calls, the library contained a number of unsat ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper documents an experimental MPI [1] library which has been built for both the AP1000 [3] and AP1000+ [2, 4] machines. Although the previous implementation of MPI [5, 6, 7] produced messaging performance that was almost identical to using CellOS calls, the library contained a number
MPI Implementation of AKS Algorithm
"... Abstract—AKS algorithm is the first deterministic polynomial time algorithm for primality proving. This project uses MPI (Message Passing Interface) along with GNU MP (Multi Pre-cision) library to implement a variant of AKS algorithm. Two different parallelization strategies have been implemented an ..."
Abstract
- Add to MetaCart
Abstract—AKS algorithm is the first deterministic polynomial time algorithm for primality proving. This project uses MPI (Message Passing Interface) along with GNU MP (Multi Pre-cision) library to implement a variant of AKS algorithm. Two different parallelization strategies have been implemented
Adaptive Multigrid Methods in MPI
- In In proceedings of Euro PVM/MPI
, 2000
"... Abstract. Adaptive multigrid methods solve partial differential equations through a discrete representation of the domain that introduces more points in those zones where the equation behavior is highly irregular. The distribution of the points changes at run time in a way that cannot be foreseen in ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
the update of the mapping at run time to recover an unbalancing, together with strategies to acquire data mapped onto other processing nodes. A MPI implementation is presented together with some experimental results. 1
Results 1 - 10
of
558