• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 558
Next 10 →

Performance Analysis of MPI Collective Operations

by Thara Angskun, George Bosilca, Graham E. Fagg, Edgar Gabriel, Jack J. Dongarra - In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05) - Workshop 15 , 2005
"... Previous studies of application usage show that the performance of collective communica-tions are critical for high performance computing and are often overlooked when compared to the point-to-point performance. In this paper we attempt to analyze and improve collective communication in the context ..."
Abstract - Cited by 78 (5 self) - Add to MetaCart
of the widely deployed MPI programming paradigm by extending accepted models of point-to-point communication, such as Hockney, LogP/LogGP, and PLogP. The predictions from the models were compared to the experimentally gathered data and our findings were used to optimize the implementation of collective

Processing MPI Datatypes outside MPI

by Robert Ross, Robert Latham, Ewing Lusk, Rajeev Thakur
"... Abstract. The MPI datatype functionality provides a powerful tool for describing structured memory and file regions in parallel applications, enabling noncontiguous data to be operated on by MPI communication and I/O routines. However, no facilities are provided by the MPI standard to allow users to ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
processing routines. We show the use of MPITypes in three examples: copying data between user buffers and a “pack ” buffer, encoding of data in a portable format, and transpacking. Our experimental evaluation shows that the implementation achieves rates comparable to existing MPI implementations. 1

Automated application-level checkpointing of MPI programs

by Greg Bronevetsky, Daniel Marques, Keshav Pingali, Paul Stodghill - In PPoPP ’03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming , 2003
"... Because of increasing hardware and software complexity, the running time of many computational science applications is now more than the mean-time-to-failure of highpeformance computing platforms. Therefore, computational science applications need to tolerate hardware failures. In this paper, we foc ..."
Abstract - Cited by 99 (15 self) - Add to MetaCart
in the literature are not suitable for implementing this approach. In this paper, we present a suitable protocol, and show how it can be used with a precompiler that instruments C/MPI programs to save application and MPI library state. An advantage of our approach is that it is independent of the MPI implementation

HARNESS and fault tolerant MPI

by Graham E. Fagg, Antonin Bukovsky, Jack J. Dongarra , 2001
"... Initial versions of MPI were designed to work e ciently on multi-processors which had very little job control and thus static process models. Subsequently forcing them to support a dynamic process model would have a€ected their performance. As current HPC systems increase in size with greater potent ..."
Abstract - Cited by 35 (8 self) - Add to MetaCart
MPI API. Given isan overview of the FT-MPI semantics, design, example applications, debugging tools and some performance issues. Also discussed is the experimental HARNESS core (G_HCORE) implementation that FT-MPI isbuilt to operate upon.

The LAM/MPI checkpoint/restart framework: System-initiated checkpointing

by Sriram Sankaran, Jeffrey M. Squyres, Brian Barrett, Andrew Lumsdaine - in Proceedings, LACSI Symposium, Sante Fe , 2003
"... As high-performance clusters continue to grow in size and popularity, issues of fault tolerance and reliability are becoming limiting factors on application scalability. To address these issues, we present the design and implementation of a system for providing coordinated checkpointing and rollback ..."
Abstract - Cited by 109 (10 self) - Add to MetaCart
for cluster maintenance and scheduling reasons as well as for fault tolerance. Experimental results show negligible communication performance impact due to the incorporation of the checkpoint support capabilities into LAM/MPI. 1

MPI-ACC: Accelerator-Aware MPI for Scientific Applications

by Ashwin M. Aji, Lokendra S. Panwar, Feng Ji, Karthik Murthy, Milind Chabbi, Pavan Balaji, Keith R. Bisset, James Dinan, Wu-chun Feng, John Mellor-crummey, Xiaosong Ma, Rajeev Thakur
"... Abstract—Data movement in high-performance computing systems accelerated by graphics processing units (GPUs) remains a challenging problem. Data communication in popular parallel programming models, such as the Message Passing Interface (MPI), is currently limited to the data stored in the CPU memor ..."
Abstract - Add to MetaCart
communication-computation patterns in scientific applications from domains such as epidemiology simulation and seismology modeling, and we discuss the lessons learned. We present experimental results on a state-of-the-art cluster with hundreds of GPUs; and we compare the performance and productivity of MPI

Dynamic malleability in mpi applications

by Kaoutar El Maghraoui, Travis J. Desell, Boleslaw K. Szymanski, Carlos A. Varela - In Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007 , 2007
"... Malleability enables a parallel application’s execution system to split or merge processes modifying the parallel application’s granularity. While process migration is widely used to adapt applications to dynamic execution environments, it is limited by the granularity of the application’s processes ..."
Abstract - Cited by 3 (3 self) - Add to MetaCart
processes. Malleability empowers process migration by allowing the application’s processes to expand or shrink following the availability of resources. We have implemented malleability as an extension to the PCM (Process Checkpointing and Migration) library, a user-level library for iterative MPI

Implementing MPI using Interrupts and Remote Copying for the AP1000/AP1000+

by David Sitsky - Fujitsu Laboratories Ltd , 1995
"... This paper documents an experimental MPI [1] library which has been built for both the AP1000 [3] and AP1000+ [2, 4] machines. Although the previous implementation of MPI [5, 6, 7] produced messaging performance that was almost identical to using CellOS calls, the library contained a number of unsat ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
This paper documents an experimental MPI [1] library which has been built for both the AP1000 [3] and AP1000+ [2, 4] machines. Although the previous implementation of MPI [5, 6, 7] produced messaging performance that was almost identical to using CellOS calls, the library contained a number

MPI Implementation of AKS Algorithm

by Jayasimha T
"... Abstract—AKS algorithm is the first deterministic polynomial time algorithm for primality proving. This project uses MPI (Message Passing Interface) along with GNU MP (Multi Pre-cision) library to implement a variant of AKS algorithm. Two different parallelization strategies have been implemented an ..."
Abstract - Add to MetaCart
Abstract—AKS algorithm is the first deterministic polynomial time algorithm for primality proving. This project uses MPI (Message Passing Interface) along with GNU MP (Multi Pre-cision) library to implement a variant of AKS algorithm. Two different parallelization strategies have been implemented

Adaptive Multigrid Methods in MPI

by Fabrizio Baiardi, Sarah Chiti, Paolo Mori, Laura Ricci - In In proceedings of Euro PVM/MPI , 2000
"... Abstract. Adaptive multigrid methods solve partial differential equations through a discrete representation of the domain that introduces more points in those zones where the equation behavior is highly irregular. The distribution of the points changes at run time in a way that cannot be foreseen in ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
the update of the mapping at run time to recover an unbalancing, together with strategies to acquire data mapped onto other processing nodes. A MPI implementation is presented together with some experimental results. 1
Next 10 →
Results 1 - 10 of 558
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University