Results 1 - 10
of
654
NAMD2: Greater Scalability for Parallel Molecular Dynamics
- JOURNAL OF COMPUTATIONAL PHYSICS
, 1998
"... Molecular dynamics programs simulate the behavior of biomolecular systems, leading to insights and understanding of their functions. However, the computational complexity of such simulations is enormous. Parallel machines provide the potential to meet this computational challenge. To harness this ..."
Abstract
-
Cited by 322 (45 self)
- Add to MetaCart
Molecular dynamics programs simulate the behavior of biomolecular systems, leading to insights and understanding of their functions. However, the computational complexity of such simulations is enormous. Parallel machines provide the potential to meet this computational challenge. To harness this potential, it is necessary to develop a scalable program. It is also necessary that the program be easily modified by application-domain programmers. The
Evaluating the viability of process replication reliability for exascale systems,”
- in Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage, and Analysis,
, 2011
"... ABSTRACT As high-end computing machines continue to grow in size, issues such as fault tolerance and reliability limit application scalability. Current techniques to ensure progress across faults, like checkpoint-restart, are increasingly problematic at these scales due to excessive overheads predi ..."
Abstract
-
Cited by 71 (7 self)
- Add to MetaCart
(Show Context)
ABSTRACT As high-end computing machines continue to grow in size, issues such as fault tolerance and reliability limit application scalability. Current techniques to ensure progress across faults, like checkpoint-restart, are increasingly problematic at these scales due to excessive overheads predicted to more than double an application's time to solution. Replicated computing techniques, particularly state machine replication, long used in distributed and mission critical systems, have been suggested as an alternative to checkpoint-restart. In this paper, we evaluate the viability of using state machine replication as the primary fault tolerance mechanism for upcoming exascale systems. We use a combination of modeling, empirical analysis, and simulation to study the costs and benefits of this approach in comparison to checkpoint/restart on a wide range of system parameters. These results, which cover different failure distributions, hardware mean time to failures, and I/O bandwidths, show that state machine replication is a potentially useful technique for meeting the fault tolerance demands of HPC applications on future exascale platforms.
Scalable algorithms for molecular dynamics simulations on commodity clusters
- In SC ’06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing
, 2006
"... Although molecular dynamics (MD) simulations of biomolecular systems often run for days to months, many events of great scientific interest and pharmaceutical relevance occur on long time scales that remain beyond reach. We present several new algorithms and implementation techniques that significan ..."
Abstract
-
Cited by 68 (5 self)
- Add to MetaCart
(Show Context)
Although molecular dynamics (MD) simulations of biomolecular systems often run for days to months, many events of great scientific interest and pharmaceutical relevance occur on long time scales that remain beyond reach. We present several new algorithms and implementation techniques that significantly accelerate parallel MD simulations compared with current stateof-the-art codes. These include a novel parallel decomposition method and message-passing techniques that reduce communication requirements, as well as novel communication primitives that further reduce communication time. We have also developed numerical techniques that maintain high accuracy while using single precision computation in order to exploit processor-level vector instructions. These methods are embodied in a newly developed MD code called Desmond that achieves unprecedented simulation throughput and parallel scalability on commodity clusters. Our results suggest that Desmond’s parallel performance substantially surpasses that of any previously described code. For example, on a standard benchmark, Desmond’s performance on a conventional Opteron cluster with 2K processors slightly exceeded the reported performance of IBM’s Blue Gene/L machine with 32K processors running its Blue Matter MD code. 1.
Anton: A Special-Purpose Machine for Molecular Dynamics Simulation
- in Proc. 34th International Symposium on Computer Architecture (ISCA’07
, 2007
"... The ability to perform long, accurate molecular dynamics (MD) simulations involving proteins and other biological macromolecules could in principle provide answers to some of the most important currently outstanding questions in the fields of biology, chemistry and medicine. A wide range of biologic ..."
Abstract
-
Cited by 65 (8 self)
- Add to MetaCart
(Show Context)
The ability to perform long, accurate molecular dynamics (MD) simulations involving proteins and other biological macromolecules could in principle provide answers to some of the most important currently outstanding questions in the fields of biology, chemistry and medicine. A wide range of biologically interesting phenomena, however, occur over time scales on the order of a millisecond—about three orders of magnitude beyond the duration of the longest current MD simulations. In this paper, we describe a massively parallel machine called Anton, which should be capable of executing millisecondscale classical MD simulations of such biomolecular systems. The machine, which is scheduled for completion by the end of 2008, is based on 512 identical MD-specific ASICs that interact in a tightly coupled manner using a specialized high-speed communication
Parallelizing Molecular Dynamics Programs for Distributed Memory Machines: An Application of the Chaos Runtime Support Library
, 1994
"... CHARMM (Chemistry at Harvard Macromolecular Mechanics) is a program that is widely used to model and simulate macromolecular systems. CHARMM has been parallelized by using the CHAOS runtime support library on distributed memory architectures. This implementation distributes both data and computation ..."
Abstract
-
Cited by 50 (6 self)
- Add to MetaCart
(Show Context)
CHARMM (Chemistry at Harvard Macromolecular Mechanics) is a program that is widely used to model and simulate macromolecular systems. CHARMM has been parallelized by using the CHAOS runtime support library on distributed memory architectures. This implementation distributes both data and computations over processors. This data-parallel strategy should make it possible to simulate very large molecules on large numbers of processors. In order to
A New Parallel Method for Molecular Dynamics Simulation of Macromolecular Systems
, 1994
"... Short--range molecular dynamics simulations of molecular systems are commonly parallelized by replicated--data methods, where each processor stores a copy of all atom positions. This enables computation of bonded 2--, 3--, and 4--body forces within the molecular topology to be partitioned among p ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
Short--range molecular dynamics simulations of molecular systems are commonly parallelized by replicated--data methods, where each processor stores a copy of all atom positions. This enables computation of bonded 2--, 3--, and 4--body forces within the molecular topology to be partitioned among processors straightforwardly. A drawback to such methods is that the inter--processor communication scales as N , the number of atoms, independent of P , the number of processors. Thus, their parallel efficiency falls off rapidly when large numbers of processors are used. In this article a new parallel method for simulating macromolecular or small--molecule systems is presented, called force--decomposition. Its memory and communication costs scale as N= p P , allowing larger problems to be run faster on greater numbers of processors. Like replicated--data techniques, and in contrast to spatial--decomposition approaches, the new method can be simply load--balanced and performs well eve...
Parallelizing molecular dynamics using spatial decomposition. in:
- Scalable High Performance Computing Conference, IEEE,
, 1994
"... ..."
(Show Context)
Lightweight Computational Steering of Very Large Scale Molecular Dynamics Simulations
, 1996
"... We present a computational steering approach for controlling, analyzing, and visualizing very large scale molecular dynamics simulations involving tens to hundreds of millions of atoms. Our approach relies on extensible scripting languages and an easy to use tool for building extensions and modules. ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
(Show Context)
We present a computational steering approach for controlling, analyzing, and visualizing very large scale molecular dynamics simulations involving tens to hundreds of millions of atoms. Our approach relies on extensible scripting languages and an easy to use tool for building extensions and modules. The system is easy to modify, works with existing C code, is memory efficient, and can be used from inexpensive workstations over standard Internet connections. We demonstrate how we have been able to explore data from production MD simulations involving as many as 104 million atoms running on the CM-5 and Cray T3D. We also show how this approach can be used to integrate common scripting languages (including Python, Tcl/Tk, and Perl), simulation code, user extensions, and commercial data analysis packages.
Characterization of scientific workloads on systems with multi-core processors
- In IISWC
, 2006
"... Abstract. Multi-core processors are planned for virtually all next-generation HPC systems. In a preliminary evaluation of AMD Opteron Dual-Core processor systems, we investigated the scaling behavior of a set of micro-benchmarks, kernels, and applications. In addition, we evaluated a number of proce ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Multi-core processors are planned for virtually all next-generation HPC systems. In a preliminary evaluation of AMD Opteron Dual-Core processor systems, we investigated the scaling behavior of a set of micro-benchmarks, kernels, and applications. In addition, we evaluated a number of processor affinity techniques for managing memory placement on these multi-core systems. We discovered that an appropriate selection of MPI task and memory placement schemes can result in over 25 % performance improvement for key scientific calculations. We collected detailed performance data for several large-scale scientific applications. Analyses of the application performance results confirmed our micro-benchmark and scaling results. Keywords: Performance characterization, Multi-core processor, AMD Opteron, micro-benchmarking, scientific applications.
Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflop Computer
- International Journal of Parallel Programming
, 2001
"... The IBM Blue Gene project has endeavored to develop a cellular architecture computer with millions of concurrent threads of execution. One of the major challenges of this project is demonstrating that applications can successfully exploit this massive amount of parallelism. Starting from the sequent ..."
Abstract
-
Cited by 24 (3 self)
- Add to MetaCart
(Show Context)
The IBM Blue Gene project has endeavored to develop a cellular architecture computer with millions of concurrent threads of execution. One of the major challenges of this project is demonstrating that applications can successfully exploit this massive amount of parallelism. Starting from the sequential version of a well known molecular dynamics code, we developed a new application that exploits the multiple levels of parallelism in the Blue Gene cellular architecture. We perform both analytical and simulation studies of the behavior of this application when executed on a very large number of threads. As a result, we demonstrate that this class of applications can execute efficiently on a large cellular machine.