| T. W. Clark, R. van Hanxleden, J. A. McCammon, and L. Ridgway. Parallelization strategies for a molecular dynamics program. In Intel Technology Focys Conference Oroceedings, April 1992. |
....dominate, since the integration is relatively cheap to carry out. Nevertheless, the terms of the sum of interactions U( x 1 ; x 2 ; x N ) are independent. Altogether, this makes MD to a certain extent inherently parallel.This has been exploited by several parallel MD programs [4, 14, 22, 23, 32, 53, 77, 81, 86, 91, 98, 110, 124, 125]. When parallelizing MD, there are two important considerations to make. First, the MD program must perform well for a small number of particles, i.e. less than 1000. There are several interests to carry out simulations with a few thousand particles over a long time scale, e.g. a protein ....
....communication and load balancing difficulties, especially for a large number of processors. Atom decomposition based on data replication is an easy but memoryexpensive approach. It has poor scaling properties due to global communication [91] Programs using this decomposition include UHGromos [22], Amber [121] CHARMM [15] Moldy [98] and an early version of EGO [31] Systolic or hypersystolic loop algorithms [78, 114] are a possible remedy to reduce the memory usage and to improve the scaling. Force decomposition involves either force matrix or systolic loop methods. It scales better ....
[Article contains additional citation context not shown here]
T. W. Clark, R. van Hanxleden, J. A. McCammon, and L. Ridgway. Parallelization strategies for a molecular dynamics program. In Intel Technology Focys Conference Oroceedings, April 1992.
....of different parallelization strategies for localized N body methods running on MIMD multiprocessors. Some previous efforts with parallel particle calculations have concentrated on the parallelization of a particular program instead of a general software infrastructure. For example, Clark et al. [49, 50] implemented a parallel version of the GROMOS molecular dynamics application. Their approach uses non uniform, dynamic partitions similar to our own which were implemented (with considerable effort) using a message passing library. The parallelization of GROMOS would have been significantly easier ....
T. W. Clark, R. V. Hanxleden, J. A. McCammon, and L. R. Scott, Parallelization strategies for a molecular dynamics program, in Intel Technology Focus Conference Proceedings, April 1992.
....what the optimal compiler could produce and are therefore interesting for performance comparisons. The SPLASH benchmark suite [SWG92] for distributed memory computers contains many reallife applications, some of which can be converted for our system and used as examples. For the FORTRAN D system, [CvHMS92, vHKS93, vH92] examine techniques to facilitate the parallel execution of FORTRAN programs. They concentrate on efficient communication with data prefetch and redundancy elimination and gather the necessary data with inspector executor loops. There is also a runtime system that performs spatial ....
Terry W. Clark, Reinhard v. Hanxleden, J. Andrew McCammon, and L. Ridgway Scott. Parallelization strategies for a molecular dynamics program. In Technology Fucus Conference, Timberline Lodge 1992, Houston, April 5--7 1992. intel University Partners Program, Supercomputer Systems Division.
....depends on the density of particles, which can vary to a large extent during simulation. Therefore, the most cost efficient number of processors to use for each routine is different and variable. Different approaches of the parallelisation of a MD program are described in the work of Hanxleden [2]. As part of our work, we parallelised the sequential MD program ARGOS [7] 8] on a SGI Power Challenge, a shared memory machine with 16 R8000 processors [6] 3 Dynamically Adapting the Degree of Parallelism Traditional load balancing methods redistribute the amount of work among all ....
R. v. Hanxleden, T. W. Clark, J. A. McCammon, L. R. Scott, Parallelization Strategies for a Molecular Dynamics Program, Intel Technology Focus Conf. Proc., 1992
....calculation depends on the density of particles, which can vary to a large extent during simulation. Therefore, the most cost efficient number of processors to use for each routine is different. Different approaches of the parallelisation of a MD program are described in the work of Hanxleden [2]. As part of our work, we parallelised the sequential MD program ARGOS [8, 7] on a SGI Power Challenge, a shared memory machine with 16 R8000 processors [6] 3 Dynamically Adapting the Degree of Parallelism Traditional load balancing methods redistribute the amount of work among all ....
R. v. Hanxleden, T. W. Clark, J. A. McCammon, L. R. Scott, Parallelization Strategies for a Molecular Dynamics Program, Intel Technology Focus Conf. Proc., 1992
....used by many loops. The following two subsections describe the phases. Table 2: Common Partitioning Heuristics Partitioner Reference Spatial Connectivity Vertex Edge Information Information Weight Weight Spectral Bisection [39] p p p Coordinate Bisection [3] p p Hierarchical Decomposition [10] p p Simulated Annealing [27] p p p Neural Network [27] p p p Genetic Algorithms [27] p p p Inertial Bisection [31] p p Kernighan Lin [22] p p p 4.1 Data Partitioning When distributed arrays are partitioned, loop iterations have not yet been assigned to processors. Assume that loop ....
....For instance, sometimes it is important to take estimated computational costs into account when carrying out coordinate or inertial bisection for problems where computational costs vary greatly from node to node. Other partitioners make use of both geometrical and connectivity information [10]. Since the data structure that stores information on which data partitioning is to be based can represent Geometrical, Connectivity and or Load information, it is called the GeoCoL data structure. More formally, a GeoCoL graph G = V; E; W v ; W e ; C) consists of 1. a set of vertices V = fv 1 ; ....
T. W. Clark, R. v. Hanxleden, J. A. McCammon, and L. R. Scott. Parallelization strategies for a molecular dynamics program. In Intel Supercomputer University Partners Conference, Timberline Lodge, Mt. Hood, OR, April 1992.
....At2 = partners (At1 , pr) F(At1) F(At1) Force (At1 , At2) ENDDO ENDDO Figure 13: F90SIMD version of the nonbonded force calculation NBFORCE. irregular problems [2, 19, 22, 23] One example is the GROMOS molecular dynamics program, which contains several interesting kernels of this kind [6, 7, 10]. Here we want to focus on the calculation of the nonbonded forces between individual pairs of atoms. 5.1 The application Since the nonbonded forces between pairs of atoms quickly decrease as the distances between them increase, they are usually approximated by considering only pairs of atoms ....
T. W. Clark, R. v. Hanxleden, J. A. McCammon, and L. R. Scott. Parallelization strategies for a molecular dynamics program. In Intel Supercomputer University Partners Conference, Timberline Lodge, Mt. Hood, OR, April 1992.
.... a compiler will choose a loop iteration partitioning scheme; e.g. partitioning Table 2: Common Partitioning Heuristics Partitioner Reference Spatial Connectivity Vertex Edge Information Information Weight Weight Spectral Bisection [37] p p p Coordinate Bisection [3] p p Hierarchical Subbox [11] p p Decomposition Simulated Annealing [29] p p p Neural Network [29] p p p Genetic Algorithms [29] p p p Inertial Bisection [33] p p Kernighan Lin [24] p p p loops so as to minimize non local distributed array references. Our approach to data partitioning makes an implicit ....
....instance, we find that it is sometimes important to take estimated computational costs into account when carrying out coordinate or inertial bisection for problems where computational costs vary greatly from node to node. Other partitioners make use of both geometrical and connectivity information [11]. Since the data structure that stores information on which data partitioning is to be based can represent Geometrical, Connectivity and or Load information, we call this the GeoCoL data structure. More formally, a GeoCoL graph G = V; E; W v ; W e ; C) consists of 1. a set of vertices V = fv 1 ; ....
T. W. Clark, R. v. Hanxleden, J. A. McCammon, and L. R. Scott. Parallelization strategies for a molecular dynamics program. In Intel Supercomputer University Partners Conference, Timberline Lodge, Mt. Hood, OR, April 1992.
....structures, common to older Fortran programs, is also apparent in Gromos. 3 UHGROMOS Over the past several years, a parallel implementation of promd has been developed by Clark, McCammon, Scott, and v. Hanxleden of the University of Houston and the Texas Center for Advanced Molecular Computation [2, 4, 3]. Their work has resulted in two parallel implementations: UHGromos and EulerGromos. These implementations differ primarily in the approach to problem decomposition; UHGromos uses an atom based decomposition approach with replicated data structures while EulerGromos employs a spatial ....
T. W. Clark, R. v. Hanxleden, J. A. McCammon, and L. Ridgway Scott. Parallelization strategies for a molecular dynamics program. In Intel Technology Focus Conference Proceedings, April 1992.
.... also be represented as a triplet of mapping functions (ffi x ; ffi y ; ffi z ) These, however, are not all one dimensional any more; instead, it is ffi(i; j; k) ffi x (i; ffi y (j; ffi z (k) Here ffi can be represented with S Gamma 1 (R Gamma 1)S (Q Gamma 1)RS = P Gamma 1 integers [CHMS92] see also Section A.4.1. A general mapping function cannot be represented as a simple composition of subfunctions; i.e. we cannot decouple any of the dimensions from each other. A special case here are recursive mapping functions like the orthogonal recursive bisection shown in Figure 4, ....
....are of type 1 and 3 (see Section 1. 1) As described in Appendix A, this project already has led to valuable insights into which of the language concepts well proven for regular problems carry over easily into the irregular world and which concepts have to be modified or extended [CHK 92, CHMS92] 3.3 Communication Analysis for Irregular Problems One issue related to value based decompositions, and to irregular decompositions in general, is how to generate the necessary communication for accessing the data distributed this way. We decided to use the Parti communication routines which ....
T. W. Clark, R. v. Hanxleden, J. A. McCammon, and L. R. Scott. Parallelization strategies for a molecular dynamics program. In Intel Supercomputer University Partners Conference, Timberline Lodge, Mt. Hood, OR, April 1992.
....instance, we find that it is sometimes important to take estimated computational costs into account when carrying out coordinate or inertial bisection for problems where computational costs vary greatly from node to node. Other partitioners make use of both geometrical and connectivity information [5]. Since the data structure that stores information on which data partitioning is to be based can represent Geometrical, Connectivity and or Load information, we call this the GeoCoL data structure. 4.1.2 Generating GeoCoL Data Structure We propose a directive CONSTRUCT that can be employed to ....
T. W. Clark, R. v. Hanxleden, J. A. McCammon, and L. R. Scott. Parallelization strategies for a molecular dynamics program. In Intel Supercomputer University Partners Conference, Timberline Lodge, Mt. Hood, OR, April 1992.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC