| Graham E. Fagg, Sathish S. Vadhiyar, Jack J. Dongarra. "ACCT: Automatic Collective Communications Tuning", Proc of EuroPVM-MPI 2000, Lecture Notes in Computer Science, Vol. 1908, pp354-361, Springer Verlag, 2000. |
.... for particular architectures, such as hypercube, mesh, or fat tree, with an emphasis on minimizing link contention, node contention, or the distance between communicating nodes [2 4, 14] More recently, Dongarra et al. have developed automatically tuned collective communication algorithms [5, 19]. Their approach consists of running tests to measure system parameters and then tuning their algorithms for those parameters. Researchers in Holland and at Argonne have optimized MPI collective communication for wide area distributed environments [8, 9] In such environments, the goal is to ....
Graham E. Fagg, Sathish S. Vadhiyar, and Jack J. Dongarra. ACCT: Automatic collective communications tuning. In Jack Dongarra, Peter Kacsuk, and Norbert Podhorszki, editors, Recent Advances in Parallel Virutal Machine and Message Passing Interface, pages 354-361. Lecture Notes in Computer Science 1908, Springer, September 2000.
No context found.
Graham E. Fagg, Sathish S. Vadhiyar, Jack J. Dongarra. "ACCT: Automatic Collective Communications Tuning", Proc of EuroPVM-MPI 2000, Lecture Notes in Computer Science, Vol. 1908, pp354-361, Springer Verlag, 2000.
....LogP model. The MAGPIE model considered only a few network parameters for modeling collective communications. For example, it did not take into account the number of previously posted non blocking sends, Isends, in determining the network parameters for a given message size. In our previous work [12], 13] we built e#cient algorithms for di#erent collective communications and selected the best collective algorithm and segment This work was supported by the US Department of Energy through contract numberDE FG02 99ER25378. size for a given communication, number of processors, message ....
....Sparc and Pentium workstations and two di#erent types of PowerPC based IBM SP2 nodes. Fig. 1 shows the results for a tuned MPI broadcast on an IBM SP2 using thin nodes verses the IBM optimised vendor MPI implementation. Similar encouraging results were obtained for other systems as detailed in [12] [13] 3 Reducing the Number of Experiments In the experimental method described in the previous sections a large number of individual experiments have to be conducted. Even though this only needs to occur once, the time taken for all these experiments was considerable and was approximately ....
Graham E. Fagg, Sathish S. Vadhiyar, Jack J. Dongarra, "ACCT: Automatic Collective Communications Tuning", Proc of EuroPVM-MPI 2000, Lecture Notes in Computer Science, Vol. 1908, pp354-361, Springer Verlag, 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC