| M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient collective communication on heterogeneous networks of workstations. In Proceedings of International Conference on Parallel Processing, pages 460--467, 1998. |
....can Tree Formation Scheme Response Time Throughput Base # # Flat # # Hierarchical # # Figure 8: A qualitative comparison of the three schemes. improve response time by parallelizing the aggregation work on all non leaf nodes. This is inspired by the tree based reduction designs used in MPI [3, 10]. We further extend the previous work by investigating the dynamic formation of a reduction tree for each service call. The tree formation must be load aware so that the total amount of work for local computation and aggregation is evenly distributed. This allows the response time to be minimized ....
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In ICPP, 1998.
....performance of collective algorithms for heterogeneous workstation clusters. The ECO package [LB96] built on top of PVM, automatically analyzes characteristics of heterogeneous networks to develop optimized communication patterns. Bhat, Raghavendra, and Prasanna [BRP99] extend the FNF algorithm [BMP98] and propose several new heuristics for collective operations. Their heuristics consider the effect communication links with different latencies have on a system. Banikazemi [BSP99] present a model for point to point communications in heterogeneous networks of workstations and use it to study the ....
M. Banikazemi, V. Moorthy, and D. Panda. "Efficient Collective Communication on Heterogeneous Networks Workstations." In International Conference on Parallel Processing, pp. 460--467, 1998.
....main difference between the two models is that HCGM is not intended to be an accurate predictor of execution times whereas HBSP attempts to provide the developer with predictable algorithmic performance. Several research efforts focus on designing collective operations for heterogeneous systems [1, 13]. Additional work demonstrates the importance of collective communication operations for hierarchical networks. Husbands and Hoe [8] develop MPI StarT, a system that efficiently implements collective routines for a cluster of SMPs. Kielmann et al. 12] present MagPie, a system that also handles a ....
M. Banikazemi, V. Moorthy, and D. Panda. Efficient collective communication on heterogeneous networks workstations. In International Conference on Parallel Processing, pages 460--467, 1998.
....of specific architectures; cf. 12, 14, 23, 27] The further difficulties caused by NOWs asynchrony and loose coupling has led to yet more insulating models; cf. 13, 15] for dedicated NOWs and [4, 11] for borrowed NOWs. NOWs algorithmic intransigence increases as NOWs lose their homogeneity [5, 25] and or communication flatness [9, 10, 26] Despite advances in algorithmic modeling, there is yet no architectural model of the scope of [14] for (hyper)clusters. In this paper, we formulate a parameterized model of hyperclusters that enjoy the tri axial generality described earlier and ....
....parameters. The three types of generality in the HiHCoHP model can exist independently in a NOW architecture. Clusters modeled by LogP [14] are single, node homogeneous clusters, communicating over single networks. The NOWs of 2 (2.1) encompasses both pipelining and store and forward networks. [5, 25] are single, node heterogeneous clusters, communicating over single networks. The NOWs of [9, 10, 26] embody the full generality of HiHCoHP, but with communication costs summarized by internode cost matrices. Our algorithms in Section 3 are formulated within hyperclusters that are general except ....
[Article contains additional citation context not shown here]
M. Banikazemi and D.K. Panda (2000): Efficient collective communication on heterogeneous networks of workstations. Tech. Rpt., Ohio State Univ.
....multiple times, depending on the network topology. Such techniques will not be efficient in wide area heterogeneous networks, since each point to point communication event incurs an additional communication cost. Further, this will also introduce extra network congestion. Recent research efforts [3] have investigated the problem of efficient broadcast and multicast in a network of heterogeneous workstations. The heterogeneity in the communication capabilities of the workstations was represented by associating a message initiation cost with each workstation. However, heterogeneity in the ....
....over the years. Communication libraries for frequently used patterns such as total exchange, one toall broadcast, all to all broadcast, and gather have been developed [1, 4, 18, 19] However, collective communication in heterogeneous systems has not been investigated until very recently [11, 3]. The Efficient Collective Operations (ECO) 11] package was developed for networks of heterogeneous workstations. It implements the same functionality as the collective communication suite in the MPI standard. The ECO approach consists of first partitioningthe network into subnets. A subnet ....
[Article contains additional citation context not shown here]
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient collective communication on heterogeneous networks of workstations. In Proc. Intl. Conf. Parallel Processing, pages 460-- 467, 1998.
....where parallel applications can be designed in a BSP like manner. Their implementations target at clusters of SMPs where communication steps are performed collectively. Others study collective operations in networks of heterogeneous workstations rather than hierarchically structured systems [6, 8, 32]. Compositions of collective operations are optimized in [23] but performance is studied only in the homogeneous case. Several metacomputing projects are currently building the infrastructure on top of which our MagPIe library may utilize distributed computing capacity. Globus [17, 18] Legion ....
M. Banikazemi, V. Moorthy, and D. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, Minneapolis, MN, Aug. 1998.
....Foster et al. [10, 11, 12] describe a wide area version of MPI using the Nexus multi threaded run time system. This work focuses on heterogeneity and interoperability issues. Our system also runs transparently on a LAN and a WAN, and, in addition, optimizes collective operations. Banikazemi et al. [4] investigate optimal communication structures for multicast operations on heterogeneous networks of workstations, focusing on processor speed. Our focus is network speed in wide area meta computers. Lowekamp et al. [20] describe a system that automatically analyzes characteristics of heterogeneous ....
M. Banikazemi, V. Moorthy, and D. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, Minneapolis, MN, 1998.
....multi threaded run time system. This work focuses on heterogeneity and interoperability issues. Our system also runs transparently on a LAN and a WAN, and, in addition, optimizes collective operations. PACX MPI [16] implements some collective operations for clustered systems. Banikazemi et al. [5] investigate optimal communication structures for multicast operations on heterogeneous networks of workstations, focusing on processor speed. Our focus is network speed in wide area metacomputers. Lowekamp et al. 23] describe a system that automatically analyzes characteristics of heterogeneous ....
M. Banikazemi, V. Moorthy, and D. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, Minneapolis, MN, 1998.
....complete set of MPI s collective operations with in depth treatment of wide area optimality and associativity of the reduction operations. Husbands et al. 20] report significantly improved performance on a cluster of SMPs with a handcrafted two level implementation of MPI Bcast. Banikazemi et al. [4] investigate optimal communication structures for multicast operations on heterogeneous networks of workstations, focusing on processor speed. Our focus is network speed in wide area meta computers. Lowekamp et al. 27] describe a system that automatically analyzes characteristics of heterogeneous ....
M. Banikazemi, V. Moorthy, and D. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, Minneapolis, MN, Aug. 1998.
....Foster et al. 10] describe a wide area version of MPI using the Nexus multi threaded run time system. This work focuses on heterogeneity and interoperability issues. Our system also runs transparently on a LAN and a WAN, and, in addition, optimizes collective operations. Banikazemi et al. [4] investigate optimal communication structures for multicast operations on heterogeneous networks of workstations, focusing on processor speed. Our focus is network speed in wide area metacomputers. Lowekamp et al. 19] describe a system that automatically analyzes characteristics of heterogeneous ....
M. Banikazemi, V. Moorthy, and D. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, Minneapolis, MN, 1998.
....communication operations for a given HNOW system. The existence of such a model also allows to predict evaluate the impact of an algorithm for a given collective operation on a HNOW system by simple analytical modeling instead of a detailed simulation. In our preliminary work along this direction [2], we demonstrated that heterogeneity in the speed of workstations can have significant impact on the fixed component of communication send receive overhead. Using this simple model, we demonstrated how near optimal algorithms for broadcast operations can be developed for HNOW systems. However, ....
....communication operations can be implemented by using different algorithms (which use different types of trees) For instance, MPICH and many other communication libraries use the binomial trees for broadcast. However, it has been shown that binomial trees are not the best choice for all systems [2]. Therefore, in order to find the best scheme for a given collective operation, it is important to compare the performance of different schemes. Our proposed communication model can be used to evaluate the performance of these algorithms analytically. Consider an S node system (four fast nodes ....
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, 1998.
....on cluster computing is directed at developing better switching technologies [1, 6, 17] and new networking protocols [9, 10, 18] in the context of homogeneous systems. The effect of heterogeneity in communication has been studied recently to develop efficient collective communication algorithms [3] for heterogeneous clusters. Heterogeneity also has an effect on load balancing schemes making them even more difficult. Efficient scheduling of applications based on the characteristics of computing nodes and different segments of applications have been studied for heterogeneous system [8, 13, ....
M. Banikazemi, V. Moorthy and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. International Conference on Parallel Processing, 1997, accepted for presentation.
....Network of workstations, Heterogeneous network of workstations, Interprocessor communication, Collective communication, Broadcast, Multicast, Barrier synchronization. 1 A preliminary version of this paper has been presented at the International Conference on Parallel Processing, August 1998 [3]. 2 This research is supported in part by NSF Career Award MIP 9502294, NSF Grant CCR 9704512, and an Ameritech Faculty Fellow Award. Contents 1 Introduction 3 2 Characterizing Heterogeneous Networks of Workstations 4 2.1 Major Characteristics . ....
....fast (intra cluster) and slow (inter cluster) communication links. The proposed approach improve the performance of collective operations (and therefore that of the high performance applications) by minimizing the number of messages on slow links. Bhat et al. 6] have built on the FNF algorithm [3] and have proposed new heuristics for collective operations. In these heuristics, the effect of communication links with different latencies is also taken into account. A new set of performance models for collective operations in heterogeneous systems has been proposed in [4] These models can be ....
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, 1998.
....communication operations for a given HNOW system. The existence of such a model also allows to predict evaluate the impact of an algorithm for a given collective operation on a HNOW system by simple analytical modeling instead of a detailed simulation. In our preliminary work along this direction [2], we demonstrated that heterogeneity in the speed of workstations can have significant impact on the fixed component of communication send recv overhead. Using this simple model, we demonstrated how near optimal algorithms for broadcast operations can be developed for HNOW systems. However, this ....
....of Algorithms Collective communication operations can be implemented by using different types of trees. For instance, MPICH as well as many other communication libraries, uses binomial trees for broadcast. However, it has been shown that binomial trees are not the best choice for all systems [2]. Therefore, in order to find the best scheme for a given collective operation, it is important to compare the performance of different schemes. Our proposed communication model can be used to evaluate the performance of these algorithms analytically. Consider an 8 node system (4 fast nodes and 4 ....
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In International Conference on Parallel Processing, pages 460--467, 1998.
....of the nodes. 2. 1 Major Characteristics A typical HNOW system can be characterized by the following four factors: 1) Communication Capabilities of Workstations (Nodes) 2) Network Architectures, 3) Communication Protocols, and 4) Dedicated Support for Communication and Synchronization [2]. These factors are orthogonal to each other. A typical HNOW environment can have one or more of these characteristics. All of the above factors have significant impact on the implementation of collective communication operations on HNOW systems. To illustrate this significance, in this paper, we ....
....be useful in any practical system. In the following section, we propose a nearoptimal algorithm which runs in polynomial time. Before describing this near optimal algorithm, let us look at two important properties of optimal trees presented as the following two lemmas (the proofs can be found in [2]) Lemma 1 Let W0 be the source node of a broadcast (or multicast) operation and fW1 ; W2 ; Delta Delta Delta ; WN Gamma1 g be the set of other participating nodes in the order of the time they have received the message. There exists an optimal tree for performing the broadcast (or multicast) ....
[Article contains additional citation context not shown here]
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. Technical report OSU-CISRC-03/98-TR07, Dept. of Computer and Information Science, The Ohio State University, March 1998.
No context found.
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient collective communication on heterogeneous networks of workstations. In Proceedings of International Conference on Parallel Processing, pages 460--467, 1998.
No context found.
M. Banikazemi, V. Moorthy, and D.K. Panda. Efficient collective communication on heterogeneous networks of workstations. In Proceedings of International Parallel Processing Conference, 1998.
No context found.
M. Banikazemi, V. Moorthy, and D.K. Panda. Efficient collectivecommunication on heterogeneous networks of workstations. In Proceedings of International Parallel Processing Conference, 1998.
No context found.
M. Banikazemi, V. Moorthy, and D. K. Panda. Efficient Collective Communication on Heterogeneous Networks of Workstations. In Proc. of International Conference on Parallel Processing, 1998.
No context found.
M. Banikazemi, V. Moorthy, and D. Panda. Efficient collective communication on heterogeneous networks of workstations. In International Conference on Parallel Processing, Minneapolis, MN, pages 460--467, 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC