| A.-H. E. P.K. McKinley, H. Xu and L. Ni. Unicastbased multicast communication in wormhole-routed networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, December 1994. |
....in a CSMP. In this paper, we focus on the tree based, inter switch broadcasts in CSMPs. The tree based techniques for broadcast have been proposed to designs various broadcast algorithms in many different interconnects, such as hypercube [JH89] and 2D mesh [BMT92] wormhole routed networks [MXEN94], Myrinet [BPDS00] and arbitrary topology [RM99] Distinguished from the existing tree based broadcasting algorithms, the proposed broadcast algorithm aims at achieving the maximum step concurrency, minimum steps of message passing among switches, and minimum cost for a broadcast to get high ....
....up to three switches, up to 44 SMP nodes. 8 16 24 32 40 48 56 64 Th8 Number oISMp nodes switches, up to 38 SMP nodes. Compasion Between the Hierarchical Broadcast and MPI Broadcast 8 10 12 14 16 18 20 22 24 26 28 The Number o SMP Nodes switches, up to 28 SMP nodes. McKinley et al. in [MXEN94] implemented multicast communication in wormhole routed direct interconnects in the absence of hardware multicast support. The method exploits the properties of the switching technology on ndimensional mesh and hypercube and use deterministic dimension ordered routing of unicast messages. The ....
P.K. McKinley, H. Xu, A.H. Esfahanian, and L.M. Ni, "Unicast-based Multicast Communication in Wormhole-Routed Networks,' IEEE Transactions on Parallel and Distributed Systems, vol.5, no. 12, pp. 1254-1265, 1994.
....interconnect and bandwidth per link in a CSMP. In the past literature, the technique of using trees are widely applied to designs of various broadcast algorithms on many different interconnects, e.g. star graphs in [TS97] hypercube in [JH89] and 2D mesh in [BMT92] wormhole routed networks [MXEN94], Myrinet [BPDS00] channel networks [BFHI97] and arbitrary topology [RM99] Distinguished by the existing tree based broadcasting algorithms, our new broadcast algorithm satisfies a couple of objectives: 1) the maximum number of nodes sending at any step denoted by maximum step concurrency, ....
.... algorithm doesn t put any limits on the underlying topology for inter switch broadcast instead of the switch ordered ring in [MSP99] The topologies for inter switch broadcast may be arbitrary (e.g. ring and tree) Also, the link scheduling technique is 9 similar to the U mesh algorithm in [MXEN94] to permit every processor to remain busy with useful work of the broadcast by scheduling the link usage to avoid link contention. As we show in Section 2, no link contention may be a too strict limit on the highbandwidth switch based CSMPs. Thus, without loss of generality, our hierarchical ....
[Article contains additional citation context not shown here]
P.K. McKinley, H. Xu, A.H. Esfahanian, and L.M. Ni, "Unicast-based Multicast Communication in Wormhole-Routed Networks,' IEEE Transactions on Parallel and Distributed Systems, vol.5, no.12, pp.1,254-1,265, 1994.
....a set of destination nodes. The multicast primitive can be also used as a basis for many other collective operations such as barrier synchronization and cache invalidations in multiprocessor systems. Multicast communication using wormhole switching has been studied extensively for direct networks [4, 5, 6, 7, 8]. However, studies on indirect networks, such as multistage interconnection networks (MINs) have been limited [9, 10] This paper addresses the issues associated with multicasting in wormhole switched MINs. MINs have been extensively studied and adopted as an interconnection fabric for ....
P. K. McKinley, H. Xu, A. Esfahanian, and L. M. Ni, "Unicast-based multicast communication in wormhole routed networks," in Proceedings of the International Conference on Parallel Processing,
....and multicasting scheme to achieve faster invalidation. Such schemes are either packet switched based requiring processing at intermediate nodes or need complex hardware support with increased latencies. Our next step is to evaluate the proposed scheme against such broadcast and multicast schemes [18, 19]. ....
P. McKinley, H. Xu, A. Esfahanian, and L. Ni, "Unicast-based multicast communication in wormhole routed networks," In Proceedings ofInternational Conference on Parallel Processing, 1992.
....computations using scattering techniques. All to all personalized communication operation is used in parallel fast Fourier transform, matrix transpose, and parallel database join operations [15] Sophisticated multicasting schemes us ing path based routing [21, 22] and unicast based schemes [14] have been shown to be efficient for non personalized multicast. However, personalized multicasts cannot take advantage of these schemes. Hence for systems not having support for these schemes and for personalized multicasts, there is no other alternative but to send messages from the source node ....
P. McKinley et al. Unicast-Based Multicast Communication in Wormhole-Routed Networks. In International Conference on Parallel Processing, volume II, pages 10 19, 1992.
....new injection mechanism on a bidirectional 8 ary 3 cube network (512 nodes) 4.1 Network Model Our simulator models the network at the flit level. Each node in the network consists of a processor, its local memory, a routing control unit, a switch, and several channels. A four port architecture [22] is considered. The routing control unit computes the output channel for a message as a function of its destination node, the current node and the output channel status. The routing algorithm can use any minimal path to forward a message towards its destination. In addition, several virtual ....
P.K. McKinley, H. Xu, A. Esfahanian and L.M. Ni, "Unicast-based multicast communication in wormhole-routed networks," in Proceedings
....messages from the root to the member nodes, but the performance can be significantly improved by reducing the number of messages. For regular interconnection networks, many research efforts have been devoted to develop efficient implementations of barrier synchronization with either software [13, 14, 15, 16] or hardware support [17, 18, 19, 20, 21, 22] For instance, Xu, et al. 15] proposed a software tree approach for barrier synchronization in 1 Nodes, in this paper, actually mean PCs or workstations in a cluster system. 2 wormhole routed hypercube multicomputers. Tree based schemes perform ....
P. K. McKinley, H. Xu, A. -H. Esfahanian, and L. M. Ni, "Unicast-Based Multicast Communication in Wormhole-Routed Networks," IEEE Transactions on Parallel and Distributed Systems, Vol. 5, No. 12, pp. 1252-1265, Dec. 1994.
....in multicast communications, where an operation might consist of several communication phases, the start up latency has a significant impact on the performance. An efficient software multicasting technique called Umesh is proposed for meshes that do not require any hardware modification [4]. The Umesh scheme reduces the number of communication steps by allowing some destinations to act like source nodes after they receive the message. However, this technique still involves a large number of communication start up steps and software overheads. To further reduce the communication ....
....depletion of storage space is negligible. The overhead of retransmission operation is very high because it involves both software (operating system) and hardware resources. Efficientmulticast algorithms can be adopted by using a combination of basic communication services. In the Umesh algorithm [4], the retransmit operations are used to complete several phases of the multicast operations. The Hamiltonian path based algorithms [5] are supported by the absorb and forward mechanism. The destination addresses are encoded in the message header. The ordering of the destinations in the header ....
P. K. McKinley, H. Xu, A. Esfahanian, and L. M. Ni, "Unicast-based multicast communication in wormhole routed networks," in Proc. of the ICPP, vol. 2, pp. 10--19, 1992.
....as shown in Figure 5.4 for single port modeling. As the source of communication is the same for the whole scattering operation, this node should reconfigure its links after each step. Therefore, the scattering time, S1 F1 , is (N 1) 1) time units. The spanning binomial tree algorithm [91] used for broadcasting multicasting operations can also be used for scattering operation. In this algorithm, the number of informed nodes doubles at each step, and each node stores its own message and forwards the rest of the messages it received, if necessary, to its children. As illustrated in ....
P. K. McKinley, H. Xu, A. -H. Esfahanian and L. M. Ni, "Unicast-based Multicast Communication in Wormhole-routed Networks", IEEE Transactions on Parallel and Distributed Systems, 5(12): 1252-1265, December 1994.
....steps, where every communication step involves the high software overhead for sending and receiving a message. To improve multicast performance, many schemes have been proposed that use a multi phase approach starting with the source sending out a multicast message to one of its destinations [21, 44]. In succeeding phases, the destinations that have received the message act as secondary sources and transmit the message to other destinations. Such a scheme reduces the impact of the high overhead associated with sending receiving point to point messages by allowing multiple nodes to ....
....Efficient multicast algorithms are typically hierarchical in nature. This means that some destinations serve as intermediate sources, i.e. when they receive a message, they forward copies of it to other destinations. Many such hierarchical algorithms have been proposed in the literature [13, 20, 21, 30] to implement multicast. Figure 3 shows an example of a multicast from a source node to seven other destinations. In the figure, the numbers in brackets indicate the step numbers. 1] 2] 3] 2] 3] 3] 3] source Figure 3: Example of a hierarchical multicast algorithm on a destination ....
[Article contains additional citation context not shown here]
P. K. McKinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based Multicast Communication in Wormhole-routed Networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, Dec 1994.
.... that make message communication easier by making message routing simpler, lowering the average distance per communication, and or increasing the bisection bandwidth [11] For such regular cutthrough networks, many multicast broadcast algorithms have been proposed in the literature in recent years [2, 5, 9, 10, 17, 20, 23, 25, 32]. More recently, cut through switching is being applied to switch based interconnects like Myrinet [4] and ServerNet [15] to build networks of workstations, or NOWs (also called workstation clusters) for cost effective parallel computing. In contrast to traditional parallel systems, these ....
....achieve reduced latency. In these algorithms, some nodes work as intermediate nodes which receive a copy of the message from the source and forward it to other nodes. Typically, tree structured algorithms are used to minimize 4 the number of communication start ups (steps) required for multicast [6, 25]. The efficiency of an algorithm is determined by the required number of start ups for a multicast to complete and the degree of link contention experienced among the messages of the multicast. For regular networks with e cube routing, the concept of a dimension ordered chain has been developed ....
[Article contains additional citation context not shown here]
P. K. McKinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based Multicast Communication in Wormhole-routed Networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, Dec 1994.
....like low latency communication and reduced communication hardware overhead [10] These systems use regular network topologies with various deadlock free routing schemes. For such regular wormhole networks, many multicast broadcast algorithms have been proposed in the literature in recent years [2, 5, 6, 8, 13]. This research is supported in part by NSF Grant MIP 9309627 and NSF Career Award MIP 9502294. More recently, wormhole routing is being applied to switch based interconnects like Myrinet [1] and ServerNet [4] to build networks of workstations for cost effective parallel computing. Switch based ....
....and highbandwidth communication. Such routing flexibility also leads to difficulty in implementing a multicast broadcast operation in a contention free manner. Typically, tree structured algorithms are used to minimize the number of communication start ups (steps) required for multicast [8]. The efficiency of an algorithm is determined by the required number of start ups for a multicast to complete and the degree of contention experienced among the messages of the multicast. For regular networks with e cube routing, the concept of a dimension ordered chain has been developed [8] to ....
[Article contains additional citation context not shown here]
P. K. McKinley, H. Xu, A. -H. Esfahanian, and L. M. Ni. Unicast-based Multicast Communication in Wormholerouted Networks. IEEE TPDS, 5(12):1252--1265, Dec 1994.
....performance because it consists of n sequential communication steps, where every communication step involves the high software overhead for sending and receiving a message. To improve multicast performance, many schemes that use a multi phase approach have been proposed for systems with regular [25, 52] and irregular topologies [16, 21] In this approach, first, the source sends out a multicast message to one of its destinations. In subsequent phases, the source and the destinations that have received the message act as secondary sources and forward the message to other destinations. Such an ....
....Efficient multicast algorithms are typically hierarchical in nature. This means that some destinations serve as intermediate sources, i.e. when they receive a message, they forward copies of it to other destinations. Many such hierarchical algorithms have been proposed in the literature [16, 24, 25, 35] to implement multicast. Figure 3 shows an example of a multicast from a source node to seven other destinations. In the figure, the numbers in brackets indicate the step numbers. It can be easily observed that dlog 2 (n 1)e communication steps are required for such a binomial tree based ....
[Article contains additional citation context not shown here]
P. K. McKinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based Multicast Communication in Wormhole-routed Networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, Dec 1994.
....networks: up down [10] and Eulerian trail (ET) 11] Irregular networks also find it very difficult to support multicast. Multicast is an important collective communication in scalable parallel computers, in which the source node sends the same data to an arbitrary number of destination nodes [2]. Many multicast routing algorithms have been studied for systems with regular topologies [2, 3, 4, 5] Usually, the proposed multicast algorithms This work was supported in part by National Science Council under grants NSC 86 2213 E 007 043 and NCHC86 08 024. are based on either one of two ....
....difficult to support multicast. Multicast is an important collective communication in scalable parallel computers, in which the source node sends the same data to an arbitrary number of destination nodes [2] Many multicast routing algorithms have been studied for systems with regular topologies [2, 3, 4, 5]. Usually, the proposed multicast algorithms This work was supported in part by National Science Council under grants NSC 86 2213 E 007 043 and NCHC86 08 024. are based on either one of two schemes: unicast [2] and multidestination messaging [3] In the unicast based approach, multicast is ....
[Article contains additional citation context not shown here]
P. K. Mckinley, H. Xu, A. H. Esfahanian, and L. M. Ni, "Unicast-based Multicast Communication in Wormhole-routed Networks," IEEE Transaction on Parallel and Distributed Systems, 5(12):1252-1265, Dec. 1994.
....configurations are conducted. The results do confirm the advantage of our scheme, under various system parameters and conditions, over other existing broadcasting algorithms (e.g. the U torus scheme for oneport tori by Robinson et al. 19] the U mesh scheme for one port meshes by McKinley et al. [15], the Scatter Collect and the Edge Disjoint Spanning Fences schemes for one port tori meshes by Barnett et al. 6] the dominating set approach for all port meshes by Tsai et al. 20, 21] the dominating set approach for all port tori by [22] and the scheme for all port hypercubes by Ho and Kao ....
....T c = 0:5sec: a) T s = 10sec , and (b) T s = 150sec. previous torus algorithm is necessary to obtain a mesh algorithm. Let the source node be P x;y . Under the one port model, we use Definition 4 to obtain h DDNs with respect to P x;y . In phase 1, we will apply a modified U mesh scheme based on [15]. In phase 2, we will invoke three known mesh schemes for a mesh: the RD by [15] the SC and EDSF by [6] Phase 3 are the same as the torus algorithm. Under the all port model, we use Definition 4 to obtain h DDNs with respect to P 0;0 . An initial step to unicast the message to an appropriate ....
[Article contains additional citation context not shown here]
P. K. Mckinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based multicast communication in wormhole-routed networks. IEEE Trans. on Parallel and Distributed Systems, 5(12):1252-- 1265, Dec. 1994.
....phases. Each communication phase incurs a startup latency that can be several orders of magnitude larger than the actual network latency. Moreover, a lower bound on the number of unicast communication phases required to distributed a message to d destinations is known to be dlog 2 (d 1)e [10]. For this reason, hardware supported multicast solutions have recently been proposed to substantially reduce the number of communication phases required. Some algorithms use hardware supported path based techniques to deliver a message with very few worms while other techniques use tree based ....
P. McKinley, H. Xu, A-H. Esfahanian, and L. Ni. Unicast-based multicast communication in wormhole-routed networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, Dec. 1994.
....memory update invalidation for cache coherence in distributed shared memory systems. Both one to one communication and broadcast are special cases of multicast. Solutions to the multicast problem can be categorized as unicast based, tree based, and path based. The unicast based solutions (e.g. [8, 21] for meshes and [31] for tori) make use of 1 to 1 communication to achieve multicast. Disadvantages of this approach include necessary involvement in message propagation at intermediate nodes and required start up latency in each intermediate node. The tree based solutions (e.g. 18, 27, 29] for ....
P. K. McKinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based multicast communication in wormhole-routed networks. IEEE Trans. on Paral. and Distrib. Sys., 5(12):1252--65, Dec. 1994.
....multicasting is to construct a multicast tree from the source node. For instances, the work by Ho and Raghunath [7] connects the destination nodes as a Hamiltonian path, on which the multicast message can be sent in a pipelined manner. The U mesh and U torus schemes proposed by McKinley et al. [10] and Robinson et al. 13] for one port meshes and tori, respectively, show how to connect the destination nodes as a congestion free binomial tree, if the dimensionordered routing is followed. Coster et al. 5] further generalize the above idea and show that the Fibonacci trees, 1 which are ....
....Coster et al. 5] further generalize the above idea and show that the Fibonacci trees, 1 which are generated based on the Fibonacci numbers, would be more appropriate for multicasting on meshes. References for fault tolerant multicasting are also available in [15, 14] In contradiction to [5, 7, 10, 13], which uses only one multicast tree, in this paper we use multiple DDNs and DCNs, which may not be trees. Even the numbers of DDNs and DCNs are adjustable parameters, which can be used to optimize the communication latency. In addition, a DDN DCN may not be a graph in standard graphtheoretical ....
[Article contains additional citation context not shown here]
P. K. Mckinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based multicast communication in wormholerouted networks. IEEE Trans. on Paral. and Distrib. Sys., 5(12):1252--1265, Dec. 1994.
....: 11 4.4. 2 Detailed steps : 11 5 Evaluation 13 6 Conclusion 13 2 1 Introduction Multicast represents the most complex communication pattern on a multicomputer system [2, 10, 15] It is shown to be useful [13] in parallel algorithms, data replication, barrier synchronization, and implementation of distributed shared memory paradigm on a directnetwork based parallel system. In general, optimal solution to multicasting is NP complete and remains so even with some restrictions [12] Several heuristics ....
.... channels for path based multicasting, the latency of unicast messages increases in the presence of multicast messages due to the demand multiplexing of a set of virtual channels over a physical channel [7] In order to alleviate these disadvantages, McKinley and his research group have proposed [13] a unicast based multicast scheme, where multicast messages are propagated using software multicast trees. It requires only the base routing support and incurs dlog 2 (d)e startups for a destination set of size d. The set of intermediate destinations need to partition the destination sets received ....
[Article contains additional citation context not shown here]
P. K. McKinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based Multicast Communication in Wormhole-routed Direct Networks. In Proceedings of the International Conference on Parallel Processing, pages II:10--19, 1992. 16
....by sending a sequence of separate unicast messages to each of these destinations or, alternatively, by sending unicast messages to a subset of its destinations which then participate in sending the message to the remaining destinations. Examples of unicast based multicast routing can be found in [16, 20]. These schemes are commonly employed in multicomputers supporting only unicast communication in hardware. The disadvantage of the unicast based approach is that multiple copies of the same message are introduced into the network, resulting in increased network traffic. More importantly, each copy ....
P. McKinley, H. Xu, A-H. Esfahanian, and L. Ni. Unicast-based multicast communication in wormhole-routed networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252-- 1265, Dec. 1994.
....and thus on the routing algorithm itself. Since startup latency appears to be the dominant term in communication latency, a number of efforts have been made to design multicast routing algorithms that minimize the number of startups required to deliver a message to all of its destinations [8, 12, 15, 19]. However, the number of startups alone is not necessarily an accurate indicator of the performance of a routing algorithm. Consider, for example, the Hamiltonian path based routing algorithm proposed in [12] 1 which requires at most two startups to deliver a message to an arbitrary set of ....
....destinations. Alternatively, a multicast tree can be used in which the source node sends the message to a subset of the destinations which then participate in recursively retransmitting the message to the remaining set of destinations. Examples of unicast based multicast routing can be found in [3, 8, 15, 18]. A significant disadvantage of the unicast based approach is the large number of startups required to send a message to a large set of destination nodes. To address this problem, Lin et al. 12] and Panda et al. 17] have proposed a hardware supported multicast routing methodology known as ....
P. McKinley, H. Xu, A-H. Esfahanian, and L. Ni. Unicast-based multicast communication in wormholerouted networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, Dec. 1994.
No context found.
A.-H. E. P.K. McKinley, H. Xu and L. Ni. Unicastbased multicast communication in wormhole-routed networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, December 1994.
No context found.
A.-H. Esfahanian P.K. McKinley, H. Xu and L.M. Ni. Unicast-based multicast communication in wormholerouted networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265,December 1994.
No context found.
P. K. McKinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based Multicast Communication in Wormhole-routed Networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1252--1265, Dec 1994.
No context found.
P. K. Mckinley, H. Xu, A.-H. Esfahanian, and L. M. Ni. Unicast-based multicast communication in wormhole-routed networks. IEEE Trans. on Paral. and Distrib. Sys., 5(12):1252-1265, Dec. 1994. 21
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC