| W. J. Dally. Virtual-Channel Flow Control. In Proceedings of the 17th Annual International Symposium on Computer Architecture (ISCA), 1990. |
....a set of queues linked by a multiplexer. As indicated in [33] router delays can increase substantially when a large number of virtual channels are multiplexed onto physical links. This is due in part to the multiplexer and virtual channel controller delays. Moreover, fully demultiplexed crossbars [34] (i.e. one virtual channel per crossbar port) become prohibitively expensive in silicon area as the number of virtual channels increases. Thus, it becomes necessary to devise an alternate buffer organization to support a large number of virtual channels. Virtual channels in the MMR are organized ....
....Memory access time increments latency in a few clock cycles, but multimedia applications usually tolerate well latency. C. Switch Organization In most routers the internal switch fabric is implemented as a crossbar. Crossbars can be multiplexed, partially multiplexed or fully demultiplexed [34], depending on the number of ports. Although some routers use a fully demultiplexed crossbar [7] with as many ports as virtual channels, this organization becomes prohibitive when the number of virtual channels is large. Even for a relatively small number of virtual channels, some commercial ....
W. J. Dally, "Virtual-channel flow control," IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 2, March 1992.
....make many adaptive turns in order to avoid congestion, meaning that if a header flit reaches a processor where an outgoing channel is blocked, it is allowed to move in another direction. Related Work. Considerable research has been done on making wormhole routing fault tolerant. Papers such as [4] have added virtual channels to the network to handle faults. Virtual channels divide a single physical channel into many channels, sharing the bandwidth between them. Papers such as [8] have used an adaptive turn based model to avoid faults. If a faulty processor is encountered on the network, a ....
....have a ring topology, any two messages introduced into the network by different processors can acquire resources in a circular dependent manner. Both of these problems can be avoided by adding more available channels upon which any processor can initiate a message. A simple solution presented in [4] is to add multiple virtual channels to the network for each physical channel. Virtual channels are logical channels which may share the same physical wire, but each virtual channel contains its own flit buffer, control program (including local variables) and data path. The flit buffers can be ....
W. J. Dally, "Virtual channel flow control," in Proceedings of the 17th International Symposium on Computer Architecture, 1990, pp. 60-68.
....through the SPIDERS project. Section 2. It supports both 2 and 3 dimensional torus (a mesh with wraparound connections) network topologies with up to 256 processing nodes. In addition to the proposed techniques, the simulator features wormhole switching [9] and virtual channel flow control [10]. The former pipelines the flit transfer and the latter reduces the packet blocking. It also utilizes fully adaptive deadlock recovery routing scheme proposed in [11] This scheme is believed to be the most efficient routing algorithm to date. Therefore, our simulator is considered one of the most ....
William J. Dally, "Virtual-Channel Flow Control," in IEEE Transactions on Parallel and Distributed Systems, 3(2): 194-205 (1992).
....connected through routers. Within a subnet, switches connect processors and I O nodes. Processing nodes are attached to an IBA network through Host Channel Adaptors (HCAs) and I O nodes can be attached to a network through Target Channel Adaptors (TCAs) Virtual Lanes (VLs) Virtual Channels (VCs) [5] provide a mechanism to implement multiple logical flows over a single physical link. A port must support at least 2 VLs and at most 16 VLs. All ports must use VL 15 for subnet management traffic. VL arbitration means selection of a VL to push data to an outgoing link in a switch, router, or ....
W.J.Dally,"Virtual-Channel Flow Control," IEEE TPDS,vol.3, pp. 194--205, May 1992.
....each flit of a packet can make forward progress to the next switch in its path, as long as there is buffer space available for at least one flit in that next switch. In order to ensure that slow moving packets do not unnecessarily block access to an output link, the concept of virtual lanes [2], also known as virtual channels, is frequently used. Since each flit is marked by the virtual lane it belongs to, in scheduling flits for transmission over an output # This work was supported in part by U.S. Air Force Contract F30602 00 2 0501. link, it is not necessary to schedule all flits ....
....lanes are examined in a round robin fashion, and the first virtual lane that has room for at least one flit is used. The virtual lane of a packet may change during each switch hop as the packet progresses toward a destination. This strategy of virtual lane assignment is different from that in [2] where a virtual lane is randomly chosen among the available virtual lanes. Achieving acceptable fairness bounds provides our motivation to avoid random assignment of virtual lanes. The focus of this paper is on scheduling strategies among virtual lanes for entry into and exit out of dedicated ....
W. J. Dally, "Virtual Channel Flow Control, " IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 3, pp. 194-- 205, March 1992.
....all greatly influence the cost and the performance of the design. In this paper, we focus on how routing policies impact network performance as the communication pattern is varied. Specifically, we classify different routing algorithms in terms of their adaptivity and the selection function [5] used to determine the order in which candidate links are considered. Through extensive multi factor experiments, we investigate how selection functions and the amount of routing adaptivity combine with the traffic pattern to determine how well links are utilized and hence, how well the network ....
....on the network topology and the distance a packet must travel. In addition, adaptive algorithms add to the complexity of both the hardware which implements the scheme and the software which must handle the possibility of out of order arrivals [13] Each algorithm invokes a selection function [5] which selects and orders candidate links. Network performance is greatly influenced by the interaction of this function with the communication workload. Selection functions of oblivious algorithms deterministically select a single candidate link independent of the current network conditions. For ....
W. Dally, "Virtual-channel flow control," IEEE Trans. Parallel and Distributed Systems, vol. 3, pp. 194--205, March 1992. 13
....In wormhole routing, the messages are divided into small flits, typically 8 32 bits long, and sent over the network in a pipeline fashion, one flit after another between adjacent switches. The performance of MINs for packet switching and wormhole networks has been evaluated in various studies [4, 5, 6, 7, 8], based on analytical models that assume simple traffic distributions. These workload models are far from the network traffic observed in a cachecoherent shared memory multiprocessor, which is characterized by invalidation messages, bursty requests, memory hot spots, and different message sizes. ....
....bottlenecks in the network due to traffic convergence and studied the effect of several options to avoid congestion. But, the Cedar multiprocessor, not being a cache coherent multiprocessor, the study does not indicate how the presence of caches affects the performance of an IN. Virtual channels [4] are used in wormhole networks to avoid deadlocks and to improve link utilization and network throughput. Therefore, we have also evaluated the performance of a wormhole routed MIN together with virtual channel flow control in a cache coherent multiprocessor. The evaluation has been done by ....
W. J. Dally, "Virtual-Channel Flow Control," IEEE Transactions on Parallel and Distributed Systems, vol. 3, pp. 194--205, March 1992.
....routing [3] is an efficient technique to reduce the latency of internode messages. In wormhole routing, the messages packets are divided into small flits, typically 8 32 bits long, and sent over the network in a pipeline fashion, one flit after another between adjacent nodes. Virtual channels [4] are used in wormhole networks to avoid deadlocks and to improve link utilization and network throughput. In this paper we evaluate the performance of a 2D torus network with wormhole routing and virtual channel flow control in shared memory multiprocessors. We selected a 2 D torus network with ....
.... it is a popular topology [5, 6, 7, 8] Also, mesh networks without end around connections have significant performance degradations at the boundary nodes, even under uniform communication [6, 9] The performance of wormhole networks with virtual channels has been evaluated in various studies [4, 7, 10] in a message passing environment. However, all these evaluations are based either on analytical models that assume certain traffic distributions or on simulations using simple probabilistic workloads. Adve and Vernon [7] analyzed the performance of mesh and torus networks using closed queueing ....
W. J. Dally, "Virtual-Channel Flow Control," IEEE Transactions on Parallel and Distributed Systems, vol. 3, pp. 194--205, March 1992.
....are organized as deep FIFO buffers and or multiple virtual lanes. In a torus, a DOR router requires two virtual channels for deadlock prevention, which are used to break channel dependency cycles at wraparound channels as suggested in [28] Additional virtual channels are used as virtual lanes [29]. For CR networks, we vary the number of virtual channels while fixing the buffer depth of each virtual channel at two flits. This is the right way to organize buffers for CR because increasing buffer depth only increases padding overhead without performance gain. In CR networks, all of the ....
....resources outperforms DOR. For instance, with equally given two virtual channels, a CR network with 2 flit deep buffers matches the performance of a DOR network with 16 flit deep buffers. Figs. 14 (c) and (d) compare CR and DOR s performance for a range of virtual channels. A previous study [29] showed that virtual channels provide more performance benefit than deep FIFO buffers. In the simulations, the DOR networks are given a fixed amount of total buffer space, so more virtual channels mean a lower buffer depth. We observe that, in both CR and DOR networks, an initial increase in ....
[Article contains additional citation context not shown here]
W. J. Dally, "Virtual channel flow control," IEEE Transactions on Parallel and Distributed 'ystems, vol. 3, no. 2, pp. 194 205, 1992.
....principles of multiport wormhole systems and show the situations which lead to port contention. Through an example, we show the significance of message ordering by taking into account routing adaptivity and multiple ports. 2. 1 Routing Adaptivity and Link Con tention In wormhole routed systems[5], the header flit of a message establishes the path, the intermediate flits follow the path, and the tail flit releases the path. During the message propagation, if a desired link is already being used by another message, the current message gets blocked. This message waits in the network ....
....as link contention. This phenomena is very much associated with the underlying routing scheme, topology of the system, and the communication traffic. To alleviate link contention, several routing schemes with varying adaptivity have been proposed in the literature. Deterministic or e cube routing [5] defines a single path from a source to a destination node and thus has zero adaptivity. Such routing is simple to implement and deadlock free. However, it does not make effective use of all communication links in a system. Fully adaptive algorithms [9] allow a mes sage to be routed along any of ....
W.J. Dally, "Virtual-channel Flow Control," IEEE Trans. on Parallel and Distributed Systems, Vol. 3, pp. 194-205, March 1992.
.... 17 1 Introduction Message routing on a multicomputer system can be characterized by network topology (mesh, hypercube, torus or tree) switching technique (wormhole, virtual cut through, circuit or packet switching) and communication pattern (one to one, permutation, broadcast or multicast) [5, 9, 10, 11, 13, 18]. Recently, wormhole routing has attracted a lot of interest for its low latency and less requirement of buffer storage. However, this accompanies with high probability of channel contention or even deadlock. Researchers have proposed schemes using virtual channels for increasing physical channel ....
....wormhole routing has attracted a lot of interest for its low latency and less requirement of buffer storage. However, this accompanies with high probability of channel contention or even deadlock. Researchers have proposed schemes using virtual channels for increasing physical channel utilization [5], exploiting adaptivity [6, 16] and avoiding deadlock [2, 4, 6, 16] Routing algorithms to avoid deadlock are also investigated in [8] Multicast represents the most complex communication pattern. It is in general NP complete and remains so even with some restrictions [10, 13] Heuristics based ....
[Article contains additional citation context not shown here]
W. Dally, "Virtual-Channel Flow Control," IEEE Trans. on Parallel and Distributed Sys- tems, Vol. 3, No. 2, March 1992, pp. 194-205.
....However, there is no solution for assignment problem for general task graphs by taking link contention into account. In recent years, the communication architecture of distributed memory systems is undergoing rapid advances. There is an increasing use of wormhole routing switching technique [4] in current generation multicomputers (nCUBE 2, iWarp, and Intel s Paragon) due to its communication efficiency over store and forward and circuit switched routing. In wormhole routing, the header flit of a message establishes the path, intermediate flits follow the path, and the tail flit ....
....distributed memory systems, researchers are proposing deadlock free routing schemes with varying adaptivity to improve system throughput. The adaptivity directly affect link contention and hence completion time of a parallel application. Deterministic (also known as e cube or oblivious) routing [4] allows a message to take a fixed path from a given source to its destination. On the other hand, adaptive routing strategies (partially adaptive [3] and fully adaptive [7] permit messages to take alternate paths while avoiding the busy links. Such adaptivity has potential to reduce link ....
W.J. Dally, "Virtual-channel Flow Control," IEEE Transactions on Parallel and Distributed Systems, Vol. 3, March 1992, pp. 194-205.
....block in place when the link requested by the header is busy. So, data flits span over multiple routers, leading to significant link contention. Contention can be reduced by multiplexing physical bandwidth among several messages. This can be achieved by using separate buffers, or virtual channels [9], associated with each physical link. Virtual channels have also been proposed to avoid message deadlock [7] Another way to reduce the negative effects of link contention consists of using adaptive routing, allowing packets to follow alternative paths. Fully adaptive routing also requires virtual ....
....are enough empty flit buffers to store the whole message [19] Other routers such as the Intel Cavallino [2] the SGI SPIDER [13] also employ a few virtual channels per phys1 ical link with the latter implementing fully adaptive routing as well. Both features improve throughput significantly [9, 10], and may reduce the execution time for bandwidthlimited parallel applications. However, virtual channels and adaptive routing have been shown to increment router delay [4] thus increasing the execution time of latencysensitive parallel applications. For those applications, it has been suggested ....
W.J. Dally, "Virtual-Channel Flow Control," IEEE Trans. on Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, March 1992.
No context found.
W. J. Dally, "Virtual-Channel Flow Control", IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, March 1992.
No context found.
Dally, W. J. (1992) Virtual channel flow control. IEEE Trans. Parallel Distributed Syst., 3, 194--205.
No context found.
W. J. Dally. Virtual-Channel Flow Control. In Proceedings of the 17th Annual International Symposium on Computer Architecture (ISCA), 1990.
No context found.
W. J. Dally. Virtual-Channel Flow Control. In Proceedings of the 17th Annual International Symposium on Computer Architecture (ISCA), 1990.
No context found.
W. Dally," Virtual-Channel Flow Control," IEEE Transaction on Parallel and Distributive System, vol 3, no 2, March 1992.
No context found.
W. J. Dally, "Virtual Channel Flow Control," IEEE Trans. on Par. and Dist. Syst., vol. 2, no. 2, pp. 194-205, March, 1992.
No context found.
W. J. Dally, "Virtual-Channel Flow Control," IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 2, pp. 194--205, May 1992.
No context found.
W.J. Dally, Virtual channel flow control, IEEE TPDS, vol. 3, no. 2, pp. 194205, 1992.
No context found.
W. J. Dally, "Virtual-Channel Flow Control." IEEE Trans. Parallel Distrib. Syst., vol. 3, no. 2, Mar. 1992, pp. 194-205.
No context found.
W. Dally," Virtual-Channel Flow Control," IEEE Transaction on Parallel and Distributive System, vol 3, no 2, March 1992.
No context found.
W. J. Dally. Virtual-channel flow control. In Proc. of 17 Int. Symp. on Computer Architecture, 1990.
No context found.
Dally, W. Virtual-channel flow control. IEEE Trans. Parallel Distrib. Systems 3 (Mar. 1992), 194--205.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC