Results 1 - 10
of
44
Planar-Adaptive Routing: Low-cost Adaptive Networks for Multiprocessors
- In Proceedings of the International Symposium on Computer Architecture
, 1992
"... Network throughput can be increased by allowing multipath, adaptive routing. Adaptive routing allows more freedom in the paths taken by messages, spreading load over physical channels more evenly. The flexibility of adaptive routing introduces new possibilities of deadlock. Previous deadlock avoidan ..."
Abstract
-
Cited by 179 (13 self)
- Add to MetaCart
Network throughput can be increased by allowing multipath, adaptive routing. Adaptive routing allows more freedom in the paths taken by messages, spreading load over physical channels more evenly. The flexibility of adaptive routing introduces new possibilities of deadlock. Previous deadlock avoidance schemes in k-ary ncubes require an exponential number of virtual channels [17]. We describe a family of deadlock-free routing algorithms, called planar-adaptive routing algorithms which require only a constant number of virtual channels, independent of network size and dimension. Planar-adaptive routing algorithms reduce the complexity of deadlock prevention by reducing the number of choices at each routing step. In the fault-free case, planar-adaptive networks are guaranteed to be deadlock-free. In the presence of network faults, the planar-adaptive router can be extended with misrouting to produce a working network which remains provably deadlock free and is provably livelock free. In a...
A Cost and Speed Model for k-ary n-cube Wormhole Routers
- HIT INTERCONNECTS '93
, 1993
"... A great deal of research has been published on the performance of wormhole routers with advanced features such as adaptivity and virtual lanes. In most cases, the effectiveness of such novel routers is evaluated on the basis of the achieved network throughput (channel utilization), ignoring the impo ..."
Abstract
-
Cited by 125 (1 self)
- Add to MetaCart
A great deal of research has been published on the performance of wormhole routers with advanced features such as adaptivity and virtual lanes. In most cases, the effectiveness of such novel routers is evaluated on the basis of the achieved network throughput (channel utilization), ignoring the important effects of implementation complexity. In this paper we describe a parameterized cost model for router performance, characterized by two numbers: router delay and flow control time. Grounding the cost model in a 0.8 micron gate array technology, we use it to compare a number of proposed routing algorithms. Based on these design studies, several insights regarding the implementation complexity of adaptive routing are clear. First, header update and selection is expensive in adaptive routers, suggesting the absolute addressing should be reconsidered. Second, virtual channels are expensive in terms of latency and cycle time, so decisions to include them to support adaptivity or even virtual lanes should not be taken lightly. Third, requirements of larger crossbars and more complex arbitration cause some increase in the complexity of adaptive routers, but the rate of increase is small. Finally, the complexity of adaptive routers significantly increases their setup delay and flow control cycle times, implying that claims of performance advantages in channel utilization and low load latency must be carefully balanced against losses in achievable implementation speed.
A Necessary and Sufficient Condition for Deadlock-Free Routing in Cut-Through and Store-and-Forward Networks
, 1995
"... This paper develops the theoretical background for the design of deadlockfree adaptive routing algorithms for virtual cut-through and store-and-forward switching. This theory is valid for networks using either central buffers or edge buffers. Some basic definitions and three theorems are proposed, d ..."
Abstract
-
Cited by 111 (15 self)
- Add to MetaCart
This paper develops the theoretical background for the design of deadlockfree adaptive routing algorithms for virtual cut-through and store-and-forward switching. This theory is valid for networks using either central buffers or edge buffers. Some basic definitions and three theorems are proposed, developing conditions to verify that an adaptive algorithm is deadlock-free, even when there are cyclic dependencies between routing resources. Moreover, we propose a necessary and sufficient condition for deadlock-free routing. Also, a design methodology is proposed. It supplies fully adaptive, minimal and non-minimal routing algorithms, guaranteeing that they are deadlock-free. The theory proposed in this paper extends the necessary and sufficient condition for wormhole switching previously proposed by us. The resulting routing algorithms are more flexible than the ones for wormhole switching. Also, the design methodology is much easier to apply because it automatically supplies deadlock-fr...
Software Overhead in Messaging Layers: Where Does the Time Go?
- In Proceedings of the Sixth Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI
, 1994
"... Despite improvements in network interfaces and software messaging layers, software communication overhead still dominates the hardware routing cost in most systems. In this study, we identify the sources of this overhead by analyzing software costs of typical communication protocols built atop the a ..."
Abstract
-
Cited by 68 (10 self)
- Add to MetaCart
Despite improvements in network interfaces and software messaging layers, software communication overhead still dominates the hardware routing cost in most systems. In this study, we identify the sources of this overhead by analyzing software costs of typical communication protocols built atop the active messages layer on the CM-5. We show that up to 50--70% of the software messaging costs are a direct consequence of the gap between specific network features such as arbitrary delivery order, finite buffering, and limited fault-handling, and the user communication requirements of in-order delivery, end-to-end flow control, and reliable transmission. However, virtually all of these costs can be eliminated if routing networks provide higher-level services such as in-order delivery, end-to-end flow control, and packet-level fault-tolerance. We conclude that significant cost reductions require changing the constraints on messaging layers: we propose designing networks and network interfaces...
The Case for Chaotic Adaptive Routing
- IEEE Transactions on Computers
, 1994
"... Chaotic routers are randomizing, non-minimal adaptive packet routers designed for use in the communication networks of parallel computers. Chaotic routing is reviewed along with other contemporary network routing approaches, including the state-of-the-art oblivious routers. Each routing approach is ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
Chaotic routers are randomizing, non-minimal adaptive packet routers designed for use in the communication networks of parallel computers. Chaotic routing is reviewed along with other contemporary network routing approaches, including the state-of-the-art oblivious routers. Each routing approach is evaluated for its effectiveness as a multicomputer message router. The results indicate that the Chaos router is the most effective of known routing methods. 1 Introduction In spite of the fact that network routing has been an active research area in recent years, leading to many diverse proposals, practical experience with routers is extremely limited. The routers used in most implemented parallel computers are all from a single class, known as oblivious routers. Most of the non-oblivious routers have appeared only in single instance machines such as the HEP, CM-2, and CM-5 computers, making it difficult to separate fundamental properties of the routers from artifacts of the specific insta...
The Reliable Router: A Reliable and High-Performance Communication Substrate for Parallel Computers
- in Proc. of Parallel Routing and Communication Workshop
, 1994
"... . The Reliable Router (RR) is a network switching element targeted to two-dimensional mesh interconnection network topologies. It is designed to run at 100 MHz and reach a useful link bandwidth of 3.2 Gbit/sec. The Reliable Router uses adaptive routing coupled with link-level retransmission and a un ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
. The Reliable Router (RR) is a network switching element targeted to two-dimensional mesh interconnection network topologies. It is designed to run at 100 MHz and reach a useful link bandwidth of 3.2 Gbit/sec. The Reliable Router uses adaptive routing coupled with link-level retransmission and a unique-token protocol to increase both performance and reliability. The RR can handle a single node or link failure anywhere in the network without interruption of service. Other unique features include a queueless low-latency plesiochronous channel interface, and simultaneous bidirectional signalling. 1 Introduction Interconnection networks play a major role in performance and reliability of massively parallel processors (MPPs). Previous work on network switching elements implementing oblivious routing such as the J-Machine Router [2], and the Caltech Mesh-Routing Chips [11] did not address the issue of reliability in part because of the inherent unreliability of oblivious routing. Past work...
Chaotic Routing - Design and Implementation of an Adaptive Multicomputer Network Router
, 1993
"... Chaotic Routing -- Design and Implementation of an Adaptive Multicomputer Network Router by Kevin Bolding Chairperson of Supervisory Committee: Professor Lawrence Snyder Department of Computer Science and Engineering A crucial component of a massively parallel multicomputer is the interconnection n ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
Chaotic Routing -- Design and Implementation of an Adaptive Multicomputer Network Router by Kevin Bolding Chairperson of Supervisory Committee: Professor Lawrence Snyder Department of Computer Science and Engineering A crucial component of a massively parallel multicomputer is the interconnection network which links all of the nodes of the computer together. This network provides the primary method of communication between the hundreds or thousands of processing nodes and is, thus, critical to the successful operation of the multicomputer. Current state-of-the-art interconnection networks use simple, oblivious routing techniques which achieve very good performance when loading is light, but do not perform well in the presence of non-uniform congestion or faults. Chaotic routing, a non-minimal adaptive routing technique, provides a mechanism which takes into account the presence of congestion and faults when choosing a path for a message and can, thus, achieve better performance. Chaot...
Wormhole Routing Techniques for Directly Connected Multicomputer Systems
- ACM Computing Surveys
, 1998
"... Wormhole routing has emerged as the most widely used switching technique in massively parallel computers. We present here a detailed survey of various techniques for enhancing the performance and reliability of the wormhole routing schemes in directly connected networks. We start with an overview of ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
Wormhole routing has emerged as the most widely used switching technique in massively parallel computers. We present here a detailed survey of various techniques for enhancing the performance and reliability of the wormhole routing schemes in directly connected networks. We start with an overview of the direct network topologies and a comparison of various switching techniques. Next, the characteristics of wormhole routing mechanism are described in detail along with the theory behind deadlock-free routing. The performance of routing algorithms depends on the selection of path between the source and the destination, the network traffic, and the router design. The routing algorithms are implemented in the router chips. We outline the router characteristics and describe the functionality of various elements of the router. Depending on the usage of paths between the source and the destination, the routing algorithms are classified as deterministic, fully adaptive, and partially adaptive. ...
Do Faster Routers Imply Faster Communication?
- In First International Workshop, PCRCW'94, volume 853 of LNCS
, 1994
"... . Despite significant improvements in network interfaces and software messaging layers, software communication overhead still dominates the hardware routing cost in most parallel systems. In this study, we identify the sources of this overhead by relating user communication services to particular ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
. Despite significant improvements in network interfaces and software messaging layers, software communication overhead still dominates the hardware routing cost in most parallel systems. In this study, we identify the sources of this overhead by relating user communication services to particular network hardware features. Based on a detailed analysis of the active messages layer on the CM-5, we assign the software messaging cost to specific user communication services and network features. Our study shows that 50--70% of the software cost of messaging can be attributed to providing end-to-end flow control, in-order delivery, and reliable transmission services. This overhead is a direct effect of specific network features -- arbitrary delivery order, finite buffering, and limited fault-handling -- and is unlikely to be eliminated through improved software implementations. We conclude that reducing this software overhead requires changing the constraints on messaging layers...
Efficient Broadcast and Multicast on Multistage Interconnection Networks using Multiport Encoding
- In Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing
, 1996
"... This paper proposes a new approach for implementing fast multicast and broadcast in unidirectional and bidirectional multistage interconnection networks (MINs) with multiport encoded multidestination worms. For a MIN with n stages such worms use n header flits each. One flit is used for each stage ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
This paper proposes a new approach for implementing fast multicast and broadcast in unidirectional and bidirectional multistage interconnection networks (MINs) with multiport encoded multidestination worms. For a MIN with n stages such worms use n header flits each. One flit is used for each stage of the network and it indicates the output ports to which a multicast message needs to be replicated. A multiport encoded worm with (d 1 ; d 2 : : : ; dn , 1 d i k) degrees of replication for the respective stages is capable of covering (d 1 \Theta d 2 \Theta : : : \Theta dn ) destinations with a single communication start-up. In this paper a switch architecture is proposed for implementing multidestination worms without deadlock. Three grouping algorithms of varying complexity are presented to derive the associated multiport encoded worms for a multicast to an arbitrary set of destinations. Using these worms a multinomial tree-based scheme is proposed to implement the multicast. This s...

