| C. J. Glass and L. M. Ni, "Fault tolerant wormhole routing in meshes," in Proceedings of 23rd Annual International Symposium on Fault Tolerant Computing, 1993, pp. 240-249. |
....Related Work. Considerable research has been done on making wormhole routing fault tolerant. Papers such as [4] have added virtual channels to the network to handle faults. Virtual channels divide a single physical channel into many channels, sharing the bandwidth between them. Papers such as [8] have used an adaptive turn based model to avoid faults. If a faulty processor is encountered on the network, a message will choose a path around the failed processor. All of these wormhole routing works are designed to tolerate fail stop faults [12] meaning that one or more processors will cease ....
C. J. Glass and L. M. Ni, "Fault tolerant wormhole routing in meshes," in Proceedings of 23rd Annual International Symposium on Fault Tolerant Computing, 1993, pp. 240-249.
....100,000 messages were exchanged. Generally performances of routing algorithms are measured in terms of the average message latencies and saturation points, which are considered as the highest sustainable message generation rates. These experimental assumptions are similar to those reported in [12, 26]. Our first set of experiments involved comparing the performances of deterministic TP and adaptive TP approaches as discussed above. We simulated message delivery using the low level simulator for both the deterministic and adaptive TP. For network sizes of 8, 16, 32, 64, 128, and 256 nodes we ....
C. Glass and Ni,L., Fault-Tolerant Wormhole Routing in Meshes, Proc. of Int. Symp. on Fault-Tolerant Computing 1993.
....21, 22] but most of these are applicable only in packet switched (store andforwarding) hypercube networks. Because of their advantages in latency and implementation, the routing community has recently begun to focus on adaptive routing algorithms for wormhole routed, low dimensional networks [2, 4, 3, 23, 24]. Chien and Kim have extended planar adaptive routing with misrouting to support fault tolerance [3] By reconfiguring faults to be a convex region, the remainder of the network can continue to operate. Gaughan and Yalamanchili enhanced pipelined circuit switching, a variant of wormhole routing, ....
C. Glass and L. Ni, "Fault-tolerant wormhole routing in meshes," in Proceedings of International posium on Fault Tolerant Computing, 1993.
.... model 1 Introduction The concept of wormhole routing was rst described in a paper by Dally and Seitz [6] Since then a number of papers have been published addressing e.g. deadlock freedom, 9,28,10, 2, 14,21,5] eoeective utilization of bandwidth, see e.g. 19,13] and fault tolerance [20,12, 7, 15, 27, 11, 3]. In recentyears the concept of wormhole routing has been taken up both in products and international standards, in particular in the domain of multiprocessing, see e.g. 22,25,24, 26, 16] The wormhole switching technique is basically an improvement of the virtual cut through technique of ....
C. J. Glass and L. M. Ni. Fault-tolerantwormhole routing in meshes. In J.-C. Laprie, editor, Proceedings of the 23rdAnnual International Symposium on Fault-Tolerant Computing (FTCS '93), pages 240249, June 1993.
....should be send in such a way that no deadlocks can appear in the system. In this paper we consider the case when wormhole routing [26] is used. According to this method, each message is divided into flits and all flits follow the same path. The major consideration is the absence of deadlocks [4,8]. Most of the known methods for wormhole routing were designed for a specific architecture, such as n dimensional mesh [10,31] while the design of deadlock free routing algorithms in irregular topologies introduces new challenges [23] Very few papers [23,28] have been published on routing for ....
C. Glass and L. Ni, "Fault-Tolerant Wormhole Routing in Meshes," Proc. of Int. Symp. on FaultTolerant Computing, 1993.
....degradation. Even more serious is the possibility that a set of channel faults causes some processors to become entirely unreachable due to routing restrictions, even if the network remains strongly connected. Recently, a number of fault tolerant wormhole routing algorithms have been proposed [2, 8, 9]. Although these algorithms differ from one another in a number of ways, a common feature of these algorithms is that they all ensure freedom from deadlock by imposing a partial ordering on the set of channels using mechanisms such as channel dependency graphs [5] extended channel dependency ....
....labeling if there exists a strictly decreasing path between every pair of vertices. 1 A large number of deadlock free wormhole routing algorithms employing this approach have been proposed [1, 4, 7, 11, 12] Recently, several fault tolerant routing algorithms have also been proposed [2, 8, 9]. The difficulty in designing a fault tolerant deadlock free routing algorithm is that as channels fail (the failure of a router can be viewed as the failure of all channels incident on that router) there may no longer be strictly decreasing paths between some pairs of nodes with respect to the ....
C. Glass and L. Ni. Fault-tolerant wormhole routing in meshes. In FTCS23, pages 240--249, June 1993.
....network through which the processors can communicate e ciently. Toroidal interconnection networks are simple and practical interconnection networks for multiprocessor systems [11, 15, 17] Figure 1 gives a two dimensional torus. Fault tolerant routing on torus has attracted much attention [14, 1, 2, 10, 6, 8, 7]. Most of the previous works focus on several speci c routing models like storage forward, circuit switched, wormhole, and so on. In this paper, we study node fault tolerant routing in torus at the topological level. We view a muti processor system as a graph, nodes express the processors and ....
C.J. Glass and L.M. Ni. Fault-tolerant wormhole routing in meshes. In Proc. of the 23rd Annual International Symposium on Fault-Tolerant Computing, pages 240-249, 1993.
....encounters a permanent fault, it circumvents the faults using alternative paths. To ensure reliable message transfer, each message holds its path until the last data flit reaches the destination. Glass and Ni have proposed an extension of the negative first algorithm to make it fault tolerant [34]. The negative first algorithm provides full adaptivity to the message at all times except when it is routing in the negative edge of the mesh or in the last dimension. The fault tolerant extension of this algorithm is targeted at removing these few cases of non adaptiveness. Their approach is ....
C. J. Glass and L. M. Ni, "Fault-tolerant wormhole routing in meshes," Intl. Symposium on Fault-Tolerant Computing, pp. 240-249, 1993.
....of Fully Adaptive Jesshope, Miller, Yes 8 Number of Virtual Yantchev [48] Channels is Exponential Linder Harden [67] Yes 6 in Dimension of Mesh (if necessary) and then in any other direction without restriction. These routing algorithms also support nonminimal routing. Glass and Ni [41] show how to modify the negative first routing algorithm to produce a routing algorithm that can tolerate at least n Gamma 1 faulty nodes on an n dimensional mesh. Boura and Das [5] use their methodology to propose a large class of partially adaptive routing algorithms. The methodology is fairly ....
C. Glass and L. M. Ni. Fault-Tolerant Wormhole Routing in Meshes. In 23 rd Annual International Symposium on Fault-Tolerant Computing, pages 240--249, 1993.
....introducing deadlocks and livelocks. Further, we show, using simulations, that good performance may be achieved even with 10 of the links faulty. Related results. Routing algorithms for wh and virtual cut through switching techniques has been the subject of extensive research in recent years [7, 11, 14, 9, 15, 21, 2]. Several results have been reported for fault tolerant routing in hypercubes. These results exploit the rich interconnection structure of hypercubes and are not suitable for high radix, low dimensional meshes. Reddy and Freitas [20] use global knowledge of faults and routing tables to investigate ....
....low dimensional meshes. Reddy and Freitas [20] use global knowledge of faults and routing tables to investigate the performance limitations caused by faults. Gaughan and Yalamanchili [13] use a pipelined circuit switching (PCS) mechanism with backtracking for fault tolerant routing. Glass and Ni [15] present a partially adaptive algorithm, called negative first, that tolerates up to (n Gamma 1) faults in an n dimensional mesh without any extra virtual channels. Unfortunately, the negative first and its related algorithms do not have good performance for the fault free case [3] and the ....
C. J. Glass and L. M. Ni. Fault-tolerant wormhole routing in meshes. In Twenty-Third Annual Int. Symp. on Fault-Tolerant Computing, pages 240--249, 1993.
....low latency message delivery, high throughput, graceful performance degradation, and adaptation to a variety of traffic and fault patterns. Previous work on fault tolerant routing has been concentrated on augmenting the existing adaptive routing algorithms with fault tolerant capabilities. In [5], Ni has extended the partially adaptive turn model algorithm negative first; to tolerate n Gamma 1 faults in n dimensional meshes without using any virtual channels. For the low dimensional networks, the number of faults tolerated are very small. Dally and Akoi have presented an adaptive, ....
C. J. Glass and L. M. Ni, "Fault-Tolerant Wormhole Routing in Meshes," Intl. Symposium on FaultTolerant Computing, pp. 240-249, 1993.
....While the e cube is simple to implement and provides high throughput for uniform traffic, it cannot handle even simple node or link faults due to its nonadaptive routing. Adaptive, fault tolerant cut through routing algorithms has been the subject of extensive research in recent years [11, 19, 15, 21, 24, 1, 4, 7, 20, 2, 16, 6]. These results implicitly or explicitly assume routers with centralized crossbars. Therefore, such techniques are not suitable for multiprocessors with PDRs. Several other results (see, for example, 23, 25] and the references therein) exploit the rich interconnection structure of hypercubes and ....
C. J. Glass and L. M. Ni, "Fault-tolerant wormhole routing in meshes," in Twenty-Third Annual Int. Symp. on Fault-Tolerant Computing, pp. 240--249, 1993.
....is a natural communication primitive to handle synchronizations, invalidations and updates of cache lines in distributed shared memory computers, and since parallel computers must be used efficiently even in the presence of faults. There are some recent results on fault tolerant wormhole routing [6, 4, 9, 8, 1, 2] and multicast wormhole routing [13, 11, 3] but very few results exist on fault tolerant multicast routing (for a result on hypercubes with limited number of faults see [14] In this paper, we show how to provide fault tolerant communication using two recently proposed multicast routing ....
C. J. Glass and L. M. Ni, "Fault-tolerant wormhole routing in meshes," in Twenty-Third Annual Int. Symp. on FaultTolerant Computing, pp. 240--249, 1993.
....if minimal routing is required, then at least one half of sourcedestination pairs will have only a single routing path from the source to destination. Because this algorithm cannot always route messages along shortest paths in the network, it is called partiallyadaptive routing. The authors of [17] modified the routing algorithm of this turn model to make it (n Gamma 1) fault tolerant for n dimensional meshes. However, for 4 or higher dimensional meshes, it is only an assertion or conjecture, i.e. its validity remains to be proved. Chien and Kim [3] proposed a planar adaptive routing ....
....because both communication and computation units can be integrated into a single chip in contemporary multicomputer systems. The fault on each outgoing link can be treated as a corresponding node fault. There are two fault types for supporting fault tolerant routing: dynamic and static faults [17]. For a system to tolerate dynamic faults, nodes may become faulty or nonfaulty at any time. By contrast, the static case lets faults occur only when the network is shut down. Tolerating dynamic faults can enhance the run time life of a multicomputer, thus increasing reliability. On the other ....
C. J. Glass and L. M. Ni, "Fault-tolerant wormhole routing in meshes," in Proc. of IEEE 23th Int'l Symposium on Fault-Tolerant Computing, pp. 240--249, 1993.
....21, 22] but most of these are applicable only in packet switched (store andforwarding) hypercube networks. Because of their advantages in latency and implementation, the routing community has recently begun to focus on adaptive routing algorithms for wormhole routed, low dimensional networks [2, 4, 3, 23, 24]. Chien and Kim have extended planar adaptive routing with misrouting to support fault tolerance [3] By reconfiguring faults to be a convex region, the remainder of the network can continue to operate. Gaughan and Yalamanchili enhanced pipelined circuit switching, a variant of wormhole routing, ....
C. Glass and L. Ni, "Fault-tolerant wormhole routing in meshes," in Proceedings of International Symposium on Fault Tolerant Computing, 1993.
....of the same length (in hops) Therefore, the e cube cannot handle even simple node or link faults, because even one fault disrupts many e cube communication paths. Therefore, adaptive and fault tolerant routing for multicomputer networks has been the subject of extensive research in recent years [5, 9, 7, 11, 2, 10, 3]. Most of the current techniques to handle faults in torus and mesh networks require one or more of the following: a) new routing algorithms with adaptivity [5, 7, 11] b) global knowledge of faults, c) restriction on the shapes, locations, and number of faults [5, 7, 11, 3] and (d) relaxing ....
.... and fault tolerant routing for multicomputer networks has been the subject of extensive research in recent years [5, 9, 7, 11, 2, 10, 3] Most of the current techniques to handle faults in torus and mesh networks require one or more of the following: a) new routing algorithms with adaptivity [5, 7, 11], b) global knowledge of faults, c) restriction on the shapes, locations, and number of faults [5, 7, 11, 3] and (d) relaxing the constraint of guaranteed delivery, deadlock or livelock free routing. Chalasani s research has been supported in part by the NSF grant CCR 9308966. Boppana s ....
[Article contains additional citation context not shown here]
C. J. Glass and L. M. Ni, "Fault-tolerant wormhole routing in meshes," in TwentyThird Annual Int. Symp. on Fault-Tolerant Computing, pp. 240--249, 1993.
.... of faults [4,5,8,11] For example, fault rings are constructed around convex faulty regions using additional virtual channels and attendant routing restrictions [4] Additionally, source hardware synchronization mechanisms have been proposed to change routing decisions in the presence of faults [20], and partially adaptive routing around convex fault regions with no additional channels are feasible [5] while more recently the use of time outs and deadlock recovery mechanisms have been proposed [22] Alternatively, in the pipelined circuit switching (PCS) flow control mechanism, the path ....
....TP s partially optimistic behavior results in a severe performance degradation. With conservative routing protocols, no network resources are reserved until a path has been setup between the source and the destination. TP does not require any complex renumbering scheme to provide fault tolerance [19,20], does not require the construction of convex regions [4,5] does not require additional virtual channels [4] and the dynamic fault tolerant version of TP does not rely on time outs [11] or padding of messages [22] It does, however, result in a more complex channel model which can affect link ....
C. J. Glass and L. M. Ni. Fault-tolerant wormhole routing in meshes. Proceedings of the 23rd International on Fault-Tolerant Computing Symposium, pages 240-249, 1993.
....results were presented in part at the 24th Annual International Conference on Parallel Processing, August 1995. 2 the network to be able to dynamically route messages along alternative, possibly non minimal paths. Techniques for adaptive routing have been proposed as means of avoiding faults [4,5,9,14,19,21,24]. Fault tolerant routing algorithms have been developed for both wormhole switched and packet switched networks. These techniques generally require substantial hardware support within the network routers, but can deliver robust inter processor communication performance in the presence of a ....
....the destination node. That fraction of message may be detected as faulty by using CRC. The idea is not recovering the message. Simply, dead flits can be easily removed with minimum hardware support. There exist other techniques to handle recovery from faults which interrupt messages in progress [21,24,10,18]. In this paper, we consider two fault models. The first one is a coalesced fault model. Adjacent faulty links and faulty nodes are coalesced into fault regions. Fault regions may overlap forming a larger fault region. We assume that the fault regions do not disconnect the network. Fault regions ....
C. J. Glass and L. Ni, "Fault-tolerant wormhole routing in meshes," Proceedings of the Fault Tolerant Computing Symposium, 1993.
....be designed for n dimensional meshes. A variant of this algorithm is being used in the reliable router project at MIT [8] Our routing techniques require only local knowledge of faults and work correctly when faulty components are confined to one or more rectangular blocks [2, 4] Glass and Ni [13] present the negative first algorithm, which tolerates up to (n Gamma 1) faults in an n dimensional mesh. Chien and Kim show that the planar adaptive routing algorithms can tolerate block faults in the mesh, if no faults are present on the boundaries of the mesh [5] Dally and Aoki use dimension ....
C. J. Glass and L. M. Ni, "Fault-tolerant wormhole routing in meshes," in Twenty-Third Annual Int. Symp. on Fault-Tolerant Computing, pp. 240--249, 1993.
....freedom from livelock and starvation. Minimality also conserves network resources such as link bandwidth and buffers[5] Full Adaptivity Full adaptivity implies that the routing algorithm can choose any of the possible paths between source and destination, subject to the minimality constraint[26, 16, 15, 1]. It is useful for networks under load imbalance, as the adaptivity allows a wider range of possible paths. Fault Tolerance Fault tolerance is the ability of a routing algorithm to bypass faulty communication links. The Reliable Router requires a particular form called One Fault Tolerant ....
....1] It is useful for networks under load imbalance, as the adaptivity allows a wider range of possible paths. Fault Tolerance Fault tolerance is the ability of a routing algorithm to bypass faulty communication links. The Reliable Router requires a particular form called One Fault Tolerant Routing[16], where the algorithm tolerates at least a single network failure. Dimension Reversals Dally and Aoki[6] developed two types of adaptive algorithms (static and dynamic) based upon the concept of dimension reversals. Both algorithms use multiple virtual channels. One of the virtual channels is ....
[Article contains additional citation context not shown here]
Cristopher J. Glass and Lionel M. Ni. Fault-tolerant wormhole routing in meshes. 1993.
....optimistic behavior results in a severe performance degradation. With conservative routing protocols, no network resources are reserved until a path has been setup between the source and the destination. However, TP does not require any complex renumbering scheme to provide fault tolerance [19,20], does not require the construction of convex regions to ease routing [4,5,23] does not require additional virtual channels [4] and the dynamic fault tolerant version of TP does not rely on time outs [11] or padding of messages [22] It does, however, result in a more complex channel model which ....
C. J. Glass and L. M. Ni. Fault-tolerant wormhole routing in meshes. Proceedings of the 23rd International Symposium on Fault-Tolerant Computing, pages 240-249, 1993.
....paths of the same length (in hops) Therefore, the e cube cannot handle even simple node or link faults, because evenone fault disrupts many e cube communication paths. Adaptive and fault tolerant routing for multicomputer networks has been the subject of extensive research in recent years [2, 3, 4, 6, 8, 10, 11, 12, 16]. Most of the current techniques to handle faults in torus and mesh networks require one or more of the following: a) new routing algorithms with adaptivity [4, 6, 8, 11, 12] b) global knowledge of faults, c) restriction on the shapes, locations, and number of faults [6, 8, 12, 3, 16] and (d) ....
.... fault tolerant routing for multicomputer networks has been the subject of extensive research in recent years [2, 3, 4, 6, 8, 10, 11, 12, 16] Most of the current techniques to handle faults in torus and mesh networks require one or more of the following: a) new routing algorithms with adaptivity [4, 6, 8, 11, 12], b) global knowledge of faults, c) restriction on the shapes, locations, and number of faults [6, 8, 12, 3, 16] and (d) relaxing the constraints of guaranteed delivery, deadlock or livelock free routing [16] In this paper, we present fault tolerant routing methods that can be used to augment ....
[Article contains additional citation context not shown here]
C. J. Glass and L. M. Ni. Fault-tolerant wormhole routing in meshes. In Twenty-Third Annual Int. Symp. on Fault-Tolerant Computing, pages 240--249, 1993.
No context found.
C. J. Glass and L. Ni, "Fault-tolerant wormhole routing in meshes," Proc. of the Fault Tolerant Computing Symposium, 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC