A theory of fault-tolerant routing in wormhole networks (1997)
| Venue: | IEEE Transactions on Parallel and Distributed Systems |
| Citations: | 24 - 5 self |
BibTeX
@ARTICLE{Duato97atheory,
author = {José Duato},
title = {A theory of fault-tolerant routing in wormhole networks},
journal = {IEEE Transactions on Parallel and Distributed Systems},
year = {1997},
volume = {8},
pages = {790--802}
}
OpenURL
Abstract
Abstract — Fault-tolerant systems aim at providing continuous operation in the presence of faults. Multicomputers rely on an interconnection network between processors to support the message-passing mechanism. Therefore, the reliability of the interconnection network is very important for the reliability of the whole system. This paper analyzes the effective redundancy available in a wormhole network by combining connectivity and deadlock freedom. Redundancy is defined at the channel level. We propose a sufficient condition for channel redundancy, also computing the set of redundant channels. The redundancy level of the network is also defined, proposing a theorem that supplies its value. This theory is developed on top of our necessary and sufficient condition for deadlock-free adaptive routing. The new theory also considers the failure of physical channels when virtual channels are used. Finally, we propose a methodology for the design of fault-tolerant routing algorithms, showing its application to n-dimensional meshes. Index Terms—Adaptive routing, channel redundancy, fault-tolerant routing, interconnection networks, network redundancy, wormhole switching. 1







