Results 1 
2 of
2
SCALABLE COLLECTIVE MESSAGEPASSING ALGORITHMS
, 2011
"... Governments, universities, and companies expend vast resources building the top supercomputers. The processors and interconnect networks become faster, while the number of nodes grows exponentially. Problems of scale emerge, not least of which is collective performance. This thesis identifies and pr ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Governments, universities, and companies expend vast resources building the top supercomputers. The processors and interconnect networks become faster, while the number of nodes grows exponentially. Problems of scale emerge, not least of which is collective performance. This thesis identifies and proposes solutions for two major scalability problems. Our first contribution is a novel algorithm for processpartitioning and remapping for exascale systems that has far better time and space scaling than known algorithms. Our evaluations predict an improvement of up to 60x for large exascale systems and arbitrary reduction in the large temporary buffer space required for generating new communicators. Our second contribution consists of several novel collective algorithms for Clos and torus networks. Known allgather, reducescatter, and composite algorithms for Clos networks suffer the worst congestion when the largest messages are exchanged, damaging performance. Known algorithms for torus networks use only one network port, regardless of how many are available.
On the Reliability of FatTrees
"... In this paper we examine the reliability properties of the ideal fattree, a general model used to capture both distance and bandwidth constraints of various classes of fattree networks. We use the notion (see [NPSY94]) of a random graph of type\GammaG obtained by selecting edges of a given undirec ..."
Abstract
 Add to MetaCart
In this paper we examine the reliability properties of the ideal fattree, a general model used to capture both distance and bandwidth constraints of various classes of fattree networks. We use the notion (see [NPSY94]) of a random graph of type\GammaG obtained by selecting edges of a given undirected graph G independently and with probability p, thus representing a network in which links fail independently and with probability f = 1 \Gamma p. In addition, we may allow vertices of the graph to fail independently with probability f . We show that: 1. At least half of the edge disjoint paths connecting any subsets of terminal nodes of a fat tree (whose fatnodes do not have internal edges) are preserved with probability tending to 1 as the number of nodes in the tree tends to infinity, even in the case of constant vertex and edge failure probabilities f ! 0:25. Thus, and by also showing the preservation of the minimum cut (and thus of the maximum flow), we prove that the ideal fattree ...