Derivation of an optimal all-to-all permutation implementation for the Cray T3D.
Abstract: Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalized communication (AAPC) over communication networks such as meshes, hypercubes and omega switches. However, the constant factors of these algorithms are often an obscure function of system parameters such as link speed, processor clock rate, and memory access time. In this paper we investigate these architectural factors, showing the impact of the communication style, the network routing table, and... (Update)
Context of citations to this paper: More
...the particular AAPC patterns used result in the same route being chosen every time. The solution is a revision of the AAPC patterns (see [8] for details) This improves aggregate performance to 23,910 MByte s (46.7 MByte s per processor) or 62 of the nominal bisection...
.... but there is one with an outstanding importance: balanced all to all routing (also called balanced all to all personalized communication [6]) In a balanced all to all routing on a network with P PUs, each PU PU i , 0 i P , initially holds P packets, p i;j , 0 j P of size...
Cited by: More
In: Proceedings of the 2005 IEEE International.. - Comprehensive..
(Correct)
Cost/Performance Tradeoffs in Network Interconnects for.. - Kurmann, Rauch, Stricker (2003)
(Correct)
Routing with Finite Speeds of Memory and Network - Jop Sibeyn (1997)
(Correct)
Active bibliography (related documents): More All
0.5: Global Address Space, Non-Uniform Bandwidth: A Memory System.. - Stricker, Gross (1997)
(Correct)
0.5: Final Report on Research in Parallel Computing.. - December Carnegie (1996)
(Correct)
0.1: Practical Parallel Algorithms for Personalized.. - Bader, Helman.. (1995)
(Correct)
Similar documents based on text: More All
0.5: An Architecture for Optimal All-to-All Personalized.. - Hinrichs, Kosak.. (1994)
(Correct)
0.3: On Scheduling All-to-All Personalized Connections and.. - Zhang, Qiao (1996)
(Correct)
0.3: Scheduling in Unidirectional WDM Rings and Its Extensions - Xijun Zhang And (1997)
(Correct)
Related documents from co-citation: More All
3: A comparison of sorting algorithms for the Connection Machine CM
- Blelloch, Leiserson et al. - 1991
3: Linpack Users' Guide (context) - Dongarra, Moler et al. - 1979
3: BEOWULF: A Parallel Workstation for Scientific Computation
- Sterling, Becker et al. - 1995
BibTeX entry: (Update)
Thomas M. Stricker and Jonathan C. Hardwick. From AAPC algorithms to high performance permutation routing and sorting. Technical Report CMU-CS-96-120, School of Computer Science, Carnegie Mellon University, April 1996. To appear. http://citeseer.ist.psu.edu/stricker96from.html More
@inproceedings{ stricker96from,
author = "Thomas M. Stricker and Jonathan C. Hardwick",
title = "From {AAPC} Algorithms to High Performance Permutation Routing and Sorting",
booktitle = "Proceedings of the 8th {ACM} Symposium on Parallel Algorithms and Architectures",
pages = "200-203",
month = "June",
year = "1996",
url = "citeseer.ist.psu.edu/stricker96from.html" }
Citations (may not include all citations):
182
A comparison of sorting algorithms for the Connection Machin..
- Blelloch, Leiserson et al. - 1991
26
Radix sort for vector multiprocessors
- Zagha, Blelloch - 1991
25
Decoupling synchronization and data transfer in message pass..
- Stricker, Stichnoth et al. - 1995
20
Optimizing memory system performance for communication in pa..
- Stricker, Gross - 1995
8
Complete exchange on the CM-5 and Touchstone Delta (context) - Thakur, Ponnusamy et al. - 1995
5
From AAPC algorithms to high performance permutation routing..
- Stricker, Hardwick - 1996
2
Supporting the hypercube programming model on meshes (context) - Stricker - 1992
2
Empirical evaluation Cray TD (context) - Krishnamurthy, Yelick et al. - 1995
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cs.cmu.edu/~jch/publications.html): More
Implementation and Evaluation of an Efficient Parallel Delaunay.. - Hardwick (1997)
(Correct)
Implementation of a Portable Nested Data-Parallel Language - Blelloch (1994)
(Correct)
An Efficient Implementation of Nested Data Parallelism for.. - Hardwick (1996)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC