See this document in CiteSeerX!

From AAPC Algorithms to High Performance Permutation Routing and Sorting (1996)  (Make Corrections)  (5 citations)
Thomas M. Stricker, Jonathan C. Hardwick
Proceedings of the 8th ACM Symposium on Parallel Algorithms and Architectures



  Home/Search   Context   Related

 
View or download:
cmu.edu/project/scandal/p...spaa96.ps.Z
cmu.edu/~scandal/papers/spaa96.ps.Z
cs.inf.ethz.ch/str...tingsorting.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/~jch/publications (more)
From:  cmu.edu/~scandal/papers/
Homepages:  J.Hardwick  

Rate this article: (best)
  Comment on this article  
Derivation of an optimal all-to-all permutation implementation for the Cray T3D.

Abstract: Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalized communication (AAPC) over communication networks such as meshes, hypercubes and omega switches. However, the constant factors of these algorithms are often an obscure function of system parameters such as link speed, processor clock rate, and memory access time. In this paper we investigate these architectural factors, showing the impact of the communication style, the network routing table, and... (Update)

Context of citations to this paper:   More

...the particular AAPC patterns used result in the same route being chosen every time. The solution is a revision of the AAPC patterns (see [8] for details) This improves aggregate performance to 23,910 MByte s (46.7 MByte s per processor) or 62 of the nominal bisection...

.... but there is one with an outstanding importance: balanced all to all routing (also called balanced all to all personalized communication [6]) In a balanced all to all routing on a network with P PUs, each PU PU i , 0 i P , initially holds P packets, p i;j , 0 j P of size...

Cited by:   More
In: Proceedings of the 2005 IEEE International.. - Comprehensive..   (Correct)
Cost/Performance Tradeoffs in Network Interconnects for.. - Kurmann, Rauch, Stricker (2003)   (Correct)
Routing with Finite Speeds of Memory and Network - Jop Sibeyn (1997)   (Correct)

Active bibliography (related documents):   More   All
0.5:   Global Address Space, Non-Uniform Bandwidth: A Memory System.. - Stricker, Gross (1997)   (Correct)
0.5:   Final Report on Research in Parallel Computing.. - December Carnegie (1996)   (Correct)
0.1:   Practical Parallel Algorithms for Personalized.. - Bader, Helman.. (1995)   (Correct)

Similar documents based on text:   More   All
0.5:   An Architecture for Optimal All-to-All Personalized.. - Hinrichs, Kosak.. (1994)   (Correct)
0.3:   On Scheduling All-to-All Personalized Connections and.. - Zhang, Qiao (1996)   (Correct)
0.3:   Scheduling in Unidirectional WDM Rings and Its Extensions - Xijun Zhang And (1997)   (Correct)

Related documents from co-citation:   More   All
3:   A comparison of sorting algorithms for the Connection Machine CM - Blelloch, Leiserson et al. - 1991
3:   Linpack Users' Guide (context) - Dongarra, Moler et al. - 1979
3:   BEOWULF: A Parallel Workstation for Scientific Computation - Sterling, Becker et al. - 1995

BibTeX entry:   (Update)

Thomas M. Stricker and Jonathan C. Hardwick. From AAPC algorithms to high performance permutation routing and sorting. Technical Report CMU-CS-96-120, School of Computer Science, Carnegie Mellon University, April 1996. To appear. http://citeseer.ist.psu.edu/stricker96from.html   More

@inproceedings{ stricker96from,
    author = "Thomas M. Stricker and Jonathan C. Hardwick",
    title = "From {AAPC} Algorithms to High Performance Permutation Routing and Sorting",
    booktitle = "Proceedings of the 8th {ACM} Symposium on Parallel Algorithms and Architectures",
    pages = "200-203",
    month = "June",
    year = "1996",
    url = "citeseer.ist.psu.edu/stricker96from.html" }
Citations (may not include all citations):
182   A comparison of sorting algorithms for the Connection Machin.. - Blelloch, Leiserson et al. - 1991
26   Radix sort for vector multiprocessors - Zagha, Blelloch - 1991
25   Decoupling synchronization and data transfer in message pass.. - Stricker, Stichnoth et al. - 1995
20   Optimizing memory system performance for communication in pa.. - Stricker, Gross - 1995
8   Complete exchange on the CM-5 and Touchstone Delta (context) - Thakur, Ponnusamy et al. - 1995
5   From AAPC algorithms to high performance permutation routing.. - Stricker, Hardwick - 1996
2   Supporting the hypercube programming model on meshes (context) - Stricker - 1992
2   Empirical evaluation Cray TD (context) - Krishnamurthy, Yelick et al. - 1995



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.cmu.edu/~jch/publications.html):   More
Implementation and Evaluation of an Efficient Parallel Delaunay.. - Hardwick (1997)   (Correct)
Implementation of a Portable Nested Data-Parallel Language - Blelloch (1994)   (Correct)
An Efficient Implementation of Nested Data Parallelism for.. - Hardwick (1996)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC