(Enter summary)
Abstract: Many theoretical models of parallel computation are based on overly simplistic
assumptions on the performance of the interconnection network. For example they
assume constant latency for any communication pattern or innite bandwidth. This
paper presents a case study based on the FFT transpose algorithm, which is mapped
on two families of scalable interconnection networks, the k-ary n-trees and the k-ary
n-cubes. We analyze in depth the network behavior of a minimal adaptive algorithm
for the... (Update)
Context of citations to this paper: More
...machine with 256 processors. These processors are connected with an indirect interconnection network using state of the art routers [30]. Based on these gures, there is obviously an uneven and inecient use of system resources. During the two computational phases of the...
.... thesis together with some related work, not shown here for brevity, either appeared in the literature or are still under review [118] [120] [121] 123] 122] 124] 117] 130] 129] 127] 128] 119] 125] 126] 47] 1.6 Thesis Overview The thesis is organized as follows....
Cited by: More
Minimal Adaptive Routing with Limited Injection on Toroidal .. - Petrini, Vanneschi (1996)
(Correct)
A New Approach to Parallel Program Development and.. - Petrini, Bassetti.. (1999)
(Correct)
Communication Performance of Wormhole Interconnection Networks - Petrini (1997)
(Correct)
Similar documents (at the sentence level):
54.4%: Latency and Bandwidth Requirements of Massively Parallel.. - Petrini, Vanneschi (1999)
(Correct)
10.0%: A Comparison of Wormhole-Routed Interconnection Networks - Petrini, Vanneschi (1997)
(Correct)
Active bibliography (related documents): More All
0.4: Efficient Total-Exchange in Wormhole-Routed Toroidal Cubes - Petrini (1999)
(Correct)
0.4: Parallel 1D-FFT Computation on Constant-valence Multicomputers - Mazzeo, Villano (1995)
(Correct)
0.3: Towards a Generic Analytical Model of Wormhole Routing Networks - Lysne (1998)
(Correct)
Similar documents based on text: More All
0.8: Minimal vs. non Minimal Adaptive Routing on k-ary n-cubes - Petrini, Vanneschi
(Correct)
0.3: Performance Analysis of Wormhole Routed k-ary n-trees - Petrini, Vanneschi (1998)
(Correct)
0.2: Network Performance under Physical Constraints - Petrini, Vanneschi (1997)
(Correct)
Related documents from co-citation: More All
5: SMART: a Simulator of Massive ARchitectures and Topologies
- Petrini, Vanneschi - 1997
5: trees: High Performance Networks for Massively Parallel Architectures (context) - Petrini, Vanneschi - 1995
5: A Necessary and Sufficient Condition for Deadlock Free Adaptive Routing in Wormh..
- - 1995
BibTeX entry: (Update)
Fabrizio Petrini and Marco Vanneschi. Latency and Bandwidth Requirements of Massively Parallel Programs: FFT as a Case Study. Future Generation Computer Systems, 1999. Accepted for publication. http://citeseer.ist.psu.edu/article/petrini99latency.html More
@inproceedings{ petrini96latency,
author = "Fabrizio Petrini and Marco Vanneschi",
title = "Latency and Bandwidth Requirements of Massively Parallel Programs: {FFT} as a Case Study",
booktitle = "Euro-Par, Vol. I",
pages = "307-312",
year = "1996",
url = "citeseer.ist.psu.edu/article/petrini99latency.html" }
Citations (may not include all citations):
981
Introduction to Parallel Algorithms and Architectures: Array.. (context) - Leighton - 1992
531
LogP: Towards a Realistic Model of Parallel Computation
- Culler, Karp et al. - 1993
531
LogP: Towards a Realistic Model of Parallel Computation
- Culler, Karp et al. - 1993
462
Deadlock-Free Message Routing in Multiprocessor Interconnect.. (context) - Dally, Seitz - 1987
218
Parallelism in Random Access Machines (context) - Fortune, Willie - 1978
171
Advanced Computer Architecture: Parallelism (context) - Hwang - 1993
130
LogGP: Incorporating Long Messages into the LogP Model - One..
- Alexandrov, Ionescu et al. - 1995
129
cube Interconnection Networks (context) - Dally, of - 1990
93
IEEE Transactions on Parallel and Distributed Systems (context) - Dally, Flow - 1992
81
A Bridging Model for Parallel Computation (context) - Valiant - 1990
79
On Communication Latency in PRAM Computation (context) - Aggarwal, Chandra et al. - 1989
79
Communication Complexity of PRAMs (context) - Aggarwal, Chandra et al. - 1990
65
Optimal Broadcast and Summation in the LogP Model
- Karp, Sahay et al. - 1992
62
Designing Broadcasting Algorithms in the Postal Model for Me.. (context) - Bar-Noy, Kipnis - 1992
61
Where is Time Spent in Message-Passing and Shared-Memory Pro..
- Chandra, Larus et al. - 1994
61
FFTs in External or Hierarchical Memory
- Bailey - 1990
55
Multiprocessor FFTs (context) - Swarztrauber - 1987
38
Assessing Fast Network Interfaces (context) - Culler, Liu et al.
35
Chaotic Routing: Design and Implementation of an Adaptive Mu..
- Bolding - 1993
32
IEEE Transactions on Parallel and Distributed Systems (context) - Gupta, Kumar et al. - 1993
27
SMART: a Simulator of Massive ARchitectures and Topologies
- Petrini, Vanneschi - 1997
23
The Message-Driven Processor (context) - Dally - 1992
22
Performance in Parallel Simulation of Interconnection Networ.. (context) - Burger, Wood - 1995
16
A High-Performance FFT Algorithm for Vector Supercomputers
- Bailey - 1988
16
FFT Algorithms for Vector Computers (context) - Swarztrauber - 1984
13
Congestion-Free Routing on the CM-5 Data Router (context) - Heller - 1994
11
trees: High Performance Networks for Massively Parallel Arch.. (context) - Petrini, Vanneschi - 1997
11
Hiding Communication Costs in Bandwidth-Limited Parallel FFT.. (context) - Sahay - 1992
8
Dimensional Parallel FFT Benchmark on SUPRENUM
- Getov - 1992
8
A High-Performance Fast Fourier Transform Algorithm for the ..
- Bailey - 1987
7
Performance Analysis of Wormhole Routed k-ary n-trees
- Petrini, Vanneschi - 1998
6
A Necessary and Sucient Condition for Deadlock-Free Adaptive.. (context) - Duato - 1994
4
Fast Fourier Transform - For Fun and Prot (context) - Gentleman, Sande - 1966
3
Towards an Architecture Independent Analysis pf Parallel Alg.. (context) - Papadimitriu, Yannakakis - 1988
3
Physical Architectures: Interconnection and Communication (context) - Petrini, Vanneschi - 1995
3
A generalized FFT algorithm on transputers
- Roebbers, Welch et al. - 1991
3
Deadlock-Free Adaptive Wormhole Routing with Disha Concurren.. (context) - Anjan, Timothy - 1995
2
Digital Equipment Corporation (context) - Aspnes, Herlihy et al. - 1993
2
Load Balanced FFT Implementation on the Intel iPSC (context) - Chu - 1987
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.c3.lanl.gov/cic19/teams/par_arch/Publications.html): More
The Performance Realities Of Massively Parallel.. - Lubeck, Simmons.. (1992)
(Correct)
Performance Evaluation of the SGI Origin2000: A.. - Wasserman, Lubeck..
(Correct)
Benchmark Tests on a Silicon Graphics R8000-Based Workstation - Wasserman
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC