See this document in CiteSeerX!

Latency and Bandwidth Requirements of Massively Parallel Programs: FFT as a Case Study (1999)  (Make Corrections)  (5 citations)
Fabrizio Petrini
Euro-Par, Vol. I



  Home/Search   Context   Related

 
View or download:
lanl.gov/cic19/teams/par_arch...fgcs.ps
lanl.gov/~fabrizio/papers/fgcs.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  lanl.gov/cic19/tea...Publications (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Many theoretical models of parallel computation are based on overly simplistic assumptions on the performance of the interconnection network. For example they assume constant latency for any communication pattern or in nite bandwidth. This paper presents a case study based on the FFT transpose algorithm, which is mapped on two families of scalable interconnection networks, the k-ary n-trees and the k-ary n-cubes. We analyze in depth the network behavior of a minimal adaptive algorithm for the... (Update)

Context of citations to this paper:   More

...machine with 256 processors. These processors are connected with an indirect interconnection network using state of the art routers [30]. Based on these gures, there is obviously an uneven and inecient use of system resources. During the two computational phases of the...

.... thesis together with some related work, not shown here for brevity, either appeared in the literature or are still under review [118] [120] [121] 123] 122] 124] 117] 130] 129] 127] 128] 119] 125] 126] 47] 1.6 Thesis Overview The thesis is organized as follows....

Cited by:   More
Minimal Adaptive Routing with Limited Injection on Toroidal .. - Petrini, Vanneschi (1996)   (Correct)
A New Approach to Parallel Program Development and.. - Petrini, Bassetti.. (1999)   (Correct)
Communication Performance of Wormhole Interconnection Networks - Petrini (1997)   (Correct)

Similar documents (at the sentence level):
54.4%:   Latency and Bandwidth Requirements of Massively Parallel.. - Petrini, Vanneschi (1999)   (Correct)
10.0%:   A Comparison of Wormhole-Routed Interconnection Networks - Petrini, Vanneschi (1997)   (Correct)

Active bibliography (related documents):   More   All
0.4:   Efficient Total-Exchange in Wormhole-Routed Toroidal Cubes - Petrini (1999)   (Correct)
0.4:   Parallel 1D-FFT Computation on Constant-valence Multicomputers - Mazzeo, Villano (1995)   (Correct)
0.3:   Towards a Generic Analytical Model of Wormhole Routing Networks - Lysne (1998)   (Correct)

Similar documents based on text:   More   All
0.8:   Minimal vs. non Minimal Adaptive Routing on k-ary n-cubes - Petrini, Vanneschi   (Correct)
0.3:   Performance Analysis of Wormhole Routed k-ary n-trees - Petrini, Vanneschi (1998)   (Correct)
0.2:   Network Performance under Physical Constraints - Petrini, Vanneschi (1997)   (Correct)

Related documents from co-citation:   More   All
5:   SMART: a Simulator of Massive ARchitectures and Topologies - Petrini, Vanneschi - 1997
5:   trees: High Performance Networks for Massively Parallel Architectures (context) - Petrini, Vanneschi - 1995
5:   A Necessary and Sufficient Condition for Deadlock Free Adaptive Routing in Wormh.. - - 1995

BibTeX entry:   (Update)

Fabrizio Petrini and Marco Vanneschi. Latency and Bandwidth Requirements of Massively Parallel Programs: FFT as a Case Study. Future Generation Computer Systems, 1999. Accepted for publication. http://citeseer.ist.psu.edu/article/petrini99latency.html   More

@inproceedings{ petrini96latency,
    author = "Fabrizio Petrini and Marco Vanneschi",
    title = "Latency and Bandwidth Requirements of Massively Parallel Programs: {FFT} as a Case Study",
    booktitle = "Euro-Par, Vol. I",
    pages = "307-312",
    year = "1996",
    url = "citeseer.ist.psu.edu/article/petrini99latency.html" }
Citations (may not include all citations):
981   Introduction to Parallel Algorithms and Architectures: Array.. (context) - Leighton - 1992
531   LogP: Towards a Realistic Model of Parallel Computation - Culler, Karp et al. - 1993
531   LogP: Towards a Realistic Model of Parallel Computation - Culler, Karp et al. - 1993
462   Deadlock-Free Message Routing in Multiprocessor Interconnect.. (context) - Dally, Seitz - 1987
218   Parallelism in Random Access Machines (context) - Fortune, Willie - 1978
171   Advanced Computer Architecture: Parallelism (context) - Hwang - 1993
130   LogGP: Incorporating Long Messages into the LogP Model - One.. - Alexandrov, Ionescu et al. - 1995
129   cube Interconnection Networks (context) - Dally, of - 1990
93   IEEE Transactions on Parallel and Distributed Systems (context) - Dally, Flow - 1992
81   A Bridging Model for Parallel Computation (context) - Valiant - 1990
79   On Communication Latency in PRAM Computation (context) - Aggarwal, Chandra et al. - 1989
79   Communication Complexity of PRAMs (context) - Aggarwal, Chandra et al. - 1990
65   Optimal Broadcast and Summation in the LogP Model - Karp, Sahay et al. - 1992
62   Designing Broadcasting Algorithms in the Postal Model for Me.. (context) - Bar-Noy, Kipnis - 1992
61   Where is Time Spent in Message-Passing and Shared-Memory Pro.. - Chandra, Larus et al. - 1994
61   FFTs in External or Hierarchical Memory - Bailey - 1990
55   Multiprocessor FFTs (context) - Swarztrauber - 1987
38   Assessing Fast Network Interfaces (context) - Culler, Liu et al.
35   Chaotic Routing: Design and Implementation of an Adaptive Mu.. - Bolding - 1993
32   IEEE Transactions on Parallel and Distributed Systems (context) - Gupta, Kumar et al. - 1993
27   SMART: a Simulator of Massive ARchitectures and Topologies - Petrini, Vanneschi - 1997
23   The Message-Driven Processor (context) - Dally - 1992
22   Performance in Parallel Simulation of Interconnection Networ.. (context) - Burger, Wood - 1995
16   A High-Performance FFT Algorithm for Vector Supercomputers - Bailey - 1988
16   FFT Algorithms for Vector Computers (context) - Swarztrauber - 1984
13   Congestion-Free Routing on the CM-5 Data Router (context) - Heller - 1994
11   trees: High Performance Networks for Massively Parallel Arch.. (context) - Petrini, Vanneschi - 1997
11   Hiding Communication Costs in Bandwidth-Limited Parallel FFT.. (context) - Sahay - 1992
8   Dimensional Parallel FFT Benchmark on SUPRENUM - Getov - 1992
8   A High-Performance Fast Fourier Transform Algorithm for the .. - Bailey - 1987
7   Performance Analysis of Wormhole Routed k-ary n-trees - Petrini, Vanneschi - 1998
6   A Necessary and Sucient Condition for Deadlock-Free Adaptive.. (context) - Duato - 1994
4   Fast Fourier Transform - For Fun and Prot (context) - Gentleman, Sande - 1966
3   Towards an Architecture Independent Analysis pf Parallel Alg.. (context) - Papadimitriu, Yannakakis - 1988
3   Physical Architectures: Interconnection and Communication (context) - Petrini, Vanneschi - 1995
3   A generalized FFT algorithm on transputers - Roebbers, Welch et al. - 1991
3   Deadlock-Free Adaptive Wormhole Routing with Disha Concurren.. (context) - Anjan, Timothy - 1995
2   Digital Equipment Corporation (context) - Aspnes, Herlihy et al. - 1993
2   Load Balanced FFT Implementation on the Intel iPSC (context) - Chu - 1987



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.c3.lanl.gov/cic19/teams/par_arch/Publications.html):   More
The Performance Realities Of Massively Parallel.. - Lubeck, Simmons.. (1992)   (Correct)
Performance Evaluation of the SGI Origin2000: A.. - Wasserman, Lubeck..   (Correct)
Benchmark Tests on a Silicon Graphics R8000-Based Workstation - Wasserman   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC