Download:
|
by David R. Surma, Edwin H. -m, Sha Peter, M. Kogge
ftp://ftp.cse.nd.edu/pub/Reports/1998/tr-98-11.ps.gz
Add To MetaCart
Abstract:
This paper presents a novel approach to reduce the communication costs incurred when performing multiple multicasts on wormhole k-ary n-cube multiprocessor systems. Both uni-cast and path-based implementations of multicasting incur communication costs due to the inherent message passing and contention for network resources. The start-up time dominates the transmission time when the data volume is small. However, in the presence of multiple multicasts when the data volume is very large, the communication delays due to message blocking and resource contention become very significant. Because of this, we present a hybrid static-dynamic technique to reduce the communication costs incurred when performing multiple multicasts on worm-hole routed direct networks. This technique requires a focus on ordering and routing information for the individual message transmissions. At compile time, each message is assigned a priority using the recently developed collision graph model. Then, at run-time these priorities are used to arbitrate the message transmissions. As a base, dimension-ordered routing is used. However, to further reduce the communication costs, some messages will be re-routed. This technique is useful either as a stand-alone algorithm or as an embedded procedure into existing algorithms. For a single multicast, our work performs as well as conventional methods. For multiple multicasts, results show that our approach provides significant improvement over baseline techniques.
Citations
|
423
|
A Survey of Wormhole Routing Techniques in Direct Networks
– Ni, McKinley
- 1993
|
|
150
|
ScaLAPACK: a scalable linear algebra library for distributed memoryconcurrent computers
– Choi, Dongarra, et al.
- 1992
|
|
124
|
Deadlock-Free Multicast Wormhole Routing in Multicomputer Networks,” Proc. 18th Int’l Symp
– Lin, Ni
|
|
121
|
Unicast-Based Multicast Communication in WormholeRoutedNetworks
– McKinley
- 1992
|
|
115
|
Synchronization in Real-Time Systems: A Priority Inheritance Approach
– Rajkumar
- 1991
|
|
93
|
Collective Communication in Wormhole-Routed Massively Parallel
– McKinley, Tsai, et al.
- 1995
|
|
64
|
Multidestination Message Passing Mechanism Conforming to Base Wormhole Routing Scheme
– Panda, Singal, et al.
- 1994
|
|
43
|
PUMMA: Parallel Universal Matrix Multiplication Algorithms,” Concurrency
– Choi, Dongarra, et al.
- 1994
|
|
36
|
Pursuing a petaflop: Point designs for 100 TF computers using PIM technologies. Frontiers of Massively Parallel Computation
– Kogge, Bass, et al.
- 1996
|
|
34
|
On Multicast Wormhole Routing in Multicomputer Networks
– Boppana, Chalasani, et al.
- 1994
|
|
25
|
Minimizing node contention in multiple multicast on wormhole k-ary n-cube networks
– Kesavan, Panda
- 1996
|
|
25
|
Optimal multicast communication in wormhole-routed torus networks
– Robinson, McKinley, et al.
- 1995
|
|
24
|
Efficient Multi-Packet Multicast Algorithms on Meshes with Wormhole and Dimension-Ordered Routing
– Coster, Dewulf, et al.
- 1995
|
|
20
|
ComPaSS: Efficient communication services for scalable architectures
– McKinley, Xu, et al.
- 1992
|
|
17
|
Path selection for communicating tasks in a wormhole-routed multicomputer
– Lee, Kim
- 1994
|
|
14
|
An Extended Dominating Node Approach to Collective Communication
– Tsai, McKinley
- 1994
|
|
10
|
Efficient collective data distribution in all-port wormhole-routed hypercubes
– Robinson, Judd, et al.
- 1993
|
|
8
|
Collision graph based communication scheduling for parallel systems
– Surma, Sha
- 1997
|
|
8
|
Throttle and Preempt: A New Flow Control for Real-Time Communications in Wormhole Networks
– Song, Kwon, et al.
- 1997
|
|
6
|
Message Ordering in Multiprocessors with Synchronous Communication
– Dikaiakos, Rogers, et al.
- 1992
|
|
6
|
Message-Ordering for Wormhole-Routed Multiport Systems with Link Contention and Routing Adaptivity
– Panda, Dixit-Radiya
- 1994
|
|
6
|
Efficient communication scheduling with re-routing based on collision graphs
– Surma, Sha
- 1997
|
|
6
|
SCORE: An Efficient Technique to Reduce Congestion in Parallel Systems
– Surma, Sha
- 1997
|
|
3
|
Hybrid technology multi-threaded (htmt) computer architecture for peta-flops computing
– Sterling, Bergman
- 1997
|
|
3
|
Efficient multicast algorithms in all-port wormholerouted hypercubes
– Halwan, Ozguner
- 1997
|