2 citations found. Retrieving documents...
E. J.-L. Lu and D. I. Okunbor. Massively parallel fast multipole algorithm in three dimensions. In the proceedings of Fifth IEEE International Symposium on High Performance Distributed Computing, August 1996. (to appear).

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Parallel Implementation of 3D FMA using MPI - Lu, Okunbor   Self-citation (Lu Okunbor)   (Correct)

.... MPIFMA. Since only primitive communication functions such as point to point communication and broadcast are implemented in the parallel FMA library, this makes it easy to port MPIFMA to other communication libraries. For efficiency, we implemented the optimum communication scheme proposed in [13]. The advantages of the optimum communication scheme include (1) minimum number of messages are communicated among processors, and (2) the computation and communication are overlapped as much as possible. Our preliminary results are remarkable. The force calculation of one million particles took ....

....to processor P j and vice versa as long as P i 6= P j . Therefore, for each box i, one can build up a processor list which box i s MEs will be sent. Since only the required MEs are transmitted, redundancies are therefore eliminated bringing about minimal transmission. The details can be found at [13]. Experiments were conducted on an Intel iPSC860 system with 1, 2, 4, 8, and 16 nodes. The benchmark system is a 3D particle system with particle positions O randomly generated between Gamma0:5 and 0:5. The size of the particle Total Near P Run ME2LE Field Comm. Eff. Time Force Cost 1 93.291 ....

E. J.-L. Lu and D. I. Okunbor. Massively parallel fast multipole algorithm in three dimensions. In the proceedings of Fifth IEEE International Symposium on High Performance Distributed Computing, August 1996. (to appear).


An Efficient Load Balancing Technique for Parallel FMA in.. - Lu, Okunbor (1996)   Self-citation (Lu Okunbor)   (Correct)

....to parallelization of fast multipole algorithm. Greengard and Gropp [11] presented the parallel version of FMA in two dimensions (2D) Board, et al. 14] have done a lot of work in the parallelization of FMA in 3D. Lu and Okunbor developed an efficient massively parallel FMA in three dimensions [15]. All parallel implementations perform well if the particles are distributed uniformly. However, the performance of these parallel algorithms degrades significantly when the particles are not distributed uniformly due to the load imbalancing. Computer Science Department, University of Missouri ....

....0.90. Processors Non Adaptive Factor = 0.90 2 1.961 1.945 4 1.959 3.042 8 1.957 5.411 16 3.065 9.596 Table 1 The speedups of domain decomposition scheme and weighted subtrees scheme with factor = 0.90. We implement the weighted subtrees technique on top of our parallel fast multipole algorithm [15]. The benchmark system is a 3D particle system with particle positions O randomly generated between Gamma0:5 and 0:5 with the restriction that all coordinates for a particle are either positive (0 O 0:5) or negative ( Gamma0:5 O 0) All experiments were conducted on an Intel iPSC 860 system ....

[Article contains additional citation context not shown here]

E. J.-L. Lu and D. I. Okunbor, A massively parallel fast multipole algorithm in three dimensions, in the proceedings of Fifth IEEE International Symposium on High Performance Distributed Computing, August 1996, pp. 40--48.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC