| T.-s. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proceedings of the 9th International Parallel Processing Symposium, pages 106--112, Santa Barbara, CA, April 1995. |
....significant in our case, nearly optimal parallel speedup for this class of problems. 3. An example of experimental performance analysis for a nontrivial parallel implementation. 2 Related Work Several groups have conducted experimental studies of graph algorithms on parallel architectures [19, 20, 26, 39, 40, 44]. Their approach to producing a parallel program is similar to ours (especially that of Ramachandran et al. 17] but their test platforms have not provided them with a true, scalable, UMA shared memory environment or have relied on ad hoc hardware [26] Thus ours is the first study of speedup ....
T.-s. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proceedings of the 9th International Parallel Processing Symposium, pages 106--112, Santa Barbara, CA, April 1995.
....un on the CoarXO Gr ained Multicomputer (CGM)par allel machine model with ppr in p)communicationr ounds with O local computation per r ound. 2 Related Experimental Work Sever al gr oups have conducted exper imental studies of gr aph algor ithms on par allel a r hitecturz (for example, [11, 12, 14, 18, 20, 9] . However none of these r elated wor ks use test platfor ms thatpr ovide atr ue, scalable, UMA shar edmemor y envir onment and still other studies haver elied on ad hoc har war [14] Thus our s is the fir st study of speedup for over tens of pr ocessor s (andpr omise to scale over a significantr ....
T.-S. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proc. 9th Int'l Parallel Processing Symp., pages 106--112, Santa Barbara, CA, April 1995. 65
....on the CoarseGrained Multicomputer (CGM) parallel machine model with p processors in O(log p) communication rounds with O local computation per round. 2 Related Experimental Work Several groups have conducted experimental studies of graph algorithms on parallel architectures (for example, [11, 12, 14, 18, 20, 9]) However, none of these related works use test platforms that provide a true, scalable, UMA shared memory environment and still other studies have relied on ad hoc hardware [14] Thus ours is the first study of speedup for over tens of processors (and promise to scale over a significant range of ....
T.-S. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proc. 9th Int'l Parallel Processing Symp., pages 106--112, Santa Barbara, CA, April 1995.
....in our case, near y optimal parx lel speedup for this class ofprO lems. 3. An example ofexper mentalper o r ance analysis for a nontr vial par3 lel implementation. 2 Related Work Sever al gr ou s have conducted exer imental studies of gr a h algor ithms on ar allel a r hitectur; [19,20,26,39,40,44].Their ar oach tor oducing a ar allel relpO is similar to our s (es ecially that of Ramachandr an et al. 17] but their test latfor ms have notr ovided them with atr ue, scalable, UMA shar edmemor envirxF;X t or haver elied on ad hoc har war [26] Thus our s is the fir st study of s eedu over a ....
T.-S. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proceedings of the 9th International Paral el Processing Symposium, pages 106-- 112, Santa Barbara, CA, April 1995. 131
.... and Motivations Although a significant amount of parallel machines have been built and a lot of parallel algorithms concerning graphs have been written, only a few implementations of those algorithms have been carried out on existing parallel platforms [HRD, KLCY94, KGP94, LKC95, RM94, HRD95] Being interested in graph problems, not only we would like to design parallel graph algorithms to process very large data as fast as possible, but also we would like that the implementations of these algorithms on parallel machines be as efficient as in theory, and this for a vast scale of ....
....computation time, the ability to manipulate data larger than those treated by the corresponding sequential algorithms. Our goal is to create a portable and efficient library of parallel graph algorithms. The only existing librairies for parallel graph algorithms are specific to a certain machine [HRD95] In parallel, we try to validate by experiment the power of BSP like model for treating irregular structures like graphs. We consider here two models for parallel algorithms: the PRAM model and the CGM (BSP like) model. The PRAM model (very classical) Jaj92] is a high level one in which ....
T.-S Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel simd computer with virtual processing. In Proc. 9th International Parallel Processing Symposium, pages 106--112, 1995.
....machines and require data layout directives to perform efficiently. Exact synchronicity is not supported as it is not available on the target architectures considered. Other PRAM oriented parallel libraries A large number of PRAM inspired algorithms has already been implemented in NESL [2] [7] reports on concrete implementations on a MasPar for many of the same algorithms as those in PAD. ....
T.-S. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In 9th International Parallel Processing Symposium, pp 106--112, 1995.
....where PAD offers a broader selection of more traditional data structures. NESL is targeted towards existing platforms, where PAD presupposes a (virtual) PRAM, and can probably in this context be more efficient. Concrete implementations of many PRAM graph algorithms on the MasPar, were discussed in [8]. 4 Status and future work A compiler for Fork95 together with system software and a simulator for the SB PRAM is already available, while a first version of PAD will be released in the summer of 1996. The next phase of PAD will extend the basic library in order to implement more advanced graph ....
T.-S. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In 9th International Parallel Processing Symposium (IPPS), pages 106--112, 1995.
....parallel serial connected components algorithm for distributed memory machines. Their study showed that for certain classes of probabilistic meshes their algorithm performs reasonably well. But its performance was quite poor on other meshes and is likely to be worse on arbitrary graphs. Hsu et al. [17, 19, 18] built a library of pointer based algorithms, including connected components, open ear decomposition, list ranking etc. Greiner [13] compared several parallel connected components algorithms; Narayannan [27] implemented a single source shortest path algorithm; and Hillis and Steele [15] ....
T.-s. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proceedings of the International Parallel Processing Symposium, pages 106--112, Santa Barbara CA, Apr. 1995.
....between the graph manipulation package NETPAD and our parallel programs. We have also written sequential programs for solving the same graph problems and studied the relative speed up of the parallel programs over the sequential ones. The performance presented in this paper is further analyzed in [4] where least squares fit curves are obtained for each set of data. We note a few observations. ffl PRAM based graph algorithms can be implemented efficiently and easily. The PRAM model has proven to be a very good theoretical model for designing parallel algorithms. By developing a general ....
....Because the MPL programming language requires explicit allocation of PE s, we need to modify our code to handle this. We also need to implement optimal PRAM algorithms to obtain the best speed up results when using 24 HSU, RAMACHANDRAN, AND DEAN virtual processors. These issues are discussed in [4]. ffl Further extension of the parallel graph algorithm library. Since we have implemented most of the commonly used routines for implementing PRAM undirected graph algorithms, we expect that it will be fairly easy to implement other graph algorithms; for example, the routines for finding an ....
T.-s. Hsu, V. Ramachandran, and N. Dean, Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing, Proc. 9th International Parallel Processing Symp., 1995, pp. 106--112.
....Section 13.2 gives a high level description of our implementation. Section 13.3 describes the implementation details of our parallel graph algorithms library. Section 13.4 gives performance analysis. Finally, Section 13.5 gives the conclusion. The work presented in this chapter also appears in [HRD93] 13.2 High Level Description of Our Implementation In our earlier implementation of parallel graph algorithms without virtual processing described in Chapter 11, we first provided a general mapping between the architecture of the MasPar and the schematic structure of the PRAM model. This ....
T.-s. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. Technical Report TR-93-14, Dept. of Computer Sciences, Univ. of Texas at Austin, 1993.
....between the graph manipulation package NETPAD and our parallel programs. We have also written sequential programs for solving the same graph problems and studied the relative speed up of the parallel programs over the sequential ones. The performance presented in this paper is further analyzed in [4] where least squares fit curves are obtained for each set of data. We note a few observations. ffl PRAM based graph algorithms can be implemented efficiently and easily. The PRAM model has proven to be a very good theoretical model for designing parallel algorithms. By developing a general ....
....programming language requires explicit allocation of PE s, we need to modify our code to handle this. We also need to implement optimal PRAM algorithms to obtain the best speed up results when using IMPLEMENTATION OF PARALLEL GRAPH ALGORITHMS 23 virtual processors. These issues are discussed in [4]. ffl Further extension of the parallel graph algorithms library. Since we have implemented most of the commonly used routines for implementing PRAM undirected graph algorithms, we expect that it will be fairly easy to implement other graph algorithms; for example, the routines for finding an ....
T.-s. Hsu, V. Ramachandran, and N. Dean, Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing, Tech. Report TR-93-14, Dept. of Computer Sciences, Univ. of Texas at Austin, 1993.
....the code in Program 3. a) b contains the summation of all local values of a s that are positive. The variable c contains the absolute value of the summation of all local values of a s that were not positive. The rewritten code with virtual processing is shown in Program 3. b) kernel given in [19,21]. We classify the parallel primitives into three different categories according to the method we used to implement them with virtual processing. They will be discussed in detail in the following subsections. Every routine in each category was fine tuned for speed and small memory space usage. For ....
....[42] SunOS 4.1.3) We used the quick sort routine provided by the system library for sorting. The performance data for the sequential algorithms was obtained by running our programs on a SPARC 10 41 machine with 32 megabytes of main memory and about 80 megabytes of swapping space. As indicated in [21], the SPARC 10 41 is at least 230 times faster than a single MP 1 PE. Since the MP 1 that we used has 16,384 PE s, the raw computational power of the MP 1 was at least 63 times larger than a SPARC 10 41. 0 2 4 6 8 10 0 400 800 1200 input size (in units of 10000) scanwithAdd32 (SPARC 10 41) user ....
[Article contains additional citation context not shown here]
T.-s. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proc. 9th International Parallel Processing Symp., pages 106--112, 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC