| E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Reiping, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. International Colloquium Algorithms, Languages and Programming, pages 390--400. LNCS 1256, Springer-Verlag, Berlin, 1997. |
....is customary to express speedup as a function of n. Thus the speedup obtained using an NC algorithm is sometimes referred to as exponential. In a coarse grained setting, i.e. the case where n and p are orders of magnitude apart, speedup is expressed as a function of only p and some recent results [4, 7, 9, 15] show that this approach is practically relevant. Second, an NC algorithm is not necessarily work optimal, and thus not resource optimal considering runtime and memory space as resources that one wants to use efficiently. Third, even if we restrict ourselves to work optimal NC algorithms and ....
....is coined LogP, an acronym for the four parameters involved. A common feature of the BSP, LogP, and other related models is their lack of simplicity: each model involves relatively many parameters making analysis and design of algorithms cumbersome. The Coarse Grained Multicomputer (CGM) model [4, 7] was later proposed in an effort to retain the advantages of BSP while keeping the model simple (making the number of parameters fewer) The BSP and its special case CGM have been the primary inspirations for our model. Thus, we believe that many optimal CGM and BSP algorithms can easily be ....
[Article contains additional citation context not shown here]
E. Caceres, F. Dehne, A. Ferreira, P. Locchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In The 24th International Colloquium on Automata Languages and Programming, volume 1256 of LNCS, pages 390--400. Springer Verlag, 1997.
....O(n=logn) processors (with O(n) work) EREW PRAM algorithms. Kosaraju and Delcher [15] developed a simplified version of the algorithm in [8] which runs in the same time processor bounds (O(logn) time O(n=logn) processors (with O(n) work) on the EREW PRAM) Recently, several researchers in [3, 7] present theoretic observation that this classical PRAM algorithm for tree contraction on a tree T with n vertices can run on the CoarseGrained Multicomputer (CGM) parallel machine model with p processors in O(log p) communication rounds with O local computation per round. 2 Related ....
F. Dehne, A. Ferreira, E. Caceres, S. W. Song, and A. Roncato. Efficient parallel graph algorithms for coarse-grained multicomputers and BSP. Algorithmica, 33:183--200, 2002.
....O(n=logn) processors (with O(n) work) EREW PRAM algorithms. Kosaraju and Delcher [15] developed a simplified version of the algorithm in [8] which runs in the same time processor bounds (O(logn) time O(n=logn) processors (with O(n) work) on the EREW PRAM) Recently, several researchers in [3, 7] present theoretic observation that this classical PRAM algorithm for tree contraction on a tree T with n vertices can run on the CoarseGrained Multicomputer (CGM) parallel machine model with p processors in O(log p) communication rounds with O local computation per round. 2 Related ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. 24th Int'l Colloquium on Automata, Languages and Programming (ICALP'97), volume 1256 of Lecture Notes in Computer Science, pages 390--400, Bologna, Italy, 1997. Springer-Verlag.
....global information required to be in the processor s local memory (measured in terms of the range of values for the ratio ) for which the algorithm is efficient and applicable. The CGM has been proved useful to solve many discrete problems in several areas like image processing [11, 12] graphs [2, 7], computational geometry [8, 9] and others [10] In this paper we show that these theoretical, BSP like models are well adapted to HPCS. In particular, they yield algorithms that are portable and whose theoretical and practical performance are closely related. Furthermore, they allow a reduction ....
.... search tree construction, in in terval graphs [7[ and knapsack [10[ On the other hand, graph problems seem to be somewhat more complex and the best existing algorithms for connected components, list ranking, Euler tour, ear decomposition and bi connectivity require O(logp) communication rounds [2]. The algorithm for computing the maximum weighted clique in inter O val graphs, presented below, runs in time (7 T (n,p) a considerable improvement over simulation [19] In addition, it is easy to implement (since all communications are performed by calls to a standard highly optimized ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. lieping, A. loncato, N. Santoro, and S. Song. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In P. Degano, 1. Gorrieri, and A. Marchetti-Spaccamela, editors, Proceedings of ICALP'97, volume 1256 of Lecture Notes in Computer Science, pages 390-400. Springer-Verlag, 1997.
....that are irregularly structured, the communication efficiency, which is vital to getting satisfactory parallel performance, becomes much harder to achieve. In particular, it is generally acknowledged that graph problems have considerably less regular structures than many other problems studied [23, 24]. An irregular structure results in highly data dependent communication patterns and makes it difficult to achieve communication efficiency. However, a vast number of interesting problems in many fields are defined in terms of graphs. Thus practical and efficient parallel algorithms for ....
....of messages received by a processor in a superstep is required to be at most h = O( n p ) CGM Model Another general purpose model, Coarse Grained Multicomputer (CGM) proposed by Dehne [30] is essentially similar to the weak CREW BSP. Work related to the CGM model has been reported in [8, 23, 30, 31, 32, 21]. A CGM computer consists of p processors P 1 ; P p , where each processor has O( n p ) local memory. The processors can be connected through any communication medium, i.e. any interconnection network or shared memory. Typically, the local memory is considerably larger than O(1) This ....
[Article contains additional citation context not shown here]
CACERES, E., DEHNE, F., FERREIRA, A., FLOCCHINI, P., RIEPING, I., RONCATO, A., SANTORO, N., AND SONG, S. W. Efficient Parallel Graph Algorithms for Coarse Grained Multicomputers and BSP. In Proc. 24th International Colloquium on Automata, Languages and Programming (1997), pp. 390--400.
....size of the graph by a constant factor in every phase. Then, if it has been reduced to N= log N , pointer jumping is applied. Numerous variants of this idea have been developed. More references are given in [11] A variant of [6] and [2] tuned towards the requirements of the BSP model is given in [5]. Earlier Practical Results. Several recent papers report on implementations of list ranking algorithms on parallel computers. Experiences with algorithms based on the independent set removal idea are described in [10] for the MasPar) and [22] for the Paragon) Asymptotically these algorithms ....
Caceres, E., F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, S.W. Song, `Efficient Parallel Graph Algorithms for Coarse Grained Multicomputers and BSP,' Proc. 24th International Colloquium on Automata Languages and Programming, LNCS 1256, pp. 390--400, Springer-Verlag, 1997.
....if the bsp parameter g is close to unity, the resulting bsp algorithms would be optimal. Unfortunately, this condition is not usually met in practice and the performance of the resulting algorithms is often disappointing. More recently, pram simulations have been revived in the form of clipping [9]. Clipping involves simulating a pram algorithm for O(log p) rounds, stopping the algorithm (clipping it) and completing the computation with a specialized CGP algorithm. In doing pram simulations on a cgm, each processor simulates n p erew pram processors and stores n p data elements. ....
....of the last list element which is assigned a rank of 0. Each element then gets assigned the pointer of its successor, and adds to its rank, the rank of its successor. After repeating this procedure log n times, each element has the last list element as its successor and is correctly ranked. In [9], C aceres et al. describe a cgm algorithm for the list ranking problem. The algorithm begins by finding a p 2 ruling set of size O( n p ) This is a subset of the original list elements such that no two consecutive elements are further than distance p 2 apart. At the same time, the ....
[Article contains additional citation context not shown here]
E. C'aceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proceedings of International conference on Automata, Languages, and Programming, 1997.
....8. NEW EM ALGORITHMS 119 Problem PDM I O Complexity Complexity EM BSP I O Description (see notes ab ) in BSP Model c Complexity d Group C: Graph Algorithms 1. List ranking O( N B logM B N B ) 25] O( N v ) O(log v) O( N log v pBD ) 2. Euler tour of tree M = O( N v ) [20] 3. Lowest common ancestor 4. Tree contraction 5. Expression tree evaluation 6. Connected components e O( E V V DB logM B V B Delta maxf1; log log V BD E g) 65] O( V E v ) O(log v) M = O( V E v ) 20] O( V E) log v pBD ) 7. Spanning forest 8. Ear and ....
....O( N log v pBD ) 2. Euler tour of tree M = O( N v ) 20] 3. Lowest common ancestor 4. Tree contraction 5. Expression tree evaluation 6. Connected components e O( E V V DB logM B V B Delta maxf1; log log V BD E g) 65] O( V E v ) O(log v) M = O( V E v ) [20] O( V E) log v pBD ) 7. Spanning forest 8. Ear and open ear decomposition 9. Biconnected components Figure 8.4: Overview of New EM Algorithms (continued) a PDM I O complexities as listed apply to general values for N ,M ,D, and B. If the constraints required by our techniques are ....
E. C'aceres, F. Dehne, A. Ferreira, P. Flocchini, I. Reiping, N. Santoro, and S. Song. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In Proc. Int. Colloquium Algorithms, Languages and Programming, LNCS 1256, pages 390--400, 1997.
....order to obtain reasonably efficient programs on the asynchronous APRAM model. This is due to the approach on which such algorithms are based. It would be interesting to carry out a similar investigation for the approach of the algorithm we propose. Algorithms for CCug on other kinds of machines [30, 19, 8, 3] have also been designed. All of them exhibit a computational structure which exploits various, specific, machine abilities. Among those, the algorithm in [19] is very competitive in time, since it is O(1) but it is expensive in hardware size, since it requires the allocation of v Theta v ....
Caceres E., Dehne F., Ferreira A., Flocchini P., Rieping I., Roncato A., Santoro N., Song S.W., Efficient Parallelgraphs Algorithms for Coarse Grained Mulicomputers. 24 th Inter. Coll. ICALP'97, Automata Languages and Programs, LNCS 1256, Degano, Gorrieri, Marchetti-Spaccamela Eds, 1997.
....to obtain an algorithm for which the number of phases is independent of the input size n as n becomes large. All of the algorithms we present have this feature. 13 Related work on minimizing the number of phases (or supersteps) using the notion of rounds is reported in [11] for sorting and in [4] for graph problems. Several lower bounds for the number of rounds needed for basic problems on the QSM and BSP are presented in [18] A round is a phase or superstep that performs linear work (O(gn=p) time on s QSM, and O(gn=p L) time on BSP) Any linear work algorithm must compute in ....
....measure for lower bounds on the number of phases (or supersteps) needed for a given problem. On the other hand, a computation that proceeds in rounds need not lead to a linear work algorithm if the number of rounds in the algorithm is non constant. In fact, all of the algorithms presented in [4] perform superlinear work. The algorithm in [11] performs superlinear communication when the number of processors is large. In contrast to the cost metric that uses the notion of rounds, in this paper we seek algorithms that perform optimal work and communication and additionally compute in a ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. ICALP, LNCS 1256, pp. 390-400, 1997.
....the same as for exact exponential distributions, with high probability. Now we are in a position to explain the BSP implementation in more detail and to estimate the corresponding cost. Our BSP implementation use the BSP sorting algorithm of Goodrich [6] and the BSP algorithm of Caceres et al. [3] for computing the connected components of a graph Lemma 3 Consider a phase k, 0 k log p. 1. The cost for generating the list L of the edges of G k is W = O i (n k ) 2 p k log (n k ) 2 j ; C = O (n k ) 2 p k log (n k ) 2 log (n k ) 2 p k and L = O log n 2 k log (n ....
....i (n k ) 2 p k log(n k ) 2 j , C = O (n k ) 2 p k log(n k ) 2 log (n k ) 2 p k and L = O log(n k ) 2 log (n k ) 2 p k . 2. For computing the connected components of the implementation of the procedure Compact we make use of the BSP algorithm of Caceres et al. [3]. For the computation time of Compact it holds that W( n k ) 2 ; p k ) O (n k ) 2 p k log p k W( n k ) 2 =2; p k ) where the first part of the right hand side essentially expresses the computation time of the BSP algorithm of Caceres et al. 3] for computing the connected ....
[Article contains additional citation context not shown here]
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncaa, P. Flocchini, I. Rieping, A. Roncato, N.Santoro, S. W. Song, Efficient parallel graph algorithms for coarse grained multicomputers and BSP , ICALP 97, Bologna, Italy, 1997.
....expression represents the computation time, while the second two represent the communication time. 2. 3 The Coarse Grained Multicomputer Model The coarse grained multicomputer (cgm) model was introduced by Dehne et al. 21] and a number of algorithms have been defined with respect to this model [9, 10, 20, 21, 22, 23]. A coarse grained multicomputer, cgm(m; p) consists of p identical processors, labelled P 0 ; P p Gamma1 , each with Theta( m p ) local random access memory. These processors are interconnected by a communication network capable of routing an h relation with h = O( m p ) The ....
....capable of routing an h relation with h = O( m p ) The performance of a cgm algorithm is measured in terms of the amount of local computation performed and the number of supersteps. Both of these quantities can be functions of n and p. The following observation (a version of which appears in [10]) relates cgm algorithms to bsp algorithms: Observation 1. Any cgm(m; p) algorithm which uses O( supersteps and O(T (n; p) computation time is also a bsp algorithm with running time O(T (n; p) gm p L) In the cgm model, algorithms are classified based on , the number of supersteps in the ....
[Article contains additional citation context not shown here]
E. C'aceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proceedings of International conference on Automata, Languages, and Programming, 1997.
....searching, visibility. ffl String algorithms: string matching, string edit problem. ffl Scientific computing: FFT, N body problem, molecular modelling, computational fluid dynamics, computational electromagnetics. For more details, see the above references and the following additional papers [2, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 35]. The success of this work has shown that BSP provides a convenient framework for analysing and comparing the relative computation, communication and synchronisation requirements of different parallel algorithms. Recent work has also shown that it provides an attractive framework in which to ....
C ACERES, E., DEHNE, F., FERREIRA, A., FLOCCHINI, P., RIEPING, I., RONCATO, A., SANTORO, N., AND SONG, S. W. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. 24th International Colloquium on Automata, Languages and Programming. LNCS Vol. 1256 (1997), Springer-Verlag, pp. 390--400.
....to obtain an algorithm for which the number of phases is independent of the input size n as n becomes large. All of the algorithms we present have this feature. Related work on minimizing the number of phases (or supersteps) using the notion of rounds is reported in [12] for sorting and in [5] for graph problems. Several lower bounds for the number of rounds needed for basic problems on the QSM and BSP are presented in [19] A round is a phase or superstep that performs linear work (O(gn=p) time on s QSM, and O(gn=p L) time on BSP) Any linear work algorithm must compute in ....
....measure for lower bounds on the number of phases (or supersteps) needed for a given problem. On the other hand, a computation that proceeds in rounds need not lead to a linear work algorithm if the number of rounds in the algorithm is non constant. In fact, all of the algorithms presented in [5] perform superlinear work. The algorithm in [12] performs superlinear communication when the number of processors is large. In contrast to the cost metric that uses the notion of rounds, in this paper we ask for algorithms that perform optimal work and communication and additionally compute in a ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. ICALP, LNCS 1256, pp. 390-400, 1997.
....( b) Delta min(m=p;n) provided that L g and that the BSP model computes with messages consisting of b bits, where b = Omega Gamma213 p) 3 Basic BSP Algorithms In this section we present the BSP complexity of some basic algorithms which are used as subroutines in our algorithms. Lemma 3 (a) [8] Ranking a list of size n and computing the Euler Tour of a tree of size n can be performed in time O( n p Delta log p g Delta n p Delta log p L Delta log p) on the BSP model. b) Let s p and let s keys be distributed evenly among the processors. Selecting the k smallest key takes ....
....technique described in Lemma 4. Erase internal edges (edges whose endpoints are the same) and restore the edge ordering by E[i] Src, as explained in Algorithm 3. 2) Contract every connected component as defined by the forest edges into a single vertex by first applying the Euler tour algorithm of [8], after which every segment leader fetches the new node number. 3) Every processor locally computes the MSF of the graph defined by the remaining supervertices and the edges stored in its local memory. 4) Merge the MSFs as in Step (4) of algorithm mstdense. End of Algorithm The time ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Annual International Colloquium on Automata, Languages and Programming (ICALP), 1997.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Reiping, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. International Colloquium Algorithms, Languages and Programming, pages 390--400. LNCS 1256, Springer-Verlag, Berlin, 1997.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, Efficient Parallel Graph Algorithms for Coarse Grained Multicomputers and BSP, Proc.24th International Colloquium on Automata, Languages and Programming (ICALP '97), Lecture Notes in Computer Science, Vol. 1256, Springer-Verlag, Berlin, 1997, pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, "Efficient parallel graph algorithms for coarse grained multicomputers and BSP," in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97), 1997, Springer Verlag Lecture Notes in Computer Science, Vol. 1256, pp. 390--400.
No context found.
C ACERES, E., DEHNE,F.,FERREIRA,A.,FLOC- CHINI,P.,REIPING,I.,SANTORO,N.,AND SONG, S. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In Proc. Int. Colloquium Algorithms, Languages and Programming, LNCS 1256 (1997), pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Reiping, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. International Colloquium Algorithms, Languages and Programming, pages 390--400. LNCS 1256, Springer-Verlag, Berlin, 1997.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, "Efficient parallel graph algorithms for coarse grained multicomputers and BSP," in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97), 1997, Springer Verlag Lecture Notes in Computer Science, Vol. 1256, pp. 390--400.
No context found.
C ACERES, E., DEHNE,F.,FERREIRA,A.,FLOC- CHINI,P.,REIPING,I.,SANTORO,N.,AND SONG, S. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In Proc. Int. Colloquium Algorithms, Languages and Programming, LNCS 1256 (1997), pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, Efficient Parallel Graph Algorithms for Coarse Grained Multicomputers and BSP, Proc.24th International Colloquium on Automata, Languages and Programming (ICALP '97), Lecture Notes in Computer Science, Vol. 1256, Springer-Verlag, Berlin, 1997, pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Reiping, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. International Colloquium Algorithms, Languages and Programming, pages 390--400. LNCS 1256, Springer-Verlag, Berlin, 1997.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, Efficient Parallel Graph Algorithms for Coarse Grained Multicomputers and BSP, Proc.24th International Colloquium on Automata, Languages and Programming (ICALP '97), Lecture Notes in Computer Science, Vol. 1256, Springer-Verlag, Berlin, 1997, pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, "Efficient parallel graph algorithms for coarse grained multicomputers and BSP," in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97), 1997, Springer Verlag Lecture Notes in Computer Science, Vol. 1256, pp. 390--400.
No context found.
C ACERES, E., DEHNE,F.,FERREIRA,A.,FLOC- CHINI,P.,REIPING,I.,SANTORO,N.,AND SONG, S. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In Proc. Int. Colloquium Algorithms, Languages and Programming, LNCS 1256 (1997), pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Reiping, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. Internat. Colloquium on Algorithms, Languages and Programming, pages 390--400. LNCS 1256. Springer-Verlag, Berlin, 1997.
No context found.
C ACERES, E., DEHNE,F.,FERREIRA,A.,FLOC- CHINI,P.,REIPING,I.,SANTORO,N.,AND SONG, S. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In Proc. Int. Colloquium Algorithms, Languages and Programming, LNCS 1256 (1997), pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, "Efficient parallel graph algorithms for coarse grained multicomputers and BSP," in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97), Bologna, Italy, 1997, Springer Verlag Lecture Notes in Computer Science, Vol. 1256, pp. 390-400. 12
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, "Efficient parallel graph algorithms for coarse grained multicomputers and BSP," in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97), 1997, Springer Verlag Lecture Notes in Computer Science, Vol. 1256, pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, Efficient Parallel Graph Algorithms for Coarse Grained Multicomputers and BSP, Proc.24th International Colloquium on Automata, Languages and Programming (ICALP '97), Lecture Notes in Computer Science, Vol. 1256, Springer-Verlag, Berlin, 1997, pp. 390--400.
No context found.
Caceres E, Dehne F, Ferreira A, Flocchini P, Rieping I, Roncato A, Santoro N, Song S. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. Proceedings of ICALP'97 (Lecture Notes in Computer Science, vol. 1256). Springer: Berlin, 1997; 131--143.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Reiping, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. International Colloquium Algorithms, Languages and Programming, pages 390--400. LNCS 1256, Springer-Verlag, Berlin, 1997.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, "Efficient parallel graph algorithms for coarse grained multicomputers and BSP," in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97), 1997, Springer Verlag Lecture Notes in Computer Science, Vol. 1256, pp. 390--400.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro and S.W. Song, Efficient Parallel Graph Algorithms For Coarse Grained Multicomputers and BSP, in Proceedings ICALP '97 - 24th International Colloquium on Automata, Languages, and Programming, P. Degano and R. Gorrieri and A. Marchetti-Spaccamela (Eds.), 1997.
....the total local computation time. The CGM model has the advantage of producing results which are closer to the actual performance on commercially available parallel machines. Some algorithms for computational geometry and graph problems require a constant number or O(log p) communication rounds [3, 6]. Contrary to a PRAM algorithm, which is often designed for p = O(N k ) with k 2 N , and with each processor receiving a small number of input data, in this model we consider the more realistic cases where N p. The CGM model is particularly suitable in current parallel machines in which the ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, Efficient parallel graph algorithms for coarse grained multicomputers and BSP, in: P. Degano, R. Gorrieri and A. Marchetti-Spaccamela, eds., Proceedings ICALP '97 - 24th International Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science, V. 1256 (1997) 390--400.
....L is considerably higher than the bandwidth parameter g, caused by the high startup cost for messages (e.g. Intel Paragon [21] Therefore, special attention should be paid to the number of supersteps. Several papers have been published describing graph algorithms for the BSP and similar models [7, 4, 15, 1], but very little effort has been put into experimental validation of these algorithms. Caceres et al. note in [7] that graph problems have considerably less internal structure than many other problems studied. This results in highly datadependent communication patterns and makes it difficult ....
....[21] Therefore, special attention should be paid to the number of supersteps. Several papers have been published describing graph algorithms for the BSP and similar models [7, 4, 15, 1] but very little effort has been put into experimental validation of these algorithms. Caceres et al. note in [7] that graph problems have considerably less internal structure than many other problems studied. This results in highly datadependent communication patterns and makes it difficult to achieve communication efficiency. In this paper, we study communication efficient MST algorithms using the ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. of Intern. Colloquium on Automata, Languages, and Programming (ICALP), pages 390 -- 400, 1997.
....separability = O( N log N v ) O(1) M = O( N B ) 19] O( N pDB ) Group C: Graph Algorithms 1. List ranking, Euler tour of tree, Lowest common ancestor, Tree contraction, expression tree evaluation O( N B logM B N B ) 11] O(log v) O( N v ) M = O( N v ) [10] O( N log v pDB ) 2. Connected components, spanning forest, Ear and open ear decomposition, Biconnected components f O(minf V 2 B logM B V B ; log V M Delta E B logM B E B g) 11] O(log v) O( V E v ) M = O( V E v ) 10] O( V E) log v pDB ) ....
....= O( N v ) M = O( N v ) 10] O( N log v pDB ) 2. Connected components, spanning forest, Ear and open ear decomposition, Biconnected components f O(minf V 2 B logM B V B ; log V M Delta E B logM B E B g) 11] O(log v) O( V E v ) M = O( V E v ) [10] O( V E) log v pDB ) Figure 7: Overview of New EM Algorithms in Comparison To Previous Results. a These results are subject to the conditions N = Omega Gamma vDB) N v 2 B v 2 (v Gamma 1) 2, and N v , where 1 is a constant that depends on the problem. For the problems examined ....
C' aceres, E., Dehne, F., Ferreira, A., Flocchini, P., Reiping, I., Santoro, N., and Song, S. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In Proc. Int. Colloquium Algorithms, Languages and Programming, LNCS 1256 (1997), pp. 390--400.
No context found.
C ACERES, E., DEHNE, F., FERREIRA, A., FLOC- CHINI, P., REIPING, I., SANTORO, N., AND SONG, S. Efficient parallel graph algorithms for coarse grained multicomputers and bsp. In Proc. Int. Colloquium Algorithms, Languages and Programming, LNCS 1256 (1997), pp. 390--400.
.... Communities (ESPRIT Long Term Research Project 20244, ALCOM IT) DFG SFB 376 MassiveParallelitat (Germany) and the R egion Rhone Alpes (France) A preliminary version of this paper was published in the proceedings of the 1997 International Colloquium on Automata, Languages and Programming [5]. y Carleton Univ. Ottawa, Canada, frank dehne.net, www.dehne.net z CNRS I3S INRIA, Sophia Antipolis, France, ferreira sophia.inria.fr x Univ. Federal de Mato Grosso do Sul, Campo Grande, Brasil, edson dct.ufms.br Univ. of S ao Paulo, S ao Paulo, Brazil, song ime.usp.br k Facolta ....
....results, and our algorithms for Problems 2 7 are the first practically relevant parallel algorithms for these standard graph problems. 6 Acknowledgements The authors would liketothank P. Flocchini, N. Santoro, and I. Rieping for their helpful discussions on the first version of this paper [5]. ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song, "Efficient parallel graph algorithms for coarse grained multicomputers and BSP," in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97), 1997, Springer Verlag Lecture Notes in Computer Science, Vol. 1256, pp. 390--400.
....constant overhead cost s. The value of s can be fairly large and the total message overhead cost can have a considerable impact on the speedup observed (see e.g. 9] In this paper, we will use a more practical version of the BSP model, referred to as the Coarse Grained Multicomputer (CGM) model [5, 9]. It is comprised of a set of p processors P 1 ; P p with size of problem p local memory per processor and an arbitrary communication network. An algorithm consists then of a sequence of supersteps, each alternating local computation and a global communication round. Each communication ....
....planar computational geometry [9] one of the main challenges we had to face here was how to map the graph onto the processors in order to perform degree computations efficiently. An easy way to get around this problem consists of globally sorting all the edges lexicographically, as proposed in [5]. But this method would require a Omega Gamma m log m p ) computation step and would then dominate the O( n m p log p) total time of our MIS algorithm. Therefore, we designed a p quantiles search 1 BSP CGM algorithm, that yields an efficient partitioning of the graph, allowing for quick ....
[Article contains additional citation context not shown here]
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In P. Degano, R. Gorrieri, and A. Marchetti-Spaccam ela, editors, Proc. of ICALP'97, volume 1256 of LNCS, pages 390--400. Springer-Verlag, 1997.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S.W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Annual International Colloquium on Automata, Languages and Programming (ICALP), 1997.
....Algorithm 1 is of size at most O(n=p) Lemma 1. After the k th iteration in Step 4, there are no more than two selected elements among any 2 k subsequent elements of the original list L. Proof. Due to space limitations, the proof is omitted. It can be found in the full version of this paper [5]. In order to show that subsequent elements selected at the end of Algorithm 1 have distance at most O(p 2 ) we need the following lemmas. Lemma 2. After every execution of Step 4.3, the distance of two subsequent selected elements with respect to the current pointers (represented by vector s) ....
....following lemmas. Lemma 2. After every execution of Step 4.3, the distance of two subsequent selected elements with respect to the current pointers (represented by vector s) is at most O(p) Proof. Due to space limitations, the proof is omitted. It can be found in the full version of this paper [5]. Lemma 3. After the k th execution of Step 4.3, two subsequent elements with respect to the current pointers (represented by vector s) have distance O(2 k ) with respect to the original list L. Proof. Obvious consequence of the fact that only k pointer jumping operations were so far executed ....
[Article contains additional citation context not shown here]
E. C'aceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S.W. Song, "Efficient Parallel Graph Algorithms For Coarse Grained Multicomputers and BSP,", on-line Postscript at http://www.scs.carleton.ca/scs/faculty/dehne.html.
....algorithm, namely, the amount of local computation required, the number and type of global communication phases required and the scalability of the algorithm, that is, the range of values for the ratio n p for which the algorithm is efficient and applicable. Recently, C aceres et al. [2] showed that many problems in general graphs, such as list ranking, connected components and others, can be solved in O(log p) communication rounds in BSP and CGM. Note that while this work is of significant theoretical interest, these algorithms involve simulation of their corresponding PRAM ....
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. of ICALP'97, pages 131--143. Lecture Notes in Computer Science. Springer-Verlag, 1997.
No context found.
F. Dehne, A. Ferreira, E. Caceres, S.W. Song, and A. Roncato. Efficient parallel graph algorithms for coarse-grained multicomputers and BSP. Algorithmica, 33(2):183--200, 2002.
No context found.
F. Dehne, A. Ferreira, E. Caceres, S.W. Song, and A. Roncato. Efficient parallel graph algorithms for coarse-grained multicomputers and BSP. Algorithmica, 33(2):183--200, 2002.
No context found.
E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro, and S. W. Song. Efficient Parallel Graph Algorithms for Coarse Grained Multicomputers and BSP. In Proc. 24th International Colloquium on Automata, Languages and Programming, 1997, 390-400.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC