| Kruskal, C. P., Rudolph, L. and Snir, M. Efficient parallel algorithms for graph problems. Proc. |
....needed to prove that the operation is indeed a reduction. Second, the sequential computation of the reduction must be replaced with a parallel algorithm. In parallel machines of medium to large size, the reduction algorithm is often replaced by a parallel prefix or recursive doubling computation [15, 21]. For reductions on array elements, a typical implementation is to have each processor accumulate partial reduction results in a private array. Then, after the loop is executed, a cross processor merging phase combines the partial results of all the processors into the original, shared array. ....
....section increases with the number of processors. Thus, it is recommended only for low contention reductions. Exploit the fact that a reduction operation is an associative and commutative recurrence. Therefore, it can be parallelized using a parallel prefix or a recursive doubling algorithm [15, 21]. This approach is more scalable. For reductions on array elements, a commonly used implementation of the second method is to create, for each processor, a private version of the reduction array initialized with the neutral element of the reduction operator. During the execution of the ....
C. Kruskal. Efficient Parallel Algorithms for Graph Problems. In Proc.
....that it is not scalable and requires synchronizations which can be very expensive in large multiprocessor systems. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [15, 16, 18]. In this case the reduction variable is privatized in the transformed doall, and the final result of the reduction operation is computed in an interprocessor reduction phase following the doall, i.e. a scalar is produced using the partial results computed in each processor as operands for a ....
C. Kruskal. Efficient parallel algorithms for graph problems. pages 869--876, August 1986.
....that it is not scalable and requires synchronizations which can be very expensive in large multiprocessor systems. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [15, 16, 18]. In this case the reduction variable is privatized in the transformed doall, and the final result of the reduction operation is computed in an interprocessor reduction phase following the doall, i.e. a scalar is produced using the partial results computed in each processor as operands for a ....
C. Kruskal. Efficient parallel algorithms for graph problems. August 1985.
....access pattern of array # is read, modify, write, and the function performed by the loop is to add a value computed in each iteration to the value stored in # . Once reduction variables are identified, methods are known for performing the reduction operation in parallel (see, e.g. [11, 14, 16, 35]) 3 Run Time Analysis of Loops Given a do loop whose access pattern cannot be statically analyzed, compilers have traditionally generated sequential code. Since compile time data dependence analysis techniques cannot be used on such programs, methods of performing the analysis at run time ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proc. of the 1986 Int. Conf. on Parallel Processing, pp. 869--876, Aug. 1986.
.... [10, 11, 18] 1 in this paper we denote by log the logarithm whose base is 2 1 An important aspect of the extended use of CCug in various applications is that the graphs to which it applies, may have considerable differences in the density of the structure, i.e. the ratio of edges to vertices [22]. Note that the size of a graph, with v vertices and e edges, is v e, and (for undirected, unlabelled graphs) v 2 is a superior bound for e. In many applications, we can safely assume that CCug deals with graphs which have e v Gamma 2. Nevertheless, in finding the transitive closure of ....
....as the algorithm presented in Section 3 but it is defined for the EREW PRAM model. This fact has several significant aspects which are discussed in Section 9 where a comparative study of the cost of our algorithm is given together with an overview of the algorithms for PRAM currently proposed by [14, 26, 4, 24, 27, 22, 1, 12, 7, 17, 20, 25, 5]. 2 Preliminaries A graph is a pair (V; E) where V is the set of vertices and E V Theta V is the set of edges. Let v = jVj, i.e. cardinality of V, and e = jE j. Then the size of a graph is e v. Given an edge a = x; z) vertex x is the source and vertex z is the target of a 2 . A ....
Kruskal C.P., Rudolph L. and Snir M.Efficient Parallel Algorithms for Graph problems. Proc. of the Inter. Conf. on Parallel Processing, (1986), 869-876.
....that it is not scalable and requires synchronizations which can be very expensive in large multiprocessor systems. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [16, 17, 19]. In this case the reduction variable is privatized in the transformed doall, and the final result of the reduction operation is computed in an interprocessor reduction phase following the doall, i.e. a scalar is produced using the partial results computed in each processor as operands for a ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1986 International Conferenceon Parallel Processing, pages 869--876, August 1986.
....that it is not scalable and requires synchronizations which can be very expensive in large multiprocessor systems. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [16, 17, 19]. In this case the reduction variable is privatized in the transformed doall, and the final result of the reduction operation is computed in an interprocessor reduction phase following the doall, i.e. a scalar is produced using the partial results computed in each processor as operands for a ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1985 International Conferenceon Parallel Processing, August 1985.
....that it is not scalable and requires synchronizations which can be very expensive in large multiprocessor systems. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [19], 21] In this case the reduction variable is privatized in the transformed doall, and the final result of the reduction operation is computed in an interprocessor reduction phase following the doall, i.e. the result is produced using the partial results computed in each processor as operands ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1986 International Conference on Parallel Processing, pages 869--876, August 1986.
....are that it is not always scalable and requires synchronizations which can be very expensive in large multiprocessor systems. A scalable method can be obtained by noting that a reduction operation is an associative recurrence and can thus be parallelized using a recursive doubling algorithm [22, 23, 25]. In this case the reduction variable is privatized in the transformed doall, and the final result of the reduction operation is computed in an interprocessor reduction phase following the doall, i.e. a scalar is produced using the partial results computed in each processor as operands for a ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1986 InternationalConference on Parallel Processing, pages 869--876, August 1986.
....are that it is not always scalable and requires synchronizations which can be very expensive in large multiprocessor systems. A scalable method can be obtained by noting that a reduction operation is an associative recurrence and can thus be parallelized using a recursive doubling algorithm [22, 23, 25]. In this case the reduction variable is privatized in the transformed doall, and the final result of the reduction operation is computed in an interprocessor reduction phase following the doall, i.e. a scalar is produced using the partial results computed in each processor as operands for a ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1985 InternationalConference on Parallel Processing, August 1985.
....reduction statements have a loop carried flow dependence, i.e. the original sequential computation of the reduction must be replaced with a parallel algorithm. For example, a sequential summation is a reduction which can be replaced by a parallel prefix, or recursive doubling, computation [5, 6]. For some applications, most notably for SPICE 2G6, the data dependence analysis involved cannot be performed at compile time because the reference pattern is not defined at that time. In the case of irregular, sparse programs the access pattern traversed by the sequential reduction during a loop ....
....high. Disadvantages: It can potentially generate a lot of memory traffic, and is applicable only for software DSM environments. 4.1. 2 Private Accumulation and Global Update A reduction operation is an associative recurrence and can thus be parallelized using a recursive doubling algorithm [5, 6]. In a similar manner, we can privatize the reduction variables and accumulate in private storage the partial results and thus allow the original loop to execute as a doall. Then, after loop execution, the partial results are accumulated across processors and the corresponding shared array is ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1986 International Conference on Parallel Processing, pages 869--876, August 1986.
....this may lead to some complications due to variable number of links pointing to or from a tree node. A more practical preorder tree traversal representation involve replacement of each tree node, u 2 V r , with two nodes, u 0 and u 1 . This representation has been used by [15] and recently by [8]. Figure 6(b) illustrates the example tree in this representation. 6 7 4 3 8 5 8 (a) b) 6 8 6 7 7 5 5 4 4 3 3 0 1 0 1 0 1 0 1 1 0 1 0 Fig. 6. Linearized representation for one of the trees in the example: Euler Tour representation (a) and another more practical representation (b) Given the ....
C. P. Kruskal, L. Rudolph, and M. Snir, Efficient parallel algorithms for graph problems, Algorithmica, 5 (1990), pp. 43--64.
....drawbacks of this method are that it is not scalable and that it requires potentially expensive synchronizations. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [10, 12]. In this case, the reduction variable is privatized in the transformed doall. A scalar is then produced using the partial results computed in each processor as operands for a reduction operation (with the same operator) across the processors (Figure 4 (c) The real difficulty encountered by ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1986 International Conference on Parallel Processing, pages 869--876, August 1986.
....n 2 = log 2 n CREW PRAM [HCS79, CLC82] or n 2 EREW PRAM processors [NM82] and O(log n) time using n m PRIORITY CRCW PRAM processors [AS87, SV82] or (n m) log log log n= log n STRONG CRCW PRAM processors [CV86] using very elaborate techniques. Other parallel algorithms are reported in [KRS90, KR84, Ben80, SJ81]. Recently, CL93] have improved the running time of [JM91] to O(log n log log n) mainly by providing a recursive version of the growth control schedule. It does not appear, however, that this technique has immediate application on the MST algorithm we present here. The paper is organized as ....
....the remaining steps use m= log m processors. Thus, the algorithm uses O(n m) processors. Assuming that there exists an integer sorting algorithm that runs in logarithmic time using O(m= p log m) processors, the whole algorithm will have this processors bound. In fact, the algorithms given in [KRS90] and in [She91, HS90] are within the desired bounds. However, due to space requirements (the former) and to unrealistic machine assumptions (the latter) these algorithms are not considered practical. 3.4 Description of a Sub phase As we have said, each component C entering a phase holds ....
C.P. Kruskal, L. Rudolph, and M. Snir. Efficient parallel algorithms for graph problems. Algorithmica, 5:43--64, 1990.
....drawbacks of this method are that it is not scalable and that it requires potentially expensive synchronizations. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [12, 13, 14]. In this case, the reduction variable is privatized in the transformed doall. A scalar is then produced using the partial results computed in each processor as operands for a reduction operation (with the same operator) across the processors (Figure 7 (c) The real difficulty encountered by ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proc. of the 1986 Int. Conf. on Parallel Processing, pp. 869--876, August 1986.
....drawbacks of this method are that it is not scalable and that it requires potentially expensive synchronizations. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [12, 13, 14]. In this case, the reduction variable is privatized in the transformed doall. A scalar is then produced using the partial results computed in each processor as operands for a reduction operation (with the same operator) across the processors (Figure 7 (c) The real difficulty encountered by ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proc. of the 1985 Int. Conf. on Parallel Processing, August 1985.
....drawbacks of this method are that it is not scalable and that it requires potentially expensive synchronizations. A scalable method can be obtained by noting that a reduction operation is an associative and commutative recurrence and can thus be parallelized using a recursive doubling algorithm [14, 16]. In this case, the reduction variable is privatized in the transformed doall. A scalar is then produced using the partial results computed in each processor as operands for a reduction operation (with the same operator) across the processors (Figure 3 (c) This last cross processor reduction, ....
C. Kruskal. Efficient parallel algorithms for graph problems. In Proceedings of the 1986 International Conference on Parallel Processing, pages 869--876, August 1986.
....A broadcasting step is used as above to notify all processors about their leaders. Theorem 1 follows. 3 Simulations Results In this section we give several known results for integer sorting algorithms, and we use them in conjunction with Theorem 1 to derive several simulation results. Fact 5 ([14]) Using n 1 Gammaffl processors, for any fixed ffl 0, n integers from an arbitrary range [1; m] can be sorted on an erew pram in O(n ffl lg m= lg n) time and O(n lg m= lg n) operations. Fact 6 ( 5] n input elements from the integer interval [1; k] can be stably sorted on an erew pram in ....
C. Kruskal, L. Rudolph, and M. Snir. Efficient parallel algorithms for graph problems. Algorithmica, 5:43--64, 1990.
No context found.
Kruskal, C. P., Rudolph, L. and Snir, M. Efficient parallel algorithms for graph problems. Proc.
No context found.
. C. P. Kruskal, L. Rudolph and M. Snir. Efficient parallel algorithms for graph problems. Proc. 1986 International Conf. on Parallel Processing, 869-876.
No context found.
Kruskal, C. P., Rudolph, L., Snir, M. Efficient parallel algorithms for graph problems. Proc. of 1986 Int. Conf. on Parallel Processing, pp. 869-876.
No context found.
# C. Kruskal, "Efficient Parallel Algorithms for Graph Problems," Proc. 1986.
No context found.
C.P. Kruskal, L. Rudolph, and M. Snir. Efficient parallel algorithms for graph problems. Algorithmica, 5:43--64, 1990.
No context found.
C. P. Kruskal, L. Rudolph, and M. Snir. Efficient parallel algorithms for graph problems. Algorithmica, 5:43-64, 1990.
No context found.
Kruskal, C. P., Rudolph, L., and Snir, M., "Efficient Parallel Algorithms for Graph Problems", Algorithmica, (1990) 5: pp. 43-64.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC