| B. Marsolf. Large grain parallel sparse system solver. Technical Report CSRD Report No. 1125, Center for Supercomputing Research and Development, Uni- versity of Illinois, Urbana, IL, 1991. Master Thesis. |
.... to obtain a parallel algorithm) An efficient reordering has been proposed by Hellerman and Rarick [34, 35] It has been used, with some modifications, by many other authors; see, for example, 8] or [20] Other preliminary reorderings have also been proposed in the literature; see, for example, [26]. A common feature of all these reorderings is that one always imposes a requirement to obtain square blocks on the main diagonal. Moreover, it is also required that the reordered matrix is either an upper block triangular matrix or a bordered matrix, in both cases with square blocks on the main ....
K. A. Gallivan, B. Marsolf and H. Wijsoff, "A large-grain parallel sparse system solver". In: "PROCEEDINGS OF THE SIAM CONFERENCE ON PARALLEL PROCESSING FOR SCIENTIFIC COMPUTING", pp. 23--28. SIAM, Philadelphia, 1991.
.... while loops that could not be parallelized by any compiler available to us; two loops are from the PERFECT Benchmarks [3] two loops are from MA28, a sparse non symmetric linear solver [5] and one loop is extracted from MCSPARSE, a parallel version of a non symmetric sparse linear systems solver [6, 7]. Our results are summarized in Table 2. For each method applied to a loop, we give the speedup that was obtained, and, mention whether backups and time stamping were necessary. Whenever necessary, we performed a simple preventive backup of the variables potentially written in the loop. In some ....
K. Gallivan, B. Marsolf, and H. Wijshoff. A large-grain parallel sparse system solver. In Proc. 4-th SIAM Conf. on Parallel Proc. for Scient. Comp., pages 23--28, Chicago, IL, 1989.
....input matrix into a bordered block upper triangular structure. This structure can be used to exploit large, medium and fine grained parallelism during the subsequent factorization and solve phases. The initial implementation of MCSPARSE was specifically targeted for the Cedar architecture [7, 8, 14, 16]. The Cedar is a cluster based multivector machine containing a hybrid memory system. Later, a shared memory version of MCSPARSE is developed and implemented for the Cray Y MP. Also, the preprocessing phase H is further optimized and parallelized [11, 12] In this paper, we specifically ....
B. A. Marsolf, Large grain parallel sparse system solver, Tech. Rep. CSRD Report No. 1125, Center for Supercomputing Research and Development, University of Illinois, Urbana, IL, 1991. Master Thesis.
....and counting we obtain the results depicted in Table 1. Since Aw [ A r [ and Aw [ A np [ are zero everywhere, the loop can be made into a DOALL, but only after privatization since tw(A) 6= tm(A) S1: DO i = 1, n S2: A[R1[i] S3: A[W[i] S4: A[R2[i] S5: ENDDO R1[1:8] [ 2 2 2 10 8 8 8 10] W[1:8] 1 3 5 4 7 3 6 12] R2[1:8] 1 3 2 10 7 3 8 12] Figure 2: Position in shadow arrays Written Counted 1 2 3 4 5 6 7 8 9 10 11 12 tw(A) tm(A) Aw [1 : 12] 1 0 1 1 1 1 1 0 0 0 0 1 8 7 Ar [1 : 12] 0 1 0 0 0 0 0 1 0 1 0 0 Anp[1 : 12] 0 1 0 0 0 0 0 1 0 1 0 0 Aw [ Ar [ 0 0 0 0 0 0 0 0 ....
.... Aw [ A r [ and Aw [ A np [ are zero everywhere, the loop can be made into a DOALL, but only after privatization since tw(A) 6= tm(A) S1: DO i = 1, n S2: A[R1[i] S3: A[W[i] S4: A[R2[i] S5: ENDDO R1[1:8] 2 2 2 10 8 8 8 10] W[1:8] 1 3 5 4 7 3 6 12] R2[1:8] [ 1 3 2 10 7 3 8 12] Figure 2: Position in shadow arrays Written Counted 1 2 3 4 5 6 7 8 9 10 11 12 tw(A) tm(A) Aw [1 : 12] 1 0 1 1 1 1 1 0 0 0 0 1 8 7 Ar [1 : 12] 0 1 0 0 0 0 0 1 0 1 0 0 Anp[1 : 12] 0 1 0 0 0 0 0 1 0 1 0 0 Aw [ Ar [ 0 0 0 0 0 0 0 0 0 0 0 0 Aw [ Anp [ 0 0 0 0 0 0 0 0 0 0 0 0 Table 1: ....
[Article contains additional citation context not shown here]
K. Gallivan, B. Marsolf, and H. Wijshoff. A large-grain parallel sparse system solver. In Proc. Fourth SIAM Conf. on Parallel Proc. for Scient. Comp., pages 23--28, Chicago, IL, 1989.
.... could not be parallelized by any compiler available to us; two loops are from the PERFECT Benchmarks [BCK 89] two loops are from MA28, a sparse non symmetric linear solver [Duf77] and one loop is extracted from MCSPARSE, a parallel version of a non symmetric sparse linear systems solver [GMW89, GMW91] Our results are summarized in Table 8.9. For each method applied to a loop, we give the speedup that was obtained, and, mention whether backups and time stamping were necessary. Whenever necessary, we performed a simple preventive backup of the variables potentially written in the loop. ....
K. Gallivan, B. Marsolf, and H. Wijshoff. A large-grain parallel sparse system solver. In Proc. Fourth SIAM Conf. on Parallel Proc. for Scient. Comp., pages 23--28, Chicago, IL, 1989.
....2. 5 Sparse linear systems: Direct methods Gallivan, Marsolf and Wijshoff have investigated the use of a new hybrid ordering technique, and an associated factorization algorithm for unsymmetric unstructured sparse linear systems on multiprocessor architectures such as Cedar and the Cray 2 [37, 66]. This algorithm, called MCSPARSE, exploits large grain parallelism and divides the problem into partitions which are processed by each of the clusters or processors. On Cedar, finer granularity parallelism is also used within a cluster. For more details we refer to H. Wijshoff s presentation in ....
K. GALLIVAN, B. MARSOLF, AND H. WIJSHOFF, A large-grain parallel sparse system solver, in Procs. Fourth SIAM Conf. Parallel Processing for Scientific Computing, Chicago, IL, 1989.
.... WHILE loops that could not be parallelized by any compiler available to us; two loops are from the PERFECT Benchmarks [3] two loops are from MA28, a sparse UNsymmetric linear solver [6] and one loop is extracted from MCSPARSE, a parallel version of a non symmetric sparse linear systems solver [7, 8]. Our results are summarized in Table 2. For each method applied to a loop, we give the speedup that was obtained, and, mention whether backups and time stamping were necessary. Whenever necessary, we performed a simple preventive backup of the variables potentially written in the loop. In some ....
K. Gallivan, B. Marsolf, and H. Wijshoff. A large-grain parallel sparse system solver. In Proc. Fourth SIAM Conf. on Parallel Proc. for Scient. Comp., pages 23--28, Chicago, IL, 1989.
....medium (various parallel row updates strategies) and fine grain (vectorization) parallelism to form a multi grain parallel solver MCSPARS, which aklows adaptation to a wide range of multiprocessor rchitectures. In [GMW89] initial results with MCSPARS, are presented and more details can be found in [Mar91]. The paper is organized as follows. In Section 2 a global overview of the procedures in MSP tSg is given. The details of the ordering H are presented in Section 3. Cazting is introduced and discussed from an algebraic point of view in Section 4. The details of the implementation of the ....
B. Marsolf. Large grain parallel sparse system solver. Technical Report CSRD Report No. 1125, Center for Supercomputing Research and Development, Uni- versity of Illinois, Urbana, IL, 1991. Master Thesis.
....paraklelism (paraklel subsystems of various sizes) exposed by H is combined with medium (various parallel row updates strategies) and fine grain (vectorization) parallelism to form a multi grain parallel solver MCSPARS, which aklows adaptation to a wide range of multiprocessor rchitectures. In [GMW89] initial results with MCSPARS, are presented and more details can be found in [Mar91] The paper is organized as follows. In Section 2 a global overview of the procedures in MSP tSg is given. The details of the ordering H are presented in Section 3. Cazting is introduced and discussed from an ....
K. Gallivan, B. Marsolf, and H. Wijshoff. A large-grain parallel sparse system solver. In Proc. Fourth SIAM Conf. on Parallel Proc. for Scient. Comp., pages 23-28, Chicago, IL, 1989.
....a new ordering technique, the hybrid ordering (H ) and an associated factorization algorithm for unsymmetric unstructured sparse linear systems. More detail on the reordering can be found in [7] and on the merger of the reordering and the factorization algorithm for multicluster architectures in [4]. 1. The Hybrid Ordering. The hybrid ordering H is composed of two different types of orderings: unsymmetric and symmetric. The unsymmetric ordering changes the associated graph of the matrix, mostly by row or column interchanges. The symmetric orderings only relabel the nodes of the associated ....
....unknown until later in the factorization and not really relying on an accurate estimate of the rank of the diagonal block, simple conservative techniques seem adequate. Detailed experiments concerning the effectiveness of such techniques and various choices of their parameters are presented in [4]. Casting can also be applied when each diagonal block is used to eliminate the corresponding columns in the border. A pivot p from the diagonal block is only applied to any element ff in the border if the following test is passed: j p j 10 Gamma6 j ff j (3) If the test is not passed then ....
[Article contains additional citation context not shown here]
K. Gallivan, B. Marsolf, and H. Wijshoff. A large-grain parallel sparse system solver. Technical report, Center for Supercomputing Research and Development, University of Illinois, Urbana, IL, 1990. in preparation.
....(various parallel row updates strategies) and fine grain (vectorization) parallelism to form a multi grain parallel solver mcsparse which allows adaptation to a wide range of multiprocessor architectures. In [12] initial results with mcsparse are presented and more details can be found in [30]. The paper is organized as follows. In Section 2 a global overview of the procedures in mcsparse is given. The details of the ordering H are presented in Section 3. Casting is introduced and discussed from an algebraic point of view in Section 4. The details of the implementation of the ....
B. Marsolf, Large grain parallel sparse system solver, Tech. Report CSRD Report No. 1125, Center for Supercomputing Research and Development, University of Illinois, Urbana, IL, 1991. Master Thesis.
....(parallel subsystems of various sizes) exposed by H is combined with medium (various parallel row updates strategies) and fine grain (vectorization) parallelism to form a multi grain parallel solver mcsparse which allows adaptation to a wide range of multiprocessor architectures. In [12] initial results with mcsparse are presented and more details can be found in [30] The paper is organized as follows. In Section 2 a global overview of the procedures in mcsparse is given. The details of the ordering H are presented in Section 3. Casting is introduced and discussed from an ....
K. Gallivan, B. Marsolf, and H. Wijshoff, A large-grain parallel sparse system solver, in Proc. Fourth SIAM Conf. on Parallel Proc. for Scient. Comp., Chicago, IL, 1989, pp. 23--28.
....while allowing the implementation of a global pivoting strategy that yields a factorization with stability similar to more conventional nonsymmetric solvers. The strategy used in MCSPARSE, called casting, is described in detail for both arbitrary and bordered block upper triangular matrices in [15, 20, 38]. We review it briefly here. Casting is a symmetric permutation that decreases the size of the diagonal block containing the cast pivot by moving a row and column into the border, thereby increasing its size by one. More formally: DEFINITION 5.1. A pivot p ii is said to be cast if the system is ....
....features of the architecture, but this is not the only possible mapping. When implementing the solver, several assumptions about the mapping must be examined. The mapping assumes there are enough rows in the diagonal block for the processors of the cluster to be effective. The current H ordering [38], however, usually generates a large number of small diagonal blocks containing between one and four rows. Since a full Cedar cluster contains eight processors, such diagonal blocks would be unable to use more than half of the processors. A method for overcoming this problem was described in ....
B. MARSOLF, Large grain parallel sparse system solver, Tech. Report CSRD Report No. 1125, Center for Supercomputing Research and Development, University of Illinois, Urbana, IL, 1991. Master Thesis.
....while allowing the implementation of a global pivoting strategy that yields a factorization with stability similar to more conventional nonsymmetric solvers. The strategy used in MCSPARSE, called casting, is described in detail for both arbitrary and bordered block upper triangular matrices in [15, 20, 38]. We review it briefly here. Casting is a symmetric permutation that decreases the size of the diagonal block containing the cast pivot by moving a row and column into the border, thereby increasing its size by one. More formally: DEFINITION 5.1. A pivot p ii is said to be cast if the system is ....
K. GALLIVAN, B. MARSOLF, AND H. WIJSHOFF, A large-grain parallel sparsesystem solver, in Proc. Fourth SIAM Conf. on Parallel Proc. for Scient. Comp., Chicago, IL, 1989, pp. 23--28.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC