Results 1  10
of
40
Multilevel algorithms for partitioning powerlaw graphs
 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS). IN
, 2006
"... Graph partitioning is an enabling technology for parallel processing as it allows for the effective decomposition of unstructured computations whose data dependencies correspond to a large sparse and irregular graph. Even though the problem of computing highquality partitionings of graphs arising i ..."
Abstract

Cited by 61 (1 self)
 Add to MetaCart
Graph partitioning is an enabling technology for parallel processing as it allows for the effective decomposition of unstructured computations whose data dependencies correspond to a large sparse and irregular graph. Even though the problem of computing highquality partitionings of graphs arising in scientific computations is to a large extent wellunderstood, this is far from being true for emerging HPC applications whose underlying computation involves graphs whose degree distribution follows a powerlaw curve. This paper presents new multilevel graph partitioning algorithms that are specifically designed for partitioning such graphs. It presents new clusteringbased coarsening schemes that identify and collapse together groups of vertices that are highly connected. An experimental evaluation of these schemes on 10 different graphs show that the proposed algorithms consistently and significantly
Parallel Optimisation Algorithms for Multilevel Mesh Partitioning
 Parallel Comput
, 2000
"... Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is ..."
Abstract

Cited by 55 (14 self)
 Add to MetaCart
Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is a restriction of this technique in which mesh entities are only allowed to migrate between subdomains in one direction. The third treats the gain as a potential field and uses the concept of relative gain for selecting appropriate vertices to migrate. The results are compared and seen to produce very high global quality partitions, very rapidly. The results are also compared with another partitioning tool and shown to be of higher quality although taking longer to compute. 2000 Elsevier Science B.V. All rights reserved.
Parallel Implementation and Practical Use of Sparse Approximate Inverse Preconditioners With a Priori Sparsity Patterns
 Int. J. High Perf. Comput. Appl
, 2001
"... This paper describes and tests a parallel, message passing code for constructing sparse approximate inverse preconditioners using Frobenius norm minimization. The sparsity patterns of the preconditioners are chosen as patterns of powers of sparsified matrices. Sparsification is necessary when powers ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
This paper describes and tests a parallel, message passing code for constructing sparse approximate inverse preconditioners using Frobenius norm minimization. The sparsity patterns of the preconditioners are chosen as patterns of powers of sparsified matrices. Sparsification is necessary when powers of a matrix have a large number of nonzeros, making the approximate inverse computation expensive. For our test problems, the minimum solution time is achieved with approximate inverses with fewer than twice the number of nonzeros of the original matrix. Additional accuracy is not compensated by the increased cost per iteration. The results lead to further understanding of how to use these methods and how well these methods work in practice. In addition, this paper describes programming techniques required for high performance, including onesided communication, local coordinate numbering, and load repartitioning.
Parallel computation of threedimensional flows using overlapping grids with adaptive mesh refinement
 J. Comput. Phys
, 2008
"... This paper describes an approach for the numerical solution of timedependent partial dierential equations in complex threedimensional domains. The domains are represented by overlapping structured grids, and blockstructured adaptive mesh renement (AMR) is employed to locally increase the grid re ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
(Show Context)
This paper describes an approach for the numerical solution of timedependent partial dierential equations in complex threedimensional domains. The domains are represented by overlapping structured grids, and blockstructured adaptive mesh renement (AMR) is employed to locally increase the grid resolution. In addition, the numerical method is implemented on parallel distributedmemory computers using a domaindecomposition approach. The implementation is
exible so that each base grid within the overlapping grid structure and its associated renement grids can be independently partitioned over a chosen set of processors. A modied binpacking algorithm is used to specify the partition for each grid so that the computational work is evenly distributed amongst the processors. All components of the AMR algorithm such as error estimation, regridding, and interpolation are performed in parallel. The parallel timestepping algorithm is illustrated for initialboundaryvalue problems involving a linear advectiondiusion equation and the (nonlinear) reactive Euler equations. Numerical results are presented for both equations to demonstrate the accuracy and correctness of the parallel approach. Exact solutions of the advectiondiusion equation are constructed, and these are used to check the corresponding numerical solutions for a variety of tests involving dierent overlapping grids, dierent numbers of renement levels and renement ratios, and dierent numbers of processors. The problem of planar shock diraction by a sphere is considered as an illustration of the
Partitioning sparse matrices for parallel preconditioned iterative methods
 SIAM Journal on Scientific Computing
, 2004
"... Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that differ ..."
Abstract

Cited by 14 (9 self)
 Add to MetaCart
(Show Context)
Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that different methods impose different partitioning requirements for the matrices. Then we develop hypergraph models to meet those requirements. In particular, we develop models that enable us to obtain partitionings on the coefficient and preconditioner matrices simultaneously. Experiments on a set of unsymmetric sparse matrices show that the proposed models yield effective partitioning results. A parallel implementation of the right preconditioned BiCGStab method on a PC cluster verifies that the theoretical gains obtained by the models hold in practice.
Dynamic load balancing of finite element applications with the DRAMA library
 APPLIED MATHEMATICAL MODELLING 25 (2000) 83±98
, 2000
"... The DRAMA library, developed within the European Commission funded (ESPRIT) project DRAMA, supports dynamic loadbalancing for parallel (messagepassing) meshbased applications. The target applications are those with dynamic and solutionadaptive features. The focus within the DRAMA project was on ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
The DRAMA library, developed within the European Commission funded (ESPRIT) project DRAMA, supports dynamic loadbalancing for parallel (messagepassing) meshbased applications. The target applications are those with dynamic and solutionadaptive features. The focus within the DRAMA project was on finite element simulation codes for structural mechanics. An introduction to the DRAMA library will illustrate that the very general cost model and the interface designed specifically for application requirements provide simplified and effective access to a range of parallel partitioners. The main body of the paper will demonstrate the ability to provide dynamic loadbalancing for parallel FEM problems that include: adaptive meshing, remeshing, the need for multiphase partitioning.
Multiconstraint mesh partitioning for contact/impact computations
 IN: PROC. SC2003, ACM
, 2003
"... We present a novel approach for decomposing contact/impact computations in which the mesh elements come in contact with each other during the course of the simulation. Effective decomposition of these computations poses a number of challenges as it needs to both balance the computations and minimize ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
We present a novel approach for decomposing contact/impact computations in which the mesh elements come in contact with each other during the course of the simulation. Effective decomposition of these computations poses a number of challenges as it needs to both balance the computations and minimize the amount of communication that is performed during the finite element and the contact search phase. Our approach achieves the first goal by partitioning the underlying mesh such that it simultaneously balances both the work that is performed during the finite element phase and that performed during contact search phase, while producing subdomains whose boundaries consist of piecewise axesparallel lines or planes. The second goal is achieved by using a decision tree to decompose the space into rectangular or boxshaped regions that contain contact points from a single partition. Our experimental evaluation on a sequence of 100 meshes, shows that this new approach can significantly reduce the communication overhead over existing algorithms.
Partitioning and Dynamic Load Balancing for the Numerical Solution of Partial Differential Equations
 NUMERICAL SOLUTION OF PARTIAL DIFFERENTIAL EQUATIONS ON PARALLEL COMPUTERS
, 2005
"... ..."
(Show Context)
Communication Support for Adaptive Computation
, 2001
"... This memory cannot be utilized in subsequent phases, decreasing the total memory which is usable for communication, thus potentially increasing the number of phases. Instead, another processor can temporarily move some of its data to this processor to free up space for messages. An example is illust ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
This memory cannot be utilized in subsequent phases, decreasing the total memory which is usable for communication, thus potentially increasing the number of phases. Instead, another processor can temporarily move some of its data to this processor to free up space for messages. An example is illustrated in Fig. 3. In this simple example, the top two processors want to exchange 100 units of data, but each has only one unit of available memory. A simplistic approach will require 100 phases. However, the third processor has 100 units of free memory. By parking data on this third processor (i.e. transferring free memory to another processor), the number of phases can be reduced to three.
Graph Partitioning in Scientific Simulations: Multilevel Schemes versus SpaceFilling Curves
"... Using spacefilling curves to partition unstructured finite element meshes is a widely applied strategy when it comes to distributing load among several computation nodes. Compared to more elaborated graph partitioning packages, this geometric approach is relatively easy to implement and very fast. ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Using spacefilling curves to partition unstructured finite element meshes is a widely applied strategy when it comes to distributing load among several computation nodes. Compared to more elaborated graph partitioning packages, this geometric approach is relatively easy to implement and very fast. However, results are not expected to be as good as those of the latter, but no detailed comparison has ever been published. In this paper we will...