Results 1  10
of
180
The development of discontinuous Galerkin methods
, 1999
"... In this paper, we present an overview of the evolution of the discontinuous Galerkin methods since their introduction in 1973 by Reed and Hill, in the framework of neutron transport, until their most recent developments. We show how these methods made their way into the main stream of computational ..."
Abstract

Cited by 182 (20 self)
 Add to MetaCart
In this paper, we present an overview of the evolution of the discontinuous Galerkin methods since their introduction in 1973 by Reed and Hill, in the framework of neutron transport, until their most recent developments. We show how these methods made their way into the main stream of computational fluid dynamics and how they are quickly finding use in a wide variety of applications. We review the theoretical and algorithmic aspects of these methods as well as their applications to equations including nonlinear conservation laws, the compressible NavierStokes equations, and HamiltonJacobilike equations.
Analysis of multilevel graph partitioning
, 1995
"... Recently, a number of researchers have investigated a class of algorithms that are based on multilevel graph partitioning that have moderate computational complexity, and provide excellent graph partitions. However, there exists little theoretical analysis that could explain the ability of multileve ..."
Abstract

Cited by 106 (12 self)
 Add to MetaCart
(Show Context)
Recently, a number of researchers have investigated a class of algorithms that are based on multilevel graph partitioning that have moderate computational complexity, and provide excellent graph partitions. However, there exists little theoretical analysis that could explain the ability of multilevel algorithms to produce good partitions. In this paper we present such an analysis. We show under certain reasonable assumptions that even if no refinement is used in the uncoarsening phase, a good bisection of the coarser graph is worse than a good bisection of the finer graph by at most a small factor. We also show that the size of a good vertexseparator of the coarse graph projected to the finer graph (without performing refinement in the uncoarsening phase) is higher than the size of a good vertexseparator of the finer graph by at most a small factor.
Schism: a WorkloadDriven Approach to Database Replication and Partitioning
"... We present Schism, a novel workloadaware approach for database partitioning and replication designed to improve scalability of sharednothing distributed databases. Because distributed transactions are expensive in OLTP settings (a fact we demonstrate through a series of experiments), our partitione ..."
Abstract

Cited by 97 (7 self)
 Add to MetaCart
(Show Context)
We present Schism, a novel workloadaware approach for database partitioning and replication designed to improve scalability of sharednothing distributed databases. Because distributed transactions are expensive in OLTP settings (a fact we demonstrate through a series of experiments), our partitioner attempts to minimize the number of distributed transactions, while producing balanced partitions. Schism consists of two phases: i) a workloaddriven, graphbased replication/partitioning phase and ii) an explanation and validation phase. The first phase creates a graph with a node per tuple (or group of tuples) and edges between nodes accessed by the same transaction, and then uses a graph partitioner to split the graph into k balanced partitions that minimize the number of crosspartition transactions. The second phase exploits machine learning techniques to find a predicatebased explanation of the partitioning strategy (i.e., a set of range predicates that represent the same replication/partitioning scheme produced by the partitioner). The strengths of Schism are: i) independence from the schema layout, ii) effectiveness on nton relations, typical in social network databases, iii) a unified and finegrained approach to replication and partitioning. We implemented and tested a prototype of Schism on a wide spectrum of test cases, ranging from classical OLTP workloads (e.g., TPCC and TPCE), to more complex scenarios derived from social network websites (e.g., Epinions.com), whose schema contains multiple nton relationships, which are known to be hard to partition. Schism consistently outperforms simple partitioning schemes, and in some cases proves superior to the best known manual partitioning, reducing the cost of distributed transactions up to 30%. 1.
Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes
, 1997
"... For a large class of irregular mesh applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the mesh evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it may lead to an e ..."
Abstract

Cited by 72 (6 self)
 Add to MetaCart
For a large class of irregular mesh applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the mesh evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it may lead to an excessive migration of data among processors. In this paper, we present schemes for computing repartitionings of adaptively refined meshes that perform diffusion of vertices in a multilevel framework. These schemes try to minimize vertex movement without significantly compromising the edgecut. We present heuristics to control the tradeoff between edgecut and vertex migration costs. We also show that multilevel diffusion produces results with improved edgecuts over singlelevel diffusion, and is better able to make use of heuristics to control the tradeoff between edgecut and vertex migration costs than singlelevel diffusion.
Fast and Effective Algorithms for Graph Partitioning and Sparse Matrix Ordering
 IBM JOURNAL OF RESEARCH AND DEVELOPMENT
, 1996
"... Graph partitioning is a fundamental problem in several scientific and engineering applications. In this paper, we describe heuristics that improve the stateoftheart practical algorithms used in graphpartitioning software in terms of both partitioning speed and quality. An important use of graph ..."
Abstract

Cited by 60 (11 self)
 Add to MetaCart
(Show Context)
Graph partitioning is a fundamental problem in several scientific and engineering applications. In this paper, we describe heuristics that improve the stateoftheart practical algorithms used in graphpartitioning software in terms of both partitioning speed and quality. An important use of graphpartitioning is in ordering sparse matrices for obtaining direct solutions to sparse systems of linear equations arising in engineering and optimization applications. The experiments reported in this paper show that the use of these heuristics results in a considerable improvement in the quality of sparsematrix orderings over conventional ordering methods, especially for sparse matrices arising in linear programming problems. In addition, our graphpartitioningbased ordering algorithm is more parallelizable than minimumdegreebased ordering algorithms, and it renders the ordered matrix more amenable to parallel factorization.
Robust Ordering of Sparse Matrices using Multisection
 Department of Computer Science, York University
, 1996
"... In this paper we provide a robust reordering scheme for sparse matrices. The scheme relies on the notion of multisection, a generalization of bisection. The reordering strategy is demonstrated to have consistently good performance in terms of fill reduction when compared with multiple minimum degree ..."
Abstract

Cited by 50 (2 self)
 Add to MetaCart
(Show Context)
In this paper we provide a robust reordering scheme for sparse matrices. The scheme relies on the notion of multisection, a generalization of bisection. The reordering strategy is demonstrated to have consistently good performance in terms of fill reduction when compared with multiple minimum degree and generalized nested dissection. Experimental results show that by using multisection, we obtain an ordering which is consistently as good as or better than both for a wide spectrum of sparse problems. 1 Introduction It is well recognized that finding a fillreducing ordering is crucial in the success of the numerical solution of sparse linear systems. For symmetric positivedefinite systems, the minimum degree [38] and the nested dissection [11] orderings are perhaps the most popular ordering schemes. They represent two opposite approaches to the ordering problem. However, they share a common undesirable characteristic. Both schemes produce generally good orderings, but the ordering qua...
E.: Improving the run time and quality of nested dissection ordering
 SIAM J. Sci. Comp
, 1998
"... ..."
(Show Context)
On Improving the Performance of Sparse MatrixVector Multiplication
 In Proceedings of the International Conference on HighPerformance Computing
, 1997
"... We analyze singlenode performance of sparse matrixvector multiplication by investigating issues of data locality and finegrained parallelism. We examine the datalocality characteristics of the compressedsparse row representation and consider improvements in locality through matrix permutation. ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
(Show Context)
We analyze singlenode performance of sparse matrixvector multiplication by investigating issues of data locality and finegrained parallelism. We examine the datalocality characteristics of the compressedsparse row representation and consider improvements in locality through matrix permutation. Motivated by potential improvements in finegrained parallelism, we evaluate modified sparsematrix representations. The results lead to general conclusions about improving singlenode performance of sparse matrixvector multiplication in parallel libraries of sparse iterative solvers. 1 Introduction One of the core operations of iterative sparse solvers is sparse matrixvector multiplication. In order to achieve high performance, a parallel implementation of sparse matrixvector multiplication must maintain scalability. This scalability comes from a balanced mapping of the matrix and vectors among the distributed processors, a mapping that minimizes interprocessor communication. Load balan...