Results 1  10
of
150
Numerical solution of saddle point problems
 ACTA NUMERICA
, 2005
"... Large linear systems of saddle point type arise in a wide variety of applications throughout computational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems represent a significant challenge for solver developers. In recent years there has b ..."
Abstract

Cited by 322 (25 self)
 Add to MetaCart
(Show Context)
Large linear systems of saddle point type arise in a wide variety of applications throughout computational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems represent a significant challenge for solver developers. In recent years there has been a surge of interest in saddle point problems, and numerous solution techniques have been proposed for solving this type of systems. The aim of this paper is to present and discuss a large selection of solution methods for linear systems in saddle point form, with an emphasis on iterative methods for large and sparse problems.
The Combinatorial BLAS: Design, Implementation, and Applications
, 2010
"... This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse mat ..."
Abstract

Cited by 58 (10 self)
 Add to MetaCart
This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to their irregular nature and low operational intensity. Many graph computations, however, contain sufficient coarse grained parallelism for thousands of processors, which can be uncovered by using the right primitives. We describe the Parallel Combinatorial BLAS, which consists of a small but powerful set of linear algebra primitives specifically targeting graph and data mining applications. We provide an extendible library interface and some guiding principles for future development. The library is evaluated using two important graph algorithms, in terms of both performance and easeofuse. The scalability and raw performance of the example applications, using the combinatorial BLAS, are unprecedented on distributed memory clusters.
DOLFIN: Automated finite element computing
, 2009
"... We describe here a library aimed at automating the solution of partial differential equations using the finite element method. By employing novel techniques for automated code generation, the library combines a high level of expressiveness with efficient computation. Finite element variational forms ..."
Abstract

Cited by 57 (6 self)
 Add to MetaCart
(Show Context)
We describe here a library aimed at automating the solution of partial differential equations using the finite element method. By employing novel techniques for automated code generation, the library combines a high level of expressiveness with efficient computation. Finite element variational forms may be expressed in near mathematical notation, from which lowlevel code is automatically generated, compiled and seamlessly integrated with efficient implementations of computational meshes and highperformance linear algebra. Easytouse objectoriented interfaces to the library are provided in the form of a C++ library and a Python module. This paper discusses the mathematical abstractions and methods used in the design of the library and its implementation. A number of examples are presented to demonstrate the use of the library in application code.
Liszt: A Domain Specific Language for Building Portable Meshbased PDE Solvers
"... Heterogeneous computers with processors and accelerators are becoming widespread in scientific computing. However, it is difficult to program hybrid architectures and there is no commonly accepted programming model. Ideally, applications should be written in a way that is portable to many platforms, ..."
Abstract

Cited by 48 (3 self)
 Add to MetaCart
(Show Context)
Heterogeneous computers with processors and accelerators are becoming widespread in scientific computing. However, it is difficult to program hybrid architectures and there is no commonly accepted programming model. Ideally, applications should be written in a way that is portable to many platforms, but providing this portability for general programs is a hard problem. By restricting the class of programs considered, we can make this portability feasible. We present Liszt, a domainspecific language for constructing meshbased PDE solvers. We introduce language statements for interacting with an unstructured mesh, and storing data at its elements. Program analysis of these statements enables our compiler to expose the parallelism, locality, and synchronization of Liszt programs. Using this analysis, we generate applications for multiple platforms: a cluster, an SMP, and a GPU. This approach allows Liszt applications to perform within 12 % of handwritten C++, scale to large clusters, and experience orderofmagnitude speedups on GPUs.
Language Virtualization for Heterogeneous Parallel Computing
"... As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatible mix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the difficult problems of parallelization, data decomp ..."
Abstract

Cited by 35 (10 self)
 Add to MetaCart
(Show Context)
As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatible mix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the difficult problems of parallelization, data decomposition and machinespecific details. Most programmers are having a difficult time using these programming models effectively. To provide a programming model that addresses the productivity and performance requirements for the average programmer, we explore a domainspecific approach to heterogeneous parallel programming. We propose language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language. We define criteria for language virtualization and present techniques to achieve them. We present two concrete case studies of domainspecific languages that are implemented using our virtualization approach.
A comparison of eigensolvers for largescale 3d modal analysis using amgpreconditioned iterative methods
 Int. Journal for Numerical Methods in Engg
"... ..."
(Show Context)
Algorithms and data structures for massively parallel generic adaptive finite element codes
 ACM Trans. Math. Softw
, 2011
"... Today’s largest supercomputers have 100,000s of processor cores and offer the potential to solve partial differential equations discretized by billions of unknowns. However, the complexity of scaling to such large machines and problem sizes has so far prevented the emergence of generic software libr ..."
Abstract

Cited by 27 (12 self)
 Add to MetaCart
Today’s largest supercomputers have 100,000s of processor cores and offer the potential to solve partial differential equations discretized by billions of unknowns. However, the complexity of scaling to such large machines and problem sizes has so far prevented the emergence of generic software libraries that support such computations, although these would lower the threshold of entry and enable many more applications to benefit from largescale computing. We are concerned with providing this functionality for meshadaptive finite element computations. We assume the existence of an “oracle ” that implements the generation and modification of an adaptive mesh distributed across many processors, and that responds to queries about its structure. Based on querying the oracle, we develop scalable algorithms and data structures for generic finite element methods. Specifically, we consider the parallel distribution of mesh data, global enumeration of degrees of freedom, constraints, and postprocessing. Our algorithms remove the bottlenecks that typically limit largescale adaptive finite element analyses. We demonstrate scalability of complete finite element workflows on up to 16,384 processors. An implementation of the proposed algorithms, based on the open source software p4est as mesh oracle, is provided
Localized hexagon patterns of the planar SwiftHohenberg equation
 SIAM J. Appl. Dyn. Syst
"... We investigate stationary spatially localized hexagon patterns of the twodimensional Swift–Hohenberg equation in the parameter region where the trivial state and regular hexagon patterns are both stable. Using numerical continuation techniques, we trace out the existence regions of fully localized ..."
Abstract

Cited by 26 (8 self)
 Add to MetaCart
(Show Context)
We investigate stationary spatially localized hexagon patterns of the twodimensional Swift–Hohenberg equation in the parameter region where the trivial state and regular hexagon patterns are both stable. Using numerical continuation techniques, we trace out the existence regions of fully localized hexagon patches and of planar pulses which consist of a strip filled with hexagons that is embedded in the trivial state. We find that these patterns exhibit snaking: for each parameter value in the snaking region, an infinite number of patterns exist that are connected in parameter space and whose width increases without bound. Our computations also indicate a relation between the limits of the snaking regions of planar hexagon pulses with different orientations and of the fully localized hexagon patches. To investigate which hexagons among the oneparameter family of hexagons are selected in a hexagon pulse or front, we derive a conserved quantity of the spatial dynamical system that describes planar patterns which are periodic in the transverse direction and use it to calculate the Maxwell curves along which the selected hexagons have the same energy as the trivial state. We find that the Maxwell curve lies within the snaking region as expected from heuristic arguments.
A Flexible OpenSource Toolbox for Scalable Complex Graph Analysis
, 2011
"... The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a highlevel language without grappling with the difficulties of writing parallel code, calling parallel libraries, or becoming a graph expert. KDT provides a flexible Py ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
(Show Context)
The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a highlevel language without grappling with the difficulties of writing parallel code, calling parallel libraries, or becoming a graph expert. KDT provides a flexible Python interface to a small set of highlevel graph operations; composing a few of these operations is often sufficient for a specific analysis. Scalability and performance are delivered by linking to a stateoftheart backend compute engine that scales from laptops to large HPC clusters. KDT delivers very competitive performance from a generalpurpose, reusable library for graphs on the order of 10 billion edges and greater. We demonstrate speedup of 1 and 2 orders of magnitude over PBGL and Pegasus, respectively, on some tasks. Examples from simple use cases and key graphanalytic benchmarks illustrate the productivity and performance realized by KDT users. Semantic graph abstractions provide both flexibility and high performance for realworld use cases. Graphalgorithm researchers benefit from the ability to develop algorithms quickly using KDT’s graph and underlying matrix abstractions for distributed memory. KDT is available as opensource code to foster experimentation.