Results 11  20
of
145
A CADFree Approach to HighFidelity Aerostructural Optimization
 Proceedings of the 13th AIAA/ISSMOMultidisciplinary Analysis Optimization Conference, Fort Worth, TX, Sept. 2010, AIAA
"... Geometry parametrization for highfidelity multidisciplinary optimization is an important and complex problem. We present a CADfree geometry parametrization method using a freefrom deformation volume approach. This approach yields several important advantages over other parametrization techniques ..."
Abstract

Cited by 16 (12 self)
 Add to MetaCart
Geometry parametrization for highfidelity multidisciplinary optimization is an important and complex problem. We present a CADfree geometry parametrization method using a freefrom deformation volume approach. This approach yields several important advantages over other parametrization techniques, the most of important of which is the efficient computation of analytic derivatives for gradientbased optimization. A parallel, hybrid, algebraiclinearelasticity mesh perturbation scheme which produces high quality perturbed meshes with low computational effort is also presented. We couple an Euler CFD solver with a finiteelement model that uses fourthorder degenerate shell elements. As a demonstration problem, we perform the aerostructural redesign of a subsonic wing for transonic flight conditions. We show that this optimization problem captures some of the complex multidisciplinary tradeoffs inherent in wing design. I.
On large scale diagonalization techniques for the Anderson model of localization
 SIAM REVIEW
, 2005
"... We propose efficient preconditioning algorithms for an eigenvalue problem arising in quantum physics, namely the computation of a few interior eigenvalues and their associated eigenvectors for the largest sparse real and symmetric indefinite matrices of the Anderson model of localization. We compar ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
We propose efficient preconditioning algorithms for an eigenvalue problem arising in quantum physics, namely the computation of a few interior eigenvalues and their associated eigenvectors for the largest sparse real and symmetric indefinite matrices of the Anderson model of localization. We compare the Lanczos algorithm in the 1987 implementation by Cullum and Willoughby with the shiftandinvert techniques in the implicitly restarted Lanczos method and in the JacobiDavidson method. Our preconditioning approaches for the shiftandinvert symmetric indefinite linear system are based on maximum weighted matchings and algebraic multilevel incomplete LDL T factorizations. These techniques can be seen as a complement to the alternative idea of using more complete pivoting techniques for the highly illconditioned symmetric indefinite Anderson matrices. We demonstrate the effectiveness and the numerical accuracy of these algorithms. Our numerical examples reveal that recent algebraic multilevel preconditioning solvers can accelerative the computation of a largescale eigenvalue problem corresponding to the Anderson model of localization by several orders of magnitude.
Dendro: Parallel algorithms for multigrid and AMR methods on 2:1 balanced octrees
"... Abstract—In this article, we present Dendro, a suite of parallel algorithms for the discretization and solution of partial differential equations involving secondorder elliptic operators. Dendro uses trilinear finite element discretizations constructed using octrees. Dendro, which is built on top o ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
(Show Context)
Abstract—In this article, we present Dendro, a suite of parallel algorithms for the discretization and solution of partial differential equations involving secondorder elliptic operators. Dendro uses trilinear finite element discretizations constructed using octrees. Dendro, which is built on top of PETSc (Argonne National Laboratories), comprises of four main modules: a bottomup octree generation and 2:1 balancing module, a meshing module, a geometric multiplicative multigrid module, and a module for adaptive mesh refinement (AMR). The first two components constitute prior work that we have published elsewere. Here, we focus on the multigrid and AMR modules. The key features of Dendro are coarsening/refinement, interoctree transfers of scalar and vector fields, and parallel partition of multilevel octree forests. We describe an algorithm for constructing the coarser multigrid levels starting with an arbitrary 2:1 balanced fine grid octree discretization. Also, we describe matrixfree implementations for the discretized finite element operators and the intergrid transfer operations. The current implementation of Dendro is most appropriate for problems with smooth variable coefficients. We present scalability results for a Poisson problem, a linear elastostatics problem, and for a timedependent heat equation. We use the first two equations to illustrate the effectiveness of the multigrid solver. We use the third equation to illustrate the performance of the AMR components. We present results on up
Sparse matrices in Matlab*P: Design and implementation
 In HiPC
, 2004
"... Abstract. Matlab*P is a flexible interactive system that enables computational scientists and engineers to use a highlevel language to program cluster computers. The Matlab*P user writes code in the Matlab language. Parallelism is available via dataparallel operations on distributed objects and ..."
Abstract

Cited by 13 (10 self)
 Add to MetaCart
(Show Context)
Abstract. Matlab*P is a flexible interactive system that enables computational scientists and engineers to use a highlevel language to program cluster computers. The Matlab*P user writes code in the Matlab language. Parallelism is available via dataparallel operations on distributed objects and via taskparallel operations on multiple objects. Matlab*P can store distributed matrices in either full or sparse format. As in Matlab, most matrix operations apply equally to full or sparse operands. Here, we describe the design and implementation of Matlab*P’s sparse matrix support, and an application to a problem in computational fluid dynamics.
HYPERGRAPHBASED UNSYMMETRIC NESTED DISSECTION ORDERING FOR SPARSE LU FACTORIZATION
"... Abstract. In this paper we present HUND, a hypergraphbased unsymmetric nested dissection ordering algorithm for reducing the fillin incurred during Gaussian elimination. HUND has several important properties. It takes a global perspective of the entire matrix, as opposed to local heuristics. It ta ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
Abstract. In this paper we present HUND, a hypergraphbased unsymmetric nested dissection ordering algorithm for reducing the fillin incurred during Gaussian elimination. HUND has several important properties. It takes a global perspective of the entire matrix, as opposed to local heuristics. It takes into account the assymetry of the input matrix by using a hypergraph to represent its structure. It is suitable for performing Gaussian elimination in parallel, with partial pivoting. This is possible because the row permutations performed due to partial pivoting do not destroy the column separators identified by the nested dissection approach. Experimental results on 27 medium and large size highly unsymmetric matrices compare HUND to four other wellknown reordering algorithms. The results show that HUND provides a robust reordering algorithm, in the sense that it is the best or close to the best (often within 10%) of all the other methods.
An ESchedulerBased Data Dependence Analysis and Task Scheduling for Parallel Circuit Simulation
 TCASII
, 2011
"... Abstract—The sparse matrix solver has become the bottleneck ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
(Show Context)
Abstract—The sparse matrix solver has become the bottleneck
Communication Requirements and Interconnect Optimization for HighEnd Scientific Applications
, 2009
"... The path towards realizing nextgeneration petascale and exascale computing is increasingly dependent on building supercomputers with unprecedented numbers of processors. To prevent the interconnect from dominating the overall cost of these ultrascale systems, there is a critical need for scalable ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
The path towards realizing nextgeneration petascale and exascale computing is increasingly dependent on building supercomputers with unprecedented numbers of processors. To prevent the interconnect from dominating the overall cost of these ultrascale systems, there is a critical need for scalable interconnects that capture the communication requirements of ultrascale applications. It is therefore essential to understand highend application communication characteristics across a broad spectrum of computational methods, and utilize that insight to tailor interconnect designs to the specific requirements of the underlying codes. This work makes several unique contributions towards attaining that goal. First, we conduct one of the broadest studies to date of highend application communication requirements, whose computational methods include: finitedifference, latticeBoltzmann, particleincell, sparse linear algebra, particle mesh ewald, and FFTbased solvers. Using derived communication characteristics, we next present the fittree approach for designing network infrastructure that is tailored to application requirements. The fittree minimizes the component count of an interconnect without impacting application performance compared to a fully connected network. Finally, we propose a methodology for reconfigurable networks to implement fittree solutions. Our Hybrid Flexibly Assignable Switch Topology (HFAST) infrastructure, uses both passive (circuit) and active (packet) commodity switch components to dynamically reconfigure interconnects to suit the topological requirements of scientific applications. Overall our exploration points to several promising directions for practically addressing the interconnect requirements of future ultrascale systems.
A parallel geometric multigrid method for finite elements on octree meshes
, 2008
"... Abstract. In this article, we present a parallel geometric multigrid algorithm for solving elliptic partial differential equations (PDEs) on octree based conforming finite element discretizations. We describe an algorithm for constructing the coarser multigrid levels starting with an arbitrary 2:1 b ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Abstract. In this article, we present a parallel geometric multigrid algorithm for solving elliptic partial differential equations (PDEs) on octree based conforming finite element discretizations. We describe an algorithm for constructing the coarser multigrid levels starting with an arbitrary 2:1 balanced finegrid octree discretization. We also describe matrixfree implementations for the discretized finite element operators and the intergrid transfer operations. The key component of our scheme is an octree meshing algorithm, which handles “hanging ” vertices in a manner that naturally supports conforming trilinear shape functions. Our MPIbased implementation has scaled to billions of elements on thousands of processors on the Cray XT3 MPP system “Bigben ” at the Pittsburgh Supercomputing Center (PSC) and the Intel 64 Linux Cluster “Abe ” at the National Center for Supercomputing Applications (NCSA). Although we do not discuss adaptive mesh refinement here, the proposed method can be used efficiently in such problems since it has a low setup cost.
Stable generalized finite element method (SGFEM
 Comput. Methods Appl. Mech. Engrg
"... ar ..."
(Show Context)
A parallel distributed solver for large dense symmetric systems: applications to geodesy and electromagnetism problems, Int
 J. of High Performance Computing Applications
"... In this paper we describe the parallel distributed implementation of a linear solver for largescale applications involving real symmetric positive definite or complex symmetric nonHermitian dense systems. The advantage of this routine is that it performs a Cholesky factorization by requiring half ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
In this paper we describe the parallel distributed implementation of a linear solver for largescale applications involving real symmetric positive definite or complex symmetric nonHermitian dense systems. The advantage of this routine is that it performs a Cholesky factorization by requiring half the storage needed by the standard parallel libraries ScaLAPACK and PLAPACK. Our solver uses a Jvariant Cholesky algorithm and a onedimensional blockcyclic column data distribution but gives similar Gigaflops performance when applied to problems that can be solved on moderately parallel computers with up to 32 processors. Experiments and performance comparisons with ScaLAPACK and PLAPACK on our target applications are presented. These applications arise from the Earth’s gravity field recovery and computational electromagnetics.