Results 1  10
of
111
Multilevel Fast Multipole Algorithm for Solving Combined Field Integral Equation of Electromagnetic Scattering
, 1995
"... The fast multipole method (FMM) has been implemented to speed up the matrixvector multiply when an iterative method is used to solve combined field integral equation (CFIE). FMM reduces the complexity from O(N 2 ) to O(N 1:5 ). With a multilevel fast multipole algorithm (MLFMA), it is further re ..."
Abstract

Cited by 188 (11 self)
 Add to MetaCart
The fast multipole method (FMM) has been implemented to speed up the matrixvector multiply when an iterative method is used to solve combined field integral equation (CFIE). FMM reduces the complexity from O(N 2 ) to O(N 1:5 ). With a multilevel fast multipole algorithm (MLFMA), it is further reduced to O(NlogN ). A 110,592 unknown problem can be solved within 24 hours on a SUN Sparc10. 1. Introduction The electromagnetic (EM) field scattering by threedimensional (3D) arbitrarily shaped conductor can be obtained by finding the solution of an integral equation where the unknown function is the induced current distribution. The integral equation is discretized into a matrix equation by the method of moments (MOM). The resultant matrix equation y The authors would like to thank L. Hernquist, J.E. Barnes and P. Hut for providing us with copies of their codes, and thank M.B. Woodworth, M.G. Cot'e, and A.D. Yaghjian for providing us with their numerical and experimental data. This wor...
A kernelindependent adaptive fast multipole algorithm in two and three dimensions
, 2003
"... ..."
(Show Context)
Fast parametric elastic image registration
 IEEE Transactions on Image Processing
, 2003
"... Abstract—We present an algorithm for fast elastic multidimensional intensitybased image registration with a parametric model of the deformation. It is fully automatic in its default mode of operation. In the case of hard realworld problems, it is capable of accepting expert hints in the form of so ..."
Abstract

Cited by 102 (8 self)
 Add to MetaCart
(Show Context)
Abstract—We present an algorithm for fast elastic multidimensional intensitybased image registration with a parametric model of the deformation. It is fully automatic in its default mode of operation. In the case of hard realworld problems, it is capable of accepting expert hints in the form of soft landmark constraints. Much fewer landmarks are needed and the results are far superior compared to pure landmark registration. Particular attention has been paid to the factors influencing the speed of this algorithm. The Bspline deformation model is shown to be computationally more efficient than other alternatives. The algorithm has been successfully used for several twodimensional (2D) and threedimensional (3D) registration tasks in the medical domain, involving MRI, SPECT, CT, and ultrasound image modalities. We also present experiments in a controlled environment, permitting an exact evaluation of the registration accuracy. Test deformations are generated automatically using a random hierarchical fractional waveletbased generator. Index Terms—Elastic registration, image registration, landmarks, splines. I.
Radiation Boundary Condition for the Numerical Simulation of Waves
 Acta Numerica
, 1999
"... We consider the efficient evaluation of accurate radiation boundary conditions for time domain simulations of wave propagation on unbounded spatial domains. This issue has long been a primary stumbling block for the reliable solution of this important class of problems. In recent years, a number of ..."
Abstract

Cited by 90 (3 self)
 Add to MetaCart
We consider the efficient evaluation of accurate radiation boundary conditions for time domain simulations of wave propagation on unbounded spatial domains. This issue has long been a primary stumbling block for the reliable solution of this important class of problems. In recent years, a number of new approaches have been introduced which have radically changed the situation. These include methods for the fast evaluation of the exact nonlocal operators in special geometries, novel sponge layers with reflectionless interfaces, and improved techniques for applying sequences of approximate conditions to higher order. For the primary isotropic, constant coefficient equations of wave theory, these new developments provide an essentially complete solution of the numerical radiation condition problem. In this paper the theory of exact boundary conditions for constant coefficient timedependent problems is developed in detail, with many examples from physical applications. The theory is used to motivate various approximations and to establish error estimates. Complexity estimates are also derived to
A short course on fast multipole methods
 Wavelets, Multilevel Methods and Elliptic PDEs
, 1997
"... In this series of lectures, we describe the analytic and computational foundations of fast multipole methods, as well as some of their applications. They are most easily understood, perhaps, in the case of particle simulations, where they reduce the cost of computing all pairwise interactions in a s ..."
Abstract

Cited by 63 (5 self)
 Add to MetaCart
(Show Context)
In this series of lectures, we describe the analytic and computational foundations of fast multipole methods, as well as some of their applications. They are most easily understood, perhaps, in the case of particle simulations, where they reduce the cost of computing all pairwise interactions in a system of N particles from O(N 2)toO(N)orO(N log N) operations. They are equally useful, however, in solving certain partial differential equations by first recasting them as integral equations. We will draw heavily from the existing literature, especially Greengard [23, 24, 25]; Greengard and Rokhlin [29, 32]; Greengard and Strain [34].
Skeletons from the Treecode Closet
 J. Comp. Phys
, 1994
"... We consider treecodes (Nbody programs which use a tree data structure) from the standpoint of their worstcase behavior. That is, we derive upper bounds on the largest possible errors that are introduced into a calculation by use of various multipole acceptability criteria (MAC). We find that the ..."
Abstract

Cited by 59 (12 self)
 Add to MetaCart
We consider treecodes (Nbody programs which use a tree data structure) from the standpoint of their worstcase behavior. That is, we derive upper bounds on the largest possible errors that are introduced into a calculation by use of various multipole acceptability criteria (MAC). We find that the conventional BarnesHut MAC can introduce potentially unbounded errors unless ` ! 1= p 3, and that this behavior while rare, is demonstrable in astrophysically reasonable examples. We consider two other MACs closely related to the BH MAC. While they don't admit the same unbounded errors, they nevertheless require extraordinary amounts of CPU time to guarantee modest levels of accuracy. We derive new error bounds based on some additional, easily computed moments of the mass distribution. These error bounds form the basis for four new MACs which can be used to limit the absolute or relative error introduced by each multipole evaluation, or, with the introduction of some additional data struc...
A particle method and adaptive treecode for vortex sheet motion in threedimensional flow
 J. Comput. Phys
, 2001
"... A particle method is presented for computing vortex sheet motion in threedimensional flow. The particles representing the sheet are advected by a regularized Biot– Savart integral in which the exact singular kernel is replaced by the Rosenhead– Moore kernel. New particles are inserted to maintain r ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
(Show Context)
A particle method is presented for computing vortex sheet motion in threedimensional flow. The particles representing the sheet are advected by a regularized Biot– Savart integral in which the exact singular kernel is replaced by the Rosenhead– Moore kernel. New particles are inserted to maintain resolution as the sheet rolls up. The particle velocities are evaluated by an adaptive treecode algorithm based on Taylor approximation in Cartesian coordinates, and the necessary Taylor coefficients are computed by a recurrence relation. The adaptive features include a divideandconquer evaluation strategy, nonuniform rectangular clusters, variableorder approximation, and a runtime choice between Taylor approximation and direct summation. Tests are performed to document the treecode’s accuracy and efficiency. The method is applied to simulate the rollup of a circulardisk vortex sheet into a vortex ring. Two examples are presented, azimuthal waves on a vortex ring and the merger of two vortex rings. c ○ 2001 Academic Press Key Words: particle method; adaptive treecode; vortex sheet; vortex ring; threedimensional flow.
PROVABLY GOOD PARTITIONING AND LOAD BALANCING ALGORITHMS FOR PARALLEL ADAPTIVE NBODY SIMULATION
, 1998
"... We present an ecient and provably good partitioning and load balancing algorithm for parallel adaptive Nbody simulation. The main ingredient of our method is a novel geometric characterization of a class of communication graphs that can be used to support hierarchical Nbody methods such as the f ..."
Abstract

Cited by 37 (3 self)
 Add to MetaCart
We present an ecient and provably good partitioning and load balancing algorithm for parallel adaptive Nbody simulation. The main ingredient of our method is a novel geometric characterization of a class of communication graphs that can be used to support hierarchical Nbody methods such as the fast multipole method (FMM) and the Barnes{Hut method (BH). We show that communication graphs of these methods have a good partition that can be found eciently sequentially and in parallel. In particular, we show that an Nbody communication graph (either for BH or for FMM) can be partitioned into two subgraphs with equal computation load by removing only O( p n logn) and O(n2=3(logn)1=3) number of nodes, respectively, for two and three dimensions. These bounds on nodepartition imply bounds on edgepartition of O( p n(logn)3=2) and O(n2=3(logn)4=3), respectively, for two and three dimensions. To the best of our knowledge, this is the rst theoretical result on the quality of partitioning Nbody communication graphs for nonuniformly distributed particles. Our results imply that parallel adaptive Nbody simulation can be made as scalable as computation on regular grids and as ecient as parallel Nbody simulation on uniformly distributed particles.
Rapid Evaluation Of Nonreflecting Boundary Kernels For TimeDomain Wave Propagation
 SIAM J. Numer. Anal
, 2000
"... . We present a systematic approach to the computation of exact nonreflecting boundary conditions for the wave equation. In both two and three dimensions, the critical step in our analysis involves convolution with the inverse Laplace transform of the logarithmic derivative of a Hankel function. The ..."
Abstract

Cited by 36 (4 self)
 Add to MetaCart
(Show Context)
. We present a systematic approach to the computation of exact nonreflecting boundary conditions for the wave equation. In both two and three dimensions, the critical step in our analysis involves convolution with the inverse Laplace transform of the logarithmic derivative of a Hankel function. The main technical result in this paper is that the logarithmic derivative of the Hankel function H (1) # (z) of real order # can be approximated in the upper half z plane with relative error # by a rational function of degree d # O # log # log 1 # +log 2 #+# 1 log 2 1 # # as ###, # # 0, with slightly more complicated bounds for # = 0. If N is the number of points used in the discretization of a cylindrical (circular) boundary in two dimensions, then, assuming that #<1/N , O(N log N log 1 # ) work is required at each time step. This is comparable to the work required for the Fourier transform on the boundary. In three dimensions, the cost is proportional to N...
A New Parallel KernelIndependent Fast Multipole Method
 in SC2003
"... We present a new adaptive fast multipole algorithm and its parallel implementation. The algorithm is kernelindependent in the sense that the evaluation of pairwise interactions does not rely on any analytic expansions, but only utilizes kernel evaluations. The new method provides the enabling techn ..."
Abstract

Cited by 35 (12 self)
 Add to MetaCart
(Show Context)
We present a new adaptive fast multipole algorithm and its parallel implementation. The algorithm is kernelindependent in the sense that the evaluation of pairwise interactions does not rely on any analytic expansions, but only utilizes kernel evaluations. The new method provides the enabling technology for many important problems in computational science and engineering. Examples include viscous flows, fracture mechanics and screened Coulombic interactions. Our MPIbased parallel implementation logically separates the computation and communication phases to avoid synchronization in the upward and downward computation passes, and thus allows us to fully exploit computation and communication overlapping. We measure isogranular and fixedsize scalability for a variety of kernels on the Pittsburgh Supercomputing Center's TCS1 Alphaserver on up to 3000 processors. We have solved viscous flow problems with up to 2.1 billion unknowns and we have achieved 1.6 Tflops/s peak performance and 1.13 Tflops/s sustained performance.