@MISC{Berrya_variablebandwidth, author = {Tyrus Berrya and John Harlima}, title = {Variable Bandwidth Diffusion Kernels}, year = {} }

Share

OpenURL

Abstract

A practical limitation of operator estimation via kernels is the assumption of a compact manifold. In practice we are often interested in data sets whose sampling density may be arbitrarily small, which implies that the data lies on an open set and cannot be modeled as a compact manifold. In this paper, we show that this limitation can be overcome by varying the bandwidth of the kernel spatially. We present an asymptotic expansion of these variable bandwidth kernels for arbitrary bandwidth functions; generalizing the theory of Diffusion Maps and Laplacian Eigenmaps. Sub-sequently, we present error estimates for the corresponding discrete operators, which reveal how the small sampling density leads to large errors; particularly for fixed bandwidth kernels. By choosing a bandwidth function inversely proportional to the sampling density (which can be estimated from data) we are able to control these error estimates uniformly over a non-compact manifold, assuming only fast decay of the density at infinity in the ambient space. We numerically verify these results by constructing the generator of the Ornstein-Uhlenbeck process on the real line using data sampled independently from the invariant measure. In this example, we find that the fixed bandwidth kernels yield reasonable approximations for small data sets when the bandwidth is carefully tuned, however these approximations actually degrade as the amount of data is increased. On the other hand, an operator approximation based on a variable bandwidth kernel does converge in the limit of large data; and for small data sets exhibits reduced sensitivity to bandwidth selection. Moreover, even for compact manifolds, variable bandwidth kernels give better approximations with reduced dependence on bandwidth selection. These results extend the classical statistical theory of variable bandwidth density estimation to operator approximation.