Results

**11 - 16**of**16**### Improved Graph Laplacian via Geometric Self-Consistency

"... We address the problem of setting the kernel bandwidth used by Manifold Learning algorithms to construct the graph Laplacian. Exploiting the connection between manifold geometry, repre-sented by the Riemannian metric, and the Laplace-Beltrami operator, we set by optimizing the Laplacian’s ability ..."

Abstract
- Add to MetaCart

We address the problem of setting the kernel bandwidth used by Manifold Learning algorithms to construct the graph Laplacian. Exploiting the connection between manifold geometry, repre-sented by the Riemannian metric, and the Laplace-Beltrami operator, we set by optimizing the Laplacian’s ability to preserve the geometry of the data. Experiments show that this principled approach is effective and robust. 1.

### Estimating Vector Fields on Manifolds and the Embedding of Directed Graphs

"... This paper considers the problem of embedding directed graphs in Euclidean space while retaining directional information. We model a directed graph as a finite set of observations from a diffusion on a manifold endowed with a vector field. This is the first generative model of its kind for directed ..."

Abstract
- Add to MetaCart

This paper considers the problem of embedding directed graphs in Euclidean space while retaining directional information. We model a directed graph as a finite set of observations from a diffusion on a manifold endowed with a vector field. This is the first generative model of its kind for directed graphs. We introduce a graph embedding algorithm that estimates all three features of this model: the low-dimensional embedding of the manifold, the data density and the vector field. In the process, we also obtain new theoretical results on the limits of “Laplacian type ” matrices derived from directed graphs. The application of our method to both artificially constructed and real data highlights its strengths. 1.

### Variable Bandwidth Diffusion Kernels

"... A practical limitation of operator estimation via kernels is the assumption of a compact manifold. In practice we are often interested in data sets whose sampling density may be arbitrarily small, which implies that the data lies on an open set and cannot be modeled as a compact manifold. In this pa ..."

Abstract
- Add to MetaCart

(Show Context)
A practical limitation of operator estimation via kernels is the assumption of a compact manifold. In practice we are often interested in data sets whose sampling density may be arbitrarily small, which implies that the data lies on an open set and cannot be modeled as a compact manifold. In this paper, we show that this limitation can be overcome by varying the bandwidth of the kernel spatially. We present an asymptotic expansion of these variable bandwidth kernels for arbitrary bandwidth functions; generalizing the theory of Diffusion Maps and Laplacian Eigenmaps. Sub-sequently, we present error estimates for the corresponding discrete operators, which reveal how the small sampling density leads to large errors; particularly for fixed bandwidth kernels. By choosing a bandwidth function inversely proportional to the sampling density (which can be estimated from data) we are able to control these error estimates uniformly over a non-compact manifold, assuming only fast decay of the density at infinity in the ambient space. We numerically verify these results by constructing the generator of the Ornstein-Uhlenbeck process on the real line using data sampled independently from the invariant measure. In this example, we find that the fixed bandwidth kernels yield reasonable approximations for small data sets when the bandwidth is carefully tuned, however these approximations actually degrade as the amount of data is increased. On the other hand, an operator approximation based on a variable bandwidth kernel does converge in the limit of large data; and for small data sets exhibits reduced sensitivity to bandwidth selection. Moreover, even for compact manifolds, variable bandwidth kernels give better approximations with reduced dependence on bandwidth selection. These results extend the classical statistical theory of variable bandwidth density estimation to operator approximation.

### SIAM J. APPLIED DYNAMICAL SYSTEMS c © 20XX SIAM. Published by SIAM under the terms Vol. 0, No. 0, pp. 000–000 of the Creative Commons 4.0 license

"... Abstract. We present a family of kernels for analysis of data generated by dynamical systems. These so-called cone kernels feature a dependence on the dynamical vector field operating in the phase space man-ifold, estimated empirically through finite differences of time-ordered data samples. In part ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract. We present a family of kernels for analysis of data generated by dynamical systems. These so-called cone kernels feature a dependence on the dynamical vector field operating in the phase space man-ifold, estimated empirically through finite differences of time-ordered data samples. In particular, cone kernels assign strong affinity to pairs of samples whose relative displacement vector lies within a narrow cone aligned with the dynamical vector field. The outcome of this explicit dependence on the dynamics is that, in a suitable asymptotic limit, Laplace–Beltrami operators for data anal-ysis constructed from cone kernels generate diffusions along the integral curves of the dynamical vector field. This property is independent of the observation modality, and endows these operators with invariance under a weakly restrictive class of transformations of the data (including conformal transformations), while it also enhances their capability to extract intrinsic dynamical timescales via eigenfunctions. Here, we study these features by establishing the Riemannian metric tensor induced by cone kernels in the limit of large data. We find that the corresponding Dirichlet energy is governed by the directional derivative of functions along the dynamical vector field, giving rise to a measure of roughness of functions that favors slowly varying observables. We demonstrate the utility of cone kernels in nonlinear flows on the 2-torus and North Pacific sea surface temperature data generated by a comprehensive climate model.

### A VARIATIONAL APPROACH TO THE CONSISTENCY OF SPECTRAL CLUSTERING

, 2015

"... This paper establishes the consistency of spectral approaches to data clustering. We consider clustering of point clouds obtained as samples of a ground-truth measure. A graph representing the point cloud is obtained by assigning weights to edges based on the distance between the points they connec ..."

Abstract
- Add to MetaCart

This paper establishes the consistency of spectral approaches to data clustering. We consider clustering of point clouds obtained as samples of a ground-truth measure. A graph representing the point cloud is obtained by assigning weights to edges based on the distance between the points they connect. We investigate the spectral convergence of both unnormalized and normalized graph Laplacians towards the appropriate operators in the continuum domain. We obtain sharp conditions on how the connectivity radius can be scaled with respect to the number of sample points for the spectral convergence to hold. We also show that the discrete clusters obtained via spectral clustering converge towards a continuum partition of the ground truth measure. Such continuum partition minimizes a functional describing the continuum analogue of the graph-based spectral partitioning. Our approach, based on variational convergence, is general and flexible.