Results 1  10
of
99
Random projections of smooth manifolds
 Foundations of Computational Mathematics
, 2006
"... We propose a new approach for nonadaptive dimensionality reduction of manifoldmodeled data, demonstrating that a small number of random linear projections can preserve key information about a manifoldmodeled signal. We center our analysis on the effect of a random linear projection operator Φ: R N ..."
Abstract

Cited by 144 (26 self)
 Add to MetaCart
(Show Context)
We propose a new approach for nonadaptive dimensionality reduction of manifoldmodeled data, demonstrating that a small number of random linear projections can preserve key information about a manifoldmodeled signal. We center our analysis on the effect of a random linear projection operator Φ: R N → R M, M < N, on a smooth wellconditioned Kdimensional submanifold M ⊂ R N. As our main theoretical contribution, we establish a sufficient number M of random projections to guarantee that, with high probability, all pairwise Euclidean and geodesic distances between points on M are wellpreserved under the mapping Φ. Our results bear strong resemblance to the emerging theory of Compressed Sensing (CS), in which sparse signals can be recovered from small numbers of random linear measurements. As in CS, the random measurements we propose can be used to recover the original data in R N. Moreover, like the fundamental bound in CS, our requisite M is linear in the “information level” K and logarithmic in the ambient dimension N; we also identify a logarithmic dependence on the volume and conditioning of the manifold. In addition to recovering faithful approximations to manifoldmodeled signals, however, the random projections we propose can also be used to discern key properties about the manifold. We discuss connections and contrasts with existing techniques in manifold learning, a setting where dimensionality reducing mappings are typically nonlinear and constructed adaptively from a set of sampled training data.
Maximum likelihood estimation of intrinsic dimension
 In Advances in Neural Information Processing Systems
, 2005
"... We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theoretically and by simulations, and ap ..."
Abstract

Cited by 143 (7 self)
 Add to MetaCart
(Show Context)
We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theoretically and by simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performance compared with two other intrinsic dimension estimators. 1
Nearestneighbor searching and metric space dimensions
 In NearestNeighbor Methods for Learning and Vision: Theory and Practice
, 2006
"... Given a set S of n sites (points), and a distance measure d, the nearest neighbor searching problem is to build a data structure so that given a query point q, the site nearest to q can be found quickly. This paper gives a data structure for this problem; the data structure is built using the distan ..."
Abstract

Cited by 107 (0 self)
 Add to MetaCart
(Show Context)
Given a set S of n sites (points), and a distance measure d, the nearest neighbor searching problem is to build a data structure so that given a query point q, the site nearest to q can be found quickly. This paper gives a data structure for this problem; the data structure is built using the distance function as a “black box”. The structure is able to speed up nearest neighbor searching in a variety of settings, for example: points in lowdimensional or structured Euclidean space, strings under Hamming and edit distance, and bit vector data from an OCR application. The data structures are observed to need linear space, with a modest constant factor. The preprocessing time needed per site is observed to match the query time. The data structure can be viewed as an application of a “kdtree ” approach in the metric space setting, using Voronoi regions of a subset in place of axisaligned boxes. 1
Riemannian manifold learning
 IEEE Trans. Pattern Anal. Mach. Intell
, 2008
"... Abstract—Recently, manifold learning has beenwidely exploited in pattern recognition, data analysis, andmachine learning. This paper presents a novel framework, called Riemannian manifold learning (RML), based on the assumption that the input highdimensional data lie on an intrinsically lowdimensi ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Recently, manifold learning has beenwidely exploited in pattern recognition, data analysis, andmachine learning. This paper presents a novel framework, called Riemannian manifold learning (RML), based on the assumption that the input highdimensional data lie on an intrinsically lowdimensional Riemannian manifold. The main idea is to formulate the dimensionality reduction problem as a classical problem in Riemannian geometry, that is, how to construct coordinate charts for a given Riemannian manifold? We implement the Riemannian normal coordinate chart, which has been the most widely used in Riemannian geometry, for a set of unorganized data points. First, two input parameters (the neighborhood size k and the intrinsic dimension d) are estimated based on an efficient simplicial reconstruction of the underlying manifold. Then, the normal coordinates are computed to map the input highdimensional data into a lowdimensional space. Experiments on synthetic data, as well as realworld images, demonstrate that our algorithm can learn intrinsic geometric structures of the data, preserve radial geodesic distances, and yield regular embeddings.
Translated Poisson mixture model for stratification learning
 Int. J. Comput. Vision
, 2000
"... A framework for the regularized and robust estimation of nonuniform dimensionality and density in high dimensional noisy data is introduced in this work. This leads to learning stratifications, that is, mixture of manifolds representing different characteristics and complexities in the data set. Th ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
(Show Context)
A framework for the regularized and robust estimation of nonuniform dimensionality and density in high dimensional noisy data is introduced in this work. This leads to learning stratifications, that is, mixture of manifolds representing different characteristics and complexities in the data set. The basic idea relies on modeling the high dimensional sample points as a process of Translated Poisson mixtures, with regularizing restrictions, leading to a model which includes the presence of noise. The Translated Poisson distribution is useful to model a noisy counting process, and it is derived from the noiseinduced translation of a regular Poisson distribution. By maximizing the loglikelihood of the process counting the points falling into a local ball, we estimate the local dimension and density. We show that
Estimating Entropy Rates with Bayesian Confidence Intervals
 NEURAL COMPUTATION 17, 1531–1576 (2005)
, 2005
"... The entropy rate quantifies the amount of uncertainty or disorder produced by any dynamical system. In a spiking neuron, this uncertainty translates into the amount of information potentially encoded and thus the subject of intense theoretical and experimental investigation. Estimating this quantity ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
The entropy rate quantifies the amount of uncertainty or disorder produced by any dynamical system. In a spiking neuron, this uncertainty translates into the amount of information potentially encoded and thus the subject of intense theoretical and experimental investigation. Estimating this quantity in observed, experimental data is difficult and requires a judicious selection of probabilistic models, balancing between two opposing biases. We use a model weighting principle originally developed for lossless data compression, following the minimum description length principle. This weighting yields a direct estimator of the entropy rate, which, compared to existing methods, exhibits significantly less bias and converges faster in simulation. With Monte Carlo techinques, we estimate a Bayesian confidence interval for the entropy rate. In related work, we apply these ideas to estimate the information rates between sensory stimuli and neural responses in experimental data (Shlens, Kennel, Abarbanel, & Chichilnisky, 2004).
Stratification learning: Detecting mixed density and dimensionality in high dimensional point clouds
 In Advances in NIPS 19
, 2006
"... The study of point cloud data sampled from a stratification, a collection of manifolds with possible different dimensions, is pursued in this paper. We present a technique for simultaneously soft clustering and estimating the mixed dimensionality and density of such structures. The framework is base ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
(Show Context)
The study of point cloud data sampled from a stratification, a collection of manifolds with possible different dimensions, is pursued in this paper. We present a technique for simultaneously soft clustering and estimating the mixed dimensionality and density of such structures. The framework is based on a maximum likelihood estimation of a Poisson mixture model. The presentation of the approach is completed with artificial and real examples demonstrating the importance of extending manifold learning to stratification learning. 1
Estimates of the information content and dimensionality of natural scenes from proximity distributions
, 2007
"... ..."
A duality view of spectral methods for dimensionality reduction
 In ICML ’06: Proceedings of the 23rd international conference on Machine learning
, 2006
"... We present a unified duality view of several recently emerged spectral methods for nonlinear dimensionality reduction, including Isomap, locally linear embedding, Laplacian eigenmaps, and maximum variance unfolding. We discuss the duality theory for the maximum variance unfolding problem, and show t ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
We present a unified duality view of several recently emerged spectral methods for nonlinear dimensionality reduction, including Isomap, locally linear embedding, Laplacian eigenmaps, and maximum variance unfolding. We discuss the duality theory for the maximum variance unfolding problem, and show that other methods are directly related to either its primal formulation or its dual formulation, or can be interpreted from the optimality conditions. This duality framework reveals close connections between these seemingly quite different algorithms. In particular, it resolves the myth about these methods in using either the top eigenvectors of a dense matrix, or the bottom eigenvectors of a sparse matrix — these two eigenspaces are exactly aligned at primaldual optimality. 1.
N.: Continuous dimensionality characterization of image structures
 Image and Vision Computing
"... Continuous dimensionality characterization of image structures ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
(Show Context)
Continuous dimensionality characterization of image structures