Results 1  10
of
90
Visualizing Data using tSNE
, 2008
"... We present a new technique called “tSNE” that visualizes highdimensional data by giving each datapoint a location in a two or threedimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly b ..."
Abstract

Cited by 280 (13 self)
 Add to MetaCart
We present a new technique called “tSNE” that visualizes highdimensional data by giving each datapoint a location in a two or threedimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. tSNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for highdimensional data that lie on several different, but related, lowdimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large data sets, we show how tSNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of tSNE on a wide variety of data sets and compare it with many other nonparametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by tSNE are significantly better than those produced by the other techniques on almost all of the data sets.
Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps
 Proceedings of the National Academy of Sciences
, 2005
"... of contexts of data analysis, such as spectral graph theory, manifold learning, nonlinear principal components and kernel methods. We augment these approaches by showing that the diffusion distance is a key intrinsic geometric quantity linking spectral theory of the Markov process, Laplace operators ..."
Abstract

Cited by 257 (45 self)
 Add to MetaCart
of contexts of data analysis, such as spectral graph theory, manifold learning, nonlinear principal components and kernel methods. We augment these approaches by showing that the diffusion distance is a key intrinsic geometric quantity linking spectral theory of the Markov process, Laplace operators, or kernels, to the corresponding geometry and density of the data. This opens the door to the application of methods from numerical analysis and signal processing to the analysis of functions and transformations of the data. Abstract. We provide a framework for structural multiscale geometric organization of graphs and subsets of Rn. We use diffusion semigroups to generate multiscale geometries in order to organize and represent complex structures. We show that appropriately selected eigenfunctions or scaling functions of Markov matrices, which describe local transitions, lead to macroscopic descriptions at different scales. The process of iterating or diffusing the Markov matrix is seen as a generalization of some aspects of the Newtonian paradigm, in which local infinitesimal transitions of a system lead to global macroscopic descriptions by integration. In Part I below, we provide a unified view of ideas from data analysis, machine learning and numerical analysis. In Part II [1], we augment this approach by introducing fast orderN algorithms for homogenization of heterogeneous structures as well as for data representation. 1.
Diffusion maps and coarsegraining: A unified framework for dimensionality reduction, graph partitioning and data set parameterization
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2006
"... We provide evidence that nonlinear dimensionality reduction, clustering and data set parameterization can be solved within one and the same framework. The main idea is to define a system of coordinates with an explicit metric that reflects the connectivity of a given data set and that is robust to ..."
Abstract

Cited by 158 (5 self)
 Add to MetaCart
(Show Context)
We provide evidence that nonlinear dimensionality reduction, clustering and data set parameterization can be solved within one and the same framework. The main idea is to define a system of coordinates with an explicit metric that reflects the connectivity of a given data set and that is robust to noise. Our construction, which is based on a Markov random walk on the data, offers a general scheme of simultaneously reorganizing and subsampling graphs and arbitrarily shaped data sets in high dimensions using intrinsic geometry. We show that clustering in embedding spaces is equivalent to compressing operators. The objective of data partitioning and clustering is to coarsegrain the random walk on the data while at the same time preserving a diffusion operator for the intrinsic geometry or connectivity of the data set up to some accuracy. We show that the quantization distortion in diffusion space bounds the error of compression of the operator, thus giving a rigorous justification for kmeans clustering in diffusion space and a precise measure of the performance of general clustering algorithms.
Diffusion maps, spectral clustering and eigenfunctions of fokkerplanck operators
 in Advances in Neural Information Processing Systems 18
, 2005
"... This paper presents a diffusion based probabilistic interpretation of spectral clustering and dimensionality reduction algorithms that use the eigenvectors of the normalized graph Laplacian. Given the pairwise adjacency matrix of all points, we define a diffusion distance between any two data points ..."
Abstract

Cited by 110 (14 self)
 Add to MetaCart
(Show Context)
This paper presents a diffusion based probabilistic interpretation of spectral clustering and dimensionality reduction algorithms that use the eigenvectors of the normalized graph Laplacian. Given the pairwise adjacency matrix of all points, we define a diffusion distance between any two data points and show that the low dimensional representation of the data by the first few eigenvectors of the corresponding Markov matrix is optimal under a certain mean squared error criterion. Furthermore, assuming that data points are random samples from a density p(x) = e −U(x) we identify these eigenvectors as discrete approximations of eigenfunctions of a FokkerPlanck operator in a potential 2U(x) with reflecting boundary conditions. Finally, applying known results regarding the eigenvalues and eigenfunctions of the continuous FokkerPlanck operator, we provide a mathematical justification for the success of spectral clustering and dimensional reduction algorithms based on these first few eigenvectors. This analysis elucidates, in terms of the characteristics of diffusion processes, many empirical findings regarding spectral clustering algorithms.
Semisupervised learning in gigantic image collections
 In Advances in Neural Information Processing Systems 22
, 2009
"... With the advent of the Internet it is now possible to collect hundreds of millions of images. These images come with varying degrees of label information. “Clean labels” can be manually obtained on a small fraction, “noisy labels ” may be extracted automatically from surrounding text, while for mo ..."
Abstract

Cited by 76 (3 self)
 Add to MetaCart
(Show Context)
With the advent of the Internet it is now possible to collect hundreds of millions of images. These images come with varying degrees of label information. “Clean labels” can be manually obtained on a small fraction, “noisy labels ” may be extracted automatically from surrounding text, while for most images there are no labels at all. Semisupervised learning is a principled framework for combining these different label sources. However, it scales polynomially with the number of images, making it impractical for use on gigantic collections with hundreds of millions of images and thousands of classes. In this paper we show how to utilize recent results in machine learning to obtain highly efficient approximations for semisupervised learning. Specifically, we use the convergence of the eigenvectors of the normalized graph Laplacian to eigenfunctions of weighted LaplaceBeltrami operators. We combine this with a label sharing framework obtained from Wordnet to propagate label information to classes lacking manual annotations. Our algorithm enables us to apply semisupervised learning to a database of 80 million images with 74 thousand classes. 1.
Data fusion and multicue data matching by diffusion maps
 IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density i ..."
Abstract

Cited by 57 (5 self)
 Add to MetaCart
(Show Context)
Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density invariant embeddings which are essential for integrating different sources of data. Second, we describe a refinement of the Nyström extension algorithm called “geometric harmonics. ” We also explain how to use this tool for data assimilation. Finally, we introduce a multicue data matching scheme based on nonlinear spectral graphs alignment. The effectiveness of the presented schemes is validated by applying it to the problems of lipreading and image sequence alignment. Index Terms—Pattern matching, graph theory, graph algorithms, Markov processes, machine learning, data mining, image databases. Ç 1
Exploring collections of 3D models using fuzzy correspondences
"... Large collections of 3D models from the same object class (e.g., chairs, cars, animals) are now commonly available via many public repositories, but exploring the range of shape variations across such collections remains a challenging task. In this work, we present a new exploration interface that a ..."
Abstract

Cited by 42 (14 self)
 Add to MetaCart
Large collections of 3D models from the same object class (e.g., chairs, cars, animals) are now commonly available via many public repositories, but exploring the range of shape variations across such collections remains a challenging task. In this work, we present a new exploration interface that allows users to browse collections based on similarities and differences between shapes in userspecified regions of interest (ROIs). To support this interactive system, we introduce a novel analysis method for computing similarity relationships between points on 3D shapes across a collection. We encode the inherent ambiguity in these relationships using fuzzy point correspondences and propose a robust and efficient computational framework that estimates fuzzy correspondences using only a sparse set of pairwise model alignments. We evaluate our analysis method on a range of correspondence benchmarks and report substantial improvements in both speed and accuracy over existing alternatives. In addition, we demonstrate how fuzzy correspondences enable key features in our exploration tool, such as automated view alignment, ROIbased similarity search, and faceted browsing.
Dimensionality Reduction: A Comparative Review
, 2008
"... In recent years, a variety of nonlinear dimensionality reduction techniques have been proposed, many of which rely on the evaluation of local properties of the data. The paper presents a review and systematic comparison of these techniques. The performances of the techniques are investigated on arti ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
In recent years, a variety of nonlinear dimensionality reduction techniques have been proposed, many of which rely on the evaluation of local properties of the data. The paper presents a review and systematic comparison of these techniques. The performances of the techniques are investigated on artificial and natural tasks. The results of the experiments reveal that nonlinear techniques perform well on selected artificial tasks, but do not outperform the traditional PCA on realworld tasks. The paper explains these results by identifying weaknesses of current nonlinear techniques, and suggests how the performance of nonlinear dimensionality reduction techniques may be improved.
An experimental investigation of graph kernels on a collaborative recommendation task
 Proceedings of the 6th International Conference on Data Mining (ICDM 2006
, 2006
"... This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regul ..."
Abstract

Cited by 27 (7 self)
 Add to MetaCart
This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regularized Laplacian kernel, the commutetime kernel, the randomwalkwithrestart similarity matrix, and finally, three graph kernels introduced in this paper: the regularized commutetime kernel, the Markov diffusion kernel, and the crossentropy diffusion matrix. The kernelonagraph approach is simple and intuitive. It is illustrated by applying the nine graph kernels to a collaborativerecommendation task and to a semisupervised classification task, both on several databases. The graph methods compute proximity measures between nodes that help study the structure of the graph. Our comparisons suggest that the regularized commutetime and the Markov diffusion kernels perform best, closely followed by the regularized Laplacian kernel. 1
Hubs in space: Popular nearest neighbors in highdimensional data
"... Different aspects of the curse of dimensionality are known to present serious challenges to various machinelearning methods and tasks. This paper explores a new aspect of the dimensionality curse, referred to as hubness, that affects the distribution of koccurrences: the number of times a point ap ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
Different aspects of the curse of dimensionality are known to present serious challenges to various machinelearning methods and tasks. This paper explores a new aspect of the dimensionality curse, referred to as hubness, that affects the distribution of koccurrences: the number of times a point appears among the k nearest neighbors of other points in a data set. Through theoretical and empirical analysis involving synthetic and real data sets we show that under commonly used assumptions this distribution becomes considerably skewed as dimensionality increases, causing the emergence of hubs, that is, points with very high koccurrences which effectively represent “popular ” nearest neighbors. We examine the origins of this phenomenon, showing that it is an inherent property of data distributions in highdimensional vector space, discuss its interaction with dimensionality reduction, and explore its influence on a wide range of machinelearning tasks directly or indirectly based on measuring distances, belonging to supervised, semisupervised, and unsupervised learning families.