• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Nonlinear dimensionality reduction by locally linear embedding (0)

by S Roweis, L Saul
Venue:Science
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 2,417
Next 10 →

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

by Mikhail Belkin, Partha Niyogi , 2003
"... One of the central problems in machine learning and pattern recognition is to develop appropriate representations for complex data. We consider the problem of constructing a representation for data lying on a low-dimensional manifold embedded in a high-dimensional space. Drawing on the correspondenc ..."
Abstract - Cited by 1226 (15 self) - Add to MetaCart
One of the central problems in machine learning and pattern recognition is to develop appropriate representations for complex data. We consider the problem of constructing a representation for data lying on a low-dimensional manifold embedded in a high-dimensional space. Drawing on the correspondence between the graph Laplacian, the Laplace Beltrami operator on the manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for representing the high-dimensional data. The algorithm provides a computationally efficient ap-proach to nonlinear dimensionality reduction that has locality-preserving properties and a natural connection to clustering. Some potential applications and illustrative examples are discussed.

Distance metric learning, with application to clustering with sideinformation,”

by Eric P Xing , Andrew Y Ng , Michael I Jordan , Stuart Russell - in Advances in Neural Information Processing Systems 15, , 2002
"... Abstract Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for ..."
Abstract - Cited by 818 (13 self) - Add to MetaCart
Abstract Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for the user to manually tweak the input space's metric until sufficiently good clusters are found. For these and other applications requiring good metrics, it is desirable that we provide a more systematic way for users to indicate what they consider "similar." For instance, we may ask them to provide examples. In this paper, we present an algorithm that, given examples of similar (and, if desired, dissimilar) pairs of points in Ê Ò , learns a distance metric over Ê Ò that respects these relationships. Our method is based on posing metric learning as a convex optimization problem, which allows us to give efficient, local-optima-free algorithms. We also demonstrate empirically that the learned metrics can be used to significantly improve clustering performance.
(Show Context)

Citation Context

...re the unsupervised ones that take an input dataset, and find an embedding of it in some space. This includes algorithms such as Multidimensional Scaling (MDS) [2], and Locally Linear Embedding (LLE) =-=[9]-=-. One feature distinguishing our work from these is that we will learn a full metric over the input space, rather than focusing only on (finding an embedding for) the points in th...

Semi-Supervised Learning Literature Survey

by Xiaojin Zhu , 2006
"... We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter ..."
Abstract - Cited by 782 (8 self) - Add to MetaCart
We review the literature on semi-supervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semi-supervised learning. This document is a chapter excerpt from the author’s doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf

Laplacian eigenmaps and spectral techniques for embedding and clustering.

by Mikhail Belkin , Partha Niyogi - Proceeding of Neural Information Processing Systems, , 2001
"... Abstract Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami op erator on a manifold , and the connections to the heat equation , we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in ..."
Abstract - Cited by 668 (7 self) - Add to MetaCart
Abstract Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami op erator on a manifold , and the connections to the heat equation , we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering. Several applications are considered. In many areas of artificial intelligence, information retrieval and data mining, one is often confronted with intrinsically low dimensional data lying in a very high dimensional space. For example, gray scale n x n images of a fixed object taken with a moving camera yield data points in rn: n2 . However , the intrinsic dimensionality of the space of all images of t he same object is the number of degrees of freedom of the camera -in fact the space has the natural structure of a manifold embedded in rn: n2 . While there is a large body of work on dimensionality reduction in general, most existing approaches do not explicitly take into account the structure of the manifold on which the data may possibly reside. Recently, there has been some interest (Tenenbaum et aI, 2000 ; The core algorithm is very simple, has a few local computations and one sparse eigenvalu e problem. The solution reflects th e intrinsic geom etric structure of the manifold. Th e justification comes from the role of the Laplacian op erator in providing an optimal emb edding. Th e Laplacian of the graph obtained from the data points may be viewed as an approximation to the Laplace-Beltrami operator defined on the manifold. The emb edding maps for the data come from approximations to a natural map that is defined on the entire manifold. The framework of analysis

Manifold regularization: A geometric framework for learning from labeled and unlabeled examples

by Mikhail Belkin, Partha Niyogi, Vikas Sindhwani - JOURNAL OF MACHINE LEARNING RESEARCH , 2006
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning al ..."
Abstract - Cited by 578 (16 self) - Add to MetaCart
We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning algorithms and standard methods including Support Vector Machines and Regularized Least Squares can be obtained as special cases. We utilize properties of Reproducing Kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph-based approaches) we obtain a natural out-of-sample extension to novel examples and so are able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semi-supervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework.
(Show Context)

Citation Context

...ust attempt to get empirical estimates @ of 9…9}q and . Note that in order to get such empirical estimates it is sufficient to have unlabeled examples. A case of particular recent interest (e.g., see =-=[27, 33, 5, 19]-=- for a discussion on dimensionality reduction) is when the support of is a compact submanifold †j‡?ˆ„‰ . In that case, one natural choice for 9 z The optimization problem becomes 9Eq is Š tŒ‹Ž t H t H...

Guaranteed minimumrank solutions of linear matrix equations via nuclear norm minimization,”

by Benjamin Recht , Maryam Fazel , Pablo A Parrilo - SIAM Review, , 2010
"... Abstract The affine rank minimization problem consists of finding a matrix of minimum rank that satisfies a given system of linear equality constraints. Such problems have appeared in the literature of a diverse set of fields including system identification and control, Euclidean embedding, and col ..."
Abstract - Cited by 562 (20 self) - Add to MetaCart
Abstract The affine rank minimization problem consists of finding a matrix of minimum rank that satisfies a given system of linear equality constraints. Such problems have appeared in the literature of a diverse set of fields including system identification and control, Euclidean embedding, and collaborative filtering. Although specific instances can often be solved with specialized algorithms, the general affine rank minimization problem is NP-hard, because it contains vector cardinality minimization as a special case. In this paper, we show that if a certain restricted isometry property holds for the linear transformation defining the constraints, the minimum rank solution can be recovered by solving a convex optimization problem, namely the minimization of the nuclear norm over the given affine space. We present several random ensembles of equations where the restricted isometry property holds with overwhelming probability, provided the codimension of the subspace is Ω(r(m + n) log mn), where m, n are the dimensions of the matrix, and r is its rank. The techniques used in our analysis have strong parallels in the compressed sensing framework. We discuss how affine rank minimization generalizes this pre-existing concept and outline a dictionary relating concepts from cardinality minimization to those of rank minimization. We also discuss several algorithmic approaches to solving the norm minimization relaxations, and illustrate our results with numerical examples.
(Show Context)

Citation Context

...ld learning, one may be given high dimensional data with low-dimensional structure that can be recovered by searching for a low-dimensional embedding of the data preserving local distance information =-=[64, 75]-=-. A symmetric matrix D ∈ Sn is called a Euclidean distance matrix (EDM) if there exist points x1, . . . , xn in Rd such that Dij = ‖xi − xj‖2 . Let V := In − 1 n11T be the projection matrix onto the h...

Survey of clustering algorithms

by Rui Xu, Donald Wunsch II - IEEE TRANSACTIONS ON NEURAL NETWORKS , 2005
"... Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the ..."
Abstract - Cited by 499 (4 self) - Add to MetaCart
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.
(Show Context)

Citation Context

...lity of MDS to explore more complex nonlinear structures in the data. Locally linear embedding (LLE) algorithm addresses the nonlinear dimensionality reduction problem from a different starting point =-=[235]-=-. LLE emphasizes the local linearity of the manifold and assumes that the local relations in the original data space ( -dimensional) are also preserved in the projected low-dimensional space ( -dimens...

Locality-constrained linear coding for image classification

by Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, Yihong Gong - IN: IEEE CONFERENCE ON COMPUTER VISION AND PATTERN CLASSIFICATOIN , 2010
"... The traditional SPM approach based on bag-of-features (BoF) requires nonlinear classifiers to achieve good image classification performance. This paper presents a simple but effective coding scheme called Locality-constrained Linear Coding (LLC) in place of the VQ coding in traditional SPM. LLC util ..."
Abstract - Cited by 443 (20 self) - Add to MetaCart
The traditional SPM approach based on bag-of-features (BoF) requires nonlinear classifiers to achieve good image classification performance. This paper presents a simple but effective coding scheme called Locality-constrained Linear Coding (LLC) in place of the VQ coding in traditional SPM. LLC utilizes the locality constraints to project each descriptor into its local-coordinate system, and the projected coordinates are integrated by max pooling to generate the final representation. With linear classifier, the proposed approach performs remarkably better than the traditional nonlinear SPM, achieving state-of-the-art performance on several benchmarks. Compared with the sparse coding strategy [22], the objective function used by LLC has an analytical solution. In addition, the paper proposes a fast approximated LLC method by first performing a K-nearest-neighbor search and then solving a constrained least square fitting problem, bearing computational complexity of O(M + K2). Hence even with very large codebooks, our system can still process multiple frames per second. This efficiency significantly adds to the practical values of LLC for real applications.
(Show Context)

Citation Context

... O(M + K 2 ), where K ≪ M. The final implementation of such approximated LLC process is illustrated in Figure.1 right. Though this approximated encoding appears to be similar to Local Linear Embeding =-=[18]-=-, the whole procedure of LLC itself differs from LLE clearly, because LLC incorporates an additional codebook learning step where the inference is derived from Eq.(3). The codebook learning step will ...

Locality Preserving Projection,"

by Xiaofei He , Partha Niyogi - Neural Information Processing System, , 2004
"... Abstract Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the ..."
Abstract - Cited by 414 (16 self) - Add to MetaCart
Abstract Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA) -a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
(Show Context)

Citation Context

...(1) and (2) above, we know of no other linear projective technique that has such a property. 4. LPP is defined everywhere. Recall that nonlinear dimensionality reduction techniques like ISOMAP[6], LLE=-=[5]-=-, Laplacian eigenmaps[2] are defined only on the training data points and it is unclear how to evaluate the map for new test points. In contrast, the Locality Preserving Projection may be simply appli...

Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds

by Lawrence K. Saul, Sam T. Roweis, Yoram Singer - Journal of Machine Learning Research , 2003
"... The problem of dimensionality reduction arises in many fields of information processing, including machine learning, data compression, scientific visualization, pattern recognition, and neural computation. ..."
Abstract - Cited by 385 (10 self) - Add to MetaCart
The problem of dimensionality reduction arises in many fields of information processing, including machine learning, data compression, scientific visualization, pattern recognition, and neural computation.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University