Results 1 - 10
of
56
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation
- Neural Computation
, 2003
"... Abstract One of the central problems in machine learning and pattern recognition is to develop appropriate representations for complex data. We consider the problem of constructing a representation for data lying on a low dimensional manifold embedded in a high dimensional space. Drawing on the corr ..."
Abstract
-
Cited by 519 (12 self)
- Add to MetaCart
Abstract One of the central problems in machine learning and pattern recognition is to develop appropriate representations for complex data. We consider the problem of constructing a representation for data lying on a low dimensional manifold embedded in a high dimensional space. Drawing on the correspondence between the graph Laplacian, the Laplace Beltrami operator on the manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for representing the high dimensional data. The algorithm provides a computationally efficient approach to non-linear dimensionality reduction that has locality preserving properties and a natural connection to clustering. Some potential applications and illustrative examples are discussed. 1 Introduction In many areas of artificial intelligence, information retrieval and data mining, one is often confronted with intrinsically low dimensional data lying in a very high dimensional space. Consider, for example, gray scale images of an object taken under fixed lighting conditions with a moving camera. Each such image would typically be represented by a brightness value at each pixel. If there were n 2
Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
- Advances in Neural Information Processing Systems 14
, 2001
"... Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami operator on a manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher ..."
Abstract
-
Cited by 311 (7 self)
- Add to MetaCart
Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami operator on a manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering. Several applications are considered.
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract
-
Cited by 142 (15 self)
- Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA) -- a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
Face recognition using laplacianfaces
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Abstract—We propose an appearance-based face recognition method called the Laplacianface approach. By using Locality Preserving Projections (LPP), the face images are mapped into a face subspace for analysis. Different from Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) wh ..."
Abstract
-
Cited by 119 (20 self)
- Add to MetaCart
Abstract—We propose an appearance-based face recognition method called the Laplacianface approach. By using Locality Preserving Projections (LPP), the face images are mapped into a face subspace for analysis. Different from Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) which effectively see only the Euclidean structure of face space, LPP finds an embedding that preserves local information, and obtains a face subspace that best detects the essential face manifold structure. The Laplacianfaces are the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the face manifold. In this way, the unwanted variations resulting from changes in lighting, facial expression, and pose may be eliminated or reduced. Theoretical analysis shows that PCA, LDA, and LPP can be obtained from different graph models. We compare the proposed Laplacianface approach with Eigenface and Fisherface methods on three different face data sets. Experimental results suggest that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition. Index Terms—Face recognition, principal component analysis, linear discriminant analysis, locality preserving projections, face manifold, subspace learning. 1
Diffusion maps and coarse-graining: A unified framework for dimensionality reduction, graph partitioning and data set parameterization
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2006
"... We provide evidence that non-linear dimensionality reduction, clustering and data set parameterization can be solved within one and the same framework. The main idea is to define a system of coordinates with an explicit metric that reflects the connectivity of a given data set and that is robust to ..."
Abstract
-
Cited by 58 (5 self)
- Add to MetaCart
We provide evidence that non-linear dimensionality reduction, clustering and data set parameterization can be solved within one and the same framework. The main idea is to define a system of coordinates with an explicit metric that reflects the connectivity of a given data set and that is robust to noise. Our construction, which is based on a Markov random walk on the data, offers a general scheme of simultaneously reorganizing and subsampling graphs and arbitrarily shaped data sets in high dimensions using intrinsic geometry. We show that clustering in embedding spaces is equivalent to compressing operators. The objective of data partitioning and clustering is to coarse-grain the random walk on the data while at the same time preserving a diffusion operator for the intrinsic geometry or connectivity of the data set up to some accuracy. We show that the quantization distortion in diffusion space bounds the error of compression of the operator, thus giving a rigorous justification for k-means clustering in diffusion space and a precise measure of the performance of general clustering algorithms.
CLUE: Cluster-based Retrieval of Images by Unsupervised Learning
- IEEE Transactions on Image Processing
, 2003
"... In a typical content-based image retrieval (CBIR) system, query results are a set of images sorted by feature similarities with respect to the query. However, images with high feature similarities to the query may be very di#erent from the query in terms of semantics. This discrepancy between low-le ..."
Abstract
-
Cited by 34 (2 self)
- Add to MetaCart
In a typical content-based image retrieval (CBIR) system, query results are a set of images sorted by feature similarities with respect to the query. However, images with high feature similarities to the query may be very di#erent from the query in terms of semantics. This discrepancy between low-level features and high-level concepts is known as the semantic gap. This paper introduces a novel image retrieval scheme, CLUster-based rEtrieval of images by unsupervised learning (CLUE), which attempts to tackle the semantic gap problem based on a hypothesis that images of the same semantics are similar in a way, images of di#erent semantics are di#erent in their own ways. CLUE attempts to capture semantic concepts by learning the way that images of the same semantics are similar and retrieving image clusters instead of a set of ordered images. Clustering in CLUE is dynamic. In particular, clusters formed depend on which images are retrieved in response to the query. Therefore, the clusters give the algorithm as well as the users semantic relevant clues as to where to navigate. CLUE is a general approach that can be combined with any real-valued symmetric similarity measure (metric or nonmetric). Thus it may be embedded in many current CBIR systems. An experimental image retrieval system using CLUE has been implemented. The performance of the system is evaluated on a database of about 60, 000 images from COREL. Empirical results demonstrate improved performance compared with a typical CBIR system using the same image similarity measure. In addition, preliminary results on images returned by Google's Image Search reveal the potential of applying CLUE to real world image data and integrating CLUE as a part of the interface for keyword-based image retrieval systems.
Universal Analytical Forms for Modeling Image Probabilities
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2002
"... Seeking probability models for images, we employ a spectral approach where the images are decomposed using bandpass filters and probability models are imposed on the filter outputs (also called spectral components). We employ a (two-parameter) family of probability densities, introduced in [11] an ..."
Abstract
-
Cited by 31 (8 self)
- Add to MetaCart
Seeking probability models for images, we employ a spectral approach where the images are decomposed using bandpass filters and probability models are imposed on the filter outputs (also called spectral components). We employ a (two-parameter) family of probability densities, introduced in [11] and called Bessel K forms, for modeling the marginal densities of the spectral components, and demonstrate their fit to the observed histograms for video, infrared, and range images. Motivated by object-based models for image analysis, a relationship between the Bessel parameters and the imaged objects is established. Using 2-metric on the set of Bessel K forms, we propose a pseudometric on the image space for quantifying image similarities/differences. Some applications, including clutter classification and pruning of hypotheses for target recognition, are presented.
Optimal cluster preserving embedding of nonmetric proximity data
- IEEE Trans. Pattern Analysis and Machine Intelligence
, 2003
"... Abstract—For several major applications of data analysis, objects are often not represented as feature vectors in a vector space, but rather by a matrix gathering pairwise proximities. Such pairwise data often violates metricity and, therefore, cannot be naturally embedded in a vector space. Concern ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Abstract—For several major applications of data analysis, objects are often not represented as feature vectors in a vector space, but rather by a matrix gathering pairwise proximities. Such pairwise data often violates metricity and, therefore, cannot be naturally embedded in a vector space. Concerning the problem of unsupervised structure detection or clustering, in this paper, a new embedding method for pairwise data into Euclidean vector spaces is introduced. We show that all clustering methods, which are invariant under additive shifts of the pairwise proximities, can be reformulated as grouping problems in Euclidian spaces. The most prominent property of this constant shift embedding framework is the complete preservation of the cluster structure in the embedding space. Restating pairwise clustering problems in vector spaces has several important consequences, such as the statistical description of the clusters by way of cluster prototypes, the generic extension of the grouping procedure to a discriminative prediction rule, and the applicability of standard preprocessing methods like denoising or dimensionality reduction. Index Terms—Clustering, pairwise proximity data, cost function, embedding, MDS. 1
Locally Linear Discriminant Analysis for Multimodally Distributed Classes for Face Recognition with a Single Model Image
- IEEE Trans. Pattern Analysis and Machine Intelligence
, 2005
"... Abstract—We present a novel method of nonlinear discriminant analysis involving a set of locally linear transformations called “Locally Linear Discriminant Analysis (LLDA). ” The underlying idea is that global nonlinear data structures are locally linear and local structures can be linearly aligned. ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
Abstract—We present a novel method of nonlinear discriminant analysis involving a set of locally linear transformations called “Locally Linear Discriminant Analysis (LLDA). ” The underlying idea is that global nonlinear data structures are locally linear and local structures can be linearly aligned. Input vectors are projected into each local feature space by linear transformations found to yield locally linearly transformed classes that maximize the between-class covariance while minimizing the within-class covariance. In face recognition, linear discriminant analysis (LDA) has been widely adopted owing to its efficiency, but it does not capture nonlinear manifolds of faces which exhibit pose variations. Conventional nonlinear classification methods based on kernels such as generalized discriminant analysis (GDA) and support vector machine (SVM) have been developed to overcome the shortcomings of the linear method, but they have the drawback of high computational cost of classification and overfitting. Our method is for multiclass nonlinear discrimination and it is computationally highly efficient as compared to GDA. The method does not suffer from overfitting by virtue of the linear base structure of the solution. A novel gradient-based learning algorithm is proposed for finding the optimal set of local linear bases. The optimization does not exhibit a local-maxima problem. The transformation functions facilitate robust face recognition in a low-dimensional subspace, under pose variations, using a single model image. The classification results are given for both synthetic and real face data. Index Terms—Linear discriminant analysis, generalized discriminant analysis, support vector machine, dimensionality reduction, face recognition, feature extraction, pose invariance, subspace representation. æ 1
Locality Preserving Indexing for Document Representation
- In Proc. of the 27rd ACM SIGIR
, 2004
"... Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Indexing (LSI) is considered effective in deriving such an indexing. LSI essentially detects the most representative features ..."
Abstract
-
Cited by 24 (12 self)
- Add to MetaCart
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Indexing (LSI) is considered effective in deriving such an indexing. LSI essentially detects the most representative features for document representation rather than the most discriminative features. Therefore, LSI might not be optimal in discriminating documents with different semantics. In this paper, a novel algorithm called Locality Preserving Indexing (LPI) is proposed for document indexing. Each document is represented by a vector with low dimensionality. In contrast to LSI which discovers the global structure of the document space, LPI discovers the local structure and obtains a compact document representation subspace that best detects the essential semantic structure. We compare the proposed LPI approach with LSI on two standard databases. Experimental results show that LPI provides better representation in the sense of semantic structure.

