Results 1  10
of
97
See All by Looking at A Few: Sparse Modeling for Finding Representative Objects
"... We consider the problem of finding a few representatives for a dataset, i.e., a subset of data points that efficiently describes the entire dataset. We assume that each data point can be expressed as a linear combination of the representatives and formulate the problem of finding the representatives ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
(Show Context)
We consider the problem of finding a few representatives for a dataset, i.e., a subset of data points that efficiently describes the entire dataset. We assume that each data point can be expressed as a linear combination of the representatives and formulate the problem of finding the representatives as a sparse multiple measurement vector problem. In our formulation, both the dictionary and the measurements are given by the data matrix, and the unknown sparse codes select the representatives via convex optimization. In general, we do not assume that the data are lowrank or distributed around cluster centers. When the data do come from a collection of lowrank models, we show that our method automatically selects a few representatives from each lowrank model. We also analyze the geometry of the representatives and discuss their relationship to the vertices of the convex hull of the data. We show that our framework can be extended to detect and reject outliers in datasets, and to efficiently deal with new observations and large datasets. The proposed framework and theoretical foundations are illustrated with examples in video summarization and image classification using representatives. 1.
Robust Subspace Clustering
, 2013
"... Subspace clustering refers to the task of finding a multisubspace representation that best fits a collection of points taken from a highdimensional space. This paper introduces an algorithm inspired by sparse subspace clustering (SSC) [17] to cluster noisy data, and develops some novel theory demo ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
(Show Context)
Subspace clustering refers to the task of finding a multisubspace representation that best fits a collection of points taken from a highdimensional space. This paper introduces an algorithm inspired by sparse subspace clustering (SSC) [17] to cluster noisy data, and develops some novel theory demonstrating its correctness. In particular, the theory uses ideas from geometric functional analysis to show that the algorithm can accurately recover the underlying subspaces under minimal requirements on their orientation, and on the number of samples per subspace. Synthetic as well as real data experiments complement our theoretical study, illustrating our approach and demonstrating its effectiveness.
Segmentation of moving objects by long term video analysis
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... ..."
Noisy sparse subspace clustering
 In International Conference on Machine Learning
, 2013
"... This paper considers the problem of subspace clustering under noise. Specifically, we study the behavior of Sparse Subspace Clustering (SSC) when either adversarial or random noise is added to the unlabelled input data points, which are assumed to lie in a union of lowdimensional subspaces. We sh ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
This paper considers the problem of subspace clustering under noise. Specifically, we study the behavior of Sparse Subspace Clustering (SSC) when either adversarial or random noise is added to the unlabelled input data points, which are assumed to lie in a union of lowdimensional subspaces. We show that a modified version of SSC is provably effective in correctly identifying the underlying subspaces, even with noisy data. This extends theoretical guarantee of this algorithm to the practical setting and provides justification to the success of SSC in a class of real applications. 1.
Greedy Feature Selection for Subspace Clustering Greedy Feature Selection for Subspace Clustering
"... Unions of subspaces are a powerful nonlinear signal model for collections of highdimensional data. In order to leverage existing methods that exploit this unique signal structure, the subspaces that signals of interest occupy must be known a priori or learned directly from data. In this work, we ana ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Unions of subspaces are a powerful nonlinear signal model for collections of highdimensional data. In order to leverage existing methods that exploit this unique signal structure, the subspaces that signals of interest occupy must be known a priori or learned directly from data. In this work, we analyze the performance of greedy feature selection strategies for learning unions of subspaces from ensembles of highdimensional data. We develop sufficient conditions that are required for orthogonal matching pursuit (OMP) to select subsets of points from the ensemble that live in the same subspace, a property which we refer to as exact feature selection (EFS). These conditions highlight the link between the sampling of each subspace in the ensemble and the geometry between pairs of subspaces in order to guarantee EFS. Following this analysis, we provide an empirical study of greedy feature selection strategies and characterize the gap between OMP and near neighborbased approaches. We find that the gap between these two methods is particularly pronounced when the tiling of subspaces in the ensemble is sparse, suggesting that OMP can be used in a number of regimes where nearest neighbor approaches fail to reveal the subspace affinity between points in the ensemble.
Robust subspace clustering via thresholding. arXiv preprint arXiv:1307.4891
, 2013
"... The problem of clustering noisy and incompletely observed highdimensional data points into a union of lowdimensional subspaces and a set of outliers is considered. The number of subspaces, their dimensions, and their orientations are assumed unknown. We propose a simple lowcomplexity subspace clu ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
The problem of clustering noisy and incompletely observed highdimensional data points into a union of lowdimensional subspaces and a set of outliers is considered. The number of subspaces, their dimensions, and their orientations are assumed unknown. We propose a simple lowcomplexity subspace clustering algorithm, which applies spectral clustering to an adjacency matrix obtained by thresholding the correlations between data points. In other words, the adjacency matrix is constructed from the nearest neighbors of each data point in spherical distance. A statistical performance analysis shows that the algorithm succeeds even when the subspaces intersect and that it exhibits robustness to additive noise. Specifically, our results reveal an explicit tradeoff between the affinity of the subspaces and the tolerable noise level. We furthermore prove that the algorithm succeeds even when the data points are incompletely observed with the number of missing entries allowed to be (up to a logfactor) linear in the ambient dimension. We also propose a simple scheme that provably detects outliers, and we present numerical results on real and synthetic data. 1
Low Rank Subspace Clustering (LRSC)
, 2013
"... We consider the problem of fitting one or more subspaces to a collection of data points drawn from the subspaces and corrupted by noise and/or gross errors. We pose this problem as a nonconvex optimization problem, where the goal is to decompose the corrupted data matrix as the sum of a clean and s ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We consider the problem of fitting one or more subspaces to a collection of data points drawn from the subspaces and corrupted by noise and/or gross errors. We pose this problem as a nonconvex optimization problem, where the goal is to decompose the corrupted data matrix as the sum of a clean and selfexpressive dictionary plus a matrix of noise and/or gross errors. By selfexpressive we mean a dictionary whose atoms can be expressed as linear combinations of themselves with lowrank coefficients. In the case of noisy data, our key contribution is to show that this nonconvex matrix decomposition problem can be solved in closed form from the SVD of the noisy data matrix. The solution involves a novel polynomial thresholding operator on the singular values of the data matrix, which requires minimal shrinkage. For one subspace, a particular case of our framework leads to classical PCA, which requires no shrinkage. For multiple subspaces, the lowrank coefficients obtained by our framework can be used to construct a data affinity matrix from which the clustering of the data according to the subspaces can be obtained by spectral clustering. In the case of data corrupted by gross errors, we solve the problem using an alternating minimization approach, which combines our polynomial thresholding operator with the more traditional shrinkagethresholding operator. Experiments on motion segmentation and face clustering show that our framework performs on par with stateoftheart techniques at a reduced computational cost.
Scalable sparse subspace clustering
 CVPR
"... In this paper, we address two problems in Sparse Subspace Clustering algorithm (SSC), i.e., scalability issue and outofsample problem. SSC constructs a sparse similarity graph for spectral clustering by using 1minimization based coefficients, has achieved stateoftheart results for image clus ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
In this paper, we address two problems in Sparse Subspace Clustering algorithm (SSC), i.e., scalability issue and outofsample problem. SSC constructs a sparse similarity graph for spectral clustering by using 1minimization based coefficients, has achieved stateoftheart results for image clustering and motion segmentation. However, the time complexity of SSC is proportion to the cubic of problem size such that it is inefficient to apply SSC into large scale setting. Moreover, SSC does not handle with outofsample data that are not used to construct the similarity graph. For each new datum, SSC needs recalculating the cluster membership of the whole data set, which makes SSC is not competitive in fast online clustering. To address the problems, this paper proposes outofsample extension of SSC, named as Scalable Sparse Subspace Clustering (SSSC), which makes SSC feasible to cluster large scale data sets. The solution of SSSC adopts a ”sampling, clustering, coding, and classifying ” strategy. Extensive experimental results on several popular data sets demonstrate the effectiveness and efficiency of our method comparing with the stateoftheart algorithms. 1.
Correlation adaptive subspace segmentation by trace lasso
 In ICCV
, 2013
"... This paper studies the subspace segmentation problem. Given a set of data points drawn from a union of subspaces, the goal is to partition them into their underlying subspaces they were drawn from. The spectral clustering method is used as the framework. It requires to find an affinity matrix which ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
This paper studies the subspace segmentation problem. Given a set of data points drawn from a union of subspaces, the goal is to partition them into their underlying subspaces they were drawn from. The spectral clustering method is used as the framework. It requires to find an affinity matrix which is close to block diagonal, with nonzero entries corresponding to the data point pairs from the same subspace. In this work, we argue that both sparsity and the grouping effect are important for subspace segmentation. A sparse affinity matrix tends to be block diagonal, with less connections between data points from different subspaces. The grouping effect ensures that the highly corrected data which are usually from the same subspace can be grouped together. Sparse Subspace Clustering (SSC), by using 1minimization, encourages sparsity for data selection, but it lacks of the grouping effect. On the contrary, LowRank Representation (LRR), by rank minimization, and Least Squares Regression (LSR), by 2regularization, exhibit strong grouping effect, but they are short in subset selection. Thus the obtained affinity matrix is usually very sparse by SSC, yet very dense by LRR and LSR. In this work, we propose the Correlation Adaptive Subspace Segmentation (CASS) method by using trace Lasso. CASS is a data correlation dependent method which simultaneously performs automatic data selection and groups correlated data together. It can be regarded as a method which adaptively balances SSC and LSR. Both theoretical and experimental results show the effectiveness of CASS. 1.
Efficient higherorder clustering on the grassmann manifold
 In ICCV
, 2013
"... The higherorder clustering problem arises when data is drawn from multiple subspaces or when observations fit a higherorder parametric model. Most solutions to this problem either decompose higherorder similarity measures for use in spectral clustering or explicitly use lowrank matrix represent ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
The higherorder clustering problem arises when data is drawn from multiple subspaces or when observations fit a higherorder parametric model. Most solutions to this problem either decompose higherorder similarity measures for use in spectral clustering or explicitly use lowrank matrix representations. In this paper we present our approach of Sparse Grassmann Clustering (SGC) that combines attributes of both categories. While we decompose the higherorder similarity tensor, we cluster data by directly finding a low dimensional representation without explicitly building a similarity matrix. By exploiting recent advances in online estimation on the Grassmann manifold (GROUSE) we develop an efficient and accurate algorithm that works with individual columns of similarities or partial observations thereof. Since it avoids the storage and decomposition of large similarity matrices, our method is efficient, scalable and has low memory requirements even for largescale data. We demonstrate the performance of our SGC method on a variety of segmentation problems including planar segmentation of Kinect depth maps and motion segmentation of the Hopkins 155 dataset for which we achieve performance comparable to the stateoftheart. 1.