Results 1  10
of
363
Robust principal component analysis?
 Journal of the ACM,
, 2011
"... Abstract This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the ..."
Abstract

Cited by 569 (26 self)
 Add to MetaCart
Abstract This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the 1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces.
Stable principal component pursuit
 In Proc. of International Symposium on Information Theory
, 2010
"... We consider the problem of recovering a target matrix that is a superposition of lowrank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured highdimensional signals such as videos and hyperspectral images, as well as in the analys ..."
Abstract

Cited by 94 (3 self)
 Add to MetaCart
(Show Context)
We consider the problem of recovering a target matrix that is a superposition of lowrank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured highdimensional signals such as videos and hyperspectral images, as well as in the analysis of transformation invariant lowrank structure recovery. We analyze the performance of the natural convex heuristic for solving this problem, under the assumption that measurements are chosen uniformly at random. We prove that this heuristic exactly recovers lowrank and sparse terms, provided the number of observations exceeds the number of intrinsic degrees of freedom of the component signals by a polylogarithmic factor. Our analysis introduces several ideas that may be of independent interest for the more general problem of compressed sensing and decomposing superpositions of multiple structured signals. 1
Restricted strong convexity and weighted matrix completion: Optimal bounds with noise
, 2012
"... We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong ..."
Abstract

Cited by 84 (10 self)
 Add to MetaCart
We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong convexity with respect to weighted Frobenius norm. Using this property, we obtain as corollaries a number of error bounds on matrix completion in the weighted Frobenius norm under noisy sampling and for both exact and near lowrank matrices. Our results are based on measures of the “spikiness” and “lowrankness” of matrices that are less restrictive than the incoherence conditions imposed in previous work. Our technique involves an Mestimator that includes controls on both the rank and spikiness of the solution, and we establish nonasymptotic error bounds in weighted Frobenius norm for recovering matrices lying with ℓq“balls ” of bounded spikiness. Using informationtheoretic methods, we show that no algorithm can achieve better estimates (up to a logarithmic factor) over these same sets, showing that our conditions on matrices and associated rates are essentially optimal.
Highdimensional regression with noisy and missing data: Provable guarantees with nonconvexity
, 2011
"... Although the standard formulations of prediction problems involve fullyobserved and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependencies. We study these issues in the context of highdimensional sparse linear regression, and ..."
Abstract

Cited by 75 (10 self)
 Add to MetaCart
(Show Context)
Although the standard formulations of prediction problems involve fullyobserved and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependencies. We study these issues in the context of highdimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing, and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and prove that a simple projected gradient descent algorithm will converge in polynomial time to a small neighborhood of the set of global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing, and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm will converge at geometric rates to a nearglobal minimizer. We illustrate these theoretical predictions with simulations, showing agreement with the predicted scalings. 1
Onebit compressed sensing by linear programming
, 2011
"... We give the first computationally tractable and almost optimal solution to the problem of onebit compressed sensing, showing how to accurately recover an ssparse vector x ∈ R n from the signs of O(s log² (n/s)) random linear measurements of x. The recovery is achieved by a simple linear program. ..."
Abstract

Cited by 57 (5 self)
 Add to MetaCart
We give the first computationally tractable and almost optimal solution to the problem of onebit compressed sensing, showing how to accurately recover an ssparse vector x ∈ R n from the signs of O(s log² (n/s)) random linear measurements of x. The recovery is achieved by a simple linear program. This result extends to approximately sparse vectors x. Our result is universal in the sense that with high probability, one measurement scheme will successfully recover all sparse vectors simultaneously. The argument is based on solving an equivalent geometric problem on random hyperplane tessellations.
Nonparametric independence screening in sparse ultrahigh dimensional additive models
, 2010
"... A variable screening procedure via correlation learning was proposed by Fan and Lv (2008) to reduce dimensionality in sparse ultrahighdimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To address this issue, we further extend the correlation lea ..."
Abstract

Cited by 54 (8 self)
 Add to MetaCart
A variable screening procedure via correlation learning was proposed by Fan and Lv (2008) to reduce dimensionality in sparse ultrahighdimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To address this issue, we further extend the correlation learning to marginal nonparametric learning. Our nonparametric independence screening (NIS) is a specific type of sure independence screening. We propose several closely related variable screening procedures. We show that with general nonparametric models, under some mild technical conditions, the proposed independence screening methods have a sure screening property. The extent to which the dimensionality can be reduced by independence screening is also explicitly quantified. As a methodological extension, we also propose a datadriven thresholding and an iterative nonparametric independence screening (INIS) method to enhance the finite sample performance for fitting sparse additive models. The simulation results and a real data analysis demonstrate that the proposed procedure works well with moderate sample size and large dimension and performs better than competing methods.
Compressive Sensing
, 2010
"... Compressive sensing is a new type of sampling theory, which predicts that sparse signals and images can be reconstructed from what was previously believed to be incomplete information. As a main feature, efficient algorithms such as ℓ1minimization can be used for recovery. The theory has many poten ..."
Abstract

Cited by 50 (12 self)
 Add to MetaCart
Compressive sensing is a new type of sampling theory, which predicts that sparse signals and images can be reconstructed from what was previously believed to be incomplete information. As a main feature, efficient algorithms such as ℓ1minimization can be used for recovery. The theory has many potential applications in signal processing and imaging. This chapter gives an introduction and overview on both theoretical and numerical aspects of compressive sensing.
Clustering partially observed graphs via convex optimization.
 Journal of Machine Learning Research,
, 2014
"... Abstract This paper considers the problem of clustering a partially observed unweighted graphi.e., one where for some node pairs we know there is an edge between them, for some others we know there is no edge, and for the remaining we do not know whether or not there is an edge. We want to organiz ..."
Abstract

Cited by 47 (13 self)
 Add to MetaCart
(Show Context)
Abstract This paper considers the problem of clustering a partially observed unweighted graphi.e., one where for some node pairs we know there is an edge between them, for some others we know there is no edge, and for the remaining we do not know whether or not there is an edge. We want to organize the nodes into disjoint clusters so that there is relatively dense (observed) connectivity within clusters, and sparse across clusters. We take a novel yet natural approach to this problem, by focusing on finding the clustering that minimizes the number of "disagreements"i.e., the sum of the number of (observed) missing edges within clusters, and (observed) present edges across clusters. Our algorithm uses convex optimization; its basis is a reduction of disagreement minimization to the problem of recovering an (unknown) lowrank matrix and an (unknown) sparse matrix from their partially observed sum. We evaluate the performance of our algorithm on the classical Planted Partition/Stochastic Block Model. Our main theorem provides sufficient conditions for the success of our algorithm as a function of the minimum cluster size, edge density and observation probability; in particular, the results characterize the tradeoff between the observation probability and the edge density gap. When there are a constant number of clusters of equal size, our results are optimal up to logarithmic factors.
Robust 1bit compressed sensing and sparse logistic regression: A convex programming approach. Preprint. Available at http://arxiv.org/abs/1202.1212
"... Abstract. This paper develops theoretical results regarding noisy 1bit compressed sensing and sparse binomial regression. Wedemonstrate thatasingle convexprogram gives anaccurate estimate of the signal, or coefficient vector, for both of these models. We show that an ssparse signal in R n can be a ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
Abstract. This paper develops theoretical results regarding noisy 1bit compressed sensing and sparse binomial regression. Wedemonstrate thatasingle convexprogram gives anaccurate estimate of the signal, or coefficient vector, for both of these models. We show that an ssparse signal in R n can be accurately estimated from m = O(slog(n/s)) singlebit measurements using a simple convex program. This remains true even if each measurement bit is flipped with probability nearly 1/2. Worstcase (adversarial) noise can also be accounted for, and uniform results that hold for all sparse inputs are derived as well. In the terminology of sparse logistic regression, we show that O(slog(2n/s)) Bernoulli trials are sufficient to estimate a coefficient vector in R n which is approximately ssparse. Moreover, the same convex program works for virtually all generalized linear models, in which the link function may be unknown. To our knowledge, these are the first results that tie together the theory of sparse logistic regression to 1bit compressed sensing. Our results apply to general signal structures aside from sparsity; one only needs to know the size of the set K where signals reside. The size is given by the mean width of K, a computable quantity whose square serves as a robust extension of the dimension. 1.
Optimal detection of sparse principal components in high dimension
, 2013
"... We perform a finite sample analysis of the detection levels for sparse principal components of a highdimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NPcomplete in general, and we describe a computationally ..."
Abstract

Cited by 42 (4 self)
 Add to MetaCart
We perform a finite sample analysis of the detection levels for sparse principal components of a highdimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NPcomplete in general, and we describe a computationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels, and it performs well on simulated datasets. Moreover, using polynomial time reductions from theoretical computer science, we bring significant evidence that our results cannot be improved, thus revealing an inherent trade off between statistical and computational performance.