Results 1  10
of
214
The Convex Geometry of Linear Inverse Problems
, 2010
"... In applications throughout science and engineering one is often faced with the challenge of solving an illposed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constr ..."
Abstract

Cited by 189 (20 self)
 Add to MetaCart
In applications throughout science and engineering one is often faced with the challenge of solving an illposed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constrained structurally so that they only have a few degrees of freedom relative to their ambient dimension. This paper provides a general framework to convert notions of simplicity into convex penalty functions, resulting in convex optimization solutions to linear, underdetermined inverse problems. The class of simple models considered are those formed as the sum of a few atoms from some (possibly infinite) elementary atomic set; examples include wellstudied cases such as sparse vectors (e.g., signal processing, statistics) and lowrank matrices (e.g., control, statistics), as well as several others including sums of a few permutations matrices (e.g., ranked elections, multiobject tracking), lowrank tensors (e.g., computer vision, neuroscience), orthogonal matrices (e.g., machine learning), and atomic measures (e.g., system identification). The convex programming formulation is based on minimizing the norm induced by the convex hull of the atomic set; this norm is referred to as the atomic norm. The facial
Structured variable selection with sparsityinducing norms
, 2011
"... We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsityinducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual ℓ1norm and the group ℓ1norm by allowing the subsets to ov ..."
Abstract

Cited by 187 (27 self)
 Add to MetaCart
(Show Context)
We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsityinducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual ℓ1norm and the group ℓ1norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problems. We first explore the relationship between the groups defining the norm and the resulting nonzero patterns, providing both forward and backward algorithms to go back and forth from groups to patterns. This allows the design of norms adapted to specific prior knowledge expressed in terms of nonzero patterns. We also present an efficient active set algorithm, and analyze the consistency of variable selection for leastsquares linear regression in low and highdimensional settings.
Nuclear norm penalization and optimal rates for noisy low rank matrix completion.
 Annals of Statistics,
, 2011
"... AbstractThis paper deals with the trace regression model where n entries or linear combinations of entries of an unknown m1 × m2 matrix A0 corrupted by noise are observed. We propose a new nuclear norm penalized estimator of A0 and establish a general sharp oracle inequality for this estimator for ..."
Abstract

Cited by 107 (7 self)
 Add to MetaCart
(Show Context)
AbstractThis paper deals with the trace regression model where n entries or linear combinations of entries of an unknown m1 × m2 matrix A0 corrupted by noise are observed. We propose a new nuclear norm penalized estimator of A0 and establish a general sharp oracle inequality for this estimator for arbitrary values of n, m1, m2 under the condition of isometry in expectation. Then this method is applied to the matrix completion problem. In this case, the estimator admits a simple explicit form and we prove that it satisfies oracle inequalities with faster rates of convergence than in the previous works. They are valid, in particular, in the highdimensional setting m1m2 n. We show that the obtained rates are optimal up to logarithmic factors in a minimax sense and also derive, for any fixed matrix A0, a nonminimax lower bound on the rate of convergence of our estimator, which coincides with the upper bound up to a constant factor. Finally, we show that our procedure provides an exact recovery of the rank of A0 with probability close to 1. We also discuss the statistical learning setting where there is no underlying model determined by A0 and the aim is to find the best trace regression model approximating the data.
Adaptive forwardbackward greedy algorithm for learning sparse representations
 IEEE Trans. Inform. Theory
, 2011
"... Consider linear prediction models where the target function is a sparse linear combination of a set of basis functions. We are interested in the problem of identifying those basis functions with nonzero coefficients and reconstructing the target function from noisy observations. Two heuristics that ..."
Abstract

Cited by 101 (9 self)
 Add to MetaCart
(Show Context)
Consider linear prediction models where the target function is a sparse linear combination of a set of basis functions. We are interested in the problem of identifying those basis functions with nonzero coefficients and reconstructing the target function from noisy observations. Two heuristics that are widely used in practice are forward and backward greedy algorithms. First, we show that neither idea is adequate. Second, we propose a novel combination that is based on the forward greedy algorithm but takes backward steps adaptively whenever beneficial. We prove strong theoretical results showing that this procedure is effective in learning sparse representations. Experimental results support our theory. 1
Minimax rates of estimation for highdimensional linear regression over balls
, 2009
"... Abstract—Consider the highdimensional linear regression model,where is an observation vector, is a design matrix with, is an unknown regression vector, and is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating in eitherloss andprediction loss, assuming tha ..."
Abstract

Cited by 97 (19 self)
 Add to MetaCart
(Show Context)
Abstract—Consider the highdimensional linear regression model,where is an observation vector, is a design matrix with, is an unknown regression vector, and is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating in eitherloss andprediction loss, assuming that belongs to anball for some.Itisshown that under suitable regularity conditions on the design matrix, the minimax optimal rate inloss andprediction loss scales as. The analysis in this paper reveals that conditions on the design matrix enter into the rates forerror andprediction error in complementary ways in the upper and lower bounds. Our proofs of the lower bounds are information theoretic in nature, based on Fano’s inequality and results on the metric entropy of the balls, whereas our proofs of the upper bounds are constructive, involving direct analysis of least squares overballs. For the special case, corresponding to models with an exact sparsity constraint, our results show that although computationally efficientbased methods can achieve the minimax rates up to constant factors, they require slightly stronger assumptions on the design matrix than optimal algorithms involving leastsquares over theball. Index Terms—Compressed sensing, minimax techniques, regression analysis. I.
Estimation of (near) lowrank matrices with noise and highdimensional scaling
"... We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Me ..."
Abstract

Cited by 95 (14 self)
 Add to MetaCart
We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Mestimator based on regularization by the traceornuclearnormovermatrices, andanalyze its performance under highdimensional scaling. We provide nonasymptotic bounds on the Frobenius norm error that hold for a generalclassofnoisyobservationmodels,and apply to both exactly lowrank and approximately lowrank matrices. We then illustrate their consequences for a number of specific learning models, including lowrank multivariate or multitask regression, system identification in vector autoregressive processes, and recovery of lowrank matrices from random projections. Simulations show excellent agreement with the highdimensional scaling of the error predicted by our theory. 1.
Stable principal component pursuit
 In Proc. of International Symposium on Information Theory
, 2010
"... We consider the problem of recovering a target matrix that is a superposition of lowrank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured highdimensional signals such as videos and hyperspectral images, as well as in the analys ..."
Abstract

Cited by 94 (3 self)
 Add to MetaCart
(Show Context)
We consider the problem of recovering a target matrix that is a superposition of lowrank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured highdimensional signals such as videos and hyperspectral images, as well as in the analysis of transformation invariant lowrank structure recovery. We analyze the performance of the natural convex heuristic for solving this problem, under the assumption that measurements are chosen uniformly at random. We prove that this heuristic exactly recovers lowrank and sparse terms, provided the number of observations exceeds the number of intrinsic degrees of freedom of the component signals by a polylogarithmic factor. Our analysis introduces several ideas that may be of independent interest for the more general problem of compressed sensing and decomposing superpositions of multiple structured signals. 1
Restricted strong convexity and weighted matrix completion: Optimal bounds with noise
, 2012
"... We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong ..."
Abstract

Cited by 84 (10 self)
 Add to MetaCart
We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong convexity with respect to weighted Frobenius norm. Using this property, we obtain as corollaries a number of error bounds on matrix completion in the weighted Frobenius norm under noisy sampling and for both exact and near lowrank matrices. Our results are based on measures of the “spikiness” and “lowrankness” of matrices that are less restrictive than the incoherence conditions imposed in previous work. Our technique involves an Mestimator that includes controls on both the rank and spikiness of the solution, and we establish nonasymptotic error bounds in weighted Frobenius norm for recovering matrices lying with ℓq“balls ” of bounded spikiness. Using informationtheoretic methods, we show that no algorithm can achieve better estimates (up to a logarithmic factor) over these same sets, showing that our conditions on matrices and associated rates are essentially optimal.