Results 1  10
of
98
A unified framework for highdimensional analysis of Mestimators with decomposable regularizers
"... ..."
Estimation of (near) lowrank matrices with noise and highdimensional scaling
"... We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Me ..."
Abstract

Cited by 95 (14 self)
 Add to MetaCart
We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Mestimator based on regularization by the traceornuclearnormovermatrices, andanalyze its performance under highdimensional scaling. We provide nonasymptotic bounds on the Frobenius norm error that hold for a generalclassofnoisyobservationmodels,and apply to both exactly lowrank and approximately lowrank matrices. We then illustrate their consequences for a number of specific learning models, including lowrank multivariate or multitask regression, system identification in vector autoregressive processes, and recovery of lowrank matrices from random projections. Simulations show excellent agreement with the highdimensional scaling of the error predicted by our theory. 1.
Informationtheoretic lower bounds on the oracle complexity of convex optimization.
, 2010
"... Abstract Despite a large literature on upper bounds on complexity of convex optimization, relatively less attention has been paid to the fundamental hardness of these problems. Given the extensive use of convex optimization in machine learning and statistics, gaining a understanding of these comple ..."
Abstract

Cited by 74 (11 self)
 Add to MetaCart
(Show Context)
Abstract Despite a large literature on upper bounds on complexity of convex optimization, relatively less attention has been paid to the fundamental hardness of these problems. Given the extensive use of convex optimization in machine learning and statistics, gaining a understanding of these complexitytheoretic issues is important. In this paper, we study the complexity of stochastic convex optimization in an oracle model of computation. We improve upon known results and obtain tight minimax complexity estimates for various function classes. We also discuss implications of these results for the understanding the inherent complexity of largescale learning and estimation problems.
Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions
 ANNALS OF STATISTICS,40(2):1171
, 2013
"... We analyze a class of estimators based on convex relaxation for solving highdimensional matrix decomposition problems. The observations are noisy realizations of a linear transformation X of the sum of an (approximately) low rank matrix � ⋆ with a second matrix Ɣ ⋆ endowed with a complementary for ..."
Abstract

Cited by 61 (8 self)
 Add to MetaCart
We analyze a class of estimators based on convex relaxation for solving highdimensional matrix decomposition problems. The observations are noisy realizations of a linear transformation X of the sum of an (approximately) low rank matrix � ⋆ with a second matrix Ɣ ⋆ endowed with a complementary form of lowdimensional structure; this setup includes many statistical models of interest, including factor analysis, multitask regression and robust covariance estimation. We derive a general theorem that bounds the Frobenius norm error for an estimate of the pair ( � ⋆,Ɣ ⋆ ) obtained by solving a convex optimization problem that combines the nuclear norm with a general decomposable regularizer. Our results use a “spikiness ” condition that is related to, but milder than, singular vector incoherence. We specialize our general result to two cases that have been studied in past work: low rank plus an entrywise sparse matrix, and low rank plus a columnwise sparse matrix. For both models, our theory yields nonasymptotic Frobenius error bounds for both deterministic and stochastic noise matrices, and applies to matrices � ⋆ that can be exactly or approximately low rank, and matrices Ɣ ⋆ that can be exactly or approximately sparse. Moreover, for the case of stochastic noise matrices and the identity observation operator, we establish matching lower bounds on the minimax error. The sharpness of our nonasymptotic predictions is confirmed by numerical simulations.
Restricted Eigenvalue Properties for Correlated Gaussian Designs
"... Methods based onℓ1relaxation, such as basis pursuit and the Lasso, are very popular for sparse regression in high dimensions. The conditions for success of these methods are now wellunderstood: (1) exact recovery in the noiseless setting is possible if and only if the design matrix X satisfies the ..."
Abstract

Cited by 54 (5 self)
 Add to MetaCart
(Show Context)
Methods based onℓ1relaxation, such as basis pursuit and the Lasso, are very popular for sparse regression in high dimensions. The conditions for success of these methods are now wellunderstood: (1) exact recovery in the noiseless setting is possible if and only if the design matrix X satisfies the restricted nullspace property, and (2) the squaredℓ2error of a Lasso estimate decays at the minimax k log p n optimal rate, where k is the sparsity of the pdimensional regression problem with additive Gaussian noise, whenever the design satisfies a restricted eigenvalue condition. The key issue is thus to determine when the design matrix X satisfies these desirable properties. Thus far, there have been numerous results showing that the restricted isometry property, which implies both the restricted nullspace and eigenvalue conditions, is satisfied when all entries of X are independent and identically distributed (i.i.d.), or the rows are unitary. This paper proves directly that the restricted nullspace and eigenvalue conditions hold with high probability for quite general classes of Gaussian matrices for which the predictors may be highly dependent, and hence restricted isometry conditions can be violated with high probability. In this way, our results extend the attractive theoretical guarantees onℓ1relaxations to a much broader class of problems than the case of completely independent or unitary designs.
Minimaxoptimal rates for sparse additive models over kernel classes via convex programming
"... Sparse additive models are families of dvariate functions with the additive decomposition f ∗ = ∑ j∈S f ∗ j, where S is an unknown subset of cardinality s ≪ d. In this paper, we consider the case where each univariate component function f ∗ j lies in a reproducing kernel Hilbert space (RKHS), and ..."
Abstract

Cited by 52 (8 self)
 Add to MetaCart
(Show Context)
Sparse additive models are families of dvariate functions with the additive decomposition f ∗ = ∑ j∈S f ∗ j, where S is an unknown subset of cardinality s ≪ d. In this paper, we consider the case where each univariate component function f ∗ j lies in a reproducing kernel Hilbert space (RKHS), and analyze a method for estimating the unknown function f ∗ based on kernels combined with ℓ1type convex regularization. Working within a highdimensional framework that allows both the dimension d and sparsity s to increase with n, we derive convergence rates in the L2 (P) and L2 (Pn) norms over the classF d,s,H of sparse additive models with each univariate function f ∗ j in the unit ball of a univariate RKHS with bounded kernel function. We complement our upper bounds by deriving minimax lower bounds on the L2 (P) error, thereby showing the optimality of our method. Thus, we obtain optimal minimax rates for many interesting classes of sparse additive models, including polynomials, splines, and Sobolev classes. We also show that if, in contrast to our univariate conditions, the dvariate function class is assumed to be globally bounded, then much faster estimation rates are possible for any sparsity s=Ω ( √ n), showing that global boundedness is a significant restriction in the highdimensional setting.
InformationTheoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing
, 2011
"... We study the compressed sensing reconstruction problem for a broad class of random, banddiagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. [KMS+ 11], message passing algorithms ca ..."
Abstract

Cited by 51 (5 self)
 Add to MetaCart
(Show Context)
We study the compressed sensing reconstruction problem for a broad class of random, banddiagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. [KMS+ 11], message passing algorithms can effectively solve the reconstruction problem for spatially coupled measurements with undersampling rates close to the fraction of nonzero coordinates. We use an approximate message passing (AMP) algorithm and analyze it through the state evolution method. We give a rigorous proof that this approach is successful as soon as the undersampling rate δ exceeds the (upper) Rényi information dimension of the signal, d(pX). More precisely, for a sequence of signals of diverging dimension n whose empirical distribution converges to pX, reconstruction is with high probability successful from d(pX) n + o(n) measurements taken according to a band diagonal matrix. For sparse signals, i.e. sequences of dimension n and k(n) nonzero entries, this implies reconstruction from k(n)+o(n) measurements. For ‘discrete ’ signals, i.e. signals whose coordinates take a fixed finite set of values, this implies reconstruction from o(n) measurements. The result
Minimax rates of estimation for sparse PCA in high dimensions
, 2012
"... We study sparse principal components analysis in the highdimensional setting, where p (the number of variables) can be much larger than n (the number of observations). We prove optimal, nonasymptotic lower and upper bounds on the minimax estimation error for the leading eigenvector when it belongs ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
(Show Context)
We study sparse principal components analysis in the highdimensional setting, where p (the number of variables) can be much larger than n (the number of observations). We prove optimal, nonasymptotic lower and upper bounds on the minimax estimation error for the leading eigenvector when it belongs to an ℓq ball for q ∈ [0, 1]. Our bounds are sharp in p and n for all q ∈ [0, 1] over a wide class of distributions. The upper bound is obtained by analyzing the performance of ℓqconstrained PCA. In particular, our results provide convergence rates for ℓ1constrained PCA. 1