Results 1  10
of
411
Near Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
, 2004
"... Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear m ..."
Abstract

Cited by 1513 (20 self)
 Add to MetaCart
Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear measurements do we need to recover objects from this class to within accuracy ɛ? This paper shows that if the objects of interest are sparse or compressible in the sense that the reordered entries of a signal f ∈ F decay like a powerlaw (or if the coefficient sequence of f in a fixed basis decays like a powerlaw), then it is possible to reconstruct f to within very high accuracy from a small number of random measurements. typical result is as follows: we rearrange the entries of f (or its coefficients in a fixed basis) in decreasing order of magnitude f  (1) ≥ f  (2) ≥... ≥ f  (N), and define the weakℓp ball as the class F of those elements whose entries obey the power decay law f  (n) ≤ C · n −1/p. We take measurements 〈f, Xk〉, k = 1,..., K, where the Xk are Ndimensional Gaussian
Decoding by Linear Programming
, 2004
"... This paper considers the classical error correcting problem which is frequently discussed in coding theory. We wish to recover an input vector f ∈ Rn from corrupted measurements y = Af + e. Here, A is an m by n (coding) matrix and e is an arbitrary and unknown vector of errors. Is it possible to rec ..."
Abstract

Cited by 1400 (17 self)
 Add to MetaCart
This paper considers the classical error correcting problem which is frequently discussed in coding theory. We wish to recover an input vector f ∈ Rn from corrupted measurements y = Af + e. Here, A is an m by n (coding) matrix and e is an arbitrary and unknown vector of errors. Is it possible to recover f exactly from the data y? We prove that under suitable conditions on the coding matrix A, the input f is the unique solution to the ℓ1minimization problem (‖x‖ℓ1:= i xi) min g∈R n ‖y − Ag‖ℓ1 provided that the support of the vector of errors is not too large, ‖e‖ℓ0: = {i: ei ̸= 0}  ≤ ρ · m for some ρ> 0. In short, f can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program). In addition, numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted. This work is related to the problem of finding sparse solutions to vastly underdetermined systems of linear equations. There are also significant connections with the problem of recovering signals from highly incomplete measurements. In fact, the results introduced in this paper improve on our earlier work [5]. Finally, underlying the success of ℓ1 is a crucial property we call the uniform uncertainty principle that we shall describe in detail.
Sure independence screening for ultrahigh dimensional feature space
, 2006
"... Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, estimation accuracy and computational cost are two top concerns. In a recent paper, ..."
Abstract

Cited by 279 (27 self)
 Add to MetaCart
Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, estimation accuracy and computational cost are two top concerns. In a recent paper, Candes and Tao (2007) propose the Dantzig selector using L1 regularization and show that it achieves the ideal risk up to a logarithmic factor log p. Their innovative procedure and remarkable result are challenged when the dimensionality is ultra high as the factor log p can be large and their uniform uncertainty principle can fail. Motivated by these concerns, we introduce the concept of sure screening and propose a sure screening method based on a correlation learning, called the Sure Independence Screening (SIS), to reduce dimensionality from high to a moderate scale that is below sample size. In a fairly general asymptotic framework, the SIS is shown to have the sure screening property for even exponentially growing dimensionality. As a methodological extension, an iterative SIS (ISIS) is also proposed to enhance its finite sample performance. With dimension reduced accurately from high to below sample size, variable selection can be improved on both speed and accuracy, and can then be ac
Population structure and eigenanalysis
 PLoS Genet 2(12): e190 DOI: 10.1371/journal.pgen.0020190
, 2006
"... Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure (principal components analysis) that was first applied to genetic data by CavalliSforza and colleague ..."
Abstract

Cited by 237 (6 self)
 Add to MetaCart
(Show Context)
Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure (principal components analysis) that was first applied to genetic data by CavalliSforza and colleagues. We place the method on a solid statistical footing, using results from modern statistics to develop formal significance tests. We also uncover a general ‘‘phase change’ ’ phenomenon about the ability to detect structure in genetic data, which emerges from the statistical theory we use, and has an important implication for the ability to discover structure in genetic data: for a fixed but large dataset size, divergence between two populations (as measured, for example, by a statistic like F ST) below a threshold is essentially undetectable, but a little above threshold, detection will be easy. This means that we can predict the dataset size needed to detect structure.
Regularized estimation of large covariance matrices
 Ann. Statist
, 2008
"... This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → ..."
Abstract

Cited by 196 (14 self)
 Add to MetaCart
(Show Context)
This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → 0, and obtain explicit rates. The results are uniform over some fairly natural wellconditioned families of covariance matrices. We also introduce an analogue of the Gaussian white noise model and show that if the population covariance is embeddable in that model and wellconditioned, then the banded approximations produce consistent estimates of the eigenvalues and associated eigenvectors of the covariance matrix. The results can be extended to smooth versions of banding and to nonGaussian distributions with sufficiently short tails. A resampling approach is proposed for choosing the banding parameter in practice. This approach is illustrated numerically on both simulated and real data. 1. Introduction. Estimation
Sparse Permutation Invariant Covariance Estimation
 Electronic Journal of Statistics
, 2008
"... The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Fro ..."
Abstract

Cited by 171 (7 self)
 Add to MetaCart
The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlationbased version of the method exhibits better rates in the operator norm. The estimator is required to be positive definite, but we avoid having to use semidefinite programming by reparameterizing the objective function
Covariance regularization by thresholding
, 2007
"... This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGa ..."
Abstract

Cited by 154 (11 self)
 Add to MetaCart
(Show Context)
This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGaussian, and (log p)/n → 0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general crossvalidation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. 1. Introduction. Estimation
Error Correction via Linear Programming
, 2005
"... Suppose we wish to transmit a vector f ∈ Rn reliably. A frequently discussed approach consists in encoding f with an m by n coding matrix A. Assume now that a fraction of the entries of Af are corrupted in a completely arbitrary fashion. We do not know which entries are affected nor do we know how t ..."
Abstract

Cited by 106 (6 self)
 Add to MetaCart
Suppose we wish to transmit a vector f ∈ Rn reliably. A frequently discussed approach consists in encoding f with an m by n coding matrix A. Assume now that a fraction of the entries of Af are corrupted in a completely arbitrary fashion. We do not know which entries are affected nor do we know how they are affected. Is it possible to recover f exactly from the corrupted mdimensional vector y? This paper proves that under suitable conditions on the coding matrix A, the input f is the unique solution to the ℓ1minimization problem (�x�ℓ1: = i xi) min �y − Ag�ℓ1 g∈Rn provided that the fraction of corrupted entries is not too large, i.e. does not exceed some strictly positive constant ρ ∗ (numerical values for ρ ∗ are given). In other words, f can be recovered exactly by solving a simple convex optimization problem; in fact, a linear program. We report on numerical experiments suggesting that ℓ1minimization is amazingly effective; f is recovered exactly even in situations where a very significant fraction of the output is corrupted.
Estimation of (near) lowrank matrices with noise and highdimensional scaling
"... We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Me ..."
Abstract

Cited by 103 (19 self)
 Add to MetaCart
We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Mestimator based on regularization by the traceornuclearnormovermatrices, andanalyze its performance under highdimensional scaling. We provide nonasymptotic bounds on the Frobenius norm error that hold for a generalclassofnoisyobservationmodels,and apply to both exactly lowrank and approximately lowrank matrices. We then illustrate their consequences for a number of specific learning models, including lowrank multivariate or multitask regression, system identification in vector autoregressive processes, and recovery of lowrank matrices from random projections. Simulations show excellent agreement with the highdimensional scaling of the error predicted by our theory. 1.