Results 1  10
of
83
Regularized estimation of large covariance matrices
 Ann. Statist
, 2008
"... This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → ..."
Abstract

Cited by 185 (14 self)
 Add to MetaCart
(Show Context)
This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → 0, and obtain explicit rates. The results are uniform over some fairly natural wellconditioned families of covariance matrices. We also introduce an analogue of the Gaussian white noise model and show that if the population covariance is embeddable in that model and wellconditioned, then the banded approximations produce consistent estimates of the eigenvalues and associated eigenvectors of the covariance matrix. The results can be extended to smooth versions of banding and to nonGaussian distributions with sufficiently short tails. A resampling approach is proposed for choosing the banding parameter in practice. This approach is illustrated numerically on both simulated and real data. 1. Introduction. Estimation
Sparse Permutation Invariant Covariance Estimation
 Electronic Journal of Statistics
, 2008
"... The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Fro ..."
Abstract

Cited by 164 (8 self)
 Add to MetaCart
The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlationbased version of the method exhibits better rates in the operator norm. The estimator is required to be positive definite, but we avoid having to use semidefinite programming by reparameterizing the objective function
Covariance regularization by thresholding
, 2007
"... This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGa ..."
Abstract

Cited by 148 (11 self)
 Add to MetaCart
(Show Context)
This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGaussian, and (log p)/n → 0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general crossvalidation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. 1. Introduction. Estimation
OBSTACLES TO HIGHDIMENSIONAL PARTICLE FILTERING
"... Particle filters are ensemblebased assimilation schemes that, unlike the ensemble Kalman filter, employ a fully nonlinear and nonGaussian analysis step to compute the probability distribution function (pdf) of a system’s state conditioned on a set of observations. Evidence is provided that the ens ..."
Abstract

Cited by 94 (5 self)
 Add to MetaCart
Particle filters are ensemblebased assimilation schemes that, unlike the ensemble Kalman filter, employ a fully nonlinear and nonGaussian analysis step to compute the probability distribution function (pdf) of a system’s state conditioned on a set of observations. Evidence is provided that the ensemble size required for a successful particle filter scales exponentially with the problem size. For the simple example in which each component of the state vector is independent, Gaussian and of unit variance, and the observations are of each state component separately with independent, Gaussian errors, simulations indicate that the required ensemble size scales exponentially with the state dimension. In this example, the particle filter requires at least 1011 members when applied to a 200dimensional state. Asymptotic results, following the work of Bengtsson, Bickel and collaborators, are provided for two cases: one in which each prior state component is independent and identically distributed, and one in which both the prior pdf and the observation errors are Gaussian. The asymptotic theory reveals that, in both cases, the required ensemble size scales exponentially with the variance of the observation loglikelihood, rather than with the state dimension per se. 2
Optimal rates of convergence for covariance matrix estimation
 Ann. Statist
, 2010
"... Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates ..."
Abstract

Cited by 88 (19 self)
 Add to MetaCart
Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates of convergence for estimating the covariance matrix under both the operator norm and Frobenius norm. It is shown that optimal procedures under the two norms are different and consequently matrix estimation under the operator norm is fundamentally different from vector estimation. The minimax upper bound is obtained by constructing a special class of tapering estimators and by studying their risk properties. A key step in obtaining the optimal rate of convergence is the derivation of the minimax lower bound. The technical analysis requires new ideas that are quite different from those used in the more conventional function/sequence estimation problems. 1. Introduction. Suppose
Operator norm consistent estimation of largedimensional sparse covariance matrices
 Annals of Statistics
"... Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n×p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard es ..."
Abstract

Cited by 69 (1 self)
 Add to MetaCart
(Show Context)
Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n×p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard estimators like the sample covariance matrix perform in general very poorly. In this “large n, large p ” setting, it is sometimes the case that practitioners are willing to assume that many elements of the population covariance matrix are equal to 0, and hence this matrix is sparse. We develop an estimator to handle this situation. The estimator is shown to be consistent in operator norm, when, for instance, we have p ≍ n as n → ∞. In other words the largest singular value of the difference between the estimator and the population covariance matrix goes to zero. This implies consistency of all the eigenvalues and consistency of eigenspaces associated to isolated eigenvalues. We also propose a notion of sparsity for matrices, that is, “compatible” with spectral analysis and is independent of the ordering of the variables. 1. Introduction. Estimating
Highdimensional covariance estimation by minimizing ℓ1penalized logdeterminant divergence
, 2008
"... ..."
Generalized thresholding of large covariance matrices,
 J. Amer. Statist. Assoc.
, 2009
"... We propose a new class of generalized thresholding operators that combine thresholding with shrinkage, and study generalized thresholding of the sample covariance matrix in high dimensions. Generalized thresholding of the covariance matrix has good theoretical properties and carries almost no compu ..."
Abstract

Cited by 64 (4 self)
 Add to MetaCart
(Show Context)
We propose a new class of generalized thresholding operators that combine thresholding with shrinkage, and study generalized thresholding of the sample covariance matrix in high dimensions. Generalized thresholding of the covariance matrix has good theoretical properties and carries almost no computational burden. We obtain an explicit convergence rate in the operator norm that shows the tradeoff between the sparsity of the true model, dimension, and the sample size, and shows that generalized thresholding is consistent over a large class of models as long as the dimension p and the sample size n satisfy log p/n ! 0. In addition, we show that generalized thresholding has the ''sparsistency'' property, meaning it estimates true zeros as zeros with probability tending to 1, and, under an additional mild condition, is sign consistent for nonzero elements. We show that generalized thresholding covers, as special cases, hard and soft thresholding, smoothly clipped absolute deviation, and adaptive lasso, and compare different types of generalized thresholding in a simulation study and in an example of gene clustering from a microarray experiment with tumor tissues.
Curseofdimensionality revisited: Collapse of the particle filter in very large scale systems.
"... It has been widely realized that Monte Carlo methods (approximation via a sample ensemble) may fail in large scale systems. This work offers some theoretical insight into this phenomenon in the context of the particle filter. We demonstrate that the maximum of the weights associated with the sample ..."
Abstract

Cited by 36 (0 self)
 Add to MetaCart
(Show Context)
It has been widely realized that Monte Carlo methods (approximation via a sample ensemble) may fail in large scale systems. This work offers some theoretical insight into this phenomenon in the context of the particle filter. We demonstrate that the maximum of the weights associated with the sample ensemble converges to one as both the sample size and the system dimension tends to infinity. Specifically, under fairly weak assumptions, if the ensemble size grows subexponentially in the cube root of the system dimension, the convergence holds for a single update step in statespace models with independent and identically distributed kernels. Further, in an important special case, more refined arguments show (and our simulations suggest) that the convergence to unity occurs unless the ensemble grows superexponentially in the system dimension. The weight singularity is also established in models with more general multivariate likelihoods, e.g. Gaussian and Cauchy. Although presented in the context of atmospheric data assimilation for numerical weather prediction, our results are generally valid for highdimensional particle filters. 1 1
A path following algorithm for Sparse PseudoLikelihood Inverse Covariance Estimation (SPLICE)
, 2008
"... Given n observations of a pdimensional random vector, the covariance matrix and its inverse (precision matrix) are needed in a wide range of applications. Sample covariance (e.g. its eigenstructure) can misbehave when p is comparable to the sample size n. Regularization is often used to mitigate th ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
(Show Context)
Given n observations of a pdimensional random vector, the covariance matrix and its inverse (precision matrix) are needed in a wide range of applications. Sample covariance (e.g. its eigenstructure) can misbehave when p is comparable to the sample size n. Regularization is often used to mitigate the problem. In this paper, we proposed an ℓ1 penalized pseudolikelihood estimate for the inverse covariance matrix. This estimate is sparse due to the ℓ1 penalty, and we term this method SPLICE. Its regularization path can be computed via an algorithm based on the homotopy/LARSLasso algorithm. Simulation studies are carried out for various inverse covariance structures for p = 15 and n = 20, 1000. We compare SPLICE with the ℓ1 penalized likelihood estimate and a ℓ1 penalized Cholesky decomposition based method. SPLICE gives the best overall performance in terms of three metrics on the precision matrix and ROC curve for model selection. Moreover, our simulation results demonstrate that the SPLICE estimates are positivedefinite for most of the regularization path even though the restriction is not enforced.