Results 1  10
of
84
Regularized estimation of large covariance matrices
 Ann. Statist
, 2008
"... This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → ..."
Abstract

Cited by 196 (14 self)
 Add to MetaCart
(Show Context)
This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → 0, and obtain explicit rates. The results are uniform over some fairly natural wellconditioned families of covariance matrices. We also introduce an analogue of the Gaussian white noise model and show that if the population covariance is embeddable in that model and wellconditioned, then the banded approximations produce consistent estimates of the eigenvalues and associated eigenvectors of the covariance matrix. The results can be extended to smooth versions of banding and to nonGaussian distributions with sufficiently short tails. A resampling approach is proposed for choosing the banding parameter in practice. This approach is illustrated numerically on both simulated and real data. 1. Introduction. Estimation
Sparse Permutation Invariant Covariance Estimation
 Electronic Journal of Statistics
, 2008
"... The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Fro ..."
Abstract

Cited by 171 (7 self)
 Add to MetaCart
The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlationbased version of the method exhibits better rates in the operator norm. The estimator is required to be positive definite, but we avoid having to use semidefinite programming by reparameterizing the objective function
Covariance regularization by thresholding
, 2007
"... This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGa ..."
Abstract

Cited by 154 (11 self)
 Add to MetaCart
(Show Context)
This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGaussian, and (log p)/n → 0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general crossvalidation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. 1. Introduction. Estimation
OBSTACLES TO HIGHDIMENSIONAL PARTICLE FILTERING
"... Particle filters are ensemblebased assimilation schemes that, unlike the ensemble Kalman filter, employ a fully nonlinear and nonGaussian analysis step to compute the probability distribution function (pdf) of a system’s state conditioned on a set of observations. Evidence is provided that the ens ..."
Abstract

Cited by 94 (4 self)
 Add to MetaCart
Particle filters are ensemblebased assimilation schemes that, unlike the ensemble Kalman filter, employ a fully nonlinear and nonGaussian analysis step to compute the probability distribution function (pdf) of a system’s state conditioned on a set of observations. Evidence is provided that the ensemble size required for a successful particle filter scales exponentially with the problem size. For the simple example in which each component of the state vector is independent, Gaussian and of unit variance, and the observations are of each state component separately with independent, Gaussian errors, simulations indicate that the required ensemble size scales exponentially with the state dimension. In this example, the particle filter requires at least 1011 members when applied to a 200dimensional state. Asymptotic results, following the work of Bengtsson, Bickel and collaborators, are provided for two cases: one in which each prior state component is independent and identically distributed, and one in which both the prior pdf and the observation errors are Gaussian. The asymptotic theory reveals that, in both cases, the required ensemble size scales exponentially with the variance of the observation loglikelihood, rather than with the state dimension per se. 2
Optimal rates of convergence for covariance matrix estimation
 Ann. Statist
, 2010
"... Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates ..."
Abstract

Cited by 93 (18 self)
 Add to MetaCart
Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates of convergence for estimating the covariance matrix under both the operator norm and Frobenius norm. It is shown that optimal procedures under the two norms are different and consequently matrix estimation under the operator norm is fundamentally different from vector estimation. The minimax upper bound is obtained by constructing a special class of tapering estimators and by studying their risk properties. A key step in obtaining the optimal rate of convergence is the derivation of the minimax lower bound. The technical analysis requires new ideas that are quite different from those used in the more conventional function/sequence estimation problems. 1. Introduction. Suppose
Operator norm consistent estimation of largedimensional sparse covariance matrices
 Annals of Statistics
"... Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n×p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard es ..."
Abstract

Cited by 73 (0 self)
 Add to MetaCart
(Show Context)
Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n×p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard estimators like the sample covariance matrix perform in general very poorly. In this “large n, large p ” setting, it is sometimes the case that practitioners are willing to assume that many elements of the population covariance matrix are equal to 0, and hence this matrix is sparse. We develop an estimator to handle this situation. The estimator is shown to be consistent in operator norm, when, for instance, we have p ≍ n as n → ∞. In other words the largest singular value of the difference between the estimator and the population covariance matrix goes to zero. This implies consistency of all the eigenvalues and consistency of eigenspaces associated to isolated eigenvalues. We also propose a notion of sparsity for matrices, that is, “compatible” with spectral analysis and is independent of the ordering of the variables. 1. Introduction. Estimating
Highdimensional covariance estimation by minimizing ℓ1penalized logdeterminant divergence
, 2008
"... ..."
Generalized thresholding of large covariance matrices
 J. Amer. Statist. Assoc. (Theory and Methods
, 2009
"... We propose a new class of generalized thresholding operators which combine thresholding with shrinkage, and study generalized thresholding of the sample covariance matrix in high dimensions. Generalized thresholding of the covariance matrix has good theoretical properties and carries almost no compu ..."
Abstract

Cited by 67 (4 self)
 Add to MetaCart
We propose a new class of generalized thresholding operators which combine thresholding with shrinkage, and study generalized thresholding of the sample covariance matrix in high dimensions. Generalized thresholding of the covariance matrix has good theoretical properties and carries almost no computational burden. We obtain an explicit convergence rate in the operator norm that shows the tradeoff between the sparsity of the true model, dimension, and the sample size, and show that generalized thresholding is consistent over a large class of models as long as the dimension p and the sample size n satisfy log p/n → 0. In addition, we show
Highdimensional covariance estimation by minimizing 1penalized logdeterminant divergence
 Electron. J. Stat
, 2011
"... Given i.i.d. observations of a random vector X ∈ Rp, we study the problem of estimating both its covariance matrix Σ∗, and its inverse covariance or concentration matrix Θ ∗ = (Σ∗)−1. When X is multivariate Gaussian, the nonzero structure of Θ ∗ is specified by the graph of an associated Gaussian ..."
Abstract

Cited by 54 (6 self)
 Add to MetaCart
Given i.i.d. observations of a random vector X ∈ Rp, we study the problem of estimating both its covariance matrix Σ∗, and its inverse covariance or concentration matrix Θ ∗ = (Σ∗)−1. When X is multivariate Gaussian, the nonzero structure of Θ ∗ is specified by the graph of an associated Gaussian Markov random field; and a popular estimator for such sparse Θ ∗ is the `1regularized Gaussian MLE. This estimator is sensible even for for nonGaussian X, since it corresponds to minimizing an `1penalized logdeterminant Bregman divergence. We analyze its performance under highdimensional scaling, in which the number of nodes in the graph p, the number of edges s, and the maximum node degree d, are allowed to grow as a function of the sample size n. In addition to the parameters (p, s, d), our analysis identifies other key quantities that control rates: (a) the `∞operator norm of the true covariance matrix Σ∗; and (b) the ` ∞ operator norm of the submatrix Γ∗SS, where S indexes the graph edges, and Γ ∗ = (Θ∗)−1 ⊗ (Θ∗)−1; and (c) a mutual incoherence or irrepresentability measure on the matrix Γ ∗ and (d) the rate of decay 1/f(n, δ) on the probabilities {Σ̂nij − Σ∗ij > δ}, where Σ̂n is the sample covariance based on n samples. Our first result establishes consistency of our estimate Θ ̂ in the elementwise maximumnorm. This in turn allows us to derive convergence rates in Frobenius and spectral norms, with improvements upon existing results for graphs with maximum node degrees d = o( s). In our second result, we show that with probability converging to one, the estimate Θ ̂ correctly speci
Curseofdimensionality revisited: Collapse of the particle filter in very large scale systems.
"... It has been widely realized that Monte Carlo methods (approximation via a sample ensemble) may fail in large scale systems. This work offers some theoretical insight into this phenomenon in the context of the particle filter. We demonstrate that the maximum of the weights associated with the sample ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
(Show Context)
It has been widely realized that Monte Carlo methods (approximation via a sample ensemble) may fail in large scale systems. This work offers some theoretical insight into this phenomenon in the context of the particle filter. We demonstrate that the maximum of the weights associated with the sample ensemble converges to one as both the sample size and the system dimension tends to infinity. Specifically, under fairly weak assumptions, if the ensemble size grows subexponentially in the cube root of the system dimension, the convergence holds for a single update step in statespace models with independent and identically distributed kernels. Further, in an important special case, more refined arguments show (and our simulations suggest) that the convergence to unity occurs unless the ensemble grows superexponentially in the system dimension. The weight singularity is also established in models with more general multivariate likelihoods, e.g. Gaussian and Cauchy. Although presented in the context of atmospheric data assimilation for numerical weather prediction, our results are generally valid for highdimensional particle filters. 1 1