Results 1  10
of
27
Geodesic convexity and covariance estimation
 IEEE Trans. Signal Process
, 2012
"... Abstract—Geodesic convexity is a generalization of classical convexity which guarantees that all local minima of gconvex functions are globally optimal. We consider gconvex functions with positive definite matrix variables, and prove that Kronecker products, and logarithms of determinants are g ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
Abstract—Geodesic convexity is a generalization of classical convexity which guarantees that all local minima of gconvex functions are globally optimal. We consider gconvex functions with positive definite matrix variables, and prove that Kronecker products, and logarithms of determinants are gconvex. We apply these results to two modern covariance estimation problems: robust estimation in scaledGaussian distributions, andKronecker structured models. Maximum likelihood estimation in these settings involves nonconvexminimizations.We show that these problems are in fact gconvex. This leads to straight forward analysis, allows the use of standard optimization methods and paves the road to various extensions via additional gconvex regularization. Index Terms—Elliptical distributions, geodesic convexity, Kronecker models, logsumexp, martix variate models, robust covariance estimation. I.
A generalized leastsquare matrix decomposition
 Journal of the American Statistical Association
"... Variables in many massive highdimensional data sets are structured, arising for example from measurements on a regular grid as in imaging and time series or from spatialtemporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often result ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
Variables in many massive highdimensional data sets are structured, arising for example from measurements on a regular grid as in imaging and time series or from spatialtemporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often resulting in poor performance. We propose a generalization of the singular value decomposition (SVD) and principal components analysis (PCA) that is appropriate for massive data sets with structured variables or known twoway dependencies. By finding the best low rank approximation of the data with respect to a transposable quadratic norm, our decomposition, entitled the Generalized least squares Matrix Decomposition (GMD), directly accounts for structural relationships. As many variables in highdimensional settings are often irrelevant or noisy, we also regularize our matrix decomposition by adding twoway penalties to encourage sparsity or smoothness. We develop fast computational algorithms using our methods to perform generalized PCA (GPCA), sparse GPCA, and functional GPCA on massive data sets. Through simulations and a whole brain functional MRI example we demonstrate the utility of our methodology for dimension reduction, signal recovery, and feature selection with highdimensional structured data.
Inference with transposable data: modelling the effects of row and column correlations
 Journal of the Royal Statistical Society: Series B (Statistical Methodology
, 2012
"... Summary. We consider the problem of largescale inference on the row or column variables of data in the form of a matrix. Many of these data matrices are transposable meaning that neither the row variables nor the column variables can be considered independent instances. An example of this scenario ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
Summary. We consider the problem of largescale inference on the row or column variables of data in the form of a matrix. Many of these data matrices are transposable meaning that neither the row variables nor the column variables can be considered independent instances. An example of this scenario is detecting significant genes in microarrays when the samples may be dependent due to latent variables or unknown batch effects. By modeling this matrix data using the matrixvariate normal distribution, we study and quantify the effects of row and column correlations on procedures for largescale inference. We then propose a simple solution to the myriad of problems presented by unanticipated correlations: We simultaneously estimate row and column covariances and use these to sphere or decorrelate the noise in the underlying data before conducting inference. This procedure yields data with approximately independent rows and columns so that test statistics more closely follow null distributions and multiple testing procedures correctly control the desired error rates. Results on simulated models and real microarray data demonstrate major advantages of this approach: (1) increased statistical power, (2) less bias in estimating the false discovery rate, and (3) reduced variance of the false discovery rate estimators.
Covariance Estimation in High Dimensions via Kronecker Product Expansions
, 2013
"... This paper presents a new method for estimating high dimensional covariance matrices. The method, permuted rankpenalized leastsquares (PRLS), is based on a Kronecker product series expansion of the true covariance matrix. Assuming an i.i.d. Gaussian random sample, we establish high dimensional rat ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
This paper presents a new method for estimating high dimensional covariance matrices. The method, permuted rankpenalized leastsquares (PRLS), is based on a Kronecker product series expansion of the true covariance matrix. Assuming an i.i.d. Gaussian random sample, we establish high dimensional rates of convergence to the true covariance as both the number of samples and the number of variables go to infinity. For covariance matrices of low separation rank, our results establish that PRLS has significantly faster convergence than the standard sample covariance matrix (SCM) estimator. The convergence rate captures a fundamental tradeoff between estimation error and approximation error, thus providing a scalable covariance estimation framework in terms of separation rank, similar to low rank approximation of covariance matrices [1]. The MSE convergence rates generalize the high dimensional rates recently obtained for the ML Flipflop algorithm [2], [3] for Kronecker product covariance estimation. We show that a class of block Toeplitz covariance matrices is approximatable by low separation rank and give bounds on the minimal separation rank r that ensures a given level of bias. Simulations are presented to validate the theoretical bounds. As a real world application, we illustrate the utility of the proposed Kronecker covariance estimator for spatiotemporal linear least squares prediction of multivariate wind speed measurements.
Convergence Properties of Kronecker Graphical Lasso Algorithms
, 2013
"... This report presents a thorough convergence analysis of Kronecker graphical lasso (KGLasso) algorithms for estimating the covariance of an i.i.d. Gaussian random sample under a sparse Kroneckerproduct covariance model. The KGlasso model, originally called the transposable regularized covariance mod ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
This report presents a thorough convergence analysis of Kronecker graphical lasso (KGLasso) algorithms for estimating the covariance of an i.i.d. Gaussian random sample under a sparse Kroneckerproduct covariance model. The KGlasso model, originally called the transposable regularized covariance model by Allen et al [1], implements a pair of `1 penalties on each Kronecker factor to enforce sparsity in the covariance estimator. The KGlasso algorithm generalizes Glasso, introduced by Yuan and Lin [2] and Banerjee et al [3], to estimate covariances having Kronecker product form. It also generalizes the unpenalized ML flipflop (FF) algorithm of Dutilleul [4] and Werner et al [5] to estimation of sparse Kronecker factors. We establish that the KGlasso iterates converge pointwise to a local maximum of the penalized likelihood function. We derive high dimensional rates of convergence to the true covariance as both the number of samples and the number of variables go to infinity. Our results establish that KGlasso has significantly faster asymptotic convergence than FF and Glasso. Our results establish that KGlasso has significantly faster asymptotic convergence than FF and Glasso. Simulations are presented that validate the results of our analysis. For example, for a sparse 10, 000 × 10, 000 covariance matrix equal to the Kronecker product of two 100 × 100 matrices, the root mean squared error of the inverse covariance estimate using FF is 3.5 times larger than that obtainable using KGlasso.
Kronecker sum decompositions of spacetime data
 in Computational Advances in MultiSensor Adaptive Processing (CAMSAP), 2013 IEEE 5th International Workshop on
, 2013
"... Abstract—In this paper we consider the use of the space vs. time Kronecker product decomposition in the estimation of covariance matrices for spatiotemporal data. This decomposition imposes lower dimensional structure on the estimated covariance matrix, thus reducing the number of samples required ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper we consider the use of the space vs. time Kronecker product decomposition in the estimation of covariance matrices for spatiotemporal data. This decomposition imposes lower dimensional structure on the estimated covariance matrix, thus reducing the number of samples required for estimation. To allow a smooth tradeoff between the reduction in the number of parameters (to reduce estimation variance) and the accuracy of the covariance approximation (affecting estimation bias), we introduce a diagonally loaded modification of the sumofkronecker products representation in [1]. We derive an asymptotic CramérRao bound (CRB) on the minimum attainable mean squared predictor coefficient estimation error for unbiased estimators of Kronecker structured covariance matrices. We illustrate the accuracy of the diagonally loaded Kronecker sum decomposition by applying it to the prediction of human activity video. I.
GEMINI: GRAPH ESTIMATION WITH MATRIX VARIATE
"... Undirected graphs can be used to describe matrix variate distributions. In this paper, we develop new methods for estimating the graphical structures and underlying parameters, namely, the row and column covariance and inverse covariance matrices from the matrix variate data. Under sparsity condit ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Undirected graphs can be used to describe matrix variate distributions. In this paper, we develop new methods for estimating the graphical structures and underlying parameters, namely, the row and column covariance and inverse covariance matrices from the matrix variate data. Under sparsity conditions, we show that one is able to recover the graphs and covariance matrices with a single random matrix from the matrix variate normal distribution. Our method extends, with suitable adaptation, to the general setting where replicates are available. We establish consistency and obtain the rates of convergence in the operator and the Frobenius norm. We show that having replicates will allow one to estimate more complicated graphical structures and achieve faster rates of convergence. We provide simulation evidence showing that we can recover graphical structures as well as estimating the precision matrices, as predicted by theory. 1. Introduction. The
Sparse Biclustering of Transposable Data
, 2013
"... We consider the task of simultaneously clustering the rows and columns of a large transposable data matrix. We assume that the matrix elements are normally distributed with a biclusterspecific mean term and a common variance, and perform biclustering by maximizing the corresponding log likelihood. ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
We consider the task of simultaneously clustering the rows and columns of a large transposable data matrix. We assume that the matrix elements are normally distributed with a biclusterspecific mean term and a common variance, and perform biclustering by maximizing the corresponding log likelihood. We apply an ℓ1 penalty to the means of the biclusters in order to obtain sparse and interpretable biclusters. Our proposal amounts to a sparse, symmetrized version of kmeans clustering. We show that kmeans clustering of the rows and of the columns of a data matrix can be seen as special cases of our proposal, and that a relaxation of our proposal yields the singular value decomposition. In addition, we propose a framework for biclustering based on the matrixvariate normal distribution. The performances of our proposals are demonstrated in a simulation study and on a gene expression data set. This article has supplementary material online.
Separable Factor Analysis with Applications to Mortality Data
, 2012
"... Human mortality datasets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution, or an error model that al ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Human mortality datasets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution, or an error model that allows for dependence along at most one or two dimensions of the data array. However, failing to account for other dependencies can lead to inefficient estimates of regression parameters, inaccurate standard errors and poor predictions. An alternative to assuming independent errors is to allow for dependence along each dimension of the array using a separable covariance model. However, the number of parameters in this model increases rapidly with the dimensions of the array, and for many arrays, maximum likelihood estimates of the covariance parameters do not exist. In this paper, we propose a submodel of the separable covariance model that estimates the covariance matrix for each dimension as having factor analytic structure. This model can be viewed as an extension of factor analysis to arrayvalued data, as it uses a factor model to estimate the covariance along each dimension of the array. We discuss properties of this model as they relate to ordinary factor analysis, describe maximum likelihood and Bayesian estimation methods, and provide a likelihood ratio testing procedure for selecting the factor model ranks. We
KRONECKER GRAPHICAL LASSO
"... We consider highdimensional estimation of a (possibly sparse) Kroneckerdecomposable covariance matrix given i.i.d. Gaussian samples. We propose a sparse covariance estimation algorithm, Kronecker Graphical Lasso (KGlasso), for the high dimensional setting that takes advantage of structure and spar ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We consider highdimensional estimation of a (possibly sparse) Kroneckerdecomposable covariance matrix given i.i.d. Gaussian samples. We propose a sparse covariance estimation algorithm, Kronecker Graphical Lasso (KGlasso), for the high dimensional setting that takes advantage of structure and sparsity. Convergence and limit point characterization of this iterative algorithm is established. Compared to standard Glasso, KGlasso has low computational complexity as the dimension of the covariance matrix increases. We derive a tight MSE convergence rate for KGlasso and show it strictly outperforms standard Glasso and FF. Simulations validate these results and shows that KGlasso outperforms the maximumlikelihood solution (FF), in the highdimensional smallsample regime. Index Terms — sparsity, structured covariance estimation, penalized maximum likelihood, graphical lasso 1.