• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Regularized rank-based estimation of highdimensional nonparanormal graphical models. (2012)

by MR3449812 Xue, L, H Zou
Venue:The Annals of Statistics
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 32
Next 10 →

High-dimensional semiparametric Gaussian copula graphical models

by Han Liu, Fang Han, Ming Yuan, John Lafferty, Larry Wasserman - THE ANNALS OF STATISTICS , 2012
"... We propose a semiparametric approach called the nonparanormal SKEPTIC for efficiently and robustly estimating high-dimensional undirected graphical models. To achieve modeling flexibility, we consider the nonparanormal graphical models proposed by Liu, Lafferty and Wasserman [J. Mach. Learn. Res. 10 ..."
Abstract - Cited by 51 (19 self) - Add to MetaCart
We propose a semiparametric approach called the nonparanormal SKEPTIC for efficiently and robustly estimating high-dimensional undirected graphical models. To achieve modeling flexibility, we consider the nonparanormal graphical models proposed by Liu, Lafferty and Wasserman [J. Mach. Learn. Res. 10 (2009) 2295–2328]. To achieve estimation robustness, we exploit nonparametric rank-based correlation coefficient estimators, including Spearman’s rho and Kendall’s tau. We prove that the nonparanormal SKEPTIC achieves the optimal parametric rates of convergence for both graph recovery and parameter estimation. This result suggests that the nonparanormal graphical models can be used as a safe replacement of the popular Gaussian graphical models, even when the data are truly Gaussian. Besides theoretical analysis, we also conduct thorough numerical simulations to compare the graph recovery performance of different estimators under both ideal and noisy settings. The proposed methods are then applied on a large-scale genomic data set to illustrate their empirical usefulness. The R package huge implementing the proposed methods is available on the Comprehensive R

Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

by Po-ling Loh, Martin J. Wainwright , 2012
"... We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects th ..."
Abstract - Cited by 17 (3 self) - Add to MetaCart
We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. The proof exploits a combination of ideas from the geometry of exponential families, junction tree theory, and convex analysis. These population-level results have various consequences for graph selection methods, both known and novel, including a novel method for structure estimation for missing or corrupted observations. We provide non-asymptotic guarantees for such methods, and illustrate the sharpness of these predictions via simulations.

Alternating Direction Methods for Latent Variable Gaussian Graphical Model Selection

by Shiqian Ma, Lingzhou Xue, Hui Zou , 2013
"... Chandrasekaran, Parrilo, andWillsky (2012) proposed a convex optimization problem for graphical model selection in the presence of unobserved variables. This convex optimization problem aims to estimate an inverse covariance matrix that can be decomposed into a sparse matrix minus a low-rank matrix ..."
Abstract - Cited by 11 (2 self) - Add to MetaCart
Chandrasekaran, Parrilo, andWillsky (2012) proposed a convex optimization problem for graphical model selection in the presence of unobserved variables. This convex optimization problem aims to estimate an inverse covariance matrix that can be decomposed into a sparse matrix minus a low-rank matrix from sample data. Solving this convex optimization problem is very challenging, especially for large problems. In this letter, we propose two alternating direction methods for solving this problem. The first method is to apply the classic alternating direction method of multipliers to solve the problem as a consensus problem. The second method is a proximal gradient-based alternating-direction method of multipliers. Our methods take advantage of the special structure of the problem and thus can solve large problems very efficiently. A global convergence result is established for the proposed methods. Numerical results on both synthetic data and gene expression data show that our methods usually solve problems with 1 million variables in 1 to 2 minutes and are usually 5 to 35 times faster than a state-of-the-art Newton-CG proximal point algorithm.

Transelliptical graphical models

by Han Liu, Fang Han, Cun-hui Zhang - In Advances in Neural Information Processing Systems , 2012
"... We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the varia ..."
Abstract - Cited by 11 (7 self) - Add to MetaCart
We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rank-based regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estima-tion. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
(Show Context)

Citation Context

...he Kendall’s tau statistic with a computational complexity O(n log n) [4]. Therefore, the incurred computational burden is negligible. Remark 4.1. Similar rank-based procedures have been discussed in =-=[19, 18, 28]-=-. Unlike our work, they focus on the more restrictive nonparanromal family and discuss several rank-based procedures using the normal-score, Spearman’s rho, and Kendall’s tau. Unlike our results, they...

Coda: High dimensional copula discriminant analysis

by Fang Han, Tuo Zhao, Han Liu, Tong Zhang - Journal of Machine Learning Research , 2012
"... We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normal-based linear discriminant analysis to the larger Gaus-sian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation e ..."
Abstract - Cited by 9 (3 self) - Add to MetaCart
We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normal-based linear discriminant analysis to the larger Gaus-sian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation efficiency and robustness, the nonparametric rank-based methods including the Spearman’s rho and Kendall’s tau are exploited in estimating the covariance matrix. In high dimen-sional settings, we prove that the sparsity pattern of the discriminant features can be consistently recovered with the parametric rate, and the expected misclassification error is consistent to the Bayes risk. Our theory is backed up by careful numerical experiments, which show that the extra flexibility gained by the CODA method incurs little efficiency loss even when the data are truly Gaussian. These results suggest that the CODA method can be an alternative choice besides the normal-based high dimensional linear discriminant analysis.
(Show Context)

Citation Context

... rho and Kendall’s tau, which are invariant to the strictly increasing functions f j. They have been shown to enjoy the optimal parametric rate in estimating the correlation matrix (Liu et al., 2012; =-=Xue and Zou, 2012-=-). Unlike previous analysis, a new contribution of this paper is that we provide an extra condition on the transformation functions which guarantees the fast rates of convergence of the marginal mean ...

Estimating Sparse Precision Matrix: Optimal Rates of Convergence and Adaptive Estimation

by T. Tony Cai, Weidong Liu, Harrison H. Zhou
"... Precision matrix is of significant importance in a wide range of applications in multivariate analysis. This paper considers adaptive minimax estimation of sparse precision matrices in the high dimensional setting. Optimal rates of convergence are established for a range of matrix norm losses. A ful ..."
Abstract - Cited by 8 (2 self) - Add to MetaCart
Precision matrix is of significant importance in a wide range of applications in multivariate analysis. This paper considers adaptive minimax estimation of sparse precision matrices in the high dimensional setting. Optimal rates of convergence are established for a range of matrix norm losses. A fully data driven estimator based on adaptive constrained ℓ1 minimization is proposed and its rate of convergence is obtained over a collection of parameter spaces. The estimator, called ACLIME, is easy to implement and performs well numerically. A major step in establishing the minimax rate of convergence is the derivation of a rate-sharp lower bound. A “two-directional ” lower bound technique is applied to obtain the minimax lower bound. The upper and lower bounds together yield the optimal rates of convergence for sparse precision matrix estimation and show that the ACLIME estimator is adaptively minimax rate optimal for a collection of parameter spaces and a range of matrix norm losses simultaneously.

Semiparametric Principal Component Analysis Fang

by Han Liu
"... We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are ..."
Abstract - Cited by 6 (5 self) - Add to MetaCart
We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are multivariate Gaussian. The COCA and Copula PCA accordingly estimate the leading eigenvectors of the correlation and covariance matrices of the latent Gaussian distribution. The robust nonparametric rank-based correlation coefficient estimator, Spearman’s rho, is exploited in estimation. We prove that, under suitable conditions, although the marginal distributions can be arbitrarily continuous, the COCA and Copula PCA estimators obtain fast estimation rates and are feature selection consistent in the setting where the dimension is nearly exponentially large relative to the sample size. Careful numerical experiments on the synthetic and real data are conducted to back up the theoretical results. We also discuss the relationship with the transelliptical component analysis proposed by Han and Liu (2012). 1
(Show Context)

Citation Context

...e transformation functions { f 0 j }dj=1 as [15] did, realizing that {f 0 j }dj=1 ranks of the data, we utilize the nonparametric correlation coefficient estimator, Spearman’s rho, to estimate Σ0 . =-=[14, 22]-=- prove that the corresponding estimators converge to Σ0 in a parametric rate. In theory, we analyze the general case that X is following the Nonparanormal and θ1 is weakly sparse, here θ1 is the leadi...

Efficient Estimation of Approximate Factor Models via Regularized Maximum Likelihood

by Jushan Bai, Yuan Liao , 2014
"... ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
Abstract not found

Mixed graphical models via exponential families

by Eunho Yang , Yulia Baker , Pradeep Ravikumar , Genevera I Allen , Zhandong Liu - In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics , 2014
"... Abstract Markov Random Fields, or undirected graphical models are widely used to model highdimensional multivariate data. Classical instances of these models, such as Gaussian Graphical and Ising Models, as well as recent extensions ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
Abstract Markov Random Fields, or undirected graphical models are widely used to model highdimensional multivariate data. Classical instances of these models, such as Gaussian Graphical and Ising Models, as well as recent extensions
(Show Context)

Citation Context

...odels, beyond the Gaussian-Ising instance, to encompass varied types of heterogeneous variables. While our construction of general mixed graphical models is a natural extension of that of Markov Random Fields for variables of one type, there are possibly other ways of jointly modeling variables of mixed types. First, there has been much recent interest in non-parametric extensions of graphical models using things like copula transforms (Dobra and Lenkoski, 2011; Liu et al., 2012) or robust estimators of relationships between variables such as with Spearman’s or Kendall’s Tao rank-correlation (Xue and Zou, 2012). While such approaches could be employed for mixed types of variables, non-parametric approaches in general might not adequately account for differing domains of mixed variables and likely have less statistical power than parametric methods for recovering graph structure in high-dimensional settings. Second, our construction is closely related to that of conditional random field (CRF) models (Lafferty, 2001), and particularly CRFs constructed via node-conditional exponential families as recently investigated by Yang et al. (2013a). Deriving a mixed MRF from such CRFs by taking a product of a ...

Gaussian graphical model estimation with false discovery rate control

by Weidong Liu - Annals of Statistics , 2013
"... This paper studies the estimation of high dimensional Gaussian graphical model (GGM). Typically, the existing methods depend on regularization techniques. As a result, it is necessary to choose the regularized parameter. However, the precise relationship between the regularized parameter and the num ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
This paper studies the estimation of high dimensional Gaussian graphical model (GGM). Typically, the existing methods depend on regularization techniques. As a result, it is necessary to choose the regularized parameter. However, the precise relationship between the regularized parameter and the number of false edges in GGM estimation is unclear. Hence, it is impossible to evaluate their performance rigorously. In this paper, we propose an alternative method by a multiple testing procedure. Based on our new test statistics for conditional dependence, we pro-pose a simultaneous testing procedure for conditional dependence in GGM. Our method can control the false discovery rate (FDR) asymptotically. The numerical performance of the proposed method shows that our method works quite well. 1
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University