Results 1  10
of
51
The huge Package for Highdimensional Undirected Graph Estimation in R
, 2012
"... We describe an R package named huge which provides easytouse functions for estimating high dimensional undirected graphs from data. This package implements recent results in the literature, including Friedman et al. [2007b], Liu et al. [2009] and Liu et al. [2010]. Compared with the existing graph ..."
Abstract

Cited by 21 (10 self)
 Add to MetaCart
We describe an R package named huge which provides easytouse functions for estimating high dimensional undirected graphs from data. This package implements recent results in the literature, including Friedman et al. [2007b], Liu et al. [2009] and Liu et al. [2010]. Compared with the existing graph estimation package glasso, the huge package provides extra features: (1) instead of using Fortan, it is written in C, which makes the code more portable and easier to modify; (2) besides fitting Gaussian graphical models, it also provides functions for fitting high dimensional semiparametric Gaussian copula models; (3) more functions like datadependent model selection, data generation and graph visualization; (4) a minor convergence problem of the graphical lasso algorithm is corrected; (5) the package allows the user to apply both lossless and lossy screening rules to scale up largescale problems, making a tradeoff between computational and statistical efficiency. 1
Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses
, 2012
"... We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects th ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a nonGaussian distribution. The proof exploits a combination of ideas from the geometry of exponential families, junction tree theory, and convex analysis. These populationlevel results have various consequences for graph selection methods, both known and novel, including a novel method for structure estimation for missing or corrupted observations. We provide nonasymptotic guarantees for such methods, and illustrate the sharpness of these predictions via simulations.
Alternating Direction Methods for Latent Variable Gaussian Graphical Model Selection
, 2013
"... Chandrasekaran, Parrilo, andWillsky (2012) proposed a convex optimization problem for graphical model selection in the presence of unobserved variables. This convex optimization problem aims to estimate an inverse covariance matrix that can be decomposed into a sparse matrix minus a lowrank matrix ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
Chandrasekaran, Parrilo, andWillsky (2012) proposed a convex optimization problem for graphical model selection in the presence of unobserved variables. This convex optimization problem aims to estimate an inverse covariance matrix that can be decomposed into a sparse matrix minus a lowrank matrix from sample data. Solving this convex optimization problem is very challenging, especially for large problems. In this letter, we propose two alternating direction methods for solving this problem. The first method is to apply the classic alternating direction method of multipliers to solve the problem as a consensus problem. The second method is a proximal gradientbased alternatingdirection method of multipliers. Our methods take advantage of the special structure of the problem and thus can solve large problems very efficiently. A global convergence result is established for the proposed methods. Numerical results on both synthetic data and gene expression data show that our methods usually solve problems with 1 million variables in 1 to 2 minutes and are usually 5 to 35 times faster than a stateoftheart NewtonCG proximal point algorithm.
Transelliptical graphical models
 In Advances in Neural Information Processing Systems
, 2012
"... We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the varia ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
(Show Context)
We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rankbased regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
Coda: High dimensional copula discriminant analysis
 Journal of Machine Learning Research
, 2012
"... We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normalbased linear discriminant analysis to the larger Gaussian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation e ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normalbased linear discriminant analysis to the larger Gaussian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation efficiency and robustness, the nonparametric rankbased methods including the Spearman’s rho and Kendall’s tau are exploited in estimating the covariance matrix. In high dimensional settings, we prove that the sparsity pattern of the discriminant features can be consistently recovered with the parametric rate, and the expected misclassification error is consistent to the Bayes risk. Our theory is backed up by careful numerical experiments, which show that the extra flexibility gained by the CODA method incurs little efficiency loss even when the data are truly Gaussian. These results suggest that the CODA method can be an alternative choice besides the normalbased high dimensional linear discriminant analysis.
Estimating Sparse Precision Matrix: Optimal Rates of Convergence and Adaptive Estimation
"... Precision matrix is of significant importance in a wide range of applications in multivariate analysis. This paper considers adaptive minimax estimation of sparse precision matrices in the high dimensional setting. Optimal rates of convergence are established for a range of matrix norm losses. A ful ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Precision matrix is of significant importance in a wide range of applications in multivariate analysis. This paper considers adaptive minimax estimation of sparse precision matrices in the high dimensional setting. Optimal rates of convergence are established for a range of matrix norm losses. A fully data driven estimator based on adaptive constrained ℓ1 minimization is proposed and its rate of convergence is obtained over a collection of parameter spaces. The estimator, called ACLIME, is easy to implement and performs well numerically. A major step in establishing the minimax rate of convergence is the derivation of a ratesharp lower bound. A “twodirectional ” lower bound technique is applied to obtain the minimax lower bound. The upper and lower bounds together yield the optimal rates of convergence for sparse precision matrix estimation and show that the ACLIME estimator is adaptively minimax rate optimal for a collection of parameter spaces and a range of matrix norm losses simultaneously.
Tiger: A tuninginsensitive approach for optimally estimating gaussian graphical models
, 2012
"... We propose a new procedure for estimating high dimensional Gaussian graphical models. Our approach is asymptotically tuningfree and nonasymptotically tuninginsensitive: it requires very few efforts to choose the tuning parameter in finite sample settings. Computationally, our procedure is signifi ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
We propose a new procedure for estimating high dimensional Gaussian graphical models. Our approach is asymptotically tuningfree and nonasymptotically tuninginsensitive: it requires very few efforts to choose the tuning parameter in finite sample settings. Computationally, our procedure is significantly faster than existing methods due to its tuninginsensitive property. Theoretically, the obtained estimator is simultaneously minimax optimal for precision matrix estimation under different norms. Empirically, we illustrate the advantages of our method using thorough simulated and real examples. The R package bigmatrix implementing the proposed methods is available on the Comprehensive R Archive Network:
Semiparametric Principal Component Analysis Fang
"... We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are multivariate Gaussian. The COCA and Copula PCA accordingly estimate the leading eigenvectors of the correlation and covariance matrices of the latent Gaussian distribution. The robust nonparametric rankbased correlation coefficient estimator, Spearman’s rho, is exploited in estimation. We prove that, under suitable conditions, although the marginal distributions can be arbitrarily continuous, the COCA and Copula PCA estimators obtain fast estimation rates and are feature selection consistent in the setting where the dimension is nearly exponentially large relative to the sample size. Careful numerical experiments on the synthetic and real data are conducted to back up the theoretical results. We also discuss the relationship with the transelliptical component analysis proposed by Han and Liu (2012). 1
The Nonparanormal skeptic
"... We propose a semiparametric method we call the nonparanormal skeptic for estimating high dimensional undirected graphical models. The underlying model is the nonparanormal family proposed by Liu et al. (2009). The method exploits nonparametric rankbased correlation coefficient estimators, including ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We propose a semiparametric method we call the nonparanormal skeptic for estimating high dimensional undirected graphical models. The underlying model is the nonparanormal family proposed by Liu et al. (2009). The method exploits nonparametric rankbased correlation coefficient estimators, including Spearman’s rho and Kendall’s tau. In high dimensional settings, we prove that the nonparanormal skeptic achieves the optimal parametric rate of convergence for both graph and parameter estimation. This result suggests that the nonparanormal graphical model can be a safe replacement for the Gaussian graphical model, even when the data are Gaussian.