Results 1  10
of
11
Latent Variable Graphical Model Selection via Convex Optimization
, 2010
"... Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistic ..."
Abstract

Cited by 76 (4 self)
 Add to MetaCart
Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latentvariable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out ” over most of the observed variables. Next we propose a tractable convex program based on regularized maximumlikelihood for model selection in this latentvariable setting; the regularizer uses both the ℓ1 norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables. These results are applicable in the highdimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of lowrank matrices play an important role in our analysis.
Solving multipleblock separable convex minimization problems using twoblock alternating direction method of multipliers
, 2013
"... ar ..."
(Show Context)
Nodebased learning of multiple gaussian graphical models. avialable at: arXiv:/1303.5145
, 2013
"... Abstract We consider the problem of estimating highdimensional Gaussian graphical models corresponding to a single set of variables under several distinct conditions. This problem is motivated by the task of recovering transcriptional regulatory networks on the basis of gene expression data contai ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Abstract We consider the problem of estimating highdimensional Gaussian graphical models corresponding to a single set of variables under several distinct conditions. This problem is motivated by the task of recovering transcriptional regulatory networks on the basis of gene expression data containing heterogeneous samples, such as different disease states, multiple species, or different developmental stages. We assume that most aspects of the conditional dependence networks are shared, but that there are some structured differences between them. Rather than assuming that similarities and differences between networks are driven by individual edges, we take a nodebased approach, which in many cases provides a more intuitive interpretation of the network differences. We consider estimation under two distinct assumptions: (1) differences between the K networks are due to individual nodes that are perturbed across conditions, or (2) similarities among the K networks are due to the presence of common hub nodes that are shared across all K networks. Using a rowcolumn overlap norm penalty function, we formulate two convex optimization problems that correspond to these two assumptions. We solve these problems using an alternating direction method of multipliers algorithm, and we derive a set of necessary and sufficient conditions that allows us to decompose the problem into independent subproblems so that our algorithm can be scaled to highdimensional settings. Our proposal is illustrated on synthetic data, a webpage data set, and a brain cancer gene expression data set.
A primaldual algorithmic framework for constrained convex minimization
, 2014
"... Abstract We present a primaldual algorithmic framework to obtain approximate solutions to a prototypical constrained convex optimization problem, and rigorously characterize how common structural assumptions affect the numerical efficiency. Our main analysis technique provides a fresh perspective ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract We present a primaldual algorithmic framework to obtain approximate solutions to a prototypical constrained convex optimization problem, and rigorously characterize how common structural assumptions affect the numerical efficiency. Our main analysis technique provides a fresh perspective on Nesterov's excessive gap technique in a structured fashion and unifies it with smoothing and primaldual methods. For instance, through the choices of a dual smoothing strategy and a center point, our framework subsumes decomposition algorithms, augmented Lagrangian as well as the alternating direction methodofmultipliers methods as its special cases, and provides optimal convergence rates on the primal objective residual as well as the primal feasibility gap of the iterates for all.
Optimal estimation of sparse correlation matrices of semiparametric Gaussian copulas
"... Statistical inference of semiparametric Gaussian copulas is well studied in the classical fixed dimension and large sample size setting. Nevertheless, optimal estimation of the correlation matrix of semiparametric Gaussian copula is understudied, especially when the dimension can far exceed the sam ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Statistical inference of semiparametric Gaussian copulas is well studied in the classical fixed dimension and large sample size setting. Nevertheless, optimal estimation of the correlation matrix of semiparametric Gaussian copula is understudied, especially when the dimension can far exceed the sample size. In this paper we derive the minimax rate of convergence under the matrix 1norm and 2norm for estimating large correlation matrices of semiparametric Gaussian copulas when the correlation matrices are in a weak q ball. We further show that an explicit rankbased thresholding estimator adaptively attains minimax optimal rate of convergence simultaneously for all 0 ≤ q < 1. Numerical examples are provided to demonstrate the finite sample performance of the rankbased thresholding estimator.
Learning Latent Variable Gaussian Graphical Models
"... Gaussian graphical models (GGM) have been widely used in many highdimensional applications ranging from biological and financial data to recommender systems. Sparsity in GGM plays a central role both statistically and computationally. Unfortunately, realworld data often does not fit well to spar ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Gaussian graphical models (GGM) have been widely used in many highdimensional applications ranging from biological and financial data to recommender systems. Sparsity in GGM plays a central role both statistically and computationally. Unfortunately, realworld data often does not fit well to sparse graphical models. In this paper, we focus on a family of latent variable Gaussian graphical models (LVGGM), where the model is conditionally sparse given latent variables, but marginally nonsparse. In LVGGM, the inverse covariance matrix has a lowrank plus sparse structure, and can be learned in a regularized maximum likelihood framework. We derive novel parameter estimation error bounds for LVGGM under mild conditions in the highdimensional setting. These results complement the existing theory on the structural learning, and open up new possibilities of using LVGGM for statistical inference. 1.
Learning Graphical Models With Hubs
"... We consider the problem of learning a highdimensional graphical model in which certain hub nodes are highlyconnected to many other nodes. Many authors have studied the use of an `1 penalty in order to learn a sparse graph in the highdimensional setting. However, the `1 penalty implicitly assumes ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We consider the problem of learning a highdimensional graphical model in which certain hub nodes are highlyconnected to many other nodes. Many authors have studied the use of an `1 penalty in order to learn a sparse graph in the highdimensional setting. However, the `1 penalty implicitly assumes that each edge is equally likely and independent of all other edges. We propose a general framework to accommodate more realistic networks with hub nodes, using a convex formulation that involves a rowcolumn overlap norm penalty. We apply this general framework to three widelyused probabilistic graphical models: the Gaussian graphical model, the covariance graph model, and the binary Ising model. An alternating direction method of multipliers algorithm is used to solve the corresponding convex optimization problems. On synthetic data, we demonstrate that our proposed framework outperforms competitors that do not explicitly model hub nodes. We illustrate our proposal on a webpage data set and a gene expression data set.
REJOINDER: LATENT VARIABLE GRAPHICAL MODEL SELECTION VIA CONVEX OPTIMIZATION
"... ar ..."
(Show Context)
1Robust Subspace Discovery via Relaxed Rank Minimization
"... This paper examines the problem of robust subspace discovery from input data samples (instances) in the presence of overwhelming outliers and corruptions. A typical example is the case where we are given a set of images; each image contains e.g. a face at an unknown location of an unknown size; our ..."
Abstract
 Add to MetaCart
This paper examines the problem of robust subspace discovery from input data samples (instances) in the presence of overwhelming outliers and corruptions. A typical example is the case where we are given a set of images; each image contains e.g. a face at an unknown location of an unknown size; our goal is to identify/detect the face in the image and simultaneously learn its model. This paper explores a direction by employing a simple generative subspace model and proposes a new formulation to simultaneously infer the label information and learn the model via lowrank optimization. Solving this problem enables us to simultaneously identify the ownership of instances to the subspace and learn the corresponding subspace model. We give an efficient and effective algorithm based on the Alternating Direction Method of Multipliers (ADMM) method and provide extensive simulations and experiments to verify the effectiveness of our method. The proposed scheme can also be applied to tackle many highdimensional combinatorial selection problems. 1