DMCA
Structured Learning of Gaussian Graphical Models
Cached
Download Links
Citations: | 8 - 1 self |
Citations
4201 | Regression shrinkage and selection via the lasso
- Tibshirani
- 1996
(Show Context)
Citation Context ... is the entrywise ℓ1 norm. The ˆ Θ that solves (1) serves as an estimate of Σ−1 . This estimate will be positive definite for any λ > 0, and sparse when λ is sufficiently large, due to the ℓ1 penalty =-=[10]-=- in (1). We refer to (1) as the graphical lasso formulation. This formulation is convex, and efficient algorithms for solving it are available [6, 4, 5, 7, 11]. 22.2 The fused graphical lasso In rece... |
1589 |
Graphical Models
- Lauritzen
- 1996
(Show Context)
Citation Context ...e a GGM on the basis of n observations, X1, . . . , Xn ∈ R p , which are independent and identically distributed N(0, Σ). It is well known that this amounts to learning the sparsity structure of Σ −1 =-=[1, 2]-=-. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors [3, 4, 5, 6, 7, 8,... |
1477 |
Multivariate Analysis
- Mardia, Kent, et al.
- 1992
(Show Context)
Citation Context ...e a GGM on the basis of n observations, X1, . . . , Xn ∈ R p , which are independent and identically distributed N(0, Σ). It is well known that this amounts to learning the sparsity structure of Σ −1 =-=[1, 2]-=-. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors [3, 4, 5, 6, 7, 8,... |
1156 | Model selection and estimation in regression with grouped variables
- Yuan, Lin
- 2006
(Show Context)
Citation Context ... ⎨ ⎩ L(Θ1 , Θ 2 ) − λ1∥Θ 1 ∥1 − λ1∥Θ 2 p∑ ∥1 − λ2 ∥Θ 1 j − Θ 2 ⎫ ⎬ j∥2 , (4) ⎭ maximize Θ 1 ∈S p ++ ,Θ2 ∈S p ++ where Θ k j is the jth column of the matrix Θk . This amounts to applying a group lasso =-=[15]-=- penalty to the columns of Θ 1 − Θ 2 . Since a group lasso penalty simultaneously shrinks all elements to which it is applied to zero, it appears that this will give the desired node perturbation stru... |
998 | Distributed optimization and statistical learning via the alternating direction method of multipliers
- Boyd, Parikh, et al.
(Show Context)
Citation Context ...t easy to compute and in addition the proximal operator to RCON is non-trivial to compute. In this section we present a fast and scalable alternating directions method of multipliers (ADMM) algorithm =-=[22]-=- to solve the problem (7). We first reformulate (7) by introducing new variables, so as to decouple some of the terms in the objective function that are difficult to optimize jointly. This will result... |
595 | Sparse inverse covariance estimation with the graphical lasso
- Friedman, Hastie, et al.
(Show Context)
Citation Context ...ure of Σ −1 [1, 2]. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors =-=[3, 4, 5, 6, 7, 8, 9]-=- have considered maximizing the penalized log likelihood maximize Θ∈S p {log det Θ − trace(SΘ) − λ∥Θ∥1} , (1) ++ where S is the empirical covariance matrix based on the n observations, λ is a positive... |
332 | Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data
- Banerjee, Ghaoui, et al.
(Show Context)
Citation Context ...ure of Σ −1 [1, 2]. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors =-=[3, 4, 5, 6, 7, 8, 9]-=- have considered maximizing the penalized log likelihood maximize Θ∈S p {log det Θ − trace(SΘ) − λ∥Θ∥1} , (1) ++ where S is the empirical covariance matrix based on the n observations, λ is a positive... |
332 | Sparsity and smoothness via the fused lasso
- Tibshirani, Saunders, et al.
- 2005
(Show Context)
Citation Context .... . . , ˆ ΘK that solve (2) serve as estimates for (Σ1) −1 , . . . , (ΣK) −1 . In particular, [13] considered the use of P (Θ 1 ij, . . . , Θ K ij ) = ∑ |Θ k ij − Θ k′ ij |, (3) a fused lasso penalty =-=[14]-=- on the differences between pairs of network edges. When λ1 is large, the network estimates will be sparse, and when λ2 is large, pairs of network estimates will have identical edges. We refer to (2) ... |
249 | Model selection and estimation in the gaussian graphical model.
- Yuan, Lin
- 2007
(Show Context)
Citation Context ...ure of Σ −1 [1, 2]. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors =-=[3, 4, 5, 6, 7, 8, 9]-=- have considered maximizing the penalized log likelihood maximize Θ∈S p {log det Θ − trace(SΘ) − λ∥Θ∥1} , (1) ++ where S is the empirical covariance matrix based on the n observations, λ is a positive... |
232 | Group lasso with overlap and graph lasso
- Jacob, Obozinski, et al.
(Show Context)
Citation Context ...t of Θ 1 − Θ 2 is in both the ith (row) and jth (column) groups. In the presence of overlapping groups, the group lasso penalty yields estimates whose support is the complement of the union of groups =-=[16, 17]-=-. Figure 2 shows a simple example of (Σ 1 ) −1 −(Σ 2 ) −1 in the case of node perturbation, as well as the estimate obtained using (4). The figure reveals that (4) cannot be used to detect node pertur... |
130 | Efficient online and batch learning using forward backword splitting
- Duchi, Singer
(Show Context)
Citation Context ...imal operator corresponding to the ℓ1/ℓq norm. For q = 1, 2, ∞, Tq takes a simple form, which we omit here due to space constraints. A description of these operators can also be found in Section 5 of =-=[25]-=-. Algorithm 1 can be interpreted as an approximate dual gradient ascent method. The approximation is due to the fact that the gradient of the dual to the augmented Lagrangian in each iteration is comp... |
104 | First-Order Methods for Sparse Covariance Selection,’’
- d’Aspremont, Banerjee, et al.
- 2008
(Show Context)
Citation Context ... when λ is sufficiently large, due to the ℓ1 penalty [10] in (1). We refer to (1) as the graphical lasso formulation. This formulation is convex, and efficient algorithms for solving it are available =-=[6, 4, 5, 7, 11]-=-. 22.2 The fused graphical lasso In recent literature, convex formulations have been proposed for extending the graphical lasso (1) to the setting in which one has access to a number of observations ... |
98 | On the linear convergence of the alternating direction method of multipliers.
- Hong, Luo
- 2012
(Show Context)
Citation Context ...through more than two such groups. Although investigation of the convergence properties of ADMM algorithms for an arbitrary number of groups is an ongoing research area in the optimization literature =-=[23, 24]-=- and specific convergence results for our algorithm are not known, we empirically observe very good convergence behavior. Further study of this issue is a direction for future work. We initialize the ... |
60 | Sparse inverse covariance selection via alternating linearization methods.
- Scheinberg, Ma, et al.
- 2010
(Show Context)
Citation Context ...ure of Σ −1 [1, 2]. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors =-=[3, 4, 5, 6, 7, 8, 9]-=- have considered maximizing the penalized log likelihood maximize Θ∈S p {log det Θ − trace(SΘ) − λ∥Θ∥1} , (1) ++ where S is the empirical covariance matrix based on the n observations, λ is a positive... |
60 |
Integrated genomic analysis identifies clinically relevant sub‐ types of glioblastoma characterized by abnormalities
- Verhaak
- 2010
(Show Context)
Citation Context ...nt to which PNJGL with q = 2 outperforms others is more apparent when n is small. 5.2 Inferring biological networks We applied the PNJGL method to a recently-published cancer gene expression data set =-=[26]-=-, with mRNA expression measurements for 11,861 genes in 220 patients with glioblastoma multiforme (GBM), a brain cancer. Each patient has one of four distinct clinical subtypes: Proneural, Neural, Cla... |
55 | Smoothing proximal gradient method for general structured sparse learning.
- Chen, Lin, et al.
- 2011
(Show Context)
Citation Context ... However, such a general approach does not fully exploit the structure of the problem and will not scale well to large-scale instances. Other algorithms proposed for overlapping group lasso penalties =-=[19, 20, 21]-=- do not apply to our setting since the PNJGL formulation has a combination of Gaussian log-likelihood loss (instead of squared error loss) and the RCON penalty along with a positivedefinite constraint... |
43 | Model selection in gaussian graphical models: High-dimensional consistency of l1-regularized mle.
- Ravikumar, Raskutti, et al.
- 2008
(Show Context)
Citation Context ...ure of Σ −1 [1, 2]. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors =-=[3, 4, 5, 6, 7, 8, 9]-=- have considered maximizing the penalized log likelihood maximize Θ∈S p {log det Θ − trace(SΘ) − λ∥Θ∥1} , (1) ++ where S is the empirical covariance matrix based on the n observations, λ is a positive... |
43 |
The joint graphical lasso for inverse covariance estimation across multiple classes,”
- Danaher, Wang, et al.
- 2013
(Show Context)
Citation Context ...ber of observations from K distinct conditions. The goal of the formulations is to estimate a graphical model for each condition under the assumption that the K networks share certain characteristics =-=[12, 13]-=-. Suppose that Xk 1 , . . . , Xk nk ∈ Rp are independent and identically distributed from a N(0, Σk ) distribution, for k = 1, . . . , K. Letting Sk denote the empirical covariance matrix for the kth ... |
41 | Joint estimation of multiple graphical models.
- Guo, Levina, et al.
- 2011
(Show Context)
Citation Context ...ber of observations from K distinct conditions. The goal of the formulations is to estimate a graphical model for each condition under the assumption that the K networks share certain characteristics =-=[12, 13]-=-. Suppose that Xk 1 , . . . , Xk nk ∈ Rp are independent and identically distributed from a N(0, Σk ) distribution, for k = 1, . . . , K. Letting Sk denote the empirical covariance matrix for the kth ... |
38 | Alternating direction method with Gaussian back substitution for separable convex programming.
- He, Tao, et al.
- 2012
(Show Context)
Citation Context ...through more than two such groups. Although investigation of the convergence properties of ADMM algorithms for an arbitrary number of groups is an ongoing research area in the optimization literature =-=[23, 24]-=- and specific convergence results for our algorithm are not known, we empirically observe very good convergence behavior. Further study of this issue is a direction for future work. We initialize the ... |
37 |
New insights and faster computations for the graphical lasso.
- Witten, Friedman, et al.
- 2011
(Show Context)
Citation Context ...ure of Σ −1 [1, 2]. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors =-=[3, 4, 5, 6, 7, 8, 9]-=- have considered maximizing the penalized log likelihood maximize Θ∈S p {log det Θ − trace(SΘ) − λ∥Θ∥1} , (1) ++ where S is the empirical covariance matrix based on the n observations, λ is a positive... |
28 | Group lasso with overlaps: The latent group lasso approach. Available at arXiv:1110.0413
- Obozinski, Jacob, et al.
- 2011
(Show Context)
Citation Context ...t of Θ 1 − Θ 2 is in both the ith (row) and jth (column) groups. In the presence of overlapping groups, the group lasso penalty yields estimates whose support is the complement of the union of groups =-=[16, 17]-=-. Figure 2 shows a simple example of (Σ 1 ) −1 −(Σ 2 ) −1 in the case of node perturbation, as well as the estimate obtained using (4). The figure reveals that (4) cannot be used to detect node pertur... |
22 |
A primal-dual algorithm for group sparse regularization with overlapping groups.
- Mosci, Villa, et al.
- 2010
(Show Context)
Citation Context ... However, such a general approach does not fully exploit the structure of the problem and will not scale well to large-scale instances. Other algorithms proposed for overlapping group lasso penalties =-=[19, 20, 21]-=- do not apply to our setting since the PNJGL formulation has a combination of Gaussian log-likelihood loss (instead of squared error loss) and the RCON penalty along with a positivedefinite constraint... |
14 | Efficient first order methods for linear composite regularizers
- Argyriou, Micchelli, et al.
- 2011
(Show Context)
Citation Context ... However, such a general approach does not fully exploit the structure of the problem and will not scale well to large-scale instances. Other algorithms proposed for overlapping group lasso penalties =-=[19, 20, 21]-=- do not apply to our setting since the PNJGL formulation has a combination of Gaussian log-likelihood loss (instead of squared error loss) and the RCON penalty along with a positivedefinite constraint... |
5 |
Sparse inverse covariance estimation using quadratic approximation.
- Hsieh, Sustik, et al.
- 2011
(Show Context)
Citation Context ...ure of Σ −1 [1, 2]. When n > p, one can estimate Σ −1 by maximum likelihood, but when p > n this is not possible because the empirical covariance matrix is singular. Consequently, a number of authors =-=[3, 4, 5, 6, 7, 8, 9]-=- have considered maximizing the penalized log likelihood maximize Θ∈S p {log det Θ − trace(SΘ) − λ∥Θ∥1} , (1) ++ where S is the empirical covariance matrix based on the n observations, λ is a positive... |
2 |
cvx version 1.21. “http://cvxr.com/cvx”,
- Grant, Boyd
- 2010
(Show Context)
Citation Context ...rselves to the case of K = 2 in this paper. 44 An ADMM algorithm for the PNJGL formulation The PNJGL optimization problem (7) is convex, and so can be directly solved in the modeling environment cvx =-=[18]-=-, which calls conic interior-point solvers such as SeDuMi or SDPT3. However, such a general approach does not fully exploit the structure of the problem and will not scale well to large-scale instance... |
1 |
Chemokine CXCL13 is overexpressed in the tumour tissue and in the peripheral blood of breast cancer patients
- Grosso
(Show Context)
Citation Context ... was not identified as a frequently mutated gene in GBM [26]. However, there is recent evidence that CXCL13 plays a critical role in driving cancerous pathways in breast, prostate, and ovarian tissue =-=[27, 28]-=-. Our results suggest the possibility of a previously unknown role of CXCL13 in brain cancer. 6 Discussion and future work We have proposed the perturbed-node joint graphical lasso, a new approach for... |
1 |
CXCL13-CXCR5 interactions support prostate cancer cell migration and invasion in a PI3K p110-, SRC- and FAK-dependent fashion
- El-Haibi
(Show Context)
Citation Context ... was not identified as a frequently mutated gene in GBM [26]. However, there is recent evidence that CXCL13 plays a critical role in driving cancerous pathways in breast, prostate, and ovarian tissue =-=[27, 28]-=-. Our results suggest the possibility of a previously unknown role of CXCL13 in brain cancer. 6 Discussion and future work We have proposed the perturbed-node joint graphical lasso, a new approach for... |