#### DMCA

## Mixed graphical models via exponential families (2014)

Venue: | In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics |

Citations: | 4 - 2 self |

### Citations

1630 |
Spatial interaction and the statistical analysis of lattice systems (with Discussion
- Besag
- 1974
(Show Context)
Citation Context ...k structure. Simulations as well as an application to learning mixed genomic networks from next generation sequencing and mutation data demonstrate the versatility of our methods. 1 Introduction Markov Networks, or undirected graphical models, are a popular tool for modeling, visualization, inference, and exploratory analysis of multivariate data with wide-ranging applications. The Gaussian Graphical Model, for continuous (Gaussian) variables, and the Ising / Potts model, for binary / categorical variables, are two widely used classes of Markov Networks. Recently, Yang et al. (2012) extending Besag (1974) inAppearing in Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. JMLR: W&CP volume 33. Copyright 2014 by the authors. troduce a more general class of graphical models constructed by assuming the node-conditional distributions arise from a univariate exponential family distribution. While this work permits graphical modeling for varied types of variables such as count data (e.g. Poisson graphical models) or left-skewed data (e.g. exponential graphical models), the models assume that all variables belong to the same ty... |

1587 |
Graphical Models
- Lauritzen
- 1996
(Show Context)
Citation Context ...ariables from methylation arrays). Scientists are interested in studying relationships both between and within these different types of genomic markers to better understand the genetic basis of disease. To this end, new classes of mixed graphical models are needed that construct Markov Networks for sets of heterogeneous variables. Existing models for mixed graphs are limited to one particular case: a Gaussian and Ising mixed model. This model was initially proposed by Lauritzen and Wermuth (1989) (and further studied in (Frydenberg and Lauritzen, 1989; Lauritzen, 1992; Lauritzen et al., 1989; Lauritzen, 1996)), where they formulated a Markov Network over nodes with a subset of continuous variables and a subset of discrete categorical or binary variables. The construction of this model is simple and assumes that the continuous variables conditioned on all possible configurations of the discrete vector are distributed as multivariate Gaussian. This model specification however scales exponentially with the number of discrete variables, and accordingly several others have proposed specializations of this Gaussian-Ising mixed graphical model. Lee and Hastie (2012) considered a specialization involving ... |

734 | High-dimensional graphs and variable selection with the lasso - Meinshausen, Bühlmann - 2006 |

223 |
Graphical models for associations between variables, some of which are qualitative and some quantitative.
- Lauritzen, Wermuth
- 1989
(Show Context)
Citation Context ... from SNP-arrays), copy number variation (categorical variables after processing CGH-arrays), and epigenetic data (continuous variables from methylation arrays). Scientists are interested in studying relationships both between and within these different types of genomic markers to better understand the genetic basis of disease. To this end, new classes of mixed graphical models are needed that construct Markov Networks for sets of heterogeneous variables. Existing models for mixed graphs are limited to one particular case: a Gaussian and Ising mixed model. This model was initially proposed by Lauritzen and Wermuth (1989) (and further studied in (Frydenberg and Lauritzen, 1989; Lauritzen, 1992; Lauritzen et al., 1989; Lauritzen, 1996)), where they formulated a Markov Network over nodes with a subset of continuous variables and a subset of discrete categorical or binary variables. The construction of this model is simple and assumes that the continuous variables conditioned on all possible configurations of the discrete vector are distributed as multivariate Gaussian. This model specification however scales exponentially with the number of discrete variables, and accordingly several others have proposed special... |

172 | Propagation of probabilities, means and variances in mixed graphical association models. - Lauritzen - 1992 |

85 | A human functional protein interaction network and its application to cancer data analysis,” - Wu, Feng, et al. - 2010 |

51 | Highdimensional semiparametric Gaussian copula graphical models. - Liu, Han, et al. - 2012 |

38 | Stability approach to regularization selection (stars) for high dimensional graphical models. Arxiv preprint arXiv:1006.3316,
- Liu, Roeder, et al.
- 2010
(Show Context)
Citation Context ...ted using standard methods described in (Zhang) and merged with the mutation data to form an indicator matrix of whether a point mutation or copy number aberration occurs in each gene biomarker. There are n = 697 patients common to both data sets, and our analysis considers the top 2% of genes filtered by expression variance across samples (pY = 329) and gene aberrations that occurred in at least 15% of patients (pZ = 177). As RNA-sequencing data is count-valued and the mutation status is binary, we fit our Truncated Poisson - Ising Manichean graphical model to this data. Stability selection (Liu et al., 2010) was used to determine the optimal level of regularization. Results visualized in Figure 2 show highly connected modules exhibiting within connections and identified several between connections that are consistent with the cancer genomics literature. For example, TP53 is known to be highly mutated in breast cancer and a regulator of gene expression. Two such genes that have been experimentally validated as influenced by TP53 mutations, DLK1 and THSD4 (Lin et al., 2010; Wu et al., 2010), were identified as inter-connected neighbors to TP53 in our graph. Overall, the formulation of mixed graphic... |

36 |
Decomposition of maximum likelihood in mixed graphical interaction models.
- Frydenberg, Lauritzen
- 1989
(Show Context)
Citation Context ...variables after processing CGH-arrays), and epigenetic data (continuous variables from methylation arrays). Scientists are interested in studying relationships both between and within these different types of genomic markers to better understand the genetic basis of disease. To this end, new classes of mixed graphical models are needed that construct Markov Networks for sets of heterogeneous variables. Existing models for mixed graphs are limited to one particular case: a Gaussian and Ising mixed model. This model was initially proposed by Lauritzen and Wermuth (1989) (and further studied in (Frydenberg and Lauritzen, 1989; Lauritzen, 1992; Lauritzen et al., 1989; Lauritzen, 1996)), where they formulated a Markov Network over nodes with a subset of continuous variables and a subset of discrete categorical or binary variables. The construction of this model is simple and assumes that the continuous variables conditioned on all possible configurations of the discrete vector are distributed as multivariate Gaussian. This model specification however scales exponentially with the number of discrete variables, and accordingly several others have proposed specializations of this Gaussian-Ising mixed graphical model. L... |

32 | Regularized rank-based estimation of highdimensional nonparanormal graphical models.
- Xue, L, et al.
- 2012
(Show Context)
Citation Context ...odels, beyond the Gaussian-Ising instance, to encompass varied types of heterogeneous variables. While our construction of general mixed graphical models is a natural extension of that of Markov Random Fields for variables of one type, there are possibly other ways of jointly modeling variables of mixed types. First, there has been much recent interest in non-parametric extensions of graphical models using things like copula transforms (Dobra and Lenkoski, 2011; Liu et al., 2012) or robust estimators of relationships between variables such as with Spearman’s or Kendall’s Tao rank-correlation (Xue and Zou, 2012). While such approaches could be employed for mixed types of variables, non-parametric approaches in general might not adequately account for differing domains of mixed variables and likely have less statistical power than parametric methods for recovering graph structure in high-dimensional settings. Second, our construction is closely related to that of conditional random field (CRF) models (Lafferty, 2001), and particularly CRFs constructed via node-conditional exponential families as recently investigated by Yang et al. (2013a). Deriving a mixed MRF from such CRFs by taking a product of a ... |

31 | Copula gaussian graphical models and their application to modeling functional disability data.
- Dobra, Lenkoski
- 2011
(Show Context)
Citation Context ... under which certain classes of these mixed graphical models are normalizable. Thus, for the first time, our work provides a general class of mixed graphical models, beyond the Gaussian-Ising instance, to encompass varied types of heterogeneous variables. While our construction of general mixed graphical models is a natural extension of that of Markov Random Fields for variables of one type, there are possibly other ways of jointly modeling variables of mixed types. First, there has been much recent interest in non-parametric extensions of graphical models using things like copula transforms (Dobra and Lenkoski, 2011; Liu et al., 2012) or robust estimators of relationships between variables such as with Spearman’s or Kendall’s Tao rank-correlation (Xue and Zou, 2012). While such approaches could be employed for mixed types of variables, non-parametric approaches in general might not adequately account for differing domains of mixed variables and likely have less statistical power than parametric methods for recovering graph structure in high-dimensional settings. Second, our construction is closely related to that of conditional random field (CRF) models (Lafferty, 2001), and particularly CRFs constructed... |

19 | Adaptive multitask Lasso: With application to eQTL detection,” in
- Lee, Zhu, et al.
- 2010
(Show Context)
Citation Context ...es as recently investigated by Yang et al. (2013a). Deriving a mixed MRF from such CRFs by taking a product of a conditional CRF distributions and marginal MRF distributions however, has a key disadvantage in that the resulting distribution ends up with much more complicated terms. (A discussion of such formulations is given in the appendix). Third, relationships between variables of different types could be approached via types of multi-response regression models (Cai et al., 2013); these are particularly popular approaches for eQTL mapping of point mutations to gene expression, for example (Lee et al., 2010). While these approaches may be effective at finding connections between two sets of variables, they cannot model relationships within sets of variables, are limited to only two types of variables, and do not correspond to a coherent joint probabilistic model. In this paper, we make several major contributions including: (1) Construction of a general class of mixed graphical models that permits each node to belong to a potentially different variable type, thus broadly generalizing the applicability of mixed statistical models; (2) Careful discussion of the conditions on the natural parameters ... |

5 |
High-dimensional mixed graphical models. ArXiv e-prints, arXiv:1304.2810,
- Cheng, Levina, et al.
- 2013
(Show Context)
Citation Context ...subset of continuous variables and a subset of discrete categorical or binary variables. The construction of this model is simple and assumes that the continuous variables conditioned on all possible configurations of the discrete vector are distributed as multivariate Gaussian. This model specification however scales exponentially with the number of discrete variables, and accordingly several others have proposed specializations of this Gaussian-Ising mixed graphical model. Lee and Hastie (2012) considered a specialization involving only pairwise interactions between any two variables, while Cheng et al. (2013) further allowed for three-way inter1042 Mixed Graphical Models via Exponential Families actions between two binary and one continuous variable. In addition to these specializations, these recent Gaussian-Ising models are limited to allowing variables to one of two specific types (binary/Ising, and continuous/Gaussian). In this paper, we propose a general class of mixed graphical models that permits each variable to belong to a potentially different type. Our construction is a natural extension of that of the Gaussian-Ising model and the class of exponential family MRFs (Yang et al., 2012). Su... |

5 | Mixed graphical association models [with discussion and reply]. - Lauritzen, Andersen, et al. - 1989 |

5 |
Learning mixed graphical models. arXiv preprint arXiv:1205.5012,
- Lee, Hastie
- 2012
(Show Context)
Citation Context ...9; Lauritzen, 1992; Lauritzen et al., 1989; Lauritzen, 1996)), where they formulated a Markov Network over nodes with a subset of continuous variables and a subset of discrete categorical or binary variables. The construction of this model is simple and assumes that the continuous variables conditioned on all possible configurations of the discrete vector are distributed as multivariate Gaussian. This model specification however scales exponentially with the number of discrete variables, and accordingly several others have proposed specializations of this Gaussian-Ising mixed graphical model. Lee and Hastie (2012) considered a specialization involving only pairwise interactions between any two variables, while Cheng et al. (2013) further allowed for three-way inter1042 Mixed Graphical Models via Exponential Families actions between two binary and one continuous variable. In addition to these specializations, these recent Gaussian-Ising models are limited to allowing variables to one of two specific types (binary/Ising, and continuous/Gaussian). In this paper, we propose a general class of mixed graphical models that permits each variable to belong to a potentially different type. Our construction is a ... |

4 | Covariateadjusted precision matrix estimation with an application in genetical genomics.
- Cai, Li, et al.
- 2013
(Show Context)
Citation Context ... conditional random field (CRF) models (Lafferty, 2001), and particularly CRFs constructed via node-conditional exponential families as recently investigated by Yang et al. (2013a). Deriving a mixed MRF from such CRFs by taking a product of a conditional CRF distributions and marginal MRF distributions however, has a key disadvantage in that the resulting distribution ends up with much more complicated terms. (A discussion of such formulations is given in the appendix). Third, relationships between variables of different types could be approached via types of multi-response regression models (Cai et al., 2013); these are particularly popular approaches for eQTL mapping of point mutations to gene expression, for example (Lee et al., 2010). While these approaches may be effective at finding connections between two sets of variables, they cannot model relationships within sets of variables, are limited to only two types of variables, and do not correspond to a coherent joint probabilistic model. In this paper, we make several major contributions including: (1) Construction of a general class of mixed graphical models that permits each node to belong to a potentially different variable type, thus broad... |

1 |
A local poisson graphical model for inferring networks from sequencing data.
- Yang, Baker, et al.
- 2013
(Show Context)
Citation Context ...bles such as with Spearman’s or Kendall’s Tao rank-correlation (Xue and Zou, 2012). While such approaches could be employed for mixed types of variables, non-parametric approaches in general might not adequately account for differing domains of mixed variables and likely have less statistical power than parametric methods for recovering graph structure in high-dimensional settings. Second, our construction is closely related to that of conditional random field (CRF) models (Lafferty, 2001), and particularly CRFs constructed via node-conditional exponential families as recently investigated by Yang et al. (2013a). Deriving a mixed MRF from such CRFs by taking a product of a conditional CRF distributions and marginal MRF distributions however, has a key disadvantage in that the resulting distribution ends up with much more complicated terms. (A discussion of such formulations is given in the appendix). Third, relationships between variables of different types could be approached via types of multi-response regression models (Cai et al., 2013); these are particularly popular approaches for eQTL mapping of point mutations to gene expression, for example (Lee et al., 2010). While these approaches may be... |