#### DMCA

## Sequences of regressions and their independences

### Cached

### Download Links

Venue: | TEST |

Citations: | 11 - 3 self |

### Citations

8890 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...he simple enumeration result for labeled trees in d nodes, dd−2, by Karl-Wilhelm Borchardt (1817-1880), it could be shown that these trees are in one-to-one correspondence to distinct strings of size d − 2; see Cayley (1889). Much later, labeled trees were recognized to form the subclass of directed acyclic graphs with exclusively source Vs and therefore to be also Markov equivalent to chordal concentration graphs; see Castelo and Siebes (2003). In the literature on graphical Markov models, a number of different names have been in use for a sink V, for instance ‘two arrows meeting head-on’ by Pearl (1988), ‘unshielded collider’ by Richardson and Spirtes (2002), and ‘Wermuth-configuration’ by Whittaker (1990), after it had been recognized that, for Gaussian distributions, the parameters of a directed acyclic graph model without sink Vs are in one-to-one correspondence to the parameters in its skeleton concentration graph model. Proposition 3. (Wermuth, 1980), (Wermuth and Lauritzen, 1983), (Frydenberg, 1990). A directed acyclic graph is Markov equivalent to a concentration graph of the same skeleton if and only if it has no collision V. Efficient algorithms to decide whether an undirected graph... |

3166 |
Generalized linear models
- Nelder, Wedderburn
- 1972
(Show Context)
Citation Context ...kelihood was recommended by Sir Ronald Fisher (1890–1962) as a general estimation technique that applies also to regressions with categorical or quantitative responses. One of the most attractive features of the method concerns properties of the estimates. Given two models with parameters that are in oneto-one correspondence, the same one-to-one transformation leads from the maximumlikelihood estimates under one model to those of the other. Different single response regressions, such as logistic, probit, or linear regressions, were described as special cases of the generalized linear model by Nelder and Wedderburn (1972); see also McCullagh and Nelder (1989). In all of these regressions, the vanishing of the coefficient(s) of regressors indicates conditional independence from the response given all remaining regressors in the regression model. The general linear model with a vector response, also called multivariate linear regression, has identical sets of regressors for each component of the vector and the individual component variables of the response vector form the set of joint responses. Maximumlikelihood estimation of regression coefficients for a joint Gaussian distribution reduces to linear-least squa... |

3020 |
Principles and Practice of Structural Equation Modeling
- Kline
- 1998
(Show Context)
Citation Context ... includes as special cases all available equivalence results for directed acyclic graphs, for covariance graphs and for concentration graphs, as set out in detail in Sections 6 and 7 here. For context variables taken as given, Gaussian regression graph models coincide with a large subclass of structural equation models (SEMs), those permitting local modeling due to the factorisation property (1) and are without any endogeneous responses. Such responses have residuals that are correlated with some of its regressors. For traditional uses of SEMs see for instance Joreskog (1981), Bollen (1989), Kline (2006), while Pearl (2009) advocates SEMs as a framework for causal inquiries. In the econometric literature thirty years ago, independences were always regarded as ‘overidentifying’ constraints. For discrete variables, more attractive features of regression graph models were derived by Drton (2009), who speaks of chain graph models of type IV for multivariate regression chains. He proves that each member in this class belongs to a curved exponential family, for a discussion of this notion see for instance Cox (2006). Discrete type IV models form also a subclass of marginal models; see Rudas, Bergsm... |

2771 |
Structural Equations with Latent Variables
- Bollen
- 1989
(Show Context)
Citation Context ...1 is simple and includes as special cases all available equivalence results for directed acyclic graphs, for covariance graphs and for concentration graphs, as set out in detail in Sections 6 and 7 here. For context variables taken as given, Gaussian regression graph models coincide with a large subclass of structural equation models (SEMs), those permitting local modeling due to the factorisation property (1) and are without any endogeneous responses. Such responses have residuals that are correlated with some of its regressors. For traditional uses of SEMs see for instance Joreskog (1981), Bollen (1989), Kline (2006), while Pearl (2009) advocates SEMs as a framework for causal inquiries. In the econometric literature thirty years ago, independences were always regarded as ‘overidentifying’ constraints. For discrete variables, more attractive features of regression graph models were derived by Drton (2009), who speaks of chain graph models of type IV for multivariate regression chains. He proves that each member in this class belongs to a curved exponential family, for a discussion of this notion see for instance Cox (2006). Discrete type IV models form also a subclass of marginal models; see... |

2194 |
An Introduction to Multivariate Statistical Analysis
- Anderson
- 1984
(Show Context)
Citation Context ...of these regressions, the vanishing of the coefficient(s) of regressors indicates conditional independence from the response given all remaining regressors in the regression model. The general linear model with a vector response, also called multivariate linear regression, has identical sets of regressors for each component of the vector and the individual component variables of the response vector form the set of joint responses. Maximumlikelihood estimation of regression coefficients for a joint Gaussian distribution reduces to linear-least squares fitting for each component separately; see Anderson (1958). 16 With different sets of regressors for the components of a vector response, seemingly unrelated regressions (SUR) result and iterative methods are needed for estimation; see Zellner (1962). For small sample sizes, a given solution of the likelihood equations of a Gaussian SUR model may not be unique; see Drton and Richardson (2004), Sundberg (2010), while for exclusively discrete variables this will never happen; see Drton (2009). For mixed variables, no corresponding results are available yet. But in general, there often exists a covering model with nice estimation properties. For instanc... |

1585 |
Graphical Models
- Lauritzen
- 1996
(Show Context)
Citation Context ...=⇒ a⊥bc|d. The standard graph theoretical separation criterion has different consequences for the two types of undirected graph corresponding for Gaussian distributions to concentration and to covariance matrices. We say a path intersects subset set c of node set N if it has an inner node in c and let {a, b, c,m} partition N to formulate known Markov properties. The notation is to reminds one that with any independence statement a⊥b|c, one implicitly has marginalised over the remaining nodes in m = V \ {a∪ b∪ c}, i.e. one considers the marginal joint distribution of Ya, Yb, Yc. Proposition 1. Lauritzen (1996). A concentration graph, GNcon , implies a⊥ b|c if and only if every path from a to b intersects c. Proposition 2. Kauermann (1996). A covariance graph, GNcov , implies a⊥ b|c if and only if every path from a to b intersects m. Notice that Proposition 1 requires the intersection property and Proposition 2 requires the composition property. A subgraph induced by nodes a ∪ b in GNcov is the covariance graph G a∪b cov . We say that there is an edge between a and b if there is an edge with one node in a and the other node in b. In this case, the graph Ga∪bcov is connected in a and b, otherwise the... |

673 | Causation. Prediction, and Search - Spines, Glymour, et al. - 1993 |

610 |
Graphical Models in Applied Multivariate Statistics
- WHITTAKER
- 1990
(Show Context)
Citation Context ...), it could be shown that these trees are in one-to-one correspondence to distinct strings of size d − 2; see Cayley (1889). Much later, labeled trees were recognized to form the subclass of directed acyclic graphs with exclusively source Vs and therefore to be also Markov equivalent to chordal concentration graphs; see Castelo and Siebes (2003). In the literature on graphical Markov models, a number of different names have been in use for a sink V, for instance ‘two arrows meeting head-on’ by Pearl (1988), ‘unshielded collider’ by Richardson and Spirtes (2002), and ‘Wermuth-configuration’ by Whittaker (1990), after it had been recognized that, for Gaussian distributions, the parameters of a directed acyclic graph model without sink Vs are in one-to-one correspondence to the parameters in its skeleton concentration graph model. Proposition 3. (Wermuth, 1980), (Wermuth and Lauritzen, 1983), (Frydenberg, 1990). A directed acyclic graph is Markov equivalent to a concentration graph of the same skeleton if and only if it has no collision V. Efficient algorithms to decide whether an undirected graph can be oriented into a directed acyclic graph, became available in the computer science literature under... |

558 |
An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias’.
- Zellner
- 1962
(Show Context)
Citation Context ...linear model with a vector response, also called multivariate linear regression, has identical sets of regressors for each component of the vector and the individual component variables of the response vector form the set of joint responses. Maximumlikelihood estimation of regression coefficients for a joint Gaussian distribution reduces to linear-least squares fitting for each component separately; see Anderson (1958). 16 With different sets of regressors for the components of a vector response, seemingly unrelated regressions (SUR) result and iterative methods are needed for estimation; see Zellner (1962). For small sample sizes, a given solution of the likelihood equations of a Gaussian SUR model may not be unique; see Drton and Richardson (2004), Sundberg (2010), while for exclusively discrete variables this will never happen; see Drton (2009). For mixed variables, no corresponding results are available yet. But in general, there often exists a covering model with nice estimation properties. For instance for the SUR model with regression graph ◦ ≻◦ ◦ ≺ ◦ , a general linear model with two independent regressors is a simple covering model. For a vector variable of categorical responses only, t... |

333 |
Covariance selection
- Dempster
- 1972
(Show Context)
Citation Context ...not assigned roles of responses or regressors and undirected measures of association are used instead of coefficients of dependence. In the concentration graph models, the undirected associations are conditional given all remaining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gaussian random variables, these had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances σij are in a one-to-one relation with the concentrations σij , the elements of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11,... |

321 | Discrete Multivariate Analysis - Bishop, Fienberg, et al. - 1975 |

305 |
Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs
- TARJAN, YANNAKAKIS
- 1984
(Show Context)
Citation Context ...ian distributions, the parameters of a directed acyclic graph model without sink Vs are in one-to-one correspondence to the parameters in its skeleton concentration graph model. Proposition 3. (Wermuth, 1980), (Wermuth and Lauritzen, 1983), (Frydenberg, 1990). A directed acyclic graph is Markov equivalent to a concentration graph of the same skeleton if and only if it has no collision V. Efficient algorithms to decide whether an undirected graph can be oriented into a directed acyclic graph, became available in the computer science literature under the name of perfect elimination schemes; see Tarjan and Yannakakis (1984). When algorithms were designed later to decide which arrows may be flipped in a given GNdag , keeping the same skeleton and the same set of sink Vs, to get to a list of all Markov equivalent GNdag s, these results appear to have not been used; see Chickering (1995). The number of equivalent characterizations of concentration graphs that have perfect elimination schemes has increased steadily, since they were introduced as rigid circuit graphs by Dirac (1961). These graphs without chordless cycles in four or more nodes are named also ‘chordal graphs’, ‘triangulated graphs”, ‘graphs with the ru... |

273 |
Information and Exponential Families in Statistical Theory
- Barndorff-Nielsen
- 1978
(Show Context)
Citation Context ...mponents of Y having a Gaussian distribution, with σ12|3 = σ12 − σ13σ23/σ33, (8) where σ12|3 denotes the covariance of Y1, Y2 given Y3. Therefore, σ12|3 coincides with σ12 if and only if σ13 = 0 or σ23 = 0. By equations (6), (7), (8), a unique independence statement is associated with the endpoints of any V in a trivariate Gaussain distribution. In the context of multivariate exponential families of distributions, concentrations are special canonical parameters and covariances are special moment parameters with estimates of canonical and moment parameters being asymptotically independent; see Barndorff-Nielsen (1978). Regression graphs capture independence structures for more general types of distribution, where operators for transforming graphs mimic operators for transforming different parametrisations of joint Gaussian distributions; see Wermuth, Wiedenbeck and Cox (2006), Wiedenbeck and Wermuth (2010), Wermuth (2011). In particular, by removing an edge from any V of a regression graph, one introduces an additional independence constraint just as in a regular joint Gaussian distribution. This requires that the generated distributions satisfy the composition and intersection property in addition to gene... |

270 |
Equivalence and synthesis of causal models
- Verma, Pearl
- 1990
(Show Context)
Citation Context ...s are named also ‘chordal graphs’, ‘triangulated graphs”, ‘graphs with the running intersection property’ or ‘graphs with only complete prime graph separators.’ 23 By contrast, for a covariance graph that can be oriented to be Markov equivalent to a GNdag of the same skeleton, chordless paths are relevant. Proposition 4. (Pearl and Wermuth, 1994). A covariance graph with a chordless path in four nodes is not Markov equivalent to a directed acyclic graph in the same node set. For distributions generated over directed acyclic graphs, sink Vs are needed again. Proposition 5. (Frydenberg, 1990), (Verma and Pearl, 1990). Directed acyclic graphs of the same skeleton are Markov equivalent if and only if they have the same sink Vs. Markov equivalence of a concentration graph and a covariance graph model is for regular joint Gaussian distributions equivalent to parameter equivalence which means that there is a one-to-one relation between the two sets parameters. Therefore, an early result on parameter equivalence for joint Gaussian distributions implies the following Markov equivalence result for distributions satisfying both the composition and the intersection property. Proposition 6. (Jensen, 1988), (Drton an... |

258 |
Fundamentals of statistical exponential families with applications in statistical decision theory; Number 9
- Brown
- 1986
(Show Context)
Citation Context ...ent of those of f>i to assure maximization of the joint likelihood by separate fitting of each univariate regression. Consequences are that distributions generated over GNpar , every collision V is association-inducing for its endpoints by conditioning on the inner node and every transmitting V is associationinducing by marginalising over the inner node, just like in a regular joint Gaussian distribution, see as examples the recursive relations of (6), (7), (8). With distributions generated over parent graphs, one excludes incomplete families of distributions; see Lehmann and Scheffe (1955), Brown (1986), Mandelbaum and 21 Ruschendorf (1987), in which independence statements connected with a V may have the inner node both within and outside the conditioning set; see Wermuth (2011), Wermuth and Cox (2004), Darroch (1962). Such independences have been characterized as being not representable in joint Gaussian distributions; see Lnenicka and Matus (2007). More generally, such independences cannot occur whenever the distribution is weakly transitive that is if, for i, k, l distinct nodes of N and m = N \ {i, k, l}, (i⊥k|l and i⊥k|{l, m}) =⇒ (i⊥m|l or k ⊥m|l). or for a dep b|c denoting depend... |

231 |
Introduction to graphical modelling
- Edwards
- 2000
(Show Context)
Citation Context ...aining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gaussian random variables, these had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances σij are in a one-to-one relation with the concentrations σij , the elements of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11, (7) where σ23.1 denotes the concentration of Y2, Y3 in their bivaraite marginal distribution. Thus, the overall concentration σ23 coincides with σ23.1 if and only if σ12 = 0 or σ13 = 0. Alternatively in covariance graph mod... |

223 |
Graphical models for associations between variables, some of which are qualitative and some quantitative.
- Lauritzen, Wermuth
- 1989
(Show Context)
Citation Context ... models form also a subclass of marginal models; see Rudas, Bergsma and Nemeth (2010), Bergsma and Rudas (2002). Defining local independence statements that involve only variables in the past are equivalent to more complex local independences used by Drton (2009); see Marchetti and Lupparelli (2010). These local definitions imply the pairwise independences of equation (2) for any regrerssion graph, GNreg . Two other types of chain graph have been studied as joint response models in statistics, the so-called AMP chain graphs of Andersson, Madigan and Perlman (2001), and the LWF chain graphs of Lauritzen and Wermuth (1989) and Frydenberg (1990). These are suitable for modeling data from intervention studies, when they are Markov equivalent to a regression graph, since they have in common that pairwise independences include other nodes of the same connected component. For AMP graphs, in equation (2) (i) is replaced by (i′) i⊥k|g>j−1 \ {i, k} for i, k both in gj , j = 1, . . . , r, and for LWF graphs, (i) ia also by (i′) and (iii) by (iii′) i⊥k|g>j−1 \ {i, k} for i in gj with j ≤ r and k in g>j. Not yet systematically approached is the search for simple covering models that capture most but not all independences,... |

175 | The statistical implications of a system of simultaneous equations. - Haavelmo - 1943 |

150 |
A theorem on trees
- Cayley
(Show Context)
Citation Context ...rkov equivalence have been obtained quite independently in the mathematical literature on characterizing different types of graph, in the statistical literature on specifying types of multivariate statistical models, and in the computer science literature on deciding on special properties of a given graph or on designing fast algorithms for transforming graphs. For instance, following the simple enumeration result for labeled trees in d nodes, dd−2, by Karl-Wilhelm Borchardt (1817-1880), it could be shown that these trees are in one-to-one correspondence to distinct strings of size d − 2; see Cayley (1889). Much later, labeled trees were recognized to form the subclass of directed acyclic graphs with exclusively source Vs and therefore to be also Markov equivalent to chordal concentration graphs; see Castelo and Siebes (2003). In the literature on graphical Markov models, a number of different names have been in use for a sink V, for instance ‘two arrows meeting head-on’ by Pearl (1988), ‘unshielded collider’ by Richardson and Spirtes (2002), and ‘Wermuth-configuration’ by Whittaker (1990), after it had been recognized that, for Gaussian distributions, the parameters of a directed acyclic graph... |

150 |
On rigid circuit graphs
- Dirac
- 1961
(Show Context)
Citation Context ...ed acyclic graph, became available in the computer science literature under the name of perfect elimination schemes; see Tarjan and Yannakakis (1984). When algorithms were designed later to decide which arrows may be flipped in a given GNdag , keeping the same skeleton and the same set of sink Vs, to get to a list of all Markov equivalent GNdag s, these results appear to have not been used; see Chickering (1995). The number of equivalent characterizations of concentration graphs that have perfect elimination schemes has increased steadily, since they were introduced as rigid circuit graphs by Dirac (1961). These graphs without chordless cycles in four or more nodes are named also ‘chordal graphs’, ‘triangulated graphs”, ‘graphs with the running intersection property’ or ‘graphs with only complete prime graph separators.’ 23 By contrast, for a covariance graph that can be oriented to be Markov equivalent to a GNdag of the same skeleton, chordless paths are relevant. Proposition 4. (Pearl and Wermuth, 1994). A covariance graph with a chordless path in four nodes is not Markov equivalent to a directed acyclic graph in the same node set. For distributions generated over directed acyclic graphs, si... |

149 | An introduction to chordal graphs and clique trees
- Blair, Peyton
- 1993
(Show Context)
Citation Context ...lower one. 28 Continually apply each step until it is not possible to continue applying it further. Then move to the next step. Lemma 5. For a regression graph with a chordal concentration graph and without chordless collision paths in four nodes, Algorithm 1 generates a directed acyclic graph that is Markov equivalent to GNreg . Proof. The generated graph is directed since by Algorithm 1, all edges are turned into arrows. Since the block containing full lines is chordal, the graph generated by the perfect elimination order of the maximal cardinality search does not have a directed cycle; see Blair and Peyton (1993) Section 2.4 and Tarjan and Yannakakis (1984). In addition, the arrows present in the graph do not change by the algorithm. Thus, to generate a cycle containing an arrow of the original graph, there should have been a cycle in the directed graph generated by replacing blocks by nodes. But, this is impossible in a regression graph. Therefore in the generated graph, there is no cycle containing arrows that have been between the blocks of the original graph. Within a block, all arrows point from nodes with higher numbers to nodes with lower ones. Otherwise, there would have been at step 3 of the ... |

140 |
Multivariate dependencies: Models, analysis and interpretation.
- Cox, Wermuth
- 1996
(Show Context)
Citation Context ...onally independent given their past. a b c d e Figure 4: A regression graph for 14 variables corresponding to blocks a to e of Figure 3. 5 Graphs with dashed lines are covariance graphs denoted by GNcov, those with full lines are concentration graphs denoted by GNcon; see Wermuth and Cox (1998). The names are to remind one of their parametrisation in regular joint Gaussian distributions, in which the covariance matrix is invertible and gives the concentration matrix. A zero ik-element in GNcov means i⊥k and a zero ik-element in G N con means i⊥k|{1, . . . , d} \ {i, k}; see Wermuth (1976a) or Cox and Wermuth (1996), Section 3.4. The regression graph of Figure 4 is consistent with the first ordering in Figure 3 since there are no edges or only lines, i.e. undirected edges, within blocks a to e. After statistical analysis, blocks of the first ordering are often subdivided into the connected components of the graph, gj , shown here in Figure 4 with the help of the stacked boxes. For several nodes in gj, each pair of nodes is connected by at least one path within gj, that is via a sequence of edges coupling distinct nodes. For a regression graph, the connected components gj, for j = 1, . . . J , are the dis... |

138 |
Conditional independence in statistical theory (with discussion
- Dawid
- 1979
(Show Context)
Citation Context ...aussian distributions; see Wermuth, Wiedenbeck and Cox (2006), Wiedenbeck and Wermuth (2010), Wermuth (2011). In particular, by removing an edge from any V of a regression graph, one introduces an additional independence constraint just as in a regular joint Gaussian distribution. This requires that the generated distributions satisfy the composition and intersection property in addition to general properties, as discussed in the next section. 5 Using graphs to combine independence statements We now state the four standard properties of independences of any multivariate distribution; see e.g. Dawid (1979), Studeny (2005), as well as two special properties of joint 18 Gaussian distributions. The six taken together, describe the combination and decomposition of independences in regression graphs, for instance those resulting by removing edges. We discuss when these six properties apply also to regression graph models. Let X, Y, Z be random (vector) variables, continuous, discrete or mixed. By using the same compact notation, fXY Z for a given joint density, a probability distribution or a mixture and by denoting the union of say X and Y by XY , one has X ⊥Y |Z ⇐⇒ (fXY Z = fXZfY Z/fZ), (9) where... |

136 | Identifying independence in Bayesian networks - Geiger, Verma, et al. - 1990 |

112 | A transformational characterization of equivalent Bayesian network structures
- Chickering
- 1995
(Show Context)
Citation Context ...ic graph is Markov equivalent to a concentration graph of the same skeleton if and only if it has no collision V. Efficient algorithms to decide whether an undirected graph can be oriented into a directed acyclic graph, became available in the computer science literature under the name of perfect elimination schemes; see Tarjan and Yannakakis (1984). When algorithms were designed later to decide which arrows may be flipped in a given GNdag , keeping the same skeleton and the same set of sink Vs, to get to a list of all Markov equivalent GNdag s, these results appear to have not been used; see Chickering (1995). The number of equivalent characterizations of concentration graphs that have perfect elimination schemes has increased steadily, since they were introduced as rigid circuit graphs by Dirac (1961). These graphs without chordless cycles in four or more nodes are named also ‘chordal graphs’, ‘triangulated graphs”, ‘graphs with the running intersection property’ or ‘graphs with only complete prime graph separators.’ 23 By contrast, for a covariance graph that can be oriented to be Markov equivalent to a GNdag of the same skeleton, chordless paths are relevant. Proposition 4. (Pearl and Wermuth, ... |

103 |
Gaussian Markov distributions over finite graphs.
- Speed, Kiiveri
- 1986
(Show Context)
Citation Context ... regressors and undirected measures of association are used instead of coefficients of dependence. In the concentration graph models, the undirected associations are conditional given all remaining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gaussian random variables, these had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances σij are in a one-to-one relation with the concentrations σij , the elements of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11, (7) where σ23.1 denotes the concentration ... |

93 | Probabilistic Conditional Independence Structures. - Studeny - 2004 |

70 |
On substantive research hypotheses, conditional independence graphs and graphical chain models
- Wermuth, Lauritzen
- 1989
(Show Context)
Citation Context ... c Q, family status Figure 1: A well-fitting regression graph for data on n = 283 adult females; within boxes are Ya, Yb, Yc; corresponding ordered partitioning of the node set on top of the boxes. This just says that prediction of Ya is not improved by knowing the context variable Yc if information on the more recent intermediate variable Yb is available. More interpretations of the independences are given later. When the edges present represent substantial associations, the graph may also be viewed as a research hypothesis, the goodness-of-fit 3 of which can be tested in future studies; see Wermuth and Lauritzen (1990). Two models are Markov equivalent whenever their associated graphs capture the same independence structure, that is the graphs lead to the same set of implied independence statements. Markov equivalent models cannot be distinguished on the basis of statistical goodness-of-fit tests for any given set of data. This may pose a problem in machine learning contexts. More precisely, knowledge about Markov equivalent models is essential for designing search procedures which converge with an increasing sample size to a true generating graph; see Castelo and Kocka (2003) for searches within the class ... |

69 | Alternative Markov Properties for Chain Graphs. - Andersson, Madigan, et al. - 2001 |

68 |
Graphical and recursive models for contingency tables.
- Wermuth, Lauritzen
- 1983
(Show Context)
Citation Context ... equivalent to chordal concentration graphs; see Castelo and Siebes (2003). In the literature on graphical Markov models, a number of different names have been in use for a sink V, for instance ‘two arrows meeting head-on’ by Pearl (1988), ‘unshielded collider’ by Richardson and Spirtes (2002), and ‘Wermuth-configuration’ by Whittaker (1990), after it had been recognized that, for Gaussian distributions, the parameters of a directed acyclic graph model without sink Vs are in one-to-one correspondence to the parameters in its skeleton concentration graph model. Proposition 3. (Wermuth, 1980), (Wermuth and Lauritzen, 1983), (Frydenberg, 1990). A directed acyclic graph is Markov equivalent to a concentration graph of the same skeleton if and only if it has no collision V. Efficient algorithms to decide whether an undirected graph can be oriented into a directed acyclic graph, became available in the computer science literature under the name of perfect elimination schemes; see Tarjan and Yannakakis (1984). When algorithms were designed later to decide which arrows may be flipped in a given GNdag , keeping the same skeleton and the same set of sink Vs, to get to a list of all Markov equivalent GNdag s, these resu... |

67 | Recursive causal models. - Kiiveri, Speed, et al. - 1984 |

66 | Principles of statistical inference. - Cox - 2006 |

64 |
Multivariate logistic models.
- Glonek, McCullagh
- 1995
(Show Context)
Citation Context ...ion of the likelihood equations of a Gaussian SUR model may not be unique; see Drton and Richardson (2004), Sundberg (2010), while for exclusively discrete variables this will never happen; see Drton (2009). For mixed variables, no corresponding results are available yet. But in general, there often exists a covering model with nice estimation properties. For instance for the SUR model with regression graph ◦ ≻◦ ◦ ≺ ◦ , a general linear model with two independent regressors is a simple covering model. For a vector variable of categorical responses only, the multivariate logistic regression of Glonek and McCullagh (1995) reduces to separate main effect logistic regressions for each component of the response vector provided that certain higher-order interactions vanish; see Marchetti and Lupparelli (2010). In the context of structural equation models (SEMs), dependences of binary categorical variables are modeled in terms of probit regressions. These do not differ substantially from logistic regressions whenever the smallest and largest events occur at least with probability 0.1; see Cox (1966). Multivariate linear regressions as well as SUR models belong to the framework of SEMs even though this general class... |

63 |
Maximum likelihood in three-way contingency tables
- Birch
- 1963
(Show Context)
Citation Context ...rom 1945 to 1965, but some issues still need to be settled; see for instance Drton, Eichler and Richardson (2009), Stanghellini and Wermuth (2005). In statistical models that treat all variables on equal standing, the variables are not assigned roles of responses or regressors and undirected measures of association are used instead of coefficients of dependence. In the concentration graph models, the undirected associations are conditional given all remaining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gaussian random variables, these had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances ... |

63 |
Elements of continuous multivariate analysis. Addison-Wesley series in behavioral sciences.
- Dempster
- 1969
(Show Context)
Citation Context ... had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances σij are in a one-to-one relation with the concentrations σij , the elements of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11, (7) where σ23.1 denotes the concentration of Y2, Y3 in their bivaraite marginal distribution. Thus, the overall concentration σ23 coincides with σ23.1 if and only if σ12 = 0 or σ13 = 0. Alternatively in covariance graph models, the undirected measures for variables on equal standing are pairwise marginal associations. For Gaussian variables, these models had been introduced as hypotheses linear in covariances; see Anderson (1973), Kauermann (1996), Wermuth, Cox and Marchetti (2006), Chaudhuri, Drton and Richardson (2007). For categorica... |

57 |
Model selection for Gaussian concentration graphs
- Drton, Perlman
- 2004
(Show Context)
Citation Context ...ociation are used instead of coefficients of dependence. In the concentration graph models, the undirected associations are conditional given all remaining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gaussian random variables, these had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances σij are in a one-to-one relation with the concentrations σij , the elements of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11, (7) where σ23.1 denotes the concentration of Y2, Y3 in their bivaraite marginal dist... |

56 |
The multivariate analysis of qualitative data: Interactions among multiple classications.
- Goodman
- 1970
(Show Context)
Citation Context ...s still need to be settled; see for instance Drton, Eichler and Richardson (2009), Stanghellini and Wermuth (2005). In statistical models that treat all variables on equal standing, the variables are not assigned roles of responses or regressors and undirected measures of association are used instead of coefficients of dependence. In the concentration graph models, the undirected associations are conditional given all remaining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gaussian random variables, these had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances σij are in a one-to-one relation w... |

54 |
Asymptotically efficient estimation of covariance matrices with linear structure.
- Anderson
- 1973
(Show Context)
Citation Context ...ns σij , the elements of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11, (7) where σ23.1 denotes the concentration of Y2, Y3 in their bivaraite marginal distribution. Thus, the overall concentration σ23 coincides with σ23.1 if and only if σ12 = 0 or σ13 = 0. Alternatively in covariance graph models, the undirected measures for variables on equal standing are pairwise marginal associations. For Gaussian variables, these models had been introduced as hypotheses linear in covariances; see Anderson (1973), Kauermann (1996), Wermuth, Cox and Marchetti (2006), Chaudhuri, Drton and Richardson (2007). For categorical variables, covariance graph models have been studied only more recently; see Drton and Richardson (2008), Lupparelli, Marchetti and Bergsma (2009). Again, no similar estimation results are available for general mixed variables yet. There is also a recursive relation for covariances; see Anderson (1958). It shows for instance, for just three components of Y having a Gaussian distribution, with σ12|3 = σ12 − σ13σ23/σ33, (8) where σ12|3 denotes the covariance of Y1, Y2 given Y3. Therefor... |

54 |
Linear dependencies represented by chain graphs (with discussion).
- Cox, Wermuth
- 1993
(Show Context)
Citation Context ...wo of the main new results of this paper can be stated. Theorem 1. Two regression graphs are Markov equivalent if and only if they have the same skeleton and the same sets of collision Vs, irrespective of the type of edge. Theorem 2. A regression graph with a chordal graph for the context variables can be oriented to be Markov equivalent to a directed acyclic graph in the same skeleton, if and only if it does not contain any chordless collision path in four nodes. Sequences of regressions were introduced and studied, without specifying a concentration graph model for the context variables, by Cox and Wermuth (1993), Wermuth and Cox (2004), under the name of multivariate regression chains, reminding one of the sequences of unconstrained models that the class contains for Gaussian joint responses. An extension to graphs including a concentration graph had already been proposed for directed acyclic graph by Kiiveri, Speed and Carlin (1984). By this type of extension, the so-called global Markov property of the graph remains unchanged. This property permits to read off the graph all independence statements implied by the graph. A criterion for Markov equivalence of summary graphs has been derived by Sadeghi... |

51 | Marginal models for categorical data.
- Bergsma, Rudas
- 2002
(Show Context)
Citation Context ...advocates SEMs as a framework for causal inquiries. In the econometric literature thirty years ago, independences were always regarded as ‘overidentifying’ constraints. For discrete variables, more attractive features of regression graph models were derived by Drton (2009), who speaks of chain graph models of type IV for multivariate regression chains. He proves that each member in this class belongs to a curved exponential family, for a discussion of this notion see for instance Cox (2006). Discrete type IV models form also a subclass of marginal models; see Rudas, Bergsma and Nemeth (2010), Bergsma and Rudas (2002). Defining local independence statements that involve only variables in the past are equivalent to more complex local independences used by Drton (2009); see Marchetti and Lupparelli (2010). These local definitions imply the pairwise independences of equation (2) for any regrerssion graph, GNreg . Two other types of chain graph have been studied as joint response models in statistics, the so-called AMP chain graphs of Andersson, Madigan and Perlman (2001), and the LWF chain graphs of Lauritzen and Wermuth (1989) and Frydenberg (1990). These are suitable for modeling data from intervention stud... |

44 |
Linear recursive equations, covariance selection, and path analysis.
- Wermuth
- 1980
(Show Context)
Citation Context ...to be also Markov equivalent to chordal concentration graphs; see Castelo and Siebes (2003). In the literature on graphical Markov models, a number of different names have been in use for a sink V, for instance ‘two arrows meeting head-on’ by Pearl (1988), ‘unshielded collider’ by Richardson and Spirtes (2002), and ‘Wermuth-configuration’ by Whittaker (1990), after it had been recognized that, for Gaussian distributions, the parameters of a directed acyclic graph model without sink Vs are in one-to-one correspondence to the parameters in its skeleton concentration graph model. Proposition 3. (Wermuth, 1980), (Wermuth and Lauritzen, 1983), (Frydenberg, 1990). A directed acyclic graph is Markov equivalent to a concentration graph of the same skeleton if and only if it has no collision V. Efficient algorithms to decide whether an undirected graph can be oriented into a directed acyclic graph, became available in the computer science literature under the name of perfect elimination schemes; see Tarjan and Yannakakis (1984). When algorithms were designed later to decide which arrows may be flipped in a given GNdag , keeping the same skeleton and the same set of sink Vs, to get to a list of all Markov... |

41 |
On a dualization of graphical Gaussian models
- Kauermann
- 1996
(Show Context)
Citation Context ...ents of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11, (7) where σ23.1 denotes the concentration of Y2, Y3 in their bivaraite marginal distribution. Thus, the overall concentration σ23 coincides with σ23.1 if and only if σ12 = 0 or σ13 = 0. Alternatively in covariance graph models, the undirected measures for variables on equal standing are pairwise marginal associations. For Gaussian variables, these models had been introduced as hypotheses linear in covariances; see Anderson (1973), Kauermann (1996), Wermuth, Cox and Marchetti (2006), Chaudhuri, Drton and Richardson (2007). For categorical variables, covariance graph models have been studied only more recently; see Drton and Richardson (2008), Lupparelli, Marchetti and Bergsma (2009). Again, no similar estimation results are available for general mixed variables yet. There is also a recursive relation for covariances; see Anderson (1958). It shows for instance, for just three components of Y having a Gaussian distribution, with σ12|3 = σ12 − σ13σ23/σ33, (8) where σ12|3 denotes the covariance of Y1, Y2 given Y3. Therefore, σ12|3 coincides... |

40 | On inclusion-driven learning of bayesian networks
- Castelo, Kočka
- 2003
(Show Context)
Citation Context ...ted in future studies; see Wermuth and Lauritzen (1990). Two models are Markov equivalent whenever their associated graphs capture the same independence structure, that is the graphs lead to the same set of implied independence statements. Markov equivalent models cannot be distinguished on the basis of statistical goodness-of-fit tests for any given set of data. This may pose a problem in machine learning contexts. More precisely, knowledge about Markov equivalent models is essential for designing search procedures which converge with an increasing sample size to a true generating graph; see Castelo and Kocka (2003) for searches within the class of directed acyclic graphs which consist exclusively of arrows and capture independences of ordered sequences in single response regressions. More importantly though, Markov equivalent models may offer alternative interpretations of a given well-fitting model or open the possibility of using different types of fitting algorithms. As we shall see in Section 7, the graph for nodes A,R,B, P,Q in blocks b and c of Figure 1 is Markov equivalent to both graphs of Figure 2. R, family distress R, family distress A, sexual abuse A, sexual abuse P, age P, age B, schooling ... |

37 | Discrete chain graph models.
- Drton
- 2009
(Show Context)
Citation Context ...structural equation models (SEMs), those permitting local modeling due to the factorisation property (1) and are without any endogeneous responses. Such responses have residuals that are correlated with some of its regressors. For traditional uses of SEMs see for instance Joreskog (1981), Bollen (1989), Kline (2006), while Pearl (2009) advocates SEMs as a framework for causal inquiries. In the econometric literature thirty years ago, independences were always regarded as ‘overidentifying’ constraints. For discrete variables, more attractive features of regression graph models were derived by Drton (2009), who speaks of chain graph models of type IV for multivariate regression chains. He proves that each member in this class belongs to a curved exponential family, for a discussion of this notion see for instance Cox (2006). Discrete type IV models form also a subclass of marginal models; see Rudas, Bergsma and Nemeth (2010), Bergsma and Rudas (2002). Defining local independence statements that involve only variables in the past are equivalent to more complex local independences used by Drton (2009); see Marchetti and Lupparelli (2010). These local definitions imply the pairwise independences o... |

37 | Estimation of a covariance matrix with zeros. - Chaudhuri, Drton, et al. - 2007 |

34 | Completeness, similar regions and unbiased estimation. - Lehmann, Scheffe - 1955 |

31 | Contribution a l’analyse statistique des tableaux de correlation,” - Caussinus - 1965 |

31 |
Graphoids: a graph based logic for reasoning about relevancy revelations. In:
- Pearl, Paz
- 1987
(Show Context)
Citation Context ... if there is an edge with one node in a and the other node in b. In this case, the graph Ga∪bcov is connected in a and b, otherwise the graphs in a and b are disconnected. Corollary 1. A covariance graph, GNcov , or a concentration graph, G N con , implies a⊥ b if and only if in the subgraph induced by a ∪ b, the graphs in a and b are disconnected. 20 It can be shown that the independence structure of a regression graph is fully specified by the pairwise independences (2) of each missing edge if both properties (v) and (vi) hold in addition to the standard ones; see also Kang and Tian (2009), Pearl and Paz (1987), Marchetti and Lupparelli (2010) for relevant, previous special results. Lemma 1. A regression graph, GNreg , captures an independence structure for a distribution with density fN factorizing as (1) if properties (i) to (vi) hold for fN . Proof. The first four properties hold for any density fN . Given the intersection property (vi), any node i with missing edges to nodes k, l in a concentration graph of node set N implies i⊥{k, l}|N \ {i, k, l} and given the composition property (v), any node i with missing edges to nodes k, l in a covariance graph given Yc implies i⊥{k, l}|c. For purely dis... |

31 |
Joint response graphs and separation induced by triangular systems.
- Wermuth, Cox
- 2004
(Show Context)
Citation Context ...ts of this paper can be stated. Theorem 1. Two regression graphs are Markov equivalent if and only if they have the same skeleton and the same sets of collision Vs, irrespective of the type of edge. Theorem 2. A regression graph with a chordal graph for the context variables can be oriented to be Markov equivalent to a directed acyclic graph in the same skeleton, if and only if it does not contain any chordless collision path in four nodes. Sequences of regressions were introduced and studied, without specifying a concentration graph model for the context variables, by Cox and Wermuth (1993), Wermuth and Cox (2004), under the name of multivariate regression chains, reminding one of the sequences of unconstrained models that the class contains for Gaussian joint responses. An extension to graphs including a concentration graph had already been proposed for directed acyclic graph by Kiiveri, Speed and Carlin (1984). By this type of extension, the so-called global Markov property of the graph remains unchanged. This property permits to read off the graph all independence statements implied by the graph. A criterion for Markov equivalence of summary graphs has been derived by Sadeghi (2009) who also shows t... |

29 | Multimodality of the likelihood in the bivariate seemingly unrelated regressions model.
- Drton, Richardson
- 2004
(Show Context)
Citation Context ... of the vector and the individual component variables of the response vector form the set of joint responses. Maximumlikelihood estimation of regression coefficients for a joint Gaussian distribution reduces to linear-least squares fitting for each component separately; see Anderson (1958). 16 With different sets of regressors for the components of a vector response, seemingly unrelated regressions (SUR) result and iterative methods are needed for estimation; see Zellner (1962). For small sample sizes, a given solution of the likelihood equations of a Gaussian SUR model may not be unique; see Drton and Richardson (2004), Sundberg (2010), while for exclusively discrete variables this will never happen; see Drton (2009). For mixed variables, no corresponding results are available yet. But in general, there often exists a covering model with nice estimation properties. For instance for the SUR model with regression graph ◦ ≻◦ ◦ ≺ ◦ , a general linear model with two independent regressors is a simple covering model. For a vector variable of categorical responses only, the multivariate logistic regression of Glonek and McCullagh (1995) reduces to separate main effect logistic regressions for each component of the... |

29 | Analysis of covariance structures. - Joreskog - 1981 |

28 | A New Identification Condition for Recursive Models with Correlated Errors. Structural Equation Modeling, - Brito, Pearl - 2002 |

28 | Markov fields and log-linear models for contingency tables. - Darroch, Lauritzen, et al. - 1980 |

27 |
The omission or addition of an independent variate in multiple linear regression
- Cochran
- 1938
(Show Context)
Citation Context ... This is discussed further in the next section. One of the special important features of the linear least-squares regressions is that the residuals are uncorrelated with the regressors. The effect is that the model part coincides with a conditional linear expectation as illustrated here with a model for response Y1 and regressors Y2, Y3, which we take, as mentioned before, as measured in deviations from their means. For instance, one gets for Y1 = β1|2.3Y2 + β1|3.2Y3 + ε1, E lin(Y1|Y2, Y3) = β1|2.3Y2 + β1|3.2Y3 . (5) There is a recursive relation for least-squares regression coefficients; see Cochran (1938), Cox and Wermuth (2003), Ma, Xie and Geng (2006). It shows for instance with β1|3 = β1|3.2 + β1|2.3β2|3 (6) that β1|3.2, the partial coefficient of Y3 given also Y2 as a regressor for Y1, coincides with the marginal coefficient, β1|3, if and only if β1|2.3 = 0 or β2|3 = 0. The method of maximizing the likelihood was recommended by Sir Ronald Fisher (1890–1962) as a general estimation technique that applies also to regressions with categorical or quantitative responses. One of the most attractive features of the method concerns properties of the estimates. Given two models with parameters that... |

27 | When can association graphs admit a causal interpretation?
- Pearl, J, et al.
- 1994
(Show Context)
Citation Context ...e Chickering (1995). The number of equivalent characterizations of concentration graphs that have perfect elimination schemes has increased steadily, since they were introduced as rigid circuit graphs by Dirac (1961). These graphs without chordless cycles in four or more nodes are named also ‘chordal graphs’, ‘triangulated graphs”, ‘graphs with the running intersection property’ or ‘graphs with only complete prime graph separators.’ 23 By contrast, for a covariance graph that can be oriented to be Markov equivalent to a GNdag of the same skeleton, chordless paths are relevant. Proposition 4. (Pearl and Wermuth, 1994). A covariance graph with a chordless path in four nodes is not Markov equivalent to a directed acyclic graph in the same node set. For distributions generated over directed acyclic graphs, sink Vs are needed again. Proposition 5. (Frydenberg, 1990), (Verma and Pearl, 1990). Directed acyclic graphs of the same skeleton are Markov equivalent if and only if they have the same sink Vs. Markov equivalence of a concentration graph and a covariance graph model is for regular joint Gaussian distributions equivalent to parameter equivalence which means that there is a one-to-one relation between the t... |

26 | Markov equivalence for ancestral graphs. - Ali, Richardson, et al. - 2009 |

25 | Binary models for marginal independence.
- Drton, Richardson
- 2008
(Show Context)
Citation Context ...es the concentration of Y2, Y3 in their bivaraite marginal distribution. Thus, the overall concentration σ23 coincides with σ23.1 if and only if σ12 = 0 or σ13 = 0. Alternatively in covariance graph models, the undirected measures for variables on equal standing are pairwise marginal associations. For Gaussian variables, these models had been introduced as hypotheses linear in covariances; see Anderson (1973), Kauermann (1996), Wermuth, Cox and Marchetti (2006), Chaudhuri, Drton and Richardson (2007). For categorical variables, covariance graph models have been studied only more recently; see Drton and Richardson (2008), Lupparelli, Marchetti and Bergsma (2009). Again, no similar estimation results are available for general mixed variables yet. There is also a recursive relation for covariances; see Anderson (1958). It shows for instance, for just three components of Y having a Gaussian distribution, with σ12|3 = σ12 − σ13σ23/σ33, (8) where σ12|3 denotes the covariance of Y1, Y2 given Y3. Therefore, σ12|3 coincides with σ12 if and only if σ13 = 0 or σ23 = 0. By equations (6), (7), (8), a unique independence statement is associated with the endpoints of any V in a trivariate Gaussain distribution. In the cont... |

25 | Marginalizing and conditioning in graphical models. - Koster - 2002 |

21 | Partial inversion for linear systems and partial closure of independence graphs. - Wermuth, Wiedenbeck, et al. - 2006 |

20 |
Analogous between multiplicative models in contingency tables and covariance selection.
- Wermuth
- 1976
(Show Context)
Citation Context ...ariable are conditionally independent given their past. a b c d e Figure 4: A regression graph for 14 variables corresponding to blocks a to e of Figure 3. 5 Graphs with dashed lines are covariance graphs denoted by GNcov, those with full lines are concentration graphs denoted by GNcon; see Wermuth and Cox (1998). The names are to remind one of their parametrisation in regular joint Gaussian distributions, in which the covariance matrix is invertible and gives the concentration matrix. A zero ik-element in GNcov means i⊥k and a zero ik-element in G N con means i⊥k|{1, . . . , d} \ {i, k}; see Wermuth (1976a) or Cox and Wermuth (1996), Section 3.4. The regression graph of Figure 4 is consistent with the first ordering in Figure 3 since there are no edges or only lines, i.e. undirected edges, within blocks a to e. After statistical analysis, blocks of the first ordering are often subdivided into the connected components of the graph, gj , shown here in Figure 4 with the help of the stacked boxes. For several nodes in gj, each pair of nodes is connected by at least one path within gj, that is via a sequence of edges coupling distinct nodes. For a regression graph, the connected components gj, for ... |

18 | Chain graph models of multivariate regression type for categorical data. Bernoulli,
- Marchetti, Lupparelli
- 2010
(Show Context)
Citation Context ...variables, more attractive features of regression graph models were derived by Drton (2009), who speaks of chain graph models of type IV for multivariate regression chains. He proves that each member in this class belongs to a curved exponential family, for a discussion of this notion see for instance Cox (2006). Discrete type IV models form also a subclass of marginal models; see Rudas, Bergsma and Nemeth (2010), Bergsma and Rudas (2002). Defining local independence statements that involve only variables in the past are equivalent to more complex local independences used by Drton (2009); see Marchetti and Lupparelli (2010). These local definitions imply the pairwise independences of equation (2) for any regrerssion graph, GNreg . Two other types of chain graph have been studied as joint response models in statistics, the so-called AMP chain graphs of Andersson, Madigan and Perlman (2001), and the LWF chain graphs of Lauritzen and Wermuth (1989) and Frydenberg (1990). These are suitable for modeling data from intervention studies, when they are Markov equivalent to a regression graph, since they have in common that pairwise independences include other nodes of the same connected component. For AMP graphs, in equ... |

18 | Matrix representations and independencies in directed acyclic graphs.
- Marchetti, Wermuth
- 2009
(Show Context)
Citation Context ... intersection property (vi) to hold are known; see San Martin, Mouchart and Rolin (2005). Too strong sufficient consitions are for joint Gaussian distributions that they are regular and for discrete variables, that the probabilities are strictly positive. The composition property (v) is known to hold for Gaussian distributions and for triangular binary distributions with at most main effects in symmetric (−1, 1) variables; see Wermuth, Marchetti and Cox (2009). Both properties (v) and (vi) hold, whenever a distribution may have been generated over a so-called parent graph; see Wermuth (2011), Marchetti and Wermuth (2009), Wermuth, Wiedenbeck and Cox (2006). Parent graphs, denoted by GNpar , are directed acyclic graphs with some added properties. Parent graphs are connected directed acyclic graphs to which one fixed, compatible ordering of the nodes is attached and the graph is edge-mimal. An ordering of the nodes is compatible with a given GNdag if each ancestor of any node i is within g>i = {i + 1, . . . , d}. In addition, G N par is to be edge-minimal for the generated distribution, that is no edge can be removed from the graph without adding another independence statement to the distribution generated over... |

18 |
On association models defined over independence graphs.
- Wermuth, Cox
- 1998
(Show Context)
Citation Context ...responding to Figure 3, Ya is a single response, Yb has two component variables, both of Yc and Ye have four and Yd has three. Each of the blocks b to e shows two stacked boxes, that is subsets of nodes without any undirected edge joining them. This indicates that disconnected components of a given vector variable are conditionally independent given their past. a b c d e Figure 4: A regression graph for 14 variables corresponding to blocks a to e of Figure 3. 5 Graphs with dashed lines are covariance graphs denoted by GNcov, those with full lines are concentration graphs denoted by GNcon; see Wermuth and Cox (1998). The names are to remind one of their parametrisation in regular joint Gaussian distributions, in which the covariance matrix is invertible and gives the concentration matrix. A zero ik-element in GNcov means i⊥k and a zero ik-element in G N con means i⊥k|{1, . . . , d} \ {i, k}; see Wermuth (1976a) or Cox and Wermuth (1996), Section 3.4. The regression graph of Figure 4 is consistent with the first ordering in Figure 3 since there are no edges or only lines, i.e. undirected edges, within blocks a to e. After statistical analysis, blocks of the first ordering are often subdivided into the con... |

18 | Distortions of effects caused by indirect confounding. - Wermuth, Cox - 2008 |

17 |
An approximation to maximum-likelihood estimates in reduced models.
- Cox, Wermuth
- 1990
(Show Context)
Citation Context ...raph, since they have in common that pairwise independences include other nodes of the same connected component. For AMP graphs, in equation (2) (i) is replaced by (i′) i⊥k|g>j−1 \ {i, k} for i, k both in gj , j = 1, . . . , r, and for LWF graphs, (i) ia also by (i′) and (iii) by (iii′) i⊥k|g>j−1 \ {i, k} for i in gj with j ≤ r and k in g>j. Not yet systematically approached is the search for simple covering models that capture most but not all independences, but for regression graphs see Propositions 8 to 10. Often, a covering model may be easier to fit than a reduced edge-minimal model; see Cox and Wermuth (1990) and the discussion of Figures 16 and 17 in Section 7. 9 Before we discuss the meaning of different types of missing edges for linear models in more detail, we derive a well-fitting regression graph for data given by Kappesser (1997). 3 Deriving and interpreting a regression graph For 201 chronic pain patients, the role of the site of pain during a three week stay in a chronic pain clinic was to be examined. In this study, it was of main interest to investigate the changes in two main symptoms and to understand determinants of the overall treatment success as rated by the patients, three month... |

17 |
A general condition for avoiding effect reversal after marginalization.
- Cox, Wermuth
- 2003
(Show Context)
Citation Context ...ed further in the next section. One of the special important features of the linear least-squares regressions is that the residuals are uncorrelated with the regressors. The effect is that the model part coincides with a conditional linear expectation as illustrated here with a model for response Y1 and regressors Y2, Y3, which we take, as mentioned before, as measured in deviations from their means. For instance, one gets for Y1 = β1|2.3Y2 + β1|3.2Y3 + ε1, E lin(Y1|Y2, Y3) = β1|2.3Y2 + β1|3.2Y3 . (5) There is a recursive relation for least-squares regression coefficients; see Cochran (1938), Cox and Wermuth (2003), Ma, Xie and Geng (2006). It shows for instance with β1|3 = β1|3.2 + β1|2.3β2|3 (6) that β1|3.2, the partial coefficient of Y3 given also Y2 as a regressor for Y1, coincides with the marginal coefficient, β1|3, if and only if β1|2.3 = 0 or β2|3 = 0. The method of maximizing the likelihood was recommended by Sir Ronald Fisher (1890–1962) as a general estimation technique that applies also to regressions with categorical or quantitative responses. One of the most attractive features of the method concerns properties of the estimates. Given two models with parameters that are in oneto-one corres... |

17 |
Covariance hypotheses which are linear in both the covariance and the inverse covariance,
- Jensen
- 1988
(Show Context)
Citation Context ... (Verma and Pearl, 1990). Directed acyclic graphs of the same skeleton are Markov equivalent if and only if they have the same sink Vs. Markov equivalence of a concentration graph and a covariance graph model is for regular joint Gaussian distributions equivalent to parameter equivalence which means that there is a one-to-one relation between the two sets parameters. Therefore, an early result on parameter equivalence for joint Gaussian distributions implies the following Markov equivalence result for distributions satisfying both the composition and the intersection property. Proposition 6. (Jensen, 1988), (Drton and Richardson, 2008). A covariance graph is Markov equivalent to a concentration graph if and only if both consist of the same complete, disconnected subgraphs. Fast ways of inserting an edge for every transition V, of deciding on connectivity and on blocking flows have been available in the corresponding Russian literature since 1970; see Dinitz (2006), but these results appear to have not not been exploited for the so-called lattice conditional independence models, recognized as distributions generated over GNdags without any transition Vs by Andersson, Madigan, Perlman and Triggs ... |

17 | Covariance chains. - Wermuth, Cox, et al. - 2006 |

16 |
Some procedures associated with the logistic qualitative response curve.
- Cox
- 1966
(Show Context)
Citation Context ... model. For a vector variable of categorical responses only, the multivariate logistic regression of Glonek and McCullagh (1995) reduces to separate main effect logistic regressions for each component of the response vector provided that certain higher-order interactions vanish; see Marchetti and Lupparelli (2010). In the context of structural equation models (SEMs), dependences of binary categorical variables are modeled in terms of probit regressions. These do not differ substantially from logistic regressions whenever the smallest and largest events occur at least with probability 0.1; see Cox (1966). Multivariate linear regressions as well as SUR models belong to the framework of SEMs even though this general class had been developed in econometrics to deal with endogenous responses, defined by the existence of correlations between the residuals and some regressors. For endogenous responses, the equation parameters are no longer measures of conditional dependence, as they are in linear least-squares regression models. Estimation methods for SEMs were discussed in the Berkeley symposia on mathematical statistics and probability from 1945 to 1965, but some issues still need to be settled; ... |

15 |
Model search among multiplicative models
- Wermuth
- 1976
(Show Context)
Citation Context ...ariable are conditionally independent given their past. a b c d e Figure 4: A regression graph for 14 variables corresponding to blocks a to e of Figure 3. 5 Graphs with dashed lines are covariance graphs denoted by GNcov, those with full lines are concentration graphs denoted by GNcon; see Wermuth and Cox (1998). The names are to remind one of their parametrisation in regular joint Gaussian distributions, in which the covariance matrix is invertible and gives the concentration matrix. A zero ik-element in GNcov means i⊥k and a zero ik-element in G N con means i⊥k|{1, . . . , d} \ {i, k}; see Wermuth (1976a) or Cox and Wermuth (1996), Section 3.4. The regression graph of Figure 4 is consistent with the first ordering in Figure 3 since there are no edges or only lines, i.e. undirected edges, within blocks a to e. After statistical analysis, blocks of the first ordering are often subdivided into the connected components of the graph, gj , shown here in Figure 4 with the help of the stacked boxes. For several nodes in gj, each pair of nodes is connected by at least one path within gj, that is via a sequence of edges coupling distinct nodes. For a regression graph, the connected components gj, for ... |

14 | Graphical methods for efficient likelihood inference in Gaussian covariance models
- Drton, Richardson
- 2008
(Show Context)
Citation Context ...es the concentration of Y2, Y3 in their bivaraite marginal distribution. Thus, the overall concentration σ23 coincides with σ23.1 if and only if σ12 = 0 or σ13 = 0. Alternatively in covariance graph models, the undirected measures for variables on equal standing are pairwise marginal associations. For Gaussian variables, these models had been introduced as hypotheses linear in covariances; see Anderson (1973), Kauermann (1996), Wermuth, Cox and Marchetti (2006), Chaudhuri, Drton and Richardson (2007). For categorical variables, covariance graph models have been studied only more recently; see Drton and Richardson (2008), Lupparelli, Marchetti and Bergsma (2009). Again, no similar estimation results are available for general mixed variables yet. There is also a recursive relation for covariances; see Anderson (1958). It shows for instance, for just three components of Y having a Gaussian distribution, with σ12|3 = σ12 − σ13σ23/σ33, (8) where σ12|3 denotes the covariance of Y1, Y2 given Y3. Therefore, σ12|3 coincides with σ12 if and only if σ13 = 0 or σ23 = 0. By equations (6), (7), (8), a unique independence statement is associated with the endpoints of any V in a trivariate Gaussain distribution. In the cont... |

14 | Separation and Completeness Properties for AMP Chain Graph Markov Models. The Annals of Statistics, - Levitz, Perlman, et al. - 2001 |

14 |
Ancestral Markov graphical models.
- Richardson, Spirtes
- 2002
(Show Context)
Citation Context ...ge B, schooling B, schooling Q, family status Q, family status a) b) Figure 2: Two Markov equivalent graphs to the one of Yb, Yc of Figure 1. From knowing the Markov equivalence to the graph in Figure 2a), the joint response model for Yb given Ya may also be fitted in terms of univariate regressions and from the Markov equivalence to the graph in Figure 2b), one knows for instance directly, using Proposition 1 below, that sexual abuse is independent of age and schooling given knowledge about family distress and family status. Regression graphs are a subclass of the maximal ancestral graphs of Richardson and Spirtes (2002) and these are a subclass of the summary graphs of Wermuth (2011). The two types are called corresponding graphs if they result after marginalising over a node set m and conditioning on a disjoint node set c from a given directed acyclic graph. Both are independence-preserving graphs in the sense that they give the independence structure implied by the generating graph for the remaining nodes. The summary graph permits in addition to trace possible distortions of generating dependences as they arise in conditional associations among the remaining variables, for instance in parameters of the ma... |

13 | Estimation of a covariance matrix with zeros. Biometrika 94 - Chaudhuri, Drton, et al. - 2007 |

13 |
Interactions in multi-factor contingency tables.
- Darroch
- 1962
(Show Context)
Citation Context ...for its endpoints by conditioning on the inner node and every transmitting V is associationinducing by marginalising over the inner node, just like in a regular joint Gaussian distribution, see as examples the recursive relations of (6), (7), (8). With distributions generated over parent graphs, one excludes incomplete families of distributions; see Lehmann and Scheffe (1955), Brown (1986), Mandelbaum and 21 Ruschendorf (1987), in which independence statements connected with a V may have the inner node both within and outside the conditioning set; see Wermuth (2011), Wermuth and Cox (2004), Darroch (1962). Such independences have been characterized as being not representable in joint Gaussian distributions; see Lnenicka and Matus (2007). More generally, such independences cannot occur whenever the distribution is weakly transitive that is if, for i, k, l distinct nodes of N and m = N \ {i, k, l}, (i⊥k|l and i⊥k|{l, m}) =⇒ (i⊥m|l or k ⊥m|l). or for a dep b|c denoting dependence of Ya on Yb given Yc for disjoint subsets a, b, c of V and an edge-minimal graph, equivalently, (i dep m|l and k dep m|l and i⊥k|l) =⇒ i dep k|{l, m} and (i dep m|l and k dep m|l and i⊥k|{l, m}) =⇒ i dep k|l. Thus, t... |

13 |
Dinitz’ algorithm: the original version and Even’s version. In:
- Dinitz
- 2006
(Show Context)
Citation Context ...ers. Therefore, an early result on parameter equivalence for joint Gaussian distributions implies the following Markov equivalence result for distributions satisfying both the composition and the intersection property. Proposition 6. (Jensen, 1988), (Drton and Richardson, 2008). A covariance graph is Markov equivalent to a concentration graph if and only if both consist of the same complete, disconnected subgraphs. Fast ways of inserting an edge for every transition V, of deciding on connectivity and on blocking flows have been available in the corresponding Russian literature since 1970; see Dinitz (2006), but these results appear to have not not been exploited for the so-called lattice conditional independence models, recognized as distributions generated over GNdags without any transition Vs by Andersson, Madigan, Perlman and Triggs (1997). Markov equivalence of other than multivariate regression chain graphs, have been given by Roverato (2005), Andersson and Perlman (2006) and Roverato and Studeny (2006). With the so-called global Markov property of a graph in node set N and any disjoint subsets a, b, c of N , one can decide whether the graph implies a⊥b|c. To give this property for a regr... |

13 | On Gaussian conditional independence structures. - Lnenicka, Matus - 2007 |

11 | A graphical characterization of lattice conditional independence models. - Andersson, Madigan, et al. - 1997 |

11 | Likelihood factorizations for mixed discrete and continuous variables - Cox, Wermuth - 1999 |

11 | Ignorable common information, null sets and Basu’s first theorem. - Martın, Mouchart, et al. - 2005 |

11 |
On the identification of path analysis models with one hidden variable.
- Stanghellini, Wermuth
- 2005
(Show Context)
Citation Context ... belong to the framework of SEMs even though this general class had been developed in econometrics to deal with endogenous responses, defined by the existence of correlations between the residuals and some regressors. For endogenous responses, the equation parameters are no longer measures of conditional dependence, as they are in linear least-squares regression models. Estimation methods for SEMs were discussed in the Berkeley symposia on mathematical statistics and probability from 1945 to 1965, but some issues still need to be settled; see for instance Drton, Eichler and Richardson (2009), Stanghellini and Wermuth (2005). In statistical models that treat all variables on equal standing, the variables are not assigned roles of responses or regressors and undirected measures of association are used instead of coefficients of dependence. In the concentration graph models, the undirected associations are conditional given all remaining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gau... |

10 | Triangular systems for symmetric binary variables. - Wermuth, Marchetti, et al. - 2009 |

9 | Markov Properties for Linear Causal Models with Correlated Errors.
- Kang, Tian
- 2009
(Show Context)
Citation Context ...n edge between a and b if there is an edge with one node in a and the other node in b. In this case, the graph Ga∪bcov is connected in a and b, otherwise the graphs in a and b are disconnected. Corollary 1. A covariance graph, GNcov , or a concentration graph, G N con , implies a⊥ b if and only if in the subgraph induced by a ∪ b, the graphs in a and b are disconnected. 20 It can be shown that the independence structure of a regression graph is fully specified by the pairwise independences (2) of each missing edge if both properties (v) and (vi) hold in addition to the standard ones; see also Kang and Tian (2009), Pearl and Paz (1987), Marchetti and Lupparelli (2010) for relevant, previous special results. Lemma 1. A regression graph, GNreg , captures an independence structure for a distribution with density fN factorizing as (1) if properties (i) to (vi) hold for fN . Proof. The first four properties hold for any density fN . Given the intersection property (vi), any node i with missing edges to nodes k, l in a concentration graph of node set N implies i⊥{k, l}|N \ {i, k, l} and given the composition property (v), any node i with missing edges to nodes k, l in a covariance graph given Yc implies i⊥{k... |

9 | Collapsibility of distribution dependence. - Ma, Xie, et al. - 2006 |

9 | A graphical representation of equivalence classes of AMP chain graphs. - Roverato, Studeny - 2006 |

8 |
An incomplete data approach to the analysis of covariance structures.
- Kiiveri
- 1987
(Show Context)
Citation Context ... measures of association are used instead of coefficients of dependence. In the concentration graph models, the undirected associations are conditional given all remaining variables on equal standing. For instance, for categorical variables, these models are better known as graphical log-linear models; see Birch (1963), Caussinus (1966), Goodman (1970), Bishop, Fienberg and Holland (1975), Wermuth (1976a), Darroch, Lauritzen and Speed (1980). For Gaussian random variables, these had been introduced as covariance selection models; see Dempster (1972), Wermuth (1976b), Speed and Kiiveri (1986), Kiiveri (1987), Drton and Perlman (2004), and for mixed variables as graphical models for conditional Gaussian (CG) distributions; see Lauritzen and Wermuth (1989), Edwards (2000). For a mean-centered vector variable Y , the elements of the covariance matrix Σ are 17 σij = E (YiYj). If Σ is invertible, the covariances σij are in a one-to-one relation with the concentrations σij , the elements of the concentration matrix Σ−1. There is a recursive relation for concentrations; see Dempster (1969). For a trivariate distribution σ23.1 = σ23 − σ12σ13/σ11, (7) where σ23.1 denotes the concentration of Y2, Y3 in the... |

7 | Half-trek criterion for generic identifiability of linear structural equation models - Foygel, Draisma, et al. - 2012 |

7 |
Childhood adversities and suicide attempts: a retrospective study.
- Hardt, Sidor, et al.
- 2008
(Show Context)
Citation Context ...ariables, also named the background variables. After statistical analyses, arrows may start from nodes within any block but always end at a node in one of the blocks in the future. 2 Thus, there are no arrows pointing to context variables and all arrows point in the same direction, from left to right. An intermediate variable is a response to some variables and also explanatory for other variables so that it has both incoming and outgoing arrows in the regression graph. As an example, we take data from a retrospective study with 283 adult females answering questions about their childhood; see Hardt et al. (2008) when visiting their general practitioner, mostly for some minor health problems. A well-fitting graph is shown in Figure 1. It contains two binary variables, A,B and six quantitative variables. Except for the directly recorded feature age in years, all other variables are derived from answers to questionnaires, coded so that high values correspond to high scores. The three blocks a, b, c reflect here a time-ordering of vector variables, Ya, Yb, Yc with Ya representing the joint response of primary interest, Yb an intermediate vector variable and Yc a context vector variable. The three individ... |

7 | Marginal log-linear parameterization of conditional independence models. - Rudas, Bergsma, et al. - 2010 |

7 |
Probability models with summary graph structure.
- Wermuth
- 2011
(Show Context)
Citation Context ...wo Markov equivalent graphs to the one of Yb, Yc of Figure 1. From knowing the Markov equivalence to the graph in Figure 2a), the joint response model for Yb given Ya may also be fitted in terms of univariate regressions and from the Markov equivalence to the graph in Figure 2b), one knows for instance directly, using Proposition 1 below, that sexual abuse is independent of age and schooling given knowledge about family distress and family status. Regression graphs are a subclass of the maximal ancestral graphs of Richardson and Spirtes (2002) and these are a subclass of the summary graphs of Wermuth (2011). The two types are called corresponding graphs if they result after marginalising over a node set m and conditioning on a disjoint node set c from a given directed acyclic graph. Both are independence-preserving graphs in the sense that they give the independence structure implied by the generating graph for the remaining nodes. The summary graph permits in addition to trace possible distortions of generating dependences as they arise in conditional associations among the remaining variables, for instance in parameters of the maximal ancestral graph models. In the following Section 2, we intr... |

6 | Characterizing Markov equivalence classes for AMP chain graph.
- Andersson, Perlman
- 2006
(Show Context)
Citation Context ...h consist of the same complete, disconnected subgraphs. Fast ways of inserting an edge for every transition V, of deciding on connectivity and on blocking flows have been available in the corresponding Russian literature since 1970; see Dinitz (2006), but these results appear to have not not been exploited for the so-called lattice conditional independence models, recognized as distributions generated over GNdags without any transition Vs by Andersson, Madigan, Perlman and Triggs (1997). Markov equivalence of other than multivariate regression chain graphs, have been given by Roverato (2005), Andersson and Perlman (2006) and Roverato and Studeny (2006). With the so-called global Markov property of a graph in node set N and any disjoint subsets a, b, c of N , one can decide whether the graph implies a⊥b|c. To give this property for a regression graph, we use special types of path that has been called active; see Wermuth (2011). For this, let again {a, b, c,m} partition the node set N of GNreg . Definition 1. A path from a to b in GNreg is active given c if its inner collision nodes are in c or have a descendant in c and its inner transmitting nodes are in m = N \ (a ∪ b ∪ c). Otherwise, the path is said to br... |

5 | Complete and symmetrically complete families of distributions - Mandelbaum, Rüschendorf - 1987 |

5 | Representing modified independence structures - Sadeghi - 2009 |

4 | Parameterization and fitting of discrete bi-directed graph models. - Lupparelli, Marchetti, et al. - 2009 |

4 | Changing parameters by partial mappings.
- Wiedenbeck, Wermuth
- 2010
(Show Context)
Citation Context ... endpoints of any V in a trivariate Gaussain distribution. In the context of multivariate exponential families of distributions, concentrations are special canonical parameters and covariances are special moment parameters with estimates of canonical and moment parameters being asymptotically independent; see Barndorff-Nielsen (1978). Regression graphs capture independence structures for more general types of distribution, where operators for transforming graphs mimic operators for transforming different parametrisations of joint Gaussian distributions; see Wermuth, Wiedenbeck and Cox (2006), Wiedenbeck and Wermuth (2010), Wermuth (2011). In particular, by removing an edge from any V of a regression graph, one introduces an additional independence constraint just as in a regular joint Gaussian distribution. This requires that the generated distributions satisfy the composition and intersection property in addition to general properties, as discussed in the next section. 5 Using graphs to combine independence statements We now state the four standard properties of independences of any multivariate distribution; see e.g. Dawid (1979), Studeny (2005), as well as two special properties of joint 18 Gaussian distri... |

3 |
Bedeutung der Lokalisation fur die Entwicklung und Behandlung chronischer Schmerzen.
- Kappesser
- 1997
(Show Context)
Citation Context ... graphs, (i) ia also by (i′) and (iii) by (iii′) i⊥k|g>j−1 \ {i, k} for i in gj with j ≤ r and k in g>j. Not yet systematically approached is the search for simple covering models that capture most but not all independences, but for regression graphs see Propositions 8 to 10. Often, a covering model may be easier to fit than a reduced edge-minimal model; see Cox and Wermuth (1990) and the discussion of Figures 16 and 17 in Section 7. 9 Before we discuss the meaning of different types of missing edges for linear models in more detail, we derive a well-fitting regression graph for data given by Kappesser (1997). 3 Deriving and interpreting a regression graph For 201 chronic pain patients, the role of the site of pain during a three week stay in a chronic pain clinic was to be examined. In this study, it was of main interest to investigate the changes in two main symptoms and to understand determinants of the overall treatment success as rated by the patients, three months after they had left the clinic. Figure 7 shows a first ordering of the variables derived in discussions between psychologists, physicians and statisticians, which shows only those variables that remained relevant after statistical ... |

3 |
Flat and multimodal likelihoods and model lack of fit in curved exponential families.
- Sundberg
- 2010
(Show Context)
Citation Context ...dual component variables of the response vector form the set of joint responses. Maximumlikelihood estimation of regression coefficients for a joint Gaussian distribution reduces to linear-least squares fitting for each component separately; see Anderson (1958). 16 With different sets of regressors for the components of a vector response, seemingly unrelated regressions (SUR) result and iterative methods are needed for estimation; see Zellner (1962). For small sample sizes, a given solution of the likelihood equations of a Gaussian SUR model may not be unique; see Drton and Richardson (2004), Sundberg (2010), while for exclusively discrete variables this will never happen; see Drton (2009). For mixed variables, no corresponding results are available yet. But in general, there often exists a covering model with nice estimation properties. For instance for the SUR model with regression graph ◦ ≻◦ ◦ ≺ ◦ , a general linear model with two independent regressors is a simple covering model. For a vector variable of categorical responses only, the multivariate logistic regression of Glonek and McCullagh (1995) reduces to separate main effect logistic regressions for each component of the response vector ... |

2 | A characterization of moral transitive acyclic directed graph Markov models as labeled trees. - unknown authors - 2003 |

2 | Markov properties of mixed loopless graphs - Sadeghi, Lauritzen - 2011 |

1 |
A unified approach to the characterisation of Markov equivalence classes of directed acyclic graphs, chain graphs with no flags and chain graphs.
- Roverato
- 2005
(Show Context)
Citation Context ...f and only if both consist of the same complete, disconnected subgraphs. Fast ways of inserting an edge for every transition V, of deciding on connectivity and on blocking flows have been available in the corresponding Russian literature since 1970; see Dinitz (2006), but these results appear to have not not been exploited for the so-called lattice conditional independence models, recognized as distributions generated over GNdags without any transition Vs by Andersson, Madigan, Perlman and Triggs (1997). Markov equivalence of other than multivariate regression chain graphs, have been given by Roverato (2005), Andersson and Perlman (2006) and Roverato and Studeny (2006). With the so-called global Markov property of a graph in node set N and any disjoint subsets a, b, c of N , one can decide whether the graph implies a⊥b|c. To give this property for a regression graph, we use special types of path that has been called active; see Wermuth (2011). For this, let again {a, b, c,m} partition the node set N of GNreg . Definition 1. A path from a to b in GNreg is active given c if its inner collision nodes are in c or have a descendant in c and its inner transmitting nodes are in m = N \ (a ∪ b ∪ c). Oth... |