Results 1  10
of
11
Controlling Selection Bias in Causal Inference
, 2012
"... Selection bias, caused by preferential exclusion of samples from the data, is a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can hardly be detected in either experimental or observational studies. This paper highlights several graphica ..."
Abstract

Cited by 21 (12 self)
 Add to MetaCart
(Show Context)
Selection bias, caused by preferential exclusion of samples from the data, is a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can hardly be detected in either experimental or observational studies. This paper highlights several graphical and algebraic methods capable of mitigating and sometimes eliminating this bias. These nonparametric methods generalize previously reported results, and identify the type of knowledge that is needed for reasoning in the presence of selection bias. Specifically, we derive a general condition together with a procedure for deciding recoverability of the odds ratio (OR) from sbiased data. We show that recoverability is feasible if and only if our condition holds. We further offer a new method of controlling selection bias using instrumental variables that permits the recovery of other effect measures besides OR. 1
Two algorithms for fitting constrained marginal models
 Computational Statistics and Data Analysis
, 2013
"... ar ..."
(Show Context)
Logmean linear models for binary data
 Biometrika
, 2013
"... This paper is devoted to the theory and application of a novel class of models for binary data, which we call logmean linear (LML) models. The characterizing feature of these models is that they are specified by linear constraints on the LML parameter, defined as a loglinear expansion of the mean ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
This paper is devoted to the theory and application of a novel class of models for binary data, which we call logmean linear (LML) models. The characterizing feature of these models is that they are specified by linear constraints on the LML parameter, defined as a loglinear expansion of the mean parameter of the multivariate Bernoulli distribution. We show that marginal independence relationships between variables can be specified by setting certain LML interactions to zero and, more specifically, that graphical models of marginal independence are LML models. LML models are code dependent, in the sense that they are not invariant with respect to relabelling of variable values. As a consequence, they allow us to specify submodels defined by codespecific independencies, which are independencies in subpopulations of interest. This special feature of LML models has useful applications. Firstly, it provides a flexible way to specify parsimonious submodels of marginal independence models. The main advantage of this approach concerns the interpretation of the submodel, which is fully characterized by independence relationships, either marginal or codespecific. Secondly, the codespecific nature of these models can be exploited to focus on a fixed, arbitrary, cell of the probability table and on the corresponding subpopulation. This leads to an innovative family of models, which we call pivotal codespecific LML models, that is especially useful when the interest of researchers is focused on a small subpopulation obtained by stratifying individuals according to some features. The application of LML models is illustrated on two datasets, one of which concerns the use of pivotal codespecific LML models in the field of personalized medicine.
Marginal AMP Chain Graphs
 INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
, 2014
"... We present a new family of models that is based on graphs that may have undirected, directed and bidirected edges. We name these new models marginal AMP (MAMP) chain graphs because each of them is Markov equivalent to some AMP chain graph under marginalization of some of its nodes. However, MAMP c ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We present a new family of models that is based on graphs that may have undirected, directed and bidirected edges. We name these new models marginal AMP (MAMP) chain graphs because each of them is Markov equivalent to some AMP chain graph under marginalization of some of its nodes. However, MAMP chain graphs do not only subsume AMP chain graphs but also multivariate regression chain graphs. We describe global and pairwise Markov properties for MAMP chain graphs and prove their equivalence for compositional graphoids. We also characterize when two MAMP chain graphs are Markov equivalent. For Gaussian probability distributions, we also show that every MAMP chain graph is Markov equivalent to some directed and acyclic graph with deterministic nodes under marginalization and conditioning on some of its nodes. This is important because it implies that the independence model represented by a MAMP chain graph can be accounted for by some data generating process that is partially observed and has selection bias. Finally, we modify MAMP chain graphs so that they are closed under marginalization for Gaussian probability distributions. This is a desirable feature because it guarantees parsimonious models under marginalization.
Constraintbased causal discovery from multiple interventions over overlapping variable sets. arXiv:1403.2150
, 2014
"... Scientific practice typically involves repeatedly studying a system, each time trying to unravel a different perspective. In each study, the scientist may take measurements under different experimental conditions (interventions, manipulations, perturbations) and measure different sets of quantitie ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Scientific practice typically involves repeatedly studying a system, each time trying to unravel a different perspective. In each study, the scientist may take measurements under different experimental conditions (interventions, manipulations, perturbations) and measure different sets of quantities (variables). The result is a collection of heterogeneous data sets coming from different data distributions. In this work, we present algorithm COmbINE, which accepts a collection of data sets over overlapping variable sets under different experimental conditions; COmbINE then outputs a summary of all causal models indicating the invariant and variant structural characteristics of all models that simultaneously fit all of the input data sets. COmbINE converts estimated dependencies and independencies in the data into path constraints on the datagenerating causal model and encodes them as a SAT instance. The algorithm is sound and complete in the sample limit. To account for conflicting constraints arising from statistical errors, we introduce a general method for sorting constraints in order of confidence, computed as a function of their corresponding pvalues. In our empirical evaluation, COmbINE outperforms in terms of efficiency the only preexisting similar algorithm; the latter additionally admits feedback cycles, but does not admit conflicting constraints which hinders the applicability on real data. As a proofofconcept, COmbINE is employed to coanalyze 4 real, masscytometry data sets measuring phosphorylated protein concentrations of overlapping protein sets under 3 different interventions. 1.
Dichotomization invariant logmean linear parameterization for discrete graphical models
, 2013
"... of marginal independence ..."
(Show Context)
Alternative Markov and Causal Properties for Acyclic Directed Mixed Graphs
"... Abstract We extend AnderssonMadiganPerlman chain graphs by (i) relaxing the semidirected acyclity constraint so that only directed cycles are forbidden, and (ii) allowing up to two edges between any pair of nodes. We introduce global, and ordered local and pairwise Markov properties for the new m ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We extend AnderssonMadiganPerlman chain graphs by (i) relaxing the semidirected acyclity constraint so that only directed cycles are forbidden, and (ii) allowing up to two edges between any pair of nodes. We introduce global, and ordered local and pairwise Markov properties for the new models. We show the equivalence of these properties for strictly positive probability distributions. We also show that when the random variables are continuous, the new models can be interpreted as systems of structural equations with correlated errors. This enables us to adapt Pearl's docalculus to them. Finally, we describe an exact algorithm for learning the new models from observational and interventional data via answer set programming.
Binary
"... models of marginal independence: a comparison of different approaches ..."
(Show Context)
Logmean linear models for binary data Alberto
, 2012
"... This paper introduces a novel class of models for binary data, which we call logmean linear models. The characterizing feature of these models is that they are specified by linear constraints on the logmean linear parameter, defined as a loglinear expansion of the mean parameter of the multivaria ..."
Abstract
 Add to MetaCart
This paper introduces a novel class of models for binary data, which we call logmean linear models. The characterizing feature of these models is that they are specified by linear constraints on the logmean linear parameter, defined as a loglinear expansion of the mean parameter of the multivariate Bernoulli distribution. We show that marginal independence relationships between variables can be specified by setting certain logmean linear interactions to zero and, more specifically, that graphical models of marginal independence are logmean linear models. Our approach overcomes some drawbacks of the existing parameterizations of graphical models of marginal independence.