#### DMCA

## Estimating Heterogeneous Treatment Effects in the Presence of Self-Selection: A Propensity Score Perspective

### Citations

1087 |
Identification and Estimation of Local Average Treatment Effects.
- Imbens, JD
- 1994
(Show Context)
Citation Context ...when treatment effect is heterogeneous, IV and RD design can only identify the average causal effect among individuals whose treatment status is influenced by the IV or passage of the “cutoff point.”(=-=Imbens and Angrist 1994-=-; Hahn, Todd, and Van der Klaauw 2001) Similarly, fixed effects models can only identify the average causal effect among individuals who change their treatment status over the study period. The second... |

1023 | Identification of Causal Effects Using Instrumental Variables. - Angrist, Imbens, et al. - 1996 |

736 | Statistics and causal inference,”
- Holland
- 1986
(Show Context)
Citation Context ...istical methods designed for drawing causal inferences can estimate causal effects only at an aggregate level while overlooking within-group, individual-level heterogeneity (Angrist and Krueger 1999; =-=Holland 1986-=-; Morgan and Winship 2007; Xie 2007, 2013). Moreover, when treatment effects vary systematically by treatment status, the average difference in outcome between the treated and untreated units will be ... |

580 |
Dummy Endogenous Variables in a Simultaneous Equation System.”Econometrica.
- Heckman
- 1978
(Show Context)
Citation Context ...le, using the sampleSelection package in R; see Toomet and Henningsen 2008). This model specification has a long history in econometrics and is usually called the “normal switching regression model” (=-=Heckman 1978-=-; see Winship and Mare 1992 for a review). With the joint normality assumption, the expression of MTE (12) reduces to MTE(x, uD) = (β1 − β0)′x + σηVσV Φ−1(uD), where σηV is the covariance between η an... |

334 | Matching as Nonparametric Preprocessing for ReducingModel Dependence in Parametric Causal Inference.”
- Ho, Imai, et al.
- 2007
(Show Context)
Citation Context ... traditional methods for causal inference from observational data can be divided into two classes, as shown in the first row of Table 4. The first class, including covariate adjustment, matching (see =-=Ho et al. 2007-=-), and inverse-probability-of-treatment weighting (Robins, Hernan, and Brumback 2000), rely on the assumption of ignorability: after control of a given set of observed confounders, treatment status is... |

301 | Identification and estimation of treatment effects with a regression-discontinuity design.” - Hahn, Todd, et al. - 2001 |

278 |
Some Thoughts on the Distribution of Earnings. Oxford Economic Papers.
- Roy
- 1951
(Show Context)
Citation Context ...hen individuals (or their agents) possess more knowledge than the researcher about their individual-specific gains from treatment and act on it (Bjorklund and Moffitt 1987; Heckman and Vytlacil 2007; =-=Roy 1951-=-). The bias associated with this type of selection has been termed treatment-effect heterogeneity bias or Type II selection bias (Xie et al. 2012; Zhou and Xie 2014). For example, research considering... |

186 |
Local Instrumental Variables and Latent Variable Models for Identifying and Bounding Treatment Effects,”
- Heckman, Vytlacil
- 1999
(Show Context)
Citation Context ...ch, developed by James Heckman and his colleagues, accommodates both types of unobserved selection through the use of a latent index model for treatment assignment (Heckman, Urzua, and Vytlacil 2006; =-=Heckman and Vytlacil 1999-=-, 2001a, 2005). Under this model, all of the treatment effect heterogeneity that is relevant for self-selection 3 is captured in the marginal treatment effect (MTE), a function defined as the conditio... |

180 |
Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment.”
- Thistlewaite, Campbell
- 1960
(Show Context)
Citation Context ...confounders, treatment status is independent of both baseline outcomes and treatment effects. The second class of methods, including instrumental variables (IV), regression discontinuity (RD) design (=-=Thistlethwaite and Campbell 1960-=-), and fixed effects models, allow for unobserved selection into treatment but require some exogenous variation in treatment status—either between or within units—to identify causal effects (see Gangl... |

174 |
Counterfactuals and Causal Inference: Methods and Principles for Social Research. New York:
- Morgan, Winship
- 2007
(Show Context)
Citation Context ...s designed for drawing causal inferences can estimate causal effects only at an aggregate level while overlooking within-group, individual-level heterogeneity (Angrist and Krueger 1999; Holland 1986; =-=Morgan and Winship 2007-=-; Xie 2007, 2013). Moreover, when treatment effects vary systematically by treatment status, the average difference in outcome between the treated and untreated units will be a biased estimate of the ... |

157 | Understanding instrumental variables in models with essential heterogeneity,”
- Heckman, Urzua, et al.
- 2006
(Show Context)
Citation Context ... we can assume Y0 = µ0(X) + e and Y1 = µ1(X) + e+ η for any given functions µ0(X) and µ1(X). In fact, the separability between X and the error terms is not required for identification of the MTE (see =-=Heckman et al. 2006-=-). 6 Now consider a latent index model for selection into treatment. Let ID be the latent tendency for treatment, which depends on both observed (Z) and unobserved (V) factors: ID = γ ′ Z−V, (6) D = 1... |

157 | Econometric Evaluation of Social Programs, Part III: Distributional Treatment Effects, Dynamic Treatment Effects, Dynamic Discrete Choice,
- Abbring, Heckman
- 2007
(Show Context)
Citation Context ...ovariates. This is likely when individuals (or their agents) possess more knowledge than the researcher about their individual-specific gains from treatment and act on it (Bjorklund and Moffitt 1987; =-=Heckman and Vytlacil 2007-=-; Roy 1951). The bias associated with this type of selection has been termed treatment-effect heterogeneity bias or Type II selection bias (Xie et al. 2012; Zhou and Xie 2014). For example, research c... |

133 |
The estimation of causal effects from observational data.
- Winship, Morgan
- 1999
(Show Context)
Citation Context ...effects vary systematically by treatment status, the average difference in outcome between the treated and untreated units will be a biased estimate of the average treatment effect in the population (=-=Winship and Morgan 1999-=-). Depending on data and assumptions about how individuals select into treatment, three major approaches have been proposed to study heterogeneous treatment effects. First, we can simply include inter... |

126 |
The shape of the river.
- Bowen, Bok
- 1998
(Show Context)
Citation Context ... exhibited higher returns to college than traditional students did, these findings were mostly not statistically significant and susceptible to uncontrolled selection biases (Attewell and Lavin 2007; =-=Bowen and Bok 1998-=-; Dale and Krueger 2011; Maurin and McNally 2008; see Hout 2012 for a review). Nonetheless, if the downward slope in the second component was sufficiently strong, MPRTE(p) would also decline with p. I... |

66 |
Estimating Causal Effects of Treatments
- Rubin
- 1974
(Show Context)
Citation Context ...ds on the generalized Roy model for discrete choices that depend partly on potential outcomes (Heckman and Vytlacil 2007; Roy 1951). Following the counterfactual framework of causality (Holland 1986; =-=Rubin 1974-=-), let us consider two potential outcomes, Y1 and Y0, a binary indicator D for treatment status, and a vector of observed variables X that are determined prior to treatment. Y1 denotes the potential o... |

65 |
Self-Selection and the Earnings of Immigrants,” American Economic Review
- Borjas
- 1987
(Show Context)
Citation Context ... from attending college (Carneiro, Heckman, and Vytlacil 2011; Moffitt 2008; Willis and Rosen 1979). Similar patterns of self-selection have been observed in a variety of contexts, such as migration (=-=Borjas 1987-=-), secondary schooling tracking (Gamoran and Mare 1989), career choice (Sakamoto and Chen 1991), and marriage dissolution (Smock, Manning, and Gupta 1999). The third approach, developed by James Heckm... |

61 |
Models for Sample Selection Bias.” Annual Review of Sociology 18:327–50.
- Winship, Mare
- 1992
(Show Context)
Citation Context ...leSelection package in R; see Toomet and Henningsen 2008). This model specification has a long history in econometrics and is usually called the “normal switching regression model” (Heckman 1978; see =-=Winship and Mare 1992-=- for a review). With the joint normality assumption, the expression of MTE (12) reduces to MTE(x, uD) = (β1 − β0)′x + σηVσV Φ−1(uD), where σηV is the covariance between η and V, σV is the standard dev... |

59 |
An investigation into the robustness of the Tobit estimator to nonnormality,
- Arabmazar, Schmidt
- 1982
(Show Context)
Citation Context ...D. The joint normality of error terms, however, is not a verifiable assumption. When errors are not normally distributed, the normality assumption may lead to substantial bias in parameter estimates (=-=Arabmazar and Schmidt 1982-=-). To ameliorate this problem, Heckman et al. (2006) introduced a semiparametric method for estimating the MTE. To understand this method, let us first write out the expectation of the observed outcom... |

34 |
The Estimation of Wage Gains and
- Björklund, Moffitt
- 1987
(Show Context)
Citation Context ...s not captured by observed covariates. This is likely when individuals (or their agents) possess more knowledge than the researcher about their individual-specific gains from treatment and act on it (=-=Bjorklund and Moffitt 1987-=-; Heckman and Vytlacil 2007; Roy 1951). The bias associated with this type of selection has been termed treatment-effect heterogeneity bias or Type II selection bias (Xie et al. 2012; Zhou and Xie 201... |

30 |
Who Benefits Most from College? Evidence for Negative Selection in Heterogeneous Economic Returns to Higher Education.”
- Brand, Xie
- 2010
(Show Context)
Citation Context ...approach for studying treatment effect heterogeneity—by examining how treatment effect varies according to the propensity score, i.e., the probability of treatment given a set of observed covariates (=-=Brand and Xie 2010-=-; Xie, Brand, and Jann 2012). The methodological rationale of this approach is that under the assumption of ignorability, the interaction between treatment status and the propensity score captures all... |

22 |
Passing the Torch: Does Higher Education for the Disadvantaged Pay Off Across the Generations?
- Attewell, Lavin
- 2007
(Show Context)
Citation Context ...m less-educated families, exhibited higher returns to college than traditional students did, these findings were mostly not statistically significant and susceptible to uncontrolled selection biases (=-=Attewell and Lavin 2007-=-; Bowen and Bok 1998; Dale and Krueger 2011; Maurin and McNally 2008; see Hout 2012 for a review). Nonetheless, if the downward slope in the second component was sufficiently strong, MPRTE(p) would al... |

22 | Latent Index Models: An Equivalence Result,”Econometrica - “Independence |

21 |
Vive la Révolution! Long-Term Educational Returns of 1968 to the Angry Students,"
- Maurin, McNally
- 2008
(Show Context)
Citation Context ...raditional students did, these findings were mostly not statistically significant and susceptible to uncontrolled selection biases (Attewell and Lavin 2007; Bowen and Bok 1998; Dale and Krueger 2011; =-=Maurin and McNally 2008-=-; see Hout 2012 for a review). Nonetheless, if the downward slope in the second component was sufficiently strong, MPRTE(p) would also decline with p. In that case, we would (paradoxically) observe a ... |

18 | Bayesian nonparametric modeling for causal inference. - Hill - 2011 |

17 |
Evaluating marginal policy changes and the average effect of treatment for individuals at the margin,”
- Carneiro, Heckman, et al.
- 2010
(Show Context)
Citation Context ...en used to construct policy-relevant causal effects, the current practice is to evaluate policy ef11 fectiveness by estimating the average treatment effect for all marginal entrants to treatment (see =-=Carneiro et al. 2010-=-, 2011). In doing so, policy implications of potential heterogeneity in treatment effects among individuals at the margin are left unexplored. As we will see later, ignoring treatment effect heterogen... |

13 |
Local polynomial modelling and its applications, Volume 66
- Fan, Gijbels
- 1996
(Show Context)
Citation Context ...imate the parametric part of equation (15), i.e., (β0, β1 − β0), and store the remaining variation of Y as R∗Y = Y− β̂ ′ 0x− (β̂1 − β̂0) ′ xp. 3. Regress R∗Y on p using a local polynomial regression (=-=Fan and Gijbels 1996-=-) to estimate the third term in equation (15) and its derivative E(η|V = F−1V (p)). 4. Construct MTE(x, uD), using β̂1− β̂0 from step 2 and the estimates of E(η|V = F−1V (uD)) from step 3. Since this ... |

12 | Estimating Marginal Treatment Effects in Heterogeneous Populations.” Annales dEconomie et de Statistique, 91-92: 239–262. Special Issue on Econometrics of Evaluation
- Moffitt
(Show Context)
Citation Context ...ns to schooling has argued that college education was selective because it disproportionately attracted young persons who would gain more from attending college (Carneiro, Heckman, and Vytlacil 2011; =-=Moffitt 2008-=-; Willis and Rosen 1979). Similar patterns of self-selection have been observed in a variety of contexts, such as migration (Borjas 1987), secondary schooling tracking (Gamoran and Mare 1989), career ... |

12 |
Estimating Heterogeneous Treatment Effects with Observational Data.’’ Sociological Methodology
- Xie, Brand, et al.
- 2012
(Show Context)
Citation Context ... treatment, traditional regression and matching methods would lead to biased estimates of average causal effects. This bias is usually called pretreatment heterogeneity bias or Type I selection bias (=-=Xie et al. 2012-=-; Zhou and Xie 2014). As Breen, Choi, and Holm (2015) show, this type of selection could easily contaminate estimates of heterogeneous treatment effects by observed covariates or the propensity score.... |

8 | 2001a. “Local Instrumental Variables.” Nonlinear Statistical Modeling - Heckman, Vytlacil |

8 | Policy-Relevant Treatment Effects." The American Economic Review 91(2 - Heckman, Vytlacil - 2001 |

8 | Sample selection models in R: Package sampleSelection
- HENNINGSEN, TOOMET
- 2008
(Show Context)
Citation Context ...ations (1), (2), (6), and (7) is fully parameterized, and the unknown parameters (β1, β0,γ,Σ) can be jointly estimated via maximum likelihood (for example, using the sampleSelection package in R; see =-=Toomet and Henningsen 2008-=-). This model specification has a long history in econometrics and is usually called the “normal switching regression model” (Heckman 1978; see Winship and Mare 1992 for a review). With the joint norm... |

7 | Effect Heterogeneity and Bias in Main-Effects-Only Regression Models.” Pp. 327-36 in Heuristics, Probability and Causality: A Tribute to Judea Pearl,
- Elwert, Winship
- 2010
(Show Context)
Citation Context ...annot recover standard causal parameters such as ATE or TT, but instead estimate a conditional-variance-weighted causal effect that is of little substantive meaning (Morgan and Winship 2007; see also =-=Elwert and Winship 2010-=-). Moreover, it is widely known that when treatment effect is heterogeneous, IV and RD design can only identify the average causal effect among individuals whose treatment status is influenced by the ... |

5 |
2011. “Estimating Marginal Returns to Education,” American Economic Review
- Carneiro, Heckman, et al.
(Show Context)
Citation Context ...e conditional independence between (e, η, V) and Z given X is required (Heckman et al. 2006). However, without the joint independence assumption, estimation of the MTE is practically challenging (see =-=Carneiro et al. 2011-=-). 3In the classic Roy model (Borjas 1987; Roy 1951), ID = Y1 −Y0. In that case, V = −η. 7 X = x and the normalized latent variable UD = uD: MTE(x, uD) = E(Y1 −Y0|X = x, UD = uD) (10) = E[(β1 − β0)′X ... |

3 |
Inequality and Attainment in a Dual Labor Market.” American Sociological Review 56:295–308
- Sakamoto, Chen
- 1991
(Show Context)
Citation Context ... and Rosen 1979). Similar patterns of self-selection have been observed in a variety of contexts, such as migration (Borjas 1987), secondary schooling tracking (Gamoran and Mare 1989), career choice (=-=Sakamoto and Chen 1991-=-), and marriage dissolution (Smock, Manning, and Gupta 1999). The third approach, developed by James Heckman and his colleagues, accommodates both types of unobserved selection through the use of a la... |

3 | Dudley Duncan’s Legacy: The Demographic Approach to Quantitative Reasoning - “Otis |

1 |
volume 3A, chapter 23
- Angrist, Krueger
- 1999
(Show Context)
Citation Context ...ct heterogeneity, all statistical methods designed for drawing causal inferences can estimate causal effects only at an aggregate level while overlooking within-group, individual-level heterogeneity (=-=Angrist and Krueger 1999-=-; Holland 1986; Morgan and Winship 2007; Xie 2007, 2013). Moreover, when treatment effects vary systematically by treatment status, the average difference in outcome between the treated and untreated ... |