### Citations

1146 |
Data analysis using regression and multilevel/hierarchical models. New York:
- Gelman, Hill
- 2006
(Show Context)
Citation Context ...h partially pools all the group-specific treatment effect estimates toward the overall mean, 휇휏 , while allowing the group-specific intercept and treatment effect to co-vary (Bryk & Raudenbush, 2002; =-=Gelman and Hill, 2006-=-). As in the previous sections, the classical approach is mathematically equivalent to the hierarchical model in which 휎훼→∞ and 휎휏→∞, which implies Hierarchical Models for Causal Effects 9 no pooling ... |

1111 |
Hierarchical Linear Models: Applications and Data Analysis Methods. Sage,
- Bryk, Raudenbush
- 1992
(Show Context)
Citation Context ...rages, and we believe there is the potential to learn much more from data. There is a long history of the use of hierarchical models for estimating causal effects, especially in education statistics (=-=Bryk & Raudenbush, 2002-=-). One reason for a revival of this topic now is that statisticians are increasingly able and willing to fit complex regression models using regularization to handle large numbers of predictors and ar... |

1107 |
Estimating Causal Effects of Treatments in Randomized and Non-Randomized
- Rubin
- 1974
(Show Context)
Citation Context ...prediction problem,with the effect of a specified treatment on a specified itembeing the difference between the predicted outcome conditional on the treatment. In the standard notation (Neyman, 1923; =-=Rubin, 1974-=-), for item i there is a treatment Zi that can equal 0 or 1, a set of pre-treatment predictors Xi, and potential outcomes Yi0 and Yi1 corresponding to what would be observed under one treatment or the... |

128 | Unified methods for censored longitudinal data and causality, - Laan, Robins - 2003 |

127 |
Principal stratification in causal inference.
- Frangakis, Rubin
- 2002
(Show Context)
Citation Context ...ly-observed subgroups. This is especially promising for principal strata, subgroups defined by the joint distribution of intermediate outcomes, such as treatment take-up, under treatment and control (=-=Frangakis & Rubin, 2002-=-). Since many researchers already fit such models in a Bayesian framework (Hirano, Imbens, Rubin, & Zhou, 2000; Imbens & Rubin, 2015), it is straightforward to extend these models to a multi-level set... |

72 | What Mean Impacts Miss: Distributional Effects of Welfare Reform Experiments.
- Bitler, Gelbach, et al.
- 2003
(Show Context)
Citation Context ...rces to a subset of the population (see, e.g., Dehejia, 2005; Imai & Strauss, 2011). Or policymakers are interested in the effects of a given intervention on the distribution of resources in society (=-=Bitler, Gelbach, & Hoynes, 2003-=-). In other settings, variation in a causal effect is itself of interest (Gelman & Huang, 2008). Even when not formally in the model, treatment effect variation is implicitly recognized. Once we start... |

58 |
Beyond linearity by default: Generalized additive models.
- Beck, Jackman
- 1998
(Show Context)
Citation Context ... us that this relationship is linear, quadratic, or exponential—there is no strong reason ahead of time to believe that treatment effect is more likely to increase with income rather than log-income (=-=Beck & Jackman, 1998-=-). Some researchers seek to avoid this problem by discretizing their continuous variable, but this simply pushes the problem back to a specification search of a different kind, in which researchers fi... |

35 |
Bayesian Data Analysis. Boca
- Gelman
- 2004
(Show Context)
Citation Context ...are predictive of treatment assignment—should be included in the analysis. This principle is supposed to be followed in classical design-based analyses as well as in model-based or Bayesian analyses (=-=Gelman et al., 2013-=-). Design information can be included in various ways, including survey weighting, regression modeling, and poststratification. Moreover, specific tools can be used and interpreted using different sta... |

33 | Strategies for improving precision in group-randomized experiments. Educational Evaluation and Policy Analysis,
- Raudenbush, Martinez, et al.
- 2007
(Show Context)
Citation Context ... (Hill & Scott, 2009; Imai, King, & Nall, 2009), political advertising applied at the media market level (Green & Vavreck, 2007), and educational interventions at the classroom or whole-school level (=-=Raudenbush, Martinez, & Spybrook, 2007-=-). Extending hierarchical models to such experiments is simple—we use the same model as for a stratified experiment but include treatment assignment as a group-level rather than individual-level predi... |

30 | Jörn-Ste¤an Pischke (2008)Mostly Harmless Econometrics: An Empiricists Companion - Angrist |

26 | Heterogeneity of variance in experimental studies: A challenge to conventional interpretations. - Bryk, Raudenbush - 1988 |

18 |
Bayesian subset analysis
- DO, Simon
- 1991
(Show Context)
Citation Context ... is especially problematic in the context of treatment effect interactions, implying that an interaction effect of zero is as likely as an arbitrarily large interaction—an obviously absurd statement (=-=Dixon & Simon, 1991-=-; Simon, 2002). Moreover, this no-pooling approach can prove especially problematic in the context of trying to estimate multiple weak signals. First, consider the issue of statistical power: imagine ... |

18 | Bayesian nonparametric modeling for causal inference. - Hill - 2011 |

17 |
Nonparametric applications of bayesian inference.
- Chamberlain, Imbens
- 2003
(Show Context)
Citation Context ...-weight infants, rather than on average weights. Modeling quantiles is often more challenging than modeling means. A growing literature has focused on this estimation challenge in a Bayesian context (=-=Chamberlain & Imbens, 2003-=-; Lancaster & Jun, 2009; Taddy & Kottas, 2010). Moreover, many of the hierarchical modeling approaches described above can be extended to quantile regression (Reich, Bondell, & Wang, 2010), simply rep... |

17 | Analysis of clusterrandomized experiments: A comparison of alternative estimation approaches.
- Green, Vavreck
- 2008
(Show Context)
Citation Context ...common in the social sciences. Examples include public health interventions rolled out by city (Hill & Scott, 2009; Imai, King, & Nall, 2009), political advertising applied at the media market level (=-=Green & Vavreck, 2007-=-), and educational interventions at the classroom or whole-school level (Raudenbush, Martinez, & Spybrook, 2007). Extending hierarchical models to such experiments is simple—we use the same model as f... |

16 | The specification of the propensity score in multilevel observational studies (Working Paper No. 6). Carlo F. Dondena Centre for Research on Social Dynamics. Retrieved from http:// www.dondena.unibocconi.it/wp6
- Arpino, Mealli
- 2008
(Show Context)
Citation Context ...tic differences between treatment and control units (Imbens & Rubin, 2015). Just as with more complex experimental designs, propensity scores models can often be improved by a hierarchical structure (=-=Arpino & Mealli, 2011-=-). Hierarchical Models for Causal Effects 7 The econometrics literature has focused on an alternative approach for observational studies, especially when estimating causal effects for panel data. In t... |

16 | Estimating Incumbency Advantage and Its Variation, as an Example of a - Gelman, Huang - 2008 |

12 | Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression Trees. Public Opinion Quarterly 76(3):491–511. - Green, Kern - 2012 |

10 | Bayesian quantile regression methods
- Lancaster, Jun
- 2010
(Show Context)
Citation Context ... on average weights. Modeling quantiles is often more challenging than modeling means. A growing literature has focused on this estimation challenge in a Bayesian context (Chamberlain & Imbens, 2003; =-=Lancaster & Jun, 2009-=-; Taddy & Kottas, 2010). Moreover, many of the hierarchical modeling approaches described above can be extended to quantile regression (Reich, Bondell, & Wang, 2010), simply replacing the mean by the ... |

9 |
Bayesian subset analysis: application to studying treatment-by-gender interactions
- Simon
(Show Context)
Citation Context ...matic in the context of treatment effect interactions, implying that an interaction effect of zero is as likely as an arbitrarily large interaction—an obviously absurd statement (Dixon & Simon, 1991; =-=Simon, 2002-=-). Moreover, this no-pooling approach can prove especially problematic in the context of trying to estimate multiple weak signals. First, consider the issue of statistical power: imagine that a resear... |

7 | Estimating Variation in Program Impacts: Theory, Practice and Applications, - Bloom, Raudenbush, et al. - 2012 |

6 | Comment on “The essential role of pair matching
- HILL, SCOTT
- 2009
(Show Context)
Citation Context ...r receives the same treatment; in other words, randomization occurs at the group level.3 This design is common in the social sciences. Examples include public health interventions rolled out by city (=-=Hill & Scott, 2009-=-; Imai, King, & Nall, 2009), political advertising applied at the media market level (Green & Vavreck, 2007), and educational interventions at the classroom or whole-school level (Raudenbush, Martinez... |

5 |
Smoothed ANOVA with application to subgroup analysis
- Sargent, Hodges
- 1997
(Show Context)
Citation Context ...her parameterizations of the simpler models discussed above. For example, there is a growing literature on specifying the prior variance for interaction effects (Hodges, Cui, Sargent, & Carlin, 2007; =-=Sargent & Hodges, 1997-=-). One promising approach is the use of nonparametric prior distributions (Sivaganesan, Laud, & Müller, 2010) or in highly flexible hierarchical array priors (Volfovsky & Hoff, 2012) for these interac... |

5 | Detecting Spillover Effects: Design and Analysis of Multilevel Experiments.’’ - Sinclair, McConnell, et al. - 2012 |

4 | Supplement to “Hierarchical array priors for ANOVA decompositions of cross-classified data.” DOI:10.1214/13-AOAS685SUPP
- Volfovsky, Hoff
- 2013
(Show Context)
Citation Context ...t, & Carlin, 2007; Sargent & Hodges, 1997). One promising approach is the use of nonparametric prior distributions (Sivaganesan, Laud, & Müller, 2010) or in highly flexible hierarchical array priors (=-=Volfovsky & Hoff, 2012-=-) for these interaction terms. Another extension is to allow for treatment effects to vary across continuous covariates. Researchers rarely—if ever—have a substantive reason to make a particular assum... |

3 | A Lasso for hierarchical interactions. arXiv preprint arXiv:1205.5050 - Bien, Taylor, et al. - 2012 |

3 | Estimating percentile-specific treatment effects in counterfactual models: a case-study of micronutrient supplementation, birth weight and infant mortality - Dominici, Zeger, et al. - 2006 |

3 | Treatment effects in before-after data. Applied Bayesian Modeling and Causal Inference from an Incomplete Data Perspective. - Gelman - 2004 |

2 | Randomization inference for treatment effect variation. Working paper available at http://scholar.harvard.edu/files/ feller/files/ding_feller_miratrix_submission.pdf
- Ding, Feller, et al.
- 2014
(Show Context)
Citation Context ...verly restrictive modeling assumption. The key point is not just that treatment effects vary, but that we can both predict this variation and use it to better understand the intervention of interest (=-=Ding, Feller, & Miratrix, 2014-=-). For example, practitioners implementing a program face budget constraints and must target resources to a subset of the population (see, e.g., Dehejia, 2005; Imai & Strauss, 2011). Or policymakers a... |

2 | Testing for heterogeneous treatment effects in experimental data: false discovery risks and correction procedures.”
- Fink, McConnell, et al.
- 2014
(Show Context)
Citation Context ...archers often look for interactions across many covariates. In a non-hierarchical setting, this creates a classic multiple testing problem, as well as a strong incentive for “specification searches” (=-=Fink, McConnell, & Vollmer, 2011-=-; Pocock, Assmann, Enos, & Kasten, 2002). Pre-analysis plans that specify which subgroups will be analyzed before running the experiment mitigate this issue, but do not completely resolve the multiple... |

2 | Smoothing balanced single-error-term analysis of variance
- Hodges, Cui, et al.
- 2007
(Show Context)
Citation Context ...t variation across groups focus on richer parameterizations of the simpler models discussed above. For example, there is a growing literature on specifying the prior variance for interaction effects (=-=Hodges, Cui, Sargent, & Carlin, 2007-=-; Sargent & Hodges, 1997). One promising approach is the use of nonparametric prior distributions (Sivaganesan, Laud, & Müller, 2010) or in highly flexible hierarchical array priors (Volfovsky & Hoff,... |

1 |
Hierarchical Models for Causal Effects 13
- Assmann, Pocock, et al.
- 2000
(Show Context)
Citation Context ...em by discretizing their continuous variable, but this simply pushes the problem back to a specification search of a different kind, in which researchers find cutpoints that lead to the best results (=-=Assmann, Pocock, Enos, & Kasten, 2000-=-). Flexible models for continuous covariates, such as splines and Gaussian processes, offer a promising solution to this issue (Feller & Holmes, 2009). 12 EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL ... |

1 | Hierarchical Models for Causal Effects 15 - Kim, Seltzer - 2011 |

1 |
Causal analysis of observational data with gaussian process potential outcome models. Presentation at the 2013 Joint Statistical Meetings
- Tokdar
- 2013
(Show Context)
Citation Context ...so Green and Kern (2012) and Imai and Strauss (2011)). More broadly, cutting-edge Bayesian nonparametric methods, such as Gaussian processes, can be used to flexibly model the response surface (e.g., =-=Tokdar, 2013-=-). Variation across Latent Subgroups. Finally, there is an increasing focus onmodeling treatment effect variation across latent or partially-observed subgroups. This is especially promising for princi... |