Results 1  10
of
53
Adjusting for selection bias in retrospective, casecontrol studies
 Biostatistics
"... Retrospective case control studies are more susceptible to selection bias than other epidemiologic studies as by design they require that both cases and controls are representative of the same population. However, as cases and control recruitment processes are often different, it is not always obvi ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
(Show Context)
Retrospective case control studies are more susceptible to selection bias than other epidemiologic studies as by design they require that both cases and controls are representative of the same population. However, as cases and control recruitment processes are often different, it is not always obvious that the necessary exchangeability conditions hold. Selection bias typically arises when the selection criteria are associated with the risk factor under investigation. We develop a method which produces biasadjusted estimates for the odds ratio. Our method hinges on two conditions. The first is that a variable that separates the risk factor from the selection criteria can be identified. This is termed the bias breaking variable. The second condition is that data can be found such that a biascorrected estimate of the distribution of the bias breaking variable can be obtained. We show by means of a set of examples that such bias breaking variables are not uncommon in epidemiologic settings. We demonstrate using simulations that the estimates of the odds ratios produced by our method are consistently closer to the true odds ratio than standard odds ratio estimates using logistic regression. Further, by applying it to a case control study, we show that our method can help to determine whether selection bias is present and thus confirm the validity of study conclusions when no evidence of selection bias can be found. selection bias, directed acyclic graphs, conditional independence, confounding, retrospective case control studies, poststratification, weighting 1 1
How Should We Estimate Public Opinion in The States?
"... We compare two approaches for estimating statelevel public opinion: disaggregation by state of national surveys and a simulation approach using multilevel modeling of individual opinion and poststratification by population share. We present the first systematic assessment of the predictive accuracy ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
We compare two approaches for estimating statelevel public opinion: disaggregation by state of national surveys and a simulation approach using multilevel modeling of individual opinion and poststratification by population share. We present the first systematic assessment of the predictive accuracy of each and give practical advice about when and how each method should be used. To do so, we use an original data set of over 100 surveys on gay rights issues as well as 1988 presidential election data. Under optimal conditions, both methods work well, but multilevel modeling performs better generally. Compared to baseline opinion measures, it yields smaller errors, higher correlations, and more reliable estimates. Multilevel modeling is clearly superior when samples are smaller—indeed, one can accurately estimate state opinion using only a single large national survey. This greatly expands the scope of issues for which researchers can study subnational opinion directly or as an influence on policymaking. Democratic theory suggests that the varying attitudes and policy preference of citizens across states should play a large role in shaping both electoral outcomes and policymaking. Accurate measurements of statelevel opinion are therefore needed to study a wide range of related political issues, issues at the heart
Deep Interactions with MRP ∗ Election Turnout and Voting Patterns Among Small Electoral Subgroups
, 2012
"... Using multilevel regression and poststratification (MRP), we estimate voter turnout and vote choice within deeply interacted subgroups: subsets of the population that are defined by multiple demographic and geographic characteristics. This article lays out the models and statistical procedures we us ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
Using multilevel regression and poststratification (MRP), we estimate voter turnout and vote choice within deeply interacted subgroups: subsets of the population that are defined by multiple demographic and geographic characteristics. This article lays out the models and statistical procedures we use, along with the steps required to fit the model for the 2004 and 2008 Presidential elections. Though MRP is an increasingly popular method, we improve upon it in numerous ways: deeper levels of covariate interaction, allowing for nonlinearity and nonmonotonicity, accounting for unequal inclusion probabilities that are conveyed in survey weights, postestimation adjustments to turnout and voting levels, and informative multidimensional graphical displays as a form of model checking. We use a series of examples to demonstrate the flexibility of our method, including an illustration of turnout and vote choice as subgroups become increasingly detailed, and an analysis of both vote choice changes and turnout changes
Understanding the LongRun Decline in Interstate Migration ∗
, 2012
"... We analyze the secular decline in interstate migration in the United States between 1991 and 2011. Gross flows of people across states are about 10 times larger than net flows, yet have declined by around 50 percent over the past 20 years. We show that micro data rule out many popular explanations f ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
We analyze the secular decline in interstate migration in the United States between 1991 and 2011. Gross flows of people across states are about 10 times larger than net flows, yet have declined by around 50 percent over the past 20 years. We show that micro data rule out many popular explanations for this decline, including aging of the population, the rise of twoearner households, other compositional changes, regional changes, and the rise in real incomes. We argue instead that the fall in migration is due to a decline in the geographic specificity of occupations and an increase in workers ’ ability to learn about other locations before moving there, through both information technology and inexpensive travel. We develop a theory to formalize these ideas and show that a plausibly calibrated version is consistent with crosssectional and timeseries patterns of interstate migration, occupations, and incomes.
The Transition to Postindustrial BMI values among US children
 American Journal of Human Biology
, 2008
"... children ..."
How should we estimate subnational opinion Using MRP? Preliminary findings and recommendations. Presented at Midwest Political Science Association
, 2013
"... Over the past few years, multilevel regression and poststratification (MRP) has become an increasingly trusted tool for estimating public opinion in subnational units from national surveys. Especially given the proliferation of this technique, more evaluation is needed to determine the conditions ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Over the past few years, multilevel regression and poststratification (MRP) has become an increasingly trusted tool for estimating public opinion in subnational units from national surveys. Especially given the proliferation of this technique, more evaluation is needed to determine the conditions under which MRP performs best and to establish benchmarks for expectations of performance. Using data from common content of the Cooperative Congressional Election Study, we evaluate the accuracy of MRP across a wide range of survey questions. In doing so, we consider varying degrees of model complexity and identify the measures of model fit and performance that best correlate to the accuracy of MRP estimates. The totality of our results will enable us to develop a set of guidelines for implementing MRP properly as well as a set of diagnostics for identifying instances where MRP is appropriate and instances where its use may be problematic. ∗For helpful comments we thank Andrew Gelman. For research assistance we thank Eurry Kim. 1
The Use of Sampling Weights in Bayesian Hierarchical Models for Small Area Estimation
"... Empirical Bayes and Bayes hierarchical models have been used extensively for small area estimation. However, the sampling weights that are required to reflect complex surveys are rarely considered in these models. In this paper, we develop a method to incorporate the sampling weights for binary data ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Empirical Bayes and Bayes hierarchical models have been used extensively for small area estimation. However, the sampling weights that are required to reflect complex surveys are rarely considered in these models. In this paper, we develop a method to incorporate the sampling weights for binary data when estimating, for example, small area proportions or predicting small area counts. We consider empirical Bayes betabinomial models, and normal hierarchical models. The latter may include spatial random effects, with computation carried out using the integrated nested Laplace approximation, which is fast. A simulation study is presented, to demonstrate the performance of the proposed approaches, and to compare results from models with and without the sampling weights. The results show that estimation mean squared error can be greatly reduced using the proposed models, when compared with more standard approaches. Bias reduction occurs through the incorporation of sampling weights, with variance reduction being achieved through hierarchical smoothing. We also analyze data taken from the
Using county demographics to infer attributes of twitter users
 In ACL Joint Workshop on Social Dynamics and Personal Attributes in Social Media
, 2014
"... Social media are increasingly being used to complement traditional survey methods in health, politics, and marketing. However, little has been done to adjust for the sampling bias inherent in this approach. Inferring demographic attributes of social media users is thus a critical step to improving ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Social media are increasingly being used to complement traditional survey methods in health, politics, and marketing. However, little has been done to adjust for the sampling bias inherent in this approach. Inferring demographic attributes of social media users is thus a critical step to improving the validity of such studies. While there have been a number of supervised machine learning approaches to this problem, these rely on a training set of users annotated with attributes, which can be difficult to obtain. We instead propose training a demographic attribute classifiers that uses countylevel supervision. By pairing geolocated social media with county demographics, we build a regression model mapping text to demographics. We then adopt this model to make predictions at the user level. Our experiments using Twitter data show that this approach is surprisingly competitive with a fully supervised approach, estimating the race of a user with 80 % accuracy. 1
Bayesian Nonparametric Weighted Sampling Inference ∗
, 2013
"... Survey weighting adjusts for known or expected differences between sample and population. Weights are constructed on design or poststratification variables that are predictors of inclusion probability. In this paper, we assume that the only information we have about the weighting procedure is the v ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Survey weighting adjusts for known or expected differences between sample and population. Weights are constructed on design or poststratification variables that are predictors of inclusion probability. In this paper, we assume that the only information we have about the weighting procedure is the values of the weights in the sample. We propose a hierarchical Bayesian approach in which we model the weights of those nonsampled units in the population and simultaneously include them as predictors in a nonparametric Gaussian process regression to yield valid inference for the underlying finite population and capture the uncertainty induced by sampling and the unobserved outcomes. We use simulation studies to evaluate the performance of our procedure and compare it to the classical designbased estimator. We apply our method to data from two ongoing social surveys: the American Community Survey and the Fragile Family Child Wellbeing Study. Our studies find the Bayesian nonparametric finite population estimator to be more robust than the classical designbased estimator without loss in efficiency. Key words: survey weighting, poststratification, modelbased survey inference, Gaussian process prior, Stan.