Joint Program in Survey Methodology
http://hdl.handle.net/1903/2251
2019-06-17T03:06:30ZSelection Bias in Nonprobability Surveys: A Causal Inference Approach
http://hdl.handle.net/1903/20943
Selection Bias in Nonprobability Surveys: A Causal Inference Approach
Mercer, Andrew William
Many in the survey research community have expressed concern at the growing popularity of nonprobability surveys. The absence of random selection prompts justified concerns about self-selection producing biased results and means that traditional, design-based estimation is inappropriate. The Total Survey Error (TSE) paradigm’s designations of selection bias as attributable to undercoverage or nonresponse are not especially helpful for nonprobability surveys as they are based on an implicit assumption that selection and inferences rely on randomization.
This dissertation proposes an alternative classification for sources of selection bias for nonprobability surveys based on principles borrowed from the field of causal inference. The proposed typology describes selection bias in terms of the three conditions that are required for a statistical model to correct or explain systematic differences between a realized sample and the target population: exchangeability, positivity, and composition. We describe the parallels between causal and survey inference and explain how these three sources of bias operate in both probability and nonprobability survey samples. We then provide a critical review of current practices in nonprobability data collection and estimation viewed through the lens of the causal bias framework.
Next, we show how net selection bias can be decomposed into separate additive components associated with exchangeability, positivity, and composition respectively. Using 10 parallel nonprobability surveys from different sources, we estimate these components for six measures of civic engagement using the 2013 Current Population Survey Civic Engagement Supplement as a reference sample. We find that a large majority of the bias can be attributed to a lack of exchangeability.
Finally, using the same six measures of civic engagement, we compare the performance of four approaches to nonprobability estimation based on Bayesian additive regression trees. These are propensity weighting (PW), outcome regression (OR), and two types of doubly-robust estimators: outcome regression with a residual bias correction (OR-RBC) and outcome regression with a propensity score covariate (OR-PSC). We find that OR-RBC tends to have the lowest bias, variance, and RMSE, with PW only slightly worse on all three measures.
2018-01-01T00:00:00ZModel-Assisted Estimators for Time-to-Event Data
http://hdl.handle.net/1903/20303
Model-Assisted Estimators for Time-to-Event Data
Reist, Benjamin Martin
In this dissertation, I develop model-assisted estimators for estimating the proportion of a population that experienced some event by time t. I provide the theoretical justification for the new estimators using time-to-event models as the underlying framework. Using simulation, I compared these estimators to traditional methods, then I applied the estimators to a study of nurses’ health, where I estimated the proportion of the population that had died after a certain period of time. The new estimators performed as well if not better than existing methods. Finally, as this work assumes that all units are censored at the same point in time, I propose an extension that allows units censoring time to vary.
2017-01-01T00:00:00ZINVESTIGATION OF ALTERNATIVE CALIBRATION ESTIMATORS IN THE PRESENCE OF NONRESPONSE
http://hdl.handle.net/1903/19939
INVESTIGATION OF ALTERNATIVE CALIBRATION ESTIMATORS IN THE PRESENCE OF NONRESPONSE
Han, Daifeng
Calibration weighting is widely used to decrease variance, reduce nonresponse bias, and improve the face validity of survey estimates. In the purely sampling context, Deville and Särndal (1992) demonstrate that many alternative forms of calibration weighting are asymptotically equivalent, so for variance estimation purposes, the generalized regression (GREG) estimator can be used to approximate some general calibration estimators with no closed-form solutions such as raking. It is unclear whether this conclusion holds when nonresponse exists and single-step calibration weighting is used to reduce nonresponse bias (i.e., calibration is applied to the basic sampling weights directly without a separate nonresponse adjustment step).
In this dissertation, we first examine whether alternative calibration estimators may perform differently in the presence of nonresponse. More specifically, properties of three widely used calibration estimations, the GREG with only main effect covariates (GREG_Main), poststratification, and raking, are evaluated. In practice, the choice between poststratification and raking are often based on sample sizes and availability of external data. Also, the raking variance is often approximated by a linear substitute containing residuals from a GREG_Main model. Our theoretical development and simulation work demonstrate that with nonresponse, poststratification, GREG_Main, and raking may perform differently and survey practitioners should examine both the outcome model and the response pattern when choosing between these estimators. Then we propose a distance measure that can be estimated for raking or GREG_Main from a given sample. Our analytical work shows that the distance measure follows a Chi-square probability distribution when raking or GREG_Main is unbiased. A large distance measure is a warning sign of potential bias and poor confidence interval coverage for some variables in a survey due to omitting a significant interaction term in the calibration process. Finally, we examine several alternative variance estimators for raking with nonresponse. Our simulation results show that when raking is model-biased, none of the linearization variance estimators under evaluation is unbiased. In contrast, the jackknife replication method performs well in variance estimation, although the confidence interval may still be centered in the wrong place if the point estimate is inaccurate.
2017-01-01T00:00:00ZEnhancing the Understanding of the Relationship between Social Integration and Nonresponse in Household Surveys
http://hdl.handle.net/1903/17305
Enhancing the Understanding of the Relationship between Social Integration and Nonresponse in Household Surveys
Amaya, Ashley Elaine
Nonresponse and nonresponse bias remain fundamental concerns for survey researchers as understanding them is critical to producing accurate statistics. This dissertation tests the relationship between social integration, nonresponse, and nonresponse bias.
Using the rich frame information available on the American Time Use Survey (ATUS) and the Survey of Health, Ageing, and Retirement in Europe (SHARE) Wave II, structural equation models were employed to create latent indicators of social integration. The resulting variables were used to predict nonresponse and its components (e.g., noncontact). In both surveys, social integration was significantly predictive of nonresponse (regardless of type of nonresponse) with integrated individuals more likely to respond. However, the relationship was driven by different components of integration across the two surveys.
Full sample estimates were compared to respondent estimates on a series of 40 dichotomous and categorical variables to test the hypothesis that variables measuring social activities and roles would suffer from nonresponse bias. The impact of nonresponse on multivariate models predicting social outcomes was also evaluated. Nearly all of the 40 assessed variables suffered from significant nonresponse bias resulting in the overestimation of social activity and role participation. In general, civic and political variables suffered from higher levels of bias, but the differences were not significant. Multivariate models were not exempt; beta coefficients were frequently biased. Although, the direction was inconsistent and often small.
Finally, an indicator of social integration was added to the weighting methodology with the goal of eliminating the observed nonresponse bias. While the addition significantly reduced the bias in most instances compared to both the base- and traditionally-weighted estimates, the improvements were small and did little to eliminate the bias.
2015-01-01T00:00:00Z