Selection Bias in Nonprobability Surveys: A Causal Inference Approach

dc.contributor.advisorKreuter, Fraukeen_US
dc.contributor.authorMercer, Andrew Williamen_US
dc.contributor.departmentSurvey Methodologyen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2018-07-17T06:08:18Z
dc.date.available2018-07-17T06:08:18Z
dc.date.issued2018en_US
dc.description.abstractMany in the survey research community have expressed concern at the growing popularity of nonprobability surveys. The absence of random selection prompts justified concerns about self-selection producing biased results and means that traditional, design-based estimation is inappropriate. The Total Survey Error (TSE) paradigm’s designations of selection bias as attributable to undercoverage or nonresponse are not especially helpful for nonprobability surveys as they are based on an implicit assumption that selection and inferences rely on randomization. This dissertation proposes an alternative classification for sources of selection bias for nonprobability surveys based on principles borrowed from the field of causal inference. The proposed typology describes selection bias in terms of the three conditions that are required for a statistical model to correct or explain systematic differences between a realized sample and the target population: exchangeability, positivity, and composition. We describe the parallels between causal and survey inference and explain how these three sources of bias operate in both probability and nonprobability survey samples. We then provide a critical review of current practices in nonprobability data collection and estimation viewed through the lens of the causal bias framework. Next, we show how net selection bias can be decomposed into separate additive components associated with exchangeability, positivity, and composition respectively. Using 10 parallel nonprobability surveys from different sources, we estimate these components for six measures of civic engagement using the 2013 Current Population Survey Civic Engagement Supplement as a reference sample. We find that a large majority of the bias can be attributed to a lack of exchangeability. Finally, using the same six measures of civic engagement, we compare the performance of four approaches to nonprobability estimation based on Bayesian additive regression trees. These are propensity weighting (PW), outcome regression (OR), and two types of doubly-robust estimators: outcome regression with a residual bias correction (OR-RBC) and outcome regression with a propensity score covariate (OR-PSC). We find that OR-RBC tends to have the lowest bias, variance, and RMSE, with PW only slightly worse on all three measures.en_US
dc.identifierhttps://doi.org/10.13016/M2ZP3W38B
dc.identifier.urihttp://hdl.handle.net/1903/20943
dc.language.isoenen_US
dc.subject.pqcontrolledStatisticsen_US
dc.subject.pquncontrolledBayesian statisticsen_US
dc.subject.pquncontrolledCausal inferenceen_US
dc.subject.pquncontrolledNonprobability surveysen_US
dc.subject.pquncontrolledOnline surveysen_US
dc.subject.pquncontrolledSelection biasen_US
dc.titleSelection Bias in Nonprobability Surveys: A Causal Inference Approachen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Mercer_umd_0117E_18928.pdf
Size:
800.56 KB
Format:
Adobe Portable Document Format