Optimizing stratified sampling allocations to account for heteroscedasticity and nonresponse

dc.contributor.advisorElliott, Michael Ren_US
dc.contributor.advisorLahiri, Parthaen_US
dc.contributor.authorMendelson, Jonathanen_US
dc.contributor.departmentSurvey Methodologyen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2023-10-07T05:37:18Z
dc.date.available2023-10-07T05:37:18Z
dc.date.issued2023en_US
dc.description.abstractNeyman's seminal paper in 1934 and subsequent developments of the next two decades transformed the practice of survey sampling and continue to provide the underpinnings of today's probability samples, including at the design stage. Although hugely useful, the assumptions underlying classic theory on optimal allocation, such as complete response and exact knowledge of strata variances, are not always met, nor is the design-based approach the only way to identify good sample allocations. This thesis develops new ways to allocate samples for stratified random sampling (STSRS) designs. In Papers 1 and 2, I provide a Bayesian approach for optimal STSRS allocation for estimating the finite population mean via a univariate regression model with heteroscedastic errors. I use Bayesian decision theory on optimal experimental design, which accommodates uncertainty in design parameters. By allowing for heteroscedasticity, I aim for improved realism in some establishment contexts, compared with some earlier Bayesian sample design work. Paper 1 assumes that the level of heteroscedasticity is known, which facilitates analytical results. Paper 2 relaxes this assumption, which results in an analytically intractable problem. Thus, I develop a computational approach that uses Monte Carlo sampling to estimate the loss for a given allocation, in conjunction with a stochastic optimization algorithm that accommodates noisy loss functions. In simulation, the proposed approaches performed as well or better than the design-based and model-assisted strategies considered, while having clearer theoretical justification. Paper 3 changes focus toward addressing how to account for nonresponse when designing samples. Existing theory on optimal STSRS allocation generally assumes complete response. A common practice is to allocate sample under complete response, then to inflate the sample sizes by the inverse of the anticipated response rates. I show that this practice overcorrects for nonresponse, leading to excessive costs per effective interview. I extend the existing design-based framework for STSRS allocation to accommodate scenarios with incomplete response. I provide theoretical comparisons between my allocation and common alternatives, which illustrate how response rates, population characteristics, and cost structure can affect the methods' relative efficiency. In an application to a self-administered survey of military personnel, the proposed allocation resulted in a 25% increase in effective sample size compared with common alternatives.en_US
dc.identifierhttps://doi.org/10.13016/dspace/wglp-wsaz
dc.identifier.urihttp://hdl.handle.net/1903/30839
dc.language.isoenen_US
dc.subject.pqcontrolledStatisticsen_US
dc.subject.pqcontrolledSocial researchen_US
dc.subject.pquncontrolledBayesian designen_US
dc.subject.pquncontrolledsample allocationen_US
dc.subject.pquncontrolledsurvey methodologyen_US
dc.subject.pquncontrolledsurvey samplingen_US
dc.subject.pquncontrolledsurvey statisticsen_US
dc.subject.pquncontrolledunit nonresponseen_US
dc.titleOptimizing stratified sampling allocations to account for heteroscedasticity and nonresponseen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mendelson_umd_0117E_23670.pdf
Size:
5.53 MB
Format:
Adobe Portable Document Format
Download
(RESTRICTED ACCESS)