Joint Program in Survey Methodology Theses and Dissertations
Permanent URI for this collection
Browse
Recent Submissions
Item Optimizing stratified sampling allocations to account for heteroscedasticity and nonresponse(2023) Mendelson, Jonathan; Elliott, Michael R; Lahiri, Partha; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Neyman's seminal paper in 1934 and subsequent developments of the next two decades transformed the practice of survey sampling and continue to provide the underpinnings of today's probability samples, including at the design stage. Although hugely useful, the assumptions underlying classic theory on optimal allocation, such as complete response and exact knowledge of strata variances, are not always met, nor is the design-based approach the only way to identify good sample allocations. This thesis develops new ways to allocate samples for stratified random sampling (STSRS) designs. In Papers 1 and 2, I provide a Bayesian approach for optimal STSRS allocation for estimating the finite population mean via a univariate regression model with heteroscedastic errors. I use Bayesian decision theory on optimal experimental design, which accommodates uncertainty in design parameters. By allowing for heteroscedasticity, I aim for improved realism in some establishment contexts, compared with some earlier Bayesian sample design work. Paper 1 assumes that the level of heteroscedasticity is known, which facilitates analytical results. Paper 2 relaxes this assumption, which results in an analytically intractable problem. Thus, I develop a computational approach that uses Monte Carlo sampling to estimate the loss for a given allocation, in conjunction with a stochastic optimization algorithm that accommodates noisy loss functions. In simulation, the proposed approaches performed as well or better than the design-based and model-assisted strategies considered, while having clearer theoretical justification. Paper 3 changes focus toward addressing how to account for nonresponse when designing samples. Existing theory on optimal STSRS allocation generally assumes complete response. A common practice is to allocate sample under complete response, then to inflate the sample sizes by the inverse of the anticipated response rates. I show that this practice overcorrects for nonresponse, leading to excessive costs per effective interview. I extend the existing design-based framework for STSRS allocation to accommodate scenarios with incomplete response. I provide theoretical comparisons between my allocation and common alternatives, which illustrate how response rates, population characteristics, and cost structure can affect the methods' relative efficiency. In an application to a self-administered survey of military personnel, the proposed allocation resulted in a 25% increase in effective sample size compared with common alternatives.Item BAYESIAN METHODS FOR PREDICTION OF SURVEY DATA COLLECTION PARAMETERS IN ADAPTIVE AND RESPONSIVE DESIGNS(2020) Coffey, Stephanie Michelle; Elliott, Michael R; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Adaptive and responsive survey designs rely on estimates of survey data collection parameters (SDCPs), such as response propensity, to make intervention decisions during data collection. These interventions are made with some data collection goal in mind, such as maximizing data quality for a fixed cost or minimizing costs for a fixed measure of data quality. Data quality may be defined by response rate, sample representativeness, or error in survey estimates. Therefore, the predictions of SDCPs are extremely important. Predictions within a data collection period are most commonly generated using fixed information about sample cases, and accumulating paradata and survey response data. Interventions occur during the data collection period, however, meaning they are applied based on predictions from incomplete accumulating data. There is evidence that the incomplete accumulating data can lead to biased and unstable predictions, particularly early in data collection. This dissertation explores the use of Bayesian methods to improve predictions of SDCPs during data collection, by providing a mathematical framework for combining priors, based on external data about covariates in the prediction models, with the current accumulating data to generate posterior predictions of SDCPs for use in intervention decisions.This dissertation includes three self-contained papers, each focused on the use of Bayesian methods to improve predictions of SDCPs for use in adaptive and responsive survey designs. The first paper predicts time to first contact, where priors are generated from historical survey data. The second paper implements expert elicitation, a method for prior construction when historical data is not available. The last paper describes a data collection experiment conducted using a Bayesian framework, which attempts to minimize data collection costs without reducing the quality of a key survey estimate. In all three papers, the use of Bayesian methods introduces modest improvements in the predictions of SDCPs, especially early in data collection, when interventions would have the largest effect on survey outcomes. Additionally, the experiment in the last paper resulted in significant data collection cost savings without having a significant effect on a key survey estimate. This work suggests that Bayesian methods can improve predictions of SDCPs that are critical for adaptive and responsive data collection interventions.Item Design and Effectiveness of Multimodal Definitions in Online Surveys(2020) Spiegelman, Maura; Conrad, Frederick G; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)If survey respondents do not interpret a question as it was intended, they may, in effect, answer the wrong question, increasing the chances of inaccurate data. Researchers can bring respondents’ interpretations into alignment with what is intended by defining the terms that respondents might misunderstand. This dissertation explores strategies to increase response alignment with definitions in online surveys. In particular, I compare the impact of unimodal (either spoken or textual) to multimodal (both spoken and textual) definitions on question interpretation and, indirectly, response quality. These definitions can be further categorized as conventional or optimized for the mode in which they are presented (for textual definitions, fewer words than in conventional definitions with key information made visually salient and easier for respondents to grasp; for spoken definitions, a shorter, more colloquial style of speaking). The effectiveness of conventional and optimized definitions are compared, as well as the effectiveness of unimodal and multimodal definitions. Amazon MTurk workers were randomly assigned to one of six definition conditions in a 2x3 design: conventional or optimized definitions, presented in a spoken, textual, or multimodal (both spoken and textual) format. While responses for unimodal optimized and conventional definitions were similar, multimodal definitions, and particularly multimodal optimized definitions, resulted in responses with greater alignment with definitions. Although complementary information presented in different modes can increase comprehension and lead to increased data quality, redundant or otherwise untailored multimodal information may not have the same positive effects. Even as not all respondents complied with instructions to read and/or listen to definitions, the compliance rates and effectiveness of multimodal presentation were sufficiently high to show improvements in data quality, and the effectiveness of multimodal definitions increased when only compliant observations were considered. Multimodal communication in a typically visual medium (such as web surveys) may increase the amount of time needed to complete a questionnaire, but respondents did not consider their use to be burdensome or otherwise unsatisfactory. While further techniques could be used to help increase respondent compliance with instructions, this study suggests that multimodal definitions, when thoughtfully designed, can improve data quality without negatively impacting respondents.Item Improving External Validity of Epidemiologic Analyses by Incorporating Data from Population-Based Surveys(2020) Wang, Lingxiao; Li, Yan; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Many epidemiologic studies forgo probability sampling and turn to volunteer-based samples because of cost, confidentiality, response burden, and invasiveness of biological samples. However, the volunteers may not represent the underlying target population mainly due to self-selection bias. Therefore, standard epidemiologic analyses may not be generalizable to the target population, which is called lack of “external validity.” In survey research, propensity score (PS)-based approaches have been developed to improve representativeness of the nonprobability samples by using population-based surveys as references. These approaches create a set of “pseudo-weights” to weight the nonprobability sample up to the target population. There are two main types of PS-based approaches: (1) PS-based weighting methods using PSs to estimate participation rates of the nonprobability sample; for example, the inverse of PS weighting (IPSW); (2) PS-based matching methods using PSs to measure similarity between the units in the nonprobability sample and the reference survey sample, such as PS adjustment by subclassification (PSAS). Although the PS-based weighting methods reduce the bias, they are sensitive to propensity model misspecification and can be inefficient. The PS-based matching methods are more robust to the propensity model misspecification and can avoid extreme weights. However, matching methods such as PSAS are less effective at bias reduction. This dissertation proposes a novel PS-based matching method, named the kernel weighting (KW) approach, to improve the external validity of epidemiologic analyses that gain a better bias–variance tradeoff. A unifying framework is established for PS-based methods to provide three advances. First, the KW method is proved to provide consistent estimates, yet generally has smaller mean-square error than the IPSW. Second, the framework reveals a fundamental strong exchangeability assumption (SEA) underlying the existing PS-based matching methods that has previously been unknown. The SEA is relaxed to a weak exchangeability assumption that is more realistic for data analysis. Third, survey weights are scaled in propensity estimation to reduce the variance of the estimated PS and improve efficiency of all PS-based methods under the framework. The performance of the proposed PS-based methods is evaluated for estimating prevalence of diseases and associations between risk factors and disease in the finite population.Item The Use of Email in Establishment Surveys(2019) Langeland, Joshua Lee; Abraham, Katharine; Wagner, James; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation evaluates the effectiveness of using Email for survey solicitation, nonresponse follow-up, and notifications for upcoming scheduled interviews in an establishment survey setting. Reasons for interest in the use of Email include the possibility that it could reduce printing and postage expenses, speed responses and encourage online reporting. To date, however, there has been limited research on the extent to which these benefits can in fact be realized in an establishment survey context. In order to send an Email for survey purposes, those administering a survey must have Email addresses for the units in the sample. One method for collecting Email addresses is to send a prenotification letter to sampled businesses prior to the initial survey invitation, informing respondents about the upcoming survey and requesting they provide contact information for someone within the organization who will have knowledge of the survey topic. Relatively little is known, however, about what makes a prenotification letter more or less effective. The first experiment on which this dissertation reports varies the content of prenotification letters sent to establishments selected for participation in a business survey in order to identify how different features affect the probability of obtaining a respondent's Email address. In this experiment, neither survey sponsorship, appeal type, nor a message about saving taxpayer dollars had a significant impact on response. The second experiment is a pilot study designed to compare the results of sending an initial Email invitation to participate in an establishment survey to the results of sending a standard postal invitation. Sampled businesses that provided an Email address were randomized into two groups. Half of the units in the experiment received the initial survey invitation by Email and the other half received the standard survey materials through postal mail; all units received the same nonresponse follow-up treatments. The analysis of this experiment focuses on response rates, timeliness of response, mode of response and cost per response. In this production environment, Email invitations achieved an equivalent response rate at reduced cost per response. Units receiving the Email invitation were more likely to report online, but it took them longer on average to respond. The third experiment built on the second and was an investigation into nonresponse follow-up procedures. In the second experiment, at the point when the cohort that received the initial survey invitation by Email received their first nonresponse follow-up, there was a large increase in response. The third experiment tests whether this large increase in response can be achieved by sending a follow-up Email instead of a postal reminder. Sampled units that provided an Email address were randomized into three groups. All units received the initial survey invitation by Email and all units also received nonresponse follow-ups by Email. The treatments varied in the point in the nonresponse follow-up period at which the Emails were augmented with a postal mailing. The analysis focuses on how this timing affects response rates and mode of response. The sequence that introduced postal mail early in nonresponse follow-up achieved the highest final response rate. All mode sequences were successful in encouraging online data reporting. The fourth and final experiment studies the use of Email in a monthly business panel survey conducted through Computer Assisted Telephone Interviewing (CATI). After the first month in which an interviewer in this survey collects data from a business, she schedules a date to call and collect data the following month. The current procedure is to send a postcard to the business a few days prior to the scheduled appointment to serve as a reminder of the upcoming interview. The fourth experiment investigates the effects of replacing this reminder postcard with an Email. Businesses in a sample that included both businesses for which the survey organization had an Email address and businesses for which no Email address was available were randomized into three groups. The first group acts as the control and received the standard postcard; the second group was designated to receive an Email reminder, provided an Email address was available, instead of the postcard; and the third group received an Email reminder with an iCalendar attachment instead of the postcard, again provided an Email address was available. Results focus on response rates, call length, percent of units reporting on time, and number of calls to respondents. The experiment found that the use of Email as a reminder for a scheduled interview significantly increased response rates and decreased the effort required to collect data.Item A Unifying Parametric Framework for Estimating Finite Population Totals from Complex Samples(2019) Flores Cervantes, Ismael; Brick, J. Michael; Kreuter, Frauke; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)We propose a unifying framework for improving the efficiency of design-based estimators of finite population characteristics in the presence of full response. We call the framework a Parametric (PA) approach. The PA framework, an extension of the model-assisted theory, uses an algorithmic approach driven by the observed data. The algorithm identifies the relevant subset of auxiliary variables related to the outcome, and the known population totals of these variables are used to compute the PA estimator. We apply the PA framework to three important estimation problems: the identification of the functional form of a design-based estimator based on the observed data; the identification working or assisting model; and the development of the methodology for creating new design-based estimators. The PA estimators are theoretically justified and evaluated by simulations. This dissertation is limited to single-stage sample designs with full response, but the framework can be extended to other sample designs and for estimation with nonresponse.Item Selection Bias in Nonprobability Surveys: A Causal Inference Approach(2018) Mercer, Andrew William; Kreuter, Frauke; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Many in the survey research community have expressed concern at the growing popularity of nonprobability surveys. The absence of random selection prompts justified concerns about self-selection producing biased results and means that traditional, design-based estimation is inappropriate. The Total Survey Error (TSE) paradigm’s designations of selection bias as attributable to undercoverage or nonresponse are not especially helpful for nonprobability surveys as they are based on an implicit assumption that selection and inferences rely on randomization. This dissertation proposes an alternative classification for sources of selection bias for nonprobability surveys based on principles borrowed from the field of causal inference. The proposed typology describes selection bias in terms of the three conditions that are required for a statistical model to correct or explain systematic differences between a realized sample and the target population: exchangeability, positivity, and composition. We describe the parallels between causal and survey inference and explain how these three sources of bias operate in both probability and nonprobability survey samples. We then provide a critical review of current practices in nonprobability data collection and estimation viewed through the lens of the causal bias framework. Next, we show how net selection bias can be decomposed into separate additive components associated with exchangeability, positivity, and composition respectively. Using 10 parallel nonprobability surveys from different sources, we estimate these components for six measures of civic engagement using the 2013 Current Population Survey Civic Engagement Supplement as a reference sample. We find that a large majority of the bias can be attributed to a lack of exchangeability. Finally, using the same six measures of civic engagement, we compare the performance of four approaches to nonprobability estimation based on Bayesian additive regression trees. These are propensity weighting (PW), outcome regression (OR), and two types of doubly-robust estimators: outcome regression with a residual bias correction (OR-RBC) and outcome regression with a propensity score covariate (OR-PSC). We find that OR-RBC tends to have the lowest bias, variance, and RMSE, with PW only slightly worse on all three measures.Item Model-Assisted Estimators for Time-to-Event Data(2017) Reist, Benjamin Martin; Valliant, Richard; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In this dissertation, I develop model-assisted estimators for estimating the proportion of a population that experienced some event by time t. I provide the theoretical justification for the new estimators using time-to-event models as the underlying framework. Using simulation, I compared these estimators to traditional methods, then I applied the estimators to a study of nurses’ health, where I estimated the proportion of the population that had died after a certain period of time. The new estimators performed as well if not better than existing methods. Finally, as this work assumes that all units are censored at the same point in time, I propose an extension that allows units censoring time to vary.Item INVESTIGATION OF ALTERNATIVE CALIBRATION ESTIMATORS IN THE PRESENCE OF NONRESPONSE(2017) Han, Daifeng; Valliant, Richard; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Calibration weighting is widely used to decrease variance, reduce nonresponse bias, and improve the face validity of survey estimates. In the purely sampling context, Deville and Särndal (1992) demonstrate that many alternative forms of calibration weighting are asymptotically equivalent, so for variance estimation purposes, the generalized regression (GREG) estimator can be used to approximate some general calibration estimators with no closed-form solutions such as raking. It is unclear whether this conclusion holds when nonresponse exists and single-step calibration weighting is used to reduce nonresponse bias (i.e., calibration is applied to the basic sampling weights directly without a separate nonresponse adjustment step). In this dissertation, we first examine whether alternative calibration estimators may perform differently in the presence of nonresponse. More specifically, properties of three widely used calibration estimations, the GREG with only main effect covariates (GREG_Main), poststratification, and raking, are evaluated. In practice, the choice between poststratification and raking are often based on sample sizes and availability of external data. Also, the raking variance is often approximated by a linear substitute containing residuals from a GREG_Main model. Our theoretical development and simulation work demonstrate that with nonresponse, poststratification, GREG_Main, and raking may perform differently and survey practitioners should examine both the outcome model and the response pattern when choosing between these estimators. Then we propose a distance measure that can be estimated for raking or GREG_Main from a given sample. Our analytical work shows that the distance measure follows a Chi-square probability distribution when raking or GREG_Main is unbiased. A large distance measure is a warning sign of potential bias and poor confidence interval coverage for some variables in a survey due to omitting a significant interaction term in the calibration process. Finally, we examine several alternative variance estimators for raking with nonresponse. Our simulation results show that when raking is model-biased, none of the linearization variance estimators under evaluation is unbiased. In contrast, the jackknife replication method performs well in variance estimation, although the confidence interval may still be centered in the wrong place if the point estimate is inaccurate.Item Enhancing the Understanding of the Relationship between Social Integration and Nonresponse in Household Surveys(2015) Amaya, Ashley Elaine; Presser, Stanley; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Nonresponse and nonresponse bias remain fundamental concerns for survey researchers as understanding them is critical to producing accurate statistics. This dissertation tests the relationship between social integration, nonresponse, and nonresponse bias. Using the rich frame information available on the American Time Use Survey (ATUS) and the Survey of Health, Ageing, and Retirement in Europe (SHARE) Wave II, structural equation models were employed to create latent indicators of social integration. The resulting variables were used to predict nonresponse and its components (e.g., noncontact). In both surveys, social integration was significantly predictive of nonresponse (regardless of type of nonresponse) with integrated individuals more likely to respond. However, the relationship was driven by different components of integration across the two surveys. Full sample estimates were compared to respondent estimates on a series of 40 dichotomous and categorical variables to test the hypothesis that variables measuring social activities and roles would suffer from nonresponse bias. The impact of nonresponse on multivariate models predicting social outcomes was also evaluated. Nearly all of the 40 assessed variables suffered from significant nonresponse bias resulting in the overestimation of social activity and role participation. In general, civic and political variables suffered from higher levels of bias, but the differences were not significant. Multivariate models were not exempt; beta coefficients were frequently biased. Although, the direction was inconsistent and often small. Finally, an indicator of social integration was added to the weighting methodology with the goal of eliminating the observed nonresponse bias. While the addition significantly reduced the bias in most instances compared to both the base- and traditionally-weighted estimates, the improvements were small and did little to eliminate the bias.Item Rapport and Its Impact on the Disclosure of Sensitive Information in Standardized Interviews(2014) Sun, Hanyu; Conrad, Frederick G.; Kreuter, Frauke; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Although there is no universally accepted way to define and operationalize rapport, the general consensus is that it can have an impact on survey responses, potentially affecting their quality. Moderately sensitive information is often asked in the interviewer-administered mode of data collection. Although rapport-related verbal behaviors have been found to increase the disclosure of moderately sensitive information in face-to-face interactions, it is unknown if rapport can be established to the same extent in video-mediated interviews, leading to similar levels of disclosure. Highly sensitive information is usually collected via self-administered modes of data collection. For some time, audio computer-assisted self-interviewing (ACASI) has been seen as one of the best methods for collecting sensitive information. Typically, the respondent first answers questions about nonsensitive topics in computer-assisted personal interviewing (CAPI) and is then switched to ACASI for sensitive questions. None of the existing research has investigated the possibility that the interviewer-respondent interaction, prior to the ACASI questions, may affect disclosures in ACASI. This dissertation used a laboratory experiment that was made up of two related studies, aiming at answering these questions. The first study compares video-mediated interviews with CAPI to investigate whether rapport can be similarly established in video-mediated interviews, leading to similar levels of disclosure. There was no significant difference in rapport ratings between video-mediated and CAPI interviews, suggesting no evidence that rapport is any better established in CAPI than video-mediated interviews. Compared with CAPI, higher disclosure of moderately sensitive information was found in video-mediated interviews, though the effects were only marginally significant. The second study examines whether the interviewer-respondent interaction, prior to the ACASI questions, may affect disclosure in ACASI. There was no significant difference on disclosure between the same voice and the different voice condition. However, there were marginally significant carryover effects of rapport in the preceding module on disclosure in the subsequent ACASI module. Respondents who experienced high rapport in the preceding module gave more disclosure in the subsequent ACASI module. Furthermore, compared with ACASI, the percentage of reported sensitive behaviors was higher for video-mediated interviews for some of the highly sensitive questions.Item Testing for Phase Capacity in Surveys with Multiple Waves of Nonrespondent Follow-Up(2014) Lewis, Taylor Hudson; Lahiri, Partha; Kreuter, Frauke; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)To mitigate the potentially harmful effects of nonresponse, many surveys repeatedly follow up with nonrespondents, often targeting a particular response rate or predetermined number of completes. Each additional recruitment attempt generally brings in a new wave of data, but returns gradually diminish over the course of a fixed data collection protocol. This is because each subsequent wave tends to contain fewer and fewer new responses, thereby resulting in smaller and smaller changes on (nonresponse-adjusted) point estimates. Consequently, these estimates begin to stabilize. This is the notion of phase capacity, suggesting some form of design change is in order, such as switching modes, increasing the incentive, or, as is considered exclusively in this research, discontinuing the nonrespondent follow-up campaign altogether. This dissertation consists of three methodological studies proposing and assessing various techniques survey practitioners can use to formally test for phase capacity. One of the earliest known phase capacity testing methods proposed in the literature calls for multiply imputing nonrespondents' missing data to assess, retrospectively, whether the most recent wave of data significantly altered a key estimate. The first study introduces an adaptation of this test amenable to surveys that instead reweight the observed data to compensate for nonresponse. A general limitation of methods discussed in the first study is that they are applicable to a single point estimate. The second study evaluates two extensions, each with the aim of producing a universal, yes-or-no phase capacity determination for a battery of point estimates. The third study builds upon ideas of a prospective phase capacity test recently proposed in the literature attempting to address the question of whether an imminent wave of data will significantly alter a key estimate. All three studies include a simulation study and application using data from the 2011 Federal Employee Viewpoint Survey.Item A COMPARISON OF EX-ANTE, LABORATORY, AND FIELD METHODS FOR EVALUATING SURVEY QUESTIONS(2014) Maitland, Aaron; Presser, Stanley; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)A diverse range of evaluation methods is available for detecting measurement error in survey questions. Ex-ante question evaluation methods are relatively inexpensive, because they do not require data collection from survey respondents. Other methods require data collection from respondents either in the laboratory or in the field setting. Research has explored how effective some of these methods are at identifying problems with respect to one another. However, a weakness of most of these studies is that they do not compare the range of question evaluation methods that are currently available to researchers. The purpose of this dissertation is to understand how the methods researchers use to evaluate survey questions influence the conclusions they draw about the questions. In addition, the dissertation seeks to identify more effective ways to use the methods together. It consists of three studies. The first study examines the extent of agreement between ex-ante and laboratory methods in identifying problems and compares the methods in how well they predict differences between questions whose validity has been estimated in record-check studies. The second study evaluates the extent to which ex-ante and laboratory methods predict the performance of questions in the field as measured by indirect assessments of data quality such as behavior coding, response latency and item nonresponse. The third study evaluates the extent to which ex-ante, laboratory, and field methods predict the reliability of answers to survey questions as measured by stability over time. The findings suggest (1) that a multiple method approach to question evaluation is the best strategy given differences in the ability to detect different types of problems between the methods and (2) how to combine methods more effectively in the future.Item TOPICS IN MODEL-ASSISTED POINT AND VARIANCE ESTIMATION IN CLUSTERED SAMPLES(2013) Kennel, Timothy; Valliant, Richard; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation describes three distinct research papers. Although each research topic is different and there is very little binding some of the chapters together, all three deal with innovations to model-assisted estimators. Moreover, all three papers explore different aspects of estimating totals, means, and rates from clustered samples. New estimators are presented. Their theoretical properties are explored; and, simulations are used to explore their design-based properties in realistic situations. After an introductory chapter, we show how leverage adjustments can be made to sandwich variance estimators to improve variance estimates of Generalized Regression estimators in two-staged samples. In the third chapter, we explore multinomial logistic-assisted estimators of finite population totals in clustered samples. In the final chapter, we use generalized linear models to assist estimating finite population totals in cluster samples.Item Classifying Mouse Movements and Providing Help in Web Surveys(2013) Horwitz, Rachel; Conrad, Frederick G; Kreuter, Frauke; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Survey administrators go to great lengths to make sure survey questions are easy to understand for a broad range of respondents. Despite these efforts, respondents do not always understand what the questions ask of them. In interviewer-administrated surveys, interviewers can pick up on cues from the respondent that suggest they do not understand or know how to answer the question and can provide assistance as their training allows. However, due to the high costs of interviewer administration, many surveys are moving towards other survey modes (at least for some respondents) that do not include costly interviewers, and with that a valuable source for clarification is gone. In Web surveys, researchers have experimented with providing real-time assistance to respondents who take a long time to answer a question. Help provided in such a fashion has resulted in increased accuracy, but some respondents do not like the imposition of unsolicited help. There may be alternative ways to provide help that can refine or overcome the limitations to using response times. This dissertation is organized into three separate studies that each use a set of independently collected data to identify a set of indicators survey administrators can use to determine when a respondent is having difficulty answering a question and proposes alternative ways of providing real-time assistance that increase accuracy as well as user satisfaction. The first study identifies nine movements that respondents make with the mouse cursor while answering survey questions and hypothesizes, using exploratory analyses, which movements are related to difficulty. The second study confirms use of these movements and uses hierarchical modeling to identify four movements which are the most predictive. The third study tests three different of providing unsolicited help to respondents: text box, audio recording, and chat. Accuracy and respondent satisfaction are evaluated for each mode. There were no differences in accuracy across the three modes, but participants reported a preference for receiving help in a standard text box. These findings allow survey designers to identify difficult questions on a larger scale than previously possible and to increase accuracy by providing real-time assistance while maintaining respondent satisfaction.Item Adjustments for Nonresponse, Sample Quality Indicators, and Nonresponse Error in a Total Survey Error Context(2012) Ye, Cong; Tourangeau, Roger; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The decline in response rates in surveys of the general population is regarded by many researchers as one of the greatest threats to contemporary surveys. Much research has focused on the consequences of nonresponse. However, because the true values for the non-respondents are rarely known, it is difficult to estimate the magnitude of nonresponse bias or to develop effective methods for predicting and adjusting for nonresponse bias. This research uses two datasets that have records on each person in the frame to evaluate the effectiveness of adjustment methods aiming to correct nonresponse bias, to study indicators of sample quality, and to examine the relative magnitude of nonresponse bias under different modes. The results suggest that both response propensity weighting and GREG weighting, are not effective in reducing nonresponse bias present in the study data. There are some reductions in error, but the reductions are limited. The comparison between response propensity weighting and GREG weighting shows that with the same set of auxiliary variables, the choice between response propensity weighting and GREG weighting makes little difference. The evaluation of the R-indicators and the penalized R-indicators using the study datasets and from a simulation study suggests that the penalized R-indicators perform better than the R-indicators in terms of assessing sample quality. The penalized R-indicator shows a pattern that has a better match to the pattern for the estimated biases than the R-indicator does. Finally, the comparison of nonresponse bias to other types of errors finds that nonresponse bias in these two data sets may be larger than sampling error and coverage bias, but measurement bias can be bigger in turn than nonresponse bias, at least for sensitive questions. And postsurvey adjustments do not result in substantial reduction in the total survey error. We reach the conclusion that 1) efforts put into dealing with nonresponse bias are warranted; 2) the effectiveness of weighting adjustments for nonresponse depends on the availability and quality of the auxiliary variables, and 3) the penalized R-indicator may be more helpful in monitoring the quality of the survey than the R-indicator.Item Beyond Response Rates: The Effect of Prepaid Incentives on Measurement Error(2012) Medway, Rebecca; Tourangeau, Roger; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)As response rates continue to decline, survey researchers increasingly offer incentives as a way to motivate sample members to take part in their surveys. Extensive prior research demonstrates that prepaid incentives are an effective tool for doing so. If prepaid incentives influence behavior at the stage of deciding whether or not to participate, they also may alter the way that respondents behave while completing surveys. Nevertheless, most research has focused narrowly on the effect that incentives have on response rates. Survey researchers should have a better empirical basis for assessing the potential tradeoffs associated with the higher responses rates yielded by prepaid incentives. This dissertation describes the results of three studies aimed at expanding our understanding of the impact of prepaid incentives on measurement error. The first study explored the effect that a $5 prepaid cash incentive had on twelve indicators of respondent effort in a national telephone survey. The incentive led to significant reductions in item nonresponse and interview length. However, it had little effect on the other indicators, such as response order effects and responses to open-ended items. The second study evaluated the effect that a $5 prepaid cash incentive had on responses to sensitive questions in a mail survey of registered voters. The incentive resulted in a significant increase in the proportion of highly undesirable attitudes and behaviors to which respondents admitted and had no effect on responses to less sensitive items. While the incentive led to a general pattern of reduced nonresponse bias and increased measurement bias for the three voting items where administrative data was available for the full sample, these effects generally were not significant. The third study tested for measurement invariance in incentive and control group responses to four multi-item scales from three recent surveys that included prepaid incentive experiments. There was no evidence of differential item functioning; however, full metric invariance could not be established for one of the scales. Generally, these results suggest that prepaid incentives had minimal impact on measurement error. Thus, these findings should be reassuring for survey researchers considering the use of prepaid incentives to increase response rates.Item Respondent Consent to Use Administrative Data(2012) Fulton, Jenna Anne; Presser, Stanley; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Surveys increasingly request respondents' consent to link survey responses with administrative records. Such linked data can enhance the utility of both the survey and administrative data, yet in most cases, this linkage is contingent upon respondents' consent. With evidence of declining consent rates, there is a growing need to understand factors associated with consent to record linkage. This dissertation presents the results of three research studies that investigate factors associated with consenting. In the first study, we draw upon surveys conducted in the U.S. with consent requests to describe characteristics of surveys containing such requests, examine trends in consent rates over time, and evaluate the effects of several characteristics of the survey and consent request on consent rates. The results of this study suggest that consent rates are declining over time, and that some characteristics of the survey and consent request are associated with variations in consent rates, including survey mode, administrative record topic, personal identifier requested, and whether the consent request takes an explicit or opt-out approach. In the second study, we administered a telephone survey to examine the effect of administrative record topic on consent rates using experimental methods, and through non-experimental methods, investigated the influence of respondents' privacy, confidentiality, and trust attitudes and consent request salience on consent rates. The results of this study indicate that respondents' confidentiality attitudes are related to their consent decision; the other factors examined appear to have less of an impact on consent rates in this survey. The final study used data from the 2009 National Immunization Survey (NIS) to assess the effects of interviewers and interviewer characteristics on respondents' willingness to consent to vaccination provider contact. The results of this study suggest that interviewers vary in their ability to obtain respondents' consent, and that some interviewer characteristics are related to consent rates, including gender and amount of previous experience on the NIS.Item Effects of Acoustic Perception of Gender on Nonsampling Errors in Telephone Surveys(2012) Kenney McCulloch, Susan; Kreuter, Frauke; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Many telephone surveys require interviewers to observe and record respondents' gender based solely on respondents' voice. Researchers may rely on these observations to: (1) screen for study eligibility; (2) determine skip patterns; (3) foster interviewer tailoring strategies; (4) contribute to nonresponse assessment and adjustments; (5) inform post-stratification weighting; and (6) design experiments. Gender is also an important covariate to understand attitudes and behavior in many disciplines. Yet, despite this fundamental role in research, survey documentation suggests there is significant variation in how gender is measured and collected across organizations. Variations of collecting respondent gender may include: (1) asking the respondent; (2) interviewer observation only; (3) a combination of observation aided by asking when needed; or (4) another method. But what is the efficacy of these approaches? Are there predictors of observational errors? What are the consequences of interviewer misclassification of respondent gender to survey outcomes? Measurement error in interviewer's observations of respondent gender has never been examined by survey methodologists. This dissertation explores the accuracy and utility of interviewer judgments specifically with regard to gender observations. Using the recent paradata work and linguistics literature as a foundation to explore acoustic gender determination, the goal of my dissertation is to identify implications for survey research of using interviewers' observations collected in a telephone interviewing setting. Organized into three journal-style papers, through a survey of survey organizations, the first paper finds that more than two-thirds of firms collect respondent gender by some form of interviewer observation. Placement of the observation, rationale for chosen collection methods, and uses of these paradata are documented. In paper two, utilizing existing recording of survey interviews, the experimental research finds that the accuracy of interviewer observations improves with increased exposure. The noisy environment of a centralized phone room does not appear to threaten the quality of gender observations. Interviewer and respondent level covariates of misclassification are also discussed. Analyzing secondary data, the third paper finds there are some consequences of incorrect interviewer observations of respondents' gender on survey estimates. Findings from this dissertation will contribute to the paradata literature and provide survey practitioners guidance in the use and collection of interviewer observations, specifically gender, to reduce sources of nonsampling error.Item The Use of Responsive Split Questionnaires in a Panel Survey(2012) Gonzalez, Jeffrey Mark; Valliant, Richard; Survey Methodology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Lengthy surveys may be associated with high respondent burden, low data quality, and high unit nonresponse. To address these concerns, survey designers may reduce the length of a survey by eliminating questions from the original questionnaire, but this means that some information would never get collected. An alternative may be to divide a lengthy questionnaire into subsets of survey items and then administer each subset to distinct subsamples of the full sample. This is referred to as a split questionnaire design and has the benefit of collecting all of the original survey information. We identify a significant deficiency in the current set of split questionnaire methods, namely, the incomplete use of prior information about the sample unit in the design. In most contemporary applications of split questionnaires, generally only characteristics of the survey items (e.g., content, cognitive burden) are used to inform the design; however, if joint consideration is given to characteristics on the survey items as well as the sample unit when designing a split questionnaire, then there may be the potential to improve the split questionnaire's utility. In this dissertation, we explore the extent to which, if any, jointly considering both types of information at the design stage will yield more efficient split questionnaires. We propose various methods for incorporating prior information about the sample unit into the split questionnaire using features of responsive design. We highlight how this specific application of a responsive split questionnaire can be used to address the concerns present in a major federal survey. Finally, we draw from the literature pertaining to survey design, experimental design, and epidemiology to develop and implement a framework for evaluating the proposed new elements of our split questionnaire design.