Human Development & Quantitative Methodology

Permanent URI for this communityhttp://hdl.handle.net/1903/2248

The departments within the College of Education were reorganized and renamed as of July 1, 2011. This department incorporates the former departments of Measurement, Statistics & Evaluation; Human Development; and the Institute for Child Study.

Browse

Search Results

Now showing 1 - 10 of 48
  • Thumbnail Image
    Item
    A Mean-Parameterized Conway–Maxwell–Poisson Multilevel Item Response Theory Model for Multivariate Count Response Data
    (2024) Strazzeri, Marian Mullin; Yang, Ji Seung; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Multivariate count data arise frequently in the process of measuring a latent construct in human development, psychology, medicine, education, and the social sciences. Some examples include the number of different types of mistakes a student makes when reading a passage of text, or the number of nausea, vomiting, diarrhea, and/or dysphagia episodes a patient experiences in a given day. These response data are often sampled from multiple sources and/or in multiple stages, yielding a multilevel data structure with lower level sampling units (e.g., individuals, such as students or patients) nested within higher level sampling units or clusters (e.g., schools, clinical trial sites, studies). Motivated by real data, a new Item Response Theory (IRT) model is developed for the integrative analysis of multivariate count data. The proposed mean-parameterized Conway--Maxwell--Poisson Multilevel IRT (CMPmu-MLIRT) model differs from currently available models in its ability to yield sound inferences when applied to multilevel, multivariate count data, where exposure (the length of time, space, or number of trials over which events are recorded) may vary across individuals, and items may provide different amounts of information about an individual’s level of the latent construct being measured (e.g., level of expressive language development, math ability, disease severity). Estimation feasibility is demonstrated through a Monte Carlo simulation study evaluating parameter recovery across various salient conditions. Mean parameter estimates are shown to be well aligned with true parameter values when a sufficient number of items (e.g., 10) are used, while recovery of dispersion parameters may be challenging when as few as 5 items are used. In a second Monte Carlo simulation study, to demonstrate the need for the proposed CMPmu-MLIRT model over currently available alternatives, the impact of CMPmu-MLIRT model misspecification is evaluated with respect to model parameter estimates and corresponding standard errors. Treating an exposure that varies across individuals as though it were fixed is shown to notably overestimate item intercept and slope estimates, and, when substantial variability in the latent construct exists among clusters, underestimate said variance. Misspecifying the number of levels (i.e., fitting a single-level model to multilevel data) is shown to overestimate item slopes---especially when substantial variability in the latent construct exists among clusters---as well as compound the overestimation of item slopes when a varying exposure is also misspecified as being fixed. Misspecifying the conditional item response distributions as Poisson for underdispersed items and negative binomial for overdispersed items is shown to bias estimates of between-cluster variability in the latent construct. Lastly, the applicability of the proposed CMPmu-MLIRT model to empirical data was demonstrated in the integrative data analysis of oral language samples.
  • Thumbnail Image
    Item
    THE USE OF RANDOM FORESTS IN PROPENSITY SCORE WEIGHTING
    (2023) Zheng, Yating; Stapleton, Laura; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    An important problem of social science research is the estimate of causal effects in observationalstudies. Propensity score methods, as effective ways to remove selection bias, have been widely used in estimating causal effects in observational studies. An important step of propensity score methods is to estimate the propensity score. Recently, a machine learning method, random forests, has been proposed as an alternative to the conventional method of logistic regression to estimate the propensity score as it requires less stringent assumptions and provides less biased and more reliable estimate of the treatment effect. However, previous studies only covered limited conditions with a small number of covariates and medium sample sizes, leaving the generalizability of the results in doubt. In addition, previous studies have seldom explored how to choose the hyper-parameters in random forests in the context of propensity score methods. This dissertation, via a simulation study, aims to 1) make a more comprehensive comparison between the use of random forests and logistic regression to determine which model performs better under what conditions, 2) explore the effects of the hyperparameters on the estimate of the treatment effect in propensity score weighting. An empirical study is also used as an illustration about how to choose the hyperparameters in random forests using propensity score weighting in practical settings.
  • Thumbnail Image
    Item
    Construct measurement error in latent social network relationship: An item response theory based latent space model
    (2023) Ding, Yishan; Sweet, Tracy; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Research on measurement error in social network analysis has primarily focused on proxy measurement error, which refers to inadequate or inaccurate observations of proxy measurements of social relationships. However, construct measurement error, a key concern in modern psychometric studies, has received less attention in social network studies. Construct measurement error is particularly relevant for social network relationships that are difficult or impossible to observe explicitly, such as friendships, which are better conceptualized as latent constructs. Historically, researchers have long advocated to use multi-item scales for social relationships to address construct measurement error (Marsden, 1990). However, there is a lack of methods tailored for multivariate social network analysis using multi-item measurements. Commonly, when data on social network ties is collected from multiple items, prevalent strategies involve either choosing a representative item or analyzing each item as a distinct network. To accommodate construct measurement error in social network analysis, this study proposes a new model, termed as IRT-LSM, that integrates an item response theory (IRT) model into a latent space model (LSM). The proposed method leverages the IRT model to take advantage of a multi-item scale to enhance the measurement of latent social relationships, providing a more comprehensive understanding of social relationships compared to relying on a single item. To evaluate the efficacy of this novel approach, the dissertation comprises three simulation studies: One assessing model feasibility and the impact of construct measurement error, a second exploring various misspecification models, and a third investigating the effects of item parameter distributions. Additionally, an empirical data analysis demonstrates the practical application of the IRT-LSM in real-world settings. The results underscore the effectiveness of the IRT-LSM in addressing construct measurement error. The model consistently yields unbiased estimates and demonstrates robustness against various factors influencing its performance across the simulated conditions. Notably, the IRT-LSM outperforms naive approaches that neglect construct measurement error, leading to divergent conclusions in the empirical data analyses.
  • Thumbnail Image
    Item
    A FINITE MIXTURE MULTILEVEL STRUCTURAL EQUATION MODEL FOR UNOBSERVED HETEROGENEITY IN RANDOM VARIABILITY
    (2023) Feng, Yi; Hancock, Gregory R; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Variability is often of key interest in various research and applied settings. Important research questions about intraindividual variability (e.g., consistency across repeated measurements) or intragroup variability (e.g., cohesiveness among members within a team) are piquing the interest of researchers from a variety of disciplines. To address the research needs in modeling random variability as the key construct, Feng and Hancock (2020, 2022) proposed a multilevel SEM-based modeling approach where variability can be modeled as a random variable. This modeling framework is a highly flexible analytical tool that can model variability in observed measures or latent constructs, variability as the predictor or the outcome, as well as the between-subject comparison of variability across observed groups. A huge challenge still remains, however, when it comes to modeling the unobserved heterogeneity in random variability. Given that no existing research addresses the methodological considerations of uncovering the unobserved sub-populations that differ in intraindividual variability or intragroup variability, or sub-populations that differ in the various processes and mechanisms involving intraindividual variability or intragroup variability, the current dissertation study aims to fill this gap in literature. In the current study, a finite-mixture MSEM for modeling unobserved heterogeneity in random variability (MMSEM-RV) is introduced. Bayesian estimation via MCMC is proposed for model estimation. The performance of MMSEM-RV with Bayesian estimation is systematically evaluated in a simulation study across varying conditions. An illustrative example with empirical PISA data is also provided to demonstrate the practical application of MMSEM-RV.
  • Thumbnail Image
    Item
    Characterizing the Adventitious Model Error as a Random Effect in Item-Response-Theory Models
    (2023) Xu, Shuangshuang; Liu, Yang; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    When drawing conclusions from statistical inferences, researchers are usually concerned about two types of errors: sampling error and model error. The sampling error is caused by the discrepancy between the observed sample and the population from which the sample is drawn from (i.e., operational population). The model error refers to the discrepancy between the fitted model and the data-generating mechanism. Most item response theory (IRT) models assume that models are correctly specified in the population of interest; as a result, only sampling errors are characterized, not model errors. The model error can be treated either as fixed or random. The proposed framework in this study treats the model error as a random effect (i.e., an adventitious error) and provides an alternative explanation for the model errors in IRT models that originate from unknown sources. A random, ideally small amount of discrepancy between the operational population and the fitted model is characterized using a Dirichlet-Multinomial framework. A concentration/dispersion parameter is used in the Dirichlet-Multinomial framework to measure the amount of adventitious error between the operational population probability and the fitted model. In general, the study aims to: 1) build a Dirichlet-Multinomial framework for IRT models, 2) establish asymptotic results for estimating model parameters when the operational population probability is assumed known or unknown, 3) conduct numerical studies to investigate parameter recovery and the relationship between the concentration/dispersion parameter in the proposed framework and the Root Mean Square Error of Approximation (RMSEA), 4) correct bias in parameter estimates of the Dirichlet-Multinomial framework using asymptotic approximation methods, and 5) quantify the amount of model error in the framework and decide whether the model should be retained or rejected.
  • Thumbnail Image
    Item
    INVESTIGATING MODEL SELECTION AND PARAMETER RECOVERY OF THE LATENT VARIABLE AUTOREGRESIVE LATENT TRAJECTORY (LV-ALT) MODEL FOR REPEATED MEASURES DATA: A MONTE CARLO SIMULATION STUDY
    (2023) Houser, Ari; Harring, Jeffrey R; Human Development; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Over the past several decades, several highly generalized models have been developed which can reduce, through parameter constraints, to a variety of classical models. One such framework, the Autoregressive Latent Trajectory (ALT) model, is a combination of two classical approaches to longitudinal modeling: the autoregressive or simplex family, in which trait scores at one occasion are regressed on scores at a previous occasion, and latent trajectory or growth curve models, in which individual trajectories are specified by a set of latent factors (typically a slope and an intercept) whose values vary across the population.The Latent Variable-Autoregressive Latent Trajectory (LV-ALT) model has been recently proposed as an extension of the ALT model in which the traits of interest are latent constructs measured by one or more indicator variables. The LV-ALT is presented as a framework by which one may compare the fit of a chosen model to alternative possibilities or use to empirically guide the selection of a model in the absence of theory, prior research, or standard practice. To date, however, there has not been any robust analysis of the efficacy or usefulness of the LV-ALT model for this purpose. This study uses a Monte Carlo simulation study to evaluate the efficacy of the basic formulation of the LV-ALT model (univariate latent growth process, single indicator variable) to identify the true model, model family, and key characteristics of the model under manipulated conditions of true model parameters, sample size, measurement reliability, and missing data. The performance of the LV-ALT model for model selection is mixed. Under most manipulated conditions, the best-fitting of nine candidate models was different than the generating model, and the cost of model misspecification for parameter recovery included significant increases in bias and loss of precision in parameter estimation. As a general rule, the LV-ALT should not be relied upon to empirically select a specific model, or to choose between several theoretical plausible models in the autoregressive or latent growth families. Larger sample size, greater measurement reliability, larger parameter magnitude, and a constant autoregressive parameter are associated with greater likelihood of correct model selection.  
  • Thumbnail Image
    Item
    ESTIMATING THE Q-DIFFUSION MODEL PARAMETERS BY APPROXIMATE BAYESIAN COMPUTATION
    (2023) Tian, Chen; Liu, Yang; Human Development; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The Q-diffusion model is a cognitive process model that considers decision making as an unobservable information accumulation process. Both item and person parameters decide the trace line of the cognitive process, which further decides observed response and response time. Because the likelihood function for the Q-diffusion model is intractable, standard parameter estimation techniques such as the maximum likelihood estimation is difficult to apply. This project applies Approximate Bayesian Computation (ABC) to estimate parameters of the Q-diffusion model. Different from standard Markov chain Monte Carlo samplers that require pointwise evaluation of the likelihood function, ABC builds upon a program for data generation and a metric on the data space to gauge the similarity between imputed and observed data. This project aims to compare the performance of two criteria for gauging the similarity or distance. The limited-information criterion measures the distance in suitable summary statistics (i.e., variances, covariances, and means) between imputed and observed data. The enhanced limited information criterion additionally considers the dependencies among persons’ responses and response times. Bias, rooted mean squared error, and coverage of credible intervals were reported. Results show that when using posterior median as the point estimate, by jointly considering a person’s responses and response time, the enhanced criterion yielded less biased estimation on population scale of person power and slightly better item parameters. This SMC-ABC algorithm informs researchers about key data features that should be captured when determining the stopping rule for the algorithm.
  • Thumbnail Image
    Item
    ACCOUNTING FOR STUDENT MOBILITY IN SCHOOL RANKINGS: A COMPARISON OF ESTIMATES FROM VALUE-ADDED AND MULTIPLE MEMBERSHIP MODELS
    (2023) Cassiday, Kristina; Stapleton, Laura M; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Student mobility exists, but it’s not always taken into account in value-added modeling approaches used to determine school accountability rankings. Multiple membership modeling can account for student mobility in a multilevel framework, but it is more computationally demanding and requires specialized knowledge and software packages that may not be available in state and district departments of education. The purpose of this dissertation was to compare how different multilevel value-added modeling approaches perform at various levels of mobility to be able to provide recommendations to state- and district-administrators about the type of models that would be best suited to their data. To accomplish this task, a simulation study was conducted, manipulating the percentage of mobility in the dataset and the similarity of the sender and receiver schools of mobile students. Traditional gains score and covariate adjustment models were run, along with comparable multiple membership models to determine the extent to which school effect estimates and school accountability rankings were affected and to investigate the conditions under which a multiple membership model would produce a meaningful increase in accuracy to justify its computational demand. Additional comparisons were made on measures of relative bias of the fixed effect coefficients, the random effect variance components, and the relative bias of the standard errors of the fixed effects and random effects variance components. The multiple membership models with schools proportionally weighted by time spent were considered better fitting models across all conditions. All multiple membership models were able to better recover the intercept and school-level residual variance better than other models. However, when considering school accountability rankings, the proportion of school quintile shifts was close to equal across the traditional and multiple membership models that were structurally similar to each other. This finding suggests that the use of a multiple membership model is preferable in providing the most accurate parameter and standard error estimates. However, if school accountability rankings are of primary interest, a traditional VAM performs equally as well as a multiple membership model. An empirical data analysis was conducted to demonstrate how to prepare data and properly run these various models and how to interpret the results, along with a discussion of issues to consider when selecting a model. Recommendations are provided on how to select a model, informed by the findings from the simulation portion of the study.
  • Thumbnail Image
    Item
    Joint Modeling Of Responses, Response Time, and Answer Changes in Testlet-based Assessment for Cognitive Diagnosis
    (2022) Yin, Chengbin; Jiao, Hong; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    To address the scenario of testlet-based assessment, this research proposes a joint model of responses, response time, and answer change patterns for testlet-based cognitive diagnostic assessments. A simulation study was conducted to assess the impact of accounting for dual item and item time dependency and of incorporating answer change patterns as an additional data source on model fit, classification accuracy at the attribute and attribute profile level, and parameter estimation. Through manipulating three factors, the simulation study examined the extent to which the manipulated factors impacted the performance of the proposed model and two comparison models in recovering model parameters. Application of the proposed model was demonstrated with an empirical dataset.
  • Thumbnail Image
    Item
    Multilevel Regression Discontinuity Models with Latent Variables
    (2020) Morell, Monica; Yang, Ji Seung; Liu, Yang; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Regression discontinuity (RD) designs allow estimating a local average treatment effect (LATE) when assignment of an individual to treatment is determined by their location on a running variable in relation to a cutoff value. The design is especially useful in education settings, where ethical concerns can forestall the use of randomization. Applications of RD in education research typically share two characteristics, which can make the use of the conventional RD model inappropriate: 1) The use of latent constructs, and 2) The hierarchical structure of the data. The running variables often used in education research represent latent constructs (e.g., math ability), which are measured by observed indicators such as categorical item responses. While the use of a latent variable model to account for the relationships among item responses and the latent construct is the preferred approach, conventional RD analyses continue to use observed scores, which can result in invalid or less informative conclusions. The current study proposes a multilevel latent RD model which accounts for the prevalence of clustered data and latent constructs in education research, allows for the generalizability of the LATE to individuals further from the cutoff, and allows researchers to quantify the heterogeneity in the treatment effect due to measurement error in the observed running variable. Models are derived for two of the most commonly used multilevel RD designs. Due to the complex and high-dimensional nature of the proposed models, they are estimated in one stage using full-information likelihood via the Metropolis-Hastings Robbins-Monro algorithm. The results of two simulation studies, under varying sample size and test length conditions, indicate the models perform well when using the full sample with at least moderate-length assessments. A proposed model is used to examine the effects of receiving an English language learner designation on science achievement using the Early Childhood Longitudinal Study. Implications of the results of these studies and future directions for the research are discussed.