A Mean-Parameterized Conway–Maxwell–Poisson Multilevel Item Response Theory Model for Multivariate Count Response Data
Files
(RESTRICTED ACCESS)
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Multivariate count data arise frequently in the process of measuring a latent construct in human development, psychology, medicine, education, and the social sciences. Some examples include the number of different types of mistakes a student makes when reading a passage of text, or the number of nausea, vomiting, diarrhea, and/or dysphagia episodes a patient experiences in a given day. These response data are often sampled from multiple sources and/or in multiple stages, yielding a multilevel data structure with lower level sampling units (e.g., individuals, such as students or patients) nested within higher level sampling units or clusters (e.g., schools, clinical trial sites, studies). Motivated by real data, a new Item Response Theory (IRT) model is developed for the integrative analysis of multivariate count data. The proposed mean-parameterized Conway--Maxwell--Poisson Multilevel IRT (CMPmu-MLIRT) model differs from currently available models in its ability to yield sound inferences when applied to multilevel, multivariate count data, where exposure (the length of time, space, or number of trials over which events are recorded) may vary across individuals, and items may provide different amounts of information about an individual’s level of the latent construct being measured (e.g., level of expressive language development, math ability, disease severity). Estimation feasibility is demonstrated through a Monte Carlo simulation study evaluating parameter recovery across various salient conditions. Mean parameter estimates are shown to be well aligned with true parameter values when a sufficient number of items (e.g., 10) are used, while recovery of dispersion parameters may be challenging when as few as 5 items are used. In a second Monte Carlo simulation study, to demonstrate the need for the proposed CMPmu-MLIRT model over currently available alternatives, the impact of CMPmu-MLIRT model misspecification is evaluated with respect to model parameter estimates and corresponding standard errors. Treating an exposure that varies across individuals as though it were fixed is shown to notably overestimate item intercept and slope estimates, and, when substantial variability in the latent construct exists among clusters, underestimate said variance. Misspecifying the number of levels (i.e., fitting a single-level model to multilevel data) is shown to overestimate item slopes---especially when substantial variability in the latent construct exists among clusters---as well as compound the overestimation of item slopes when a varying exposure is also misspecified as being fixed. Misspecifying the conditional item response distributions as Poisson for underdispersed items and negative binomial for overdispersed items is shown to bias estimates of between-cluster variability in the latent construct. Lastly, the applicability of the proposed CMPmu-MLIRT model to empirical data was demonstrated in the integrative data analysis of oral language samples.