AN INVESTIGATION OF GROWTH MIXTURE MODELS WHEN DATA ARE COLLECTED WITH UNEQUAL SELECTION PROBABILITIES: A MONTE CARLO STUDY
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
As researchers begin to use Growth Mixture Models (GMM) with data from nationally representative samples, it becomes increasingly critical for researchers to understand the difficulties associated with modeling data that come from complex sample designs. If researchers naively apply GMM to nationally representative data sets without adjusting for the way in which the sample was selected, the resulting parameter estimates, standard errors and tests of significant may not be trustworthy.
Therefore, the objective of the current study was to quantify the accuracy of parameter estimates and class assignment when subjects are sampled with unequal probabilities of selection. To this end, a series of Monte Carlo simulations empirically investigated the ability of GMM to recover known growth parameters of distinct populations when various adjustments are applied to the statistical model. Specifically, the current research compared the performance of GMM that 1) ignores the sample design; 2) accounts for the sample design via weighting; 3) accounts for the sample design via explicitly modeling the stratification variable; and 4) accounts for the sample design by using weights and modeling the stratification variable.
Results suggested that a model-based approach does not improve the accuracy of parameter estimates when individuals are sampled with disproportionate sampling probabilities. Not only does this method often fail to converge, when it did converge the parameter estimates exhibited an unacceptable amount of bias. The weighted model performed the best out of all of the models tested, but still resulted in parameter estimates with unacceptably high percentages of bias. It is possible that the distributions of the manifest variables overlap too much, and the aggregate distribution may be unimodal, making it potentially difficult to distinguish among the latent classes and thus affecting the accuracy of parameter estimates. In sum, the current research indicates that GMM should not be used when data are sampled with disproportionate probabilities. Researchers should therefore attend to the study design and data collection strategies when considering the use of a Growth Mixture Model in the analysis phase.