Comparing the Effectiveness of Standard vs. Multilevel Machine Learning Algorithms on Hierarchical Data
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
This dissertation explored the performance of standard and multilevel machine learning classification algorithms on hierarchical datasets, which are prevalent in educational research due to multistage sampling techniques (e.g., students nested within schools). Hierarchical data pose unique analytical challenges (e.g., nonindependence of data), often requiring specialized approaches. While multilevel modeling is well-established in inferential contexts, its potential in predictive scenarios had been largely underexplored.
Through a Monte Carlo simulation and empirical analyses using data from the Maryland Longitudinal Data System (MLDS), this study evaluated the suitability of standard and multilevel algorithms. Results revealed that multilevel models offered slight advantages in high residual level-2 variance settings, effectively capturing cluster-level dependencies and providing stable predictions. Standard models performed well in low residual level-2 variance contexts, while standard models incorporating cluster IDs as fixed effects performed comparably to multilevel models under many conditions, including high residual level-2 variance scenarios. However, this approach was only feasible when training and testing clusters overlapped, highlighting limitations in generalizing predictions to unseen clusters. In an empirical analysis addressing class imbalance, Logistic Regression and GLMM exhibited the highest sensitivity for identifying STEM completers when training and testing clusters overlapped while Neural Nets and XGBoost demonstrated better performance in identifying the minority class when training and testing clusters were distinct. These findings highlighted the complexity of predictive modeling for hierarchical data and provided insights for prediction tasks in educational research.