Comparing the Effectiveness of Standard vs. Multilevel Machine Learning Algorithms on Hierarchical Data

Register, Brennan

Comparing the Effectiveness of Standard vs. Multilevel Machine Learning Algorithms on Hierarchical Data

Files

Register_umd_0117E_24921.pdf (1.99 MB)

No. of downloads: 88

Date

2025

Authors

Register, Brennan

Advisor

Sweet, Tracy

DRUM DOI

https://doi.org/10.13016/mv4w-liso

Abstract

This dissertation explored the performance of standard and multilevel machine learning classification algorithms on hierarchical datasets, which are prevalent in educational research due to multistage sampling techniques (e.g., students nested within schools). Hierarchical data pose unique analytical challenges (e.g., nonindependence of data), often requiring specialized approaches. While multilevel modeling is well-established in inferential contexts, its potential in predictive scenarios had been largely underexplored.

Through a Monte Carlo simulation and empirical analyses using data from the Maryland Longitudinal Data System (MLDS), this study evaluated the suitability of standard and multilevel algorithms. Results revealed that multilevel models offered slight advantages in high residual level-2 variance settings, effectively capturing cluster-level dependencies and providing stable predictions. Standard models performed well in low residual level-2 variance contexts, while standard models incorporating cluster IDs as fixed effects performed comparably to multilevel models under many conditions, including high residual level-2 variance scenarios. However, this approach was only feasible when training and testing clusters overlapped, highlighting limitations in generalizing predictions to unseen clusters. In an empirical analysis addressing class imbalance, Logistic Regression and GLMM exhibited the highest sensitivity for identifying STEM completers when training and testing clusters overlapped while Neural Nets and XGBoost demonstrated better performance in identifying the minority class when training and testing clusters were distinct. These findings highlighted the complexity of predictive modeling for hierarchical data and provided insights for prediction tasks in educational research.

URI (handle)

http://hdl.handle.net/1903/34077

Collections

UMD Theses and Dissertations
Human Development & Quantitative Methodology Theses and Dissertations

Full item page