MODELING CLUSTERED DATA WITH FEW CLUSTERS: A CROSS-DISCIPLINE COMPARISON OF SMALL SAMPLE METHODS

Loading...
Thumbnail Image

Files

Publication or External Link

Date

2015

Citation

Abstract

Small sample inference with clustered data has received increased attention recently in the methodological literature with several simulation studies being presented on the small sample behavior of various methods. There are several different classes of methods that can be implemented to account for clustering and disciplinary allegiances are quite rigid: for instance, recent reviews have found that 94% of psychology studies use multilevel models whereas only 3% of economics studies use multilevel models. In economics, fixed effects models are far more popular and in biostatistics there is a tendency to employ generalized estimating equations. As a result of these strong disciplinary preferences, methodological studies tend to focus only a single class of methods (e.g., multilevel models in psychology) while largely ignoring other possible methods. Therefore, the performance of small sample methods have been investigated within classes of methods but studies have not expanded investigations across disciplinary boundaries to more broadly compare the performance of small sample methods that exist in the various classes of methods to accommodate clustered data.

Motivated by an applied educational psychology study with a few clusters, in this dissertation the various methods to accommodate clustered data and their small sample extensions are introduced. Then a wide ranging simulation study is conducted to compare 12 methods to model clustered data with a small number of clusters. Many small sample studies generate data from fairly unrealistic models that only feature a single predictor at each level so this study generates data from a more complex model with 8 predictors that is more reminiscent of data researchers might have in an applied study.  Few studies have also investigated  extremely small numbers of clusters (less than 10) that are quite common in many researchers areas where clusters contain many observations and are there expensive to recruit (e.g., schools, hospitals) and the simulation study lowers the number of clusters well into the single digits. Results show that some methods such as fixed effects models and Bayes estimation clearly perform better than others and that researchers may benefit from considering methods outside those typically employed in their specific discipline.

Notes

Rights