Performance of Propensity Score Methods in the Presence of Heterogeneous Treatment Effects

Thumbnail Image


Publication or External Link





Estimating an average treatment effect assumes that individuals and groups are homogeneous in their responses to a treatment or intervention. However, treatment effects are often heterogeneous. Selecting the most effective treatment, generalizing causal effect estimates to a population, and identifying subgroups for which a treatment is effective or harmful are factors that motivate the study of heterogeneous treatment effects. In observational studies, treatment effects are often estimated using propensity score methods. This dissertation adds to the literature on the analysis of heterogeneous treatment effects using propensity score methods. Three propensity score methods were compared using Monte Carlo simulation: single propensity score with exact matching on subgroup, matching using group propensity scores, and multinomial propensity scores using generalized boosted modeling. Methods were evaluated under various group distributions, sample sizes, effect sizes, and selection models. An empirical analysis using data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K) is included to demonstrate the methods studied. Simulation results showed that estimating group propensity scores provided the smallest MSE, MNPS performance was comparable to GBM, and including the group indicator in the propensity score model improved treatment effect estimates regardless of whether group membership influenced selection. In addition, subclassification performed poorly when one group was more prevalent in the extremes of the propensity score distribution.