Estimating Common Odds Ratio with Missing Data
Estimating Common Odds Ratio with Missing Data
Files
Publication or External Link
Date
2005-07-26
Authors
Chen, Te-Ching
Advisor
Smith, Paul J.
Citation
DRUM DOI
Abstract
We derive estimates of expected cell counts for $I\times J\times K$
contingency tables where the stratum variable $C$ is always observed
but the column variable $B$ and row variable $A$ might be missing. In
particular, we investigate cases where only row variable $A$ might be
missing, either randomly or informatively. For $2\times 2\times K$
tables, we use Taylor expansion to study the biases and variances of the Mantel-Haenszel
estimator and modified Mantel-Haenszel estimators of the common odds
ratio using one pair of
pseudotables for data without missing values and for data with missing
values, based either on the completely observed subsample or on
estimated cell means when both stratum and column variables are always
observed. We examine both large table and sparse table asymptotics. \\
Analytic studies and simulation results show that the Mantel-Haenszel
estimators overestimate the common odds ratio but adding one pair of
pseudotables reduces bias and variance. Mantel-Haenszel estimators
with jackknifing also reduces the biases and variances. Estimates
using only the complete subsample seem to have larger bias than those
based on full data, but when the total number of observations gets
large, the bias is reduced. Estimators based on estimated cell means
seem to have larger biases and variances than those based only on
complete subsample with randomly missing data. With informative
missingness, estimators based on the estimated cell means do not
converge to the correct common odds ratio under sparse asymptotics, and
converge slowly for the large table asymptotics.
The Mantel-Haenszel estimators based on incorrectly estimated cell
means when the variable $A$ is informatively missing behave
similarly to those based on the only complete subsamples. The
asymptotic variance formula of the ratio estimators had smaller biases
and variances than those based on jackknifing or bootstrapping.
Bootstrapping may produce zero divisors and unstable estimates, but
adding one pair of pseudotables eliminates these problems and reduces the variability.