Diagnostics for Nonlinear Mixed-Effects Models
Nagem, Mohamed Ould
The estimation methods in Nonlinear Mixed-Effects Models (NLMM) still largely rely on numerical approximation of the likelihood function and the properties of these methods are yet to be characterized. These methods are available in most statistical software packages, such as S-plus and SAS; However approaches on how to assess the reliability of these estimation methods are still open to debate. Moreover, the lack of a common measure to capture the best fitted model is still an open area of research. Common Software packages such as SAS and S-plus do not provide a specific method for computing such a measure other than the traditional Akaike's Information Criterion (AIC) Akaike , Bayesian Information Criterion (BIC) Schwarz , or the likelihood ratio. These methods are comparative in nature and are very hard to interpret in this context due to the complex structure and dependent nature of the populations that they were intended to analyze. This dissertation focuses on approximate methods of estimating parameters of NLMM. In chapter 1, the general form of a NLMM is introduced and real data examples are presented to illustrate the usefulness of NLMM where a standard regression model is not appropriate. A general review of the approximation methods of the log-likelihood function is described. In chapter 2, we compared three approximation techniques, which are widely used in the estimation of NLMM, based on simulation studies. In this chapter we compared these approx- imation methods through extensive simulation studies motivated by two widely used data sets. We compared the empirical estimates from three different approximations of the log-likelihood function and their bias, precision, convergence rate, and the 95% confidence interval coverage probability. We compared the First Order approximation (FO) of Beal and Sheiner , the Laplace approximation (LP) of Wolfinger , and the Gaussian Quadrature (GQ) of Davidian and Gallant . We also compared these approaches under different sample size configurations and analyzed their effects on both fixed effects estimates and the precision measures. The question of which approximation yields the best estimates and the degree of precision associated with it seems to depend greatly on many aspects. We explored some of these aspects such as the magnitude of variability among the random effects, the random parameters covariance structure, and the way in which such random parameters enter the model as well as the \linearity" or the "close to linearity" of the model as a function of these random parameters. We concluded that, while no method outperformed the others on a consistent basis, both the GQ and LP methods provided the most accurate estimates. The FO method has the advantage that it is exact when the model is linear in the random effects. It also has the advantage of being computationally simple and provides reasonable convergence rates. In chapter 3 we investigated the robustness and sensitivity of the three approximation techniques to the structure of the random effect parameters, the dimension of these parameters, and the correlation structure of the covariance matrix. We expanded the work of Hartford and Davidian  to assess the robustness of these approximation methods under different scenarios (models) of random effect covariance structures:(1) Under the assumption of single random effect models;(2) under the assumption of correlated random effect models;(3) under the assumption of non-correlated random effect models. We showed that the LP and GQ methods are very similar and provided the most accurate estimates. Even though the LP is fairly robust to mild deviations, the LP estimates can be extremely biased due to the difficulty of achieving convergence. The LP method is sensitive to misspecification of the inter-individual model. In chapter 4 we evaluated the Goodness of Fit measure (GOF) of Hosmer et. al.  and Sturdivant and Hosmer  to a class of NLMM and evaluated the asymptotic sum of residual squares statistics as a measure of goodness of fit by conditioning the response on the random effect parameter and using Taylor series approximations in the estimation technique. Simulations of different mixed logistic regression models were evaluated, as well as the effect of the sample size on such statistics. We showed that the proposed sum of squares residual statistics works well for a class of mixed logistic regression models with the presence of continuous covariates with a modest sample size dataset. However, the same statistics failed to provide an adequate power to detect the correct model in the presence of binary covariates.