Regularized Variable Selection in Proportional Hazards Model Using Area under Receiver Operating Characteristic Curve Criterion

Thumbnail Image


Publication or External Link






The goal of this thesis is to develop a statistical procedure for selecting pertinent predictors among a number of covariates to accurately predict the survival time of a patient. There are available many variable selection procedures in the literature. This thesis is focused on a more recently developed “regularized variable selection procedure”. This procedure, based on a penalized likelihood, can simultaneously address the problem of variable selection and variable estimation which previous procedures lack. Specifically, this thesis studies regularized variable selection procedure in the proportional hazards model for censored survival data.

Implementation of the procedure requires judicious determination of the amount of penalty, a regularization parameter λ, on the likelihood and the development of computational intensive algorithms. In this thesis, a new criterion of determining λ using the notion of “the area under the receiver operating characteristic curve (AUC)” is proposed. The conventional generalized cross-validation criterion (GCV) is based on the likelihood and its second derivative. Unlike GCV, the AUC criterion is based on the performance of disease classification in terms of patients' survival times. Simulations show that performance of the AUC and the GCV criteria are similar. But the AUC criterion gives a better interpretation of the survival data.

We also establish the consistency and asymptotic normality of the regularized estimators of parameters in the partial likelihood of proportional hazards model. Some oracle properties of the regularized estimators are discussed under certain sparsity conditions. An algorithm for selecting λ and computing regularized estimates is developed. The developed procedure is then illustrated with an application to the survival data of patients who have cancers in head and neck. The results show that the proposed method is comparable with the conventional one.