Multivariate Threshold Regression Models with Cure Rates: Identification and Estimation in the Presence of the Esscher Property

Thumbnail Image


Publication or External Link





Lee, M.-L.T.; Whitmore, G.A. Multivariate Threshold Regression Models with Cure Rates: Identification and Estimation in the Presence of the Esscher Property. Stats 2022, 5, 172-189.


The first hitting time of a boundary or threshold by the sample path of a stochastic process is the central concept of threshold regression models for survival data analysis. Regression functions for the process and threshold parameters in these models are multivariate combinations of explanatory variates. The stochastic process under investigation may be a univariate stochastic process or a multivariate stochastic process. The stochastic processes of interest to us in this report are those that possess stationary independent increments (i.e., Lévy processes) as well as the Esscher property. The Esscher transform is a transformation of probability density functions that has applications in actuarial science, financial engineering, and other fields. Lévy processes with this property are often encountered in practical applications. Frequently, these applications also involve a ‘cure rate’ fraction because some individuals are susceptible to failure and others not. Cure rates may arise endogenously from the model alone or exogenously from mixing of distinct statistical populations in the data set. We show, using both theoretical analysis and case demonstrations, that model estimates derived from typical survival data may not be able to distinguish between individuals in the cure rate fraction who are not susceptible to failure and those who may be susceptible to failure but escape the fate by chance. The ambiguity is aggravated by right censoring of survival times and by minor misspecifications of the model. Slightly incorrect specifications for regression functions or for the stochastic process can lead to problems with model identification and estimation. In this situation, additional guidance for estimating the fraction of non-susceptibles must come from subject matter expertise or from data types other than survival times, censored or otherwise. The identifiability issue is confronted directly in threshold regression but is also present when applying other kinds of models commonly used for survival data analysis. Other methods, however, usually do not provide a framework for recognizing or dealing with the issue and so the issue is often unintentionally ignored. The theoretical foundations of this work are set out, which presents new and somewhat surprising results for the first hitting time distributions of Lévy processes that have the Esscher property.