Abstract
A mixture cure model relies on a model for the cure probability and a model for the survival function of the uncured subjects. For the latter, one often uses a Cox proportional hazards model. We show the identifiability of this model under weak assumptions. The model assumes that the cure threshold is the same for all values of the covariates, which might be unrealistic in certain situations. An alternative mixture cure model is the accelerated failure time (AFT) model. We also show the identifiability of this model under minimal assumptions. The cure threshold in this model depends on the covariates, which often leads to a better fit of the data. This is especially true when the follow-up period is insufficient for certain values of the covariates. We study these two models via simulations both when the follow-up is sufficient and when it is insufficient. Moreover, the two models are applied to data coming from a breast cancer clinical trial. We show that the AFT and the Cox model both fit the data well in the region of sufficient follow-up, but differ drastically outside that region.
Similar content being viewed by others
References
Abrahamowicz M, MacKenzie T, Esdaile JM (1996) Time-dependent hazard ratio: modeling and hypothesis testing with application in lupus nephritis. J Am Stat Assoc 91(436):1432–1439
Amico M, Van Keilegom I (2018) Cure models in survival analysis. Annu Rev Stat Appl 5:311–342
Bender R, Augustin T, Blettner M (2005) Generating survival times to simulate Cox proportional hazards models. Stat Med 24(11):1713–1723
Berkson J, Gage RP (1952) Survival curve for cancer patients following treatment. J Am Stat Assoc 47(259):501–515
Cai C, Zou Y, Peng Y, Zhang J (2012) smcure: an R-package for estimating semiparametric mixture cure models. Comput Methods Progr Biomed 108(3):1255–1260
Chown J, Heuchenne C, Van Keilegom I (2020) The nonparametric location-scale mixture cure model. TEST 29:1008–1028
Cox DR (1972) Regression models and life tables (with discussion). J R Stat Soc Ser B 34(2):187–220
Cox DR, Oakes D (1984) Survival data analysis. Chapman and Hall, New York
Escobar-Bach M, Maller R, Van Keilegom I, Zhao M (2021) Estimation of the cure rate for distributions in the Gumbel maximum domain of attraction under insufficient follow-up. Biometrika 109:243–256
Fang H-B, Li G, Sun J (2005) Maximum likelihood estimation in a semiparametric logistic/proportional-hazards mixture model. Scand J Stat 32(1):59–75
Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38:1041–1046
Farewell VT (1986) Mixture models in survival analysis: are they worth the risk? Can J Stat 14(3):257–262
Geerdens C, Janssen P, Van Keilegom I (2020) Goodness-of-fit tests for univariate mixture cure rate models. TEST 29:768–792
Hanin L, Huang L-S (2014) Identifiability of cure models revisited. J Multivar Anal 130:261–274
Kalbfleisch JD, Prentice RL (2011) The statistical analysis of failure time data. Wiley, New York
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481
Kuk AY, Chen C-H (1992) A mixture model combining logistic regression with proportional hazards regression. Biometrika 79(3):531–541
Li C-S, Taylor JM (2002) A semi-parametric accelerated failure time cure model. Stat Med 21(21):3235–3247
López-Cheda A, Cao R, Jácome MA, Van Keilegom I (2017) Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models. Comput Stat Data Anal 105:144–165
Lu W (2008) Maximum likelihood estimation in the proportional hazards cure model. Ann Inst Stat Math 60(3):545–574
Lu W (2010) Efficient estimation for an accelerated failure time model with a cure fraction. Stat Sin 20:661–674
Lu W, Ying Z (2004) On semiparametric transformation cure models. Biometrika 91(2):331–343
Maller RA, Zhou X (1996) Survival analysis with long-term survivors. Wiley, New York
Orbe J, Ferreira E, Núñez-Antón V (2002) Comparing proportional hazards and accelerated failure time models for survival analysis. Stat Med 21(22):3493–3510
Patel K, Kay R, Rowell L (2006) Comparing proportional hazards and accelerated failure time models: an application in influenza. Pharm Stat J Appl Stat Pharm Ind 5(3):213–224
Patilea V, Van Keilegom I (2020) A general approach for cure models in survival analysis. Ann Stat 48(4):2323–2346
Peng Y, Yu B (2021) Cure models: methods, applications, and implementation. CRC Press, Boca Raton
Peng Y, Dear KB (2000) A nonparametric mixture model for cure rate estimation. Biometrics 56(1):237–243
Sy JP, Taylor JM (2000) Estimation in a Cox proportional hazards cure model. Biometrics 56(1):227–236
Taylor JM (1995) Semi-parametric estimation in failure time mixture models. Biometrics 51:899–907
Xu J, Peng Y (2014) Nonparametric cure rate estimation with covariates. Can J Stat 42(1):1–17
Yu B, Tiwari RC, Cronin KA, Feuer EJ (2004) Cure fraction estimation from the mixture cure models for grouped survival data. Stat Med 23(11):1733–1747
Zhang J, Peng Y (2007) A new estimation method for the semiparametric accelerated failure time mixture cure model. Stat Med 26(16):3157–3171
Zhang J, Peng Y, Zhao O (2011) A new semiparametric estimation method for accelerated hazard model. Biometrics 67(4):1352–1360
Zhang M-J (2002) Cox proportional hazards regression models for survival data in cancer research. In: Beam C (ed) Biostatistical applications in cancer research, vol 113. Cancer treatment and research. Springer, Boston, pp 59–70
Acknowledgements
The authors acknowledge support from the European Research Council (2016–2022, Horizon 2020 / ERC grant agreement No. 694409).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs of main results
Appendix: Proofs of main results
In this Appendix, we provide the proofs of Propositions 1 and 2.
Proof of Proposition 1
We need to show that if
for all realizations \((y, \delta , x, z)\) of \((Y, \Delta , X, Z)\), then \({\gamma }=\tilde{{\gamma }}, {\beta }=\tilde{{\beta }}\) and \(S_0 \equiv {\tilde{S}}_0\). Here, \(p(z) = \exp (\gamma ^t z) / [1+\exp (\gamma ^t z)]\), \({\tilde{p}}(z) = \exp ({\tilde{\gamma }}^t z) / [1+\exp ({\tilde{\gamma }}^t z)]\), \(S_u(y| x) = S_0(y)^{\exp (\beta ^t x)}\), \({\tilde{S}}_u(y| x) = {\tilde{S}}_0(y)^{\exp ({\tilde{\beta }}^t x)}\), and \(f_u(y|x)\) and \({\tilde{f}}_u(y|x)\) are the corresponding probability density functions.
First, consider \(y>\tau \) and \(\delta =0\). Note that
thanks to (A)(i), (B)(i) and (B)(ii). Hence, \((y>\tau , \delta =0, x, z)\) is a possible realization of \((Y,\Delta , X, Z)\). Equation (7.1) reduces in this case to \(1 - p(z) = 1 - \tilde{p}(z)\). It follows that \(\gamma \equiv {\tilde{\gamma }}\), since \(\hbox {Var}(Z)\) is positive definite by assumption (A)(ii).
Next, consider any \(0 < y \le \tau \). Then,
Hence, the corresponding sub-density equals \(p(Z) P(C > y \ | \ X,Z) f_u(y|X)\). Since the support of \(f_u(\cdot |X)\) is \([0,\tau ]\) and since \(p(Z) P(C> \tau \ | \ X,Z) >0\) thanks to assumptions (A)(i), (B)(i) and (B)(ii), it follows that the support of Y when \(\Delta =1\) and given X and Z is also \([0,\tau ]\). For any \(0 < y \le \tau \) and \(\delta =1\), the likelihood contribution in (7.1) is such that \(p(Z) f_u(y| X) = {\tilde{p}}(Z) \tilde{f}_u(y| X)\). Since \(p \equiv {\tilde{p}} >0\), it follows that \(f_u(y| X)=\tilde{f}_u(y| X)\). Since this is true for all \(0 < y \le \tau \), it follows that \(S_u(y|X)={\tilde{S}}_u(y|X)\). Hence,
Taking a logarithmic transformation at both sides we get
or equivalently, \((\beta -{\tilde{\beta }})^t X = \log \big [\log \tilde{S}_0(y) / \log S_0(y)\big ]\) for all \(0 < y \le \tau \). It follows that
Since Var(X) is positive definite by assumption (A)(ii), this is only possible if \(\beta ={\tilde{\beta }}\), which implies that \(S_0 \equiv {\tilde{S}}_0\). \(\square \)
Proof of Proposition 2
We need to show that if
for all realizations \((y, \delta , x, z)\) of \((Y, \Delta , X, Z)\), then \({\gamma }=\tilde{{\gamma }}, {\beta }=\tilde{{\beta }}\) and \(S_0 \equiv {\tilde{S}}_0\). Here, \(p(z) = \exp (\gamma ^t z) / [1+\exp (\gamma ^t z)]\), \({\tilde{p}}(z) = \exp ({\tilde{\gamma }}^t z) / [1+\exp ({\tilde{\gamma }}^t z)]\), \(S_u(y| x) = S_0(y \exp (-\beta ^t x))\), \({\tilde{S}}_u(y| x) = {\tilde{S}}_0(y \exp (-{\tilde{\beta }}^t x))\), and \(f_u(y|x)\) and \({\tilde{f}}_u(y|x)\) are the corresponding probability density functions.
First, consider any \(0 < y \le \tau _0 \exp (\beta ^t x)\) with \((x,z) \in S\). Then, we know from (7.2) that the conditional sub-density of the uncensored Y-values equals \(p(z) P(C> y \ | \ X=x,Z=z) f_u(y|x) = p(z) P(C > y \ | \ X=x,Z=z) f_0(y \exp (-\beta ^t x)) \exp (-\beta ^t x)\). Since the support of \(f_0(\cdot )\) is \([0,\tau _0]\) and since \(p(z) P(C> \tau _0 \exp (\beta ^t x) \ | \ X=x,Z=z) >0\) thanks to assumptions (A)(i), (C)(i) and (C)(ii), it follows that the support of Y when \(\Delta =1\), \(X=x\) and \(Z=z\) is \([0,\tau _0 \exp (\beta ^t x)]\). Hence, \(\beta ^t x = {\tilde{\beta }}^t x\), which is only possible if \(\beta ={\tilde{\beta }}\) since Var\((X|X \in S_X)>0\).
Next, consider \(y> \tau _0 \exp (\beta ^t x)\) and \(\delta =0\) with \((x,z) \in S\). Note that
thanks to (A)(i), (C)(i) and (C)(ii). Hence, \((y> \tau _0\exp (\beta ^t x), \delta =0, x, z)\) is a possible realization of \((Y,\Delta , X, Z)\). Equation (7.3) reduces in this case to \(1 - p(z) = 1 - {\tilde{p}}(z)\). It follows that \(\gamma = {\tilde{\gamma }}\) since \(\hbox {Var}(Z|Z \in S_Z)>0\).
Finally, for any \(0 < y \le \tau _0 \exp (\beta ^t x)\) and \(\delta =1\), the likelihood contribution in (7.3) is such that \(p(z) f_u(y|x) = {\tilde{p}}(z) \tilde{f}_u(y| x)\). Since \(p(z) = {\tilde{p}}(z) >0\), it follows that \(f_u(y|x)={\tilde{f}}_u(y| x)\). Since this is true for all \(0 < y \le \tau _0 \exp (\beta ^t x)\), it follows that \(S_u(y| x)={\tilde{S}}_u(y| x)\) for all \(0 < y \le \tau _0 \exp (\beta ^t x)\), and hence \(S_0 \equiv {\tilde{S}}_0\). \(\square \)
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Parsa, M., Van Keilegom, I. Accelerated failure time vs Cox proportional hazards mixture cure models: David vs Goliath?. Stat Papers 64, 835–855 (2023). https://doi.org/10.1007/s00362-022-01345-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-022-01345-5