Skip to main content

Advertisement

Log in

Accelerated failure time vs Cox proportional hazards mixture cure models: David vs Goliath?

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

A mixture cure model relies on a model for the cure probability and a model for the survival function of the uncured subjects. For the latter, one often uses a Cox proportional hazards model. We show the identifiability of this model under weak assumptions. The model assumes that the cure threshold is the same for all values of the covariates, which might be unrealistic in certain situations. An alternative mixture cure model is the accelerated failure time (AFT) model. We also show the identifiability of this model under minimal assumptions. The cure threshold in this model depends on the covariates, which often leads to a better fit of the data. This is especially true when the follow-up period is insufficient for certain values of the covariates. We study these two models via simulations both when the follow-up is sufficient and when it is insufficient. Moreover, the two models are applied to data coming from a breast cancer clinical trial. We show that the AFT and the Cox model both fit the data well in the region of sufficient follow-up, but differ drastically outside that region.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Abrahamowicz M, MacKenzie T, Esdaile JM (1996) Time-dependent hazard ratio: modeling and hypothesis testing with application in lupus nephritis. J Am Stat Assoc 91(436):1432–1439

    Article  MATH  Google Scholar 

  • Amico M, Van Keilegom I (2018) Cure models in survival analysis. Annu Rev Stat Appl 5:311–342

    Article  MathSciNet  Google Scholar 

  • Bender R, Augustin T, Blettner M (2005) Generating survival times to simulate Cox proportional hazards models. Stat Med 24(11):1713–1723

    Article  MathSciNet  Google Scholar 

  • Berkson J, Gage RP (1952) Survival curve for cancer patients following treatment. J Am Stat Assoc 47(259):501–515

    Article  Google Scholar 

  • Cai C, Zou Y, Peng Y, Zhang J (2012) smcure: an R-package for estimating semiparametric mixture cure models. Comput Methods Progr Biomed 108(3):1255–1260

    Article  Google Scholar 

  • Chown J, Heuchenne C, Van Keilegom I (2020) The nonparametric location-scale mixture cure model. TEST 29:1008–1028

    Article  MathSciNet  MATH  Google Scholar 

  • Cox DR (1972) Regression models and life tables (with discussion). J R Stat Soc Ser B 34(2):187–220

    MATH  Google Scholar 

  • Cox DR, Oakes D (1984) Survival data analysis. Chapman and Hall, New York

    Google Scholar 

  • Escobar-Bach M, Maller R, Van Keilegom I, Zhao M (2021) Estimation of the cure rate for distributions in the Gumbel maximum domain of attraction under insufficient follow-up. Biometrika 109:243–256

    Article  MathSciNet  MATH  Google Scholar 

  • Fang H-B, Li G, Sun J (2005) Maximum likelihood estimation in a semiparametric logistic/proportional-hazards mixture model. Scand J Stat 32(1):59–75

    Article  MathSciNet  MATH  Google Scholar 

  • Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38:1041–1046

    Article  Google Scholar 

  • Farewell VT (1986) Mixture models in survival analysis: are they worth the risk? Can J Stat 14(3):257–262

    Article  MathSciNet  Google Scholar 

  • Geerdens C, Janssen P, Van Keilegom I (2020) Goodness-of-fit tests for univariate mixture cure rate models. TEST 29:768–792

    Article  MathSciNet  MATH  Google Scholar 

  • Hanin L, Huang L-S (2014) Identifiability of cure models revisited. J Multivar Anal 130:261–274

    Article  MathSciNet  MATH  Google Scholar 

  • Kalbfleisch JD, Prentice RL (2011) The statistical analysis of failure time data. Wiley, New York

    MATH  Google Scholar 

  • Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481

    Article  MathSciNet  MATH  Google Scholar 

  • Kuk AY, Chen C-H (1992) A mixture model combining logistic regression with proportional hazards regression. Biometrika 79(3):531–541

    Article  MATH  Google Scholar 

  • Li C-S, Taylor JM (2002) A semi-parametric accelerated failure time cure model. Stat Med 21(21):3235–3247

    Article  Google Scholar 

  • López-Cheda A, Cao R, Jácome MA, Van Keilegom I (2017) Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models. Comput Stat Data Anal 105:144–165

    Article  MathSciNet  MATH  Google Scholar 

  • Lu W (2008) Maximum likelihood estimation in the proportional hazards cure model. Ann Inst Stat Math 60(3):545–574

    Article  MathSciNet  MATH  Google Scholar 

  • Lu W (2010) Efficient estimation for an accelerated failure time model with a cure fraction. Stat Sin 20:661–674

    MathSciNet  MATH  Google Scholar 

  • Lu W, Ying Z (2004) On semiparametric transformation cure models. Biometrika 91(2):331–343

    Article  MathSciNet  MATH  Google Scholar 

  • Maller RA, Zhou X (1996) Survival analysis with long-term survivors. Wiley, New York

    MATH  Google Scholar 

  • Orbe J, Ferreira E, Núñez-Antón V (2002) Comparing proportional hazards and accelerated failure time models for survival analysis. Stat Med 21(22):3493–3510

    Article  Google Scholar 

  • Patel K, Kay R, Rowell L (2006) Comparing proportional hazards and accelerated failure time models: an application in influenza. Pharm Stat J Appl Stat Pharm Ind 5(3):213–224

    Google Scholar 

  • Patilea V, Van Keilegom I (2020) A general approach for cure models in survival analysis. Ann Stat 48(4):2323–2346

    Article  MathSciNet  MATH  Google Scholar 

  • Peng Y, Yu B (2021) Cure models: methods, applications, and implementation. CRC Press, Boca Raton

    Book  Google Scholar 

  • Peng Y, Dear KB (2000) A nonparametric mixture model for cure rate estimation. Biometrics 56(1):237–243

    Article  MATH  Google Scholar 

  • Sy JP, Taylor JM (2000) Estimation in a Cox proportional hazards cure model. Biometrics 56(1):227–236

    Article  MathSciNet  MATH  Google Scholar 

  • Taylor JM (1995) Semi-parametric estimation in failure time mixture models. Biometrics 51:899–907

    Article  MATH  Google Scholar 

  • Xu J, Peng Y (2014) Nonparametric cure rate estimation with covariates. Can J Stat 42(1):1–17

    Article  MathSciNet  MATH  Google Scholar 

  • Yu B, Tiwari RC, Cronin KA, Feuer EJ (2004) Cure fraction estimation from the mixture cure models for grouped survival data. Stat Med 23(11):1733–1747

    Article  Google Scholar 

  • Zhang J, Peng Y (2007) A new estimation method for the semiparametric accelerated failure time mixture cure model. Stat Med 26(16):3157–3171

    Article  MathSciNet  Google Scholar 

  • Zhang J, Peng Y, Zhao O (2011) A new semiparametric estimation method for accelerated hazard model. Biometrics 67(4):1352–1360

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang M-J (2002) Cox proportional hazards regression models for survival data in cancer research. In: Beam C (ed) Biostatistical applications in cancer research, vol 113. Cancer treatment and research. Springer, Boston, pp 59–70

    Chapter  Google Scholar 

Download references

Acknowledgements

The authors acknowledge support from the European Research Council (2016–2022, Horizon 2020 / ERC grant agreement No. 694409).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ingrid Van Keilegom.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proofs of main results

Appendix: Proofs of main results

In this Appendix, we provide the proofs of Propositions 1 and 2.

Proof of Proposition 1

We need to show that if

$$\begin{aligned}&\Big \{p(z) f_u(y| x)\Big \}^{\delta } \ \Big \{1 - p(z) + p(z)S_u(y| x)\Big \}^{1 - \delta } \nonumber \\&\quad = \Big \{{\tilde{p}}(z) {\tilde{f}}_u(y|x) \Big \}^{\delta } \ \Big \{1 - {\tilde{p}}(z) + {\tilde{p}}(z) \tilde{S}_u(y|x)\Big \}^{1 - \delta } \end{aligned}$$
(7.1)

for all realizations \((y, \delta , x, z)\) of \((Y, \Delta , X, Z)\), then \({\gamma }=\tilde{{\gamma }}, {\beta }=\tilde{{\beta }}\) and \(S_0 \equiv {\tilde{S}}_0\). Here, \(p(z) = \exp (\gamma ^t z) / [1+\exp (\gamma ^t z)]\), \({\tilde{p}}(z) = \exp ({\tilde{\gamma }}^t z) / [1+\exp ({\tilde{\gamma }}^t z)]\), \(S_u(y| x) = S_0(y)^{\exp (\beta ^t x)}\), \({\tilde{S}}_u(y| x) = {\tilde{S}}_0(y)^{\exp ({\tilde{\beta }}^t x)}\), and \(f_u(y|x)\) and \({\tilde{f}}_u(y|x)\) are the corresponding probability density functions.

First, consider \(y>\tau \) and \(\delta =0\). Note that

$$\begin{aligned} P(Y> \tau , \Delta =0 \ | \ X, Z)= & {} P(C>\tau , C\le T \ | \ X, Z) \\= & {} P(C>\tau , T=\infty \ | \ X, Z) \\= & {} P(C>\tau \ | \ X, Z) \ (1-p(Z)) >0, \end{aligned}$$

thanks to (A)(i), (B)(i) and (B)(ii). Hence, \((y>\tau , \delta =0, x, z)\) is a possible realization of \((Y,\Delta , X, Z)\). Equation (7.1) reduces in this case to \(1 - p(z) = 1 - \tilde{p}(z)\). It follows that \(\gamma \equiv {\tilde{\gamma }}\), since \(\hbox {Var}(Z)\) is positive definite by assumption (A)(ii).

Next, consider any \(0 < y \le \tau \). Then,

$$\begin{aligned} P(Y \le y, \Delta =1 \ | \ X, Z)= & {} P(T\le y, T \le C \ | \ X, Z) \nonumber \\= & {} \int _0^y P(C > t \ | \ X,Z) \, d P(T \le t \ | \ X,Z). \end{aligned}$$
(7.2)

Hence, the corresponding sub-density equals \(p(Z) P(C > y \ | \ X,Z) f_u(y|X)\). Since the support of \(f_u(\cdot |X)\) is \([0,\tau ]\) and since \(p(Z) P(C> \tau \ | \ X,Z) >0\) thanks to assumptions (A)(i), (B)(i) and (B)(ii), it follows that the support of Y when \(\Delta =1\) and given X and Z is also \([0,\tau ]\). For any \(0 < y \le \tau \) and \(\delta =1\), the likelihood contribution in (7.1) is such that \(p(Z) f_u(y| X) = {\tilde{p}}(Z) \tilde{f}_u(y| X)\). Since \(p \equiv {\tilde{p}} >0\), it follows that \(f_u(y| X)=\tilde{f}_u(y| X)\). Since this is true for all \(0 < y \le \tau \), it follows that \(S_u(y|X)={\tilde{S}}_u(y|X)\). Hence,

$$\begin{aligned} S_0(y)^{\exp (\beta ^t X)} = {\tilde{S}}_0(y)^{\exp ({\tilde{\beta }}^t X)}. \end{aligned}$$

Taking a logarithmic transformation at both sides we get

$$\begin{aligned} \exp (\beta ^t X) \log S_0(y) = \exp ({\tilde{\beta }}^t X) \log {\tilde{S}}_0(y), \end{aligned}$$

or equivalently, \((\beta -{\tilde{\beta }})^t X = \log \big [\log \tilde{S}_0(y) / \log S_0(y)\big ]\) for all \(0 < y \le \tau \). It follows that

$$\begin{aligned} (\beta -{\tilde{\beta }})^t \text{ Var }(X) (\beta -{\tilde{\beta }}) = 0. \end{aligned}$$

Since Var(X) is positive definite by assumption (A)(ii), this is only possible if \(\beta ={\tilde{\beta }}\), which implies that \(S_0 \equiv {\tilde{S}}_0\). \(\square \)

Proof of Proposition 2

We need to show that if

$$\begin{aligned}&\Big \{p(z) f_u(y|x)\Big \}^{\delta } \ \Big \{1 - p(z) + p(z)S_u(y| x)\Big \}^{1 - \delta } \nonumber \\&\quad = \Big \{{\tilde{p}}(z) {\tilde{f}}_u(y|x) \Big \}^{\delta } \ \Big \{1 - {\tilde{p}}(z) + {\tilde{p}}(z) \tilde{S}_u(y|x)\Big \}^{1 - \delta } \end{aligned}$$
(7.3)

for all realizations \((y, \delta , x, z)\) of \((Y, \Delta , X, Z)\), then \({\gamma }=\tilde{{\gamma }}, {\beta }=\tilde{{\beta }}\) and \(S_0 \equiv {\tilde{S}}_0\). Here, \(p(z) = \exp (\gamma ^t z) / [1+\exp (\gamma ^t z)]\), \({\tilde{p}}(z) = \exp ({\tilde{\gamma }}^t z) / [1+\exp ({\tilde{\gamma }}^t z)]\), \(S_u(y| x) = S_0(y \exp (-\beta ^t x))\), \({\tilde{S}}_u(y| x) = {\tilde{S}}_0(y \exp (-{\tilde{\beta }}^t x))\), and \(f_u(y|x)\) and \({\tilde{f}}_u(y|x)\) are the corresponding probability density functions.

First, consider any \(0 < y \le \tau _0 \exp (\beta ^t x)\) with \((x,z) \in S\). Then, we know from (7.2) that the conditional sub-density of the uncensored Y-values equals \(p(z) P(C> y \ | \ X=x,Z=z) f_u(y|x) = p(z) P(C > y \ | \ X=x,Z=z) f_0(y \exp (-\beta ^t x)) \exp (-\beta ^t x)\). Since the support of \(f_0(\cdot )\) is \([0,\tau _0]\) and since \(p(z) P(C> \tau _0 \exp (\beta ^t x) \ | \ X=x,Z=z) >0\) thanks to assumptions (A)(i), (C)(i) and (C)(ii), it follows that the support of Y when \(\Delta =1\), \(X=x\) and \(Z=z\) is \([0,\tau _0 \exp (\beta ^t x)]\). Hence, \(\beta ^t x = {\tilde{\beta }}^t x\), which is only possible if \(\beta ={\tilde{\beta }}\) since Var\((X|X \in S_X)>0\).

Next, consider \(y> \tau _0 \exp (\beta ^t x)\) and \(\delta =0\) with \((x,z) \in S\). Note that

$$\begin{aligned}&P(Y> \tau _0\exp (\beta ^t x), \Delta =0 \ | \ X=x, Z=z) \\&\quad = P(C> \tau _0\exp (\beta ^t x), C\le T \ | \ X=x, Z=z) \\&\quad = P(C> \tau _0\exp (\beta ^t x), T=\infty \ | \ X=x, Z=z) \\&\quad = P(C> \tau _0\exp (\beta ^t x) \ | \ X=x, Z=z) \ (1-p(z)) >0, \end{aligned}$$

thanks to (A)(i), (C)(i) and (C)(ii). Hence, \((y> \tau _0\exp (\beta ^t x), \delta =0, x, z)\) is a possible realization of \((Y,\Delta , X, Z)\). Equation (7.3) reduces in this case to \(1 - p(z) = 1 - {\tilde{p}}(z)\). It follows that \(\gamma = {\tilde{\gamma }}\) since \(\hbox {Var}(Z|Z \in S_Z)>0\).

Finally, for any \(0 < y \le \tau _0 \exp (\beta ^t x)\) and \(\delta =1\), the likelihood contribution in (7.3) is such that \(p(z) f_u(y|x) = {\tilde{p}}(z) \tilde{f}_u(y| x)\). Since \(p(z) = {\tilde{p}}(z) >0\), it follows that \(f_u(y|x)={\tilde{f}}_u(y| x)\). Since this is true for all \(0 < y \le \tau _0 \exp (\beta ^t x)\), it follows that \(S_u(y| x)={\tilde{S}}_u(y| x)\) for all \(0 < y \le \tau _0 \exp (\beta ^t x)\), and hence \(S_0 \equiv {\tilde{S}}_0\). \(\square \)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Parsa, M., Van Keilegom, I. Accelerated failure time vs Cox proportional hazards mixture cure models: David vs Goliath?. Stat Papers 64, 835–855 (2023). https://doi.org/10.1007/s00362-022-01345-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-022-01345-5

Keywords

Navigation