A support vector machine based semiparametric mixture cure model

Abstract

The mixture cure model is an extension of standard survival models to analyze survival data with a cured fraction. Many developments in recent years focus on the latency part of the model to allow more flexible modeling strategies for the distribution of uncured subjects, and fewer studies focus on the incidence part to model the probability of being uncured/cured. We propose a new mixture cure model that employs the support vector machine (SVM) to model the covariate effects in the incidence part of the cure model. The new model inherits the features of the SVM to provide a flexible model to assess the effects of covariates on the incidence. Unlike the existing nonparametric approaches for the incidence part, the SVM method also allows for potentially high-dimensional covariates in the incidence part. Semiparametric models are also allowed in the latency part of the proposed model. We develop an estimation method to estimate the cure model and conduct a simulation study to show that the proposed model outperforms existing cure models, particularly in incidence estimation. An illustrative example using data from leukemia patients is given.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Boag JW (1949) Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J R Stat Soc 11:15–53

    MATH  Google Scholar 

  2. Breslow NE (1974) Covariate analysis of censored survival data. Biometrics 30:89–99

    Article  Google Scholar 

  3. Cai C, Zou Y, Peng Y, Zhang J (2012) SMCURE: an R-package for estimating semiparametric mixture cure models. Comput Methods Programs Biomed 108:1255–1260

    Article  Google Scholar 

  4. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2:27

    Google Scholar 

  5. Copelan EA, Biggs JC, Thompson JM, Crilley P, Szer J, Klein JP, Kapoor N, Avalos BR, Cunningham I, Atkinson K, Downs K, Harmon GS, Daly MB, Brodsky I, Bulova SI, Tutschka PJ (1991) Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with Bu/Cy. Blood 78:838–843

    Article  Google Scholar 

  6. Corbiere F, Joly P (2007) A SAS macro for parametric and semiparametric mixture cure models. Comput Methods Programs Biomed 85:173–180

    Article  Google Scholar 

  7. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    MATH  Google Scholar 

  8. Denham JW, Denham EE, Dear KBG, Hudson GV (1996) The follicular non-Hodgkin’s lymphomas—I. The possibility of cure. Eur J Cancer 32:470–479

    Article  Google Scholar 

  9. Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38:1041–46

    Article  Google Scholar 

  10. Farewell VT (1986) Mixture models in survival analysis: are they worth the risk? Can J Stat 14:257–262

    MathSciNet  Article  Google Scholar 

  11. Gu Y, Sinha D, Banerjee S (2011) Analysis of cure rate survival data under proportional odds model. Lifetime Data Anal 17:123–134

    MathSciNet  Article  Google Scholar 

  12. Jiang W, Sun H, Peng Y (2017) Prediction accuracy for the cure probabilities in mixture cure model. Stat Methods Med Res 26:2029–2041

    MathSciNet  Article  Google Scholar 

  13. Klein JP, Moeschberger ML (2003) Survival Analysis, Techniques for Censored and Truncated Data, 2nd edn. Spinger, New York

    Google Scholar 

  14. Kuk AYC, Chen C (1992) A mixture model combining logistic regression with proportional hazards regression. Biometrika 79:531–41

    Article  Google Scholar 

  15. Li C-S, Taylor JMG (2002) A semi-parametric accelerated failure time cure model. Stat Med 21:3235–3247

    Article  Google Scholar 

  16. López-Cheda A, Cao R, Jacome MA, van Keilegom I (2017) Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models. Comput Stat Data Anal 105:144–165

    MathSciNet  Article  Google Scholar 

  17. Mao M, Wang J-L (2010) Semiparametric efficient estimation for a class of generalized proportional odds cure models. J Am Stat Assoc 105:302–311

    MathSciNet  Article  Google Scholar 

  18. Peng Y (2003) Fitting semiparametric cure models. Comput Stat Data Anal 41:481–490

    MathSciNet  Article  Google Scholar 

  19. Peng Y, Dear KBG (2000) A nonparametric mixture model for cure rate estimation. Biometrics 56:237–243

    Article  Google Scholar 

  20. Platt J (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-98-14.pdf. Accessed 3 Nov 2019

  21. Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10:61–74

    Google Scholar 

  22. Ramires TG, Hens N, Cordeiro GM, Ortega EM (2018) Estimating nonlinear effects in the presence of cure fraction using a semi-parametric regression model. Comput Stat 33:709–730

    MathSciNet  Article  Google Scholar 

  23. Sy JP, Taylor JMG (2000) Estimation in a Cox proportional hazards cure model. Biometrics 56:227–236

    MathSciNet  Article  Google Scholar 

  24. Taylor JMG (1995) Semi-parametric estimation in failure time mixture models. Biometrics 51:899–907

    Article  Google Scholar 

  25. Tong EN, Mues C, Thomas LC (2012) Mixture cure models in credit scoring: if and when borrowers default. Eur J Oper Res 218:132–139

    MathSciNet  Article  Google Scholar 

  26. Wu Y, Yin G (2013) Cure rate quantile regression for censored data with a survival fraction. J Am Stat Assoc 108:1517–1531

    MathSciNet  Article  Google Scholar 

  27. Xu J, Peng Y (2014) Nonparametric cure rate estimation with covariates. Can J Stat 42:1–17

    MathSciNet  Article  Google Scholar 

  28. Zhang J, Peng Y (2007a) An alternative estimation method for the accelerated failure time frailty model. Comput Stat Data Anal 51:4413–4423

    MathSciNet  Article  Google Scholar 

  29. Zhang J, Peng Y (2007b) A new estimation method for the semiparametric accelerated failure time mixture cure model. Stat Med 26:3157–3171

    MathSciNet  Article  Google Scholar 

  30. Zhang J, Peng Y (2009) Accelerated hazards mixture cure model. Lifetime Data Anal 15:455–467

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

The first and the last authors gratefully acknowledge the financial support from China Scholarship Council. The first author’s work was partially supported by Liaoning Social Science Planning Fund (L19CTJ001). The second author’s work was partially supported by a research grant from the Natural Sciences and Engineering Research Council of Canada. The last author’s work was also supported by the Fundamental Research Funds for the Central Universities (DUT19RC(3)042).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yingwei Peng.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, P., Peng, Y., Jiang, P. et al. A support vector machine based semiparametric mixture cure model. Comput Stat 35, 931–945 (2020). https://doi.org/10.1007/s00180-019-00931-w

Download citation

Keywords

  • Censored survival time
  • Cure model
  • Support vector machine
  • EM algorithm
  • Multiple imputation