A support vector machine based semiparametric mixture cure model

  • Peizhi Li
  • Yingwei PengEmail author
  • Ping Jiang
  • Qingli Dong
Original paper


The mixture cure model is an extension of standard survival models to analyze survival data with a cured fraction. Many developments in recent years focus on the latency part of the model to allow more flexible modeling strategies for the distribution of uncured subjects, and fewer studies focus on the incidence part to model the probability of being uncured/cured. We propose a new mixture cure model that employs the support vector machine (SVM) to model the covariate effects in the incidence part of the cure model. The new model inherits the features of the SVM to provide a flexible model to assess the effects of covariates on the incidence. Unlike the existing nonparametric approaches for the incidence part, the SVM method also allows for potentially high-dimensional covariates in the incidence part. Semiparametric models are also allowed in the latency part of the proposed model. We develop an estimation method to estimate the cure model and conduct a simulation study to show that the proposed model outperforms existing cure models, particularly in incidence estimation. An illustrative example using data from leukemia patients is given.


Censored survival time Cure model Support vector machine EM algorithm Multiple imputation 



The first and the last authors gratefully acknowledge the financial support from China Scholarship Council. The first author’s work was partially supported by Liaoning Social Science Planning Fund (L19CTJ001). The second author’s work was partially supported by a research grant from the Natural Sciences and Engineering Research Council of Canada. The last author’s work was also supported by the Fundamental Research Funds for the Central Universities (DUT19RC(3)042).

Compliance with ethical standards

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this paper.


  1. Boag JW (1949) Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J R Stat Soc 11:15–53zbMATHGoogle Scholar
  2. Breslow NE (1974) Covariate analysis of censored survival data. Biometrics 30:89–99CrossRefGoogle Scholar
  3. Cai C, Zou Y, Peng Y, Zhang J (2012) SMCURE: an R-package for estimating semiparametric mixture cure models. Comput Methods Programs Biomed 108:1255–1260CrossRefGoogle Scholar
  4. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2:27Google Scholar
  5. Copelan EA, Biggs JC, Thompson JM, Crilley P, Szer J, Klein JP, Kapoor N, Avalos BR, Cunningham I, Atkinson K, Downs K, Harmon GS, Daly MB, Brodsky I, Bulova SI, Tutschka PJ (1991) Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with Bu/Cy. Blood 78:838–843CrossRefGoogle Scholar
  6. Corbiere F, Joly P (2007) A SAS macro for parametric and semiparametric mixture cure models. Comput Methods Programs Biomed 85:173–180CrossRefGoogle Scholar
  7. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297zbMATHGoogle Scholar
  8. Denham JW, Denham EE, Dear KBG, Hudson GV (1996) The follicular non-Hodgkin’s lymphomas—I. The possibility of cure. Eur J Cancer 32:470–479CrossRefGoogle Scholar
  9. Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38:1041–46CrossRefGoogle Scholar
  10. Farewell VT (1986) Mixture models in survival analysis: are they worth the risk? Can J Stat 14:257–262MathSciNetCrossRefGoogle Scholar
  11. Gu Y, Sinha D, Banerjee S (2011) Analysis of cure rate survival data under proportional odds model. Lifetime Data Anal 17:123–134MathSciNetCrossRefGoogle Scholar
  12. Jiang W, Sun H, Peng Y (2017) Prediction accuracy for the cure probabilities in mixture cure model. Stat Methods Med Res 26:2029–2041MathSciNetCrossRefGoogle Scholar
  13. Klein JP, Moeschberger ML (2003) Survival Analysis, Techniques for Censored and Truncated Data, 2nd edn. Spinger, New YorkzbMATHGoogle Scholar
  14. Kuk AYC, Chen C (1992) A mixture model combining logistic regression with proportional hazards regression. Biometrika 79:531–41CrossRefGoogle Scholar
  15. Li C-S, Taylor JMG (2002) A semi-parametric accelerated failure time cure model. Stat Med 21:3235–3247CrossRefGoogle Scholar
  16. López-Cheda A, Cao R, Jacome MA, van Keilegom I (2017) Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models. Comput Stat Data Anal 105:144–165MathSciNetCrossRefGoogle Scholar
  17. Mao M, Wang J-L (2010) Semiparametric efficient estimation for a class of generalized proportional odds cure models. J Am Stat Assoc 105:302–311MathSciNetCrossRefGoogle Scholar
  18. Peng Y (2003) Fitting semiparametric cure models. Comput Stat Data Anal 41:481–490MathSciNetCrossRefGoogle Scholar
  19. Peng Y, Dear KBG (2000) A nonparametric mixture model for cure rate estimation. Biometrics 56:237–243CrossRefGoogle Scholar
  20. Platt J (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Accessed 3 Nov 2019
  21. Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10:61–74Google Scholar
  22. Ramires TG, Hens N, Cordeiro GM, Ortega EM (2018) Estimating nonlinear effects in the presence of cure fraction using a semi-parametric regression model. Comput Stat 33:709–730MathSciNetCrossRefGoogle Scholar
  23. Sy JP, Taylor JMG (2000) Estimation in a Cox proportional hazards cure model. Biometrics 56:227–236MathSciNetCrossRefGoogle Scholar
  24. Taylor JMG (1995) Semi-parametric estimation in failure time mixture models. Biometrics 51:899–907CrossRefGoogle Scholar
  25. Tong EN, Mues C, Thomas LC (2012) Mixture cure models in credit scoring: if and when borrowers default. Eur J Oper Res 218:132–139MathSciNetCrossRefGoogle Scholar
  26. Wu Y, Yin G (2013) Cure rate quantile regression for censored data with a survival fraction. J Am Stat Assoc 108:1517–1531MathSciNetCrossRefGoogle Scholar
  27. Xu J, Peng Y (2014) Nonparametric cure rate estimation with covariates. Can J Stat 42:1–17MathSciNetCrossRefGoogle Scholar
  28. Zhang J, Peng Y (2007a) An alternative estimation method for the accelerated failure time frailty model. Comput Stat Data Anal 51:4413–4423MathSciNetCrossRefGoogle Scholar
  29. Zhang J, Peng Y (2007b) A new estimation method for the semiparametric accelerated failure time mixture cure model. Stat Med 26:3157–3171MathSciNetCrossRefGoogle Scholar
  30. Zhang J, Peng Y (2009) Accelerated hazards mixture cure model. Lifetime Data Anal 15:455–467MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of FinanceDongbei University of Finance and EconomicsDalianChina
  2. 2.Department of Public Health SciencesQueen’s UniversityKingstonCanada
  3. 3.Department of Mathematics and StatisticsQueen’s UniversityKingstonCanada
  4. 4.School of StatisticsDongbei University of Finance and EconomicsDalianChina
  5. 5.Faculty of Management and EconomicsDalian University of TechnologyDalianChina

Personalised recommendations