The mixture cure model is an extension of standard survival models to analyze survival data with a cured fraction. Many developments in recent years focus on the latency part of the model to allow more flexible modeling strategies for the distribution of uncured subjects, and fewer studies focus on the incidence part to model the probability of being uncured/cured. We propose a new mixture cure model that employs the support vector machine (SVM) to model the covariate effects in the incidence part of the cure model. The new model inherits the features of the SVM to provide a flexible model to assess the effects of covariates on the incidence. Unlike the existing nonparametric approaches for the incidence part, the SVM method also allows for potentially high-dimensional covariates in the incidence part. Semiparametric models are also allowed in the latency part of the proposed model. We develop an estimation method to estimate the cure model and conduct a simulation study to show that the proposed model outperforms existing cure models, particularly in incidence estimation. An illustrative example using data from leukemia patients is given.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Boag JW (1949) Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J R Stat Soc 11:15–53
Breslow NE (1974) Covariate analysis of censored survival data. Biometrics 30:89–99
Cai C, Zou Y, Peng Y, Zhang J (2012) SMCURE: an R-package for estimating semiparametric mixture cure models. Comput Methods Programs Biomed 108:1255–1260
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2:27
Copelan EA, Biggs JC, Thompson JM, Crilley P, Szer J, Klein JP, Kapoor N, Avalos BR, Cunningham I, Atkinson K, Downs K, Harmon GS, Daly MB, Brodsky I, Bulova SI, Tutschka PJ (1991) Treatment for acute myelocytic leukemia with allogeneic bone marrow transplantation following preparation with Bu/Cy. Blood 78:838–843
Corbiere F, Joly P (2007) A SAS macro for parametric and semiparametric mixture cure models. Comput Methods Programs Biomed 85:173–180
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Denham JW, Denham EE, Dear KBG, Hudson GV (1996) The follicular non-Hodgkin’s lymphomas—I. The possibility of cure. Eur J Cancer 32:470–479
Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38:1041–46
Farewell VT (1986) Mixture models in survival analysis: are they worth the risk? Can J Stat 14:257–262
Gu Y, Sinha D, Banerjee S (2011) Analysis of cure rate survival data under proportional odds model. Lifetime Data Anal 17:123–134
Jiang W, Sun H, Peng Y (2017) Prediction accuracy for the cure probabilities in mixture cure model. Stat Methods Med Res 26:2029–2041
Klein JP, Moeschberger ML (2003) Survival Analysis, Techniques for Censored and Truncated Data, 2nd edn. Spinger, New York
Kuk AYC, Chen C (1992) A mixture model combining logistic regression with proportional hazards regression. Biometrika 79:531–41
Li C-S, Taylor JMG (2002) A semi-parametric accelerated failure time cure model. Stat Med 21:3235–3247
López-Cheda A, Cao R, Jacome MA, van Keilegom I (2017) Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models. Comput Stat Data Anal 105:144–165
Mao M, Wang J-L (2010) Semiparametric efficient estimation for a class of generalized proportional odds cure models. J Am Stat Assoc 105:302–311
Peng Y (2003) Fitting semiparametric cure models. Comput Stat Data Anal 41:481–490
Peng Y, Dear KBG (2000) A nonparametric mixture model for cure rate estimation. Biometrics 56:237–243
Platt J (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-98-14.pdf. Accessed 3 Nov 2019
Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10:61–74
Ramires TG, Hens N, Cordeiro GM, Ortega EM (2018) Estimating nonlinear effects in the presence of cure fraction using a semi-parametric regression model. Comput Stat 33:709–730
Sy JP, Taylor JMG (2000) Estimation in a Cox proportional hazards cure model. Biometrics 56:227–236
Taylor JMG (1995) Semi-parametric estimation in failure time mixture models. Biometrics 51:899–907
Tong EN, Mues C, Thomas LC (2012) Mixture cure models in credit scoring: if and when borrowers default. Eur J Oper Res 218:132–139
Wu Y, Yin G (2013) Cure rate quantile regression for censored data with a survival fraction. J Am Stat Assoc 108:1517–1531
Xu J, Peng Y (2014) Nonparametric cure rate estimation with covariates. Can J Stat 42:1–17
Zhang J, Peng Y (2007a) An alternative estimation method for the accelerated failure time frailty model. Comput Stat Data Anal 51:4413–4423
Zhang J, Peng Y (2007b) A new estimation method for the semiparametric accelerated failure time mixture cure model. Stat Med 26:3157–3171
Zhang J, Peng Y (2009) Accelerated hazards mixture cure model. Lifetime Data Anal 15:455–467
The first and the last authors gratefully acknowledge the financial support from China Scholarship Council. The first author’s work was partially supported by Liaoning Social Science Planning Fund (L19CTJ001). The second author’s work was partially supported by a research grant from the Natural Sciences and Engineering Research Council of Canada. The last author’s work was also supported by the Fundamental Research Funds for the Central Universities (DUT19RC(3)042).
Conflict of interest
The authors declare that there is no conflict of interests regarding the publication of this paper.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Li, P., Peng, Y., Jiang, P. et al. A support vector machine based semiparametric mixture cure model. Comput Stat 35, 931–945 (2020). https://doi.org/10.1007/s00180-019-00931-w
- Censored survival time
- Cure model
- Support vector machine
- EM algorithm
- Multiple imputation