Abstract
In clinical studies, one often encounters time-to-event data that are subject to right censoring and for which a fraction of the patients under study never experience the event of interest. Such data can be modeled using cure models in survival analysis. In the presence of cure fraction, the mixture cure model is popular, since it allows to model probability to be cured (called the incidence) and the survival function of the uncured individuals (called the latency). In this paper, we develop a variable selection procedure for the incidence and latency parts of a mixture cure model, consisting of a logistic model for the incidence and a semiparametric accelerated failure time model for the latency. We use a penalized likelihood approach, based on adaptive LASSO penalties for each part of the model, and we consider two algorithms for optimizing the criterion function. Extensive simulations are carried out to assess the accuracy of the proposed selection procedure. Finally, we employ the proposed method to a real dataset regarding heart failure patients with left ventricular systolic dysfunction.
Similar content being viewed by others
References
Abrahamowicz M, MacKenzie T, Esdaile JM (1996) Time-dependent hazard ratio: modeling and hypothesis testing with application in lupus nephritis. J Am Stat Assoc 91(436):1432–1439
Ahmad T, Munir A, Bhatti SH, Aftab M, Raza MA (2017) Survival analysis of heart failure patients: a case study. PloS one 12(7):e0181001
Amico M, Van Keilegom I (2018) Cure models in survival analysis. Ann Rev Stat Appl 5:311–342
Beretta A, Heuchenne C (2019) Variable selection in proportional hazards cure model with time-varying covariates, application to us bank failures. J Appl Stat 46(9):1529–1549
Cai C, Zou Y, Peng Y, Zhang J (2012) smcure: an R-package for estimating semiparametric mixture cure models. Comput Methods Programs Biomed 108(3):1255–1260
Chang W, Wan C, Yu C, Yao W, Zhang C, Cao S (2020) Robmixreg: an R package for robust, flexible and high dimensional mixture regression. bioRxiv
Cox DR (1972) Regression models and life-tables. J Royal Stat Soc Ser B (Methodological) 34(2):187–202
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Statistica Sinica 20(1):101
Fan J, Peng H (2004) On nonconcave penalized likelihood with a diverging number of parameters. Ann Stat 32(3):928–961
Friedman J, Hastie T, Höfling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1(2):302–332
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1
Hoerl AE, Kennard RW (1970) Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(1):55–67
Kearney MT, Fox KA, Lee AJ, Prescott RJ, Shah AM, Batin PD, Baig W, Lindsay S, Callahan TS, Shell WE et al (2002) Predicting death due to progressive heart failure in patients with mild-to-moderate chronic heart failure. J Am Coll Cardiol 40(10):1801–1808
Liu X, Peng Y, Tu D, Liang H (2012) Variable selection in semiparametric cure models based on penalized likelihood, with application to breast cancer clinical trials. Stat Med 31(24):2882–2891
Lu W (2010) Efficient estimation for an accelerated failure time model with a cure fraction. Statistica Sinica 20:661
Masud A, Tu W, Yu Z (2018) Variable selection for mixture and promotion time cure rate models. Stat Methods Med Res 27(7):2185–2199
Masud AA, Yu Z, Tu W (2019) Variable selection and nonlinear effect discovery in partially linear mixture cure rate models. Biostat Epidemiol 3(1):156–177
Park E, Ha ID (2018) Penalized variable selection for accelerated failure time models. Commun Stat Appl Methods 25(6):591–604
Park E, Ha ID (2019) Penalized variable selection for accelerated failure time models with random effects. Stat Med 38(5):878–892
Parsa M, Van Keilegom I (2023) Accelerated failure time vs Cox proportional hazards mixture cure models: David vs Goliath? Stat Papers 64(3):835–855
Patel K, Kay R, Rowell L (2006) Comparing proportional hazards and accelerated failure time models: an application in influenza. Pharm Stat J Appl Stat Pharm Ind 5(3):213–224
Peng Y, Yu B (2021) Cure Models Methods Appl Implement. CRC Press
Scolas S, El Ghouch A, Legrand C (2016a) The SNP representation in mixture cure models with interval-censoring: estimation and goodness-of-fit testing. Tech. rep., Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA)
Scolas S, El Ghouch A, Legrand C, Oulhaj A (2016) Variable selection in a flexible parametric mixture cure model with interval-censored data. Stat Med 35(7):1210–1225
Seow SC, Chai P, Lee YP, Chan YH, Kwok BW, Yeo TC, Chia BL (2007) Heart failure mortality in southeast Asian patients with left ventricular systolic dysfunction. J Card Fail 13(6):476–481
Shi X, Ma S, Huang Y (2020) Promoting sign consistency in the cure model estimation and selection. Stat Methods Med Res 29(1):15–28
Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1
Solomon S, Anavekar N, Skali H, McMurray J, Swedberg K, Yusuf S, Granger C, Michelson E, Wang D, Pocock S, Pfeffer M (2005) Influence of ejection fraction on cardiovascular outcomes in a broad spectrum of heart failure patients. Circulation 112:3738–3744
Sun L, Li S, Wang L, Song X (2019) Variable selection in semiparametric nonmixture cure model with interval-censored failure time data: an application to the prostate cancer screening study. Stat Med 38(16):3026–3039
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J Royal Stat Soc Ser B (Methodological) 58(1):267–288
Tibshirani R (1997) The Lasso method for variable selection in the Cox model. Stat Med 16(4):385–395
Van Keilegom I, Parsa M (2022) On a semiparametric estimation method for AFT mixture cure models. Tech Rep Http://econweb2012.econ.kuleuven.ac.be/public/u0062045/paper1.pdf
Wang H, Leng C (2007) Unified Lasso estimation by least squares approximation. J Am Stat Assoc 102(479):1039–1048
Wang H, Li R, Tsai C (2007) On the consistency of SCAD tuning parameter selector. Biometrika 94(3):553–568
Xu Y, Zhao S, Hu T, Sun J (2021) Variable selection for generalized odds rate mixture cure models with interval-censored failure time data. Comput Stat Data Anal 156:107115
Zeng D, Lin D (2007) Efficient estimation for the accelerated failure time model. J Am Stat Assoc 102(480):1387–1396
Zhang HH, Lu W (2007) Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94(3):691–703
Zhang J, Peng Y (2007) A new estimation method for the semiparametric accelerated failure time mixture cure model. Stat Med 26(16):3157–3171
Zhang MJ (2002) Cox proportional hazards regression models for survival data in cancer research. Cancer Treat Res 113:59–70
Zou H (2006) The adaptive Lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429
Acknowledgements
The authors acknowledge support from the European Research Council (2016-2022, Horizon 2020 / ERC grant agreement No. 694409). For the simulations, they used the infrastructure of the Flemish Supercomputer Center (VSC). Ingrid Van Keilegom also acknowledges funding from the FWO and F.R.S.-FNRS under the Excellence of Science (EOS) programme, project ASTeRISK (grant nr. 40007517).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Parsa, M., Taghavi-Shahri, S.M. & Van Keilegom, I. On variable selection in a semiparametric AFT mixture cure model. Lifetime Data Anal 30, 472–500 (2024). https://doi.org/10.1007/s10985-024-09619-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-024-09619-w