Statistics and Computing

, Volume 26, Issue 3, pp 725–741 | Cite as

Variable selection for survival data with a class of adaptive elastic net techniques

  • Md Hasinur Rahaman KhanEmail author
  • J. Ewart H. Shaw


The accelerated failure time (AFT) models have proved useful in many contexts, though heavy censoring (as for example in cancer survival) and high dimensionality (as for example in microarray data) cause difficulties for model fitting and model selection. We propose new approaches to variable selection for censored data, based on AFT models optimized using regularized weighted least squares. The regularized technique uses a mixture of \(\ell _1\) and \(\ell _2\) norm penalties under two proposed elastic net type approaches. One is the adaptive elastic net and the other is weighted elastic net. The approaches extend the original approaches proposed by Ghosh (Adaptive elastic net: an improvement of elastic net to achieve oracle properties, Technical Reports 2007) and Hong and Zhang (Math Model Nat Phenom 5(3):115–133 2010), respectively. We also extend the two proposed approaches by adding censoring observations as constraints into their model optimization frameworks. The approaches are evaluated on microarray and by simulation. We compare the performance of these approaches with six other variable selection techniques-three are generally used for censored data and the other three are correlation-based greedy methods used for high-dimensional data.


Adaptive elastic net AFT Variable selection Stute’s weighted least squares Weighted elastic net 



The first author is grateful to the centre for research in Statistical Methodology (CRiSM), Department of Statistics, University of Warwick, UK for offering research funding for his PhD study.


  1. Akaike, H.: Information theory as an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973)Google Scholar
  2. Antoniadis, A., Fryzlewicz, P., Letue, F.: The Dantzig selector in Cox’s proportional hazards model. Scand. J. Stat. 37(4), 531–552 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  3. Buckley, J., James, I.: Linear regression with censored data. Biometrika 66, 429–436 (1979)CrossRefzbMATHGoogle Scholar
  4. Bühlmann, P., Kalisch, M., Maathuis, M.H.: Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm. Biometrika 97(2), 261–278 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  5. Cai, T., Huang, J., Tian, L.: Regularized estimation for the accelerated failure time model. Biometrics 65, 394–404 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  6. Candes, E., Tao, T.: The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann. Stat. 35(6), 2313–2351 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  7. Cho, H., Fryzlewicz, P.: High dimensional variable selection via tilting. J. R. Stat. Soc. Ser. B 74(3), 593–622 (2012)MathSciNetCrossRefGoogle Scholar
  8. Cox, D.R.: Regression models and life-tables. J. R. Stat. Soc. Ser. B 34, 187–220 (1972)MathSciNetzbMATHGoogle Scholar
  9. Datta, S., Le-Rademacher, J., Datta, S.: Predicting patient survival from microarray data by accelerated failure time modeling using partial least squares and LASSO. Biometrics 63, 259–271 (2007)MathSciNetCrossRefGoogle Scholar
  10. Efron, B.: The two sample problem with censored data. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 4, pp. 831–853. Prentice Hall, New York (1967)Google Scholar
  11. Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman and Hall, New York (1993)CrossRefzbMATHGoogle Scholar
  12. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  13. Engler, D., Li, Y.: Survival analysis with high-dimensional covariates: an application in microarray studies. Stat. Appl. Genet. Mol. Biol. 8(1), 1–22 (2009). (Article 14)MathSciNetzbMATHGoogle Scholar
  14. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  15. Fan, J., Li, R.: Variable selection for Cox’s proportional hazards model and frailty model. Ann. Stat. 30, 74–99 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  16. Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B 70(5), 849–911 (2008)MathSciNetCrossRefGoogle Scholar
  17. Faraggi, D., Simon, R.: Bayesian variable selection method for censored survival data. Biometrics 54, 1475–1485 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  18. Frank, I.E., Friedman, J.H.: A statistical view of some chemometrics regression tools. Technometrics 35(2), 109–135 (1993)CrossRefzbMATHGoogle Scholar
  19. Gehan, E.A.: A generalized Wilcoxon test for comparing arbitrarily singlecensored samples. Biometrika 52, 203–223 (1965)MathSciNetCrossRefzbMATHGoogle Scholar
  20. Ghosh, S.: On the grouped selection and model complexity of the adaptive elastic net. Stat. Comput. 21(3), 451–462 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  21. Ghosh, S.: Adaptive elastic net: an improvement of elastic net to achieve oracle properties. Technical Reports, Indiana University-Purdue University, Indianapolis, (PR no. 07–01) (2007)Google Scholar
  22. Gui, J., Li, H.: Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics 21, 3001–3008 (2005)CrossRefGoogle Scholar
  23. Hong, D., Zhang, F.: Weighted elastic net model for mass spectrometry imaging processing. Math. Model. Nat. Phenom. 5(3), 115–133 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  24. Hu, S., Rao, J.S.: Sparse penalization with censoring constraints for estimating high dimensional AFT models with applications to microarray data analysis. Technical Reports, University of Miami (2010)Google Scholar
  25. Huang, J., Harrington, D.: Iterative partial least squares with rightcensored data analysis: a comparison to other dimension reduction techniques. Biometrics 61(1), 17–24 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  26. Huang, J., Ma, S.: Variable selection in the accelerated failure time model via the bridge method. Lifetime Data Anal. 16, 176–195 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  27. Huang, J., Ma, S., Xie, H.: Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62, 813–820 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  28. Hunter, D.R., Li, R.: Variable selection using MM algorithms. Ann. Stat. 33(4), 1617–1642 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  29. Jin, Z., Lin, D., Wei, L.J., Ying, Z.L.: Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  30. Jin, Z., Lin, D.Y., Ying, Z.: On least-squares regression with censored data. Biometrika 93(1), 147–161 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  31. Khan, M.H.R., Shaw, J.E.H.: AdapEnetClass: a class of adaptive elastic net methods for censored data. R package version 1.1 (2014)Google Scholar
  32. Khan, M.H.R.: Variable selection and estimation procedures for high-dimensional survival data. Ph.D. Thesis, Department of Statistics, University of Warwick (2013)Google Scholar
  33. Khan, M.H.R., Shaw, J.E.H.: On dealing with censored largest observations under weighted least squares. CRiSM Working Paper, No 13–07 Department of Statistics, University of Warwick (2013b)Google Scholar
  34. Khan, M.H.R., Shaw, J.E.H.: Variable selection with the modified Buckley- James method and the dantzig selector for high-dimensional survival data. In: 59th ISI World Statistics Congress Proceedings, Hong Kong, pp. 4239–4244, 25–30 Aug 2013cGoogle Scholar
  35. Kriegeskorte, N., Simmons, W.K., Bellgowan, P.S.F., Baker, C.I.: Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12(5), 535–540 (2009)CrossRefGoogle Scholar
  36. Li, H., Luan, Y.: Kernel Cox regression models for linking gene expression profiles to censored survival data. Pac. Symp. Biocomput. 8, 65–76 (2003)zbMATHGoogle Scholar
  37. Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. Ser. B 72(4), 417–473 (2010)MathSciNetCrossRefGoogle Scholar
  38. Peduzzi, P.N., Hardy, R.J., Holford, T.R.: A stepwise variable selection procedure for nonlinear regression models. Biometrics 36, 511–516 (1980)CrossRefzbMATHGoogle Scholar
  39. Radchenko, P., James, G.M.: Improved variable selection with Forward-Lasso adaptive shrinkage. Ann. Appl. Stat. 5(1), 427–448 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  40. Rosenwald, A., Wright, G., Wiestner, A., Chan, W., Connors, J., Campo, E., Gascoyne, R., Grogan, T., Muller Hermelink, H., Smeland, E., Chiorazzi, M., Giltnane, J., Hurt, E., Zhao, H., Averett, L., Henrickson, S., Yang, L., Powell, J., Wilson, W., Jaffe, E., Simon, R., Klausner, R., Montserrat, E., Bosch, F., Greiner, T., Weisenburger, D., Sanger, W., Dave, B., Lynch, J., Vose, J., Armitage, J., Fisher, R., Miller, T., LeBlanc, M., Ott, G., Kvaloy, S., Holte, H., Delabie, J., Staudt, L.: The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 3, 185–197 (2003)CrossRefGoogle Scholar
  41. Sha, N., Tadesse, M.G., Vannucci, M.: Bayesian variable selection for the analysis of microarray data with censored outcome. Bioinformatics 22(18), 2262–2268 (2006)CrossRefGoogle Scholar
  42. Stute, W.: Consistent estimation under random censorship when covariables are available. J. Multivar. Anal. 45, 89–103 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  43. Stute, W.: Distributional convergence under random censorship when covariables are present. Scand. J. Stat. 23, 461–471 (1996)MathSciNetzbMATHGoogle Scholar
  44. Swerdlow, S., Williams, M.: From centrocytic to mantle cell lymphoma: a clinicopathologic and molecular review of 3 decades. Hum. Pathol. 33, 7–20 (2002)CrossRefGoogle Scholar
  45. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  46. Tibshirani, R.: The lasso method for variable selection in the Cox model. Stat. Med. 16, 385–395 (1997)CrossRefGoogle Scholar
  47. Wang, S., Nan, B., Zhu, J., Beer, D.G.: Doubly penalized Buckley-James method for survival data with high-dimensional covariates. Biometrics 64, 132–140 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  48. Wu, Y.: Elastic net for Cox’s proportional hazards model with a solution path algorithm. Stat. Sin. 22, 271–294 (2012)zbMATHGoogle Scholar
  49. Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21(1), 76–99 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  50. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  51. Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  52. Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  53. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67, 301–320 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  54. Zou, H., Zhang, H.H.: On the adaptive elastic-net with a diverging number of parameters. Ann. Stat. 37(4), 1733–1751 (2009)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Applied Statistics, Institute of Statistical Research and TrainingUniversity of DhakaDhakaBangladesh
  2. 2.Department of StatisticsUniversity of WarwickCoventryUK

Personalised recommendations