Variable Selection for Time-to-Event Data

  • Ai NiEmail author
  • Chi SongEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 2194)


With the increasing availability of large scale biomedical and -omics data, researchers are offered with unprecedented opportunities to discover novel biomarkers for clinical outcomes. At the same time, they are also faced with great challenges to accurately identify important biomarkers from numerous candidates. Many novel statistical methodologies have been developed to tackle these challenges in the last couple of decades. When the clinical outcome is time-to-event data, special statistical methods are needed to analyze this type of data due to the presence of censoring. In this article, we review some of the most commonly used modern statistical methodologies for variable selection for time-to-event data. The reviewed methods are classified into three large categories: filter-test based method, penalized regression method, and machine learning method.

Key words

Variable selection Time-to-event data Filter test Penalized regression Machine learning 


  1. 1.
    Ahn H, Loh WY (1994) Tree-structured proportional hazards regression modeling. Biometrics 50:471–485CrossRefPubMedGoogle Scholar
  2. 2.
    Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov NN, Csaki F (eds) Second international symposium on information theory, pp 267–281Google Scholar
  3. 3.
    Bair E, Tibshirani R (2004) Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2(4):e108CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Bair E, Hastie T, Paul D, Tibshirani R (2006) Prediction by supervised principal components. J Am Stat Assoc 101(473):119–137CrossRefGoogle Scholar
  5. 5.
    Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol) 57(1):289–300Google Scholar
  6. 6.
    Bou-Hamad I, Larocque D, Ben-Ameur H, et al (2011) A review of survival trees. Stat Surv 5:44–71CrossRefGoogle Scholar
  7. 7.
    Ciampi A, Thiffault J, Nakache JP, Asselain B (1986) Stratification by stepwise regression, correspondence analysis and recursive partition: a comparison of three methods of analysis for survival data with covariates. Comput Stat Data Anal 4(3):185–204CrossRefGoogle Scholar
  8. 8.
    Ciampi A, Chang CH, Hogg S, McKinney S (1987) Recursive partition: a versatile method for exploratory-data analysis in biostatistics. In: Biostatistics. Springer, Berlin, pp 23–50CrossRefGoogle Scholar
  9. 9.
    Ciampi A, Hogg SA, McKinney S, Thiffault J (1988) RECPAM: a computer program for recursive partition and amalgamation for censored survival data and other situations frequently occurring in biostatistics. I. Methods and program features. Comput Methods Prog Biomed 26(3):239–256CrossRefGoogle Scholar
  10. 10.
    Cox DR (1972) Regression models and life-tables. J R Stat Soc (Ser B) 34(2):187–220Google Scholar
  11. 11.
    Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403CrossRefGoogle Scholar
  12. 12.
    Dezeure R, Bühlmann P, Meier L, Meinshausen N (2015) High-dimensional inference: confidence intervals, p-values and R-software hdi. Stat Sci 30:533–558CrossRefGoogle Scholar
  13. 13.
    Efron B, Hastie T, Johnstone I, Tibshirani RJ (2004) Least angle regression. Ann Stat 32(2):407–451. Scholar
  14. 14.
    Fan J, Li G, Li R (2005) An overview on variable selection for survival analysis. In: Contemporary multivariate analysis and design of experiments: in celebration of Professor Kai-Tai Fang’s 65th birthday. World Scientific, Singapore, pp 315–336CrossRefGoogle Scholar
  15. 15.
    Friedman J, Hastie T, Tibshirani R (2009) glmnet: Lasso and elastic-net regularized generalized linear models. R package version 1(4)Google Scholar
  16. 16.
    Goeman JJ (2010) L1 penalized estimation in the Cox proportional hazards model. Biom J 52(1):70–84PubMedGoogle Scholar
  17. 17.
    Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer. Scholar
  18. 18.
    Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67CrossRefGoogle Scholar
  19. 19.
    Huang J, Ma S, Zhang CH (2008) Adaptive lasso for sparse high-dimensional regression models. Stat Sin 18:1603–1618Google Scholar
  20. 20.
    Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008) Random survival forests. Ann Appl Stat 2(3):841–860CrossRefGoogle Scholar
  21. 21.
    Ishwaran H, Kogalur UB, Gorodeski EZ, Minn AJ, Lauer MS (2010) High-dimensional variable selection for survival data. J Am Stat Assoc 105(489):205–217CrossRefGoogle Scholar
  22. 22.
    Klein JP, Moeschberger ML (2006) Survival analysis: techniques for censored and truncated data. Springer Science & Business Media, BerlinGoogle Scholar
  23. 23.
    Ni A, Cai J (2018) Tuning parameter selection in Cox proportional hazards model with a diverging number of parameters. Scand J Stat 45(3):557–570CrossRefGoogle Scholar
  24. 24.
    Park MY, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc Ser B (Stat Methodol) 69(4):659–677CrossRefGoogle Scholar
  25. 25.
    Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefGoogle Scholar
  26. 26.
    Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22(2):231–245CrossRefGoogle Scholar
  28. 28.
    Simpson EH (1951) The interpretation of interaction in contingency tables. J R Stat Soc Ser B (Methodol) 13(2):238–241Google Scholar
  29. 29.
    Therneau TM, Grambsch PM, Fleming TR (1990) Martingale-based residuals for survival models. Biometrika 77(1):147–160CrossRefGoogle Scholar
  30. 30.
    Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc (Ser B) 58:267–288Google Scholar
  31. 31.
    Tibshirani RJ (1997) The lasso method for variable selection in the Cox model. Stat Med 16(4):385–395CrossRefPubMedGoogle Scholar
  32. 32.
    van Houwelingen HC, Bruinsma T, Hart AA, van’t Veer LJ, Wessels LF (2006) Cross-validated Cox regression on microarray gene expression data. Stat Med 25(18):3201–3216CrossRefPubMedGoogle Scholar
  33. 33.
    Wang H, Li R, Tsai CL (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94(3):553–568CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc (Ser B) 68(1):49–67CrossRefGoogle Scholar
  35. 35.
    Zhang H (1995) Splitting criteria in survival trees. In: Statistical modelling. Springer, Berlin, pp 305–313CrossRefGoogle Scholar
  36. 36.
    Zhang HH, Lu W (2007) Adaptive lasso for Cox’s proportional hazards model. Biometrika 94(3):691–703CrossRefGoogle Scholar
  37. 37.
    Zhao SD, Li Y (2012) Principled sure independence screening for Cox models with ultra-high-dimensional covariates. J Multivar Anal 105(1):397–411CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429CrossRefGoogle Scholar
  39. 39.
    Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc: Ser B (Stat Methodol) 67(2):301–320CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2021

Authors and Affiliations

  1. 1.Division of Biostatistics, College of Public HealthThe Ohio State UniversityColumbusUSA

Personalised recommendations