Variable Selection for Time-to-Event Data

Ni, Ai; Song, Chi

doi:10.1007/978-1-0716-0849-4_5

Variable Selection for Time-to-Event Data

Ai Ni³ &
Chi Song³

Protocol
First Online: 15 September 2020

1679 Accesses

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2194))

Abstract

With the increasing availability of large scale biomedical and -omics data, researchers are offered with unprecedented opportunities to discover novel biomarkers for clinical outcomes. At the same time, they are also faced with great challenges to accurately identify important biomarkers from numerous candidates. Many novel statistical methodologies have been developed to tackle these challenges in the last couple of decades. When the clinical outcome is time-to-event data, special statistical methods are needed to analyze this type of data due to the presence of censoring. In this article, we review some of the most commonly used modern statistical methodologies for variable selection for time-to-event data. The reviewed methods are classified into three large categories: filter-test based method, penalized regression method, and machine learning method.

Both the authors “Ai Ni and Chi Song” contributed equally to this work.

This is a preview of subscription content, log in via an institution.

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

Ahn H, Loh WY (1994) Tree-structured proportional hazards regression modeling. Biometrics 50:471–485
Article CAS PubMed Google Scholar
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov NN, Csaki F (eds) Second international symposium on information theory, pp 267–281
Google Scholar
Bair E, Tibshirani R (2004) Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2(4):e108
Article PubMed PubMed Central Google Scholar
Bair E, Hastie T, Paul D, Tibshirani R (2006) Prediction by supervised principal components. J Am Stat Assoc 101(473):119–137
Article CAS Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol) 57(1):289–300
Google Scholar
Bou-Hamad I, Larocque D, Ben-Ameur H, et al (2011) A review of survival trees. Stat Surv 5:44–71
Article Google Scholar
Ciampi A, Thiffault J, Nakache JP, Asselain B (1986) Stratification by stepwise regression, correspondence analysis and recursive partition: a comparison of three methods of analysis for survival data with covariates. Comput Stat Data Anal 4(3):185–204
Article Google Scholar
Ciampi A, Chang CH, Hogg S, McKinney S (1987) Recursive partition: a versatile method for exploratory-data analysis in biostatistics. In: Biostatistics. Springer, Berlin, pp 23–50
Chapter Google Scholar
Ciampi A, Hogg SA, McKinney S, Thiffault J (1988) RECPAM: a computer program for recursive partition and amalgamation for censored survival data and other situations frequently occurring in biostatistics. I. Methods and program features. Comput Methods Prog Biomed 26(3):239–256
Article CAS Google Scholar
Cox DR (1972) Regression models and life-tables. J R Stat Soc (Ser B) 34(2):187–220
Google Scholar
Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403
Article Google Scholar
Dezeure R, Bühlmann P, Meier L, Meinshausen N (2015) High-dimensional inference: confidence intervals, p-values and R-software hdi. Stat Sci 30:533–558
Article Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani RJ (2004) Least angle regression. Ann Stat 32(2):407–451. http://www.jstor.org/stable/3448465
Article Google Scholar
Fan J, Li G, Li R (2005) An overview on variable selection for survival analysis. In: Contemporary multivariate analysis and design of experiments: in celebration of Professor Kai-Tai Fang’s 65th birthday. World Scientific, Singapore, pp 315–336
Chapter Google Scholar
Friedman J, Hastie T, Tibshirani R (2009) glmnet: Lasso and elastic-net regularized generalized linear models. R package version 1(4)
Google Scholar
Goeman JJ (2010) L1 penalized estimation in the Cox proportional hazards model. Biom J 52(1):70–84
PubMed Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer. http://www-stat.stanford.edu/~tibs/ElemStatLearn/
Book Google Scholar
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67
Article Google Scholar
Huang J, Ma S, Zhang CH (2008) Adaptive lasso for sparse high-dimensional regression models. Stat Sin 18:1603–1618
Google Scholar
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008) Random survival forests. Ann Appl Stat 2(3):841–860
Article Google Scholar
Ishwaran H, Kogalur UB, Gorodeski EZ, Minn AJ, Lauer MS (2010) High-dimensional variable selection for survival data. J Am Stat Assoc 105(489):205–217
Article CAS Google Scholar
Klein JP, Moeschberger ML (2006) Survival analysis: techniques for censored and truncated data. Springer Science & Business Media, Berlin
Google Scholar
Ni A, Cai J (2018) Tuning parameter selection in Cox proportional hazards model with a diverging number of parameters. Scand J Stat 45(3):557–570
Article Google Scholar
Park MY, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc Ser B (Stat Methodol) 69(4):659–677
Article Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article Google Scholar
Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1
Article PubMed PubMed Central Google Scholar
Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22(2):231–245
Article Google Scholar
Simpson EH (1951) The interpretation of interaction in contingency tables. J R Stat Soc Ser B (Methodol) 13(2):238–241
Google Scholar
Therneau TM, Grambsch PM, Fleming TR (1990) Martingale-based residuals for survival models. Biometrika 77(1):147–160
Article Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc (Ser B) 58:267–288
Google Scholar
Tibshirani RJ (1997) The lasso method for variable selection in the Cox model. Stat Med 16(4):385–395
Article CAS PubMed Google Scholar
van Houwelingen HC, Bruinsma T, Hart AA, van’t Veer LJ, Wessels LF (2006) Cross-validated Cox regression on microarray gene expression data. Stat Med 25(18):3201–3216
Article PubMed Google Scholar
Wang H, Li R, Tsai CL (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94(3):553–568
Article PubMed PubMed Central Google Scholar
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc (Ser B) 68(1):49–67
Article Google Scholar
Zhang H (1995) Splitting criteria in survival trees. In: Statistical modelling. Springer, Berlin, pp 305–313
Chapter Google Scholar
Zhang HH, Lu W (2007) Adaptive lasso for Cox’s proportional hazards model. Biometrika 94(3):691–703
Article Google Scholar
Zhao SD, Li Y (2012) Principled sure independence screening for Cox models with ultra-high-dimensional covariates. J Multivar Anal 105(1):397–411
Article PubMed PubMed Central Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429
Article CAS Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc: Ser B (Stat Methodol) 67(2):301–320
Article Google Scholar

Download references

Author information

Authors and Affiliations

Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, USA
Ai Ni & Chi Song

Authors

Ai Ni
View author publications
You can also search for this author in PubMed Google Scholar
Chi Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ai Ni or Chi Song .

Editor information

Editors and Affiliations

Department of Cutaneous Oncology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL, USA
Joseph Markowitz

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Ni, A., Song, C. (2021). Variable Selection for Time-to-Event Data. In: Markowitz, J. (eds) Translational Bioinformatics for Therapeutic Development. Methods in Molecular Biology, vol 2194. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0849-4_5

Download citation

DOI: https://doi.org/10.1007/978-1-0716-0849-4_5
Published: 15 September 2020
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0848-7
Online ISBN: 978-1-0716-0849-4
eBook Packages: Springer Protocols

Publish with us

Policies and ethics