Variable selection for survival data with a class of adaptive elastic net techniques

Khan, Md Hasinur Rahaman; Shaw, J. Ewart H.

doi:10.1007/s11222-015-9555-8

Variable selection for survival data with a class of adaptive elastic net techniques

Published: 17 March 2015

Volume 26, pages 725–741, (2016)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Md Hasinur Rahaman Khan¹ &
J. Ewart H. Shaw²

1412 Accesses
36 Citations
Explore all metrics

Abstract

The accelerated failure time (AFT) models have proved useful in many contexts, though heavy censoring (as for example in cancer survival) and high dimensionality (as for example in microarray data) cause difficulties for model fitting and model selection. We propose new approaches to variable selection for censored data, based on AFT models optimized using regularized weighted least squares. The regularized technique uses a mixture of \(\ell _1\) and \(\ell _2\) norm penalties under two proposed elastic net type approaches. One is the adaptive elastic net and the other is weighted elastic net. The approaches extend the original approaches proposed by Ghosh (Adaptive elastic net: an improvement of elastic net to achieve oracle properties, Technical Reports 2007) and Hong and Zhang (Math Model Nat Phenom 5(3):115–133 2010), respectively. We also extend the two proposed approaches by adding censoring observations as constraints into their model optimization frameworks. The approaches are evaluated on microarray and by simulation. We compare the performance of these approaches with six other variable selection techniques-three are generally used for censored data and the other three are correlation-based greedy methods used for high-dimensional data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection in proportional odds model with informatively interval-censored data

Article 29 September 2023

Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

Article 03 March 2022

A Weighted Likelihood Approach to Problems in Survival Data

Article 18 December 2019

References

Akaike, H.: Information theory as an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973)
Google Scholar
Antoniadis, A., Fryzlewicz, P., Letue, F.: The Dantzig selector in Cox’s proportional hazards model. Scand. J. Stat. 37(4), 531–552 (2010)
Article MathSciNet MATH Google Scholar
Buckley, J., James, I.: Linear regression with censored data. Biometrika 66, 429–436 (1979)
Article MATH Google Scholar
Bühlmann, P., Kalisch, M., Maathuis, M.H.: Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm. Biometrika 97(2), 261–278 (2010)
Article MathSciNet MATH Google Scholar
Cai, T., Huang, J., Tian, L.: Regularized estimation for the accelerated failure time model. Biometrics 65, 394–404 (2009)
Article MathSciNet MATH Google Scholar
Candes, E., Tao, T.: The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann. Stat. 35(6), 2313–2351 (2007)
Article MathSciNet MATH Google Scholar
Cho, H., Fryzlewicz, P.: High dimensional variable selection via tilting. J. R. Stat. Soc. Ser. B 74(3), 593–622 (2012)
Article MathSciNet Google Scholar
Cox, D.R.: Regression models and life-tables. J. R. Stat. Soc. Ser. B 34, 187–220 (1972)
MathSciNet MATH Google Scholar
Datta, S., Le-Rademacher, J., Datta, S.: Predicting patient survival from microarray data by accelerated failure time modeling using partial least squares and LASSO. Biometrics 63, 259–271 (2007)
Article MathSciNet Google Scholar
Efron, B.: The two sample problem with censored data. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 4, pp. 831–853. Prentice Hall, New York (1967)
Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman and Hall, New York (1993)
Book MATH Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Article MathSciNet MATH Google Scholar
Engler, D., Li, Y.: Survival analysis with high-dimensional covariates: an application in microarray studies. Stat. Appl. Genet. Mol. Biol. 8(1), 1–22 (2009). (Article 14)
MathSciNet MATH Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Fan, J., Li, R.: Variable selection for Cox’s proportional hazards model and frailty model. Ann. Stat. 30, 74–99 (2002)
Article MathSciNet MATH Google Scholar
Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B 70(5), 849–911 (2008)
Article MathSciNet Google Scholar
Faraggi, D., Simon, R.: Bayesian variable selection method for censored survival data. Biometrics 54, 1475–1485 (1998)
Article MathSciNet MATH Google Scholar
Frank, I.E., Friedman, J.H.: A statistical view of some chemometrics regression tools. Technometrics 35(2), 109–135 (1993)
Article MATH Google Scholar
Gehan, E.A.: A generalized Wilcoxon test for comparing arbitrarily singlecensored samples. Biometrika 52, 203–223 (1965)
Article MathSciNet MATH Google Scholar
Ghosh, S.: On the grouped selection and model complexity of the adaptive elastic net. Stat. Comput. 21(3), 451–462 (2011)
Article MathSciNet MATH Google Scholar
Ghosh, S.: Adaptive elastic net: an improvement of elastic net to achieve oracle properties. Technical Reports, Indiana University-Purdue University, Indianapolis, (PR no. 07–01) (2007)
Gui, J., Li, H.: Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics 21, 3001–3008 (2005)
Article Google Scholar
Hong, D., Zhang, F.: Weighted elastic net model for mass spectrometry imaging processing. Math. Model. Nat. Phenom. 5(3), 115–133 (2010)
Article MathSciNet MATH Google Scholar
Hu, S., Rao, J.S.: Sparse penalization with censoring constraints for estimating high dimensional AFT models with applications to microarray data analysis. Technical Reports, University of Miami (2010)
Huang, J., Harrington, D.: Iterative partial least squares with rightcensored data analysis: a comparison to other dimension reduction techniques. Biometrics 61(1), 17–24 (2005)
Article MathSciNet MATH Google Scholar
Huang, J., Ma, S.: Variable selection in the accelerated failure time model via the bridge method. Lifetime Data Anal. 16, 176–195 (2010)
Article MathSciNet MATH Google Scholar
Huang, J., Ma, S., Xie, H.: Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62, 813–820 (2006)
Article MathSciNet MATH Google Scholar
Hunter, D.R., Li, R.: Variable selection using MM algorithms. Ann. Stat. 33(4), 1617–1642 (2005)
Article MathSciNet MATH Google Scholar
Jin, Z., Lin, D., Wei, L.J., Ying, Z.L.: Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353 (2003)
Article MathSciNet MATH Google Scholar
Jin, Z., Lin, D.Y., Ying, Z.: On least-squares regression with censored data. Biometrika 93(1), 147–161 (2006)
Article MathSciNet MATH Google Scholar
Khan, M.H.R., Shaw, J.E.H.: AdapEnetClass: a class of adaptive elastic net methods for censored data. R package version 1.1 (2014)
Khan, M.H.R.: Variable selection and estimation procedures for high-dimensional survival data. Ph.D. Thesis, Department of Statistics, University of Warwick (2013)
Khan, M.H.R., Shaw, J.E.H.: On dealing with censored largest observations under weighted least squares. CRiSM Working Paper, No 13–07 Department of Statistics, University of Warwick (2013b)
Khan, M.H.R., Shaw, J.E.H.: Variable selection with the modified Buckley- James method and the dantzig selector for high-dimensional survival data. In: 59th ISI World Statistics Congress Proceedings, Hong Kong, pp. 4239–4244, 25–30 Aug 2013c
Kriegeskorte, N., Simmons, W.K., Bellgowan, P.S.F., Baker, C.I.: Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12(5), 535–540 (2009)
Article Google Scholar
Li, H., Luan, Y.: Kernel Cox regression models for linking gene expression profiles to censored survival data. Pac. Symp. Biocomput. 8, 65–76 (2003)
MATH Google Scholar
Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. Ser. B 72(4), 417–473 (2010)
Article MathSciNet Google Scholar
Peduzzi, P.N., Hardy, R.J., Holford, T.R.: A stepwise variable selection procedure for nonlinear regression models. Biometrics 36, 511–516 (1980)
Article MATH Google Scholar
Radchenko, P., James, G.M.: Improved variable selection with Forward-Lasso adaptive shrinkage. Ann. Appl. Stat. 5(1), 427–448 (2011)
Article MathSciNet MATH Google Scholar
Rosenwald, A., Wright, G., Wiestner, A., Chan, W., Connors, J., Campo, E., Gascoyne, R., Grogan, T., Muller Hermelink, H., Smeland, E., Chiorazzi, M., Giltnane, J., Hurt, E., Zhao, H., Averett, L., Henrickson, S., Yang, L., Powell, J., Wilson, W., Jaffe, E., Simon, R., Klausner, R., Montserrat, E., Bosch, F., Greiner, T., Weisenburger, D., Sanger, W., Dave, B., Lynch, J., Vose, J., Armitage, J., Fisher, R., Miller, T., LeBlanc, M., Ott, G., Kvaloy, S., Holte, H., Delabie, J., Staudt, L.: The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 3, 185–197 (2003)
Article Google Scholar
Sha, N., Tadesse, M.G., Vannucci, M.: Bayesian variable selection for the analysis of microarray data with censored outcome. Bioinformatics 22(18), 2262–2268 (2006)
Article Google Scholar
Stute, W.: Consistent estimation under random censorship when covariables are available. J. Multivar. Anal. 45, 89–103 (1993)
Article MathSciNet MATH Google Scholar
Stute, W.: Distributional convergence under random censorship when covariables are present. Scand. J. Stat. 23, 461–471 (1996)
MathSciNet MATH Google Scholar
Swerdlow, S., Williams, M.: From centrocytic to mantle cell lymphoma: a clinicopathologic and molecular review of 3 decades. Hum. Pathol. 33, 7–20 (2002)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Tibshirani, R.: The lasso method for variable selection in the Cox model. Stat. Med. 16, 385–395 (1997)
Article Google Scholar
Wang, S., Nan, B., Zhu, J., Beer, D.G.: Doubly penalized Buckley-James method for survival data with high-dimensional covariates. Biometrics 64, 132–140 (2008)
Article MathSciNet MATH Google Scholar
Wu, Y.: Elastic net for Cox’s proportional hazards model with a solution path algorithm. Stat. Sin. 22, 271–294 (2012)
MATH Google Scholar
Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21(1), 76–99 (1993)
Article MathSciNet MATH Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006)
Article MathSciNet MATH Google Scholar
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)
Article MathSciNet MATH Google Scholar
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Article MathSciNet MATH Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67, 301–320 (2005)
Article MathSciNet MATH Google Scholar
Zou, H., Zhang, H.H.: On the adaptive elastic-net with a diverging number of parameters. Ann. Stat. 37(4), 1733–1751 (2009)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The first author is grateful to the centre for research in Statistical Methodology (CRiSM), Department of Statistics, University of Warwick, UK for offering research funding for his PhD study.

Author information

Authors and Affiliations

Applied Statistics, Institute of Statistical Research and Training, University of Dhaka, Dhaka, 1000, Bangladesh
Md Hasinur Rahaman Khan
Department of Statistics, University of Warwick, Coventry, CV4 7AL, UK
J. Ewart H. Shaw

Authors

Md Hasinur Rahaman Khan
View author publications
You can also search for this author in PubMed Google Scholar
J. Ewart H. Shaw
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md Hasinur Rahaman Khan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khan, M.H.R., Shaw, J.E.H. Variable selection for survival data with a class of adaptive elastic net techniques. Stat Comput 26, 725–741 (2016). https://doi.org/10.1007/s11222-015-9555-8

Download citation

Received: 17 April 2014
Accepted: 02 March 2015
Published: 17 March 2015
Issue Date: May 2016
DOI: https://doi.org/10.1007/s11222-015-9555-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection for survival data with a class of adaptive elastic net techniques

Abstract

Access this article

Similar content being viewed by others

Variable selection in proportional odds model with informatively interval-censored data

Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

A Weighted Likelihood Approach to Problems in Survival Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variable selection for survival data with a class of adaptive elastic net techniques

Abstract

Access this article

Similar content being viewed by others

Variable selection in proportional odds model with informatively interval-censored data

Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

A Weighted Likelihood Approach to Problems in Survival Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation