Variable selection for survival data with a class of adaptive elastic net techniques
The accelerated failure time (AFT) models have proved useful in many contexts, though heavy censoring (as for example in cancer survival) and high dimensionality (as for example in microarray data) cause difficulties for model fitting and model selection. We propose new approaches to variable selection for censored data, based on AFT models optimized using regularized weighted least squares. The regularized technique uses a mixture of \(\ell _1\) and \(\ell _2\) norm penalties under two proposed elastic net type approaches. One is the adaptive elastic net and the other is weighted elastic net. The approaches extend the original approaches proposed by Ghosh (Adaptive elastic net: an improvement of elastic net to achieve oracle properties, Technical Reports 2007) and Hong and Zhang (Math Model Nat Phenom 5(3):115–133 2010), respectively. We also extend the two proposed approaches by adding censoring observations as constraints into their model optimization frameworks. The approaches are evaluated on microarray and by simulation. We compare the performance of these approaches with six other variable selection techniques-three are generally used for censored data and the other three are correlation-based greedy methods used for high-dimensional data.
KeywordsAdaptive elastic net AFT Variable selection Stute’s weighted least squares Weighted elastic net
The first author is grateful to the centre for research in Statistical Methodology (CRiSM), Department of Statistics, University of Warwick, UK for offering research funding for his PhD study.
- Akaike, H.: Information theory as an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973)Google Scholar
- Efron, B.: The two sample problem with censored data. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 4, pp. 831–853. Prentice Hall, New York (1967)Google Scholar
- Ghosh, S.: Adaptive elastic net: an improvement of elastic net to achieve oracle properties. Technical Reports, Indiana University-Purdue University, Indianapolis, (PR no. 07–01) (2007)Google Scholar
- Hu, S., Rao, J.S.: Sparse penalization with censoring constraints for estimating high dimensional AFT models with applications to microarray data analysis. Technical Reports, University of Miami (2010)Google Scholar
- Khan, M.H.R., Shaw, J.E.H.: AdapEnetClass: a class of adaptive elastic net methods for censored data. R package version 1.1 (2014)Google Scholar
- Khan, M.H.R.: Variable selection and estimation procedures for high-dimensional survival data. Ph.D. Thesis, Department of Statistics, University of Warwick (2013)Google Scholar
- Khan, M.H.R., Shaw, J.E.H.: On dealing with censored largest observations under weighted least squares. CRiSM Working Paper, No 13–07 Department of Statistics, University of Warwick (2013b)Google Scholar
- Khan, M.H.R., Shaw, J.E.H.: Variable selection with the modified Buckley- James method and the dantzig selector for high-dimensional survival data. In: 59th ISI World Statistics Congress Proceedings, Hong Kong, pp. 4239–4244, 25–30 Aug 2013cGoogle Scholar
- Rosenwald, A., Wright, G., Wiestner, A., Chan, W., Connors, J., Campo, E., Gascoyne, R., Grogan, T., Muller Hermelink, H., Smeland, E., Chiorazzi, M., Giltnane, J., Hurt, E., Zhao, H., Averett, L., Henrickson, S., Yang, L., Powell, J., Wilson, W., Jaffe, E., Simon, R., Klausner, R., Montserrat, E., Bosch, F., Greiner, T., Weisenburger, D., Sanger, W., Dave, B., Lynch, J., Vose, J., Armitage, J., Fisher, R., Miller, T., LeBlanc, M., Ott, G., Kvaloy, S., Holte, H., Delabie, J., Staudt, L.: The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 3, 185–197 (2003)CrossRefGoogle Scholar