Variable selection in the accelerated failure time model via the bridge method
 Jian Huang,
 Shuangge Ma
 … show all 2 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, standard survival analysis techniques cannot be directly applied. In addition, among the thousands of genes surveyed, only a subset are diseaseassociated. Gene selection is needed along with estimation. In this article, we model the relationship between gene expressions and survival using the accelerated failure time (AFT) models. We use the bridge penalization for regularized estimation and gene selection. An efficient iterative computational algorithm is proposed. Tuning parameters are selected using Vfold cross validation. We use a resampling method to evaluate the prediction performance of bridge estimator and the relative stability of identified genes. We show that the proposed bridge estimator is selection consistent under appropriate conditions. Analysis of two lymphoma prognostic studies suggests that the bridge estimator can identify a small number of genes and can have better prediction performance than the Lasso.
 Alizadeh, AA, Eisen, MB, Davis, RE, Ma, C (2000) Distinct types of diffuse large Bcell lymphoma identified by gene expression profiling. Nature 403: pp. 503511 CrossRef
 Buckley, J, James, I (1979) Linear regression with censored data. Biometrika 66: pp. 429436 CrossRef
 Dave, SS, Wright, G, Tan, B (2004) Prediction of survival in follicular lymphoma based on molecular features of tumorinfiltrating immune cells. New Engl J Med 351: pp. 21592169 CrossRef
 Efron, B, Hastie, T, Johnstone, I, Tibshirani, R (2004) Least angle regression. Ann Stat 32: pp. 407499 CrossRef
 Frank, IE, Friedman, JH (1993) A statistical view of some chemometrics regression tools (with discussion). Technometrics 35: pp. 109148 CrossRef
 Fu, WJ (1998) Penalized regressions: the bridge versus the Lasso. J Comput Graph Stat 7: pp. 397416 CrossRef
 Gui, J, Li, H (2005) Penalized Cox regression analysis in the highdimensional and lowsample size settings, with applications to microarray gene expression data. Bioinformatics 21: pp. 30013008 CrossRef
 Huang, J, Ma, SG, Xie, HL (2006) Regularized estimation in the accelerated failure time model with highdimensional covariates. Biometrics 62: pp. 813820 CrossRef
 Huang, J, Horowitz, JL, Ma, S (2008) Asymptotic properties of bridge estimators in sparse highdimensional regression models. Ann Stat 36: pp. 587613 CrossRef
 Huang, J, Ma, SG, Xie, HL, Zhang, CH (2009) A group bridge approach for variable selection. Biometrika 96: pp. 339355 CrossRef
 Huang, J, Ma, S, Zhang, C (2008) Adaptive Lasso for highdimensional regression models. Stat Sinica 18: pp. 16031618
 Kalbfleisch, JD, Prentice, RL (1980) The statistical analysis of failure time data. John Wiley, New York
 Leng, C, Lin, Y, Wahba, G (2006) A note on the LASSO and related procedures in model selection. Stat Sinica 16: pp. 12731284
 Ma, S, Huang, J (2007) Additive risk survival model with microarray data. BMC Bioinform 8: pp. 192 CrossRef
 Rosenwald, A, Wright, G, Chan, WC, Conners, JM (2002) The use of molecular profiling to predict survival after chemotherapy for diffuse large B cell lymphoma. New Engl J Med 346: pp. 19371947 CrossRef
 Rosenwald, A, Wright, G, Wiestner, A, Chan, WC (2003) The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 3: pp. 185197 CrossRef
 Stute, W (1993) Consistent estimation under random censorship when covariables are available. J Multivar Anal 45: pp. 89103 CrossRef
 Tibshirani, R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58: pp. 267288
 Geer, S (2008) Highdimensional generalized linear models and the Lasso. Ann Stat 36: pp. 614645 CrossRef
 Vaart, AW, Wellner, JA (1996) Weak convergence and empirical processes: with applications to statistics. Springer, New York
 Wang, S, Nan, B, Zhu, J, Beer, DG (2008) Doubly penalized BuckleyJames method for survival data with highdimensional covariates. Biometrics 6: pp. 132140 CrossRef
 Wei, LJ (1992) The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat Med 11: pp. 18711879 CrossRef
 Ying, ZL (1993) A large sample study of rank estimation for censored regression data. Ann Stat 21: pp. 7699 CrossRef
 Zhang, C, Huang, J (2008) The sparsity and bias of the Lasso selection in highdimensional linear regression. Ann Stat 36: pp. 15671594 CrossRef
 Zhou, M (1992) Mestimation in censored linear models. Biometrika 79: pp. 837841
 Title
 Variable selection in the accelerated failure time model via the bridge method
 Journal

Lifetime Data Analysis
Volume 16, Issue 2 , pp 176195
 Cover Date
 20100401
 DOI
 10.1007/s1098500991442
 Print ISSN
 13807870
 Online ISSN
 15729249
 Publisher
 Springer US
 Additional Links
 Topics
 Keywords

 Bridge penalization
 Censored data
 High dimensional data
 Selection consistency
 Stability
 Sparse model
 Industry Sectors
 Authors

 Jian Huang ^{(1)} ^{(2)}
 Shuangge Ma ^{(3)}
 Author Affiliations

 1. Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA, 52242, USA
 2. Department of Biostatistics, University of Iowa, Iowa City, IA, 52242, USA
 3. Department of Epidemiology and Public Health, Yale University, New Haven, CT, 06520, USA