Penalized empirical likelihood inference for sparse additive hazards regression with a diverging number of covariates
- 453 Downloads
High-dimensional sparse modeling with censored survival data is of great practical importance, as exemplified by applications in high-throughput genomic data analysis. In this paper, we propose a class of regularization methods, integrating both the penalized empirical likelihood and pseudoscore approaches, for variable selection and estimation in sparse and high-dimensional additive hazards regression models. When the number of covariates grows with the sample size, we establish asymptotic properties of the resulting estimator and the oracle property of the proposed method. It is shown that the proposed estimator is more efficient than that obtained from the non-concave penalized likelihood approach in the literature. Based on a penalized empirical likelihood ratio statistic, we further develop a nonparametric likelihood approach for testing the linear hypothesis of regression coefficients and constructing confidence regions consequently. Simulation studies are carried out to evaluate the performance of the proposed methodology and also two real data sets are analyzed.
KeywordsPenalized empirical likelihood Empirical likelihood ratio Oracle property Smoothly clipped absolute deviation Survival data Variable selection
We are grateful to an Associate Editor and two anonymous referees for their very constructive comments and suggestions which helped improve the paper greatly. This research is supported partly by the Singapore Ministry of Education Academic Research Fund Tier 1 (RG30/12), Tier 2 (MOE2013-T2-2-118) and the National Natural Science Foundation of China (Grant No. 71420107025).
- Breslow, N.E., Day, N.E.: Statistical Methods in Cancer Research. International Agency for Research on Cancer Lyon, Lyon (1987)Google Scholar
- Cox, D.R., Oakes, D.: Analysis of Survival Data, vol. 21. CRC Press, Boca Raton (1984)Google Scholar
- Early Breast Cancer Trialists’ Collaborative Group: Polychemotherapy for early breast cancer: an overview of the randomised trials. Lancet 352(1), 930–942 (1998)Google Scholar
- Fan, J., Feng, Y., Wu, Y.: High-dimensional variable selection for Cox’s proportional hazards model. In: Tony Cai, T. (ed.) Borrowing Strength: Theory Powering Applications–A Festschrift for Lawrence D, pp. 70–86. Institute of Mathematical Statistics, Los Angeles (2010)Google Scholar
- Leng, C., Tang, C.Y.: Penalized empirical likelihood and growing dimensional general estimating equations. Biometrika 99(3), 703–716 (2012)Google Scholar
- Ma, S., Huang, J.: Lasso method for additive risk models with high dimensional covariates. Technical report, Department of Statistics and Actuarial Science, University of Iowa, Iowa, (2005)Google Scholar
- van de Vijver, M.J., He, Y., vant Veer, L.J., Dai, H., Hart, A.M., Voskuil, D.W., Schreiber, G.J., Peterse, H.L., Roberts, C., Marton, M.J., Parrish, M., Atsma, D., Witteveen, A.T., Glas, A., Delahaye, L., van de Velde, T., Bartelink, H., Rodenhuis, S., Rutgers, E.T., Friend, S.H., Bernards, R.: A gene expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347(25), 1999–2009 (2002)CrossRefGoogle Scholar
- van’t Veer, L.J., Dai, H., van de Vijver, M.J., He, Y.D., Hart, A.A.M., Mao, M., Peterse, H.L., van der Kooy, K., Marton, M.J., Witteveen, A.T., Schreiber, G.J., Kerkhoven, R.M., Roberts, C., Linsley, P.S., Bernards, R., Friend, S.H.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(1), 530–536 (2002)CrossRefGoogle Scholar
- Xie, X., Strickler, H.D., Xue, X.: Additive hazard regression models: an application to the natural history of human papillomavirus. Comput. Math. Methods Med. 2013, (2013)Google Scholar