Advertisement

Statistics and Computing

, Volume 27, Issue 5, pp 1347–1364 | Cite as

Penalized empirical likelihood inference for sparse additive hazards regression with a diverging number of covariates

  • Shanshan Wang
  • Liming XiangEmail author
Article

Abstract

High-dimensional sparse modeling with censored survival data is of great practical importance, as exemplified by applications in high-throughput genomic data analysis. In this paper, we propose a class of regularization methods, integrating both the penalized empirical likelihood and pseudoscore approaches, for variable selection and estimation in sparse and high-dimensional additive hazards regression models. When the number of covariates grows with the sample size, we establish asymptotic properties of the resulting estimator and the oracle property of the proposed method. It is shown that the proposed estimator is more efficient than that obtained from the non-concave penalized likelihood approach in the literature. Based on a penalized empirical likelihood ratio statistic, we further develop a nonparametric likelihood approach for testing the linear hypothesis of regression coefficients and constructing confidence regions consequently. Simulation studies are carried out to evaluate the performance of the proposed methodology and also two real data sets are analyzed.

Keywords

Penalized empirical likelihood Empirical likelihood ratio Oracle property Smoothly clipped absolute deviation Survival data Variable selection 

Notes

Acknowledgments

We are grateful to an Associate Editor and two anonymous referees for their very constructive comments and suggestions which helped improve the paper greatly. This research is supported partly by the Singapore Ministry of Education Academic Research Fund Tier 1 (RG30/12), Tier 2 (MOE2013-T2-2-118) and the National Natural Science Foundation of China (Grant No. 71420107025).

Supplementary material

11222_2016_9690_MOESM1_ESM.pdf (291 kb)
Supplementary material 1 (pdf 291 KB)

References

  1. Andersen, P.K., Gill, R.D.: Cox’s regression model for counting processes: a large sample study. Ann.Stat. 10(4), 1100–1120 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  2. Breslow, N.E., Day, N.E.: Statistical Methods in Cancer Research. International Agency for Research on Cancer Lyon, Lyon (1987)Google Scholar
  3. Cai, J., Fan, J., Li, R., Zhou, H.: Variable selection for multivariate failure time data. Biometrika 92(2), 303–316 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  4. Chen, S.X., Cui, H.: On Bartlett correction of empirical likelihood in the presence of nuisance parameters. Biometrika 93(1), 215–220 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  5. Chen, S.X., Peng, L., Qin, Y.L.: Effects of data dimension on empirical likelihood. Biometrika 96(3), 711–722 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  6. Cox, D.R., Oakes, D.: Analysis of Survival Data, vol. 21. CRC Press, Boca Raton (1984)Google Scholar
  7. DiCiccio, T., Hall, P., Romano, J.: Empirical likelihood is Bartlett-correctable. Ann. Stat. 19(2), 1053–1061 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  8. Early Breast Cancer Trialists’ Collaborative Group: Polychemotherapy for early breast cancer: an overview of the randomised trials. Lancet 352(1), 930–942 (1998)Google Scholar
  9. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  10. Fan, J., Li, R.: Variable selection for Cox’s proportional hazards model and frailty model. Ann. Stat. 30(1), 74–99 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  11. Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. 70(5), 849–911 (2008)MathSciNetCrossRefGoogle Scholar
  12. Fan, J., Peng, H.: Nonconcave penalized likelihood with a diverging number of parameters. Ann. Stat. 32(3), 928–961 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  13. Fan, J., Feng, Y., Wu, Y.: High-dimensional variable selection for Cox’s proportional hazards model. In: Tony Cai, T. (ed.) Borrowing Strength: Theory Powering Applications–A Festschrift for Lawrence D, pp. 70–86. Institute of Mathematical Statistics, Los Angeles (2010)Google Scholar
  14. Fleming, T.R., Harrington, D.P.: Counting Processes and Survival Analysis, vol. 169. Wiley, New York (2011)zbMATHGoogle Scholar
  15. Hjort, N.L., McKeague, I.W., Van Keilegom, I.: Extending the scope of empirical likelihood. Ann. Stat. 37(3), 1079–1111 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  16. Leng, C., Ma, S.: Path consistent model selection in additive risk model via Lasso. Stat. Med. 26(20), 3753–3770 (2007)MathSciNetCrossRefGoogle Scholar
  17. Leng, C., Tang, C.Y.: Penalized empirical likelihood and growing dimensional general estimating equations. Biometrika 99(3), 703–716 (2012)Google Scholar
  18. Lin, D., Ying, Z.: Semiparametric analysis of the additive risk model. Biometrika 81(1), 61–71 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  19. Lin, W., Lv, J.: High-dimensional sparse additive hazards regression. J. Am. Stat. Assoc. 108(501), 247–264 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  20. Ma, S., Huang, J.: Lasso method for additive risk models with high dimensional covariates. Technical report, Department of Statistics and Actuarial Science, University of Iowa, Iowa, (2005)Google Scholar
  21. Ma, S., Kosorok, M.R., Fine, J.P.: Additive risk models for survival data with high-dimensional covariates. Biometrics 62(1), 202–210 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  22. Martinussen, T., Scheike, T.H.: A semiparametric additive regression model for longitudinal data. Biometrika 86(3), 691–702 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  23. Martinussen, T., Scheike, T.H.: Covariate selection for the semiparametric additive risk model. Scand. J. Stat. 36(4), 602–619 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  24. Owen, A.B.: Empirical Likelihood. CRC Press, Boca Raton (2001)CrossRefzbMATHGoogle Scholar
  25. Sauerbrei, W., Schumacher, M.: A bootstrap resampling procedure for model building: application to the Cox regression model. Stat. Med. 11(16), 2093–2109 (1992)CrossRefGoogle Scholar
  26. Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  27. Tang, C.Y., Leng, C.: Penalized high-dimensional empirical likelihood. Biometrika 97(4), 905–920 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  28. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B (Methodological) 58(1), 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  29. Tibshirani, R.: The Lasso method for variable selection in the Cox model. Stat. Med. 16(4), 385–395 (1997)CrossRefGoogle Scholar
  30. van de Vijver, M.J., He, Y., vant Veer, L.J., Dai, H., Hart, A.M., Voskuil, D.W., Schreiber, G.J., Peterse, H.L., Roberts, C., Marton, M.J., Parrish, M., Atsma, D., Witteveen, A.T., Glas, A., Delahaye, L., van de Velde, T., Bartelink, H., Rodenhuis, S., Rutgers, E.T., Friend, S.H., Bernards, R.: A gene expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347(25), 1999–2009 (2002)CrossRefGoogle Scholar
  31. van’t Veer, L.J., Dai, H., van de Vijver, M.J., He, Y.D., Hart, A.A.M., Mao, M., Peterse, H.L., van der Kooy, K., Marton, M.J., Witteveen, A.T., Schreiber, G.J., Kerkhoven, R.M., Roberts, C., Linsley, P.S., Bernards, R., Friend, S.H.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(1), 530–536 (2002)CrossRefGoogle Scholar
  32. Wang, H., Li, B., Leng, C.: Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. Ser. B (Methodological) 71(3), 671–683 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  33. Xie, X., Strickler, H.D., Xue, X.: Additive hazard regression models: an application to the natural history of human papillomavirus. Comput. Math. Methods Med. 2013, (2013)Google Scholar
  34. Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  35. Zhang, H.H., Lu, W.: Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94(3), 691–703 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  36. Zou, H.: The adaptive Lasso and its oracle properties. J. Am.Stat. Assoc. 101(476), 1418–1429 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  37. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Statistical Methodology) 67(2), 301–320 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  38. Zou, H., Li, R.: One-step sparse estimates in nonconcave penalized likelihood models. Ann. Stat. 36(4), 1509–1533 (2008)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.School of Economics and ManagementBeihang UniversityBeijingChina
  2. 2.School of Physical and Mathematical SciencesNanyang Technological UniversitySingaporeSingapore

Personalised recommendations