Skip to main content
Log in

Feature selection of ultrahigh-dimensional covariates with survival outcomes: a selective review

  • Published:
Applied Mathematics-A Journal of Chinese Universities Aims and scope Submit manuscript

Abstract

Many modern biomedical studies have yielded survival data with high-throughput predictors. The goals of scientific research often lie in identifying predictive biomarkers, understanding biological mechanisms and making accurate and precise predictions. Variable screening is a crucial first step in achieving these goals. This work conducts a selective review of feature screening procedures for survival data with ultrahigh dimensional covariates. We present the main methodologies, along with the key conditions that ensure sure screening properties. The practical utility of these methods is examined via extensive simulations. We conclude the review with some future opportunities in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. E Barut, J Q Fan, A Verhasselt. Conditional sure independence screening, J Amer Statist Assoc, 2016, 111(515): 1266–1277.

    Article  MathSciNet  Google Scholar 

  2. J Bradic, J Q Fan, J C Jiang. Regularization for Cox’s proportional hazards model with N Pdimensionality, Ann Statist, 2011, 39(6): 3092–3120.

    Article  MathSciNet  MATH  Google Scholar 

  3. J Q Fan, Y Feng, Y C Wu. High-dimensional variable selection for Cox’s proportional hazards model, In: IMS Collections 6, Borrowing Strength: Theory Powering Applications - A Festschrift for Lawrence D. Brown, 2010, 70–86.

    Chapter  Google Scholar 

  4. J Q Fan, R Z Li. Variable selection for Cox’s proportional hazards model and frailty model, Ann Statist, 2000, 30(1): 74–99.

    MathSciNet  MATH  Google Scholar 

  5. J Q Fan, J Lv. Sure independence screening for ultrahigh dimensional feature space (with discussion), J Roy Statist Soc B, 2008, 70(5): 849–911.

    Article  Google Scholar 

  6. J Q Fan, R Samworth, Y C Wu. Ultrahigh dimensional feature selection: beyond the linear model, J Mach Learn Res, 2009, 10: 2013–2038.

    MathSciNet  MATH  Google Scholar 

  7. J Q Fan, R Song. Sure independence screening in generalized linear models with NP-dimensionality, Ann Statist, 2010, 38(6): 3567–3604.

    Article  MathSciNet  MATH  Google Scholar 

  8. A Gorst-Rasmussen, T Scheike. Independent screening for single-index hazard rate models with ultrahigh dimensional features, J Roy Statist Soc B, 2013, 75(2): 217–245.

    Article  MathSciNet  Google Scholar 

  9. X M He, L Wang, H G Hong. Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data, Ann Statist, 2013, 41(1): 342–369.

    Article  MathSciNet  MATH  Google Scholar 

  10. H G Hong, X R Chen, D C Christiani, Y Li. Integrated powered density: screening ultrahigh dimensional covariates with survival outcomes, Biometrics, in press.

  11. H G Hong, J Kang, Y Li. Conditional screening for ultra-high dimensional covariates with survival outcomes, Lifetime Data Anal, 2016, https://doi.org/10.1007/s10985-016-9387-7

    Google Scholar 

  12. H G Hong, L Wang, X M He. A data-driven approach to conditional screening of high dimensional variables, Stat, 2016, 5(1): 200–212.

    Article  MathSciNet  Google Scholar 

  13. J Huang, T N Sun, Z L Ying, Y Yu, C-H Zhang. Oracle inequalities for the Lasso in the Cox model, Ann Statist, 2013, 41(3): 1142–1165.

    Article  MathSciNet  MATH  Google Scholar 

  14. J Kang, H G Hong, Y Li. Partition-based ultrahigh-dimensional variable screening, Biometrika, 2017, https://doi.org/10.1093/biomet/asx052

    Google Scholar 

  15. S C Kong, B Nan. Non-asymptotic oracle inequalities for the high-dimensional Cox regression via Lasso, Statist Sinica, 2014, 24: 25–42.

    MathSciNet  MATH  Google Scholar 

  16. G R Li, H Peng, J Zhang, L X Zhu. Robust rank correlation based screening, Ann Statist, 2012, 40: 1846–1877.

    Article  MathSciNet  MATH  Google Scholar 

  17. J L Li, Q Zheng, L M Peng, Z P Huang. Survival impact index and ultrahigh-dimensional modelfree screening with survival outcomes, Biometrics, 2016, 72(4): 1145–1154.

    Article  MathSciNet  MATH  Google Scholar 

  18. D Y Lin, Z L Ying. Semiparametric analysis of the additive risk model, Biometrika, 1994, 81(1): 61–71.

    Article  MathSciNet  MATH  Google Scholar 

  19. R Song, W B Lu, S G Ma, X J Jeng. Censored rank independence screening for high-dimensional survival data, Biometrika, 2014, 101(4): 799–814.

    Article  MathSciNet  MATH  Google Scholar 

  20. R J Tibshirani. The lasso method for variable selection in the Cox model, Stat Med, 1997, 16(4): 385–395.

    Article  Google Scholar 

  21. R J Tibshirani. Univariate shrinkage in the Cox model for high dimensional data, Stat Appl Genet Mol Biol, 2009, 8(1): 3498–3528.

    Article  MathSciNet  MATH  Google Scholar 

  22. X D Yan, N S Tang, X Q Zhao. The Spearman rank correlation screening for ultrahigh dimensional censored data, eprint arXiv:1702.02708.

  23. G R Yang, Y Yu, R Z Li, A Buu. Feature screening in ultrahigh dimensional Cox’s model, 2016, Statist Sinica, 26: 881–901.

    Google Scholar 

  24. M Yue, J L Li. Improvement screening for ultra-high dimensional data with censored survival outcomes and varying coefficients, Int J Biostat, 2017, 13(1), https://doi.org/10.1515/ijb-2017-0024

    Google Scholar 

  25. H H Zhang, W B Lu. Adaptive Lasso for Cox’s proportional hazards model, Biometrika, 2007, 94(3): 691–703.

    Article  MathSciNet  MATH  Google Scholar 

  26. J Zhang, G S Yin, Y Y Liu, Y S Wu. Censored cumulative residual independent screening for ultrahigh-dimensional survival data, Lifetime Data Anal, 2017, https://doi.org/10.1007/s10985-017-9395-2

    Google Scholar 

  27. S D Zhao, Y Li. Principled sure independence screening for Cox models with ultra-high-dimensional covariates, J Multivariate Anal, 2012, 105(1): 397–411.

    Article  MathSciNet  MATH  Google Scholar 

  28. S D Zhao, Y Li. Score test variable screening, Biometrics, 2014, 70(4): 862–871.

    Article  MathSciNet  MATH  Google Scholar 

  29. H Zou. A note on path-based variable selection in the penalized proportional hazards model, Biometrika, 2008, 95: 241–247.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank Dr. Jialiang Li for providing the code for the survival impact index screening and Ms. Martina Fu for proofreading the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Li.

Additional information

Supported by the National Natural Science Foundation of China (11528102) and the National Institutes of Health (U01CA209414).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hong, H.G., Li, Y. Feature selection of ultrahigh-dimensional covariates with survival outcomes: a selective review. Appl. Math. J. Chin. Univ. 32, 379–396 (2017). https://doi.org/10.1007/s11766-017-3547-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11766-017-3547-8

Keywords

MR Subject Classification

Navigation