Skip to main content

Advertisement

Log in

A regularized variable selection procedure in additive hazards model with stratified case-cohort design

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Case-cohort designs are commonly used in large epidemiological studies to reduce the cost associated with covariate measurement. In many such studies the number of covariates is very large. An efficient variable selection method is needed for case-cohort studies where the covariates are only observed in a subset of the sample. Current literature on this topic has been focused on the proportional hazards model. However, in many studies the additive hazards model is preferred over the proportional hazards model either because the proportional hazards assumption is violated or the additive hazards model provides more relevent information to the research question. Motivated by one such study, the Atherosclerosis Risk in Communities (ARIC) study, we investigate the properties of a regularized variable selection procedure in stratified case-cohort design under an additive hazards model with a diverging number of parameters. We establish the consistency and asymptotic normality of the penalized estimator and prove its oracle property. Simulation studies are conducted to assess the finite sample performance of the proposed method with a modified cross-validation tuning parameter selection methods. We apply the variable selection procedure to the ARIC study to demonstrate its practical use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aalen O (1980) A model for nonparametric regression analysis of counting processes. Lecture notes in statistics 2. Springer, New York

    Book  MATH  Google Scholar 

  • Akaike H (1973) Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 60:255–265

    Article  MathSciNet  MATH  Google Scholar 

  • Ballantyne CM, Hoogeveen RC, Bang H, Coresh J, Folsom AR, Heiss G, Sharrett AR (2004) Lipoprotein-associated phospholipase a2, high-sensitivity c-reactive protein, and risk for incident coronary heart disease in middle-aged men and women in the atherosclerosis risk in communities (ARIC) study. Circulation 109:837–842

    Article  Google Scholar 

  • Borgan O, Langholz B, Samuelsen SO, Goldstein L, Pogoda J (2000) Exposure stratified case-cohort designs. Lifetime Data Anal 6:39–58

    Article  MathSciNet  MATH  Google Scholar 

  • Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B 34:187–220

    MathSciNet  MATH  Google Scholar 

  • Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MathSciNet  MATH  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, New York

    Book  MATH  Google Scholar 

  • Huber PJ (1973) Robust regression: asymptotics, conjectures, and monte carlo. Ann Stat 1:799–821

    Article  MathSciNet  MATH  Google Scholar 

  • Kang S, Cai J, Chambless L (2013) Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the atherosclerosis risk in communities (aric) study. Biostatistics 14:28–41

    Article  Google Scholar 

  • Kulich M, Lin D (2000) Additive hazards regression for case-cohort studies. Biometrika 87:73–87

    Article  MathSciNet  MATH  Google Scholar 

  • Lin D, Ying Z (1994) Semiparametric analysis of the additive risk model. Biometrika 81:61–71

    Article  MathSciNet  MATH  Google Scholar 

  • Lin W, Lv J (2013) High-dimensional sparse additive hazards regression. J Am Stat Assoc 108:247–264

    Article  MathSciNet  MATH  Google Scholar 

  • Martinussen T, Scheike TH (2009) Covariate selection for the semiparametric additive risk model. Scand J Stat 36:602–619

    Article  MathSciNet  MATH  Google Scholar 

  • Ni A, Cai J, Zeng D (2016) Variable selection for case-cohort studies with failure time outcome. Biometrika 103:547–562

    Article  MathSciNet  Google Scholar 

  • Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73:1–11

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288

    MathSciNet  MATH  Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68:49–67

    Article  MathSciNet  MATH  Google Scholar 

  • Zeng L, Xie J (2014) Group variable selection via scad-l2. Statistics 48:49–66

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was partially supported by National Institutes of Health Grants (P01 CA 142538, R01 ES 021900). The authors thank the staff and participants of the ARIC study for their important contributions. The ARIC Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts (N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, N01-HC-55022).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ai Ni.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 133 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ni, A., Cai, J. A regularized variable selection procedure in additive hazards model with stratified case-cohort design. Lifetime Data Anal 24, 443–463 (2018). https://doi.org/10.1007/s10985-017-9402-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-017-9402-7

Keywords

Navigation