Skip to main content
Log in

A new comparison of nested case–control and case–cohort designs and methods

  • METHODS
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

Abstract

Existing literature comparing statistical properties of nested case–control and case–cohort methods have become insufficient for present day epidemiologists. The literature has not reconciled conflicting conclusions about the standard methods. Moreover, a comparison including newly developed methods, such as inverse probability weighting methods, is needed. Two analytical methods for nested case–control studies and six methods for case–cohort studies using proportional hazards regression model were summarized and their statistical properties were compared. The answer to which design and method is more powerful was more nuanced than what was previously reported. For both nested case–control and case–cohort designs, inverse probability weighting methods were more powerful than the standard methods. However, the difference became negligible when the proportion of failure events was very low (<1 %) in the full cohort. The comparison between two designs depended on the censoring types and incidence proportion: with random censoring, nested case–control designs coupled with the inverse probability weighting method yielded the highest statistical power among all methods for both designs. With fixed censoring times, there was little difference in efficiency between two designs when inverse probability weighting methods were used; however, the standard case–cohort methods were more powerful than the conditional logistic method for nested case–control designs. As the proportion of failure events in the full cohort became smaller (<10 %), nested case–control methods outperformed all case–cohort methods and the choice of analytic methods within each design became less important. When the predictor of interest was binary, the standard case–cohort methods were often more powerful than the conditional logistic method for nested case–control designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Thomas D. Addendum to ‘Methods of cohort analysis: appraisal by application to asbestos mining’ by Liddell FDK, McDonald JC, Thomas DC. J R Stat Soc. 1977;A140:469–91.

    Google Scholar 

  2. Prentice RL. A case–cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73:1–11.

    Article  Google Scholar 

  3. Self SG, Prentice RL. Asymptotic distribution theory and efficiency results for case–cohort studies. Ann Stat. 1988;16:64–81.

    Article  Google Scholar 

  4. Langholz B, Thomas D. Nested case–control and case–cohort methods of sampling from a cohort: a critical comparison. Am J Epidemiol. 1990;131:169–76.

    CAS  PubMed  Google Scholar 

  5. Barlow WE, Ichikawa L, Rosner D, Izumi S. Analysis of case–cohort designs. J Clin Epidemiol. 1999;52:1165–72.

    Article  CAS  PubMed  Google Scholar 

  6. Barlow WE. Robust variance estimation for the case–cohort design. Biometrics. 1994;50:1064–72.

    Article  CAS  PubMed  Google Scholar 

  7. Samuelsen S. A pseudo-likelihood approach to analysis of nested case–control studies. Biometrika. 1997;84:379–94.

    Article  Google Scholar 

  8. Kim RS, Kaplan R. Analysis of secondary outcomes in nested case–control study designs. Stat Med. 2014;33:4215–26.

  9. Kim RS. Analysis of nested case–control study designs: revisiting the inverse probability weighting method. Commun Stat Appl Methods. 2013;20:455–66.

    Google Scholar 

  10. Lin DY, Ying Z. Cox regression with incomplete covariate measurements. J Am Stat Assoc. 1993;88:1341–9.

    Article  Google Scholar 

  11. Binder DA. Fitting Cox’s proportional hazards models from survey data. Biometrika. 1992;79:139–47.

    Article  Google Scholar 

  12. Lin DY. On fitting Cox’s proportional hazards models to survey data. Biometrika. 2000;87:37–47.

    Article  Google Scholar 

  13. Anderson PK, Gill RD. Cox’s regression model for counting processes: a large sample study. Ann Stat. 1982;10:1100–20.

    Article  Google Scholar 

  14. Borgan O, Goldstein L, Langholz B. Methods for the analysis of sampled cohort data in the Cox proportional hazards model. Ann Stat. 1995;23:1749–78.

    Article  Google Scholar 

  15. Therneau TM, Li H. Computing the Cox model for case cohort designs. Lifetime Data Anal. 1999;5:99–112.

    Article  CAS  PubMed  Google Scholar 

  16. Lin DY, Wei LJ. The robust inference for the Cox proportional hazards model. J Am Stat Assoc. 1989;84:1074–8.

    Article  Google Scholar 

  17. R Development Core Team. R: a language and environment for statistical computing. Vienna: R Development Core Team; 2010.

  18. R code: six case-cohort and two nested case-control methods. http://missionalconsulting.com/methods/rcode-cch-ncc

  19. Zhang H, Goldstein L. Information and asymptotic efficiency of the case–cohort sampling design in Cox’s regression model. J Multivar Anal. 2003;85:292–317.

    Article  Google Scholar 

  20. Goldstein L, Zhang H. Efficiency of the maximum partial likelihood estimator for nested case control sampling. Bernoulli. 2009;15:569–97.

    Article  Google Scholar 

  21. Wacholder J. Practical considerations in choosing between the case–cohort and NCC designs. Epidemiology. 1991;2:155–8.

    Article  CAS  PubMed  Google Scholar 

  22. Chen KN. Generalized case–cohort sampling. J R Stat Soc Ser B (Stat Methodol). 2001;63:791–809.

    Article  Google Scholar 

  23. Chen KN. Statistical estimation in the proportional hazards model with risk set sampling. Ann Stat. 2004;32:1513–32.

    Article  Google Scholar 

  24. Chen HY. Double-semiparametric method for missing covariates in Cox regression models. J Am Stat Assoc. 2002;97:565–76.

    Article  Google Scholar 

  25. Scheike TH, Juul A. Maximum likelihood estimation for Cox’s regression model under nested case–control sampling. Biostatistics. 2004;5:193–206.

    Article  PubMed  Google Scholar 

  26. Prentice RL, Williams BJ, Peterson AV. On the regression analysis of multivariate failure time data. Biometrika. 1981;68:373–9.

    Article  Google Scholar 

  27. Lubin JH. Case–control methods in the presence of multiple failure times and competing risks. Biometrics. 1985;41:49–54.

    Article  CAS  PubMed  Google Scholar 

  28. Zhang H, Schaubel DE, Kalbfleisch JD. Proportional hazards regression for the analysis of clustered survival data from case–cohort studies. Biometrics. 2011;67:18–28.

    Article  PubMed  Google Scholar 

  29. Chen F, Chen K. Case–cohort analysis of clusters of recurrent events. Lifetime Data Anal. 2014;20:1–15.

    Article  PubMed  Google Scholar 

  30. Xue X, Xie X, Gunter M, Rohan TE, Wassertheil-Smoller S, Ho GY, et al. Testing the proportional hazards assumption in case–cohort analysis. BMC Med Res Methodol. 2013;13:1–10.

    Article  Google Scholar 

  31. Bellera C, MacGrogan G, Debled M, de Lara C, Brouste V, Mathoulin-Pelissier S. Variables with time-varying effects and the cox model: some statistical concepts illustrated with a prognostic factor study in breast cancer. BMC Med Res Methodol. 2010;10:1–12.

    Article  Google Scholar 

  32. Lu W, Liu M, Chen Y-H. Testing goodness-of-fit for the proportional hazards model based on nested case–control data. Biometrics. 2014;. doi:10.1111/biom.12239.

    PubMed Central  Google Scholar 

  33. Ranganathan P, Pramesh CS. Censoring in survival analysis: potential for bias. Perspect Clin Res. 2012;3:40.

    Article  PubMed Central  PubMed  Google Scholar 

  34. Meier EN. A sensitivity analysis for clinical trials with informatively censored survival endpoints. Master’s thesis, University of Washington; 2012.

  35. Braekers R, Veraverbeke N. Cox’s regression model under partially informative censoring. Commun Stat Theory Methods. 2005;34:1793–811.

    Article  Google Scholar 

  36. Lin DY, Robins JM, Wei LJ. Comparing two failure time distributions on the presence of dependent censoring. Biometrika. 1996;83:381–93.

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Institutes of Health Grants 1UL1RR025750-01, P30 CA01330-35; and the National Research Foundation of Korea Grant NRF-2012-S1A3A2033416. The author is deeply thankful for the constructive comments from the anonymous referees, which led to significant improvement of this work.

Conflict of interest

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryung S. Kim.

Electronic supplementary material

Below is the link to the electronic supplementary material.

10654_2014_9974_MOESM1_ESM.tif

The Empirical Biases of the Estimators of β 2. The considered methods are the full cohort analysis, two nested case-control methods which are the conditional logistic approach by Thomas (1977) and the inverse probability weighting method by Samuelsen (1997), and four case–cohort methods which are the inverse probability weighting method by Binder (1992), and the methods by Prentice (1986), Self & Prentice (1988), and Barlow (1994). The average sample size n* and the average subcohort proportion π* are shown in the titles. CCH and NCC are abbreviations for case–cohort and nested case-control designs, respectively (TIFF 173 kb)

10654_2014_9974_MOESM2_ESM.tif

The Empirical Standard Errors of the Estimators of β 2. The empirical standard errors of β 2 estimators are shown for the full cohort analysis, two nested case-control (NCC) methods, which are the conditional logistic approach by Thomas (1977) and the inverse probability weighting method by Samuelsen (1997), and four case–cohort (CCH) methods, which are the inverse probability weighting method by Binder (1992), and the methods by Prentice (1986), Self & Prentice (1988), and Barlow (1994). The average sample size n* and the average subcohort proportion π* are shown in the titles. Only the results for N=500, 1,000 are shown (TIFF 170 kb)

10654_2014_9974_MOESM3_ESM.tif

Empirical Power Testing H0: β 2=0. The nominal type 1 error rate was 0.05. The empirical power of nine methods is measured: full cohort analysis, the conditional logistic approach by Thomas (1997), inverse probability weighting methods by Samuelsen (1997) coupled with approximate jackknife (AJK) variance estimator (Kim 2013), the inverse probability weighting methods by Binder (1992) coupled with AJK variance estimator, Prentice (1986), Prentice (1986) coupled with AJK variance estimator, Self & Prentice (1988), Self & Prentice coupled with AJK variance estimator (i.e. Lin & Ying 1993), and Barlow (1994). The average sample size n* and the average subcohort proportion π* are shown in the titles. CCH and NCC are abbreviations for case–cohort and nested case-control designs, respectively. Only the results for N=500, 1,000 are shown (TIFF 199 kb)

10654_2014_9974_MOESM4_ESM.tif

N=1,500. The empirical biases, standard errors of the estimators of β 1, the empirical power testing H0: β 1=0, and the empirical standard errors of the estimators of β 2 are shown when N=1,500 (TIFF 172 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, R.S. A new comparison of nested case–control and case–cohort designs and methods. Eur J Epidemiol 30, 197–207 (2015). https://doi.org/10.1007/s10654-014-9974-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-014-9974-4

Keywords

Navigation