Lifetime Data Analysis

, Volume 21, Issue 4, pp 517–541 | Cite as

Nested case–control studies: should one break the matching?

  • Ørnulf BorganEmail author
  • Ruth Keogh


In a nested case–control study, controls are selected for each case from the individuals who are at risk at the time at which the case occurs. We say that the controls are matched on study time. To adjust for possible confounding, it is common to match on other variables as well. The standard analysis of nested case–control data is based on a partial likelihood which compares the covariates of each case to those of its matched controls. It has been suggested that one may break the matching of nested case–control data and analyse them as case–cohort data using an inverse probability weighted (IPW) pseudo likelihood. Further, when some covariates are available for all individuals in the cohort, multiple imputation (MI) makes it possible to use all available data in the cohort. In the paper we review the standard method and the IPW and MI approaches, and compare their performance using simulations that cover a range of scenarios, including one and two endpoints.


Case–cohort Competing risks Cox regression Inverse probability weighting Matching Multiple imputation Nested case–control 



Most of this research was done when Ørnulf Borgan was visiting the Department of Medical Statistics at London School of Hygiene and Tropical Medicine the spring of 2014. The department is acknowledged for its hospitality and for providing the best working facilities. We also want to thank Nathalie Støer for letting us use her new R package multipleNCC before it was made publicly available.

Supplementary material

10985_2015_9319_MOESM1_ESM.pdf (35 kb)
Supplementary material 1 (pdf 35 KB)


  1. Aalen OO, Borgan Ø, Gjessing HK (2008) Survival and event history analysis: a process point of view. Springer, New YorkCrossRefGoogle Scholar
  2. Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120zbMATHMathSciNetCrossRefGoogle Scholar
  3. Bartlett JW, Seaman SR, White IR, Carpenter JR (2014) Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model. Stat Methods Med Res. doi: 10.1177/0962280214521348
  4. Borgan Ø, Samuelsen SO (2013) Nested case–control and case–cohort studies. In: Klein JP, van Houwelingen HC, Ibrahim JG, Scheike TH (eds) Handbook of survival analysis. Chapman and Hall/CRC Press, Boca Raton, Florida, pp 343–367Google Scholar
  5. Borgan Ø, Goldstein L, Langholz B (1995) Methods for the analysis of sampled cohort data in the Cox proportional hazards model. Ann Stat 23:1749–1778zbMATHMathSciNetCrossRefGoogle Scholar
  6. Breslow NE (1996) Statistics in epidemiology: the case–control study. J American Stat Assoc 91:14–28zbMATHMathSciNetCrossRefGoogle Scholar
  7. Carpenter JR, Kenward MG (2013) Multiple imputation and its aplication. Wiley, New YorkCrossRefGoogle Scholar
  8. Chen K (2001) Generalized case–cohort estimation. J R Stat Soc Ser B 63:791–809zbMATHCrossRefGoogle Scholar
  9. Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, HobokenzbMATHCrossRefGoogle Scholar
  10. Keogh RH, Cox DR (2014) Case–control studies. Cambridge University Press, CambridgezbMATHCrossRefGoogle Scholar
  11. Keogh RH, White IR (2013) Using full-cohort data in nested case–control and case–cohort studies by multiple imputation. Stat Med 32:4021–4043MathSciNetCrossRefGoogle Scholar
  12. Langholz B, Borgan Ø (1995) Counter-matching: a stratified nested case–control sampling method. Biometrika 82:69–79zbMATHCrossRefGoogle Scholar
  13. Meng X (1994) Multiple-imputation inferences with uncongenial sources of input. Stat Sci 9:538–558Google Scholar
  14. Oakes D (1981) Survival times: aspects of partial likelihood (with discussion). Int Stat Rev 49:235–264zbMATHMathSciNetCrossRefGoogle Scholar
  15. Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New YorkCrossRefGoogle Scholar
  16. Rundle AG, Vineis P, Ahsan H (2005) Design options for molecular epidemiology research within cohort studies. Cancer Epidemiol Biomark Prev 14:1899–1907CrossRefGoogle Scholar
  17. Saarela O, Kulathinal S, Arjas E, Läärä E (2008) Nested case–control data utilized for multiple outcomes: a likelihood approach and alternatives. Stat Med 27:5991–6008MathSciNetCrossRefGoogle Scholar
  18. Samuelsen SO (1997) A pseudolikelihood approach to analysis of nested case–control studies. Biometrika 84:379–394zbMATHMathSciNetCrossRefGoogle Scholar
  19. Samuelsen SO, Ånestad H, Skrondal A (2007) Stratified case–cohort analysis of general cohort sampling designs. Scand J Stat 34:103–119zbMATHCrossRefGoogle Scholar
  20. Scheike TH, Juul A (2004) Maximum likelihood estimation for Cox’s regression model under nested case–control sampling. Biostatistics 5:193–206zbMATHCrossRefGoogle Scholar
  21. Scott AJ, Wild CJ (1986) Logistic models under case-control or choice based sampling. J R Stat Soc Ser B 48:170–182zbMATHMathSciNetGoogle Scholar
  22. Scott AJ, Wild CJ (2002) Logistic models under case-control or choice based sampling. J R Stat Soc Ser B 64:207–219zbMATHMathSciNetCrossRefGoogle Scholar
  23. Støer NC, Samuelsen SO (2012) Comparison of estimators in nested case–control studies with multiple outcomes. Lifetime Data Anal 18:261–283MathSciNetCrossRefGoogle Scholar
  24. Støer NC, Samuelsen SO (2013) Inverse probability weighting in nested case–control studies with additional matching—a simulation study. Stat Med 32:5328–5339MathSciNetCrossRefGoogle Scholar
  25. Støer NC, Samuelsen SO (2014) multipleNCC: weighted Cox-regression for nested case-control data., R package version 1.0
  26. Van Buuren S (2007) Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res 16:219–242zbMATHMathSciNetCrossRefGoogle Scholar
  27. Van Buuren S, Groothuis-Oudshoorn K (2011) Mice: multivariate imputation by chained equations in R. J Stat Softw 45:1–67Google Scholar
  28. White IR, Royston P (2009) Imputing missing covariate values for the Cox model. Stat Med 28:1982–1998MathSciNetCrossRefGoogle Scholar
  29. White IR, Royston P, Wood AM (2011) Multiple imputation using chained equations: issues and guidance for practice. Stat Med 30:377–399MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of OsloOsloNorway
  2. 2.Department of Medical StatisticsLondon School of Hygiene and Tropical MedicineLondonUK

Personalised recommendations