European Journal of Epidemiology

, Volume 33, Issue 1, pp 5–14 | Cite as

Case–control matching: effects, misconceptions, and recommendations

  • Mohammad Ali MansourniaEmail author
  • Nicholas Patrick Jewell
  • Sander Greenland


Misconceptions about the impact of case–control matching remain common. We discuss several subtle problems associated with matched case–control studies that do not arise or are minor in matched cohort studies: (1) matching, even for non-confounders, can create selection bias; (2) matching distorts dose–response relations between matching variables and the outcome; (3) unbiased estimation requires accounting for the actual matching protocol as well as for any residual confounding effects; (4) for efficiency, identically matched groups should be collapsed; (5) matching may harm precision and power; (6) matched analyses may suffer from sparse-data bias, even when using basic sparse-data methods. These problems support advice to limit case–control matching to a few strong well-measured confounders, which would devolve to no matching if no such confounders are measured. On the positive side, odds ratio modification by matched variables can be assessed in matched case–control studies without further data, and when one knows either the distribution of the matching factors or their relation to the outcome in the source population, one can estimate and study patterns in absolute rates. Throughout, we emphasize distinctions from the more intuitive impacts of cohort matching.


Bias Case–control studies Confounding Matching Odds ratio 



The authors are grateful to David Clayton and the referees for helpful comments on earlier drafts of this paper.


  1. 1.
    Rothman KJ, Greenland S, Lash TL. Design strategies to improve study accuracy. In: Rothman KJ, Greenland S, Lash TL, eds. Modern Epidemiology. 3rd ed. Philadelphia, PA: Lippincott Williams and Wilkins; 2008. p. 168–82.Google Scholar
  2. 2.
    Greenland S. Partial and marginal matching in case-control studies. In: Moolgavkar SH, Prentice RL, editors. Modern statistical methods in chronic disease epidemiology. New York: Wiley; 1986. p. 35–49.Google Scholar
  3. 3.
    Rothman KJ, Greenland S, Lash TL. Case-control studies. In: Rothman KJ, Greenland S, Lash TL, eds. Modern Epidemiology. 3rd ed. Philadelphia, PA: Lippincott Williams and Wilkins; 2008. p. 111–27.Google Scholar
  4. 4.
    Jewell NP. Statistics for epidemiology, chapter 5. Boca Raton: Chapman & Hall/CRC; 2004.Google Scholar
  5. 5.
    Glymour MM, Greenland S. Causal diagrams. In: Rothman KJ, Greenland S, Lash T, editors. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008. p. 183–209.Google Scholar
  6. 6.
    Jewell NP. Statistics for epidemiology, chapter 8. Boca Raton: Chapman & Hall/CRC; 2004.Google Scholar
  7. 7.
    Greenland S, Mansournia MA. Limitations of individual causal models, causal graphs, and ignorability assumptions, as illustrated by random confounding and design unfaithfulness. Eur J Epidemiol. 2015;30:1101–10.CrossRefPubMedGoogle Scholar
  8. 8.
    Mansournia MA, Higgins JPT, Sterne JAC, Hernán MA. Biases in randomized trials: a conversation between trialists and epidemiologists. Epidemiology. 2017;28:54–9.CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Suzuki E, Tsuda T, Mitsuhashi T, Mansournia MA, Yamamoto E. Errors in causal inference: an organizational schema for systematic error and random error. Ann Epidemiol. 2016;26:788–93.CrossRefPubMedGoogle Scholar
  10. 10.
    Mansournia MA, Etminan M, Danaei G, Kaufman JS, Collins G. Handling time varying confounding in observational research. BMJ 2017;359:j4587.CrossRefPubMedGoogle Scholar
  11. 11.
    Rothman KJ. Modern epidemiology, chapter 13. Boston: Little, Brown; 1986.Google Scholar
  12. 12.
    Pearce N. Analysis of matched case-control studies. BMJ. 2016;352:i969.CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Gharibzadeh S, Mohammad K, Rahimiforoushani A, Amouzegar A, Mansournia MA. Standardization as a tool for causal inference in medical research. Arch Iran Med. 2016;19:666–70.PubMedGoogle Scholar
  14. 14.
    Greenland S, Lash TL. Bias analysis. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams and Wilkins; 2008. p. 345–80.Google Scholar
  15. 15.
    Hernán MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15:615–25.CrossRefPubMedGoogle Scholar
  16. 16.
    Gail MH. Selection bias. In: Armitage P, Colton T, editors. Encyclopedia of biostatistics. 2nd ed. Hoboken: John Wiley & Sons; 2005. p. 4869–70.Google Scholar
  17. 17.
    Mansournia MA, Hernán MA, Greenland S. Matched designs and causal diagrams. Int J Epidemiol. 2013;42:860–9.CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Smith PG, Day NE. Matching and confounding in the design and analysis of epidemiological case-control studies. In: Blithell JF, Coppi R, editors. Perspectives in medical statistics. New York: Academic Press; 1981.Google Scholar
  19. 19.
    Kupper LL, Karon JM, Kleinbaum DG, Morgenstern H, Lewis DK. Matching in epidemiologic studies: validity and efficiency considerations. Biometrics. 1981;37:271–92.CrossRefPubMedGoogle Scholar
  20. 20.
    Samuels ML. Matching and design efficiency in epidemiological studies. Biometrika. 1981;68:577–88.CrossRefGoogle Scholar
  21. 21.
    Thomas DC, Greenland S. The relative efficiencies of matched and independent sample designs for case-control studies. J Chronic Dis. 1983;36:685–97.CrossRefPubMedGoogle Scholar
  22. 22.
    Smith PG, Day NE. The design of case-control studies: the influence of confounding and interaction effects. Int J Epidemiol. 1984;13:356–65.CrossRefPubMedGoogle Scholar
  23. 23.
    Thomas DC, Greenland S. The efficiency of matching in case-control studies of risk-factor interactions. J Chronic Dis. 1985;38:569–74.CrossRefPubMedGoogle Scholar
  24. 24.
    Greenland S. Estimating variances of standardized estimators in case-control studies and sparse data. J Chronic Dis. 1986;39:473–7.CrossRefPubMedGoogle Scholar
  25. 25.
    Greenland S, Rothman KJ. Introduction to stratified analysis. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams and Wilkins; 2008. p. 258–82.Google Scholar
  26. 26.
    Clayton D, Hills M. Statistical models in epidemiology, chapter 18. New York: Oxford University Press; 1993.Google Scholar
  27. 27.
    Greenland S. Re: Estimating relative risk functions in case-control studies using a nonparametric logistic regression. Am J Epidemiol. 1997;146:883–4.CrossRefPubMedGoogle Scholar
  28. 28.
    Breslow NE, Lubin JH, Marek P, Langholz B. Multiplicative models and cohort analysis. J Am Stat Assoc. 1983;78:1–12.CrossRefGoogle Scholar
  29. 29.
    Greenland S. Introduction to regression modeling. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams and Wilkins; 2008. p. 418–55.Google Scholar
  30. 30.
    Greenland S. Applications of stratified analysis methods. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams and Wilkins; 2008. p. 283–302.Google Scholar
  31. 31.
    Jewell NP. Statistics for epidemiology, chapter 16. Boca Raton: Chapman & Hall/CRC; 2004.Google Scholar
  32. 32.
    Robinson LD, Jewell NP. Some surprising results about covariate adjustment in logistic regression. Int Stat Rev. 1991;59:227–40.CrossRefGoogle Scholar
  33. 33.
    Brookmeyer R, Liang KY, Linet M. Matched case-control designs and overmatched analyses. Am J Epidemiol. 1986;124:693–701.CrossRefPubMedGoogle Scholar
  34. 34.
    Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. 2003;14:300–6.PubMedGoogle Scholar
  35. 35.
    Didelez V, Kreiner S, Keiding N. On the use of graphical models for inference under outcome dependent sampling. Stat Sci. 2010;25:368–87.CrossRefGoogle Scholar
  36. 36.
    Kalish LA. Matching on a non-risk factor in the design of case-control studies does not always result in an efficiency loss. Am J Epidemiol. 1986;123:551–4.CrossRefPubMedGoogle Scholar
  37. 37.
    Mansournia MA, Altman DG. Inverse probability weighting. BMJ. 2016;15(352):i189.CrossRefGoogle Scholar
  38. 38.
    Mansournia MA, Danaei G, Forouzanfar MH, Mahmoudi M, Jamali M, Mansournia N, Mohammad K. Effect of physical activity on functional performance and knee pain in patients with osteoarthritis: analysis with marginal structural models. Epidemiology. 2012;23:631–40.CrossRefPubMedGoogle Scholar
  39. 39.
    Szklo M, Nieto F. Epidemiology: beyond the basics, chapter 6. 3rd ed. Sudbury: Jones and Bartlett Publishers; 2014.Google Scholar
  40. 40.
    Greenland S. Tests for interaction in epidemiologic studies: a review and a study of power. Stat Med. 1983;2:243–51.CrossRefPubMedGoogle Scholar
  41. 41.
    Greenland S, Maldonado G. The interpretation of multiplicative model parameters as standardized parameters. Stat Med. 1994;13:989–99.CrossRefPubMedGoogle Scholar
  42. 42.
    Mohammad K, Hashemi Nazari SS, Mansournia N, Mansournia MA. Marginal versus conditional causal effects. J Biostat Epidemiol. 2015;1:121–8.Google Scholar
  43. 43.
    Greenland S. Dose-response and trend analysis: alternatives to category-indicator regression. Epidemiology. 1995;6:356–65.CrossRefPubMedGoogle Scholar
  44. 44.
    Sjölander A, Greenland S. Ignoring the matching variables in cohort studies: when is it valid and why? Stat Med. 2013;32:4696–708.CrossRefPubMedGoogle Scholar
  45. 45.
    Greenland S, Morgenstern H. Matching and efficiency in cohort studies. Am J Epidemiol. 1990;131:151–9.CrossRefPubMedGoogle Scholar
  46. 46.
    Kurth T, Walker AM, Glynn RJ, Chan KA, Gaziano JM, Berger K, Robins JM. Results of multivariable logistic regression, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. Am J Epidemiol. 2006;163:262–70.CrossRefPubMedGoogle Scholar
  47. 47.
    Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Stat Sci. 1999;14:29–46.CrossRefGoogle Scholar
  48. 48.
    Mansournia MA, Greenland S. The relation of collapsibility and confounding to faithfulness and stability. Epidemiology. 2015;26:466–72.CrossRefPubMedGoogle Scholar
  49. 49.
    Greenland S, Pearl J. Adjustments and their consequences: collapsibility analysis using graphical models. Int Stat Rev. 2011;79:401–26.CrossRefGoogle Scholar
  50. 50.
    Pang M, Kaufman JS, Platt RW. Studying noncollapsibility of the odds ratio with marginal structural and logistic regression models. Stat Methods Med Res. 2016;25:1925–37.CrossRefPubMedGoogle Scholar
  51. 51.
    Lombard HL, Doering CR. Cancer studies in Massachusetts. 2. Habits, characteristics and environment of individuals with and without cancer. N Engl J Med. 1928;198:481–7.CrossRefGoogle Scholar
  52. 52.
    Lane-Claypon JE. A further report on cancer of the breast. London: Her Majesty’s Stationery Office; 1926.Google Scholar
  53. 53.
    VanderWeele TJ, Shpitser I. A new criterion for confounder selection. Biometrics. 2011;67:1406–13.CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Greenland S, Schwartzbaum JA, Finkle WD. Problems from small samples and sparse data in conditional logistic regression analysis. Am J Epidemiol. 2000;151:531–9.CrossRefPubMedGoogle Scholar
  55. 55.
    Greenland S. Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators. Biostatistics. 2000;1:113–22.CrossRefPubMedGoogle Scholar
  56. 56.
    Jewell NP. Small-sample bias of point estimators of the odds ratio from matched sets. Biometrics. 1984;40:421–35.CrossRefPubMedGoogle Scholar
  57. 57.
    Greenland S, Mansournia MA, Altman DG. Sparse data bias: a problem hiding in plain sight. BMJ. 2016;27(352):i1981.CrossRefGoogle Scholar
  58. 58.
    Greenland S, Mansournia MA. Penalization, bias reduction, and default priors in logistic and related categorical and survival regressions. Stat Med. 2015;34:3133–43.CrossRefPubMedGoogle Scholar
  59. 59.
    Mansournia MA, Geroldinger A, Greenland S, Heinze G. Separation in logistic regression–causes, consequences, and control. Am J Epidemiol. 2017. doi: 10.1093/aje/kwx299.Google Scholar
  60. 60.
    Shrier I. Re: the design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat Med. 2008;27:2740–1.CrossRefPubMedGoogle Scholar
  61. 61.
    Pearl J. Remarks on the method of propensity score. Stat Med. 2009;28:1415–6.CrossRefPubMedGoogle Scholar
  62. 62.
    King G, Nielsen R. Why propensity scores should not be used for matching. Vers. 2 Feb. 2016 downloaded from
  63. 63.
    Mansson R, Joffe MM, Sun W, Hennessy S. On the estimation and use of propensity scores in case-control and case-cohort studies. Am J Epidemiol. 2007;166:332–9.CrossRefPubMedGoogle Scholar
  64. 64.
    Austin H, Flanders WD, Rothman KJ. Bias arising in case-control studies from selection of controls from overlapping groups. Int J Epidemiol. 1989;18:713–6.CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2017

Authors and Affiliations

  1. 1.Department of Epidemiology and Biostatistics, School of Public HealthTehran University of Medical SciencesTehranIran
  2. 2.Division of Biostatistics, School of Public HealthUniversity of CaliforniaBerkeleyUSA
  3. 3.Department of StatisticsUniversity of CaliforniaBerkeleyUSA
  4. 4.Department of Epidemiology, Fielding School of Public HealthUniversity of CaliforniaLos AngelesUSA
  5. 5.Department of Statistics, College of Letters and ScienceUniversity of CaliforniaLos AngelesUSA

Personalised recommendations