Mendelian Randomization

  • Sandeep Grover
  • Fabiola Del Greco M.
  • Catherine M. Stein
  • Andreas ZieglerEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1666)


Confounding and reverse causality have prevented us from drawing meaningful clinical interpretation even in well-powered observational studies. Confounding may be attributed to our inability to randomize the exposure variable in observational studies. Mendelian randomization (MR) is one approach to overcome confounding. It utilizes one or more genetic polymorphisms as a proxy for the exposure variable of interest. Polymorphisms are randomly distributed in a population, they are static throughout an individual’s lifetime, and may thus help in inferring directionality in exposure–outcome associations. Genome-wide association studies (GWAS) or meta-analyses of GWAS are characterized by large sample sizes and the availability of many single nucleotide polymorphisms (SNPs), making GWAS-based MR an attractive approach. GWAS-based MR comes with specific challenges, including multiple causality. Despite shortcomings, it still remains one of the most powerful techniques for inferring causality.

With MR still an evolving concept with complex statistical challenges, the literature is relatively scarce in terms of providing working examples incorporating real datasets. In this chapter, we provide a step-by-step guide for causal inference based on the principles of MR with a real dataset using both individual and summary data from unrelated individuals. We suggest best possible practices and give recommendations based on the current literature.

Key words

Mendelian randomization Instrumental variable Observational epidemiology Reverse causation Unobserved confounding Genome-wide association study Individual data Summary data Pleiotropy Causal inference 



S.G. greatly acknowledges the organizers of the recent workshop on Mendelian Randomization run through the University of Cambridge’s Department for Public Health and Primary Care. This work was supported by a grant from the German Research Foundation (Research Unit ProtectMove, FOR 2488).

Supplementary material

421123_2_En_29_MOESM1_ESM.csv (59 kb)
Dataset 1 Descriptive table of prioritized SNPs for estimating the causal estimates using individual and summary data (CSV 1 kb)
421123_2_En_29_MOESM2_ESM.csv (1 kb)
Dataset 2 Example dataset with quality-controlled individual data on phenotypic variables and prioritized genetic instruments (CSV 59 kb)
421123_2_En_29_MOESM3_ESM.csv (1 kb)
Dataset 3 Example dataset with summary data on genetic instrument–exposure and genetic instrument–outcome associations (CSV 1019 bytes)


  1. 1.
    Hamer D, Sirota L (2000) Beware the chopsticks gene. Mol Psychiatry 5:11–13CrossRefPubMedGoogle Scholar
  2. 2.
    Hill AB (1965) The environment and disease: association or causation? Proc R Soc Med 58:295–300PubMedPubMedCentralGoogle Scholar
  3. 3.
    King RC, Mulligan P, Stansfield W (2013) A dictionary of genetics. Oxford University Press, OxfordCrossRefGoogle Scholar
  4. 4.
    Burgess S, Thompson SG (2015) Mendelian randomization: methods for using genetic variants in causal estimation. Chapman & Hall, LondonCrossRefGoogle Scholar
  5. 5.
    Ziegler A, Mwambi H, König IR (2015) Mendelian randomization versus path models: making causal inferences in genetic epidemiology. Hum Hered 79:194–204CrossRefPubMedGoogle Scholar
  6. 6.
    Vens M, Ziegler A (2012) Estimating disequilibrium coefficients. Methods Mol Biol 850:103–117CrossRefPubMedGoogle Scholar
  7. 7.
    Gage SH, Davey Smith G, Ware JJ, Flint J, Munafò MR (2016) G = E: what GWAS can tell us about the environment. PLoS Genet. doi: 10.1371/journal.pgen.1005765
  8. 8.
    Welsh P, Polisecki E, Robertson M et al (2010) Unraveling the directional link between adiposity and inflammation: a bidirectional Mendelian randomization approach. J Clin Endocr Metab 95:93–99CrossRefPubMedGoogle Scholar
  9. 9.
    Burgess S, Thompson SG (2013) Use of allele scores as instrumental variables for Mendelian randomization. Int J Epidemiol 42:1134–1144CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Cragg J, Donald S (1993) Instrumental variables regression with weak instruments. Econometrica 65:557–586Google Scholar
  11. 11.
    Staiger D, Stock J (1997) Instrumental variables regression with weak instruments. Econometrica 65:557–586CrossRefGoogle Scholar
  12. 12.
    Wald A (1940) The fitting of straight lines if both variables are subject to error. Ann Math Stat 11:284–300CrossRefGoogle Scholar
  13. 13.
    Thomas DC, Lawlor DA, Thompson JR (2007) Re: Estimation of bias in nongenetic observational studies using “Mendelian triangulation” by Bautista et al. Ann Epidemiol 17:511–513CrossRefPubMedGoogle Scholar
  14. 14.
    Fieller E (1954) Some problems in interval estimation. J R Stat Soc Ser B 16:175–185Google Scholar
  15. 15.
    Wooldridge JM (2009) Instrumental variables estimation and two stage least squares. Thompson, South-Western, Nashville, TNGoogle Scholar
  16. 16.
    Angrist JD, Graddy K, Imbens GW (2000) The interpretation of instrumental variables estimators in simultaneous equations models with an application to the demand for fish. Rev Econ Stud 67:499–527CrossRefGoogle Scholar
  17. 17.
    Nagelkerke N, Fidler V, Bernsen R, Borgdorff M (2000) Estimating treatment effects in randomized clinical trials in the presence of non-compliance. Stat Med 19:1849–1864CrossRefPubMedGoogle Scholar
  18. 18.
    Palmer TM, Thompson JR, Tobin MD, Sheehan NA, Burton PR (2008) Adjusting for bias and unmeasured confounding in Mendelian randomization studies with binary responses. Int J Epidemiol 37:1161–1168CrossRefPubMedGoogle Scholar
  19. 19.
    Davidson R, MacKinnon JG (2014) Confidence sets based on inverting Anderson-Rubin tests. Econom J 17:S39–S58CrossRefGoogle Scholar
  20. 20.
    Angrist J, Pischke J (2009) Instrumental variables in action: sometimes you get what you need. Mostly harmless econometrics: an empiricist’s companion. Princeton University Press, Princeton, NJGoogle Scholar
  21. 21.
    Johnson T (2013) Efficient calculation for multi-SNP genetic risk scores. Technical report, The Comprehensive R Archive Network. Accessed 20 Feb 2017
  22. 22.
    Burgess S, Bowden J (2015) Integrating summarized data from multiple genetic variants in Mendelian randomization: bias and coverage properties of inverse-variance weighted methods. arXiv: 151204486Google Scholar
  23. 23.
    Bowden J, Davey Smith G, Burgess S (2015) Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol 44:512–525CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Bowden J, Davey Smith G, Haycock PC, Burgess S (2016) Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol 40:304–314CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Del Greco MF, Minelli C, Sheehan NA, Thompson JR (2015) Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med 34:2926–2940CrossRefGoogle Scholar
  26. 26.
    Hausman JA (1978) Specification tests in econometrics. Econometrica 46:1251–1271CrossRefGoogle Scholar
  27. 27.
    Charoen P, Nitsch D, Engmann J et al (2016) Mendelian randomisation study of the influence of eGFR on coronary heart disease. Sci Rep 6:28514CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Brion MJ, Shakhbazov K, Visscher PM (2013) Calculating statistical power in Mendelian randomization studies. Int J Epidemiol 42:1497–1501CrossRefPubMedGoogle Scholar
  29. 29.
    Chen L, Davey Smith G, Harbord RM, Lewis SJ (2008) Alcohol intake and blood pressure: a systematic review implementing a Mendelian randomization approach. PLoS Med 5:461–471CrossRefGoogle Scholar
  30. 30.
    Reed E, Nunez S, Kulp D, Qian J, Reilly MP, Foulkes AS (2015) A guide to genome-wide association analysis and post-analytic interrogation. Stat Med 34:3769–3792CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Willer CJ, Schmidt EM, Sengupta S et al (2013) Discovery and refinement of loci associated with lipid levels. Nat Genet 45:1274–1283CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Inoue A, Solon G (2010) Two-sample instrumental variables estimators. Rev Econ Stat 92:557–561CrossRefGoogle Scholar
  33. 33.
    Burgess S, Thompson SG, CCG C (2011) Avoiding bias from weak instruments in Mendelian randomization studies. Int J Epidemiol 40:755–764CrossRefPubMedGoogle Scholar
  34. 34.
    Glymour MM, Tchetgen Tchetgen EJ, Robins JM (2012) Credible Mendelian randomization studies: approaches for evaluating the instrumental variable assumptions. Am J Epidemiol 175:332–339CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Didelez V, Sheehan N (2007) Mendelian randomization as an instrumental variable approach to causal inference. Stat Methods Med Res 16:309–330CrossRefPubMedGoogle Scholar
  36. 36.
    le Cessie S, Debeij J, Rosendaal FR, Cannegieter SC, Vandenbroucke JP (2012) Quantification of bias in direct effects estimates due to different types of measurement error in the mediator. Epidemiology 23:551–560CrossRefPubMedGoogle Scholar
  37. 37.
    Burgess S, Scott RA, Timpson NJ, Davey Smith G, Thompson SG, Consortium E-I (2015) Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. Eur J Epidemiol 30:543–552CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Cole SR, Platt RW, Schisterman EF et al (2010) Illustrating bias due to conditioning on a collider. Int J Epidemiol 39:417–420CrossRefPubMedGoogle Scholar
  39. 39.
    Boef AG, Dekkers OM, le Cessie S (2015) Mendelian randomization studies: a review of the approaches used and the quality of reporting. Int J Epidemiol 44:496–511CrossRefPubMedGoogle Scholar
  40. 40.
    Burgess S, Davies NM, Thompson SG, Consortium E-I (2014) Instrumental variable analysis with a nonlinear exposure-outcome relationship. Epidemiology 25:877–885CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Burgess S, CRP CHD Genetics Collaboration (2013) Identifying the odds ratio estimated by a two-stage instrumental variable analysis with a logistic regression model. Stat Med 32:4726–4747Google Scholar
  42. 42.
    Mendelson MM, Marioni RE, Joehanes R et al (2017) Association of body mass index with DNA methylation and gene expression in blood cells and relations to cardiometabolic disease: a Mendelian randomization approach. PLoS Med. doi: 10.1371/journal.pmed.1002215
  43. 43.
    Burgess S, Freitag DF, Khan H, Gorman DN, Thompson SG (2014) Using multivariable Mendelian randomization to disentangle the causal effects of lipid fractions. PLoS One. doi: 10.1371/journal.pone.0108891
  44. 44.
    Burgess S, Thompson SG (2012) Improving bias and coverage in instrumental variable analysis with weak instruments for continuous and binary outcomes. Stat Med 31:1582–1600CrossRefPubMedGoogle Scholar
  45. 45.
    Clarke P, Windmeijer F (2010) Estimating structural mean models with multiple instrumental variables using the generalised method of moments. Working Paper No. 10/239. University of Bristol, UK: Centre for Market and Public OrganisationGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  • Sandeep Grover
    • 1
  • Fabiola Del Greco M.
    • 2
  • Catherine M. Stein
    • 3
  • Andreas Ziegler
    • 1
    Email author
  1. 1.Institut für Medizinische Biometrie und StatistikUniversität zu Lübeck, Universitätsklinikum Schleswig-HolsteinLübeckGermany
  2. 2.Center for BiomedicineEURAC ResearchBolzanoItaly
  3. 3.Case Western Reserve UniversityClevelandUSA

Personalised recommendations