Gene-Gene and Gene-Environment Interactions

  • Andrew T. DeWanEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1793)


Identifying gene–gene and gene–environment interactions may help us to better describe the genetic architecture for complex traits. While advances have been made in identifying genetic variants associated with complex traits through more dense panels of genetic variants and larger sample sizes, genome-wide interaction analyses are still limited in power to detect interactions with small effect sizes, rare frequencies, and higher order interactions. This chapter outlines methods for detecting both gene-gene and gene-environment interactions both through explicit tests for interactions (i.e., ones in which the interaction is tested directly) and non-explicit tests (i.e., ones in which an interaction is allowed for in the test, but does not test for the interaction directly) as well as approaches for increasing power by reducing the search space. Issues relating to multiple test correction, replication, and the reporting of interaction results in publications.

Key words

Interaction Epistasis Environment GWAS Power Replication 


  1. 1.
    Niel C, Sinoquet C, Dina C et al (2015) A survey about methods dedicated to epistasis detection. Front Genet 6:285CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Ritchie MD (2015) Finding the epistasis needles in the genome-wide haystack. Methods Mol Biol 1253:19–33CrossRefPubMedGoogle Scholar
  3. 3.
    Gusareva ES, Van Steen K (2014) Practical aspects of genome-wide association interaction analysis. Hum Genet 133(11):1343–1358CrossRefPubMedGoogle Scholar
  4. 4.
    Tiret L (2002) Gene-environment interaction: a central concept in multifactorial diseases. Proc Nutr Soc 61(4):457–463CrossRefPubMedGoogle Scholar
  5. 5.
    Ottman R (1990) An epidemiologic approach to gene-environment interaction. Genet Epidemiol 7(3):177–185CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Manolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461(7265):747–753CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Bateson W (1909) Mendel’s principles of heredity. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  8. 8.
    Cordell HJ (2002) Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 11(20):2463–2468CrossRefPubMedGoogle Scholar
  9. 9.
    Moore JH (2005) A global view of epistasis. Nat Genet 37(1):13–14CrossRefPubMedGoogle Scholar
  10. 10.
    Ma J, Thabane L, Beyene J et al (2016) Power analysis for population-based longitudinal studies investigating gene-environment interactions in chronic diseases: a simulation study. PLoS One 11(2):e0149940CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Dunham I, Kundaje A, Aldred SF et al (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74CrossRefGoogle Scholar
  12. 12.
    Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38(16):e164CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Bush WS, Dudek SM, Ritchie MD (2009) Biofilter: a knowledge-integration system for the multi-locus analysis of genome-wide association studies. Pac Symp Biocomput:368–379Google Scholar
  14. 14.
    Price AL, Patterson NJ, Plenge RM et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8):904–909CrossRefPubMedGoogle Scholar
  15. 15.
    Howie BN, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5(6):e1000529CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Purcell S, Neale B, Todd-Brown K et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Ueki M, Cordell HJ (2012) Improved statistics for genome-wide interaction analysis. PLoS Genet 8(4):e1002625CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Wu X, Dong H, Luo L et al (2010) A novel statistic for genome-wide interaction analysis. PLoS Genet 6(9):e1001131CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Wan X, Yang C, Yang Q et al (2010) BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet 87(3):325–340CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Hahn LW, Ritchie MD, Moore JH (2003) Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics 19(3):376–382CrossRefPubMedGoogle Scholar
  21. 21.
    Ritchie MD, Hahn LW, Roodi N et al (2001) Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 69(1):138–147CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Calle ML, Urrea V, Malats N et al (2010) mbmdr: an R package for exploring gene-gene interactions associated with binary or quantitative traits. Bioinformatics 26(17):2198–2199CrossRefPubMedGoogle Scholar
  23. 23.
    Gui J, Moore JH, Williams SM et al (2013) A simple and computationally efficient approach to multifactor dimensionality reduction analysis of gene-gene interactions for quantitative traits. PLoS One 8(6):e66545CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Van der Auwera GA, Carneiro MO, Hartl C et al (2013) From FASTQ data to high confidence variant calls: the genome analysis toolkit best practices pipeline. Curr Protoc Bioinformatics 43:11.10 1–11.1033Google Scholar
  25. 25.
    Li H, Durbin R (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26(5):589–595CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    McKenna A, Hanna M, Banks E et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    DePristo MA, Banks E, Poplin R et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43(5):491–498CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Dewan AT, Egan KB, Hellenbrand K et al (2012) Whole-exome sequencing of a pedigree segregating asthma. BMC Med Genet 13(1):95CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Marchini J, Donnelly P, Cardon LR (2005) Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet 37(4):413–417CrossRefPubMedGoogle Scholar
  31. 31.
    Calle ML, Urrea V, Vellalta G, Malats N, Steen KV (2008) Improving strategies for detecting genetic patterns of disease susceptibility in association studies. Stat Med 27(30):6532–6546CrossRefPubMedGoogle Scholar
  32. 32.
    Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57(1):289–300Google Scholar
  33. 33.
    Nyholt DR (2004) A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet 74(4):765–769CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    North BV, Curtis D, Sham PC (2002) A note on the calculation of empirical P values from Monte Carlo procedures. Am J Hum Genet 71(2):439–441CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    North BV, Curtis D, Sham PC (2003) A note on calculation of empirical P values from Monte Carlo procedure. Am J Hum Genet 72(2):498–499CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Murk W, DeWan AT (2016) Exhaustive genome-wide search for SNP-SNP interactions across 10 human diseases. G3 (Bethesda) 6(7):2043–2050CrossRefGoogle Scholar
  37. 37.
    Gauderma WJ, Morrison JM, QUANTO 1.1: A computer program for power and sample size calculations for genetic-epidemiology studies.
  38. 38.
    Uzun A, Sharma S, Padbury J (2012) A bioinformatics approach to preterm birth. Am J Reprod Immunol 67(4):273–277CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Uzun A, Triche EW, Schuster J et al (2016) dbPEC: a comprehensive literature-based database for preeclampsia related genes and phenotypes. Database (Oxford). pii:baw006CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Shearer AE, Eppsteiner RW, Booth KT et al (2014) Utilizing ethnic-specific differences in minor allele frequency to recategorize reported pathogenic deafness variants. Am J Hum Genet 95(4):445–453CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Murk W, DeWan AT (2016) Genome-wide search identifies a gene-gene interaction between 20p13 and 2q14 in asthma. BMC Genet 17(1):102CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Ma L, Clark AG, Keinan A (2013) Gene-based testing of interactions in association studies of quantitative traits. PLoS Genet 9(2):e1003321CrossRefPubMedPubMedCentralGoogle Scholar
  43. 43.
    Wu MC, Lee S, Cai T et al (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89(1):82–93CrossRefPubMedPubMedCentralGoogle Scholar
  44. 44.
    Lin X, Lee S, Wu MC et al (2016) Test for rare variants by environment interactions in sequencing association studies. Biometrics 72(1):156–164CrossRefPubMedGoogle Scholar
  45. 45.
    Chen H, Meigs JB, Dupuis J (2014) Incorporating gene-environment interaction in testing for association with rare genetic variants. Hum Hered 78(2):81–90CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Murk W, Bracken MB, DeWan AT (2015) Confronting the missing epistasis problem: on the reproducibility of gene-gene interactions. Hum Genet 134(8):837–849CrossRefPubMedGoogle Scholar
  47. 47.
    Greene CS, Penrod NM, Williams SM et al (2009) Failure to replicate a genetic association may provide important clues about genetic architecture. PLoS One 4(6):e5639CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Fleiss JL (1993) The statistical basis of meta-analysis. Stat Methods Med Res 2(2):121–145CrossRefPubMedGoogle Scholar
  49. 49.
    Fisher RA (1948) Combining independent tests of significance. Am Stat 2:30Google Scholar
  50. 50.
    Piegorsch WW, Weinberg CR, Taylor JA (1994) Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med 13(2):153–162CrossRefPubMedGoogle Scholar
  51. 51.
    Begg CB, Zhang ZF (1994) Statistical analysis of molecular epidemiology studies employing case-series. Cancer Epidemiol Biomark Prev 3(2):173–175Google Scholar
  52. 52.
    Hodgson ME, Olshan AF, North KE et al (2012) The case-only independence assumption: associations between genetic polymorphisms and smoking among controls in two population-based studies. Int J Mol Epidemiol Genet 3(4):333–360PubMedPubMedCentralGoogle Scholar
  53. 53.
    Yang Q, Khoury MJ, Sun F et al (1999) Case-only design to measure gene-gene interaction. Epidemiology 10(2):167–170CrossRefPubMedGoogle Scholar
  54. 54.
    The International HapMap Consortium (2003) The international HapMap project. Nature 426:789–796CrossRefGoogle Scholar
  55. 55.
    Yang CH, Lin YD, Wu SJ et al (2015) High order gene-gene interactions in eight single nucleotide polymorphisms of renin-angiotensin system genes for hypertension association study. Biomed Res Int 2015:454091PubMedPubMedCentralGoogle Scholar
  56. 56.
    Wu C, Zhang H, Liu X et al (2009) Detecting essential and removable interactions in genome-wide association studies. Stat Interface 2(2):161–170CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Chronic Disease EpidemiologyYale School of Public HealthNew HavenUSA

Personalised recommendations