Statistics for Testing Gene–Environment Interaction

  • Momiao XiongEmail author
  • Xuesen Wu


This chapter introduces a number of new gene–environment interaction measures and develop novel statistics that are based on these new gene–environment interaction measures. These new statistics are simple, less computationally intensive and easy to implement. It is hoped that these developments may open a new avenue for large-scale genome-wide gene–environment interaction analysis, deciphering the genetic and physiological meaning of gene–environment interactions and developing sophisticated statistical methods for unraveling gene–gene and gene–environment interactions leading to the development of human cancers.


Statistics Testing interaction between gene and binary or continuous environment Cancer 



M. Xiong are supported by grants from the National Institutes of Health NIAMS P01 AR052915-01A1 and NIAMS R01AR057120-01.


  1. Altshuler D, Daly MJ, Lander ES (2008) Genetic mapping in human disease. Science. 322(5903):881–8.PubMedCrossRefGoogle Scholar
  2. Amos CI, Wu X, Broderick P, et al. (2008) Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet. 40:616–22.PubMedCrossRefGoogle Scholar
  3. Anderson TW (1984) An introduction to multivariate statistical analysis, 2nd Edition. John Wiley & Sons, New York.Google Scholar
  4. Andrew AS, Mason RA, Kelsey KT, Schned AR, Marsit CJ, Nelson HH, Karagas MR (2009) DNA repair genotype interacts with arsenic exposure to increase bladder cancer risk. Toxicol Lett. 187(1):10–4.Google Scholar
  5. Andrieu N, Goldstein AM (1998) Epidemiologic and genetic approaches in the study of gene–environment interaction: an overview of available methods. Epidemiol Rev. 20(2):137–47.PubMedCrossRefGoogle Scholar
  6. Ay N (2002) Locality of global stochastic interaction in directed acyclic networks. Neural Comput. 14:2959–80.PubMedCrossRefGoogle Scholar
  7. Brennan P (2002) Gene–environment interaction and aetiology of cancer: what does it mean and how can we measure it? Carcinogenesis. 3:381–7.CrossRefGoogle Scholar
  8. Brillinger DR (2004) Some data analyses using mutual information. Braz J Probab Stat. 18:163–83.Google Scholar
  9. Bush WS, Dudek SM, Ritchie MD (2006) Parallel multifactor dimensionality reduction: a tool for the large-scale analysis of gene–gene interactions. Bioinformatics. 22(17):2173–4.PubMedCrossRefGoogle Scholar
  10. Chaisson MJ, Brinza D, Pevzner PA (2009) De novo fragment assembly with short mate-paired reads: does the read length matter? Genome Res. 19(2):336–46.PubMedCrossRefGoogle Scholar
  11. Chatterjee N, Kalaylioglu Z, Moslehi R, Peters U, Wacholder S (2006) Powerful multilocus tests of genetic association in the presence of gene–gene and gene–environment interactions. Am J Hum Genet. 79(6):1002–100.PubMedCrossRefGoogle Scholar
  12. Cheverud JM, Routman EJ (1995) Epistasis and its contribution to genetic variance components. Genetics. 139:1455–61.PubMedGoogle Scholar
  13. Chung Y, Lee SY, Elston RC, Park T (2007) Odds ratio based multifactor-dimensionality reduction method for detecting gene–gene interactions. Bioinformatics. 23(1):71–6.PubMedCrossRefGoogle Scholar
  14. Colt JS, Rothman N, Severson RK, Hartge P, Cerhan JR, Chatterjee N, Cozen W, Morton LM, De Roos AJ, Davis S, Chanock S, Wang SS (2009) Organochlorine exposure, immune gene variation, and risk of non-Hodgkin lymphoma. Blood. 113:1899–905.PubMedCrossRefGoogle Scholar
  15. Cordell HJ (2009) Detecting gene–gene interactions that underlie human diseases. Nat Rev Genet. 10:392–404.PubMedCrossRefGoogle Scholar
  16. Cover TM, Thomas JA (1991) Elements of information theory. John Wiley & Sons, New York.CrossRefGoogle Scholar
  17. Dandara C, Li DP, Walther G, Parker MI (2006) Gene–environment interaction: the role of SULT1A1 and CYP3A5 polymorphisms as risk modifiers for squamous cell carcinoma of the oesophagus. Carcinogenesis. 27:791–7.PubMedCrossRefGoogle Scholar
  18. Frazer KA, Murray SS, Schork NJ, Topol EJ (2009) Human genetic variation and its contribution to complex traits. Nat Rev Genet. 10(4):241–51.PubMedCrossRefGoogle Scholar
  19. Garcia-Closas M, Lubin JH (1999) Power and sample size calculations in case-control studies of gene–environment interactions: comments on different approaches. Am J Epidemiol. 149:689–92.PubMedCrossRefGoogle Scholar
  20. Gauderman WJ (2002) Sample size requirements for matched case-control studies of gene–environment interaction. Stat Med. 21(1):35–50.PubMedCrossRefGoogle Scholar
  21. Ghadirian P, Narod S, Fafard E, Costa M, Robidoux A, Nkondjock A (2009). Breast cancer risk in relation to the joint effect of BRCA mutations and diet diversity. Breast Cancer Res Treat. 117:417–22.PubMedCrossRefGoogle Scholar
  22. Giarelli E, Jacobs LA (2005). Modifying cancer risk factors: the gene–environment interaction. Semin Oncol Nurs. 21:271–7.PubMedCrossRefGoogle Scholar
  23. Goldstein AM, Dondon MG, Andrieu N (2006) Unconditional analyses can increase efficiency in assessing gene–environment interaction of the case-combined-control design. Int J Epidemiol. 35(4):1067–73.PubMedCrossRefGoogle Scholar
  24. Goodman M, Dana Flanders W (2007) Study design options in evaluating gene–environment interactions: practical considerations for a planned case-control study of pediatric leukemia. Pediatr Blood Cancer. 48(4):375–9.Google Scholar
  25. Hahn LW, Ritchie MD, Moore JH (2003) Multifactor dimensionality reduction software for detecting gene–gene and gene–environment interactions. Bioinformatics. 19:376–82.PubMedCrossRefGoogle Scholar
  26. Hall J, Marcel V, Bolin C, Fernet M, Tartier L, Vaslin L, Hainaut P (2009). The associations of sequence variants in DNA-repair and cell-cycle genes with cancer risk: genotype-phenotype correlations. Biochem Soc Trans. 37(Pt 3):527–33.PubMedCrossRefGoogle Scholar
  27. Hansen TF, Wagner GP (2001) Modeling genetic architecture a multilinear theory of gene interaction. Theor Popul Biol. 59:61–86.PubMedCrossRefGoogle Scholar
  28. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA. 106(23):9362–7.PubMedCrossRefGoogle Scholar
  29. Hung RJ, McKay JD, Gaborieau V, Boffetta P, Hashibe M, Zaridze D, et al. (2008). A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature. 452:633–7.PubMedCrossRefGoogle Scholar
  30. Jakulin A (2005) Machine learning based on attribute interaction. Ph.D. Dissertation, University of Ljubljana, Sezana.Google Scholar
  31. Johnson RA, Wichern DW (2002) Applied multivariate statistical analysis, 5th Edition. Prentice Hall, Upper Saddle River, NJ.Google Scholar
  32. Johnson PL, Slatkin M (2007). Accounting for bias from sequencing error in population. Mol Biol Evol. 25:199–206.PubMedCrossRefGoogle Scholar
  33. Joshi AD, Corral R, Siegmund KD, Haile RW, Le Marchand L, Martínez ME, Ahnen DJ, Sandler RS, Lance P, Stern MC (2009). Red meat and poultry intake, polymorphisms in the nucleotide excision repair and mismatch repair pathways and colorectal cancer risk. Carcinogenesis. 30:472–9.PubMedCrossRefGoogle Scholar
  34. Kallberg H, Padyukov L, Plenge RM, Ronnelid J, Gregersen PK, et al. (2007) Gene–gene and gene–environment interactions involving HLA-DRB1, PTPN22, and smoking in two subsets of rheumatoid arthritis. Am J Hum Genet. 80:867–75.PubMedCrossRefGoogle Scholar
  35. Khoury MJ, Wacholder S (2009). Invited commentary: from genome-wide association studies to gene-environment-wide interaction studies-challenges and opportunities. Am J Epidemiol. 169:227–30.PubMedCrossRefGoogle Scholar
  36. Khoury-Shakour S, Gruber SB, Lejbkowicz F, Rennert HS, Raskin L, Pinchev M, Rennert G (2008). Recreational physical activity modifies the association between a common GH1 polymorphism and colorectal cancer risk. Cancer Epidemiol Biomarkers Prev. 17: 3314–18.PubMedCrossRefGoogle Scholar
  37. Klareskog L, Stolt P, Lundberg K, Kallberg H, Bengtsson C, et al. (2006) A new model for an etiology of rheumatoid arthritis: smoking may trigger HLA-DR (shared epitope)-restricted immune reactions to autoantigens modified by citrullination. Arthritis Rheum. 54:38–46.PubMedCrossRefGoogle Scholar
  38. Koopman JS (1977) Causal models and sources of interaction. Am J Epidemiol. 106:439–44.PubMedGoogle Scholar
  39. Lehmann EL (1983) Theory of point estimation. John Wiley & Sons, New York, NY.CrossRefGoogle Scholar
  40. Liberman U, Puniyani A, Feldman MW (2007) On the evolution of epistasis II: a generalized Wright-Kimura framework. Theor Popul Biol. 71(2):230–8.PubMedCrossRefGoogle Scholar
  41. Linn-Rasker SP, Van der Helm-Van Mil AH, Van Gaalen FA, Kloppenburg M, De Vries RR, et al. (2006) Smoking is a risk factor for anti-CCP antibodies only in rheumatoid arthritis patients who carry HLA-DRB1 shared epitope alleles. Ann Rheum Dis. 65:366–71.PubMedCrossRefGoogle Scholar
  42. Liu X, Fallin MD, Kao WH (2004) Genetic dissection methods: designs used for tests of gene–environment interaction. Curr Opin Genet Dev. 14:241–5.PubMedCrossRefGoogle Scholar
  43. Luan JA, Wong MY, Day NE, Wareham NJ (2001) Sample size determination for studies of gene–environment interaction. Int J Epidemiol. 30(5):1035–40.PubMedCrossRefGoogle Scholar
  44. Lundberg K, Nijenhuis S, Vossenaar ER, Palmblad K, Van Venrooij WJ, et al. (2005) Citrullinated proteins have increased immunogenicity and arthritogenicity and their presence in arthritic joints correlates with disease severity. Arthritis Res Ther. 7:R458–67.PubMedCrossRefGoogle Scholar
  45. Manolio TA, Bailey-Wilson JE, Collins FS (2006) Genes, environment and the value of prospective cohort studies. Nat Rev Genet. 7(10):812–20.PubMedCrossRefGoogle Scholar
  46. Marchini J, Donnelly P, Cardon LR (2005) Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet. 37:413–17.PubMedCrossRefGoogle Scholar
  47. Mukherjee B, Ahn J, Gruber SB, Rennert G, Moreno V, Chatterjee N. (2008) Tests for gene environment interaction from case-control data: a novel study of type I error, power and designs. Genet Epidemiol. 32(7):615–26.PubMedCrossRefGoogle Scholar
  48. Murcary CE, lewinger JP, Gauderman WJ (2009) Gene–environment interaction in genome-wide association studies. Am J Epidemiol. 169:219–26.CrossRefGoogle Scholar
  49. Ottman R (1996) Theoretic epidemiology. Gene–environment interaction: definitions and study designs. Prev Med. 25:764–70.PubMedCrossRefGoogle Scholar
  50. Phillips PC (2008) Epistasis-the essential role of gene interactions in the structure and evolution of genetic systems. Nature Rev Genet. 9:855–67.PubMedCrossRefGoogle Scholar
  51. Piegorsch WW, Weinberg CR, Taylor JA (1994) Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med. 13(2):153–62.PubMedCrossRefGoogle Scholar
  52. Puniyani A, Liberman U, Feldman MW (2004) On the meaning of non-epistatic selection. Theor Popul Biol. 66:317–21.PubMedCrossRefGoogle Scholar
  53. Qin JM, Yang L, Chen B, Wang XM, Li F, Liao PH, He L (2008) Interaction of methylenetetrahydrofolate reductase C677T, cytochrome P4502E1 polymorphism and environment factors in esophageal cancer in Kazakh population. World J Gastroenterol. 14:6986–92.PubMedCrossRefGoogle Scholar
  54. Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, et al. (2001) Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 69:138–47.PubMedCrossRefGoogle Scholar
  55. Rothman KJ, Greenland S, Walker AM (1980) Concepts of interaction. Am J Epidemiol. 112(4):467–70.PubMedGoogle Scholar
  56. Ruwali M, Khan AJ, Shah PP, Singh AP, Pant MC, Parmar D. (2009) Cytochrome P450 2E1 and head and neck cancer: interaction with genetic and environmental risk factors. Environ Mol Mutagen. 50:473–82.PubMedCrossRefGoogle Scholar
  57. Sabatti C, Risch N (2002) Homozygosity and linkage disequilibrium. Genetics. 160:1707–19.PubMedGoogle Scholar
  58. Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol. 26(10):1135–45.PubMedCrossRefGoogle Scholar
  59. Taioli E (2008). Gene–environment interaction in tobacco-related cancers. Carcinogenesis. 29:1467–74.PubMedCrossRefGoogle Scholar
  60. Thorgeirsson TE, Geller F, Sulem P, et al. (2008) A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature. 452:638–42.PubMedCrossRefGoogle Scholar
  61. Tsai MH, Tseng HC, Liu CS, Chang CL, Tsai CW, Tsou YA, Wang RF, Lin CC, Wang HC, Chiu CF, Bau DT (2009) Interaction of Exo1 genotypes and smoking habit in oral cancer in Taiwan. Oral Oncol. 45(9):e90–4.PubMedCrossRefGoogle Scholar
  62. Walter SD, Holford TR (1978) Additive, multiplicative, and other models for disease risks. Am J Epidemiol. 108:341–6.PubMedGoogle Scholar
  63. Wang K (2008) Genetic association tests in the presence of epistasis or gene–environment interaction. Genet Epidemiol. 32:606–14.PubMedCrossRefGoogle Scholar
  64. Winslow RL, Boguski MS (2003) Genome informatics: current status and future prospects. Circ Res. 92:953–61.PubMedCrossRefGoogle Scholar
  65. Wu X, Jin L, Xiong M. (2009) Mutual information for testing gene–environment interaction. PLoS One. 4(2):e4578.PubMedCrossRefGoogle Scholar
  66. Yoon Y, Song J, Hong SH, Kim JQ (2003) Analysis of multiple single nucleotide polymorphisms of candidate genes related to coronary heart disease susceptibility by using support vector machines. Clin Chem Lab Med. 41:529–34.PubMedCrossRefGoogle Scholar
  67. Zhang YW, Eom SY, Kim YD, Song YJ, Yun HY, Park JS, Youn SJ, Kim BS, Kim H, Hein DW (2009) Effects of dietary factors and the NAT2 acetylator status on gastric cancer in Koreans. Int J Cancer. 125(1):139–45.PubMedCrossRefGoogle Scholar
  68. Zhou W, Liu G, Miller DP, Thurston SW, Xu LL, et al. (2002) Gene–environment interaction for the ERCC2 polymorphisms and cumulative cigarette smoking exposure in lung cancer. Cancer Res. 62(5):1377–81.PubMedGoogle Scholar

Copyright information

© Springer Science+ Business Media, LLC 2010

Authors and Affiliations

  1. 1.School of Public HealthHuman Genetics Center, The University of Texas Health Science Center at HoustonHoustonUSA
  2. 2.Department of Epidemiology and StatisticsBengbu Medical CollegeBengbuChina

Personalised recommendations