Gene-Gene Interactions Detection Using a Two-Stage Model

  • Zhanyong Wang
  • Jae Hoon Sul
  • Sagi Snir
  • Jose A. Lozano
  • Eleazar Eskin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8394)


Genome wide association studies (GWAS) have discovered numerous loci involved in genetic traits. Virtually all studies have reported associations between individual single nucleotide polymorphism (SNP) and traits. However, it is likely that complex traits are influenced by interaction of multiple SNPs. One approach to detect interactions of SNPs is the brute force approach which performs a pairwise association test between a trait and each pair of SNPs. The brute force approach is often computationally infeasible because of the large number of SNPs collected in current GWAS studies. We propose a two-stage model, Threshold-based Efficient Pairwise Association Approach (TEPAA), to reduce the number of tests needed while maintaining almost identical power to the brute force approach. In the first stage, our method performs the single marker test on all SNPs and selects a subset of SNPs that achieve a certain significance threshold. In the second stage, we perform a pairwise association test between traits and pairs of the SNPs selected from the first stage. The key insight of our approach is that we derive the joint distribution between the association statistics of a single SNP and the association statistics of pairs of SNPs. This joint distribution allows us to provide guarantees that the statistical power of our approach will closely approximate the brute force approach. We applied our approach to the Northern Finland Birth Cohort data and achieved 63 times speedup while maintaining 99% of the power of the brute force approach.


Single Nucleotide Polymorphism Minor Allele Frequency Power Loss Sequence Kernel Association Test Brute Force Approach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altshuler, D., Hirschhorn, J.N., Klannemark, M., Lindgren, C.M., Vohl, M.C., Nemesh, J., Lane, C.R., Schaffner, S.F., Bolk, S., Brewer, C., et al.: The common pparγ pro12ala polymorphism is associated with decreased risk of type 2 diabetes. Nature Genetics 26(1), 76–80 (2000)CrossRefGoogle Scholar
  2. 2.
    Bertina, R.M., Koeleman, B.P.C., Koster, T., Rosendaal, F.R., Dirven, R.J., de Ronde, H., Van Der Velden, P.A., Reitsma, P.H., et al.: Mutation in blood coagulation factor v associated with resistance to activated protein c. Nature 369(6475), 64–67 (1994)CrossRefGoogle Scholar
  3. 3.
    Brem, R.B., Storey, J.D., Whittle, J., Kruglyak, L.: Genetic interactions between polymorphisms that affect gene expression in yeast. Nature 436(7051), 701–703 (2005)CrossRefGoogle Scholar
  4. 4.
    Brinza, D., Schultz, M., Tesler, G., Bafna, V.: Rapid detection of gene-gene interactions in genome-wide association studies. Bioinformatics (2010)Google Scholar
  5. 5.
    Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)Google Scholar
  6. 6.
    Corder, E.H., Saunders, A.M., Strittmatter, W.J., Schmechel, D.E., Gaskell, P.C., Small, G.W., Roses, A.D., Haines, J.L., Pericak-Vance, M.A.: Gene dose of apolipoprotein e type 4 allele and the risk of alzheimer’s disease in late onset families. Science 261(5123), 921–923 (1993)CrossRefGoogle Scholar
  7. 7.
    Evans, D.M., Marchini, J., Morris, A.P., Cardon, L.R.: Two-stage two-locus models in genome-wide association. PLoS Genet. 2(9), e157 (2006)Google Scholar
  8. 8.
    Han, B., Kang, H.M., Eskin, E.: Rapid and accurate multiple testing correction and power estimation for millions of correlated markers. PLoS Genet. 5, e1000456 (2009)Google Scholar
  9. 9.
    Kostem, E., Eskin, E.: Efficiently identifying significant associations in genome-wide association studies. J. Comput. Biol. 9 (2013)Google Scholar
  10. 10.
    Kostem, E., Lozano, J.A., Eskin, E.: Increasing power of genome-wide association studies by collecting additional snps. Genetics 188(2), 449–460 (2011)CrossRefGoogle Scholar
  11. 11.
    Lin, D.Y.: An efficient monte carlo approach to assessing statistical significance in genomic studies. Bioinformatics 21(6), 781–787 (2005)CrossRefGoogle Scholar
  12. 12.
    Listgarten, J., Lippert, C., Kang, E.Y., Xiang, J., Kadie, C.M., Heckerman, D.: A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics 4 (2013)Google Scholar
  13. 13.
    Ljungberg, K., Holmgren, S., Carlborg, O.: Simultaneous search for multiple qtl using the global optimization algorithm direct. Bioinformatics 20(12), 1887–1895 (2004)CrossRefGoogle Scholar
  14. 14.
    Millstein, J., Conti, D.V., Gilliland, F.D., Gauderman, W.J.: A testing framework for identifying susceptibility genes in the presence of epistasis. The American Journal of Human Genetics 78(1), 15–27 (2006)CrossRefGoogle Scholar
  15. 15.
    Prabhu, S., Pe’er, I.: Ultrafast genome-wide scan for snp-snp interactions in common complex disease. Genome Research 22(11), 2230–2240 (2012)CrossRefGoogle Scholar
  16. 16.
    Saxena, R., Voight, B.F., Lyssenko, V., Burtt, N.P., de Bakker, P.I.W., Chen, H., Roix, J.J., Kathiresan, S., Hirschhorn, J.N., Daly, M.J., Hughes, T.E., Groop, L., Altshuler, D., Almgren, P., Florez, J.C., Meyer, J., Ardlie, K., Bostrőm, K.B., Isomaa, B., Lettre, G., Lindblad, U., Lyon, H.N., Melander, O., Newton-Cheh, C., Nilsson, P., Orho-Melander, M., Rastam, L., Speliotes, E.K., Taskinen, M.-R.R., Tuomi, T., Guiducci, C., Berglund, A., Carlson, J., Gianniny, L., Hackett, R., Hall, L., Holmkvist, J., Laurila, E., Sjőgren, M., Sterner, M., Surti, A., Svensson, M., Svensson, M., Tewhey, R., Blumenstiel, B., Parkin, M., Defelice, M., Barry, R., Brodeur, W., Camarata, J., Chia, N., Fava, M., Gibbons, J., Handsaker, B., Healy, C., Nguyen, K., Gates, C., Sougnez, C., Gage, D., Nizzari, M., Gabriel, S.B., Chirn, G.-W.W., Ma, Q., Parikh, H., Richardson, D., Ricke, D., Purcell, S.: Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316(5829), 1331–1336 (2007)CrossRefGoogle Scholar
  17. 17.
    Schpbach, T., Xenarios, I., Bergmann, S., Kapur, K.: Fastepistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics 26(11), 1468–1469 (2010)CrossRefGoogle Scholar
  18. 18.
    Seaman, S.R., Muller-Myhsok, B.: Rapid simulation of p values for product methods and multiple-testing adjustment in association studies. American Journal of Human Genetics 76(3), 399–408 (2005)CrossRefGoogle Scholar
  19. 19.
    Williams, S.M., Addy, J.H., Phillips, J.A., Dai, M., Kpodonu, J., Afful, J., Jackson, H., Joseph, K., Eason, F., Murray, M.M., Epperson, P., Aduonum, A., Wong, L.J., Jose, P.A., Felder, R.A.: Combinations of variations in multiple genes are associated with hypertension. Hypertension 36(1), 2–6 (2000)CrossRefGoogle Scholar
  20. 20.
    Wu, M.C., Kraft, P., Epstein, M.P., Taylor, D.M., Chanock, S.J., Hunter, D.J., Lin, X.: Powerful snp-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86(6), 929–942 (2010)CrossRefGoogle Scholar
  21. 21.
    Wu, M.C., Lee, S., Cai, T., Li, Y., Boehnke, M., Lin, X.: Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89(1), 82–93 (2011)CrossRefGoogle Scholar
  22. 22.
    Xiang, W., Can, Y., Qiang, Y., Hong, X., Xiaodan, F., Nelson, T., Weichuan, Y.: Boost: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. The American Journal of Human Genetics 87, 325–340 (2010)CrossRefGoogle Scholar
  23. 23.
    Yanchina, E.D., Ivchik, T.V., Shvarts, E.I., Kokosov, A.N., Khodzhayantz, N.E.: Gene-gene interactions between glutathione-s transferase m1 and matrix metalloproteinase 9 in the formation of hereditary predisposition to chronic obstructive pulmonary disease. Bulletin of Experimental Biology and Medicine 137(1), 64–66 (2004)CrossRefGoogle Scholar
  24. 24.
    Yang, C., He, Z., Wan, X., Yang, Q., Xue, H., Yu, W.: Snpharvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)CrossRefGoogle Scholar
  25. 25.
    Zhang, X., Huang, S., Zou, F., Wang, W.: Team: efficient two-locus epistasis tests in human genome-wide association study. Bioinformatics 227, i217–i227 (2010)Google Scholar
  26. 26.
    Zhang, X., Pan, F., Xie, Y., Zou, F., Wang, W.: COE: A General Approach for Efficient Genome-Wide Two-Locus Epistasis Test in Disease Association Study. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 253–269. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Zhanyong Wang
    • 1
  • Jae Hoon Sul
    • 1
  • Sagi Snir
    • 2
  • Jose A. Lozano
    • 3
  • Eleazar Eskin
    • 1
  1. 1.Computer Science DepartmentUniversity of California Los AngelesUnited States
  2. 2.Institute of Evolution, Department of Evolutionary and Environmental Biology, Faculty of Natural SciencesUniversity of HaifaIsrael
  3. 3.Intelligent Systems Group, Department of Computer Science and Artificial IntelligenceUniversity of the Basque CountryDonostiaSpain

Personalised recommendations