Convolutional Model for Predicting SNP Interactions

  • Suneetha UppuEmail author
  • Aneesh Krishna
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11305)


Single-nucleotide polymorphisms (SNPs) are genetic markers that empower researchers to examine for genes associated with complex diseases. Several efforts have been contributed by researchers to study the interaction effects between multi-locus SNPs for discerning the status of complex diseases. However, the current conventional machine learning techniques are still left with several caveats. Deep learning is a new breed of machine learning technique that elucidates the hidden structure of the raw data by transforming it into multiple high levels of abstractions, using the power of parallel and distributed computing. It promises empirical success in the number of applications including bioinformatics to drive insights of biological complexities. The deep learning approach in the multi-locus interaction studies is yet to meet its potential achievements. In this paper, a convolutional neural network is trained to identify true causative two-locus SNP interactions. The performance of the method is evaluated on hypertension data. Highly ranked two-locus SNP interactions are identified for the manifestation of hypertension.


Convolutional neural network SNP-SNP interactions Deep learning Multi-locus Epistasis Gene-gene interactions 


  1. 1.
    Bush, W.S., Moore, J.H.: Genome-wide association studies. PLoS Comput. Biol. 8(12), e1002822 (2012)CrossRefGoogle Scholar
  2. 2.
    Onay, V.Ü., et al.: SNP-SNP interactions in breast cancer susceptibility. BMC Cancer 6, 114 (2006)CrossRefGoogle Scholar
  3. 3.
    Padyukov, L.: Between the Lines of Genetic Code: Genetic Interactions in Understanding Disease and Complex Phenotypes. Academic Press, Cambridge (2013)Google Scholar
  4. 4.
    Uppu, S., Krishna, A., Gopalan, R.: A review on methods for detecting SNP interactions in high-dimensional genomic data. IEEE/ACM Trans. Comput. Biol. Bioinf. 15(2), 599–612 (2018)CrossRefGoogle Scholar
  5. 5.
    Ritchie, M.D., et al.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69, 138–147 (2001)CrossRefGoogle Scholar
  6. 6.
    Schwender, H., Ickstadt, K.: Identification of SNP interactions using logic regression. Biostatistics 9, 187–198 (2008)CrossRefGoogle Scholar
  7. 7.
    Wang, Y., Liu, X., Robbins, K., Rekaya, R.: AntEpiSeeker: detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm. BMC Res. Notes 3, 117 (2010)CrossRefGoogle Scholar
  8. 8.
    Tang, W., Wu, X., Jiang, R., Li, Y.: Epistatic module detection for case-control studies: a Bayesian model with a Gibbs sampling strategy. PLoS Genet. 5, e1000464 (2009)CrossRefGoogle Scholar
  9. 9.
    Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39, 1167–1173 (2007)CrossRefGoogle Scholar
  10. 10.
    Wan, X., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87, 325–340 (2010)CrossRefGoogle Scholar
  11. 11.
    Motsinger, A.A., Lee, S.L., Mellick, G., Ritchie, M.D.: GPNN: power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease. BMC Bioinform. 7, 39 (2006)CrossRefGoogle Scholar
  12. 12.
    Fang, Y.H., Chiu, Y.F.: SVM-based generalized multifactor dimensionality reduction approaches for detecting gene-gene interactions in family studies. Genet. Epidemiol. 36, 88–98 (2012)CrossRefGoogle Scholar
  13. 13.
    Purcell, S., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007)CrossRefGoogle Scholar
  14. 14.
    Schwarz, D.F., König, I.R., Ziegler, A.: On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data. Bioinformatics 26, 1752–1758 (2010)CrossRefGoogle Scholar
  15. 15.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)CrossRefGoogle Scholar
  16. 16.
    Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinform. 18(5), 851–869 (2016)Google Scholar
  17. 17.
    Uppu, S., Krishna, A., Gopalan, R.P.: A deep learning approach to detect SNP interactions. JSW 11, 965–975 (2016)CrossRefGoogle Scholar
  18. 18.
    Uppu, S., Krishna, A.: Improving strategy for discovering interacting genetic variants in association studies. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9947, pp. 461–469. Springer, Cham (2016). Scholar
  19. 19.
    Bengio, Y., Goodfellow, I.J., Courville, A.: Deep learning. An MIT Press book in Preparation (2015).
  20. 20.
    Uppu, S., Krishna, A.: Tuning hyperparameters for gene interaction models in genome-wide association studies. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, El-Sayed M. (eds.) ICONIP 2017. LNCS, vol. 10638, pp. 791–801. Springer, Cham (2017). Scholar
  21. 21.
    Wu, S.J., Chiang, F.T., Chen, W. J., Liu, P.H., Hsu, K.L., Hwang, J.J., Lai, L.P., Lin, J.L., Tseng, C.D., Tseng, Y.Z.: Three single-nucleotide polymorphisms of the angiotensinogen gene and susceptibility to hypertension: single locus genotype vs. haplotype analysis. Physiol. Genomics 17, 79–86 (2004)CrossRefGoogle Scholar
  22. 22.
    Wu, J.: Introduction to convolutional neural networks. National Key Lab for Novel Software Technology, Nanjing University, China (2017)Google Scholar
  23. 23.
    Moore, J.H., Hahn, L.W., Ritchie, M.D., Thornton, T.A., White, B.C.: Application of genetic algorithms to the discovery of complex models for simulation studies in human genetics. In Proceedings of the Genetic and Evolutionary Computation Conference/GECCO, p. 1150 (2002)Google Scholar
  24. 24.
    Ritchie, M.D., Hahn, L.W., Moore, J.H.: Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet. Epidemiol. 24, 150–157 (2003)CrossRefGoogle Scholar
  25. 25.
    Urbanowicz, R.J., Kiralis, J., Sinnott-Armstrong, N.A., Heberling, T., Fisher, J.M., Moore, J.H.: GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData Min. 5, 1–14 (2012)CrossRefGoogle Scholar
  26. 26.
    Uppu, S., Krishna, A., Gopalan, R.P.: Rule-based analysis for detecting epistasis using associative classification mining. Netw. Model. Anal. Health Inform. Bioinform. 4, 1–19 (2015)CrossRefGoogle Scholar
  27. 27.
    Candel, A., Parmar, V., LeDell, E., Arora, A.: Deep Learning with H2O (2015)Google Scholar
  28. 28.
    Chen, T., et al.: MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)
  29. 29.
    Glander, S.: Building deep neural nets with H2O and rsparkling that predict arrhythmia of the heart (2017).

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Electrical Engineering, Computing and Mathematical SciencesCurtin UniversityBentley, PerthAustralia

Personalised recommendations