Skip to main content

Protocol for Epistasis Detection with Machine Learning Using GenEpi Package

  • Protocol
  • First Online:
Epistasis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2212))

Abstract

To develop medical treatments and prevention, the association between disease and genetic variants needs to be identified. The main goal of genome-wide association study (GWAS) is to discover the underlying reason for vulnerability to disease and utilize this knowledge for the development of prevention and treatment against these diseases. Given the methods available to address the scientific problems involved in the search for epistasis, there is not any standard for detecting epistasis, and this remains a problem due to limited statistical power. The GenEpi package is a Python package that uses a two-level workflow machine learning model to detect within-gene and cross-gene epistasis. This protocol chapter shows the usage of GenEpi with example data. The package uses a three-step procedure to reduce dimensionality, select the within-gene epistasis, and select the cross-gene epistasis. The package also provides a medium to build prediction models with the combination of genetic features and environmental influences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bush WS, Moore JH (2012) Genome-wide association studies. PLoS Comput Biol 8(12):e1002822

    Article  CAS  Google Scholar 

  2. Wei W-H, Hemani G, Haley CS (2014) Detecting epistasis in human complex traits. Nat Rev Genet 15(11):722–733

    Article  CAS  Google Scholar 

  3. Hemani G, Shakhbazov K, Westra H-J, Esko T, Henders AK, McRae AF, Yang J, Gibson G, Martin NG, Metspalu A (2014) Detection and replication of epistasis influencing transcription in humans. Nature 508(7495):249

    Article  CAS  Google Scholar 

  4. Moore JH, Williams SM (2002) New strategies for identifying gene-gene interactions in hypertension. Ann Med 34(2):88–95

    Article  CAS  Google Scholar 

  5. Briggs F, Ramsay P, Madden E, Norris J, Holers V, Mikuls TR, Sokka T, Seldin MF, Gregersen P, Criswell L (2010) Supervised machine learning and logistic regression identifies novel epistatic risk factors with PTPN22 for rheumatoid arthritis. Genes Immun 11(3):199

    Article  CAS  Google Scholar 

  6. Ansarifar J, Wang L (2018) New algorithms for detecting multi-effect and multi-way epistatic interactions. Bioinformatics 35(24):5078–5085

    Article  Google Scholar 

  7. Moore JH, Mackay TF, Williams SM (2019) Testing the assumptions of parametric linear models: the need for biological data mining in disciplines such as human genetics. BioData Min 12:6

    Article  Google Scholar 

  8. Manduchi E, Orzechowski PR, Ritchie MD, Moore JH (2019) Exploration of a diversity of computational and statistical measures of association for genome-wide genetic studies. BioData Min 12(1):14

    Article  Google Scholar 

  9. Zhou H, Jia D, Al-Dhelaan A, Al-Dhelaan M, Tian Y (2019) Feature selection with a local search strategy based on the forest optimization algorithm. Comput Model Eng Sci 121(2):569–592

    Google Scholar 

  10. David H, Dan H, Parida LP (2018) Feature selection for efficient epistasis modeling for phenotype prediction. Google Patents

    Google Scholar 

  11. Nejad MB, Ahmadabadi MES (2019) A novel image categorization strategy based on salp swarm algorithm to enhance efficiency of MRI images. Comput Model Eng Sci 119(1):185–205

    Google Scholar 

  12. Jiang X, Neapolitan RE, Barmada MM, Visweswaran S (2011) Learning genetic epistasis using Bayesian network scoring criteria. BMC Bioinformatics 12(1):89

    Article  CAS  Google Scholar 

  13. Jiang R, Tang W, Wu X, Fu W (2009) A random forest approach to the detection of epistatic interactions in case-control studies. BMC Bioinformatics 10(1):S65

    Article  Google Scholar 

  14. Huang K, Nogueira R (2019) EpiRL: a reinforcement learning agent to facilitate epistasis detection. In: International workshop on health intelligence. Springer

    Google Scholar 

  15. Chang Y-C, Wu J-T, Hong M-Y, Tung Y-A, Hsieh P-H, Yee SW, Giacomini KM, Oyang Y-J, Chen C-Y, A.s.D.N. Initiative (2018) GenEpi: gene-based epistasis discovery using machine learning. bioRxiv: p 421719

    Google Scholar 

  16. Joiret M, John JMM, Gusareva ES, Van Steen K (2019) Confounding of linkage disequilibrium patterns in large scale DNA based gene-gene interaction studies. BioData Min 12(1):11

    Article  Google Scholar 

  17. Lewontin R (1964) The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49(1):49

    Article  CAS  Google Scholar 

  18. Slim L, Chatelain C, Azencott C-A, Vert J-P (2019) Novel methods for epistasis detection in genome-wide association studies

    Google Scholar 

  19. Lin C, Chu C-M, Su S-L (2016) Epistasis test in meta-analysis: a multi-parameter Markov chain Monte Carlo model for consistency of evidence. PLoS One 11(4):e0152891

    Article  Google Scholar 

  20. Abegaz F, Van Lishout F, Mahachie John JM, Chiachoompu K, Bhardwaj A, Gusareva ES, Wei Z, Hakonarson H, Van Steen K, Consortium IIG (2019) Epistasis detection in genome-wide screening for complex human diseases in structured populations. Syst Med 2(1):19–27

    Article  Google Scholar 

  21. Kam-Thong T, Azencott C-A, Cayton L, Pütz B, Altmann A, Karbalai N, Sämann PG, Schölkopf B, Müller-Myhsok B, Borgwardt KM (2012) GLIDE: GPU-based linear regression for detection of epistasis. Hum Hered 73(4):220–236

    Article  Google Scholar 

  22. Bi J-H, Tong Y-F, Qiu Z-W, Yang X-F, Minna J, Gazdar AF, Song K (2019) ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration. BioData Min 12(1):12

    Article  Google Scholar 

Download references

Acknowledgments

The work described in this paper was substantially supported by two grants from the Research Grants Council of the Hong Kong Special Administrative Region ([CityU 11203217] and [CityU 11200218]) and the funding from the Hong Kong Institute for Data Science (HKIDS) at the City University of Hong Kong. The work described in this paper was partially supported by a grant from the City University of Hong Kong (CityU 11202219).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ka-Chun Wong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Petinrin, O.O., Wong, KC. (2021). Protocol for Epistasis Detection with Machine Learning Using GenEpi Package. In: Wong, KC. (eds) Epistasis. Methods in Molecular Biology, vol 2212. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0947-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0947-7_18

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0946-0

  • Online ISBN: 978-1-0716-0947-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics