Abstract
A fundamental question in precision medicine is to quantitatively decode the genetic basis of complex human diseases, which will enable the development of predictive models of disease risks based on personal genome sequences. To account for the complex systems within different cellular contexts, large-scale regulatory networks are critical components to be integrated into the analysis. Based on the fast accumulation of multiomics and disease genetics data, advanced machine learning algorithms and efficient computational tools are becoming the driving force in predicting phenotypes from genotypes, identifying potential causal genetic variants, and revealing disease mechanisms. Here, we review the state-of-the-art methods for this topic and describe a computational pipeline that assembles a series of algorithms together to achieve improved disease genetics prediction through the delineation of regulatory circuitry step by step.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Lander ES (2011) Initial impact of the sequencing of the human genome. Nature 470(7333):187–197. https://doi.org/10.1038/nature09792
Visscher PM, Wray NR, Zhang Q et al (2017) 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet 101(1):5–22. https://doi.org/10.1016/j.ajhg.2017.06.005
Tam V, Patel N, Turcotte M et al (2019) Benefits and limitations of genome-wide association studies. Nat Rev Genet 20(8):467–484. https://doi.org/10.1038/s41576-019-0127-1
Do C, Shearer A, Suzuki M et al (2017) Genetic-epigenetic interactions in cis: a major focus in the post-GWAS era. Genome Biol 18:120. https://doi.org/10.1186/s13059-017-1250-y
Gallagher MD, Chen-Plotkin AS (2018) The post-GWAS era: from association to function. Am J Hum Genet 102(5):717–730. https://doi.org/10.1016/j.ajhg.2018.04.002
Hawkins RD, Hon GC, Ren B et al (2010) Next-generation genomics: an integrative approach. Nat Rev Genet 11(7):476–486. https://doi.org/10.1038/nrg2795
Deplancke B, Alpern D, Gardeux V et al (2016) The genetics of transcription factor DNA binding variation. Cell 166(3):538–554. https://doi.org/10.1016/j.cell.2016.07.012
Watanabe K, Taskesen E, Bochoven A et al (2017) Functional mapping and annotation of genetic associations with FUMA. Nat Commun 8:1826. https://doi.org/10.1038/s41467-017-01261-5
Schaub MA, Boyle AP, Kundaje A et al (2012) Linking disease associations with regulatory information in the human genome. Genome Res 22(9):1748–1759. https://doi.org/10.1101/gr.136127.111
Shlyueva D, Stampfel G, Stark A et al (2014) Transcriptional enhancers: from properties to genome-wide predictions. Nat Rev Genet 15(4):272–286. https://doi.org/10.1038/nrg3682
Creyghton MP, Cheng AW, Wehstead GG et al (2010) Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A 107(50):21931–21936. https://doi.org/10.1073/pnas.1016071107
Kimura H (2013) Histone modifications for human epigenome analysis. J Hum Genet 58(7):439–445. https://doi.org/10.1038/jhg.2013.66
Lister R, Pelizzola M, Dowen RH et al (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462(7271):315–322. https://doi.org/10.1038/nature08514
Zhou VW, Goren A, Bernstein BE (2011) Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet 12(1):7–18. https://doi.org/10.1038/nrg2905
Schoenfelder S, Fraser P (2019) Long-range enhancer-promoter contacts in gene expression control. Nat Rev Genet 20(8):437–455. https://doi.org/10.1038/s41576-019-0128-0
Heintzman ND, Hon GC, Hawkins RD et al (2009) Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459(7243):108–112. https://doi.org/10.1038/nature07829
ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. https://doi.org/10.1038/nature11247
Roadmap Epigenomics Consortium (2015) Integrative analysis of 111 reference human epigenomes. Nature 518(7539):317–330. https://doi.org/10.1038/nature14248
Valencia AM, Kadoch C (2019) Chromatin regulatory mechanisms and therapeutic opportunities in cancer. Nat Cell Biol 21(2):152–161. https://doi.org/10.1038/s41556-018-0258-1
Kim K, Jang K, Yang W et al (2016) Chromatin structure-based prediction of recurrent noncoding mutations in cancer. Nat Genet 48(11):1321–1326. https://doi.org/10.1038/ng.3682
Ernst J, Kellis M (2012) ChromHMM: automating chromatin-state discovery and characterization. Nat Methods 9(3):215–216. https://doi.org/10.1038/nmeth.1906
Yang JJ, Fritsche LG, Zhou X et al (2017) A scalable Bayesian method for integrating functional information in genome-wide association studies. Am J Hum Genet 101(3):404–416. https://doi.org/10.1016/j.ajhg.2017.08.002
Keilwagen J, POSCH S, Grau J (2019) Accurate prediction of cell type-specific transcription factor binding. Genome Biol 20:9. https://doi.org/10.1186/s13059-018-1614-y
Lee D, Gorkin DU, Baker M et al (2015) A method to predict the impact of regulatory variants from DNA sequence. Nat Genet 47(8):955. https://doi.org/10.1038/ng.3331
He B, Chen C, Teng L et al (2014) Global view of enhancer-promoter interactome in human cells. Proc Natl Acad Sci U S A 111(21):E2191–E2199. https://doi.org/10.1073/pnas.1320308111
Gao L, Uzun Y, Gao P et al (2018) Identifying noncoding risk variants using disease-relevant gene regulatory networks. Nat Commun 9:702. https://doi.org/10.1038/s41467-018-03133-y
Lonsdale J, Thomas J, Salvatore M et al (2013) The genotype-tissue expression (GTEx) project. Nat Genet 45(6):580–585. https://doi.org/10.1038/ng.2653
Kheradpour P, Kellis M (2014) Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Res 42(5):2976–2987. https://doi.org/10.1093/nar/gkt1249
Wang YL, Song F, Zhang B et al (2018) The 3D genome browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol 19:151. https://doi.org/10.1186/s13059-018-1519-9
Kulakovskiy IV, Medvedeva YA, Schaefer U et al (2013) HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Res 41(D1):D195–D202. https://doi.org/10.1093/nar/gks1089
Acknowledgements
Hao Wang, Jiaxin Yang and Jianrong Wang were supported by NIH R01GM131398. The authors would like to thank iCER at MSU for providing the high-performance computing facilities.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Wang, H., Yang, J., Wang, J. (2021). Leverage Large-Scale Biological Networks to Decipher the Genetic Basis of Human Diseases Using Machine Learning. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0826-5_11
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0826-5_11
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0825-8
Online ISBN: 978-1-0716-0826-5
eBook Packages: Springer Protocols