Advertisement

Identification of Pathway-Modulating Genes Using the Biomedical Literature Mining

  • Zhenning Yu
  • Jin Hyun Nam
  • Daniel Couch
  • Andrew Lawson
  • Dongjun Chung
Chapter
Part of the ICSA Book Series in Statistics book series (ICSABSS)

Abstract

Although biomedical literature is considered as a valuable resource to investigate the relationship among genes, it still remains challenging to effectively use it for the identification of the relationships among genes mainly because most abstracts contain information for a single gene while the majority of approaches are based on the co-occurrence of genes within an abstract. In order to address this limitation, we recently developed a Bayesian hierarchical model that allows to identify indirect relationship between genes by linking them using the gene ontology (GO) terms, namely bayesGO. In addition, this approach also facilitates interpretation of the identified pathways by automatically associating relevant GO terms to each gene within a unified framework. In this book chapter, we illustrate this approach using the web interface GAIL which provides the PubMed literature mining results based on human gene entities and GO terms, along with the R package bayesGO implementing the proposed Bayesian hierarchical model. The web interface GAIL is currently hosted at http://chunglab.io/GAIL and the R package bayesGO is publicly available at its GitHub webpage (https://dongjunchung.github.io/bayesGO/).

Notes

Acknowledgements

This work was supported by the NIH/NIGMS grant (R01 GM122078) and the NIH/NCI grant (R21 CA209848).

References

  1. Chung, D., Lawson, A., & Zheng, W. J. (2017). A statistical framework for biomedical literature mining. Statistics in Medicine, 36(22), 3461–3474.MathSciNetCrossRefGoogle Scholar
  2. Frijters, R., Heupers, B., van Beek, P., Bouwhuis, M., van Schaik, R., de Vlieg, J., et al. (2008). Copub: A literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Research, 36(suppl_2), W406–W410.CrossRefGoogle Scholar
  3. Jenssen, T. K., Lægreid, A., Komorowski, J., & Hovig, E. (2001). A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28(1), 21–28.Google Scholar
  4. Koike, A., & Takagi, T. (2004). Gene/protein/family name recognition in biomedical literature. In Proceedings of BioLink 2004 Workshop: Linking Biological Literature, Ontologies and Databases: Tools for Users (Vol. 42, p. 56).Google Scholar
  5. Liu, H., Hu, Z. Z., Zhang, J., & Wu, C. (2005). Biothesaurus: A web-based thesaurus of protein and gene names. Bioinformatics, 22(1), 103–105.CrossRefGoogle Scholar
  6. Mitsumori, T., Fation, S., Murata, M., Doi, K., & Doi, H. (2005). Gene/protein name recognition based on support vector machine using dictionary as features. BMC Bioinformatics, 6(1), S8.CrossRefGoogle Scholar
  7. Qin, T., Matmati, N., Tsoi, L. C., Mohanty, B. K., Gao, N., Tang, J., et al. (2014). Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network. Nucleic Acids Research, 42(18), e138–e138.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Zhenning Yu
    • 1
  • Jin Hyun Nam
    • 1
  • Daniel Couch
    • 1
  • Andrew Lawson
    • 1
  • Dongjun Chung
    • 1
  1. 1.Department of Public Health SciencesMedical University of South CarolinaCharlestonUSA

Personalised recommendations