Skip to main content

Identification of Pathway-Modulating Genes Using the Biomedical Literature Mining

  • Chapter
  • First Online:
New Frontiers of Biostatistics and Bioinformatics

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

  • 1092 Accesses


Although biomedical literature is considered as a valuable resource to investigate the relationship among genes, it still remains challenging to effectively use it for the identification of the relationships among genes mainly because most abstracts contain information for a single gene while the majority of approaches are based on the co-occurrence of genes within an abstract. In order to address this limitation, we recently developed a Bayesian hierarchical model that allows to identify indirect relationship between genes by linking them using the gene ontology (GO) terms, namely bayesGO. In addition, this approach also facilitates interpretation of the identified pathways by automatically associating relevant GO terms to each gene within a unified framework. In this book chapter, we illustrate this approach using the web interface GAIL which provides the PubMed literature mining results based on human gene entities and GO terms, along with the R package bayesGO implementing the proposed Bayesian hierarchical model. The web interface GAIL is currently hosted at and the R package bayesGO is publicly available at its GitHub webpage (

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


  • Chung, D., Lawson, A., & Zheng, W. J. (2017). A statistical framework for biomedical literature mining. Statistics in Medicine, 36(22), 3461–3474.

    Article  MathSciNet  Google Scholar 

  • Frijters, R., Heupers, B., van Beek, P., Bouwhuis, M., van Schaik, R., de Vlieg, J., et al. (2008). Copub: A literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Research, 36(suppl_2), W406–W410.

    Article  Google Scholar 

  • Jenssen, T. K., Lægreid, A., Komorowski, J., & Hovig, E. (2001). A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28(1), 21–28.

    Google Scholar 

  • Koike, A., & Takagi, T. (2004). Gene/protein/family name recognition in biomedical literature. In Proceedings of BioLink 2004 Workshop: Linking Biological Literature, Ontologies and Databases: Tools for Users (Vol. 42, p. 56).

    Google Scholar 

  • Liu, H., Hu, Z. Z., Zhang, J., & Wu, C. (2005). Biothesaurus: A web-based thesaurus of protein and gene names. Bioinformatics, 22(1), 103–105.

    Article  Google Scholar 

  • Mitsumori, T., Fation, S., Murata, M., Doi, K., & Doi, H. (2005). Gene/protein name recognition based on support vector machine using dictionary as features. BMC Bioinformatics, 6(1), S8.

    Article  Google Scholar 

  • Qin, T., Matmati, N., Tsoi, L. C., Mohanty, B. K., Gao, N., Tang, J., et al. (2014). Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network. Nucleic Acids Research, 42(18), e138–e138.

    Article  Google Scholar 

Download references


This work was supported by the NIH/NIGMS grant (R01 GM122078) and the NIH/NCI grant (R21 CA209848).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Zhenning Yu .

Editor information

Editors and Affiliations



See Tables 17.5 for GO term clusters and Tables 17.617.8 for gene clusters omitted in the main text.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Yu, Z., Nam, J.H., Couch, D., Lawson, A., Chung, D. (2018). Identification of Pathway-Modulating Genes Using the Biomedical Literature Mining. In: Zhao, Y., Chen, DG. (eds) New Frontiers of Biostatistics and Bioinformatics. ICSA Book Series in Statistics. Springer, Cham.

Download citation

Publish with us

Policies and ethics