Abstract
Although biomedical literature is considered as a valuable resource to investigate the relationship among genes, it still remains challenging to effectively use it for the identification of the relationships among genes mainly because most abstracts contain information for a single gene while the majority of approaches are based on the co-occurrence of genes within an abstract. In order to address this limitation, we recently developed a Bayesian hierarchical model that allows to identify indirect relationship between genes by linking them using the gene ontology (GO) terms, namely bayesGO. In addition, this approach also facilitates interpretation of the identified pathways by automatically associating relevant GO terms to each gene within a unified framework. In this book chapter, we illustrate this approach using the web interface GAIL which provides the PubMed literature mining results based on human gene entities and GO terms, along with the R package bayesGO implementing the proposed Bayesian hierarchical model. The web interface GAIL is currently hosted at http://chunglab.io/GAIL and the R package bayesGO is publicly available at its GitHub webpage (https://dongjunchung.github.io/bayesGO/).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chung, D., Lawson, A., & Zheng, W. J. (2017). A statistical framework for biomedical literature mining. Statistics in Medicine, 36(22), 3461–3474.
Frijters, R., Heupers, B., van Beek, P., Bouwhuis, M., van Schaik, R., de Vlieg, J., et al. (2008). Copub: A literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Research, 36(suppl_2), W406–W410.
Jenssen, T. K., Lægreid, A., Komorowski, J., & Hovig, E. (2001). A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28(1), 21–28.
Koike, A., & Takagi, T. (2004). Gene/protein/family name recognition in biomedical literature. In Proceedings of BioLink 2004 Workshop: Linking Biological Literature, Ontologies and Databases: Tools for Users (Vol. 42, p. 56).
Liu, H., Hu, Z. Z., Zhang, J., & Wu, C. (2005). Biothesaurus: A web-based thesaurus of protein and gene names. Bioinformatics, 22(1), 103–105.
Mitsumori, T., Fation, S., Murata, M., Doi, K., & Doi, H. (2005). Gene/protein name recognition based on support vector machine using dictionary as features. BMC Bioinformatics, 6(1), S8.
Qin, T., Matmati, N., Tsoi, L. C., Mohanty, B. K., Gao, N., Tang, J., et al. (2014). Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network. Nucleic Acids Research, 42(18), e138–e138.
Acknowledgements
This work was supported by the NIH/NIGMS grant (R01 GM122078) and the NIH/NCI grant (R21 CA209848).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Yu, Z., Nam, J.H., Couch, D., Lawson, A., Chung, D. (2018). Identification of Pathway-Modulating Genes Using the Biomedical Literature Mining. In: Zhao, Y., Chen, DG. (eds) New Frontiers of Biostatistics and Bioinformatics. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-99389-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-99389-8_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99388-1
Online ISBN: 978-3-319-99389-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)