RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature

Wu, Ye; Luo, Ruibang; Leung, Henry C. M.; Ting, Hing-Fung; Lam, Tak-Wah

doi:10.1007/978-3-030-17083-7_17

Ye Wu¹⁵,
Ruibang Luo¹⁵,
Henry C. M. Leung¹⁵,
Hing-Fung Ting¹⁵ &
…
Tak-Wah Lam¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11467))

Included in the following conference series:

International Conference on Research in Computational Molecular Biology

3023 Accesses
34 Citations

Abstract

Over one million new biomedical articles are published every year. Efficient and accurate text-mining tools are urgently needed to automatically extract knowledge from these articles to support research and genetic testing. In particular, the extraction of gene-disease associations is mostly studied. However, existing text-mining tools for extracting gene-disease associations have limited capacity, as each sentence is considered separately. Our experiments show that the best existing tools, such as BeFree and DTMiner, achieve a precision of 48% and recall rate of 78% at most. In this study, we designed and implemented a deep learning approach, named RENET, which considers the correlation between the sentences in an article to extract gene-disease associations. Our method has significantly improved the precision and recall rate to 85.2% and 81.8%, respectively. The source code of RENET is available at https://bitbucket.org/alexwuhkucs/gda-extraction/src/master/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

RelCurator: a text mining-based curation system for extracting gene–phenotype relationships specific to neurodegenerative disorders

Article 10 June 2023

Distant Supervision for Large-Scale Extraction of Gene–Disease Associations from Literature Using DeepDive

TBGA: a large-scale Gene-Disease Association dataset for Biomedical Relation Extraction

Article Open access 31 March 2022

References

Lu, Y.-F., Goldstein, D.B., Angrist, M., Cavalleri, G.: Personalized medicine and human genetic diversity. Cold Spring Harbor Perspect. Med. 4, a008581 (2014)
Article Google Scholar
Garraway, L.A., Verweij, J., Ballman, K.V.: Precision oncology: an overview. J. Clin. Oncol. 31(15), 1803–1805 (2013)
Article Google Scholar
Westergaard, D., Stærfeldt, H.-H., Tønsberg, C., Jensen, L.J., Brunak, S.: A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts. PLoS Comput. Biol. 14(2), e1005962 (2018)
Article Google Scholar
Wei, C.-H., Kao, H.-Y., Lu, Z.: PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res. 41(W1), W518–W522 (2013)
Article Google Scholar
Wang, Y., et al.: No association between bipolar disorder and syngr1 or synapsin II polymorphisms in the Han Chinese population. Psychiatry Res. 169(2), 167–168 (2009)
Article Google Scholar
Hakenberg, J., et al.: A SNPshot of PubMed to associate genetic variants with drugs, diseases, and adverse reactions. J. Biomed. Inf. 45(5), 842–850 (2012)
Article Google Scholar
Song, M., Kim, W.C., Lee, D., Heo, G.E., Kang, K.Y.: PKDE4 J: entity and relation extraction for public knowledge discovery. J. Biomed. Inf. 57, 320–332 (2015)
Article Google Scholar
Thompson, P., Ananiadou, S.: Extracting gene-disease relations from text to support biomarker discovery. In: Proceedings of the 2017 International Conference on Digital Health, pp. 180–189. ACM (2017)
Google Scholar
Bundschus, M., Dejori, M., Stetter, M., Tresp, V., Kriegel, H.-P.: Extraction of semantic biomedical relations from text using conditional random fields. BMC Bioinf. 9(1), 207 (2008)
Article Google Scholar
Chun, H.-W., et al.: Extraction of gene-disease relations from Medline using domain dictionaries and machine learning. In: Biocomputing, pp. 4–15. World Scientific (2006)
Google Scholar
Peng, Y., Lu, Z.: Deep learning for extracting protein-protein interactions from biomedical literature. arXiv preprint arXiv:1706.01556 (2017)
Bravo, À., Piñero, J., Queralt-Rosinach, N., Rautschka, M., Furlong, L.I.: Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinf. 16(1), 55 (2015)
Article Google Scholar
Miwa, M., Bansal, M.: End-to-end relation extraction using LSTMS on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016)
Nguyen, T.H., Grishman, R.: Relation extraction: perspective from convolutional neural networks. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp. 39–48 (2015)
Google Scholar
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432 (2015)
Google Scholar
Xu, D., et al.: DTMiner: identification of potential disease targets through biomedical literature mining. Bioinformatics 32(23), 3619–3626 (2016)
Google Scholar
Roberts, R.J.: PubMed central: the GenBank of the published literature. Proc. Natl. Acad. Sci. U. S. A. 98(2), 381–382 (2001)
Article MathSciNet Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
MATH Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Tang, D., Qin, B., Liu, T.: Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1014–1023 (2015)
Google Scholar
Denil, M., Demiraj, A., Kalchbrenner, N., Blunsom, P., de Freitas, N.: Modelling, visualising and summarising documents with a single convolutional neural network. arXiv preprint arXiv:1408.5882 (2014)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceeding of the Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)
Google Scholar
Graves, A., Jaitly, N., Mohamed, A.-R.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 273–278. IEEE (2013)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Piñero, J., et al.: DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45(D1), D833–D839 (2016)
Article Google Scholar
Moen, S., Ananiadou, T.S.S.: Distributional semantics resources for biomedical text processing. In: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, Tokyo, Japan, pp. 39–43 (2013)
Google Scholar

Download references

Acknowledgments

This work was supported by Hong Kong ITF Grant ITS/331/17FP and General Research Fund No. 27204518.

Author information

Authors and Affiliations

Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong
Ye Wu, Ruibang Luo, Henry C. M. Leung, Hing-Fung Ting & Tak-Wah Lam

Authors

Ye Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ruibang Luo
View author publications
You can also search for this author in PubMed Google Scholar
Henry C. M. Leung
View author publications
You can also search for this author in PubMed Google Scholar
Hing-Fung Ting
View author publications
You can also search for this author in PubMed Google Scholar
Tak-Wah Lam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tak-Wah Lam .

Editor information

Editors and Affiliations

Tufts University, Cambridge, MA, USA
Lenore J. Cowen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Y., Luo, R., Leung, H.C.M., Ting, HF., Lam, TW. (2019). RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature. In: Cowen, L. (eds) Research in Computational Molecular Biology. RECOMB 2019. Lecture Notes in Computer Science(), vol 11467. Springer, Cham. https://doi.org/10.1007/978-3-030-17083-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-17083-7_17
Published: 02 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17082-0
Online ISBN: 978-3-030-17083-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature

Abstract

Access this chapter

Similar content being viewed by others

RelCurator: a text mining-based curation system for extracting gene–phenotype relationships specific to neurodegenerative disorders

Distant Supervision for Large-Scale Extraction of Gene–Disease Associations from Literature Using DeepDive

TBGA: a large-scale Gene-Disease Association dataset for Biomedical Relation Extraction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature

Abstract

Access this chapter

Similar content being viewed by others

RelCurator: a text mining-based curation system for extracting gene–phenotype relationships specific to neurodegenerative disorders

Distant Supervision for Large-Scale Extraction of Gene–Disease Associations from Literature Using DeepDive

TBGA: a large-scale Gene-Disease Association dataset for Biomedical Relation Extraction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation