Automatically Language Patterns Elicitation from Biomedical Literature

Alborzi, Seyed Ziaeddin

doi:10.1007/978-3-319-00951-3_15

Seyed Ziaeddin Alborzi⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 225))

1182 Accesses

Abstract

The amount of research articles being published over the years has been overwhelming and number continues to rise with each day. This rapid growth combined with the unstructured nature of text written in natural languages has created the need to develop tools and methods that aid the process of information extraction, making it more accessible and utilizable. In this work, we present an approach for language pattern acquisition from the biomedical literature. In our method, all possible patterns are generated (candidates’ enumeration), and those patterns which have a match in the training corpus are selected. Equipped with genes and proteins names glossaries plus keywords database, we achieved a recall rate of 52.2% with precision of 40.9%, identifying 321 gene ontology terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining Biomedical Literature: An Open Source and Modular Approach

Natural Language Processing for Biomedical Tools Discovery: A Feasibility Study and Preliminary Results

A Transformer-Based Framework for Biomedical Information Retrieval Systems

References

Putnam, N.C.: Searching MEDLINE free on the Internet using the National Library of Medicine’s PubMed. Clin Excell Nurse Pract. 2(5), 314–316 (1998)
Google Scholar
NLM Systems: Data, News and Update Information. PubMed Update. Internet (April 18, 2011), http://www.nlm.nih.gov/bsd/revup/revup_pub.html#med_update
Vastag, B.: NIH launches PubMed Central. J. Natl. Cancer Inst. 92(5), 374 (2000)
Article Google Scholar
The Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32(Database issue), D258–D261 (2004)
Google Scholar
Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res. 32(Database issue), 262–266 (2004)
Article Google Scholar
Hirschman, L., Colosimo, M., Morgan, A., Yeh, A.: Overview of BioCreAtIvE task 1B: normalized gene lists. BMC Bioinformatics 6(suppl. 1), S11 (2005)
Article Google Scholar
Morgan, A., et al.: Overview of BioCreative II Gene Normalization. Genome Biology 9(suppl. 2), S3 (2008)
Article Google Scholar
Blaschke, C., Hoffmann, R., Oliveros, J.C., Valencia, A.: Extracting information automatically from biological literature. Comparative and Functional Genomics 2(5), 310–313 (2001)
Article Google Scholar
Andrade, M.A., Valencia, A.: Automatic extraction of keywords from scientific text: Application to the knowledge domain of protein families. Bioinformatics 14, 600–607 (1998)
Article Google Scholar
Blaschke, C., Oliveros, J.C., Valencia, A.: Mining functional information associated to expression arrays. Functional and Integrative Genomics (2000) (in press)
Google Scholar
Andrade, M.A., Brok, P.: Automatic extraction of information in molocular biology. FEBS Lett. 476, 12–17 (1998)
Article Google Scholar
Sudo, K., Sekine, S., Grishman, R.: Automatic pattern acquisition for Japanese information extraction. In: HLT 2001 Proceedings of the First International Conference on Human Language Technology Research (2001)
Google Scholar
Mooney, R., Nahm, Y.: Text Mining with Information Extraction. In: Proceedings of the 4th International MIDP Colloquium, Multilingualism and Electronic Language Management, pp. 141–160 (September 2003)
Google Scholar
Chowdhary, R., Zhang, J., Liu, J.S.: Bayesian inference of protein–protein interactions from biological literature. Bioinformatics 25(12), 1536–1542 (2009)
Article Google Scholar
Bui, Q., Katrenko, S., Sloot, P.: A hybrid approach to extract protein–protein interactions. Bioinformatics 27(2), 259–265 (2011)
Article Google Scholar
Liu, B., Qian, L., Zhou, G., Zhu, Q.: Exploiting dependency information for feature-based protein-protein interaction extraction. In: Jiang, L. (ed.) ICCE 2011. AISC, vol. 111, pp. 267–272. Springer, Heidelberg (2011)
Chapter Google Scholar
Blaschke, C., Andrade, M.A., Ouzounis, C., Valencia, A.: Automatic extraction of biological information from scientific text: Protein-protein interactions, pp. 60–66. AAAI Press (1999)
Google Scholar
Ono, T., Hishigaki, H., Tanigami, A., Takagi, T.: Automated extraction of information onprotein-protein interactions from the biological literature. Bioinformatics 17(2), 155–161 (2001)
Article Google Scholar
Huang, M., Zhu, X., Hao, Y., Payan, D.G., Qu, K., Li, M.: Discovering patterns to extract protein-protein interactions from full texts. Bioinformatics 20(18), 3604–3612 (2004)
Article Google Scholar
Surdeanu, M., Turmo, J., Ageno, A.: A hybrid approach for the acquisition of information extraction patterns. In: Proceedings of the EACL 2006 Workshop on Adaptive Text Extraction and Mining (ATEM 2006). ACL (2006)
Google Scholar
Gaudan, S., Jimeno Yepes, A., Lee, V., Rebholz-Schuhmann, D.: Combining Evidence, Specificity, and Proximity towards the Normalization of Gene Ontology Terms in Text. EURASIP Journal on Bioinformatics and Systems Biology 2008, Article ID 342746
Google Scholar
Beisswanger, E., Lee, V., Kim, J., Rebholz-Schuhmann, D., Splendiani, A., Dameron, O., Schulz, S., Hahn, U.: Gene Regulation Ontology (GRO): Design principles and use cases. Studies in Health Technology and Informatics. Studies in Health Technology and Informatics 136, 9–14 (2008)
Google Scholar
Ashburner, M., Ball, C., Blake, J.A., Botstein, D., Butler, H., Cherry, M., Davis, A.P., Dolinski, K., Dwight, S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene Ontology: tool for the unification of biology, The Gene Ontology Consortium. Nature Genet. 25, 25–29 (2000)
Article Google Scholar
Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res. 32(Database issue), D262–D266 (2004)
Article Google Scholar
Settles, B.: ABNER: an open source tool for automatically tagging genes, proteins, and other entity names in text. Bioinformatics 21(14), 3191–3192 (2005)
Article Google Scholar
Settles, B.: Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA), Geneva, Switzerland, pp. 104–107 (2004)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: Proceedings of the International Conference on Machine Learning (ICML), Williamstown, MA, USA, pp. 282–289 (2001)
Google Scholar
Miller, G.: WordNet: A Lexical Database for English. Communications of the ACM 38(11) (November 1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering School, Nanyang Technological University, Singapore, Singapore
Seyed Ziaeddin Alborzi

Authors

Seyed Ziaeddin Alborzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seyed Ziaeddin Alborzi .

Editor information

Editors and Affiliations

, Dept of Computer Engineering, KTO Karatay University, 130 Karatay, Konya, 42020, Turkey
Dhinaharan Nagamalai
, School of Computing and Informatics, University of Louisiana at Lafayette, 214 Oliver Hall, Lafayette, 70504-4330, Louisiana, USA
Ashok Kumar
, Dept. of Electrical &, Prairie View A&M University, Prairie View, 77446-0519, Texas, USA
Annamalai Annamalai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alborzi, S.Z. (2013). Automatically Language Patterns Elicitation from Biomedical Literature. In: Nagamalai, D., Kumar, A., Annamalai, A. (eds) Advances in Computational Science, Engineering and Information Technology. Advances in Intelligent Systems and Computing, vol 225. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00951-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-00951-3_15
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00950-6
Online ISBN: 978-3-319-00951-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Automatically Language Patterns Elicitation from Biomedical Literature

Abstract

Access this chapter

Preview

Similar content being viewed by others

Mining Biomedical Literature: An Open Source and Modular Approach

Natural Language Processing for Biomedical Tools Discovery: A Feasibility Study and Preliminary Results

A Transformer-Based Framework for Biomedical Information Retrieval Systems

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Automatically Language Patterns Elicitation from Biomedical Literature

Abstract

Access this chapter

Preview

Similar content being viewed by others

Mining Biomedical Literature: An Open Source and Modular Approach

Natural Language Processing for Biomedical Tools Discovery: A Feasibility Study and Preliminary Results

A Transformer-Based Framework for Biomedical Information Retrieval Systems

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation