Skip to main content

Learning Recursive Patterns for Biomedical Information Extraction

  • Conference paper
Inductive Logic Programming (ILP 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4455))

Included in the following conference series:

Abstract

Information in text form remains a greatly unexploited source of biological information. Information Extraction (IE) techniques are necessary to map this information into structured representations that allow facts relating domain-relevant entities to be automatically recognized. In biomedical IE tasks, extracting patterns that model implicit relations among entities is particularly important since biological systems intrinsically involve interactions among several entities. In this paper, we resort to an Inductive Logic Programming (ILP) approach for the discovery of mutual recursive patterns from text. Mutual recursion allows dependencies among entities to be explored in data and extraction models to be applied in a context-sensitive mode. In particular, IE models are discovered in form of classification rules encoding the conditions to fill a pre-defined information template. An application to a real-world dataset composed by publications selected to support biologists in the task of automatic annotation of a genomic database is reported.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aitken, J.S.: Learning information extraction rules: An inductive logic programming approach. In: Proceedings of the 15th European Conference on Artificial Intelligence, pp. 355–359 (2002)

    Google Scholar 

  2. Attimonelli, M., Accetturo, M., Santamaria, M., Lascaro, D., Scioscia, G., Pappada, G., Tommaseo-Ponzetta, M., Torroni, A.: Hmtdb, a human mitochondrial genomic resource based on variability studies supporting population genetics and biomedical research. BMC Bioinformatics 1(6) (2005)

    Google Scholar 

  3. Berardi, M., Ceci, M., Malerba, D.: A hybrid strategy for knowledge extraction from biomedical documents. In: ICDAR workshop on Neural Networks and Learning in Document Analysis and Recognition, Seoul, Korea (2005)

    Google Scholar 

  4. Califf, M.E., Mooney, R.J.: Relational learning of pattern-match rules for information extraction. In: AAAI ’99/IAAI ’99, pp. 328–334. American Association for Artificial Intelligence (1999)

    Google Scholar 

  5. Craven, M., Kumlien, J.: Constructing biological knowledge bases by extracting information from text sources. In: Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, pp. 77–86. AAAI Press, Stanford (1999)

    Google Scholar 

  6. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: Gate: A framework and graphical development environment for robust nlp tools and application. In: Proc. of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia, USA (2002)

    Google Scholar 

  7. Cussens, J., Nedellec, C. (ed.): Proceedings of the 4th ICML Workshop on Learning Language in Logic (LLL 2005), Bonn, Germany (2005)

    Google Scholar 

  8. Cussens, J., Nedellec, C. (ed.): In: Proceedings of the 4th ICML Workshop on Learning Language in Logic (LLL05), Bonn, Germany (2005)

    Google Scholar 

  9. Džeroski, S., Lavrač, N.: Relational Data Mining. Springer-Verlag, Heidelberg (2001)

    MATH  Google Scholar 

  10. Dzeroski, S., Cussens, J., Manandhar, S.: An introduction to inductive logic programming and learning language in logic. In: Cussens, J., Džeroski, S. (eds.) Learning Language in Logic. LNCS (LNAI), vol. 1925, pp. 3–35. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  11. Ferilli, S., Fanizzi, N., Semeraro, G.: Learning logic models for automated text categorization. In: Esposito, F. (ed.) AI*IA 2001: Advances in Artificial Intelligence. LNCS (LNAI), vol. 2175, pp. 81–86. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Flener, P., Yilmaz, S.: Inductive synthesis of recursive logic programs: achievements and prospects. Journal of Logic Programming, Special Issue on Synthesis, Transformation, and Analysis 41(2-3), 141–195 (1999)

    MATH  MathSciNet  Google Scholar 

  13. Freitag, D.: Toward general-purpose learning for information extraction. In: Proceedings. of the 17th int. conf. on Computational linguistics, pp. 404–408, Morristown, NJ, USA, Association for Computational Linguistics (1998)

    Google Scholar 

  14. Goadrich, M., Oliphant, L., Shavlik, J.W.: Learning ensembles of first-order clauses for recall-precision curves: A case study in biomedical information extraction. In: Proceedings of the Fourteenth International Conference on Inductive Logic Programming, pp. 98–115 (2004)

    Google Scholar 

  15. Hirschman, L., Yeh, A., Blaschke, C., Valencia, A.: Overview of biocreative: critical assessment of information extraction for biology. Bioinformatics 6 (2005)

    Google Scholar 

  16. Junker, M., Sintek, M., Rink, M.: Learning for text categorization and information extraction with ilp. In: Learning Language in Logic, pp. 247–258 (1999)

    Google Scholar 

  17. Lloyd, J.W.: Foundations of Logic Programming, 2nd edn. Springer-Verlag, Heidelberg (1987)

    MATH  Google Scholar 

  18. Malerba, D.: Learning recursive theories in the normal ilp setting. Fundamenta Informaticae 57(1), 39–77 (2003)

    MATH  MathSciNet  Google Scholar 

  19. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  20. Mooney, R.: Learning for semantic interpretation: Scaling up without dumbing down. In: Cussens, J. (ed.) Proc. of the 1st Workshop on Learning Language in Logic, Bled, Slovenia, pp. 7–15 (1999), citeseer.ist.psu.edu/mooney99learning.html

  21. Loveland, D.W. (ed.): Machine Learning for Information Extraction in Genomics - State of the art and perspectives. LNCS, vol. 138. Springer, Heidelberg (1982)

    Google Scholar 

  22. Porter, M.F.: Readings in information retrieval, chapter An algorithm for suffix stripping, pp. 313–316 (1997)

    Google Scholar 

  23. Provost, F.: Learning with imbalanced data sets (invited paper). In: Proc. of AAAI 2000 Workshop on Imbalanced Data Sets (2000)

    Google Scholar 

  24. Shatkay, H., Feldman, R.: Mining the biomedical literature in the genomic era: an overview. Journal of Computational Biology 10, 821–855 (2003)

    Article  Google Scholar 

  25. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Stephen Muggleton Ramon Otero Alireza Tamaddoni-Nezhad

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berardi, M., Malerba, D. (2007). Learning Recursive Patterns for Biomedical Information Extraction. In: Muggleton, S., Otero, R., Tamaddoni-Nezhad, A. (eds) Inductive Logic Programming. ILP 2006. Lecture Notes in Computer Science(), vol 4455. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73847-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73847-3_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73846-6

  • Online ISBN: 978-3-540-73847-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics