Ontology Based Information Extraction from Text

  • Vangelis Karkaletsis
  • Pavlina Fragkou
  • Georgios Petasis
  • Elias Iosif
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6050)

Abstract

Information extraction systems employ ontologies as a means to describe formally the domain knowledge exploited by these systems for their operation. The aim of this survey is to study the contribution of ontologies to information extraction systems. We believe that this will help towards specifying a concrete methodology for ontology based information extraction exploiting all levels of ontological knowledge, from domain entities for named entity recognition, to the use of conceptual hierarchies for pattern generalization, to the use of properties and non-taxonomic relations for pattern acquisition, and finally to the use of the domain model itself for integrating extracted entities and instances of relations, as well as for discovering implicit information and detecting inconsistencies.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Appelt, D.E., Hobbs, J.R., Bear, J., Israel, D.J., Tyson, M.: Fastus: A finite-state processor for information extraction from real-world text. In: IJCAI, pp. 1172–1178 (1993)Google Scholar
  2. 2.
    Asher, N., Lascarides, A.: Logics of Conversation. Cambridge University Press, Cambridge (2003)Google Scholar
  3. 3.
    Basili, R., Moschitti, A., Pazienza, M.T., Zanzotto, F.M.: Personalizing Web Publishing via Information Extraction. IEEE Intelligent Systems and Their Applications 18(1), 62–70 (2003)CrossRefGoogle Scholar
  4. 4.
    Bikel, D., Miller, S., Schwartz, R., Weischedel, R.: Nymble: A High-Performance Learning NameFinder. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 194–201. Morgan Kaufmann, CA (1997)CrossRefGoogle Scholar
  5. 5.
    Buitelaar, P., Cimiano, P., Racioppa, S., Siegel, M.: Ontology-based Information Extraction with SOBA. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), pp. 2321–2324 (2006)Google Scholar
  6. 6.
    Buitelaar, P., Cimiano, P., Frank, A., Hartung, M., Racioppa, S.: Ontology-based Information Extraction and Integration from Heterogeneous Data Sources. International Journal of Human Computer Studies (JHCS) 66, 759–788 (2008)CrossRefGoogle Scholar
  7. 7.
    Castano, S., Peraldi, I.S.E., Ferrara, A., Karkaletsis, V., Kaya, A., Möller, R., Montanelli, S., Petasis, G., Wessel, M.: Multimedia Interpretation for Dynamic Ontology Evolution. Journal of Logic and Computation (2008)Google Scholar
  8. 8.
    Cimiano, P., Handschuh, S., Staab, S.: Towards the Self Annotating Web. In: Proceedings of the 13th World Wide Web Conference (2004)Google Scholar
  9. 9.
    Cimiano, P., Ladwig, G., Staab, S.: Gimme The Context: Context driven Automatic Semantic Annotation with CPANKOW. In: Proceedings of the 14th International Conference on World Wide Web, Chiba, Japan, pp. 332–341 (2005)Google Scholar
  10. 10.
    Ciravegna, F., Lavelli, A.: LearningPinocchio: Adaptive Information Extraction for Real World Applications. Natural Language Engineering 1(1), 1–21 (2003)Google Scholar
  11. 11.
    Croft, W., Cruse, D.A.: Cognitive Linguistics. Cambridge University Press, Cambridge (2004)CrossRefGoogle Scholar
  12. 12.
    Defence Advanced Research Project Agency: Proc. of the 6th Message Understanding Conference. Morgan Kaufmann, San Francisco (1995)Google Scholar
  13. 13.
    Defence Advanced Research Project Agency: Proc. of the 7th Message Understanding Conference, http://www.muc.saic.com/proceedings/muc_7_toc.html
  14. 14.
    Fellbaum, C.: WordNet, an electronic lexical database. MIT Press, Cambridge (1998)MATHGoogle Scholar
  15. 15.
    Grishman, R.: Information Extraction: Techniques and Challenges. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS, vol. 1299, pp. 10–27. Springer, Heidelberg (1997)Google Scholar
  16. 16.
    Grishman, R.: Information Extraction. In: Mitkov, R. (ed.) Handbook of Computational Linguistics Information Extraction. Oxford University Press, Oxford (2003)Google Scholar
  17. 17.
    Hahn, U., Marko, K.G.: Ontology and Lexicon Evolution by Text Understanding. In: Proceedings of the ECAI 2002 Workshop on Machine Learning and Natural Language Processing for Ontology Engineering (OLT 2002), Lyon, France (2002)Google Scholar
  18. 18.
    Hahn, U., Romacker, M., Schulz, S.: Creating Knowledge Repositories from Biomedical Repots: MEDSYNDIKATE Text Mining System. In: Proceedings PSB 2002, pp. 338–349 (2002)Google Scholar
  19. 19.
    Hearst, M.A.: Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of COLING 1992, pp. 539 – 545 (1992)Google Scholar
  20. 20.
    Hobbs, J.R., Stickel, M., Appelt, D., Martin, P.: Interpretation as Abduction, Technical Note 499, AI Center, SRI International (1990)Google Scholar
  21. 21.
    Karkaletsis, V., Spyropoulos, C.D., Grover, C., Pazienza, M.T., Coch, J., Souflis, D.: A Platform for Cross-lingual, Domain and User Adaptive Web Information Extraction. In: Proceedings of the European Conference in Artificial Intelligence (ECAI), Valencia, Spain, pp. 725–729 (2004)Google Scholar
  22. 22.
    Kifer, M., Lausen, G., Wu, J.: Logical Foundations of Object Oriented and Frame Based Languages. Journal of the ACM (1995)Google Scholar
  23. 23.
    Lehnert, W., Cardie, C., Fisher, D., Riloff, E., Williams, R.: University of Massachusetts: Description of the CIRCUS system as used for MUC-3. In: Proceedings of the Third Message Understanding Conference. Morgan Kaufmann, CA (1991)Google Scholar
  24. 24.
    Ma, L., Shepherd, J.: Information Extraction Using Two-Phase Pattern Discovery. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, United Kingdom, pp. 534–535 (2004)Google Scholar
  25. 25.
    Maedche, A., Neumann, G., Staab, S.: Bootstrapping an Ontology-Based Information Extraction System. In: Szczepaniak, P.S., Segovia, J., Kacprzyk, J., Zadeh, L.A. (eds.) Intelligent Exploration of the Web Series. Studies in Fuzziness and Soft Computing. Springer, Heidelberg (2002)Google Scholar
  26. 26.
    Mikheev, A., Grover, C., Moens, M.: Description of the LTG system used for MUC-7 (1998), http://muc.saic.com/proceedings/muc_7_toc.html (last visited October 1999)
  27. 27.
    Nédellec, C., Nazarenko, A.: Ontologies and Information Extraction (2005), http://arxiv.org/abs/cs.AI/0609137
  28. 28.
    Nédellec, C., Nazarenko, A., Bossy, R.: Information Extraction. In: Staab, S., Studer, R. (eds.) To appear in Ontology Handbook. Springer, Heidelberg (2008)Google Scholar
  29. 29.
    Niepert, M., Buckner, C., Allen, C.: A Dynamic Ontology for a Dynamic Reference Work. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2007) (2007)Google Scholar
  30. 30.
    Rosendfeld, B., Feldman, R., Fresko, M.: TEG—A Hybrid Approach to Information Extraction. Knowledge Information Systems 9(1), 1–18 (2005)Google Scholar
  31. 31.
    Soderland, S.: Learning text analysis rules for domain-specific natural language processing, PhD thesis. Amherst: University of Massachusetts, Department of Computer Science (1997)Google Scholar
  32. 32.
    Wang, T., Bontcheva, K., Li, Y., Cunningham, H.: D2.1.2 / Ontology-Based Information Extraction (OBIE) v.2, EU-IST Project IST-2003-506826 SEKT SEKT: Semantically Enabled Knowledge Technologies (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Vangelis Karkaletsis
    • 1
  • Pavlina Fragkou
    • 1
  • Georgios Petasis
    • 1
  • Elias Iosif
    • 1
  1. 1.Institute of Informatics and TelecommunicationsNational Center for Scientific Research (N.C.S.R.) “Demokritos”Aghia ParaskeviGreece

Personalised recommendations