Skip to main content

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 87))

Abstract

Most knowledge is available in unstructured texts, however, it must be represented and handled automatically to become truly useful for the construction knowledge-based systems. Ontologies are an approach for knowledge representation capable of expressing a set of entities and their relationships, constraints, axioms and vocabulary of a given domain. Ontology population looks for identifying instances of concepts, relationships and properties of an ontology. Manual population by domain experts and knowledge engineers is an expensive and time consuming task so, automatic or semi-automatic approaches are needed. This article proposes a process for semi-automatic population of ontologies from text focusing on the application of natural language processing and information extraction techniques to acquire and classify ontology instances. Some experiments using a legal corpus were conducted in order to evaluate it. Initial results are promising and indicate that our approach can extract instances with high effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, J.: Natural Language Understanding. Cummings Publishing Company, Redwood City (1995)

    MATH  Google Scholar 

  2. Cimiano, P., Volker, J.: Towards large-scale, open-domain and ontology-based named entity classification. In: Proceedings of RANLP 2005, Borovets, Bulgaria, pp. 166–172 (2005)

    Google Scholar 

  3. Cimiano, P., Ladwig, G., Staab, S.: Gimme the context: Context-driven automatic semantic annotation with C-PANKOW. In: Proceedings of the 14th World Wide Web Conference (WWW), pp. 332–341 (2005)

    Google Scholar 

  4. Cowie, J., Wilks, Y.: Information Extraction. Handbook of Natural Language Processing, Robert Dale, Hermann Moisl and Harold Somers, 241–260 (2000)

    Google Scholar 

  5. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S.: Learning to construct knowledge bases from the world wide web. Artificial Intelligence 118, 69–113 (2000)

    Article  MATH  Google Scholar 

  6. Cunningham, H.: Information Extraction. In: Encyclopedia of Language and Linguistics, 2nd edn. (2005)

    Google Scholar 

  7. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia (July 2002)

    Google Scholar 

  8. Dale, R., Moisl, H., Somers, H.L.: Handbook of natural language processing. CRC, Boca Raton (2000)

    Google Scholar 

  9. Dellschaft, K., Staab, S.: On how to perform a gold standard based evaluation of ontology learning. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 228–241. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.M., Shaked, T., Soderland, S., Weld, D., Yates, A.: Web-scale information extraction in KnowItAU. In: Proceedings of the 13th World Wide Web Conference (WWW), pp. 100–109 (2004)

    Google Scholar 

  11. Evans, R.: A framework for named entity recognition in the open domain. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), pp. 137–144 (2003)

    Google Scholar 

  12. Fleischman, M., Hovy, E.: Fine Grained Classification of Named Entities. In: Proceedings of COLING, Taipei, Taiwan (August 2002)

    Google Scholar 

  13. Girardi, R.: Guiding Ontology Learning and Population by Knowledge System Goals. In: Proceedings of International Conference on Knowledge Engineering and Ontology Development, pp. 480–484. INSTIIC, Valence (2010)

    Google Scholar 

  14. Giuliano, C., Gliozzo, A.: Instance-Based Ontology Population Exploiting Named-Entity Substitution. In: Proceedings of the The 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, August 18-22 (2008)

    Google Scholar 

  15. Guarino, N., Masolo, C., Vetere, C.: Ontoseek: Content-based Access to the web. IEEE Intelligent Systems 14(3), 70–80 (1999)

    Article  Google Scholar 

  16. Alcalá-Fdez, J., Sánchez, L., García, S., Jesús, M.J., Ventura, S., Josep, M.G.G., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: A software tool to assess evolutionary algorithms for data mining problems. Soft Computing 13(3), 307–318 (2009)

    Article  Google Scholar 

  17. Karkaletsis, V., Valarakos, A., Spyropoulos, C.D.: Populating ontologies in biomedicine and presenting their content using multilingual generation. Acquiring and Representing Multilingual, Specialized Lexicons: the Case of Biomedicine (2006)

    Google Scholar 

  18. Macedo, M.J.C.: Natural Language Processing for Identification of Classes and Instances at an Ontology. In: CGCC-UFMA Final Degree work (2010) (in Portuguese)

    Google Scholar 

  19. Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a Large Annotated Corpus of English: Penn TreeBank. Computational linguistics: Special Issue on Using Large Corpora 19(2), 313–330 (1993)

    Google Scholar 

  20. Nierenburg, S., Raskin, V.: Ontological Semantics. MIT Press, Cambridge (2004)

    Google Scholar 

  21. Noy, N.F., Fergerson, R.W., Musen, M.A.: The knowledge model of protégé-2000: Combining interoperability and flexibility. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 17–32. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  22. OWL, http://www.w3.org/2001/sw/WebOnt/ (last acess November 2010)

  23. Ruiz-Martínez, J.M., Miñarro-Giménez, J.A., Guillén-Cárceles, L., Castellanos-Nieves, D., Valencia-García, R., García-Sánchez, F., Fernández-Breis, J.T., Martínez-Béjar, R.: Populating Ontologies in the eTourism Domain. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, December 09-12, vol. 03, pp. 316–319. IEEE Computer Society, Washington, DC (2008)

    Chapter  Google Scholar 

  24. Tanev, H., Magnini, B.: Weakly Supervised Approaches for Ontology Population. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 17–24 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Faria, C., Girardi, R. (2011). An Information Extraction Process for Semi-automatic Ontology Population. In: Corchado, E., Snášel, V., Sedano, J., Hassanien, A.E., Calvo, J.L., Ślȩzak, D. (eds) Soft Computing Models in Industrial and Environmental Applications, 6th International Conference SOCO 2011. Advances in Intelligent and Soft Computing, vol 87. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19644-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19644-7_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19643-0

  • Online ISBN: 978-3-642-19644-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics