Skip to main content

An Unsupervised Method for Ontology Population from the Web

  • Conference paper
Advances in Artificial Intelligence – IBERAMIA 2012 (IBERAMIA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7637))

Included in the following conference series:

Abstract

Knowledge engineers have had difficulty in automatically constructing and populating domain ontologies, mainly due to the well-known knowledge acquisition bottleneck. In this paper, we attempt to alleviate this problem by proposing an iterative unsupervised approach to identifying and extracting ontological class instances from the Web. The proposed approach considers the Web as a big corpus and relies on a confidence-weighted metric based on semantic measures and web-scale statistics as types of evidence. Moreover, our iterative method is able to learn, to some extent, domain-specific linguistic patterns for extracting ontological class instances. We obtained encouraging results for the final ranking of candidate instances as well as an accuracy performance up to 97% for the patterns found by our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 34–43 (2001)

    Article  Google Scholar 

  2. Cimiano, P.: Ontology Learning and Population from Text Algorithms, Evaluation and Applications. Springer, New York (2006)

    Google Scholar 

  3. Wimalasuriya, D.C., Dou, D.: Ontology-Based Information Extraction An introduction and a Survey of Current Approaches. Journal of Information Science 36(3), 306–323 (2010)

    Article  Google Scholar 

  4. Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D., Yates, A.: Web-Scale Information Extraction in KnowIt All (Preliminary Results). In: 13th Intl. Conference on World Wide Web (WWW 2004), pp. 100–110. ACM (2004)

    Google Scholar 

  5. McDowell, L.K., Cafarella, M.: Ontology-Driven, Unsupervised Instance Population. Web Semantics: Science, Services and Agents on the World Wide Web 6(3), 218–236 (2008)

    Article  Google Scholar 

  6. Geleijnse, G., Korst, J.: Learning Effective Surface Text Patterns for Information Extraction. In: Workshop on Adaptive Text Extraction and Mining (ATEM 2006) at the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006), pp. 1–8. ACL (2006)

    Google Scholar 

  7. Downey, D., Etzioni, O., Weld, D.S., Soderland, S.: Learning Text Patterns for Web Information Extraction and Assessment. In: Workshop on Adaptive Text Extraction and Mining (ATEM 2004) at the 19th Nat. Conf. on Artificial Intelligence (AAAI 2004). AAAI (2004)

    Google Scholar 

  8. Hearst, M.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: 14th Conference on Computational Linguistics (COLING 1992), vol. 2, pp. 539–545. ACL (1992)

    Google Scholar 

  9. Oliveira, H., Lima, R., Gomes, J., Ferreira, R., Freitas, F., Costa, E.: A Confidence–Weighted Metric for Unsupervised Ontology Population from Web Texts. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012. LNCS, vol. 7446, pp. 176–190. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Pedersen, T.: Information Content Measures of Semantic Similarity Perform Better Without Sense-Tagged Text. In: Human Language Technologies 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT 2010), pp. 329–332. ACL (2010)

    Google Scholar 

  11. Lin, D.: An Information-Theoretic Definition of Similarity. In: 15th Intl. Conference on Machine Learning (ICML 1998), pp. 296–304 (1998)

    Google Scholar 

  12. Wu, Z., Palmer, M.: Verb Semantics and Lexical Selection. In: 32nd Annual Meeting of the Association for Computational Linguistics (ACL 1994), pp. 133–138. ACL (1994)

    Google Scholar 

  13. Monllaó, C.V.: Ontology-Based Information Extraction. Dissertation Thesis, Polytechnic University of Catalunya (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tomaz, H., Lima, R., Emanoel, J., Freitas, F. (2012). An Unsupervised Method for Ontology Population from the Web. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds) Advances in Artificial Intelligence – IBERAMIA 2012. IBERAMIA 2012. Lecture Notes in Computer Science(), vol 7637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34654-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34654-5_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34653-8

  • Online ISBN: 978-3-642-34654-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics