Automatic Acquisition of Attributes for Ontology Construction

  • Gaoying Cui
  • Qin Lu
  • Wenjie Li
  • Yirong Chen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5459)

Abstract

An ontology can be seen as an organized structure of concepts according to their relations. A concept is associated with a set of attributes that themselves are also concepts in the ontology. Consequently, ontology construction is the acquisition of concepts and their associated attributes through relations. Manual ontology construction is time-consuming and difficult to maintain. Corpus-based ontology construction methods must be able to distinguish concepts themselves from concept instances. In this paper, a novel and simple method is proposed for automatically identifying concept attributes through the use of Wikipedia as the corpus. The built-in {{Infobox}} in Wiki is used to acquire concept attributes and identify semantic types of the attributes. Two simple induction rules are applied to improve the performance. Experimental results show precisions of 92.5% for attribute acquisition and 80% for attribute type identification. This is a very promising result for automatic ontology construction.

Keywords

Attribute acquisition ontology construction Wikipedia as resource source 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Almuhareb, A., Poesio, M.: Attribute-Based and Value-Based Clustering: An Evaluation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), Barcelona, Spain (2004)Google Scholar
  2. 2.
    Grefenstette, G.: SEXTANT: Extracting semantics from raw text implementation details. Heuristics: The Journal of Knowledge Engineering (1993)Google Scholar
  3. 3.
    Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL), Montreal, pp. 768–774 (1998)Google Scholar
  4. 4.
    Curran, J.R., Moens, M.: Improvements in automatic thesaurus extraction. In: Proceedings of the ACL-SIGLEX Workshop on Unsupervised Lexical Acquisition, Philadelphia, PA, USA, pp. 59–66 (2002)Google Scholar
  5. 5.
    Kilgarriff, A.: Thesauruses for Natural Language Processing. In: Proceedings of the IEEE 2003 International Conference on Natural Language Processing and Knowledge Engineering (NLPKE 2003), Beijing (2003)Google Scholar
  6. 6.
    Natalya, F., Noy, Deborah, L.: McGuinness: Ontology Development 101: A Guide to Creating Your First Ontology (2001) (last visited September 20th, 2008), http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html
  7. 7.
    Niles, I., Pease, A.: Towards a Standard Upper Ontology. In: Proceedings of the Second International Conference on Formal Ontology in Information Systems (FOIS 2001) (2001) (last visited September 20th, 2008), http://home.earthlink.net/~adampease/professional/FOIS.pdf
  8. 8.
    Chen, Y., Lu, Q., Li, W., Li, W., Ji, L., Cui, G.: Automatic Construction of a Chinese Core Ontology from an English-Chinese Term Bank. In: Proceeding of ISWC 2007 Workshop OntoLex 2007 - From Text to Knowledge: The Lexicon/Ontology Interface, Busan, Korea, pp. 78–87 (2007)Google Scholar
  9. 9.
    Lee, C.S., Kao, Y.F., Kuo, Y.H., Wang, M.H.: Automated ontology construction for unstructured text documents. Data & Knowledge Engineering 60, 547–566 (2007)CrossRefGoogle Scholar
  10. 10.
    Yang, Y., Lu, Q., Zhao, T.: A Clustering Based Approach for Domain Relevant Relation Extraction. In: Proceedings of the 2008 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE 2008), Beijing, China, October 19-22 (2008)Google Scholar
  11. 11.
    Yoshinaga, N., Torisawa, K.: Open-Domain Attribute-Value Acquisition from Semi-Structured Texts. In: Proceedings of the OntoLex 2007 - From Text to Knowledge: The Lexicon/Ontology Interface, Busan, South-Korea, November 11th (2007)Google Scholar
  12. 12.
    Pasca, M., Durme, B.V.: Weakly-supervised Acquisition of Open-domain Classes and Class Attributes from Web Documents and Quey Logs. In: Proceedings of ACL 2008: HLT, Columbus, Ohio, USA, pp. 19–27 (2008)Google Scholar
  13. 13.
    Cui, G., Lu, Q., Li, W., Chen, Y.: Corpus Exploitation from Wikipedia. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, May 28-30 (2008)Google Scholar
  14. 14.
    Wikipedia (English version), http://en.Wikipedia.org
  15. 15.
    Poesio, M., Almuhareb, A.: Identifying Concept Attributes Using a Classifier. In: Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition, Ann Arbor, pp. 18–27 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Gaoying Cui
    • 1
  • Qin Lu
    • 1
  • Wenjie Li
    • 1
  • Yirong Chen
    • 1
  1. 1.Department of ComputingThe Hong Kong Polytechnic UniversityHong KongChina

Personalised recommendations