Extracting Instances of Relations from Web Documents Using Redundancy

  • Viktor de Boer
  • Maarten van Someren
  • Bob J. Wielinga
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4011)


In this document we describe our approach to a specific subtask of ontology population, the extraction of instances of relations. We present a generic approach with which we are able to extract information from documents on the Web. The method exploits redundancy of information to compensate for loss of precision caused by the use of domain independent extraction methods. In this paper, we present the general approach and describe our implementation for a specific relation instance extraction task in the art domain. For this task, we describe experiments, discuss evaluation measures and present the results.


Extraction Module Relation Instance Ontology Learning Document Score Seed List 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intelligent Systems 13, 993 (2001)Google Scholar
  2. 2.
    Kushmerick, N., Weld, D., Doorenbos, R.: Wrapper induction for information extraction. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pp. 729–737 (1997)Google Scholar
  3. 3.
    Cimiano, P.: Ontology learning and population. In: Proceedings Dagstuhl Seminar Machine Learning for the Semantic Web (2005)Google Scholar
  4. 4.
    The Getty Foundation: Aat: The art and architecture thesaurus (2000),
  5. 5.
    The Getty Foundation: Ulan: Union list of artist names (2000),
  6. 6.
    Anjewierden, A., Wielinga, B.J., de Hoog, R.: Task and domain ontologies for knowledge mapping in operational processes. Metis Deliverable 4.2/2003, University of Amsterdam (2004)Google Scholar
  7. 7.
    Ciravegna, F., Chapman, S., Dingli, A., Wilks, Y.: Learning to harvest information for the semantic web. In: Proceedings of the 2nd European Semantic Web Conference, Heraklion, Greece (2005)Google Scholar
  8. 8.
    Cimiano, P., Schmidt-Thieme, L., Pivk, A., Staab, S.: Learning taxonomic relations from heterogeneous evidence. In: Proceedings of the ECAI 2004 Ontology Learning and Population Workshop (2004)Google Scholar
  9. 9.
    Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Webscale information extraction in knowitall preliminary results. In: Proceedings of WWW 2004 (2004)Google Scholar
  10. 10.
    Cilibrasi, R., Vitanyi, P.: Automatic meaning discovery using google (2004),

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Viktor de Boer
    • 1
  • Maarten van Someren
    • 1
  • Bob J. Wielinga
    • 1
  1. 1.Human-Computer Studies Laboratory, Informatics InstituteUniversiteit van Amsterdam 

Personalised recommendations