How the Minotaur Turned into Ariadne: Ontologies in Web Data Extraction

  • Tim Furche
  • Georg Gottlob
  • Xiaonan Guo
  • Christian Schallhart
  • Andrew Sellers
  • Cheng Wang
Conference paper

DOI: 10.1007/978-3-642-22233-7_2

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6757)
Cite this paper as:
Furche T., Gottlob G., Guo X., Schallhart C., Sellers A., Wang C. (2011) How the Minotaur Turned into Ariadne: Ontologies in Web Data Extraction. In: Auer S., Díaz O., Papadopoulos G.A. (eds) Web Engineering. ICWE 2011. Lecture Notes in Computer Science, vol 6757. Springer, Berlin, Heidelberg

Abstract

Humans require automated support to profit from the wealth of data nowadays available on the web. To that end, the linked open data initiative and others have been asking data providers to publish structured, semantically annotated data. Small data providers, such as most UK real-estate agencies, however, are overburdened with this task—often just starting to move from simple, table- or list-like directories to web applications with rich interfaces.

We argue that fully automated extraction of structured data can help resolve this dilemma. Ironically, automated data extraction has seen a recent revival thanks to ontologies and linked open data to guide data extraction. First results from the DIADEM project illustrate that high quality, fully automated data extraction at a web scale is possible, if we combine domain ontologies with a phenomenology describing the representation of domain concepts. We briefly summarise the DIADEM project and discuss a few preliminary results.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tim Furche
    • 1
  • Georg Gottlob
    • 1
  • Xiaonan Guo
    • 1
  • Christian Schallhart
    • 1
  • Andrew Sellers
    • 1
  • Cheng Wang
    • 1
  1. 1.Department of Computer ScienceUniversity of OxfordUK

Personalised recommendations