Using Ontologies for Extracting Product Features from Web Pages

  • Wolfgang Holzinger
  • Bernhard Krüpl
  • Marcus Herzog
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4273)


In this paper, we show how to use ontologies to bootstrap a knowledge acquisition process that extracts product information from tabular data on Web pages. Furthermore, we use logical rules to reason about product specific properties and to derive higher-order knowledge about product features. We will also explain the knowledge acquisition process, covering both ontological and procedural aspects. Finally, we will give an qualitative and quantitative evaluation of our results.


Regular Expression Domain Ontology Semantic Concept Table Structure Table Extraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Alani, H., Kim, S., Millard, D.E., Weal, M.J., Hall, W., Lewis, P.H., Shadbolt, N.R.: Automatic Ontology-Based Knowledge Extraction from Web Documents. IEEE Intelligent Systems 18(1), 14–21 (2003)CrossRefGoogle Scholar
  2. 2.
    Carme, J., Ceresna, M., Frölich, O., Gottlob, G., Hassan, T., Herzog, M., Holzinger, W., Krüpl, B.: The Lixto Project: Exploring New Frontiers of Web Data Extraction. In: Proc. of the 23rd British National Conf. on Databases (2006)Google Scholar
  3. 3.
    Embley, D.W., Tao, C., Liddle, S.W.: Automatically Extracting Ontologically Specified Data from HTML Tables of Unknown Structure. In: Spaccapietra, S., March, S.T., Kambayashi, Y. (eds.) ER 2002. LNCS, vol. 2503. Springer, Heidelberg (2002)Google Scholar
  4. 4.
    Embley, D.W., Lopresti, D., Nagy, G.: Notes on Contemporary Table Recognition. In: Proc. of the 2nd IEEE Int. Conf. on Document Image Analysis for Libraries (2006)Google Scholar
  5. 5.
    Gatterbauer, W., Bohunsky, P.: Table Extraction Using Spatial Reasoning on the CSS2 Visual Box Model. In: Proc. of the 21st National Conf. on Artificial Intelligence (2006)Google Scholar
  6. 6.
    Hurst, M.: Layout and Language: Challenges for Table Understanding on the Web. In: Proc. of the 1st Int. Workshop on Web Document Analysis (2001)Google Scholar
  7. 7.
    Krüpl, B., Herzog, M.: Visually Guided Bottom-Up Table Detection and Segmentation in Web Documents. In: Proc. of the 15th Int. World Wide Web Conf. (2006)Google Scholar
  8. 8.
    Parsia, B., Sivrin, E., Grove, M., Alford, R.: Pellet OWL Reasoner. Maryland Information and Networks Dynamics Lab (2003) (as of May 2006),
  9. 9.
    Patel, C., Supekar, K., Lee, Y.: Ontogenie: Extracting Ontology Instances from WWW. In: Fensel, D., Sycara, K.P., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Tanaka, M., Ishida, T.: Ontology Extraction from Tables on the Web. In: Proc. of the Int. Symposium on Applications on Internet (2006)Google Scholar
  11. 11.
    Tijerino, Y.A., Embley, D.W., Lonsdale, D.W., Nagy, G.: Ontology Generation from Tables. In: Proc. of the Fourth Int. Conf. on Web Information Systems Engineering (2003)Google Scholar
  12. 12.
    Wang, X.: Tabular Abstraction, Editing, and Formatting. PhD thesis, Univ. of Waterloo (1996)Google Scholar
  13. 13.
    Wessman, A., Liddle, S.W., Embley, D.W.: A Generalized Framework for an Ontology-Based Data-Extraction System. In: Proc. of the 4th Int. Conf. on Information Systems Technology and its Applications, pp. 239–253 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Wolfgang Holzinger
    • 1
  • Bernhard Krüpl
    • 1
  • Marcus Herzog
    • 1
  1. 1.Database and Artificial Intelligence GroupVienna University of TechnologyWienAustria

Personalised recommendations