In Praise of Laziness: A Lazy Strategy for Web Information Extraction

  • Rifat Ozcan
  • Ismail Sengor Altingovde
  • Özgür Ulusoy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7224)


A large number of Web information extraction algorithms are based on machine learning techniques. For such extraction algorithms, we propose employing a lazy learning strategy to build a specialized model for each test instance to improve the extraction accuracy and avoid the disadvantages of constructing a single general model.


Information Extraction Test Instance Machine Learning Technique Training Instance Extraction Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aha, D.W. (ed.): Lazy learning. Kluwer Academic Publishers, Norwell (1997)zbMATHGoogle Scholar
  2. 2.
    Chang, C.-H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18(10), 1411–1428 (2006)CrossRefGoogle Scholar
  3. 3.
    Freitag, D.: Information extraction from html: Application of a general machine learning approach. In: Proceedings of AAAI/IAAI, pp. 517–523 (1998)Google Scholar
  4. 4.
    Veloso, A., Meira Jr., W., Zaki, M.J.: Lazy associative classification. In: Proceedings of IEEE International Conference on Data Mining, pp. 645–654 (2006)Google Scholar
  5. 5.
    Wachsmuth, H., Stein, B., Engels, G.: Constructing efficient information extraction pipelines. In: Proceedings of CIKM 2011, pp. 2237–2240 (2011)Google Scholar
  6. 6.
    WebKB: CMU, world wide knowledge base (WebKB) project (2011)

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Rifat Ozcan
    • 1
  • Ismail Sengor Altingovde
    • 2
  • Özgür Ulusoy
    • 1
  1. 1.Computer Engineering DepartmentBilkent UniversityAnkaraTurkey
  2. 2.L3S Research CenterHannoverGermany

Personalised recommendations