Evaluation of Web Page Representations by Content Through Clustering

  • Arantza Casillas
  • Víctor Fresno
  • M. Teresa González de Lena
  • Raquel Martínez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3246)

Abstract

In order to obtain accurate information from Internet web pages, a suitable representation of this type of document is required. In this paper, we present the results of evaluating 7 types of web page representations by means of a clustering process.

References

  1. [Fresno & Ribeiro 04]
    Fresno, V., Ribeiro, A.: An Analytical Approach to Concept Extraction in HTML Environments. In: JIIS, pp. 215–235. Kluwer A. Pub., Dordrecht (2004)Google Scholar
  2. [Karypis]
    Karypis, G.: CLUTO: A Clustering Toolkit. Technical Report: 02-017. University of Minnesota, Department of Computer Science, Minneapolis, MN 55455Google Scholar
  3. [Ribeiro et al. 03]
    Ribeiro, A., Fresno, V., García-Alegre, M., Guinea, D.: A Fuzzy System for the Web page Representation. In: Intelligent Exploration of the Web, pp. 19–38. Springer, Heidelberg (2003)Google Scholar
  4. [Sinka & Corne]
    Sinka, M.P., Corne, D.W.: BankSearch Dataset, http://www.pedal.rdg.ac.uk/banksearchdataset/

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Arantza Casillas
    • 1
  • Víctor Fresno
    • 2
  • M. Teresa González de Lena
    • 2
  • Raquel Martínez
    • 2
  1. 1.Dpt. Electricidad y Electrónica. UPV-EHU 
  2. 2.Dpt. Informática, Estadística y Telemática, URJC 

Personalised recommendations