Advertisement

Evaluation of Web Page Representations by Content Through Clustering

  • Arantza Casillas
  • Víctor Fresno
  • M. Teresa González de Lena
  • Raquel Martínez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3246)

Abstract

In order to obtain accurate information from Internet web pages, a suitable representation of this type of document is required. In this paper, we present the results of evaluating 7 types of web page representations by means of a clustering process.

Keywords

Cluster Process Term Frequency External Evaluation Ture Reduction Partition Cluster Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. [Fresno & Ribeiro 04]
    Fresno, V., Ribeiro, A.: An Analytical Approach to Concept Extraction in HTML Environments. In: JIIS, pp. 215–235. Kluwer A. Pub., Dordrecht (2004)Google Scholar
  2. [Karypis]
    Karypis, G.: CLUTO: A Clustering Toolkit. Technical Report: 02-017. University of Minnesota, Department of Computer Science, Minneapolis, MN 55455Google Scholar
  3. [Ribeiro et al. 03]
    Ribeiro, A., Fresno, V., García-Alegre, M., Guinea, D.: A Fuzzy System for the Web page Representation. In: Intelligent Exploration of the Web, pp. 19–38. Springer, Heidelberg (2003)Google Scholar
  4. [Sinka & Corne]
    Sinka, M.P., Corne, D.W.: BankSearch Dataset, http://www.pedal.rdg.ac.uk/banksearchdataset/

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Arantza Casillas
    • 1
  • Víctor Fresno
    • 2
  • M. Teresa González de Lena
    • 2
  • Raquel Martínez
    • 2
  1. 1.Dpt. Electricidad y Electrónica. UPV-EHU 
  2. 2.Dpt. Informática, Estadística y Telemática, URJC 

Personalised recommendations