Skip to main content

Evaluation of Web Page Representations by Content Through Clustering

  • Conference paper
String Processing and Information Retrieval (SPIRE 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3246))

Included in the following conference series:

Abstract

In order to obtain accurate information from Internet web pages, a suitable representation of this type of document is required. In this paper, we present the results of evaluating 7 types of web page representations by means of a clustering process.

Work supported by the Madrid Research Agency, project 07T/0030/2003 1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Fresno, V., Ribeiro, A.: An Analytical Approach to Concept Extraction in HTML Environments. In: JIIS, pp. 215–235. Kluwer A. Pub., Dordrecht (2004)

    Google Scholar 

  2. Karypis, G.: CLUTO: A Clustering Toolkit. Technical Report: 02-017. University of Minnesota, Department of Computer Science, Minneapolis, MN 55455

    Google Scholar 

  3. Ribeiro, A., Fresno, V., García-Alegre, M., Guinea, D.: A Fuzzy System for the Web page Representation. In: Intelligent Exploration of the Web, pp. 19–38. Springer, Heidelberg (2003)

    Google Scholar 

  4. Sinka, M.P., Corne, D.W.: BankSearch Dataset, http://www.pedal.rdg.ac.uk/banksearchdataset/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Casillas, A., Fresno, V., de Lena, M.T.G., Martínez, R. (2004). Evaluation of Web Page Representations by Content Through Clustering. In: Apostolico, A., Melucci, M. (eds) String Processing and Information Retrieval. SPIRE 2004. Lecture Notes in Computer Science, vol 3246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30213-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30213-1_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23210-0

  • Online ISBN: 978-3-540-30213-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics