Abstract
In order to obtain accurate information from Internet web pages, a suitable representation of this type of document is required. In this paper, we present the results of evaluating 7 types of web page representations by means of a clustering process.
Work supported by the Madrid Research Agency, project 07T/0030/2003 1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fresno, V., Ribeiro, A.: An Analytical Approach to Concept Extraction in HTML Environments. In: JIIS, pp. 215–235. Kluwer A. Pub., Dordrecht (2004)
Karypis, G.: CLUTO: A Clustering Toolkit. Technical Report: 02-017. University of Minnesota, Department of Computer Science, Minneapolis, MN 55455
Ribeiro, A., Fresno, V., García-Alegre, M., Guinea, D.: A Fuzzy System for the Web page Representation. In: Intelligent Exploration of the Web, pp. 19–38. Springer, Heidelberg (2003)
Sinka, M.P., Corne, D.W.: BankSearch Dataset, http://www.pedal.rdg.ac.uk/banksearchdataset/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Casillas, A., Fresno, V., de Lena, M.T.G., Martínez, R. (2004). Evaluation of Web Page Representations by Content Through Clustering. In: Apostolico, A., Melucci, M. (eds) String Processing and Information Retrieval. SPIRE 2004. Lecture Notes in Computer Science, vol 3246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30213-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-30213-1_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23210-0
Online ISBN: 978-3-540-30213-1
eBook Packages: Springer Book Archive