Partitional Clustering Experiments with News Documents

  • Arantza Casillas
  • Mayte de González Lena
  • Raquel Martínez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2588)

Abstract

We have carried out experiments in clustering a news corpus. In these experiments we have used two partitional methods varying two different parameters of the clustering tool. In addition,we have worked with the whole document (news)and with representative parts of the document. We have obtained good results working with a representative part of the document. The experiments have been carried out with news in Spanish and Basque in order to compare the results in both languages.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 2.
    Industry Standard IPTC Subject Codes.http://www.sipausa.com/iptcsubject-codes.htm.
  2. 3.
    A. Gelbukh, G. Sidorov, A. Guzman-Arenas.“Use of a weighted topic hierarchy for text retrieval and classification.”Text,Speech and Dialogue.Proc.TSD-99. Lecture Notes in Artificial Intelligence,No.1692,Springer,130–135,1999.Google Scholar
  3. 4.
    “Project HERMES (Hemerotecas Electrónicas:Recuperación Multilingue y Extracción Semántica)”of the Spanish Research Agency,(TIC2000-0335-C03-03). http://terral.ieec.uned.es/hermes/.
  4. 5.
    Y. Zhao and G. Karypis.“Evaluation of hierarchical clustering algorithms for document data sets ”.CIKM,2002.Google Scholar
  5. 6.
    Y. Zhao and G. Karypis.“Criterion functions for document clustering:Experiments and analysis ”.http://cs.umn.edu/karypis/publications.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Arantza Casillas
    • 1
  • Mayte de González Lena
    • 2
  • Raquel Martínez
    • 2
  1. 1.Dpt.de Electricidad y Electrónica, Facultad de CienciasUniversidad del País VascoSpain
  2. 2.Escuela Superior de CC.Experimentales y TecnologíaUniversidad Rey Juan CarlosSpain

Personalised recommendations