Evaluating Hierarchical Clustering of Search Results

  • Juan M. Cigarran
  • Anselmo Pen̈as
  • Julio Gonzalo
  • Felisa Verdejo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3772)

Abstract

We propose a goal-oriented evaluation measure, Hierarchy Quality, for hierarchical clustering algorithms applied to the task of organizing search results -such as the clusters generated by Vivisimo search engine-. Our metric considers the content of the clusters, their hierarchical arrangement, and the effort required to find relevant information by traversing the hierarchy starting from the top node. It compares the effort required to browse documents in a baseline ranked list with the minimum effort required to find the same amount of relevant information by browsing the hierarchy (which involves examining both documents and node descriptors).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Carpineto, C., Romano, G.: Concept Data Analysis. Data and Applications. Wiley, Chichester (2004)CrossRefGoogle Scholar
  2. 2.
    Cigarran, J., Gonzalo, J., Peñas, A., Verdejo, F.: Browsing search results via formal concept analysis: Automatic selection of attributes. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 74–87. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Cigarran, J., Peñas, A., Gonzalo, J., Verdejo, F.: Automatic selection of noun phrases as document descriptors in an fca-based information retrieval system. In: Formal Concept Analysis. Springer, Heidelberg (2005)Google Scholar
  4. 4.
    Ferragina, P., Gulli, A.: A personalized search engine based on web-snippet hierarchical clustering. In: WWW 2005: Special interest tracks and posters of the 14th international conference on World Wide Web, pp. 801–810. ACM Press, New York (2005)CrossRefGoogle Scholar
  5. 5.
    Hearst, M., Pedersen, J.: Reexamining the cluster hypothesis: Scatter/gather on retrieval results. In: Proceedings of SIGIR-96, 19th ACM International Conference on Research and Development in Information Retrieval, Zurich, CH, pp. 76–84 (1996)Google Scholar
  6. 6.
    Kummamuru, K., Lotlikar, R., Roy, S., Singal, K., Krishnapuram, R.: A hierarchical monothetic document clustering algorithm for summarization and browsing search results. In: WWW 04: Proceedings of the 13th international conference on World Wide Web, pp. 658–665. ACM Press, New York (2004)CrossRefGoogle Scholar
  7. 7.
    Lawrie, D., Croft, W.: Discovering and comparing topic hierarchies. In: Proceedings of RIAO 2000 (2000)Google Scholar
  8. 8.
    Leouski, A., Croft, W.: An evaluation of techniques for clustering search results (1996)Google Scholar
  9. 9.
    Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004: Proceedings of the 13th international conference on World Wide Web, pp. 13–19. ACM Press, New York (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Juan M. Cigarran
    • 1
  • Anselmo Pen̈as
    • 1
  • Julio Gonzalo
    • 1
  • Felisa Verdejo
    • 1
  1. 1.Dept. Lenguajes y Sistemas InformáticosE.T.S.I. Informática UNED 

Personalised recommendations