A New Method of Clustering Search Results Using Frequent Itemsets with Graph Structures

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 107)

Abstract

The representation of search results from the World Wide Web has received considerable attention in the database research community. Systems have been proposed for clustering search results into meaningful semantic categories for presentation to the end user. This paper presents a novel clustering algorithm, which is based on the concept of frequent itemsets mining over a graph structure, to efficiently generate search result clusters. The performance study reveals that the algorithm was highly efficient and significantly outperformed previous approaches in clustering search results.

Keywords

Web clustering engine Frequent itemsets mining Hash table Graph structure 

References

  1. 1.
    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: VLDB, pp 487–499Google Scholar
  2. 2.
    Bernardini A, Carpineto C, D’Amico M (2009) Full-subtopic retrieval with keyphrase-based search results clustering. In: Web intelligence, pp 206–213Google Scholar
  3. 3.
    Carpineto C, Osinski S, Romano G, Weiss D (2009) A survey of web clustering engines. ACM Comput Surv 41(3):1–38CrossRefGoogle Scholar
  4. 4.
    Carpineto C, Romano G (2010) Optimal meta search results clustering. In: SIGIR, pp 170–177Google Scholar
  5. 5.
    Giacomo ED, Didimo W, Grilli L, Liotta G (2007) Graph visualization techniques for web clustering engines. IEEE Trans Vis Comput Graph 13(2):294–304CrossRefGoogle Scholar
  6. 6.
    Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pp 1–12 ACMGoogle Scholar
  7. 7.
    Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkMATHCrossRefGoogle Scholar
  8. 8.
    Osinski S, Stefanowski J, Weiss D (2004) Lingo: search results clustering algorithm based on singular value decomposition. In: Intelligent information systems, pp 359–368Google Scholar
  9. 9.
    Rijsbergen CV (1979) Information retrieval. Butterworth-Heinemann, NewtonGoogle Scholar
  10. 10.
    Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • I-Fang Su
    • 1
  • Yu-Chi Chung
    • 2
  • Chiang Lee
    • 3
  • Xuanyou Lin
    • 3
  1. 1.Department of Information ManagementFotechKaohsiungTaiwan
  2. 2.Department of CSIECJCUTainanTaiwan
  3. 3.Department of CSIENCKUTainanTaiwan

Personalised recommendations