Advertisement

Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results

  • Yaochun Huang
  • Hui Xiong
  • Weili Wu
  • Sam Y. Sung
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)

Abstract

Hyperclique patterns are groups of objects which are strongly related to each other. Indeed, the objects in a hyperclique pattern have a guaranteed level of global pairwise similarity to one another as measured by uncentered Pearson’s correlation coefficient. Recent literature has provided the approach to discovering hyperclique patterns over data sets with binary attributes. In this paper, we introduce algorithms for mining maximal hyperclique patterns in large data sets containing quantitative attributes. An intuitive and simple solution is to partition quantitative attributes into binary attributes. However, there is potential information loss due to partitioning. Instead, our approach is based on a normalization scheme and can directly work on quantitative attributes. In addition, we adopt the algorithm structures of three popular association pattern mining algorithms and add a critical clique pruning technique. Finally, we compare the performance of these algorithms for finding quantitative maximal hyperclique patterns using some real-world data sets.

Keywords

Frequent Itemset Quantitative Attribute Binary Attribute Hyperclique Pattern Maximal Frequent Itemset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: SIGMOD 1993 (May 1993)Google Scholar
  2. 2.
    Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissuese probed by oligonucleotide arrays. In: Proc. Natl. Acad. Sci., June 1999, vol. 96, pp. 6745–6750 (1999)Google Scholar
  3. 3.
    Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: ICDE (2001)Google Scholar
  4. 4.
    Huang, Y., Xiong, H., Wu, W., Zhang, Z.: A hybrid approach for mining maximal hyperclique patterns. In: ICTAI (2004)Google Scholar
  5. 5.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD (2000)Google Scholar
  6. 6.
    Ross, D.T., Scherf, U., et al.: Systematic variation in gene expression patterns in human cancer cell lines. Nature Genetics 24(3), 227–234 (2000)CrossRefGoogle Scholar
  7. 7.
    Steinbach, M., Tan, P.-N., Xiong, H., Kumar, V.: Extending the Notion of Support. In: ACM SIGKDD (2004)Google Scholar
  8. 8.
    Xiong, H., Steinbach, M., Tan, P., Kumpar, V.: HICAP: Hierarchial Clustering with Pattern Preservation. In: Proc. of SIAM Int’l Conf. on Data Mining (2004)Google Scholar
  9. 9.
    Xiong, H., Tan, P., Kumar, V.: Mining strong affinity association patterns in data sets with skewed support distribution. In: ICDM 2003, USA (2003)Google Scholar
  10. 10.
    Zaki, M., Gouda, K.: Fast vertical mining using diffsets. In: ACM SIGKDD (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yaochun Huang
    • 1
  • Hui Xiong
    • 2
  • Weili Wu
    • 1
  • Sam Y. Sung
    • 3
  1. 1.Computer Science DepartmentUniversity of TexasDallasUSA
  2. 2.MSIS DepartmentRutgers UniversityUSA
  3. 3.Dept. of Computer ScienceSouth Texas CollegeUSA

Personalised recommendations