Advertisement

Tight Correlated Item Sets and Their Efficient Discovery

  • Lizheng Jiang
  • Dongqing Yang
  • Shiwei Tang
  • Xiuli Ma
  • Dehui Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4505)

Abstract

We study the problem of mining correlated patterns. Correlated patterns have advantages over associations that they cover not only frequent items, but also rare items.Tight correlated item sets is a concise representation of correlated patterns, where items are correlated each other. Although finding such tight correlated item sets is helpful for applications, the algorithm’s efficiency is critical, especially for high dimensional database. Thus, we first prove Lemma 1 and Lemma 2 in theory. Utilizing Lemma 1 and Lemma 2, we design an optimized RSC (Regional-Searching-Correlations) algorithm. Furthermore, we estimate the amount of pruned search space for data with various support distributions based on a probabilistic model. Experiment results demonstrate that RSC algorithm is much faster than other similar algorithms.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of 1993 Int. Conf. on Management of Data (SIGMOD’93), pp. 207–216 (1993)Google Scholar
  2. 2.
    Liu, B., Hsu, W., Ma, Y.: Mining Association Rules with Multiple Minimum Supports. In: Proc. Knowledge Discovery and Data Mining Conf., Aug. 1999, pp. 337–341 (1999)Google Scholar
  3. 3.
    Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. In: SIGMOD, pp. 265–276 (1997)Google Scholar
  4. 4.
    Omiecinski, E.: Alternative interest measures for mining associations. IEEE Trans. Knowledge and Data Engineering 15, 57–69 (2003)CrossRefGoogle Scholar
  5. 5.
    Lee, Y.-K., Kim, W.-Y., Cai, Y.D., Han, J.: Comine: Efficient mining of correlated patterns. In: ICDM, p. 581 (2003)Google Scholar
  6. 6.
    Xiong, H., Shekhar, S., Tan, P.N., Kumar, V.: Exploiting a Support-based Upper Bound of Pearson’s Correlation Coefficient for Efficiently Identifying Strongly Correlated Pairs. In: Proc. of 2004 Int. Conf. Knowledge Discovery and Data Mining (KDD’04), pp. 334–343 (2004)Google Scholar
  7. 7.
    Calders, T., Goethals, B., Jaroszewicz, S.: Mining rank-correlated sets of numerical attributes. In: KDD 2006, pp. 96–105 (2006)Google Scholar
  8. 8.
    Kendall, M.: Rank Correlation Methods. Oxford University Press, Oxford (1990)zbMATHGoogle Scholar
  9. 9.
    Ke, Y., Cheng, J., Ng, W.: Mining quantitative correlated patterns using an information-theoretic approach. In: KDD 2006, pp. 227–236 (2006)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Lizheng Jiang
    • 1
  • Dongqing Yang
    • 1
  • Shiwei Tang
    • 1
    • 2
  • Xiuli Ma
    • 2
  • Dehui Zhang
    • 2
  1. 1.School of Electronics Engineering and Computer Science, Peking University 
  2. 2.National Laboratory on Machine Perception, Peking University, Beijing 100871China

Personalised recommendations