Advertisement

Discovering Overlapping Quantitative Associations by Density-Based Mining of Relevant Attributes

  • Thomas Van Brussel
  • Emmanuel Müller
  • Bart Goethals
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9616)

Abstract

Association rule mining is an often used method to find relationships in the data and has been extensively studied in the literature. Unfortunately, most of these methods do not work well for numerical attributes. State-of-the-art quantitative association rule mining algorithms follow a common routine: (1) discretize the data and (2) mine for association rules. Unfortunately, this two-step approach can be rather inaccurate as discretization partitions the data space. This misses rules that are present in overlapping intervals.

In this paper, we explore the data for quantitative association rules hidden in overlapping regions of numeric data. Our method works without the need for a discretization step, and thus, prevents information loss in partitioning numeric attributes prior to the mining step. It exploits a statistical test for selecting relevant attributes, detects relationships of dense intervals in these attributes, and finally combines them into quantitative association rules. We evaluate our method on synthetic and real data to show its efficiency and quality improvement compared to state-of-the-art methods.

References

  1. 1.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD 22(2), 207–216 (1993)CrossRefGoogle Scholar
  2. 2.
    Altay Guvenir, H., Uysal, I.: Bilkent university function approximation repository (2000). http://funapp.cs.bilkent.edu.tr
  3. 3.
    Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: ACM SIGKDD, pp. 261–270 (1999)Google Scholar
  4. 4.
    Bay, S.D.: Multivariate discretization for set mining. Knowl. Inf. Syst. 3(4), 491–512 (2001)CrossRefMATHGoogle Scholar
  5. 5.
    Brin, S., Rastogi, R., Shim, K.: Mining optimized gain rules for numeric attributes. IEEE Trans. Knowl. Data Eng. 15(2), 324–338 (2003)CrossRefGoogle Scholar
  6. 6.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: ACM SIGKDD, pp. 226–231 (1996)Google Scholar
  7. 7.
    Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Mining optimized association rules for numeric attributes. J. Comput. Syst. Sci. 58(1), 1–12 (1999)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Grzymała-Busse, J.W.: Three strategies to rule induction from data with numerical attributes. In: Peters, J.F., Skowron, A., Dubois, D., Grzymała-Busse, J.W., Inuiguchi, M., Polkowski, L. (eds.) Transactions on Rough Sets II. LNCS, vol. 3135, pp. 54–62. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Kaytoue, M., Kuznetsov, S.O., Napoli, A.: Revisiting numerical pattern mining with formal concept analysis. International Joint Conference on Artificial Intelligence (IJCAI) arXiv preprint arxiv:1111.5689 (2011)
  10. 10.
    Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression data with pattern structures in formal concept analysis. Inf. Sci. 181(10), 1989–2001 (2011)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Ke, Y., Cheng, J., Ng, W.: Mic framework: an information-theoretic approach to quantitative association rule mining. In: IEEE ICDE, pp. 112–112 (2006)Google Scholar
  12. 12.
    Kriegel, H.P., Kröger, P., Renz, M., Wurst, S.H.R.: A generic framework for efficient subspace clustering of high-dimensional data. In: IEEE ICDM, pp. 250–257 (2005)Google Scholar
  13. 13.
    Kröger, P., Kriegel, H.P., Kailing, K.: Density-connected subspace clustering for high-dimensional data. In: SIAM SDM, pp. 246–256 (2004)Google Scholar
  14. 14.
    Mata, J., Alvarez, J.L., Riquelme, J.C.: An evolutionary algorithm to discover numeric association rules. In: ACM SAC, pp. 590–594 (2002)Google Scholar
  15. 15.
    Miller, R.J., Yang, Y.: Association rules over interval data. ACM SIGMOD 26(2), 452–461 (1997)CrossRefGoogle Scholar
  16. 16.
    Müller, E., Assent, I., Günnemann, S., Seidl, T.: Scalable density-based subspace clustering. In: ACM CIKM, pp. 1077–1086 (2011)Google Scholar
  17. 17.
    Müller, E., Assent, I., Krieger, R., Günnemann, S., Seidl, T.: DensEst: Density estimation for data mining in high dimensional spaces. In: SIAM SDM, pp. 175–186 (2009)Google Scholar
  18. 18.
    Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data. PVLDB 2(1), 1270–1281 (2009)Google Scholar
  19. 19.
    Salleb-Aouissi, A., Vrain, C., Nortet, C., Kong, X., Rathod, V., Cassard, D.: Quantminer for mining quantitative association rules. J. Mach. Learn. Res. 14(1), 3153–3157 (2013)MATHGoogle Scholar
  20. 20.
    Serrurier, M., Dubois, D., Prade, H., Sudkamp, T.: Learning fuzzy rules with their implication operators. Data Knowl. Eng. 60(1), 71–89 (2007). http://dx.doi.org/10.1016/j.datak.2006.01.007 CrossRefGoogle Scholar
  21. 21.
    Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: ACM SIGMOD. pp. 1–12 (1996)Google Scholar
  22. 22.
    Tatti, N.: Itemsets for real-valued datasets. In: IEEE ICDM, pp. 717–726 (2013)Google Scholar
  23. 23.
    Vannucci, M., Colla, V.: Meaningful discretization of continuous features for association rules mining by means of a som. In: ESANN, pp. 489–494 (2004)Google Scholar
  24. 24.
    Washio, T., Mitsunaga, Y., Motoda, H.: Mining quantitative frequent itemsets using adaptive density-based subspace clustering. In: IEEE ICDM, pp. 793–796 (2005)Google Scholar
  25. 25.
    Webb, G.I.: Discovering associations with numeric variables. In: ACM SIGKDD, pp. 383–388 (2001)Google Scholar
  26. 26.
    Wijsen, J., Meersman, R.: On the complexity of mining quantitative association rules. Data Min. Knowl. Discov. 2(3), 263–281 (1998)CrossRefGoogle Scholar
  27. 27.
    Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining colossal frequent patterns by core pattern fusion. In: IEEE ICDE, pp. 706–715 (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Thomas Van Brussel
    • 1
  • Emmanuel Müller
    • 1
    • 2
  • Bart Goethals
    • 1
  1. 1.University of AntwerpAntwerpBelgium
  2. 2.Hasso-Plattner-InstitutePotsdamGermany

Personalised recommendations