Abstract
While association rule mining is one of the most popular data mining techniques, it usually results in many rules, some of which are not considered as interesting or significant for the application at hand. In this paper, we conduct a systematic approach to ascertain the discovered rules and provide a rigorous statistical approach supporting this framework. The strategy proposed combines data mining and statistical measurement techniques, including redundancy analysis, sampling and multivariate statistical analysis, to discard the non significant rules. A real world dataset is used to demonstrate how the proposed unified framework can discard many of the redundant or non significant rules and still preserve high accuracy of the rule set as a whole.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco (2001)
McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20, 39–61 (2005)
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A survey. ACM Comput. Surv. 38, 9 (2006)
Zhang, H., Padmanabhan, B., Tuzhilin, A.: On the discovery of significant statistical quantitative rules. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York (2004)
Philippe, L., Patrick, M., Benoît, V., Stéphane, L.: On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid. European Journal of Operational Research 184, 610–626 (2008)
Agrawal, R., Imieliski, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22, 207–216 (1993)
Aggarwal, C.C., Yu, P.S.: A new framework for itemset generation. Book A new framework for itemset generation. Series A new framework for itemset generation. ACM, New York (1998)
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. ACM SIGMOD Record 26(2), 255–264 (1997)
Silverstein, C., Brin, S., Motwani, R.: Beyond Market Baskets: Generalizing Association Rules to Dependence Rules. Data Min. Knowl. Discov. 2(1), 39–68 (1998)
Brijs, T., Vanhoof, K., Wets, G.: Defining interestingness for association rules. International Journal of Information Theories and Applications 10(4), 370–376 (2003)
Omiecinski, E.R.: Alternative interest measures for mining associations in databases. IEEE Transactions on Knowledge and Data Engineering 15(1), 57–69 (2003)
Webb, G.I.: Discovering Significant Patterns. Machine Learning, 1–33 (2007)
Piatetsky-Shapiro, G.: Discovery, analysis; and presentation of strong rules. Knowledge discovery in database 229 (1991)
Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York (1999)
Bay, S.D., Pazzani, M.J.: Detecting Group Differences: Mining Contrast Sets. Data Mining and Knowledge Discovery 5, 213–246 (2001)
Meggido, N., Srikant, R.: Discovering Predictive Association Rules. In: 4th International Conference on Knowledge Discovery in Databases and Data Mining, pp. 274–278 (1998)
Webb, G.I.: Preliminary investigations into statistically valid exploratory rule discovery. In: Australasian Data Mining workshop (AudDM 2003), pp. 1–9 (2003)
Aumann, Y., Lindell, Y.: A Statistical Theory for Quantitative Association Rules. J. Intell. Inf. Syst. 20, 255–283 (2003)
Goodman, A., Kamath, C., Kumar, V.: Data Analysis in the 21st Century. Stat. Anal. Data Mining 1, 1–3 (2008)
Mohd Shaharanee, I.N., Dillon, T.S., Hadzic, F.: Ascertaining Association Rules using Statistical Analysis. In: Proceeding of the 2009 International Symposium on Computing, Communication and Control, Singapore (2009)
Roberto, B., Rakesh, A., Dimitrios, G.: Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery 4, 217–240 (2007)
Cheng, H., Yan, X., Han, J., Yu, P.S.: Direct discriminative pattern mining for effective classification. IEEE, 169–178 (2008)
Cheng, H., Yan, X., Han, J., Hsu, C.-W.: Discriminative frequent pattern analysis for effective classification. IEEE, 10 (2007)
Nada, L., Peter, F., Blaz, Z.: Rule Evaluation Measures: A Unifying View. Inductive Logic Programming, 174–185 (1999)
Zhou, X.J., Dillon, T.S.: A statistical-heuristic feature selection criterion for decision tree induction. IEEE Transactions on Pattern Analysis and Machine Intelligence 13 (1991)
Agresti, A.: An Intro. to Categorical Data Analysis. Wiley-Interscience, New York (2007)
Bovas, A., Johannes, L.: Intro. to Regression Modeling. Brooks/Cole, California (2006)
Yanrong, L., Raj, G.: Effective Sampling for Mining Association Rules. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 391–401. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mohd Shaharanee, I.N., Hadzic, F., Dillon, T.S. (2009). Interestingness of Association Rules Using Symmetrical Tau and Logistic Regression. In: Nicholson, A., Li, X. (eds) AI 2009: Advances in Artificial Intelligence. AI 2009. Lecture Notes in Computer Science(), vol 5866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10439-8_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-10439-8_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10438-1
Online ISBN: 978-3-642-10439-8
eBook Packages: Computer ScienceComputer Science (R0)