Summary
Nowadays mining association rules in a database is a quite simple task; many algorithms have been developed to discover regularities in data. The analysis and the interpretation of the discovered rules are more difficult or almost impossible, given the huge number of generated rules. In this paper we propose a three step strategy to select only interesting association rules after the mining process. The proposed approach is based on the introduction of statistical tests in order to prune logical implications that are not significant.
Similar content being viewed by others
Notes
2The Bonferroni effect related to multiple tests performed on the same data set is under investigation but it is expected not to modify substantially the structure of the proposed strategy and the idea behind it.
3This constraint should be no problem in a Data Mining application.
4An α value equal to 0.05 is a typical choice in statistical hypothesis tests.
5The rules generated contain at most two items in the antecedent and one item in the consequence.
References
Agrawal, R., Imielinski, T. & Swami, A. (1993), ‘Mining Association Rules between Sets of Items in Large Databases’, Proceedings of the 1993 ACM SIGMOD Conference, May, Washington DC, USA, 207–216.
Bayardo, R.J.Jr., Agrawal, R. (1999), ‘Mining the Most Interesting Rules’, Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 145–154.
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., & Verkamo, A.I. (1994), ‘Finding interesting rules from large sets of discovered association rules’, Proceedings of the Third International Conference on Information and Knowledge Management CIKM-94, 401–407.
Liu, B., Hsu, W. & Ma, Y. (1999), ‘Pruning and Summarizing the Discovered Associations’, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), August 15–18, San Diego, CA, USA.
Liu, B., Hsu, W., Wang, K. & Chen, S. (1999), ‘Visually Aided Exploration Interesting Association Rules’, Proceedings of the Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD-99), April 26–28, Beijing.
Silberschatz, A. & Tuzhilin, A. (1995), ‘On subjective measures of interestingness in knowledge discovery’, Proceedings of the First International Conference on Knowledge Discovery and Data Mining, 275–281.
Toivonen, H., Klemettinen, M., Ronkainen, P., Hatonen, K. & Mannila, H. (1995), ‘Pruning and grouping of discovered association rules’, Workshop Notes of the ECML-95 Workshop on Statistics, Machine Learning, and Knowledge Discovery in Databases, 47–52, Heraklion, Greece, April 1995.
Weber I. (1998), ‘On Pruning Strategies for Discovery of Generalized and Quantitative Association Rules’, Proceedings of Knowledge Discovery and Data Mining Workshop, Singapore.
Shah, D., Lakshmanan, L.V.S., Ramamritham, K. & Sudarshan S. (1999), ‘Interestingness and Pruning of Mined Patterns’, Workshop Notes of the 1999 ACM SIGMOD Research Issues in Data Mining and Knowledge Discovery.
Author information
Authors and Affiliations
Additional information
This research was supported by“Ricerca Dipartimentale 1999” grant (Prof. C. Lauro). We are very grateful to Prof. Carlo Lauro for his helpful suggestions.
Rights and permissions
About this article
Cite this article
Bruzzese, D., Davino, C. Statistical Pruning of Discovered Association Rules. Computational Statistics 16, 387–398 (2001). https://doi.org/10.1007/s001800100074
Published:
Issue Date:
DOI: https://doi.org/10.1007/s001800100074