Interestingness of discovered association rules in terms of neighborhood-based unexpectedness
One of the central problems in knowledge discovery is the development of good measures of interestingness of discovered patterns. With such measures, a user needs to manually examine only the more interesting rules, instead of each of a large number of mined rules. Previous proposals of such measures include rule templates, minimal rule cover, actionability, and unexpectedness in the statistical sense or against user beliefs.
In this paper we will introduce neighborhood-based interestingness by considering unexpectedness in terms of neighborhood-based parameters. We first present some novel notions of distance between rules and of neighborhoods of rules. The neighborhood-based interestingness of a rule is then defined in terms of the pattern of the fluctuation of confidences or the density of mined rules in some of its neighborhoods. Such interestingness can also be defined for sets of rules (e.g. plateaus and ridges) when their neighborhoods have certain properties. We can rank the interesting rules by combining some neighborhood-based characteristics, the support and confidence of the rules, and users' feedback. We discuss how to implement the proposed ideas and compare our work with related ones. We also give a few expected tendencies of changes due to rule structures, which should be taken into account when considering unexpectedness. We concentrate on association rules and briefly discuss generalization to other types of rules.
Unable to display preview. Download preview PDF.
- 2.R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207–216, Washington, D. C., 1993.Google Scholar
- 4.Y. Fu and J. Han, Meta-rule-guided mining of association rules in relational databases. Proc. 1995 Int'l Workshop. on Knowledge Discovery and Deductive and Object-Oriented Databases (KDOOD'95), Singapore, December 1995, pp. 39–46.Google Scholar
- 5.M. Klemettinen, H. Mannila, P Ronkainen, H. Toivonen, and A. I. Verkamo. Finding interesting rules from large sets of discovered association rules. In Proceedings of the Third International Conference on Information and Knowledge Management, pages 401–407, Gaithersburg, Maryland, 1994.Google Scholar
- 6.B. Liu and W. Hsu. Post-analysis of learned rules. In Proceedings of AAAI, pages 828–834, 1996.Google Scholar
- 7.G. Piatetsky-Shapiro. Discovery, analysis, and presentation of strong rules. In Knowledge Discovery in Databases, G. Piatetsky-Shapiro and W. J. Frawleay, Eds, pages 229–248. AAAI Press/The MIT Press, Menlo Park, CA, 1991.Google Scholar
- 8.G. Piatetsky-Shapiro and C. J. Matheus. The interestingness of deviations. In Proceedings of the AAAI-94 Workshop on Knowledge Discovery in Databases, pages 25–36, 1994.Google Scholar
- 9.R. Srikant and R. Agrawal. Mining generalized association rules. IBM Research Report RJ 9963, 1996.Google Scholar
- 10.A. Silberschatz and A. Tuzhilin. On subjective measures of interestingness in knowledge discovery. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 275–281, Montreal, Canada, August, 1995Google Scholar
- 12.H. Toivonen, M. Klemettinen, P Ronkainen, K. Hätönen and H. Mannila. Pruning and grouping discovered association rules. In MLnet Workshop on Statistics, Machine Learning, and Discovery in Databases, Crete, Greece, April 1995.Google Scholar
- 13.A. Tuzhilin. A pattern discovery algebra. In Proceedings of the 1997 Workshop on Research Issues in Data Mining and Knowledge Discovery, 1997.Google Scholar