Efficiently Identifying Exploratory Rules’ Significance
How to efficiently discard potentially uninteresting rules in exploratory rule discovery is one of the important research foci in data mining. Many researchers have presented algorithms to automatically remove potentially uninteresting rules utilizing background knowledge and user-specified constraints. Identifying the significance of exploratory rules using a significance test is desirable for removing rules that may appear interesting by chance, hence providing the users with a more compact set of resulting rules. However, applying statistical tests to identify significant rules requires considerable computation and data access in order to obtain the necessary statistics. The situation gets worse as the size of the database increases. In this paper, we propose two approaches for improving the efficiency of significant exploratory rule discovery. We also evaluate the experimental effect in impact rule discovery which is suitable for discovering exploratory rules in very large, dense databases.
KeywordsExploratory rule discovery impact rule rule significance interestingness measure
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C, pp. 207–216 (May 26-28, 1993)Google Scholar
- 2.Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: Knowledge Discovery and Data Mining, pp. 261–270 (1999)Google Scholar
- 3.Bay, S.D.: The uci kdd archive (1999), http://kdd.ics.uci.edu
- 4.Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. In: Data Mining and Knowledge Discovery, pp. 213–246 (2001)Google Scholar
- 6.Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)Google Scholar
- 7.Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann, San Francisco (2001)Google Scholar
- 8.Huang, S., Webb, G.I.: Discarding insignificant rules during impact rule discovery in large database. In: SIAM Data Mining Conference, Newport Beach, USA (2005)Google Scholar
- 9.Pei, J.H.J., Lakshmanan, L.V.S.: Mining frequent itemsets with convertible constraints. In: Proceedings of the 17th International Conference on Data Engineering, p. 433. IEEE Computer Society, Los Alamitos (2001)Google Scholar
- 10.Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Knowledge Discovery and Data Mining, pp. 125–134 (1999)Google Scholar
- 11.Michalski, R.S.: A theory and methodology of inductive learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach, pp. 83–134. Springer, Heidelberg (1984)Google Scholar
- 12.Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
- 15.Webb, G.I.: Statistically sound exploratory rule discovery (2004) (to be published)Google Scholar