Data Mining pp 64-77

Part of the Lecture Notes in Computer Science book series (LNCS, volume 3755)

| Cite as

Efficiently Identifying Exploratory Rules’ Significance

  • Shiying Huang
  • Geoffrey I. Webb

Abstract

How to efficiently discard potentially uninteresting rules in exploratory rule discovery is one of the important research foci in data mining. Many researchers have presented algorithms to automatically remove potentially uninteresting rules utilizing background knowledge and user-specified constraints. Identifying the significance of exploratory rules using a significance test is desirable for removing rules that may appear interesting by chance, hence providing the users with a more compact set of resulting rules. However, applying statistical tests to identify significant rules requires considerable computation and data access in order to obtain the necessary statistics. The situation gets worse as the size of the database increases. In this paper, we propose two approaches for improving the efficiency of significant exploratory rule discovery. We also evaluate the experimental effect in impact rule discovery which is suitable for discovering exploratory rules in very large, dense databases.

Keywords

Exploratory rule discovery impact rule rule significance interestingness measure 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C, pp. 207–216 (May 26-28, 1993)Google Scholar
  2. 2.
    Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: Knowledge Discovery and Data Mining, pp. 261–270 (1999)Google Scholar
  3. 3.
    Bay, S.D.: The uci kdd archive (1999), http://kdd.ics.uci.edu
  4. 4.
    Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. In: Data Mining and Knowledge Discovery, pp. 213–246 (2001)Google Scholar
  5. 5.
    Bayardo Jr., R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Discov. 4(2-3), 217–240 (2000)CrossRefGoogle Scholar
  6. 6.
    Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)Google Scholar
  7. 7.
    Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann, San Francisco (2001)Google Scholar
  8. 8.
    Huang, S., Webb, G.I.: Discarding insignificant rules during impact rule discovery in large database. In: SIAM Data Mining Conference, Newport Beach, USA (2005)Google Scholar
  9. 9.
    Pei, J.H.J., Lakshmanan, L.V.S.: Mining frequent itemsets with convertible constraints. In: Proceedings of the 17th International Conference on Data Engineering, p. 433. IEEE Computer Society, Los Alamitos (2001)Google Scholar
  10. 10.
    Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Knowledge Discovery and Data Mining, pp. 125–134 (1999)Google Scholar
  11. 11.
    Michalski, R.S.: A theory and methodology of inductive learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach, pp. 83–134. Springer, Heidelberg (1984)Google Scholar
  12. 12.
    Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  13. 13.
    Webb, G.I.: Discovering associations with numeric variables. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 383–388. ACM Press, New York (2001)CrossRefGoogle Scholar
  14. 14.
    Webb, G.I.: OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research 3, 431–465 (1995)MATHGoogle Scholar
  15. 15.
    Webb, G.I.: Statistically sound exploratory rule discovery (2004) (to be published)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Shiying Huang
    • 1
  • Geoffrey I. Webb
    • 1
  1. 1.School of Computer Science and Software EngineeringMonash UniversityMelbourneAustralia

Personalised recommendations