Abstract
The paper presents a new general measure of rule interestingness. Many known measures such as chi-square, gini gain or entropy gain can be obtained from this measure by setting some numerical parameters, including the amount of trust we have in the estimation of the probability distribution of the data. Moreover, we show that there is a continuum of measures having chi-square, Gini gain and entropy gain as boundary cases. Therefore our measure generalizes both conditional and unconditional classical measures of interestingness. Properties and experimental evaluation of the new measure are also presented.
Chapter PDF
Similar content being viewed by others
References
Bayardo R.J. and R. Agrawal, Mining the Most Interesting Rules, Proc. of the 5th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, pp. 145–154, August 1999.
Blake C.L., and Merz C.J. UCI Repository of machine learning databases Irvine, CA: University of California, Department of Information and Computer Science, http://www.ics.uci.edu/~mlearn/MLRepository.html
Chou P.A., Optimal Partitioning for classification and regression trees IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-13(14):340–354, 1991.
Silverstein C., Brin S. and Motwani R., Beyond Market Baskets: Generalizing Association Rules to Dependence Rules Data Mining and Knowledge Discovery, 2(1998), pp. 39–68
Cziszar I., A Class of Measures of Informativity of Observation Channels, Periodic Math. Hungarica, 2:191–213, 1972.
Havrda J.H., Charvát F., Quantification Methods of Classification Processes: Concepts of Structural α Entropy, Kybernetica, 3:30–35, 1967.
Kapur J.N. and Kesavan H.K., Entropy Optimization Principles with Applications, Academic Press, San Diego, 1992.
Kvålseth T.O., Entropy and Correlation: Some comments, IEEE Trans. on Systems, Man and Cybernetics, SMC-17(3):517–519, 1987.
McEliece R.J. The Theory of Information and Coding. A mathematical Framework for Communication, Encyclopedia of Mathematics and its Applications, Addisson-Wesley, Reading Massachusetts, 1977.
Mitchell T.M.. Machine Learning, McGraw-Hill, ISBN: 0070428077.
Morimoto Y., Fukuda T., Matsuzawa H., Tokuyama T. and Yoda K. Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases, Proc. of the 24th Conf. on Very Large Databases, pp. 380–391, 1998
Morishita S., On Classification and Regression Proc. of the First Int’l Conf. on Discovery Science — Lecture Notes in Arti.cial Intelligence 1532:40–57, 1998
Padmanabhan B. and Tuzhilin A. Unexpectedness as a measure of interestingness in knowledge discovery Decision and Support Systems 27(1999), pp. 303–318
Simovici, D. A. and Tenney R. L. Relational Database Systems, Academic Press, 1995, San Diego.
Wehenkel L., On uncertainty Measures Used for Decision Tree Induction, Info. Proc. and Manag. of Uncertainty in Knowledge-Based Systems (IPMU’96), July 1–5, 1996, Granada Spain, pp. 413–418.
Witten I.H., and Eibe F., Data Mining, Practical Machine Learning and Techniques with JAVA Implementations, Academic Press, San Diego, CA, 2000, ISBN: 1558605525.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jaroszewicz, S., Simovici, D.A. (2001). A General Measure of Rule Interestingness. In: De Raedt, L., Siebes, A. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2001. Lecture Notes in Computer Science(), vol 2168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44794-6_21
Download citation
DOI: https://doi.org/10.1007/3-540-44794-6_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42534-2
Online ISBN: 978-3-540-44794-8
eBook Packages: Springer Book Archive