Evaluating Learning Algorithms to Support Human Rule Evaluation with Predicting Interestingness Based on Objective Rule Evaluation Indices

  • Hidenao Abe
  • Shusaku Tsumoto
  • Miho Ohsaki
  • Takahira Yamaguchi
Part of the Studies in Computational Intelligence book series (SCI, volume 123)

Summary

In this paper, we present an evaluation of learning algorithms of a rule evaluation support method with rule evaluation models based on objective indices for data mining post-processing. Post-processing of mined results is one of the key processes in a data mining process. However, it is difficult for human experts to evaluate several thousands of rules from a large dataset with noises for finding out reraly included valuable rules. To reduce the costs in such rule evaluation task, we have developed the rule evaluation support method with rule evaluation models which learn from a dataset. This dataset comprises objective indices for mined classification rules and evaluations by a human expert for each rule. To evaluate performances of learning algorithms for constructing the rule evaluation models, we have done a case study on the meningitis data mining as an actual problem. Furthermore, we have also evaluated our method with twelve rule sets obtained from twelve UCI datasets. With regard to these results, we show the availability of our rule evaluation support method for human experts.

Keywords

Data Mining Post-processing Rule Evaluation Support Objective Rule Evaluation Index 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ali, K., Manganaris, S., Srikant, R.: Partial Classification Using Association Rules. Proc. of Int. Conf. on Knowledge Discovery and Data Mining KDD-1997 (1997) 115–118Google Scholar
  2. 2.
    Brin, S., Motwani, R., Ullman, J., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. Proc. of ACM SIGMOD Int. Conf. on Management of Data (1997) 255–264Google Scholar
  3. 3.
    Frank, E., Wang, Y., Inglis, S., Holmes, G., and Witten, I. H.: Using model trees for classification. Machine Learning, Vol.32, No.1 (1998) 63–76MATHCrossRefGoogle Scholar
  4. 4.
    Frank, E, Witten, I. H., Generating accurate rule sets without global optimization. Proc. of the Fifteenth International Conference on Machine Learning, (1998) 144–151Google Scholar
  5. 5.
    Gago, P., Bento, C.: A Metric for Selection of the Most Promising Rules. Proc. of Euro. Conf. on the Principles of Data Mining and Knowledge Discovery PKDD-1998 (1998) 19–27Google Scholar
  6. 6.
    Goodman, L. A., Kruskal, W. H.: Measures of association for cross classifications. Springer Series in Statistics, 1, Springer-Verlag (1979)Google Scholar
  7. 7.
    Gray, B., Orlowska, M. E.: CCAIIA: Clustering Categorical Attributes into Interesting Association Rules. Proc. of Pacific-Asia Conf. on Knowledge Discovery and Data Mining PAKDD-1998 (1998) 132–143Google Scholar
  8. 8.
    Hamilton, H. J., Shan, N., Ziarko, W.: Machine Learning of Credible Classifications. Proc. of Australian Conf. on Artificial Intelligence AI-1997 (1997) 330–339Google Scholar
  9. 9.
    Hatazawa, H., Negishi, N., Suyama, A., Tsumoto, S., and Yamaguchi, T.: Knowledge Discovery Support from a Meningoencephalitis Database Using an Automatic Composition Tool for Inductive Applications. Proc. of KDD Challenge 2000 in conjunction with PAKDD2000 (2000) 28–33Google Scholar
  10. 10.
    Hettich, S., Blake, C. L., and Merz, C. J.: UCI Repository of machine learning databases [http://www.ics.uci.edu/ \tilde{}mlearn/MLRepository.html], Irvine, CA: University of California, Department of Information and Computer Science, (1998)
  11. 11.
    Hilderman, R. J. and Hamilton, H. J.: Knowledge Discovery and Measure of Interest. Kluwe Academic Publishers (2001)Google Scholar
  12. 12.
    Hinton, G. E.: “Learning distributed representations of concepts”, Proceedings of 8th Annual Conference of the Cognitive Science Society, Amherest, MA. REprinted in R.G.M. Morris (ed.) (1986)Google Scholar
  13. 13.
    Holte, R. C.: Very simple classification rules perform well on most commonly used datasets, Machine Learning, Vol. 11 (1993) 63–91MATHCrossRefGoogle Scholar
  14. 14.
    Klösgen, W.: Explora: A Multipattern and Multistrategy Discovery Assistant. in Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy R. (Eds.): Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, California (1996) 249–271Google Scholar
  15. 15.
    Ohsaki, M., Kitaguchi, S., Kume, S., Yokoi, H., and Yamaguchi, T.: Evaluation of Rule Interestingness Measures with a Clinical Dataset on Hepatitis. Proc. of ECML/PKDD 2004, LNAI3202 (2004) 362–373Google Scholar
  16. 16.
    Piatetsky-Shapiro, G.: Discovery, Analysis and Presentation of Strong Rules. In Piatetsky-Shapiro, G., Frawley, W. J. (eds.): Knowledge Discovery in Databases. AAAI/MIT Press (1991) 229–248Google Scholar
  17. 17.
    Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: B. Schölkopf, C. Burges, and A. Smola (eds.): Advances in Kernel Methods - Support Vector Learning, MIT Press (1999) 185–208Google Scholar
  18. 18.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, (1993)Google Scholar
  19. 19.
    Rijsbergen, C.: Information Retrieval, Chapter 7, Butterworths, London, (1979) http://www.dcs.gla.ac.uk/Keith/Chapter.7/Ch.7.html
  20. 20.
    Smyth, P., Goodman, R. M.: Rule Induction using Information Theory. In Piatetsky-Shapiro, G., Frawley, W. J. (eds.): Knowledge Discovery in Databases. AAAI/MIT Press (1991) 159–176Google Scholar
  21. 21.
    Tan, P. N., Kumar V., Srivastava, J.: Selecting the Right Interestingness Measure for Association Patterns. Proc. of Int. Conf. on Knowledge Discovery and Data Mining KDD-2002 (2002) 32–41Google Scholar
  22. 22.
    Witten, I. H and Frank, E.: DataMining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, (2000)Google Scholar
  23. 23.
    Yao, Y. Y. Zhong, N.: An Analysis of Quantitative Measures Associated with Rules. Proc. of Pacific-Asia Conf. on Knowledge Discovery and Data Mining PAKDD-1999 (1999) 479–488Google Scholar
  24. 24.
    Zhong, N., Yao, Y. Y., Ohshima, M.: Peculiarity Oriented Multi-Database Mining. IEEE Trans. on Knowledge and Data Engineering, 15, 4, (2003) 952–960CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Hidenao Abe
    • 1
  • Shusaku Tsumoto
    • 1
  • Miho Ohsaki
    • 2
  • Takahira Yamaguchi
    • 3
  1. 1.Department of Medical InformaticsShimane University, School of MedicineShimaneJapan
  2. 2.Faculty of EngineeringDoshisha UniversityKyotoJapan
  3. 3.Faculty of Science and TechnologyKeio UniversityKanagawaJapan

Personalised recommendations