Rule Induction: Combining Rough Set and Statistical Approaches

  • Wojciech Jaworski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5306)

Abstract

In this paper we propose the hybridisation of the rough set concepts and statistical learning theory. We introduce new estimators for rule accuracy and coverage, which base on the assumptions of the statistical learning theory. Then we construct classifier which uses these estimators for rule induction. These estimators allow us to select rules describing statistically significant dependencies in data. We test our classifier on benchmark datasets and show its applications for KDD.

Keywords

Rough sets quality measures accuracy coverage significance rule induction rule selection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html Google Scholar
  2. 2.
    Gediga, G., Düntsch, I.: Statistical techniques for rough set data analysis. In: Polkowski, L., et al. (eds.) Rough set methods and applications: New developments in knowledge discovery in information systems, pp. 545–565. Physica Verlag, Heidelberg (2000)CrossRefGoogle Scholar
  3. 3.
    Guillet, F., Hamilton, H.J. (eds.): Quality Measures in Data Mining. Studies in Computational Intelligence, vol. 43. Springer, Heidelberg (2007)MATHGoogle Scholar
  4. 4.
    Jaworski, W.: Model Selection and Assessment for Classification Using Validation. In: Ślȩzak, D., et al. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 481–490. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Jaworski, W.: Bounds for Validation. Fundamenta Informaticae 70(3), 261–275 (2006)MathSciNetMATHGoogle Scholar
  6. 6.
    Hoeffding, W.: Probability Inequalities for Sums of Bounded Random Variables. Journal of the American Statistical Association 58, 13–30 (1963)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)MATHGoogle Scholar
  8. 8.
    Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Information Sciences 177(1), 28–40 (2007)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Skowron, A., Swiniarski, R., Synak, P.: Approximation spaces and information granulation. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 175–189. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Tsumoto, S.: Accuracy and Coverage in Rough Set Rule Induction. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 373–380. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  11. 11.
    Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Wojciech Jaworski
    • 1
  1. 1.Faculty of Mathematics, Computer Science and MechanicsWarsaw UniversityWarsawPoland

Personalised recommendations