Algorithms for Filtration of Unordered Sets of Regression Rules

  • Łukasz Wróbel
  • Marek Sikora
  • Adam Skowron
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7694)

Abstract

This paper presents six filtration algorithms for the pruning of the unordered sets of regression rules. Three of these algorithms aim at the elimination of the rules which cover similar subsets of examples, whereas the other three ones aim at the optimization of the rule sets according to the prediction accuracy. The effectiveness of the filtration algorithms was empirically tested for 5 different rule learning heuristics on 35 benchmark datasets. The results show that, depending on the filtration algorithm, the reduction of the number of rules fluctuates on average between 10% and 50% and in most cases it does not cause statistically significant degradation in the accuracy of predictions.

Keywords

rule-based regression rule induction rule filtration rule quality measures 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fürnkranz, J.: Pruning algorithms for rule learning. Machine Learning 27(2), 139–172 (1997)CrossRefGoogle Scholar
  2. 2.
    Bramer, M.: Avoiding overfitting of decision trees. In: Principles of Data Mining, pp. 119–134. Springer, London (2007)Google Scholar
  3. 3.
    Bruha, I.: From machine learning to knowledge discovery: Survey of preprocessing and postprocessing. Intelligent Data Analysis 4(3,4), 363–374 (2000)MATHGoogle Scholar
  4. 4.
    Sikora, M.: Rule Quality Measures in Creation and Reduction of Data Rule Models. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 716–725. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Sikora, M.: Decision Rule-Based Data Models Using TRS and NetTRS – Methods and Algorithms. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets XI. LNCS, vol. 5946, pp. 130–160. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  6. 6.
    Sikora, M., Wróbel, Ł.: Data-driven adaptive selection of rule quality measures for improving rule induction and filtration algorithms. International Journal of General Systems 42(4) (2013) (to appear)Google Scholar
  7. 7.
    Ågotnes, T., Komorowski, J., Løken, T.: Taming Large Rule Models in Rough Set Approaches. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 193–203. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  8. 8.
    Dembczyński, K., Kotłowski, W., Słowiński, R.: Solving Regression by Learning an Ensemble of Decision Rules. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2008. LNCS (LNAI), vol. 5097, pp. 533–544. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Friedman, J., Popescu, B.: Predictive learning via rule ensembles. The Annals of Applied Statistics, 916–954 (2008)Google Scholar
  10. 10.
    Janssen, F., Fürnkranz, J.: Heuristic rule-based regression via dynamic reduction to classification. In: Walsh, T. (ed.) Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 1330–1335 (2011)Google Scholar
  11. 11.
    Sikora, M., Skowron, A., Wróbel, Ł.: Rule Quality Measure-Based Induction of Unordered Sets of Regression Rules. In: Ramsay, A., Agre, G. (eds.) AIMSA 2012. LNCS, vol. 7557, pp. 162–171. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Ishibuchi, H., Yamamoto, T.: Effects of three-objective genetic rule selection on the generalization ability of fuzzy rule-based systems. LNCS, pp. 608–622 (2003)Google Scholar
  13. 13.
    Andersen, T., Martinez, T.: NP-completeness of minimum rule sets. In: Proceedings of the 10th International Symposium on Computer and Information Sciences, pp. 411–418 (1995)Google Scholar
  14. 14.
    Øhrn, A., Ohno-Machado, L., Rowland, T.: Building manageable rough set classifiers. In: Proceedings of the AMIA Symposium, American Medical Informatics Association, p. 543 (1998)Google Scholar
  15. 15.
    Gamberger, D., Lavrač, N.: Confirmation Rule Sets. In: Zighed, D.A., Komorowski, J., Žytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 34–43. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  16. 16.
    Strehl, A., Gupta, G., Ghosh, J.: Distance based clustering of association rules. Proceedings ANNIE 1999 9, 759–764 (1999)Google Scholar
  17. 17.
    Tsumoto, S., Hirano, S.: Visualization of rule’s similarity using multidimensional scaling. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 339–346. IEEE (2003)Google Scholar
  18. 18.
    Sikora, M., Gruca, A.: Induction and selection of the most interesting gene ontology based multiattribute rules for descriptions of gene groups. Pattern Recognition Letters 32(2), 258–269 (2011)CrossRefGoogle Scholar
  19. 19.
    Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: KDD 1999: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145–154. ACM Press, New York (1999)Google Scholar
  20. 20.
    Brzezińska, I., Greco, S., Słowiński, R.: Mining pareto-optimal rules with respect to support and confirmation or support and anti-support. Engineering Applications of Artificial Intelligence 20(5), 587–600 (2007)CrossRefGoogle Scholar
  21. 21.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Łukasz Wróbel
    • 1
  • Marek Sikora
    • 1
    • 2
  • Adam Skowron
    • 1
  1. 1.Institute of Computer ScienceSilesian University of TechnologyGliwicePoland
  2. 2.Institute of Innovative Technologies EMAGKatowicePoland

Personalised recommendations