Advertisement

Ensembles of Decision Rules for Solving Binary Classification Problems in the Presence of Missing Values

  • Jerzy Błaszczyński
  • Krzysztof Dembczyński
  • Wojciech Kotłowski
  • Roman Słowiński
  • Marcin Szeląg
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4259)

Abstract

In this paper, we consider an algorithm that generates an ensemble of decision rules. A single rule is treated as a specific subsidiary, base classifier in the ensemble that indicates only one of the decision classes. Experimental results have shown that the ensemble of decision rules is as efficient as other machine learning methods. In this paper we concentrate on a common problem appearing in real-life data that is a presence of missing attributes values. To deal with this problem, we experimented with different approaches inspired by rough set approach to knowledge discovery. Results of those experiments are presented and discussed in the paper.

Keywords

Decision Rule Knowledge Discovery Default Rule Decision Class Single Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Michalski, R.S.: A Theory and Methodology of Inductive Learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach, Palo Alto, pp. 83–129. Tioga Publishing (1983)Google Scholar
  2. 2.
    Błaszczyński, J., Dembczyński, K., Kotłowski, W., Słowiński, R., Szeląg, M.: Ensemble of Decision Rules. Research Report RA-011/06, Poznań University of Technology (2006)Google Scholar
  3. 3.
    Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I.: An Implementation of Logical Analysis of Data. IEEE Trans. on Knowledge and Data Engineering 12, 292–306 (2000)CrossRefGoogle Scholar
  4. 4.
    Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)MATHMathSciNetGoogle Scholar
  5. 5.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)Google Scholar
  6. 6.
    Clark, P., Nibbet, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)Google Scholar
  7. 7.
    Cohen, W.W., Singer, Y.: A simple, fast, and effective rule learner. In: Proc. of 16th National Conference on Artificial Intelligence, pp. 335–342 (1999)Google Scholar
  8. 8.
    Düntsch, I., Gediga, G., Orłowska, E.: Relational attribute systems. International Journal of Human-Computer Studies 55, 293–309 (2001)MATHCrossRefGoogle Scholar
  9. 9.
    Friedman, J.H., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Stanford University Technical Report (last access: 1.05.2006) (August 1998), http://www-stat.stanford.edu/~jhf/
  10. 10.
    Friedman, J.H., Hastie, T., Tibshirani, R.: Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Heidelberg (2003)Google Scholar
  11. 11.
    Friedman, J.H.: Recent advances in predictive (machine) learning. Stanford University Technical Report (last access: 1.05.2006) (November 2003), http://www-stat.stanford.edu/~jhf/
  12. 12.
    Friedman, J.H., Popescu, B.E.: Gradient directed regularization. Stanford University Technical Report (last access: 1.05.2006) (February 2004), http://www-stat.stanford.edu/~jhf/
  13. 13.
    Friedman, J.H., Popescu, B.E.: Predictive Learning via Rule Ensembles. Stanford University Technical Report (last access: 1.05.2006) (February 2005), http://www-stat.stanford.edu/~jhf/
  14. 14.
    Greco, S., Matarazzo, B., Słowiński, R.: Dealing with missing data in rough set analysis of multi-attribute and multi-criteria decision problems. In: Zanakis, S.H., Doukidis, G., Zopounidis, C. (eds.) Decision Making: Recent Developments and Worldwide Applications, pp. 295–316. Kluwer Academic Publishers, Dordrecht (2000)Google Scholar
  15. 15.
    Grzymala-Busse, J.W.: LERS — A system for learning from examples based on rough sets. In: Słowiński, R. (ed.) Intelligent Decision Support, Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
  16. 16.
    Grzymala-Busse, J.W., Hu, M.: A Comaprison of Several Approaches in Missing Attribute Values in Data Mining. In: Ziarko, W., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 378–385. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  17. 17.
    Grzymala-Busse, J.W.: Incomplete Data and Generalization of Indiscernibility Relation, Definability, and Approximation. In: Wang, L., Jin, Y. (eds.) FSKD 2005. LNCS (LNAI), vol. 3614, pp. 244–253. Springer, Heidelberg (2005)Google Scholar
  18. 18.
    Kryszkiewicz, M.: Rough Set approach to incomplete information systems. Information Sciences 112, 39–49 (1998)MATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: (UCI) Repository of machine learning databases, Dept. of Information and Computer Sciences, University of California, Irvine (1998) (last access: 01.05.2006), http://www.ics.uci.edu/~mlearn/MLRepository.html
  20. 20.
    Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)MATHGoogle Scholar
  21. 21.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  22. 22.
    Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.E.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26(5), 1651–1686 (1998)MATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Skowron, A.: Extracting laws from decision tables - a rough set approach. Computational Intelligence 11, 371–388 (1995)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Stefanowki, J.: On rough set based approach to induction of decision rules. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery, pp. 500–529. Physica-Verlag, Heidelberg (1998)Google Scholar
  25. 25.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jerzy Błaszczyński
    • 1
  • Krzysztof Dembczyński
    • 1
  • Wojciech Kotłowski
    • 1
  • Roman Słowiński
    • 1
    • 2
  • Marcin Szeląg
    • 1
  1. 1.Institute of Computing SciencePoznań University of TechnologyPoznańPoland
  2. 2.Institute for Systems ResearchPolish Academy of SciencesWarsawPoland

Personalised recommendations