Compact Ensemble Trees for Imbalanced Data

  • Yubin Park
  • Joydeep Ghosh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6713)

Abstract

This paper introduces a novel splitting criterion parametrized by a scalar ‘α’ to build a class-imbalance resistant ensemble of decision trees. The proposed splitting criterion generalizes information gain in C4.5, and its extended form encompasses Gini(CART) and DKM splitting criteria as well. Each decision tree in the ensemble is based on a different splitting criterion enforced by a distinct α. The resultant ensemble, when compared with other ensemble methods, exhibits improved performance over a variety of imbalanced datasets even with small numbers of trees.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Proceedings of the 15th European Conference on Machine Learning (2004)Google Scholar
  2. 2.
    Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: A comparison of decision tree ensemble creation techniques. IEEE Transactions on Pattern Analysis and Machine Intelligence (2006)Google Scholar
  3. 3.
    Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter 6, 20–29 (2004)CrossRefGoogle Scholar
  4. 4.
    Breiman, L.: Technical note: Some properties of splitting criteria. Machine Learning 24, 41–47 (1996)MATHGoogle Scholar
  5. 5.
    Chawla, N.V.: Many are better than one: Improving probabilistic estimates from decision trees. In: Machine Learning Challenges, pp. 41–55 (2006)Google Scholar
  6. 6.
    Chen, C., Liaw, A., Breiman, L.: Using random forest to learn imbalanced data. Tech. rep. Dept. of Statistics, U.C. Berkeley (2004)Google Scholar
  7. 7.
    Dietterich, T., Kearns, M., Mansour, Y.: Applying the weak learning framework to understand and improve c4.5. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 96–104 (1996)Google Scholar
  8. 8.
    Ertekin, S., Huang, J., Giles, C.L.: Learning on the border: Active learning in imbalanced data classification. In: Proceedings of the 30th Annual International ACM SIGIR conference, pp. 823–824 (2007)Google Scholar
  9. 9.
    Gashler, M., Giraud-Carrier, C., Martinez, T.: Decision tree ensemble: Small heterogeneous is better than large homogeneous. In: The 7th International Conference on Machine Learning and Applications, pp. 900–905 (2008)Google Scholar
  10. 10.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21 (2009)Google Scholar
  11. 11.
    Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. In: Intelligent Data Analysis, vol. 6, pp. 429–449 (2002)Google Scholar
  12. 12.
    Karakos, D., Eisner, J., Khudanpur, S., Priebe, C.E.: Cross-instance tuning of unsupervised document clustering algorithms. In: Proceedings of NAACL HLT, pp. 252–259 (2007)Google Scholar
  13. 13.
    Laurikkala, J.: Improving identification of difficult small classes by blancing class distribution. In: Proceedings of the 8th Conference of AI in Medicine in Europe: Artificial Intelligence Medicine, pp. 63–66 (2001)Google Scholar
  14. 14.
    Liu, A., Martin, C., Cour, B.L., Ghosh, J.: Effects of oversampling versus cost-sensitive learning for bayesian and svm classifiers. Annals of Information Systems 8, 159–192 (2010)CrossRefGoogle Scholar
  15. 15.
    McCarthy, K., Zarbar, B., weiss, G.: Does cost-sensitive learning beat sampling for classifying rare classes? In: Proceedings of International Workshop Utility-Based Data Mining, pp. 69–77 (2005)Google Scholar
  16. 16.
    Sharkey, A.J. (ed.): Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems. Springer, Heidelberg (1999)MATHGoogle Scholar
  17. 17.
    Weiss, G., Provost, F.: The effect of class distribution on classifier learning: An empirical study. Tech. Rep. Dept. of Computer Science, Rutgers University (2001)Google Scholar
  18. 18.
    Weiss, G., Provost, F.: Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research 19, 315–354 (2003)MATHGoogle Scholar
  19. 19.
    Zhu, H., Rohwer, R.: Information geometric measurements of generalization. Tech. Rep. 4350, Aston University (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yubin Park
    • 1
  • Joydeep Ghosh
    • 1
  1. 1.Department of Electrical and Computer EngineeringThe University of TexasAustinUSA

Personalised recommendations