Skip to main content

Mining Multiple Class Imbalanced Datasets Using a Specialized Balancing Algorithm and the Adaboost Technique

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2023)

Abstract

In this paper, we propose an ensemble classifier extended from a specialized bicriterion balancing algorithm originally proposed by the authors for binary imbalanced classification. The approach uses two specialized criteria for oversampling - classification potential and distance from the borderline between minority and majority instances. For mining multiclass imbalanced datasets the bicriteria oversampling algorithm was adapted to the needs of the multiple class problems using the one-versus-one (OVO) approach and the Adaboost technique. To evaluate the performance of the proposed ensemble classifier we use several state of the art balancing algorithms. The computational experiment shows a very good performance of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abdi, L., Hashemi, S.: To combat multi-class imbalanced problems by means of over-sampling and boosting techniques. Soft. Comput. 19(12), 3369–3385 (2015)

    Article  Google Scholar 

  2. Agrawal, A., Viktor, H.L., Paquet, E.: SCUT: multi-class imbalanced data classification using smote and cluster-based undersampling. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), vol. 01, pp. 226–234 (2015)

    Google Scholar 

  3. Alcalá-Fdez, J., et al.: KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft. Comput. 13(3), 307–318 (2009)

    Article  Google Scholar 

  4. Díaz-Vico, D., Figueiras-Vidal, A.R., Dorronsoro, J.R.: Deep mlps for imbalanced classification. In: 2018 International Joint Conference on Neural Networks, IJCNN 2018, Rio de Janeiro, Brazil, 8–13 July 2018, pp. 1–7. IEEE (2018)

    Google Scholar 

  5. Fernández, A., del Jesus, M.J., Herrera, F.: Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets. Int. J. Approx. Reason. 50(3), 561–577 (2009)

    Article  MATH  Google Scholar 

  6. Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst. 13(2) (2001)

    Google Scholar 

  7. Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-59119-2_166

    Chapter  Google Scholar 

  8. Haixiang, G., Yijing, L., Yanan, L., Xiao, L., Jinling, L.: BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng. Appl. Artif. Intell. 49, 176–193 (2016)

    Article  Google Scholar 

  9. Hastie, T.J., Rosset, S., Zhu, J., Zou, H.: Multi-class adaboost. Statist. Interface 2, 349–360 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. He, H., Bai, Y., Garcia, E., Li, S.A.: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), IJCNN 2008, pp. 1322–1328 (2008)

    Google Scholar 

  11. Hoens, T.R., Qian, Q., Chawla, N.V., Zhou, Z.-H.: Building decision trees for the multi-class imbalance problem. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012. LNCS (LNAI), vol. 7301, pp. 122–134. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30217-6_11

    Chapter  Google Scholar 

  12. Jedrzejowicz, J., Jedrzejowicz, P.: Bicriteria oversampling for imbalanced data classification. In: Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 26th International Conference KES-2022, volume 207C of Procedia Computer Science, pp. 239–248. Elsevier (2022)

    Google Scholar 

  13. Koziarski, M.: CSMOUTE: combined synthetic oversampling and undersampling technique for imbalanced data classification. In: International Joint Conference on Neural Networks, IJCNN 2021, Shenzhen, China, 18–22 July 2021, pp. 1–8. IEEE (2021)

    Google Scholar 

  14. Koziarski, M.: Potential anchoring for imbalanced data classification. Pattern Recognit. 120, 108114 (2021)

    Article  Google Scholar 

  15. Koziarski, M., Krawczyk, B., Wozniak, M.: Radial-based oversampling for noisy imbalanced data classification. Neurocomputing 343, 19–33 (2019)

    Article  Google Scholar 

  16. Li, Q., Song, Y., Zhang, J., Sheng, V.S.: Multiclass imbalanced learning with one-versus-one decomposition and spectral clustering. Expert Syst. Appl. 147, 113152 (2020)

    Article  Google Scholar 

  17. Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2009)

    Google Scholar 

  18. Maldonado, S., Vairetti, C., Fernández, A., Herrera, F.: FW-SMOTE: a feature-weighted oversampling approach for imbalanced classification. Pattern Recognit. 124, 108511 (2022)

    Article  Google Scholar 

  19. Rodríguez, J.J., Díez-Pastor, J.F., Arnaiz-González, A., Kuncheva, L.I.: Random balance ensembles for multiclass imbalance learning. Knowl. Based Syst. 193, 105434 (2020)

    Article  Google Scholar 

  20. Sáez, J.A., Krawczyk, B., Wozniak, M.: Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets. Pattern Recognit. 57, 164–178 (2016)

    Article  Google Scholar 

  21. Wang, X., Jian, X., Zeng, T., Jing, L.: Local distribution-based adaptive minority oversampling for imbalanced data classification. Neurocomputing 422, 200–213 (2021)

    Article  Google Scholar 

  22. Kaiyuan, W., Zheng, Z., Tang, S.: BVDT: a boosted vector decision tree algorithm for multi-class classification problems. Int. J. Pattern Recognit Artif Intell. 31(05), 1750016 (2017)

    Article  MathSciNet  Google Scholar 

  23. Yijing, L., Haixiang, G., Xiao, L., Yanan, L., Jinling, L.: Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl.-Based Syst. 94, 88–104 (2016)

    Article  Google Scholar 

  24. Zhang, Z.-L., Luo, X.-G., García, S., Herrera, F.: Cost-sensitive back-propagation neural networks with binarization techniques in addressing multi-class problems and non-competent classifiers. Appl. Soft Comput. 56, 357–367 (2017)

    Article  Google Scholar 

  25. Zhang, Z., Krawczyk, B., García, S., Rosales-Pérez, A., Herrera, F.: Empowering one-vs-one decomposition with ensemble learning for multi-class imbalanced data. Knowl. Based Syst. 106, 251–263 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piotr Jedrzejowicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jedrzejowicz, J., Jedrzejowicz, P. (2023). Mining Multiple Class Imbalanced Datasets Using a Specialized Balancing Algorithm and the Adaboost Technique. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2023. Lecture Notes in Computer Science(), vol 14162. Springer, Cham. https://doi.org/10.1007/978-3-031-41456-5_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41456-5_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41455-8

  • Online ISBN: 978-3-031-41456-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics