Skip to main content

Learning from Imbalanced Data Streams Using Rotation-Based Ensemble Classifiers

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14162))

Included in the following conference series:

  • 573 Accesses

Abstract

In this paper, the problem of learning from imbalanced data streams is considered. To solve this problem, an approach is presented based on the processing of data chunks, which are formed using over-sampling and under-sampling. The final classification output is determined using an ensemble approach, which is supported by the rotation technique to introduce more diversification into the pool of base classifiers and increase the final performance of the system. The proposed approach is called Weighted Ensemble with one-class Classification and Over-sampling and Instance selection (WECOI). It is validated experimentally using several selected benchmarks, and some results are presented and discussed. The paper concludes with a discussion of future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In [4], it was shown that the use of ENN was superior to CNN, which held true for all of the datasets considered.

  2. 2.

    For OUOB, OB and Learn +  +.NIE, the Hoeffding tree was used for the base classifiers, and the base classifier pool was set to 10.

References

  1. Bernardo, A., Valle, E.D.: An extensive study of C-SMOTE, a continuous synthetic minority oversampling technique for evolving data streams. Expert Syst. Appl. 196, 116630 (2022). https://doi.org/10.1016/j.eswa.2022.116630

    Article  Google Scholar 

  2. Khamassi, I., Sayed Mouchaweh, M., Hammami, M., Ghédira, K.: Discussion and review on evolving data streams and concept drift adapting. Evol. Syst. 9(1), 1–23 (2018)

    Article  Google Scholar 

  3. Shreya, S., Bernease, H., Aditya, G.P.: Rethinking streaming machine learning evaluation. arXiv (2022). https://doi.org/10.48550/arxiv.2205.11473

  4. Czarnowski, I.: Weighted ensemble with one-class classification and over-sampling and instance selection (WECOI): an approach for learning from imbalanced data streams. J. Comput. Sci. 61(1), 101614 (2022). https://doi.org/10.1016/j.jocs.2022.101614

    Article  Google Scholar 

  5. Benczúr, A.A., Kocsis, L., Pálovics, R.: Online machine learning in big data streams. arXiv (2018). https://doi.org/10.48550/ARXIV.1802.05872

  6. Gomes, H.M., Read, J., Bifet, A., Barddal, J.P., Gama, J.: Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor. Newsl. 21(2), 6–22 (2019). https://doi.org/10.1145/3373464.3373470

    Article  Google Scholar 

  7. Ghaderi-Zefrehi, H., Altınçay, H.: Imbalance learning using heterogeneous ensembles. Expert Syst. Appl. 142, 113005 (2020). https://doi.org/10.1016/j.eswa.2019.113005

    Article  Google Scholar 

  8. You, G.-R., Shiue, Y.-R., Yeh, W.-Ch., Chen, X.-L., Chen, Ch.-M.: A weighted ensemble learning algorithm based on diversity using a novel particle swarm optimization approach. Algorithms 13(10) 255 (2020). https://doi.org/10.3390/a13100255

  9. Shiue, Y.-R., You, G.-R., Su, Ch..-T., Chen, H.: Balancing accuracy and diversity in ensemble learning using a two-phase artificial bee colony approach. Appl. Soft Comput. 105, 107212 (2021). https://doi.org/10.1016/j.asoc.2021.107212

  10. Jamalinia, H., Khalouei, S., Rezaie, V., Nejatian, S., Bagheri-Fard, K., Parvin, H.: Diverse classifier ensemble creation based on heuristic dataset modification. J. Appl. Stat. 45(7), 1209–1226 (2018). https://doi.org/10.1080/02664763.2017.1363163

    Article  MathSciNet  MATH  Google Scholar 

  11. Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woźniak, M.: Ensemble learning for data stream analysis: a survey. Inf. Fusion 37, 132–156 (2017). https://doi.org/10.1016/j.inffus.2017.02.004

    Article  Google Scholar 

  12. Czarnowski, I., Jędrzejowicz, P.: An approach to data reduction for learning from big datasets: integrating stacking, rotation, and agent population learning techniques. Complexity 7404627, 1076–2787 (2018)

    Google Scholar 

  13. Wozniak, M., Cal, P., Cyganek, B.: The influence of a classifiers’ diversity on the quality of weighted again ensemble. In: Nguyen, N.T., Attachoo, B., Trawinski, B., Somboonviwat, K. (eds.) ACIIDS 2014. LNAI, vol. 8398, pp. 90–99. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-319-05458-2_10

    Chapter  Google Scholar 

  14. Adachi, K.: Rotation techniques. In: Adachi, K. (ed.) Matrix-Based Introduction to Multivariate Data Analysis, pp. 193–205. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-4103-2_13

    Chapter  MATH  Google Scholar 

  15. Rodríguez, J.J., Alonso, C.J.: Rotation-based ensembles. In: Conejo, R., Urretavizcaya, M., Pérez-de-la-Cruz, J.-L. (eds.) CAEPIA/TTIA -2003. LNCS (LNAI), vol. 3040, pp. 498–506. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25945-9_49

    Chapter  Google Scholar 

  16. Xia, J.: Multiple classifier systems for the classification of hyperspectral data. Ph.D. thesis, University de Grenoble (2014)

    Google Scholar 

  17. Czarnowski, I., Jedrzejowicz, P.: Ensemble online classifier based on the one-class base classifiers for mining data streams. Cybern. Syst. 46(1–2), 51–68 (2015). https://doi.org/10.1080/01969722.2015.1007736

    Article  Google Scholar 

  18. Czarnowski, I., Martins, D.M.L.: Impact of clustering on a synthetic instance generation in imbalanced data streams classification. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2022. LNCS, vol. 13351, pp. 586–597. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08754-7_63

    Chapter  Google Scholar 

  19. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  20. Oza, N.C.: Online bagging and boosting. In: Proceedings of the 2005 IEEE International Conference on Systems, Man and Cybernetics, Waikoloa, HI, USA, 10–12 October 2005, vol. 2343, pp. 2340–2345 (2005)

    Google Scholar 

  21. Wang, S., Minku, L.L., Yao, X.: Dealing with multiple classes in online class imbalance learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) (2016)

    Google Scholar 

  22. Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013). https://doi.org/10.1109/TKDE.2012.136

    Article  Google Scholar 

  23. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235 (2003). https://doi.org/10.1145/956750.956778

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ireneusz Czarnowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Czarnowski, I. (2023). Learning from Imbalanced Data Streams Using Rotation-Based Ensemble Classifiers. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2023. Lecture Notes in Computer Science(), vol 14162. Springer, Cham. https://doi.org/10.1007/978-3-031-41456-5_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41456-5_60

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41455-8

  • Online ISBN: 978-3-031-41456-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics