Abstract
In this paper, the problem of learning from imbalanced data streams is considered. To solve this problem, an approach is presented based on the processing of data chunks, which are formed using over-sampling and under-sampling. The final classification output is determined using an ensemble approach, which is supported by the rotation technique to introduce more diversification into the pool of base classifiers and increase the final performance of the system. The proposed approach is called Weighted Ensemble with one-class Classification and Over-sampling and Instance selection (WECOI). It is validated experimentally using several selected benchmarks, and some results are presented and discussed. The paper concludes with a discussion of future research directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In [4], it was shown that the use of ENN was superior to CNN, which held true for all of the datasets considered.
- 2.
For OUOB, OB and Learn +  +.NIE, the Hoeffding tree was used for the base classifiers, and the base classifier pool was set to 10.
References
Bernardo, A., Valle, E.D.: An extensive study of C-SMOTE, a continuous synthetic minority oversampling technique for evolving data streams. Expert Syst. Appl. 196, 116630 (2022). https://doi.org/10.1016/j.eswa.2022.116630
Khamassi, I., Sayed Mouchaweh, M., Hammami, M., Ghédira, K.: Discussion and review on evolving data streams and concept drift adapting. Evol. Syst. 9(1), 1–23 (2018)
Shreya, S., Bernease, H., Aditya, G.P.: Rethinking streaming machine learning evaluation. arXiv (2022). https://doi.org/10.48550/arxiv.2205.11473
Czarnowski, I.: Weighted ensemble with one-class classification and over-sampling and instance selection (WECOI): an approach for learning from imbalanced data streams. J. Comput. Sci. 61(1), 101614 (2022). https://doi.org/10.1016/j.jocs.2022.101614
Benczúr, A.A., Kocsis, L., Pálovics, R.: Online machine learning in big data streams. arXiv (2018). https://doi.org/10.48550/ARXIV.1802.05872
Gomes, H.M., Read, J., Bifet, A., Barddal, J.P., Gama, J.: Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor. Newsl. 21(2), 6–22 (2019). https://doi.org/10.1145/3373464.3373470
Ghaderi-Zefrehi, H., Altınçay, H.: Imbalance learning using heterogeneous ensembles. Expert Syst. Appl. 142, 113005 (2020). https://doi.org/10.1016/j.eswa.2019.113005
You, G.-R., Shiue, Y.-R., Yeh, W.-Ch., Chen, X.-L., Chen, Ch.-M.: A weighted ensemble learning algorithm based on diversity using a novel particle swarm optimization approach. Algorithms 13(10) 255 (2020). https://doi.org/10.3390/a13100255
Shiue, Y.-R., You, G.-R., Su, Ch..-T., Chen, H.: Balancing accuracy and diversity in ensemble learning using a two-phase artificial bee colony approach. Appl. Soft Comput. 105, 107212 (2021). https://doi.org/10.1016/j.asoc.2021.107212
Jamalinia, H., Khalouei, S., Rezaie, V., Nejatian, S., Bagheri-Fard, K., Parvin, H.: Diverse classifier ensemble creation based on heuristic dataset modification. J. Appl. Stat. 45(7), 1209–1226 (2018). https://doi.org/10.1080/02664763.2017.1363163
Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woźniak, M.: Ensemble learning for data stream analysis: a survey. Inf. Fusion 37, 132–156 (2017). https://doi.org/10.1016/j.inffus.2017.02.004
Czarnowski, I., Jędrzejowicz, P.: An approach to data reduction for learning from big datasets: integrating stacking, rotation, and agent population learning techniques. Complexity 7404627, 1076–2787 (2018)
Wozniak, M., Cal, P., Cyganek, B.: The influence of a classifiers’ diversity on the quality of weighted again ensemble. In: Nguyen, N.T., Attachoo, B., Trawinski, B., Somboonviwat, K. (eds.) ACIIDS 2014. LNAI, vol. 8398, pp. 90–99. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-319-05458-2_10
Adachi, K.: Rotation techniques. In: Adachi, K. (ed.) Matrix-Based Introduction to Multivariate Data Analysis, pp. 193–205. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-4103-2_13
RodrÃguez, J.J., Alonso, C.J.: Rotation-based ensembles. In: Conejo, R., Urretavizcaya, M., Pérez-de-la-Cruz, J.-L. (eds.) CAEPIA/TTIA -2003. LNCS (LNAI), vol. 3040, pp. 498–506. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25945-9_49
Xia, J.: Multiple classifier systems for the classification of hyperspectral data. Ph.D. thesis, University de Grenoble (2014)
Czarnowski, I., Jedrzejowicz, P.: Ensemble online classifier based on the one-class base classifiers for mining data streams. Cybern. Syst. 46(1–2), 51–68 (2015). https://doi.org/10.1080/01969722.2015.1007736
Czarnowski, I., Martins, D.M.L.: Impact of clustering on a synthetic instance generation in imbalanced data streams classification. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2022. LNCS, vol. 13351, pp. 586–597. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08754-7_63
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)
Oza, N.C.: Online bagging and boosting. In: Proceedings of the 2005 IEEE International Conference on Systems, Man and Cybernetics, Waikoloa, HI, USA, 10–12 October 2005, vol. 2343, pp. 2340–2345 (2005)
Wang, S., Minku, L.L., Yao, X.: Dealing with multiple classes in online class imbalance learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) (2016)
Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013). https://doi.org/10.1109/TKDE.2012.136
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235 (2003). https://doi.org/10.1145/956750.956778
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Czarnowski, I. (2023). Learning from Imbalanced Data Streams Using Rotation-Based Ensemble Classifiers. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2023. Lecture Notes in Computer Science(), vol 14162. Springer, Cham. https://doi.org/10.1007/978-3-031-41456-5_60
Download citation
DOI: https://doi.org/10.1007/978-3-031-41456-5_60
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41455-8
Online ISBN: 978-3-031-41456-5
eBook Packages: Computer ScienceComputer Science (R0)