Abstract
Data stream mining is the process of generating continuous data stream records such as internet search, phone conversations, sensor data, etc. However it performs huge tasks such as frequency counting, clustering, analysis as well as classification. Mining information from data streams is often considered as a complicated process due to the rapid change in the underlying concept which is often referred to as concept drift and the high speed of data arrival. Moreover the data stream classification process is not stationary where each transmission is evolved with time. In addition to this, it cannot able to handle imbalanced data and is not able to accommodate new classes. To overcome this problem, an Ensemble Learning model based Support Vector Machine (ESVM) is proposed to perform the data stream classification. To achieve higher diversity, each base SVM is trained with different feature subsets and updated during the presence of new data instances. However, the selection of optimal feature subsets from high dimensional data streams is complex due to the increase in size and computational cost. Hence Dynamic Accelerated Function (DAF) and Dynamic Candidate Solution (DCS) approaches are developed that diminish the classification error and improve the performance with the best fitness value. The performances of the proposed methods is validated based on accuracy, precision, F-score, kappa, and relative error. The experimental result demonstrates that the proposed model is efficient when evaluated in terms of classification accuracy, rapid training, processing time, kappa score and attained an accuracy of 91.45%.
Similar content being viewed by others
References
Jain V (2017) Perspective analysis of telecommunication fraud detection using data stream analytics and neural network classification based data mining. Int J Inf Technol 9(3):303–310. https://doi.org/10.1007/s41870-017-0036-5
Aman SS, Agbo DDA, N’guessan BG, Kone T (2023) Design of a data storage and retrieval ontology for the efficient integration of information in artificial intelligence systems. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01583-2
Anagra I, Bahiuddin I, Priatomo HR, Winarno A, Darmo S, Sandhy RID, Mazlan SA (2023) Detection of coal wagon load distributions based on geometrical features using extreme learning machine methods. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01499-x
Gajjar P, Saxena A, Acharya K, Shah P, Bhatt C, Nguyen TT (2023) Liquidt: stock market analysis using liquid time-constant neural networks. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01506-1
Cheriyan S, Chitra K (2023) MR-AMFO-CNN: an intelligent recommendation system using optimized deep learning classifications. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01416-2
Ren S, Zhu W, Liao B, Li Z, Wang P, Li K, Chen M, Li Z (2019) Selection-based resampling ensemble algorithm for nonstationary imbalanced stream data learning. Knowl-Based Syst 163:705–722
Junior JRB, do Carmo Nicoletti M, (2019) An iterative boosting-based ensemble for streaming data classification. Inform Fusion 45:66–78. https://doi.org/10.1016/j.inffus.2018.01.003
Du H, Zhang Y, Gang K, Zhang L, Chen YC (2021) Online ensemble learning algorithm for imbalanced data stream. Appl Soft Comput 107:107378
Wen YM, Liu S (2020) Semi-supervised classification of data streams by BIRCH ensemble and local structure mapping. J Comput Sci Technol 35(2):295–304. https://doi.org/10.1007/s11390-020-9999-y
Le T, Vo B, Fournier-Viger P, Lee MY, Baik SW (2019) SPPC: a new tree structure for mining erasable patterns in data streams. Appl Intell 49(2):478–495
Khodadadi N, Snasel V, Mirjalili S (2022) Dynamic arithmetic optimization algorithm for truss optimization under natural frequency constraints. IEEE Access 10:16188–16208. https://doi.org/10.1109/ACCESS.2022.3146374
Yu L, Wang S, Lai KK (2010) Developing an SVM-based ensemble learning system for customer risk identification collaborating with customer relationship management. Frontiers of Computer Science in China 4(2):196–203. https://doi.org/10.1007/s11704-010-0508-2
UCI Machine Learning Repository: Qtyt40i10d100k data set. (n.d.). Retrieved October 15, 2022, from https://archive.ics.uci.edu/ml/datasets/QtyT40I10D100K
Uchoice-Kosarak Dataset. (n.d.). Retrieved October 15, 2022, from https://www.cs.cornell.edu/~arb/data/uchoice-Kosarak/
Greeshma (2019) Road accidents in India (2014–2017). Kaggle. Retrieved October 15, 2022, from https://www.kaggle.com/datasets/greeshmagirish/road-accidents-in-india-20142017
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Vidya, R.M., Ramakrishna, M. An adaptive learning paradigm: event detection through a novel dynamic arithmetic optimization-based ensemble SVM for data stream classification. Int. j. inf. tecnol. (2024). https://doi.org/10.1007/s41870-024-01832-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41870-024-01832-y