Abstract
Data streams are characterized by high volatility, and they drastically change in an unpredictable way over time. In the typical case, newer data are the most important, as the concept of aging is based on their timing. These flows require real-time processing in order to extract meaningful information that will allow for essential and targeted responses to changing circumstances. Knowledge mining is a real-time process performed on a subset of the data streams, which contains a small but recent part of the observations. Timely security requirements call for further quest of optimal approaches, capable of improving the reliability and the accuracy of the employed classifiers. This research introduces a real-time evolving spiking restricted Boltzmann machine approach, for efficient anomaly detection in data streams. Testing has proved that the proposed algorithm maximizes the classification accuracy and at the same time minimizes the computational resources requirements. A comparative analysis has shown that it outperforms other data flow analysis algorithms.
Similar content being viewed by others
References
Dedić N, Stanier C (2017) Towards differentiating business intelligence, Big Data, data analytics and knowledge discovery, vol 285. Springer, Berlin
Benjelloun F, Lahcen AA, Belfkih S (2015) An overview of Big Data opportunities, applications and tools. In: 2015 intelligent systems and computer vision (ISCV), pp 1–6. https://doi.org/10.1109/ISACV.2015.7105553
Kiran M, Murphy P, Monga I, Dugan J, Baveja SS (2015) Lambda architecture for cost-effective batch and speed Big Data processing. In: IEEE international conference on Big Data (Big Data), Santa Clara, CA, pp 2785–2792. https://doi.org/10.1109/bigdata.2015.7364082
Demchenko Y, de Laat C, Membrey P (2014) Defining architecture components of the Big Data Ecosystem. In: 2014 international conference on collaboration technologies and systems (CTS), Minneapolis, MN, pp 104–112. https://doi.org/10.1109/cts.2014.6867550
Sample C, Schaffer K (2013) An overview of anomaly detection. IEEE J Mag 15(1):8–11. https://doi.org/10.1109/MITP.2013.7
Borah A, Nath B (2017) Mining patterns from data streams: an overview. In: 2017 international conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC), pp 371–376. https://doi.org/10.1109/I-SMAC.2017.8058373
Babcock B, Babu S, Datar M, Motwani R, Widom J (2002). Models and issues in data stream systems. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. PODS ‘02. ACM, New York, pp 1–16. CiteSeerX 10.1.1.138.190. https://doi.org/10.1145/543613.543615. ISBN 978-1581135077
Kushner HJ, Yin GG (2003) Stochastic approximation algorithms and applications, Springer, New York (1997). ISBN 0-387-94916-X; 2nd edn, titled Stochastic approximation and recursive algorithms and applications, ISBN 0-387-00894-2
Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. In: Balcázar JL, Bonchi F, Gionis A, Sebag M (eds) Machine learning and knowledge discovery in databases. ECML PKDD 2010, vol 6321. Lecture notes in computer science. Springer, Berlin
Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM international conference on data mining, pp 443–448. https://doi.org/10.1137/1.9781611972771.42
Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742. https://doi.org/10.1109/TKDE.2009.156
Baena-Garcia M, Del Campo-Avila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: 4th ECML PKDD international workshop on knowledge discovery from data streams, pp 77–86
Farid DM, Zhang L, Hossain A, Rahman CM, Strachan R, Sexton G, Dahal K (2013) An adaptive ensemble classifier for mining concept drifting data streams. Expert Syst Appl 40(15):5895–5906. https://doi.org/10.1016/j.eswa.2013.05.001
Wang L-Y, Park C, Choi H, Yeon K (2016) A classifier ensemble for concept drift using a constrained penalized regression combiner. Procedia Comput Sci 91:252–259. https://doi.org/10.1016/j.procs.2016.07.070
Yang Z, Al-Dahidi S, Baraldi P, Zio E, Montelatici L (2019) A novel concept drift detection method for incremental learning in nonstationary environments. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2019.2900956
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 71–80. https://doi.org/10.1145/347090.347107
Aggarwal CC, Han J, Wang J, Yu PS (2004) On demand classification of data streams. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 503–508. https://doi.org/10.1145/1014052.1014110
Zhang P, Zhu X, Shi Y (2008) Categorizing and mining concept drifting data streams. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 812–820. https://doi.org/10.1145/1401890.1401987
Losing V, Hammer B, Wersing H (2016) KNN classifier with self adjusting memory for heterogeneous concept drift. In: 2016 IEEE 16th international conference on data mining (ICDM), pp 291–300. https://doi.org/10.1109/ICDM.2016.0040
Rani MS, Sumathy S (2016) Analysis of KNN, C5.0 and one class SVM for intrusion detection system. Int J Pharm Technol 8(4):26251–26259
Shalev-Shwartz S, Singer Y, Srebro N, Cotter A (2011) Pegasos: primal estimated sub-gradient solver for SVM. Math Program 127(1):3–30. https://doi.org/10.1007/s10107-010-0420-4
Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495. https://doi.org/10.1007/s10994-017-5642-8
Demertzis K, Iliadis L (2014) Evolving computational intelligence system for malware detection. In: Iliadis L, Papazoglou M, Pohl K (eds) Advanced information systems engineering workshops. Lecture notes in business information processing. Springer, Berlin, pp 322–334
Demertzis K, Iliadis LS (2016) Ladon: a cyber threat bio-inspired intelligence management system. J Appl Math Bioinform 6(3):45–64
Demertzis K, Iliadis LS, Anezakis V-D (2018) An innovative soft computing system for smart energy grids cybersecurity. Adv Build Energy Res 12:3–24. https://doi.org/10.1080/17512549.2017.1325401
Demertzis K, Iliadis L (2018) A computational intelligence system identifying cyber-attacks on smart energy grids. In: Daras NJ, Rassias TM (eds) Modern discrete mathematics and analysis: with applications in cryptography, information systems and modeling. Springer optimization and its applications. Springer, Cham, pp 97–116. https://doi.org/10.1007/978-3-319-74325-7_5
Demertzis K, Iliadis L (2014) A hybrid network anomaly and intrusion detection approach based on evolving spiking neural network classification. In: Sideridis AB, Kardasiadou Z, Yialouris CP, Zorkadis V (eds) E-democracy, security, privacy and trust in a digital world, communications in computer and information science. Springer, Berlin, pp 11–23
Demertzis K, Iliadis L, Spartalis S (2017) A spiking one-class anomaly detection framework for cyber-security on industrial control systems. In: Boracchi G, Iliadis L, Jayne C, Likas A (eds) Engineering applications of neural networks, communications in computer and information science. Springer, Berlin, pp 122–134
Demertzis K, Kikiras P, Tziritas N, Sanchez SL, Iliadis L (2018) The next generation cognitive security operations center: network flow forensics using cybersecurity intelligence. Big Data and Cogn Comput 2:35. https://doi.org/10.3390/bdcc2040035
Demertzis K, Iliadis L, Anezakis V (2018) MOLESTRA: a multi-task learning approach for real-time Big Data analytics. In: 2018 innovations in intelligent systems and applications (INISTA). Presented at the 2018 innovations in intelligent systems and applications (INISTA), pp 1–8. https://doi.org/10.1109/INISTA.2018.8466306
Demertzis K, Iliadis L, Anezakis V-D (2018) A dynamic ensemble learning framework for data stream analysis and real-time threat detection. In: Kůrková V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I (eds) Artificial neural networks and machine learning—ICANN 2018. Lecture notes in computer science. Springer, Berlin, pp 669–681
Ponulak F, Kasinski A (2011) Introduction to spiking neural networks: information processing, learning and applications. Acta Neurobiol Exp 71(4):409–433
Schliebs S, Kasabov N (2013) Evolving spiking neural network—a survey. Evolv Syst 4(2):87–98. https://doi.org/10.1007/s12530-013-9074-9
Zhang N, Ding S, Zhang J, Xue Y (2018) An overview on restricted Boltzmann machines. Neurocomputing 275:1186–1199. https://doi.org/10.1016/j.neucom.2017.09.065
LeCun Y, Chopra S, Hadsell R, Huang FJ et al (2006) A tutorial on energy-based learning. In: BakIr G, Hofmann T, Schölkopf B, Smola AJ, Taskar B, Vishwanathan SVN (eds) Predicting structured data. MIT Press, Cambridge
van Ravenzwaaij D, Cassey P, Brown SD (2018) A simple introduction to Markov chain Monte-Carlo sampling. Psychon Bull Rev 25(1):143–154. https://doi.org/10.3758/s13423-016-1015-8
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6(6):721–741. https://doi.org/10.1109/TPAMI.1984.4767596
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800. https://doi.org/10.1162/089976602760128018
Liu J, Chi G, Luo X (2013) Contrastive divergence learning for the restricted Boltzmann machine. In 2013 9th international conference on natural computation (ICNC), pp 18–22. https://doi.org/10.1109/ICNC.2013.6817936
Morris T, Gao W (2014). Industrial control system traffic data sets for intrusion detection research. In: Butts J, Shenoi S (Eds) 8th international conference on critical infrastructure protection (ICCIP), Mar 2014, Arlington, United States, IFIP advances in information and communication technology, AICT-441. Critical infrastructure protection VIII. Springer, pp 65–78
Vinagre J, Jorge AM, Gama J (2014) Evaluation of recommender systems in streaming environments. In: Workshop on ‘recommender systems evaluation: dimensions and design’ (REDD 2014), held in conjunction with RecSys 2014. Oct 10 2014, Silicon Valley, USA. https://doi.org/10.13140/2.1.4381.5367
Žliobaitė I, Bifet A, Read J et al (2015) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98:455. https://doi.org/10.1007/s10994-014-5441-4
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xing, L., Demertzis, K. & Yang, J. Identifying data streams anomalies by evolving spiking restricted Boltzmann machines. Neural Comput & Applic 32, 6699–6713 (2020). https://doi.org/10.1007/s00521-019-04288-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04288-5