Skip to main content
Log in

Identifying data streams anomalies by evolving spiking restricted Boltzmann machines

  • Brain inspired Computing &Machine Learning Applied Research-BISMLARE
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Data streams are characterized by high volatility, and they drastically change in an unpredictable way over time. In the typical case, newer data are the most important, as the concept of aging is based on their timing. These flows require real-time processing in order to extract meaningful information that will allow for essential and targeted responses to changing circumstances. Knowledge mining is a real-time process performed on a subset of the data streams, which contains a small but recent part of the observations. Timely security requirements call for further quest of optimal approaches, capable of improving the reliability and the accuracy of the employed classifiers. This research introduces a real-time evolving spiking restricted Boltzmann machine approach, for efficient anomaly detection in data streams. Testing has proved that the proposed algorithm maximizes the classification accuracy and at the same time minimizes the computational resources requirements. A comparative analysis has shown that it outperforms other data flow analysis algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Dedić N, Stanier C (2017) Towards differentiating business intelligence, Big Data, data analytics and knowledge discovery, vol 285. Springer, Berlin

    Google Scholar 

  2. Benjelloun F, Lahcen AA, Belfkih S (2015) An overview of Big Data opportunities, applications and tools. In: 2015 intelligent systems and computer vision (ISCV), pp 1–6. https://doi.org/10.1109/ISACV.2015.7105553

  3. Kiran M, Murphy P, Monga I, Dugan J, Baveja SS (2015) Lambda architecture for cost-effective batch and speed Big Data processing. In: IEEE international conference on Big Data (Big Data), Santa Clara, CA, pp 2785–2792. https://doi.org/10.1109/bigdata.2015.7364082

  4. Demchenko Y, de Laat C, Membrey P (2014) Defining architecture components of the Big Data Ecosystem. In: 2014 international conference on collaboration technologies and systems (CTS), Minneapolis, MN, pp 104–112. https://doi.org/10.1109/cts.2014.6867550

  5. Sample C, Schaffer K (2013) An overview of anomaly detection. IEEE J Mag 15(1):8–11. https://doi.org/10.1109/MITP.2013.7

    Article  Google Scholar 

  6. Borah A, Nath B (2017) Mining patterns from data streams: an overview. In: 2017 international conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC), pp 371–376. https://doi.org/10.1109/I-SMAC.2017.8058373

  7. Babcock B, Babu S, Datar M, Motwani R, Widom J (2002). Models and issues in data stream systems. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. PODS ‘02. ACM, New York, pp 1–16. CiteSeerX 10.1.1.138.190. https://doi.org/10.1145/543613.543615. ISBN 978-1581135077

  8. Kushner HJ, Yin GG (2003) Stochastic approximation algorithms and applications, Springer, New York (1997). ISBN 0-387-94916-X; 2nd edn, titled Stochastic approximation and recursive algorithms and applications, ISBN 0-387-00894-2

  9. Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. In: Balcázar JL, Bonchi F, Gionis A, Sebag M (eds) Machine learning and knowledge discovery in databases. ECML PKDD 2010, vol 6321. Lecture notes in computer science. Springer, Berlin

    Google Scholar 

  10. Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM international conference on data mining, pp 443–448. https://doi.org/10.1137/1.9781611972771.42

  11. Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742. https://doi.org/10.1109/TKDE.2009.156

    Article  Google Scholar 

  12. Baena-Garcia M, Del Campo-Avila J, Fidalgo R, Bifet A, Gavalda R, Morales-Bueno R (2006) Early drift detection method. In: 4th ECML PKDD international workshop on knowledge discovery from data streams, pp 77–86

  13. Farid DM, Zhang L, Hossain A, Rahman CM, Strachan R, Sexton G, Dahal K (2013) An adaptive ensemble classifier for mining concept drifting data streams. Expert Syst Appl 40(15):5895–5906. https://doi.org/10.1016/j.eswa.2013.05.001

    Article  Google Scholar 

  14. Wang L-Y, Park C, Choi H, Yeon K (2016) A classifier ensemble for concept drift using a constrained penalized regression combiner. Procedia Comput Sci 91:252–259. https://doi.org/10.1016/j.procs.2016.07.070

    Article  Google Scholar 

  15. Yang Z, Al-Dahidi S, Baraldi P, Zio E, Montelatici L (2019) A novel concept drift detection method for incremental learning in nonstationary environments. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2019.2900956

    Article  Google Scholar 

  16. Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 71–80. https://doi.org/10.1145/347090.347107

  17. Aggarwal CC, Han J, Wang J, Yu PS (2004) On demand classification of data streams. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 503–508. https://doi.org/10.1145/1014052.1014110

  18. Zhang P, Zhu X, Shi Y (2008) Categorizing and mining concept drifting data streams. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 812–820. https://doi.org/10.1145/1401890.1401987

  19. Losing V, Hammer B, Wersing H (2016) KNN classifier with self adjusting memory for heterogeneous concept drift. In: 2016 IEEE 16th international conference on data mining (ICDM), pp 291–300. https://doi.org/10.1109/ICDM.2016.0040

  20. Rani MS, Sumathy S (2016) Analysis of KNN, C5.0 and one class SVM for intrusion detection system. Int J Pharm Technol 8(4):26251–26259

    Google Scholar 

  21. Shalev-Shwartz S, Singer Y, Srebro N, Cotter A (2011) Pegasos: primal estimated sub-gradient solver for SVM. Math Program 127(1):3–30. https://doi.org/10.1007/s10107-010-0420-4

    Article  MathSciNet  MATH  Google Scholar 

  22. Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495. https://doi.org/10.1007/s10994-017-5642-8

    Article  MathSciNet  MATH  Google Scholar 

  23. Demertzis K, Iliadis L (2014) Evolving computational intelligence system for malware detection. In: Iliadis L, Papazoglou M, Pohl K (eds) Advanced information systems engineering workshops. Lecture notes in business information processing. Springer, Berlin, pp 322–334

    Google Scholar 

  24. Demertzis K, Iliadis LS (2016) Ladon: a cyber threat bio-inspired intelligence management system. J Appl Math Bioinform 6(3):45–64

    Google Scholar 

  25. Demertzis K, Iliadis LS, Anezakis V-D (2018) An innovative soft computing system for smart energy grids cybersecurity. Adv Build Energy Res 12:3–24. https://doi.org/10.1080/17512549.2017.1325401

    Article  Google Scholar 

  26. Demertzis K, Iliadis L (2018) A computational intelligence system identifying cyber-attacks on smart energy grids. In: Daras NJ, Rassias TM (eds) Modern discrete mathematics and analysis: with applications in cryptography, information systems and modeling. Springer optimization and its applications. Springer, Cham, pp 97–116. https://doi.org/10.1007/978-3-319-74325-7_5

    Chapter  Google Scholar 

  27. Demertzis K, Iliadis L (2014) A hybrid network anomaly and intrusion detection approach based on evolving spiking neural network classification. In: Sideridis AB, Kardasiadou Z, Yialouris CP, Zorkadis V (eds) E-democracy, security, privacy and trust in a digital world, communications in computer and information science. Springer, Berlin, pp 11–23

    Google Scholar 

  28. Demertzis K, Iliadis L, Spartalis S (2017) A spiking one-class anomaly detection framework for cyber-security on industrial control systems. In: Boracchi G, Iliadis L, Jayne C, Likas A (eds) Engineering applications of neural networks, communications in computer and information science. Springer, Berlin, pp 122–134

    Chapter  Google Scholar 

  29. Demertzis K, Kikiras P, Tziritas N, Sanchez SL, Iliadis L (2018) The next generation cognitive security operations center: network flow forensics using cybersecurity intelligence. Big Data and Cogn Comput 2:35. https://doi.org/10.3390/bdcc2040035

    Article  Google Scholar 

  30. Demertzis K, Iliadis L, Anezakis V (2018) MOLESTRA: a multi-task learning approach for real-time Big Data analytics. In: 2018 innovations in intelligent systems and applications (INISTA). Presented at the 2018 innovations in intelligent systems and applications (INISTA), pp 1–8. https://doi.org/10.1109/INISTA.2018.8466306

  31. Demertzis K, Iliadis L, Anezakis V-D (2018) A dynamic ensemble learning framework for data stream analysis and real-time threat detection. In: Kůrková V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I (eds) Artificial neural networks and machine learning—ICANN 2018. Lecture notes in computer science. Springer, Berlin, pp 669–681

    Chapter  Google Scholar 

  32. Ponulak F, Kasinski A (2011) Introduction to spiking neural networks: information processing, learning and applications. Acta Neurobiol Exp 71(4):409–433

    Google Scholar 

  33. Schliebs S, Kasabov N (2013) Evolving spiking neural network—a survey. Evolv Syst 4(2):87–98. https://doi.org/10.1007/s12530-013-9074-9

    Article  Google Scholar 

  34. Zhang N, Ding S, Zhang J, Xue Y (2018) An overview on restricted Boltzmann machines. Neurocomputing 275:1186–1199. https://doi.org/10.1016/j.neucom.2017.09.065

    Article  Google Scholar 

  35. LeCun Y, Chopra S, Hadsell R, Huang FJ et al (2006) A tutorial on energy-based learning. In: BakIr G, Hofmann T, Schölkopf B, Smola AJ, Taskar B, Vishwanathan SVN (eds) Predicting structured data. MIT Press, Cambridge

    Google Scholar 

  36. van Ravenzwaaij D, Cassey P, Brown SD (2018) A simple introduction to Markov chain Monte-Carlo sampling. Psychon Bull Rev 25(1):143–154. https://doi.org/10.3758/s13423-016-1015-8

    Article  Google Scholar 

  37. Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6(6):721–741. https://doi.org/10.1109/TPAMI.1984.4767596

    Article  MATH  Google Scholar 

  38. Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800. https://doi.org/10.1162/089976602760128018

    Article  MATH  Google Scholar 

  39. Liu J, Chi G, Luo X (2013) Contrastive divergence learning for the restricted Boltzmann machine. In 2013 9th international conference on natural computation (ICNC), pp 18–22. https://doi.org/10.1109/ICNC.2013.6817936

  40. Morris T, Gao W (2014). Industrial control system traffic data sets for intrusion detection research. In: Butts J, Shenoi S (Eds) 8th international conference on critical infrastructure protection (ICCIP), Mar 2014, Arlington, United States, IFIP advances in information and communication technology, AICT-441. Critical infrastructure protection VIII. Springer, pp 65–78

  41. Vinagre J, Jorge AM, Gama J (2014) Evaluation of recommender systems in streaming environments. In: Workshop on ‘recommender systems evaluation: dimensions and design’ (REDD 2014), held in conjunction with RecSys 2014. Oct 10 2014, Silicon Valley, USA. https://doi.org/10.13140/2.1.4381.5367

  42. Žliobaitė I, Bifet A, Read J et al (2015) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98:455. https://doi.org/10.1007/s10994-014-5441-4

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Demertzis.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xing, L., Demertzis, K. & Yang, J. Identifying data streams anomalies by evolving spiking restricted Boltzmann machines. Neural Comput & Applic 32, 6699–6713 (2020). https://doi.org/10.1007/s00521-019-04288-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04288-5

Keywords

Navigation