Zusammenfassung
Continuous, dynamic and short-term learning is an effective learning strategy when operating in very fast and dynamic environments, where concept drift constantly occurs. We focus on a particularly challenging problem, that of continually learning detection models capable to recognize network attacks and system intrusions in highly dynamic environments such as communication networks. We consider adaptive learning algorithms for the analysis of continuously evolving network data streams, using a dynamic, variable length system memory which automatically adapts to concept drifts in the underlying data. By continuously learning and detecting concept drifts to adapt memory length, we show that adaptive learning algorithms can continuously realize high detection accuracy over dynamic network data streams. To deal with big network traffic streams, we deploy the proposed models into a big data analytics platform for network traffic monitoring and analysis tasks, and show that high speed up computations (as high as × 5) can be achieved by parallelizing off-the-shelf stream learning approaches.
Schlüsselwörter
- Big-data
- Network traffic monitoring and analysis
- Stream machine learning
- Network attacks
This is a preview of subscription content, access via your institution.
Buying options
Preview
Unable to display preview. Download preview PDF.
Literatur
R. Boutaba et al., “A Comprehensive Survey on Machine Learning for Networking: Evolution, Applications and Research Opportunities,” Jour. of Internet Services and Applications, vol. 9, no. 1, 2018.
R. Fontugne et al.,, “Mawilab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking,” in ACM CoNEXT, 2010.
V. Chandola et al., “Anomaly detection: A survey,” ACM Comput. Surv., vol. 41, no. 3, pp. 15:1–15:58, Jul. 2009.
M. Ahmed et al., “A survey of network anomaly detection techniques,” J. Netw. Comput. Appl., 2016.
W. Zhang et al., “A survey of anomaly detection methods in networks,” in 2009 CNMT, Jan 2009, pp. 1–3.
P. Casas, “On the Analysis of Network Measurements through Machine Learning: the Power of the Crowd,” in TMA Conference, 2018.
V. Carela-Español et al., “A streaming flow-based technique for traffic classification applied to 12+ 1 years of internet traffic,” Telecommunication Systems, vol. 63, no. 2, pp. 191–204, 2016.
P. M. Domingos et al., “Catching up with the data: Research issues in mining data streams.” in DMKD, 2001.
M. Stonebraker et al., “The 8 requirements of real-time stream processing,” ACM Sigmod Record, vol. 34, no. 4, pp. 42–47, 2005.
G. Hulten et al., Mining massive data streams. University of Washington, 2005.
J. Gama et al., “Issues in evaluation of stream learning algorithms,” in ACM SIGKDD, 2009.
——, “On evaluating stream learning algorithms,” Machine learning, vol. 90, no. 3, pp. 317–346, 2013.
T. R. Hoens et al., “Learning from streaming data with concept drift and imbalance: an overview,” Progress in Artificial Intelligence, vol. 1, no. 1, pp. 89–101, 2012.
A. Bifet et al., “Moa: Massive online analysis,” Journal of Machine Learning Research, vol. 11, no. May, pp. 1601–1604, 2010.
G. D. F. Morales and A. Bifet, “Samoa: scalable advanced massive online analysis.” Journal of Machine Learning Research, vol. 16, no. 1, pp. 149–153, 2015.
A. Bifet and R. Gavalda, “Learning from time-changing data with adaptive windowing,” in Proceedings of the 2007 SIAM Conference, 2007, pp. 443–448.
A. Bifet et al., “Efficient online evaluation of big data stream classifiers,” in ACM SIGKDD, 2015.
D. Brzezinski and J. Stefanowski, “Prequential auc: properties of the area under the roc curve for data streams with concept drift,” Knowledge and Information Systems, vol. 52, no. 2, pp. 531–562, 2017.
J. Gama et al., “A survey on concept drift adaptation,” ACM CSUR, vol. 46, 2014.
A.-K. Koliopoulos et al., “A parallel distributed weka framework for big data mining using spark,” in IEEE Big Data, 2015.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Der/die Autor(en), exklusiv lizenziert durch Springer Fachmedien Wiesbaden GmbH , ein Teil von Springer Nature
About this paper
Cite this paper
Casas, P., Mulinka, P., Vanerio, J. (2021). NetSEC at High-Speed: Distributed Stream Learning for Security in Big Networking Data. In: Haber, P., Lampoltshammer, T., Mayr, M., Plankensteiner, K. (eds) Data Science – Analytics and Applications. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-32182-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-658-32182-6_15
Published:
Publisher Name: Springer Vieweg, Wiesbaden
Print ISBN: 978-3-658-32181-9
Online ISBN: 978-3-658-32182-6
eBook Packages: Computer Science and Engineering (German Language)