A Cybersecurity Framework for Classifying Non Stationary Data Streams Exploiting Genetic Programming and Ensemble Learning
- 46 Downloads
Intrusion detection systems have to cope with many challenging problems, such as unbalanced datasets, fast data streams and frequent changes in the nature of the attacks (concept drift). To this aim, here, a distributed genetic programming (GP) tool is used to generate the combiner function of an ensemble; this tool does not need a heavy additional training phase, once the classifiers composing the ensemble have been trained, and it can hence answer quickly to concept drifts, also in the case of fast-changing data streams. The above-described approach is integrated into a novel cybersecurity framework for classifying non stationary and unbalanced data streams. The framework provides mechanisms for detecting drifts and for replacing classifiers, which permits to build the ensemble in an incremental way. Tests conducted on real data have shown that the framework is effective in both detecting attacks and reacting quickly to concept drifts.
KeywordsCybersecurity Intrusion detection Genetic programming
- 1.Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SDM, vol. 7, pp. 443–448. SIAM (2007)Google Scholar
- 6.Folino, G., Pisani, F.S., Sabatino, P.: A distributed intrusion detection framework based on evolved specialized ensembles of classifiers. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 315–331. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31204-0_21CrossRefGoogle Scholar
- 7.Folino, G., Pisani, F.S., Sabatino, P.: An incremental ensemble evolved by using genetic programming to efficiently detect drifts in cyber security datasets. In: Genetic and Evolutionary Computation Conference, Companion Material Proceedings, GECCO 2016, Denver, CO, USA, 20–24 July 2016, pp. 1103–1110 (2016)Google Scholar
- 11.Micenková, B., McWilliams, B., Assent, I.: Learning outlier ensembles: the best of both worlds-supervised and unsupervised. In: ACM SIGKDD 2014 Workshop ODD2 (2014)Google Scholar