OutGene: Detecting Undefined Network Attacks with Time Stretching and Genetic Zooms

Dias, Luís; Reia, Hélder; Neves, Rui; Correia, Miguel

doi:10.1007/978-3-030-36938-5_12

OutGene: Detecting Undefined Network Attacks with Time Stretching and Genetic Zooms

Conference paper
First Online: 10 December 2019

2143 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11928))

Abstract

The paper presents OutGene, an approach for streaming detection of malicious activity without previous knowledge about attacks or training data. OutGene uses clustering to aggregate hosts with similar behavior. To assist human analysts on pinpointing malicious clusters, we introduce the notion of genetic zoom, that consists in using a genetic algorithm to identify the features that are more relevant to characterize a cluster. Adversaries are often able to circumvent attack detection based on machine learning by executing attacks at a low pace, below the thresholds used. To detect such stealth attacks, we introduce the notion of time stretching. The idea is to analyze the stream of events in different time-windows, so that we can identify attacks independently of the pace they are performed. We evaluated OutGene experimentally with a recent publicly available dataset and with a dataset obtained at a large military infrastructure. Both genetic zoom and time stretching have been found to be useful, and high values of recall and accuracy were obtained.

This research was supported by national funds through Fundação para a Ciência e Tecnologia (FCT) with reference UID/CEC/50021/2019 (INESC-ID), by the Portuguese Army (CINAMIL), and by the European Commission under grant agreement number 830892 (SPARTA). We warmly thank prof. Victor Lobo for feedback on a previous version of this work.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Apache Spark documentation. https://spark.apache.org/. Accessed 22 Apr 2019
Fail2ban. https://www.fail2ban.org. Accessed 22 Apr 2019
Alelyani, S., Tang, J., Liu, H.: Feature selection for clustering: a review. Data Clustering 29, 110–121 (2013)
Google Scholar
Bhuyan, M., Bhattacharyya, D., Kalita, J.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutorials 6(1), 303–336 (2014)
Article Google Scholar
Buczak, A., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutorials 18(2), 1153–1176 (2016)
Article Google Scholar
Cárdenas, A., Manadhata, P., Rajan, S.: Big data analytics for security intelligence. Cloud Secur. Alliance, 10–11 (2013)
Google Scholar
Casas, P., Mazel, J., Owezarski, P.: Unsupervised network intrusion detection systems: detecting the unknown without knowledge. Comput. Commun. 35(7), 772–783 (2012)
Article Google Scholar
CheckPoint: 2018 security report: Welcome to the future of cyber security (2018)
Google Scholar
Cinque, M., Corte, R.D., Pecchia, A.: Entropy-based security analytics: measurements from a critical information system. In: 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 379–390, June 2017
Google Scholar
Claise, B.: Cisco systems netflow services export version 9. Technical report, RFC 3954. IETF RFC 3954 (2004)
Google Scholar
Debar, H., Dacier, M., Wespi, A.: Towards a taxonomy of intrusion detection systems. Comput. Netw. 31(8), 805–822 (1999)
Article Google Scholar
Denning, D.E., Neumann, P.G.: Requirements and model for IDES: a real-time intrusion detection expert system. Technical report, Computer Science Laboratory, SRI International, Menlo Park, CA (1985)
Google Scholar
Dias, L.F., Correia, M.: Big data analytics for intrusion detection: an overview. In: Handbook of Research on Machine and Deep Learning Applications for Cyber Security, pp. 292–316. IGI Global (2020)
Google Scholar
Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: ACM SIGSAC Conference on Computer and Communications Security (2017)
Google Scholar
Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)
MathSciNet MATH Google Scholar
Fortin, F.A., Rainville, F.M.D., Gardner, M.A., Parizeau, M., Gagné, C.: Deap: evolutionary algorithms made easy. J. Mach. Learn. Res. 13, 2171–2175 (2012)
MathSciNet MATH Google Scholar
Goldberg, D.E., Holland, J.H.: Genetic algorithms and machine learning. Mach. Learn. 3(2), 95–99 (1988)
Article Google Scholar
Gonçalves, D., Bota, J., Correia, M.: Big data analytics for detecting host misbehavior in large logs. In: Proceedings of the 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2015)
Google Scholar
Gorgulho, A., Neves, R., Horta, N.: Applying a GA kernel on optimizing technical analysis rules for stock picking and portfolio composition. Expert Syst. Appl. 38(11), 14072–14085 (2011)
Google Scholar
Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: a survey. Int. J. Inf. Manage. 45, 289–307 (2018)
Article Google Scholar
Hellemons, L., Hendriks, L., Hofstede, R., Sperotto, A., Sadre, R., Pras, A.: SSHCure: a flow-based SSH intrusion detection system. In: Sadre, R., Novotný, J., Čeleda, P., Waldburger, M., Stiller, B. (eds.) AIMS 2012. LNCS, vol. 7279, pp. 86–97. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30633-4_11
Chapter Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Article Google Scholar
Hunt, P., Konar, M., Junqueira, F., Reed, B.: Zookeeper: wait-free coordination for internet-scale systems. In: USENIX Annual Technical Conference (2010)
Google Scholar
Jin, C., Carbonell, J.: Incremental aggregation on multiple continuous queries. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds.) ISMIS 2006. LNCS (LNAI), vol. 4203, pp. 167–177. Springer, Heidelberg (2006). https://doi.org/10.1007/11875604_20
Chapter Google Scholar
Kent, A.D.: Comprehensive. Multi-Source Cyber-Security Events, Los Alamos National Laboratory (2015)
Google Scholar
Kent, A.D.: Cyber security data sources for dynamic network research. Dyn. Netw. Cyber-Secur. 1, 37–65 (2016)
Article Google Scholar
Kienzler, R.: Mastering Apache Spark 2.x: Scalable Analytics Faster than Ever. Packt Publishing, Birmingham (2017)
Google Scholar
Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. In: Proceedings of NetDB, pp. 1–7 (2011)
Google Scholar
Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: Proceedings of the 7th USENIX Security Symposium, January 1998
Google Scholar
Leung, K., Leckie, C.: Unsupervised anomaly detection in network intrusion detection using clusters. In: Proceedings of the 28th Australasian Conference on Computer Science, pp. 333–342 (2005)
Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Google Scholar
Mandiant: Special report, M-TRENDS 2018 (2018)
Google Scholar
Marchetti, M., Pierazzi, F., Colajanni, M., Guido, A.: Analysis of high volumes of network traffic for advanced persistent threat detection. Comput. Netw. 109, 127–141 (2016)
Article Google Scholar
Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)
MathSciNet MATH Google Scholar
Middlemiss, M., Dick, G.: Feature selection of intrusion detection data using a hybrid genetic algorithm/KNN approach. In: Design and Application of Hybrid Intelligent Systems, pp. 519–527. IOS Press (2003)
Google Scholar
Mirsky, Y., Doitshman, T., Elovici, Y., Shabtai, A.: Kitsune: an ensemble of autoencoders for online network intrusion detection. In: Proceedings of the Network and Distributed System Security Symposium (2018)
Google Scholar
Osada, G., Omote, K., Nishide, T.: Network intrusion detection based on semi-supervised variational auto-encoder. In: Foley, S.N., Gollmann, D., Snekkenes, E. (eds.) ESORICS 2017. LNCS, vol. 10493, pp. 344–361. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66399-9_19
Chapter Google Scholar
OTA: Cyber incident & breach trends report (2018)
Google Scholar
Otey, M.E., Ghoting, A., Parthasarathy, S.: Fast distributed outlier detection in mixed-attribute data sets. Data Min. Knowl. Discov. 12(2–3), 203–228 (2006)
Article MathSciNet Google Scholar
Sacramento, L., Medeiros, I., Bota, J., Correia, M.: Flowhacker: detecting unknown network attacks in big traffic data using network flows. In: 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pp. 567–572 (2018)
Google Scholar
Satoh, A., Nakamura, Y., Ikenaga, T.: A flow-based detection method for stealthy dictionary attacks against secure shell. J. Inf. Secur. Appl. 21, 31–41 (2015)
Google Scholar
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies, pp. 1–10 (2010)
Google Scholar
Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: Proceedings of the 30th IEEE Symposium on Security and Privacy, pp. 305–316 (2010)
Google Scholar
Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of IP flow-based intrusion detection. IEEE Commun. Surv. Tutorials 12(3), 343–356 (2010)
Article Google Scholar
Srisuresh, P., Holdrege, M.: IP network address translator (NAT) terminology and considerations. IETF Request for Comments: RFC 2663, August 1999
Google Scholar
Stein, G., Chen, B., Wu, A.S., Hua, K.A.: Decision tree classifier for network intrusion detection with GA-based feature selection. In: Proceedings of the 43rd ACM Annual Southeast Regional Conference, vol. 2, pp. 136–141 (2005)
Google Scholar
Stergiopoulos, G., Talavari, A., Bitsikas, E., Gritzalis, D.: Automatic detection of various malicious traffic using side channel features on TCP packets. In: Lopez, J., Zhou, J., Soriano, M. (eds.) ESORICS 2018. LNCS, vol. 11098, pp. 346–362. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99073-6_17
Chapter Google Scholar
Su, Y.N., Chung, G.H., Wu, B.J.: Developing the upgrade detection and defense system of SSH dictionary-attack for multi-platform environment. iBusiness 3(01), 65 (2011)
Article Google Scholar
Thames, J.L., Abler, R., Keeling, D.: A distributed active response architecture for preventing SSH dictionary attacks. In: IEEE Southeastcon, pp. 84–89 (2008)
Google Scholar
Turcotte, M.J.M., Kent, A.D., Hash, C.: Unified Host and Network Data Set, chap. 1, pp. 1–22, November 2018
Google Scholar
Veeramachaneni, K., Arnaldo, I., Cuesta-Infante, A., Korrapati, V., Bassias, C., Li, K.: \(AI^2\): training a big data machine to defend. In: Proceedings of the 2nd IEEE International Conference on Big Data Security on Cloud (2016)
Google Scholar
Whitley, D.: The GENITOR algorithm and selection pressure. In: Proceedings of the 3rd International Conference on Genetic Algorithms, pp. 116–121 (1989)
Google Scholar
Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers. In: Proceedings of the 2016 Network and Distributed Systems Symposium (2016)
Google Scholar
Yamanishi, K., Takeuchi, J.I., Williams, G., Milne, P.: On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Min. Knowl. Discov. 8(3), 275–300 (2004)
Article MathSciNet Google Scholar
Yen, T.F.: Detecting stealthy malware using behavioral features in network traffic. Ph.D. thesis, Carnegie Mellon University Department of Electrical and Computer Engineering (2011)
Google Scholar
Yen, T.F., et al.: Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks. In: Proceedings of the 29th ACM Annual Computer Security Applications Conference (2013)
Google Scholar
Zhang, J., Zulkernine, M.: Anomaly based network intrusion detection with unsupervised outlier detection. In: 2006 IEEE International Conference on Communications, vol. 5, pp. 2388–2393 (2006)
Google Scholar
Zuech, R., Khoshgofthaar, T., Wald, R.: Intrusion detection and big heterogeneous data: a survey. J. Big Data 2, 90–107 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

CINAMIL, Academia Militar, Instituto Universitário Militar, Lisbon, Portugal
Luís Dias & Hélder Reia
INESC-ID, Instituto Superior Técnico, Lisbon, Portugal
Luís Dias, Hélder Reia & Miguel Correia
Instituto de Telecomunicações, Instituto Superior Técnico, Lisbon, Portugal
Rui Neves

Authors

Luís Dias
View author publications
You can also search for this author in PubMed Google Scholar
Hélder Reia
View author publications
You can also search for this author in PubMed Google Scholar
Rui Neves
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Correia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luís Dias .

Editor information

Editors and Affiliations

Monash University, Clayton, VIC, Australia
Joseph K. Liu
Fujian Normal University, Fuzhou, China
Xinyi Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dias, L., Reia, H., Neves, R., Correia, M. (2019). OutGene: Detecting Undefined Network Attacks with Time Stretching and Genetic Zooms. In: Liu, J., Huang, X. (eds) Network and System Security. NSS 2019. Lecture Notes in Computer Science(), vol 11928. Springer, Cham. https://doi.org/10.1007/978-3-030-36938-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-36938-5_12
Published: 10 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36937-8
Online ISBN: 978-3-030-36938-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics