Skip to main content

OutGene: Detecting Undefined Network Attacks with Time Stretching and Genetic Zooms

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11928))

Abstract

The paper presents OutGene, an approach for streaming detection of malicious activity without previous knowledge about attacks or training data. OutGene uses clustering to aggregate hosts with similar behavior. To assist human analysts on pinpointing malicious clusters, we introduce the notion of genetic zoom, that consists in using a genetic algorithm to identify the features that are more relevant to characterize a cluster. Adversaries are often able to circumvent attack detection based on machine learning by executing attacks at a low pace, below the thresholds used. To detect such stealth attacks, we introduce the notion of time stretching. The idea is to analyze the stream of events in different time-windows, so that we can identify attacks independently of the pace they are performed. We evaluated OutGene experimentally with a recent publicly available dataset and with a dataset obtained at a large military infrastructure. Both genetic zoom and time stretching have been found to be useful, and high values of recall and accuracy were obtained.

This research was supported by national funds through Fundação para a Ciência e Tecnologia (FCT) with reference UID/CEC/50021/2019 (INESC-ID), by the Portuguese Army (CINAMIL), and by the European Commission under grant agreement number 830892 (SPARTA). We warmly thank prof. Victor Lobo for feedback on a previous version of this work.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Apache Spark documentation. https://spark.apache.org/. Accessed 22 Apr 2019

  2. Fail2ban. https://www.fail2ban.org. Accessed 22 Apr 2019

  3. Alelyani, S., Tang, J., Liu, H.: Feature selection for clustering: a review. Data Clustering 29, 110–121 (2013)

    Google Scholar 

  4. Bhuyan, M., Bhattacharyya, D., Kalita, J.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutorials 6(1), 303–336 (2014)

    Article  Google Scholar 

  5. Buczak, A., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutorials 18(2), 1153–1176 (2016)

    Article  Google Scholar 

  6. Cárdenas, A., Manadhata, P., Rajan, S.: Big data analytics for security intelligence. Cloud Secur. Alliance, 10–11 (2013)

    Google Scholar 

  7. Casas, P., Mazel, J., Owezarski, P.: Unsupervised network intrusion detection systems: detecting the unknown without knowledge. Comput. Commun. 35(7), 772–783 (2012)

    Article  Google Scholar 

  8. CheckPoint: 2018 security report: Welcome to the future of cyber security (2018)

    Google Scholar 

  9. Cinque, M., Corte, R.D., Pecchia, A.: Entropy-based security analytics: measurements from a critical information system. In: 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 379–390, June 2017

    Google Scholar 

  10. Claise, B.: Cisco systems netflow services export version 9. Technical report, RFC 3954. IETF RFC 3954 (2004)

    Google Scholar 

  11. Debar, H., Dacier, M., Wespi, A.: Towards a taxonomy of intrusion detection systems. Comput. Netw. 31(8), 805–822 (1999)

    Article  Google Scholar 

  12. Denning, D.E., Neumann, P.G.: Requirements and model for IDES: a real-time intrusion detection expert system. Technical report, Computer Science Laboratory, SRI International, Menlo Park, CA (1985)

    Google Scholar 

  13. Dias, L.F., Correia, M.: Big data analytics for intrusion detection: an overview. In: Handbook of Research on Machine and Deep Learning Applications for Cyber Security, pp. 292–316. IGI Global (2020)

    Google Scholar 

  14. Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: ACM SIGSAC Conference on Computer and Communications Security (2017)

    Google Scholar 

  15. Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)

    MathSciNet  MATH  Google Scholar 

  16. Fortin, F.A., Rainville, F.M.D., Gardner, M.A., Parizeau, M., Gagné, C.: Deap: evolutionary algorithms made easy. J. Mach. Learn. Res. 13, 2171–2175 (2012)

    MathSciNet  MATH  Google Scholar 

  17. Goldberg, D.E., Holland, J.H.: Genetic algorithms and machine learning. Mach. Learn. 3(2), 95–99 (1988)

    Article  Google Scholar 

  18. Gonçalves, D., Bota, J., Correia, M.: Big data analytics for detecting host misbehavior in large logs. In: Proceedings of the 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2015)

    Google Scholar 

  19. Gorgulho, A., Neves, R., Horta, N.: Applying a GA kernel on optimizing technical analysis rules for stock picking and portfolio composition. Expert Syst. Appl. 38(11), 14072–14085 (2011)

    Google Scholar 

  20. Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: a survey. Int. J. Inf. Manage. 45, 289–307 (2018)

    Article  Google Scholar 

  21. Hellemons, L., Hendriks, L., Hofstede, R., Sperotto, A., Sadre, R., Pras, A.: SSHCure: a flow-based SSH intrusion detection system. In: Sadre, R., Novotný, J., Čeleda, P., Waldburger, M., Stiller, B. (eds.) AIMS 2012. LNCS, vol. 7279, pp. 86–97. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30633-4_11

    Chapter  Google Scholar 

  22. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

    Article  Google Scholar 

  23. Hunt, P., Konar, M., Junqueira, F., Reed, B.: Zookeeper: wait-free coordination for internet-scale systems. In: USENIX Annual Technical Conference (2010)

    Google Scholar 

  24. Jin, C., Carbonell, J.: Incremental aggregation on multiple continuous queries. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds.) ISMIS 2006. LNCS (LNAI), vol. 4203, pp. 167–177. Springer, Heidelberg (2006). https://doi.org/10.1007/11875604_20

    Chapter  Google Scholar 

  25. Kent, A.D.: Comprehensive. Multi-Source Cyber-Security Events, Los Alamos National Laboratory (2015)

    Google Scholar 

  26. Kent, A.D.: Cyber security data sources for dynamic network research. Dyn. Netw. Cyber-Secur. 1, 37–65 (2016)

    Article  Google Scholar 

  27. Kienzler, R.: Mastering Apache Spark 2.x: Scalable Analytics Faster than Ever. Packt Publishing, Birmingham (2017)

    Google Scholar 

  28. Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. In: Proceedings of NetDB, pp. 1–7 (2011)

    Google Scholar 

  29. Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: Proceedings of the 7th USENIX Security Symposium, January 1998

    Google Scholar 

  30. Leung, K., Leckie, C.: Unsupervised anomaly detection in network intrusion detection using clusters. In: Proceedings of the 28th Australasian Conference on Computer Science, pp. 333–342 (2005)

    Google Scholar 

  31. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)

    Google Scholar 

  32. Mandiant: Special report, M-TRENDS 2018 (2018)

    Google Scholar 

  33. Marchetti, M., Pierazzi, F., Colajanni, M., Guido, A.: Analysis of high volumes of network traffic for advanced persistent threat detection. Comput. Netw. 109, 127–141 (2016)

    Article  Google Scholar 

  34. Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)

    MathSciNet  MATH  Google Scholar 

  35. Middlemiss, M., Dick, G.: Feature selection of intrusion detection data using a hybrid genetic algorithm/KNN approach. In: Design and Application of Hybrid Intelligent Systems, pp. 519–527. IOS Press (2003)

    Google Scholar 

  36. Mirsky, Y., Doitshman, T., Elovici, Y., Shabtai, A.: Kitsune: an ensemble of autoencoders for online network intrusion detection. In: Proceedings of the Network and Distributed System Security Symposium (2018)

    Google Scholar 

  37. Osada, G., Omote, K., Nishide, T.: Network intrusion detection based on semi-supervised variational auto-encoder. In: Foley, S.N., Gollmann, D., Snekkenes, E. (eds.) ESORICS 2017. LNCS, vol. 10493, pp. 344–361. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66399-9_19

    Chapter  Google Scholar 

  38. OTA: Cyber incident & breach trends report (2018)

    Google Scholar 

  39. Otey, M.E., Ghoting, A., Parthasarathy, S.: Fast distributed outlier detection in mixed-attribute data sets. Data Min. Knowl. Discov. 12(2–3), 203–228 (2006)

    Article  MathSciNet  Google Scholar 

  40. Sacramento, L., Medeiros, I., Bota, J., Correia, M.: Flowhacker: detecting unknown network attacks in big traffic data using network flows. In: 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pp. 567–572 (2018)

    Google Scholar 

  41. Satoh, A., Nakamura, Y., Ikenaga, T.: A flow-based detection method for stealthy dictionary attacks against secure shell. J. Inf. Secur. Appl. 21, 31–41 (2015)

    Google Scholar 

  42. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies, pp. 1–10 (2010)

    Google Scholar 

  43. Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: Proceedings of the 30th IEEE Symposium on Security and Privacy, pp. 305–316 (2010)

    Google Scholar 

  44. Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of IP flow-based intrusion detection. IEEE Commun. Surv. Tutorials 12(3), 343–356 (2010)

    Article  Google Scholar 

  45. Srisuresh, P., Holdrege, M.: IP network address translator (NAT) terminology and considerations. IETF Request for Comments: RFC 2663, August 1999

    Google Scholar 

  46. Stein, G., Chen, B., Wu, A.S., Hua, K.A.: Decision tree classifier for network intrusion detection with GA-based feature selection. In: Proceedings of the 43rd ACM Annual Southeast Regional Conference, vol. 2, pp. 136–141 (2005)

    Google Scholar 

  47. Stergiopoulos, G., Talavari, A., Bitsikas, E., Gritzalis, D.: Automatic detection of various malicious traffic using side channel features on TCP packets. In: Lopez, J., Zhou, J., Soriano, M. (eds.) ESORICS 2018. LNCS, vol. 11098, pp. 346–362. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99073-6_17

    Chapter  Google Scholar 

  48. Su, Y.N., Chung, G.H., Wu, B.J.: Developing the upgrade detection and defense system of SSH dictionary-attack for multi-platform environment. iBusiness 3(01), 65 (2011)

    Article  Google Scholar 

  49. Thames, J.L., Abler, R., Keeling, D.: A distributed active response architecture for preventing SSH dictionary attacks. In: IEEE Southeastcon, pp. 84–89 (2008)

    Google Scholar 

  50. Turcotte, M.J.M., Kent, A.D., Hash, C.: Unified Host and Network Data Set, chap. 1, pp. 1–22, November 2018

    Google Scholar 

  51. Veeramachaneni, K., Arnaldo, I., Cuesta-Infante, A., Korrapati, V., Bassias, C., Li, K.: \(AI^2\): training a big data machine to defend. In: Proceedings of the 2nd IEEE International Conference on Big Data Security on Cloud (2016)

    Google Scholar 

  52. Whitley, D.: The GENITOR algorithm and selection pressure. In: Proceedings of the 3rd International Conference on Genetic Algorithms, pp. 116–121 (1989)

    Google Scholar 

  53. Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers. In: Proceedings of the 2016 Network and Distributed Systems Symposium (2016)

    Google Scholar 

  54. Yamanishi, K., Takeuchi, J.I., Williams, G., Milne, P.: On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Min. Knowl. Discov. 8(3), 275–300 (2004)

    Article  MathSciNet  Google Scholar 

  55. Yen, T.F.: Detecting stealthy malware using behavioral features in network traffic. Ph.D. thesis, Carnegie Mellon University Department of Electrical and Computer Engineering (2011)

    Google Scholar 

  56. Yen, T.F., et al.: Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks. In: Proceedings of the 29th ACM Annual Computer Security Applications Conference (2013)

    Google Scholar 

  57. Zhang, J., Zulkernine, M.: Anomaly based network intrusion detection with unsupervised outlier detection. In: 2006 IEEE International Conference on Communications, vol. 5, pp. 2388–2393 (2006)

    Google Scholar 

  58. Zuech, R., Khoshgofthaar, T., Wald, R.: Intrusion detection and big heterogeneous data: a survey. J. Big Data 2, 90–107 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luís Dias .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dias, L., Reia, H., Neves, R., Correia, M. (2019). OutGene: Detecting Undefined Network Attacks with Time Stretching and Genetic Zooms. In: Liu, J., Huang, X. (eds) Network and System Security. NSS 2019. Lecture Notes in Computer Science(), vol 11928. Springer, Cham. https://doi.org/10.1007/978-3-030-36938-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36938-5_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36937-8

  • Online ISBN: 978-3-030-36938-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics