Abstract
The evolving nature of the tactics, techniques, and procedures used by cyber adversaries have made signature and template based methods of modeling adversary behavior almost infeasible. We are moving into an era of data-driven autonomous cyber defense agents that learn contextually meaningful adversary behaviors from observables. In this chapter, we explore what can be learnt about cyber adversaries from observable data, such as intrusion alerts, network traffic, and threat intelligence feeds. We describe the challenges of building autonomous cyber defense agents, such as learning from noisy observables with no ground truth, and the brittle nature of deep learning based agents that can be easily evaded by adversaries. We illustrate three state-of-the-art autonomous cyber defense agents that model adversary behavior from traffic induced observables without a priori expert knowledge or ground truth labels. We close with recommendations and directions for future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Afianian, A., Niksefat, S., Sadeghiyan, B., & Baptiste, D. (2020). Malware dynamic analysis evasion techniques. ACM Computing Surveys, 52, 1–28.
Alata, E., Dacier, M., Deswarte, Y., et al. (2006). Collection and analysis of attack data based on honeypots deployed on the internet. In Quality of protection (pp. 79–91). Springer.
Alsaheel, A., Nan, Y., Ma, S., et al. (2021). ATLAS: A sequence-based learning approach for attack investigation. In 30th USENIX security symposium (USENIX security 21) (pp. 3005–3022).
Apruzzese, G., Andreolini, M., Marchetti, M., et al. (2020). Deep reinforcement adversarial learning against botnet evasion attacks. IEEE Transactions on Network and Service Management, 17, 1975–1987. https://doi.org/10.1109/TNSM.2020.3031843
Axelsson, S. (2000). The base-rate fallacy and the difficulty of intrusion detection. ACM Transactions on Information and System Security, 3, 186–205. https://doi.org/10.1145/357830.357849
Bianco, D. (2013). The pyramid of pain. Enterprise Detection & Response.
Buczak, A. L., & Guven, E. (2016). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys Tutorials, 18, 1153–1176. https://doi.org/10.1109/COMST.2015.2494502
Cai, H., Meng, N., Ryder, B., & Yao, D. (2019). DroidCat: Effective android malware detection and categorization via app-level profiling. IEEE Transactions on Information Forensics and Security, 14, 1455–1470. https://doi.org/10.1109/TIFS.2018.2879302
Carrasco, R. C., & Oncina, J. (1994). Learning stochastic regular grammars by means of a state merging method. In Grammatical inference and applications (pp. 139–152). Springer.
Chen, S., Xue, M., Fan, L., et al. (2018). Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. Computers & Security, 73, 326–344. https://doi.org/10.1016/j.cose.2017.11.007
Croft, R., Ali Babar, M., & Chen, H. (2022). Noisy label learning for security defects. arXiv [cs.SE].
Du, P., Sun, Z., Chen, H., et al. (2018). Statistical estimation of malware detection metrics in the absence of ground truth. IEEE Transactions on Information Forensics and Security, 13, 2965–2980. https://doi.org/10.1109/TIFS.2018.2833292
Eslahi, M., Rohmad, M. S., Nilsaz, H., et al. (2015). Periodicity classification of HTTP traffic to detect HTTP botnets. In 2015 IEEE Symposium on Computer Applications Industrial Electronics (ISCAIE) (pp. 119–123).
García, S., Grill, M., Stiborek, J., & Zunino, A. (2014). An empirical comparison of botnet detection methods. Computers & Security, 45, 100–123. https://doi.org/10.1016/j.cose.2014.05.011
Hammerschmidt, C., Marchal, S., State, R., & Verwer, S. (2016). Behavioral clustering of non-stationary IP flow record data. In 2016 12th International Conference on Network and Service Management (CNSM) (pp. 297–301).
Holder, E., & Wang, N. (2021). Explainable artificial intelligence (XAI) interactively working with humans as a junior cyber analyst. Human-Intelligent Systems Integration, 3, 139–153. https://doi.org/10.1007/s42454-020-00021-z
Hutchins, E. M., Cloppert, M. J., Amin, R. M., & Others. (2011). Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains. Leading Issues in Information Warfare & Security Research, 1, 80.
Jha, S., Sheyner, O., & Wing, J. (2002). Two formal analyses of attack graphs. Proceedings 15th IEEE Computer Security Foundations Workshop. CSFW-15.
Jordaney, R., Wang, Z., Papini, D., et al. (2016). Misleading metrics: On evaluating machine learning for malware with confidence. Tech Rep.
Jordaney, R., Sharad, K., Dash, S. K., et al. (2017). Transcend: Detecting concept drift in malware classification models. In 26th USENIX security symposium (USENIX security 17) (pp. 625–642).
Kolosnjaji, B., Demontis, A., Biggio, B., et al. (2018). Adversarial malware binaries: Evading deep learning for malware detection in executables. In 2018 26th European Signal Processing Conference (EUSIPCO) (pp. 533–537).
Letham, B., Rudin, C., McCormick, T. H., & Madigan, D. (2015). Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model. aoas, 9, 1350–1371. https://doi.org/10.1214/15-AOAS848
Li, H., Wei, F., & Hu, H. (2019). Enabling dynamic network access control with anomaly-based IDS and SDN. In Proceedings of the ACM international workshop on security in software defined networks & network function virtualization (pp. 13–16). Association for Computing Machinery.
Liu, H., Zhong, C., Alnusair, A., & Islam, S. R. (2021). FAIXID: A framework for enhancing AI explainability of intrusion detection results using data cleaning techniques. Journal of Network and Systems Management, 29, 40. https://doi.org/10.1007/s10922-021-09606-8
Lu, Y., Richter, F., & Seidl, T. (2020). Efficient infrequent pattern mining using negative Itemset tree. In A. Appice, M. Ceci, C. Loglisci, et al. (Eds.), Complex pattern mining: New challenges, methods and applications (pp. 1–16). Springer.
Manning, C., Raghavan, P., & Schütze, H. (2010). Introduction to information retrieval. Natural Language Engineering, 16, 100–103.
Marpaung, J. A. P., Sain, M., & Lee, H.-J. (2012). Survey on malware evasion techniques: State of the art and challenges. In 2012 14th International Conference on Advanced Communication Technology (ICACT) (pp. 744–749).
McFate, M. (2005). The military utility of understanding adversary culture. OFFICE OF NAVAL RESEARCH ARLINGTON VA.
Michie, S., van Stralen, M. M., & West, R. (2011). The behaviour change wheel: A new method for characterising and designing behaviour change interventions. Implementation Science, 6, 42. https://doi.org/10.1186/1748-5908-6-42
Moskal, S., & Yang, S. J. (2020). Cyberattack action-intent-framework for mapping intrusion observables. arXiv [cs.CR].
Moskal, S., & Yang, S. J. (2021a). Translating intrusion alerts to cyberattack stages using pseudo-active transfer learning (PATRL). In 2021 IEEE conference on communications and network security (CNS) (pp. 110–118).
Moskal, S., & Yang, S. J. (2021b). Heated Alert Triage (HeAT): Network-agnostic extraction of cyber attack campaigns. In Proceedings of the conference on applied machine learning for information security.
Moskal, S., Yang, S. J., & Kuhl, M. E. (2018). Extracting and evaluating similar and unique cyber attack strategies from intrusion alerts. In 2018 IEEE international conference on intelligence and security informatics (ISI) (pp. 49–54).
Munaiah, N., Pelletier, J., Su, S.-H., et al. (2019). A cybersecurity dataset derived from the national collegiate penetration testing competition. In HICSS symposium on cybersecurity big data analytics.
Nadeem, A., Hammerschmidt, C., Gañán, C. H., & Verwer, S. (2021a). Beyond labeling: Using clustering to build network behavioral profiles of malware families. In M. Stamp, M. Alazab, & A. Shalaginov (Eds.), Malware analysis using artificial intelligence and deep learning (pp. 381–409). Springer.
Nadeem, A., Verwer, S., Moskal, S., & Yang, S. J. (2021b). Enabling visual analytics via alert-driven attack graphs. In Proceedings of the 2021 ACM SIGSAC conference on computer and communications security (pp. 2420–2422). Association for Computing Machinery.
Nadeem, A., Verwer, S., & Yang, S. J. (2021c). SAGE: Intrusion alert-driven attack graph extractor. In 2021 IEEE symposium on visualization for cyber security (VizSec) (pp. 36–41).
Nadeem, A., Rimmer, V., Joosen, W., & Verwer, S. (2022a). Intelligent malware defenses. In L. Batina, T. Bäck, I. Buhan, & S. Picek (Eds.), Security and artificial intelligence: A crossdisciplinary approach (pp. 217–253). Springer.
Nadeem, A., Verwer, S., Moskal, S., & Yang, S. J. (2022b). Alert-driven attack graph generation using S-PDFA. IEEE Transactions on Dependable and Secure Computing, 19, 731–746. https://doi.org/10.1109/TDSC.2021.3117348
Noel, S., Elder, M., Jajodia, S., et al. (2009). Advances in topological vulnerability analysis. In 2009 cybersecurity applications technology conference for homeland security (pp. 124–129).
Okutan, A., & Yang, S. J. (2019). ASSERT: Attack synthesis and separation with entropy redistribution towards predictive cyber defense. Cybersecurity.
Piplai, A., Mittal, S., Joshi, A., et al. (2020). Creating cybersecurity knowledge graphs from malware after action reports. IEEE Access, 8, 211691–211703.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier. arXiv [cs.LG].
Rimmer, V., Nadeem, A., Verwer, S., et al. (2022). Open-world network intrusion detection. In L. Batina, T. Bäck, I. Buhan, & S. Picek (Eds.), Security and artificial intelligence: A crossdisciplinary approach (pp. 254–283). Springer.
Roscher, R., Bohn, B., Duarte, M. F., & Garcke, J. (2020). Explainable machine learning for scientific insights and discoveries. IEEE Access, 8, 42200–42216. https://doi.org/10.1109/ACCESS.2020.2976199
Ross, A., & Doshi-Velez, F. (2018). Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. AAAI, 32.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1, 206–215.
Samhita, L., & Gross, H. J. (2013). The “clever Hans phenomenon” revisited. Communicative & Integrative Biology, 6, e27122.
Sauerwein, C., Sillaber, C., Mussmann, A., & Breu, R. (2017). Threat intelligence sharing platforms: An exploratory study of software vendors and research perspectives. In Wirtschaftsinformatik 2017 proceedings.
Schaberreiter, T., Kupfersberger, V., Rantos, K., et al. (2019). A quantitative evaluation of Trust in the quality of cyber threat intelligence sources. In Proceedings of the 14th international conference on availability, reliability and security (pp. 1–10). Association for Computing Machinery.
Sebastián, M., Rivera, R., Kotzias, P., & Caballero, J. (2016). AVclass: A tool for massive malware labeling. In Research in attacks, intrusions, and defenses (pp. 230–253). Springer.
Sejnowski, T. J. (2020). The unreasonable effectiveness of deep learning in artificial intelligence. Proceedings of the National Academy of Sciences of the United States of America, 117, 30033–30038. https://doi.org/10.1073/pnas.1907373117
Sethi, T. S., & Kantardzic, M. (2018). When good machine learning leads to bad security. Ubiquity, 2018, 1–14.
Severi, G., Meyer, J., Coull, S., & Oprea, A. (2021). Explanation-guided backdoor poisoning attacks against malware classifiers. In 30th USENIX security symposium (USENIX security 21) (pp. 1487–1504).
Slack, D., Hilgard, S., Jia, E., et al. (2020). Fooling LIME and SHAP: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM conference on AI, ethics, and society (pp. 180–186). Association for Computing Machinery.
Souri, A., & Hosseini, R. (2018). A state-of-the-art survey of malware detection approaches using data mining techniques. Human-centric Computing and Information Sciences, 8, 1–22. https://doi.org/10.1186/s13673-018-0125-x
Steinberg, A. N. (2005). An approach to threat assessment. In 2005 7th international conference on information fusion (p. 8).
Steinberg, A. (2007). Predictive modeling of interacting agents. In 2007 10th international conference on information fusion (pp. 1–6).
Strom, B. E., Applebaum, A., Miller, D. P., et al. (2018). Mitre att & ck: Design and philosophy. Tech Rep NAVTRADEVCEN.
Surber, J. G., & Zantua, M. (2022). Intelligent interaction honeypots for threat hunting within the internet of things. CISSE, 9, 5–5. https://doi.org/10.53735/cisse.v9i1.147
Szczepański, M., Choraś, M., Pawlicki, M., & Kozik, R. (2020). Achieving explainability of intrusion detection system by hybrid Oracle-explainer approach. In 2020 international joint conference on neural networks (IJCNN) (pp. 1–8).
Ucci, D., Aniello, L., & Baldoni, R. (2019). Survey of machine learning techniques for malware analysis. Computers & Security, 81, 123–147. https://doi.org/10.1016/j.cose.2018.11.001
Verwer, S., & Hammerschmidt, C. A. (2017). Flexfringe: A passive automaton learning package. In 2017 IEEE international conference on software maintenance and evolution (ICSME) (pp. 638–642).
Yang, S. J., Okutan, A., Werner, G., et al. (2021) Near real-time learning and extraction of attack models from intrusion alerts. arXiv [cs.CR].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Nadeem, A., Verwer, S., Yang, S.J. (2023). Learning About the Adversary. In: Kott, A. (eds) Autonomous Intelligent Cyber Defense Agent (AICA). Advances in Information Security, vol 87. Springer, Cham. https://doi.org/10.1007/978-3-031-29269-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-29269-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29268-2
Online ISBN: 978-3-031-29269-9
eBook Packages: Computer ScienceComputer Science (R0)