Abstract
Over the last few years, cyber attacks have become increasingly sophisticated. PDF malware – a continuously effective method of attack due to the difficulty of classifying malicious files – is a popular target of study within the field of machine learning for cybersecurity. The obstacles to using machine learning are many: attack patterns change over time as attackers change their behavior (sometimes automatically), and application security systems are deployed in a highly resource-constrained environments, meaning that an accurate but time-consuming machine learning cannot be deployed.
Motivated by these challenges, we propose an active defender system to adapt to evasive PDF malware in a resource-constrained environment. We observe this system to improve the \(f_1\) score from 0.17535 to 0.4562 over five stages of receiving unlabeled PDF files. Furthermore, average classification time per file is low across all 5 stages, and is reduced from an average of 1.16908 s per file to 1.09649 s per file. Beyond classifying malware, we provide a general active defender framework that can be used to deploy decision systems for a variety of applications operating under resource-constrained environments with adversaries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
We make use of the open source library: https://github.com/HDI-Project/BTB.
References
Contagio dump. http://contagiodump.blogspot.com. Accessed 11 Nov 2016
The rise of document-based malware. https://www.sophos.com/en-us/security-news-trends/security-trends/the-rise-of-document-based-malware.aspx
The rise of machine learning (ml) in cybersecurity. https://www.crowdstrike.com/resources/white-papers/rise-machine-learning-ml-cybersecurity/
Mimicus framweork (2017). https://github.com/srndic/mimicus
Argyros, G., Stais, I., Jana, S., Keromytis, A.D., Kiayias, A.: Sfadiff: automated evasion attacks and fingerprinting using black-box differential automata learning. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1690–1701. ACM (2016)
Argyros, G., Stais, I., Kiayias, A., Keromytis, A.D.: Back in black: towards formal, black box analysis of sanitizers and filters. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 91–109. IEEE (2016)
Ashford, W.: Cyber criminals catching up with nation state attacks. https://www.computerweekly.com/news/252435701/Cyber-criminals-catching-up-with-nation-state-attacks
Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_25
Bossert, T.P.: It’s official: north korea is behind wannacry, December 2017. https://www.wsj.com/articles/its-official-north-korea-is-behind-wannacry-1513642537
Chen, Y., Nadji, Y., Kountouras, A., Monrose, F., Perdisci, R., Antonakakis, M., Vasiloglou, N.: Practical attacks against graph-based clustering. arXiv preprint arXiv:1708.09056 (2017)
Dang, H., Huang, Y., Chang, E.C.: Evading classifiers by morphing in the dark (2017)
Hosseini, H., Xiao, B., Clark, A., Poovendran, R.: Attacking automatic video analysis algorithms: a case study of google cloud video intelligence API. arXiv preprint arXiv:1708.04301 (2017)
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on gan. arXiv preprint arXiv:1702.05983 (2017)
Kantchelian, A., Tygar, J., Joseph, A.: Evasion and hardening of tree ensemble classifiers. In: International Conference on Machine Learning, pp. 2387–2396 (2016)
Laskov, P., et al.: Practical evasion of a learning-based classifier: a case study. In: 2014 IEEE Symposium on Security and Privacy (SP), pp. 197–211. IEEE (2014)
Li, W.-J., Stolfo, S., Stavrou, A., Androulaki, E., Keromytis, A.D.: A study of malcode-bearing documents. In: M. Hämmerli, B., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 231–250. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73614-1_14
MacFarlane, D., Network, I.C.: Why even smaller enterprises should consider nation-state quality cyber defenses, September 2017. https://www.csoonline.com/article/3223866/cyberwarfare/nation-state-quality-cyber-defenses.html
Maiorca, D., Corona, I., Giacinto, G.: Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection. In: Proceedings of the 8th ACM SIGSAC symposium on Information, Computer and Communications Security, pp. 119–130. ACM (2013)
Millman, R.: Nation state cyber-attacks on the rise - detect lateral movement quickly, February 2018. https://www.scmagazineuk.com/nation-state-cyber-attacks-on-the-rise-detect-lateral-movement-quickly/article/746561/
Riley, M., Robertson, J., Sharpe, A.: The equifax hack has the hallmarks of state-sponsored pros, September 2017. https://www.bloomberg.com/news/features/2017-09-29/the-equifax-hack-has-all-the-hallmarks-of-state-sponsored-pros
Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against RNNs and other API calls based malware classifiers. arXiv preprint arXiv:1707.05970 (2017)
Sethi, T.S., Kantardzic, M.: Data driven exploratory attacks on black box classifiers in adversarial domains. arXiv preprint arXiv:1703.07909 (2017)
Sethi, T.S., Kantardzic, M., Ryu, J.W.: ‘Security theater’: on the vulnerability of classifiers to exploratory attacks. In: Wang, G.A., Chau, M., Chen, H. (eds.) PAISI 2017. LNCS, vol. 10241, pp. 49–63. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57463-9_4
Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 239–248. ACM (2012)
Smutz, C., Stavrou, A.: When a tree falls: using diversity in ensemble classifiers to identify evasion in malware detectors. In: NDSS (2016)
Swearingen, T., Drevo, W., Cyphers, B., Cuesta-Infante, A., Ross, A., Veeramachaneni, K.: ATM: a distributed, collaborative, scalable system for automated machine learning. In: IEEE International Conference on Big Data (2017)
Tong, L., Li, B., Hajaj, C., Vorobeychik, Y.: Feature conservation in adversarial classifier evasion: a case study. arXiv preprint arXiv:1708.08327 (2017)
Veeramachaneni, K., Arnaldo, I., Korrapati, V., Bassias, C., Li, K.: Ai\(^{2}\): training a big data machine to defend. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS), pp. 49–54. IEEE (2016)
Wang, B., Gao, J., Qi, Y.: A theoretical framework for robustness of (deep) classifiers under adversarial noise. arXiv preprint arXiv:1612.00334 (2016)
Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers. In: Proceedings of the 2016 Network and Distributed Systems Symposium (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Perumal, Z., Veeramachaneni, K. (2018). Towards Building Active Defense Systems for Software Applications. In: Dinur, I., Dolev, S., Lodha, S. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2018. Lecture Notes in Computer Science(), vol 10879. Springer, Cham. https://doi.org/10.1007/978-3-319-94147-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-94147-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94146-2
Online ISBN: 978-3-319-94147-9
eBook Packages: Computer ScienceComputer Science (R0)