Towards Building Active Defense Systems for Software Applications

Perumal, Zara; Veeramachaneni, Kalyan

doi:10.1007/978-3-319-94147-9_12

Zara Perumal¹⁶ &
Kalyan Veeramachaneni¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10879))

Included in the following conference series:

International Symposium on Cyber Security Cryptography and Machine Learning

1075 Accesses
1 Citations

Abstract

Over the last few years, cyber attacks have become increasingly sophisticated. PDF malware – a continuously effective method of attack due to the difficulty of classifying malicious files – is a popular target of study within the field of machine learning for cybersecurity. The obstacles to using machine learning are many: attack patterns change over time as attackers change their behavior (sometimes automatically), and application security systems are deployed in a highly resource-constrained environments, meaning that an accurate but time-consuming machine learning cannot be deployed.

Motivated by these challenges, we propose an active defender system to adapt to evasive PDF malware in a resource-constrained environment. We observe this system to improve the \(f_1\) score from 0.17535 to 0.4562 over five stages of receiving unlabeled PDF files. Furthermore, average classification time per file is low across all 5 stages, and is reduced from an average of 1.16908 s per file to 1.09649 s per file. Beyond classifying malware, we provide a general active defender framework that can be used to deploy decision systems for a variety of applications operating under resource-constrained environments with adversaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Several recent studies suggest that PDF malware is evading classification using various automated methods [8, 11, 23, 30].
2.
https://github.com/mzweilin/pdfrwructure.
3.
We make use of the open source library: https://github.com/HDI-Project/BTB.

References

Contagio dump. http://contagiodump.blogspot.com. Accessed 11 Nov 2016
The rise of document-based malware. https://www.sophos.com/en-us/security-news-trends/security-trends/the-rise-of-document-based-malware.aspx
The rise of machine learning (ml) in cybersecurity. https://www.crowdstrike.com/resources/white-papers/rise-machine-learning-ml-cybersecurity/
Mimicus framweork (2017). https://github.com/srndic/mimicus
Argyros, G., Stais, I., Jana, S., Keromytis, A.D., Kiayias, A.: Sfadiff: automated evasion attacks and fingerprinting using black-box differential automata learning. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1690–1701. ACM (2016)
Google Scholar
Argyros, G., Stais, I., Kiayias, A., Keromytis, A.D.: Back in black: towards formal, black box analysis of sanitizers and filters. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 91–109. IEEE (2016)
Google Scholar
Ashford, W.: Cyber criminals catching up with nation state attacks. https://www.computerweekly.com/news/252435701/Cyber-criminals-catching-up-with-nation-state-attacks
Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_25
Chapter Google Scholar
Bossert, T.P.: It’s official: north korea is behind wannacry, December 2017. https://www.wsj.com/articles/its-official-north-korea-is-behind-wannacry-1513642537
Chen, Y., Nadji, Y., Kountouras, A., Monrose, F., Perdisci, R., Antonakakis, M., Vasiloglou, N.: Practical attacks against graph-based clustering. arXiv preprint arXiv:1708.09056 (2017)
Dang, H., Huang, Y., Chang, E.C.: Evading classifiers by morphing in the dark (2017)
Google Scholar
Hosseini, H., Xiao, B., Clark, A., Poovendran, R.: Attacking automatic video analysis algorithms: a case study of google cloud video intelligence API. arXiv preprint arXiv:1708.04301 (2017)
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on gan. arXiv preprint arXiv:1702.05983 (2017)
Kantchelian, A., Tygar, J., Joseph, A.: Evasion and hardening of tree ensemble classifiers. In: International Conference on Machine Learning, pp. 2387–2396 (2016)
Google Scholar
Laskov, P., et al.: Practical evasion of a learning-based classifier: a case study. In: 2014 IEEE Symposium on Security and Privacy (SP), pp. 197–211. IEEE (2014)
Google Scholar
Li, W.-J., Stolfo, S., Stavrou, A., Androulaki, E., Keromytis, A.D.: A study of malcode-bearing documents. In: M. Hämmerli, B., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 231–250. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73614-1_14
Chapter Google Scholar
MacFarlane, D., Network, I.C.: Why even smaller enterprises should consider nation-state quality cyber defenses, September 2017. https://www.csoonline.com/article/3223866/cyberwarfare/nation-state-quality-cyber-defenses.html
Maiorca, D., Corona, I., Giacinto, G.: Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection. In: Proceedings of the 8th ACM SIGSAC symposium on Information, Computer and Communications Security, pp. 119–130. ACM (2013)
Google Scholar
Millman, R.: Nation state cyber-attacks on the rise - detect lateral movement quickly, February 2018. https://www.scmagazineuk.com/nation-state-cyber-attacks-on-the-rise-detect-lateral-movement-quickly/article/746561/
Riley, M., Robertson, J., Sharpe, A.: The equifax hack has the hallmarks of state-sponsored pros, September 2017. https://www.bloomberg.com/news/features/2017-09-29/the-equifax-hack-has-all-the-hallmarks-of-state-sponsored-pros
Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against RNNs and other API calls based malware classifiers. arXiv preprint arXiv:1707.05970 (2017)
Sethi, T.S., Kantardzic, M.: Data driven exploratory attacks on black box classifiers in adversarial domains. arXiv preprint arXiv:1703.07909 (2017)
Sethi, T.S., Kantardzic, M., Ryu, J.W.: ‘Security theater’: on the vulnerability of classifiers to exploratory attacks. In: Wang, G.A., Chau, M., Chen, H. (eds.) PAISI 2017. LNCS, vol. 10241, pp. 49–63. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57463-9_4
Chapter Google Scholar
Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 239–248. ACM (2012)
Google Scholar
Smutz, C., Stavrou, A.: When a tree falls: using diversity in ensemble classifiers to identify evasion in malware detectors. In: NDSS (2016)
Google Scholar
Swearingen, T., Drevo, W., Cyphers, B., Cuesta-Infante, A., Ross, A., Veeramachaneni, K.: ATM: a distributed, collaborative, scalable system for automated machine learning. In: IEEE International Conference on Big Data (2017)
Google Scholar
Tong, L., Li, B., Hajaj, C., Vorobeychik, Y.: Feature conservation in adversarial classifier evasion: a case study. arXiv preprint arXiv:1708.08327 (2017)
Veeramachaneni, K., Arnaldo, I., Korrapati, V., Bassias, C., Li, K.: Ai\(^{2}\): training a big data machine to defend. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS), pp. 49–54. IEEE (2016)
Google Scholar
Wang, B., Gao, J., Qi, Y.: A theoretical framework for robustness of (deep) classifiers under adversarial noise. arXiv preprint arXiv:1612.00334 (2016)
Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers. In: Proceedings of the 2016 Network and Distributed Systems Symposium (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Data to AI Lab, MIT LIDS, Cambridge, MA, 02139, USA
Zara Perumal & Kalyan Veeramachaneni

Authors

Zara Perumal
View author publications
You can also search for this author in PubMed Google Scholar
Kalyan Veeramachaneni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kalyan Veeramachaneni .

Editor information

Editors and Affiliations

Ben-Gurion University of the Negev, Beer Sheva, Israel
Itai Dinur
Ben-Gurion University of the Negev, Beer Sheva, Israel
Shlomi Dolev
Tata Consultancy Services (India), Chennai, Tamil Nadu, India
Sachin Lodha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Perumal, Z., Veeramachaneni, K. (2018). Towards Building Active Defense Systems for Software Applications. In: Dinur, I., Dolev, S., Lodha, S. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2018. Lecture Notes in Computer Science(), vol 10879. Springer, Cham. https://doi.org/10.1007/978-3-319-94147-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-94147-9_12
Published: 17 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94146-2
Online ISBN: 978-3-319-94147-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics