Abstract
Despite the continuous countermeasuring efforts, embedding malware in PDF documents and using it as a malware distribution mechanism is still a threat. This is due to its popularity as a document exchange format, the lack of user awareness of its dangers, as well as its ability to carry and execute malware. Several malicious PDF detection tools have been proposed by the academic community to address the PDF threat. All of which suffer some drawbacks that limit its utility. In this paper, we present the drawbacks of the current state of the art malicious PDF detectors. This was achieved by undertaking a survey of all recent malicious PDF detectors, followed by a comparative evaluation of the available tools. Our results show that Concept drifts is major drawback to the detectors, despite the fact that many detectors use machine learning approaches.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adobe: Adobe reader security patches (2017). https://helpx.adobe.com/security/products/reader.html
Adobe: PDF technology center (2017). http://www.adobe.com/devnet/pdf.html
Carmony, C., Hu, X., Yin, H., Bhaskar, A.V., Zhang, M.: Extract me if you can: abusing PDF parsers in malware detectors, In: NDSS (2016)
Contagio: Contagio malware dump (2017). http://contagiodump.blogspot.com.au
CVE: PDF-related vulnerabilities (2017). https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=PDF
Esparza, J.M.: PDF attack - a journey from the exploit kit to the shellcode (2014). https://www.blackhat.com/docs/eu-14/materials/eu-14-Esparza-PDF-Attack-A-Journey-From-The-Exploit-Kit-To-The-Shellcode.pdf
Laskov, P., Šrndić, N.: Static detection of malicious JavaScript-bearing PDF documents. In: Proceedings of the 27th Annual Computer Security Applications Conference, pp. 373–382. ACM (2011)
Li, W.-J., Stolfo, S., Stavrou, A., Androulaki, E., Keromytis, A.D.: A study of malcode-bearing documents. In: M. Hämmerli, B., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 231–250. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73614-1_14
Liu, D., Wang, H., Stavrou, A.: Detecting malicious JavaScript in PDF through document instrumentation. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 100–111. IEEE (2014)
Maiorca, D., Ariu, D., Corona, I., Giacinto, G.: A structural and content-based approach for a precise and robust detection of malicious PDF files. In: 2015 International Conference on Information Systems Security and Privacy (ICISSP), pp. 27–36. IEEE (2015)
Maiorca, D., Giacinto, G., Corona, I.: A pattern recognition system for malicious PDF files detection. In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 510–524. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_40
McAfee: Mcafee september 2017 threat report (2017). https://www.mcafee.com/au/resources/reports/rp-quarterly-threats-sept-2017.pdf
Trent Nelson: PDF collection (2017). https://github.com/tpn/pdfs
Neupane, A., Saxena, N., Maximo, J.O., Kana, R.: Neural markers of cybersecurity: an fMRI study of phishing and malware warnings. IEEE Trans. Inf. Forensics Secur. 11(9), 1970–1983 (2016). https://doi.org/10.1109/TIFS.2016.2566265
NIST: National vulnerable database (2017). https://nvd.nist.gov
Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: Proceedings Of The 28th Annual Computer Security Applications Conference, pp. 239–248. ACM (2012)
Smutz, C., Stavrou, A.: When a tree falls: using diversity in ensemble classifiers to identify evasion in malware detectors. In: NDSS (2016)
Šrndić, N., Laskov, P.: Detection of malicious PDF files based on hierarchical document structure. In: Proceedings of the 20th Annual Network and Distributed System Security Symposium (2013)
Šrndić, N., Laskov, P.: Hidost: a static machine-learning-based detector of malicious files. EURASIP J. Inf. Secur. 2016(1), 22 (2016)
Tabish, S.M., Shafiq, M.Z., Farooq, M.: Malware detection using statistical analysis of byte-level file content. In: Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics, pp. 23–31. ACM (2009)
VirusTotal: Virustotal (2017). https://www.virustotal.com
Xu, M., Kim, T.: PlatPal: detecting malicious documents with platform diversity. In: USENIX Security Symposium (2017)
Acknowledgements
We would like to thank Mustafa Al-Saegh for helping with dataset cleaning and preparation. We would also like to thank VirusTotal, the owner of the Contagio dataset and TPN for providing access to their files. Finally, we are grateful to the authors and creators of PDFrate and Slayer, for providing access to their tools.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Falah, A., Pan, L., Abdelrazek, M., Doss, R. (2018). Identifying Drawbacks in Malicious PDF Detectors. In: Doss, R., Piramuthu, S., Zhou, W. (eds) Future Network Systems and Security. FNSS 2018. Communications in Computer and Information Science, vol 878. Springer, Cham. https://doi.org/10.1007/978-3-319-94421-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-94421-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94420-3
Online ISBN: 978-3-319-94421-0
eBook Packages: Computer ScienceComputer Science (R0)