Evaluation of random forest classifier in security domain
There is an intrinsic adversarial nature in the security domain such as spam filtering and malware detection systems that attempt to mislead the detection system. This adversarial nature makes security applications different from the classical machine learning problems; for instance, an adversary (attacker) might change the distribution of test data and violate the data stationarity, a common assumption in machine learning techniques. Since machine learning methods are not inherently adversary-aware, a classifier designer should investigate the robustness of a learning system under attack. In this respect, recent studies have modeled the identified attacks against machine learning-based detection systems. Based on this, a classifier designer can evaluate the performance of a learning system leveraging the modeled attacks. Prior research explored a gradient-based approach in order to devise an attack against a classifier with differentiable discriminant function like SVM. However, there are several powerful classifiers with non-differentiable decision boundary such as Random Forest, which are commonly used in different security domain and applications. In this paper, we present a novel approach to model an attack against classifiers with non-differentiable decision boundary. In the experimentation, we first present an example that visually shows the effect of a successful attack on the MNIST handwritten digits classification task. Then we conduct experiments for two well-known applications in the security domain: spam filtering and malware detection in PDF files. The experimental results demonstrate that the proposed attack successfully evades Random Forest classifier and effectively degrades the classifier’s performance.
KeywordsMachine learning Security application Evasion attack Discriminant function Surrogate classifier
The authors gratefully acknowledge Dr. Richard Wallace from Riverside Research for suggesting changes, reviewing, and editing the grammar and readability of this paper.
- 1.Warrender C, Forrest S, Pearlmutter B (1999) Detecting intrusions using system calls: Alternative data models Security and Privacy, 1999. Proceedings of the 1999 IEEE Symposium on, pp 133–145Google Scholar
- 6.Biggio B, Corona I, Maiorca D, Nelson B, Šxrndić N, Laskov P, Giacinto G, Roli F (2013) Evasion attacks against machine learning at test time Machine Learning and Knowledge Discovery in Databases. Springer, pp 387–402Google Scholar
- 7.Barreno M, Nelson B, Sears R, Joseph AD, Doug Tygar J (2006) Can machine learning be secure? Proceedings of the 2006 ACM Symposium on Information, computer and communications security, pp 16–25Google Scholar
- 9.Zhang F, Chan PPK, Biggio B, Yeung DS, Rolim F (2015) Adversarial feature selection against evasion attacksGoogle Scholar
- 10.Brückner M, Scheffer T (2011) Stackelberg games for adversarial prediction problems Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 547–555Google Scholar
- 16.Macdonald C, Ounis I, Soboroff I (2007) Overview of the trec 2007 blog track TREC, vol 7. Citeseer, pp 31–43Google Scholar
- 17.Maiorca D, Corona I, Giacinto G (2013) Looking at the bag is not enough to find the bomb: an evasion of structural methods for Malicious pdf files detection Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security, pp 119–130Google Scholar
- 18.Maiorca D, Giacinto G, Corona I (2012) A pattern recognition system for malicious pdf files detection Machine Learning and Data Mining in Pattern Recognition. Springer, pp 510–524Google Scholar
- 19.Smutz C, Stavrou A (2012) Malicious pdf detection using metadata and structural features Proceedings of the 28th Annual Computer Security Applications Conference, pp 239–248Google Scholar
- 20.Ṡrndic N, Laskov P (2013) Detection of Malicious pdf files based on hierarchical document structure Proceedings of the 20th Annual Network & Distributed System Security SymposiumGoogle Scholar