Security Evaluation of Support Vector Machines in Adversarial Environments

Biggio, Battista; Corona, Igino; Nelson, Blaine; Rubinstein, Benjamin I. P.; Maiorca, Davide; Fumera, Giorgio; Giacinto, Giorgio; Roli, Fabio

doi:10.1007/978-3-319-02300-7_4

Battista Biggio³,
Igino Corona³,
Blaine Nelson⁴,
Benjamin I. P. Rubinstein⁵,
Davide Maiorca³,
Giorgio Fumera³,
Giorgio Giacinto³ &
…
Fabio Roli³

4415 Accesses
50 Citations
5 Altmetric

Abstract

Support vector machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion) or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This is an instance of the Representer Theorem which states that solutions to a large class of regularized ERM problems lie in the span of the training data [61].
2.
Although in certain abstract models we have shown how regret-minimizing online learning can be used to define reactive approaches that are competitive with proactive security [6].
3.
See [12] for more details on the definition of the data distribution and the resampling algorithm.
4.
http://contagiodump.blogspot.it
5.
We also conducted experiments using C = 0. 1 and C = 100, but did not find significant differences compared to the presented results using C = 1.
6.
That is, the learning map’s output is not a deterministic function of the training data. The probability in the definition of differential privacy is due to this randomness. Our treatment here is only as complex as necessary, but to be completely general, the events in the definition should be on measurable sets \(G \subset \mathcal{H}\) rather than individual \(g \in \mathcal{H}\).
7.
Recall that the zero-mean multi-variate Laplace distribution with scale parameter s has density proportional to \(\exp (-\|\mathbf{x}\|_{1}/s)\).
8.
That is ∀ x, \(k(\mathbf{x},\mathbf{x}) {\leq \kappa }^{2}\) ; e.g. for the RBF the norm is uniformly unity κ = 1; more generally, we can make the standard assumption that the data lies within some κ L ₂ -ball.
9.
Above we previously bounded the L₂ norms of points in features space by κ, the additional bound on the L _∞ norm here is for convenience and is standard practice in learning-theoretic results.

References

Adobe Systems Incorporated: PDF Reference, sixth edition, Adobe Portable Document Format, Version 1.7, November 2006.
Google Scholar
Balfanz, D., Staddon, J. (eds.): Proceedings of the 1st ACM Workshop on AISec (AISec). ACM, New York (2008)
Google Scholar
Balfanz, D., Staddon, J. (eds.): Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence (AISec). ACM, New York (2009)
Google Scholar
Barreno, M., Nelson, B., Joseph, A., Tygar, J.: The security of machine learning. Mach. Learn. 81, 121–148 (2010). doi:10.1007/s10994-010-5188-5
Article MathSciNet Google Scholar
Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure? In: ASIACCS ’06: Proceedings of the 2006 ACM Symposium on Information. Computer and Communications Security, pp. 16–25. ACM, New York (2006). doi: http://doi.acm.org/10.1145/1128817.1128824
Barth, A., Rubinstein, B.I.P., Sundararajan, M., Mitchell, J.C., Song, D., Bartlett, P.L.: A learning-based approach to reactive security. IEEE Trans. Depend. Secure Comput. 9(4), 482–493 (2012)
Article Google Scholar
Biggio, B., Corona, I., Fumera, G., Giacinto, G., Roli, F.: Bagging classifiers for fighting poisoning attacks in adversarial environments. In: The 10th International Workshop on Multiple Classifier Systems (MCS). Lecture Notes in Computer Science, vol. 6713, pp. 350–359. Springer, Berlin (2011)
Google Scholar
Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Blockeel, H., etal. (eds.) European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Part III. Lecture Notes in Artificial Intelligence, vol. 8190, pp. 387–402. Springer, Berlin Heidelberg (2013)
Google Scholar
Biggio, B., Didaci, L., Fumera, G., Roli, F.: Poisoning attacks to compromise face templates. In: The 6th IAPR International Conference on Biometrics (ICB) (2013)
Google Scholar
Biggio, B., Fumera, G., Pillai, I., Roli, F.: A survey and experimental evaluation of image spam filtering techniques. Pattern Recogn. Lett. 32(10), 1436–1446 (2011)
Article Google Scholar
Biggio, B., Fumera, G., Roli, F.: Design of robust classifiers for adversarial environments. IEEE Int. Conf. Syst. Man Cybern. (SMC), 977–982 (2011)
Google Scholar
Biggio, B., Fumera, G., Roli, F.: Security evaluation of pattern classifiers under attack. IEEE Trans. Knowl. Data Eng. 99, 1 (2013)
Article Google Scholar
Biggio, B., Fumera, G., Roli, F., Didaci, L.: Poisoning adaptive biometric systems. In: Structural, Syntactic, and Statistical Pattern Recognition. Lecture Notes in Computer Science, vol. 7626, pp. 417–425 (2012)
Article Google Scholar
Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29th International Conference on Machine Learning (2012)
Google Scholar
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SuLQ framework. In: Proceedings of the 24th Symposium on Principles of Database Systems, pp. 128–138 (2005)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Brückner, M., Kanzow, C., Scheffer, T.: Static prediction games for adversarial learning problems. J. Mach. Learn. Res. 13, 2617–2654 (2012)
MathSciNet Google Scholar
Cárdenas, A.A., Baras, J.S.: Evaluation of classifiers: practical considerations for security applications. In: AAAI Workshop on Evaluation Methods for Machine Learning (2006)
Google Scholar
Cárdenas, A.A., Nelson, B., Rubinstein, B.I. (eds.): The 5th ACM Workshop on Artificial Intelligence and Security (AISec). ACM, New York (2012)
Google Scholar
Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) NIPS, pp. 409–415. MIT Press, Cambridge (2000)
Google Scholar
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)
MATH MathSciNet Google Scholar
Chen, Y., Cárdenas, A.A., Greenstadt, R., Rubinstein, B. (eds.): Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence (AISec). ACM, New York (2011)
Google Scholar
Christmann, A., Steinwart, I.: On robust properties of convex risk minimization methods for pattern recognition. J. Mach. Learn. Res. 5, 1007–1034 (2004)
MATH MathSciNet Google Scholar
Corona, I., Biggio, B., Maiorca, D.: Adversarialib: a general-purpose library for the automatic evaluation of machine learning-based classifiers under adversarial attacks. http://sourceforge.net/projects/adversarialib/ (2013)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 99–108 (2004)
Google Scholar
Dimitrakakis, C., Gkoulalas-Divanis, A., Mitrokotsa, A., Verykios, V.S., Saygin, Y. (eds.): International ECML/PKDD Workshop on Privacy and Security Issues in Data Mining and Machine Learning (2010)
Google Scholar
Drucker, H., Wu, D., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Trans. Neural Netw. 10(5), 1048–1054 (1999)
Article Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, Chichester (2000)
Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference (TCC 2006), pp. 265–284 (2006)
Google Scholar
Fogla, P., Sharif, M., Perdisci, R., Kolesnikov, O., Lee, W.: Polymorphic blending attacks. In: Proceedings of the 15th Conference on USENIX Security Symposium (2006)
Google Scholar
Globerson, A., Roweis, S.T.: Nightmare at test time: robust learning by feature deletion. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 353–360 (2006)
Google Scholar
Golland, P.: Discriminative direction for kernel classifiers. In: Neural Information Processing Systems (NIPS), pp. 745–752. MIT Press, Cambridge (2002)
Google Scholar
Greenstadt, R. (ed.): Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security (AISec). ACM, New York (2010)
Google Scholar
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Probability and Mathematical Statistics. Wiley, New York (1986). http://www.worldcat.org/isbn/0471735779
Huang, L., Joseph, A.D., Nelson, B., Rubinstein, B., Tygar, J.D.: Adversarial machine learning. In: Proceedings of the 4th ACM Workshop on Artificial Intelligence and Security (AISec), pp. 43–57 (2011)
Google Scholar
Joseph, A.D., Laskov, P., Roli, F., Tygar, D. (eds.): Dagstuhl Perspectives Workshop on Machine Learning Methods for Computer Security. Workshop 12371 (2012)
Google Scholar
Kloft, M., Laskov, P.: Online anomaly detection under adversarial impact. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 405–412 (2010)
Google Scholar
Kloft, M., Laskov, P.: A ‘poisoning’ attack against online anomaly detection. In: Joseph, A.D., Laskov, P., Roli, F., Tygar, D. (eds.) Dagstuhl Perspectives Workshop on Machine Learning Methods for Computer Security. Workshop 12371 (2012)
Google Scholar
Kloft, M., Laskov, P.: Security analysis of online centroid anomaly detection. J. Mach. Learn. Res. 13, 3647–3690 (2012)
MathSciNet Google Scholar
Kolcz, A., Teo, C.H.: Feature weighting for improved classifier robustness. In: Proceedings of the 6th Conference on Email and Anti-Spam (CEAS) (2009)
Google Scholar
Laskov, P., Kloft, M.: A framework for quantitative security analysis of machine learning. In: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, pp. 1–4 (2009)
Google Scholar
Laskov, P., Lippmann, R. (eds.): NIPS Workshop on Machine Learning in Adversarial Environments for Computer Security (2007)
Google Scholar
Laskov, P., Lippmann, R.: Machine learning in adversarial environments. Mach. Learn. 81, 115–119 (2010)
Article Google Scholar
LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., Denker, J., Drucker, H., Guyon, I., Müller, U., Säckinger, E., Simard, P., Vapnik, V.: Comparison of learning algorithms for handwritten digit recognition. In: International Conference on Artificial Neural Networks, pp. 53–60 (1995)
Google Scholar
Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 641–647 (2005)
Google Scholar
Lowd, D., Meek, C.: Good word attacks on statistical spam filters. In: Proceedings of the 2nd Conference on Email and Anti-Spam (CEAS) (2005)
Google Scholar
Lütkepohl, H.: Handbook of Matrices. Wiley, New York (1996)
MATH Google Scholar
Maiorca, D., Corona, I., Giacinto, G.: Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications security (ASIA CCS ’13), pp. 119–130. ACM, New York (2013)
Google Scholar
Maiorca, D., Giacinto, G., Corona, I.: A pattern recognition system for malicious PDF files detection. In: MLDM, pp. 510–524. Springer, Berlin (2012)
Google Scholar
Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust Statistics: Theory and Methods. Probability and Mathematical Statistics. Wiley, New York (2006). http://www.worldcat.org/isbn/0471735779
Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the 29th IEEE Symposium on Security and Privacy, pp. 111–125 (2008)
Google Scholar
Nelson, B., Barreno, M., Chi, F.J., Joseph, A.D., Rubinstein, B.I.P., Saini, U., Sutton, C., Tygar, J.D., Xia, K.: Exploiting machine learning to subvert your spam filter. In: Proceedings of the 1st USENIX Workshop on Large-Scale Exploits and Emergent Threats, pp. 1–9 (2008)
Google Scholar
Nelson, B., Rubinstein, B.I., Huang, L., Joseph, A.D., Lee, S.J., Rao, S., Tygar, J.D.: Query strategies for evading convex-inducing classifiers. J. Mach. Learn. Res. 13, 1293–1332 (2012)
MathSciNet Google Scholar
Perdisci, R., Gu, G., Lee, W.: Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems. In: Proceedings of the International Conference on Data Mining (ICDM), pp. 488–498 (2006)
Google Scholar
Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (2000)
Google Scholar
Rizzi, S.: What-if analysis. In: Encyclopedia of Database Systems, pp. 3525–3529. Springer, New York (2009)
Google Scholar
Rodrigues, R.N., Ling, L.L., Govindaraju, V.: Robustness of multimodal biometric fusion methods against spoof attacks. J. Vis. Lang. Comput. 20(3), 169–179 (2009)
Article Google Scholar
Rubinstein, B.I., Nelson, B., Huang, L., Joseph, A.D., Lau, S.h., Rao, S., Taft, N., Tygar, J.D.: Antidote: understanding and defending against poisoning of anomaly detectors. In: Proceedings of the 9th Conference on Internet Measurement Conference (IMC), pp. 1–14 (2009)
Google Scholar
Rubinstein, B.I.P., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: privacy-preserving mechanisms for SVM learning. J. Privacy Confidentiality 4(1), 65–100 (2012)
Google Scholar
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Computational Learning Theory. Lecture Notes in Computer Science, vol. 2111, pp. 416–426. Springer, Berlin (2001)
Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 239–248 (2012)
Google Scholar
Šrndić, N., Laskov, P.: Detection of malicious PDF files based on hierarchical document structure. In: Proceedings of the 20th Annual Network & Distributed System Security Symposium (NDSS) (2013)
Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzz. Knowl. Based Syst. 10(5), 557–570 (2002)
Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book MATH Google Scholar
Wittel, G.L., Wu, S.F.: On attacking statistical spam filters. In: Proceedings of the 1st Conference on Email and Anti-Spam (CEAS) (2004)
Google Scholar
Young, R.: 2010 IBM x-force mid-year trend & risk report. Technical Report, IBM (2010)
Google Scholar

Download references

Acknowledgments

This work has been partly supported by the project CRP-18293 funded by Regione Autonoma della Sardegna, L.R. 7/2007, Bando 2009, and by the project “Advanced and secure sharing of multimedia data over social networks in the future Internet” (CUP F71J1100069 0002) funded by the same institution. Davide Maiorca gratefully acknowledges Regione Autonoma della Sardegna for the financial support of his PhD scholarship (P.O.R. Sardegna F.S.E. Operational Programme of the Autonomous Region of Sardinia, European Social Fund 2007–2013—Axis IV Human Resources, Objective l.3, Line of Activity l.3.1.). Blaine Nelson thanks the Alexander von Humboldt Foundation for providing additional financial support. The opinions expressed in this chapter are solely those of the authors and do not necessarily reflect the opinions of any sponsor.

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, University of Cagliari, Piazza d’Armi, 09123, Cagliari, Italy
Battista Biggio, Igino Corona, Davide Maiorca, Giorgio Fumera, Giorgio Giacinto & Fabio Roli
Institut für Informatik, Universität Potsdam, August-Bebel-Straße 89, 14482, Potsdam, Germany
Blaine Nelson
Department of Computing and Information Systems, University of Melbourne, Parkville, 3010, VIC, Australia
Benjamin I. P. Rubinstein

Authors

Battista Biggio
View author publications
You can also search for this author in PubMed Google Scholar
Igino Corona
View author publications
You can also search for this author in PubMed Google Scholar
Blaine Nelson
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin I. P. Rubinstein
View author publications
You can also search for this author in PubMed Google Scholar
Davide Maiorca
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Fumera
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Giacinto
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Roli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Battista Biggio .

Editor information

Editors and Affiliations

Honeywell, Golden Valley, Minnesota, USA
Yunqian Ma
West Virginia University, Morgantown, West Virginia, USA
Guodong Guo

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Biggio, B. et al. (2014). Security Evaluation of Support Vector Machines in Adversarial Environments. In: Ma, Y., Guo, G. (eds) Support Vector Machines Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-02300-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-02300-7_4
Published: 18 January 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02299-4
Online ISBN: 978-3-319-02300-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics