Skip to main content

Security Evaluation of Support Vector Machines in Adversarial Environments

  • Chapter
  • First Online:
Support Vector Machines Applications

Abstract

Support vector machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion) or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This is an instance of the Representer Theorem which states that solutions to a large class of regularized ERM problems lie in the span of the training data [61].

  2. 2.

    Although in certain abstract models we have shown how regret-minimizing online learning can be used to define reactive approaches that are competitive with proactive security [6].

  3. 3.

    See [12] for more details on the definition of the data distribution and the resampling algorithm.

  4. 4.

    http://contagiodump.blogspot.it

  5. 5.

    We also conducted experiments using C = 0. 1 and C = 100, but did not find significant differences compared to the presented results using C = 1.

  6. 6.

    That is, the learning map’s output is not a deterministic function of the training data. The probability in the definition of differential privacy is due to this randomness. Our treatment here is only as complex as necessary, but to be completely general, the events in the definition should be on measurable sets \(G \subset \mathcal{H}\) rather than individual \(g \in \mathcal{H}\).

  7. 7.

    Recall that the zero-mean multi-variate Laplace distribution with scale parameter s has density proportional to \(\exp (-\|\mathbf{x}\|_{1}/s)\).

  8. 8.

    That is ∀ x, \(k(\mathbf{x},\mathbf{x}) {\leq \kappa }^{2}\) ; e.g. for the RBF the norm is uniformly unity κ = 1; more generally, we can make the standard assumption that the data lies within some κ L 2 -ball.

  9. 9.

    Above we previously bounded the L2 norms of points in features space by κ, the additional bound on the L norm here is for convenience and is standard practice in learning-theoretic results.

References

  1. Adobe Systems Incorporated: PDF Reference, sixth edition, Adobe Portable Document Format, Version 1.7, November 2006.

    Google Scholar 

  2. Balfanz, D., Staddon, J. (eds.): Proceedings of the 1st ACM Workshop on AISec (AISec). ACM, New York (2008)

    Google Scholar 

  3. Balfanz, D., Staddon, J. (eds.): Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence (AISec). ACM, New York (2009)

    Google Scholar 

  4. Barreno, M., Nelson, B., Joseph, A., Tygar, J.: The security of machine learning. Mach. Learn. 81, 121–148 (2010). doi:10.1007/s10994-010-5188-5

    Article  MathSciNet  Google Scholar 

  5. Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure? In: ASIACCS ’06: Proceedings of the 2006 ACM Symposium on Information. Computer and Communications Security, pp. 16–25. ACM, New York (2006). doi: http://doi.acm.org/10.1145/1128817.1128824

  6. Barth, A., Rubinstein, B.I.P., Sundararajan, M., Mitchell, J.C., Song, D., Bartlett, P.L.: A learning-based approach to reactive security. IEEE Trans. Depend. Secure Comput. 9(4), 482–493 (2012)

    Article  Google Scholar 

  7. Biggio, B., Corona, I., Fumera, G., Giacinto, G., Roli, F.: Bagging classifiers for fighting poisoning attacks in adversarial environments. In: The 10th International Workshop on Multiple Classifier Systems (MCS). Lecture Notes in Computer Science, vol. 6713, pp. 350–359. Springer, Berlin (2011)

    Google Scholar 

  8. Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Blockeel, H., etal. (eds.) European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Part III. Lecture Notes in Artificial Intelligence, vol. 8190, pp. 387–402. Springer, Berlin Heidelberg (2013)

    Google Scholar 

  9. Biggio, B., Didaci, L., Fumera, G., Roli, F.: Poisoning attacks to compromise face templates. In: The 6th IAPR International Conference on Biometrics (ICB) (2013)

    Google Scholar 

  10. Biggio, B., Fumera, G., Pillai, I., Roli, F.: A survey and experimental evaluation of image spam filtering techniques. Pattern Recogn. Lett. 32(10), 1436–1446 (2011)

    Article  Google Scholar 

  11. Biggio, B., Fumera, G., Roli, F.: Design of robust classifiers for adversarial environments. IEEE Int. Conf. Syst. Man Cybern. (SMC), 977–982 (2011)

    Google Scholar 

  12. Biggio, B., Fumera, G., Roli, F.: Security evaluation of pattern classifiers under attack. IEEE Trans. Knowl. Data Eng. 99, 1 (2013)

    Article  Google Scholar 

  13. Biggio, B., Fumera, G., Roli, F., Didaci, L.: Poisoning adaptive biometric systems. In: Structural, Syntactic, and Statistical Pattern Recognition. Lecture Notes in Computer Science, vol. 7626, pp. 417–425 (2012)

    Article  Google Scholar 

  14. Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29th International Conference on Machine Learning (2012)

    Google Scholar 

  15. Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SuLQ framework. In: Proceedings of the 24th Symposium on Principles of Database Systems, pp. 128–138 (2005)

    Google Scholar 

  16. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  17. Brückner, M., Kanzow, C., Scheffer, T.: Static prediction games for adversarial learning problems. J. Mach. Learn. Res. 13, 2617–2654 (2012)

    MathSciNet  Google Scholar 

  18. Cárdenas, A.A., Baras, J.S.: Evaluation of classifiers: practical considerations for security applications. In: AAAI Workshop on Evaluation Methods for Machine Learning (2006)

    Google Scholar 

  19. Cárdenas, A.A., Nelson, B., Rubinstein, B.I. (eds.): The 5th ACM Workshop on Artificial Intelligence and Security (AISec). ACM, New York (2012)

    Google Scholar 

  20. Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) NIPS, pp. 409–415. MIT Press, Cambridge (2000)

    Google Scholar 

  21. Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)

    MATH  MathSciNet  Google Scholar 

  22. Chen, Y., Cárdenas, A.A., Greenstadt, R., Rubinstein, B. (eds.): Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence (AISec). ACM, New York (2011)

    Google Scholar 

  23. Christmann, A., Steinwart, I.: On robust properties of convex risk minimization methods for pattern recognition. J. Mach. Learn. Res. 5, 1007–1034 (2004)

    MATH  MathSciNet  Google Scholar 

  24. Corona, I., Biggio, B., Maiorca, D.: Adversarialib: a general-purpose library for the automatic evaluation of machine learning-based classifiers under adversarial attacks. http://sourceforge.net/projects/adversarialib/ (2013)

  25. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  26. Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 99–108 (2004)

    Google Scholar 

  27. Dimitrakakis, C., Gkoulalas-Divanis, A., Mitrokotsa, A., Verykios, V.S., Saygin, Y. (eds.): International ECML/PKDD Workshop on Privacy and Security Issues in Data Mining and Machine Learning (2010)

    Google Scholar 

  28. Drucker, H., Wu, D., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Trans. Neural Netw. 10(5), 1048–1054 (1999)

    Article  Google Scholar 

  29. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, Chichester (2000)

    Google Scholar 

  30. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference (TCC 2006), pp. 265–284 (2006)

    Google Scholar 

  31. Fogla, P., Sharif, M., Perdisci, R., Kolesnikov, O., Lee, W.: Polymorphic blending attacks. In: Proceedings of the 15th Conference on USENIX Security Symposium (2006)

    Google Scholar 

  32. Globerson, A., Roweis, S.T.: Nightmare at test time: robust learning by feature deletion. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 353–360 (2006)

    Google Scholar 

  33. Golland, P.: Discriminative direction for kernel classifiers. In: Neural Information Processing Systems (NIPS), pp. 745–752. MIT Press, Cambridge (2002)

    Google Scholar 

  34. Greenstadt, R. (ed.): Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security (AISec). ACM, New York (2010)

    Google Scholar 

  35. Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Probability and Mathematical Statistics. Wiley, New York (1986). http://www.worldcat.org/isbn/0471735779

  36. Huang, L., Joseph, A.D., Nelson, B., Rubinstein, B., Tygar, J.D.: Adversarial machine learning. In: Proceedings of the 4th ACM Workshop on Artificial Intelligence and Security (AISec), pp. 43–57 (2011)

    Google Scholar 

  37. Joseph, A.D., Laskov, P., Roli, F., Tygar, D. (eds.): Dagstuhl Perspectives Workshop on Machine Learning Methods for Computer Security. Workshop 12371 (2012)

    Google Scholar 

  38. Kloft, M., Laskov, P.: Online anomaly detection under adversarial impact. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 405–412 (2010)

    Google Scholar 

  39. Kloft, M., Laskov, P.: A ‘poisoning’ attack against online anomaly detection. In: Joseph, A.D., Laskov, P., Roli, F., Tygar, D. (eds.) Dagstuhl Perspectives Workshop on Machine Learning Methods for Computer Security. Workshop 12371 (2012)

    Google Scholar 

  40. Kloft, M., Laskov, P.: Security analysis of online centroid anomaly detection. J. Mach. Learn. Res. 13, 3647–3690 (2012)

    MathSciNet  Google Scholar 

  41. Kolcz, A., Teo, C.H.: Feature weighting for improved classifier robustness. In: Proceedings of the 6th Conference on Email and Anti-Spam (CEAS) (2009)

    Google Scholar 

  42. Laskov, P., Kloft, M.: A framework for quantitative security analysis of machine learning. In: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, pp. 1–4 (2009)

    Google Scholar 

  43. Laskov, P., Lippmann, R. (eds.): NIPS Workshop on Machine Learning in Adversarial Environments for Computer Security (2007)

    Google Scholar 

  44. Laskov, P., Lippmann, R.: Machine learning in adversarial environments. Mach. Learn. 81, 115–119 (2010)

    Article  Google Scholar 

  45. LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., Denker, J., Drucker, H., Guyon, I., Müller, U., Säckinger, E., Simard, P., Vapnik, V.: Comparison of learning algorithms for handwritten digit recognition. In: International Conference on Artificial Neural Networks, pp. 53–60 (1995)

    Google Scholar 

  46. Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 641–647 (2005)

    Google Scholar 

  47. Lowd, D., Meek, C.: Good word attacks on statistical spam filters. In: Proceedings of the 2nd Conference on Email and Anti-Spam (CEAS) (2005)

    Google Scholar 

  48. Lütkepohl, H.: Handbook of Matrices. Wiley, New York (1996)

    MATH  Google Scholar 

  49. Maiorca, D., Corona, I., Giacinto, G.: Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications security (ASIA CCS ’13), pp. 119–130. ACM, New York (2013)

    Google Scholar 

  50. Maiorca, D., Giacinto, G., Corona, I.: A pattern recognition system for malicious PDF files detection. In: MLDM, pp. 510–524. Springer, Berlin (2012)

    Google Scholar 

  51. Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust Statistics: Theory and Methods. Probability and Mathematical Statistics. Wiley, New York (2006). http://www.worldcat.org/isbn/0471735779

  52. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the 29th IEEE Symposium on Security and Privacy, pp. 111–125 (2008)

    Google Scholar 

  53. Nelson, B., Barreno, M., Chi, F.J., Joseph, A.D., Rubinstein, B.I.P., Saini, U., Sutton, C., Tygar, J.D., Xia, K.: Exploiting machine learning to subvert your spam filter. In: Proceedings of the 1st USENIX Workshop on Large-Scale Exploits and Emergent Threats, pp. 1–9 (2008)

    Google Scholar 

  54. Nelson, B., Rubinstein, B.I., Huang, L., Joseph, A.D., Lee, S.J., Rao, S., Tygar, J.D.: Query strategies for evading convex-inducing classifiers. J. Mach. Learn. Res. 13, 1293–1332 (2012)

    MathSciNet  Google Scholar 

  55. Perdisci, R., Gu, G., Lee, W.: Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems. In: Proceedings of the International Conference on Data Mining (ICDM), pp. 488–498 (2006)

    Google Scholar 

  56. Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (2000)

    Google Scholar 

  57. Rizzi, S.: What-if analysis. In: Encyclopedia of Database Systems, pp. 3525–3529. Springer, New York (2009)

    Google Scholar 

  58. Rodrigues, R.N., Ling, L.L., Govindaraju, V.: Robustness of multimodal biometric fusion methods against spoof attacks. J. Vis. Lang. Comput. 20(3), 169–179 (2009)

    Article  Google Scholar 

  59. Rubinstein, B.I., Nelson, B., Huang, L., Joseph, A.D., Lau, S.h., Rao, S., Taft, N., Tygar, J.D.: Antidote: understanding and defending against poisoning of anomaly detectors. In: Proceedings of the 9th Conference on Internet Measurement Conference (IMC), pp. 1–14 (2009)

    Google Scholar 

  60. Rubinstein, B.I.P., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: privacy-preserving mechanisms for SVM learning. J. Privacy Confidentiality 4(1), 65–100 (2012)

    Google Scholar 

  61. Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Computational Learning Theory. Lecture Notes in Computer Science, vol. 2111, pp. 416–426. Springer, Berlin (2001)

    Google Scholar 

  62. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  63. Smutz, C., Stavrou, A.: Malicious PDF detection using metadata and structural features. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 239–248 (2012)

    Google Scholar 

  64. Šrndić, N., Laskov, P.: Detection of malicious PDF files based on hierarchical document structure. In: Proceedings of the 20th Annual Network & Distributed System Security Symposium (NDSS) (2013)

    Google Scholar 

  65. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzz. Knowl. Based Syst. 10(5), 557–570 (2002)

    Google Scholar 

  66. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    Book  MATH  Google Scholar 

  67. Wittel, G.L., Wu, S.F.: On attacking statistical spam filters. In: Proceedings of the 1st Conference on Email and Anti-Spam (CEAS) (2004)

    Google Scholar 

  68. Young, R.: 2010 IBM x-force mid-year trend & risk report. Technical Report, IBM (2010)

    Google Scholar 

Download references

Acknowledgments

This work has been partly supported by the project CRP-18293 funded by Regione Autonoma della Sardegna, L.R. 7/2007, Bando 2009, and by the project “Advanced and secure sharing of multimedia data over social networks in the future Internet” (CUP F71J1100069 0002) funded by the same institution. Davide Maiorca gratefully acknowledges Regione Autonoma della Sardegna for the financial support of his PhD scholarship (P.O.R. Sardegna F.S.E. Operational Programme of the Autonomous Region of Sardinia, European Social Fund 2007–2013—Axis IV Human Resources, Objective l.3, Line of Activity l.3.1.). Blaine Nelson thanks the Alexander von Humboldt Foundation for providing additional financial support. The opinions expressed in this chapter are solely those of the authors and do not necessarily reflect the opinions of any sponsor.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Battista Biggio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Biggio, B. et al. (2014). Security Evaluation of Support Vector Machines in Adversarial Environments. In: Ma, Y., Guo, G. (eds) Support Vector Machines Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-02300-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02300-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02299-4

  • Online ISBN: 978-3-319-02300-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics