Advertisement

Multiple classifier systems for robust classifier design in adversarial environments

  • Battista BiggioEmail author
  • Giorgio Fumera
  • Fabio Roli
Original Article

Abstract

Pattern recognition systems are increasingly being used in adversarial environments like network intrusion detection, spam filtering and biometric authentication and verification systems, in which an adversary may adaptively manipulate data to make a classifier ineffective. Current theory and design methods of pattern recognition systems do not take into account the adversarial nature of such kind of applications. Their extension to adversarial settings is thus mandatory, to safeguard the security and reliability of pattern recognition systems in adversarial environments. In this paper we focus on a strategy recently proposed in the literature to improve the robustness of linear classifiers to adversarial data manipulation, and experimentally investigate whether it can be implemented using two well known techniques for the construction of multiple classifier systems, namely, bagging and the random subspace method. Our results provide some hints on the potential usefulness of classifier ensembles in adversarial classification tasks, which is different from the motivations suggested so far in the literature.

Keywords

Adversarial classification Multiple classifier systems Robust classifiers Linear classifiers 

Notes

Acknowledgments

This work was partly supported by a grant awarded to B. Biggio by Regione Autonoma della Sardegna, PO Sardegna FSE 2007–2013, L.R. 7/2007 “Promotion of the scientific research and technological innovation in Sardinia”.

References

  1. 1.
    The Apache Spam Assassin Project. http://spamassassin.apache.org/
  2. 2.
    Barreno M, Nelson B, Sears R, Joseph AD, Tygar JD (2006) Can machine learning be secure? In: ASIACCS ’06: proceeding 2006 ACM symposium on information, computer and communications security, New York, NY, USA. ACM, New York, pp 16–25Google Scholar
  3. 3.
    Benediktsson JA, Kittler J, Roli F (eds) (2009) Multiple classifier systems, 8th international workshop (MCS 2009). In: Lecture notes in computer science, vol 5519. Springer, New YorkGoogle Scholar
  4. 4.
    Biggio B, Fumera G, Roli F (2008) Adversarial pattern classification using multiple classifiers and randomisation. In: 12th Joint IAPR international workshop on structural and syntactic pattern recognition (SSPR 2008). LNCS, vol 5342. Springer-Verlag, New York, pp 500–509Google Scholar
  5. 5.
    Biggio B, Fumera G, Roli F (2009) Evade hard multiple classifier systems. In: Okun O, Valentini G (eds) Supervised and unsupervised ensemble methods and their applications. Studies in computational intelligence, vol 245. Springer, Berlin, pp 15–38Google Scholar
  6. 6.
    Biggio B, Fumera G, Roli F (2009) Multiple classifier systems for adversarial classification tasks. In: Benediktsson JA, Kittler J, Roli F (eds) Multiple classifier systems, 8th international workshop (MCS 2009). Lecture notes in computer science, vol 5519. Springer, New York, pp 132–141Google Scholar
  7. 7.
    Biggio B, Fumera G, Roli F (2010) Multiple classifier systems under attack. In: Gayar NE, Kittler J, Roli F (eds) MCS. Lecture notes in computer science. Springer, Berlin, pp 74–83Google Scholar
  8. 8.
    Bishop CM (2007) Pattern recognition and machine learning (Information science and statistics), 1st edn. Springer, BerlinGoogle Scholar
  9. 9.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140zbMATHMathSciNetGoogle Scholar
  10. 10.
    Breiman L (2001) Random forests. Mach Learn 45:5–32zbMATHCrossRefGoogle Scholar
  11. 11.
    Bühlmann P, Yu B (2002) Analyzing bagging. Ann Stat 30(4):927–961zbMATHCrossRefGoogle Scholar
  12. 12.
    Buja A, Stuetzle W (2000) The effect of bagging on variance, bias, and mean squared error. Technical report. AT&T Labs-ResearchGoogle Scholar
  13. 13.
    Cárdenas AA, Baras JS (2006) Evaluation of classifiers: practical considerations for security applications. In: AAAI workshop on evaluation methods for machine learning, Boston, MA, USAGoogle Scholar
  14. 14.
    Chang C-C, Lin C-J (2001) LibSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm/
  15. 15.
    Cormack GV (2007) Trec 2007 spam track overview. In: Voorhees EM, Buckland LP (eds) TREC, volume special publication 500-274. National Institute of Standards and Technology (NIST)Google Scholar
  16. 16.
    Cretu GF, Stavrou A, Locasto ME, Stolfo SJ, Keromytis AD (2008) Casting out demons: sanitizing training data for anomaly sensors. In: IEEE symposium on security and privacy, pp 81–95Google Scholar
  17. 17.
    Dalvi N, Domingos P, Mausam, Sanghai S, Verma D (2004) Adversarial classification. In: Tenth ACM SIGKDD international conference on knowledge discovery and data mining (KDD), Seattle, pp 99–108Google Scholar
  18. 18.
    Domingos P (1997) Why does bagging work? a bayesian account and its implications. In: Proceedings of 3rd international conference on knowledge discovery and data mining, pp 155–158Google Scholar
  19. 19.
    Drucker H, Wu D, Vapnik VN (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw 10(5):1048–1054CrossRefGoogle Scholar
  20. 20.
    Fogla P, Sharif M, Perdisci R, Kolesnikov O, Lee W (2006) Polymorphic blending attacks. In: USENIX-SS’06: proceedings of 15th conference on USENIX security symposium. USENIX AssociationGoogle Scholar
  21. 21.
    Friedman JH, Hall P (2007) On bagging and nonlinear estimation. J Stat Plan Inference 137(3):669–683. Special issue on nonparametric statistics and related topics: in honor of M.L. PuriGoogle Scholar
  22. 22.
    Galbally-Herrero J, Fierrez-Aguilar J, Rodriguez-Gonzalez JD, Alonso-Fernandez F, Ortega-Garcia J, Tapiador M (2006) On the vulnerability of fingerprint verification systems to fake fingerprint attacks. In: Proceedings of IEEE international Carnahan conference on security technology, ICCST, pp 130–136Google Scholar
  23. 23.
    Gargiulo F, Kuncheva LI, Sansone C. Network protocol verification by a classifier selection ensemble. In: Benediktsson JA, Kittler J, Roli F (eds) (2009) Multiple classifier systems, 8th international workshop (MCS 2009). In: Lecture notes in computer science, vol 5519. Springer, New York, pp 314–323Google Scholar
  24. 24.
    Globerson A, Roweis ST (2006) Nightmare at test time: robust learning by feature deletion. In: Cohen WW, Moore A (eds) ICML. ACM international conference proceeding series, vol 148. ACM, New York, pp 353–360Google Scholar
  25. 25.
    Graham P (2002) A plan for spam. http://paulgraham.com/spam.html
  26. 26.
    Graham-Cumming J (2004) How to beat an adaptive spam filter. In: MIT Spam conference, Cambridge, MA, USAGoogle Scholar
  27. 27.
    Grandvalet Y (2004) Bagging equalizes influence. Mach Learn 55:251–270zbMATHCrossRefGoogle Scholar
  28. 28.
    Haindl M, Kittler J, Roli F (eds) (2007) Multiple classifier systems. 7th international workshop, MCS 2007, Prague, Czech Republic, May 23–25, 2007. Proceedings, lecture notes in computer science, vol 4472. Springer, New YorkGoogle Scholar
  29. 29.
    Hershkop S, Stolfo SJ (2005) Combining email models for false positive reduction. In: KDD ’05: Proceedings of 11th ACM SIGKDD international conference on knowledge discovery in data mining. ACM, New York, pp 98–107Google Scholar
  30. 30.
    Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844CrossRefGoogle Scholar
  31. 31.
    Jorgensen Z, Zhou Y, Inge M (2008) A multiple instance learning strategy for combating good word attacks on spam filters. J Mach Learn Res 9:1115–1146Google Scholar
  32. 32.
    Kemmerer RA, Vigna G (2002) Intrusion detection: a brief history and overview (supplement to Computer magazine). Computer 35:27–30CrossRefGoogle Scholar
  33. 33.
    Kittler J, Hatef M, Duin RP, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239CrossRefGoogle Scholar
  34. 34.
    Kloft M, Laskov P. A ’poisoning’ attack against online anomaly detection. In: Laskov P, Lippmann R (eds) Neural information processing systems (NIPS) workshop on machine learning in adversarial environments for computer security. http://mls-nips07.first.fraunhofer.de
  35. 35.
    Kolcz A, Teo CH (2009) Feature weighting for improved classifier robustness. In: 6th conference on Email and Anti-Spam (CEAS)Google Scholar
  36. 36.
    Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, HobokenzbMATHCrossRefGoogle Scholar
  37. 37.
    Laskov P, Kloft M (2009) A framework for quantitative security analysis of machine learning. In: AISec ’09: proceedings of 2nd ACM workshop on security and artificial intelligence. ACM, New York, pp 1–4Google Scholar
  38. 38.
    Laskov P, Lippmann R (eds) (2007) Neural information processing systems (NIPS) workshop on machine learning in adversarial environments for computer security. http://mls-nips07.first.fraunhofer.de
  39. 39.
    Lewis DD (1992) An evaluation of phrasal and clustered representations on a text categorization task. In: SIGIR ’92: proceedings of 15th annual international ACM SIGIR conference research and development in information retrieval, New York, NY, USA, pp 37–50Google Scholar
  40. 40.
    Lowd D, Meek C (2005) Adversarial learning. In: Press A (ed) Proceedings of 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 641–647Google Scholar
  41. 41.
    Lowd D, Meek C (2005) Good word attacks on statistical spam filters. In: 2nd conference on Email and Anti-Spam (CEAS)Google Scholar
  42. 42.
    Meyer TA, Whateley B (2004) Spambayes: effective open-source, bayesian based, email classification system. In: 1st conference on Email and Anti-Spam (CEAS)Google Scholar
  43. 43.
    Perdisci R, Dagon D, Lee W, Fogla P, Sharif M (2006) Misleading worm signature generators using deliberate noise injection. In: IEEE symposium on security and privacy, pp 15–31Google Scholar
  44. 44.
    Perdisci R, Gu G, Lee W (2006) Using an ensemble of one-class svm classifiers to harden payload-based anomaly detection systems. In: International conference on data mining (ICDM). IEEE Computer Society, pp 488–498Google Scholar
  45. 45.
    Rodrigues RN, Ling LL, Govindaraju V (2009) Robustness of multimodal biometric fusion methods against spoof attacks. J Vis Lang Comput 20(3):169–179CrossRefGoogle Scholar
  46. 46.
    Ross AA, Nandakumar K, Jain AK (2006) Handbook of multibiometrics. Springer, New YorkGoogle Scholar
  47. 47.
    Skillicorn DB (2009) Adversarial knowledge discovery. IEEE Intell Syst 24:54–61CrossRefGoogle Scholar
  48. 48.
    Skurichina M, Duin RPW (1998) Bagging for linear classifiers. Pattern Recognit 31:909–930CrossRefGoogle Scholar
  49. 49.
    Skurichina M, Duin RPW (2002) Bagging, boosting and the random subspace method for linear classifiers. Pattern Anal Appl 5(2):121–135zbMATHCrossRefMathSciNetGoogle Scholar
  50. 50.
    Stern H (2008) A survey of modern spam tools. In: 5th conference on Email and Anti-Spam (CEAS)Google Scholar
  51. 51.
    Sutton C, Sindelar M, McCallum A (2005) Feature bagging: preventing weight undertraining in structured discriminative learning. IR 402, University of MassachusettsGoogle Scholar
  52. 52.
    Tran T, Tsai P, Jan T (2008) An adjustable combination of linear regression and modified probabilistic neural network for anti-spam filtering. In: International conference on pattern recognition (ICPR08), pp 1–4Google Scholar
  53. 53.
    Uludag U, Jain AK (2004) Attacks on biometric systems: a case study in fingerprints. In: Proceedings of SPIE-EI 2004, security, steganography and watermarking of multimedia contents VI, pp 622–633Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.Department of Electrical and Electronic EngineeringUniversity of CagliariCagliariItaly

Personalised recommendations