Learning under p-tampering poisoning attacks

  • Saeed MahloujifarEmail author
  • Dimitrios I. Diochnos
  • Mohammad Mahmoody


Recently, Mahloujifar and Mahmoody (Theory of Cryptography Conference’17) studied attacks against learning algorithms using a special case of Valiant’s malicious noise, called p-tampering, in which the adversary gets to change any training example with independent probability p but is limited to only choose ‘adversarial’ examples with correct labels. They obtained p-tampering attacks that increase the error probability in the so called ‘targeted’ poisoning model in which the adversary’s goal is to increase the loss of the trained hypothesis over a particular test example. At the heart of their attack was an efficient algorithm to bias the expected value of any bounded real-output function through p-tampering. In this work, we present new biasing attacks for increasing the expected value of bounded real-valued functions. Our improved biasing attacks, directly imply improved p-tampering attacks against learners in the targeted poisoning model. As a bonus, our attacks come with considerably simpler analysis. We also study the possibility of PAC learning under p-tampering attacks in the non-targeted (aka indiscriminate) setting where the adversary’s goal is to increase the risk of the generated hypothesis (for a random test example). We show that PAC learning is possible under p-tampering poisoning attacks essentially whenever it is possible in the realizable setting without the attacks. We further show that PAC learning under ‘no-mistake’ adversarial noise is not possible, if the adversary could choose the (still limited to only p fraction of) tampered examples that she substitutes with adversarially chosen ones. Our formal model for such ‘bounded-budget’ tampering attackers is inspired by the notions of adaptive corruption in cryptography.


Poisoning attacks Adversarial machine learning p-tampering attacks 

Mathematics Subject Classification (2010)



Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



We thank the anonymous reviewers of the International Conference in Algorithmic Learning Theory (ALT) 2018 as well as of the International Symposium in Artificial Intelligence and Mathematics (ISAIM) 2018 for their useful comments on earlier versions of this work.


  1. 1.
    Awasthi, P., Balcan, M.F., Long, P.M.: The power of localization for efficiently learning linear separators with noise. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pp 449–458. ACM (2014)Google Scholar
  2. 2.
    Austrin, P., Chung, K.-M., Mahmoody, M., Pass, R., Seth, K.: On the impossibility of cryptography with tamperable randomness. In: International Cryptology Conference, pp 462–479. Springer (2014)Google Scholar
  3. 3.
    Angluin, D., Krikis, M., Sloan, R.H., Turán, G: Malicious omissions and errors in answers to membership queries. Mach. Learn. 28(2–3), 211–255 (1997)CrossRefGoogle Scholar
  4. 4.
    Aumann, Y., Lindell, Y.: Security against covert adversaries: Efficient protocols for realistic adversaries. Theory Cryptogr., 137–156 (2007)Google Scholar
  5. 5.
    Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1987)MathSciNetGoogle Scholar
  6. 6.
    Beigi, S., Etesami, O., Gohari, A.: Deterministic randomness extraction from generalized and distributed Santha–Vazirani sources. SIAM J. Comput. 46(1), 1–36 (2017)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Occam’s razor. Inf. Process. Lett. 24(6), 377–380 (1987)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36(4), 929–965 (1989)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Bshouty, N.H., Eiron, N., Kushilevitz, E.: PAC learning with nasty noise. Theor. Comput. Sci. 288(2), 255–275 (2002)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Bentov, I., Gabizon, A., Zuckerman, D.: Bitcoin beacon. arXiv:1605.04559 (2016)
  11. 11.
    Benedek, G.M., Itai, A.: Learnability with respect to fixed distributions. Theor. Comput. Sci. 86(2), 377–390 (1991)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29th International Coference on International Conference on Machine Learning, pp 1467–1474. Omnipress (2012)Google Scholar
  13. 13.
    Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure?. In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp 16–25. ACM (2006)Google Scholar
  14. 14.
    Canetti, R., Feige, U., Goldreich, O., Naor, M.: Adaptively secure multi-party computation. In: 28th Annual ACM Symposium on Theory of Computing, pp 639–648. ACM Press, Philadephia (1996)Google Scholar
  15. 15.
    Chor, B., Goldreich, O.: Unbiased bits from sources of weak randomness and probabilistic communication complexity. In: Proc. 26th FOCS, pp 429–442. IEEE (1985)Google Scholar
  16. 16.
    Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23(4), 493–507 (1952)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Charikar, M., Steinhardt, J., Valiant, G.: Learning from untrusted data. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp 47–60. ACM (2017)Google Scholar
  18. 18.
    Diochnos, D.I.: On the evolution of monotone conjunctions: drilling for best approximations. In: ALT, pp 98–112 (2016)Google Scholar
  19. 19.
    Diakonikolas, I., Kamath, G., Kane, D.M., Li, J., Moitra, A, Stewart, A: Robust estimators in high dimensions without the computational intractability. In: 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp 655–664. IEEE (2016)Google Scholar
  20. 20.
    Diakonikolas, I, Kamath, G., Kane, D.M., Li, J., Steinhardt, J, Stewart, A: Sever: a robust meta-algorithm for stochastic optimization. arXiv: (2018)
  21. 21.
    Diakonikolas, I., Kane, D.M., Stewart, A.: Statistical query lower bounds for robust estimation of high-dimensional Gaussians and Gaussian mixtures. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp 73–84. IEEE (2017)Google Scholar
  22. 22.
    Diakonikolas, I, Kane, D.M., Stewart, A.: List-decodable robust mean estimation and learning mixtures of spherical Gaussians. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pp 1047–1060. ACM (2018)Google Scholar
  23. 23.
    Diakonikolas, I., Kong, W., Stewart, A.: Efficient algorithms and lower bounds for robust linear regression. arXiv:1806.00040 (2018)
  24. 24.
    Dodis, Y., Ong, S.J., Prabhakaran, M., Sahai, A.: On the (Im)possibility of cryptography with imperfect randomness. In: FOCS IEEE Symposium on Foundations of Computer Science (FOCS) (2004)Google Scholar
  25. 25.
    Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: 2010 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp 51–60. IEEE (2010)Google Scholar
  26. 26.
    Dodis, Y., Yao, Y.: Privacy with imperfect randomness. In: Annual Cryptology Conference, pp 463–482. Springer (2015)Google Scholar
  27. 27.
    Ehrenfeucht, A., Haussler, D., Kearns, M.J., Valiant, L.G.: A general lower bound on the number of examples needed for learning. Inf. Comput. 82(3), 247–261 (1989)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Etesami, O., Mahloujifar, S., Mahmoody, M.: Computational concentration of measure: Optimal bounds, reductions, and more. arXiv:1907.05401. To appear in SODA 2020 (2019)
  29. 29.
    González, C.R., Abu-Mostafa, Y.S.: Mismatched training and test distributions can outperform matched ones. Neural Comput. 27(2), 365–387 (2015)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Garg, S., Jha, S., Mahloujifar, S., Mahmoody, M.: Adversarially robust learning could leverage computational hardness. arXiv:1905.11564(2019)
  31. 31.
    Goldwasser, S., Kalai, Y.T., Park, S.: Adaptively secure coin-flipping, revisited (2015)Google Scholar
  32. 32.
    Goldwasser, S., Kalai, Y.T., Park, S.: Adaptively secure coin-flipping, revisited. In: International Colloquium on Automata, Languages, and Programming, pp 663–674. Springer (2015)Google Scholar
  33. 33.
    Haitner, I., Ishai, Y., Kushilevitz, E., Lindell, Y., Petrank, E.: Black-box constructions of protocols for secure computation. Cryptology ePrint Archive, Report 2010/164 (2010)
  34. 34.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Kearns, M.J., Li, M.: Learning in the presence of malicious errors. SIAM J. Comput. 22(4), 807–837 (1993)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Lai, K.A., Rao, A.B., Vempala, S.: Agnostic estimation of mean and covariance. In: 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp 665–674. IEEE (2016)Google Scholar
  37. 37.
    Mahloujifar, S., Mahmoody, M.: Blockwise p-tampering attacks on cryptographic primitives, extractors, and learners. In: Theory of Cryptography Conference, pp 245–279. Springer (2017)Google Scholar
  38. 38.
    Mahloujifar, S, Mahmoody, M: Blockwise p-tampering attacks on cryptographic primitives, extractors, and learners. Cryptology ePrint Archive, Report 2017/950 (2017)
  39. 39.
    Mahloujifar, S., Mahmoody, M.: Can adversarially robust learning leveragecomputational hardness? In: Algorithmic Learning Theory, pp. 581–609 (2019)Google Scholar
  40. 40.
    Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system (2008)Google Scholar
  41. 41.
    Papernot, N., McDaniel, P., Sinha, A., Wellman, M.: Towards the science of security and privacy in machine learning. arXiv:1611.03814 (2016)
  42. 42.
    Prasad, A., Suggala, A.S., Balakrishnan, S., Ravikumar, P.: Robust estimation via robust gradient estimation. arXiv:1802.06485 (2018)
  43. 43.
    Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. J. ACM 35(4), 965–984 (1988)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Rao, C.R.: Information and the accuracy attainable in the estimation of statistical parameters. In: Breakthroughs in Statistics, pp 235–247. Springer (1992)Google Scholar
  45. 45.
    Rubinstein, B.I.P., Nelson, B., Huang, L., Joseph, A.D., Lau, S.-h., Rao, S., Taft, N., Tygar, J.D.: Antidote: understanding and defending against poisoning of anomaly detectors. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, pp 1–14. ACM (2009)Google Scholar
  46. 46.
    Rubinstein, B.I.P., Nelson, B., Huang, L., Joseph, A.D., Lau, S -h, Rao, S., Taft, N., Tygar, J.D.: Stealthy poisoning attacks on pca-based anomaly detectors. ACM SIGMETRICS Perform. Eval. Rev. 37(2), 73–74 (2009)CrossRefGoogle Scholar
  47. 47.
    Reingold, O., Vadhan, S, Wigderson, A: A note on extracting randomness from Santha-Vazirani sources. Unpublished manuscript (2004)Google Scholar
  48. 48.
    Shafahi, A., Huang, W.R., Najibi, M., Suciu, O., Studer, C., Dumitras, T., Goldstein, T.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp 6103–6113 (2018)Google Scholar
  49. 49.
    Sloan, R.H.: Four types of noise in data for PAC learning. Inf. Process. Lett. 54 (3), 157–162 (1995)CrossRefGoogle Scholar
  50. 50.
    Shen, S., Tople, S., Saxena, P.: A uror: Defending against poisoning attacks in collaborative deep learning systems. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, pp 508–519. ACM (2016)Google Scholar
  51. 51.
    Santha, M., Vazirani, U.V.: Generating quasi-random sequences from semi-random sources. J. Comput. Syst. Sci. 33(1), 75–87 (1986)CrossRefGoogle Scholar
  52. 52.
    Valiant, L.G.: A Theory of the Learnable. Commun. ACM 27(11), 1134–1142 (1984)CrossRefGoogle Scholar
  53. 53.
    Valiant, L.G.: Learning disjunctions of conjunctions. In: IJCAI, pp 560–566 (1985)Google Scholar
  54. 54.
    Von Neumann, J.: 13 various techniques used in connection with random digits. Appl. Math. Ser 12, 36–38 (1951)Google Scholar
  55. 55.
    Wang, Y., Chaudhuri, K.: Data poisoning attacks against online learning. arXiv:1808.08994 (2018)
  56. 56.
    Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., Roli, F.: Is feature selection secure against training data poisoning? In: ICML, pp. 1689–1698 (2015)Google Scholar
  57. 57.
    Xu, H., Mannor, S.: Robustness and generalization. Mach. Learn. 86(3), 391–423 (2012)MathSciNetCrossRefGoogle Scholar
  58. 58.
    Yamazaki, K, Kawanabe, M., Watanabe, S., Sugiyama, M, Müller, K.-R.: Asymptotic Bayesian generalization error when training and test distributions are different. In: ICML, pp 1079–1086 (2007)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of VirginiaCharlottesvilleUSA

Personalised recommendations