Skip to main content

Deep k-NN Defense Against Clean-Label Data Poisoning Attacks

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 Workshops (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12535))

Included in the following conference series:

Abstract

Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference. Although defenses have been proposed for general poisoning attacks, no reliable defense for clean-label attacks has been demonstrated, despite the attacks’ effectiveness and realistic applications. In this work, we propose a simple, yet highly-effective Deep k-NN defense against both feature collision and convex polytope clean-label attacks on the CIFAR-10 dataset. We demonstrate that our proposed strategy is able to detect over 99% of poisoned examples in both attacks and remove them without compromising model performance. Additionally, through ablation studies, we discover simple guidelines for selecting the value of k as well as for implementing the Deep k-NN defense on real-world datasets with class imbalance. Our proposed defense shows that current clean-label poisoning attack strategies can be annulled, and serves as a strong yet simple-to-implement baseline defense to test future clean-label poisoning attacks. Our code is available on GitHub.

N. Peri, N. Gupta and W. R. Huang—The first three authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We use conventional training loss for all except the adversarial training defense.

  2. 2.

    There is no filtering in adversarial training.

References

  1. Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420 (2018)

  2. Biggio, B., et al.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_25

    Chapter  Google Scholar 

  3. Chen, B., et al.: Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728 (2018)

  4. Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1907–1915 (2017)

    Google Scholar 

  5. Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: Advances in Neural Information Processing Systems, pp. 4467–4475 (2017)

    Google Scholar 

  6. Frosst, N., Papernot, N., Hinton, G.: Analyzing and improving representations with the soft nearest neighbor loss. arXiv preprint arXiv:1902.01889 (2019)

  7. Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015). http://arxiv.org/abs/1412.6572

  8. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)

  9. Gu, T., Dolan-Gavitt, B., Garg, S.: BadNets: identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733 (2017)

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  11. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  12. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269. IEEE (2017)

    Google Scholar 

  13. Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. In: Advances in Neural Information Processing Systems, pp. 125–136 (2019)

    Google Scholar 

  14. Koh, P.W., Steinhardt, J., Liang, P.: Stronger data poisoning attacks break data sanitization defenses. arXiv preprint arXiv:1811.00741 (2018)

  15. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report. Citeseer (2009)

    Google Scholar 

  16. Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770 (2016)

  17. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)

  18. Moosavi-Dezfooli, S.M., Fawzi, A., Uesato, J., Frossard, P.: Robustness via curvature regularization, and vice versa. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9078–9086 (2019)

    Google Scholar 

  19. Papernot, N., McDaniel, P.: Deep k-nearest neighbors: towards confident, interpretable and robust deep learning. arXiv preprint arXiv:1803.04765 (2018)

  20. Pascanu, R., Stokes, J.W., Sanossian, H., Marinescu, M., Thomas, A.: Malware classification with recurrent networks. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1916–1920. IEEE (2015)

    Google Scholar 

  21. Qin, C., et al.: Adversarial robustness through local linearization. arXiv preprint arXiv:1907.02610 (2019)

  22. Rizoiu, M.A., Wang, T., Ferraro, G., Suominen, H.: Transfer learning for hate speech detection in social media. In: Conference on Artificial Intelligence (AAAI) (2019)

    Google Scholar 

  23. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. arXiv preprint arXiv:1801.04381 (2018)

  24. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  Google Scholar 

  25. Shafahi, A., et al.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp. 6103–6113 (2018)

    Google Scholar 

  26. Shafahi, A., et al.: Adversarial training for free! In: Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  27. Sitawarin, C., Wagner, D.: On the robustness of keep k-nearest neighbors (2019)

    Google Scholar 

  28. Steinhardt, J., Koh, P.W., Liang, P.: Certified defenses for data poisoning attacks. CoRR abs/1706.03691 (2017). http://arxiv.org/abs/1706.03691

  29. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1891–1898 (2014)

    Google Scholar 

  30. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  31. Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems, pp. 8000–8010 (2018)

    Google Scholar 

  32. Turner, A., Tsipras, D., Madry, A.: Clean-label backdoor attacks (2019). https://people.csail.mit.edu/madry/lab/cleanlabel.pdf

  33. Xie, C., Wu, Y., Maaten, L.v.d., Yuille, A.L., He, K.: Feature denoising for improving adversarial robustness. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 501–509 (2019)

    Google Scholar 

  34. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987–5995. IEEE (2017)

    Google Scholar 

  35. Zhu, C., Huang, W.R., Li, H., Taylor, G., Studer, C., Goldstein, T.: Transferable clean-label poisoning attacks on deep neural nets. In: International Conference on Machine Learning, pp. 7614–7623 (2019)

    Google Scholar 

Download references

Acknowledgement

Dickerson and Gupta were supported in part by NSF CAREER Award IIS-1846237, DARPA GARD HR00112020007, DARPA SI3-CMD S4761, DoD WHS Award HQ003420F0035, and a Google Faculty Research Award. Goldstein and his students were supported by the DARPA GARD and DARPA QED4RML programs. Additional support was provided by the National Science Foundation DMS division, and the JP Morgan Fellowship program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to W. Ronny Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Peri, N. et al. (2020). Deep k-NN Defense Against Clean-Label Data Poisoning Attacks. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12535. Springer, Cham. https://doi.org/10.1007/978-3-030-66415-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66415-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66414-5

  • Online ISBN: 978-3-030-66415-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics