Advertisement

DunDi: Improving Robustness of Neural Networks Using Distance Metric Learning

  • Lei Cui
  • Rongrong XiEmail author
  • Zhiyu Hao
  • Xuehao Yu
  • Lei Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11537)

Abstract

The deep neural networks (DNNs), although highly accurate, are vulnerable to adversarial attacks. A slight perturbation applied to a sample may lead to misprediction of the DNN, even it is imperceptible to humans. This defect makes the DNN lack of robustness to malicious perturbations, and thus limits their usage in many safety-critical systems. To this end, we present DunDi, a metric learning based classification model, to provide the ability to defend adversarial attacks. The key idea behind DunDi is a metric learning model which is able to pull samples of the same label together meanwhile pushing samples of different labels away. Consequently, the distance between samples and model’s boundary can be enlarged accordingly, so that significant perturbations are required to fool the model. Then, based on the distance comparison, we propose a two-step classification algorithm that performs efficiently for multi-class classification. DunDi can not only build and train a new customized model but also support the incorporation of the available pre-trained neural network models to take full advantage of their capabilities. The results show that DunDi is able to defend 94.39% and 88.91% of adversarial samples generated by four state-of-the-art adversarial attacks on the MNIST dataset and CIFAR-10 dataset, without hurting classification accuracy.

Keywords

Robustness Deep neural network Metric learning 

Notes

Acknowledgement

Thanks for the valuable comments from anonymous reviewers of ICCS2019 and researchers of the George Washington University. This work has been supported by the National Natural Science Foundation of China (grant no. 61602465 and 61601458), National Key Research and Development Program of China (grant no. 2016QY04W0804), and Beijing Natural Science Foundation (grant no. 4172069).

References

  1. 1.
    Abadi, M., Barham, P., Chen, J., Chen, Z., et al.: TensorFlow: a system for large-scale machine learning. OSDI 16, 265–283 (2016)Google Scholar
  2. 2.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR, pp. 539–546 (2005)Google Scholar
  3. 3.
    Cui, L., Hao, Z., Wang, D., Li, Y.: Detecting adversarial samples using heatmap of neural networks. In: AAAI-19 Workshop on Engineering Dependable and Secure Machine Learning (2019)Google Scholar
  4. 4.
    Fawzi, A., Dezfooli, S.M.M., Frossard, P.: A geometric perspective on the robustness of deep networks. IEEE Sign. Process. Mag. (2017)Google Scholar
  5. 5.
    Feinman, R., Curtin, R.R., Shintre, S., Gardner, A.B.: Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410 (2017)
  6. 6.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)Google Scholar
  7. 7.
    Gu, S., Rigazio, L.: Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068 (2014)
  8. 8.
    Hoffer, E., Ailon, N.: Deep metric learning using triplet network. In: Feragen, A., Pelillo, M., Loog, M. (eds.) SIMBAD 2015. LNCS, vol. 9370, pp. 84–92. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24261-3_7CrossRefGoogle Scholar
  9. 9.
    Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236 (2016)
  10. 10.
    LeCun, Y.: The MNIST database of handwritten digitsGoogle Scholar
  11. 11.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRefGoogle Scholar
  12. 12.
    Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: ICLR (2017)Google Scholar
  13. 13.
    Moosavi-Dezfooli, S.-M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: CVPR, pp. 86–94 (2017)Google Scholar
  14. 14.
    Moosavi Dezfooli, S.M., Fawzi, A., Frossard, P.: DeepFool: a simple and accurate method to fool deep neural networks. In: CVPR (2016)Google Scholar
  15. 15.
    Papernot, N., McDaniel, P., et al.: The limitations of deep learning in adversarial settings. In: Security and Privacy (EuroS&P), pp. 372–387 (2016)Google Scholar
  16. 16.
    Papernot, N., McDaniel, P., Goodfellow, I., et al.: Practical black-box attacks against deep learning systems using adversarial examples. arXiv preprint (2016)Google Scholar
  17. 17.
    Papernot, N., McDaniel, P., Wu, X., et al.: Distillation as a defense to adversarial perturbations against deep neural networks. In: Security and Privacy (SP), pp. 582–597 (2016)Google Scholar
  18. 18.
    Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: SOSP, pp. 1–18 (2017)Google Scholar
  19. 19.
    Ross, A.S., Doshi-Velez, F.: Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. arXiv preprint arXiv:1711.09404 (2017)
  20. 20.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)Google Scholar
  21. 21.
    Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Adversarial generative nets: neural network attacks on state-of-the-art face recognition. arXiv preprint arXiv:1801.00349 (2017)
  22. 22.
    Szegedy, C., Zaremba, W., Sutskever, I., et al.: Intriguing properties of neural networks. In: ICLR (2014)Google Scholar
  23. 23.
    Tramèr, F., Kurakin, A., Papernot, N., et al.: Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204 (2017)
  24. 24.
    Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., et al.: Ensemble adversarial training: attacks and defenses. In: ICLR (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Lei Cui
    • 1
  • Rongrong Xi
    • 1
    Email author
  • Zhiyu Hao
    • 1
  • Xuehao Yu
    • 2
  • Lei Zhang
    • 1
  1. 1.Institute of Information EngineeringChinese Academy of SciencesBeijingChina
  2. 2.State Grid Information and Telecommunication BranchBeijingChina

Personalised recommendations