Advertisement

Accelerate Black-Box Attack with White-Box Prior Knowledge

  • Jinghui Cai
  • Boyang Wang
  • Xiangfeng WangEmail author
  • Bo Jin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11936)

Abstract

We propose an efficient adversarial attack method in the black-box setting. Our Multi-model Efficient Query Attack (MEQA) method takes advantage of the prior knowledge on different models’ relationship to guide the construction of black-box adversarial instances. The MEQA method employs several gradients from different white-box attack models and further the “best” one is selected to replace the gradient of black-box model in each step. The gradient composed by different model gradients will lead a significant loss to the black-box model on these adversarial pictures and then cause misclassification. Our key motivation is to estimate the black-box model with several existing white-box models, which can significantly increase the efficiency from the perspectives of both query sampling and calculating. Compared with gradient estimation based black-box adversarial attack methods, our MEQA method reduces the number of queries from 10000 to 40, which greatly accelerates the black-box adversarial attack. Compared with the zero query black-box adversarial attack method, which also called transfer attack method, MEQA boosts the attack success rate by 30%. We evaluate our method on several black-box models and achieve remarkable performance which proves that MEQA can serve as a baseline method for fast and effective black-box adversarial attacks.

Keywords

Efficient black-box attack Gradient estimation Transfer attack Model robustness 

Notes

Acknowledgement

This work is supported by NSFC 61702188 and U1509219.

References

  1. 1.
    Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.: Synthesizing robust adversarial examples. arXiv preprint arXiv:1707.07397 (2017)
  2. 2.
    Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)Google Scholar
  3. 3.
    Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.J.: ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 15–26. ACM (2017)Google Scholar
  4. 4.
    Dong, Y., et al.: Boosting adversarial attacks with momentum. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9185–9193 (2018)Google Scholar
  5. 5.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
  6. 6.
    Guo, C., Gardner, J.R., You, Y., Wilson, A.G., Weinberger, K.Q.: Simple black-box adversarial attacks. arXiv preprint arXiv:1905.07121 (2019)
  7. 7.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  8. 8.
    Ilyas, A., Engstrom, L., Athalye, A., Lin, J.: Black-box adversarial attacks with limited queries and information. arXiv preprint arXiv:1804.08598 (2018)
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  10. 10.
    Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
  11. 11.
    Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770 (2016)
  12. 12.
    Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277 (2016)
  13. 13.
    Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519. ACM (2017)Google Scholar
  14. 14.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  16. 16.
    Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
  17. 17.
    Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204 (2017)
  18. 18.
    Tu, C.C., et al.: AutoZOOM: autoencoder-based zeroth order optimization method for attacking black-box neural networks. arXiv preprint arXiv:1805.11770 (2018)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jinghui Cai
    • 1
  • Boyang Wang
    • 2
  • Xiangfeng Wang
    • 1
    Email author
  • Bo Jin
    • 1
  1. 1.Shanghai Key Lab for Trustworthy Computing, School of Computer Science and TechnologyEast China Normal UniversityShanghaiChina
  2. 2.School of Mechanical EngineeringShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations