Adversarial Minimax Training for Robustness Against Adversarial Examples
In this paper, we propose a novel method to improve robustness against adversarial examples. In conventional methods, in order to take measures against adversarial examples, a classifier is learned with adversarial examples generated in a specific way. However, this method can defend against only limited types of adversarial examples. In the proposed method, in order to deal with a wide range of adversarial examples, two networks are used: a generator network and a classifier network. The generator network generates noise to make an adversarial example and the classifier network acquires robustness by learning the adversarial example. Computer simulation results show that the proposed method is more robust against adversarial examples generated by some different methods in black box attacks than the conventional adversarial training methods.
KeywordsAdversarial examples Adversarial training Black box attack
This work was supported in part by JSPS KAKENHI (Grant no. 16K00329).
- 1.Szegedy, C., et al.: Intriguing properties of neural networks. arXiv:1312.6199 (2014)
- 2.Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv:1412.6572 (2015)
- 3.Carrara, F., Falchi, F., Caldelli, R., Amato, G., Fumarola, R., Becarelli, R.: Detecting adversarial example attacks to deep neural networks. In: 15th International Workshop on Content-Based Multimedia Indexing, no. 38, pp. 1–7. ACM (2017)Google Scholar
- 4.Meng, D., Chen, H.: MagNet: a two-pronged defense against adversarial examples. In: 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 135–147. ACM (2017)Google Scholar
- 5.Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 37th IEEE Symposium on Security and Privacy, pp. 582–597. IEEE Press (2016)Google Scholar
- 6.Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, B.Z., Swami, A.: The limitations of deep learning in adversarial settings. In: 1st IEEE European Symposium on Security and Privacy, pp. 372–387. IEEE Press (2016)Google Scholar
- 7.Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy, pp. 39–57. IEEE Press (2017)Google Scholar
- 9.Xiao, C., Li, B., Zhu, J., He, W., Liu, M., Song, D.: Generating Adversarial Examples with Adversarial Networks. arXiv:1801.02610 (2018)
- 10.Baluja, S., Fischer, I.: Adversarial Transformation Networks: Learning to Generate Adversarial Examples. arXiv:1703.09387 (2017)
- 11.Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27, pp. 2672–2680. NIPS (2014)Google Scholar
- 12.Papernot, N., et al.: cleverhans v2.0.0: an adversarial machine learning library. arXiv:1610.00768 (2017)