Advertisement

API-Net: Robust Generative Classifier via a Single Discriminator

Conference paper
  • 452 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12358)

Abstract

Robustness of deep neural network classifiers has been attracting increased attention. As for the robust classification problem, a generative classifier typically models the distribution of inputs and labels, and thus can better handle off-manifold examples at the cost of a concise structure. On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure. This work aims for a solution of generative classifiers that can profit from the merits of both. To this end, we propose an Anti-Perturbation Inference (API) method, which searches for anti-perturbations to maximize the lower bound of the joint log-likelihood of inputs and classes. By leveraging the lower bound to approximate Bayes’ rule, we construct a generative classifier Anti-Perturbation Inference Net (API-Net) upon a single discriminator. It takes advantage of the generative properties to tackle off-manifold examples while maintaining a succinct structure for effective optimization. Experiments show that API successfully neutralizes adversarial perturbations, and API-Net consistently outperforms state-of-the-art defenses on prevailing benchmarks, including CIFAR-10, MNIST, and SVHN.(Our code is available at github.com/dongxinshuai/API-Net.).

Keywords

Deep learning Neural networks Adversarial defense Adversarial training Generative classifier 

Notes

Acknowledgements

This work is supported by the Nature Science Foundation of China (No.U1705262, No.6177244, No.61572410, No.61802324 and No.61702136), National Key R&D Program (No.2017YFC0113000, and No.2016YFB1001503), Key R&D Program of Jiangxi Province (No. 20171ACH80022) and Natural Science Foundation of Guangdong Provice in China (No.2019B1515120049).

Supplementary material

504454_1_En_23_MOESM1_ESM.pdf (1.3 mb)
Supplementary material 1 (pdf 1349 KB)

References

  1. 1.
    Alzantot, M., Sharma, Y., Elgohary, A., Ho, B.J., Srivastava, M., Chang, K.W.: Generating natural language adversarial examples. In: EMNLP (2018)Google Scholar
  2. 2.
    Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: ICML (2018)Google Scholar
  3. 3.
    Buckman, J., Roy, A., Raffel, C., Goodfellow, I.: Thermometer encoding: one hot way to resist adversarial examples. In: ICLR (2018)Google Scholar
  4. 4.
    Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: SP. IEEE (2017)Google Scholar
  5. 5.
    Carlini, N., Wagner, D.: Audio adversarial examples: targeted attacks on speech-to-text. In: SPW. IEEE (2018)Google Scholar
  6. 6.
    Cauchy, A.: Méthode générale pour la résolution des systemes d’équations simultanées. Comp. Rend. Sci. Paris 25(1847), 536–538 (1847)Google Scholar
  7. 7.
    Chen, J., Wu, X., Liang, Y., Jha, S.: Improving adversarial robustness by data-specific discretization. CoRR, abs/1805.07816 (2018)Google Scholar
  8. 8.
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188 (1936)CrossRefGoogle Scholar
  9. 9.
    Frank, M., Wolfe, P.: An algorithm for quadratic programming. Naval Res. Logist. Q. 3(1–2), 95–110 (1956)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Gilmer, J., et al.: Adversarial spheres. arXiv preprint arXiv:1801.02774 (2018)
  11. 11.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT press, United States (2016)zbMATHGoogle Scholar
  12. 12.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)Google Scholar
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  14. 14.
    Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)CrossRefGoogle Scholar
  15. 15.
    Holub, A., Perona, P.: A discriminative framework for modelling object classes. In: CVPR (2005)Google Scholar
  16. 16.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
  17. 17.
    Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Tech. rep, Citeseer (2009)Google Scholar
  18. 18.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NeurIPS (2012)Google Scholar
  19. 19.
    Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
  20. 20.
    Lasserre, J.A., Bishop, C.M., Minka, T.P.: Principled hybrids of generative and discriminative models. In: CVPR (2006)Google Scholar
  21. 21.
    Li, X., Li, F.: Adversarial examples detection in deep networks with convolutional filter statistics. In: ICCV (2017)Google Scholar
  22. 22.
    Li, Y., Bradshaw, J., Sharma, Y.: Are generative classifiers more robust to adversarial attacks? In: ICML (2019)Google Scholar
  23. 23.
    Li, Y., Gal, Y.: Dropout inference in bayesian neural networks with alpha-divergences. In: ICML (2017)Google Scholar
  24. 24.
    Liu, H., et al.: Universal adversarial perturbation via prior driven uncertainty approximation. In: ICCV (2019)Google Scholar
  25. 25.
    Louizos, C., Welling, M.: Multiplicative normalizing flows for variational bayesian neural networks. In: ICML (2017)Google Scholar
  26. 26.
    Lu, J., Issaranon, T., Forsyth, D.: Safetynet: detecting and rejecting adversarial examples robustly. In: ICCV (2017)Google Scholar
  27. 27.
    Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)Google Scholar
  28. 28.
    Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: ICLR (2017)Google Scholar
  29. 29.
    Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: CVPR (2017)Google Scholar
  30. 30.
    Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: CVPR (2016)Google Scholar
  31. 31.
    Na, T., Ko, J.H., Mukhopadhyay, S.: Cascade adversarial machine learning regularized with a unified embedding. arXiv preprint arXiv:1708.02582 (2017)
  32. 32.
    Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. In: NeurIPS (2002)Google Scholar
  33. 33.
    Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: EuroS&P. IEEE (2016)Google Scholar
  34. 34.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: NeurIPS (2015)Google Scholar
  35. 35.
    Samangouei, P., Kabkab, M., Chellappa, R.: Defense-gan: Protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605 (2018)
  36. 36.
    Santurkar, S., Ilyas, A., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Image synthesis with a single (robust) classifier. In: NeurIPS (2019)Google Scholar
  37. 37.
    Schott, L., Rauber, J., Bethge, M., Brendel, W.: Towards the first adversarially robust neural network model on mnist. In: ICLR (2019)Google Scholar
  38. 38.
    Song, Y., Kim, T., Nowozin, S., Ermon, S., Kushman, N.: Pixeldefend: leveraging generative models to understand and defend against adversarial examples. In: ICLR (2018)Google Scholar
  39. 39.
    Stutz, D., Hein, M., Schiele, B.: Disentangling adversarial robustness and generalization. In: CVPR (2019)Google Scholar
  40. 40.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NeurIPS (2014)Google Scholar
  41. 41.
    Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2013)Google Scholar
  42. 42.
    Tramer, F., Carlini, N., Brendel, W., Madry, A.: On adaptive attacks to adversarial example defenses. arXiv preprint arXiv:2002.08347 (2020)
  43. 43.
    Wong, E., Schmidt, F.R., Kolter, J.Z.: Wasserstein adversarial examples via projected sinkhorn iterations. In: ICML (2019)Google Scholar
  44. 44.
    Xie, C., Wang, J., Zhang, Z., Ren, Z., Yuille, A.: Mitigating adversarial effects through randomization. In: ICLR (2018)Google Scholar
  45. 45.
    Xu, W., Evans, D., Qi, Y.: Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155 (2017)
  46. 46.
    Yang, Y., Zhang, G., Katabi, D., Xu, Z.: Me-net: towards effective adversarial robustness with matrix estimation. In: ICML (2019)Google Scholar
  47. 47.
    Zhang, H., Yu, Y., Jiao, J., Xing, E.P., Ghaoui, L.E., Jordan, M.I.: Theoretically principled trade-off between robustness and accuracy. In: ICML (2019)Google Scholar
  48. 48.
    Zhang, X., Wan, F., Liu, C., Ji, R., Ye, Q.: Freeanchor: learning to match anchors for visual object detection. In: NeurIPS (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Media Analytics and Computing Lab, Department of Artificial Intelligence, School of InformaticsXiamen UniversityXiamenChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.Noah’s Ark Lab, Huawei TechnologiesShenzhenChina
  4. 4.Huawei Cloud BUShenzhenChina

Personalised recommendations