Skip to main content
Log in

DPG: a model to build feature subspace against adversarial patch attack

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

Adversarial patch attacks in the physical world are a major threat to the application of deep learning. However, current research on adversarial patch defense algorithms focuses on image pre-processing defenses, it has been demonstrated that this defense reduces the classification accuracy of clean images and is unable to defend against physically realizable attacks. In this paper, we propose a defense patch GNN (DPG), using a new perspective for defending against adversarial patch attacks. First, we extract the input image features with the feature extraction to obtain a feature set. Then downsampling the feature set by applying the global average pooling layer to reduce the perturbation of the features by the adversarial patch. Finally, this paper proposes a graph-structured feature subspace to robust the feature performance. In addition, we design an optimization algorithm based on stochastic gradient descent (SGD), which can significantly increase the mode’s generalization ability. We demonstrate empirically the superior robustness of the DPG model on existing adversarial patch attacks. DPG shows without any accuracy loss in the prediction of clean images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

All datasets used in this work are publicy available.

Code availability

The code is available upon reasonable request.

References

  • Brown, T.B, Mané, D., Roy, A., et al. (2017). Adversarial patch. arXiv preprint arXiv:1712.09665.

  • Chen, J., Ma, T., & Xiao, C. (2018). Fastgcn: Fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247.

  • Chen, Z., Kailkhura, B., & Zhou, Y. (2023). An accelerated proximal algorithm for regularized nonconvex and nonsmooth bi-level optimization. Machine Learning, 112(5), 1433–63.

    Article  MathSciNet  Google Scholar 

  • Cohen, J., Rosenfeld, E., & Kolter, Z. (2019). Certified adversarial robustness via randomized smoothing. In: International Conference on Machine Learning, PMLR, p 1310–1320.

  • Das, N., Shanbhogue, M., Chen, S.T., et al. (2017). Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression. arXiv preprint arXiv:1705.02900.

  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.

  • Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(7), 2121–2159.

    MathSciNet  Google Scholar 

  • Dziugaite, G.K., Ghahramani, Z., Roy, D.M., (2016). A study of the effect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853.

  • Evtimov, I., Eykholt, K., Fernandes, E., et al. (2017). Robust physical-world attacks on machine learning models. arXiv preprint arXiv:1707.08945 2(3):4

  • Gao, J., Lan, J., Wang, B., et al. (2022). Sdanet: Spatial deep attention-based for point cloud classification and segmentation. Machine Learning, 111(4), 1327–1348.

    Article  MathSciNet  Google Scholar 

  • Hamilton, W., Ying, Z., & Leskovec, J., (2017). Inductive representation learning on large graphs. Advances in Neural Information Processing Systems 30.

  • Han, K., Wang, Y., Guo, J., et al. (2022). Vision GNN: An image is worth graph of nodes. arXiv preprint arXiv:2206.00272

  • Hao, J., Liu, J., Pereira, E., et al. (2022). Uncertainty-guided graph attention network for parapneumonic effusion diagnosis. Medical Image Analysis, 75, 102217.

    Article  Google Scholar 

  • Hayes, J., (2018). On visible adversarial perturbations & digital watermarking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, p 1597–1604.

  • He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p 770–778.

  • Huang, M., Zhuang, F., Zhang, X., et al. (2019). Supervised representation learning for multi-label classification. Machine Learning, 108, 747–763.

    Article  MathSciNet  Google Scholar 

  • Khirirat, S., Feyzmahdavian, H.R., Johansson, M., (2017). Mini-batch gradient descent: Faster convergence under data sparsity. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), IEEE, p 2880–2887.

  • Kingma, D.P., & Ba, J., (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

  • Levine, A., & Feizi, S. (2020). (de) randomized smoothing for certifiable defense against patch attacks. Advances in Neural Information Processing Systems, 33, 6465–6475.

    Google Scholar 

  • Liu, J., Levine, A., Lau, CP., et al. (2022). Segment and complete: Defending object detectors against adversarial patch attacks with robust patch detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 14973–14982.

  • Mustafa, A., Khan, S. H., Hayat, M., et al. (2019). Image super-resolution as a defense against adversarial attacks. IEEE Transactions on Image Processing, 29, 1711–1724.

    Article  MathSciNet  Google Scholar 

  • Naseer, M., Khan S., Porikli, F., (2019). Local gradients smoothing: Defense against localized adversarial attacks. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, p 1300–1307..

  • Papernot, N., McDaniel, P., Wu X., et al. (2016). Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), IEEE, p 582–597

  • Parisot, S., Ktena, S. I., Ferrante, E., et al. (2018). Disease prediction using graph convolutional networks: Application to autism spectrum disorder and alzheimer’s disease. Medical Image Analysis, 48, 117–130.

    Article  Google Scholar 

  • Qian, N. (1999). On the momentum term in gradient descent learning algorithms. Neural Networks, 12(1), 145–151.

    Article  Google Scholar 

  • Ren, M., Wang, Y. L., & He, Z. F. (2022). Towards interpretable defense against adversarial attacks via causal inference. Machine Intelligence Research, 19(3), 209–226.

    Article  Google Scholar 

  • Ren, S., He, K., Girshick, R., et al. (2015). Faster r-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 28.

  • Scarselli, F., Gori, M., Tsoi, A. C., et al. (2008). The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61–80.

    Article  Google Scholar 

  • Sharif, M., Bhagavatula, S., Bauer, L., et al. (2016). Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, p 1528–1540.

  • Shi, W., Rajkumar, R. (2020). Point-GNN: Graph neural network for 3d object detection in a point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p 1711–1719.

  • Song, D., Eykholt, K., Evtimov, I., et al. (2018). Physical adversarial examples for object detectors. In: 12th USENIX Workshop on Offensive Technologies (WOOT 18).

  • Song, S., Chaudhuri, K., Sarwate, A.D. (2013). Stochastic gradient descent with differentially private updates. In: 2013 IEEE Global Conference on Signal and Information Processing, IEEE, p 245–248.

  • Van Ranst, W., Thys, S., Goedemé, T. (2019). Fooling automated surveillance cameras: Adversarial patches to attack person detection. In: CVPR Workshop on The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security, IEEE, p 49–55.

  • Veličković, P., Cucurull, G., Casanova, A., et al. (2017). Graph attention networks. arXiv preprint arXiv:1710.10903.

  • Wilson, A.C., Roelofs. R., Stern. M., et al. (2017). The marginal value of adaptive gradient methods in machine learning. Advances in Neural Information Processing Systems 30.

  • Wu, T., Tong, L., Vorobeychik, Y.D (2019). Defending against physically realizable attacks on image classification. arXiv preprint arXiv:1909.09552.

  • Wu, Z., Lim, S.N., Davis, L.S., et al. (2020). Making an invisibility cloak: Real world adversarial attacks on object detectors. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16, Springer, p 1–17.

  • Xiang, C., Bhagoji, A.N., Sehwag, V., et al. (2021). Patchguard: A provably robust defense against adversarial patches via small receptive fields and masking. In: USENIX Security Symposium, p 2237–2254.

  • Xie, Y., Li, S., Yang, C., et al. (2020). When do GNNs work: Understanding and improving neighborhood aggregation. In: IJCAI’20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020.

  • Xu, K., Hu, W., Leskovec, J., et al. (2018). How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.

  • Zhou, X., Tsang, IW., Yin, J. (2022). Ladder: Latent boundary-guided adversarial training. Machine Learning p 1–29.

  • Zhu, Y., Chen, Y., Li, X., et al. (2022). Toward understanding and boosting adversarial transferability from a distribution perspective. IEEE Transactions on Image Processing, 31, 6487–6501.

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant Nos. U23B2021, U1936213, Program of Shanghai Academic Research Leader No. 21XD1421500, Shanghai Science and Technology Commission Project No. 20020500600.

Author information

Authors and Affiliations

Authors

Contributions

All authors conceived and designed the theoretical, algorithmic and model work together. MW participated in the funding of this work. YSX completed the initial draft of the manuscript. YSX, MW, WH and WWL revised the manuscript. YSX completed the code base and the experiments. YSX, MW and WH analyzed the experiments together. All authors reviewed this paper.

Corresponding author

Correspondence to Mi Wen.

Ethics declarations

Conflict of interest

The authors report no conflicts of interest or competing interests.

Ethical approval

Not applicable.

Consent to participate

All authors consent to participate.

Consent for publication

All authors consent to publication.

Additional information

Editors: Bingxue Zhang, Feida Zhu, Bin Yang, João Gama.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Adversarial patch attacks

This section contains additional details on adversarial attacks. Note that we use the Adversarial Patch, Robust Physical Perturbation and Adversarial Eyeglasses attacks to evaluate our defended models.

1.1 Attack algorithm

The untargeted adversarial patch attack can be expressed as the optimization problem shown below:

$$\begin{aligned} \textbf{x}'=\arg \max _{\textbf{x}'\in \mathcal {A}(\textbf{x})}\mathcal {L}(\mathcal {M}(\textbf{x}'),y) \end{aligned}$$
(8)

where M(x) is the prediction confidence of the final output of the model, y is the one-hot encoding of the current class, and \(\mathcal {L}\) refers to the cross-entropy loss, \(x'\) denotes our adversarial example that is attempting to maximize the loss. For the targeted attack with target class \(y' \ne y\), the main difference is that we aim to maximize the loss of the correct class while minimizing the loss of the target class, instead of merely maximizing the loss of the correct class:

$$\begin{aligned} \begin{array}{c}\underset{\delta \in \Delta }{\text {maximize}}(\ell (h_\theta (x+\delta ),y)-\ell (h_\theta (x+\delta ),y_{\text {target}}))\\ =\underset{\delta \in \Delta }{\text {maximize}}\Big (h_\theta (x+\delta )_{y\text {target}}-h_\theta (x+\delta )_y\Big )\end{array} \end{aligned}$$
(9)

1.2 Adversarial patch

The Adversarial Patch attack algorithm implements the attack by completely replacing a part of the image with a patch. During the training phase, random translation, scaling, and rotation are applied to the picture patch. With the aim to define the transformation t that transforms patch p initially and then positions patch p on the image x at location l, we form a patch application operator A(Pxlt).

We use a variant of the Expectation over Transformation (EOT) framework to train the patch to accomplish the attack. The purpose of EOT is to maximize the target class by constraining the effective distance between the adversarial sample and the input of the original sample, under the condition that the input image is transformed:

$$\begin{aligned} \begin{array}{ll}\underset{x'}{\text {argmax}}&{}\mathbb {E}_{t\sim T}[\log P(y_t|t(x'))]\\ {\text {subject to}}&{}\mathbb {E}_{t\sim T}[d(t(x'),t(x))]<\epsilon \\ {} &{}x\in [0,1]^d\end{array} \end{aligned}$$
(10)

where d(., .) represents the Euclidean distance in LAB space after the Lagrangian relaxation method, so the optimization function in the EOT framework is as follows:

$$\begin{aligned} \begin{array}{l}dist=||LAB(t(x'))-LAB(t(x))||_2\\ \underset{x'}{\text {argmax}}\mathbb {E}_{t\sim T}\Big [\log P(y_t|t(x')) -\lambda {\text {dist}}\Big ]\end{array} \end{aligned}$$
(11)

We use a variant of the EOT framework to obtain the trained patch \(\hat{p}\). The final objective function after reformulating by using the Lagrangian-relaxed form is as follows:

$$\begin{aligned} \widehat{p}=\underset{x'}{\text {argmax}}\mathbb {E}_{x\sim X,t\sim T,l\sim L}[\log \Pr (\widehat{y}|A(p,x,l,t)] \end{aligned}$$
(12)

1.3 Robust physical perturbation

Robust Physical Perturbation(RP2) is performed by using a mask to completely cover the target object with the adversarial patch attack; this mask is used to project the calculated perturbations to physical regions on the object’s surfaces. The optimization function is as follows:

$$\begin{aligned} \begin{array}{c}\underset{\delta }{\text {argmin}}\lambda ||M_x\cdot \delta ||_p+\mathbb {E}_{x_i\sim X^V}J(f_\theta (x_i+T_i(M_x\cdot \delta )),y^*)\end{array} \end{aligned}$$
(13)

Where \(T_i\) denotes the alignment function that maps transformations on the object to transformations on the perturbation. \(M_x\) means the perturbation mask. \(M_x\) contains zeroes in regions where no perturbation is added and ones in regions where the perturbation is added during optimization.

1.4 Adversarial eyeglasses

This attack algorithm build on Generative Adversarial Networks (GAN). We employ GAN to generate adversarial glasses. The whole optimization process is shown below:

$$\begin{aligned}{} & {} Loss_G(Z,D)=\sum \limits _{z\in Z}\lg \left( 1-D\bigl (G(z)\bigr )\right) \end{aligned}$$
(14)
$$\begin{aligned}{} & {} Gain_D(G,Z,data)=\sum \limits _{x\in data}\lg \left( D\big (x\big )\right) +\sum \limits _{z\in Z}\lg \left( 1-D\big (G(z)\big )\right) \end{aligned}$$
(15)

Formally, We use three Deep Neural Networks to generate our adversarial glasses: a generator, G; a discriminator, D; and a pre-trained DNN whose classification function is denoted by F(.). For an image x, G implements the generation of adversarial glasses by training a fooling F and minimizing the following objective function:

$$\begin{aligned} Loss_G(Z,D)-\kappa \cdot \sum \limits _{z\in Z}Loss_F(x+G(z)) \end{aligned}$$
(16)

For untargeted attack, the classifier corresponds to the following objective function:

$$\begin{aligned} Loss_F(x+G(z))=\sum \limits _{i\ne x}F_{c_i}(x+G(z))-F_{c_x}(x+G(z)) \end{aligned}$$
(17)

while for targeted attacks we use:

$$\begin{aligned} L o s s_{F}(x+G(z))=F_{c_{t}}(x+G(z))-\sum _{i\ne t}F_{c_{i}}(x+G(z)) \end{aligned}$$
(18)

Where \(F_c(.)\) is the DNN’s output for class c. By maximizing \(Loss_F\), the probability of the correct class decreases for untargeted attacks, whereas the probability of the target class increases for targeted attacks. Therefore, we generate the adversarial glasses samples by minimizing the objective function.

Appendix 2: Classification accuracy

Table 6 Classification accuracy of original classification models

Table 6 presents the classification accuracy of the original classification model on the three datasets. On the VGGFace dataset, the DPG model improves the accuracy by 20 and 12%, respectively, compared to the original classification model, but the classification accuracy decreases by 10% on the ResNet model.

Appendix 3: Inference time

Table 7 Inference time for different defense models

Table 7 reports the per-image inference time of different models on the VGGFace validation set. DPG defense model prediction with ResNet as a feature extraction module consumes the shortest time.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xue, Y., Wen, M., He, W. et al. DPG: a model to build feature subspace against adversarial patch attack. Mach Learn (2024). https://doi.org/10.1007/s10994-023-06417-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10994-023-06417-7

Keywords

Navigation