AP-GAN: Adversarial patch attack on content-based image retrieval systems


Key Smart City applications such as traffic management and public security rely heavily on the intelligent processing of video and image data, often in the form of visual retrieval tasks, such as person Re-IDentification (ReID) and vehicle re-identification. For these tasks, Deep Neural Networks (DNNs) have been the dominant solution for the past decade, for their remarkable ability in learning discriminative features from images to boost retrieval performance. However, it is been discovered that DNNs are broadly vulnerable to maliciously constructed adversarial examples. By adding small perturbations to a query image, the returned retrieval results will be completely dissimilar from the query image. This poses serious challenges to vital systems in Smart City applications that depend on the DNN-based visual retrieval technology, as in the physical world, simple camouflage can be added on the subject (a few patches on the body or car), and turn the subject completely untrackable by person or vehicle Re-ID systems. To demonstrate the potential of such threats, this paper proposes a novel adversarial patch generative adversarial network (AP-GAN) to generate adversarial patches instead of modifying the entire image, which also causes the DNNs-based image retrieval models to return incorrect results. AP-GAN is trained in an unsupervised way that requires only a small amount of unlabeled data for training. Once trained, it produces query-specific perturbations for query images to form adversarial queries. Extensive experiments show that the AP-GAN achieves excellent attacking performance with various application scenarios that are based on deep features, including image retrieval, person ReID and vehicle ReID. The results of this study provide a warning that when deploying a DNNs-based image retrieval system, its security and robustness needs to be thoroughly considered.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.


  2. 2.


  3. 3.


  4. 4.


  5. 5.



  1. 1.

    Akhtar N, Liu J, Mian A (2018) Defense against universal adversarial perturbations. In: CVPR, pp 3389–3398

  2. 2.

    Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval. In: ICCV, pp 1269–1277

  3. 3.

    Brown TB, Mané D, Roy A, Abadi M, Gilmer J (2017) Adversarial patch. CoRR 1712.09665

  4. 4.

    Chen C, Seff A, Kornhauser A, Xiao J (2015) Deepdriving: Learning affordance for direct perception in autonomous driving. In: ICCV, pp 2722–2730

  5. 5.

    Chen L, Shang S (2019) Region-based message exploration over spatio-temporal data streams. AAAI, vol 33, pp 873–880

  6. 6.

    Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: CVPR, pp 1335–1344

  7. 7.

    Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, Prakash A, Kohno T, Song D (2018) Robust physical-world attacks on deep learning visual classification. In: CVPR, pp 1625–1634

  8. 8.

    Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: NIPS, pp 2672–2680

  9. 9.

    Goodfellow I, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: ICLR

  10. 10.

    Gordo A, Almazan J, Revaud J, Larlus D (2017) End-to-end learning of deep visual representations for image retrieval. IJCV 124(2):237–254

    Article  Google Scholar 

  11. 11.

    Han P, Yang P, Zhao P, Shang S, Liu Y, Zhou J, Gao X, Kalnis P (2019) Gcn-mf: Disease-gene association identification by graph convolutional networks and matrix factorization. In: SIGKDD, pp 705–713

  12. 12.

    He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: ECCV, pp 630–645

  13. 13.

    Huang R, Zhang S, Li T, He R (2017) Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: ICCV, pp 2439–2448

  14. 14.

    Kalantidis Y, Mellina C, Osindero S (2016) Cross-dimensional weighting for aggregated deep convolutional features. In: ECCV, pp 685–701

  15. 15.

    Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J (2018) Deblurgan: Blind motion deblurring using conditional adversarial networks. In: CVPR

  16. 16.

    Kurakin A, Goodfellow I, Bengio S (2017) Adversarial machine learning at scale. In: ICLR

  17. 17.

    Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, pp 4681–4690

  18. 18.

    Li J, Ji R, Liu H, Hong X, Gao Y, Tian Q (2019) Universal perturbation attack against image retrieval. In: ICCV, pp 4899–4908

  19. 19.

    Liu A, Liu X, Fan J, Ma Y, Zhang A, Xie H, Tao D (2019a) Perceptual-sensitive gan for generating adversarial patches. In: AAAI, vol 33, pp 1028–1035

  20. 20.

    Liu X, Liu W, Ma H, Fu H (2016) Large-scale vehicle re-identification in urban surveillance videos. In: ICME. IEEE, pp 1–6

  21. 21.

    Liu X, Yang H, Liu Z, Song L, Chen Y, Li H (2019b) DPATCH: an adversarial patch attack on object detectors. In: Workshop AAAI, vol 2301

  22. 22.

    Liu Z, Zhao Z, Larson M (2019c) Who’s afraid of adversarial queries?: The impact of image modifications on content-based image retrieval. In: ICMR. ACM, pp 306–314

  23. 23.

    Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: ICCV, pp 2794–2802

  24. 24.

    Moosavi-Dezfooli SM, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In: CVPR, pp 2574–2582

  25. 25.

    Moosavi-Dezfooli SM, Fawzi A, Fawzi O, Frossard P (2017) Universal adversarial perturbations. In: CVPR, pp 1765–1773

  26. 26.

    Noh H, Araujo A, Sim J, Weyand T, Han B (2017) Large-scale image retrieval with attentive deep local features. In: ICCV, pp 3456–3465

  27. 27.

    Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: CVPR, pp 1–8

  28. 28.

    Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2008) Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR

  29. 29.

    Radenović F, Tolias G, Chum O (2016) Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. In: ECCV, pp 3–20

  30. 30.

    Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. TPAMI 41(7):1655–1668

    Article  Google Scholar 

  31. 31.

    Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR

  32. 32.

    Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: NIPS, pp 91–99

  33. 33.

    Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: CVPR, pp 815–823

  34. 34.

    Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: ICCV, pp 618–626

  35. 35.

    Shang S, Zhu S, Guo D, Lu M (2017) Discovery of probabilistic nearest neighbors in traffic-aware spatial networks. WWW 20(5):1135–1151

    Article  Google Scholar 

  36. 36.

    Simo-Serra E, Trulls E, Ferraz L, Kokkinos I, Fua P, Moreno-Noguer F (2015) Discriminative learning of deep convolutional feature point descriptors. In: ICCV, pp 118–126

  37. 37.

    Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR

  38. 38.

    Subramanya A, Pillai V, Pirsiavash H (2019) Fooling network interpretation in image classification. In: ICCV, pp 2020–2029

  39. 39.

    Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: ECCV, pp 480–496

  40. 40.

    Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: ICLR

  41. 41.

    Thys S, Van Ranst W, Goedemé T (2019) Fooling automated surveillance cameras: adversarial patches to attack person detection. In: CVPR Workshops, pp 0–0

  42. 42.

    Tolias G, Sicre R, Jégou H (2016) Particular object retrieval with integral max-pooling of cnn activations. ICLR

  43. 43.

    Tolias G, Radenovic F, Chum O (2019) Targeted mismatch adversarial attack: Query with a flower to retrieve the tower. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5037–5046

  44. 44.

    Wang G, Yuan Y, Chen X, Li J, Zhou X (2018a) Learning discriminative features with multiple granularities for person re-identification. In: MM. ACM, pp 274–282

  45. 45.

    Wang H, Yang Y Y, Pan Y, Han P, Li Z X, Huang HG, Zhu SZ (2020) Detecting thoracic diseases via representation learning with adaptive sampling. Neurocomputing

  46. 46.

    Wang J, Song Y, Leung T, Rosenberg C, Wang J, Philbin J, Chen B, Wu Y (2014) Learning fine-grained image similarity with deep ranking. In: CVPR

  47. 47.

    Wang Y, Chen Z, Wu F, Wang G (2018b) Person re-identification with cascaded pairwise convolutions. In: CVPR, pp 1470–1478

  48. 48.

    Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer gan to bridge domain gap for person re-identification. In: CVPR, pp 79–88

  49. 49.

    Xiao C, Li B, yan Zhu J, He W, Liu M, Song D (2018) Generating adversarial examples with adversarial networks. In: IJCAI-18, pp 3905–3911

  50. 50.

    Xie C, Wang J, Zhang Z, Zhou Y, Xie L, Yuille A (2017) Adversarial examples for semantic segmentation and object detection. In: ICCV, pp 1369–1378

  51. 51.

    Xu Y, Wu B, Shen F, Fan Y, Zhang Y, Shen HT, Liu W (2019) Exact adversarial attack to image captioning via structured output learning with latent variables. In: CVPR, pp 4135–4144

  52. 52.

    Yuan X, He P, Zhu Q, Li X (2019) Adversarial examples: Attacks and defenses for deep learning. IEEE Trans Neural Netw Learn Syst 30 (9):2805–2824

    Article  Google Scholar 

  53. 53.

    Zhang K, Ni J, Yang K, Liang X, Ren J, Shen X S (2017) Security and privacy in smart city applications: Challenges and solutions. IEEE Commun Mag 55(1):122–129

    Article  Google Scholar 

  54. 54.

    Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: CVPR, pp 589–597

  55. 55.

    Zhang Y, Foroosh H, David P, Gong B (2019) CAMOU: Learning Physical vehicle camouflages to adversarially attack detectors in the wild. In: ICLR

  56. 56.

    Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: ICCV

  57. 57.

    Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q (2017a) Person re-identification in the wild. In: CVPR, pp 1367–1376

  58. 58.

    Zheng Y, Capra L, Wolfson O, Yang H (2014) Urban computing: concepts, methodologies, and applications. ACM Trans Intell Syst Technol (TIST) 5(3):1–55

    Google Scholar 

  59. 59.

    Zheng Z, Zheng L, Yang Y (2017b) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: ICCV. IEEE

Download references


This work was supported by the National Natural Science Foundation of China (Grant No. 61602487, 61832017), the Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China (Grant No. 2015030275, 2018030202).

Author information



Corresponding author

Correspondence to Jiajun Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhao, G., Zhang, M., Liu, J. et al. AP-GAN: Adversarial patch attack on content-based image retrieval systems. Geoinformatica (2020). https://doi.org/10.1007/s10707-020-00418-7

Download citation


  • Adversarial attack
  • Adversarial patch
  • Image retrieval
  • GAN