Advertisement

Using LIP to Gloss Over Faces in Single-Stage Face Detection Networks

  • Siqi YangEmail author
  • Arnold Wiliem
  • Shaokang Chen
  • Brian C. Lovell
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11219)

Abstract

This work shows that it is possible to fool/attack recent state-of-the-art face detectors which are based on the single-stage networks. Successfully attacking face detectors could be a serious malware vulnerability when deploying a smart surveillance system utilizing face detectors. In addition, for the privacy concern, it helps prevent faces being harvested and stored in the server. We show that existing adversarial perturbation methods are not effective to perform such an attack, especially when there are multiple faces in the inut image. This is because the adversarial perturbation specifically generated for one face may disrupt the adversarial perturbation for another face. In this paper, we call this problem the Instance Perturbation Interference (IPI) problem. This IPI problem is addressed by studying the relationship between the deep neural network receptive field and the adversarial perturbation. Besides the single-stage face detector, we find that the IPI problem also exists on the first stage of the Faster-RCNN, the commonly used two-stage object detector. As such, we propose the Localized Instance Perturbation (LIP) that confines the adversarial perturbation inside the Effective Receptive Field (ERF) of a target to perform the attack. Experimental results show the LIP method massively outperforms existing adversarial perturbation generation methods – often by a factor of 2 to 10.

Keywords

Adversarial Interference Effective Receptive Field Single-stage network Detection 

Notes

Acknowledgements

This work has been funded by Sullivan Nicolaides Pathology, Australia, and the Australian Research Council (ARC) Linkage Projects Grant LP160101797. Arnold Wiliem is funded by the Advance Queensland Early-Career Research Fellowship.

Supplementary material

474204_1_En_39_MOESM1_ESM.pdf (4 mb)
Supplementary material 1 (pdf 4138 KB)

Supplementary material 2 (mp4 69117 KB)

References

  1. 1.
    Chen, D., Hua, G., Wen, F., Sun, J.: Supervised transformer network for efficient face detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 122–138. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_8CrossRefGoogle Scholar
  2. 2.
    Cisse, M., Adi, Y., Neverova, N., Keshet, J.: Houdini: Fooling deep structured prediction models. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  3. 3.
    Fischer, V., Kumar, M.C., Metzen, J.H., Brox, T.: Adversarial examples for semantic image segmentation. In: International Conference on Learning Representations (ICLR) Workshop (2017)Google Scholar
  4. 4.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  5. 5.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2016)Google Scholar
  6. 6.
    Hu, P., Ramanan, D.: Finding tiny faces. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2017)Google Scholar
  7. 7.
    Huang, J. et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2017)Google Scholar
  8. 8.
    Jain, V., Learned-Miller, E.G.: Fddb: A benchmark for face detection in unconstrained settings. UMass Amherst Technical Report (2010)Google Scholar
  9. 9.
    Jiang, H., Learned-Miller, E.: Face detection with the faster r-cnn. In: IEEE International Conference on Automatic Face & Gesture Recognition (FG). IEEE (2017)Google Scholar
  10. 10.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS) (2012)Google Scholar
  11. 11.
    Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. In: International Conference on Learning Representations (ICLR) Workshop (2017)Google Scholar
  12. 12.
    Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2015)Google Scholar
  13. 13.
    Li, Y., Sun, B., Wu, T., Wang, Y.: Face detection with end-to-end integration of a convnet and a 3D model. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 420–436. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_26CrossRefGoogle Scholar
  14. 14.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  15. 15.
    Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  16. 16.
    Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. In: International Conference on Learning Representations (ICLR) (2017)Google Scholar
  17. 17.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2015)Google Scholar
  18. 18.
    Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  19. 19.
    Metzen, J.H., Kumar, M.C., Brox, T., Fischer, V.: Universal adversarial perturbations against semantic image segmentation. In: International Conference on Computer Vision (ICCV). IEEE (2017)Google Scholar
  20. 20.
    Mirjalili, V., Raschka, S., Namboodiri, A., Ross, A.: Semi-adversarial networks: convolutional autoencoders for imparting privacy to face images. In: International Conference on Biometrics (ICB) (2018)Google Scholar
  21. 21.
    Mirjalili, V., Ross, A.: Soft biometric privacy: Retaining biometric utility of face images while perturbing gender. In: International Joint Conference on Biometrics (IJCB) (2017)Google Scholar
  22. 22.
    Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2017)Google Scholar
  23. 23.
    Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2016)Google Scholar
  24. 24.
    Najibi, M., Samangouei, P., Chellappa, R., Davis, L.: Ssh: Single stage headless face detector. In: International Conference on Computer Vision (ICCV). IEEE (2017)Google Scholar
  25. 25.
    Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2015)Google Scholar
  26. 26.
    Qin, H., Yan, J., Li, X., Hu, X.: Joint training of cascaded cnn for face detection. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2016)Google Scholar
  27. 27.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2016)Google Scholar
  28. 28.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)Google Scholar
  29. 29.
    Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM (2016)Google Scholar
  30. 30.
    Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2016)Google Scholar
  31. 31.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  32. 32.
    Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (ICLR) (2014)Google Scholar
  33. 33.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2001)Google Scholar
  34. 34.
    Xie, C., Wang, J., Zhang, Z., Ren, Z., Yuille, A.: Mitigating adversarial effects through randomization. In: International Conference on Learning Representations (ICLR) (2018)Google Scholar
  35. 35.
    Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A.: Adversarial examples for semantic segmentation and object detection. In: International Conference on Computer Vision (ICCV). IEEE (2017)Google Scholar
  36. 36.
    Yamada, T., Gohshi, S., Echizen, I.: Privacy Visor: method for preventing face image detection by using differences in human and device sensitivity. In: De Decker, B., Dittmann, J., Kraetzer, C., Vielhauer, C. (eds.) CMS 2013. LNCS, vol. 8099, pp. 152–161. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40779-6_13CrossRefGoogle Scholar
  37. 37.
    Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  38. 38.
    Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2015)Google Scholar
  39. 39.
    Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  40. 40.
    Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: S3 fd: single shot scale-invariant face detector. In: International Conference on Computer Vision (ICCV). IEEE (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Siqi Yang
    • 1
    Email author
  • Arnold Wiliem
    • 1
  • Shaokang Chen
    • 1
  • Brian C. Lovell
    • 1
  1. 1.The University of QueenslandBrisbaneAustralia

Personalised recommendations