Skip to main content
Log in

An attention-erasing stripe pyramid network for face forgery detection

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Face forgery detection aims to distinguish between real and fake facial images or videos by identifying manipulated or forged visual media. The main challenge in face forgery detection is achieving high model generalization ability, i.e., satisfactory performance under cross-database scenarios where the training and testing datasets are from different forgery methods. To achieve this goal, this paper presents an attention-erasing stripe pyramid network (ASPNet) to utilize high-frequency noises and exploit both the RGB and fine-grained frequency clues. First, since separately extracting features from different scales and granularities will ignore their complementarity, we employ a stripe pyramid block (SPB) to learn multi-scale and multi-granularity features simultaneously. Second, to make the model focus on useful information and suppress noise, a two-stage attention block (TSAB) is introduced by combining spatial attention and channel attention to filter out the pixel-wise and channel-wise noise in the learned feature maps. Finally, to dynamically guide the model to pay attention to different areas of the human face, an attention erasing (AE) scheme is adopted by randomly erasing units in attention maps. Sufficient experiments demonstrate that ASPNet has superior performance than \(F^{3}\)-Net on the FaceForensics++ dataset. The area under the receiver operating characteristic curve (AUC) and the accuracy (ACC) of our model reach 77.4% and 70.85%, respectively, which are improved by 0.83% and 1.28% compared with \(F^{3}\)-Net. Our code is available at: https://github.com/NWPU-Zwu.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The original datasets have been published online. The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

  1. Chen, S., Yao, T., Chen, Y., et al.: Local relation learning for face forgery detection. Proc. AAAI Conf. Artif. Intell. 35(2), 1081–1088 (2021)

    MathSciNet  Google Scholar 

  2. Luo, Y., Zhang, Y., Yan, J., et al.: Generalizing face forgery detection with high-frequency features. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 16317–16326 (2021)

  3. Yang, J., Xiao, S., Li, A., et al.: MSTA-net: forgery detection by generating manipulation trace based on multi-scale self-texture attention. IEEE Trans. Circ. Syst. Video Technol. 32(7), 4854–4866 (2021)

    Article  Google Scholar 

  4. Shang, Z., Xie, H., Zha, Z., et al.: PRRNet: pixel-region relation network for face forgery detection. Pattern Recogn. 116, 107950 (2021)

    Article  Google Scholar 

  5. Liu, Z., Lin, Y., Cao, Y., et al.: Swin-transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–1002 (2021)

  6. Martinel, N., Luca Foresti, G., Micheloni, C.: Aggregating deep pyramidal representations for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, (2019)

  7. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)

  8. Wang, C., Zhang, Q., Huang, C., et al.: Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In: Proceedings of the European conference on computer vision (ECCV), pp. 365–381 (2018)

  9. Zhong, Y., Wang, Y., Zhang, S.: Progressive feature enhancement for person re-identification. IEEE Trans. Image Process. 30, 8384–8395 (2021)

    Article  Google Scholar 

  10. Sun, K., Liu, H., Yao, T., et al.: An information theoretic approach for attention-driven face forgery detection. European conference on computer vision, pp. 111–127. Springer, Cham (2022)

    Google Scholar 

  11. Fei, J., Dai, Y., Yu, P., et al.: Learning second order local anomaly for general face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20270–20280 (2022)

  12. Wang, Q., Guo, G.: AAN-face: attention augmented networks for face recognition. IEEE Trans. Image Process. 30, 7636–7648 (2021)

    Article  Google Scholar 

  13. Yu, P., Fei, J., Xia, Z., et al.: Improving generalization by commonality learning in face forgery detection. IEEE Trans. Inf. Forensics Secur. 17, 547–558 (2022)

    Article  Google Scholar 

  14. Cao, J., Ma, C., Yao, T., et al.: End-to-end reconstruction-classification learning for face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4113–4122 (2022)

  15. Yang, J., Cai, Y., Liu, D., et al.: Multi-scale Siamese prediction network for video anomaly detection. Signal, Image and Video Processing, pp. 1–8 (2022)

  16. Aloraini, M.: FaceMD: convolutional neural network-based spatiotemporal fusion facial manipulation detection. SIViP 17(1), 247–255 (2023)

    Article  Google Scholar 

  17. Atkale, D.V., Pawar, M.M., Deshpande, S.C., et al.: Multi-scale feature fusion model followed by residual network for generation of face aging and de-aging. SIViP 16(3), 753–761 (2022)

    Article  Google Scholar 

  18. Qian, Y., Yin, G., Sheng, L., et al.: Thinking in frequency: face forgery detection by mining frequency-aware clues. European conference on computer vision, pp. 86–103. Springer, Cham (2020)

    Google Scholar 

  19. Wang, L., Fayolle, P.A., Belyaev, A.G.: Reverse image filtering with clean and noisy filters. SIViP 17(2), 333–341 (2023)

    Article  Google Scholar 

  20. Jia, S., Ma, C., Yao, T., et al.: Exploring frequency adversarial attacks for face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4103–4112 (2022)

  21. Zhao, H., Zhou, W., Chen, D., et al.: Multi-attentional deepfake detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2185–2194. (2021)

  22. Tao, H., Duan, Q.: Learning discriminative feature representation for estimating smoke density of smoky vehicle rear. IEEE Trans. Intell. Transp. Syst. 12(23), 23136–23147 (2022)

    Article  Google Scholar 

  23. Tao, H., Lu, M., Hu, Z., et al.: Attention-aggregated attribute-aware network with redundancy reduction convolution for video-based industrial smoke emission recognition. IEEE Trans. Industr. Inf. 18(11), 7653–7664 (2022)

    Article  Google Scholar 

  24. Wang, C., Deng, W.: Representative forgery mining for fake face detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14923–14932 (2021)

  25. Duan, Q., Hu, Z., Lu, M., et al.: Learning discriminative features for person re-identification via multi-spectral channel attention. SIViP (2023). https://doi.org/10.1007/s11760-023-02522-1

    Article  Google Scholar 

  26. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258. (2017)

  27. Wang, Q., Guo, G.: LS-CNN: characterizing local patches at multiple scales for face recognition. IEEE Trans. Inf. Forensics Secur. 15, 1640–1653 (2019)

    Article  Google Scholar 

  28. Rossler, A., Cozzolino, D., Verdoliva, L., et al.: Faceforensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1–11. (2019)

  29. Sagonas, C., Antonakos, E., Tzimiropoulos, G., et al.: 300 faces in-the-wild challenge: database and results. Image Vis. Comput. 47, 3–18 (2016)

    Article  Google Scholar 

  30. Haliassos, A., Vougioukas, K., Petridis, S., et al.: Lips don't lie: a generalisable and robust approach to face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5039–5049 (2021)

  31. Zheng, Y., Bao, J., Chen, D., et al.: Exploring temporal coherence for more general video face forgery detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 15044–15054 (2021)

  32. Li, L., Bao, J., Zhang, T., et al.: Face X-ray for more general face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5001–5010 (2020)

  33. Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 7(3), 868–882 (2012)

    Article  Google Scholar 

  34. Cozzolino, D., Poggi, G., Verdoliva, L.: Recasting residual-based local descriptors as convolutional neural networks: an application to image forgery detection. In: Proceedings of the 5th ACM workshop on information hiding and multimedia security, pp. 159–164 (2017)

  35. Bayar, B., Stamm, M.C.: A deep learning approach to universal image manipulation detection using a new convolutional layer. In: Proceedings of the 4th ACM workshop on information hiding and multimedia security, pp. 5–10 (2016)

  36. Afchar, D., Nozick, V., Yamagishi, J., et al.: Mesonet: a compact facial video forgery detection network. In: 2018 IEEE international workshop on information forensics and security (WIFS). IEEE, pp. 1–7 (2018)

  37. Nguyen, H.H., Fang, F., Yamagishi, J., et al.: Multi-task learning for detecting and segmenting manipulated facial images and videos. In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, pp. 1–8 (2019)

  38. Ni, Y., Meng, D., Yu, C., et al.: CORE: consistent representation learning for face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12–21 (2022)

  39. Liu, D., Dang, Z., Peng, C., et al.: FedForgery: generalized face forgery detection with residual federated learning. arXiv preprint arXiv:2210.09563, (2022)

  40. Deepfakes. https://github.com/iperov/DeepFaceLab. Accessed: 2020–05–10. 3, 6, 7.

  41. Thies, J., Zollhofer, M., Stamminger, M., et al.: Face2face: Real-time face capture and reenactment of rgb videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2387–2395 (2016)

  42. Faceswap. https://github.com/MarekKowalski/FaceSwap. Accessed: 2020–05–10. 3, 6, 7.

  43. Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. Acm Trans. Graph. (TOG) 38(4), 1–12 (2019)

    Article  Google Scholar 

  44. Selvaraju, R.R., Cogswell, M., Das, A., et al.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp. 618–626 (2017)

  45. Rahmouni, N., Nozick, V., Yamagishi, J., et al.: Distinguishing computer graphics from natural images using convolution neural networks. In: 2017 IEEE workshop on information forensics and security (WIFS). IEEE, pp. 1–6 (2017)

  46. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, (2014)

Download references

Acknowledgements

This work was partly supported by the Fundamental Research Funds for the Central Universities (No. D5000210737), the Key Research and Development Program of Shaanxi Province (No. 2023-ZDLGY-53), and the National Natural Science Foundation of China (No. 62102320). (Corresponding author: Huanjie Tao).

Author information

Authors and Affiliations

Authors

Contributions

Zhenwu Hu completed the experiment and wrote the main manuscript text. Qianyue Duan provided ideas of improvement. Peiyu Zhang assisted us in writing. Huanjie Tao provided experiment guidance and writing advice. All authors reviewed the manuscript.

Corresponding author

Correspondence to Huanjie Tao.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Z., Duan, Q., Zhang, P. et al. An attention-erasing stripe pyramid network for face forgery detection. SIViP 17, 4123–4131 (2023). https://doi.org/10.1007/s11760-023-02644-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02644-6

Keywords

Navigation