An attention-erasing stripe pyramid network for face forgery detection

Hu, Zhenwu; Duan, Qianyue; Zhang, PeiYu; Tao, Huanjie

doi:10.1007/s11760-023-02644-6

An attention-erasing stripe pyramid network for face forgery detection

Original Paper
Published: 19 June 2023

Volume 17, pages 4123–4131, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Zhenwu Hu¹,
Qianyue Duan²,
PeiYu Zhang² &
…
Huanjie Tao^2,3,4

282 Accesses
2 Citations
Explore all metrics

Abstract

Face forgery detection aims to distinguish between real and fake facial images or videos by identifying manipulated or forged visual media. The main challenge in face forgery detection is achieving high model generalization ability, i.e., satisfactory performance under cross-database scenarios where the training and testing datasets are from different forgery methods. To achieve this goal, this paper presents an attention-erasing stripe pyramid network (ASPNet) to utilize high-frequency noises and exploit both the RGB and fine-grained frequency clues. First, since separately extracting features from different scales and granularities will ignore their complementarity, we employ a stripe pyramid block (SPB) to learn multi-scale and multi-granularity features simultaneously. Second, to make the model focus on useful information and suppress noise, a two-stage attention block (TSAB) is introduced by combining spatial attention and channel attention to filter out the pixel-wise and channel-wise noise in the learned feature maps. Finally, to dynamically guide the model to pay attention to different areas of the human face, an attention erasing (AE) scheme is adopted by randomly erasing units in attention maps. Sufficient experiments demonstrate that ASPNet has superior performance than \(F^{3}\)-Net on the FaceForensics++ dataset. The area under the receiver operating characteristic curve (AUC) and the accuracy (ACC) of our model reach 77.4% and 70.85%, respectively, which are improved by 0.83% and 1.28% compared with \(F^{3}\)-Net. Our code is available at: https://github.com/NWPU-Zwu.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning models for digital image processing: a review

Article 07 January 2024

Image forgery detection: a survey of recent deep-learning approaches

Article Open access 03 October 2022

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

Data availability

The original datasets have been published online. The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

Chen, S., Yao, T., Chen, Y., et al.: Local relation learning for face forgery detection. Proc. AAAI Conf. Artif. Intell. 35(2), 1081–1088 (2021)
MathSciNet Google Scholar
Luo, Y., Zhang, Y., Yan, J., et al.: Generalizing face forgery detection with high-frequency features. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 16317–16326 (2021)
Yang, J., Xiao, S., Li, A., et al.: MSTA-net: forgery detection by generating manipulation trace based on multi-scale self-texture attention. IEEE Trans. Circ. Syst. Video Technol. 32(7), 4854–4866 (2021)
Article Google Scholar
Shang, Z., Xie, H., Zha, Z., et al.: PRRNet: pixel-region relation network for face forgery detection. Pattern Recogn. 116, 107950 (2021)
Article Google Scholar
Liu, Z., Lin, Y., Cao, Y., et al.: Swin-transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–1002 (2021)
Martinel, N., Luca Foresti, G., Micheloni, C.: Aggregating deep pyramidal representations for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)
Wang, C., Zhang, Q., Huang, C., et al.: Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In: Proceedings of the European conference on computer vision (ECCV), pp. 365–381 (2018)
Zhong, Y., Wang, Y., Zhang, S.: Progressive feature enhancement for person re-identification. IEEE Trans. Image Process. 30, 8384–8395 (2021)
Article Google Scholar
Sun, K., Liu, H., Yao, T., et al.: An information theoretic approach for attention-driven face forgery detection. European conference on computer vision, pp. 111–127. Springer, Cham (2022)
Google Scholar
Fei, J., Dai, Y., Yu, P., et al.: Learning second order local anomaly for general face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20270–20280 (2022)
Wang, Q., Guo, G.: AAN-face: attention augmented networks for face recognition. IEEE Trans. Image Process. 30, 7636–7648 (2021)
Article Google Scholar
Yu, P., Fei, J., Xia, Z., et al.: Improving generalization by commonality learning in face forgery detection. IEEE Trans. Inf. Forensics Secur. 17, 547–558 (2022)
Article Google Scholar
Cao, J., Ma, C., Yao, T., et al.: End-to-end reconstruction-classification learning for face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4113–4122 (2022)
Yang, J., Cai, Y., Liu, D., et al.: Multi-scale Siamese prediction network for video anomaly detection. Signal, Image and Video Processing, pp. 1–8 (2022)
Aloraini, M.: FaceMD: convolutional neural network-based spatiotemporal fusion facial manipulation detection. SIViP 17(1), 247–255 (2023)
Article Google Scholar
Atkale, D.V., Pawar, M.M., Deshpande, S.C., et al.: Multi-scale feature fusion model followed by residual network for generation of face aging and de-aging. SIViP 16(3), 753–761 (2022)
Article Google Scholar
Qian, Y., Yin, G., Sheng, L., et al.: Thinking in frequency: face forgery detection by mining frequency-aware clues. European conference on computer vision, pp. 86–103. Springer, Cham (2020)
Google Scholar
Wang, L., Fayolle, P.A., Belyaev, A.G.: Reverse image filtering with clean and noisy filters. SIViP 17(2), 333–341 (2023)
Article Google Scholar
Jia, S., Ma, C., Yao, T., et al.: Exploring frequency adversarial attacks for face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4103–4112 (2022)
Zhao, H., Zhou, W., Chen, D., et al.: Multi-attentional deepfake detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2185–2194. (2021)
Tao, H., Duan, Q.: Learning discriminative feature representation for estimating smoke density of smoky vehicle rear. IEEE Trans. Intell. Transp. Syst. 12(23), 23136–23147 (2022)
Article Google Scholar
Tao, H., Lu, M., Hu, Z., et al.: Attention-aggregated attribute-aware network with redundancy reduction convolution for video-based industrial smoke emission recognition. IEEE Trans. Industr. Inf. 18(11), 7653–7664 (2022)
Article Google Scholar
Wang, C., Deng, W.: Representative forgery mining for fake face detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14923–14932 (2021)
Duan, Q., Hu, Z., Lu, M., et al.: Learning discriminative features for person re-identification via multi-spectral channel attention. SIViP (2023). https://doi.org/10.1007/s11760-023-02522-1
Article Google Scholar
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258. (2017)
Wang, Q., Guo, G.: LS-CNN: characterizing local patches at multiple scales for face recognition. IEEE Trans. Inf. Forensics Secur. 15, 1640–1653 (2019)
Article Google Scholar
Rossler, A., Cozzolino, D., Verdoliva, L., et al.: Faceforensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1–11. (2019)
Sagonas, C., Antonakos, E., Tzimiropoulos, G., et al.: 300 faces in-the-wild challenge: database and results. Image Vis. Comput. 47, 3–18 (2016)
Article Google Scholar
Haliassos, A., Vougioukas, K., Petridis, S., et al.: Lips don't lie: a generalisable and robust approach to face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5039–5049 (2021)
Zheng, Y., Bao, J., Chen, D., et al.: Exploring temporal coherence for more general video face forgery detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 15044–15054 (2021)
Li, L., Bao, J., Zhang, T., et al.: Face X-ray for more general face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5001–5010 (2020)
Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 7(3), 868–882 (2012)
Article Google Scholar
Cozzolino, D., Poggi, G., Verdoliva, L.: Recasting residual-based local descriptors as convolutional neural networks: an application to image forgery detection. In: Proceedings of the 5th ACM workshop on information hiding and multimedia security, pp. 159–164 (2017)
Bayar, B., Stamm, M.C.: A deep learning approach to universal image manipulation detection using a new convolutional layer. In: Proceedings of the 4th ACM workshop on information hiding and multimedia security, pp. 5–10 (2016)
Afchar, D., Nozick, V., Yamagishi, J., et al.: Mesonet: a compact facial video forgery detection network. In: 2018 IEEE international workshop on information forensics and security (WIFS). IEEE, pp. 1–7 (2018)
Nguyen, H.H., Fang, F., Yamagishi, J., et al.: Multi-task learning for detecting and segmenting manipulated facial images and videos. In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, pp. 1–8 (2019)
Ni, Y., Meng, D., Yu, C., et al.: CORE: consistent representation learning for face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12–21 (2022)
Liu, D., Dang, Z., Peng, C., et al.: FedForgery: generalized face forgery detection with residual federated learning. arXiv preprint arXiv:2210.09563, (2022)
Deepfakes. https://github.com/iperov/DeepFaceLab. Accessed: 2020–05–10. 3, 6, 7.
Thies, J., Zollhofer, M., Stamminger, M., et al.: Face2face: Real-time face capture and reenactment of rgb videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2387–2395 (2016)
Faceswap. https://github.com/MarekKowalski/FaceSwap. Accessed: 2020–05–10. 3, 6, 7.
Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. Acm Trans. Graph. (TOG) 38(4), 1–12 (2019)
Article Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., et al.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp. 618–626 (2017)
Rahmouni, N., Nozick, V., Yamagishi, J., et al.: Distinguishing computer graphics from natural images using convolution neural networks. In: 2017 IEEE workshop on information forensics and security (WIFS). IEEE, pp. 1–6 (2017)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, (2014)

Download references

Acknowledgements

This work was partly supported by the Fundamental Research Funds for the Central Universities (No. D5000210737), the Key Research and Development Program of Shaanxi Province (No. 2023-ZDLGY-53), and the National Natural Science Foundation of China (No. 62102320). (Corresponding author: Huanjie Tao).

Author information

Authors and Affiliations

School of Cyberspace Security, Northwestern Polytechnical University, Xi’an, 710129, People’s Republic of China
Zhenwu Hu
School of Computer Science, Northwestern Polytechnical University, Xi’an, 710129, People’s Republic of China
Qianyue Duan, PeiYu Zhang & Huanjie Tao
Engineering and Research Center of Embedded Systems Integration (Northwestern Polytechnical University), Ministry of Education, Xi’an, 710129, People’s Republic of China
Huanjie Tao
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, Xi’an, 710129, People’s Republic of China
Huanjie Tao

Authors

Zhenwu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Qianyue Duan
View author publications
You can also search for this author in PubMed Google Scholar
PeiYu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Huanjie Tao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Zhenwu Hu completed the experiment and wrote the main manuscript text. Qianyue Duan provided ideas of improvement. Peiyu Zhang assisted us in writing. Huanjie Tao provided experiment guidance and writing advice. All authors reviewed the manuscript.

Corresponding author

Correspondence to Huanjie Tao.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hu, Z., Duan, Q., Zhang, P. et al. An attention-erasing stripe pyramid network for face forgery detection. SIViP 17, 4123–4131 (2023). https://doi.org/10.1007/s11760-023-02644-6

Download citation

Received: 21 March 2023
Revised: 14 May 2023
Accepted: 25 May 2023
Published: 19 June 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11760-023-02644-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An attention-erasing stripe pyramid network for face forgery detection

Abstract

Access this article

Similar content being viewed by others

Deep learning models for digital image processing: a review

Image forgery detection: a survey of recent deep-learning approaches

A comprehensive survey of AI-enabled phishing attacks detection techniques

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An attention-erasing stripe pyramid network for face forgery detection

Abstract

Access this article

Similar content being viewed by others

Deep learning models for digital image processing: a review

Image forgery detection: a survey of recent deep-learning approaches

A comprehensive survey of AI-enabled phishing attacks detection techniques

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation