FLAG: frequency-based local and global network for face forgery detection

Zhou, Kai; Sun, Guanglu; Wang, Jun; Wang, Jiahui; Yu, Linsen

doi:10.1007/s11042-024-18751-6

FLAG: frequency-based local and global network for face forgery detection

Published: 28 March 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Kai Zhou¹,
Guanglu Sun ORCID: orcid.org/0000-0003-2589-1164¹,
Jun Wang²,
Jiahui Wang¹ &
…
Linsen Yu¹

108 Accesses
1 Altmetric
Explore all metrics

Abstract

Deepfake detection aims to mitigate the threat of manipulated content by identifying and exposing forgeries. However, previous methods primarily tend to perform poorly when confronted with cross-dataset scenarios. To address the above issue, we propose an innovative hybrid network called the Frequency-based Local and Global (FLAG) network to explore local and global information with the help of frequency-domain cues for better generalization capability. In consideration of the fact that forged faces often exhibit flaws in the frequency domain, we design a Frequency-based Attention Enhancement Module (FAEM) to enhance the aggregation of CNN and Vision Transformer (ViT). In this design, local features from CNN are attentively enhanced by selected frequency coefficients in FAEM, facilitating generalizable global features learning by the ViT module. The effectiveness of the proposed method is validated via numerous experiments and the generalization performance is improved under cross-dataset scenarios. Especially, the proposed method have obtained an AUC of 99.26% and an ACC of 96.56% using intra-dataset experimental results on FaceForensics++ (C23).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

Deepfake: An Overview

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Availability of data and materials

Not applicable.

Code availability

Not applicable.

References

Pu Y, Gan Z, Henao R, Yuan X, Li C, Stevens A, Carin L (2016) Variational autoencoder for deep learning of images, labels and captions. Advan Neural Inform Process Syst 29
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advan Neural Inform Process Syst 27
Citron DK (2019) How deepfakes undermine truth and threaten democracy. https://www.ted.com
Tolosana R, Vera-Rodriguez R, Fierrez J, Morales A, Ortega-Garcia J (2020) Deepfakes and beyond: a survey of face manipulation and fake detection. Inform Fusion 64:131–148
Article Google Scholar
Sun K, Liu H, Ye Q, Gao Y, Liu J, Shao L, Ji R (2021) Domain general face forgery detection by learning to weight. Proc AAAI Conf Artif Intell 35:2638–2646
Google Scholar
Miao C, Tan Z, Chu Q, Yu N, Guo G (2022) Hierarchical frequency-assisted interactive networks for face manipulation detection. IEEE Trans Inf Forensics Secur 17:3008–3021
Article Google Scholar
Wang J, Wu Z, Ouyang W, Han X, Chen J, Jiang Y-G, Li S-N (2022) M2TR: multi-modal multi-scale transformers for deepfake detection. In: Proceedings of the 2022 international conference on multimedia retrieval, pp 615–623
Wang J, Tondi B, Barni M (2022) An eyes-based Siamese neural network for the detection of GAN-generated face images. Front Signal Process 2:918725
Wang J, Alamayreh O, Tondi B, Costanzo A, Barni M et al (2022) Detecting deepfake videos in data scarcity conditions by means of video coding features. APSIPA Trans Signal Inform Process 11(2)
Afchar D, Nozick V, Yamagishi J, Echizen I (2018) Mesonet: a compact facial video forgery detection network. In: 2018 IEEE international workshop on information forensics and security (WIFS), pp 1–7. IEEE
Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2019) Faceforensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1–11
Matern F, Riess C, Stamminger M (2019) Exploiting visual artifacts to expose deepfakes and face manipulations. In: 2019 IEEE winter applications of computer vision workshops (WACVW), pp 83–92. IEEE
Ni Y, Meng D, Yu C, Quan C, Ren D, Zhao Y (2022) CORE: consistent representation learning for face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12–21
Wang P, Liu K, Zhou W, Zhou H, Liu H, Zhang W, Yu N (2022) ADT: anti-deepfake transformer. In: ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2899–1903
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929
Arkin E, Yadikar N, Xu X, Aysa A, Ubul K (2023) A survey: object detection methods from CNN to transformer. Multimed Tool Appl 82(14):21353–21383
Article Google Scholar
Wodajo D, Atnafu S (2021) Deepfake video detection using convolutional vision transformer. arXiv:2102.11126
Coccomini DA, Messina N, Gennaro C, Falchi F (2022) Combining efficientnet and vision transformers for video deepfake detection. In: International conference on image analysis and processing, pp 219–229. Springer
Yang X, Li Y, Lyu S (2019) Exposing deep fakes using inconsistent head poses. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 8261–8265. IEEE
Yang J, Li A, Xiao S, Lu W, Gao X (2021) MTD-Net: learning to detect deepfakes images by multi-scale texture difference. IEEE Trans Inf Forensics Secur 16:4234–4245
Article Google Scholar
Deepfakes (2022) GitHub. https://github.com/deepfakes/faceswap
Kohli A, Gupta A (2021) Detecting deepfake, faceswap and face2face facial forgeries using frequency CNN. Multimed Tool Appl 80:18461–18478
Article Google Scholar
Yu Y, Ni R, Li W, Zhao Y (2022) Detection of AI-manipulated fake faces via mining generalized features. ACM Trans Multimed Comput Commun Appl 18(4):1–23
Article Google Scholar
Qian Y, Yin G, Sheng L, Chen Z, Shao J (2020) Thinking in frequency: face forgery detection by mining frequency-aware clues. In: European conference on computer vision, pp 86–103. Springer
Luo Y, Zhang Y, Yan J, Liu W (2021) Generalizing face forgery detection with high-frequency features. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16317–16326
Chen S, Yao T, Chen Y, Ding S, Li J, Ji R (2021) Local relation learning for face forgery detection. Proc AAAI Conf Artif Intell 35:1081–1088
Google Scholar
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722
Qin Z, Zhang P, Wu F, Li X (2021) FcaNet: frequency channel attention networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 783–792
Wan W, Wang J, Li J, Meng L, Sun J, Zhang H, Liu J (2020) Pattern complexity-based JND estimation for quantization watermarking. Pattern Recogn Lett 130:157–164
Article Google Scholar
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Thies J, Zollhöfer M, Nießner M (2019) Deferred neural rendering: image synthesis using neural textures. Acm Trans Graphics (TOG) 38(4):1–12
Article Google Scholar
Fridrich J, Kodovsky J (2012) Rich models for steganalysis of digital images. IEEE Trans Inf Forensics Secur 7(3):868–882
Article Google Scholar
Carvalho T, Faria FA, Pedrini H, Torres RdS, Rocha A (2015) Illuminant-based transformed spaces for image forensics. IEEE Trans Inform Forensics Secur 11(4):720–733
Article Google Scholar
Peng B, Wang W, Dong J, Tan T (2016) Optimized 3D lighting environment estimation for image forgery detection. IEEE Trans Inf Forensics Secur 12(2):479–494
Article Google Scholar
Cozzolino D, Poggi G, Verdoliva L (2017) Recasting residual-based local descriptors as convolutional neural networks: an application to image forgery detection. In: Proceedings of the 5th ACM workshop on information hiding and multimedia security, pp 159–164
Li L, Bao J, Zhang T, Yang H, Chen D, Wen F, Guo B (2020) Face x-ray for more general face forgery detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5001–5010
Zhao H, Zhou W, Chen D, Wei T, Zhang W, Yu N (2021) Multi-attentional deepfake detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2185–2194
Dong S, Wang J, Liang J, Fan H, Ji R (2022) Explaining deepfake detection by analysing image matching. In: European conference on computer vision, pp 18–35. Springer
Frank J, Eisenhofer T, Schönherr L, Fischer A, Kolossa D, Holz T (2020) Leveraging frequency analysis for deep fake image recognition. In: International conference on machine learning, pp 3247–3258. PMLR
Liu H, Li X, Zhou W, Chen Y, He Y, Xue H, Zhang W, Yu N (2021) Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 772–781
Tetko IV, Karpov P, Van Deursen R, Godin G (2020) State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat Commun 11(1):5575
Article Google Scholar
Khurana D, Koli A, Khatter K, Singh S (2023) Natural language processing: state of the art, current trends and challenges. Multimed Tool Appl 82(3):3713–3744
Article Google Scholar
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision, pp 213–229. Springer
Li Y, Mao H, Girshick R, He K (2022) Exploring plain vision transformer backbones for object detection. In: European conference on computer vision, pp 280–296. Springer
Xu K, Deng P, Huang H (2022) Vision transformer: an excellent teacher for guiding small networks in remote sensing image scene classification. IEEE Trans Geosci Remote Sens 60:1–15
Google Scholar
Dan J, Liu Y, Xie H, Deng J, Xie H, Xie X, Sun B (2023) TransFace: calibrating transformer training for face recognition from a data-centric perspective. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 20642–20653
Xiao T, Singh M, Mintun E, Darrell T, Dollár P, Girshick R (2021) Early convolutions help transformers see better. Adv Neural Inf Process Syst 34:30392–30400
Google Scholar
Li Y, Yang X, Sun P, Qi H, Lyu S (2020) Celeb-DF: a large-scale challenging dataset for deepfake forensics. In: CVPR, pp 3207–3216
Dolhansky B, Bitton J, Pflaum B, Lu J, Howes R, Wang M, Ferrer CC (2020) The deepfake detection challenge (DFDC) dataset. arXiv:2006.07397
Thies J, Zollhofer M, Stamminger M, Theobalt C, Nießner M (2016) Face2face: real-time face capture and reenactment of RGB videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2387–2395
Faceswap (2019) GitHub. http://www.github.com/MarekKowalski
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Article Google Scholar
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp 6105–6114. PMLR
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, pp 248–255. IEEE
Yu P, Fei J, Xia Z, Zhou Z, Weng J (2022) Improving generalization by commonality learning in face forgery detection. IEEE Trans Inf Forensics Secur 17:547–558
Article Google Scholar
Cozzolino D, Thies J, Rössler A, Riess C, Nießner M, Verdoliva L (2018) Forensictransfer: weakly-supervised domain adaptation for forgery detection. arXiv:1812.02510
Nguyen HH, Fang F, Yamagishi J, Echizen I (2019) Multi-task learning for detecting and segmenting manipulated facial images and videos. In: 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS), pp 1–8. IEEE
Li D, Yang Y, Song Y-Z, Hospedales T (2018) Learning to generalize: meta-learning for domain generalization. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Dong X, Bao J, Chen D, Zhang T, Zhang W, Yu N, Chen D, Wen F, Guo B (2022) Protecting celebrities from deepfake with identity consistency transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9468–9478

Download references

Acknowledgements

This study is in part supported by the Key Research and Development Project of Heilongjiang Province (2022ZX01A34), the 2020 Heilongjiang Province Higher Education Teaching Reform Project (SJGY 20200320).

Funding

This study is funded by the Key Research and Development Project of Heilongjiang Province (2022ZX01A34), the 2020 Heilongjiang Province Higher Education Teaching Reform Project (SJGY 20200320).

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin University of Science and Technology, 150080, Harbin, China
Kai Zhou, Guanglu Sun, Jiahui Wang & Linsen Yu
Department of Information Engineering and Mathematics, University of Siena, 53100, Siena, Italy
Jun Wang

Authors

Kai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Guanglu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiahui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Linsen Yu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Kai Zhou, Guanglu Sun and Jun Wang made substantial contributions to the conception of the work; Kai Zhou and Jiahui Wang drafted the work and made significant contributions to the acquisition, analysis or interpretation of the data; Guanglu Sun, Jun Wang and Linsen Yu revised it critically for important intellectual content.

Corresponding author

Correspondence to Guanglu Sun.

Ethics declarations

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

The Author confirms: that the work described has not been published before; that it is not under consideration for publication elsewhere; that its publication has been approved by all co-authors; that its publication has been approved by the responsible authorities at the institution where the work is carried out.

Competing Interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, K., Sun, G., Wang, J. et al. FLAG: frequency-based local and global network for face forgery detection. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18751-6

Download citation

Received: 23 December 2023
Revised: 04 February 2024
Accepted: 24 February 2024
Published: 28 March 2024
DOI: https://doi.org/10.1007/s11042-024-18751-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FLAG: frequency-based local and global network for face forgery detection

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Deepfake: An Overview

Image Matching from Handcrafted to Deep Features: A Survey

Availability of data and materials

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Consent to participate

Consent for publication

Competing Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

FLAG: frequency-based local and global network for face forgery detection

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Deepfake: An Overview

Image Matching from Handcrafted to Deep Features: A Survey

Availability of data and materials

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Consent to participate

Consent for publication

Competing Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation