Fake visual content detection using two-stream convolutional neural networks

Yousaf, Bilal; Usama, Muhammad; Sultani, Waqas; Mahmood, Arif; Qadir, Junaid

doi:10.1007/s00521-022-06902-5

Fake visual content detection using two-stream convolutional neural networks

Original Article
Published: 20 January 2022

Volume 34, pages 7991–8004, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Bilal Yousaf¹,
Muhammad Usama²,
Waqas Sultani¹,
Arif Mahmood¹ &
…
Junaid Qadir ORCID: orcid.org/0000-0001-9466-2475^3,4

816 Accesses
6 Citations
2 Altmetric
Explore all metrics

Abstract

Rapid progress in adversarial learning has enabled the generation of realistic-looking fake visual content. To distinguish between fake and real visual content, several detection techniques have been proposed. The performance of most of these techniques however drops off significantly if the test and the training data are sampled from different distributions. This motivates efforts towards improving the generalization of fake detectors. Since current fake content generation techniques do not accurately model the frequency spectrum of the natural images, we observe that the frequency spectrum of the fake visual data contains discriminative characteristics that can be used to detect fake content. We also observe that the information captured in the frequency spectrum is different from that of the spatial domain. Using these insights, we propose to complement frequency and spatial domain features using a two-stream convolutional neural network architecture called TwoStreamNet. We demonstrate the improved generalization of the proposed two-stream network to several unseen generation architectures, datasets, and techniques. The proposed detector has demonstrated significant performance improvement compared to the current state-of-the-art fake content detectors with the fusing of frequency and spatial domain streams also improving the generalization of the detector.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

CBAM: Convolutional Block Attention Module

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Article 04 June 2022

Notes

The eyes and mouth are determined as the mesoscopic features in the forgery detection in the Deepfake videos.
https://www.yf.io/p/lsun.
https://peterwang512.github.io/CNNDetection/.

References

Chesney B, Citron D (2019) Deep fakes: a looming challenge for privacy, democracy, and national security. Calif L Rev 107:1753
Google Scholar
Kumar R, Sotelo J, Kumar K, de Brébisson A, Bengio Y (2017) Obamanet: photo-realistic lip-sync from text. arXiv preprint arXiv:1801.01442
Suwajanakorn S, Seitz SM, Kemelmacher-Shlizerman I (2017) Synthesizing Obama: learning lip sync from audio. ACM Trans Graph (TOG) 36(4):95
Article Google Scholar
Thies J, Zollhofer M, Stamminger M, Theobalt C, Nießner M (2016) Face2face: real-time face capture and reenactment of rgb videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2387–2395
Thies J, Zollhöfer M, Nießner M, Valgaerts L, Stamminger M, Theobalt C (2015) Real-time expression transfer for facial reenactment. ACM Trans Graph 34(6):183–191
Article Google Scholar
Wiles O, Koepke A, Zisserman A (2018) X2face: a network for controlling face generation using images, audio, and pose codes. In: Proceedings of the European conference on computer vision (ECCV), pp 670–686
Chan C, Ginosar S, Zhou T, Efros AA (2019) Everybody dance now. In: Proceedings of the IEEE international conference on computer vision, pp 5933–5942
Cai H, Bai C, Tai YW, Tang CK (2018) Deep video generation, prediction and completion of human action sequences. In: Proceedings of the European conference on computer vision (ECCV), pp 366–382
Esser P, Haux J, Milbich T et al (2018) Towards learning a realistic rendering of human behavior. In: Proceedings of the European conference on computer vision (ECCV)
Wang Y, Skerry-Ryan RJ, Stanton D, Wu Y, Weiss RJ, Jaitly N, Yang Z, Xiao Y, Chen Z, Bengio S et al (2017)Tacotron: towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135
Arik SO, Chen J, Peng K, Ping W, Zhou Y (2018) Neural voice cloning with a few samples. In: Advances in neural information processing systems, pp 10019–10029
Tachibana H, Uenoyama K, Aihara S (2018) Efficiently trainable text-to-speech system based on deep convolutional networks with guided attention. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP, pp 4784–4788. IEEE
From porn to ‘game of thrones’: how deepfakes and realistic-looking fake videos hit it big. https://www.businessinsider.com/deepfakes-explained-the-rise-of-fake-realistic-videos-online-2019-6. Accessed 07 Dec 2020
Lee D (2018)Fake porn’ has serious consequences. https://www.bbc.com/news/technology-42912529. Accessed 07 Dec 2020
Cole S (2018) Gfycat’s AI solution for fighting deepfakes isn’t working. https://www.vice.com/en_us/article/ywe4qw/gfycat-spotting-deepfakes-fake-ai-porn. Accessed 07 Dec 2020
Patrini G, Cavalli F, Henry A (2018) The state of deepfakes: reality under attack, annual report v.2.3. https://deeptracelabs.com/archive/. Accessed 07 Dec 2020
Damiani J (2019) A voice deepfake was used to scam a CEO out of 243,000\$. https://www.forbes.com/sites/jessedamiani/2019/09/03/a-voice-deepfake-was-used-toscam-a-ceo-out-of-243000/. Accessed 07 Dec 2020
Bayar B, Stamm MC (2016)A deep learning approach to universal image manipulation detection using a new convolutional layer. In: Proceedings of the 4th ACM workshop on information hiding and multimedia security, pp 5–10
Cozzolino D, Poggi G, Verdoliva L (2017) Recasting residual-based local descriptors as convolutional neural networks: an application to image forgery detection. In: Proceedings of the 5th ACM workshop on information hiding and multimedia security, pp 159–164
Rahmouni N, Nozick V, Yamagishi J, Echizen I (2017) Distinguishing computer graphics from natural images using convolution neural networks. In: 2017 IEEE workshop on information forensics and security (WIFS), pp 1–6. IEEE
Rössler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2018) FaceForensics: a large-scale video dataset for forgery detection in human faces. arXiv preprint arXiv:1803.09179
Afchar D, Nozick V, Yamagishi J, Echizen I (2018) MesoNet: a compact facial video forgery detection network. In: 2018 IEEE international workshop on information forensics and security (WIFS), pp 1–7. IEEE
Nataraj L, Mohammed TM, Manjunath BS, Chandrasekaran S, Flenner A, Bappy JH, Roy Chowdhury AK (2019) generated fake images using co-occurrence matrices. Electron Imaging 2019(5):532–541
Wang SY, Wang O, Zhang R, Owens A, Efros AA (2020) Cnn-generated images are surprisingly easy to spot... for now. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 7
Zhang X, Karaman S, Chang SF (2019) Detecting and simulating artifacts in gan fake images. In: 2019 IEEE international workshop on information forensics and security (WIFS), pp 1–6
Agarwal S, Farid H (2017) Photo forensics from jpeg dimples. In: 2017 IEEE workshop on information forensics and security (WIFS), pp 1–6. IEEE
Lyu S, Pan X, Zhang X (2014) Exposing region splicing forgeries with blind local noise estimation. Int J Comput Vis 110(2):202–221
Article Google Scholar
Popescu AC, Farid H (2005) Exposing digital forgeries by detecting traces of resampling. IEEE Trans Signal Process 53(2):758–767
Article MathSciNet Google Scholar
Li H, Luo W, Qiu X, Huang J (2017) Image forgery localization via integrating tampering possibility maps. IEEE Trans Inf Forensics Secur 12(5):1240–1252
Article Google Scholar
Guo Y, Cao X, Zhang W, Wang R (2018) Fake colorized image detection. IEEE Trans Inf Forensics Secur 13(8):1932–1944
Article Google Scholar
Peng B, Wang W, Dong J, Tan T (2018) Image forensics based on planar contact constraints of 3d objects. IEEE Trans Inf Forensics Secur 13(2):377–392
Article Google Scholar
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
Huh M, Liu A, Owens A, Efros AA. (2018) Fighting fake news: image splice detection via learned self-consistency. In: Proceedings of the European conference on computer vision (ECCV), pp 101–117
Cozzolino D, Poggi G, Verdoliva L (2015) Splicebuster: a new blind image splicing detector. In: 2015 IEEE international workshop on information forensics and security (WIFS), pp 1–6. IEEE
Yuan Rao , Jiangqun Ni (2016) A deep learning approach to detection of splicing and copy-move forgeries in images. In: 2016 IEEE international workshop on information forensics and security (WIFS), pp 1–6. IEEE
Yan Y, Ren W, Cao X (2019) Recolored image detection via a deep discriminative model. IEEE Trans Inf Forensics Secur 14(1):5–17
Article Google Scholar
Quan W, Wang K, Yan D, Zhang X (2018) Distinguishing between natural and computer-generated images using convolutional neural networks. IEEE Trans Inf Forensics Secur 13(11):2772–2787
Article Google Scholar
McCloskey S, Albright M (2018) Detecting gan-generated imagery using color cues. arXiv preprint arXiv:1812.08247
Li Y, Lyu S (2018) Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656
Li Y, Chang MC, Lyu S (2018) In ictu oculi: exposing AI created fake videos by detecting eye blinking. In: 2018 IEEE international workshop on information forensics and security (WIFS), pp 1–7. IEEE
Yang X, Li Y, Lyu S (2019) Exposing deep fakes using inconsistent head poses. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 8261–8265. IEEE
Wang R, Juefei-Xu F, Ma L, Xie X, Huang Y, Wang J, Liu Y (2019) Fakespotter: a simple baseline for spotting ai-synthesized fake faces. arXiv preprint arXiv:1909.06122
Yang J, Xiao S, Li A, Lan G, Wang H (2021) Detecting fake images by identifying potential texture difference. Future Gener Comput Syst 125:127–135
Article Google Scholar
Guo H, Hu S, Wang X, Chang MC, Lyu S (2021) Robust attentive deep neural network for exposing gan-generated faces. arXiv preprint arXiv:2109.02167
Cozzolino D, Thies J, Rössler A, Riess C, Nießner M, Verdoliva L (2018) Forensictransfer: weakly-supervised domain adaptation for forgery detection. arXiv preprint arXiv:1812.02510
Xuan X, Peng B, Wang W, Dong J (2019) On the generalization of GAN image forensics. In: Chinese conference on biometric recognition. Springer, pp 134–141
Gueguen L, Sergeev A, Kadlec B, Liu R, Yosinski J (2018) Faster neural networks straight from JPEG. In: Advances in neural information processing systems, pp 3933–3944
Ehrlich M, Davis LS (2019) Deep residual learning in the jpeg transform domain. In: Proceedings of the IEEE international conference on computer vision, pp 3484–3493
Xu K, Qin M, Sun F, Wang Y, Chen YK, Ren F (2020) Learning in the frequency domain. arXiv preprint arXiv:2002.12416
Durall R, Keuper M, Pfreundt FJ, Keuper J (2019) Unmasking deepfakes with simple features. arXiv preprint arXiv:1911.00686
Recommendation ITU-R (2011) Studio encoding parameters of digital television for standard 4: 3 and wide-screen 16: 9 aspect ratios
Radiocommunication ITU (2002) Parameter values for the HDTV standards for production and international programme exchange. Recommendation ITU-R BT, pp 709–5
Recommendation ITU-R BT.601-5 (1982-1995)
Recommendation ITU-R BT.709-5 (1990-2002)
Society of Motion Picture and Television Engineers SMPTE 240M-1999 “Television-Signal Parameters-1125-Line High-Definition Production”. http://www.smpte.org
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196
Yu F, Seff A, Zhang Y, Song S, Funkhouser T, Xiao J (2015) LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365
Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410
Chen C, Chen Q, Xu J, Koltun V (2018) Learning to see in the dark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3291–3300
Brock A, Donahue J, Simonyan K (2018) Large scale GAN training for high fidelity natural image synthesis. In: International conference on learning representations
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2019) Analyzing and improving the image quality of StyleGAN. arXiv preprint arXiv:1912.04958
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Which face is real? https://www.whichfaceisreal.com/. Accessed 07 Dec 2020
Park T, Liu MY, Wang TC, Zhu JY (2019) Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M.(2019) FaceForensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE international conference on computer vision, pp 1–11
Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: Proceedings of the IEEE international conference on computer vision, pp 1511–1520
Li K, Zhang T, Malik J (2019) Diverse image synthesis from semantic layouts via conditional IMLE. In: Proceedings of the IEEE international conference on computer vision, pp 4220–4229
Dai T, Cai J, Zhang Y, Xia ST, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11065–11074
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Department of Computer Science, Information Technology University (ITU), Lahore, Pakistan
Bilal Yousaf, Waqas Sultani & Arif Mahmood
Lahore University of Management Sciences (LUMS), Lahore, Pakistan
Muhammad Usama
Department of Computer Science and Engineering (CSE), College of Engineering, Qatar University, Doha, Qatar
Junaid Qadir
Department of Electrical Engineering, Information Technology University (ITU), Lahore, Pakistan
Junaid Qadir

Authors

Bilal Yousaf
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Usama
View author publications
You can also search for this author in PubMed Google Scholar
Waqas Sultani
View author publications
You can also search for this author in PubMed Google Scholar
Arif Mahmood
View author publications
You can also search for this author in PubMed Google Scholar
Junaid Qadir
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junaid Qadir.

Ethics declarations

Conflict of interest

We wish to confirm that there are no known conflicts of interest associated with this publication.

Ethical approval

We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us. We understand that the Corresponding Author is the sole contact for the Editorial process (including Editorial Manager and direct communications with the office).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yousaf, B., Usama, M., Sultani, W. et al. Fake visual content detection using two-stream convolutional neural networks. Neural Comput & Applic 34, 7991–8004 (2022). https://doi.org/10.1007/s00521-022-06902-5

Download citation

Received: 03 February 2021
Accepted: 04 January 2022
Published: 20 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s00521-022-06902-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fake visual content detection using two-stream convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

CBAM: Convolutional Block Attention Module

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fake visual content detection using two-stream convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

CBAM: Convolutional Block Attention Module

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation