Video splicing detection and localization based on multi-level deep feature fusion and reinforcement learning

Jin, Xiao; He, Zhen; Xu, Jing; Wang, Yongwei; Su, Yuting

doi:10.1007/s11042-022-13001-z

Video splicing detection and localization based on multi-level deep feature fusion and reinforcement learning

Published: 16 May 2022

Volume 81, pages 40993–41011, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiao Jin ORCID: orcid.org/0000-0002-5130-9505¹,
Zhen He¹,
Jing Xu¹,
Yongwei Wang² &
…
Yuting Su³

581 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Splicing forgery refers to copying some regions of a video or an image to another video/image. Although image splicing detection has been studied for many years, video splicing detection has attracted relatively much less attention. In this paper, we proposed a novel framework for video splicing detection by modeling this forensic task as a video object segmentation problem. Based on the nature of this forgery operation, discontinuous noise distribution and object contours are adopted as traces to guide the localization results. The method consists of three modules: EXIF-consistency prediction, suspected region tracking, and semantic segmentation. To bridge the gap between sensor-level and semantic-level features, three modules in our framework are integrated for final tampered areas detection. Firstly, we use the EXIF-consistency prediction module to extract sensor-level traces from tampered areas. Then, we employ a deep reinforcement learning-based method for tracking suspected regions. Finally, a semantic segmentation module is adopted to localize the final results of the tampered regions. Compared with several state-of-the-art forensic approaches, our method demonstrates superiority in publicly available datasets. In terms of F1 score, our method achieves 0.623 in GRIP dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Article 04 June 2022

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Notes

References

Al-Qershi OM, Khoo BE (2013) Passive detection of copy-move forgery in digital images: State-of-the-art. Forensic Sci Int 231(1):284–295. https://doi.org/10.1016/j.forsciint.2013.05.027
Article Google Scholar
Al-Sanjary OI, Ahmed AA, Sulong G (2016) Development of a video tampering dataset for forensic investigation. Forensic Sci Int 266:565–572. https://doi.org/10.1016/j.forsciint.2016.07.013
Article Google Scholar
Bahrami K, Kot AC, Li L, Li H (2015) Blurred image splicing localization by exposing blur type inconsistency. IEEE Trans Inf Forensics Secur 10(5):999–1009. https://doi.org/10.1109/TIFS.2015.2394231
Article Google Scholar
Bammey Q, Gioi RGV, Morel JM (2020) An adaptive neural network for unsupervised mosaic consistency analysis in image forensics. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 14,194–14,204
Bappy JH, Simons C, Nataraj L, Manjunath B, Roy-Chowdhury AK (2019) Hybrid LSTM and encoder-decoder architecture for detection of image forgeries. IEEE Trans Image Process 28(7):3286–3300. https://doi.org/10.1109/TIP.2019.2895466
Article MathSciNet Google Scholar
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522. https://doi.org/10.1109/34.993558
Article Google Scholar
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details:, Delving deep into convolutional nets. arXiv:1405.3531
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv:1412.7062
Cozzolino D, Poggi G, Verdoliva L (2019) Extracting camera-based fingerprints for video forensics. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, p 130–137
Cozzolino D, Verdoliva L (2020) Noiseprint: A CNN-based camera model fingerprint. IEEE Trans Inf Forensics Secur 15:144–159. https://doi.org/10.1109/TIFS.2019.2916364
Article Google Scholar
D’Avino D, Cozzolino D, Poggi G, Verdoliva L (2017) Autoencoder with recurrent neural networks for video forgery detection. Electron Imaging 2017(7):92–99. https://doi.org/10.2352/ISSN.2470-1173.2017.7.MWSF-330
Article Google Scholar
Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
Goel V, Weng J, Poupart P (2018) Unsupervised video object segmentation for deep reinforcement learning. In: Advances in neural information processing systems, p 5683–5694
Hsu Y, Chang S (2010) Camera response functions for image forensics: an automatic algorithm for splicing detection. IEEE Trans Inf Forensics Secur 5(4):816–825. https://doi.org/10.1109/TIFS.2010.2077628
Article Google Scholar
Huh M, Liu A, Owens A, Efros AA (2018) Fighting fake news: Image splice detection via learned self-consistency. In: Proceedings of the european conference on computer vision, p 101–117
Islam A, Long C, Basharat A, Hoogs A (2020) DOA-GAN: Dual-Order attentive generative adversarial network for image Copy-Move forgery detection and localization. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, p 4676–4685
Jiang J, Song X (2016) An optimized higher order crf for automated labeling and segmentation of video objects. IEEE Trans Circuits Syst Video Technol 26(3):506–516. https://doi.org/10.1109/TCSVT.2015.2416557
Article Google Scholar
Johnston P, Elyan E (2019) A review of digital video tampering: from simple editing to full synthesis. Digit Investig 29:67–81. https://doi.org/10.1016/j.diin.2019.03.006
Article Google Scholar
Khoreva A, Benenson R, Hosang J, Hein M, Schiele B (2017) Simple does it: Weakly supervised instance and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 876–885
Kniaz VV, Knyaz V, Remondino F (2019) The point where reality meets fantasy: Mixed adversarial generators for image splice detection. In: Advances in neural information processing systems, p 215–226
Lin X, Wang S, Huang WJ, Liew AWC, Huang XS, Wu J (2019) Toward adaptive BDCT feature representation based image splicing measurement in smart cities. Measurement 139:61–69. https://doi.org/10.1016/j.measurement.2019.02.086
Article Google Scholar
Liu B, Pun CM (2018) Locating splicing forgery by fully convolutional networks and conditional random field. Signal Process. Image Commun 66:103–112. https://doi.org/10.1016/j.image.2018.04.011
Article Google Scholar
Liu B, Pun CM (2020) Exposing splicing forgery in realistic scenes using deep fusion network. Inf Sci 526:133–150. https://doi.org/10.1016/j.ins.2020.03.099
Article MathSciNet Google Scholar
Liu B, Pun CM (2020) Locating splicing forgery by adaptive-SVD noise estimation and vicinity noise descriptor. Neurocomputing 387:172–187. https://doi.org/10.1016/j.neucom.2019.12.105
Article Google Scholar
Liu Y, Zhu X, Zhao X, Cao Y (2019) Adversarial learning for constrained image splicing detection and localization based on atrous convolution. IEEE Trans Inf Forensics Secur 14(10):2551–2566. https://doi.org/10.1109/TIFS.2019.2902826
Article Google Scholar
Lyu S, Pan X, Zhang X (2014) Exposing region splicing forgeries with blind local noise estimation. Int J Comput Vis 110(2):202–221. https://doi.org/10.1007/s11263-013-0688-y
Article Google Scholar
Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 724–732
Richao C, Gaobo Y, Ningbo Z (2014) Detection of object-based manipulation by the statistical features of object contour. Forensic Sci Int 236:164–169. https://doi.org/10.1016/j.forsciint.2013.12.022
Article Google Scholar
Rocha A, Scheirer W, Boult T, Goldenstein S (2011) Vision of the unseen: Current trends and challenges in digital image and video forensics. ACM Comput Surv 43(4):1–42. https://doi.org/10.1145/1978802.1978805
Article Google Scholar
Salloum R, Ren Y, Kuo CCJ (2018) Image Splicing Localization using a Multi-task Fully Convolutional Network (MFCN). J Vis Commun Image Represent 51:201–209. https://doi.org/10.1016/j.jvcir.2018.01.010
Article Google Scholar
Singh RD, Aggarwal N (2017) Detection of upscale-crop and splicing for digital video authentication. Digit Investig 21:31–52. https://doi.org/10.1016/j.diin.2017.01.001
Article Google Scholar
Singh RD, Aggarwal N (2018) Video content authentication techniques: a comprehensive survey. Multimedia Systems 24(2):211–240. https://doi.org/10.1007/s00530-017-0538-9
Article Google Scholar
Sun P, Lang Y, Fan S, Shen Z, Liu L, Shan D, Peng S (2018) Exposing splicing forgery based on color temperature estimation. Forensic Sci Int 289:1–11. https://doi.org/10.1016/j.forsciint.2018.04.049
Article Google Scholar
Vecchio G, Palazzo S, Giordano D, Rundo F, Spampinato C (2020) MASK-RL: Multiagent video object segmentation framework through reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, p 1–13. https://doi.org/10.1109/TNNLS.2019.2963282
Verdoliva L (2020) Media forensics and deepfakes: an overview. IEEE J Sel Top Signal Process 14(5):910–932. https://doi.org/10.1109/JSTSP.2020.3002101
Article Google Scholar
Wang P, Liu F, Yang C, Luo X (2018) Blind forensics of image gamma transformation and its application in splicing detection. J Vis Commun Image Represent 55:80–90. https://doi.org/10.1016/j.jvcir.2018.05.020
Article Google Scholar
Wang W, Shen J, Yang R, Porikli F (2018) Saliency-aware video object segmentation. IEEE Trans Pattern Anal Mach Intell 40(1):20–33. https://doi.org/10.1109/TPAMI.2017.2662005
Article Google Scholar
Wang X, Wang Y, Lei J, Li B, Wang Q, Xue J (2022) Coarse-to-fine-grained method for image splicing region detection. Pattern Recogn 122(108):347. https://doi.org/10.1016/j.patcog.2021.108347
Google Scholar
Warif NBA, Wahab AWA, Idris MYI, Ramli R, Salleh R, Shamshirband S, Choo KKR (2016) Copy-move forgery detection: Survey, challenges and future directions. J Netw Comput Appl 75:259–278. https://doi.org/10.1016/j.jnca.2016.09.008
Article Google Scholar
Wei Y, Wang Z, Xiao B, Liu X, Yan Z, Ma J (2020) Controlling neural learning network with multiple scales for image splicing forgery detection, ACM Transactions on Multimedia Computing, Communications and Applications 16(4). https://doi.org/10.1145/3408299
Wu Y, Abd-Almageed W, Natarajan P (2017) Deep matching and validation network: an end-to-end solution to constrained image splicing localization and detection. In: Proceedings of the ACM International Conference on Multimedia, p 1480–1502
Wu Y, Abd-Almageed W, Natarajan P (2019) Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 9543–9552
Xiao B, Wei Y, Bi X, Li W, Ma J (2020) Image splicing forgery detection combining coarse to refined convolutional neural network and adaptive clustering. Inf Sci 511:172–191. https://doi.org/10.1016/j.ins.2019.09.038
Article MathSciNet Google Scholar
Xie S, Tu Z (2015) Holistically-nested edge detection. In: Proceedings of the IEEE international conference on computer vision
Yang J, Price B, Shen X, Lin Z, Yuan J (2016) Fast appearance modeling for automatic primary video object segmentation. IEEE Trans Image Process 25(2):503–515. https://doi.org/10.1109/TIP.2015.2500820
Article MathSciNet Google Scholar
Yao R, Lin G, Xia S, Zhao J, Zhou Y (2020) Video object segmentation and tracking: a survey. ACM Trans Intell Syst Technol 11(4):1–47. https://doi.org/10.1145/3391743
Article Google Scholar
Yun S, Choi J, Yoo Y, Yun K, Choi JY (2018) Action-driven visual object tracking with deep reinforcement learning. IEEE Trans Neural Netw Learn Syst 29(6):2239–2252. https://doi.org/10.1109/TNNLS.2018.2801826
Article MathSciNet Google Scholar
Zhao X, Wang S, Li S, Li J (2015) Passive Image-Splicing detection by a 2-D noncausal markov model. IEEE Trans Circuits Syst Video Technol 25(2):185–199. https://doi.org/10.1109/TCSVT.2014.2347513
Article Google Scholar
Zhou P, Han X, Morariu VI, Davis LS (2018) Learning rich features for image manipulation detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, p 1053–1061
Zhu N, Li Z (2018) Blind image splicing detection via noise level function. Signal Process Image Commun 68:181–192. https://doi.org/10.1016/j.image.2018.07.012
Article Google Scholar
Zhuo T, Cheng Z, Zhang P, Wong Y, Kankanhalli M (2020) Unsupervised online video object segmentation with motion property understanding. IEEE Trans Image Process 29:237–249. https://doi.org/10.1109/TIP.2019.2930152
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (Grant No. 62002177), Tianjin Natural Science Foundation, China (Grant No. 21JCYBJC00110 and 19JCQNJC00300), and Fundamental Research Funds for the Central Universities of Nankai University (Grant No. 63201192, 63211116).

Author information

Authors and Affiliations

College of Artificial Intelligence, Nankai University, Tianjin, 300350, China
Xiao Jin, Zhen He & Jing Xu
Joint NTU-WeBank Research Centre on Fintech, Nanyang Technological University, Singapore, 639798, Singapore
Yongwei Wang
School of Electronic and Information Engineering, Tianjin University, Tianjin, 300072, China
Yuting Su

Authors

Xiao Jin
View author publications
You can also search for this author in PubMed Google Scholar
Zhen He
View author publications
You can also search for this author in PubMed Google Scholar
Jing Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yongwei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuting Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, X., He, Z., Xu, J. et al. Video splicing detection and localization based on multi-level deep feature fusion and reinforcement learning. Multimed Tools Appl 81, 40993–41011 (2022). https://doi.org/10.1007/s11042-022-13001-z

Download citation

Received: 12 March 2021
Revised: 07 January 2022
Accepted: 27 March 2022
Published: 16 May 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s11042-022-13001-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video splicing detection and localization based on multi-level deep feature fusion and reinforcement learning

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Video splicing detection and localization based on multi-level deep feature fusion and reinforcement learning

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation