Skip to main content
Log in

Multi-scale attention context-aware network for detection and localization of image splicing

Efficient and robust identification network

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Nowadays, advanced image editing tools and techniques produce more realistic tampered images, especially the addition of intelligent retouching technology makes the threshold of tampered operations lower and lower, which can easily evade image forensic systems and make it more difficult to verify the authenticity of images. The field of forensics has failed to achieve effective development due to the lack of high-quality splicing datasets. In this paper, we present the SMI20K-the first benchmark dataset for image splicing operations under intelligent tampered techniques, which contains a total of 20,000 splicing tampered images. By combining Seamless Cloning and image similarity search techniques, the tampered images have more hidden manipulation traces, making it difficult for the naked eye to distinguish the tampered targets. SMI20K brings a new challenge to the field of image forensics. Furthermore, we propose the novel Multi-scale Attention Context-aware Network (MAC-Net) to address the novel challenge of image tampered presently. Specifically, we propose a Multi-scale Multi-level Attention Module (MMAM) that not only effectively resolves feature inconsistencies at different scales, but also automatically adjusts the coefficients of the original input features to maintain detailed features. The fused features are then fed to the proposed Multi-Branch Global Context Module (MGCM), it has three different branches that not only enriches the contextual information but also maintains the detailed features of the target through automatic coefficient adjustment. Extensive experimental results on three public datasets and the proposed dataset show that the proposed model outperforms other state-of-the-art (SOTA) models in image forgery localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Martino JMD, Facciolo G, Meinhardt-Llopis E (2016) Poisson image editing. Image Process Line 6:300–325. https://doi.org/10.5201/ipol.2016.163

    Article  MathSciNet  Google Scholar 

  2. Hao J, Zhang Z, Yang S, Xie D, Pu S (2021) Transforensics: Image forgery localization with dense self-attention. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021. IEEE, Montreal, 10-17 October, 2021, pp 15035–15044. https://doi.org/10.1109/ICCV48922.2021.01478

  3. Bappy JH, Roy-Chowdhury AK, Bunk J, Nataraj L, Manjunath BS (2017) Exploiting spatial structure for localizing manipulated image regions. In: IEEE international conference on computer vision, ICCV 2017. IEEE computer society, Italy, 22-29 October, 2017, pp 4980–4989. https://doi.org/10.1109/ICCV.2017.532

  4. Bappy JH, Simons C, Nataraj L, Manjunath BS, Roy-Chowdhury AK (2019) Hybrid LSTM and encoder-decoder architecture for detection of image forgeries. IEEE Trans Image Process 28(7):3286–3300. https://doi.org/10.1109/TIP.2019.2895466

    Article  MathSciNet  MATH  Google Scholar 

  5. Wu Y, AbdAlmageed W, Natarajan P (2019) Mantra-net: manipulation tracing network for detection and localization of image forgeries with anomalous features. In: IEEE conference on computer vision and pattern recognition, CVPR 2019. Computer Vision Foundation / IEEE, USA, 16-20 June, 2019, pp 9543–9552. https://doi.org/10.1109/CVPR.2019.00977

  6. Xiao B, Wei Y, Bi X, Li W, Ma J (2020) Image splicing forgery detection combining coarse to refined convolutional neural network and adaptive clustering. Inf Sci 511:172–191. https://doi.org/10.1016/j.ins.2019.09.038

    Article  MathSciNet  Google Scholar 

  7. Bi X, Wei Y, Xiao B, Li W (2019) Rru-net: the ringed residual u-net for image splicing forgery detection. In: IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2019. Computer Vision Foundation / IEEE, long beach, CA, USA, 16-20 June, 2019, pp 30–39. https://doi.org/10.1109/CVPRW.2019.00010

  8. Huh M, Liu A, Owens A, Efros AA (2018) Fighting fake news: image splice detection via learned self-consistency. In: ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision - ECCV 2018 - 15th European conference, Germany, 8-14 September, 2018, Proceedings, Part XI, Lecture Notes in Computer Science. Springer, vol 11215, pp 106–124. https://doi.org/10.1007/978-3-030-01252-6_7

  9. Bi X, Zhang Z, Xiao B (2021) Reality transform adversarial generators for image splicing forgery detection and localization. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021. IEEE, Canada, 10-17 October, 2021, pp 14274–14283. https://doi.org/10.1109/ICCV48922.2021.01403

  10. Cozzolino D, Gragnaniello D, Verdoliva L (2014) Image forgery localization through the fusion of camera-based, feature-based and pixel-based techniques. In: 2014 IEEE international conference on image processing, ICIP 2014. IEEE, France, 27-30 October, 2014, pp 5302–5306. https://doi.org/10.1109/ICIP.2014.7026073

  11. Ferrara P, Bianchi T, Rosa AD, Piva A (2012) Image forgery localization via fine-grained analysis of CFA artifacts. IEEE Trans Inf Forensics Secur 7(5):1566–1577. https://doi.org/10.1109/TIFS.2012.2202227

    Article  Google Scholar 

  12. de Carvalho TJ, Riess C, Angelopoulou E, Pedrini H, de Rezende Rocha A (2013) Exposing digital image forgeries by illumination color classification. IEEE Trans Inf Forensics Secur 8(7):1182–1194. https://doi.org/10.1109/TIFS.2013.2265677

    Article  Google Scholar 

  13. Riess C, Angelopoulou E (2010) Scene illumination as an indicator of image manipulation. In: Böhme R, Fong PWL, Safavi-Naini R (eds) Information hiding - 12th international conference, IH 2010. Springer, Canada, 28-30 June, 2010, Revised selected papers, lecture notes in computer science, vol 6387, pp 66–80. https://doi.org/10.1007/978-3-642-16435-4_6

  14. Barni M, Bondi L, Bonettini N, Bestagini P, Costanzo A, Maggini M, Tondi B, Tubaro S (2017) Aligned and non-aligned double JPEG detection using convolutional neural networks. J Vis Commun Image Represent 49:153–163. https://doi.org/10.1016/j.jvcir.2017.09.003

    Article  Google Scholar 

  15. Bianchi T, Rosa AD, Piva A (2011) Improved DCT coefficient analysis for forgery localization in JPEG images. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, ICASSP 2011, Prague Congress Center, Prague, Czech Republic. IEEE, 22-27 May, 2011, pp 2444–2447. https://doi.org/10.1109/ICASSP.2011.5946978

  16. Bappy JH, Simons C, Nataraj L, Manjunath BS, Roy-Chowdhury AK (2019) Hybrid LSTM and encoder-decoder architecture for detection of image forgeries. IEEE Trans Image Process 28(7):3286–3300. https://doi.org/10.1109/TIP.2019.2895466

    Article  MathSciNet  MATH  Google Scholar 

  17. Zhou P, Han X, Morariu VI, Davis LS (2017) Two-stream neural networks for tampered face detection. In: 2017 IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2017, honolulu, HI, USA, 21-26 July, 2017, pp 1831–1839. IEEE Computer Society. https://doi.org/10.1109/CVPRW.2017.229

  18. Xiao B, Wei Y, Bi X, Li W, Ma J (2020) Image splicing forgery detection combining coarse to refined convolutional neural network and adaptive clustering. Inf Sci 511:172–191. https://doi.org/10.1016/j.ins.2019.09.038

    Article  MathSciNet  Google Scholar 

  19. Islam A, Long C, Basharat A, Hoogs A (2020) DOA-GAN: dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020. Computer Vision Foundation / IEEE, Seattle, WA, USA, 13-19 June, 2020, pp 4675–4684. https://doi.org/10.1109/CVPR42600.2020.00473

  20. Zhou P, Han X, Morariu VI, Davis LS (2017) Two-stream neural networks for tampered face detection. In: 2017 IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2017. IEEE Computer Society, Honolulu, HI, USA, 21-26 July, 2017, pp 1831–1839. https://doi.org/10.1109/CVPRW.2017.229

  21. Amerini I, Uricchio T, Ballan L, Caldelli R (2017) Localization of JPEG double compression through multi-domain convolutional neural networks. In: 2017 IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2017. IEEE Computer Society, USA, 21-26 July, 2017, pp 1865–1871. https://doi.org/10.1109/CVPRW.2017.233

  22. Bappy JH, Roy-Chowdhury AK, Bunk J, Nataraj L, Manjunath BS (2017) Exploiting spatial structure for localizing manipulated image regions. In: IEEE international conference on computer vision, ICCV 2017. IEEE computer society, Italy, 22-29 October, 2017, pp 4980–4989. https://doi.org/10.1109/ICCV.2017.532

  23. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: Vedaldi A, Bischof H, Brox T, Frahm J (eds) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, 23-28 August, 2020, Proceedings, Part I, lecture notes in computer science. Springer, vol 12346, pp 213–229. https://doi.org/10.1007/978-3-030-58452-8_13

  24. Vaswani A, Ramachandran P, Srinivas A, Parmar N, Hechtman BA, Shlens J (2103) Scaling local self-attention for parameter efficient visual backbones. arXiv:2103.12731

  25. Vaswani A, Ramachandran P, Srinivas A, Parmar N, Hechtman BA, Shlens J (2021) Scaling local self-attention for parameter efficient visual backbones. arXiv:2103.12731

  26. Ye L, Rochan M, Liu Z, Wang Y (2019) Cross-modal self-attention network for referring image segmentation. In: IEEE conference on computer vision and pattern recognition, CVPR 2019. Computer Vision Foundation / IEEE, USA, 16-20 June, 2019, pp 10502–10511. https://doi.org/10.1109/CVPR.2019.01075

  27. Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PHS, Zhang L (2020) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. arXiv:2012.15840

  28. Hu X, Zhang Z, Jiang Z, Chaudhuri S, Yang Z, Nevatia R (2020) SPAN: spatial pyramid attention network for image manipulation localization. In: Vedaldi A, Bischof H, Brox T, Frahm J (eds) Computer Vision - ECCV 2020 - 16th European conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXI, Lecture Notes in Computer Science, vol 12366, pp 312–328. Springer. https://doi.org/10.1007/978-3-030-58589-1_19

  29. Islam A, Long C, Basharat A, Hoogs A (2020) DOA-GAN: dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp 4675–4684. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR42600.2020.00473

  30. Qin L, Che W, Ni M, Li Y, Liu T (2021) Knowing where to leverage: context-aware graph convolutional network with an adaptive fusion layer for contextual spoken language understanding. IEEE ACM Trans Audio Speech Lang Process 29:1280–1289. https://doi.org/10.1109/TASLP.2021.3053400

    Article  Google Scholar 

  31. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, USA, July 21-26, 2017, pp 6230–6239. IEEE Computer Society. https://doi.org/10.1109/CVPR.2017.660

  32. Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184

    Article  Google Scholar 

  33. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: IEEE conference on computer vision and pattern recognition, CVPR 2019. Computer Vision Foundation / IEEE, USA, June 16-20, 2019, pp 3146–3154. https://doi.org/10.1109/CVPR.2019.00326

  34. Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: Criss-cross attention for semantic segmentation. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019. IEEE, Seoul, Korea (South), October 27 - November 2, 2019, pp 603–612. https://doi.org/10.1109/ICCV.2019.00069

  35. Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018. Computer Vision Foundation / IEEE Computer Society, USA, June 18-22, 2018, pp 1741–1750. https://doi.org/10.1109/CVPR.2018.00187

  36. Tan J, Xiong P, Lv Z, Xiao K, He Y (2020) Local context attention for salient object segmentation. In: Ishikawa H, Liu C, Pajdla T, Shi J (eds) Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Japan, November 30 - December 4, 2020, Revised Selected Papers, Part I, Lecture Notes in Computer Science, vol. 12622, pp 706–722. Springer. https://doi.org/10.1007/978-3-030-69525-5_42

  37. Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. arXiv:2003.00651

  38. Liu J, Hou Q, Cheng M, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, USA, June 16-20, 2019, pp 3917–3926. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR.2019.00404

  39. Dong J, Wang W, Tan T (2013) CASIA Image tampering detection evaluation database. In: 2013 IEEE China summit and international conference on signal and information processing, chinaSIP 2013, China, July 6-10, 2013, pp 422–426. IEEE. https://doi.org/10.1109/ChinaSIP.2013.6625374

  40. Novozamsky A, Mahdian B, Saic S (2020) Imd2020: a large-scale annotated dataset tailored for detecting manipulated images. In: 2020 IEEE winter applications of computer vision workshops (WACVW), pp 71–80

  41. Novozámský A, Mahdian B, Saic S (2020) IMD2020: A large-scale annotated dataset tailored for detecting manipulated images. In: IEEE winter applications of computer vision workshops, WACV workshops 2020, USA, March 1-5, 2020, pp 71–80.IEEE. https://doi.org/10.1109/WACVW50321.2020.9096940

  42. Hsu Y, Chang S (2006) Detecting image splicing using geometry invariants and camera characteristics consistency. In: Proceedings of the 2006 IEEE international conference on multimedia and expo, ICME 2006, July 9-12 2006, Canada, pp 549–552. IEEE Computer Society. https://doi.org/10.1109/ICME.2006.262447

  43. Guan H, Kozak M, Robertson E, Lee Y, Yates AN, Delgado A, Zhou D, Kheyrkhah T, Smith J, Fiscus JG (2019) MFC Datasets: large-scale benchmark datasets for media forensic challenge evaluation. In: IEEE winter applications of computer vision workshops, WACV workshops 2019, USA, January 7-11, 2019, pp 63–72. IEEE. https://doi.org/10.1109/WACVW.2019.00018

  44. Wang L, Lu H, Wang Y, Feng M, Wang D, Yin B, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, USA, July 21-26, 2017, pp 3796–3805. IEEE Computer Society. https://doi.org/10.1109/CVPR.2017.404

  45. Dai Y, Gieseke F, Oehmcke S, Wu Y, Barnard K (2021) Attentional feature fusion. In: IEEE winter conference on applications of computer vision, WACV 2021, USA, January 3-8, 2021, pp 3559–3568. IEEE. https://doi.org/10.1109/WACV48630.2021.00360

  46. Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, USA, June 16-20, 2019, pp 3907–3916. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR.2019.00403

  47. de Boer P, Kroese DP, Mannor S, Rubinstein RY (2005) A tutorial on the cross-entropy method. Ann Oper Res 134(1):19–67. https://doi.org/10.1007/s10479-005-5724-z

    Article  MathSciNet  MATH  Google Scholar 

  48. Máttyus G, Luo W, Urtasun R (2017) Deeproadmapper: extracting road topology from aerial images. In: IEEE international conference on computer vision, ICCV 2017, Italy, October 22-29, 2017, pp 3458–3466. IEEE computer society. https://doi.org/10.1109/ICCV.2017.372

  49. He K, Gkioxari G, Dollár P, Girshick RB (2017) Mask r-CNN. In: IEEE international conference on computer vision, ICCV 2017, Italy, october 22-29, 2017, pp 2980–2988. IEEE computer society. https://doi.org/10.1109/ICCV.2017.322

  50. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, III WMW, Frangi AF (eds) Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th international conference Munich, Germany, October 5 - 9, 2015, Proceedings, Part III, Lecture Notes in Computer Science, vol 9351, pp 234–241. Springer. https://doi.org/10.1007/978-3-319-24574-4_28

  51. Zhou P, Han X, Morariu VI, Davis LS (2018) Learning rich features for image manipulation detection. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, USA, June 18-22, 2018, pp 1053–1061. Computer Vision Foundation / IEEE Computer Society

  52. Wu Y, AbdAlmageed W, Natarajan P (2019) Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. In: IEEE Conference on computer vision and pattern recognition, CVPR 2019, long beach, CA, USA, June 16-20, 2019, pp 9543–9552. Computer Vision Foundation / IEEE, DOI 10.1109/CVPR.2019.00977, (to appear in print)

  53. Liu N, Han J, Yang M (2018) Picanet: learning pixel-wise contextual attention for saliency detection. In: 2018 IEEE Conference on computer vision and pattern recognition, CVPR 2018, USA, June 18-22, 2018, pp 3089–3098. Computer Vision Foundation / IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00326

  54. Islam A, Long C, Basharat A, Hoogs A (2020) DOA-GAN: dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, seattle, WA, USA, June 13-19, 2020, pp 4675–4684. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR42600.2020.00473

  55. Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part IV, Lecture Notes in Computer Science, vol 9908, pp 354–370. Springer. https://doi.org/10.1007/978-3-319-46493-0_22

  56. Zhao J, Liu J, Fan D, Cao Y, Yang J, Cheng M (2019) Egnet: Edge guidance network for salient object detection. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Korea (South), October 27 - November 2, 2019, pp 8778–8787. IEEE. https://doi.org/10.1109/ICCV.2019.00887

  57. Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp 7263–7272. IEEE. https://doi.org/10.1109/ICCV.2019.00736

  58. Fan D, Ji G, Sun G, Cheng M, Shen J, Shao L (2020) Camouflaged object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, USA, June 13-19, 2020, pp 2774–2784. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR42600.2020.00285

  59. Mei H, Ji G, Wei Z, Yang X, Wei X, Fan D (2021) Camouflaged object segmentation with distraction mining. In: IEEE conference on computer vision and pattern recognition, CVPR 2021, Virtual, June 19-25, 2021, pp 8772–8781. Computer vision foundation / IEEE

  60. Sun Y, Chen G, Zhou T, Zhang Y, Liu N (2021) Context-aware cross-level fusion network for camouflaged object detection. In: Zhou Z (ed) Proceedings of the thirtieth international joint conference on artificial intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19-27 August 2021, pp 1025–1031. ijcai.org. https://doi.org/10.24963/ijcai.2021/142

  61. Gao SH, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr P (2021) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662. https://doi.org/10.1109/TPAMI.2019.2938758

    Article  Google Scholar 

  62. Li W, Zhang Z, Wang X, Luo P (2020) Adax: adaptive gradient descent with exponential long term memory. arXiv:2004.09740

  63. Ruyong R, Shaozhang N, Hua R, Shubin Z, Tengyue H, Xiaohai T (2022) Esrnet: efficient search and recognition network for image manipulation detection. ACM Trans Multimedia Comput, Commun, Appl (TOMM)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaozhang Niu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, R., Niu, S., Jin, J. et al. Multi-scale attention context-aware network for detection and localization of image splicing. Appl Intell 53, 18219–18238 (2023). https://doi.org/10.1007/s10489-022-04421-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04421-3

Keywords

Navigation