Image Forgery Detection Based on Semantic Image Understanding

  • Kui Ye
  • Jing Dong
  • Wei Wang
  • Jindong Xu
  • Tieniu Tan
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 771)


Image forensics has been focusing on low-level visual features, paying little attention to high-level semantic information of the image. In this work, we propose the framework for image forgery detection based on high-level semantics with three components of image understanding module, the normal rule bank (NR) holding semantic rules that comply with our common sense, and the abnormal rule bank (AR) holding semantic rules that don’t. Ke et al. [1] also proposed a similar framework, but ours has following advantages. Firstly, image understanding module is integrated by a dense image caption model, with no need for human intervention and more hierarchical features. secondly, our proposed framework can generate thousands of semantic rules automatically for NR. Thirdly, besides NR, we also propose to construct AR. In this way, not only can we frame image forgery detection as anomaly detection with NR, but also as recognition problem with AR. The experimental results demonstrate our framework is effective and performs better.


Image forensics Image understanding module NR AR Deep learning 



This work is supported by NSFC (Nos. U1536120, U1636201, 61502496), the National Key Research and Development Program of China (No. 2016YFB1001003) and China Postdoctoral Science Foundation funded project (No. 2016M601168).


  1. 1.
    Ke, Y., Min, W., Qin, F., Shang, J.: Image forgery detection based on semantics. Int. J. Hybrid Inf. Technol. 7(1) (2014)Google Scholar
  2. 2.
    Fridrich, A.J., Soukal, B.D., Luk, A.J.: Detection of copy-move forgery in digital images. In: Proceedings of Digital Forensic Research Workshop (2003)Google Scholar
  3. 3.
    Popescu, A.C., Farid, H.: Statistical tools for digital forensics. In: Fridrich, J. (ed.) IH 2004. LNCS, vol. 3200, pp. 128–147. Springer, Heidelberg (2004). CrossRefGoogle Scholar
  4. 4.
    Lin, Z., Wang, R., Tang, X., Shum, H.-V.: Detecting doctored images using camera response normality and consistency. In: Proceedings of Computer Vision and Pattern Recognition (2005)Google Scholar
  5. 5.
    Johnson, M.K., Farid, H.: Exposing digital forgeries through chromatic aberration. In: Proceedings of ACM Multimedia and Security, Workshop, pp. 48–55 (2006)Google Scholar
  6. 6.
    Popescu, A.C., Farid, H.: Exposing digital forgeries in color filter array interpolated images. IEEE Trans. Sig. Process. 53(10), 3948–3959 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Luk, J., Fridrich, J., Goljan, M.: Detecting digital image forgeries using sensor pattern noise. In: Proceedings of SPIE Electronic Imaging Security Steganography Watermarking of Multimedia Contents VIII, vol. 6072, pp. 0Y1–0Y11 (2006)Google Scholar
  8. 8.
    O’Brien, J., Farid, H.: Exposing photo manipulation with inconsistent reflections. ACM Trans. Graph. 31(1), 1–11 (2012)CrossRefGoogle Scholar
  9. 9.
    Kee, E., Farid, H.: Exposing digital forgeries from 3-D lighting environments. In: 2010 IEEE International Workshop on Information Forensics and Security, pp. 1–6. IEEE (2010)Google Scholar
  10. 10.
    Kee, E., O’Brien, J.F., Farid, H.: Exposing photo manipulation with inconsistent shadows. ACM Trans. Graph. (ToG) 32(3), 28 (2013)CrossRefzbMATHGoogle Scholar
  11. 11.
    Johnson, J., Karpathy, A., Fei-Fei, L.: Densecap: fully convolutional localization networks for dense captioning, arXiv preprint arXiv:1511.07571 (2015)
  12. 12.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  13. 13.
    Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for sequence learning, arXiv preprint arXiv:1506.00019 (2015)
  14. 14.
    Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)Google Scholar
  15. 15.
    Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L., Shamma, D.A., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations, arXiv preprint arXiv:1602.07332 (2016)
  16. 16.
    Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of 8th International Conference on Computer Vision, vol. 2, pp. 416–423, July 2001Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  • Kui Ye
    • 1
    • 2
  • Jing Dong
    • 1
    • 3
  • Wei Wang
    • 1
    • 2
    • 4
  • Jindong Xu
    • 1
  • Tieniu Tan
    • 1
  1. 1.Center for Research on Intelligent Perception and Computing, Institute of AutomationChinese Academy of SciencesBeijingChina
  2. 2.State Key Laboratory of CryptologyChinese Academy of SciencesBeijingChina
  3. 3.State Key Laboratory of Information Security, Institute of Information EngineeringChinese Academy of SciencesBeijingChina
  4. 4.Shenzhen Key Laboratory of Media SecurityShenzhen UniversityShenzhenChina

Personalised recommendations