Skip to main content

Detecting and Recovering Sequential DeepFake Manipulation

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Since photorealistic faces can be readily generated by facial manipulation technologies nowadays, potential malicious abuse of these technologies has drawn great concerns. Numerous deepfake detection methods are thus proposed. However, existing methods only focus on detecting one-step facial manipulation. As the emergence of easy-accessible facial editing applications, people can easily manipulate facial components using multi-step operations in a sequential manner. This new threat requires us to detect a sequence of facial manipulations, which is vital for both detecting deepfake media and recovering original faces afterwards. Motivated by this observation, we emphasize the need and propose a novel research problem called Detecting Sequential DeepFake Manipulation (Seq-DeepFake). Unlike the existing deepfake detection task only demanding a binary label prediction, detecting Seq-DeepFake manipulation requires correctly predicting a sequential vector of facial manipulation operations. To support a large-scale investigation, we construct the first Seq-DeepFake dataset, where face images are manipulated sequentially with corresponding annotations of sequential facial manipulation vectors. Based on this new dataset, we cast detecting Seq-DeepFake manipulation as a specific image-to-sequence (e.g. image captioning) task and propose a concise yet effective Seq-DeepFake Transformer (SeqFakeFormer). Moreover, we build a comprehensive benchmark and set up rigorous evaluation protocols and metrics for this new research problem. Extensive experiments demonstrate the effectiveness of SeqFakeFormer. Several valuable observations are also revealed to facilitate future research in broader deepfake detection problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. https://apps.apple.com/us/app/facetune2-editor-by-lightricks/id1149994032

  2. https://apps.apple.com/us/app/youcam-makeup-selfie-editor/id863844475

  3. https://apps.apple.com/us/app/youcam-perfect-photo-editor/id768469908

  4. Bello, I., Zoph, B., Vaswani, A., Shlens, J., Le, Q.V.: Attention augmented convolutional networks. In: CVPR (2019)

    Google Scholar 

  5. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  6. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  7. Dolhansky, B., Howes, R., Pflaum, B., Baram, N., Ferrer, C.C.: The deepfake detection challenge (DFDC) preview dataset. arXiv preprint arXiv:1910.08854 (2019)

  8. Durall, R., Keuper, M., Pfreundt, F.J., Keuper, J.: Unmasking deepfakes with simple features. arXiv preprint arXiv:1911.00686 (2019)

  9. Dzanic, T., Shah, K., Witherden, F.: Fourier spectrum discrepancies in deep network generated images. In: NeurIPS (2020)

    Google Scholar 

  10. Gao, P., Zheng, M., Wang, X., Dai, J., Li, H.: Fast convergence of DETR with spatially modulated co-attention. In: CVPR (2021)

    Google Scholar 

  11. Gu, S., Bao, J., Chen, D., Wen, F.: GIQA: generated image quality assessment. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 369–385. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_22

    Chapter  Google Scholar 

  12. Haliassos, A., Vougioukas, K., Petridis, S., Pantic, M.: Lips don’t lie: a generalisable and robust approach to face forgery detection. In: CVPR (2021)

    Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  14. He, Y., et al.: ForgeryNet: a versatile benchmark for comprehensive forgery analysis. In: CVPR (2021)

    Google Scholar 

  15. Jiang, L., Li, R., Wu, W., Qian, C., Loy, C.C.: DeeperForensics-1.0: a large-scale dataset for real-world face forgery detection. In: CVPR (2020)

    Google Scholar 

  16. Jiang, Y., Huang, Z., Pan, X., Loy, C.C., Liu, Z.: Talk-to-edit: fine-grained facial editing via dialog. In: ICCV (2021)

    Google Scholar 

  17. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018)

    Google Scholar 

  18. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)

    Google Scholar 

  19. Kim, H., Choi, Y., Kim, J., Yoo, S., Uh, Y.: Exploiting spatial dimensions of latent in GAN for real-time image editing. In: CVPR (2021)

    Google Scholar 

  20. Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. In: CVPR (2020)

    Google Scholar 

  21. Lee, W., Kim, D., Hong, S., Lee, H.: High-fidelity synthesis with disentangled representation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 157–174. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_10

    Chapter  Google Scholar 

  22. Li, J., Xie, H., Li, J., Wang, Z., Zhang, Y.: Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection. In: CVPR (2021)

    Google Scholar 

  23. Li, L., et al.: Face x-ray for more general face forgery detection. In: CVPR (2020)

    Google Scholar 

  24. Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S.: Celeb-DF: a large-scale challenging dataset for deepfake forensics. In: CVPR (2020)

    Google Scholar 

  25. Liu, H., et al.: Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In: CVPR (2021)

    Google Scholar 

  26. Liu, S.-Q., Lan, X., Yuen, P.C.: Remote photoplethysmography correspondence feature for 3D mask face presentation attack detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 577–594. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_34

    Chapter  Google Scholar 

  27. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: CVPR (2015)

    Google Scholar 

  28. Luo, Y., Zhang, Y., Yan, J., Liu, W.: Generalizing face forgery detection with high-frequency features. In: CVPR (2021)

    Google Scholar 

  29. Pang, M., Wang, B., Huang, S., Cheung, Y.M., Wen, B.: A unified framework for bidirectional prototype learning from contaminated faces across heterogeneous domains. IEEE Trans. Inf. Forensics Secur. 17, 1544–1557 (2022)

    Article  Google Scholar 

  30. Parmar, N., et al.: Image transformer. In: ICML (2018)

    Google Scholar 

  31. Qian, Y., Yin, G., Sheng, L., Chen, Z., Shao, J.: Thinking in frequency: face forgery detection by mining frequency-aware clues. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 86–103. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_6

    Chapter  Google Scholar 

  32. Qiu, H., Xiao, C., Yang, L., Yan, X., Lee, H., Li, B.: SemanticAdv: generating adversarial examples via attribute-conditioned image editing. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 19–37. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_2

    Chapter  Google Scholar 

  33. Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Nießner, M.: Faceforensics++: learning to detect manipulated facial images. In: CVPR (2019)

    Google Scholar 

  34. Shao, R., Lan, X., Li, J., Yuen, P.C.: Multi-adversarial discriminative deep domain generalization for face presentation attack detection. In: CVPR (2019)

    Google Scholar 

  35. Shao, R., Lan, X., Yuen, P.C.: Deep convolutional dynamic texture learning with adaptive channel-discriminability for 3D mask face anti-spoofing. In: IJCB (2017)

    Google Scholar 

  36. Shao, R., Lan, X., Yuen, P.C.: Joint discriminative learning of deep dynamic textures for 3D mask face anti-spoofing. IEEE Trans. Inf. Forensics Secur. 14(4), 923–938 (2018)

    Article  Google Scholar 

  37. Shao, R., Lan, X., Yuen, P.C.: Regularized fine-grained meta face anti-spoofing. In: AAAI (2020)

    Google Scholar 

  38. Shao, R., Perera, P., Yuen, P.C., Patel, V.M.: Open-set adversarial defense. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 682–698. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_40

    Chapter  Google Scholar 

  39. Shao, R., Perera, P., Yuen, P.C., Patel, V.M.: Federated generalized face presentation attack detection. IEEE Trans. Neural Netw. Learn. Syst. (2022)

    Google Scholar 

  40. Shao, R., Perera, P., Yuen, P.C., Patel, V.M.: Open-set adversarial defense with clean-adversarial mutual learning. Int. J. Comput. Vision 130(4), 1070–1087 (2022)

    Article  Google Scholar 

  41. Shao, R., Zhang, B., Yuen, P.C., Patel, V.M.: Federated test-time adaptive face presentation attack detection with dual-phase privacy preservation. In: FG (2021)

    Google Scholar 

  42. Shen, Y., Gu, J., Tang, X., Zhou, B.: Interpreting the latent space of gans for semantic face editing. In: CVPR (2020)

    Google Scholar 

  43. Shen, Y., Yang, C., Tang, X., Zhou, B.: InterfaceGAN: interpreting the disentangled face representation learned by GANs. TMPAMI (2020)

    Google Scholar 

  44. Shen, Y., Zhou, B.: Closed-form factorization of latent semantics in GANs. In: CVPR (2021)

    Google Scholar 

  45. Voynov, A., Babenko, A.: Unsupervised discovery of interpretable directions in the GAN latent space. In: ICML (2020)

    Google Scholar 

  46. Wang, H., Liu, W., Bocchieri, A., Li, Y.: Can multi-label classification networks know what they don’t know? In: NeurIPS (2021)

    Google Scholar 

  47. Wang, S.Y., Wang, O., Owens, A., Zhang, R., Efros, A.A.: Detecting photoshopped faces by scripting photoshop. In: CVPR (2019)

    Google Scholar 

  48. Wang, W., Alameda-Pineda, X., Xu, D., Fua, P., Ricci, E., Sebe, N.: Every smile is unique: landmark-guided diverse smile generation. In: CVPR (2018)

    Google Scholar 

  49. Xiao, Z., et al.: Improving transferability of adversarial patches on face recognition with generative models. In: CVPR (2021)

    Google Scholar 

  50. Yang, H., Huang, D., Wang, Y., Jain, A.K.: Learning face age progression: a pyramid architecture of GANs. In: CVPR (2018)

    Google Scholar 

  51. Yu, Z., et al.: Searching central difference convolutional networks for face anti-spoofing. In: CVPR (2020)

    Google Scholar 

  52. Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., Yu, N.: Multi-attentional deepfake detection. In: CVPR (2021)

    Google Scholar 

  53. Zhao, T., Xu, X., Xu, M., Ding, H., Xiong, Y., Xia, W.: Learning self-consistency for deepfake detection. In: ICCV (2021)

    Google Scholar 

  54. Zhu, P., Abdal, R., Qin, Y., Wonka, P.: Sean: image synthesis with semantic region-adaptive normalization. In: CVPR (2020)

    Google Scholar 

  55. Zhu, X., Wang, H., Fei, H., Lei, Z., Li, S.Z.: Face forgery detection by 3D decomposition. In: CVPR (2021)

    Google Scholar 

  56. Zhuang, P., Koyejo, O., Schwing, A.G.: Enjoy your editing: Controllable GANs for image editing via latent space navigation. In: ICLR (2021)

    Google Scholar 

Download references

Acknowledgements

This work is supported by NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), and under the RIE2020 Industry Alignment Fund - Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s). Ziwei Liu is the corresponding author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ziwei Liu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1580 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shao, R., Wu, T., Liu, Z. (2022). Detecting and Recovering Sequential DeepFake Manipulation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13673. Springer, Cham. https://doi.org/10.1007/978-3-031-19778-9_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19778-9_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19777-2

  • Online ISBN: 978-3-031-19778-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics