Multi-Scale Enhanced Dual-Stream Network for Facial Attribute Editing Localization

Huang, Jinkun; Luo, Weiqi; Huang, Wenmin; Xi, Ziyi; Wei, Kangkang; Huang, Jiwu

doi:10.1007/978-981-97-2585-4_11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14511))

Included in the following conference series:

International Workshop on Digital Watermarking

85 Accesses

Abstract

The advancement of Facial Attribute Editing (FAE) technology allows individuals to effortlessly alter facial attributes in images without discernible visual artifacts. Given the pivotal role facial features play in identity recognition, the misuse of these manipulated images raises significant security concerns, particularly around identity forgery. While existing image forensics algorithms primarily concentrate on traditional tampering methods like splicing and copy-move and are often tailored to detect tampering in natural landscape images, they fall short in pinpointing FAE manipulations effectively. In this paper, we introduce two FAE datasets and propose the Multi-Scale Enhanced Dual-Stream Network (MSDS-Net) specifically for FAE Localization. Our analysis reveals that FAE artifacts are present in both the spatial and DCT frequency domains. Uniquely, in contrast to traditional tampering methods where modifications are localized, facial attribute alterations often span the entire image. The transitions between edited and unedited regions appear seamless, devoid of any conspicuous local tampering signs. Thus, our proposed method adopts a dual-stream structure, targeting the extraction of tampering signs from both the spatial and DCT frequency domains. Within each stream, multi-scale units are employed to discern editing artifacts across varying receptive field sizes. Comprehensive comparative results indicate that our approach outperforms existing methods in the field of FAE localization, setting a new benchmark in performance. Additionally, when applied to the task of pinpointing facial image inpainting, our method demonstrated commendable results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Goodfellow, I.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014). http://arxiv.org/abs/1312.6114
Razavi, A., van den Oord, A., Vinyals, O.: Generating diverse high-resolution images with VQ-VAE. In: Deep Generative Models for Highly Structured Data, ICLR 2019 Workshop, New Orleans, Louisiana, United States, 6 May 2019. OpenReview.net (2019). https://openreview.net/forum?id=ryeBN88Ku4
Perarnau, G., Van De Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible conditional GANs for image editing. arXiv preprint arXiv:1611.06355 (2016)
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2018)
Google Scholar
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: Attgan: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28(11), 5464–5478 (2019)
Article MathSciNet Google Scholar
Liu, M., et al.: Stgan: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3673–3682 (2019)
Google Scholar
Chen, X., et al.: CooGAN: a memory-efficient framework for high-resolution facial attribute editing. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 670–686. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_39
Chapter Google Scholar
Deng, Q., Li, Q., Cao, J., Liu, Y., Sun, Z.: Controllable multi-attribute editing of high-resolution face images. IEEE Trans. Inf. Forensics Secur. 16, 1410–1423 (2020)
Article Google Scholar
Gao, Y., et al.: High-fidelity and arbitrary face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16115–16124 (2021)
Google Scholar
Wu, Y., AbdAlmageed, W., Natarajan, P.: Mantra-net: manipulation tracing network for detection and localization of image forgeries with anomalous features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9543–9552 (2019)
Google Scholar
Hu, X., Zhang, Z., Jiang, Z., Chaudhuri, S., Yang, Z., Nevatia, R.: SPAN: spatial pyramid attention network for image manipulation localization. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 312–328. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_19
Chapter Google Scholar
Zhuang, P., Li, H., Tan, S., Li, B., Huang, J.: Image tampering localization using a dense fully convolutional network. IEEE Trans. Inf. Forensics Secur. 16, 2986–2999 (2021)
Article Google Scholar
Zhuo, L., Tan, S., Li, B., Huang, J.: Self-adversarial training incorporating forgery attention for image forgery localization. IEEE Trans. Inf. Forensics Secur. 17, 819–834 (2022)
Article Google Scholar
Kwon, M.J., Nam, S.H., Yu, I.J., Lee, H.K., Kim, C.: Learning jpeg compression artifacts for image manipulation detection and localization. Int. J. Comput. Vision 130(8), 1875–1895 (2022)
Article Google Scholar
Liu, X., Liu, Y., Chen, J., Liu, X.: Pscc-net: Progressive spatio-channel correlation network for image manipulation detection and localization. IEEE Trans. Circ. Syst. Video Technol. 32(11), 7505–7517 (2022)
Article Google Scholar
Dong, C., Chen, X., Hu, R., Cao, J., Li, X.: Mvss-net: multi-view multi-scale supervised networks for image manipulation detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3539–3553 (2022)
Google Scholar
Wang, J., et al.: Objectformer for image manipulation detection and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2364–2373 (2022)
Google Scholar
Bayar, B., Stamm, M.C.: Constrained convolutional neural networks: a new approach towards general purpose image manipulation detection. IEEE Trans. Inf. Forensics Secur. 13(11), 2691–2706 (2018)
Article Google Scholar
Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 7(3), 868–882 (2012)
Article Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Nguyen, H.H., Fang, F., Yamagishi, J., Echizen, I.: Multi-task learning for detecting and segmenting manipulated facial images and videos. In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–8. IEEE (2019)
Google Scholar
Dang, H., Liu, F., Stehouwer, J., Liu, X., Jain, A.K.: On the detection of digital face manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition, pp. 5781–5790 (2020)
Google Scholar
Jia, G., et al.: Inconsistency-aware wavelet dual-branch network for face forgery detection. IEEE Trans. Biometrics, Behav. Identity Sci. 3(3), 308–319 (2021)
Article Google Scholar
Huang, Y., Juefei-Xu, F., Guo, Q., Liu, Y., Pu, G.: Fakelocator: robust localization of GAN-based face manipulations. IEEE Trans. Inf. Forensics Secur. 17, 2657–2672 (2022)
Article Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Zeng, Y., Fu, J., Chao, H., Guo, B.: Aggregated contextual transformations for high-resolution image inpainting. IEEE Trans. Vis. Comput. Graph. (2022)
Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net (2018). https://openreview.net/forum?id=Hk99zCeAb
Li, W., Lin, Z., Zhou, K., Qi, L., Wang, Y., Jia, J.: Mat: mask-aware transformer for large hole image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10758–10768 (2022)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Article Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant Nos. U19B2022, 61972430).

Author information

Authors and Affiliations

School of Computer Science and Engineering and Guangdong Key Laboratory of Information Security Technology, Sun Yat-sen University, Guangzhou, 510006, China
Jinkun Huang, Weiqi Luo, Wenmin Huang, Ziyi Xi & Kangkang Wei
Guangdong Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen, 518060, China
Jiwu Huang

Authors

Jinkun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Weiqi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Wenmin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ziyi Xi
View author publications
You can also search for this author in PubMed Google Scholar
Kangkang Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jiwu Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weiqi Luo .

Editor information

Editors and Affiliations

School of Cyber Security, Qilu University of Technology, Jinan, China
Bin Ma
Qilu University of Technology, Jinan, China
Jian Li
Qilu University of Technology, Jinan, China
Qi Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, J., Luo, W., Huang, W., Xi, Z., Wei, K., Huang, J. (2024). Multi-Scale Enhanced Dual-Stream Network for Facial Attribute Editing Localization. In: Ma, B., Li, J., Li, Q. (eds) Digital Forensics and Watermarking. IWDW 2023. Lecture Notes in Computer Science, vol 14511. Springer, Singapore. https://doi.org/10.1007/978-981-97-2585-4_11

Download citation

DOI: https://doi.org/10.1007/978-981-97-2585-4_11
Published: 25 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2584-7
Online ISBN: 978-981-97-2585-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-Scale Enhanced Dual-Stream Network for Facial Attribute Editing Localization