Abstract
The advancement of Facial Attribute Editing (FAE) technology allows individuals to effortlessly alter facial attributes in images without discernible visual artifacts. Given the pivotal role facial features play in identity recognition, the misuse of these manipulated images raises significant security concerns, particularly around identity forgery. While existing image forensics algorithms primarily concentrate on traditional tampering methods like splicing and copy-move and are often tailored to detect tampering in natural landscape images, they fall short in pinpointing FAE manipulations effectively. In this paper, we introduce two FAE datasets and propose the Multi-Scale Enhanced Dual-Stream Network (MSDS-Net) specifically for FAE Localization. Our analysis reveals that FAE artifacts are present in both the spatial and DCT frequency domains. Uniquely, in contrast to traditional tampering methods where modifications are localized, facial attribute alterations often span the entire image. The transitions between edited and unedited regions appear seamless, devoid of any conspicuous local tampering signs. Thus, our proposed method adopts a dual-stream structure, targeting the extraction of tampering signs from both the spatial and DCT frequency domains. Within each stream, multi-scale units are employed to discern editing artifacts across varying receptive field sizes. Comprehensive comparative results indicate that our approach outperforms existing methods in the field of FAE localization, setting a new benchmark in performance. Additionally, when applied to the task of pinpointing facial image inpainting, our method demonstrated commendable results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Goodfellow, I.: Generative adversarial networks. Commun. ACM 63(11), 139ā144 (2020)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223ā2232 (2017)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14ā16 April 2014, Conference Track Proceedings (2014). http://arxiv.org/abs/1312.6114
Razavi, A., vanĀ den Oord, A., Vinyals, O.: Generating diverse high-resolution images with VQ-VAE. In: Deep Generative Models for Highly Structured Data, ICLR 2019 Workshop, New Orleans, Louisiana, United States, 6 May 2019. OpenReview.net (2019). https://openreview.net/forum?id=ryeBN88Ku4
Perarnau, G., Van DeĀ Weijer, J., Raducanu, B., Ćlvarez, J.M.: Invertible conditional GANs for image editing. arXiv preprint arXiv:1611.06355 (2016)
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8789ā8797 (2018)
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: Attgan: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28(11), 5464ā5478 (2019)
Liu, M., et al.: Stgan: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3673ā3682 (2019)
Chen, X., et al.: CooGAN: a memory-efficient framework for high-resolution facial attribute editing. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 670ā686. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_39
Deng, Q., Li, Q., Cao, J., Liu, Y., Sun, Z.: Controllable multi-attribute editing of high-resolution face images. IEEE Trans. Inf. Forensics Secur. 16, 1410ā1423 (2020)
Gao, Y., et al.: High-fidelity and arbitrary face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16115ā16124 (2021)
Wu, Y., AbdAlmageed, W., Natarajan, P.: Mantra-net: manipulation tracing network for detection and localization of image forgeries with anomalous features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9543ā9552 (2019)
Hu, X., Zhang, Z., Jiang, Z., Chaudhuri, S., Yang, Z., Nevatia, R.: SPAN: spatial pyramid attention network for image manipulation localization. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 312ā328. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_19
Zhuang, P., Li, H., Tan, S., Li, B., Huang, J.: Image tampering localization using a dense fully convolutional network. IEEE Trans. Inf. Forensics Secur. 16, 2986ā2999 (2021)
Zhuo, L., Tan, S., Li, B., Huang, J.: Self-adversarial training incorporating forgery attention for image forgery localization. IEEE Trans. Inf. Forensics Secur. 17, 819ā834 (2022)
Kwon, M.J., Nam, S.H., Yu, I.J., Lee, H.K., Kim, C.: Learning jpeg compression artifacts for image manipulation detection and localization. Int. J. Comput. Vision 130(8), 1875ā1895 (2022)
Liu, X., Liu, Y., Chen, J., Liu, X.: Pscc-net: Progressive spatio-channel correlation network for image manipulation detection and localization. IEEE Trans. Circ. Syst. Video Technol. 32(11), 7505ā7517 (2022)
Dong, C., Chen, X., Hu, R., Cao, J., Li, X.: Mvss-net: multi-view multi-scale supervised networks for image manipulation detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3539ā3553 (2022)
Wang, J., et al.: Objectformer for image manipulation detection and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2364ā2373 (2022)
Bayar, B., Stamm, M.C.: Constrained convolutional neural networks: a new approach towards general purpose image manipulation detection. IEEE Trans. Inf. Forensics Secur. 13(11), 2691ā2706 (2018)
Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 7(3), 868ā882 (2012)
Huang, G., Liu, Z., Van DerĀ Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700ā4708 (2017)
Nguyen, H.H., Fang, F., Yamagishi, J., Echizen, I.: Multi-task learning for detecting and segmenting manipulated facial images and videos. In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp.Ā 1ā8. IEEE (2019)
Dang, H., Liu, F., Stehouwer, J., Liu, X., Jain, A.K.: On the detection of digital face manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition, pp. 5781ā5790 (2020)
Jia, G., et al.: Inconsistency-aware wavelet dual-branch network for face forgery detection. IEEE Trans. Biometrics, Behav. Identity Sci. 3(3), 308ā319 (2021)
Huang, Y., Juefei-Xu, F., Guo, Q., Liu, Y., Pu, G.: Fakelocator: robust localization of GAN-based face manipulations. IEEE Trans. Inf. Forensics Secur. 17, 2657ā2672 (2022)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3ā19 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770ā778 (2016)
Zeng, Y., Fu, J., Chao, H., Guo, B.: Aggregated contextual transformations for high-resolution image inpainting. IEEE Trans. Vis. Comput. Graph. (2022)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net (2018). https://openreview.net/forum?id=Hk99zCeAb
Li, W., Lin, Z., Zhou, K., Qi, L., Wang, Y., Jia, J.: Mat: mask-aware transformer for large hole image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10758ā10768 (2022)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452ā1464 (2017)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Grant Nos. U19B2022, 61972430).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Huang, J., Luo, W., Huang, W., Xi, Z., Wei, K., Huang, J. (2024). Multi-Scale Enhanced Dual-Stream Network forĀ Facial Attribute Editing Localization. In: Ma, B., Li, J., Li, Q. (eds) Digital Forensics and Watermarking. IWDW 2023. Lecture Notes in Computer Science, vol 14511. Springer, Singapore. https://doi.org/10.1007/978-981-97-2585-4_11
Download citation
DOI: https://doi.org/10.1007/978-981-97-2585-4_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2584-7
Online ISBN: 978-981-97-2585-4
eBook Packages: Computer ScienceComputer Science (R0)