Abstract
In the recent improvement in deep learning approaches for realistic image generation and translation, Generative Adversarial Networks (GANs) delivered favorable results. GAN generates novel samples that look indistinguishable from authentic images. This paper proposes a novel generative network for thermal-to-visible image translation. Thermal to Visible synthesis is challenging due to the non-availability of accurate semantic and textural information in thermal images. The thermal sensors acquire the thermal face images by capturing the object’s luminance with fewer details about the actual facial information. However, it is advantageous for low-light and night-time vision, where image information cannot be captured in a complex environment by an RGB camera. We design a new Attention-guided Cyclic Generative Adversarial Network for Thermal to Visible Face transformation (TVA-GAN) by integrating a new attention network. We utilize attention guidance with a recurrent block with an Inception module to simplify the learning space toward the optimum solution. The proposed TVA-GAN is trained and evaluated for thermal to visible face synthesis over three benchmark datasets, including the WHU-IIP, Tufts Face Thermal2RGB, and CVBL-CHILD datasets. The proposed TVA-GAN results show promising improvement in face synthesis compared to the state-of-the-art GAN methods. For the proposed TVA-GAN, code is available at: https://github.com/GANGREEK/TVA-GAN.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The WHU-IIP, Tufts Face Thermal2RGB (http://tdface.ece.tufts.edu/), and CVBL-CHILD datasets (https://cvbl.iiita.ac.in/dataset.php) used in the paper were taken from [67,68,69], respectively. Datasets are available from the authors upon reasonable request.
Notes
References
Siesler HW, Ozaki Y, Kawata S, Heise HM (2008) Near-infrared spectroscopy: principles, instruments, applications. John Wiley & Sons, London
Havens KJ, Sharp EJ (2016) Chapter 7—thermal imagers and system considerations. In: Havens KJ, Sharp EJ (eds) Thermal imaging techniques to survey and monitor animals in the wild. Academic Press, Boston, pp 101–119. https://doi.org/10.1016/B978-0-12-803384-5.00007-5
Banfield D, Conrath B, Pearl J, Smith M, Christensen P (2000) Thermal tides and stationary waves on mars as revealed by mars global surveyor thermal emission spectrometer. J Geophys Res 105:9521–9537
FLIR A (2010) The ultimate infrared handbook for r &d professionals. FLIR Systems, Boston
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Mao X, Shen C, Yang YB (2016) Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. Adv Neural Inf Process syst, pp 2802–2810
Wang L, Sindagi V, Patel V (2018) High-quality facial photo-sketch synthesis using multi-adversarial networks. In: 2018 13th IEEE international conference on automatic face and gesture recognition (FG 2018). IEEE, pp 83–90
Shen Y, Luo P, Yan J, Wang X, Tang X (2018) Faceid-gan: Learning a symmetry three-player gan for identity-preserving face synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830
Peng C, Wang N, Li J, Gao X (2020) Face sketch synthesis in the wild via deep patch representation-based probabilistic graphical model. IEEE Trans Inf Forensics Security 15:172–183
Xia Y, Zheng W, Wang Y, Yu H, Dong J, Wang FY (2021) Local and global perception generative adversarial network for facial expression synthesis. IEEE Trans Circuits Syst Video Technol
Yang Y, Liu J, Huang S, Wan W, Wen W, Guan J (2021) Infrared and visible image fusion via texture conditional generative adversarial network. IEEE Trans Circuits Syst Video Technol
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976
Bharti V, Biswas B, Shukla KK (2021) Emocgan: a novel evolutionary multiobjective cyclic generative adversarial network and its application to unpaired image translation. Neural Comput Appl, pp 1–15
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder—decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1724–1734
Xu S, Zhu Q, Wang J (2020) Generative image completion with image-to-image translation. Neural Comput Appl 32(11):7333–7345
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). 10.1109/CVPR.2017.632
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857
Liao B, Chen Y (2007) An image quality assessment algorithm based on dual-scale edge structure similarity. In: Second international conference on innovative computing, informatio and control (ICICIC 2007). IEEE, pp 56–56
Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision. Springer, pp 649–666
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690
Souly N, Spampinato C, Shah M (2017) Semi supervised semantic segmentation using generative adversarial network. In: Proceedings of the IEEE international conference on computer vision, pp 5688–5696
Abdal R, Qin Y, Wonka P (2019) Image2stylegan: How to embed images into the stylegan latent space? In: Proceedings of the IEEE international conference on computer vision, pp 4432–4441
Yuan M, Peng Y (2019) Bridge-gan: interpretable representation learning for text-to-image synthesis. IEEE Trans Circuits Syst Video Technol 30(11):4258–4268
Liao K, Lin C, Zhao Y, Gabbouj M (2019) Dr-gan: Automatic radial distortion rectification using conditional gan in real-time. IEEE Trans Circuits Syst Video Technol 30(3):725–733
Zhang S, Ji R, Hu J, Lu X, Li X (2018) Face sketch synthesis by multidomain adversarial learning. IEEE Trans Neural Netw Learn Syst 30(5):1419–1428
Serengil SI, Ozpinar A (2020) Lightface: A hybrid deep face recognition framework. In: 2020 Innovations in intelligent systems and applications conference (ASYU). IEEE, pp 23–27. https://doi.org/10.1109/ASYU50717.2020.9259802
Li J, Hao P, Zhang C, Dou M (2008) Hallucinating faces from thermal infrared images. In: 2008 15th IEEE international conference on image processing. IEEE, pp 465–468
Choi J, Hu S, Young SS, Davis LS (2012) Thermal to visible face recognition. In: Sensing technologies for global health, military medicine, disaster response, and environmental monitoring II; and biometric technology for human identification IX, vol 8371. International Society for Optics and Photonics, p 83711L
Chen C, Ross A (2016) Matching thermal to visible face images using hidden factor analysis in a cascaded subspace learning framework. Pattern Recogn Lett 72:25–32
Zhang H, Riggan BS, Hu S, Short NJ, Patel VM (2019) Synthesis of high-quality visible faces from polarimetric thermal faces using generative adversarial networks. Int J Comput Vis 127(6–7):845–862
Hu S, Short NJ, Riggan BS, Gordon C, Gurton KP, Thielke M, Gurram P, Chan AL (2016) A polarimetric thermal database for face recognition research. In: 2016 IEEE conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 187–194
Iranmanesh SM, Dabouei A, Kazemi H, Nasrabadi NM (2018) Deep cross polarimetric thermal-to-visible face recognition. In: 2018 international conference on biometrics (ICB). IEEE, pp 166–173
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Babu KK, Dubey SR (2020) Pcsgan: Perceptual cyclic-synthesized generative adversarial networks for thermal and nir to visible image transformation. Neurocomputing 413:41–50
Mejjati YA, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Adv Neural Inf Process Syst, pp 3693–3703
Tang H, Xu D, Sebe N, Yan Y (2019) Attention-guided generative adversarial networks for unsupervised image-to-image translation. In: 2019 International joint conference on neural networks (IJCNN). IEEE, pp 1–8
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning, pp 7354–7363
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Liu MY, Tuzel O (2016) Coupled generative adversarial networks. Adv Neural Inf Process Syst, pp 469–477
Mejjati YA, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. Adv Neural Inf Process Syst, pp 3693–3703
Zhang H, Goodfellow IJ, Metaxas DN, Odena A (2018) Self-attention generative adversarial networks. arXiv:1805.08318
Lejbølle AR, Nasrollahi K, Krogh B, Moeslund TB (2020) Person re-identification using spatial and layer-wise attention. IEEE Trans Inf Forensics Security 15:1216–1231. https://doi.org/10.1109/TIFS.2019.2938870
Tang H, Liu HC, Xu D, Torr PHS, Sebe N (2019) Attentiongan: Unpaired image-to-image translation using attention-guided generative adversarial networks. arXiv:1911.11897
Tang H, Xu D, Sebe N, Wang Y, Corso JJ, Yan Y (2019) Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2417–2426
Tang H, Chen X, Wang W, Xu D, Corso JJ, Sebe N, Yan Y (2019) Attribute-guided sketch generation. In: 2019 14th IEEE international conference on automatic face and gesture recognition (FG 2019). IEEE, pp 1–7
Chen H, Hu G, Lei Z, Chen Y, Robertson NM, Li SZ (2020) Attention-based two-stream convolutional networks for face spoofing detection. IEEE Trans Inf Forensics Security 15:578–593. https://doi.org/10.1109/TIFS.2019.2922241
Nyberg A, Eldesokey A, Bergstrom D, Gustafsson D (2018) Unpaired thermal to visible spectrum transfer using adversarial training. In: Proceedings of the European conference on computer vision (ECCV) Workshops
Kuang X, Zhu J, Sui X, Liu Y, Liu C, Chen Q, Gu G (2020) Thermal infrared colorization via conditional generative adversarial network. Infrared Phys Technol 107:103338
Zhang T, Wiliem A, Yang S, Lovell B (2018) Tv-gan: Generative adversarial network based thermal to visible face recognition. In: 2018 International conference on biometrics (ICB). IEEE, pp 174–181
Bhat N, Saggu N, Kumar S, et al (2020) Generating visible spectrum images from thermal infrared using conditional generative adversarial networks. In: 2020 5th International conference on communication and electronics systems (ICCES). IEEE, pp 1390–1394
Kantarci A, Ekenel HK (2019) Thermal to visible face recognition using deep autoencoders. In: 2019 International conference of the biometrics special interest group (BIOSIG), pp 1–5
Kezebou L, Oludare V, Panetta K, Agaian S (2020) Tr-gan: thermal to rgb face synthesis with generative adversarial network for cross-modal face recognition. In: Mobile multimedia/image processing, security, and applications 2020, vol 11399. International Society for Optics and Photonics, p 113990P
Lahiri A, Bairagya S, Bera S, Haldar S, Biswas PK (2021) Lightweight modules for efficient deep learning based image restoration. IEEE Trans Circuits Syst Video Technol 31(4):1395–1410. https://doi.org/10.1109/TCSVT.2020.3007723
Tan DS, Lin YX, Hua KL (2021) Incremental learning of multi-domain image-to-image translations. IEEE Trans Circuits Syst Video Technol 31(4):1526–1539. https://doi.org/10.1109/TCSVT.2020.3005311
Xu S, Liu D, Xiong Z (2021) E2i: Generative inpainting from edge to image. IEEE Trans Circuits Syst Video Technol 31(4):1308–1322. https://doi.org/10.1109/TCSVT.2020.3001267
Zhong X, Lu T, Huang W, Ye M, Jia X, Lin CW (2021) Grayscale enhancement colorization network for visible-infrared person re-identification. IEEE Trans Circuits Syst Video Technol, pp 1–1. https://doi.org/10.1109/TCSVT.2021.3072171
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al (2018) Attention u-net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1412–1421
Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3367–3375. https://doi.org/10.1109/CVPR.2015.7298958
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges C, Bottou L, Weinberger K (eds) Advances in neural information processing systems, vol 25. Curran Associates Inc, Red Hook
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308
Mao X, Li Q, Xie H, Lau RY, Wang Z, Smolley SP (2017) Least squares generative adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 2813–2821
Kancharagunta KB, Dubey SR (2019) Csgan: Cyclic-synthesized generative adversarial networks for image-to-image transformation. arXiv preprint arXiv:1901.03554
Kniaz VV, Knyaz VA, Hladůvka J, Kropatsch WG, Mizginov VA (2018) ThermalGAN: multimodal color-to-thermal image translation for person re-identification in multispectral dataset. In: Computer vision—ECCV 2018 workshops. Springer International Publishing
Wang Z, Chen Z, Wu F (2018) Thermal to visible facial image translation using generative adversarial networks. IEEE Signal Process Lett 25:1161–1165
Panetta K, Wan Q, Agaian S, Rajeev S, Kamath S, Rajendran R, Rao S, Kaszowska A, Taylor H, Samani A, et al (2018) A comprehensive database for benchmarking imaging systems. IEEE Trans Pattern Anal Mach Intell
Kumar S, Singh SK (2018) A comparative analysis on the performance of different handcrafted descriptors over thermal and low resolution visible image dataset. In: 2018 5th IEEE Uttar Pradesh section international conference on electrical, electronics and computer engineering (UPCON), pp 1–6. https://doi.org/10.1109/UPCON.2018.8596897
Dubey SR, Chakraborty S, Roy SK, Mukherjee S, Singh SK, Chaudhuri BB (2020) diffgrad: An optimization method for convolutional neural networks. IEEE Trans Neural Netw Learn Syst 31(11):4500–4511. https://doi.org/10.1109/TNNLS.2019.2955777
Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: International conference on learning representation
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR
Sheikh HR, Bovik AC (2006) Image information and visual quality. IEEE Trans Image Process 15(2):430–444. https://doi.org/10.1109/TIP.2005.859378
Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR abs/1312.6034
Lahitani AR, Permanasari AE, Setiawan NA (2016) Cosine similarity to determine similarity measure: Study case in online essay assessment. In: 2016 4th International conference on cyber and IT service management, pp 1–6. https://doi.org/10.1109/CITSM.2016.7577578
Acknowledgements
The authors acknowledge the High Performance Computing facility of IIIT Allahabad used for the experiments in this paper.
Funding
We gratefully acknowledge the Indian Institute of Information Technology Allahabad, Ministry of Education, Govt. of India, for providing the fellowship to pursue this research work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare no conflict of interest.
Human and animal rights
We use the publicly available WHU-IIP and Tufts Thermal2RGB datasets for the experiments. The CVBL-CHILD dataset is collected by following the due process and consent from the subjects. No images are used in a way that can cause embarrassment to the subjects.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yadav, N.K., Singh, S.K. & Dubey, S.R. TVA-GAN: attention guided generative adversarial network for thermal to visible image transformations. Neural Comput & Applic 35, 19729–19749 (2023). https://doi.org/10.1007/s00521-023-08724-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08724-5