Skip to main content
Log in

An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Infrared and visible image fusion aims to obtain a more informative fusion image by merging the infrared and visible images. However, the existing methods have some shortcomings, such as detail information loss, unclear boundaries, and not being end-to-end. In this paper, we propose an end-to-end network architecture for infrared and visible image fusion task. Our network contains three essential parts: encoders, residual fusion module, and decoder. First, we input infrared and visible images to two encoders to extract shallow features, respectively. Subsequently, the two sets of features are concatenated and fed to the residual fusion module to extract multi-scale features and fuse them adequately. Finally, the fused image is obtained by the decoder. We conduct objective and subjective experiments on two public datasets. The comparison results with the state-of-art methods prove that the fusion results of the proposed method have better objective metrics and contain more detail information and more explicit boundary.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Aslantas V, Bendes E (2015) A new image quality metric for image fusion: the sum of the correlations of differences. Aeu-international J Electron Commun 69(12):1890–1896. https://doi.org/10.1016/j.aeue.2015.09.004

    Article  Google Scholar 

  2. Bavirisetti DP, Dhuli R (2016) Two-scale image fusion of visible and infrared images using saliency detection. Infrared Phys Technol 76:52–64

    Article  Google Scholar 

  3. Bavirisetti DP, Xiao G, Liu G (2017) Multi-sensor image fusion based on fourth order partial differential equations. In: 2017 20th International conference on information fusion (Fusion), IEEE, pp 1–9. https://doi.org/10.23919/ICIF.2017.8009719

  4. Darbari A, Kumar K, Darbari S, Patil PL (2021) Requirement of artificial intelligence technology awareness for thoracic surgeons. Cardiothorac Surgeon 29(1):1–10

    Article  Google Scholar 

  5. Fu Y, Wu XJ (2021) A dual-branch network for infrared and visible image fusion. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 10675–10680. https://doi.org/10.1109/ICPR48806.2021.9412293

  6. Haghighat M, Razian MA (2014) Fast-FMI: non-reference image fusion metric. In: 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), pp 1–3. https://doi.org/10.1109/ICAICT.2014.7036000

  7. Hanna BV, Gorbach AM, Gage FA et al (2008) Intraoperative assessment of critical biliary structures with visible range/infrared image fusion. J Am Coll Surg 206(6):1227–1231. https://doi.org/10.1016/j.jamcollsurg.2007.10.012

    Article  Google Scholar 

  8. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  9. Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106

    Article  Google Scholar 

  10. Hwang S, Park J, Kim N, Choi Y, So Kweon I (2015) Multispectral pedestrian detection: benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045

  11. Jagalingam P, Hegde AV (2015) A review of quality metrics for fused image. Aquat Procedia 4:133–142. https://doi.org/10.1016/j.aqpro.2015.02.019

    Article  Google Scholar 

  12. Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Kämäräinen JK et al (2020) The eighth visual object tracking VOT2020 challenge results. In: European conference on computer vision. Springer, Cham, pp 547–601

  13. Li H, Wu XJ (2018) DenseFuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28(5):2614–2623. https://doi.org/10.1109/TIP.2018.2887342

    Article  MathSciNet  Google Scholar 

  14. Li S, Kang X, Hu J (2013) Image fusion with guided filtering. IEEE Trans Image Process 22(7):2864–2875. https://doi.org/10.1109/TIP.2013.2244222

    Article  Google Scholar 

  15. Li S, Kang X, Fang L, Hu J, Yin H (2017) Pixel-level image fusion: a survey of the state of the art. Inf Fusion 33:100–112. https://doi.org/10.1016/j.inffus.2016.05.004

    Article  Google Scholar 

  16. Li H, Wu XJ, Kittler J (2018) Infrared and visible image fusion using a deep learning framework. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 2705–2710. https://doi.org/10.1109/ICPR.2018.8546006

  17. Li H, Wu XJ, Durrani TS (2019) Infrared and visible image fusion with ResNet and zero-phase component analysis. Infrared Phys Technol 102:103039

    Article  Google Scholar 

  18. Li Q, Lu L, Li Z, Wu W, Liu Z, Jeon G, Yang X (2019) Coupled GAN with relativistic discriminators for infrared and visible images fusion. IEEE Sens J 21(6):7458–7467. https://doi.org/10.1109/JSEN.2019.2921803

    Article  Google Scholar 

  19. Li H, Wu XJ, Durrani T (2020) NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans Instrum Meas 69(12):9645–9656. https://doi.org/10.1109/TIM.2020.3005230

    Article  Google Scholar 

  20. Li H, Wu XJ, Kittler J (2021) RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Inf Fusion 73:72–86. https://doi.org/10.1016/j.inffus.2021.02.023

    Article  Google Scholar 

  21. Li J, Li B, Jiang Y, Cai W (2022) MSAt-GAN: a generative adversarial network based on multi-scale and deep attention mechanism for infrared and visible light image fusion. Complex Intell Syst:1–29

  22. Liu Y, Liu S, Wang Z (2015) A general framework for image fusion based on multi-scale transform and sparse representation. Inf Fusion 24:147–164. https://doi.org/10.1016/j.inffus.2014.09.004

  23. Liu Y, Chen X, Peng H, Wang Z (2017) Multi-focus image fusion with a deep convolutional neural network. Inf Fusion 36:191–207. https://doi.org/10.1016/j.inffus.2016.12.001

    Article  Google Scholar 

  24. Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets Multiresolut Inf Process 16(03):1850018. https://doi.org/10.1142/S0219691318500182

    Article  MathSciNet  MATH  Google Scholar 

  25. Ma K, Zeng K, Wang Z (2015) Perceptual quality assessment for multi-exposure image fusion. IEEE Trans Image Process 24(11):3345–3356. https://doi.org/10.1109/TIP.2015.2442920

    Article  MathSciNet  MATH  Google Scholar 

  26. Ma J, Chen C, Li C, Huang J (2016) Infrared and visible image fusion via gradient transfer and total variation minimization. Inf Fusion 31:100–109. https://doi.org/10.1016/j.inffus.2016.02.001

    Article  Google Scholar 

  27. Ma J, Yu W, Liang P, Li C, Jiang J (2019) FusionGAN: a generative adversarial network for infrared and visible image fusion. Inf Fusion 48:11–26

    Article  Google Scholar 

  28. Ma J, Xu H, Jiang J, Mei X, Zhang X (2020) DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans Image Process 29:4980–4995. https://doi.org/10.1109/TIP.2020.2977573

    Article  MATH  Google Scholar 

  29. Ma J, Zhang H, Shao Z, Liang P, Xu H (2020) GANMcC: a generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Trans Instrum Meas 70:1–14. https://doi.org/10.1109/TIM.2020.3038013

    Article  Google Scholar 

  30. Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 552–568

  31. Negi A, Kumar K (2021) Classification and detection of citrus diseases using deep learning. In: Data science and its applications. Chapman and Hall/CRC, pp 63–85

  32. Negi A, Kumar K (2022) AI-based implementation of decisive technology for prevention and fight with COVID-19. In: Cyber-physical systems. Academic Press, pp 1–14

  33. Negi A, Kumar K, Chauhan P (2021) Deep neural network-based multi‐class image classification for plant diseases. Agricultural informatics: automation using the IoT and machine learning, pp 117–129

  34. Nejati M, Samavi S, Shirani S (2015) Multi-focus image fusion using dictionary-based sparse representation. Inf Fusion 25:72–84. https://doi.org/10.1016/j.inffus.2014.10.004

    Article  Google Scholar 

  35. Ram Prabhakar K, Sai Srikar V, Venkatesh Babu R (2017) Deepfuse: a deep unsupervised approach for exposure fusion with extreme exposure image pairs. In: Proceedings of the IEEE international conference on computer vision, pp 4714–4722

  36. Reinhard E, Adhikhmin M, Gooch B, Shirley P (2001) Color transfer between images. IEEE Comput Graph Appl 21(5):34–41. https://doi.org/10.1109/38.946629

    Article  Google Scholar 

  37. Shreyamsha Kumar BK (2015) Image fusion based on pixel significance using cross bilateral filter. SIViP 9(5):1193–1204

    Article  Google Scholar 

  38. Simone G, Farina A, Morabito FC, Serpico SB, Bruzzone L (2002) Image fusion techniques for remote sensing applications. Inform fusion 3(1):3–15. https://doi.org/10.1016/S1566-2535(01)00056-2

    Article  Google Scholar 

  39. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  40. Toet A (2014) TNO image fusion dataset. https://figshare.com/articles/TN_Image_Fusion_Dataset/1008029

  41. Singh R, Vatsa M, Noore A (2008) Integrated multilevel image fusion and match score fusion of visible and infrared face images for robust face recognition. Pattern Recogn 41(3):880–893. https://doi.org/10.1016/j.patcog.2007.06.022

    Article  MATH  Google Scholar 

  42. Sun C, Zhang C, Xiong N (2020) Infrared and visible image fusion techniques based on deep learning: a review. Electronics 9(12):2162. https://doi.org/10.3390/electronics9122162

    Article  Google Scholar 

  43. Sun K, Zhang B, Chen Y, Luo Z, Zheng K, Wu H, Sun X (2021) The facial expression recognition method based on image fusion and CNN. Integr Ferroelectr 217(1):198–213

  44. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612. https://doi.org/10.1109/TIP.2003.819861

    Article  Google Scholar 

  45. Xydeas CA, Petrovic V (2000) Objective image fusion performance measure. Electron Lett 36(4):308–309

    Article  Google Scholar 

  46. Zhang X, Ye P, Leung H, Gong K, Xiao G (2020) Object fusion tracking based on visible and infrared images: a comprehensive review. Inf Fusion 63:166–187

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua Yan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest to this work. The people involved in the experiment have been informed and formally accepted.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Yan, H. An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion. Multimed Tools Appl 82, 20139–20156 (2023). https://doi.org/10.1007/s11042-022-14314-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-14314-9

Keywords

Navigation