Skip to main content
Log in

A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In this paper, a robust infrared and visible image fusion scheme that joins a dual-branch multi-receptive-field neural network and a color vision transfer algorithm is designed to aggregate infrared and visible video sequences. The proposed method enables the fused image to effectively recognize thermal objects, contain rich texture information and ensure visual perception quality. The fusion network is an integrated encoder-decoder modal with a multi-receptive-field attention mechanism that is implemented via hybrid dilated convolution (HDC) and a series of convolution layers to form an unsupervised framework. Specifically, the multi-receptive-field attention mechanism aims to extract comprehensive spatial information to enable the encoder to separately focus on the substantial thermal radiation from the infrared modal and the environmental characteristics from the visible modal. In addition, to ensure that the fused image has rich color, high fidelity and steady brightness, a color vision transfer method is proposed to recolor the fused gray results by deriving a map from the visible image serving as a reference. Extensive experiments verify the importance and robustness of each step in the subjective and objective evaluation and demonstrate that our work represents a trade-off among color fidelity, fusion performance and computational efficiency. Moreover, we will publish our research content, data and code publicly at https://github.com/DZSYUNNAN/RGB-TIR-image-fusion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Ma J, Ma Y, Li C (2019) Infrared and visible image fusion methods and applications: A survey. Inf Fusion 45:153–178

    Article  Google Scholar 

  2. Geng J, Miao Z, Zhang X (2015) Efficient heuristic methods for multimodal fusion and concept fusion in video concept detection. IEEE Trans Multimedia 17(4):498–511

    Article  Google Scholar 

  3. Javan FD, Samadzadegan F, Mehravar S, Toosi A, Stein A (2021) A review of image fusion techniques for pan-sharpening of high-resolution satellite imagery. ISPRS J Photogramm Remote Sens 171:101–117

    Article  Google Scholar 

  4. Hu H, Wu J, Li B, Guo Q, Zheng J (2017) An adaptive fusion algorithm for visible and infrared videos based on entropy and the cumulative distribution of gray levels. IEEE Trans Multimedia 19(12):2706–2719

    Article  Google Scholar 

  5. Zhang Q, Wang L, Ma Z, Li H (2012) A novel video fusion framework using surfacelet transform. Opt Commun 285(13–14):3032–3041

    Article  Google Scholar 

  6. Zhang Q, Chen Y, Wang L (2013) Multisensor video fusion based on spatial–temporal salience detection. Signal Process 93(9):2485–2499

    Article  Google Scholar 

  7. Bin S, Yingjie L, Rongguo F (2020) Multi-Band infrared and visual video registration and fusion parallel acceleration method. Presented at the Proceedings of the 2020 International conference on computing, Networks and Internet of Things, Sanya, China, 107-112

  8. Li J, Huo H, Li C, Wang R, Sui C, Liu Z (2021) Multigrained attention network for infrared and visible image fusion. IEEE Trans Instrum Meas 70:1–12

    Google Scholar 

  9. Zhang Q, Liu Y, Rick S (2018) Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: A review. Inf Fusion 40:57–75

    Article  Google Scholar 

  10. Luo X, Zhang Z, Zhang B, Wu X (2017) Image fusion with contextual statistical similarity and nonsubsampled shearlet transform. IEEE Sensors J PP(6):1760–1771

    Article  Google Scholar 

  11. Zhang TY, Zhou Q, Feng HJ, Xu ZH, Li Q, Chen YT (2013) Fusion of infrared and visible light images based on nonsubsampled shearlet transform. Proc SPIE 8907, id. 89071H, 8 pp

  12. Jiang Y, Wu Z, Tang J, Li Z, Xue X, Chang S (2018) Modeling multimodal clues in a hybrid deep learning framework for video classification. IEEE Trans Multimedia 20(11):3137–3147

    Article  Google Scholar 

  13. Hou RC, Zhou DM, Nie RC (2020) VIF-Net: An unsupervised framework for infrared and visible image fusion. IEEE Trans Comput Imaging 6:640–651

    Article  Google Scholar 

  14. Ma J, Yu W, Liang P et al (2019) FusionGAN: A generative adversarial network for infrared and visible image fusion. Information Fusion 48:11–26

    Article  Google Scholar 

  15. Ma J, Zhang H, Shao Z, Liang P, Xu H (2021) GANMcC: A generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Trans Instrum Meas 70:1–14

    Google Scholar 

  16. Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets Multiresolution Inf Process 16(3):1850018

    Article  MathSciNet  MATH  Google Scholar 

  17. Vanmali AV, Gadre VM (2017) Visible and NIR image fusion using weight-map-guided Laplacian–Gaussian pyramid for improving scene visibility. Sādhanā 42(7):1063–1082

    Article  Google Scholar 

  18. Li H, Wu XJ, Kittler J (2021) RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Inf Fusion 73:72–86

    Article  Google Scholar 

  19. Li H, Wu XJ, Durrani TS (2019) Infrared and visible image fusion with ResNet and zero-phase component analysis. Infrared Phys Technol 102:103039

    Article  Google Scholar 

  20. Wang Z (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13:600–612

    Article  Google Scholar 

  21. Li J, Huo HT, Li C, Wang RH, Feng Q (2021) "AttentionFGAN: infrared and visible image fusion using attention-based generative adversarial networks," (in English). IEEE Trans Multimedia 23:1383–1396

    Article  Google Scholar 

  22. Faridul HS, Pouli T, Chamaret C, Stauder J, Reinhard E, Kuzovkin D, Tremeau A (2016) Colour mapping: a review of recent methods, extensions and applications. Comput Graphics Forum 35(1):59–88

    Article  Google Scholar 

  23. A-Monem ME, Hammood TZ (2020) Video colorization methods: a survey. Iraqi J Sci:675–686

  24. Hogervorst MA, Toet A (2010) Fast natural color mapping for night-time imagery. Inf Fusion 11(2):69–77

    Article  Google Scholar 

  25. Reinhard E, Pouli T (2011) Colour spaces for colour transfer. In: Computational Color Imaging - Third International Workshop, CCIW vol. 6626, pp. 1–15

  26. Gómez-Gavara C, Piella G, Vázquez J et al (2021) LIVERCOLOR: An Algorithm Quantification of Liver Graft Steatosis Using Machine Learning and Color Image Processing. HPB 23(supplement 3):S691–S692

    Article  Google Scholar 

  27. Pavlovic R, Petrovic V (2012) Multisensor colour image fusion for night vision. Sensor Signal Processing for Defence, pp. 1–5

  28. Florea L, Florea C (2019) Directed color transfer for low-light image enhancement. Digit Signal Process 93:1–12

    Article  Google Scholar 

  29. Fang Y, Li Y, Tu X, Tan T, Wang X (2020) Face completion with hybrid dilated convolution. Signal Process Image Commun 80:115664

    Article  Google Scholar 

  30. Wang P, Chen P, Yuan Y, Liu D, Cottrell G (2018) Understanding Convolution for Semantic Segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1451–1460

  31. Liu Y, Zhou D, Nie R, Ding Z, Guo Y, Ruan X, Xia W, Hou R (2022) TSE_Fuse: two stage enhancement method using attention mechanism and feature-linking model for infrared and visible image fusion. Digital Signal Process 123:103387

    Article  Google Scholar 

  32. Li H, Wu X-J, Kittler J (2020) MDLatLRR: A novel decomposition method for infrared and visible image fusion. IEEE Trans Image Process 29:4733–4746

    Article  MATH  Google Scholar 

  33. Li H, Wu XJ, Kittler J (2018) Infrared and visible image fusion using a deep learning framework. In: International Conference on Pattern Recognition, pp. 2705–2710

  34. Li H, Wu X (2019) DenseFuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28(5):2614–2623

    Article  MathSciNet  Google Scholar 

  35. Ding Z, Li H, Zhou D, Li H, Liu Y, Hou R (2021) CMFA_Net: A cross-modal feature aggregation network for infrared-visible image fusion. Infrared Phys Technol 118:103905

    Article  Google Scholar 

  36. Toet A (2014) TNO image fusion dataset. Figshare. Data. [Online]. Available: https://figshare.com/articles/TNimageFusionDataset/1008029. Accessed 26 Apr 2014

  37. INO video dataset. [Online]. Available: https://www.ino.ca/en/videoanalytics-dataset/

  38. Qu G, Zhang D, Yan P (2002) Information measure for performance of image fusion. Electron Lett 38(7):313–315

    Article  Google Scholar 

  39. Wang Q, Shen Y (2004) Performances evaluation of image fusion techniques based on nonlinear correlation measurement. In: Proceedings of the 21st IEEE Instrumentation and Measurement Technology Conference (IEEE Cat. No.04CH37510)

  40. Kandadai S, Hardin J, Creusere CD (2008) Audio quality assessment using the mean structural similarity measure. In: IEEE international conference on acoustics

  41. Li H, Wu XJ, Durrani T (2020) NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans Instrum Meas 69(12):9645–9656

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the editors and the anonymous reviewers for their careful work and valuable suggestions for this study and declare that there are no conflict of interest regarding the publication of this paper. This work is supported by the National Natural Science Foundation of China (Nos. 62066047, 61966037, 61861045). “Famous teacher of teaching” of Yunnan 10000 Talents Program. Key project of Basic Research Program of Yunnan Province (No. 202101AS070031). General project of national Natural Science Foundation of China (No. 81771928). The Yunnan University’s Research Innovation Fund for Graduate Students (No. 2020298). Scientific Research Fund Project of Education Department of Yunnan Province for Graduate Students (No. 2021Y022).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haiyan Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, Z., Li, H., Zhou, D. et al. A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception. Appl Intell 53, 8114–8132 (2023). https://doi.org/10.1007/s10489-022-03952-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03952-z

Keywords

Navigation