Attention based dual UNET network for infrared and visible image fusion

Wang, Xuejiao; Hua, Zhen; Li, Jinjiang

doi:10.1007/s11042-024-18196-x

Attention based dual UNET network for infrared and visible image fusion

Published: 22 January 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

160 Accesses
1 Altmetric
Explore all metrics

Abstract

How to accurately extract the effective information and profile features of source images has been a difficult problem in the domain of infrared and visible image fusion. The existing fusion algorithms lack scientific decision-making in feature information extraction and fusion weight allocation, which can easily result in the loss of information or redundancy and affect the fusion effect. Therefore, this paper proposes a dual U-Net network based on the attention mechanism to realize the fusion of infrared and visible images. The method integrates the feature information of two different structural U-Net branches. The inner nested connected network extracts and fuses multi-scale infrared and visible features, while the attention mechanism is introduced in the decoding block to enhance the model’s attention to the significant features of the image to generate rich semantic information. The outer U-Net network extracts the depth features of the fused image, enhances the contour features of the fused image, and improves the retention of spatial feature. Comparison with six existing fusion methods on a public dataset shows that the algorithm in this paper is superior in both subjective vision and objective evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Cross-UNet: dual-branch infrared and visible image fusion framework based on cross-convolution and attention mechanism

Article 11 August 2022

SACA-fusion: a low-light fusion architecture of infrared and visible images based on self- and cross-attention

Article 12 August 2023

Unsupervised densely attention network for infrared and visible image fusion

Article 05 August 2020

Data Availability

Data available on request from the authors.

References

Hong C, Yu J, Tao D, Wang M (2014) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751
Google Scholar
Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779
Article Google Scholar
Yu J, Tan M, Zhang H, Rui Y, Tao D (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563–578
Article Google Scholar
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Article MathSciNet Google Scholar
Yu N, Li J, Hua Z (2022) Decolorization algorithm based on contrast pyramid transform fusion. Multimed Tools Appl 1–23
Huang L, Dai S, Huang T, Huang X, Wang H (2021) Infrared small target segmentation with multiscale feature representation. Infrared Phys Technol 116:103755
Article Google Scholar
Qiao W, Yang Z (2020) Forecast the electricity price of us using a wavelet transform-based hybrid model. Energy 193:116704
Article Google Scholar
Zhu Z, Yin H, Chai Y, Li Y, Qi G (2018) A novel multi-modality image fusion method based on image decomposition and sparse representation. Inf Sci 432:516–529
Article MathSciNet Google Scholar
Zhang Q, Shi T, Wang F, Blum RS, Han J (2018) Robust sparse representation based multi-focus image fusion with dictionary construction and local spatial consistency. Pattern Recognit 83:299–313
Article Google Scholar
Hong C, Yu J, Zhang J, Jin X, Lee KH (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inf 15(7):3952–3961
Article Google Scholar
Li Z, Li J, Zhang F, Fan L (2023) Cadui: cross-attention-based depth unfolding iteration network for pansharpening remote sensing images. IEEE Trans Geosci Remote Sensing
Hssayni EH, Joudar NE, Ettaouil M (2022a) A deep learning framework for time series classification using normal cloud representation and convolutional neural network optimization. Comput Intell 38(6):2056–2074
Article Google Scholar
Hssayni Eh, Joudar NE, Ettaouil M (2022b) Localization and reduction of redundancy in cnn using l 1-sparsity induction. J Ambient Intell Humaniz Comput 1–13
Liu Y, Chen X, Peng H, Wang Z (2017) Multi-focus image fusion with a deep convolutional neural network. Inf Fusion 36:191–207
Article Google Scholar
Zhang Y, Liu Y, Sun P, Yan H, Zhao X, Zhang L (2020) Ifcnn: a general image fusion framework based on convolutional neural network. Inf Fusion 54:99–118
Zhu Z, Li D, Hu Y, Li J, Liu D, Li J (2021) Indoor scene segmentation algorithm based on full convolutional neural network. Neural Comput Appl 33(14):8261–8273
Article Google Scholar
Guo R, Xj Shen, Xy Dong, Xl Zhang (2020) Multi-focus image fusion based on fully convolutional networks. Front Inf Technol Electron 21(7):1019–1033
Article Google Scholar
Feng Y, Lu H, Bai J, Cao L, Yin H (2020) Fully convolutional network-based infrared and visible image fusion. Multimed Tools Appl 79(21):15001–15014
Article Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241
Xiao B, Xu B, Bi X, Li W (2020) Global-feature encoding u-net (geu-net) for multi-focus image fusion. IEEE Trans Image Process 30:163–175
Article Google Scholar
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2018) Unet++: a nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support, Springer, pp 3–11
Li H, Wu XJ, Durrani T (2020) Nestfuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans Instrum Meas 69(12):9645–9656
Su X, Li J, Hua Z (2022) Transformer-based regression network for pansharpening remote sensing images. IEEE Trans Geosci Remote Sens 60:1–23
Google Scholar
Chen J, Li X, Luo L, Mei X, Ma J (2020) Infrared and visible image fusion based on target-enhanced multiscale transform decomposition. Inf Sci 508:64–78
Article Google Scholar
Mo Y, Kang X, Duan P, Sun B, Li S (2021) Attribute filter based infrared and visible image fusion. Inf Fusion 75:41–54
Article Google Scholar
Zhan L, Zhuang Y, Huang L (2017) Infrared and visible images fusion method based on discrete wavelet transform. J Comput 28(2):57–71
Google Scholar
Aghamaleki JA, Ghorbani A (2023) Image fusion using dual tree discrete wavelet transform and weights optimization. Vis Comput 39(3):1181–1191
Article Google Scholar
Li S, Yin H, Fang L (2013) Remote sensing image fusion via sparse representations over learned dictionaries. IEEE Trans Geosci Remote Sens 51(9):4779–4789
Article Google Scholar
Liu Y, Chen X, Ward RK, Wang ZJ (2016) Image fusion with convolutional sparse representation. IEEE Signal Process Lett 23(12):1882–1886
Article Google Scholar
Ma J, Yu W, Liang P, Li C, Jiang J (2019) Fusiongan: a generative adversarial network for infrared and visible image fusion. Inf Fusion 48:11–26
Li J, Huo H, Li C, Wang R, Feng Q (2020) Attentionfgan: infrared and visible image fusion using attention-based generative adversarial networks. IEEE Trans Multimed 23:1383–1396
Zhang H, Yuan J, Tian X, Ma J (2021) Gan-fm: infrared and visible image fusion using gan with full-scale skip connection and dual markovian discriminators. IEEE Trans Comput Imaging 7:1134–1147
Li H, Wu XJ, Kittler J (2018) Infrared and visible image fusion using a deep learning framework. In: 2018 24th International conference on pattern recognition (ICPR), IEEE, pp 2705–2710
Li H, Wu XJ, Durrani TS (2019) Infrared and visible image fusion with resnet and zero-phase component analysis. Infrared Phys Technol 102:103039
Article Google Scholar
Li H, Wu XJ (2018) Densefuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28(5):2614–2623
Hou R, Zhou D, Nie R, Liu D, Xiong L, Guo Y, Yu C (2020) Vif-net: an unsupervised framework for infrared and visible image fusion. IEEE Trans Comput Imaging 6:640–651
Liu HI, Chen WL (2021) Re-transformer: a self-attention based model for machine translation. Procedia Comput Sci 189:3–10
Article Google Scholar
Galassi A, Lippi M, Torroni P (2020) Attention in natural language processing. IEEE Trans Neural Netw Learn Syst
Bahdanau D, Chorowski J, Serdyuk D, Brakel P, Bengio Y (2016) End-to-end attention-based large vocabulary speech recognition. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 4945–4949
Wang D, Lai R, Guan J (2021) Target attention deep neural network for infrared image enhancement. Infrared Phys Technol 115:103690
Article Google Scholar
Zhang T, Gong X, Chen CP (2021) Bmt-net: broad multitask transformer network for sentiment analysis. IEEE Trans Cybern
Yang B, Wang L, Wong DF, Shi S, Tu Z (2021) Context-aware self-attention networks for natural language processing. Neurocomputing 458:157–169
Article Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Sang H, Zhou Q, Zhao Y (2020) Pcanet: pyramid convolutional attention network for semantic segmentation. Image Vis Comput 103:103997
Cheng J, Tian S, Yu L, Lu H, Lv X (2020) Fully convolutional attention network for biomedical image segmentation. Artif Intell Med 107:101899
Article Google Scholar
Sun J, Darbehani F, Zaidi M, Wang B (2020) Saunet: shape attentive u-net for interpretable medical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 797–806
Ren K, Zhang D, Wan M, Miao X, Gu G, Chen Q (2021) An infrared and visible image fusion method based on improved densenet and mrmr-zca. Infrared Phys Technol 115:103707
Article Google Scholar
Huang H, Lin L, Tong R, Hu H, Zhang Q, Iwamoto Y, Han X, Chen YW, Wu J (2020) Unet 3+: a full-scale connected unet for medical image segmentation. ICASSP 2020–2020 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP), IEEE, pp 1055–1059
Zhang J, Jin Y, Xu J, Xu X, Zhang Y (2018) Mdu-net: multi-scale densely connected u-net for biomedical image segmentation. arXiv:1812.00352
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, Springer, pp 740–755
Joudar NE, Ettaouil M et al (2022) An adaptive drop method for deep neural networks regularization: estimation of dropconnect hyperparameter using generalization gap. Knowl-Based Syst 253:109567
Wang X, Hua Z, Li J (2022) Cross-unet: dual-branch infrared and visible image fusion framework based on cross-convolution and attention mechanism. Vis Comput 1–18
Toet A, et al (2014) Tno image fusion dataset https://figshare.com/articles.TN_Image_Fusion_Dataset/1008029
Wang X, Hua Z, Li J (2022) Paccdu: pyramid attention cross-convolutional dual unet for infrared and visible image fusion. IEEE Trans Instrum Meas 71:1–16
Wang X, Hua Z, Li J (2023) Dbsd: dual branches network using semantic and detail information for infrared and visible image fusion. Infrared Phys Technol 104769
Li H, Wu XJ, Kittler J (2021) Rfn-nest: an end-to-end residual fusion network for infrared and visible images. Inf Fusion 73:72–86
Zhang L, Li H, Zhu R, Du P (2022) An infrared and visible image fusion algorithm based on resnet-152. Multimed Tools Appl 1–11
Li Y, Wang J, Miao Z, Wang J (2020) Unsupervised densely attention network for infrared and visible image fusion. Multimed Tools Appl 79(45):34685–34696
Article Google Scholar
Ma J, Zhou Z, Wang B, Zong H (2017) Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Phys Technol 82:8–17
Article Google Scholar
Xu H, Ma J, Jiang J, Guo X, Ling H (2020) U2fusion: a unified unsupervised image fusion network. IEEE Trans Pattern Anal Mach Intell

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (62002200, 62202268, 62272281), Shandong Natural Science Foundation of China (ZR2023MF026), Yantai science and technology innovation development plan(2022JCYJ031).

Author information

Authors and Affiliations

School of Computer Science and Technology, Shandong Technology and Business University, Yantai, 264005, China
Xuejiao Wang & Jinjiang Li
School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai, 264005, China
Zhen Hua

Authors

Xuejiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Hua
View author publications
You can also search for this author in PubMed Google Scholar
Jinjiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinjiang Li.

Ethics declarations

Conflict of Interests

The authors declare that no potentail competing interests exist. There is no an undisclosed relationship thay may pose a competing interest. There is no an undisclosed funding source that may pose a competing interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, X., Hua, Z. & Li, J. Attention based dual UNET network for infrared and visible image fusion. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18196-x

Download citation

Received: 06 May 2022
Revised: 21 September 2023
Accepted: 05 January 2024
Published: 22 January 2024
DOI: https://doi.org/10.1007/s11042-024-18196-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention based dual UNET network for infrared and visible image fusion

Abstract

Access this article

Similar content being viewed by others

Cross-UNet: dual-branch infrared and visible image fusion framework based on cross-convolution and attention mechanism

SACA-fusion: a low-light fusion architecture of infrared and visible images based on self- and cross-attention

Unsupervised densely attention network for infrared and visible image fusion

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Attention based dual UNET network for infrared and visible image fusion

Abstract

Access this article

Similar content being viewed by others

Cross-UNet: dual-branch infrared and visible image fusion framework based on cross-convolution and attention mechanism

SACA-fusion: a low-light fusion architecture of infrared and visible images based on self- and cross-attention

Unsupervised densely attention network for infrared and visible image fusion

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation