Abstract
The purpose of infrared and visible fusion is to encompass significant targets and abundant texture details in multiple visual scenarios. However, existing fusion methods have not effectively addressed multiple visual scenarios including small objects, multiple objects, noise, low light, light pollution, overexposure and so on. To better adapt to multiple visual scenarios, we propose a general infrared and visible image fusion method based on saliency weight, termed as MVSFusion. Initially, we use SVM (Support Vector Machine) to classify visible images into two categories based on lighting conditions: Low-Light visible images and Brightly Lit visible images. Designing fusion rules according to distinct lighting conditions ensures adaptability to multiple visual scenarios. Our designed saliency weights guarantee saliency for both small and multiple objects across different scenes. On the other hand, we propose a new texture detail fusion method and an adaptive brightness enhancement technique to better address multiple visual scenarios such as noise, light pollution, nighttime, and overexposure. Extensive experiments indicate that MVSFusion excels not only in visual quality and quantitative evaluation compared to state-of-the-art algorithms but also provides advantageous support for high-level visual tasks. Our code is publicly available at: https://github.com/VCMHE/MVSFusion.
Similar content being viewed by others
Data availability
The datasets generated during and analyzed during the current study are available at: https://github.com/VCMHE/MVSFusion.
References
Jagtap, N.S., Thepade, S.D.: High-quality image multi-focus fusion to address ringing and blurring artifacts without loss of information. Vis. Comput. 38, 4353–4371 (2022). https://doi.org/10.1007/s00371-021-02300-5
Guo, H., Sheng, B., Li, P., Chen, C.L.P.: Multiview high dynamic range image synthesis using fuzzy broad learning system. IEEE Trans Cybern. 51, 2735–2747 (2021). https://doi.org/10.1109/TCYB.2019.2934823
He, K., Zhang, X., Xu, D., Gong, J., Xie, L.: Fidelity-driven optimization reconstruction and details preserving guided fusion for multi-modality medical image. IEEE Trans. Multimed. 25, 4943–4957 (2023)
Hou, R., Zhou, D., Nie, R., Liu, D., Xiong, L., Guo, Y., Yu, C.: VIF-Net: an unsupervised framework for infrared and visible image fusion. IEEE Trans. Comput. Imaging 6, 640–651 (2020)
Tan, A., Guo, T., Zhao, Y., Wang, Y., Li, X.: Object detection based on polarization image fusion and grouped convolutional attention network. Vis. Comput. 1–17 (2023). https://doi.org/10.1007/s00371-023-03022-6
Soroush, R., Baleghi, Y.: NIR/RGB image fusion for scene classification using deep neural networks. Vis. Comput. 39, 2725–2739 (2023). https://doi.org/10.1007/s00371-022-02488-0
Ding, Z., Li, H., Zhou, D., Liu, Y., Hou, R.: Multi-spectral color vision fusion jointly with two-stream feature interaction and color transformation network. Digit. Signal Process. 133, 103875 (2023)
Yu, C., Li, S., Feng, W., Zheng, T., Liu, S.: SACA-fusion: a low-light fusion architecture of infrared and visible images based on self-and cross-attention. Vis. Comput. 1–10 (2023). https://doi.org/10.1007/s00371-023-03037-z
Li, J., Guo, X., Lu, G., Zhang, B., Xu, Y., Wu, F., Zhang, D.: DRPL: deep regression pair learning for multi-focus image fusion. IEEE Trans. Image Process. 29, 4816–4831 (2020). https://doi.org/10.1109/TIP.2020.2976190
Lin, X., Li, J., Ma, Z., Li, H., Li, S., Xu, K., Lu, G., Zhang, D.: Learning modal-invariant and temporal-memory for video-based visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 20973–20982 (2022)
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. pp. 740–755. Springer (2014)
Lu, R., Gao, F., Yang, X., Fan, J., Li, D.: A novel infrared and visible image fusion method based on multi-level saliency integration. Vis. Comput. 39, 2321–2335 (2023)
Liu, J., Jiang, Z., Wu, G., Liu, R., Fan, X.: A unified image fusion framework with flexible bilevel paradigm integration. Vis. Comput. 39, 4869–4886 (2022)
Xu, H., Ma, J., Jiang, J., Guo, X., Ling, H.: U2Fusion: a unified unsupervised image fusion network. IEEE Trans. Pattern Anal. Mach. Intell. 44, 502–518 (2022). https://doi.org/10.1109/TPAMI.2020.3012548
Wang, X., Hua, Z., Li, J.: Cross-UNet: dual-branch infrared and visible image fusion framework based on cross-convolution and attention mechanism. Vis. Comput. 39, 4801–4818 (2022)
Aghamaleki, J.A., Ghorbani, A.: Image fusion using dual tree discrete wavelet transform and weights optimization. Vis. Comput. 39, 1181–1191 (2023)
Zhao, Z., Xu, S., Zhang, C., Liu, J., Zhang, J.: Bayesian fusion for infrared and visible images. Signal Process. 177, 107734 (2020)
Xie, Q., Hu, J., Wang, X., Zhang, D., Qin, H.: Novel and fast EMD-based image fusion via morphological filter. Vis. Comput. 39, 4249–4265 (2022)
Li, L., Li, H., Dang, E., Liu, B.: Compressive sensing method for recognizing cat-eye effect targets. Appl. Opt. 52, 7033–7039 (2013)
Li, L., Li, H., Li, T., Gao, F.: Infrared small target detection in compressive domain. Electron. Lett. 50, 510–512 (2014)
Hou, R., Ren, T., Wu, G.: MIRNet: a robust RGBT tracking jointly with multi-modal interaction and refinement. In: 2022 IEEE International Conference on Multimedia and Expo (ICME). pp. 1–6. IEEE (2022)
Yin, W., He, K., Xu, D., Yue, Y., Luo, Y.: Adaptive low light visual enhancement and high-significant target detection for infrared and visible image fusion. Vis. Comput. 39, 6723–6742 (2023). https://doi.org/10.1007/s00371-022-02759-w
Ding, Z., Li, H., Zhou, D., Liu, Y., Hou, R.: A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception. Appl. Intell. 53, 8114–8132 (2023)
Chen, J., Li, X., Luo, L., Mei, X., Ma, J.: Infrared and visible image fusion based on target-enhanced multiscale transform decomposition. Inf. Sci. 508, 64–78 (2020). https://doi.org/10.1016/j.ins.2019.08.066
Li, H., Wu, X.-J.: Infrared and visible image fusion using latent low-rank representation. arXiv180408992. (2018)
Li, H., Wu, X.-J., Kittler, J.: MDLatLRR: A novel decomposition method for infrared and visible image fusion. IEEE Trans. Image Process. 29, 4733–4746 (2020)
Li, G., Lin, Y., Qu, X.: An infrared and visible image fusion method based on multi-scale transformation and norm optimization. Inf. Fusion. 71, 109–129 (2021). https://doi.org/10.1016/j.inffus.2021.02.008
Ma, J., Zhou, Z., Wang, B., Zong, H.: Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Phys. Technol. 82, 8–17 (2017). https://doi.org/10.1016/j.infrared.2017.02.005
Tang, L., Xiang, X., Zhang, H., Gong, M., Ma, J.: DIVFusion: darkness-free infrared and visible image fusion. Inf. Fusion. 91, 477–493 (2023)
Zhang, H., Ma, J.: SDNet: a versatile squeeze-and-decomposition network for real-time image fusion. Int. J. Comput. Vis. 129, 2761–2785 (2021)
Wang, D., Liu, J., Fan, X., Liu, R.: Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration. arXiv220511876. (2022)
Xu, H., Zhang, H., Ma, J.: Classification saliency-based rule for visible and infrared image fusion. IEEE Trans. Comput. Imaging. 7, 824–836 (2021)
Zhou, Z., Fei, E., Miao, L., Yang, R.: A perceptual framework for infrared–visible image fusion based on multiscale structure decomposition and biological vision. Inf. Fusion. 93, 174–191 (2023)
Tan, W., Zhou, H., Song, J., Li, H., Yu, Y., Du, J.: Infrared and visible image perceptive fusion through multi-level Gaussian curvature filtering image decomposition. Appl. Opt. 58, 3064 (2019). https://doi.org/10.1364/AO.58.003064
Guo, C., Fan, D., Jiang, Z., Zhang, D.: MDFN: mask deep fusion network for visible and infrared image fusion without reference ground-truth. Expert Syst. Appl. 211, 118631 (2023)
Li, C., He, K., Xu, D., Tao, D., Lin, X., Shi, H., Yin, W.: Superpixel-based adaptive salient region analysis for infrared and visible image fusion. Neural Comput. Appl. 35, 22511–22529 (2023)
Yin, W., He, K., Xu, D., Luo, Y., Gong, J.: Significant target analysis and detail preserving based infrared and visible image fusion. Infrared Phys. Technol. 121, 104041 (2022)
Ma, J., Yu, W., Liang, P., Li, C., Jiang, J.: FusionGAN: a generative adversarial network for infrared and visible image fusion. Inf. Fusion. 48, 11–26 (2019)
Li, H., Zhao, J., Li, J., Yu, Z., Lu, G.: Feature dynamic alignment and refinement for infrared-visible image fusion: translation robust fusion. Inf Fusion. 95, 26–41 (2023). https://doi.org/10.1016/j.inffus.2023.02.011
Han, M., Yu, K., Qiu, J., Li, H., Wu, D., Rao, Y., Yang, Y., Xing, L., Bai, H., Zhou, C.: Boosting target-level infrared and visible image fusion with regional information coordination. Inf Fusion. 92, 268–288 (2023). https://doi.org/10.1016/J.INFFUS.2022.12.005
Wang, X., Guan, Z., Qian, W., Cao, J., Wang, C., Yang, C.: Contrast saliency information guided infrared and visible image fusion. IEEE Trans. Comput. Imaging. 9, 769–780 (2023). https://doi.org/10.1109/TCI.2023.3304471
Liu, J., Fan, X., Huang, Z., Wu, G., Liu, R., Zhong, W., Luo, Z.: Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5802–5811 (2022)
Takumi, K., Watanabe, K., Ha, Q., Tejero-De-Pablos, A., Ushiku, Y., Harada, T.: Multispectral object detection for autonomous vehicles. In: Proceedings of the on Thematic Workshops of ACM Multimedia 2017. pp. 35–43 (2017)
Lee, H., Jeon, J., Kim, J., Lee, S.: Structure-texture decomposition of images with interval gradient. In: Computer Graphics Forum. 36(6), 262–274 (2017)
Xydeas, C.S., Petrovic, V.: Others: objective image fusion performance measure. Electron. Lett. 36, 308–309 (2000)
Sheikh, H.R., Bovik, A.C.: Image information and visual quality. IEEE Trans. Image Process. 15, 430–444 (2006)
Qu, G., Zhang, D., Yan, P.: Information measure for performance of image fusion. Electron. Lett. 38, 1 (2002)
Sheikh, H.R., Bovik, A.C., De Veciana, G.: An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Trans. Image Process. 14, 2117–2128 (2005)
Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., Harada, T.: MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, September 24–28, 2017. pp. 5108–5115. IEEE (2017)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012)
Xu, H., Ma, J., Le, Z., Jiang, J., Guo, X.: FusionDN: a unified densely connected network for image fusion. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020. pp. 12484–12491. AAAI Press (2020)
Toet, A.: TNO image fusion dataset. figshare. Dataset (2014)
Li, H., Wu, X.-J.: DenseFuse: a fusion approach to infrared and visible images. IEEE Trans. Image Process. 28, 2614–2623 (2018)
Lin, X., Sun, S., Huang, W., Sheng, B., Li, P., Feng, D.D.: EAPT: efficient attention pyramid transformer for image processing. IEEE Trans. Multimed. 25, 50–61 (2023)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 770–778 (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18. pp. 234–241. Springer (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 779–788 (2016)
Acknowledgements
This work was supported in part the Yunnan provincial major science and technology special plan projects under Grant 202202AD080003, in part by the National Natural Science Foundation of China under Grant 62202416, 62162068, Grant 62162065, in part by the Yunnan Province Ten Thousand Talents Program and Yunling Scholars Special Project under Grant YNWR-YLXZ-2018-022, in part by the Yunnan Provincial Science and Technology Department-Yunnan University “Double First Class” Construction Joint Fund Project under Grant No. 202301BF070001-025, and in part by the Research Foundation of Yunnan Province No. 202105AF150011.
Author information
Authors and Affiliations
Contributions
C.L. and K.H. wrote the main manuscript text, D.X. funded this work, Y.L. and Y.Z. prepared the experimental data and verified the experimental results. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of the article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, C., He, K., Xu, D. et al. MVSFusion: infrared and visible image fusion method for multiple visual scenarios. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03273-x
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-024-03273-x