Infrared and Visible Image Fusion via Test-Time Training

Zheng, Guoqing; Fu, Zhenqi; Lin, Xiaopeng; Chu, Xueye; Huang, Yue; Ding, Xinghao

doi:10.1007/978-981-99-8549-4_7

Guoqing Zheng¹⁵,
Zhenqi Fu¹⁵,
Xiaopeng Lin¹⁵,
Xueye Chu¹⁵,
Yue Huang^15,16 &
…
Xinghao Ding^15,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14434))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

500 Accesses

Abstract

Infrared and visible image fusion (IVIF) is a widely used technique in instrument-related fields. It aims at extracting contrast information from the infrared image and texture details from the visible image and combining these two kinds of information into a single image. Most auto-encoder-based methods train the network on natural images, such as MS-COCO, and test the model on IVIF datasets. This kind of method suffers from domain shift issues and cannot generalize well in real-world scenarios. To this end, we propose a self-supervised test-time training (TTT) approach to facilitate learning a better fusion result. Specifically, a new self-supervised loss is developed to evaluate the quality of the fusion result. This loss function directs the network to improve the fusion quality by optimizing model parameters with a small number of iterations in the test time. Besides, instead of manually designing fusion strategies, we leverage a fusion adapter to automatically learn fusion rules. Experimental comparisons on two public IVIF datasets validate that the proposed method outperforms existing methods subjectively and objectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aslantas, V., Bendes, E.: A new image quality metric for image fusion: the sum of the correlations of differences. Aeu-Inter. J. Electr. Commun. 69(12), 1890–1896 (2015)
Article Google Scholar
Das, S., Zhang, Y.: Color night vision for navigation and surveillance. Transp. Res. Rec. 1708(1), 40–46 (2000)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. Ieee (2009)
Google Scholar
Gandelsman, Y., Sun, Y., Chen, X., Efros, A.: Test-time training with masked autoencoders. Adv. Neural. Inf. Process. Syst. 35, 29374–29385 (2022)
Google Scholar
Gao, Y., Ma, S., Liu, J.: Dcdr-gan: a densely connected disentangled representation generative adversarial network for infrared and visible image fusion. IEEE Trans. Circ. Syst. Video Technol. (2022)
Google Scholar
Li, H., Wu, X.J.: Densefuse: a fusion approach to infrared and visible images. IEEE Trans. Image Process. 28(5), 2614–2623 (2018)
Article MathSciNet Google Scholar
Li, H., Wu, X.J., Durrani, T.: Nestfuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans. Instrum. Meas. 69(12), 9645–9656 (2020)
Article Google Scholar
Li, H., Wu, X.J., Kittler, J.: Infrared and visible image fusion using a deep learning framework. In: 2018 24th International Conference On Pattern Recognition (ICPR), pp. 2705–2710. IEEE (2018)
Google Scholar
Li, H., Wu, X.J., Kittler, J.: Mdlatlrr: a novel decomposition method for infrared and visible image fusion. IEEE Trans. Image Process. 29, 4733–4746 (2020)
Article Google Scholar
Li, Q., et al.: A multilevel hybrid transmission network for infrared and visible image fusion. IEEE Trans. Instrum. Meas. 71, 1–14 (2022)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Lin, X., Zhou, G., Tu, X., Huang, Y., Ding, X.: Two-level consistency metric for infrared and visible image fusion. IEEE Trans. Instrum. Meas. 71, 1–13 (2022)
Google Scholar
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 171–184 (2012)
Article Google Scholar
Liu, G., Lin, Z., Yu, Y.: Robust subspace segmentation by low-rank representation. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 663–670 (2010)
Google Scholar
Liu, H., Wu, Z., Li, L., Salehkalaibar, S., Chen, J., Wang, K.: Towards multi-domain single image dehazing via test-time training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5831–5840 (2022)
Google Scholar
Ma, J., Chen, C., Li, C., Huang, J.: Infrared and visible image fusion via gradient transfer and total variation minimization. Inform. Fusion 31, 100–109 (2016)
Article Google Scholar
Ma, J., Xu, H., Jiang, J., Mei, X., Zhang, X.P.: Ddcgan: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans. Image Process. 29, 4980–4995 (2020)
Article Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9(11) (2008)
Google Scholar
Piella, G.: A general framework for multiresolution image fusion: from pixels to regions. Inform. Fusion 4(4), 259–280 (2003)
Article Google Scholar
Roberts, J.W., Van Aardt, J.A., Ahmed, F.B.: Assessment of image fusion procedures using entropy, image quality, and multispectral classification. J. Appl. Remote Sens. 2(1), 023522 (2008)
Article Google Scholar
Sheikh, H.R., Bovik, A.C.: Image information and visual quality. IEEE Trans. Image Process. 15(2), 430–444 (2006)
Article Google Scholar
Sun, Y., Wang, X., Liu, Z., Miller, J., Efros, A., Hardt, M.: Test-time training with self-supervision for generalization under distribution shifts. In: International Conference on Machine Learning, pp. 9229–9248. PMLR (2020)
Google Scholar
Tang, W., He, F., Liu, Y.: Ydtr: infrared and visible image fusion via y-shape dynamic transformer. IEEE Trans. Multimedia (2022)
Google Scholar
Toet, A.: The tno multiband image data collection. Data Brief 15, 249–251 (2017)
Article Google Scholar
Vishwakarma, A.: Image fusion using adjustable non-subsampled shearlet transform. IEEE Trans. Instrum. Meas. 68(9), 3367–3378 (2018)
Article Google Scholar
Wang, Z., Wu, Y., Wang, J., Xu, J., Shao, W.: Res2fusion: infrared and visible image fusion based on dense res2net and double nonlocal attention models. IEEE Trans. Instrum. Meas. 71, 1–12 (2022)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers 2003, vol. 2, pp. 1398–1402. IEEE (2003)
Google Scholar
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2008)
Article Google Scholar
Xu, H., Ma, J., Jiang, J., Guo, X., Ling, H.: U2fusion: a unified unsupervised image fusion network. IEEE Trans. Pattern Anal. Mach. Intell. 44(1), 502–518 (2020)
Article Google Scholar
Zhang, Q., Fu, Y., Li, H., Zou, J.: Dictionary learning method for joint sparse representation-based image fusion. Opt. Eng. 52(5), 057006–057006 (2013)
Article Google Scholar
Zhang, X., Demiris, Y.: Visible and infrared image fusion using deep learning. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Google Scholar
Zhang, X., Ye, P., Xiao, G.: Vifb: a visible and infrared image fusion benchmark. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 104–105 (2020)
Google Scholar

Download references

Acknowledgements

The work was supported in part by the National Natural Science Foundation of China under Grant 82172033, U19B2031, 61971369, 52105126, 82272071, 62271430, and the Fundamental Research Funds for the Central Universities 20720230104.

Author information

Authors and Affiliations

School of Informatics, Xiamen University, Xiamen, China
Guoqing Zheng, Zhenqi Fu, Xiaopeng Lin, Xueye Chu, Yue Huang & Xinghao Ding
Institute of Artificial Intelligence, Xiamen University, Xiamen, China
Yue Huang & Xinghao Ding

Authors

Guoqing Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zhenqi Fu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaopeng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Xueye Chu
View author publications
You can also search for this author in PubMed Google Scholar
Yue Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xinghao Ding
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinghao Ding .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, G., Fu, Z., Lin, X., Chu, X., Huang, Y., Ding, X. (2024). Infrared and Visible Image Fusion via Test-Time Training. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14434. Springer, Singapore. https://doi.org/10.1007/978-981-99-8549-4_7

Download citation

DOI: https://doi.org/10.1007/978-981-99-8549-4_7
Published: 25 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8548-7
Online ISBN: 978-981-99-8549-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Infrared and Visible Image Fusion via Test-Time Training