AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization

Li, Dong; Liang, Haowen; Lam, Kin-Man

doi:10.1007/s10489-024-05418-w

AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization

Published: 20 April 2024

(2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Dong Li¹,
Haowen Liang¹ &
Kin-Man Lam²

77 Accesses
Explore all metrics

Abstract

Multi-scale feature fusion has been widely used in handcrafted descriptors, but has not been fully explored in deep learning-based descriptor extraction. Simple concatenation of descriptors of different scales has not been successful in significantly improving performance for computer vision tasks. In this paper, we propose a novel convolutional neural network, based on center-surround adaptive multi-scale feature fusion. Our approach enables the network to focus on different center-surround scales, resulting in improved performance. We also introduce a novel regularization technique that uses second-order similarity to constrain the learning of local descriptors, based on the symmetric property of the similarity matrix. The proposed method outperforms single-scale or simple-concatenation descriptors on two datasets and achieves state-of-the-art results on the Brown dataset. Furthermore, our method demonstrates excellent generalization ability on the HPatches dataset. Our code is released on GitHub: https://github.com/Leung-GD/AFSRNet/tree/main.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

A review of object detection based on deep learning

Article 12 June 2020

Deep learning models for digital image processing: a review

Article 07 January 2024

Data Availability

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.

References

Xue J, Hou X, Zeng Y (2021) Review of image-based 3d reconstruction of building for automated construction progress monitoring. Appl Sci 11(17)
Ganesan K, Ganapathi II, Javed S et al (2023) Multimodal hybrid features in 3d ear recognition. Appl Intell 53(10):11,618-11,635
Article Google Scholar
Cai Y, Li L, Wang D et al (2023) Htmatch: An efficient hybrid transformer based graph neural network for local feature matching. Signal Process 204(108):859
Google Scholar
Di Y, Liao Y, Zhou H et al (2023) Femip: detector-free feature matching for multimodal images with policy gradient. Appl Intell 53(20):24068–24088
Article Google Scholar
Zhu F, Zhu X, Huang Z et al (2021) Deep learning based data-adaptive descriptor for non-rigid multi-modal medical image registration. Signal Process 183(108):023
Google Scholar
Ma J, Jiang X, Fan A et al (2021) Image matching from handcrafted to deep features: A survey. Int J Comput Vis 129(1):23–79
Article MathSciNet Google Scholar
Jin Y, Mishkin D, Mishchuk A et al (2021) Image matching across wide baselines: From paper to practice. Int J Comput Vis 129(2):517–547
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. International journal of computer vision 60:91–110
Article Google Scholar
Bay H, Ess A, Tuytelaars T et al (2008) Speeded-up robust features (surf). Comput. Vis. Image Underst 110(3):346–359
Article Google Scholar
Tian Y, Fan B, Wu F (2017) L2-net: Deep learning of discriminative patch descriptor in euclidean space. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Mishchuk A, Mishkin D, Radenovic F et al (2017) Working hard to know your neighbor’s margins: Local descriptor learning loss. In: Guyon I, Luxburg UV, Bengio S et al (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc
Google Scholar
Hausler S, Garg S, Xu M, et al (2021) Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 14,141–14,152
Xu Y, Gong M, Liu T et al (2019) Robust angular local descriptor learning. In: Jawahar C, Li H, Mori G et al (eds) Computer Vision - ACCV 2018. Springer International Publishing, Cham, pp 420–435
Chapter Google Scholar
Tian Y, Yu X, Fan B, et al (2019) Sosnet: Second order similarity regularization for local descriptor learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Wang S, Guo X, Tie Y, et al (2021) Local feature descriptors with deep hypersphere learning. In: 2021 IEEE international conference on image processing (ICIP), pp 1524–1528
Zhang J, Jiao L, Ma W et al (2023) Rdlnet: A regularized descriptor learning network. IEEE Trans Neural Netw Learn Syst 34(9):5669–5681
Article Google Scholar
Zhang L, Rusinkiewicz S (2019) Learning local descriptors with a cdf-based dynamic soft margin. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)
Liang P, Ji H, Cheng E et al (2021) Learning local descriptors with multi-level feature aggregation and spatial context pyramid. Neurocomputing 461:99–108
Article Google Scholar
Zhang P, Zhang C, Liu B et al (2022) Leveraging local and global descriptors in parallel to search correspondences for visual localization. Pattern Recognit 122(108):344
Google Scholar
He Y, Hu Y, Zhao W, et al (2023) Darkfeat: noise-robust feature detector and descriptor for extremely low-light raw images. In: Proceedings of the AAAI conference on artificial intelligence, pp 826–834
Lin TY, Dollár P, Girshick R, et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Deng C, Wang M, Liu L et al (2022) Extended feature pyramid network for small object detection. IEEE Trans Multimed 24:1968–1979
Article Google Scholar
Jiang K, Wang Z, Yi P, et al (2020) Multi-scale progressive fusion network for single image deraining. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8346–8355
Wang G, Gan X, Cao Q et al (2023) Mfanet: multi-scale feature fusion network with attention mechanism. Vis Comput 39(7):2969–2980
Article Google Scholar
He K, Zhang X, Ren S et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
Chen LC, Papandreou G, Kokkinos I et al (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Li Y, Chen Y, Wang N, et al (2019) Scale-aware trident networks for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6054–6063
Balntas V, Riba E, Ponsa D, et al (2016) Learning local feature descriptors with triplets and shallow convolutional neural networks. In: BMVC, p 3
Tian Y, Barroso Laguna A, Ng T, et al (2020) Hynet: Learning local descriptor with hybrid similarity measure and triplet loss. In: Larochelle H, Ranzato M, Hadsell R, et al (eds) Advances in neural information processing systems, vol 33. Curran Associates, Inc., pp 7401–7412
Brown M, Hua G, Winder S (2011) Discriminative learning of local image descriptors. IEEE Trans Pattern Anal Mach Intell 33(1):43–57
Article Google Scholar
Balntas V, Lenc K, Vedaldi A, et al (2017) Hpatches: A benchmark and evaluation of handcrafted and learned local descriptors. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Miao Y, Lin Z, Ma X et al (2021) Learning transformation-invariant local descriptors with low-coupling binary codes. IEEE Trans Image Process 30:7554–7566
Article MathSciNet Google Scholar
Fan B, Liu H, Zeng H et al (2021) Deep unsupervised binary descriptor learning through locality consistency and self distinctiveness. IEEE Trans Multimed 23:2770–2781
Article Google Scholar
Wang W, Zhang L, Huang H (2023) Revisiting unsupervised local descriptor learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 2680–2688
Yin J, Liu Q, Meng F et al (2022) Stcdesc: Learning deep local descriptor using similar triangle constraint. Knowl Based Syst 248(108):799
Google Scholar
Quan D, Wang S, Li Y et al (2021) Multi-relation attention network for image patch matching. IEEE Trans Image Process 30:7127–7142
Article Google Scholar
Yu C, Liu Y, Li C et al (2022) Multibranch feature difference learning network for cross-spectral image patch matching. IEEE Trans Geosci Remote Sensing 60:1–15
Google Scholar

Download references

Acknowledgements

This work was supported by the Guangdong Basic and Applied Basic Research Foundation (Grant No. 2021A1515011867).

Author information

Authors and Affiliations

Guangdong University Of Technology, Guangzhou, 510006, China
Dong Li & Haowen Liang
The Hong Kong Polytechnic University, HongKong, 999077, China
Kin-Man Lam

Authors

Dong Li
View author publications
You can also search for this author in PubMed Google Scholar
Haowen Liang
View author publications
You can also search for this author in PubMed Google Scholar
Kin-Man Lam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haowen Liang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, D., Liang, H. & Lam, KM. AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization. Appl Intell (2024). https://doi.org/10.1007/s10489-024-05418-w

Download citation

Accepted: 23 March 2024
Published: 20 April 2024
DOI: https://doi.org/10.1007/s10489-024-05418-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A review of object detection based on deep learning

Deep learning models for digital image processing: a review

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A review of object detection based on deep learning

Deep learning models for digital image processing: a review

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation