Skip to main content
Log in

AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-scale feature fusion has been widely used in handcrafted descriptors, but has not been fully explored in deep learning-based descriptor extraction. Simple concatenation of descriptors of different scales has not been successful in significantly improving performance for computer vision tasks. In this paper, we propose a novel convolutional neural network, based on center-surround adaptive multi-scale feature fusion. Our approach enables the network to focus on different center-surround scales, resulting in improved performance. We also introduce a novel regularization technique that uses second-order similarity to constrain the learning of local descriptors, based on the symmetric property of the similarity matrix. The proposed method outperforms single-scale or simple-concatenation descriptors on two datasets and achieves state-of-the-art results on the Brown dataset. Furthermore, our method demonstrates excellent generalization ability on the HPatches dataset. Our code is released on GitHub: https://github.com/Leung-GD/AFSRNet/tree/main.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Xue J, Hou X, Zeng Y (2021) Review of image-based 3d reconstruction of building for automated construction progress monitoring. Appl Sci 11(17)

  2. Ganesan K, Ganapathi II, Javed S et al (2023) Multimodal hybrid features in 3d ear recognition. Appl Intell 53(10):11,618-11,635

    Article  Google Scholar 

  3. Cai Y, Li L, Wang D et al (2023) Htmatch: An efficient hybrid transformer based graph neural network for local feature matching. Signal Process 204(108):859

    Google Scholar 

  4. Di Y, Liao Y, Zhou H et al (2023) Femip: detector-free feature matching for multimodal images with policy gradient. Appl Intell 53(20):24068–24088

    Article  Google Scholar 

  5. Zhu F, Zhu X, Huang Z et al (2021) Deep learning based data-adaptive descriptor for non-rigid multi-modal medical image registration. Signal Process 183(108):023

    Google Scholar 

  6. Ma J, Jiang X, Fan A et al (2021) Image matching from handcrafted to deep features: A survey. Int J Comput Vis 129(1):23–79

    Article  MathSciNet  Google Scholar 

  7. Jin Y, Mishkin D, Mishchuk A et al (2021) Image matching across wide baselines: From paper to practice. Int J Comput Vis 129(2):517–547

    Article  Google Scholar 

  8. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. International journal of computer vision 60:91–110

    Article  Google Scholar 

  9. Bay H, Ess A, Tuytelaars T et al (2008) Speeded-up robust features (surf). Comput. Vis. Image Underst 110(3):346–359

    Article  Google Scholar 

  10. Tian Y, Fan B, Wu F (2017) L2-net: Deep learning of discriminative patch descriptor in euclidean space. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  11. Mishchuk A, Mishkin D, Radenovic F et al (2017) Working hard to know your neighbor’s margins: Local descriptor learning loss. In: Guyon I, Luxburg UV, Bengio S et al (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc

    Google Scholar 

  12. Hausler S, Garg S, Xu M, et al (2021) Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 14,141–14,152

  13. Xu Y, Gong M, Liu T et al (2019) Robust angular local descriptor learning. In: Jawahar C, Li H, Mori G et al (eds) Computer Vision - ACCV 2018. Springer International Publishing, Cham, pp 420–435

    Chapter  Google Scholar 

  14. Tian Y, Yu X, Fan B, et al (2019) Sosnet: Second order similarity regularization for local descriptor learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  15. Wang S, Guo X, Tie Y, et al (2021) Local feature descriptors with deep hypersphere learning. In: 2021 IEEE international conference on image processing (ICIP), pp 1524–1528

  16. Zhang J, Jiao L, Ma W et al (2023) Rdlnet: A regularized descriptor learning network. IEEE Trans Neural Netw Learn Syst 34(9):5669–5681

    Article  Google Scholar 

  17. Zhang L, Rusinkiewicz S (2019) Learning local descriptors with a cdf-based dynamic soft margin. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  18. Liang P, Ji H, Cheng E et al (2021) Learning local descriptors with multi-level feature aggregation and spatial context pyramid. Neurocomputing 461:99–108

    Article  Google Scholar 

  19. Zhang P, Zhang C, Liu B et al (2022) Leveraging local and global descriptors in parallel to search correspondences for visual localization. Pattern Recognit 122(108):344

    Google Scholar 

  20. He Y, Hu Y, Zhao W, et al (2023) Darkfeat: noise-robust feature detector and descriptor for extremely low-light raw images. In: Proceedings of the AAAI conference on artificial intelligence, pp 826–834

  21. Lin TY, Dollár P, Girshick R, et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  22. Deng C, Wang M, Liu L et al (2022) Extended feature pyramid network for small object detection. IEEE Trans Multimed 24:1968–1979

    Article  Google Scholar 

  23. Jiang K, Wang Z, Yi P, et al (2020) Multi-scale progressive fusion network for single image deraining. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8346–8355

  24. Wang G, Gan X, Cao Q et al (2023) Mfanet: multi-scale feature fusion network with attention mechanism. Vis Comput 39(7):2969–2980

    Article  Google Scholar 

  25. He K, Zhang X, Ren S et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  26. Chen LC, Papandreou G, Kokkinos I et al (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  27. Li Y, Chen Y, Wang N, et al (2019) Scale-aware trident networks for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6054–6063

  28. Balntas V, Riba E, Ponsa D, et al (2016) Learning local feature descriptors with triplets and shallow convolutional neural networks. In: BMVC, p 3

  29. Tian Y, Barroso Laguna A, Ng T, et al (2020) Hynet: Learning local descriptor with hybrid similarity measure and triplet loss. In: Larochelle H, Ranzato M, Hadsell R, et al (eds) Advances in neural information processing systems, vol 33. Curran Associates, Inc., pp 7401–7412

  30. Brown M, Hua G, Winder S (2011) Discriminative learning of local image descriptors. IEEE Trans Pattern Anal Mach Intell 33(1):43–57

    Article  Google Scholar 

  31. Balntas V, Lenc K, Vedaldi A, et al (2017) Hpatches: A benchmark and evaluation of handcrafted and learned local descriptors. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  32. Miao Y, Lin Z, Ma X et al (2021) Learning transformation-invariant local descriptors with low-coupling binary codes. IEEE Trans Image Process 30:7554–7566

    Article  MathSciNet  Google Scholar 

  33. Fan B, Liu H, Zeng H et al (2021) Deep unsupervised binary descriptor learning through locality consistency and self distinctiveness. IEEE Trans Multimed 23:2770–2781

    Article  Google Scholar 

  34. Wang W, Zhang L, Huang H (2023) Revisiting unsupervised local descriptor learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 2680–2688

  35. Yin J, Liu Q, Meng F et al (2022) Stcdesc: Learning deep local descriptor using similar triangle constraint. Knowl Based Syst 248(108):799

    Google Scholar 

  36. Quan D, Wang S, Li Y et al (2021) Multi-relation attention network for image patch matching. IEEE Trans Image Process 30:7127–7142

    Article  Google Scholar 

  37. Yu C, Liu Y, Li C et al (2022) Multibranch feature difference learning network for cross-spectral image patch matching. IEEE Trans Geosci Remote Sensing 60:1–15

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Guangdong Basic and Applied Basic Research Foundation (Grant No. 2021A1515011867).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haowen Liang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, D., Liang, H. & Lam, KM. AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization. Appl Intell (2024). https://doi.org/10.1007/s10489-024-05418-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05418-w

Keywords

Navigation