Skip to main content
Log in

Towards stronger illumination robustness of local feature detection and description based on auxiliary learning

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Local feature detection and description play a crucial role in various computer vision tasks, including image matching. Variations in illumination conditions significantly affect the accuracy of these applications. However, existing methods inadequately address this issue. In this paper, a novel algorithm based on illumination auxiliary learning module (IALM) is introduced. Firstly, a new local feature extractor named illumination auxiliary Superpoint (IA-Superpoint) is established, based on the integration of IALM and Superpoint. Secondly, illumination-aware auxiliary training focuses on capturing the effects of illumination variations during feature extraction through tailored loss functions and jointly learning mechanisms. Lastly, in order to evaluate the illumination robustness of local features, a metric is proposed by simulating various illumination disturbances. Experiments on HPatches and RDNIM datasets demonstrate that the performance of local feature extraction is greatly improved by our method. Compared to the baseline method, the proposed method exhibits improvements in both mean matching accuracy and homography estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availibility

The datasets used or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Wang, C., Xu, D., Zhu, Y., Martín-Martín, R., Lu, C., Fei-Fei, L., Savarese, S.: Densefusion: 6d object pose estimation by iterative dense fusion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3338–3347 (2019). https://doi.org/10.1109/CVPR.2019.00346

  2. Shen, Y., Wang, R., Zuo, W., Zheng, N.: Tcl: tightly coupled learning strategy for weakly supervised hierarchical place recognition. IEEE Robot. Autom. Lett. 7(2), 2684–2691 (2022). https://doi.org/10.1109/LRA.2022.3141663

    Article  Google Scholar 

  3. Ma, J., Jiang, X., Fan, A., Jiang, J., Yan, J.: Image matching from handcrafted to deep features: a survey. Int. J. Comput. Vis. (2021). https://doi.org/10.1007/s11263-020-01359-2

    Article  MathSciNet  Google Scholar 

  4. Zhou, H., Sattler, T., Jacobs, D.W.: Evaluating local features for day-night matching. In: Hua, G., Jegou, H. (Eds.) Computer Vision—ECCV 2016 Workshops, PT III, vol. 9915, pp. 724–736 (2016). https://doi.org/10.1007/978-3-319-49409-8_60

  5. Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: Loftr: detector-free local feature matching with transformers. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8918–8927 (2021). https://doi.org/10.1109/CVPR46437.2021.00881

  6. Lowe, D.: Distinctive image features from scale-invariant key points. Int. J. Comput. Vis. 20, 91–110 (2003)

    Google Scholar 

  7. DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 337–33712 (2018). https://doi.org/10.1109/CVPRW.2018.00060

  8. Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., Sattler, T.: D2-net: a trainable cnn for joint description and detection of local features. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8084–8093 (2019). https://doi.org/10.1109/CVPR.2019.00828

  9. Zhou, Q., Sattler, T., Leal-Taixé, L.: Patch2pix: Epipolar-guided pixel-level correspondences. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4667–4676 (2021). https://doi.org/10.1109/CVPR46437.2021.00464

  10. Ke, Y., Sukthankar, R.: Pca-sift: a more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 (CVPR 2004), vol. 2 (2004). https://doi.org/10.1109/CVPR.2004.1315206

  11. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: Computer Vision—ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7–13, 2006. Proceedings, Part I 9, pp. 404–417. Springer (2006)

  12. Wang, Z., Fan, B., Wu, F.: Local intensity order pattern for feature description. In: 2011 International Conference on Computer Vision, pp. 603–610 (2011). https://doi.org/10.1109/ICCV.2011.6126294

  13. Tang, F., Lim, S.H., Chang, N.L., Tao, H.: A novel feature descriptor invariant to complex brightness changes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2631–2638 (2009). https://doi.org/10.1109/CVPR.2009.5206550

  14. Verdie, Y., Yi, K.M., Fua, P., Lepetit, V.: Tilde: A temporally invariant learned detector. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5279–5288 (2015). https://doi.org/10.1109/CVPR.2015.7299165

  15. Sarlin, P.-E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: Learning feature matching with graph neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4937–4946 (2020). https://doi.org/10.1109/CVPR42600.2020.00499

  16. Efe, U., Ince, K.G., Aydin Alatan, A.: Dfm: A performance baseline for deep feature matching. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4279–4288 (2021). https://doi.org/10.1109/CVPRW53098.2021.00484

  17. Quan, D., Wei, H., Wang, S., Lei, R., Duan, B., Li, Y., Hou, B., Jiao, L.: Self-distillation feature learning network for optical and sar image registration. IEEE Trans. Geosci. Remote Sens. 60, 1–18 (2022). https://doi.org/10.1109/TGRS.2022.3173476

    Article  Google Scholar 

  18. Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3852–3861 (2017). https://doi.org/10.1109/CVPR.2017.410

  19. Pautrat, R., Larsson, V., Oswald, M.R., Pollefeys, M.: Online invariance selection for local feature descriptors. Lecture Notes in Computer Science, pp. 707–724 (2020)

  20. Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, vol. 15, pp. 10–5244 (1988). Citeseer

  21. Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Computer Vision—ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7–13, 2006. Proceedings, Part I 9 (2006). Springer, pp. 430–443

  22. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  23. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571 (2011). https://doi.org/10.1109/ICCV.2011.6126544

  24. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2), 66 (2012)

    Google Scholar 

  25. Barroso-Laguna, A., Mikolajczyk, K.: Key.net: keypoint detection by handcrafted and learned cnn filters revisited. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 698–711 (2023). https://doi.org/10.1109/TPAMI.2022.3145820

  26. Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K.M., Trulls, E.: Image matching across wide baselines: from paper to practice. Int. J. Comput. Vis. 129(2), 517–547 (2021). https://doi.org/10.1007/s11263-020-01385-0

    Article  Google Scholar 

  27. Li, C., Guo, C., Han, L., Jiang, J., Cheng, M.-M., Gu, J., Loy, C.C.: Low-light image and video enhancement using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 9396–9416 (2022). https://doi.org/10.1109/TPAMI.2021.3126387

    Article  Google Scholar 

  28. Reinhard, E., Stark, M., Shirley, P., Ferwerda, J.: Photographic tone reproduction for digital images. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, pp. 267–276 (2002)

  29. Wang, R., Zhang, Q., Fu, C.-W., Shen, X., Zheng, W.-S., Jia, J.: Underexposed photo enhancement using deep illumination estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6842–6850 (2019). https://doi.org/10.1109/CVPR.2019.00701

  30. Land, E.H.: Recent advances in retinex theory and some implications for cortical computations: color vision and the natural image. Proc. Natl. Acad. Sci. 80(16), 5163–5169 (1983)

    Article  Google Scholar 

  31. Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560 (2018)

  32. Zhang, Y., Yang, Q.: An overview of multi-task learning. Natl. Sci. Rev. 5(1), 30–43 (2018). https://doi.org/10.1093/nsr/nwx105

    Article  Google Scholar 

  33. Vandenhende, S., Georgoulis, S., Proesmans, M., Dai, D., Van Gool, L.: Revisiting multi-task learning in the deep learning era. arXiv preprint arXiv:2004.13379 (2020)

  34. Zhang, A., Gao, Y., Niu, Y., Liu, W., Zhou, Y.: Coarse-to-fine person re-identification with auxiliary-domain classification and second-order information bottleneck. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021). IEEE Conference on Computer Vision and Pattern Recognition, pp. 598–607 (2021). https://doi.org/10.1109/CVPR46437.2021.00066

  35. He, C., Zeng, H., Huang, J., Hua, X.-S., Zhang, L.: Structure aware single-stage 3d object detection from point cloud. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11870–11879 (2020). https://doi.org/10.1109/CVPR42600.2020.01189

  36. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (Eds.) Computer Vision—ECCV 2014, PT V. Lecture Notes in Computer Science, vol. 8693, pp. 740–755 (2014). https://doi.org/10.1007/978-3-319-10602-1_48

  37. Jiang, W., Trulls, E., Hosang, J., Tagliasacchi, A., Yi, K.M.: Cotr: correspondence transformer for matching across images. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6187–6197 (2021). https://doi.org/10.1109/ICCV48922.2021.00615

  38. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    Article  MathSciNet  Google Scholar 

Download references

Funding

National Natural Science Foundation of China (62073024).

Author information

Authors and Affiliations

Authors

Contributions

H.B., S.F. and, H.Z. wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Lunming Qin.

Ethics declarations

Conflict of interest

No, all the authors have no conflict of interest as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bian, H., Fan, S., Zhang, H. et al. Towards stronger illumination robustness of local feature detection and description based on auxiliary learning. SIViP (2024). https://doi.org/10.1007/s11760-024-03175-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11760-024-03175-4

Keywords

Navigation