Skip to main content

A Faster, Lighter and Stronger Deep Learning-Based Approach for Place Recognition

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1682))

Abstract

Visual Place Recognition is a vital part of image localization and loop closure detection systems, and it has attracted widespread interest in multiple domains such as computer vision, robotics and AR/VR. In this work, we propose a faster, lighter and stronger approach that can generate models with fewer parameters and can spend less time in the inference stage. We designed RepVGG-lite as the backbone network in our architecture, it is more discriminative than other general networks in the Place Recognition task. RepVGG-lite has more speed advantages while achieving higher performance. We extract only one scale patch-level descriptors from global descriptors in the feature extraction stage. Then we design a trainable feature matcher to exploit both the space relationships and the visual appearance of the features, which is based on the attention mechanism. Extensive experiments on difficult datasets show that the proposed approach outperforming previous other advanced learning approaches, and achieving even higher inference speed. Our system has 14 times less params than Patch-NetVLAD, 6.8 times lower theoretical FLOPs, and run faster 21 and 33 times in feature extraction and feature matching. Moreover, the performance of our approach is 0.5% better than Patch-NetVLAD in Recall@1. We used subsets of Mapillary Street Level Sequences dataset to conduct experiments for all other challenging conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arandjelovic, R., et al.: NetVLAD: CNN architecture for weakly supervised place recognition. In: CVPR, pp. 5297–5307 (2016)

    Google Scholar 

  2. DeTone, D., et al.: Superpoint: self-supervised interest point detection and description. In: CVPR, pp. 224–236 (2018)

    Google Scholar 

  3. Ding, X., et al.: RepVGG: making VGG-style convnets great again. In: CVPR, pp. 13733–13742 (2021)

    Google Scholar 

  4. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: Part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)

    Article  Google Scholar 

  5. Dusmanu, M., et al.: D2-net: a trainable CNN for joint description and detection of local features. In: CVPR, pp. 8092–8101 (2019)

    Google Scholar 

  6. Hausler, S., et al.: Patch-NetVLAD: multi-scale fusion of locally-global descriptors for place recognition. In: CVPR, pp. 14141–14152 (2021)

    Google Scholar 

  7. Newman, P., Ho, K.: SLAM-loop closing with visually salient features. In: Proceedings of the 2005 IEEE International Conference on Robotics and Automation, pp. 635–642. IEEE (2005)

    Google Scholar 

  8. Peyré, G., Cuturi, M., et al.: Computational optimal transport: with applications to data science. Found. Trends® Mach. Learn. 11(5–6), 355–607 (2019)

    Article  MATH  Google Scholar 

  9. Revaud, J., Almazán, J., Rezende, R.S., de Souza, C.R.: Learning with average precision: training image retrieval with a listwise loss. In: CVPR, pp. 5107–5116 (2019)

    Google Scholar 

  10. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)

    Google Scholar 

  11. Sarlin, P.-E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: CVPR, pp. 4938–4947 (2020)

    Google Scholar 

  12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  13. Torii, A., et al.: 24/7 place recognition by view synthesis. In: CVPR, pp. 1808–1817 (2015)

    Google Scholar 

  14. Torii, A., Sivic, J., Pajdla, T., Okutomi, M.: Visual place recognition with repetitive structures. In: CVPR, pp. 883–890 (2013)

    Google Scholar 

  15. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  16. Warburg, F., et al.: Mapillary street-level sequences: a dataset for lifelong place recognition. In: CVPR, pp. 2626–2635 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Songzhi Su .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, R., Huang, Z., Su, S. (2023). A Faster, Lighter and Stronger Deep Learning-Based Approach for Place Recognition. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1682. Springer, Singapore. https://doi.org/10.1007/978-981-99-2385-4_34

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-2385-4_34

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-2384-7

  • Online ISBN: 978-981-99-2385-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics