Skip to main content
Log in

COLCONF: Collaborative ConvNet Features-based Robust Visual Place Recognition for Varying Environments

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Several deep learning features were recently proposed for visual place recognition (VPR) purpose. Some of them use the information laid in the image sequences, while others utilize the regions of interest (ROIs) that reside in the feature maps produced by the CNN models. It was shown in the literature that features produced from a single layer cannot meet multiple visual challenges. In this work, we present a new collaborative VPR approach, taking the advantage of ROIs feature maps gathered and combined from two different layers in order to improve the recognition performance. An extensive analysis is made on extracting ROIs and the way the performance can differ from one layer to another. Our approach was evaluated over several benchmark datasets including those with viewpoint and appearance challenges. Results have confirmed the robustness of the proposed method compared to the state-of-the-art methods. The area under curve (AUC) and the mean average precision (mAP) measures achieve an average of 91% in comparison with 86% for Max Flow and 72% for CAMAL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Milford, M.J.; Wyeth, G.F.: Seqslam: visual route-based navigation for sunny summer days and stormy winter nights. In: 2012 IEEE International Conference on Robotics and Automation, pp. 1643–1649. IEEE (2012)

  2. Cummins, M.; Newman, P.: Fab-map: probabilistic localization and mapping in the space of appearance. Int. J. Robot. Res. 27(6), 647–665 (2008)

    Article  Google Scholar 

  3. Khaliq, A.; Ehsan, S.; Chen, Z.; Milford, M.; McDonald-Maier, K.: A holistic visual place recognition approach using lightweight CNNs for significant viewpoint and appearance changes. IEEE Trans. Rob. 36(2), 561–569 (2020)

    Article  Google Scholar 

  4. Chen, Z.; Maffra, F.; Sa, I.; Chli, M.: Only look once, mining distinctive landmarks from convnet for visual place recognition. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9–16. IEEE (2017)

  5. Naseer, T.; Burgard, W.; Stachniss, C.: Robust visual localization across seasons. IEEE Trans. Rob. 34(2), 289–302 (2018)

    Article  Google Scholar 

  6. Bay, H.; Tuytelaars, T.; Van Gool, L.: Surf: speeded up robust features. In: Leonardis, A.; Bischof, H.; Pinz, A. (eds.) Computer Vision—ECCV 2006, pp. 404–417. Springer, Berlin, Heidelberg (2006)

  7. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  8. Hafez, A.H.A., Singh, M., Krishna, K.M., Jawahar, C.V.: Visual localization in highly crowded urban environments. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2778–2783 (2013)

  9. Arroyo, R.; Alcantarilla, P.F.; Bergasa, L.M.; Yebes, J.J.; Gámez, S.: Bidirectional loop closure detection on panoramas for visual navigation. In: 2014 IEEE Intelligent Vehicles Symposium Proceedings, pp. 1378–1383. IEEE (2014)

  10. Arandjelovic, R.; Gronat, P.; Torii, A.; Pajdla, T.; Sivic, J.: Netvlad: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)

  11. Sünderhauf, N.; Shirazi, S.; Dayoub, F.; Upcroft, B.; Milford, M.: On the performance of convnet features for place recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304 (2015)

  12. Chen, Z.; Lam, O.; Jacobson, A.; Milford, M.: Convolutional neural network-based place recognition. CoRR (2014). ArXiv:1411.1509

  13. Hafez, A.A., Alqaraleh, S., Tello, A.: Encoded deep features for visual place recognition. In: 2020 28th Signal Processing and Communications Applications Conference (SIU), pp. 1 – 4. IEEE (2020)

  14. Kanji, T.: Self-localization from images with small overlap. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4497–4504. IEEE (2016)

  15. Suenderhauf, N.; Shirazi, S.; Jacobson, A.; Dayoub, F.; Pepperell, E.; Upcroft, B.; Milford, M.: Place recognition with convnet landmarks: viewpoint-robust, condition-robust, training-free. In: Hsu, D. (ed.) Robotics: Science and Systems. Robotics: Science and Systems Conference, vol. XI, pp. 1–10 (2015)

  16. Li, Z.; Zhou, A.; Wang, M.; Shen, Y.: Deep fusion of multi-layers salient CNN features and similarity network for robust visual place recognition. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 22–29. IEEE (2019)

  17. Hausler, S.; Jacobson, A.; Milford, M.: Multi-process fusion: visual place recognition using multiple image processing methods. IEEE Robot. Autom. Lett. 4(2), 1924–1931 (2019)

    Article  Google Scholar 

  18. Zeiler, M.D.; Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. (eds.) Computer Vision—ECCV 2014, pp. 818–833 (2014)

  19. Perronnin, F.; Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007). https://doi.org/10.1109/CVPR.2007.383266

  20. Jégou, H.; Perronnin, F.; Douze, M.; Sànchez, J.; Pérez, P.; Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1704–1716 (2012)

    Article  Google Scholar 

  21. Sánchez, J.; Perronnin, F.; Mensink, T.; Verbeek, J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)

    Article  MathSciNet  Google Scholar 

  22. Sivic, Z.: Video google: a text retrieval approach to object matching in videos. In: Proceedings 9th IEEE International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)

  23. Arandjelovic, R.; Zisserman, A.: All about VLAD. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1578–1585 (2013)

  24. Jammula, M.: Content based image retrieval system using integrated ML and DL-CNN. Ann. Roman. Soc. Cell Biol. 9656–9666 (2021)

  25. Hamreras, S.; Boucheham, B.; Molina-Cabello, M.A.; Benitez-Rochel, R.; Lopez-Rubio, E.: Content based image retrieval by ensembles of deep learning object classifiers. Integrated Comput.-Aided Eng. 27(3), 317–331 (2020)

    Article  Google Scholar 

  26. Guo, H.; Liu, J.; Xiao, Z.; Xiao, L.: Deep CNN-based hyperspectral image classification using discriminative multiple spatial-spectral feature fusion. Remote Sens. Lett. 11(9), 827–836 (2020)

    Article  Google Scholar 

  27. Shakarami, A.; Tarrah, H.: An efficient image descriptor for image classification and CBIR. Optik 214, 164833 (2020)

    Article  Google Scholar 

  28. Abdul Hafez, A.H.; Arora, M.; Krishna, K.M.; Jawahar, C.: Learning multiple experiences useful visual features for active maps localization in crowded environments. Adv. Robot. 30(1), 50–67 (2016)

    Article  Google Scholar 

  29. Du, K.; Cai, K.Y.: Comparison research on IOT oriented image classification algorithms. In: ITM Web of Conferences, vol. 7, p. 02006. EDP Sciences (2016)

  30. Wang, P.; Liu, L.; Shen, C.; Huang, Z.; van den Hengel, A.; Tao Shen, H.: Multi-attention network for one shot learning. In: Proceedings of the IEEE CVPR, pp. 2721–2729 (2017)

  31. Chu, B.; Yang, D.; Tadinada, R.: Visualizing residual networks. CoRR (2017). arXiv:1701.02362

  32. Yu, W.; Yang, K.; Bai, Y.; Xiao, T.; Yao, H.; Rui, Y.: Visualizing and comparing alexnet and VGG using deconvolutional layers. In: Proceedings of the 33rd International Conference on Machine Learning (2016)

  33. Zaffar, M.; Ehsan, S.; Milford, M.; McDonald-Maier, K.: Cohog: a light-weight, compute-efficient, and training-free visual place recognition technique for changing environments. IEEE Robot. Autom. Lett. 5(2), 1835–1842 (2020)

    Article  Google Scholar 

  34. Khaliq, A.; Ehsan, S.; Milford, M.; McDonald-Maier, K.: Camal: context-aware multi-scale attention framework for lightweight visual place recognition. ArXiv preprint (2019). arXiv:1909.08153

  35. Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q.: Learning ROI transformer for oriented object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2849–2858 (2019)

  36. Kim, K.H.; Hong, S.; Roh, B.; Cheon, Y.; Park, M.: Pvanet: deep but lightweight neural networks for real-time object detection. ArXiv preprint (2016). arXiv:1608.08021

  37. Liu, B.; Zhao, W.; Sun, Q.: Study of object detection based on faster r-CNN. In: 2017 Chinese Automation Congress (CAC), pp. 6233–6236. IEEE (2017)

  38. Torii, A.; Arandjelovic, R.; Sivic, J.; Okutomi, M.; Pajdla, T.: 24/7 place recognition by view synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1808–1817 (2015)

  39. Chen, Z.; Jacobson, A.; Sünderhauf, N.; Upcroft, B.; Liu, L.; Shen, C.; Reid, I.; Milford, M.: Deep learning features at scale for visual place recognition. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3223–3230. IEEE (2017)

  40. Sünderhauf, N.; Shirazi, S.; Jacobson, A.; Dayoub, F.; Pepperell, E.; Upcroft, B.; Milford, M.: Place recognition with convnet landmarks: viewpoint-robust, condition-robust, training-free. Robotics: Science and Systems, vol. XI, pp. 1–10 (2015)

  41. Hafez, A.H.A., Tello, A., Alqaraleh, S.: Visual place recognition by dtw-based sequence alignment. In: 2019 27th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2019)

  42. Vedaldi, A.; Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/

  43. Suenderhauf, N.: Openseqslam code (2013). https://openslam.org/openseqslam.html

  44. Hagberg, A.A.; Schult, D.A.; Swart, P.J.: Exploring network structure, dynamics, and function using network. In: Varoquaux, G.; Vaught, T.; Millman, J. (eds.) Proceedings of the 7th Python in Science Conference, pp. 11 – 15. Pasadena, CA USA (2008)

Download references

Acknowledgements

The Titan Xp used for this research was donated by the NVIDIA Corporation. This work is supported by TUBITAK under project number 117E173.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. H. Abdul Hafez.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abdul Hafez, A.H., Tello, A. & Alqaraleh, S. COLCONF: Collaborative ConvNet Features-based Robust Visual Place Recognition for Varying Environments. Arab J Sci Eng 47, 2381–2395 (2022). https://doi.org/10.1007/s13369-021-06148-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-021-06148-8

Keywords

Navigation