Skip to main content
Log in

Bag of Sampled Words: A Sampling-based Strategy for Fast and Accurate Visual Place Recognition in Changing Environments

  • Robot and Applications
  • Published:
International Journal of Control, Automation and Systems Aims and scope Submit manuscript

Abstract

In the field of visual place recognition, a variety of methods using Visual Bag of Words has been suggested to cope with environmental change. This paper presents a sampling-based method which improves the speed and the accuracy of the existing Visual Bag of Words models. We first propose sampling of image features considering their density to speed up the quantization step. By using samples, a more accurate but rather slow ranking procedure is feasible. Thus, we also propose a ranking procedure which utilizes spatial information of samples. Lastly, a coarse and fine approach-based refinement method is proposed which increases the accuracy of the system by iteratively updating the similarity between images. The experimental results show that the proposed method improves the performance of the existing Visual Bag of Words models in terms of speed and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. W. Zhou, H. Li, and Q. Tian, “Recent advance in content-based image retrieval: a literature survey,” arXiv: 1706.06064, 2017.

    Google Scholar 

  2. S. Lawry, N. Sünderhau][, P. Newman, J. J. Leonard, D. Cox, P. Corke, and M. J. Milford, “Visual place recognition: a survey,” IEEE trans. Robotics, vol. 32, no. 1, pp.1–19, 2016.

    Article  Google Scholar 

  3. D. Gálvez-López, and J. D. Tardós, “Bags of binary words for fast place recognition in image sequences,” IEEE Trans. Robot., vol. 28, no. 5, pp.1188–1197, 2012.

    Article  Google Scholar 

  4. M. Cummins and P. Newman, “Appearance-only SLAM at large scale with FAB-MAP 2.0,” Int. J. Rob. Res., vol. 30, no. 9, pp. 1100–1123, 2011.

    Article  Google Scholar 

  5. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2007.

    Google Scholar 

  6. D. Nistér and H. Stewénius, “Scalable recognition with a vocabulary tree,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, vol. 2, pp.2161–2168, 2006.

    Google Scholar 

  7. J. Sivic and A. Zisserman, “Video Google: a text retrieval approach to object matching in videos,” Towar. Categ. Object Recognit., pp. 1470, 2003.

    Google Scholar 

  8. J. L. Bentley, “K-d trees for semidynamic point sets,” Proc. Sixth Annu. Symp. Comput. Geom, pp. 187–197, 1990.

    Chapter  Google Scholar 

  9. C. Silpa-Anan and R. Hartley, “Localisation using an image-map,” Australasian Conf. on Robotics and Automation, vol. 162, 2004.

  10. M. Muja and D.G. Lowe, “Scalable nearest neighbor algorithms for high dimensional data,” IEEE Trans. on Pattern Analysis andMachine Intelligence, vol. 36, no. 11, pp.2227–2240, 2014.

    Article  Google Scholar 

  11. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., 2004.

    Google Scholar 

  12. A. Kelman, M. Sofka, and C. V Stewart, “Keypoint descriptors for matching across multiple image modalities and non-linear intensity variations,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 1–7, 2007.

    Google Scholar 

  13. X. Zhang, L. Zhang, and H.Y. Shum, “QsRank: query-sensitive hash code ranking for efficient -neighbor search,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 2058–2065, 2012.

    Google Scholar 

  14. D. Mishkin, M. Perdoch, and J. Matas, “Place Recognition with WxBS Retrieval,” CVPR 2015 Work. Vis. Place Recognit. Chang. Environ, vol. 30, 2015.

  15. H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 3304–3311, 2010.

    Google Scholar 

  16. S. Lowry and H. Andreasson, “Lightweight, viewpoint-invariant visual place recognition in changing environments,” IEEE Robotics and Automation Letters, vol. 3, no. 6, pp.957–964, 2018.

    Article  Google Scholar 

  17. F. Perronnin, Y. Liu, J. Sánchez, and H. Poirier, “Large-scale image retrieval with compressed fisher vectors,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 3384–3391, 2010.

    Google Scholar 

  18. D. Mishkin, J. Matas, M. Perdoch, and K. Lenc, “WxBS: wide baseline stereo generalizations,” arXiv, 112, 2015.

    Google Scholar 

  19. D. Mishkin, J. Matas, and M. Perdoch, “MODS: Fast and robust method for two-view matching,” Compututer Vision and Image Understanding, vol. 141, pp.8193, 2015.

    Google Scholar 

  20. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, vol. 463, ACM press, New York, 1999.

  21. P. E. Forssén and D. G. Lowe, “Shape descriptors for maximally stable extremal regions,” Proc. IEEE Int. Conf. Comput. Vis., pp. 1–8, 2007.

    Google Scholar 

  22. K. Mikolajczyk and C. Schmid, “Scale affine invariant interest point detectors,” Ijcv., vol. 60, no. 1, pp.63–86, 2004.

    Article  Google Scholar 

  23. H. Altwaijry, A. Veit, S. J. Belongie, and C. Tech, “Learning to detect and match keypoints with deep architectures.,” Bmvc., vol. 60, no. 2, pp.91–110, 2016.

    Google Scholar 

  24. Law of Large Numbers, https://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter8.pdf.

  25. R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “ORBSLAM: a versatile and accurate monocular SLAM system,” IEEE Trans. Robot., vol. 31, no. 5, pp. 11471163, 2015.

    Article  Google Scholar 

  26. C. Tang, O. Wang, and P. Tan, “GSLAM: initialization-robust monocular visual slam via global structure-from-motion,” Proc. - Int. Conf. 3D Vision, 3DV 2017, 2018.

    Google Scholar 

  27. J. Engel, T. Schöps, and D. Cremers, “LSD-SLAM: large-scale direct monocular SLAM,” Lect. Notes Comput. Sci., pp. 834–849, 2014.

    Google Scholar 

  28. E. Rublee, V. Rabau][, K. Konolige, and G. Bradski, “ORB: an efficient alternative to SIFT or SURF,” Proc. IEEE Int. Conf. Comput. Vis., pp. 2564–2571, 2011.

    Google Scholar 

  29. H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: speeded up robust features,” Lect. Notes Comput. Sci., pp. 404–417, 2006.

    Google Scholar 

  30. N. Sünderhauf, The VPRiCE Challenge 2015-Visual Place Recognition in Changing Environments, https://roboticvision.atlassian.net/wiki/pages/viewpage.action?pageId=14188617.

  31. M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM., vol. 24, no. 6, pp.381–395, 1981.

    Article  MathSciNet  Google Scholar 

  32. V. Lepetit and P. Fua, “Keypoint recognition using randomized trees,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 9, pp.1465–1479, 2006.

    Article  Google Scholar 

  33. J. Waksberg, “Sampling methods for random digit dialing,” J. Am. Stat. Assoc., vol. 73, no. 361, pp.40–46, 1978.

    Article  Google Scholar 

  34. J. S. Simonoff, Smoothing Methods in Statistics, Science & Business Media, Berlin, Germany, 2012.

    MATH  Google Scholar 

  35. V. A. Epanechnikov, “Non-parametric estimation of a multivariate probability density,” Theory Probab. Its Appl., vol. 14, no. 1, pp.153–158, 1969.

    Article  MathSciNet  Google Scholar 

  36. L. Martino and J. Míguez, “Generalized rejection sampling schemes and applications in signal processing,” Signal Processing, vol. 90, no. 11, pp.2981–2995, 2010.

    Article  Google Scholar 

  37. Geometric verification of matching, http://www.micc.unifi.it/delbimbo/wp-content/uploads/2011/10/slide_corso/A34%20Geometric%20verification.pdf, Accessed 15 November 2011.

  38. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, 2nd eds., Cambridge, England, 2003.

    MATH  Google Scholar 

  39. O. Chum, J. Matas, and J. Kittler, “Locally optimized RANSAC,” Joint Pattern Recognition Symposium, Springer, Berlin, Heidelberg, pp. 236243. 2003.

    Google Scholar 

  40. D. M. Chen, G. Baatz, K. Köser, S. S. Tsai, R. Vedantham, T. Pylvänäinen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk, “City-scale landmark identification on mobile devices,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 737–744, 2011.

    Google Scholar 

  41. R. Gomez-Ojeda, M. Lopez-Antequera, N. Petkov, and J. Gonzalez-Jimenez, “Training a convolutional neural network for appearance-invariant place recognition,” CVPR Work. Vis. Place Recognit. Chang. Environ., 2015.

    Google Scholar 

  42. O. Vysotska and C. Stachniss, “Lazy sequences matching under substantial appearance changes,” Proc. of ICRA 15 WS VPRiCE. (short paper), 2015.

    Google Scholar 

  43. M. Douze, A. Ramisa, and C. Schmid, “Combining attributes and Fisher vectors for efficient image retrieval,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 745–752, 2011.

    Google Scholar 

  44. S. Zhao, H. Yao, Y. Yang, and Y. Zhang, “Affective image retrieval via multi-graph learning,” Proc. ACM Int. Conf. Multimed., 2014.

    Google Scholar 

  45. R. Tao, A. W. M. Smeulders, and S. F. Chang, “Attributes and categories for generic instance search from one example,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 1025–1028, 2015.

    Google Scholar 

  46. X. Chen and Y. Jia, “Indoor localization for mobile robots using lampshade corners as landmarks: visual system calibration, feature extraction and experiments,” International Journal of Control, Automation, and Systems, vol. 12, no. 6, pp.1313–1322, 2014.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sung Soo Hwang.

Additional information

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Recommended by Associate Editor Kang-Hyun Jo under the direction of Editor Euntai Kim. This journal was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2016R1D1A3B03934808).

Sang Jun Lee received his B.S. degree in Computer science and Engineering from Handong Global University, Pohang-si, Korea, in 2017. He is currently pursuing an M.S. degree in the Dept. of Information Technology at the Handong Global University. His research interests include the SLAM system for the localization of self-driving cars, robotics, and AR with 3D reconstruction, and optimization.

Sung Soo Hwang received his B.S. degree in Electrical Engineering and Computer Science from Handong Global Unveristy, Pohang, Korea in 2008, and his M.S. and Ph.D. degrees in Korea Advanced Institute of Science and Technology, Daejeon, Korea, in 2010 and 2015, respectively. His research interests include image-based 3D modeling, 3D data compression, augmented reality, and Simultaneous Localization and Mapping system.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, S.J., Hwang, S.S. Bag of Sampled Words: A Sampling-based Strategy for Fast and Accurate Visual Place Recognition in Changing Environments. Int. J. Control Autom. Syst. 17, 2597–2609 (2019). https://doi.org/10.1007/s12555-018-0790-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12555-018-0790-6

Keywords

Navigation