Intelligent Service Robotics

, Volume 8, Issue 2, pp 115–125 | Cite as

Scene recognition with bag of visual nouns and prepositions

Original Research Paper


The loop closure problem is central to topological simultaneous localization and mapping (SLAM); by associating features between distant portions of a trajectory, the odometry error that has accumulated between two observations can be eliminated and a more consistent map can be built. Bayesian pattern recognition techniques such as bag of visual words (BoVW) have recently shown outstanding results in solving the loop closure problem completely in image space using very simple, inexpensive cameras, without the requirement for highly accurate metric information, 3D reconstruction, or camera calibration. In this paper, a modified BoVW descriptor that incorporates simple geometric relationships within an image is used with the fast appearance-based mapping (FAB-MAP) algorithm. In direct comparisons with the traditional BoVW descriptor, an improved recall rate is observed with an acceptable increase in computational time. The proposal of a BoVW-compatible descriptor and the use of the proposed descriptor with a well-known BoVW classifier demonstrate the ability of the BoVW metaphor to be generalized, which could pave the way for more various BoVW descriptors in the same way that many individual visual feature descriptors exist within the computer vision community.


Bag of visual words Scene recognition  Loop closure Place recognition SLAM 



This research was supported by the MOTIE under the Industrial Foundation Technology Development Program supervised by KEIT (No. 10051155) and by Basic Science Research Program through the NRF funded by MSIP (No. 2007-0056094).


  1. 1.
    Thrun S (2001) Probabilistic robotics. MIT Press, CambridgeGoogle Scholar
  2. 2.
    Bradski G (2000) OpenCV. Dr Dobb’s J Softw ToolsGoogle Scholar
  3. 3.
    Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395CrossRefMathSciNetGoogle Scholar
  4. 4.
    Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. Int Conf Comput Vis 2:1470–1477. doi: 10.1109/ICCV.2003.1238663 Google Scholar
  5. 5.
    Cummins M, Newman P (2008) FAB-MAP: probabilistic localization and mapping in the space of appearance. Int J Robot Res 27(6):647–665. doi: 10.1177/0278364908090961 CrossRefGoogle Scholar
  6. 6.
    Cummins M, Newman P (2010) Appearance-only SLAM at large scale with FAB-MAP 2.0. Int J Robot Res 30(9):1100–1123. doi: 10.1177/0278364910385483 CrossRefGoogle Scholar
  7. 7.
    Pérez J, Caballero F, Merino L (2015) Enhanced Monte Carlo localization with visual place recognition for robust robot localization. J Intell Robot Syst 1–16. doi: 10.1007/s10846-015-0198-y
  8. 8.
    Yang C, Shengnan C, Jingdong W, Quan L (2014) Low-rank sift: an affine invariant feature for place recognition. Comput Res Repos 1–5. arXiv:1408.1688
  9. 9.
    Sünderhauf N, Dayoub F, Shirazi S, Upcroft B, Milford M (2015) On the performance of ConvNet features for place recognition. Comput Res Repos 1–8. arXiv:1501.04158
  10. 10.
    Cao J, Chen T, Fan J (2014) Fast online learning algorithm for landmark recognition based on BoW framework. IEEE Trans Ind Appl 1163–1168. doi: 10.1109/ICIEA.2014.6931341
  11. 11.
    Johns E, Yang G (2014) Pairwise probabilistic voting: fast place recognition without RANSAC. Comput Vis ECCV 505–519. doi: 10.1007/978-3-319-10605-2_33
  12. 12.
    Bolovinou A, Pratikakis I, Perantonis S (2012) Bag of spatio-visual words for context inference in scene classification. Pattern Recognit 46(3):1039–1053. doi: 10.1016/j.patcog.2012.07.024 CrossRefGoogle Scholar
  13. 13.
    Duda R, Hart P, Stork D (2000) Pattern classification. Wiley, New YorkGoogle Scholar
  14. 14.
    Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. Comput Vis ECCV 404–417. doi: 10.1007/11744023_32
  15. 15.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 602(2):91–110. doi: 10.1023/B:VISI.0000029664.99615.94 CrossRefGoogle Scholar
  16. 16.
    Rublee E, Rabaud V (2011) ORB: an efficient alternative to SIFT or SURF. Comput Vis ECCV 2564–2571. doi: 10.1109/ICCV.2011.6126544
  17. 17.
    Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: binary robust independent elementary features. Comput Vis ECCV IV:778–792. doi: 10.1007/978-3-642-15561-1_56
  18. 18.
    Cormen T, Leiserson C, Rivest R, Stein C (2001) Introduction to algorithms, 2nd edn. MIT Press, CambridgeGoogle Scholar
  19. 19.
    Chow C, Lee C (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14(3):462–467. doi: 10.1109/TIT.1968.1054142 CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Department of Mechanical EngineeringKorea UniversitySeoulSouth Korea

Personalised recommendations