Towards Loop Closure Detection for SLAM Applications Using Bag of Visual Features: Experiments and Simulation

da Silva, Alexandra Miguel Raibolt; Casqueiro, Gustavo Alves; Angonese, Alberto Torres; Rosa, Paulo Fernando Ferreira

doi:10.1007/978-3-031-08443-0_3

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1519))

Included in the following conference series:

Latin American Workshop on Computational Neuroscience

286 Accesses

Abstract

This paper presents a new approach to exploring sparse and binary convolutional filters in traditional Convolutional Neural Networks (CNN). Recent advances in the integration of Deep Learning architectures, particularly in mobile autonomous robotics applications, have motivated several researches to overcome the challenges related to the limitations of computational resources. One of the biggest challenges in the area, is the development of applications to address the Loop Closure Detection problem in Simultaneous Localization and Mapping (SLAM) systems. For such application, it is necessary to use exhaustive computational power. Nevertheless, resource optimization of Convolutional Neural Network models enhances the capability of integration. Therefore, we propose the reformulation of convolutional layers through Local Binary Descriptors (LBD) to achieve this kind of optimization of CNN’s resources. This paper discusses the evaluation of a Bag of Visual Features (BoVF) approach, extracting features through local descriptors (e.g., SIFT, SURF, KAZE), and local binary descriptors (e.g., BRIEF, ORB, BRISK, AKAZE, FREAK). The descriptors were evaluated in the recognition and classification steps using six visual datasets (i.e., MNIST, JAFFE, Extended CK+, FEI, CIFAR-10, and FER-2013) through a Multilayer Perceptron (MLP) classifier. Experimentally, we demonstrated the feasibility of producing promising results by combining BoVF with MLP classifier. Additionally, we can assume that the computed descriptors generated by a Local Binary Descriptor alongside the proposed hybrid DNN (Deep Neural Network) architecture can satisfactorily accomplish the results for the optimization of a CNN’s resources applied to the Loop Closure Detection problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Alahi, A., Ortiz, R., Vandergheynst, P.: FREAK: fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 510–517. IEEE (2012)
Google Scholar
Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Patt. Anal. Mach. Intell 34(7), 1281–1298 (2011)
Google Scholar
Alcantarilla, P.F., Bartoli, A., Davison, A.J.: KAZE features. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 214–227. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_16
Chapter Google Scholar
Anwer, R.M., Khan, F.S., van de Weijer, J., Molinier, M., Laaksonen, J.: Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification. ISPRS J. Photogramm. Remote. Sens. 138, 74–85 (2018)
Article Google Scholar
Aqel, M.O.A., Marhaban, M.H., Saripan, M.I., Ismail, N.B.: Review of visual odometry: types, approaches, challenges, and applications. Springerplus 5(1), 1–26 (2016). https://doi.org/10.1186/s40064-016-3573-7
Article Google Scholar
Barroso-Laguna, A., Riba, E., Ponsa, D., Mikolajczyk, K.: Key .net: keypoint detection by handcrafted and learned CNN filters. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5836–5844 (2019)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Chapter Google Scholar
Bekele, D., Teutsch, M., Schuchert, T.: Evaluation of binary keypoint descriptors. In: 2013 IEEE International Conference on Image Processing, pp. 3652–3656. IEEE (2013)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Google Scholar
Calonder, M., Lepetit, V., Ozuysal, M., Trzcinski, T., Strecha, C., Fua, P.: Brief: computing a local binary descriptor very fast. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2011)
Article Google Scholar
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_56
Chapter Google Scholar
Chatoux, H., Lecellier, F., Fernandez-Maloigne, C.: Comparative study of descriptors with dense key points. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 1988–1993. IEEE (2016)
Google Scholar
Chen, B., Yuan, D., Liu, C., Wu, Q.: Loop closure detection based on multi-scale deep feature fusion. Appl. Sci. 9(6), 1120 (2019)
Article Google Scholar
CS Kumar, A., Bhandarkar, S.M., Prasad, M.: DepthNet: a recurrent neural network architecture for monocular depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 283–291 (2018)
Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, pp. 1–2. Prague (2004)
Google Scholar
Dai, Z., Huang, X., Chen, W., He, L., Zhang, H.: A comparison of CNN-based and hand-crafted keypoint descriptors. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 2399–2404. IEEE (2019)
Google Scholar
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Google Scholar
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Rob. Autom. Mag. 13(2), 99–110 (2006)
Article Google Scholar
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Chapter Google Scholar
Heinly, J., Dunn, E., Frahm, J.-M.: Comparative evaluation of binary features. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 759–773. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_54
Chapter Google Scholar
Juefei-Xu, F., Naresh Boddeti, V., Savvides, M.: Local binary convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 19–28 (2017)
Google Scholar
Kanade, T., Cohn, J.F., Tian, Y.: Comprehensive database for facial expression analysis. In: Fourth IEEE International Conference on Automatic Face and Gesture Recognition, Proceedings, pp. 46–53. IEEE (2000)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Leutenegger, S., Chli, M., Siegwart, R.: BRISK: binary robust invariant scalable keypoints. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2548–2555. IEEE (2011)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Article Google Scholar
Lowe, D.G., et al.: Object recognition from local scale-invariant features. In: ICCV, vol. 99, pp. 1150–1157 (1999)
Google Scholar
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101. IEEE (2010)
Google Scholar
Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with Gabor wavelets. In: Third IEEE International Conference on Automatic Face and Gesture Recognition, Proceedings, pp. 200–205. IEEE (1998)
Google Scholar
Mascharka, D., Manley, E.: Lips: learning based indoor positioning system using mobile phone-based sensors. In: 2016 13th IEEE Annual Consumer Communications Networking Conference (CCNC), pp. 968–971 (2016). https://doi.org/10.1109/CCNC.2016.7444919
Minsky, M., Papert, S.: Perceptrons. 1969. Cited on p. 1 (1990)
Google Scholar
Morioka, N., Satoh, S.: Building compact local pairwise codebook with joint feature space clustering. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 692–705. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15549-9_50
Chapter Google Scholar
Patel, A., Kasat, D., Jain, S., Thakare, V.: Performance analysis of various feature detector and descriptor for real-time video based face tracking. Int. J. Comput. Appl. 93(1) (2014)
Google Scholar
Peng, T., Zhang, D., Liu, R., Asari, V.K., Loomis, J.S.: Evaluating the power efficiency of visual slam on embedded GPU systems. In: 2019 IEEE National Aerospace and Electronics Conference (NAECON), pp. 117–121. IEEE (2019)
Google Scholar
Ramezani, M., Wang, Y., Camurri, M., Wisth, D., Mattamala, M., Fallon, M.: The newer college dataset: Handheld lidar, inertial and vision with ground truth. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020)
Google Scholar
Rosa, P., Silveira, O., de Melo, J., Moreira, L., Rodrigues, L.: Development of embedded algorithm for visual simultaneous localization and mapping. In: Anais Estendidos da XXXII Conference on Graphics, Patterns and Images, pp. 160–163. SBC (2019)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to sift or surf. In: ICCV, vol. 11, p. 2. Citeseer (2011)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science (1985)
Google Scholar
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: NULL, p. 1470. IEEE (2003)
Google Scholar
Tan, C.L., Egerton, S., Ganapathy, V.: Semantic slam model for autonomous mobile robots using content based image retrieval techniques: a performance analysis. Aust. J. Intell. Inf. Process. Syst. 12(4), 32 (2010)
Google Scholar
Thomaz, C.E., Giraldi, G.A.: A new ranking method for principal components analysis and its application to face image analysis. Image Vis. Comput. 28(6), 902–913 (2010)
Article Google Scholar
Valiente, D., Gil, A., Payá, L., Sebastián, J., Reinoso, Ó.: Robust visual localization with dynamic uncertainty management in omnidirectional slam. Appl. Sci. 7, 1294 (12 2017). https://doi.org/10.3390/app7121294
Wang, S., Clark, R., Wen, H., Trigoni, N.: DeepVO: towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2043–2050. IEEE (2017)
Google Scholar
Xie, J., Kiefel, M., Sun, M.T., Geiger, A.: Semantic instance annotation of street scenes by 3D to 2D label transfer. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Zhang, Z., Lyons, M., Schuster, M., Akamatsu, S.: Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron. In: Third IEEE International Conference on Automatic Face and Gesture Recognition, Proceedings. pp. 454–459. IEEE (1998)
Google Scholar

Download references

Acknowledgments

– This work was financed in part by the CoordenaçÃo de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001.

– This work was carried out with the support of the Programa de CooperaçÃo Acadêmica em Defesa Nacional (PROCAD-DEFESA).

Author information

Authors and Affiliations

Instituto Militar de Engenharia (IME), Praça Gen. Tibúrcio, 80 - Urca, Rio de Janeiro, RJ, Brazil
Alexandra Miguel Raibolt da Silva, Gustavo Alves Casqueiro & Paulo Fernando Ferreira Rosa
Faculdade de Ed.Tec.do Estado do Rio de Janeiro (FAETERJ/Petrópolis), Av. Getúlio Vargas, 335 - Quitandinha, Petrópolis, RJ, Brazil
Alberto Torres Angonese

Authors

Alexandra Miguel Raibolt da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo Alves Casqueiro
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Torres Angonese
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Fernando Ferreira Rosa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Alexandra Miguel Raibolt da Silva or Paulo Fernando Ferreira Rosa .

Editor information

Editors and Affiliations

Federal University of Maranhão (UFMA), São Luís, Maranhão, Brazil
Paulo Rogério de Almeida Ribeiro
Federal University of São João del-Rei, São João del-Rei, Minas Gerais, Brazil
Vinícius Rosa Cota
Federal University of Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
Dante Augusto Couto Barone
Federal University of Maranhão (UFMA), São Luís, Maranhão, Brazil
Alexandre César Muniz de Oliveira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

da Silva, A.M.R., Casqueiro, G.A., Angonese, A.T., Rosa, P.F.F. (2022). Towards Loop Closure Detection for SLAM Applications Using Bag of Visual Features: Experiments and Simulation. In: Ribeiro, P.R.d.A., Cota, V.R., Barone, D.A.C., de Oliveira, A.C.M. (eds) Computational Neuroscience. LAWCN 2021. Communications in Computer and Information Science, vol 1519. Springer, Cham. https://doi.org/10.1007/978-3-031-08443-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-08443-0_3
Published: 19 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08442-3
Online ISBN: 978-3-031-08443-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Loop Closure Detection for SLAM Applications Using Bag of Visual Features: Experiments and Simulation