Skip to main content

Evaluation of Feature Descriptors for Scene Classification

  • Conference paper
  • First Online:
Trends in Sustainable Smart Cities and Territories (SSCT 2023)

Abstract

The current article discusses the performance of local and global descriptors, as well as convolutional neural networks (CNNs), in tasks involving image recognition in interior spaces. The purpose of the test is to identify several realistic situations that closely resemble the typical working conditions for mobile robots. A robot interacting with its environment may be able to see portions of scenes in which objects are seen from various angles or changes in the lighting in various settings. The purpose is to investigate how well the different descriptors perform in identifying situations that meet the above criteria. In order to evaluate the effectiveness of visual descriptors and convolutional neural networks in the classification of images taken from the perspective of mobile robots in indoor environments, a proprietary database was implemented and subjected to several controlled transformations. These modifications made it possible to analyze the performance of Bag-of-Visual-Words (BoVW), Fisher Vectors (Fisher), Vector of Locally Aggregated Descriptors (VLAD), Global Image Descriptors (GIST), and CNN descriptors in visual categorization tasks according to the situational perception of mobile robots.The findings highlight the advantages of descriptors for the various test scenarios and highlight the need for hybrid models that employ both descriptors and CNNs for scene identification tasks in interior areas where mobile robots operate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afif, M., Ayachi, R., Said, Y., Atri, M.: Deep learning based application for indoor scene recognition. Neural Process. Lett. 51, 2827–2837 (2020)

    Article  Google Scholar 

  2. Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)

    Google Scholar 

  3. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. Lect. Notes Comput. Sci. 3951, 404–417 (2006)

    Article  Google Scholar 

  4. Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)

    Google Scholar 

  5. Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, pp. 1–2. Prague (2004)

    Google Scholar 

  6. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893. IEEE (2005)

    Google Scholar 

  7. Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 392–407. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_26

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  9. Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1704–1716 (2011)

    Article  Google Scholar 

  10. Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 604–610. IEEE (2005)

    Google Scholar 

  11. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Google Scholar 

  12. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  13. Murino, V., Puppo, E.: Image Analysis and Processing-ICIAP 2015: 18th International Conference, Genoa, Italy, Proceedings, Part I, vol. 9279. Springer (2015)

    Google Scholar 

  14. Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)

    Article  MATH  Google Scholar 

  15. Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 413–420. IEEE (2009)

    Google Scholar 

  16. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  17. Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the fisher vector: Theory and practice. Int. J. Comput. Vis. 105, 222–245 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  18. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  19. van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel codebooks for scene categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_52

  20. Vogel, J., Schiele, B.: A semantic typicality measure for natural scene categorization. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 195–203. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28649-3_24

  21. Wei, X., Phung, S.L., Bouzerdoum, A.: Visual descriptors for scene categorization: experimental evaluation. Artif. Intell. Rev. 45, 333–368 (2016)

    Article  Google Scholar 

  22. Wozniak, P., Afrisal, H., Esparza, R.G., Kwolek, B.: Scene recognition for indoor localization of mobile robots using deep CNN. In: Chmielewski, L.J., Kozera, R., Orłowski, A., Wojciechowski, K., Bruckstein, A.M., Petkov, N. (eds.) ICCVG 2018. LNCS, vol. 11114, pp. 137–147. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00692-1_13

  23. Wu, R., Wang, B., Wang, W., Yu, Y.: Harvesting discriminative meta objects with deep cnn features for scene classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1287–1295 (2015)

    Google Scholar 

  24. Xie, L., Lee, F., Liu, L., Kotani, K., Chen, Q.: Scene recognition: A comprehensive survey. Pattern Recognit. 102, 107205 (2020)

    Article  Google Scholar 

  25. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

  26. Zhou, X., Zhuang, X., Tang, H., Hasegawa-Johnson, M., Huang, T.S.: Novel gaussianized vector representation for improved natural scene categorization. Pattern Recognit. Lett. 31(8), 702–708 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

This research has been supported by the project “COordinated intelligent Services for Adaptive Smart areaS (COSASS), Reference: PID2021-123673OB-C33, financed by MCIN/AEI/10.13039/501100011033/FEDER, UE.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastián López Flórez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hernando Ríos González, L., López Flórez, S., González-Briones, A., de la Prieta, F. (2023). Evaluation of Feature Descriptors for Scene Classification. In: Castillo Ossa, L.F., Isaza, G., Cardona, Ó., Castrillón, O.D., Corchado Rodriguez, J.M., De la Prieta Pintado, F. (eds) Trends in Sustainable Smart Cities and Territories . SSCT 2023. Lecture Notes in Networks and Systems, vol 732. Springer, Cham. https://doi.org/10.1007/978-3-031-36957-5_24

Download citation

Publish with us

Policies and ethics