Skip to main content
Log in

An Evaluation of RetinaNet on Indoor Object Detection for Blind and Visually Impaired Persons Assistance Navigation

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Indoor object detection presents a computer vision task that deals with the detection of specific indoor classes. This task attracts a lot of attention, especially in the last few years. The strong interest related to this field can be explained by the big importance of this task for indoor assistance navigation for visually impaired people and also by the phenomenal development of the deep convolutional neural networks (Deep CNN). In this paper, an effort is made to perform a new indoor object detector using the deep convolutional neural network-based framework. The framework is built based on the deep convolutional neural network “RetinaNet”. Evaluation is done by using various backbones as ResNet, DenseNet, and VGGNet in order to improve detection performances and processing time. We obtained very encouraging results coming up to 84.61% mAP as detection precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Hu H, Gu J, Zhang Z et al (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3588–3597

  2. Liu F, Lin G, Shen C (2015) CRF learning with CNN features for image segmentation. Pattern Recognit 48(10):2983–2992

    Article  Google Scholar 

  3. Ayachi R, Afif M, Said Y et al (2019) Traffic signs detection for real-world application of an advanced driving assisting system using deep learning. Neural Process Lett. https://doi.org/10.1007/s11063-019-10115-8

    Article  Google Scholar 

  4. Dai J, Qi H, Xiong Y et al (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773

  5. Lin T-Y, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  6. Yu Z, Yu J, Xiang C et al (2018) Beyond bilinear: generalized multimodal factorized high-order pooling for visual question answering. IEEE Trans Neural Netw Learn Syst 29(12):5947–5959

    Article  Google Scholar 

  7. Cosio FA, Castaneda MAP (2004) Autonomous robot navigation using adaptive potential fields. Math Comput Model 40(9–10):1141–1156

    Article  Google Scholar 

  8. Dollar P, Wojek C, Schiele B et al (2009) Pedestrian detection: a benchmark

  9. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  10. Dai J, Li Y, He K et al (2016) R-fcn: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

  11. Ghiasi G, Lin T-Y, Le QV (2019) NAS-FPN: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7036–7045

  12. Yu J, Li J, Yu Z et al (2019) Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/tcsvt.2019.2947482

    Article  Google Scholar 

  13. Yu Z, Yu J, Xiang C et al (2018) Rethinking diversified and discriminative proposal generation for visual grounding. In: Proceedings of the 27th international joint conference on artificial intelligence. AAAI Press, pp 1114–1120

  14. Newcombe RA, Lovegrove SJ, Davison AJ (2011) DTAM: dense tracking and mapping in real-time. In: 2011 international conference on computer vision. https://doi.org/10.1109/iccv.2011.6126513

  15. Zhang J, Yu J, Tao D et al (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process 27(5):2420–2432

    Article  MathSciNet  Google Scholar 

  16. Felzenszwalb PF, Girshick RB, Mcallester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627

    Article  Google Scholar 

  17. Sung KK, Poggio T (2002) Example-based learning for view-based human face detection. IEEE Trans Pattern Anal Mach Intell 20(1):39–51

    Article  Google Scholar 

  18. Wojek C, Dollar P, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743

    Article  Google Scholar 

  19. Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: CVPR

  20. Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3d object detection network for autonomous driving. In: CVPR

  21. Yu J, Tan M, Zhang H et al (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2019.2932058

    Article  Google Scholar 

  22. Pronobis A, Martinez Mozos O, Caputo B et al (2010) Multi-modal semantic place classification. Int J Robot Res 29(2–3):298–320

    Article  Google Scholar 

  23. Yu J, Zhu C, Zhang J et al (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2019.2908982

    Article  Google Scholar 

  24. Jiang L, Koch A, Zell A (2016) Object recognition and tracking for indoor robots using an RGB-D sensor. In: Intelligent autonomous systems 13. Advances in intelligent systems and computing, vol 302. Springer, Cham, pp 859–871

  25. Ding X, Luo Y, Li Q et al (2018) Prior knowledge-based deep learning method for indoor object recognition and application. Syst Sci Control Eng 6(1):249–257

    Article  Google Scholar 

  26. Ding X, Luo Y, Yu Q et al (2017) Indoor object recognition using pre-trained convolutional neural network. In: 2017 23rd international conference on automation and computing (ICAC). IEEE, pp 1–6

  27. Wang L, Li R, Shi H et al (2019) Multi-channel convolutional neural network based 3D object detection for indoor robot environmental perception. Sensors 19(4):893

    Article  Google Scholar 

  28. Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: Proceedings of the 12th European conference on computer vision (ECCV 2012), Florence, Italy, 7–13 Oct 2012, pp 1–14

  29. Song S, Lichtenberg SP, Xiao J (2015) SUN RGB-D: a RGB-D scene understanding benchmark suite. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition (CVPR 2015), Boston, MA, USA, 7–12 June 2015; pp 567–576

  30. Li C, Zhang Y, Qu Y (2018) Object detection based on deep learning of small samples. In: 2018 tenth international conference on advanced computational intelligence (ICACI). IEEE, pp 449–454

  31. Afif M, Ayachi R, Said Y, Pissaloux E, Atri M (2018) Indoor image recognition and classification via deep convolutional neural network. In: International conference on the sciences of electronics, technologies of information and telecommunications. Springer, Cham, pp 364–371

  32. Aftf M, Ayachi R, Said Y, Pissaloux E, Atri M (2019) Indoor object c1assification for autonomous navigation assistance based on deep CNN model. In: 2019 IEEE international symposium on measurements & networking (M&N). IEEE, pp 1–4

  33. Ren S, He K, Girshick R et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  34. He K, Gkioxari G, Dollar P et al (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  35. Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37

  36. Redmon J, Farhadi A () YOLOv3: an incremental improvement. arXiv arXiv:1804.02767

  37. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  38. Huang G, Liu Z, Van Der Maaten L et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  39. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  40. Afif M, Ayachi R, Said Y, Pissaloux E, Atri M (2019) A novel dataset for intelligent indoor object detection systems. Artif Intell Adv 1(1):52–58

    Article  Google Scholar 

  41. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: COMPSTAT’2010. Physica-Verlag HD, pp 177–186

  42. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mouna Afif.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Afif, M., Ayachi, R., Said, Y. et al. An Evaluation of RetinaNet on Indoor Object Detection for Blind and Visually Impaired Persons Assistance Navigation. Neural Process Lett 51, 2265–2279 (2020). https://doi.org/10.1007/s11063-020-10197-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10197-9

Keywords

Navigation