Abstract
Indoor object detection presents a computer vision task that deals with the detection of specific indoor classes. This task attracts a lot of attention, especially in the last few years. The strong interest related to this field can be explained by the big importance of this task for indoor assistance navigation for visually impaired people and also by the phenomenal development of the deep convolutional neural networks (Deep CNN). In this paper, an effort is made to perform a new indoor object detector using the deep convolutional neural network-based framework. The framework is built based on the deep convolutional neural network “RetinaNet”. Evaluation is done by using various backbones as ResNet, DenseNet, and VGGNet in order to improve detection performances and processing time. We obtained very encouraging results coming up to 84.61% mAP as detection precision.
Similar content being viewed by others
References
Hu H, Gu J, Zhang Z et al (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3588–3597
Liu F, Lin G, Shen C (2015) CRF learning with CNN features for image segmentation. Pattern Recognit 48(10):2983–2992
Ayachi R, Afif M, Said Y et al (2019) Traffic signs detection for real-world application of an advanced driving assisting system using deep learning. Neural Process Lett. https://doi.org/10.1007/s11063-019-10115-8
Dai J, Qi H, Xiong Y et al (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773
Lin T-Y, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Yu Z, Yu J, Xiang C et al (2018) Beyond bilinear: generalized multimodal factorized high-order pooling for visual question answering. IEEE Trans Neural Netw Learn Syst 29(12):5947–5959
Cosio FA, Castaneda MAP (2004) Autonomous robot navigation using adaptive potential fields. Math Comput Model 40(9–10):1141–1156
Dollar P, Wojek C, Schiele B et al (2009) Pedestrian detection: a benchmark
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Dai J, Li Y, He K et al (2016) R-fcn: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
Ghiasi G, Lin T-Y, Le QV (2019) NAS-FPN: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7036–7045
Yu J, Li J, Yu Z et al (2019) Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/tcsvt.2019.2947482
Yu Z, Yu J, Xiang C et al (2018) Rethinking diversified and discriminative proposal generation for visual grounding. In: Proceedings of the 27th international joint conference on artificial intelligence. AAAI Press, pp 1114–1120
Newcombe RA, Lovegrove SJ, Davison AJ (2011) DTAM: dense tracking and mapping in real-time. In: 2011 international conference on computer vision. https://doi.org/10.1109/iccv.2011.6126513
Zhang J, Yu J, Tao D et al (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process 27(5):2420–2432
Felzenszwalb PF, Girshick RB, Mcallester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627
Sung KK, Poggio T (2002) Example-based learning for view-based human face detection. IEEE Trans Pattern Anal Mach Intell 20(1):39–51
Wojek C, Dollar P, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743
Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: CVPR
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3d object detection network for autonomous driving. In: CVPR
Yu J, Tan M, Zhang H et al (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2019.2932058
Pronobis A, Martinez Mozos O, Caputo B et al (2010) Multi-modal semantic place classification. Int J Robot Res 29(2–3):298–320
Yu J, Zhu C, Zhang J et al (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2019.2908982
Jiang L, Koch A, Zell A (2016) Object recognition and tracking for indoor robots using an RGB-D sensor. In: Intelligent autonomous systems 13. Advances in intelligent systems and computing, vol 302. Springer, Cham, pp 859–871
Ding X, Luo Y, Li Q et al (2018) Prior knowledge-based deep learning method for indoor object recognition and application. Syst Sci Control Eng 6(1):249–257
Ding X, Luo Y, Yu Q et al (2017) Indoor object recognition using pre-trained convolutional neural network. In: 2017 23rd international conference on automation and computing (ICAC). IEEE, pp 1–6
Wang L, Li R, Shi H et al (2019) Multi-channel convolutional neural network based 3D object detection for indoor robot environmental perception. Sensors 19(4):893
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: Proceedings of the 12th European conference on computer vision (ECCV 2012), Florence, Italy, 7–13 Oct 2012, pp 1–14
Song S, Lichtenberg SP, Xiao J (2015) SUN RGB-D: a RGB-D scene understanding benchmark suite. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition (CVPR 2015), Boston, MA, USA, 7–12 June 2015; pp 567–576
Li C, Zhang Y, Qu Y (2018) Object detection based on deep learning of small samples. In: 2018 tenth international conference on advanced computational intelligence (ICACI). IEEE, pp 449–454
Afif M, Ayachi R, Said Y, Pissaloux E, Atri M (2018) Indoor image recognition and classification via deep convolutional neural network. In: International conference on the sciences of electronics, technologies of information and telecommunications. Springer, Cham, pp 364–371
Aftf M, Ayachi R, Said Y, Pissaloux E, Atri M (2019) Indoor object c1assification for autonomous navigation assistance based on deep CNN model. In: 2019 IEEE international symposium on measurements & networking (M&N). IEEE, pp 1–4
Ren S, He K, Girshick R et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
He K, Gkioxari G, Dollar P et al (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37
Redmon J, Farhadi A () YOLOv3: an incremental improvement. arXiv arXiv:1804.02767
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Afif M, Ayachi R, Said Y, Pissaloux E, Atri M (2019) A novel dataset for intelligent indoor object detection systems. Artif Intell Adv 1(1):52–58
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: COMPSTAT’2010. Physica-Verlag HD, pp 177–186
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Afif, M., Ayachi, R., Said, Y. et al. An Evaluation of RetinaNet on Indoor Object Detection for Blind and Visually Impaired Persons Assistance Navigation. Neural Process Lett 51, 2265–2279 (2020). https://doi.org/10.1007/s11063-020-10197-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10197-9