Abstract
With the expansive aging of global population, service robot with living assistance applied in indoor scenes will serve as a crucial role in the field of elderly care in the future. Service robots need to detect multiple targets when completing auxiliary tasks. However, indoor scenes are usually complex and there are many types of interference factors, leading to great challenges in the multiple targets detection. To overcome this technical difficulty, a novel improved Mask RCNN method for multiple targets detection in the indoor complex scenes is proposed in this paper. The improved model utilizes Mask RCNN as the network framework. On this basis, Convolutional Block Attention Module (CBAM) with channel mechanism and space mechanism is integrated, and the influence of different background, distance, angle and interference factors is comprehensively considered. Meanwhile, in order to evaluate the detection and identification effects of the established model, a comprehensive evaluation system based on loss function and Mean Average Precision (mAP) is established. For verification, experiments on the detection and identification effects under different distances, backgrounds, postures and interference factors were conducted. The results demonstrated that designed model improves the accuracy to a higher level and has a better anti-interference ability than other methods while the detection speed was nearly the same. This research will promote the application of intelligent service robots in the field of perception and target grasp.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
International Federation of Robotics. Service robots. http://www.ifr.org/service-robots/
Wilson G, Pereyda C, Raghunath N, Cruz G, Goel S (2019) Robot-enabled support of daily activities in smart home environments. Cognit Syst Res 54:258–272
Liu S, Tian G, Zhang Y, Zhang M, Liu S (2022) Service planning oriented efficient object search: a knowledge-based framework for home service robot. Expert Syst Appl 187:115853
Kim M, Suh I (2019) Active object search in an unknown large-scale environment using commonsense knowledge and spatial relations. Intel Serv Robot 12(4):371–380
Li X, Zhong J, Kamruzzaman M (2021) Complicated robot activity recognition by quality-aware deep reinforcement learning. Future Gener Comput Syst 117:480–485
Zhang M, Tian G, Zhang Y, Duan P (2021) Service skill improvement for home robots: autonomous generation of action sequence based on reinforcement learning. Knowl-Based Syst 212:106605
Wen S, Liu X, Wang Z, Zhang H, Zhang Z, Tian W (2022) An improved multi-object classification algorithm for visual SLAM under dynamic environment. Intell Serv Robot 15:39–55
Kasaei S, Ghorbani M, Schilperoort J, Rest W (2021) Investigating the importance of shape features, color constancy, color spaces, and similarity measures in open-ended 3D object recognition. Intell Serv Robot 14:329–344
Cheong S, Cho B, Lee J, Lee J, Kim D, Nam C, Kim C, Park S (2021) Obstacle rearrangement for robotic manipulation in clutter using a deep Q-network. Intell Serv Robot 14:549–561
Ercolano G, Rossi S (2021) Combining CNN and LSTM for activity of daily living recognition with a 3D matrix skeleton representation. Intell Serv Robot 14:175–185
Bochkovskiy A, Wang C, Liao H (2020) YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934
Tan M, Pang R, Le Q (2019) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10778–10787
Zheng G, Liu S, Wang F, Li Z, Sun J (2021) YOLOX: exceeding YOLO series in 2021. arXiv:2107.08430
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of the 30th international conference on neural information processing systems, pp 379–387
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Cai Z, Vasconcelos N (2017) Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Wang D, He D (2022) Fusion of Mask RCNN and attention mechanism for instance segmentation of apples under complex background. Comput Electron Agric 196:106864
Chu P, Li Z, Lammers K, Lu R, Liu X (2021) Deep learning-based apple detection using a suppression Mask R-CNN. Pattern Recogn Lett 147:206–211
Wang Q, Fan Z, Sheng W, Zhang S, Liu M (2019) Finding misplaced items using a mobile robot in a smart home environment. Front Inf Technol Electron Eng 20(8):1036–1048
Singh K, Kapoor D, Thakur K, Sharma A, Gao X (2022) Computer-vision based object detection and recognition for service robot in indoor environment. Comput Mater Continua 72(1):197–213
Hameed K, Chai D, Rassau A (2022) Score-based mask edge improvement of Mask-RCNN for segmentation of fruit and vegetables. Expert Syst Appl 190:116205
Liu S, Tian G, Cui Y, Shao X (2022) A deep Q-learning network based active object detection model with a novel training algorithm for service robots. Front Inf Technol Electron Eng 23:1673–1683
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. Adv Neural Inf Process Syst 28:2017–2025
Hu J, Shen L, Albanie S, Sun G, Wu E (2017) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42:2011–2023
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/ CVF conference on computer vision and pattern recognition, pp 11531–11539
Woo S, Park J, Lee J, Kweon I (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19
Lin, T Y, Maire M, Belongie S et al (2014) Microsoft COCO: common objects in context. In: Proceedings of the European conference on computer vision, pp 740–755
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Lin T, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 936–944
Paszke A, Gross S, Massa F et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Acknowledgements
This work was supported by grants of the National Key Research and Development Program of China (No. 2022YFE0107300), the Chongqing Natural Science Foundation (No. cstc2020jcyj-msxmX0067), the Scientific and Technological Research Program of Chongqing Municipal Education Commission (No. KJQN202000821), the Chongqing Scientific Research Institutions Performance Incentive and Guidance Project (cstc2022jxjl00009) and the Graduate Scientific Research and Innovation Foundation of Chongqing Technology and Business University (No: yjscxx2022-112-161).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, Z., Wang, J., Li, J. et al. A novel multiple targets detection method for service robots in the indoor complex scenes. Intel Serv Robotics 16, 453–469 (2023). https://doi.org/10.1007/s11370-023-00471-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11370-023-00471-9