Skip to main content

Advertisement

Log in

Using Deep Learning to Find Victims in Unknown Cluttered Urban Search and Rescue Environments

  • Defense, Military, and Surveillance Robotics (S Ferrari and P Zhu, Section Editors)
  • Published:
Current Robotics Reports Aims and scope Submit manuscript

Abstract

Purpose of Review

We investigate the first use of deep networks for victim identification in Urban Search and Rescue (USAR). Moreover, we provide the first experimental comparison of single-stage and two-stage networks for body part detection, for cases of partial occlusions and varying illumination, on a RGB-D dataset obtained by a mobile robot navigating cluttered USAR-like environments.

Recent Findings

We considered the single-stage detectors Single Shot Multi-box Detector, You Only Look Once, and RetinaNet and the two-stage Feature Pyramid Network detector. Experimental results show that RetinaNet has the highest mean average precision (77.66%) and recall (86.98%) for detecting victims with body part occlusions in different lighting conditions.

Summary

End-to-end deep networks can be used for finding victims in USAR by autonomously extracting RGB-D image features from sensory data. We show that RetinaNet using RGB-D is robust to body part occlusions and low-lighting conditions and outperforms other detectors regardless of the image input type.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

Papers of particular interest, published recently, have been highlighted as: • Of importance

  1. Louie W-YG, Nejat G. A victim identification methodology for rescue robots operating in cluttered USAR environments. Adv Robot. 2013;27:373–84. https://doi.org/10.1080/01691864.2013.763743.

    Article  Google Scholar 

  2. Hui N, Li-gang C, Ya-zhou T, Yue W. Research on human body detection methods based on the head features on the disaster scenes. In: 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics. China: Harbin; 2010. p. 380–5.

    Google Scholar 

  3. Nguyen DT, Li W, Ogunbona PO. Human detection from images and videos: a survey. Pattern Recogn. 2016;51:148–75. https://doi.org/10.1016/j.patcog.2015.08.027.

    Article  Google Scholar 

  4. Kadkhodamohammadi A, Gangi A, Mathelin M de, Padoy N (2017) A multi-view RGB-D approach for human pose estimation in operating rooms. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). Santa Rosa, USA, pp 363–372.

  5. Li H, Liu J, Zhang G, et al. Multi-glimpse LSTM with color-depth feature fusion for human detection. Beijing: IEEE International Conference on Image Processing; 2017.

    Book  Google Scholar 

  6. Pishchulin L, Insafutdinov E, Tang S, et al (2016) DeepCut: joint subset partition and labeling for multi person pose estimation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Las Vegas, pp 4929–4937.

  7. Insafutdinov E, Pishchulin L, Andres B, et al. DeeperCut: a deeper, stronger, and faster multi-person pose estimation model. In: Computer Vision – ECCV 2016. Cham, Amsterdam: Springer; 2016. p. 34–50.

    Chapter  Google Scholar 

  8. Iqbal U, Gall J. Multi-person pose estimation with local joint-to-person associations. In: Hua G, Jégou H, editors. Computer Vision – ECCV 2016 Workshops. Cham: Springer International Publishing; 2016. p. 627–42.

    Chapter  Google Scholar 

  9. Papandreou G, Zhu T, Kanazawa N, et al. Towards accurate multi-person pose estimation in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017. p. 3711–9.

    Chapter  Google Scholar 

  10. Liu Y, Nejat G. Multirobot cooperative learning for semiautonomous control in urban search and rescue applications. J Field Robot. 2016;33:512–36. https://doi.org/10.1002/rob.21597.

    Article  Google Scholar 

  11. Doroodgar B, Liu Y, Nejat G. A learning-based semi-autonomous controller for robotic exploration of unknown disaster scenes while searching for victims. IEEE Trans Cybern. 2014;44:2719–32. https://doi.org/10.1109/TCYB.2014.2314294.

    Article  Google Scholar 

  12. Zhang K, Niroui F, Ficocelli M, Nejat G. Robot navigation of environments with unknown rough terrain using deep reinforcement learning. In: 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR); 2018. p. 1–7.

    Google Scholar 

  13. Zhang Z, Nejat G, Guo H, Huang P. A novel 3D sensory system for robot-assisted mapping of cluttered urban search and rescue environments. Intell Serv Robot. 2011;4:119–34. https://doi.org/10.1007/s11370-010-0082-3.

    Article  Google Scholar 

  14. Zhang Z, Nejat G. Intelligent sensing systems for rescue robots: landmark identification and three-dimensional mapping of unknown cluttered urban search and rescue environments. Adv Robot. 2009;23:1179–98. https://doi.org/10.1163/156855309X452511.

    Article  Google Scholar 

  15. Zhang Z, Guo H, Nejat G, Huang P. Finding disaster victims: a sensory system for robot-assisted 3D mapping of urban search and rescue environments. In: Proceedings 2007 IEEE International Conference on Robotics and Automation; 2007. p. 3889–94.

    Chapter  Google Scholar 

  16. Shamroukh R, Awad F. Detection of surviving humans in destructed environments using a simulated autonomous robot. In: 2009 6th International Symposium on Mechatronics and its Applications. Sharjah; 2009. p. 1–6.

  17. Shu G, Dehghan A, Oreifej O, et al. Part-based multiple-person tracking with partial occlusion handling. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence; 2012. p. 1815–21.

  18. Liu J, Zhang G, Liu Y, Tian L, Chen YQ. An ultra-fast human detection method for color-depth camera. J Vis Commun Image Represent. 2015;31:177–85. https://doi.org/10.1016/j.jvcir.2015.06.014.

    Article  Google Scholar 

  19. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Conference on Neural Information Processing Systems.

    Google Scholar 

  20. Johnson S, Everingham M. Clustered pose and nonlinear appearance models for human pose estimation. In: Procedings of the British Machine Vision Conference 2010. Aberystwyth: British Machine Vision Association; 2010. p. 12.1–12.11.

    Google Scholar 

  21. Pishchulin L, Andriluka M, Schiele B (2014) Fine-grained activity recognition with holistic and pose based features. arXiv:14061881 [cs].

  22. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016. p. 770–8.

    Chapter  Google Scholar 

  23. Lin T-Y, Maire M, Belongie S, et al (2014) Microsoft COCO: common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision – ECCV 2014. Springer International Publishing pp 740–755.

  24. Wang X, Hu J, Jin Y, et al. Human pose estimation via deep part detection. In: Zhai G, Zhou J, Yang X, editors. Digital TV and Wireless Multimedia Communication. Singapore, Springer Singapore; 2018. p. 55–66.

  25. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, et al. SSD: single shot MultiBox detector. In: Leibe B, Matas J, Sebe N, Welling M, editors. Computer Vision – ECCV 2016. Cham: Springer International Publishing; 2016. p. 21–37.

    Chapter  Google Scholar 

  26. Panteleris P, Oikonomidis I, Argyros A (2017) Using a single RGB frame for real time 3D hand pose estimation in the wild. arXiv:171203866 [cs].

  27. Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition; 2009. p. 248–55.

    Chapter  Google Scholar 

  28. Güler RA, Neverova N, Kokkinos I. Densepose: dense human pose estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2018. p. 7297–306.

    Google Scholar 

  29. Li X, Yang L, Song Q, Zhou F (2019) Detector-in-detector: multi-level analysis for human-parts. arXiv:190207017 [cs].

  30. • Lin T-Y, Goyal P, Girshick RB, et al. Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV); 2017. p. 2999–3007. This paper introduces RetinaNet, a single stage object detector that uses focal loss to focus training on hard examples.

    Chapter  Google Scholar 

  31. Liu L, Ouyang W, Wang X, et al (2018) Deep learning for generic object detection: a survey. arXiv:180902165 [cs].

  32. • Lin T-Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE; 2017. p. 936–44. This paper introduces Feature Pyramid Networks, a two stage detector that uses lateral connections to build high-level semantic feature maps at all scales.

    Book  Google Scholar 

  33. Girshick R, Donahue J, Darrell T, Malik J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society; 2014. p. 580–7.

    Chapter  Google Scholar 

  34. Gkioxari G, Girshick R, Dollár P, He K (2017) Detecting and recognizing human-object interactions. arXiv:170407333 [cs].

  35. Lan X, Zhu X, Gong S (2018) Person search by multi-scale matching. arXiv:180708582 [cs] 18.

  36. Redmon J, Farhadi A (2016) YOLO9000: better, faster, stronger. arXiv:161208242 [cs].

  37. • Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. CoRR abs/1804.02767: This paper introduces YOLOv3, an improvement of YOLOv2 using residual blocks and feature pyramids.

  38. Kato S, Takeuchi E, Ishiguro Y, Ninomiya Y, Takeda K, Hamada T. An open approach to autonomous vehicles. IEEE Micro. 2015;35:60–8. https://doi.org/10.1109/MM.2015.133.

    Article  Google Scholar 

  39. (2018) Open-Source To Self-Driving. Contribute to CPFL/Autoware development by creating an account on GitHub. Computing Platforms Federated Labratory.

  40. Thakar V, Saini H, Ahmed W, et al (2018) Efficient single-shot Multibox detector for construction site monitoring. arXiv:180805730 [cs].

  41. Simonyan K, Zisserman A (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 14091556.

  42. Everingham M, Van Gool L, Williams CKI, et al. The Pascal visual object classes (VOC) challenge. Int J Comput Vis. 2010;88:303–38. https://doi.org/10.1007/s11263-009-0275-4.

    Article  Google Scholar 

  43. COCO - Common Objects in Context. http://cocodataset.org/#detection-leaderboard. Accessed 9 Oct 2018.

  44. Non-max Suppression - Object detection. In: Coursera. https://www.coursera.org/lecture/convolutional-neural-networks/non-max-suppression-dvrjH. Accessed 14 Nov 2018.

Download references

Funding

This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Canada Research Chairs (CRC) Program, and the NVIDIA GPU grant.

Author information

Authors and Affiliations

Authors

Contributions

The authors Angus Fung and Long Yu Wang contributed equally to this work.

Corresponding author

Correspondence to Angus Fung.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Defense, Military, and Surveillance Robotics

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fung, A., Wang, L.Y., Zhang, K. et al. Using Deep Learning to Find Victims in Unknown Cluttered Urban Search and Rescue Environments. Curr Robot Rep 1, 105–115 (2020). https://doi.org/10.1007/s43154-020-00011-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s43154-020-00011-8

Keywords

Navigation