Skip to main content
Log in

A hierarchical approach for updating targeted person states in human-following mobile robots

  • Original Research Paper
  • Published:
Intelligent Service Robotics Aims and scope Submit manuscript

Abstract

In the human-following task, the human detection, tracking and identification are fundamental steps to help the mobile robot to follow and maintain an appropriate distance and orientation to the selected target person (STP) without any threatenings. Recently, along with the widespread development of robots in general and service robots in particular, not only the safety, but the flexibility, the naturality and the sociality in applications of human-friendly services and collaborative tasks are also increasingly demanded with a higher level. This request poses more challenges in robustly detecting, tracking and identifying the STP since the human–robot cooperation is more complex and unpredictable. Obviously, the safe natural robot behavior cannot be ensured if the STP is lost or the robot misidentified its target. In this paper, a hierarchical approach is presented to update the states of the STP more robustly during the human-following task. This method is proposed with the goal of achieving good performance (robust, accurate and fast response) to serve safe natural robot behaviors, with modest hardware. The proposed system is verified by a set of experiments, and shown reasonable results.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Code or data availability

Not applicable.

References

  1. Islam MJ, Hong J, Sattar J (2019) Person-following by autonomous robots: a categorical overview. Int J Robot Res 38(14):1581–1618

    Article  Google Scholar 

  2. Rudenko A et al (2020) Human motion trajectory prediction: a survey. Int J Robot Res 39(8):895–935

    Article  Google Scholar 

  3. Leigh A et al (2015) Person tracking and following with 2D laser scanners. In: 2015 IEEE international conference on robotics and automation (ICRA), Seattle, Washington, USA, 26–30 May 2015, pp 726–733

  4. Yuan J et al (2018) Laser-based intersection-aware human following with a mobile robot in indoor environments. IEEE Trans Syst Man Cybern Syst 51(1):354–369

    Article  Google Scholar 

  5. Beyer L et al (2018) Deep person detection in two-dimensional range data. IEEE Robot Autom Lett 3(3):2726–2733

    Article  Google Scholar 

  6. Guerrero-Higueras AM et al (2019) Tracking people in a mobile robot from 2D LIDAR scans using full convolutional neural networks for security in cluttered environments. Front Neurorobot 12:85

    Article  Google Scholar 

  7. Eguchi R, Yorozu A, Takahashi M (2019) Spatiotemporal and kinetic gait analysis system based on multisensor fusion of laser range sensor and instrumented insoles. In: 2019 IEEE international conference on robotics and automation (ICRA), Montreal, QC, Canada, 20–24 May 2019, pp 4876–4881

  8. Duong HT, Suh YS (2020) Human gait tracking for normal people and walker users using a 2D LiDAR. IEEE Sens J 20(11):6191–6199

    Article  Google Scholar 

  9. Cha D, Chung W (2020) Human-leg detection in 3D feature space for a person-following mobile robot using 2D LiDARs. Int J Precis Eng Manuf 21(7):1299–1307

    Article  Google Scholar 

  10. Mandischer N et al (2021) Radar tracker for human legs based on geometric and intensity features. In: 2021 29th European signal processing conference (EUSIPCO), Dublin, Ireland, 23–27 August 2021, pp 1521–1525

  11. Eguchi R, Takahashi M (2022) Human leg tracking by fusion of laser range and insole force sensing with Gaussian mixture model-based occlusion compensation. IEEE Sens J 22(4):3704–3714

    Article  Google Scholar 

  12. Torta E et al (2011) Design of robust robotic proxemic behavior. In: Social robotics: third international conference on social robotics, ICSR 2011, Amsterdam, The Netherlands, 24–25 November 2011, Proceedings 3, pp 21–30

  13. Torta E et al (2013) Design of a parametric model of personal space for robotic social navigation. Int J Soc Robot 5(3):357–365

    Article  Google Scholar 

  14. Truong X-T, Ngo T-D (2016) Dynamic social zone based mobile robot navigation for human comfortable safety in social environments. Int J Soc Robot 8(5):663–684

    Article  Google Scholar 

  15. Van Toan N, Khoi PB (2019) Fuzzy-based-admittance controller for safe natural human-robot interaction. Adv Robot 33(15–16):815–823

    Article  Google Scholar 

  16. Van Toan N, Khoi PB (2019) A control solution for closed-form mechanisms of relative manipulation based on fuzzy approach. Int J Adv Robot Syst 16(2):1–11

    Google Scholar 

  17. Van Toan N, Do MH, Jo J (2022) Robust-adaptive-behavior strategy for human-following robots in unknown environments based on fuzzy inference mechanism. Ind Robot Int J Robot Res Appl 49(6):1089–1100

    Google Scholar 

  18. Van Toan N et al (2023) The human-following strategy for mobile robots in mixed environments. Robot Auton Syst 160:104317

    Article  Google Scholar 

  19. Van Toan N, Khoi PB, Yi SY (2021) A MLP-hedge-algebras admittance controller for physical human–robot interaction. Appl Sci 11(12):5459

    Article  Google Scholar 

  20. Van Toan N, Yi S-Y, Khoi PB (2020) Hedge algebras-based admittance controller for safe natural human-robot interaction. Adv Robot 34(24):1546–1558

    Article  Google Scholar 

  21. Khoi PB, Van Toan N (2018) Hedge-algebras-based controller for mechanisms of relative manipulation. Int J Precis Eng Manuf 19(3):377–385

    Article  Google Scholar 

  22. Fosty B et al (2016) Accuracy and reliability of the RGB-D camera for measuring walking speed on a treadmill. Gait Posture 48:113–119

    Article  Google Scholar 

  23. Koide K, Miura J (2016) Identification of a specific person using color, height, and gait features for a person following robot. Robot Auton Syst 84:76–87

    Article  Google Scholar 

  24. Chen BX, Sahdev R, Tsotsos JK (2017) Integrating stereo vision with a CNN tracker for a person-following robot. In: International conference on computer vision systems; computer vision systems. Springer, Berlin/Heidelberg, pp 300–313

  25. Lee B-J et al (2018) Robust human following by deep Bayesian trajectory prediction for home service robots. In: 2018 IEEE international conference on robotics and automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018, pp 7189–7195

  26. Yang C-A, Song K-T (2019) Control design for robotic human-following and obstacle avoidance using an RGB-D camera. In: 2019 19th International conference on control, automation and systems (ICCAS 2019), Jeju, South Korea, 15–18 October 2019, pp 934–939

  27. Vilas-Boas MC et al (2019) Full-body motion assessment: concurrent validation of two body tracking depth sensors versus a gold standard system during gait. J Biomech 87:189–196

    Article  Google Scholar 

  28. Yagi K et al (2020) Gait measurement at home using a single RGB camera. Gait Posture 76:136–140

    Article  Google Scholar 

  29. Yorozu A, Takahashi M (2020) Estimation of body direction based on gait for service robot applications. Robot Auton Syst 132:103603

    Article  Google Scholar 

  30. Redhwan A, Choi M-T (2020) Deep-learning-based indoor human following of mobile robot using color feature. Sensors (Basel) 20(9):2699

    Article  Google Scholar 

  31. Van Toan N, Hoang MD, Jo J (2022) MoDeT: a low-cost obstacle tracker for self-driving mobile robot navigation using 2D-laser scan. Ind Robot Int J Robot Res Appl 49(6):1032–1041

    Google Scholar 

  32. Ren S et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the 28th international conference on neural information processing systems, Montreal, Canada, 7–12 December 2015, pp 91–99

  33. Girshick R et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 23–28 June 2014, pp 580–87

  34. Dai J et al (2016) R-FCN: Object detection via region-based fully convolutional networks. In: Proceedings of the 30th international conference on neural information processing systems, Barcelona, Spain, 5–10 December 2016, pp 379–387

  35. Redmon J et al (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NY, USA, 27–30 June 2016, pp 779–788

  36. Liu W et al (2016) SSD: Single shot multibox detector. In: European conference on computer vision, Amsterdam, Netherlands, 11–14 October 2016, pp 21–37

  37. Vu T-H, Osokin A, Laptev I (2015) Context-aware CNNs for person head detection. In: 2015 IEEE international conference on computer vision (ICCV), Santiago, Chile, 07–13 December 2015, pp 2893–2901

  38. Rashid M, Gu X, Lee YJ (2017) Interspecies knowledge transfer for facial keypoint detection. In: IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017, pp 6894–6903

  39. Girdhar R et al (2018) Detect-and-track: efficient pose estimation in videos. In: IEEE conference on computer vision and pattern recognition (CVPR), Salt Lake City, Utah, USA, 19–21 June 2018, pp 350–359

  40. Hong M et al (2022) SSPNet: scale selection pyramid network for tiny person detection from UAV images. IEEE Geosci Remote Sens Lett 19:1–5. https://doi.org/10.1109/LGRS.2021.3103069

    Article  Google Scholar 

  41. Howard AG et al (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1704.04861

  42. Labeling Image (labelImg). Available at: https://github.com/heartexlabs/labelImg

  43. King D (2017) A high quality face recognition with deep metric learning. Available at: http://blog.dlib.net/2017/02/high-quality-face-recognition-with-deep.html

  44. Huang GB, Learned-Miller E (2014) Labeled faces in the wild: updates and new reporting procedures. Technical Report UM-CS-2014–03, University of Massachusetts, Amherst

  45. Huang GB et al (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07–49, University of Massachusetts, Amherst

  46. Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. Comput Vis Patter Recognit. https://arxiv.org/abs/1703.07737

  47. Yuan Y et al (2020) In defense of the triplet loss again: learning robust person re-identification with fast approximated triplet loss and label distillation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020, pp 1454–1463

  48. He K et al (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp 770–778

  49. He K et al (2016) Identity mappings in deep residual networks. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. ECCV 2016. Lecture notes in computer science, vol 9908. Springer, Cham. https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  50. van der Maaten L (2014) Accelerating t-SNE using tree-based algorithm. J Mach Learn Res 15(93):3221–3245

    MathSciNet  MATH  Google Scholar 

  51. Ku J, Haraked A, Waslander SL (2018) In defense of classical image processing: fast depth completion on the CPU. In: 2018 15th conference on computer and robot vision (CRV), Toronto, Canada, 9–11 May 2018, pp 16–22

  52. Ku J et al (2018) Joint 3D proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), Madrid, Spain, 01–05 October 2018, pp 1–8

  53. Lahoud J, Ghanem B (2017) 2D-driven 3D object detection in RGB-D images. In: 2017 IEEE international conference on computer vision (ICCV), Venice, Italy, 22–29 October 2017, pp 4622–4630

  54. Qi CR et al (2018) Frustum pointnets for 3D object detection from RGB-D data. In: 2018 IEEE conference on computer vision and pattern recognition (CVPR), Salt Lake City, Utah, USA, 18–22 June 2018, pp 918–927

  55. Shi W et al (2018) Dynamic obstacles rejection for 3D map simultaneous updating. IEEE Access 6:37715–37724

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the Research Program funded by the Seoul National University of Science and Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soo-Yeong Yi.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the author(s).

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

“Seq” indicates the sequence of the events, “Time” indicates the moment when the action starts in the experiment, and “Snapshot” indicates the time instant in which the action starts in the video.

1.1 Appendix 1: Sequence of events of the experiment in this paper (in the demo video)

Seq

Time (mm:ss)

Action

Snapshot (mm:ss)

1

00:00

Only use 2D-LiDAR sensor. The robot detects and tracks its STP and other persons appeared in its detection range

https://www.youtube.com/watch?v=D17h_oRf_cs&t=0s 00:00

2

00:13

Only use 2D-LiDAR sensor. The robot is tracking the STP when he is too close to environmental objects

https://www.youtube.com/watch?v=D17h_oRf_cs&t=13s 00:13

3

00:29

Only use 2D-LiDAR sensor. The robot misidentified its STP. The identification number of the STP is switched to the environmental object

https://www.youtube.com/watch?v=D17h_oRf_cs&t=29s 00:29

4

00:35

Hierarchical approach. The STP moves too close to environmental objects. Then, the robot activates the visual human detection

https://www.youtube.com/watch?v=D17h_oRf_cs&t=35s 00:35

5

00:55

Hierarchical approach. The STP moves far from environmental objects. Nothing around the STP, then visual-based modules are deactivated

https://www.youtube.com/watch?v=D17h_oRf_cs&t=55s 00:55

6

01:20

Hierarchical approach. Other people are close to the STP. The robot activates the visual human detection and face identification to identify its correct STP

https://www.youtube.com/watch?v=D17h_oRf_cs&t=80s 01:20

7

01:53

Hierarchical approach. Other people are close to the STP. Firstly, the robot activates the human detection and face identification to identify its STP. If there is no STP’s face, then the body identification is activated

https://www.youtube.com/watch?v=D17h_oRf_cs&t=113s 01:53

8

02:37

Hierarchical approach. There are no other people and environmental objects near the STP. The robot then deactivates visual-based modules

https://www.youtube.com/watch?v=D17h_oRf_cs&t=157s 02:37

9

02:46

Hierarchical approach. During the visual identification procedure, the STP is tracked (2D RGB human tracking and 3D PCL tracking) to match with tracked human-leg after the visual identification finished

https://www.youtube.com/watch?v=D17h_oRf_cs&t=165s 02:46

10

02:58

The STP data collection for the visual identification procedure

https://www.youtube.com/watch?v=D17h_oRf_cs&t=178s 02:58

1.2 Appendix 2: Effects and CPU consumptions of sub-methods in the visual identification procedure of the “Appendix 1”

Seq

Time (mm:ss)

Action

Snapshot (mm:ss)

1

00:00

Only the visual-based human detection is activated

https://www.youtube.com/watch?v=1jlm8ZWK_pA&t=0s 00:00

2

00:31

Visual-based human detection and body identification are activated

https://www.youtube.com/watch?v=1jlm8ZWK_pA&t=31s 00:31

3

01:24

Only the face identification is activated

https://www.youtube.com/watch?v=1jlm8ZWK_pA&t=84s 01:24

1.3 Appendix 3: Sequences of the flexible heading during the human–robot cooperation (presented in [17, 18])

Seq

Time (mm:ss)

Action

Snapshot (mm:ss)

1

00:00 (in [17])

The robot changes its moving heading flexibly when its STP change his moving directions and his sides with respect to the robot local coordinates, in [17]

https://www.youtube.com/watch?v=5EJiJUNxSCU&t=0s 00:00

2

03:40 (in [17])

The human-following task is harassed by other people, in [17]

https://www.youtube.com/watch?v=5EJiJUNxSCU&t=220s 03:40

3

00:00

  

(in [18])

The robot follows and supports the STP to take food trays in the office

https://www.youtube.com/watch?v=YGrWU6ldKuw&t=0s 00:00

 

4

01:00 (in [18])

The human–robot cooperation in narrow areas, and surrounded by many environmental objects and prohibited areas (in [18])

https://www.youtube.com/watch?v=YGrWU6ldKuw&t=60s 01:00

1.4 Appendix 4: Sequences of events of the experiment video of the object tracking using the fusion of 2D-LiDAR and RGB-D cameras.

Seq

Action

Snapshot (mm:ss)

1

The robot is not moving when detecting and tracking objects using only RGB-D cameras

https://www.youtube.com/watch?v=bi42fB3EfWA 00:00

2

The robot is not moving when detecting and tracking objects using only RGB-D cameras. Here, the visualization of the filtered 3D point cloud data is turned on

https://www.youtube.com/watch?v=bi42fB3EfWA&t=58s 00:58

3

The robot is not moving when detecting and tracking objects using the fusion of the 2D-LiDAR and RGB-D cameras

https://www.youtube.com/watch?v=z0FMGI5_tsg&t=0s 00:00

4

The robot is moving when detecting and tracking objects using the fusion of the 2D-LiDAR and RGB-D cameras

https://www.youtube.com/watch?v=z0FMGI5_tsg&t=50s 00:50

5

The robot is not moving when detecting and tracking objects using the fusion of the 2D-LiDAR and RGB-D cameras

https://www.youtube.com/watch?v=z0FMGI5_tsg&t=152s 02:32

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Van Toan, N., Bach, SH. & Yi, SY. A hierarchical approach for updating targeted person states in human-following mobile robots. Intel Serv Robotics 16, 287–306 (2023). https://doi.org/10.1007/s11370-023-00463-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11370-023-00463-9

Keywords

Navigation