Skip to main content

Realtime Human-UAV Interaction Using Deep Learning

  • Conference paper
  • First Online:
Biometric Recognition (CCBR 2017)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10568))

Included in the following conference series:

Abstract

In this paper, we propose a realtime human gesture identification for controlling a micro UAV in a GPS denied environment. Exploiting the breakthrough of deep convolution network in computer vision, we develop a robust Human-UAV Interaction (HUI) system that can detect and identify a person gesture to control a micro UAV in real time. We also build a new dataset with 23 participants to train or fine-tune the deep neural networks for human gesture detection. Based on the collected dataset, the state-of-art YOLOv2 detection network is tailored to detect the face and two hands locations of a human. Then, an interpreter approach is proposed to infer the gesture from detection results, in which each interpreted gesture is equivalent to a UAV flying command. Real flight experiments performed by non-expert users with the Bebop 2 micro UAV have approved our proposal for HUI. The gesture detection deep model with a demo will be publicly available to aid the research work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)

  2. Mashood, A., Noura, H., Jawhar, I., Mohamed, N.: A gesture based kinect for quadrotor control. In: 2015 International Conference on Information and Communication Technology Research (ICTRC), pp. 298–301. IEEE (2015)

    Google Scholar 

  3. Boudjit, K., Larbes, C., Alouache, M.: Control of flight operation of a quad rotor AR. drone using depth map from microsoft kinect sensor. Int. J. Eng. Innov. Technol. (IJEIT) 3, 15–19 (2008)

    Google Scholar 

  4. Sanna, A., Lamberti, F., Paravati, G., Manuri, F.: A kinect-based natural interface for quadrotor control. Entertain. Comput. 4(3), 179–186 (2013)

    Article  Google Scholar 

  5. Suarez Fernandez, R., Sanchez Lopez, J.L., Sampedro, C., Bavle, H., Molina, M., Campoy Cervera, P.: Natural user interfaces for human-drone multi-modal interaction. In: Proceedings of 2016 International Conference on Unmanned Aircraft Systems (ICUAS), ETSI_Informatica (2016)

    Google Scholar 

  6. Nagi, J., Giusti, A., Di Caro, G.A., Gambardella, L.M.: Human control of UAVs using face pose estimates and hand gestures. In: Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction, pp. 252–253. ACM (2014)

    Google Scholar 

  7. Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software, vol. 3, p. 5, Kobe (2009)

    Google Scholar 

  8. Zhang, B., Perina, A., Li, Z., Murino, V., Liu, J., Ji, R.: Bounding multiple gaussians uncertainty with application to object tracking. Int. J. Comput. Vision 118(3), 364–379 (2016)

    Article  MathSciNet  Google Scholar 

  9. Zhang, B., Li, Z., Perina, A., Del Bue, A., Murino, V., Liu, J.: Adaptive local movement modeling for robust object tracking. IEEE Trans. Circuits Syst. Video Technol. 27(7), 1515–1526 (2017)

    Article  Google Scholar 

  10. Zhang, B., Yang, Y., Chen, C., Yang, L., Han, J., Shao, L.: Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans. Image Process. 26(10), 4648–4660 (2017)

    Article  MathSciNet  Google Scholar 

  11. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems. pp. 91–99 (2015)

    Google Scholar 

  12. Zhou, H.B., Gao, J.T.: Automatic method for determining cluster number based on silhouette coefficient. In: Advanced Materials Research, vol. 951, pp. 227–230. Trans Tech Publ (2014)

    Google Scholar 

Download references

Acknowledgments

The work was supported in part by the Natural Science Foundation of China under Contract 61672079, 61473086 and 61601466. The work of B. Zhang was supported in part by the Program for New Century Excellent Talents University within the Ministry of Education, China, and in part by the Beijing Municipal Science and Technology Commission under Grant Z161100001616005. Baochang Zhang is the correspondence.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baochang Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Maher, A., Li, C., Hu, H., Zhang, B. (2017). Realtime Human-UAV Interaction Using Deep Learning. In: Zhou, J., et al. Biometric Recognition. CCBR 2017. Lecture Notes in Computer Science(), vol 10568. Springer, Cham. https://doi.org/10.1007/978-3-319-69923-3_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69923-3_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69922-6

  • Online ISBN: 978-3-319-69923-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics