Real-Time Visual Recognition of Ramp Hand Signals for UAS Ground Operations

de Frutos Carro, Miguel Ángel; LópezHernández, Fernando Carlos; Granados, José Javier Rainer

doi:10.1007/s10846-023-01832-3

Real-Time Visual Recognition of Ramp Hand Signals for UAS Ground Operations

Short paper
Open access
Published: 20 March 2023

Volume 107, article number 44, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Real-Time Visual Recognition of Ramp Hand Signals for UAS Ground Operations

Download PDF

Miguel Ángel de Frutos Carro¹,
Fernando Carlos LópezHernández² &
José Javier Rainer Granados³

493 Accesses
Explore all metrics

This article has been updated

Abstract

We describe the design and validation of a vision-based system that allows the dynamic identification of ramp signals performed by airport ground staff. This ramp signals’ recognizer increases the autonomy of unmanned vehicles and prevents errors caused by visual misinterpretations or lack of attention from the pilot of manned vehicles. This system is based on supervised machine learning techniques, developed with our own training dataset and two models. The first model is based on a pre-trained Convolutional Pose Machine followed by a classifier, for which we have evaluated two possibilities: A Random Forest and a Multi-Layer Perceptron based classifier. The second model is based on a single Convolutional Neural Network that classifies the gestures directly imported from real images. When experimentally tested, the first model proved to be more accurate and scalable than the second one. Its strength relies on a better capacity to extract information from the images and transform the domain of pixels into spatial vectors, which increases the robustness of the classification layer. The second model instead is more adequate for gestures’ identification in low visibility environments, such as during night operations, conditions in which the first model appeared to be more limited, segmenting the shape of the operator. Our results support the use of supervised learning and computer vision techniques for the correct identification and classification of ramp hand signals performed by airport marshallers.

Article PDF

Indoor vs. Outdoor Scene Classification for Mobile Robots

Road surface classification using accelerometer and speed data: evaluation of a convolutional neural network model

Article 23 March 2023

Pedestrian Classification and Detection in Far Infrared Images

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data Availability

A repository with the dataset used, the software, the figures and the demonstration videos, is publicly available at: https://github.com/astromaf/ramp_hand_signals_recognition

Change history

20 May 2023
Missing Open Access funding information has been added in the Funding Note.

References

ICAO, Annex 2 - Rules of the Air - Tenth Edition, no. November. (2005)
Tomaszewska, J., Zieja, M., Woch, M., Krzysiak, P.: Statistical analysis of ground-related incidents at airports. J. KONES 25(3), 467–472 (2018). https://doi.org/10.5604/01.3001.0012.4369
Article Google Scholar
Dempsey, M. E., Rasmussen, S.: “Eyes of the army--US Army roadmap for unmanned aircraft systems, 2010--2035,” (2010)
Song, Y., Demirdjian, D., Davis, R.: “Tracking body and hands for gesture recognition: NATOPS aircraft handling signals database,” 2011 IEEE Int. Conf. Autom. Face Gesture Recognit. Work. FG 2011, pp. 500–506 (2011).https://doi.org/10.1109/FG.2011.5771448
Civil Aviation Authority (CAA), “Visual aids handbook,” Aids.10(6), 690–691, (1996). https://doi.org/10.1097/00002030-199606000-00024
Castillo, J.C., Alonso-Martín, F., Cáceres-Domíngue, D., Malfaz, M., Salichs M. Malfaz, A., Salichs, M.A.: “The Influence of Speed and Position in Dynamic Gesture Recognition for Human-Robot Interaction,” J. Sensors., (2019). https://doi.org/10.1155/2019/7060491
Shannon, C.E.: “The Mathematical Theory of Communication,” M.D. Comput., (1997). https://doi.org/10.2307/410457
Demarco, K.J., West, M.E., Howard, A.M.: “Underwater human-robot communication: A case study with human divers,” Conf. Proc. - IEEE Int. Conf. Syst. Man Cybern., vol. 2014-Janua, no. January, pp. 3738–3743, (2014). https://doi.org/10.1109/smc.2014.6974512
Baek, T., Lee, Y.G.: Traffic control hand signal recognition using convolution and recurrent neural networks. J. Comput. Des. Eng. 9(2), 296–309 (2022). https://doi.org/10.1093/jcde/qwab080
Article Google Scholar
Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., Kautz, J.: “Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Decem, 4207–4215 (2016). https://doi.org/10.1109/CVPR.2016.456
Kapuscinski, T., Oszust, Wysocki, M.,D. Warchol.: “Recognition of hand gestures observed by depth cameras,” Int. J. Adv. Robot. Syst., vol. 12, (2015). https://doi.org/10.5772/60091
Choi, C., Ahn, J.H., Byun, H.: “Visual recognition of aircraft marshalling signals using gesture phase analysis,” IEEE Intell. Veh. Symp. Proc., pp. 853–858 (2008). https://doi.org/10.1109/IVS.2008.4621186
Waldherr, S., Romero, R., Thrun, S.: Gesture based interface for human-robot interaction. Auton. Robots 9(2), 151–173 (2000). https://doi.org/10.1023/A:1008918401478
Article Google Scholar
Ribó, A., Warchol, D., M. prz edu pl Oszust: An approach to gesture recognition with skeletal data using dynamic time warping and nearest neighbour classifier”. Int. J. Intell. Syst. Appl. 8(6), 1–8 (2016). https://doi.org/10.5815/ijisa.2016.06.01
Article Google Scholar
Raheja, J.L., Minhas, M., Prashanth, D., Shah, T., Chaudhary, A.: Robust gesture recognition using Kinect: A comparison between DTW and HMM. Optik (Stuttg) (2015). https://doi.org/10.1016/j.ijleo.2015.02.043
Article Google Scholar
Donahue, J., et al.: Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 677–691 (2017). https://doi.org/10.1109/TPAMI.2016.2599174
Article Google Scholar
Zhou, B., Andonian, A., Oliva, A., Torralba, A.: “Temporal Relational Reasoning in Videos,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 11205 LNCS,831–846 (2018). https://doi.org/10.1007/978-3-030-01246-5_49
Hara, K., Kataoka, H., Satoh, Y.: “Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 6546–6555, (2018). https://doi.org/10.1109/CVPR.2018.00685
L. Abraham, A. Urru, N. Normani, M. P. Wilk, M. Walsh, and B. O’flynn, “Hand tracking and gesture recognition using lensless smart sensors,” Sensors (Switzerland), vol. 18, no. 9, (2018). https://doi.org/10.3390/s18092834
Viola, P., Jones, M.: “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (2001).https://doi.org/10.1109/cvpr.2001.990517
Dalal, N., Triggs, B.: “Histograms of oriented gradients for human detection,” in Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2005). https://doi.org/10.1109/CVPR.2005.177
Krizhevsky, A., Sutskever, I., Hinton, G.E.: “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems (2012)
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: “Convolutional pose machines,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 4724–4732 (2016). https://doi.org/10.1109/CVPR.2016.511
He, J., Zhang, C., He, X., Dong, R.: Visual Recognition of traffic police gestures with convolutional pose machine and handcrafted features. Neurocomputing 390, 248–259 (2020). https://doi.org/10.1016/j.neucom.2019.07.103
Article Google Scholar
Wang, S., et al.: Skeleton-based traffic command recognition at road intersections for intelligent vehicles. Neurocomputing 501, 123–134 (2022). https://doi.org/10.1016/j.neucom.2022.05.107
Article Google Scholar
Schneider, P., Memmesheimer, R., Kramer, I., Paulus, D.: “Gesture Recognition in RGB Videos Using Human Body Keypoints and Dynamic Time Warping,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11531 LNAI, pp. 281–293, (2019). https://doi.org/10.1007/978-3-030-35699-6_22
Fang, H.S., Xie, S., Tai, Y.W., Lu, C.: RMPE: Regional Multi-person Pose Estimation. Proc. IEEE Conf. Comput. Vis. (2017). https://doi.org/10.1109/ICCV.2017.256
Article Google Scholar
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S.-E., Sheikh, Y.A.: “OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., (2019). https://doi.org/10.1109/tpami.2019.2929257.
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y : “Realtime multi-person 2D pose estimation using part affinity fields,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 1302–1310 (2017). https://doi.org/10.1109/CVPR.2017.143
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: “End-to-End Recovery of Human Shape and Pose,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/CVPR.2018.00744
Liu, J., Akhtar, N., Mian, A.: “Skepxels: Spatio-temporal Image Representation of Human Skeleton Joints for Action Recognition,” pp. 10–19, (2017), [Online]. Available: http://arxiv.org/abs/1711.05941
Lin, T.Y., et al : “Microsoft COCO: Common objects in context,” Lect. Notes Comput. Sci.(including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 8693(5)740–755 (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Singh, M., Mandal, M., Basu, A.: “Visual gesture recognition for ground air traffic control using the radon transform,” 2005 IEEE/RSJ Int. Conf. Intell. Robot. Syst. IROS, pp. 2850–2855, (2005). https://doi.org/10.1109/IROS.2005.1545408
Blackett, C., Fernandes, A., Teigen, E., Thoresen, T.: Effects of Signal Latency on Human Performance in Teleoperations. Lect. Notes Networks Syst. 319(August), 386–393 (2022). https://doi.org/10.1007/978-3-030-85540-6_50
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: “Deep residual learning for image recognition,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 770–778, (2016). https://doi.org/10.1109/CVPR.2016.90
Breiman, L.: “Random forests,” Random For., pp. 1–122, (2001), doi: https://doi.org/10.1201/9780367816377-11

Download references

Funding

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Centro de Automática y Robótica, Universidad Politécnica de Madrid-Consejo Superior de Investigaciones Científicas, 28006, Madrid, Spain
Miguel Ángel de Frutos Carro
Department of Mathematical Analysis and Applied Mathematics, Universidad Complutense de Madrid (UCM), 28040, Madrid, Spain
Fernando Carlos LópezHernández
Universidad Internacional de La Rioja (UNIR), 26006, Logroño, Spain
José Javier Rainer Granados

Authors

Miguel Ángel de Frutos Carro
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Carlos LópezHernández
View author publications
You can also search for this author in PubMed Google Scholar
José Javier Rainer Granados
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Miguel Ángel de Frutos (MAdF) initiated the present project during his Master of Sciences at the Universidad Internacional de la Rioja (UNIR) and performed the subsequent data analysis and manuscript elaboration while pursuing his doctoral research at the Universidad Politécnica de Madrid (UPM). Fernando López Hernández (UCM) and J. Javier Rainer (UNIR) supervised the elaboration of the manuscript. All authors discussed and approved the final manuscript.

Corresponding author

Correspondence to Miguel Ángel de Frutos Carro.

Ethics declarations

Ethics Approval

The authors declare that this work is original and does not include experiments with animals.

Consent to Participate

All individuals participating in the study provided an informed consent. The captured information has been nonetheless adequately anonymized.

Consent for Publication

The participants in the experiments provided informed consent for publication of the related images. Nevertheless, their faces or any other biometric data can be recognised in the relevant images.

Conflict of Interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Categories (6), (7).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

de Frutos Carro, M.Á., LópezHernández, F.C. & Granados, J.J.R. Real-Time Visual Recognition of Ramp Hand Signals for UAS Ground Operations. J Intell Robot Syst 107, 44 (2023). https://doi.org/10.1007/s10846-023-01832-3

Download citation

Received: 16 February 2022
Accepted: 07 February 2023
Published: 20 March 2023
DOI: https://doi.org/10.1007/s10846-023-01832-3

Keywords

MSC

68T45

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Real-Time Visual Recognition of Ramp Hand Signals for UAS Ground Operations

Abstract

Article PDF

Similar content being viewed by others

Indoor vs. Outdoor Scene Classification for Mobile Robots

Road surface classification using accelerometer and speed data: evaluation of a convolutional neural network model

Pedestrian Classification and Detection in Far Infrared Images

Data Availability

Change history

20 May 2023

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

MSC

Navigation

Real-Time Visual Recognition of Ramp Hand Signals for UAS Ground Operations

Abstract

Article PDF

Similar content being viewed by others

Indoor vs. Outdoor Scene Classification for Mobile Robots

Road surface classification using accelerometer and speed data: evaluation of a convolutional neural network model

Pedestrian Classification and Detection in Far Infrared Images

Data Availability

Change history

20 May 2023

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC

Search

Navigation