Abstract
This paper presents the theoretical background and the implementation of a Long Short-Term Memory (LSTM) Neural Network architecture to recognize arm movements from video clips. The pose points (corresponding to the position of six body parts: shoulders, elbows and wrists) are extracted with a pre-trained Convolutional Pose Machine. Those points generate sequences over time with 66 (x, y) pairs, which are the input for a neural network, to classify them in 20 movement classes. Our architecture has 128 LSTM cells and presented \(92.5\%\) of accuracy on testing data and an execution time of around 6.64 ms.
Moreover, we present the methodology used to create our dataset, with 2400 samples of 20 different arms movements, recorded by 6 persons with different physical appearance in a controlled environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
Bulat, A., Tzimiropoulos, G.: Human pose estimation via convolutional part heatmap regression. CoRR abs/1609.01743 (2016). http://arxiv.org/abs/1609.01743
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Huang, Z., Wan, C., Probst, T., Gool, L.V.: Deep learning on lie groups for skeleton-based action recognition. CoRR abs/1612.05877 (2016). http://arxiv.org/abs/1612.05877
Ke, Q., Bennamoun, M., An, S., Sohel, F.A., Boussaïd, F.: A new representation of skeleton sequences for 3D action recognition. CoRR abs/1703.03492 (2017). http://arxiv.org/abs/1703.03492
Lipton, Z.C.: A critical review of recurrent neural networks for sequence learning. CoRR abs/1506.00019 (2015). http://arxiv.org/abs/1506.00019
Liu, J., Wang, G., Hu, P., Duan, L., Kot, A.C.: Global context-aware attention LSTM networks for 3D action recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3671–3680, July 2017
Liu, J., Shahroudy, A., Xu, D., Wang, G.: Spatio-temporal LSTM with trust gates for 3D human action recognition. CoRR abs/1607.07043 (2016). http://arxiv.org/abs/1607.07043
Liu, M., Yuan, J.: Recognizing human actions as the evolution of pose estimation maps. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1159–1168, June 2018
Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 716–723, June 2013
Seredin, O.S., Kopylov, A.V., Huang, S.C., Rodionov, D.S.: A skeleton features-based fall detection using Microsoft Kinect V2 with one class-classifier outlier removal. In: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XLII-2/W12, pp. 189–195 (2019). https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-2-W12/189/2019/
Wei, S.-E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. CoRR abs/1602.00134v4 (2016). https://arxiv.org/pdf/1602.00134.pdf
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR 2011, pp. 1297–1304 (2011)
Yang, X., Tian, Y.: Super normal vector for activity recognition using depth sequences. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 804–811, June 2014
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1110–1118, June 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Rey, A., Ruiz, A., Camacho, C., Higuera, C. (2021). Vision Based Upper Limbs Movement Recognition Using LSTM Neural Network. In: Cortes Tobar, D., Hoang Duy, V., Trong Dao, T. (eds) AETA 2019 - Recent Advances in Electrical Engineering and Related Sciences: Theory and Application. AETA 2019. Lecture Notes in Electrical Engineering, vol 685. Springer, Cham. https://doi.org/10.1007/978-3-030-53021-1_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-53021-1_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-53020-4
Online ISBN: 978-3-030-53021-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)