Abstract
A gait history image (GHI) is a spatial template that accumulates regions of motion into a single image in which moving pixels are brighter than others. A new descriptor named Time-sliced averaged gradient boundary magnitude (TAGBM) is also designed to show the time variations of motion. The spatial and temporal information of each video can be condensed using these templates. Recently, the advantage of deep learning architectures for human activity recognition encourages us to explore the effectiveness of combining them with these templates. Based on this opinion, a new method is proposed in this paper. Each video is split into N and M groups of consecutive frames, and the GHI and TAGBM are computed for each group, resulting spatial and temporal templates. Transfer learning with the fine-tuning technique has been used for classifying these templates. This proposed method achieves the recognition accuracies of 96.5%, 92.7%, 97.13% and 86.6% for KTH, UCF Sport, UCF-11 and Olympic Sport action datasets, respectively. Also it is compared with state-of-the-art approaches and the results demonstrate that the proposed method has the best efficiency.
Similar content being viewed by others
Notes
ImageNet (2018). ImageNet database [online]. Website https://www.image-net.org/ [accessed 21 10 2018].
References
Chen L, Hoey J, Nugent CD, Cook DJ, Yu Z (2012) Sensor-based activity recognition. IEEE Trans Syst Man Cybernet 42:790–808
Hamad RA, Hidalgo AS, Bouguelia M-R, Estevez ME, Quero JM (2020) Efficient activity recognition in smart homes using delayed fuzzy temporal windows on binary sensors. IEEE J Biomed Health Inf 24:387–395
Saini R, Kumar P, Kaur B, Roy PP, Dogra DP, Santosh K (2019) Kinect sensor-based interaction monitoring system using the BLSTM neural network in healthcare. Int J Mach Learn Cybern 10:2529–2540
Liu X, Liu L, Simske SJ, Liu J (2016) Human daily activity recognition for healthcare using wearable and visual sensing data. In: 2016 IEEE International Conference on Healthcare Informatics (ICHI). IEEE, pp 24–31
Phan H-H, Vu N-S, Nguyen V-L, Quoy M (2018) Action recognition based on motion of oriented magnitude patterns and feature selection. IET Comput Vis 12:735–743
Gaidon A, Harchaoui Z, Schmid C (2014) Activity representation with motion hierarchies. Int J Comput Vis 107:219–238
Kumar Dwivedi S, Gupta V, Mitra R, Ahmed S, Jain A (2019) Protogan: towards few shot learning for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops. pp 0–0
Cho J, Lee M, Chang HJ, Oh S (2014) Robust action recognition using local motion and group sparsity. Pattern Recogn 47:1813–1825
Sharma S, Kiros R, Salakhutdinov R (2015) Action recognition using visual attention. arXiv preprint arXiv:1511.04119
Souly N, Shah M (2016) Visual saliency detection using group lasso regularization in videos of natural scenes. Int J Comput Vis 117:93–110
Saremi M, Yaghmaee F (2020) Efficient encoding of video descriptor distribution for action recognition. Multimed Tools Appl 79:6025–6043
Zhang Y, Ding M, Bai Y, Liu D, Ghanem B (2019) Learning a strong detector for action localization in videos. Pattern Recogn Lett 128:407–413
Cong J, Zhang B (2020) Multi-model feature fusion for human action recognition towards sport sceneries. Signal Process. https://doi.org/10.1016/j.image.2020.115803
Liu J, Zheng N (2007) Gait history image: a novel temporal template for gait recognition. In: 2007 IEEE International Conference on Multimedia and Expo. IEEE, pp 663–666
Ji S, Xu W, Yang M, Yu K (2012) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35:221–231
Ramasinghe S, Rodrigo R (2015) Action recognition by single stream convolutional neural networks: an approach using combined motion and static information. In: 2015 Asian Conference on Pattern Recognition. IEEE, pp 101–105
Zhou Y, Pu N, Qian L, Wu S, Xiao G (2017) Human action recognition in videos of realistic scenes based on multi-scale CNN feature. Pacific Rim conference on multimedia. Springer, pp 316–326
Ullah A, Ahmad J, Muhammad K, Sajjad M, Baik SW (2017) Action recognition in video sequences using deep bi-directional LSTM with CNN features. IEEE Access 6:1155–1166
Wang L, Xu Y, Cheng J, Xia H, Yin J, Wu J (2018) Human action recognition by learning spatio-temporal features with deep neural networks. IEEE Access 6:17913–17922
Wei J, Wang H, Yi Y, Li Q, Huang D (2019) P3d-ctn: Pseudo-3d convolutional tube network for spatio-temporal action detection in videos. In: 2019 IEEE International Conference on Image Processing (ICIP). IEEE, pp 300–304
Ge H, Yan Z, Yu W, Sun L (2019) An attention mechanism based convolutional LSTM network for video action recognition. Multimed Tools Appl 78:20533–20556
Zare A, Moghaddam HA, Sharifi A (2020) Video spatiotemporal mapping for human action recognition by convolutional neural network. Pattern Anal Appl 23:265–279
Lee CP, Tan AW, Tan SC (2014) Time-sliced averaged motion history image for gait recognition. J Vis Commun Image Represent 25:822–826
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition. pp 32–36
Soomro K, Zamir AR (2014) Action recognition in realistic sports videos. In: Computer vision in sports. Springer, pp 181–208
Liu J, Luo J, Shah M (2009) Recognizing realistic actions from videos “in the wild”. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 1996–2003
Niebles JC, Chen C-W, Fei-Fei L (2010) Modeling temporal structure of decomposable motion segments for activity classification. European conference on computer vision. Springer, pp 392–405
Bautista MA, Sanakoyeu A, Ommer B (2017) Deep unsupervised similarity learning using partially ordered sets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7130–7139
Abdelbaky A, Aly S (2020) Human action recognition using short-time motion energy template images and PCANet features. Neural Comput Appl 32(16):12561–12574
Liu C, Liu J, He Z, Zhai Y, Hu Q, Huang Y (2016) Convolutional neural random fields for action recognition. Pattern Recogn 59:213–224
Boualia SN, Amara NEB 3D CNN for Human Action Recognition, 2018.
Chou K-P et al (2018) Robust feature-based automated multi-view human action recognition system. IEEE Access 6:15283–15296
Antonik P, Marsal N, Brunner D, Rontani D (2019) Human action recognition with a large-scale brain-inspired photonic computer. Nat Mach Intell 1:530–537
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zebhi, S., AlModarresi, S.M.T. & Abootalebi, V. Human activity recognition using pre-trained network with informative templates. Int. J. Mach. Learn. & Cyber. 12, 3449–3461 (2021). https://doi.org/10.1007/s13042-021-01383-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-021-01383-9