Abstract
Computer vision is a vast area of research that includes extracting useful information from images or sequence of images. Human activity recognition is one such field undergoing lots of research. The practical application for this model is vast in various kinds of researches as well as actual practice. This paper proposes a two-model approach using a combination of a convolutional neural network using transfer learning and a long short-term memory model. CNN network is applied to gather the feature vectors for each video, and the LSTM network is used to classify the video activity. Standard activities contain benchpress, horse riding, basketball dunk, etc. A high accuracy level of 94.2% was achieved by the proposed algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tavakkoli, A., Kelley, R., King, C., Nicolescu, M., Nicolescu, M., Bebis, G.: A visual tracking framework for intent recognition in videos
Wu, Z., Yao, T., Fu, Y., Jiang, Y.-G.: Deep learning for video classification and captioning (February 2018)
Sunny, J.T., George, S.M., Kizhakkethottam, J.J.: Applications and challenges of human activity recognition using sensors in a smart environment (September 2015)
Ranasinghe, S., Machot, F.A., Mayr, H.C.: A review on applications of activity recognition systems with regard to performance and evaluation (August 2016)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks (September 2014)
A must-read introduction to sequence modelling (with use cases), Analytics Vidhya. https://www.analyticsvidhya.com/blog/2018/04/sequence-modelling-an-introduction-with-practical-use-cases/
Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network (August 2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory (1997)
Lipton, Z.C., Berkowitz, J.: A critical review of recurrent neural networks for sequence learning (June 2015)
Varol, G., Laptev, I., Schmid, C.: Long-term temporal convolutions for action recognition (2016)
Zhang, H.-B., Zhang, Y.-X., Zhong, B., Lei, Q., Yang, L., Du, J.-X., Chen, D.-S.: A comprehensive survey of vision-based human action recognition methods (February 2019)
Thankaraj, S., Gopalan, C.: A survey on human activity recognition from videos (February 2016)
Asadi-Aghbolaghi, M., Clapes, A., Bellantonio, M., Es-calante, H., Ponce-Lpez, V., Bar, X., Guyon, I., Kasaei, S., Escalera, S.: Deep learning for action and gesture recognition in image sequences: a survey (January 2018)
Krizhevsky, A., Sutskever, I.: ImageNet classification with deep convolutional neural network
Zha, S., Luisier, F., Andrews, W., Srivastava, N., Salakhutdinov, R.: Exploiting image-trained CNN architectures for unconstrained video classification (May 2015)
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Van Gool, L.: Temporal segment networks: towards good practices for deep action recognition (August 2016)
Zolfaghari, M., Oliveira, G.L., Sedaghat, N., Brox, T.: Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection (2017)
Ji, S., Xu, W., Yang, M.W., Yu, K.: 3D convolutional neural networks for human action recognition (2010)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 44894497. IEEE (2015)
Ma, C.-Y., Chen, M.-H., Kira, Z., AlRegib, G.: TS-LSTM and temporal-inception: exploiting spatiotemporal dynamics for activity recognition (March 2017)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition (April 2016)
Gammulle, H., Denman, S., Sridharan, S., Fookes, C.: Two stream LSTM: a deep fusion framework for human action recognition (2017)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition (September 2016)
Diba, A., Fayyaz, M., Sharma, V., Karami, A.H., Arzani, M.M., Yousefzadeh, R., Van Gool, L.: Temporal 3D convnets: new architecture and transfer learning for video classification (November 2017)
O’Shea, K.T., Nash, R.: An introduction to convolutional neural networks
Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., Liu, C.: A survey on deep transfer learning (August 2018)
Szegedy, C., Liu, W., Jia, Y.: Going deeper with convolutions (September 2017)
Szegedy, C., Vanhouck, V., Ioffe, S., Shlens, J.: Rethinking the inception architecture for computer vision
A simple guide to the versions of the inception network, towards data science. https://towardsdatascience.com/a-simple-guide-to-the-versions-of-the-inception-network-7fc52b863202
UCF101—Action recognition data set, University of Central Florida. https://www.crcv.ucf.edu/data/UCF101.php
Choutas, V., Weinzaepfel, P., Revaud, J., Schmid, C.: PoTion: pose moTion representation for action recognition. In: CVPR 2018 - IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, United States, pp. 7024–7033 ( Jun 2018). https://doi.org/10.1109/CVPR.2018.00734. hal-01764222
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pandya, M., Pillai, A., Rupani, H. (2021). Segregating and Recognizing Human Actions from Video Footages Using LRCN Technique. In: Hassanien, A., Bhatnagar, R., Darwish, A. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2020. Advances in Intelligent Systems and Computing, vol 1141. Springer, Singapore. https://doi.org/10.1007/978-981-15-3383-9_1
Download citation
DOI: https://doi.org/10.1007/978-981-15-3383-9_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3382-2
Online ISBN: 978-981-15-3383-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)