Abstract
The enormous applicability of advanced human-computer interaction technology in daily life has facilitated the interest of the researchers toward the development of more intelligent autonomous systems. These human-computer interaction systems can be successful in real-life by addressing the gap among the existing techniques. This research work focuses on one of the prominent human-computer interaction applications of human activity recognition. Human activity recognition (HAR) is the process to detect human gestures, actions, and different types of interactions. The HAR process requires competent knowledge about the day to day human activities and advanced technology to recognize their activities. The conventional pattern recognition techniques are significant to recognize the human activities using the machine learning techniques but only within the controlled environment for the recognition of limited actions. In recent years, deep learning techniques are developed that can learn the deep attributes of the problem application and determine the outcomes with promising performance. The present paper has presented a systematic review of the deep learning models for video-based human activity recognition. The work describes the recent developments in the field for the analysis of different models. The paper also discusses the process of human activity recognition and the eminent datasets available for experimentation. The summarization of the work is illustrated with the future directions in the field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Jakus D (2018) Visual communication in public relations campaigns. Mark Sci Res Organ 27(1):25–36
Wang J, Chen Y, Hao S, Peng X, Hu L (2019) Deep learning for sensor-based activity recognition: a survey. Pattern Recogn Lett 119:3–11
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Caba Heilbron F, Escorcia V, Ghanem B, Carlos Niebles J (2015) Activitynet: a large-scale video benchmark for human activity understanding. In: IEEE conference on computer vision and pattern recognition. IEEE, Boston, USA, pp 961–970
Kong Y, Jia Y, Fu Y (2012) Learning human interaction by interactive phrases. In: European conference on computer vision. Springer, Berlin, Heidelberg, pp 300–313
Wang Y, Huang K, Tan T (2007) Human activity recognition based on R transform. In: 2007 IEEE conference on computer vision and pattern recognition. IEEE, Minneapolis, USA, pp 1–8
Cornell Activity Datasets: CAD-60 & CAD-120, http://pr.cs.cornell.edu/index.php. Accessed 27 December 2020
Jiang YG, Wu Z, Wang J, Xue X, Chang SF (2017) Exploiting feature and class relationships in video categorization with regularized deep neural networks. IEEE Trans Pattern Anal Mach Intell 40(2):352–364
Bloom V, Argyriou V, Makris D (2016) Hierarchical transfer learning for online recognition of compound actions. Comput Vis Image Underst 144:62–72
Hadfield S, Bowden R (2013) Hollywood 3D: recognizing actions in 3D natural scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Portland, USA, pp 3398–3405
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of the 17th international conference on pattern recognition. ICPR 2004. IEEE, Cambridge, UK, pp 32–36
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3d points. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. IEEE, San Francisco, USA, pp 9–14
MSR Action Pair, https://sites.google.com/view/wanqingli/data-sets/msr-actionpair?authuser=0. Accessed 16 January 2021
Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, Providence, USA, pp 1290–1297
Barekatain M, Martà M, Shih HF, Murray S, Nakayama K, Matsuo Y, Prendinger H (2017) Okutama-action: an aerial view video dataset for concurrent human action detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. IEEE, Honolulu, USA, pp 28–35
Hu JF, Zheng WS, Lai J, Zhang J (2015) Jointly learning heterogeneous features for RGB-D activity recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Boston, USA, pp 5344–5352
Soomro K, Zamir AR (2014) Action recognition in realistic sports videos. In: Computer vision in sports. Springer, Cham, pp 181–208
Soomro K, Zamir AR, Shah M (2012) UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv:1212.0402
Chen C, Jafari R, Kehtarnavaz N (2015) UTD-MHAD: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: 2015 IEEE international conference on image processing (ICIP). IEEE, Quebec City, Canada, pp 168–172
Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253
Baccouche M, Mamalet F, Wolf C, Garcia C, Baskurt A (2011) Sequential deep learning for human action recognition. In: International workshop on human behavior understanding. Springer, Berlin, Heidelberg, pp 29–39
Chen G, Zhang F, Giuliani M, Buckl C, Knoll A (2013) Unsupervised learning spatio-temporal features for human activity recognition from rgb-d video data. In: International conference on social robotics. Springer, Cham, pp 341–350
Hasan M, Roy-Chowdhury AK (2014) Continuous learning of human activity models using deep nets. In: European conference on computer vision. Springer, Cham, pp 705–720
Dobhal T, Shitole V, Thomas G, Navada G (2015) Human activity recognition using binary motion image and deep learning. Proced Comput Sci 58:178–185
Huang CD, Wang CY, Wang JC (2015) Human action recognition system for elderly and children care using three stream convnet. In: 2015 international conference on orange technologies (ICOT). IEEE, Hong Kong, China, pp 5–9
Mo L, Li F, Zhu Y, Huang A (2016) Human physical activity recognition based on computer vision with deep learning model. In: 2016 IEEE international instrumentation and measurement technology conference proceedings. IEEE, Taipei, Taiwan, pp 1–6
Park SU, Park JH, Al-Masni MA, Al-Antari MA, Uddin MZ, Kim TS (2016) A depth camera-based human activity recognition via deep learning recurrent neural network for health and social care services. Proced Comput Sci 100:78–84
Tomas A, Biswas KK (2017) Human activity recognition using combined deep architectures. In: 2017 IEEE 2nd international conference on signal and image processing (ICSIP). IEEE, Singapore, pp 41–45
Ijjina EP, Chalavadi KM (2017) Human action recognition in RGB-D videos using motion sequence information and deep learning. Pattern Recogn 72:504–516
Choi JH, Cheon M, Lee JS (2017) Influence of video quality on multi-view activity recognition. In: 2017 IEEE international symposium on multimedia (ISM). IEEE, Taichung, Taiwan, pp 511–515
Sheeba PT, Murugan S (2018) Hybrid features-enabled dragon deep belief neural network for activity recognition. Imaging Sci J 66(6):355–371
Putra PU, Shima K, Shimatani K (2018) Markerless human activity recognition method based on deep neural network model using multiple cameras. In: 2018 5th international conference on control, decision and information technologies (CoDIT). IEEE, Thessaloniki, Greece, pp 13–18
Uddin MZ, Torresen J (2018) A deep learning-based human activity recognition in darkness. In: 2018 colour and visual computing symposium (CVCS). IEEE, Gjøvik, Norway, pp 1–5
Kong Y, Huang J, Huang S, Wei Z, Wang S (2019) Learning spatiotemporal representations for human fall detection in surveillance video. J Vis Commun Image Represent 59:215–230
Khelalef A, Ababsa F, Benoudjit N (2019) An efficient human activity recognition technique based on deep learning. Pattern Recognit Image Anal 29(4):702–715
Jaouedi N, Boujnah N, Bouhlel MS (2020) A new hybrid deep learning model for human action recognition. J King Saud Univ Comput Inf Sci 32(4):447–453
Girdhar P, Johri P, Virmani D (2020) Incept_LSTM: accession for human activity concession in automatic surveillance. J Discret Math Sci Cryptogr, 1–15
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jindal, S., Sachdeva, M., Kushwaha, A.K.S. (2022). Deep Learning for Video Based Human Activity Recognition: Review and Recent Developments. In: Bansal, R.C., Zemmari, A., Sharma, K.G., Gajrani, J. (eds) Proceedings of International Conference on Computational Intelligence and Emerging Power System. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-16-4103-9_7
Download citation
DOI: https://doi.org/10.1007/978-981-16-4103-9_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-4102-2
Online ISBN: 978-981-16-4103-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)