Skip to main content

Deep Learning for Video Based Human Activity Recognition: Review and Recent Developments

  • Conference paper
  • First Online:
Proceedings of International Conference on Computational Intelligence and Emerging Power System

Abstract

The enormous applicability of advanced human-computer interaction technology in daily life has facilitated the interest of the researchers toward the development of more intelligent autonomous systems. These human-computer interaction systems can be successful in real-life by addressing the gap among the existing techniques. This research work focuses on one of the prominent human-computer interaction applications of human activity recognition. Human activity recognition (HAR) is the process to detect human gestures, actions, and different types of interactions. The HAR process requires competent knowledge about the day to day human activities and advanced technology to recognize their activities. The conventional pattern recognition techniques are significant to recognize the human activities using the machine learning techniques but only within the controlled environment for the recognition of limited actions. In recent years, deep learning techniques are developed that can learn the deep attributes of the problem application and determine the outcomes with promising performance. The present paper has presented a systematic review of the deep learning models for video-based human activity recognition. The work describes the recent developments in the field for the analysis of different models. The paper also discusses the process of human activity recognition and the eminent datasets available for experimentation. The summarization of the work is illustrated with the future directions in the field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Jakus D (2018) Visual communication in public relations campaigns. Mark Sci Res Organ 27(1):25–36

    Google Scholar 

  2. Wang J, Chen Y, Hao S, Peng X, Hu L (2019) Deep learning for sensor-based activity recognition: a survey. Pattern Recogn Lett 119:3–11

    Article  Google Scholar 

  3. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  Google Scholar 

  4. Caba Heilbron F, Escorcia V, Ghanem B, Carlos Niebles J (2015) Activitynet: a large-scale video benchmark for human activity understanding. In: IEEE conference on computer vision and pattern recognition. IEEE, Boston, USA, pp 961–970

    Google Scholar 

  5. Kong Y, Jia Y, Fu Y (2012) Learning human interaction by interactive phrases. In: European conference on computer vision. Springer, Berlin, Heidelberg, pp 300–313

    Google Scholar 

  6. Wang Y, Huang K, Tan T (2007) Human activity recognition based on R transform. In: 2007 IEEE conference on computer vision and pattern recognition. IEEE, Minneapolis, USA, pp 1–8

    Google Scholar 

  7. Cornell Activity Datasets: CAD-60 & CAD-120, http://pr.cs.cornell.edu/index.php. Accessed 27 December 2020

  8. Jiang YG, Wu Z, Wang J, Xue X, Chang SF (2017) Exploiting feature and class relationships in video categorization with regularized deep neural networks. IEEE Trans Pattern Anal Mach Intell 40(2):352–364

    Article  Google Scholar 

  9. Bloom V, Argyriou V, Makris D (2016) Hierarchical transfer learning for online recognition of compound actions. Comput Vis Image Underst 144:62–72

    Article  Google Scholar 

  10. Hadfield S, Bowden R (2013) Hollywood 3D: recognizing actions in 3D natural scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Portland, USA, pp 3398–3405

    Google Scholar 

  11. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of the 17th international conference on pattern recognition. ICPR 2004. IEEE, Cambridge, UK, pp 32–36

    Google Scholar 

  12. Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3d points. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. IEEE, San Francisco, USA, pp 9–14

    Google Scholar 

  13. MSR Action Pair, https://sites.google.com/view/wanqingli/data-sets/msr-actionpair?authuser=0. Accessed 16 January 2021

  14. Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, Providence, USA, pp 1290–1297

    Google Scholar 

  15. Barekatain M, Martí M, Shih HF, Murray S, Nakayama K, Matsuo Y, Prendinger H (2017) Okutama-action: an aerial view video dataset for concurrent human action detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. IEEE, Honolulu, USA, pp 28–35

    Google Scholar 

  16. Hu JF, Zheng WS, Lai J, Zhang J (2015) Jointly learning heterogeneous features for RGB-D activity recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Boston, USA, pp 5344–5352

    Google Scholar 

  17. Soomro K, Zamir AR (2014) Action recognition in realistic sports videos. In: Computer vision in sports. Springer, Cham, pp 181–208

    Google Scholar 

  18. Soomro K, Zamir AR, Shah M (2012) UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv:1212.0402

  19. Chen C, Jafari R, Kehtarnavaz N (2015) UTD-MHAD: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: 2015 IEEE international conference on image processing (ICIP). IEEE, Quebec City, Canada, pp 168–172

    Google Scholar 

  20. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253

    Article  Google Scholar 

  21. Baccouche M, Mamalet F, Wolf C, Garcia C, Baskurt A (2011) Sequential deep learning for human action recognition. In: International workshop on human behavior understanding. Springer, Berlin, Heidelberg, pp 29–39

    Google Scholar 

  22. Chen G, Zhang F, Giuliani M, Buckl C, Knoll A (2013) Unsupervised learning spatio-temporal features for human activity recognition from rgb-d video data. In: International conference on social robotics. Springer, Cham, pp 341–350

    Google Scholar 

  23. Hasan M, Roy-Chowdhury AK (2014) Continuous learning of human activity models using deep nets. In: European conference on computer vision. Springer, Cham, pp 705–720

    Google Scholar 

  24. Dobhal T, Shitole V, Thomas G, Navada G (2015) Human activity recognition using binary motion image and deep learning. Proced Comput Sci 58:178–185

    Article  Google Scholar 

  25. Huang CD, Wang CY, Wang JC (2015) Human action recognition system for elderly and children care using three stream convnet. In: 2015 international conference on orange technologies (ICOT). IEEE, Hong Kong, China, pp 5–9

    Google Scholar 

  26. Mo L, Li F, Zhu Y, Huang A (2016) Human physical activity recognition based on computer vision with deep learning model. In: 2016 IEEE international instrumentation and measurement technology conference proceedings. IEEE, Taipei, Taiwan, pp 1–6

    Google Scholar 

  27. Park SU, Park JH, Al-Masni MA, Al-Antari MA, Uddin MZ, Kim TS (2016) A depth camera-based human activity recognition via deep learning recurrent neural network for health and social care services. Proced Comput Sci 100:78–84

    Article  Google Scholar 

  28. Tomas A, Biswas KK (2017) Human activity recognition using combined deep architectures. In: 2017 IEEE 2nd international conference on signal and image processing (ICSIP). IEEE, Singapore, pp 41–45

    Google Scholar 

  29. Ijjina EP, Chalavadi KM (2017) Human action recognition in RGB-D videos using motion sequence information and deep learning. Pattern Recogn 72:504–516

    Article  Google Scholar 

  30. Choi JH, Cheon M, Lee JS (2017) Influence of video quality on multi-view activity recognition. In: 2017 IEEE international symposium on multimedia (ISM). IEEE, Taichung, Taiwan, pp 511–515

    Google Scholar 

  31. Sheeba PT, Murugan S (2018) Hybrid features-enabled dragon deep belief neural network for activity recognition. Imaging Sci J 66(6):355–371

    Article  Google Scholar 

  32. Putra PU, Shima K, Shimatani K (2018) Markerless human activity recognition method based on deep neural network model using multiple cameras. In: 2018 5th international conference on control, decision and information technologies (CoDIT). IEEE, Thessaloniki, Greece, pp 13–18

    Google Scholar 

  33. Uddin MZ, Torresen J (2018) A deep learning-based human activity recognition in darkness. In: 2018 colour and visual computing symposium (CVCS). IEEE, Gjøvik, Norway, pp 1–5

    Google Scholar 

  34. Kong Y, Huang J, Huang S, Wei Z, Wang S (2019) Learning spatiotemporal representations for human fall detection in surveillance video. J Vis Commun Image Represent 59:215–230

    Article  Google Scholar 

  35. Khelalef A, Ababsa F, Benoudjit N (2019) An efficient human activity recognition technique based on deep learning. Pattern Recognit Image Anal 29(4):702–715

    Article  Google Scholar 

  36. Jaouedi N, Boujnah N, Bouhlel MS (2020) A new hybrid deep learning model for human action recognition. J King Saud Univ Comput Inf Sci 32(4):447–453

    Google Scholar 

  37. Girdhar P, Johri P, Virmani D (2020) Incept_LSTM: accession for human activity concession in automatic surveillance. J Discret Math Sci Cryptogr, 1–15

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sonika Jindal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jindal, S., Sachdeva, M., Kushwaha, A.K.S. (2022). Deep Learning for Video Based Human Activity Recognition: Review and Recent Developments. In: Bansal, R.C., Zemmari, A., Sharma, K.G., Gajrani, J. (eds) Proceedings of International Conference on Computational Intelligence and Emerging Power System. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-16-4103-9_7

Download citation

Publish with us

Policies and ethics