Skip to main content
Log in

Human action recognition using deep rule-based classifier

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In recent years, numerous techniques have been proposed for human activity recognition (HAR) from images and videos. These techniques can be divided into two major categories: handcrafted and deep learning. Deep Learning-based models have produced remarkable results for HAR. However, these models have several shortcomings, such as the requirement for a massive amount of training data, lack of transparency, offline nature, and poor interpretability of their internal parameters. In this paper, a new approach for HAR is proposed, which consists of an interpretable, self-evolving, and self-organizing set of 0-order If...THEN rules. This approach is entirely data-driven, and non-parametric; thus, prototypes are identified automatically during the training process. To demonstrate the effectiveness of the proposed method, a set of high-level features is obtained using a pre-trained deep convolution neural network model, and a recently introduced deep rule-based classifier is applied for classification. Experiments are performed on a challenging benchmark dataset UCF50; results confirmed that the proposed approach outperforms state-of-the-art methods. In addition to this, an ablation study is conducted to demonstrate the efficacy of the proposed approach by comparing the performance of our DRB classifier with four state-of-the-art classifiers. This analysis revealed that the DRB classifier could perform better than state-of-the-art classifiers, even with limited training samples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Amodei D, Ananthanarayanan S, Anubhai R, Bai J, Battenberg E, Case C, Casper J, Catanzaro B, Cheng Q, Chen G et al (2016) Deep speech 2: End-to-end speech recognition in english and mandarin. In: International conference on machine learning, pp 173–182

  2. Angelov P (2012) Autonomous learning systems: from data streams to knowledge in real-time. John Wiley & Sons

  3. Angelov P, Xiaowei G (2017) Autonomous learning multi-model classifier of 0-order (almmo-0). In: 2017 Evolving and adaptive intelligent systems (EAIS), IEEE, pp 1–7

  4. Angelov P, Xiaowei G (2017) A cascade of deep learning fuzzy rule-based image classifier and svm. In: 2017 IEEE International conference on systems, man, and cybernetics (SMC), IEEE, pp 746–751

  5. Angelov PP, Xiaowei G (2018) Deep rule-based classifier with human-level performance and characteristics. Information Sciences

  6. Angelov PP, Xiaowei G (2019) Empirical approach to machine learning. Springer

  7. Angelov P, Yager R (2011) A new type of simplified fuzzy rule-based system. Int J Gen Syst 41(2):163–185

    Article  MathSciNet  MATH  Google Scholar 

  8. Angelov P, Yager R (2012) A new type of simplified fuzzy rule-based system. Int J Gen Syst 41(2):163–185

    Article  MathSciNet  MATH  Google Scholar 

  9. Batchuluun G, Kim JH, Hong HG, Kang JK, Park KR (2017) Fuzzy system based human behavior recognition by combining behavior prediction and recognition. Expert Syst Appl 81:108–133

    Article  Google Scholar 

  10. Bilen H, Fernando B, Gavves E, Vedaldi A (2017) Action recognition with dynamic image networks. IEEE Transactions on Pattern Analysis and Machine Intelligence

  11. Bilen H, Fernando B, Gavves E, Vedaldi A, Gould S (2016) Dynamic image networks for action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3034–3042

  12. Bo Y, Hagras H, Alhaddad MJ, Alghazzawi D (2015) A fuzzy logic-based system for the automation of human behavior recognition using machine vision in intelligent environments. Soft Comput 19(2):499–506

    Article  Google Scholar 

  13. Cao X-Q, Liu Z-Q (2015) Type-2 fuzzy topic models for human action recognition. IEEE Trans Fuzzy Syst 23(5):1581–1593

    Article  Google Scholar 

  14. Chang J-Y, Shyu J-J, Cho C-W et al (2009) Fuzzy rule inference based human activity recognition. In: 2009 IEEE control applications CCA & intelligent control (ISIC) VOLS 1-3, pp 211–215

  15. Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. arXiv:1202.2745

  16. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, IEEE, pp 886–893

  17. Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European conference on computer vision, Springer, pp 428–441

  18. Deng C, Yang X, Nie F, Tao D (2019) Saliency detection via a multiple self-weighted graph-based manifold ranking. IEEE Transactions on Multimedia

  19. Dollár P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: 2005 IEEE International workshop on visual surveillance and performance evaluation of tracking and surveillance, IEEE, pp 65–72

  20. Duta IC, Uijlings JRR, Ionescu B, Aizawa K, Hauptmann AG, Sebe N (2017) Efficient human action recognition using histograms of motion gradients and vlad with descriptor shape information. Multimedia Tools and Applications 76 (21):22445–22472

    Article  Google Scholar 

  21. Everts I, Van Gemert JC, Gevers T (2013) Evaluation of color stips for human action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2850–2857

  22. Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1933–1941

  23. Gao S, Duan L, Tsang IW (2016) Defeatnet—a deep conventional image representation for image classification. IEEE Trans Circ Syst Video Technol 26(3):494–505

    Article  Google Scholar 

  24. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  25. Gokmen G, Akinci TÇ, Tektaş M, Onat N, Kocyigit G, Tektaş N (2010) Evaluation of student performance in laboratory applications using fuzzy logic. Procedia-Social and Behavioral Sciences 2(2):902–909

    Article  Google Scholar 

  26. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Patt Anal Mach Intel 29(12):2247–2253

    Article  Google Scholar 

  27. Han J, Zhang D, Cheng G, Liu N, Dong X (2018) Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Proc Mag 35(1):84–100

    Article  Google Scholar 

  28. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  29. Ji S, Wei X, Yang M, Yu K (2013) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intel 35 (1):221–231

    Article  Google Scholar 

  30. Jiang H, Sun D, Jampani V, Lv Z, Learned-Miller E, Kautz J (2019) Sense: a shared encoder network for scene-flow estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3195–3204

  31. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Li F-F (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1725–1732

  32. Kavukcuoglu K, Sermanet P, Boureau Y-L, Gregor K, Mathieu M, Cun YL (2010) Learning convolutional feature hierarchies for visual recognition. In: Advances in neural information processing systems, pp 1090–1098

  33. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  34. Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: A large video database for human motion recognition. In: Proceedings of the international conference on computer vision (ICCV)

  35. Lan Z, Lin M, Li X, Hauptmann AG, Raj B (2015) Beyond gaussian pyramid: Multi-skip feature stacking for action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 204–212

  36. LeCun Y et al (2015) Lenet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet, page 20

  37. Li Y, Liu Y, Zhang C (2019) What elements are essential to recognize human actions? In: CVPR Workshops

  38. Li Z, Tang J (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288

    Article  MathSciNet  MATH  Google Scholar 

  39. Liu A, Su Y, Nie W, Kankanhalli MS (2017) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114

    Article  Google Scholar 

  40. Medjahed H, Istrate D, Boudy J, Dorizzi B (2009) Human activities of daily living recognition using fuzzy logic for elderly home monitoring. In: IEEE international conference on Fuzzy systems, 2009. FUZZ-IEEE 2009, IEEE, pp 2001–2006

  41. Nazir S, Yousaf MH, Nebel J-C, Velastin SA (2018) A bag of expression framework for improved human action recognition. Pattern Recogn Lett 103:39–45

    Article  Google Scholar 

  42. Noori FM, Wallace B, Uddin MdZ, Torresen J (2019) A robust human activity recognition approach using openpose, motion features, and deep recurrent neural network. In: Scandinavian conference on image analysis, Springer, pp 299–310

  43. Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. British Machine Vision Association

  44. Peng X, Wang L, Wang X, Qiao Y (2016) Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice. Comput Vis Image Underst 150:109–125

    Article  Google Scholar 

  45. Rahmani H, Mian A, Shah M (2018) Learning a deep model for human action recognition from novel viewpoints. IEEE Trans Pattern Anal Mach Intel 40(3):667–681

    Article  Google Scholar 

  46. Reddy KK, Shah M (2013) Recognizing 50 human action categories of web videos. Mach Vis Appl 24(5):971–981

    Article  Google Scholar 

  47. Reddy KK, Shah M (2013) Recognizing 50 human action categories of web videos. Mach Vis Appl 24(5):971–981

    Article  Google Scholar 

  48. Sargano A, Angelov P, Habib Z (2016) Human action recognition from multiple views based on view-invariant feature descriptor using support vector machines. Appl Sci 6(10):309

    Article  Google Scholar 

  49. Sargano AB, Angelov P, Habib Z (2017) A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition. Appl Sci 7(1):110

    Article  Google Scholar 

  50. Sargano AB, Wang X, Angelov P, Habib Z (2017) Human action recognition using transfer learning with deep representations. In; 2017 international joint conference on Neural networks (IJCNN), IEEE, pp 463–469

  51. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: Proceedings of the 17th international conference on Pattern recognition, 2004. ICPR 2004, vol 3. IEEE, pp 32–36

  52. Shu Y, Shi Y, Wang Y, Huang T, Tian Y (2020) p-odn: prototype-based open deep network for open set recognition. Scientific Reports 10(1):1–13

    Article  Google Scholar 

  53. Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In: Advances in neural information processing systems, pp 568–576

  54. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  55. Soomro K, Zamir AR (2014) Action recognition in realistic sports videos. Springer, pp 181–208

  56. Sun X, Pengcheng W, Hoi SCH (2018) Face detection using deep learning: an improved faster rcnn approach. Neurocomputing 299:42–50

    Article  Google Scholar 

  57. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  58. Taigman Y, Yang M, Marc’Aurelio R, Wolf L (2014) Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708

  59. Taylor GW, Fergus R, LeCun Y, Bregler C (2010) Convolutional learning of spatio-temporal features. In: European conference on computer vision, Springer, pp 140–153

  60. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497

  61. Ullah A, Muhammad K, Ul Haq I, Baik SW (2019) Action recognition using optimized deep autoencoder and cnn for surveillance data streams of non-stationary environments. Futur Gener Comput Syst 96:386–397

    Article  Google Scholar 

  62. Varol G, Laptev I, Schmid C (2018) Long-term temporal convolutions for action recognition. IEEE Trans Pattern Anal Mach Intel 40(6):1510–1517

    Article  Google Scholar 

  63. Wang H, Kläser A, Schmid C, Liu C-L (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103 (1):60–79

    Article  MathSciNet  Google Scholar 

  64. Wang S, Ma Z, Yi Y, Li X, Pang C, Hauptmann AG (2014) Semi-supervised multiple feature analysis for action recognition. IEEE Trans Multimedia 16(2):289–298

    Article  Google Scholar 

  65. Wang H, Oneata D, Verbeek J, Schmid C (2016) A robust and efficient video representation for action recognition. Int J Comput Vis 119(3):219–238

    Article  MathSciNet  Google Scholar 

  66. Wang H, Schmid C (2013) Action recognition with improved trajectories. In: Proceedings of the IEEE international conference on computer vision, pp 3551–3558

  67. Wang L, Xiong Y, Wang Z, Qiao Y (2015) Towards good practices for very deep two-stream convnets. arXiv:1507.02159

  68. Wilson S, Krishna Mohan C (2017) Coherent and noncoherent dictionaries for action recognition. IEEE Signal Process Lett 24(5):698–702

    Article  Google Scholar 

  69. Xiaowei G, Angelov P (2018) Semi-supervised deep rule-based approach for image classification. Appl Soft Comput 68:53–68

    Article  Google Scholar 

  70. Xiaowei G, Angelov P, Ce Z, Atkinson P (2018) A massively parallel deep rule-based ensemble classifier for remote sensing scenes. IEEE Geosci Remote Sens Lett 15(3):345–349

    Article  Google Scholar 

  71. Yi Y, Wang H (2018) Motion keypoint trajectory and covariance descriptor for human action recognition. Vis Comput 34(3):391–403

    Article  Google Scholar 

  72. Zhang D, Han J, Zhang Y, Dong X (2019) Synthesizing supervision for learning deep saliency network without human annotation. IEEE transactions on pattern analysis and machine intelligence

  73. Zheng J, Jiang Z, Chellappa R (2016) Cross-view action recognition via transferable dictionary learning. IEEE Trans Image Process 25 (6):2542–2556

    Article  MathSciNet  MATH  Google Scholar 

  74. Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pap 487–495

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Allah Bux Sargano.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sargano, A.B., Gu, X., Angelov, P. et al. Human action recognition using deep rule-based classifier. Multimed Tools Appl 79, 30653–30667 (2020). https://doi.org/10.1007/s11042-020-09381-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09381-9

Keywords

Navigation