Human activity recognition (HAR) is an important research area in computer vision due to its vast range of applications. Specifically, the past decade has witnessed enormous growth in its applications, such as Human Computer Interaction, intelligent video surveillance, ambient assisted living, entertainment, human-robot interaction, and intelligent transportation systems. This review paper provides a comprehensive state-of-the-art survey of different phases of HAR. Techniques related to segmentation of the image into physical objects, feature extraction, and activity classification are thoroughly reviewed and compared. Finally, the paper is concluded with research challenges and future directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Bouwmans, T.: Traditional and recent approaches in background modeling for foreground detection: an overview. Comput. Sci. Rev. 11, 31–66 (2014)
Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: A survey. Circuits Syst. Video Technol. IEEE Trans. 18, 1473–1488 (2008)
Poppe, R.: A survey on vision-based human action recognition. Image Vision Comput. 28, 976–990 (2010)
Ke, S.-R., Uyen, H.L., Lee, Y.-J., Hwang, J.-N., Yoo, J.-H., Choi, K.-H.: A review on video-based human activity recognition. Computers. 2, 88–131 (2013)
Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43(3), 16 (2011)
Ramanathan, M., Yau, W.-Y., Teoh, E.K.: Human action recognition with video data: research and evaluation challenges. Human-Mach. Syst. IEEE Trans. 44(5), 650–663 (2014)
Aggarwal, J., Xia, L.: Human activity recognition from 3d data: a review. Pattern Recogn. Lett. 48, 70–80 (2014)
Ziaeefard, M., Bergevin, R.: Semantic human activity recognition: a literature review. Pattern Recogn. 48(8), 2329–2345 (2015)
Morris, G., Angelov, P.: Real-time novelty detection in video using background subtraction techniques: State of the art a practical review. In: 2014 IEEE International Conference on Systems, Man and Cybernetics (SMC). IEEE (2014)
Sobral, A., Vacavant, A.: A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Comput. Vision Image Underst. 122, 4–21 (2014)
Sobral, A.: BGSLibrary: An opencv c ++ background subtraction library. In: IX Workshop de Visao Computational (WVC’2013), Rio de Janeiro, Brazil (2013)
El Baf, F., Bouwmans, T., Vachon, B.: Foreground detection using the Choquet integral. In: WIAMIS’08. Ninth International Workshop on Image Analysis for Multimedia Interactive Services, 2008. IEEE (2008)
Toyama, K., et al.: Wallflower: principles and practice of background maintenance. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999. IEEE (1999)
Heikkila, M., Pietikainen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006)
Yao, J., Odobez, J.-M.: Multi-layer background subtraction based on color and texture. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE (2007)
Jian, X., et al.: Background subtraction based on a combination of texture, color and intensity. In: 9th International Conference on Signal Processing, 2008. ICSP 2008. IEEE (2008)
Jain, V., Kimia, B.B., Mundy, J.L.: Background modeling based on subpixel edges. In: IEEE International Conference on Image Processing, 2007. ICIP 2007. IEEE (2007)
Lai, A.H., Yung, N.H.: A fast and accurate scoreboard algorithm for estimating stationary backgrounds in an image sequence. In: Proceedings of the 1998 IEEE International Symposium on Circuits and Systems, 1998. ISCAS’98. IEEE (1998)
Wren, C.R., et al.: Pfinder: Real-time tracking of the human body. Pattern Anal. Mach. Intell. IEEE Trans. 19(7), 780–785 (1997)
Friedman, N., Russell, S.: Image segmentation in video sequences: a probabilistic approach. In: Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc (1997)
Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE (1999)
Hayman, E., Eklundh, J.-O.: Statistical background subtraction for a mobile observer. In: Proceedings of Ninth IEEE International Conference on Computer Vision, 2003. IEEE (2003)
Zivkovic, Z. Improved adaptive Gaussian mixture model for background subtraction. in Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. 2004. IEEE
Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn. Lett. 27(7), 773–780 (2006)
Tuzel, O., Porikli, F., Meer, P.: A bayesian approach to background modeling. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, 2005, CVPR Workshops. IEEE (2005)
Chen, Y.-T., et al.: Efficient hierarchical method for background subtraction. Pattern Recogn. 40(10), 2706–2715 (2007)
Zhang, H., Xu, D.: Fusing color and texture features for background model. In: Third International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2006, Xi’an, China, 24–28 Sept 2006. Springer (2006)
El Baf, F., Bouwmans, T., Vachon, B.: Fuzzy statistical modeling of dynamic backgrounds for moving object detection in infrared videos. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009. CVPR Workshops 2009. IEEE (2009)
Azab, M.M., Shedeed, H.A., Hussein, A.S.: A new technique for background modeling and subtraction for motion detection in real-time videos. In: ICIP (2010)
Sivabalakrishnan, M., Manjula, D.: Adaptive background subtraction in dynamic environments using fuzzy logic. Int. J.Video Image Process. Netw. Secur. 10(1) (2010)
Bouwmans, T.: Background subtraction for visual surveillance: a fuzzy approach. In: Handbook on Soft Computing for Video Surveillance, pp. 103–134 (2012)
Shakeri, M., et al.: A novel fuzzy background subtraction method based on cellular automata for urban traffic applications. In: 9th International Conference on Signal Processing, ICSP 2008. IEEE (2008)
Maddalena, L., Petrosino, A.: A self-organizing approach to background subtraction for visual surveillance applications. Image Process. IEEE Trans. 17(7), 1168–1177 (2008)
Culibrk, D., et al.: Neural network approach to background modeling for video object segmentation. Neural Netw. IEEE Trans. 18(6), 1614–1627 (2007)
Maddalena, L., Petrosino, A.: A fuzzy spatial coherence-based approach to background/foreground separation for moving object detection. Neural Comput. Appl. 19(2), 179–186 (2010)
Oliver, N.M., Rosario, B., Pentland, A.P.: A Bayesian computer vision system for modeling human interactions. Pattern Anal. Mach. Intell. IEEE Trans. 22(8), 831–843 (2000)
Goyat, Y., et al.: Vehicle trajectories evaluation by static video sensors. In: Intelligent Transportation Systems Conference, ITSC’06. IEEE (2006)
Godbehere, A.B., Matsukawa, A., Goldberg, K.: Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. In: 2012 American Control Conference (ACC). IEEE (2012)
Permuter, H., Francos, J., Jermyn, I.: A study of Gaussian mixture models of color and texture features for image classification and segmentation. Pattern Recogn. 39(4), 695–706 (2006)
Yoon, S., et al.: Image classification using GMM with context information and with a solution of singular covariance problem. In: Proceedings of Data Compression Conference, DCC 2003. IEEE (2003)
Brendel, W., Todorovic, S.: Video object segmentation by tracking regions. In: IEEE 12th International Conference on Computer Vision, 2009. IEEE (2009)
Yu, T., et al.: Monocular video foreground/background segmentation by tracking spatial-color gaussian mixture models. In: IEEE Workshop on Motion and Video Computing, 2007. WMVC’07. IEEE (2007)
Gowsikhaa, D., Abirami, S., Baskaran, R.: Automated human behavior analysis from surveillance videos: a survey. Artif. Intell. Rev. 42(4), 747–765 (2014)
Hu, W.-C., et al.: Moving object detection and tracking from video captured by moving camera. J. Visual Commun. Image Represent. (2015)
Ochs, P., Malik, J., Brox, T.: Segmentation of moving objects by long term video analysis. Pattern Anal. Mach. Intell. IEEE Trans. 36(6), 1187–1200 (2014)
Mak, C.-M., Cham, W.-K.: Fast video object segmentation using Markov random field. In: 2008 IEEE 10th Workshop on Multimedia Signal Processing. IEEE (2008)
Cucchiara, R., Prati, A., Vezzani, R.: Real-time motion segmentation from moving cameras. Real-Time Imaging 10(3), 127–143 (2004)
Jodoin, P., Mignotte, M., Rosenberger, C.: Segmentation framework based on label field fusion. Image Process. IEEE Trans. 16(10), 2535–2550 (2007)
Wang, Y.: Joint random field model for all-weather moving vehicle detection. Image Process. IEEE Trans. 19(9), 2491–2501 (2010)
Ghosh, A., Subudhi, B.N., Ghosh, S.: Object detection from videos captured by moving camera by fuzzy edge incorporated Markov random field and local histogram matching. Circuits Syst. Video Technol. IEEE Trans. 22(8), 1127–1135 (2012)
Murali, S., Girisha, R.: Segmentation of motion objects from surveillance video sequences using temporal differencing combined with multiple correlation. In: Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009. AVSS’09. IEEE (2009)
Wan, Y., Wang, X., Hu, H.: Automatic moving object segmentation for freely moving cameras. Math. Probl. Eng. 2014 (2014)
Kumari, S., Mitra, S.K.: Human action recognition using DFT. In: 2011 Third National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG). IEEE (2011)
He, Z., Jin, L.: Activity recognition from acceleration data based on discrete consine transform and svm. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2009. IEEE (2009)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th International Conference on Multimedia. ACM (2007)
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: European Conference on Computer Vision. Springer (2006)
Noguchi, A., Yanai, K.: A surf-based spatio-temporal feature for feature-fusion-based action recognition. In: European Conference on Computer Vision. Springer (2010)
Wang, H., et al.: A robust and efficient video representation for action recognition. Int. J. Comput. Vision 1–20 (2-15)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE (2005)
Lu, W.-L., Little, J.J.: Simultaneous tracking and action recognition using the pca-hog descriptor. In: The 3rd Canadian Conference on Computer and Robot Vision, 2006. IEEE (2006)
Lin, C.-H., Hsu, F.-S., Lin, W.-Y.: Recognizing human actions using NWFE-based histogram vectors. EURASIP J. Adv. Signal Process. 2010, 9 (2010)
Hsu, F.-S., Lin, C.-H., Lin, W.-Y:. Recognizing human actions using curvature estimation and NWFE-based histogram vectors. In: Visual Communications and Image Processing (VCIP). IEEE (2011)
Kuo, B.-C., Landgrebe, D.A.: Nonparametric weighted feature extraction for classification. Geosci. Remote Sensing, IEEE Trans. 42(5), 1096–1105 (2004)
Veeraraghavan, A., Roy-Chowdhury, A.K., Chellappa, R.: Matching shape sequences in video with applications in human movement analysis. Pattern Anal. Mach. Intell. IEEE Trans. 27(12), 1896–1909 (2005)
Schindler, K., Van Gool, L.: Action snippets: How many frames does human action recognition require? In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008. IEEE (2008)
Mahbub, U., Imtiaz, H., Ahad, A.: An optical flow-based action recognition algorithm. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Yang, M., Kpalma, K., Ronsin, J.: A survey of shape feature extraction techniques. Pattern Recogn. 43–90 (2008)
Rahman, S.A., Cho, S.-Y., Leung, M.K.: Recognising human actions by analysing negative spaces. IET Comput. Vision 6(3), 197–213 (2012)
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)
Dargazany, A., Nicolescu, M.: Human body parts tracking using torso tracking: applications to activity recognition. In: 2012 Ninth International Conference on Information Technology: New Generations (ITNG). IEEE (2012)
Nakazawa, A., Kato, H., Inokuchi, S.: Human tracking using distributed vision systems. In: Proceedings of Fourteenth International Conference on Pattern Recognition, 1998. IEEE (1998)
Leung, M.K., Yang, Y.-H.: First sight: A human body outline labeling system. Pattern Anal. Mach. Intell. IEEE Trans. 17(4), 359–377 (1995)
Leong, I.-F., Fang, J.-J., Tsai, M.-J.: Automatic body feature extraction from a marker-less scanned human body. Comput. Aided Des. 39(7), 568–582 (2007)
Rogez, G., Guerrero, J.J., Orrite, C.: View-invariant human feature extraction for video-surveillance applications. In: IEEE Conference on Advanced Video and Signal Based Surveillance, AVSS 2007. IEEE (2007)
Yao, A., et al.: Does human action recognition benefit from pose estimation? In: BMVC (2011)
Sedai, S., Bennamoun, M., Huynh, D.: Context-based appearance descriptor for 3D human pose estimation from monocular images. In: Digital Image Computing: Techniques and Applications, DICTA’09. IEEE (2009)
Ramanan, D., Forsyth, D.A., Zisserman, A.: Tracking people by learning their appearance. Pattern Anal. Mach. Intell. IEEE Trans. 29(1), 65–81 (2007)
Kaghyan, S., Sarukhanyan, H.: Activity recognition using K-nearest neighbor algorithm on smartphone with tri-axial accelerometer. In: International Journal of Informatics Models and Analysis (IJIMA), vol. 1, pp. 146–156. ITHEA International Scientific Society, Bulgaria (2012)
Gavrila, D., Davis, L.: Towards 3-d model-based tracking and recognition of human movement: a multi-view approach. In: International workshop on automatic face-and gesture-recognition. Citeseer (1995)
Veeraraghavan, A., Chellappa, R., Roy-Chowdhury, A.K.: The function space of an activity. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE (2006)
Sempena, S., Maulidevi, N.U., Aryan, P.R.: Human action recognition using dynamic time warping. In: 2011 International Conference on Electrical Engineering and Informatics (ICEEI). IEEE (2011)
Robertson, N., Reid, I.: A general method for human activity recognition in video. Comput. Vis. Image Underst. 104(2), 232–248 (2006)
Chung, P.-C., Liu, C.-D.: A daily behavior enabled hidden Markov model for human behavior understanding. Pattern Recogn. 41(5), 1572–1580 (2008)
Thuc, H.L.U., et al.: Quasi-periodic action recognition from monocular videos via 3D human models and cyclic HMMs. In:), 2012 International Conference on Advanced Technologies for Communications (ATC). IEEE (2012)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Reddy, K.K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24(5), 971–981 (2013)
Qian, H., et al.: Recognition of human activities using SVM multi-class classifier. Pattern Recogn. Lett. 31(2), 100–111 (2010)
Junejo, I.N., et al.: View-independent action recognition from temporal self-similarities. Pattern Anal. Mach. Intell. IEEE Trans. 33(1), 172–185 (2011)
Bodor, R., Jackson, B., Papanikolopoulos, N.: Vision-based human tracking and activity recognition. In: Proceedings of the 11th Mediterranean Conference on Control and Automation. Citeseer (2003)
Chu, C.-T., et al.: Human tracking by adaptive Kalman filtering and multiple kernels tracking with projected gradients. In: 2011 Fifth ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC). IEEE (2011)
Sengto, A., Leauhatong, T.: Human falling detection algorithm using back propagation neural network. In: Biomedical Engineering International Conference (BMEiCON), 2012. IEEE (2012)
Sharma, A., Lee, Y.-D., Chung, W.-Y.: High accuracy human activity monitoring using neural network. In: Third International Conference on Convergence and Hybrid Information Technology, ICCIT’08. IEEE (2008)
Ben-Arie, J., et al.: Human activity recognition using multidimensional indexing. Pattern Anal. Mach. Intell. IEEE Trans. 24(8), 1091–1104 (2002)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2014)
Ijjina, E.P., Mohan, C.K.: Human action recognition based on motion capture information using fuzzy convolution neural networks. In: 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR). IEEE (2015)
Toshev, A., Szegedy, C.: Deep pose: human pose estimation via deep neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2014)
Ji, S., et al.: 3D convolutional neural networks for human action recognition. Pattern Anal. Mach. Intell. IEEE Trans. 35(1), 221–231 (2013)
Gorelick, L., et al.: Actions as space-time shapes. Pattern Anal. Mach. Intell. IEEE Trans. 29(12), 2247–2253 (2007)
Ke, Y., Sukthankar, R., Hebert, M.: Spatio-temporal shape and flow correlation for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR’07. IEEE (2007)
Dollár, P., et al.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance. IEEE (2005)
Lu, X., Liu, Q., Oe, S.: Recognizing non-rigid human actions using joints tracking in space-time. In: Proceedings of International Conference on Information Technology: Coding and Computing, ITCC 2004. IEEE (2004)
Shechtman, E., Irani, M.: Space-time behavior based correlation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005. IEEE (2005)
Danafar, S., Gheissari, N.: Action recognition for surveillance applications using optic flow and SVM. In: Computer Vision–ACCV 2007, pp. 457–466. Springer (2007)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. IEEE (2004)
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009. IEEE (2009)
Sorenson, H.W.: Kalman Filtering: Theory and Application. IEEE (1960)
Deng, L.: Three classes of deep learning architectures and their applications: a tutorial survey. APSIPA Trans. Signal Inf. Process. (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bux, A., Angelov, P., Habib, Z. (2017). Vision Based Human Activity Recognition: A Review. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds) Advances in Computational Intelligence Systems. Advances in Intelligent Systems and Computing, vol 513. Springer, Cham. https://doi.org/10.1007/978-3-319-46562-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-46562-3_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46561-6
Online ISBN: 978-3-319-46562-3
eBook Packages: EngineeringEngineering (R0)