Skip to main content
Log in

A survey on activity recognition and behavior understanding in video surveillance

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

This paper provides a comprehensive survey for activity recognition in video surveillance. It starts with a description of simple and complex human activity, and various applications. The applications of activity recognition are manifold, ranging from visual surveillance through content based retrieval to human computer interaction. The organization of this paper covers all aspects of the general framework of human activity recognition. Then it summarizes and categorizes recent-published research progresses under a general framework. Finally, this paper also provides an overview of benchmark databases for activity recognition, the market analysis of video surveillance, and future directions to work on for this application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Aggarwal, J.K., Cai, Q.: Human motion analysis: a review. Comput. Vis. Image Underst. 73(3), 428–440 (1999)

    Google Scholar 

  2. Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 1–43 (2011)

    Google Scholar 

  3. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 06, 37–66 (1991)

    Google Scholar 

  4. Allili, M.S., Bouguila, N., Ziou, D.: A robust video foreground segmentation by using generalized Gaussian mixture modeling. In: 4th Canadian Conf. on Computer and Robot Vision, pp. 503–509 (2007)

    Google Scholar 

  5. Bayona, A., SanMiguel, J.C., Martínez, J.M.: Stationary foreground detection using background subtraction and temporal difference in video surveillance. In: IEEE 17th Int. Conf. on Image Processing, pp. 1–4 (2010)

    Google Scholar 

  6. Blunsom, P.: Hidden Markov models. Tech. rep, Human Language Technology University of Melbourne, Victoria, Australia (2004). http://www.cs.mu.oz.au/460/2004/materials/hmm-tutorial.pdf

  7. Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)

    Google Scholar 

  8. Bobick, A.F., Wilson, A.D.: A state-based approach to the representation and recognition of gesture. IEEE Trans. Pattern Anal. Mach. Intell. 19(12), 1325–1337 (1997)

    Google Scholar 

  9. Bose, B., Grimson, E.: Improving object classification in far-field video. In: Proc. of the Int. Conf. on Computer Vision and Pattern Recognition, pp. 181–188. IEEE Computer Society, Washington (2004)

    Google Scholar 

  10. Brown, L.M.: View independent vehicle/person classification. In: Proc. of the ACM 2nd Int. Workshop on Video Surveillance & Sensor Networks, pp. 114–123. ACM Press, New York (2004)

    Google Scholar 

  11. Bucak, S.S., Gunsel, B., Gursoy, O.: Incremental nonnegative matrix factorization for background modeling in surveillance video. In: IEEE 15th Signal Processing and Communications Applications (SIU), pp. 1–4 (2007)

    Google Scholar 

  12. Cai, L., He, L., Yamashita, T., Xu, Y., Zhao, Y., Yang, X.: Robust contour tracking by combining region and boundary information. IEEE Trans. Circuits Syst. Video Technol. 21(12), 1784–1794 (2011)

    Google Scholar 

  13. Campbell, L., Bobick, A.: Recognition of human body motion using phase space constraints. In: ICCV, pp. 624–630 (1995)

    Google Scholar 

  14. Camplani, M., Salgado, L.: Adaptive background modeling in multicamera system for real-time object detection. Opt. Eng. 50(12), 1–17 (2011)

    Google Scholar 

  15. Cavallaro, A., Steiger, O., Ebrahimi, T.: Tracking video objects in cluttered background. IEEE Trans. Circuits Syst. Video Technol. 15(4), 575–584 (2005)

    Google Scholar 

  16. Chai, Y., Shin, S., Chang, K., Kim, T.: Real-time user interface using particle filter with integral histogram. IEEE Trans. Consum. Electron. 56(2), 510–515 (2010)

    Google Scholar 

  17. Chang, S.F.: The holy grail of content-based media analysis. IEEE Multimed. 9(2), 6–10 (2002)

    Google Scholar 

  18. Chen, L., Yang, H., Takaki, T., Ishii, I.: Real-time frame-straddling-based optical flow detection. In: Proc. of IEEE Int. Conf. on Robotics and Biomimetics, pp. 2447–2452 (2011)

    Google Scholar 

  19. Chen, Q., Sun, Q.S., Heng, P.A., Xia, D.S.: Two-stage object tracking method based on kernel and active contour. IEEE Trans. Circuits Syst. Video Technol. 20(4), 605–609 (2010)

    Google Scholar 

  20. Chen, Y., Zhang, L., Lin, B., Xu, Y., Ren, X.: Fighting detection based on optical flow context histogram. In: Proc. of IEEE 2nd Int. Conf. on Innovations in Bio-inspired Computing and Applications, pp. 95–98 (2011)

    Google Scholar 

  21. Cheng, F.H., Chen, Y.L.: Real time multiple objects tracking and identification based on discretewavelet transform. Pattern Recognit. 39, 1126–1139 (2006)

    MathSciNet  MATH  Google Scholar 

  22. Cheung, K., Baker, S., Kanade, T.: Shape-from-silhouette across time part II: applications to human modeling and markerless motion tracking. Int. J. Comput. Vis. 63(3), 225–245 (2005)

    Google Scholar 

  23. Chiverton, J., Mirmehdi, M., Xie, X.: On-line learning of shape information for object segmentation and tracking. In: Proc. of British Machine Vision Conference, pp. 1–11 (2009)

    Google Scholar 

  24. Chiverton, J., Xie, X., Mirmehdi, M.: Automatic bootstrapping and tracking of object contours. IEEE Trans. Image Process. 21(3), 1231–1245 (2012)

    MathSciNet  Google Scholar 

  25. Chomat, O., Crowley, J.L.: Probabilistic recognition of activity using local appearance. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 637–663 (1999)

    Google Scholar 

  26. Cohen, C.J., Morelli, F., Scott, K.A.: A surveillance system for recognition of intent within individuals and crowds. In: Conf. on Technologies for Homeland Security, Waltham, MA, pp. 559–565. IEEE Press, New York (2008)

    Google Scholar 

  27. Cohen, W.W.: Fast effective rule induction. In: Proc. of 12th Int. Conf. on Machine Learning, pp. 115–123. Morgan Kaufmann, San Mateo (1995)

    Google Scholar 

  28. Coifman, B., Beymer, D., McLauchlan, P., Malik, J.: A real-time computer vision system for vehicle tracking and traffic surveillance. Transp. Res., Part C, Emerg. Technol. 6(4), 271–288 (1998)

    Google Scholar 

  29. Collins, R.T., Lipton, A.J., Kanade, T., Fujiyoshi, H., Duggins, D., Tsin, Y., Tolliver, D., Enomoto, N., Hasegawa, O., Burt, P., Wixson, L.: A system for video surveillance and monitoring. Tech. rep, Robotics Institute at Carnegie Mellon University (2000)

  30. Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(5), 564–577 (2003)

    Google Scholar 

  31. Cupillard, F., Bremond, F., Thonnat, M.: Group behavior recognition with multiple cameras. In: Proc. 6th IEEE Workshop on Applications of Computer Vision, pp. 177–183 (2002)

    Google Scholar 

  32. Cutler, R., Davis, L.S.: Robust real-time periodic motion detection, analysis, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 781–796 (2000)

    Google Scholar 

  33. Dai, P., Di, H., Dong, L., Tao, L., Xu, G.: Group interaction analysis in dynamic context. IEEE Trans. Syst. Man Cybern. 38(1), 275–282 (2008)

    Google Scholar 

  34. Damen, D., Hogg, D.: Recognizing linked events: searching the space of feasible explanations. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 927–934 (2009)

    Google Scholar 

  35. Darrell, T., Pentland, A.: Space-time gestures. In: Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 335–340 (1993)

    Google Scholar 

  36. Denman, S., Chandran, V., Sridharan, S.: Adaptive optical flow for person tracking. In: Proc. of the Digital Imaging Computing: Techniques and Applications, DICTA ’05, pp. 1–7 (2005)

    Google Scholar 

  37. Denman, S., Chandran, V., Sridharan, S.: An adaptive optical flow technique for person tracking systems. Pattern Recognit. Lett. 28(10), 1232–1239 (2007)

    Google Scholar 

  38. Denman, S., Fookes, C., Sridharan, S.: Improved simultaneous computation of motion detection and optical flow for object tracking. In: IEEE Digital Image Computing: Techniques and Applications, pp. 175–182 (2009)

    Google Scholar 

  39. Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Int. Conf. on Computer Communications and Networks, vol. 14, pp. 65–72. IEEE Press, New York (2005)

    Google Scholar 

  40. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Stanford Research Institute, Menlo Park (1973)

    MATH  Google Scholar 

  41. Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proc. 9th IEEE Int. Conf. on Computer Vision, vol. 2, pp. 726–733 (2003)

    Google Scholar 

  42. Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: Frame-Rate Workshop, pp. 751–767. IEEE Press, New York (2000)

    Google Scholar 

  43. Fazli, S., Pour, H.M., Bouzari, H.: Multiple object tracking using improved GMM based motion segmentation. In: IEEE ECTI-CON, vol. 2, pp. 1130–1133 (2009)

    Google Scholar 

  44. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 264–271 (2003)

    Google Scholar 

  45. Filipovych, R., Ribeiro, E.: Combining models of pose and dynamics for human motion recognition. In: 3rd International Springer Symposium on Advances in Visual Computing, Aberdeen, Scotland, pp. 21–32 (2007)

    Google Scholar 

  46. Forsyth, D.A., Arikan, O., Ikemoto, L., O’Brien, J., Ramanan, D.: Computational studies of human motion: part 1, tracking and motion synthesis. Found. Trends Comput. Graph. Vis. 1(02/03), 77–254 (2005)

    Google Scholar 

  47. Gallagher, M., Downs, T.: Visualization of learning in multilayer perceptron networks using principal component analysis. IEEE Trans. Syst. Man Cybern. 33, 28–34 (2003)

    Google Scholar 

  48. Gavrilla, D., Davis, L.: 3D Model-based tracking of humans in action: a multi-view approach. In: Int. Proc. of the Computer Vision and Pattern Recognition, pp. 73–80 (1996)

    Google Scholar 

  49. Ghanem, N., DeMenthon, D., Doermann, D., Davis, L.: Representation and recognition of events in surveillance video using Petri nets. In: Conf. on Computer Vision and Pattern Recognition Workshop, pp. 112–121 (2004)

    Google Scholar 

  50. Gilbert, A., Illingworth, J., Bowden, R.: Fast realistic multi-action recognition using mined dense spatio-temporal features. In: IEEE 12th Int. Conf. on Computer Vision, pp. 925–931 (2009)

    Google Scholar 

  51. Girisha, R., Murali, S.: Tracking humans using novel optical flow algorithm for surveillance videos. In: Proceedings of the 4th Annual ACM Bangalore Conf., COMPUTE ’11, pp. 1–8 (2011)

    Google Scholar 

  52. Gong, S., Xiang, T.: Recognition of group activities using dynamic probabilistic networks. In: Proc. 9th IEEE Int. Conf. on Computer Vision, vol. 2, pp. 742–749 (2003)

    Google Scholar 

  53. Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)

    Google Scholar 

  54. Gupta, A., Davis, L.S.: Objects in action: an approach for combining action understanding and object perception. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2007)

    Google Scholar 

  55. Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2012–2019 (2009)

    Google Scholar 

  56. Haritaoglu, I., Harwood, D., Davis, L.S.: W 4: real-time surveillance of people and their activities. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 309–330 (2000)

    Google Scholar 

  57. Heisele, B., Ho, P., Wu, J., Poggio, T.: Face recognition: component-based versus global approaches. Comput. Vis. Image Underst. 91, 6–21 (2003)

    Google Scholar 

  58. Hoiem, D., Efros, A.A., Hebert, M.: Putting objects in perspective. Int. J. Comput. Vis. 80, 3–15 (2008)

    Google Scholar 

  59. Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 34(3), 334–352 (2004)

    Google Scholar 

  60. Hu, W., Xie, D., Tan, T., Maybank, S.: Learning activity patterns using fuzzy self-organizing neural network. IEEE Trans. Syst. Man Cybern. 34(3), 1618–1626 (2004)

    Google Scholar 

  61. Huang, J., et al.: GPU-accelerated computation for robust motion tracking using the CUDA framework. In: Int. Conf. on Visual Information Engineering, vol. 5, pp. 437–442 (2008)

    Google Scholar 

  62. 11th IEEE Int. Workshop on Performance Evaluation of Tracking and Surveillance (2009). http://www.cvg.rdg.ac.uk/PETS2009/authors.html

  63. Imagery Library for Intelligent Detection Systems (2010). http://www.ilids.co.uk

  64. IMS Research. http://www.imsresearch.com/

  65. Ince, S., Konrad, J.: Occlusion-aware optical flow estimation. IEEE Trans. Image Process. 17(8), 1443–1451 (2008)

    MathSciNet  Google Scholar 

  66. Intille, S.S., Bobick, A.F.: A framework for recognizing multi-agent action from visual evidence. In: AAAI-99, pp. 518–525. AAAI Press, Menlo Park (1999)

    Google Scholar 

  67. Ishii, I., Taniguchi, T., Yamamoto, K., Takaki, T.: 1000 fps real-time optical flow detection system. Proc. SPIE 7538, 1–11 (2010)

    Google Scholar 

  68. Ivanov, Y.A., Bobick, A.F.: Recognition of visual activities and interactions by stochastic parsing. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 852–872 (2000)

    Google Scholar 

  69. Jan, T.: Neural network based threat assessment for automated visual surveillance. In: Int. Joint Conf. on Neural Networks, vol. 2, pp. 1309–1312. IEEE Press, New York (2004)

    Google Scholar 

  70. Jang, D.S., Choi, H.I.: Active models for tracking moving objects. Pattern Recognit. 33(7), 1135–1146 (2000)

    Google Scholar 

  71. Javed, O., Shah, M.: Tracking and object classification for automated surveillance. In: Proc. of the 7th European Conference on Computer Vision, pp. 343–357. Springer, London (2002)

    Google Scholar 

  72. Jeong, Y.S., Jeong, M.K., Omitaomu, O.A.: Weighted dynamic time warping for time series classification. Pattern Recognit. 44, 2231–2240 (2011)

    Google Scholar 

  73. Jiang, H., Drew, M.S., Li, Z.N.: Successive convex matching for action detection. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 1646–1653 (2006)

    Google Scholar 

  74. Joo, S.W., Chellappa, R.: Attribute grammar-based event recognition and anomaly detection. In: Conference on Computer Vision and Pattern Recognition Workshop, CVPRW ’06, pp. 107–114 (2006)

    Google Scholar 

  75. Kameda, Y., Minoh, M.: A human motion estimation method using 3-successive video frames. In: Proc. of Int. Conf. on Virtual Systems, pp. 135–140 (1996)

    Google Scholar 

  76. Kang, W., Deng, F.: Research on intelligent visual surveillance for public security. In: 6th Int. Conf. Comput. and Inf. Sci, pp. 824–829. IEEE/ACIS, Melbourne (2007)

    Google Scholar 

  77. Ke, Y., Sukthankar, R., Hebert, M.: Spatio-temporal shape and flow correlation for action recognition. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2007)

    Google Scholar 

  78. Khan, S.M., Shah, M.: Detecting group activities using rigidity of formation. In: Proc. of the 13th Annual ACM Int. Conf. on Multimedia, pp. 403–406 (2005)

    Google Scholar 

  79. Kim, H., Sakamoto, R., Kitahara, I., Toriyama, T., Kogure, K.: Robust silhouette extraction technique using background subtraction. In: 10th Meeting on Image Recognition and Understand (MIRU), Hiroshima, Japan, pp. 1–6 (2007)

    Google Scholar 

  80. Kim, J.B., Kim, H.J.: Efficient region-based motion segmentation for a video monitoring system. Pattern Recognit. Lett. 24(1/3), 113–128 (2003)

    Google Scholar 

  81. Kim, T.K., Im, J.H., Paik, J.K.: Video object segmentation and its salient motion detection using adaptive background generation. IEEE Power Electron. Lett. 45(11), 542–543 (2009)

    Google Scholar 

  82. Ko, T.: A survey on behavior analysis in video surveillance for homeland security applications. In: AIPR, pp. 1–8. IEEE Press, New York (2008)

    Google Scholar 

  83. Kuno, Y., Watanabe, T., Shimosakoda, Y., Nakagawa, S.: Automated detection of human for visual surveillance system. In: Proc. of the Int. Conf. on Pattern Recognition, ICPR ’96, pp. 865–869. IEEE Computer Society, Washington (1996)

    Google Scholar 

  84. Ladikos, A., Benhimane, S., Navab, N.: A realtime tracking system combining template-based and feature-based approaches. In: VISAPP (2007)

    Google Scholar 

  85. Lalos, C., Anagnostopoulos, V.: Hybrid tracking approach for assistive environments. In: In Int. Conf. Proc. Series, 05, vol. 39/64. ACM Press, New York (2009)

    Google Scholar 

  86. Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)

    Google Scholar 

  87. Laptev, I., Lindeberg, T.: Space-time interest points. In: Proc. 9th IEEE Int. Conf. on Computer Vision, pp. 432–439 (2003)

    Google Scholar 

  88. Laptev, I., Perez, P.: Retrieving actions in movies. In: Proc. of the 11th IEEE Int. Conf. on Computer Vision, pp. 1–8 (2007)

    Google Scholar 

  89. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

    Google Scholar 

  90. Leordeanu, M., Collins, R.: Unsupervised learning of object features from video sequences. In: Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA, vol. 1, pp. 1142–1149 (2005)

    Google Scholar 

  91. Li, X., Hu, W., Zhang, Z., Zhang, X.: Robust foreground segmentation based on two effective background models. In: Proc. of the 1st ACM Int. Conf. on Multimedia Information Retrieval, MIR ’08, pp. 223–228. ACM Press, New York (2008)

    Google Scholar 

  92. Liao, H.H., Chang, J.Y., Chen, L.G.: A localized approach to abandoned luggage detection with foreground-mask sampling. In: Proc. of the IEEE 5th Int. Conf. on Advanced Video and Signal Based Surveillance, AVSS’08, pp. 132–139. IEEE Computer Society, Washington (2008)

    Google Scholar 

  93. Lin, F., Chen, B.M., Lee, T.H.: Robust vision-based target tracking control system for an unmanned helicopter using feature fusion. In: 9th IAPR Int. Conf. on Machine Vision Applications, vol. 13, pp. 398–401 (2009)

    Google Scholar 

  94. Lin, H.H., Liu, T.L., Chuang, J.H.: A probabilistic svm approach for background scene initialization. In: Int. Conf. on Image Processing, vol. 3, pp. 893–896 (2002)

    Google Scholar 

  95. Lipton, A.J.: Local application of optic flow to analyse rigid versus non-rigid motion. http://www.eecs.lehigh.edu/FRAME/Lipton/ieevframe.html

  96. Lipton, A.J., Fujiyoshi, H., Patil, R.S.: Moving target classification and tracking from real-time video. In: Proc. of the 4th IEEE Workshop on Applications of Computer Vision, pp. 8–14. IEEE Computer Society, Washington (1998)

    Google Scholar 

  97. Liu, C., Yuen, J., Torralba, A., Sivic, J., Freeman, W.T.: Sift flow: dense correspondence across different scenes. In: Proc. of the 10th European Conference on Computer Vision: Part III, pp. 28–42. Springer, Berlin, Heidelberg (2008)

    Google Scholar 

  98. Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2009)

    Google Scholar 

  99. Lublinerman, R., Ozay, N., Zarpalas, D., Camps, O.: Activity recognition from silhouettes using linear systems and model (in)validation techniques. In: 18th Int. Conf. on Pattern Recognition, vol. 1, pp. 347–350 (2006)

    Google Scholar 

  100. Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Int. Joint Conf. on Artificial Intelligence, pp. 674–679. AAAI Press, Menlo Park (1981)

    Google Scholar 

  101. Luo, R., Li, L., Gu, I.Y.: Efficient adaptive background subtraction based on multi-resolution background modeling and updating. In: Springer-PCM, pp. 118–127. Springer, Berlin (2007)

    Google Scholar 

  102. Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and Viterbi path searching. In: CVPR, Minneapolis, Minnesota, USA, pp. 1–7. IEEE Computer Society, Washington (2007)

    Google Scholar 

  103. Ma, X., Grimson, W.E.L.: Edge-based rich representation for vehicle classification. In: Proceedings of the Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1185–1192. IEEE Computer Society, Washington (2005)

    Google Scholar 

  104. McHugh, J.M., Konrad, J., Saligrama, V., Jodoin, P.M.: Foreground-adaptive background subtraction. IEEE Signal Process. Lett. 16(5), 390–393 (2009)

    Google Scholar 

  105. Meyer, F., Bouthemy, P.: Region-based tracking using affine motion models in long image sequences. CVGIP, Image Underst. 60(2), 119–140 (1994)

    Google Scholar 

  106. Migdal, J., Grimson, W.E.L.: Background subtraction using Markov thresholds. In: Proc. of the IEEE Workshop on Motion and Video Computing (WACV/MOTION’05), WACV-MOTION ’05, vol. 2, pp. 58–65. IEEE Computer Society, Washington (2005)

    Google Scholar 

  107. Minnen, D., Essa, I., Starner, T.: Expectation grammars: leveraging high-level expectations for activity recognition. In: Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003, vol. 2, pp. 626–632 (2003)

    Google Scholar 

  108. Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81(03), 231–268 (2001)

    MATH  Google Scholar 

  109. Moeslund, T.B., Hilton, A., kruger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2–3), 90–126 (2006)

    Google Scholar 

  110. Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE Trans. Pattern Anal. Mach. Intell. 23(4), 349–361 (2001)

    Google Scholar 

  111. Monnet, A., Mittal, A., Paragios, N., Ramesh, V.: Background modeling and subtraction of dynamic scenes. In: Proc. 9th IEEE Int. Conf. on Computer Vision, vol. 2, pp. 1305–1312 (2003)

    Google Scholar 

  112. Moore, D., Essa, I.: Recognizing multitasked activities from video using stochastic context-free grammar. In: Proc. AAAI National Conf. on AI, pp. 770–776. AAAI Press, Menlo Park (2002)

    Google Scholar 

  113. Moore, D.J., Essa, I.A., Hayes, M.H.: Exploiting human actions and object context for recognition tasks. In: Proc. of 7th IEEE Int. Conf. on Computer Vision, vol. 1, pp. 80–86 (1999)

    Google Scholar 

  114. Morris, B.T., Trivedi, M.M.: A survey of vision-based trajectory learning and analysis for surveillance. IEEE Trans. Circuits Syst. Video Technol. 18(08), 1114–1127 (2008)

    Google Scholar 

  115. Narayana, M., Haverkamp, D.: A Bayesian algorithm for tracking multiple moving objects in outdoor surveillance video. In: CVPR, pp. 1–8. IEEE Press, New York (2007)

    Google Scholar 

  116. Natarajan, P., Nevatia, R.: Coupled hidden semi Markov models for activity recognition. In: IEEE Workshop on Motion and Video Computing, pp. 1–8 (2007)

    Google Scholar 

  117. Nevatia, R., Hobbs, J., Bolles, B.: An ontology for video event representation. In: IEEE Conf. on Computer Vision and Pattern Recognition Workshop, pp. 119–128 (2004)

    Google Scholar 

  118. Nevatia, R., Zhao, T., Hongeng, S.: Hierarchical language-based representation of events in video streams. In: Conf. on Computer Vision and Pattern Recognition Workshop, vol. 4, pp. 39–47 (2003)

    Google Scholar 

  119. Nguyen, N.T., Phung, D.Q., Venkatesh, S., Bui, H.: Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 955–960 (2005)

    Google Scholar 

  120. Niebles, J.C., Wang, H., Fei-fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: Proc. British Machine Vision Conference (BMVC) (2006)

    Google Scholar 

  121. Niethammer, M., Tannenbaum, A., Angenent, S.: Dynamic active contours for visual tracking. IEEE Trans. Autom. Control 51(4), 562–579 (2006)

    MathSciNet  Google Scholar 

  122. Nowozin, G.S., Bakir, G., Tsuda, K.: Discriminative subsequence mining for action classification. In: ICCV, vol. 11, pp. 1–8. IEEE Press, New York (2007)

    Google Scholar 

  123. Ogale, A.S., Karapurkar, A., Aloimonos, Y.: View-invariant modeling and recognition of human actions using grammars. In: 10th Conf. on Category Curve of Long Video, vol. 10, pp. 115–126, Beijing, China. IEEE Press, New York (2005)

    Google Scholar 

  124. Oh, S., Hoogs, A., et al.: A large-scale benchmark dataset for event recognition in surveillance video. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 3153–3160 (2011)

    Google Scholar 

  125. Oikonomopoulos, A., Patras, I., Pantic, M., Paragios, N.: Trajectory-based representation of human actions. In: Artificial Intelligence for Human Computing, vol. 4451, pp. 133–154. Springer, Berlin (2007)

    Google Scholar 

  126. Oikonomopoulos, A., Patras, I., Pantici, M.: Spatiotemporal salient points for visual recognition of human actions. IEEE Trans. Syst. Man Cybern. 36(3), 710–719 (2006)

    Google Scholar 

  127. Oliver, N.M., Rosario, B., Pentland, A.P.: A Bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 831–843 (2000)

    Google Scholar 

  128. Oliver, N., Horvitz, E., Garg, A.: Layered representations for human activity recognition. In: Proc. 4th IEEE Int. Conf. on Multimodal Interfaces, pp. 3–8 (2002)

    Google Scholar 

  129. Ong, E.J., Gong, S.: The dynamics of linear combinations: tracking 3d skeletons of human subjects. Image Vis. Comput. 20(5/6), 397–414 (2002)

    Google Scholar 

  130. Paragios, N., Deriche, R.: Geodesic active contours and level sets for the detection and tracking of moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 22(3), 266–280 (2000)

    Google Scholar 

  131. Paragios, R., Stenger, B., Ramesh, V., Paragios, N., Buhmann, F.C.J.: Topology free hidden Markov models: application to background modeling. In: IEEE Int. Conf. on Computer Vision, pp. 294–301 (2001)

    Google Scholar 

  132. Parameswaran, V., Chellappa, R.: View invariance for human action recognition. Int. J. Comput. Vis. 66(1), 83–101 (2006)

    Google Scholar 

  133. Parameswaran, V., Singh, M., Ramesh, V.: Illumination compensation based change detection using order consistency. In: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1982–1989 (2010)

    Google Scholar 

  134. Parikh, D., Zitnick, C.L., Chen, T.: Unsupervised learning of hierarchical spatial structures in images. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2009)

    Google Scholar 

  135. Park, S., Aggarwal, J.K.: A hierarchical Bayesian network for event recognition of human actions and interactions. Assoc. Comput. Mach. Multimedia Syst. J., 164–179 (2004)

  136. Paruchuri, J.K., Sathiyamoorthy, E.P., Ching, S., Cheung, S., Chen, C.H.: Spatially adaptive illumination modeling for background subtraction. In: IEEE Int. Conf. on Computer Vision Workshops (ICCV Workshops), pp. 1745–1752 (2011)

    Google Scholar 

  137. Pentland, A.: Smart rooms, smart clothes. In: Proc. Fourteenth Int. Conf. on Pattern Recognition, vol. 2, pp. 949–953 (1998)

    Google Scholar 

  138. Peursum, P., West, G., Venkatesh, S.: Combining image regions and human activity for indirect object recognition in indoor wide-angle views. In: 10th IEEE Int. Conf. on Computer Vision, vol. 1, pp. 82–89 (2005)

    Google Scholar 

  139. Pilet, J., Strecha, C., Fua, P.: Making background subtraction robust to sudden illumination changes. In: Proc. European Conf. on Computer Vision, pp. 1–14 (2008)

    Google Scholar 

  140. Pinhanez, C.S., Bobick, A.F.: Human action detection using pnf propagation of temporal constraints. In: Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 898–904 (1998)

    Google Scholar 

  141. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Proceedings of Advances in Kernel Methods—Support Vector Learning, pp. 185–208. Microsoft, Redmond (1998)

    Google Scholar 

  142. Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28, 976–990 (2010)

    Google Scholar 

  143. Porikli, F., Ivanov, Y., Haga, T.: Robust abandoned object detection using dual foregrounds. EURASIP J. Adv. Signal Process. 08, 197875 (2008)

    Google Scholar 

  144. Qi, Y., An, G.: Infrared moving targets detection based on optical flow estimation. In: Proc. of IEEE Int. Conf. on Computer Science and Network Technology, pp. 2452–2455 (2011)

    Google Scholar 

  145. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  146. Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: Proc. of the IEEE 11th Int. Conf. on Computer Vision, pp. 1–8 (2007)

    Google Scholar 

  147. Rao, C., Shah, M.: View-invariance in action recognition. In: Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 316–322 (2001)

    Google Scholar 

  148. Reddy, V., Sanderson, C., Sanin, A., Lovell, B.C.: Adaptive patch-based background modelling for improved foreground object segmentation and tracking. In: 7th IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS), pp. 172–179 (2010)

    Google Scholar 

  149. Ren, Y., Chua, C.S.: Bilateral learning for color-based tracking. Image Vis. Comput. 26(11), 1530–1539 (2008)

    Google Scholar 

  150. Rodriguez, M.D., Ahmed, J., Shah, M.: Action MACH: a spatio-temporal maximum average correlation height filter for action recognition. In: CVPR. IEEE Press, New York (2008)

    Google Scholar 

  151. Rui, Y., Huang, T.S.: Image retrieval: current techniques, promising directions and open issues. J. Vis. Commun. Image Represent. 10, 39–62 (1999)

    Google Scholar 

  152. Ryoo, M.S., Aggarwal, J.K.: Recognition of composite human activities through context-free grammar based representation. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 1709–1718 (2006)

    Google Scholar 

  153. Ryoo, M.S., Aggarwal, J.K.: Hierarchical recognition of human activities interacting with objects. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2007)

    Google Scholar 

  154. Ryoo, M.S., Aggarwal, J.K.: Recognition of high-level group activities based on activities of individual members. In: IEEE Workshop on Motion and Video Computing, pp. 1–8 (2008)

    Google Scholar 

  155. Ryoo, M.S., Aggarwal, J.K.: Semantic representation and recognition of continued and recursive human activities. Int. J. Comput. Vis. 82, 1–24 (2009)

    Google Scholar 

  156. Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: IEEE 12th Int. Conf. on Computer Vision, pp. 1593–1600 (2009)

    Google Scholar 

  157. Ryoo, M.S., Aggarwal, J.K.: UT-Interaction Dataset, ICPR contest on Semantic Description of Human Activities (SDHA). http://cvrc.ece.utexas.edu/SDHA2010/Human_Interaction.html (2010)

  158. Ryoo, M.S., Chen, C.C., Aggarwal, J.K., Roy-Chowdhury, A.: An overview of contest on semantic description of human activities 2010. In: Proc. Int. Conf. Pattern Recognition Contests, pp. 1–16 (2010)

    Google Scholar 

  159. Sakaino, H.: A semitransparency-based optical-flow method with a point trajectory model for particle-like video. IEEE Trans. Image Process. 21(2), 441–450 (2012)

    MathSciNet  Google Scholar 

  160. Salembier, P., Marques, F.: Region-based representations of image and video: segmentation tools for multimedia services. IEEE Trans. Circuits Syst. Video Technol. 9(8), 1147–1169 (1999)

    Google Scholar 

  161. Sarkar, S., Phillips, P.J., Liu, Z., Vega, I.R., Grother, P., Bowyer, K.W.: The humanoid gait challenge problem: data sets, performance, and analysis. IEEE Trans. Pattern Anal. Mach. Intell. 27(2), 162–177 (2005)

    Google Scholar 

  162. Schmaltz, C., Rosenhahn, B., Brox, T., Weickert, J.: Localised mixture models in region-based tracking. In: Proc. of the 31st DAGM Symposium on Pattern Recognition, pp. 21–30. Springer, Berlin (2009)

    Google Scholar 

  163. Schmaltz, C., Rosenhahn, B., Brox, T., Weickert, J.: Region-based pose tracking with occlusions using 3D models. Mach. Vis. Appl. 23(3), 557–577 (2012)

    Google Scholar 

  164. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proc. IEEE Computer Society Pattern Recognition, vol. 3, pp. 32–36. IEEE Computer Society Press, Los Alamitos (2004)

    Google Scholar 

  165. Schunck, B.: The image flow constraint equation. Comput. Vis. Graph. Image Process. 35(1), 20–46 (1986)

    Google Scholar 

  166. Schunck, B., Horni, B.: Determining optical flow. In: DARPA81, pp. 144–156 (1981)

    Google Scholar 

  167. Sclaroff, S., Isidoro, J.: Active blobs: region-based, deformable appearance models. Comput. Vis. Image Underst. 89(2/3), 197–225 (2003)

    MATH  Google Scholar 

  168. Senst, T., Evangelio, R.H., Sikora, T.: Detecting people carrying objects based on an optical flow motion model. In: IEEE Workshop on Applications of Computer Vision, pp. 301–306 (2011)

    Google Scholar 

  169. Shechtman, E., Irani, M.: Space-time behavior based correlation. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 405–412 (2005)

    Google Scholar 

  170. Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: IEEE 12th Int. Conf. on Computer Vision, pp. 1219–1225 (2009)

    Google Scholar 

  171. Sheikh, Y., Sheikh, M., Shah, M.: Exploring the space of a human action. In: Tenth IEEE Int. Conf. on Computer Vision, vol. 1, pp. 144–149 (2005)

    Google Scholar 

  172. Shi, J., Tomasi, C.: Good features to track. In: CVPR, pp. 593–600. IEEE Computer Society, Washington (1994)

    Google Scholar 

  173. Shi, Y., Huang, Y., Minnen, D., Bobick, A., Essa, I.: Propagation networks for recognition of partially ordered sequential action. In: Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 862–869 (2004)

    Google Scholar 

  174. Shibata, M., Yasuda, Y., Ito, M.: Moving object detection for active camera based on optical flow distortion. In: Proc. of the 17th World Congress the International Federation of Automatic Control, Seoul, Korea, pp. 14,720–14,725 (2008)

    Google Scholar 

  175. Siskind, J.M.: Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. J. Artif. Intell. Res. 15, 31–90 (2001)

    MATH  Google Scholar 

  176. Sivic, J., Zisserman, A.: Video data mining using configurations of viewpoint invariant regions. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Washington, DC, pp. 1–8 (2004)

    Google Scholar 

  177. Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proc. of the 8th ACM Int. Workshop on Multimedia Information Retrieval, Santa Barbara, California, USA, pp. 321–330 (2006)

    Google Scholar 

  178. Starner, T., Pentland, A.: Real-time American sign language recognition from video using hidden Markov models. In: Proceedings International Symposium on Computer Vision, pp. 265–270 (1995)

    Google Scholar 

  179. Stauffer, C.: Automatic hierarchical classification using time-based co-occurrences. In: IEEE Int. Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 333–339 (1999)

    Google Scholar 

  180. Stauffer, C., Grimson, W.E.L.: Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–757 (2000)

    Google Scholar 

  181. Tavakkoli, A., Nicolescu, M., Bebis, G.: A novelty detection approach for foreground region detection in videos with quasi-stationary backgrounds. In: Proc. of the 2nd Int. Symposium on Visual Computing, pp. 40–49. Springer, Berlin, Heidelberg (2006)

    Google Scholar 

  182. Techmer, A.: Contour-based motion estimation & object tracking for real-time applications. In: IEEE Image Processing Proceedings, vol. 3, pp. 648–651 (2001)

    Google Scholar 

  183. Thi, T.H., Zhang, J., Cheng, L., Wang, L., Satoh, S.: Semi-supervised human action recognition and localization using spatially and temporally integrated local features (2009). http://huetuan.net/semiaction.html

  184. Trec Video Retrieval Evaluation Official Website. http://huetuan.net/semiaction.html

  185. Tsai, D.M., Lai, S.C.: Independent component analysis-based background subtraction for indoor surveillance. IEEE Trans. Image Process. 18(1), 158–167 (2009)

    MathSciNet  Google Scholar 

  186. Tsuchiya, M., Fujiyoshi, H.: Evaluating feature importance for object classification in visual surveillance. In: Proc. of the 18th Int. Conf. on Pattern Recognition, vol. 2, pp. 978–981. IEEE Computer Society, Washington (2006)

    Google Scholar 

  187. Valera, M., Velastin, S.A.: Intelligent distributed surveillance systems: a review. IEE Proc., Vis. Image Signal Process. 152(2), 192–204 (2005)

    Google Scholar 

  188. Varcheie, P.D.Z., Sills-Lavoie, M., Bilodeau, G.A.: A multiscale region-based motion detection and background subtraction algorithm. Sensors 10, 1041–1061 (2010)

    Google Scholar 

  189. Vaswani, N., Chowdhury, A.R., Chellappa, R.: Activity recognition using the dynamics of the configuration of interacting objects. In: Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 633–640 (2003)

    Google Scholar 

  190. Vaswani, N., Chowdhury, A.R., Chellappa, R.: Shape activity: a continuous state HMM for moving/deforming shapes with application to abnormal activity detection. IEEE Trans. Image Process. 14(10), 1603–1616 (2005)

    Google Scholar 

  191. Veeraraghavan, A., Chellappa, R., Roy-Chowdhury, A.K.: The function space of an activity. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 959–968 (2006)

    Google Scholar 

  192. Vishwakarma, S., Agrawal, A.: A novel approach for feature quantization using one-dimensional histogram. In: Annual IEEE India Conference (INDICON), pp. 1–4 (2011)

    Google Scholar 

  193. Vishwakarma, S., Sapre, A., Agrawal, A.: Action recognition using cuboids of interest points. In: IEEE Int. Conf. on Signal Processing, Communications and Computing (ICSPCC), pp. 1–6 (2011)

    Google Scholar 

  194. Vlasic, D., Baran, I., Matusik, W., Popović, J.: Articulated mesh animation from multi-view silhouettes. ACM Trans. Graph. 27(3), 97:1–97:9 (2008)

    Google Scholar 

  195. Vogler, C., Metaxas, D.: Parallel hidden Markov models for American sign language recognition. In: IEEE Int. Conf. on Computer Vision, vol. 1, pp. 224–228 (1999)

    Google Scholar 

  196. Vosters, L., Shan, C., Gritti, T.: Background subtraction under sudden illumination changes. In: 7th IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS), pp. 384–391 (2010)

    Google Scholar 

  197. Vu, V.-T., Bremond, F., Thonnat, M.: Automatic video interpretation: a novel algorithm for temporal scenario recognition. In: Proc. 8th Int. Joint Conf. Artif. Intell, pp. 9–15 (2003)

    Google Scholar 

  198. Waltisberg, D., Yao, A., Gall, J., Gool, L.V.: Variations of a hough-voting action recognition system. In: Proc. of Int. Conf. on Pattern Recognition, pp. 1–7 (2010)

    Google Scholar 

  199. Wang, J., Bebis, G., Miller, R.: Robust video-based surveillance by integrating target detection with tracking. In: Proc. Conf. on Computer Vision and Pattern Recognition Workshop, CVPRW ’06, pp. 137–144. IEEE Computer Society, Washington (2006)

    Google Scholar 

  200. Weber, M.: Unsupervised learning of models for object recognition. Ph.D. thesis, California Institute of Technology, Pasadena, California (2000)

  201. Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3D exemplars. In: ICCV, Rio de Janeiro, Brazil, vol. 11, pp. 1–7. IEEE Computer Society Press, Los Alamitos (2007)

    Google Scholar 

  202. Weinland, D., Ronfard, R., Boyer, E.: Automatic discovery of action taxonomies from multiple views. In: CVPR, vol. 2, pp. 1639–1645. IEEE Computer Society, Washington (2006)

    Google Scholar 

  203. Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104(02), 249–257 (2006)

    Google Scholar 

  204. Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115, 224–241 (2011)

    Google Scholar 

  205. Wen, Z., Cai, Z.: A robust object tracking approach using mean shift. In: 3rd IEEE Int. Conf. on Natural Computation, vol. 2, pp. 170–174 (2007)

    Google Scholar 

  206. Wong, S.F., Cipolla, R.: Extracting spatiotemporal interest points using global information. In: ICCV, vol. 11, pp. 1–8. IEEE Press, New York (2007)

    Google Scholar 

  207. Wong, S.F., Kim, T.K., Cipolla, R.: Learning motion categories using both semantic and structural information. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–6 (2007)

    Google Scholar 

  208. Wunsch, P., Hirzinger, G.: Real-time visual tracking of 3D objects with dynamic handling of occlusion. In: Int. Conf. on Robotics and Automation, 97, Albuquerque, New Mexico, USA, vol. 4, pp. 2868–2879 (1997)

    Google Scholar 

  209. Xiang, T.: Video behavior profiling for anomaly detection. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 893–908 (2008)

    Google Scholar 

  210. Xiao, J., Cheng, H., Han, F., Sawhney, H.: Geo-spatial aerial video processing for scene understanding and object tracking. In: CVPR, pp. 1–8. IEEE Press, New York (2008)

    Google Scholar 

  211. Xu, M., Zuo, L., Iyengar, S., Goldfain, A., DelloStritto, J.: A semi-supervised hidden Markov model-based activity monitoring system. In: 33rd Annual Int. Conf. of the IEEE Engineering in Medicine and Biology Society (EMBC), Boston, Massachusetts USA, pp. 1794–1797 (2011)

    Google Scholar 

  212. Yacoob, Y., Black, M.J.: Parameterized modeling and recognition of activities. In: 6th Int. Conf. on Computer Vision, pp. 120–127 (1998)

    Google Scholar 

  213. Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden Markov model. In: Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 379–385 (1992)

    Google Scholar 

  214. Yamazaki, M., Xu, G., Chen, Y.W.: Detection of moving objects by independent component analysis. In: Proc. of the 7th Asian Conf. on Computer Vision, ACCV’06, vol. 2, pp. 467–478. Springer, Berlin, Heidelberg (2006)

    Google Scholar 

  215. Yang, F., Li, B.: Unsupervised learning of spatial structures shared among images. Vis. Comput. 28(2), 175–180 (2011)

    Google Scholar 

  216. Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38(4), 1–45 (2006)

    Google Scholar 

  217. Yilmaz, A., Li, X., Shah, M.: Contour-based object tracking with occlusion handling in video acquired using mobile cameras. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1531–1536 (2004)

    Google Scholar 

  218. Yilmaz, A., Shah, M.: Actions sketch: a novel action representation. In: CVPR, vol. 1, pp. 984–989. IEEE Computer Society, Washington (2005)

    Google Scholar 

  219. Yohannes, Y., Hoddinott, J.: Classification and regression trees. Tech. rep., International Food Policy Research Institute, Washington, DC, USA (1999)

  220. Yokoyama, M., Poggio, T.: A contour-based moving object detection and tracking. In: 2nd Joint IEEE Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 271–276 (2005)

    Google Scholar 

  221. Yu, E., Aggarwal, J.K.: Detection of fence climbing from monocular video. In: 18th Int. Conf. on Pattern Recognition, vol. 1, pp. 375–378 (2006)

    Google Scholar 

  222. Yu, T.H., Kim, T.K., Cipolla, R.: Real-time action recognition by spatiotemporal semantic and structural forests. In: Proc. of British Machine Vision Conference, pp. 1–7 (2010)

    Google Scholar 

  223. Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 123–130 (2001)

    Google Scholar 

  224. Zhan, B., Monekosso, D.N., Remagnino, P., Velastin, S.A., Xu, L.Q.: Crowd analysis: a survey. Mach. Vis. Appl. 19(5–6), 345–357 (2008)

    MATH  Google Scholar 

  225. Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I.: Modeling individual and group actions in meetings with layered hmms. IEEE Trans. Multimed. 8(3), 509–520 (2006)

    Google Scholar 

  226. Zhang, J., Tian, Y., Yang, Y.: Adaptive dynamic model particle filter for visual object tracking. In: ISECS International Colloquium, vol. 1, pp. 333–336. IEEE Press, New York (2009)

    Google Scholar 

  227. Zhang, L., Li, S.Z., Yuan, X., Xiang, S.: Real-time object classification in video surveillance based on appearance learning. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2007)

    Google Scholar 

  228. Zhao, Y., Gong, H., Lin, L., Jia, Y.: Spatio-temporal patches for night background modeling by subspace learning. In: 19th Int. Conf. on Pattern Recognition, pp. 1–4 (2008)

    Google Scholar 

  229. Zhong, H., Shi, J., Visontai, M.: Detecting unusual activity in video. In: Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 819–826 (2004)

    Google Scholar 

  230. Zhou, S.K., Chellappa, R., Moghaddam, B.: Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Trans. Image Process. 13(11), 1491–1506 (2004)

    Google Scholar 

  231. Zhu, Y., Dariush, B., Fujimura, K.: Kinematic self retargeting: a framework for human pose estimation. Comput. Vis. Image Underst. 114(12), 1362–1375 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarvesh Vishwakarma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vishwakarma, S., Agrawal, A. A survey on activity recognition and behavior understanding in video surveillance. Vis Comput 29, 983–1009 (2013). https://doi.org/10.1007/s00371-012-0752-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-012-0752-6

Keywords

Navigation