International Journal of Computer Vision

, Volume 98, Issue 3, pp 303–323 | Cite as

Video Behaviour Mining Using a Dynamic Topic Model

Article

Abstract

This paper addresses the problem of fully automated mining of public space video data, a highly desirable capability under contemporary commercial and security considerations. This task is especially challenging due to the complexity of the object behaviors to be profiled, the difficulty of analysis under the visual occlusions and ambiguities common in public space video, and the computational challenge of doing so in real-time. We address these issues by introducing a new dynamic topic model, termed a Markov Clustering Topic Model (MCTM). The MCTM builds on existing dynamic Bayesian network models and Bayesian topic models, and overcomes their drawbacks on sensitivity, robustness and efficiency. Specifically, our model profiles complex dynamic scenes by robustly clustering visual events into activities and these activities into global behaviours with temporal dynamics. A Gibbs sampler is derived for offline learning with unlabeled training data and a new approximation to online Bayesian inference is formulated to enable dynamic scene understanding and behaviour mining in new video data online in real-time. The strength of this model is demonstrated by unsupervised learning of dynamic scene models for four complex and crowded public scenes, and successful mining of behaviors and detection of salient events in each.

Keywords

Behaviour profiling Video behaviour mining Topic models Learning for vision Bayesian methods Probabilistic modelling 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali, S., & Shah, M. (2008). Floor fields for tracking in high density crowd scenes. In European conference on computer vision. Google Scholar
  2. Basharat, A., Gritai, A., & Shah, M. (2008). Learning object motion patterns for anomaly detection and improved object detection. In IEEE conference on computer vision and pattern recognition. Google Scholar
  3. Benezeth, Y., Jodoin, P.-M., Saligrama, V., & Rosenberger, C. (2009). Abnormal events detection based on spatio-temporal co-occurences. In IEEE conference on computer vision and pattern recognition. Google Scholar
  4. Berclaz, J., Fleuret, F., & Fua, P. (2008). Multi-camera tracking and atypical motion detection with behavioral maps. In European conference on computer vision. Google Scholar
  5. Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer. MATHGoogle Scholar
  6. Blei, D., & Lafferty, J. (2006). Dynamic topic models. In International conference on machine learning. Google Scholar
  7. Blei, D., & McAuliffe, J. (2007). Supervised topic models. In Neural information processing systems. Google Scholar
  8. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. MATHGoogle Scholar
  9. Boiman, O., & Irani, M. (2007). Detecting irregularities in images and in video. International Journal of Computer Vision, 74(1), 17–31. CrossRefGoogle Scholar
  10. Chang, S. F., Luo, J., Maybank, S., Schonfeld, D., & Xu, D. (2008). An introduction to the special issue on event analysis in videos. IEEE Transactions on Circuits and Systems for Video Technology, 18(11), 1469–1472. CrossRefGoogle Scholar
  11. Chen, M. y., Li, H., & Hauptmann, A. (2009). Informedia @ trecvid 2009: analyzing video motions. In Proc TRECvid. Google Scholar
  12. Dee, H., & Hogg, D. (2004). Detecting inexplicable behaviour. In British machine vision conference. Google Scholar
  13. Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In VS-PETS (pp. 65–72). Google Scholar
  14. Duong, T., Bui, H., Phung, D., & Venkatesh, S. (2005). Activity recognition and abnormality detection with the switching hidden semi-Markov model. In IEEE conference on computer vision and pattern recognition. Google Scholar
  15. Gilks, W., Richardson, S., & Spiegelhalter, D. (Eds.) (1995). Markov chain Monte Carlo in practice. London/Boca Raton: Chapman & Hall/CRC Press. Google Scholar
  16. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101, 5228–5235. CrossRefGoogle Scholar
  17. Griffiths, T., Steyvers, M., Blei, D., & Tenenbaum, J. (2007). Integrating topics and syntax. In Neural information processing systems. Google Scholar
  18. Gruber, A., Rosen-Zvi, M., & Weiss, Y. (2007). Hidden topic Markov models. In Artificial intelligence and statistics. Google Scholar
  19. HOSDB. Imagery library for intelligent detection systems (i-lids). In IEEE conf. on crime and security (2006). Google Scholar
  20. Hospedales, T., Gong, S., & Xiang, T. (2009). A Markov clustering topic model for behaviour mining in video. In IEEE international conference on computer vision. Google Scholar
  21. Hu, W., Tan, T., Wang, L., & Maybank, S. (2004). A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man and Cybernetics. Part C, Applications and Reviews, 34(3), 334–352. CrossRefGoogle Scholar
  22. Hu, W., Xiao, X., Fu, Z., Xie, D., Tan, T., & Maybank, S. (2006). A system for learning statistical motion patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9), 1450–1464. CrossRefGoogle Scholar
  23. Hu, Z., Ye, G., Jia, G., Chen, X., Hu, Q., Jiang, K., Wang, Y., Qing, L., Tian, Y., Wu, X., & Gaoa, W. (2009). Pku@trecvid2009: Single-actor and pair-activity event detection in surveillance video. In Proc. TRECvid. Google Scholar
  24. Inoue, N., Hao, S., Saito, T., & Shinoda, K. (2009). Titgt at trecvid 2009 workshop. In Proc. TRECvid. Google Scholar
  25. Johnson, N., & Hogg, D. (1996). Learning the distribution of object trajectories for event recognition. Image and Vision Computing, 8, 609–615. CrossRefGoogle Scholar
  26. Kapoor, A., Horvitz, E., & Basu, S. (2007). Selective supervision: Guiding supervised learning with decision-theoretic active learning. In International joint conference on artificial intelligence. Google Scholar
  27. Kim, J., & Grauman, K. (2009). Observe locally, infer globally: a space-time mrf for detecting abnormal activities with incremental update. In IEEE conference on computer vision and pattern recognition. Google Scholar
  28. Li, J., Gong, S., & Xiang, T. (2008). Global behaviour inference using probabilistic latent semantic analysis. In British machine vision conference. Google Scholar
  29. Meng, J., & Chang, S.-F. (1996). Tools for compressed-domain video indexing and editing. In SPIE conference on storage and retrieval for image and video databases. Google Scholar
  30. National institute of standards and technology (NIST): Trec video retrieval evaluation. http://trecvid.nist.gov/.
  31. Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318. CrossRefGoogle Scholar
  32. Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. CrossRefGoogle Scholar
  33. Pritch, Y., Rav-Acha, A., & Peleg, S. (2008). Nonchronological video synopsis and indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1971–1984. CrossRefGoogle Scholar
  34. Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Uncertainty in artificial intelligence. Google Scholar
  35. Saleemi, I., Shafique, K., & Shah, M. (2009). Probabilistic modeling of scene dynamics for applications in visual surveillance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1472–1485. CrossRefGoogle Scholar
  36. Scovanner, P., Ali, S., & Shah, M. (2007). A 3-dimensional sift descriptor and its application to action recognition. In ACM international conference on multimedia. Google Scholar
  37. Sillito, R. R., & Fisher, R. B. (2008). Semi-supervised learning for anomalous trajectory detection. In British machine vision conference. Google Scholar
  38. Smith, K., Quelhas, P., & Gatica-Perez, D. (2006). Detecting abandoned luggage items in a public space. In Performance evaluation of tracking and surveillance (PETS) workshop. Google Scholar
  39. Stauffer, C., & Grimson, W. (2000). Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 747–757. CrossRefGoogle Scholar
  40. Wallach, H. (2006). Topic modeling: beyond bag-of-words. In International conference on machine learning. Google Scholar
  41. Wallach, H., Murray, I., Salakhutdinov, R., & Mimno, D. (2009). Evaluation methods for topic models. In International conference on machine learning. Google Scholar
  42. Wang, Y., & Mori, G. (2009). Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10), 1762–1774. CrossRefGoogle Scholar
  43. Wang, X., Tieu, K., & Grimson, E. (2006). Learning semantic scene models by trajectory analysis. In European conference on computer vision. Google Scholar
  44. Wang, X., Ma, X., & Grimson, E. (2009). Unsupervised activity perception by hierarchical bayesian models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3), 539–555. CrossRefGoogle Scholar
  45. Xiang, T., & Gong, S. (2006). Beyond tracking: Modelling activity and understanding behaviour. International Journal of Computer Vision, 61(1), 21–51. CrossRefGoogle Scholar
  46. Xiang, T., & Gong, S. (2008a). Activity based surveillance video content modelling. Pattern Recognition, 41, 2309–2326. MATHCrossRefGoogle Scholar
  47. Xiang, T., & Gong, S. (2008b). Video behavior profiling for anomaly detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5), 893–908. CrossRefGoogle Scholar
  48. Xie, L., Sundaram, H., & Campbell, M. (2008). Event mining in multimedia streams. Proceedings of the IEEE, 96(4), 623–647. CrossRefGoogle Scholar
  49. Zhong, H., Shi, J., & Visontai, M. (2004). Detecting unusual activity in video. In IEEE conference on computer vision and pattern recognition (pp. 819–826). Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.School of Electronic Engineering and Computer ScienceQueen Mary University of LondonLondonUK

Personalised recommendations