Skip to main content
Log in

Video Behaviour Mining Using a Dynamic Topic Model

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper addresses the problem of fully automated mining of public space video data, a highly desirable capability under contemporary commercial and security considerations. This task is especially challenging due to the complexity of the object behaviors to be profiled, the difficulty of analysis under the visual occlusions and ambiguities common in public space video, and the computational challenge of doing so in real-time. We address these issues by introducing a new dynamic topic model, termed a Markov Clustering Topic Model (MCTM). The MCTM builds on existing dynamic Bayesian network models and Bayesian topic models, and overcomes their drawbacks on sensitivity, robustness and efficiency. Specifically, our model profiles complex dynamic scenes by robustly clustering visual events into activities and these activities into global behaviours with temporal dynamics. A Gibbs sampler is derived for offline learning with unlabeled training data and a new approximation to online Bayesian inference is formulated to enable dynamic scene understanding and behaviour mining in new video data online in real-time. The strength of this model is demonstrated by unsupervised learning of dynamic scene models for four complex and crowded public scenes, and successful mining of behaviors and detection of salient events in each.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ali, S., & Shah, M. (2008). Floor fields for tracking in high density crowd scenes. In European conference on computer vision.

    Google Scholar 

  • Basharat, A., Gritai, A., & Shah, M. (2008). Learning object motion patterns for anomaly detection and improved object detection. In IEEE conference on computer vision and pattern recognition.

    Google Scholar 

  • Benezeth, Y., Jodoin, P.-M., Saligrama, V., & Rosenberger, C. (2009). Abnormal events detection based on spatio-temporal co-occurences. In IEEE conference on computer vision and pattern recognition.

    Google Scholar 

  • Berclaz, J., Fleuret, F., & Fua, P. (2008). Multi-camera tracking and atypical motion detection with behavioral maps. In European conference on computer vision.

    Google Scholar 

  • Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.

    MATH  Google Scholar 

  • Blei, D., & Lafferty, J. (2006). Dynamic topic models. In International conference on machine learning.

    Google Scholar 

  • Blei, D., & McAuliffe, J. (2007). Supervised topic models. In Neural information processing systems.

    Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Boiman, O., & Irani, M. (2007). Detecting irregularities in images and in video. International Journal of Computer Vision, 74(1), 17–31.

    Article  Google Scholar 

  • Chang, S. F., Luo, J., Maybank, S., Schonfeld, D., & Xu, D. (2008). An introduction to the special issue on event analysis in videos. IEEE Transactions on Circuits and Systems for Video Technology, 18(11), 1469–1472.

    Article  Google Scholar 

  • Chen, M. y., Li, H., & Hauptmann, A. (2009). Informedia @ trecvid 2009: analyzing video motions. In Proc TRECvid.

    Google Scholar 

  • Dee, H., & Hogg, D. (2004). Detecting inexplicable behaviour. In British machine vision conference.

    Google Scholar 

  • Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In VS-PETS (pp. 65–72).

    Google Scholar 

  • Duong, T., Bui, H., Phung, D., & Venkatesh, S. (2005). Activity recognition and abnormality detection with the switching hidden semi-Markov model. In IEEE conference on computer vision and pattern recognition.

    Google Scholar 

  • Gilks, W., Richardson, S., & Spiegelhalter, D. (Eds.) (1995). Markov chain Monte Carlo in practice. London/Boca Raton: Chapman & Hall/CRC Press.

    Google Scholar 

  • Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101, 5228–5235.

    Article  Google Scholar 

  • Griffiths, T., Steyvers, M., Blei, D., & Tenenbaum, J. (2007). Integrating topics and syntax. In Neural information processing systems.

    Google Scholar 

  • Gruber, A., Rosen-Zvi, M., & Weiss, Y. (2007). Hidden topic Markov models. In Artificial intelligence and statistics.

    Google Scholar 

  • HOSDB. Imagery library for intelligent detection systems (i-lids). In IEEE conf. on crime and security (2006).

  • Hospedales, T., Gong, S., & Xiang, T. (2009). A Markov clustering topic model for behaviour mining in video. In IEEE international conference on computer vision.

    Google Scholar 

  • Hu, W., Tan, T., Wang, L., & Maybank, S. (2004). A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man and Cybernetics. Part C, Applications and Reviews, 34(3), 334–352.

    Article  Google Scholar 

  • Hu, W., Xiao, X., Fu, Z., Xie, D., Tan, T., & Maybank, S. (2006). A system for learning statistical motion patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9), 1450–1464.

    Article  Google Scholar 

  • Hu, Z., Ye, G., Jia, G., Chen, X., Hu, Q., Jiang, K., Wang, Y., Qing, L., Tian, Y., Wu, X., & Gaoa, W. (2009). Pku@trecvid2009: Single-actor and pair-activity event detection in surveillance video. In Proc. TRECvid.

    Google Scholar 

  • Inoue, N., Hao, S., Saito, T., & Shinoda, K. (2009). Titgt at trecvid 2009 workshop. In Proc. TRECvid.

    Google Scholar 

  • Johnson, N., & Hogg, D. (1996). Learning the distribution of object trajectories for event recognition. Image and Vision Computing, 8, 609–615.

    Article  Google Scholar 

  • Kapoor, A., Horvitz, E., & Basu, S. (2007). Selective supervision: Guiding supervised learning with decision-theoretic active learning. In International joint conference on artificial intelligence.

    Google Scholar 

  • Kim, J., & Grauman, K. (2009). Observe locally, infer globally: a space-time mrf for detecting abnormal activities with incremental update. In IEEE conference on computer vision and pattern recognition.

    Google Scholar 

  • Li, J., Gong, S., & Xiang, T. (2008). Global behaviour inference using probabilistic latent semantic analysis. In British machine vision conference.

    Google Scholar 

  • Meng, J., & Chang, S.-F. (1996). Tools for compressed-domain video indexing and editing. In SPIE conference on storage and retrieval for image and video databases.

    Google Scholar 

  • National institute of standards and technology (NIST): Trec video retrieval evaluation. http://trecvid.nist.gov/.

  • Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318.

    Article  Google Scholar 

  • Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.

    Article  Google Scholar 

  • Pritch, Y., Rav-Acha, A., & Peleg, S. (2008). Nonchronological video synopsis and indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1971–1984.

    Article  Google Scholar 

  • Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Uncertainty in artificial intelligence.

    Google Scholar 

  • Saleemi, I., Shafique, K., & Shah, M. (2009). Probabilistic modeling of scene dynamics for applications in visual surveillance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1472–1485.

    Article  Google Scholar 

  • Scovanner, P., Ali, S., & Shah, M. (2007). A 3-dimensional sift descriptor and its application to action recognition. In ACM international conference on multimedia.

    Google Scholar 

  • Sillito, R. R., & Fisher, R. B. (2008). Semi-supervised learning for anomalous trajectory detection. In British machine vision conference.

    Google Scholar 

  • Smith, K., Quelhas, P., & Gatica-Perez, D. (2006). Detecting abandoned luggage items in a public space. In Performance evaluation of tracking and surveillance (PETS) workshop.

    Google Scholar 

  • Stauffer, C., & Grimson, W. (2000). Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 747–757.

    Article  Google Scholar 

  • Wallach, H. (2006). Topic modeling: beyond bag-of-words. In International conference on machine learning.

    Google Scholar 

  • Wallach, H., Murray, I., Salakhutdinov, R., & Mimno, D. (2009). Evaluation methods for topic models. In International conference on machine learning.

    Google Scholar 

  • Wang, Y., & Mori, G. (2009). Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10), 1762–1774.

    Article  Google Scholar 

  • Wang, X., Tieu, K., & Grimson, E. (2006). Learning semantic scene models by trajectory analysis. In European conference on computer vision.

    Google Scholar 

  • Wang, X., Ma, X., & Grimson, E. (2009). Unsupervised activity perception by hierarchical bayesian models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3), 539–555.

    Article  Google Scholar 

  • Xiang, T., & Gong, S. (2006). Beyond tracking: Modelling activity and understanding behaviour. International Journal of Computer Vision, 61(1), 21–51.

    Article  Google Scholar 

  • Xiang, T., & Gong, S. (2008a). Activity based surveillance video content modelling. Pattern Recognition, 41, 2309–2326.

    Article  MATH  Google Scholar 

  • Xiang, T., & Gong, S. (2008b). Video behavior profiling for anomaly detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5), 893–908.

    Article  Google Scholar 

  • Xie, L., Sundaram, H., & Campbell, M. (2008). Event mining in multimedia streams. Proceedings of the IEEE, 96(4), 623–647.

    Article  Google Scholar 

  • Zhong, H., Shi, J., & Visontai, M. (2004). Detecting unusual activity in video. In IEEE conference on computer vision and pattern recognition (pp. 819–826).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Timothy Hospedales.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hospedales, T., Gong, S. & Xiang, T. Video Behaviour Mining Using a Dynamic Topic Model. Int J Comput Vis 98, 303–323 (2012). https://doi.org/10.1007/s11263-011-0510-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-011-0510-7

Keywords

Navigation