Video Behaviour Mining Using a Dynamic Topic Model

Hospedales, Timothy; Gong, Shaogang; Xiang, Tao

doi:10.1007/s11263-011-0510-7

Video Behaviour Mining Using a Dynamic Topic Model

Published: 08 December 2011

Volume 98, pages 303–323, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Timothy Hospedales¹,
Shaogang Gong¹ &
Tao Xiang¹

1462 Accesses
79 Citations
Explore all metrics

Abstract

This paper addresses the problem of fully automated mining of public space video data, a highly desirable capability under contemporary commercial and security considerations. This task is especially challenging due to the complexity of the object behaviors to be profiled, the difficulty of analysis under the visual occlusions and ambiguities common in public space video, and the computational challenge of doing so in real-time. We address these issues by introducing a new dynamic topic model, termed a Markov Clustering Topic Model (MCTM). The MCTM builds on existing dynamic Bayesian network models and Bayesian topic models, and overcomes their drawbacks on sensitivity, robustness and efficiency. Specifically, our model profiles complex dynamic scenes by robustly clustering visual events into activities and these activities into global behaviours with temporal dynamics. A Gibbs sampler is derived for offline learning with unlabeled training data and a new approximation to online Bayesian inference is formulated to enable dynamic scene understanding and behaviour mining in new video data online in real-time. The strength of this model is demonstrated by unsupervised learning of dynamic scene models for four complex and crowded public scenes, and successful mining of behaviors and detection of salient events in each.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ali, S., & Shah, M. (2008). Floor fields for tracking in high density crowd scenes. In European conference on computer vision.
Google Scholar
Basharat, A., Gritai, A., & Shah, M. (2008). Learning object motion patterns for anomaly detection and improved object detection. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Benezeth, Y., Jodoin, P.-M., Saligrama, V., & Rosenberger, C. (2009). Abnormal events detection based on spatio-temporal co-occurences. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Berclaz, J., Fleuret, F., & Fua, P. (2008). Multi-camera tracking and atypical motion detection with behavioral maps. In European conference on computer vision.
Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.
MATH Google Scholar
Blei, D., & Lafferty, J. (2006). Dynamic topic models. In International conference on machine learning.
Google Scholar
Blei, D., & McAuliffe, J. (2007). Supervised topic models. In Neural information processing systems.
Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
MATH Google Scholar
Boiman, O., & Irani, M. (2007). Detecting irregularities in images and in video. International Journal of Computer Vision, 74(1), 17–31.
Article Google Scholar
Chang, S. F., Luo, J., Maybank, S., Schonfeld, D., & Xu, D. (2008). An introduction to the special issue on event analysis in videos. IEEE Transactions on Circuits and Systems for Video Technology, 18(11), 1469–1472.
Article Google Scholar
Chen, M. y., Li, H., & Hauptmann, A. (2009). Informedia @ trecvid 2009: analyzing video motions. In Proc TRECvid.
Google Scholar
Dee, H., & Hogg, D. (2004). Detecting inexplicable behaviour. In British machine vision conference.
Google Scholar
Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In VS-PETS (pp. 65–72).
Google Scholar
Duong, T., Bui, H., Phung, D., & Venkatesh, S. (2005). Activity recognition and abnormality detection with the switching hidden semi-Markov model. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Gilks, W., Richardson, S., & Spiegelhalter, D. (Eds.) (1995). Markov chain Monte Carlo in practice. London/Boca Raton: Chapman & Hall/CRC Press.
Google Scholar
Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101, 5228–5235.
Article Google Scholar
Griffiths, T., Steyvers, M., Blei, D., & Tenenbaum, J. (2007). Integrating topics and syntax. In Neural information processing systems.
Google Scholar
Gruber, A., Rosen-Zvi, M., & Weiss, Y. (2007). Hidden topic Markov models. In Artificial intelligence and statistics.
Google Scholar
HOSDB. Imagery library for intelligent detection systems (i-lids). In IEEE conf. on crime and security (2006).
Hospedales, T., Gong, S., & Xiang, T. (2009). A Markov clustering topic model for behaviour mining in video. In IEEE international conference on computer vision.
Google Scholar
Hu, W., Tan, T., Wang, L., & Maybank, S. (2004). A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man and Cybernetics. Part C, Applications and Reviews, 34(3), 334–352.
Article Google Scholar
Hu, W., Xiao, X., Fu, Z., Xie, D., Tan, T., & Maybank, S. (2006). A system for learning statistical motion patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9), 1450–1464.
Article Google Scholar
Hu, Z., Ye, G., Jia, G., Chen, X., Hu, Q., Jiang, K., Wang, Y., Qing, L., Tian, Y., Wu, X., & Gaoa, W. (2009). Pku@trecvid2009: Single-actor and pair-activity event detection in surveillance video. In Proc. TRECvid.
Google Scholar
Inoue, N., Hao, S., Saito, T., & Shinoda, K. (2009). Titgt at trecvid 2009 workshop. In Proc. TRECvid.
Google Scholar
Johnson, N., & Hogg, D. (1996). Learning the distribution of object trajectories for event recognition. Image and Vision Computing, 8, 609–615.
Article Google Scholar
Kapoor, A., Horvitz, E., & Basu, S. (2007). Selective supervision: Guiding supervised learning with decision-theoretic active learning. In International joint conference on artificial intelligence.
Google Scholar
Kim, J., & Grauman, K. (2009). Observe locally, infer globally: a space-time mrf for detecting abnormal activities with incremental update. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Li, J., Gong, S., & Xiang, T. (2008). Global behaviour inference using probabilistic latent semantic analysis. In British machine vision conference.
Google Scholar
Meng, J., & Chang, S.-F. (1996). Tools for compressed-domain video indexing and editing. In SPIE conference on storage and retrieval for image and video databases.
Google Scholar
National institute of standards and technology (NIST): Trec video retrieval evaluation. http://trecvid.nist.gov/.
Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318.
Article Google Scholar
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.
Article Google Scholar
Pritch, Y., Rav-Acha, A., & Peleg, S. (2008). Nonchronological video synopsis and indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1971–1984.
Article Google Scholar
Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Uncertainty in artificial intelligence.
Google Scholar
Saleemi, I., Shafique, K., & Shah, M. (2009). Probabilistic modeling of scene dynamics for applications in visual surveillance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1472–1485.
Article Google Scholar
Scovanner, P., Ali, S., & Shah, M. (2007). A 3-dimensional sift descriptor and its application to action recognition. In ACM international conference on multimedia.
Google Scholar
Sillito, R. R., & Fisher, R. B. (2008). Semi-supervised learning for anomalous trajectory detection. In British machine vision conference.
Google Scholar
Smith, K., Quelhas, P., & Gatica-Perez, D. (2006). Detecting abandoned luggage items in a public space. In Performance evaluation of tracking and surveillance (PETS) workshop.
Google Scholar
Stauffer, C., & Grimson, W. (2000). Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 747–757.
Article Google Scholar
Wallach, H. (2006). Topic modeling: beyond bag-of-words. In International conference on machine learning.
Google Scholar
Wallach, H., Murray, I., Salakhutdinov, R., & Mimno, D. (2009). Evaluation methods for topic models. In International conference on machine learning.
Google Scholar
Wang, Y., & Mori, G. (2009). Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10), 1762–1774.
Article Google Scholar
Wang, X., Tieu, K., & Grimson, E. (2006). Learning semantic scene models by trajectory analysis. In European conference on computer vision.
Google Scholar
Wang, X., Ma, X., & Grimson, E. (2009). Unsupervised activity perception by hierarchical bayesian models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3), 539–555.
Article Google Scholar
Xiang, T., & Gong, S. (2006). Beyond tracking: Modelling activity and understanding behaviour. International Journal of Computer Vision, 61(1), 21–51.
Article Google Scholar
Xiang, T., & Gong, S. (2008a). Activity based surveillance video content modelling. Pattern Recognition, 41, 2309–2326.
Article MATH Google Scholar
Xiang, T., & Gong, S. (2008b). Video behavior profiling for anomaly detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5), 893–908.
Article Google Scholar
Xie, L., Sundaram, H., & Campbell, M. (2008). Event mining in multimedia streams. Proceedings of the IEEE, 96(4), 623–647.
Article Google Scholar
Zhong, H., Shi, J., & Visontai, M. (2004). Detecting unusual activity in video. In IEEE conference on computer vision and pattern recognition (pp. 819–826).
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic Engineering and Computer Science, Queen Mary University of London, London, E1 4NS, UK
Timothy Hospedales, Shaogang Gong & Tao Xiang

Authors

Timothy Hospedales
View author publications
You can also search for this author in PubMed Google Scholar
Shaogang Gong
View author publications
You can also search for this author in PubMed Google Scholar
Tao Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timothy Hospedales.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hospedales, T., Gong, S. & Xiang, T. Video Behaviour Mining Using a Dynamic Topic Model. Int J Comput Vis 98, 303–323 (2012). https://doi.org/10.1007/s11263-011-0510-7

Download citation

Received: 21 August 2009
Accepted: 22 November 2011
Published: 08 December 2011
Issue Date: July 2012
DOI: https://doi.org/10.1007/s11263-011-0510-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video Behaviour Mining Using a Dynamic Topic Model

Abstract

Access this article

Similar content being viewed by others

Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos

Language-Motivated Approaches to Action Recognition

Analysis of moving cluster with scene constraints for group behavior pattern mining

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Video Behaviour Mining Using a Dynamic Topic Model

Abstract

Access this article

Similar content being viewed by others

Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos

Language-Motivated Approaches to Action Recognition

Analysis of moving cluster with scene constraints for group behavior pattern mining

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation