Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos

  • He WangEmail author
  • Carol O’Sullivan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9909)


Automatically recognizing activities in video is a classic problem in vision and helps to understand behaviors, describe scenes and detect anomalies. We propose an unsupervised method for such purposes. Given video data, we discover recurring activity patterns that appear, peak, wane and disappear over time. By using non-parametric Bayesian methods, we learn coupled spatial and temporal patterns with minimum prior knowledge. To model the temporal changes of patterns, previous works compute Markovian progressions or locally continuous motifs whereas we model time in a globally continuous and non-Markovian way. Visually, the patterns depict flows of major activities. Temporally, each pattern has its own unique appearance-disappearance cycles. To compute compact pattern representations, we also propose a hybrid sampling method. By combining these patterns with detailed environment information, we interpret the semantics of activities and report anomalies. Also, our method fits data better and detects anomalies that were difficult to detect previously.


Gaussian Mixture Model Gibbs Sampling Anomaly Detection Time Topic Spatial Activity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Supplementary material

419978_1_En_32_MOESM1_ESM.pdf (356 kb)
Supplementary material 1 (pdf 356 KB)


  1. 1.
    Zhou, S., Chen, D., Cai, W., Luo, L., Low, M.Y.H., Tian, F., Tay, V.S.H., Ong, D.W.S., Hamilton, B.D.: Crowd modeling and simulation technologies. ACM Trans. Model. Comput. Simul. 20(4), 20:1–20:35 (2010)CrossRefGoogle Scholar
  2. 2.
    Ali, S., Shah, M.: Floor fields for tracking in high density crowd scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 1–14. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88688-4_1 CrossRefGoogle Scholar
  3. 3.
    Antonini, G., Martinez, S.V., Bierlaire, M., Thiran, J.P.: Behavioral priors for detection and tracking of pedestrians in video sequences. Int. J. Comput. Vision 69(2), 159–180 (2006)CrossRefGoogle Scholar
  4. 4.
    Emonet, R., Varadarajan, J., Odobez, J.: Extracting and locating temporal motifs in video scenes using a hierarchical non parametric Bayesian model. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3233–3240, June 2011Google Scholar
  5. 5.
    Zhou, B., Tang, X., Wang, X.: Learning collective crowd behaviors with dynamic pedestrian-agents. Int. J. Comput. Vision 111(1), 50–68 (2014)CrossRefGoogle Scholar
  6. 6.
    Wang, X., Ma, K.T., Ng, G.W., Grimson, W.: Trajectory analysis and semantic region modeling using a nonparametric Bayesian model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8, June 2008Google Scholar
  7. 7.
    Wang, X., Ma, X., Grimson, W.: Unsupervised activity perception in crowded and complicated scenes using hierarchical Bayesian models. IEEE Trans. Pattern Anal. Mach. Intell. 31(3), 539–555 (2009)CrossRefGoogle Scholar
  8. 8.
    Stauffer, C., Grimson, W.E.L.: Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22, 747–757 (2000)CrossRefGoogle Scholar
  9. 9.
    Oliver, N., Rosario, B., Pentland, A.: A Bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 831–843 (2000)CrossRefGoogle Scholar
  10. 10.
    Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 2, pp. 123–130 (2001)Google Scholar
  11. 11.
    Zhong, H., Shi, J., Visontai, M.: Detecting unusual activity in video. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, pp. II-819–II-826, June 2004Google Scholar
  12. 12.
    Lin, D., Grimson, E., Fisher, J.: Learning visual flows: a lie algebraic approach. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 747–754, June 2009Google Scholar
  13. 13.
    Yi, S., Li, H., Wang, X.: Understanding pedestrian behaviors from stationary crowd groups. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3488–3496, June 2015Google Scholar
  14. 14.
    Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33765-9_15 Google Scholar
  15. 15.
    Xie, D., Todorovic, S., Zhu, S.C.: Inferring “Dark Matter” and “Dark Energy” from videos. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2224–2231, December 2013Google Scholar
  16. 16.
    Wang, X., Ma, K.T., Ng, G.W., Grimson, W.E.L.: Trajectory analysis and semantic region modeling using nonparametric hierarchical Bayesian models. Int. J. Comput. Vision 95(3), 287–312 (2011)CrossRefGoogle Scholar
  17. 17.
    Varadarajan, J., Emonet, R., Odobez, J.M.: A sequential topic model for mining recurrent activities from long term video logs. Int. J. Comput. Vision 103(1), 100–126 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 524–531, June 2005Google Scholar
  19. 19.
    Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Describing visual scenes using transformed objects and parts. Int. J. Comput. Vision 77(1–3), 291–330 (2007)Google Scholar
  20. 20.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering object categories in image collections. ICCV 2005 (2005)Google Scholar
  21. 21.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vision 79(3), 299–318 (2008)CrossRefGoogle Scholar
  22. 22.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley-Interscience, New York (2005)Google Scholar
  23. 23.
    Emonet, R., Varadarajan, J., Odobez, J.M.: Temporal analysis of motif mixtures using Dirichlet processes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 140–156 (2014)CrossRefGoogle Scholar
  24. 24.
    Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Dubey, A., Hefny, A., Williamson, S., Xing, E.P.: A non-parametric mixture model for topic modeling over time. arXiv: 1208.4411 [stat], August 2012
  26. 26.
    Wang, C., Blei, D.M.: A Split-Merge MCMC Algorithm for the Hierarchical Dirichlet Process. arXiv: 1201.1657 [cs, stat], January 2012
  27. 27.
    Lin, D., Grimson, E., Fisher, J.W.: Construction of dependent Dirichlet processes based on poisson processes. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1396–1404, Curran Associates, Inc. (2010)Google Scholar
  28. 28.
    Blei, D.M., Frazier, P.I.: Distance dependent Chinese restaurant processes. J. Mach. Learn. Res. 12, 2461–2488 (2011)MathSciNetzbMATHGoogle Scholar
  29. 29.
    Wang, X., McCallum, A.: Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp. 424–433. ACM, New York (2006)Google Scholar
  30. 30.
    Sethuraman, J.: A constructive definition of Dirichlet priors. Statistica Sinica 4, 639–650 (1994)MathSciNetzbMATHGoogle Scholar
  31. 31.
    Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303–1347 (2013)MathSciNetzbMATHGoogle Scholar
  32. 32.
    Chang, J., Fisher III, J.W.: Parallel sampling of HDPs using sub-cluster splits. In: Advances in Neural Information Processing Systems, pp. 235–243 (2014)Google Scholar
  33. 33.
    Hughes, M.C., Fox, E., Sudderth, E.B.: Effective split-merge Monte Carlo methods for nonparametric models of sequential data. In: Advances in Neural Information Processing Systems, pp. 1295–1303 (2012)Google Scholar
  34. 34.
    Jain, S., Neal, R.: A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. J. Comput. Graph. Stat. 13, 158–182 (2000)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Dahl, D.B.: Sequentially-allocated merge-split sampler for conjugate and nonconjugate Dirichlet process mixture models. J. Comput. Graph. Stat. 11 (2005)Google Scholar
  36. 36.
    Majecka, B.: Statistical models of pedestrian behaviour in the Forum. M.Sc. Dissertation, School of Informatics, University of Edinburgh, Edinburgh (2009)Google Scholar
  37. 37.
    Luber, M., Spinello, L., Silva, J., Arras, K.O.: Socially-aware robot navigation: a learning approach. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 902–907, October 2012Google Scholar
  38. 38.
    Almingol, J., Montesano, L., Lopes, M.: Learning multiple behaviors from unlabeled demonstrations in a latent controller space, 136–144 (2013)Google Scholar
  39. 39.
    Wang, H., Ondřej, J., O’Sullivan, C.: Path patterns: analyzing and comparing real and simulated crowds. In: Proceedings of the 20th ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. I3D 2016, pp. 49–57. ACM, New York (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Disney Research Los AngelesGlendaleUSA
  2. 2.University of LeedsLeedsUK
  3. 3.Trinity College DublinDublinIreland

Personalised recommendations