Surveillance of Crowded Environments: Modeling the Crowd by Its Global Properties

Chapter
Part of the The International Series in Video Computing book series (VICO, volume 11)

Abstract

In this chapter, we consider aspects of the crowd that can be modeled holistically, by analyzing global properties. We first discuss the dynamic texture model for representing holistic motion flow, which treats the video as a sample from a linear dynamical system. By defining appropriate distances and kernels between dynamic textures, crowd motion can be recognized with standard classification algorithms. Besides motion flow, crowd size, i.e., the number of objects within a crowd can also be modeled holistically. From a suitable set of low-level features, crowd counts can be estimated with a regression function that directly maps features into the number of objects within the crowd. In both cases, the surveillance task is solvable by analyzing global scene properties, and there is no need to detect or track individual objects. In result, the solutions tend to be robust even when the crowd is large, there are substantial occlusions, complex object interactions, or the objects are small.

Keywords

Entropy Covariance 

Notes

Acknowledgements

The authors wish to thank the Washington State DOT for the videos of highway traffic [85], Jeffrey Cuenco and Zhang-Sheng John Liang for annotating part of the pedestrian video data, Navneet Dalal and Pedro Felzenszwalb for the people detection algorithms [29, 37], and Piotr Dollar for running these algorithms. This work was supported by NSF CCF-0830535, IIS-0812235, IIS-0534985, NSF IGERT award DGE-0333451, and the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU 110610).

References

  1. 1.
    Ali, S., Shah, M.: A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2007)Google Scholar
  2. 2.
    Bach, F., Lanckriet, G., Jordan, M.: Multiple kernel learning, conic duality, and the SMO algorithm. In: International Conference on Machine Learning, ACM Press (2004)Google Scholar
  3. 3.
    Bar-Joseph, Z., El-Yaniv, R., Lischinski, D., Werman, M.: Texture mixing and texture movie synthesis using statistical learning. IEEE Trans. Vis. Comput. Graph. 7(2), 120–135 (2001)CrossRefGoogle Scholar
  4. 4.
    Barron, J., Fleet, D., Beauchemin, S.: Performance of optical flow techniques. Int. J. Comput. Vis. 12, 43–77 (1994)CrossRefGoogle Scholar
  5. 5.
    Bauer, D.: Comparing the CCA subspace method to pseudo maximum likelihood methods in the case of no exogenous inputs. J. Time Ser. Anal. 26, 631–668 (2005)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Bissacco, A., Chiuso, A., Ma, Y., Soatto, S.: Recognition of human gaits. In: IEEE Conference on Computer Vision and Pattern Recognition 20, IEEE (2001)Google Scholar
  7. 7.
    Brostow, G.J., Cipolla, R.: Unsupervised Bayesian detection of independent motion in crowds. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, vol 1, pp. 594–601 (2006)Google Scholar
  8. 8.
    Cetingul, E., Chaudhry, R., Vidal, R.: A system theoretic approach to synthesis and classification of lip articulation. In: International Workshop on Dynamical Vision, Springer LNCS (2007)Google Scholar
  9. 9.
    Chan, A.B.: Beyond dynamic textures: a family of stochastic dynamical models for video with applications to computer vision. PhD thesis, UCSD (2008)Google Scholar
  10. 10.
    Chan, A.B., Dong, D.: Generalized gaussian process models. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2011)Google Scholar
  11. 11.
    Chan, A.B., Vasconcelos, N.: Probabilistic kernels for the classification of auto-regressive visual processes. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, vol. 1, pp. 846–851 (2005)Google Scholar
  12. 12.
    Chan, A.B., Vasconcelos, N.: Classifying video with kernel dynamic textures. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2007)Google Scholar
  13. 13.
    Chan, A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 909–926 (2008)CrossRefGoogle Scholar
  14. 14.
    Chan, A.B., Vasconcelos, N.: Bayesian Poisson regression for crowd counting. In: IEEE International Conference on Computer Vision, IEEE (2009a)Google Scholar
  15. 15.
    Chan, A.B., Vasconcelos, N.: Layered dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell.: Spec. Issue Probab. Graph. Models Comput. Vis. 31(10), 1862–1879 (2009b)Google Scholar
  16. 16.
    Chan, A.B., Vasconcelos, N.: Variational layered dynamic textures. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2009c)Google Scholar
  17. 17.
    Chan, A., Vasconcelos, N.: Counting people with low-level features and Bayesian regression. IEEE Trans. Image Process. 21(4), 2160–2177 (2012)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2008)Google Scholar
  19. 19.
    Chan, A., Morrow, M., Vasconcelos, N.: Analysis of crowded scenes using holistic properties. In: 11th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS’09) (online) (2009)Google Scholar
  20. 20.
    Chan, A.B., Coviello, E., Lanckriet, G.R.G.: Clustering dynamic textures with the hierarchical EM algorithm. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2010a)Google Scholar
  21. 21.
    Chan, A.B., Mahadevan, V., Vasconcelos, N.: Generalized Stauffer-Grimson background subtraction for dynamic scenes. Mach. Vis. Appl. 22(5) 751–766 (2011)CrossRefGoogle Scholar
  22. 22.
    Chaudry, R., Ravichandran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE International Conference on Computer Vision and Pattern Recognition, IEEE (2009)Google Scholar
  23. 23.
    Cho, S.Y., Chow, T.W.S., Leung, C.T.: A neural-based crowd estimation by hybrid global learning algorithm. IEEE Trans. Syst. Man Cybern. 29, 535–541 (1999)CrossRefGoogle Scholar
  24. 24.
    Cock, K.D., Moor, B.D.: Subspace angles between linear stochastic models. In: IEEE Conference on Decision and Control, Proceedings, IEEE, pp. 1561–1566 (2000)Google Scholar
  25. 25.
    Cong, Y., Gong, H., Zhu, S.C., Tang, Y.: Flow mosaicking: real-time pedestrian counting without scene-specific learning. In: IEEE CVPR, IEEE (2009)Google Scholar
  26. 26.
    Cooper, L., Liu, J., Huang, K.: Spatial segmentation of temporal texture using mixture linear models. In: Dynamical Vision Workshop in the IEEE International Conference of Computer Vision, Springer LNCS (2005)Google Scholar
  27. 27.
    Costantini, R., Sbaiz, L., Süsstrunk, S.: Higher order SVD analysis for dynamic texture synthesis. IEEE Trans. Image Process. 17(1), 42–52 (2008)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Cover, T., Thomas, J.: Elements of Information Theory. Wiley, New York (1991)CrossRefMATHGoogle Scholar
  29. 29.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, vol. 2, pp. 886–893 (2005)Google Scholar
  30. 30.
    Davies, A.C., Yin, J.H., Velastin, S.A.: Crowd monitoring using image processing. Electron. Commun. Eng. J. 7, 37–47 (1995)CrossRefGoogle Scholar
  31. 31.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–38 (1977)MathSciNetMATHGoogle Scholar
  32. 32.
    Dong, L., Parameswaran, V., Ramesh, V., Zoghlami, I.: Fast crowd segmentation using shape indexing. In: IEEE International Conference on Computer Vision, IEEE (2007)Google Scholar
  33. 33.
    Doretto, G., Soatto, S.: Dynamic shape and appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2006–2019 (2006)CrossRefGoogle Scholar
  34. 34.
    Doretto, G., Chiuso, A., Wu, Y.N., Soatto, S.: Dynamic textures. Int. J. Comput. Vis. 51(2), 91–109 (2003a)CrossRefMATHGoogle Scholar
  35. 35.
    Doretto, G., Cremers, D., Favaro, P., Soatto, S.: Dynamic texture segmentation. In: IEEE International Conference on Computer Vision, IEEE, vol. 2, pp. 1236–1242 (2003b)CrossRefGoogle Scholar
  36. 36.
    Doretto, G., Jones, E., Soatto, S.: Spatially homogeneous dynamic textures. In: ECCV, Springer-Verlag LNCS 3021–3024 (2004)Google Scholar
  37. 37.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2008)Google Scholar
  38. 38.
    Fitzgibbon, A.W.: Stochastic rigidity: image registration for nowhere-static scenes. In: IEEE International Conference on Computer Vision, IEEE, vol. 1, pp. 662–670 (2001)Google Scholar
  39. 39.
    Gelb, A.: Applied Optimal Estimation. MIT, Cambridge (1974)Google Scholar
  40. 40.
    Ghanem, B., Ahuja, N.: Phase based modelling of dynamic textures. In: IEEE Internationl Conference on Computer Vision, IEEE (2007)Google Scholar
  41. 41.
    Ghoreyshi, A., Vidal, R.: Segmenting dynamic textures with Ising descriptors, ARX models and level sets. In: Dynamical Vision Workshop in the European Conference on Computer Vision, Springer LNCS (2006)Google Scholar
  42. 42.
    Horn, B.K.P.: Robot Vision. McGraw-Hill, New York (1986)Google Scholar
  43. 43.
    Horn, B., Schunk, B.: Determining optical flow. Artif. Intell. 17, 185–204 (1981)CrossRefGoogle Scholar
  44. 44.
    Hu, M., Ali, S., Shah, M.: Detecting global motion patterns in complex videos. In: IEEE International Conference on Pattern Recognition, IEEE (2008a)Google Scholar
  45. 45.
    Hu, M., Ali, S., Shah, M.: Learning motion patterns in crowded scenes using motion flow field. In: IEEE International Conference on Pattern Recognition, IEEE (2008b)Google Scholar
  46. 46.
    Isard, M., Blake, A.: Condensation – conditional density propagation for visual tracking. Int. J. Comput. Vis. 29(1), 5–28 (1998)CrossRefGoogle Scholar
  47. 47.
    Kay, S.M.: Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall, Upper Saddle River (1993)MATHGoogle Scholar
  48. 48.
    Kong, D., Gray, D., Tao, H.: Counting pedestrians in crowds using viewpoint invariant training. In: British Machine Vision Conference, BMVA (2005)Google Scholar
  49. 49.
    Lanckriet, G., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)MATHGoogle Scholar
  50. 50.
    Larimore, W.E.: Canonical variate analysis in identification, filtering, and adaptive control. In: IEEE Conference on Decision and Control, IEEE, vol. 2, pp. 596–604 (1990)CrossRefGoogle Scholar
  51. 51.
    Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, vol. 1, pp. 875–885 (2005)Google Scholar
  52. 52.
    Leibe, B., Schindler, K., Van Gool, L.: Coupled detection and trajectory estimation for multi-object tracking. In: IEEE International Conference on Computer Vision, IEEE (2007)Google Scholar
  53. 53.
    Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Advances in Neural Information Processing Systems, NIPS (2010)Google Scholar
  54. 54.
    Lin, S.F., Chen, J.Y., Chao, H.X.: Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans. Syst. Man Cybern. 31(6), 645–654 (2001)CrossRefGoogle Scholar
  55. 55.
    Liu, C.B., Lin, R.S., Ahuja, N., Yang, M.H.: Dynamic texture synthesis as nonlinear manifold learning and traversing. In: British Machine Vision Conference, vol. 2, pp. 859–868. BMVA (2006)Google Scholar
  56. 56.
    Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceeding on DARPA Image Understanding Workshop, pp. 121–130. Morgan Kaufmann Publishers, (1981)Google Scholar
  57. 57.
    Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in highly dynamic scenes. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 171–177 (2010)CrossRefGoogle Scholar
  58. 58.
    Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE (2010)Google Scholar
  59. 59.
    Marana, A.N., Costa, L.F., Lotufo, R.A., Velastin, S.A.: On the efficacy of texture analysis for crowd monitoring. In: IEEE Proceedings of Computer Graphics, Image Processing, and Vision, IEEE, pp. 354–361 (1998)Google Scholar
  60. 60.
    Marana, A.N., Costa, L.F., Lotufo, R.A., Velastin, S.A.: Estimating crowd density with minkoski fractal dimension. In: IEEE Proceedings of International Conference Acoustics, Speech, Signal Processing, IEEE, vol. 6, pp. 3521–3524 (1999)Google Scholar
  61. 61.
    Martin, R.J.: A metric for ARMA processes. IEEE Trans. Signal Process. 48(4), 1164–1170 (2000)MathSciNetCrossRefMATHGoogle Scholar
  62. 62.
    Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2009)Google Scholar
  63. 63.
    Mehran, R., Moore, B., Shah, M.: A streakline representation of flow in crowded scenes. In: European Conference on Computer Vision, LNCS (2010)Google Scholar
  64. 64.
    Monnet, A., Mittal, A., Paragios, N., Ramesh, V.: Background modeling and subtraction of dynamic scenes. In: CVPR, IEEE (2003)Google Scholar
  65. 65.
    Overschee, P.V., Moor, B.D.: N4SID: subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica 30, 75–93 (1994)CrossRefMATHGoogle Scholar
  66. 66.
    Paragios, N., Ramesh, V.: A MRF-based approach for real-time subway monitoring. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, vol. 1, pp. 1034–1040 (2001)Google Scholar
  67. 67.
    Polana, R., Nelson, R.C.: Recognition of motion from temporal texture. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 129–134 (1992)Google Scholar
  68. 68.
    Rabaud, V., Belongie, S.J.: Counting crowded moving objects. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2006)Google Scholar
  69. 69.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT, Cambridge (2006)MATHGoogle Scholar
  70. 70.
    Ravichandran, A., Vidal, R.: Video registration using dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), pp. 158–171 (2011)CrossRefGoogle Scholar
  71. 71.
    Ravichandran, A., Chaudhry, R., Vidal, R.: View-invariant dynamic texture recognition using a bag of dynamical systems. Video Registration using Dynamic Textures. In: IEEE International Conference on Computer Vision and Pattern Recognition, IEEE 33(1) 158–171 (2011)Google Scholar
  72. 72.
    Regazzoni, C.S., Tesei, A.: Distributed data fusion for real-time crowding estimation. Signal Process. 53, 47–63 (1996)CrossRefMATHGoogle Scholar
  73. 73.
    Saisan, P., Doretto, G., Wu, Y., Soatto, S.: Dynamic texture recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, vol. 2, pp. 58–63 (2001)Google Scholar
  74. 74.
    Saleemi, I., Hartung, L., Shah, M.: Scene understanding by statistical modeling of motion patterns. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2010)Google Scholar
  75. 75.
    Shumway, R.H., Stoffer, D.S.: An approach to time series smoothing and forecasting using the EM algorithm. J. Time Ser. Anal. 3(4), 253–264 (1982)CrossRefMATHGoogle Scholar
  76. 76.
    Siddiqi, S.M., Boots, B., Gordon, G.J.: A constraint generation approach to learning stable linear dynamical systems. In: Advances in Neural Information Processing Systems, NIPS (2007)Google Scholar
  77. 77.
    Szummer, M., Picard, R.: Temporal texture modeling. In: IEEE Conference on Image Processing, IEEE, vol. 3, pp. 823–826 (1996)CrossRefGoogle Scholar
  78. 78.
    Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995)CrossRefMATHGoogle Scholar
  79. 79.
    Vidal, R.: Online clustering of moving hyperplanes. In: Neural Information and Processing Systems, NIPS (2006)Google Scholar
  80. 80.
    Vidal, R., Favaro, P.: Dynamicboost: boosting time series generated by dynamical systems. In: IEEE International Conference on Computer Vision, IEEEGoogle Scholar
  81. 81.
    Vidal, R., Ravichandran, A.: Optical flow estimation & segmentation of multiple moving dynamic textures. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 516–521 (2005)Google Scholar
  82. 82.
    Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vis. 63(2), 153–161 (2005)CrossRefGoogle Scholar
  83. 83.
    Vishwanathan, S.V.N., Smola, A.J., Vidal, R.: Binet-cauchy kernels on dynamical systems and its application to the analysis of dynamic scenes. Int. J. Comput. Vis. 73(1), 95–119 (2007)CrossRefGoogle Scholar
  84. 84.
    Wang, J., Adelson, E.: Representing moving images with layers. IEEE Trans. Image Proc. 3(5), 625–638 (1994)CrossRefGoogle Scholar
  85. 85.
    Washington State Department of Transportation. http://www.wsdot.wa.gov (2005)
  86. 86.
    Woolfe, F., Fitzgibbon, A.: Shift-invariant dynamic texture recognition. In: ECCV, Springer LNCS (2006)Google Scholar
  87. 87.
    Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: IEEE International Conference on Computer Vision, IEEE, vol. 1, pp. 90–97 (2005)Google Scholar
  88. 88.
    Yang, Y., Liu, J., Shah, M.: Video scene understanding using multi-scale analysis. In: IEEE International Conference on Computer Vision, IEEE (2009)Google Scholar
  89. 89.
    Yuan, L., Wen, F., Liu, C., Shum, H.Y.: Synthesizing dynamic textures with closed-loop linear dynamic systems. In: European Conference on Computer Vision, pp. 603–616. Springer LNCS (2004)Google Scholar
  90. 90.
    Zhao, T., Nevatia, R.: Bayesian human segmentation in crowded situations. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, vol. 2, pp. 459–466 (2003)Google Scholar
  91. 91.
    Zhong, J., Sclaroff, S.: Segmenting foreground objects from a dynamic textured background via a robust Kalman filter. In: IEEE ICCV, IEEE (2003)Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of Computer ScienceCity University of Hong KongHong KongChina
  2. 2.Department of Electrical and Computer EngineeringUniversity of CaliforniaSan DiegoUSA

Personalised recommendations