Advertisement

Crowd Counting and Profiling: Methodology and Evaluation

  • Chen Change LoyEmail author
  • Ke Chen
  • Shaogang Gong
  • Tao Xiang
Part of the The International Series in Video Computing book series (VICO, volume 11)

Abstract

Video imagery based crowd analysis for population profiling and density estimation in public spaces can be a highly effective tool for establishing global situational awareness. Different strategies such as counting by detection and counting by clustering have been proposed, and more recently counting by regression has also gained considerable interest due to its feasibility in handling relatively more crowded environments. However, the scenarios studied by existing regression-based techniques are rather diverse in terms of both evaluation data and experimental settings. It can be difficult to compare them in order to draw general conclusions on their effectiveness. In addition, contributions of individual components in the processing pipeline such as feature extraction and perspective normalization remain unclear and less well studied. This study describes and compares the state-of-the-art methods for video imagery based crowd counting, and provides a systematic evaluation of different methods using the same protocol. Moreover, we evaluate critically each processing component to identify potential bottlenecks encountered by existing techniques. Extensive evaluation is conducted on three public scene datasets, including a new shopping center environment with labelled ground truth for validation. Our study reveals new insights into solving the problem of crowd analysis for population profiling and density estimation, and considers open questions for future studies.

Keywords

Local Binary Pattern Partial Little Square Regression Gaussian Process Regression Crowd Density Crowded Scene 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abdi, H.: Partial least square regression (pls regression). In: Salkind, N.J., Rasmussen, K. (eds.) Encyclopedia of Measurement and Statistics, pp. 740–744. SAGE Publications, Thousand Oaks (2007)Google Scholar
  2. 2.
    Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)CrossRefGoogle Scholar
  3. 3.
    Ali, S., Shah, M.: Floor fields for tracking in high density crowd scenes. In: European Conference on Computer Vision, Marseille, pp. 1–24 (2008)Google Scholar
  4. 4.
    Benabbas, Y., Ihaddadene, N., Yahiaoui, T., Urruty, T., Djeraba, C.: Spatio-temporal optical flow analysis for people counting. In: IEEE International Conference on Advanced Video and Signal Based Surveillance, Boston, pp. 212–217 (2010)Google Scholar
  5. 5.
    Benenson, R., Mathias, M., Timofte, R., Gool, L.V.: Pedestrian detection at 100 frames per second. In: IEEE Conference Computer Vision and Pattern Recognition, Providence (2012)Google Scholar
  6. 6.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2007)Google Scholar
  7. 7.
    Brostow, G.J., Cipolla, R.: Unsupervised Bayesian detection of independent motion in crowds. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 594–601 (2006)Google Scholar
  8. 8.
    Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986)CrossRefGoogle Scholar
  9. 9.
    Chan A.B., Dong, D.: Generalized Gaussian process models. In: IEEE Conference Computer Vision and Pattern Recognition, Colorado, pp. 2681–2688. IEEE (2011)Google Scholar
  10. 10.
    Chan A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 909–926 (2008)CrossRefGoogle Scholar
  11. 11.
    Chan A.B., Vasconcelos, N.: Bayesian poisson regression for crowd counting. In: IEEE International Conference on Computer Vision, Kyoto, pp. 545–551. IEEE (2009)Google Scholar
  12. 12.
    Chan A. B., Vasconcelos, N.: Counting people with low-level features and Bayesian regression. IEEE Trans. Image Process. 21(4), 2160–2177 (2012)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, pp. 1–7 (2008)Google Scholar
  14. 14.
    Chan, A.B., Morrow, M., Vasconcelos, N.: Analysis of crowded scenes using holistic properties. In: IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (2009)Google Scholar
  15. 15.
    Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: British Machine Vision Conference, Surrey (2012)Google Scholar
  16. 16.
    Cho, S., Chow, T., Leung, C.: A neural-based crowd estimation by hybrid global learning algorithm. IEEE Trans. Syst. Man Cybern. Part B Cybern. 29(4), 535–541 (1999)CrossRefGoogle Scholar
  17. 17.
    Cohen, S.: Background estimation as a labeling problem. In: IEEE International Conference on Computer Vision, Beijing, vol. 2, pp. 1034–1041 (2005)Google Scholar
  18. 18.
    Cong, Y., Gong, H., Zhu, S., Tang, Y.: Flow mosaicking: real-time pedestrian counting without scene-specific learning. In: IEEE Conference Computer Vision and Pattern Recognition, Miami, pp. 1093–1100 (2009)Google Scholar
  19. 19.
    Conte, D., Foggia, P., Percannella, G., Vento, M.: A method based on the indirect approach for counting people in crowded scenes. In: IEEE International Conference on Advanced Video and Signal Based Surveillance, Boston, pp. 111–118. IEEE (2010)Google Scholar
  20. 20.
    Criminisi, A., Shotton, J., Konukoglu, E.: Decision forest for classification, regression, density estimation, manifold learning and semi-supervised learning. Tech. Rep. MSR-TR-2011-114, Microsoft Research (2011)Google Scholar
  21. 21.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pp. 886–893 (2005)Google Scholar
  22. 22.
    Davies, A., Yin, J., Velastin, S.: Crowd monitoring using image processing. Electron. Commun. Eng. J. 7(1), 37–47 (1995)CrossRefGoogle Scholar
  23. 23.
    De Brabanter, K., De Brabanter, J., Suykens, J., De Moor, B.: Approximate confidence and prediction intervals for least squares support vector regression. IEEE Trans. Neural Netw. 22(1), 110–120 (2011)CrossRefGoogle Scholar
  24. 24.
    Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2011)CrossRefGoogle Scholar
  25. 25.
    Dong, L., Parameswaran, V., Ramesh, V., Zoghlami, I.: Fast crowd segmentation using shape indexing. In: IEEE International Conference on Computer Vision, Rio de Janeiro (2007)Google Scholar
  26. 26.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  27. 27.
    Ferryman, J., Crowley, J., Shahrokni, A.: Pets 2009 benchmark data. http://www.cvg.rdg.ac.uk/WINTERPETS09/a.html
  28. 28.
    Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2188–2202 (2011)CrossRefGoogle Scholar
  29. 29.
    Ge, W., Collins, R.: Marked point processes for crowd counting. In: IEEE Conference on Computer Vision and Pattern Recognition, Miami, pp. 2913–2920 (2009)Google Scholar
  30. 30.
    Ge, W., Collins, R.: Crowd detection with a multiview sampler. In: European Conference on Computer Vision, Heraklion, pp. 324–337 (2010)Google Scholar
  31. 31.
    Geladi, P., Kowalski, B.: Partial least-squares regression: a tutorial. Anal. Chim. Acta 185, 1–17 (1986)CrossRefGoogle Scholar
  32. 32.
    Gong, S., Xiang, T.: Visual Analysis of Behaviour: From Pixels to Semantics. Springer, New York (2011)CrossRefGoogle Scholar
  33. 33.
    Gong, S., Loy, C.C., Xiang, T.: Security and surveillance. In: Moeslund, T., Hilton, A., Krueger, V., Sigal, L. (eds.) Visual Analysis of Humans: Looking at People, Springer, pp. 455–472 (2011)Google Scholar
  34. 34.
    Haralick, R., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)CrossRefGoogle Scholar
  35. 35.
    Helbing, D., Farkas, I., Molnar, P., Vicsek, T.: Simulation of pedestrian crowds in normal and evacuation situations. In: Schreckenberg, M., Sharma, S.D. (eds.) Pedestrian and Evacuation Dynamics, vol. 21. Springer, Berlin/New York (2002)Google Scholar
  36. 36.
    Hoerl, A., Kennard, R.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970)CrossRefzbMATHGoogle Scholar
  37. 37.
    Jacques, J., Jr., Musse, S., Jung, C.: Crowd analysis using computer vision techniques. IEEE Signal Process. Mag. 27(5), 66–77 (2010)Google Scholar
  38. 38.
    Kong, D., Gray, D., Tao, H.: Counting pedestrians in crowds using viewpoint invariant training. In: British Machine Vision Conference, Oxford (2005). CiteseerGoogle Scholar
  39. 39.
    Kong, D., Gray, D., Tao, H.: A viewpoint invariant approach for crowd counting. In: International Conference on Pattern Recognition, Hong Kong, vol. 3, pp. 1187–1190 (2006)Google Scholar
  40. 40.
    Krahnstoever, N., Mendonca, P.: Bayesian autocalibration for surveillance. In: IEEE International Conference on Computer Vision, Beijing, vol. 2, pp. 1858–1865. IEEE (2005)Google Scholar
  41. 41.
    Kratz, L., Nishino, K.: Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In: IEEE Conference on Computer Vision and Pattern Recognition, Miami, pp. 1446–1453 (2009)Google Scholar
  42. 42.
    Lampert, C.: Kernel Methods in Computer Vision, vol. 4. Now Publishers Inc., Hanover (2009)Google Scholar
  43. 43.
    Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: IEEE Conference Computer Vision and Pattern Recognition, San Diego, vol. 1, pp. 878–885 (2005)Google Scholar
  44. 44.
    Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Advances in Neural Information Processing Systems (2010)Google Scholar
  45. 45.
    Li, J., Huang, L., Liu, C.: CASIA pedestrian counting dataset. http://cpcd.vdb.csdb.cn/page/showItem.vpage?id=automation.dataFile/1
  46. 46.
    Li, M., Zhang, Z., Huang, K., Tan, T.: Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: International Conference on Pattern Recognition, Tampa, pp. 1–4 (2008)Google Scholar
  47. 47.
    Li, J., Huang, L., Liu, C.: Robust people counting in video surveillance: dataset and system. In: IEEE International Conference on Advanced Video and Signal-Based Surveillance, pp. 54–59. IEEE (2011)Google Scholar
  48. 48.
    Lin, S., Chen, J., Chao, H.: Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 31(6), 645–654 (2001)CrossRefGoogle Scholar
  49. 49.
    Lin, T., Lin, Y., Weng, M., Wang, Y., Hsu, Y., Liao, H.: Cross camera people counting with perspective estimation and occlusion handling. In: IEEE International Workshop on Information Forensics and Security (2011)Google Scholar
  50. 50.
    Liu, J., Collins, R.T., Liu, Y.: Surveillance camera autocalibration based on pedestrian height distributions. In: British Machine Vision Conference, Dundee (2011)Google Scholar
  51. 51.
    Loy, C.C., Xiang, T., Gong, S.: Time-delayed correlation analysis for multi-camera activity understanding. Int. J. Comput. Vis. 90(1), 106–129 (2010)CrossRefGoogle Scholar
  52. 52.
    Loy, C.C., Xiang, T., Gong, S.: Incremental activity modelling in multiple disjoint cameras. IEEE Trans. Pattern Anal. Mach. Intell. 34(9) 1799–1813 (2011)CrossRefGoogle Scholar
  53. 53.
    Loy, C.C., Xiang, T., Gong, S.: Salient motion detection in crowded scenes. In: Special Session on ‘Beyond Video Surveillance: Emerging Applications and Open Problems’, International Symposium on Communications, Control and Signal Processing, Invited Paper (2012)Google Scholar
  54. 54.
    Ma, R., Li, L., Huang, W., Tian, Q.: On pixel count based crowd density estimation for visual surveillance. In: IEEE Conference on Cybernetics and Intelligent Systems, vol. 1, pp. 170–173. IEEE (2004)Google Scholar
  55. 55.
    Ma, W., Huang, L., Liu, C.: Advanced local binary pattern descriptors for crowd estimation. In: Pacific-Asia Workshop on Computational Intelligence and Industrial Application, vol. 2, pp. 958–962. IEEE (2008)Google Scholar
  56. 56.
    Ma, W., Huang, L., Liu, C.: Crowd density analysis using co-occurrence texture features. In: International Conference on Computer Sciences and Convergence Information Technology, pp. 170–175 (2010)Google Scholar
  57. 57.
    Marana, A., Velastin, S., Costa, L., Lotufo, R.: Estimation of crowd density using image processing. In: Image Processing for Security Applications, pp. 11–1 (1997)Google Scholar
  58. 58.
    Marana, A., Costa, L., Lotufo, R., Velastin, S.: On the efficacy of texture analysis for crowd monitoring. In: International Symposium on Computer Graphics, Image Processing, and Vision, pp. 354–361 (1998)Google Scholar
  59. 59.
    Marana, A., da Fontoura Costa, L., Lotufo, R., Velastin, S.: Estimating crowd density with Minkowski fractal dimension. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6, pp. 3521–3524. IEEE (1999)Google Scholar
  60. 60.
    Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behaviour detection using social force model. In: IEEE Conference on Computer Vision and Pattern Recognition, Miami, pp. 935–942 (2009)Google Scholar
  61. 61.
    Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRefGoogle Scholar
  62. 62.
    Pätzold, M., Evangelio, R., Sikora, T.: Counting people in crowded environments by fusion of shape and motion information. In: IEEE International Conference on Advanced Video and Signal Based Surveillance, Boston, pp. 157–164. IEEE (2010)Google Scholar
  63. 63.
    Rabaud, V., Belongie, S.: Counting crowded moving objects. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 705–711 (2006)Google Scholar
  64. 64.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Process for Machine Learning. MIT, Cambridge (2006)Google Scholar
  65. 65.
    Rodriguez, M., Laptev, I., Sivic, J., Audibert, J.: Density-aware person detection and tracking in crowds. In: IEEE International Conference on Computer Vision, Barcelona (2011)Google Scholar
  66. 66.
    Russell, D., Gong, S.: Minimum cuts of a time-varying background. In: British Machine Vision Conference, Edinburgh, pp. 809–818 (2006)Google Scholar
  67. 67.
    Ryan, D., Denman, S., Fookes, C., Sridharan, S.: Crowd counting using multiple local features. In: Digital Image Computing: Techniques and Applications (2009)Google Scholar
  68. 68.
    Sabzmeydani, P., Mori, G.: Detecting pedestrians by learning shapelet features. In: IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, pp. 1–8 (2007)Google Scholar
  69. 69.
    Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: International Conference on Machine Learning, pp. 515–521 (1998)Google Scholar
  70. 70.
    Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)CrossRefGoogle Scholar
  71. 71.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge/New York (2004)CrossRefGoogle Scholar
  72. 72.
    Smola, A., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14(3), 199–222 (2004)MathSciNetCrossRefGoogle Scholar
  73. 73.
    Stauffer, C., Grimson, W.E.L.: Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–757 (2000)CrossRefGoogle Scholar
  74. 74.
    Suykens, J., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)MathSciNetCrossRefGoogle Scholar
  75. 75.
    Swears, E., Turek, M., Collins, R., Perera, A., Hoogs, A.: Automatic activity profile generation from detected functional regions for video scene analysis. In: Shan, C., Porikli, F., Xiang, T., Gong, S. (eds.) Video Analytics for Business Intelligence, pp. 241–269. Springer, Berlin/New York (2012)CrossRefGoogle Scholar
  76. 76.
    Tian, Y., Brown, L., Hampapur, A., Lu, M., Senior, A., Shu, C.: IBM smart surveillance system (s3): event based video surveillance system with an open and extensible framework. Mach. Vis. Appl. 19(5), 315–327 (2008)CrossRefzbMATHGoogle Scholar
  77. 77.
    Tu, P., Sebastian, T., Doretto, G., Krahnstoever, N., Rittscher, J., Yu, T.: Unified crowd segmentation. In: European Conference on Computer Vision, Marseille (2008)Google Scholar
  78. 78.
    Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on Riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell. 30(10), 1713–1727 (2008)CrossRefGoogle Scholar
  79. 79.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (2000)CrossRefzbMATHGoogle Scholar
  80. 80.
    Viola, P., Jones, M.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRefGoogle Scholar
  81. 81.
    Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vis. 63(2), 153–161 (2005)CrossRefGoogle Scholar
  82. 82.
    Wang, M., Wang, X.: Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In: IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, pp. 3401–3408. IEEE (2011)Google Scholar
  83. 83.
    Wang, M., Li, W., Wang, X.: Transferring a generic pedestrian detector towards specific scenes. In: IEEE Conference Computer Vision and Pattern Recognition, Providence (2012)Google Scholar
  84. 84.
    Welling, M.: Support vector regression. Tech. Rep., Department of Computer Science, University of Toronto (2004)Google Scholar
  85. 85.
    Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, Beijing, vol. 1, pp. 90–97. IEEE (2005)Google Scholar
  86. 86.
    Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75(2), 247–266 (2007)CrossRefGoogle Scholar
  87. 87.
    Wu, X., Liang, G., Lee, K., Xu, Y.: Crowd density estimation using texture analysis and learning. In: IEEE International Conference on Robotics and Biomimetics, pp. 214–219. IEEE (2006)Google Scholar
  88. 88.
    Yang, D., González-Baños, H., Guibas, L.: Counting people in crowds with a real-time network of simple image sensors. In: IEEE International Conference on Computer Vision, Nice, pp. 122–129 (2003)Google Scholar
  89. 89.
    Yeniay, O., Goktas, A.: A comparison of partial least squares regression with other prediction methods. Hacet. J. Math. Stat. 31(99), 111 (2002)MathSciNetGoogle Scholar
  90. 90.
    Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM J. Comput. Surv. 38(4), 1–45 (2006)Google Scholar
  91. 91.
    Zhan, B., Monekosso, D.N., Remagnino, P., Velastin, S.A., Xu, L.Q.: Crowd analysis: a survey. Mach. Vis. Appl. 19, 345–357 (2008)CrossRefzbMATHGoogle Scholar
  92. 92.
    Zhao, T., Nevatia, R., Wu, B.: Segmentation and tracking of multiple humans in crowded environments. IEEE Trans. Pattern Anal. Mach. Intell. 30(7), 1198–1211 (2008)CrossRefGoogle Scholar
  93. 93.
    Zhou, B., Wang, X., Tang, X.: Random field topic model for semantic region analysis in crowded scenes from tracklets. In: IEEE Conference Computer Vision and Pattern Recognition, Colorado Springs (2011)Google Scholar
  94. 94.
    Zhu, X., Gong, S., Loy, C.C.: Comparing visual feature coding for learning disjoint camera dependencies. In: British Machine Vision Conference, Surrey (2012)Google Scholar
  95. 95.
    Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognit. Lett. 27(7), 773–780 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Chen Change Loy
    • 1
    Email author
  • Ke Chen
    • 2
  • Shaogang Gong
    • 2
  • Tao Xiang
    • 2
  1. 1.Department of Information EngineeringThe Chinese University of Hong KongShatinHong Kong
  2. 2.Queen Mary University of LondonLondonUK

Personalised recommendations