Multimedia Tools and Applications

, Volume 73, Issue 1, pp 41–60 | Cite as

The large-scale crowd analysis based on sparse spatial-temporal local binary pattern

  • Hua YangEmail author
  • Yihua Cao
  • Hang Su
  • Yawen Fan
  • Shibao Zheng


As a particular class of public security issues, the large-scale crowd analysis plays a very important role in video surveillance application. This paper proposes a sparse spatial-temporal local binary pattern (SST-LBP) descriptor to extract dynamic texture of the walking crowd which can be applied to the crowd density estimation and distribution analysis. The proposed approach consists of four steps. First of all, sparse selected locations are extracted, which vary notably in both spatial domain and temporal domain. Afterwards, we propose a SST-LBP algorithm to extract the local dynamic feature and utilize the local feature’s statistical property to describe the crowd feature. Thirdly, the overall crowd density level can be determined by classifying the crowd feature with support vector machine. Finally, the local feature is used to represent the local density and then the overall density distribution can be described. To improve the accuracy, we introduce the perspective correction into the detection of sparse selected locations and the spectrum analysis of SST-LBP code. The experiments on different datasets not only show that the proposed SST-LBP method is effective and robust on the large-scale crowd density estimation and distribution, but also indicate that the deformity correction is useful. Compared with other methods, the proposed method has the advantage of low computation complexity and high efficiency. In addition, it performs well on all density levels and can present local crowd distribution.


Video surveillance Crowd density Local binary pattern Sparse point Density distribution 



This research is partly supported by NSFC (No.61102099, No.61171172), Scientific and Technological Committee of Shanghai (No.11231203102, No.10231204002) and National Basic Research Program (973 Program, No. 2010CB731406). We sincerely thank for the testing video datasets from University of Reading and permission (PETS2009) and University of Minnesota.


  1. 1.
    Beran V, Hradis M, Zemcik P et al (2008) Video summarization at Brno university of technology. Proceeding of the 2nd ACM workshop on Video summarization, pp. 31–34Google Scholar
  2. 2.
    Chan AB, Vasconcelos N (2012) Counting people with low-level features and Bayesian regression. IEEE Trans Image Process 21(4):2160–2177MathSciNetCrossRefGoogle Scholar
  3. 3.
    Chan AB, Liang Z-SJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7Google Scholar
  4. 4.
    Chapelle O, Haffner P, Vapnik VN (1999) Support vector machines for histogram-based image classification. IEEE Trans Neural Netw 10(5):1055–1064CrossRefGoogle Scholar
  5. 5.
    Chetverikov D, Peteri R (2005) A brief survey of dynamic texture description and recognition. Adv Soft Comput 30:17–26CrossRefGoogle Scholar
  6. 6.
    Cho S-Y, Chow TWS, Leung C-T (1999) A neural-based crowd estimation by hybrid global learning algorithm. IEEE Trans Syst Man Cybern B Cybern 29:535–541CrossRefGoogle Scholar
  7. 7.
    Davies AC, Yin JH, Velastin SA (1995) Crowd monitoring using image processing. Electron Comm Eng J 7(1):37–47CrossRefGoogle Scholar
  8. 8.
    Ge W, Collins RT (2010) Crowd density analysis with marked point processes [applications corner]. IEEE Signal Process Mag 27(5):107–123CrossRefGoogle Scholar
  9. 9.
    Huang D, Chow TWS (2003) A people-counting system using a hybrid RBF neural network. Neural Process Lett 18(2):97–113CrossRefGoogle Scholar
  10. 10.
    Kreßel UH-G (1999) Pairwise classification and support vector machines. Advances in kernel methods: support vector learning, pp. 255–268Google Scholar
  11. 11.
    Lempitsky V, Zisserman A (2010) Learning to count objects in images. Machine Vision Learning/Statistics & OptimisationGoogle Scholar
  12. 12.
    Lin S-F, Chen J-Y, Chao H-X (2001) Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans Syst Man Cybern Syst Hum 31(6):645–654CrossRefGoogle Scholar
  13. 13.
    Ma R, Li L, Huang W, Tian Q (2004) On pixel count based crowd density estimation for visual surveillance. Cybernetics and Intelligent Systems, 2004 IEEE Conference 1:170–173Google Scholar
  14. 14.
    Marana AN, Verona V (2001) Wavelet packet analysis for crowd density estimation, Proceedings of the IASTED International Symposia on Applied Informatics, pp. 535–540Google Scholar
  15. 15.
    Marana AN, Velastin SA, Costa L et al (1998) Automatic estimation of crowd density using texture. Saf Sci 28:165–175CrossRefGoogle Scholar
  16. 16.
    Marana AN, Da Fontoura Costa L, Lotufo RA, Velastin SA (1999) Estimating crowd density with Minkowski fractal dimension. Acoust Speech Signal Process 6:3521–3524Google Scholar
  17. 17.
    Mikolajczyk K, Tuytelaars T, Schmid C et al (2005) A comparison of affine region detectors. Int J Comput Vis 1–2(65):43–72CrossRefGoogle Scholar
  18. 18.
    Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987CrossRefGoogle Scholar
  19. 19.
    Polus A, Schofer JL, Ushpiz A (1983) Pedestrian flow and level of service. J Transport Eng 109(1):46–56CrossRefGoogle Scholar
  20. 20.
    Rodriguez M, Laptev I, Sivic J et al (2011) Density-aware person detection and tracking in crowds. 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2423–2430Google Scholar
  21. 21.
    Roqueiro, Petrushin VA (2007) Counting people using video cameras. International Journal of Parallel, Emergent and Distributed Systems 22(3):193–209MathSciNetCrossRefGoogle Scholar
  22. 22.
    Sen G, Liu Wei, Yan He Ping (2009) Counting people in crowd open scene based on grey level dependence matrix. International Conference on Information and Automation, pp. 228–231Google Scholar
  23. 23.
    Su H, Yang H, Zheng S (2011) The large-scale crowd density estimation based on effective region feature extraction method. Lect Notes Comput Sci 6494:302–313CrossRefGoogle Scholar
  24. 24.
    Wu X, Liang G, Lee KK et al (2006) Crowd density estimation using texture analysis and learning. IEEE International Conference on Robotics and Biomimetic, pp. 214–219Google Scholar
  25. 25.
    Yang H, Su H, Zheng S (2011) The Large-scale Crowd Density Estimation Based on Sparse Spatio-temporal Local Binary Pattern, IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6Google Scholar
  26. 26.
    Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  • Hua Yang
    • 1
    • 2
    Email author
  • Yihua Cao
    • 1
  • Hang Su
    • 1
  • Yawen Fan
    • 1
  • Shibao Zheng
    • 1
  1. 1.Institution of Image Communication and Information Processing, Department of EEShanghai Jiaotong UniversityShanghaiChina
  2. 2.Shanghai Key Laboratory of Digital Media Processing and TransmissionShanghaiPeople’s Republic of China

Personalised recommendations