Groups and Crowds: Behaviour Analysis of People Aggregations

  • Sadegh Mohammadi
  • Francesco Setti
  • Alessandro Perina
  • Marco Cristani
  • Vittorio Murino
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 693)

Abstract

Automatic analysis of human behavior in social environment is a key topic for the computer vision community, with applications in security and video surveillance. While human behavior at an individual (single person) level has been widely studied in the past years, analysis of groups and crowd behavior, is still at a preliminary stage, with room for new approaches to emerge. Recently, there has been significant research effort dedicated to the development of automated computer vision techniques, intended to enhance safety of our societies by monitoring human behaviors and their actions in groups and crowd level. In particular, groups are usually formed by number of people who gathered for private meeting, birthday party, or wedding, while we consider crowd as huge number of people are gathered together to participate for a national or religious event, or protest due to some dissatisfaction. In this chapter, we will provide a broad overview on proposed approaches on human behavior analysis in group and crowd level, as well as, a detailed of some most recent state-of-the-art methods along with extensive experiments and comparison.

References

  1. 1.
    Ali, S., Shah, M.: A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–6. IEEE (2007)Google Scholar
  2. 2.
    Ali, S., Shah, M.: Floor fields for tracking in high density crowd scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 1–14. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88688-4_1 CrossRefGoogle Scholar
  3. 3.
    Ba, S.O., Odobez, J.M.: Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 33(1), 101–116 (2011)CrossRefGoogle Scholar
  4. 4.
    Batchelor, G.K.: An Introduction to Fluid Dynamics. Cambridge University Press, Cambridge (2000)CrossRefMATHGoogle Scholar
  5. 5.
    Bazzani, L., Cristani, M., Tosato, D., Farenzena, M., Paggetti, G., Menegaz, G., Murino, V.: Social interactions by visual focus of attention in a three-dimensional environment. Expert Syst. 30(2), 115–127 (2013)CrossRefGoogle Scholar
  6. 6.
    Benfold, B., Reid, I.: Stable multi-target tracking in real-time surveillance video. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3457–3464 (2011)Google Scholar
  7. 7.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATHGoogle Scholar
  8. 8.
    Blunsden, S., Fisher, R.: The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Ann. BMVA 2010(4), 1–12 (2010)Google Scholar
  9. 9.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 23(11), 1222–1239 (2001)CrossRefGoogle Scholar
  10. 10.
    Bozdogan, H.: Model selection and Akaike’s Information Criterion (AIC): the general theory and its analytical extensions. Psychometrika 52(3), 345–370 (1987)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Campbell, N.D.F., Vogiatzis, G., Hernández, C., Cipolla, R.: Automatic 3D object segmentation in multiple views using volumetric graph-cuts. Image Vis. Comput. 28(1), 14–25 (2010)CrossRefGoogle Scholar
  12. 12.
    Cao, T., Wu, X., Guo, J., Yu, S., Xu, Y.: Abnormal crowd motion analysis. ROBIO 9, 1709–1714 (2009)Google Scholar
  13. 13.
    Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–7. IEEE (2008)Google Scholar
  14. 14.
    Chen, C., Odobez, J.M.: We are not contortionists: coupled adaptive learning for head and body orientation estimation in surveillance video. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1544–1551 (2012)Google Scholar
  15. 15.
    Ciolek, T.M.: The proxemics lexicon: a first approximation. J. Nonverbal Behav. 8(1), 55–79 (1983)CrossRefGoogle Scholar
  16. 16.
    Ciolek, T.M., Kendon, A.: Environment and the spatial arrangement of conversational encounters. Sociol. Inq. 50(3–4), 237–271 (1980)CrossRefGoogle Scholar
  17. 17.
    Conte, D., Foggia, P., Percannella, G., Tufano, F., Vento, M.: A method for counting people in crowded scenes. In: 2010 Seventh IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 225–232. IEEE (2010)Google Scholar
  18. 18.
    Cook, M.: Experiments on orientation and proxemics. Hum. Relat. 23(1), 61–76 (1970)CrossRefGoogle Scholar
  19. 19.
    Cristani, M., Bazzani, L., Paggetti, G., Fossati, A., Tosato, D., Del Bue, A., Menegaz, G., Murino, V.: Social interaction discovery by statistical analysis of f-formations. In: British Machine Vision Conference (BMVC), pp. 23.1–23.12 (2011)Google Scholar
  20. 20.
    Datta, A., Shah, M., Lobo, N.D.V.: Person-on-person violence detection in video data. In: Proceedings 16th International Conference on Pattern Recognition 2002, vol. 1, pp. 433–438. IEEE (2002)Google Scholar
  21. 21.
    Gong, S., Loy, C.C., Xiang, T.: Security and surveillance. In: Moeslund, T.B., Hilton, A., Krüger, V., Sigal, L. (eds.) Visual Analysis of Humans. Springer, Heidelberg (2011). doi: 10.1007/978-0-85729-997-0_23 Google Scholar
  22. 22.
    Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–6. IEEE (2012)Google Scholar
  23. 23.
    Helbing, D., Molnar, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282 (1995)CrossRefGoogle Scholar
  24. 24.
    Hung, H., Kröse, B.: Detecting F-formations as dominant sets. In: International Conference on Multimodal Interfaces (ICMI), pp. 231–238 (2011)Google Scholar
  25. 25.
    Jiang, F., Wu, Y., Katsaggelos, A.K.: Detecting contextual anomalies of crowd motion in surveillance video. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 1117–1120. IEEE (2009)Google Scholar
  26. 26.
    Jojic, N., Perina, A.: Multidimensional counting grids: inferring word order from disordered bags of words. arXiv preprint (2012). arXiv:1202.3752
  27. 27.
    Kendon, A.: Conducting Interaction: Patterns of Behavior in Focused Encounters. Cambridge University Press, Cambridge (1990)Google Scholar
  28. 28.
    Kok, V.J., Lim, M.K., Chan, C.S.: Crowd behavior analysis: a review where physics meets biology. Neurocomputing 177, 342–362 (2015)CrossRefGoogle Scholar
  29. 29.
    Kratz, L., Nishino, K.: Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1446–1453. IEEE (2009)Google Scholar
  30. 30.
    Ladickỳ, L., Russell, C., Kohli, P., Torr, P.H.S.: Inference methods for CRFs with co-occurrence statistics. Int. J. Comput. Vis. 103(2), 213–225 (2013)MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Lanz, O.: Approximate Bayesian multibody tracking. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 28(9), 1436–1449 (2006)CrossRefGoogle Scholar
  32. 32.
    Lanz, O., Brunelli, R.: Joint Bayesian tracking of head location and pose from low-resolution video. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds.) CLEAR/RT -2007. LNCS, vol. 4625, pp. 287–296. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-68585-2_27 CrossRefGoogle Scholar
  33. 33.
    Li, T., Chang, H., Wang, M., Ni, B., Hong, R., Yan, S.: Crowded scene analysis: a survey. IEEE Trans. Circuits Syst. Video Technol. 25(3), 367–386 (2015)CrossRefGoogle Scholar
  34. 34.
    Liu, Y., Li, X., Jia, L.: Abnormal crowd behavior detection based on optical flow and dynamic threshold. In: 2014 11th World Congress on Intelligent Control and Automation (WCICA), pp. 2902–2906. IEEE (2014)Google Scholar
  35. 35.
    Lombaert, H., Sun, Y., Grady, L., Xu, C.: A multilevel banded graph cuts method for fast image segmentation. In: International Conference on Computer Vision (ICCV), pp. 259–265 (2005)Google Scholar
  36. 36.
    Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 935–942. IEEE (2009)Google Scholar
  37. 37.
    Mohammadi, S., Kiani, H., Perina, A., Murino, V.: A comparison of crowd commotion measures from generative models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 49–55 (2015)Google Scholar
  38. 38.
    Mohammadi, S., Kiani, H., Perina, A., Murino, V.: Violence detection in crowded scenes using substantial derivative. In: 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2015)Google Scholar
  39. 39.
    Mousavi, H., Galoogahi, H.K., Perina, A., Murino, V.: Detecting abnormal behavioral patterns in crowd scenarios. In: Esposito, A., Jain, L.C. (eds.) Toward Robotic Socially Believable Behaving Systems - Volume II. ISRL, vol. 106, pp. 185–205. Springer, Cham (2016). doi: 10.1007/978-3-319-31053-4_11 CrossRefGoogle Scholar
  40. 40.
    Mousavi, H., Mohammadi, S., Perina, A., Chellali, R., Murino, V.: Analyzing tracklets for the detection of abnormal crowd behavior. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp. 148–155. IEEE (2015)Google Scholar
  41. 41.
    Mousavi, H., Nabi, M., Kiani, H., Perina, A., Murino, V.: Crowd motion monitoring using tracklet-based commotion measure. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 2354–2358. IEEE (2015)Google Scholar
  42. 42.
    Moussaïd, M., Helbing, D., Theraulaz, G.: How simple rules determine pedestrian behavior and crowd disasters. Proc. Nat. Acad. Sci. 108(17), 6884–6888 (2011)CrossRefGoogle Scholar
  43. 43.
    Moussaïd, M., Nelson, J.D.: Simple heuristics and the modelling of crowd behaviours. In: Weidmann, U., Kirsch, U., Schreckenberg, M. (eds.) Pedestrian and Evacuation Dynamics 2012, pp. 75–90. Springer, Cham (2014). doi: 10.1007/978-3-319-02447-9_5 CrossRefGoogle Scholar
  44. 44.
    Pavan, M., Pelillo, M.: Dominant sets and pairwise clustering. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 29(1), 167–172 (2007)CrossRefGoogle Scholar
  45. 45.
    Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)CrossRefGoogle Scholar
  46. 46.
    Rabaud, V., Belongie, S.: Counting crowded moving objects. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 1, pp. 705–711. IEEE (2006)Google Scholar
  47. 47.
    Rittscher, J., Tu, P.H., Krahnstoever, N.: Simultaneous estimation of segmentation and shape. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 486–493. IEEE (2005)Google Scholar
  48. 48.
    Rodriguez, M., Ali, S., Kanade, T.: Tracking in unstructured crowded scenes. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1389–1396. IEEE (2009)Google Scholar
  49. 49.
    Mohammadi, S., Perina, A., Kiani, H., Murino, V.: Angry crowds: detecting violent events in videos. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 3–18. Springer, Cham (2016). doi: 10.1007/978-3-319-46478-7_1 Google Scholar
  50. 50.
    Saleh, S.A.M., Suandi, S.A., Ibrahim, H.: Recent survey on crowd density estimation and counting for visual surveillance. Eng. Appl. Artif. Intell. 41, 103–114 (2015)CrossRefGoogle Scholar
  51. 51.
    Setti, F., Hung, H., Cristani, M.: Group detection in still images by f-formation modeling: a comparative study. In: International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), pp. 1–4 (2013)Google Scholar
  52. 52.
    Setti, F., Lanz, O., Ferrario, R., Murino, V., Cristani, M.: Multi-scale F-formation discovery for group detection. In: IEEE International Conference on Image Processing (ICIP) (2013)Google Scholar
  53. 53.
    Setti, F., Russell, C., Bassetti, C., Cristani, M.: F-formation detection: individuating free-standing conversational groups in images. PloS ONE 10(5), e0123783 (2015)CrossRefGoogle Scholar
  54. 54.
    Shao, J., Loy, C.C., Wang, X.: Scene-independent group profiling in crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2219–2226 (2014)Google Scholar
  55. 55.
    Smith, K., Ba, S.O., Odobez, J.M., Gatica-Perez, D.: Tracking the visual focus of attention for a varying number of wandering people. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 30(7), 1212–1229 (2008)CrossRefGoogle Scholar
  56. 56.
    Tang, S., Andriluka, M., Milan, A., Schindler, K., Roth, S., Schiele, B.: Learning people detectors for tracking in crowded scenes. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1049–1056 (2013)Google Scholar
  57. 57.
    Tosato, D., Spera, M., Cristani, M., Murino, V.: Characterizing humans on Riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 35(8), 1972–1984 (2013)CrossRefGoogle Scholar
  58. 58.
    Tran, K.N., Bedagkar-Gala, A., Kakadiaris, I.A., Shah, S.K.: Social cues in group formation and local interactions for collective activity analysis. In: International Conference on Computer Vision Theory and Applications (VISAPP), vol. 1, pp. 539–548 (2013)Google Scholar
  59. 59.
    Vascon, S., Mequanint, E.Z., Cristani, M., Hung, H., Pelillo, M., Murino, V.: A game-theoretic probabilistic approach for detecting conversational groups. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9007, pp. 658–675. Springer, Cham (2015). doi: 10.1007/978-3-319-16814-2_43 Google Scholar
  60. 60.
    Vascon, S., Mequanint, E.Z., Cristani, M., Hung, H., Pelillo, M., Murino, V.: Detecting conversational groups in images and sequences: a robust game-theoretic approach. Comput. Vis. Image Underst. 143, 11–24 (2016)CrossRefGoogle Scholar
  61. 61.
    Wu, S., Moore, B.E., Shah, M.: Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2054–2060. IEEE (2010)Google Scholar
  62. 62.
    Xu, L., Gong, C., Yang, J., Wu, Q., Yao, L.: Violent video detection based on MoSIFT feature and sparse coding. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3538–3542. IEEE (2014)Google Scholar
  63. 63.
    Xu, N., Ahuja, N., Bansal, R.: Object segmentation using graph cuts based active contours. Comput. Vis. Image Underst. 107(3), 210–224 (2007)CrossRefGoogle Scholar
  64. 64.
    Zhan, B., Monekosso, D.N., Remagnino, P., Velastin, S.A., Xu, L.Q.: Crowd analysis: a survey. Mach. Vis. Appl. 19(5–6), 345–357 (2008)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Sadegh Mohammadi
    • 1
  • Francesco Setti
    • 2
  • Alessandro Perina
    • 3
  • Marco Cristani
    • 2
  • Vittorio Murino
    • 1
    • 2
  1. 1.Pattern Analysis and Computer Vision (PAVIS)Istituto Italiano di TecnologiaGenovaItaly
  2. 2.Department of Computer ScienceUniversity of VeronaVeronaItaly
  3. 3.Microsoft Corp, WDG Core Data ScienceRedmondUSA

Personalised recommendations