Advertisement

Journal of Real-Time Image Processing

, Volume 10, Issue 4, pp 785–803 | Cite as

Predicting and recognizing human interactions in public spaces

  • Fabio PoiesiEmail author
  • Andrea Cavallaro
Special Issue Paper

Abstract

We present an extensive survey of methods for recognizing human interactions and propose a method for predicting rendezvous areas in observable and unobservable regions using sparse motion information. Rendezvous areas indicate where people are likely to interact with each other or with static objects (e.g., a door, an information desk or a meeting point). The proposed method infers the direction of movement by calculating prediction lines from displacement vectors and temporally accumulates intersecting locations generated by prediction lines. The intersections are then used as candidate rendezvous areas and modeled as spatial probability density functions using Gaussian Mixture Models. We validate the proposed method to predict dynamic and static rendezvous areas on real-world datasets and compare it with related approaches.

Keywords

Human interactions Motion prediction Behavior understanding Crowd analysis Gaussian mixture models 

References

  1. 1.
    Andriyenko, A., Schindler, K., Roth, S.: Discrete-continuous optimization for multi-target tracking. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1926–1933 (2012)Google Scholar
  2. 2.
    Bazzani, L., Cristani, M., Murino, V.: Decentralized particle filter for joint individual-group tracking. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1886–1893 (2012)Google Scholar
  3. 3.
    Benabbas, Y., Ihaddadene, N., Djeraba, C.: Motion pattern extraction and event detection for automatic visual surveillance. EURASIP 7, 1 (2011)Google Scholar
  4. 4.
    Bera, A., Galoppo, N., Sharlet, D., Lake, A., Manocha, D.: Adapt: real-time adaptive pedestrian tracking for crowded scenes. In: Proceedings of Conference on Robotics and Automation, Hong Kong. (2014)Google Scholar
  5. 5.
    Borges, P., Conci, N., Cavallaro, A.: Video-based human behavior understanding: a survey. Trans. Circuits Syst. Video Technol. 23(11), 1993–2008 (2013)CrossRefGoogle Scholar
  6. 6.
    Bouman, C: Cluster: an unsupervised algorithm for modeling gaussian mixtures. http://engineering.purdue.edu/-bouman. (1998)
  7. 7.
    Bulthoff, H., Little, J., Poggio, T.: A parallel algorithm for real-time computation of optical flow. Nature 337(6207), 549–553 (1989)CrossRefGoogle Scholar
  8. 8.
    Calderara, S., Cucchiara, R.: Understanding dyadic interactions applying proxemic theory on videosurveillance trajectories. In: Proceedings of Computer Vision and Pattern Recognition Workshop, Providence. pp. 20–27 (2012)Google Scholar
  9. 9.
    Chang, M.C., Krahnstoever, N., Ge, W.: Probabilistic group-level motion analysis and scenario recognition. In: Proceedings of International Conference on Computer Vision, Barcelona. pp. 747–754 (2011)Google Scholar
  10. 10.
    Chaquet, J., Carmona, E., Fernandez-Caballero, A.: A survey of video datasets for human action and activity recognition. Comput. Vis. Image Underst. 117(6), 633–659 (2013)CrossRefGoogle Scholar
  11. 11.
    Chen, D.Y., Huang, P.C.: Motion-based unusual event detection in human crowds. J. Vis. Commun. Image R. 22(2), 178–186 (2011)CrossRefGoogle Scholar
  12. 12.
    Chen, F., Cavallaro, A.: Detecting group interactions by online association of trajectory data. In: Proceedings of Acoustics, Speech, and Signal Processing, Vancouver. pp. 1754–1758 (2013)Google Scholar
  13. 13.
    Cong, Y., Liu, J.Y.J.: Sparse reconstruction cost for abnormal event detection. In: Proceedings of Computer Vision and Pattern Recognition, Colorado Springs. pp. 3449–3456 (2011)Google Scholar
  14. 14.
    Cristani, M., Bazzani, L., Paggetti, G., Fossati, A., Bue, A.D., Menegaz, G., Murino, V. : Social interaction discovery by statistical analysis of \(F\)-formations. In: Proceedings of British Machine Vision Conference, Dundee. pp. 1–12 (2011a)Google Scholar
  15. 15.
    Cristani, M., Paggetti, G., Vinciarelli, A., Bazzani, L., Menegaz, G., Murino, V.: Towards computational proxemics: inferring social relations from interpersonal distances. In: Proceedings of Internation Conference on Social Computing, Sydney. pp. 290–297 (2011b)Google Scholar
  16. 16.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for human detection. In: Proceedings of Computer Vision and Pattern Recognition, San Diego. pp. 886–893 (2005)Google Scholar
  17. 17.
    Farenzena, M., Tavano, A., Bazzani, L., Tosato, D., Paggetti, G., Menegaz, G., Murino, V., Cristani, M.: Social interactions by visual focus of attention in a three-dimensional environment. In: Workshop on Pattern Recognition and Artificial Intelligence for Human Behaviour Analysis, Reggio Emilia. (2009)Google Scholar
  18. 18.
    Fassold, H., Rosner, J., Schallauer, P., Bailer, W.: Realtime KLT feature point tracking for high definition video. In: Proceedings of Computer Graphics, Computer Vision and Mathematics, Plzen. pp. 40–47 (2009)Google Scholar
  19. 19.
    Fathi, A., Hodgins, J., Rehg, J.: Social interactions: a first-person perspective. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1226–1233 (2012)Google Scholar
  20. 20.
    Garcia-Rodriguez, J., Orts-Escolano, S., Angelopoulou, A., Psarrou, A., Azorin-Lopez, J., Garcia-Chamizo, J.: Real time motion estimation using a neural architecture implemented on GPUs. J. Real-Time Image Process. (2014)Google Scholar
  21. 21.
    Granger, C.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Greggio, N., Bernardino, A., Laschi, C., Dario, P., Santos-Victor, J.: Self-adaptive Gaussian mixture models for real-time video segmentation and background subtraction. In: Proceedings of Intelligent Systems Design and Applications, Cairo. pp. 983–989 (2010)Google Scholar
  23. 23.
    Greggio, N., Bernardino, A., Laschi, C., Dario, P., Santos-Victor, J.: Fast estimation of Gaussian mixture models for image segmentation. Mach. Vis. Appl. 23(4), 773–789 (2012)CrossRefGoogle Scholar
  24. 24.
    Hall, E.: The Hidden Dimension: Handbook for Proxemic Research. Anchor Books Doubleday, New York (1966)Google Scholar
  25. 25.
    Helbing, D., Molnar, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282–4286 (1995)CrossRefGoogle Scholar
  26. 26.
    Jin, B., Hu, W., Wang, H.: Human interaction recognition based on transformation of spatial semantics. IEEE Sign. Process. Lett. 19(3), 139–142 (2012)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Kendon, A.: Studies in the Behavior of Social Interaction. Indiana Univeristy Press, Bloomington (1977)Google Scholar
  28. 28.
    Kendon, A.: Development of Multimodal Interfaces: Active Listening and Synchrony. Spacing and Orientation in Co-present, Interaction, pp. 1–15. Springer, Berlin (2009)Google Scholar
  29. 29.
    Kim, K., Grundmann, M., Shamir, A., Matthews, I., Hodgins, J., Essa, I.: Motion field to predict play evolution in dynamic sport scenes. In: Proceedings of Computer Vision and Pattern Recognition, San Francisco. pp. 840–847 (2010)Google Scholar
  30. 30.
    Kirby, R.: Social Robot Navigation. Ph.D. Thesis (CMU-RI-TR-10-13), Robotics Institute, Carnegie Mellon University, Pittsburgh (2010)Google Scholar
  31. 31.
    Krausz, B., Bauckhage, C.: Loveparade 2010: automatic video analysis of a crowd disaster. Comput. Vis. Image Underst. 116(3), 307–319 (2012)CrossRefGoogle Scholar
  32. 32.
    Kumar, N., Satoor, S., Buck, I.: Fast parallel expectation maximization for Gaussian mixture models on GPUs using CUDA. In: Proceedings of High Performance Computing and Communications, Seoul. pp. 103–109 (2009)Google Scholar
  33. 33.
    Lan, T., Sigal, L., Mori, G.: Social roles in hierarchical models for human activity recognition. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1354–1361 (2012)Google Scholar
  34. 34.
    Laptev, I.: On space–time interest points. Intern. J. Comput. Vis. 64(2/3), 107–123 (2005)CrossRefGoogle Scholar
  35. 35.
    Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Intern. J. Comput. Vis. 77(1), 259–289 (2008)CrossRefGoogle Scholar
  36. 36.
    Lester, P.M.: Visual Communication: Images with Messages. Wadsworth Publishing Co Inc., Belmont (2002)Google Scholar
  37. 37.
    Li, R., Porfilio, P., Zickler, T.: Finding group interactions in social clutter. In: Proceedings of Computer Vision and Pattern Recognition, Columbus. pp. 2722–2729 (2013)Google Scholar
  38. 38.
    Liu, H., Hong, T.H., Herman, M., Chellappa, R.: Accuracy vs. efficiency trade-offs in optical flow algorithms. Comput. Vis. Image Underst. 72(3), 271–286 (1996)CrossRefGoogle Scholar
  39. 39.
    Liu, H., Hong, T.H., Herman, M., Chellappa, R.: A general motion model and spatio-temporal filters for computing optical flow. Intern. J. Comput Vis. 22(2), 141–172 (1997)CrossRefGoogle Scholar
  40. 40.
    Liu, J., Carr, P., Collins, R., Liu, Y.: Tracking sports players with context-conditioned motion models. In: Proceedings of Computer Vision and Pattern Recognition, Portland. pp. 1830–1837 (2013)Google Scholar
  41. 41.
    Lowe, D.: Object recognition from local scale-inveriant feature. In: Proceedings of International Conference on Computer Vision, Corfu. pp. 1150–1157 (1999)Google Scholar
  42. 42.
    Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of International Joint Conference on Artificial Intelligence, San Francisco. pp. 674–679 (1981)Google Scholar
  43. 43.
    Mazzon, R., Poiesi, F., Cavallaro, A.: Detection and tracking of groups in crowd. In: Proceedings of Advanced Video and Signal Based Surveillance, Krakow. pp. 202–207 (2013)Google Scholar
  44. 44.
    McKenna, S., Nait-Charif, H.: Learning spatial context from tracking using penalised likelihoods. In: Proceedings of International Conference on Pattern Recognition, Cambridge. pp. 138–141 (2004)Google Scholar
  45. 45.
    Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: Proceedings of Computer Vision and Pattern Recognition, Miami. pp. 935–942 (2009)Google Scholar
  46. 46.
    Mehran, R., Moore, B., Shah, M.: A streakline representation of flow in crowded scenes. In: Proceedings of European Conference in Computer Vision, Crete. pp. 439–452 (2010)Google Scholar
  47. 47.
    Nayak, N., Zhu, Y., Roy-Chowdhury, A.: Vector field analysis for multi-object behavior modeling. Comput. Vis. Image Underst. 31(6–7), 460–472 (2013)CrossRefGoogle Scholar
  48. 48.
    Needleman, S., Wunsch, C.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)CrossRefGoogle Scholar
  49. 49.
    Oliver, N.: Towards perceptual intelligence: statistical modeling of human individual and interactive behaviors. Ph.D. thesis, Massachusetts Institute Technology (MIT), Media Lab, Cambridge, Mass (2000)Google Scholar
  50. 50.
    Papadakis, P., Spalanzani, A., Laugier, C.: Social mapping of human-populated environments by implicit function learning. In: Proceedings of Intelligent Robots and Systems, Tokyo. pp. 1701–1706 (2013)Google Scholar
  51. 51.
    Pellegrini, S., Ess, A., Schindler, K., Gool, L.V.: You will never walk alone: modeling social behavior for multi-target tracking. In: Proceedings of Internation Conference on Computer Vision, Kyoto. pp. 261–268 (2009)Google Scholar
  52. 52.
    Pellegrini, S., Ess, A., Gool, L.V.: Improving data association by joint modeling of pedestrian trajectories and groupings. In: Proceedings of European Conference on Computer Vision, Heraklion, Crete. pp. 452–465 (2010)Google Scholar
  53. 53.
    Poiesi, F., Danyial, F., Cavallaro, A.: Detector-less ball localization using context and motion flow analysis. In: Proceedings of International Conference on Image Processing, Hong Kong. pp. 3913–3916 (2010)Google Scholar
  54. 54.
    Raghavendra, R., Bue, A.D., Cristani, M., Murino, V.: Optimizing interaction force for global anomaly detection in crowded scenes. In: Proceedings of Internation Conference on Computer Vision Workshop, Barcelona. pp. 136–143 (2011)Google Scholar
  55. 55.
    Ryoo, M., Aggarwal, J.L.: Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: Proceedings of International Conference on Computer Vision, Kyoto. pp. 1593–1600 (2009)Google Scholar
  56. 56.
    Salvadori, C., Petracca, M., del Rincon, J.M., Velastin, S.A., Makris, D.: An optimisation of Gaussian Mixture Models for integer processing units. J. Real-Time Image Process (2014).Google Scholar
  57. 57.
    Sfikas, G., Constantinopoulos, C., Likas, A., Galatsanos, N.P.: An analytic distance metric for Gaussian mixture models with application in image retrieval. Artif. Neural Netw. 3697, 835–840 (2005)Google Scholar
  58. 58.
    Sinha, S., Frahm, J.M., Pollefeys, M., Genc, Y.: GPU-based video feature tracking and matching. Technical Report TR 06–012, Department of Computer Science, UNC Chapel Hill, Chapel Hill (2006)Google Scholar
  59. 59.
    Sochman, J., Hogg, D.: Who knows who inverting the social force model for finding groups. In: Proceedings of International Conference on Computer Vision Workshop, Barcelona. pp. 830–837 (2011)Google Scholar
  60. 60.
    Soldera, F., Calderara, S., Cucchiara, R.: Structured learning for detection of social groups in crowd. In: Proceedings of Advanced Video and Signal Based Surveillance, Krakow. pp. 7–12 (2013)Google Scholar
  61. 61.
    Solmaz, B., Moore, B., Shah, M.: Identifying behaviors in crowd scenes using analysis for dynamical systems. IEEE Trans. PAMI 34(10), 2064–2070 (2012)CrossRefGoogle Scholar
  62. 62.
    Su, H., Yang, H., Zheng, S., Fan, Y., Wei, S.: The large-scale crowd behavior perception based on spatio-temporal viscous fluid fields. IEEE Trans. Info. Forens. Sec. 8(10), 1556–1589 (2013)Google Scholar
  63. 63.
    Suk, H.I., Jain, A., Lee, S.W.: A network of dynamic probabilistic models for human interaction analysis. IEEE Trans. Circuits Syst. Video Technol. 21, 932–945 (2011)CrossRefGoogle Scholar
  64. 64.
    Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: Proceedings of Computer Vision and Pattern Recognition, San Francisco. pp. 2432–2439 (2010)Google Scholar
  65. 65.
    Taj, M., Cavallaro, A.: Recognizing Interactions in Video. Intelligent Multimedia Analysis for Security Applications, vol. 282/2010. Springer, Berlin (2010)Google Scholar
  66. 66.
    Taj, M., Cavallaro, A.: Interaction recognition in wide areas using audiovisual sensors. In: Proceedings of Internation Conference on Image Processing, Orlando. pp. 1113–1116 (2012)Google Scholar
  67. 67.
    Tao, J., Klette, R.: Integrated pedestrian and direction classification using a random decision forest. In: Proceedings of International Conference on Computer Vision Workshop, Sydney. pp. 230–237 (2013)Google Scholar
  68. 68.
    Wang, X., Ma, X., Grimson, W.: Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian model. IEEE Trans. Patt. Anal. Mach. Intell. 31(3), 539–555 (2009)CrossRefGoogle Scholar
  69. 69.
    Zanotto, M., Cristani, L.B.B., Murino, V.: Online bayesian nonparametrics for group detection. In: Proceedings of British Machine Vision Conference, Surrey. pp. 111.1–111.12 (2012)Google Scholar
  70. 70.
    Zhao, M., Turner, S., Cai, W.: A data-driven crowd simulation model based on clustering and classification. In: Proceedings of Distributed Simulation and Real Time Applications, Delft. pp. 125–134 (2013)Google Scholar
  71. 71.
    Zhou, B., Wang, X., Tang, X.: Understanding collective crowd behaviors: learning a mixture model of dynamic pedestrian-agents. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 2871–2878 (2012)Google Scholar
  72. 72.
    Zhou, B., Tang, X., Wang, X.: Measuring the collectiveness. In: Proceedings of Computer Vision and Pattern Recognition, Columbus. pp. 3049–3056 (2013)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Centre for Intelligent SensingQueen Mary University of LondonLondon UK

Personalised recommendations