Predicting and recognizing human interactions in public spaces

Poiesi, Fabio; Cavallaro, Andrea

doi:10.1007/s11554-014-0428-8

Predicting and recognizing human interactions in public spaces

Special Issue Paper
Published: 15 May 2014

Volume 10, pages 785–803, (2015)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Fabio Poiesi¹ &
Andrea Cavallaro¹

984 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

We present an extensive survey of methods for recognizing human interactions and propose a method for predicting rendezvous areas in observable and unobservable regions using sparse motion information. Rendezvous areas indicate where people are likely to interact with each other or with static objects (e.g., a door, an information desk or a meeting point). The proposed method infers the direction of movement by calculating prediction lines from displacement vectors and temporally accumulates intersecting locations generated by prediction lines. The intersections are then used as candidate rendezvous areas and modeled as spatial probability density functions using Gaussian Mixture Models. We validate the proposed method to predict dynamic and static rendezvous areas on real-world datasets and compare it with related approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human activity recognition in artificial intelligence framework: a narrative review

Article 18 January 2022

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Article 27 April 2023

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Article 30 January 2023

Notes

Definition taken from Cambridge Dictionary, Cambridge University Press 2014.
iLIDS, Home Office multiple camera tracking scenario definition (UK), 2008.
http://www.eecs.qmul.ac.uk/~andrea/avss2007_d.html. Last accessed: December 2013.
http://www.robots.ox.ac.uk/ActiveVision/Research/Projects/2009bbenfold_headpose/project.html. Last accessed: December 2013.
http://www.cvg.rdg.ac.uk/PETS2009/a.html. Last accessed: December 2013.
Video results on the full sequence can be found here: ftp://motinas.elec.qmul.ac.uk/pub/ra_results/students003_border.zip.
Video results on the full sequence can be found here: ftp://motinas.elec.qmul.ac.uk/pub/ra_results/pets2009_noborder.zip.
Video results on the full sequence can be found here: ftp://motinas.elec.qmul.ac.uk/pub/ra_results/trainstation_border_a.zip ftp://motinas.elec.qmul.ac.uk/pub/ra_results/trainstation_border_b.zip.

References

Andriyenko, A., Schindler, K., Roth, S.: Discrete-continuous optimization for multi-target tracking. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1926–1933 (2012)
Bazzani, L., Cristani, M., Murino, V.: Decentralized particle filter for joint individual-group tracking. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1886–1893 (2012)
Benabbas, Y., Ihaddadene, N., Djeraba, C.: Motion pattern extraction and event detection for automatic visual surveillance. EURASIP 7, 1 (2011)
Google Scholar
Bera, A., Galoppo, N., Sharlet, D., Lake, A., Manocha, D.: Adapt: real-time adaptive pedestrian tracking for crowded scenes. In: Proceedings of Conference on Robotics and Automation, Hong Kong. (2014)
Borges, P., Conci, N., Cavallaro, A.: Video-based human behavior understanding: a survey. Trans. Circuits Syst. Video Technol. 23(11), 1993–2008 (2013)
Article Google Scholar
Bouman, C: Cluster: an unsupervised algorithm for modeling gaussian mixtures. http://engineering.purdue.edu/-bouman. (1998)
Bulthoff, H., Little, J., Poggio, T.: A parallel algorithm for real-time computation of optical flow. Nature 337(6207), 549–553 (1989)
Article Google Scholar
Calderara, S., Cucchiara, R.: Understanding dyadic interactions applying proxemic theory on videosurveillance trajectories. In: Proceedings of Computer Vision and Pattern Recognition Workshop, Providence. pp. 20–27 (2012)
Chang, M.C., Krahnstoever, N., Ge, W.: Probabilistic group-level motion analysis and scenario recognition. In: Proceedings of International Conference on Computer Vision, Barcelona. pp. 747–754 (2011)
Chaquet, J., Carmona, E., Fernandez-Caballero, A.: A survey of video datasets for human action and activity recognition. Comput. Vis. Image Underst. 117(6), 633–659 (2013)
Article Google Scholar
Chen, D.Y., Huang, P.C.: Motion-based unusual event detection in human crowds. J. Vis. Commun. Image R. 22(2), 178–186 (2011)
Article Google Scholar
Chen, F., Cavallaro, A.: Detecting group interactions by online association of trajectory data. In: Proceedings of Acoustics, Speech, and Signal Processing, Vancouver. pp. 1754–1758 (2013)
Cong, Y., Liu, J.Y.J.: Sparse reconstruction cost for abnormal event detection. In: Proceedings of Computer Vision and Pattern Recognition, Colorado Springs. pp. 3449–3456 (2011)
Cristani, M., Bazzani, L., Paggetti, G., Fossati, A., Bue, A.D., Menegaz, G., Murino, V. : Social interaction discovery by statistical analysis of \(F\)-formations. In: Proceedings of British Machine Vision Conference, Dundee. pp. 1–12 (2011a)
Cristani, M., Paggetti, G., Vinciarelli, A., Bazzani, L., Menegaz, G., Murino, V.: Towards computational proxemics: inferring social relations from interpersonal distances. In: Proceedings of Internation Conference on Social Computing, Sydney. pp. 290–297 (2011b)
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for human detection. In: Proceedings of Computer Vision and Pattern Recognition, San Diego. pp. 886–893 (2005)
Farenzena, M., Tavano, A., Bazzani, L., Tosato, D., Paggetti, G., Menegaz, G., Murino, V., Cristani, M.: Social interactions by visual focus of attention in a three-dimensional environment. In: Workshop on Pattern Recognition and Artificial Intelligence for Human Behaviour Analysis, Reggio Emilia. (2009)
Fassold, H., Rosner, J., Schallauer, P., Bailer, W.: Realtime KLT feature point tracking for high definition video. In: Proceedings of Computer Graphics, Computer Vision and Mathematics, Plzen. pp. 40–47 (2009)
Fathi, A., Hodgins, J., Rehg, J.: Social interactions: a first-person perspective. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1226–1233 (2012)
Garcia-Rodriguez, J., Orts-Escolano, S., Angelopoulou, A., Psarrou, A., Azorin-Lopez, J., Garcia-Chamizo, J.: Real time motion estimation using a neural architecture implemented on GPUs. J. Real-Time Image Process. (2014)
Granger, C.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)
Article MathSciNet Google Scholar
Greggio, N., Bernardino, A., Laschi, C., Dario, P., Santos-Victor, J.: Self-adaptive Gaussian mixture models for real-time video segmentation and background subtraction. In: Proceedings of Intelligent Systems Design and Applications, Cairo. pp. 983–989 (2010)
Greggio, N., Bernardino, A., Laschi, C., Dario, P., Santos-Victor, J.: Fast estimation of Gaussian mixture models for image segmentation. Mach. Vis. Appl. 23(4), 773–789 (2012)
Article Google Scholar
Hall, E.: The Hidden Dimension: Handbook for Proxemic Research. Anchor Books Doubleday, New York (1966)
Google Scholar
Helbing, D., Molnar, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282–4286 (1995)
Article Google Scholar
Jin, B., Hu, W., Wang, H.: Human interaction recognition based on transformation of spatial semantics. IEEE Sign. Process. Lett. 19(3), 139–142 (2012)
Article MathSciNet Google Scholar
Kendon, A.: Studies in the Behavior of Social Interaction. Indiana Univeristy Press, Bloomington (1977)
Google Scholar
Kendon, A.: Development of Multimodal Interfaces: Active Listening and Synchrony. Spacing and Orientation in Co-present, Interaction, pp. 1–15. Springer, Berlin (2009)
Google Scholar
Kim, K., Grundmann, M., Shamir, A., Matthews, I., Hodgins, J., Essa, I.: Motion field to predict play evolution in dynamic sport scenes. In: Proceedings of Computer Vision and Pattern Recognition, San Francisco. pp. 840–847 (2010)
Kirby, R.: Social Robot Navigation. Ph.D. Thesis (CMU-RI-TR-10-13), Robotics Institute, Carnegie Mellon University, Pittsburgh (2010)
Krausz, B., Bauckhage, C.: Loveparade 2010: automatic video analysis of a crowd disaster. Comput. Vis. Image Underst. 116(3), 307–319 (2012)
Article Google Scholar
Kumar, N., Satoor, S., Buck, I.: Fast parallel expectation maximization for Gaussian mixture models on GPUs using CUDA. In: Proceedings of High Performance Computing and Communications, Seoul. pp. 103–109 (2009)
Lan, T., Sigal, L., Mori, G.: Social roles in hierarchical models for human activity recognition. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 1354–1361 (2012)
Laptev, I.: On space–time interest points. Intern. J. Comput. Vis. 64(2/3), 107–123 (2005)
Article Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Intern. J. Comput. Vis. 77(1), 259–289 (2008)
Article Google Scholar
Lester, P.M.: Visual Communication: Images with Messages. Wadsworth Publishing Co Inc., Belmont (2002)
Google Scholar
Li, R., Porfilio, P., Zickler, T.: Finding group interactions in social clutter. In: Proceedings of Computer Vision and Pattern Recognition, Columbus. pp. 2722–2729 (2013)
Liu, H., Hong, T.H., Herman, M., Chellappa, R.: Accuracy vs. efficiency trade-offs in optical flow algorithms. Comput. Vis. Image Underst. 72(3), 271–286 (1996)
Article Google Scholar
Liu, H., Hong, T.H., Herman, M., Chellappa, R.: A general motion model and spatio-temporal filters for computing optical flow. Intern. J. Comput Vis. 22(2), 141–172 (1997)
Article Google Scholar
Liu, J., Carr, P., Collins, R., Liu, Y.: Tracking sports players with context-conditioned motion models. In: Proceedings of Computer Vision and Pattern Recognition, Portland. pp. 1830–1837 (2013)
Lowe, D.: Object recognition from local scale-inveriant feature. In: Proceedings of International Conference on Computer Vision, Corfu. pp. 1150–1157 (1999)
Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of International Joint Conference on Artificial Intelligence, San Francisco. pp. 674–679 (1981)
Mazzon, R., Poiesi, F., Cavallaro, A.: Detection and tracking of groups in crowd. In: Proceedings of Advanced Video and Signal Based Surveillance, Krakow. pp. 202–207 (2013)
McKenna, S., Nait-Charif, H.: Learning spatial context from tracking using penalised likelihoods. In: Proceedings of International Conference on Pattern Recognition, Cambridge. pp. 138–141 (2004)
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: Proceedings of Computer Vision and Pattern Recognition, Miami. pp. 935–942 (2009)
Mehran, R., Moore, B., Shah, M.: A streakline representation of flow in crowded scenes. In: Proceedings of European Conference in Computer Vision, Crete. pp. 439–452 (2010)
Nayak, N., Zhu, Y., Roy-Chowdhury, A.: Vector field analysis for multi-object behavior modeling. Comput. Vis. Image Underst. 31(6–7), 460–472 (2013)
Article Google Scholar
Needleman, S., Wunsch, C.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Article Google Scholar
Oliver, N.: Towards perceptual intelligence: statistical modeling of human individual and interactive behaviors. Ph.D. thesis, Massachusetts Institute Technology (MIT), Media Lab, Cambridge, Mass (2000)
Papadakis, P., Spalanzani, A., Laugier, C.: Social mapping of human-populated environments by implicit function learning. In: Proceedings of Intelligent Robots and Systems, Tokyo. pp. 1701–1706 (2013)
Pellegrini, S., Ess, A., Schindler, K., Gool, L.V.: You will never walk alone: modeling social behavior for multi-target tracking. In: Proceedings of Internation Conference on Computer Vision, Kyoto. pp. 261–268 (2009)
Pellegrini, S., Ess, A., Gool, L.V.: Improving data association by joint modeling of pedestrian trajectories and groupings. In: Proceedings of European Conference on Computer Vision, Heraklion, Crete. pp. 452–465 (2010)
Poiesi, F., Danyial, F., Cavallaro, A.: Detector-less ball localization using context and motion flow analysis. In: Proceedings of International Conference on Image Processing, Hong Kong. pp. 3913–3916 (2010)
Raghavendra, R., Bue, A.D., Cristani, M., Murino, V.: Optimizing interaction force for global anomaly detection in crowded scenes. In: Proceedings of Internation Conference on Computer Vision Workshop, Barcelona. pp. 136–143 (2011)
Ryoo, M., Aggarwal, J.L.: Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: Proceedings of International Conference on Computer Vision, Kyoto. pp. 1593–1600 (2009)
Salvadori, C., Petracca, M., del Rincon, J.M., Velastin, S.A., Makris, D.: An optimisation of Gaussian Mixture Models for integer processing units. J. Real-Time Image Process (2014).
Sfikas, G., Constantinopoulos, C., Likas, A., Galatsanos, N.P.: An analytic distance metric for Gaussian mixture models with application in image retrieval. Artif. Neural Netw. 3697, 835–840 (2005)
Google Scholar
Sinha, S., Frahm, J.M., Pollefeys, M., Genc, Y.: GPU-based video feature tracking and matching. Technical Report TR 06–012, Department of Computer Science, UNC Chapel Hill, Chapel Hill (2006)
Sochman, J., Hogg, D.: Who knows who inverting the social force model for finding groups. In: Proceedings of International Conference on Computer Vision Workshop, Barcelona. pp. 830–837 (2011)
Soldera, F., Calderara, S., Cucchiara, R.: Structured learning for detection of social groups in crowd. In: Proceedings of Advanced Video and Signal Based Surveillance, Krakow. pp. 7–12 (2013)
Solmaz, B., Moore, B., Shah, M.: Identifying behaviors in crowd scenes using analysis for dynamical systems. IEEE Trans. PAMI 34(10), 2064–2070 (2012)
Article Google Scholar
Su, H., Yang, H., Zheng, S., Fan, Y., Wei, S.: The large-scale crowd behavior perception based on spatio-temporal viscous fluid fields. IEEE Trans. Info. Forens. Sec. 8(10), 1556–1589 (2013)
Google Scholar
Suk, H.I., Jain, A., Lee, S.W.: A network of dynamic probabilistic models for human interaction analysis. IEEE Trans. Circuits Syst. Video Technol. 21, 932–945 (2011)
Article Google Scholar
Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: Proceedings of Computer Vision and Pattern Recognition, San Francisco. pp. 2432–2439 (2010)
Taj, M., Cavallaro, A.: Recognizing Interactions in Video. Intelligent Multimedia Analysis for Security Applications, vol. 282/2010. Springer, Berlin (2010)
Google Scholar
Taj, M., Cavallaro, A.: Interaction recognition in wide areas using audiovisual sensors. In: Proceedings of Internation Conference on Image Processing, Orlando. pp. 1113–1116 (2012)
Tao, J., Klette, R.: Integrated pedestrian and direction classification using a random decision forest. In: Proceedings of International Conference on Computer Vision Workshop, Sydney. pp. 230–237 (2013)
Wang, X., Ma, X., Grimson, W.: Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian model. IEEE Trans. Patt. Anal. Mach. Intell. 31(3), 539–555 (2009)
Article Google Scholar
Zanotto, M., Cristani, L.B.B., Murino, V.: Online bayesian nonparametrics for group detection. In: Proceedings of British Machine Vision Conference, Surrey. pp. 111.1–111.12 (2012)
Zhao, M., Turner, S., Cai, W.: A data-driven crowd simulation model based on clustering and classification. In: Proceedings of Distributed Simulation and Real Time Applications, Delft. pp. 125–134 (2013)
Zhou, B., Wang, X., Tang, X.: Understanding collective crowd behaviors: learning a mixture model of dynamic pedestrian-agents. In: Proceedings of Computer Vision and Pattern Recognition, Providence. pp. 2871–2878 (2012)
Zhou, B., Tang, X., Wang, X.: Measuring the collectiveness. In: Proceedings of Computer Vision and Pattern Recognition, Columbus. pp. 3049–3056 (2013)

Download references

Author information

Authors and Affiliations

Centre for Intelligent Sensing, Queen Mary University of London, Mile End Road, London , E1 4NS, UK
Fabio Poiesi & Andrea Cavallaro

Authors

Fabio Poiesi
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Cavallaro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabio Poiesi.

Additional information

This work was supported in part by the Artemis JU and in part by the UK Technology Strategy Board through COPCAMS Project under Grant 332913.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Poiesi, F., Cavallaro, A. Predicting and recognizing human interactions in public spaces. J Real-Time Image Proc 10, 785–803 (2015). https://doi.org/10.1007/s11554-014-0428-8

Download citation

Received: 14 September 2013
Accepted: 25 April 2014
Published: 15 May 2014
Issue Date: December 2015
DOI: https://doi.org/10.1007/s11554-014-0428-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting and recognizing human interactions in public spaces

Abstract

Access this article

Similar content being viewed by others

Human activity recognition in artificial intelligence framework: a narrative review

3D Object Detection for Autonomous Driving: A Comprehensive Survey

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation