Detecting small group activities from multimodal observations

Brdiczka, Oliver; Maisonnasse, Jérôme; Reignier, Patrick; Crowley, James L.

doi:10.1007/s10489-007-0074-y

Detecting small group activities from multimodal observations

Published: 15 July 2007

Volume 30, pages 47–57, (2009)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Oliver Brdiczka¹,
Jérôme Maisonnasse¹,
Patrick Reignier¹ &
…
James L. Crowley¹

124 Accesses
11 Citations
Explore all metrics

Abstract

This article addresses the problem of detecting configurations and activities of small groups of people in an augmented environment. The proposed approach takes a continuous stream of observations coming from different sensors in the environment as input. The goal is to separate distinct distributions of these observations corresponding to distinct group configurations and activities. This article describes an unsupervised method based on the calculation of the Jeffrey divergence between histograms over observations. These histograms are generated from adjacent windows of variable size slid from the beginning to the end of a meeting recording. The peaks of the resulting Jeffrey divergence curves are detected using successive robust mean estimation. After a merging and filtering process, the retained peaks are used to select the best model, i.e. the best allocation of observation distributions for a meeting recording. These distinct distributions can be interpreted as distinct segments of group configuration and activity. To evaluate this approach, 5 small group meetings, one seminar and one cocktail party meeting have been recorded. The observations of the small groups meetings and the seminar were generated by a speech activity detector, while the observations of the cocktail party meeting were generated by both the speech activity detector and a visual tracking system. The authors measured the correspondence between detected segments and labeled group configurations and activities. The obtained results are promising, in particular as the method is completely unsupervised.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recognizing Interactions Between People from Video Sequences

Socially-Driven Computer Vision for Group Behavior Analysis

Statistical Metric-Theoretic Approach to Activity Recognition Based on Accelerometer Data

References

Aoki PM, Romaine M, Szymanski MH, Thornton JD, Wilson D, Woodruff A (2003) The mad hatter’s cocktail party: a social mobile audio space supporting multiple simultaneous conversations. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 425–432
Basu S (2002) Conversational scene analysis. PhD thesis, MIT Department of EECS, Cambridge, MA
Bilmes JA (1998) A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Technical Report ICSI-TR-97-021, University of Berkeley
Bobick A, Intille S, Davis J, Baird F, Pinhanez C, Campbell L, Ivanov Y, Schutte A, Wilson A (1999) The KidsRoom: a perceptually-based interactive and immersive story environment. In: Presence (USA), vol 8, pp 369–393
Brdiczka O, Maisonnasse J, Reignier P (2005) Automatic detection of interaction groups. In: Proceedings of the international conference multimodal interfaces, pp 32–36, October 2005
Brdiczka O, Vaufreydaz D, Maisonnasse J, Reignier P (2006) Unsupervised segmentation of meeting configurations and activities using speech activity detection. In: Maglogiannis I, Karpouzis K, Bramer M (eds) IFIP international federation of information processing. Artificial intelligence applications and innovations, vol 204. Springer, Boston, pp 195–203
Google Scholar
Brumitt B, Meyers B, Krumm J, Kern A, Shafer SA (2000) EasyLiving: technologies for intelligent environments. In: Proceedings of the international conference on handheld and ubiquitous computing, pp 12–29
Burger S, MacLaren V, Yu H (2002) The ISL meeting corpus: the impact of meeting type on speech style. In: Proceedings of the international conference on spoken language processing, pp 301–304
Caporossi A, Hall D, Reignier P, Crowley JL (2004) Robust visual tracking from dynamic control of processing. In: Proceedings of the international workshop on performance evaluation for tracking and surveillance, pp 23–32
Choudhury T, Pentland A (2004) Characterizing social interactions using the sociometer. In: Proceedings NAACOS 2004, June 2004
Le Gal Ch, Martin J, Lux A, Crowley JL (2001) Smartoffice: design of an intelligent environment. IEEE Intell Syst 16(4): 60–66
Article Google Scholar
McCowan I, Gatica-Perez D, Bengio S, Lathoud G, Barnard M, Zhang D (2005) Automatic analysis of multimodal group actions in meetings. IEEE Trans Pattern Anal Mach Intell 27(3): 305–317
Article Google Scholar
Muehlenbrock M, Brdiczka O, Snowdon D, Meunier J-L (2004) Learning to detect user activity and availability from a variety of sensor data. In: Proceedings of the IEEE international conference on pervasive computing and communications, March 2004, pp 13–22
Oliver N, Rosario B, Pentland A (2000) A Bayesian computer vision system for modeling human interactions. IEEE Trans Pattern Anal Mach Intell 22(8): 831–843
Article Google Scholar
Puzicha J, Hofmann Th, Buhmann J (1997) Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In: Proceedings of the international conference on computer vision and pattern recognition, pp 267–272
Qian RJ, Sezan MI, Mathews KE (1998) Face tracking using robust statistical estimation. In: Proceedings workshop on perceptual user interfaces, San Francisco
Suchman L (1987) Plans and situated actions: the problem of human–machine communication. Cambridge University Press, Cambridge
Google Scholar
Stiefelhagen R, Steusloff H, Waibel A (2004) CHIL—computers in the human interaction loop. In: Proceedings of the international workshop on image analysis for multimedia interactive services
Vaufreydaz D (2001) IST-2000-28323 FAME: facilitating agent for multi-cultural exchange (WP4). European Commission project IST-2000-28323, October 2001
Zaidenberg S, Brdiczka O, Reignier P, Crowley JL (2006) Learning context models for the recognition of scenarios. In: Maglogiannis I, Karpouzis K, Bramer M (eds) IFIP international federation of information processing. Artificial intelligence applications and innovations, vol 204. Springer, Boston, pp 86–97
Google Scholar
Zhang D, Gatica-Perez D, Bengio S, McCowan I, Lathoud G (2004) Multimodal group action clustering in meetings. In: Proceedings of the international workshop on video surveillance & sensor networks

Download references

Author information

Authors and Affiliations

INRIA Rhône-Alpes, 655 avenue de l’Europe, 38334, Saint Ismier, Cedex, France
Oliver Brdiczka, Jérôme Maisonnasse, Patrick Reignier & James L. Crowley

Authors

Oliver Brdiczka
View author publications
You can also search for this author in PubMed Google Scholar
Jérôme Maisonnasse
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Reignier
View author publications
You can also search for this author in PubMed Google Scholar
James L. Crowley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oliver Brdiczka.

Additional information

A short version of this article [6] obtained the Best Paper Award of the 3rd IFIP Conference on Artificial Intelligence Applications and Innovations (AIAI) 2006.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brdiczka, O., Maisonnasse, J., Reignier, P. et al. Detecting small group activities from multimodal observations. Appl Intell 30, 47–57 (2009). https://doi.org/10.1007/s10489-007-0074-y

Download citation

Received: 25 October 2006
Accepted: 24 May 2007
Published: 15 July 2007
Issue Date: February 2009
DOI: https://doi.org/10.1007/s10489-007-0074-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting small group activities from multimodal observations

Abstract

Access this article

Similar content being viewed by others

Recognizing Interactions Between People from Video Sequences

Socially-Driven Computer Vision for Group Behavior Analysis

Statistical Metric-Theoretic Approach to Activity Recognition Based on Accelerometer Data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detecting small group activities from multimodal observations

Abstract

Access this article

Similar content being viewed by others

Recognizing Interactions Between People from Video Sequences

Socially-Driven Computer Vision for Group Behavior Analysis

Statistical Metric-Theoretic Approach to Activity Recognition Based on Accelerometer Data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation