Group Dynamics and Multimodal Interaction Modeling Using a Smart Digital Signage

Tung, Tony; Gomez, Randy; Kawahara, Tatsuya; Matsuyama, Takashi

doi:10.1007/978-3-642-33863-2_36

Group Dynamics and Multimodal Interaction Modeling Using a Smart Digital Signage

Tony Tung¹⁹,
Randy Gomez¹⁹,
Tatsuya Kawahara¹⁹ &
…
Takashi Matsuyama¹⁹

Conference paper

4111 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7583))

Abstract

This paper presents a new multimodal system for group dynamics and interaction analysis. The framework is composed of a mic array and multiview video cameras placed on a digital signage display which serves as a support for interaction. We show that visual information processing can be used to localize nonverbal communication events and synchronized with audio information. Our contribution is twofold: 1) we present a scalable portable system for multiple people multimodal interaction sensing, and 2) we propose a general framework to model A/V multimodal interaction that employs speaker diarization for audio processing and hybrid dynamical systems (HDS) for video processing. HDS are used to represent communication dynamics between multiple people by capturing the characteristics of temporal structures in head motions. Experimental results show real-world situations of group communication processing for joint attention estimation. We believe the proposed framework is very promising for further research.

Download to read the full chapter text

Chapter PDF

References

Newcomb, T.M., Turner, R.H., Converse, P.E.: Social psychology: The study of human interaction. Routledge and Kegan Paul (1966)
Google Scholar
Cassell, J., Vilhjálmsson, H., Bickmore, T.: Beat: the behavior expression animation toolkit. In: SIGGRAPH (2001)
Google Scholar
Buchanan, M.: Secret signals. Nature (2009)
Google Scholar
Pentland, A.: To signal is human. American Scientist (2010)
Google Scholar
Chen, L., Rose, R., Qiao, Y., Kimbara, I., Parrill, F., Welji, H., Han, T., Tu, J., Huang, Z., Harper, M., Quek, F., Xiong, Y., McNeill, D., Tuttle, R., Huang, T.: Vace multimodal meeting corpus (2006)
Google Scholar
Poel, M., Poppe, R., Nijholt, A.: Meeting behavior detection in smart environments: Nonverbal cues that help to obtain natural interaction. In: FG (2008)
Google Scholar
Pianesi, F., Zancanaro, M., Lepri, B., Cappelletti, A.: A multimodal annotated corpus of concensus decision making meetings. In: Language Resources and Evaluation, pp. 409–429 (2007)
Google Scholar
Sumi, Y., Yano, M., Nishida, T.: Analysis environment of conversational structure with nonverbal multimodal data. In: ICMI-MLMI (2010)
Google Scholar
White, S.: Backchannels across cultures: A study of americans and japanese. Language in Society 18, 59–76 (1989)
Article Google Scholar
Rabiner, L.R.: A tutorial on hidden markow models and selected applications in speech recognition. IEEE 77, 257–286 (1989)
Article Google Scholar
Liu, C.D., Chung, Y.N., Chung, P.C.: An interaction-embedded hmm framework for human behavior understanding: With nursing environments as examples. IEEE Trans. Information Technology in Biomedecine 14, 1236–1246 (2010)
Article Google Scholar
Doretto, G., Chiuso, A., Wu, Y., Soatto, S.: Dynamic textures. IJCV 51 (2003)
Google Scholar
Kawashima, H., Matsuyama, T.: Interval-based modeling of human communication dynamics via hybrid dynamical systems. In: NIPS Workshop on Modeling Human Communication Dynamics (2010)
Google Scholar
Chaudhry, R., Ravichandran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: CVPR (2009)
Google Scholar
Jani, E., Heracleus, P., Ishi, C., Nagita, N.: Joint use of microphone array and laser range finders for speaker identification in meeting. Japanese Society for Artificial Intelligence (2011)
Google Scholar
Gomez, R., Lee, A., Saruwatari, H., Shikano, K.: Robust speech recognition with spectral subtraction in low snr. In: Int’l Conf. Spoken Language Processing (2004)
Google Scholar
Viola, P., Jones, M.: Robust real-time object detection. IJCV (2001)
Google Scholar
Pérez, P., Hue, C., Vermaak, J., Gangnet, M.: Color-Based Probabilistic Tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 661–675. Springer, Heidelberg (2002)
Chapter Google Scholar
Gomez, R., Kawahara, T.: Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood. IEEE Trans. Audio, Speech and Language Processing (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Academic Center for Computing and Media Studies, and Graduate School of Informatics, Kyoto University, Japan
Tony Tung, Randy Gomez, Tatsuya Kawahara & Takashi Matsuyama

Authors

Tony Tung
View author publications
You can also search for this author in PubMed Google Scholar
Randy Gomez
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Kawahara
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Matsuyama
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Ingegneria Elettrica, Gestionale e Meccanica (DIEGM), Università degli Studi di Udine, Via delle Scienze, 208, 33100, Udine, Italy
Andrea Fusiello
IIT Istituto Italiano di Tecnologia, Via Morego 30, 16163, Genoa, Italy
Vittorio Murino
Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Modena e Reggio Emilia, Strada Vignolege, 905, 41125, Modena, Italy
Rita Cucchiara

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tung, T., Gomez, R., Kawahara, T., Matsuyama, T. (2012). Group Dynamics and Multimodal Interaction Modeling Using a Smart Digital Signage. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7583. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33863-2_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-33863-2_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33862-5
Online ISBN: 978-3-642-33863-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics