VACE Multimodal Meeting Corpus

Chen, Lei; Rose, R. Travis; Qiao, Ying; Kimbara, Irene; Parrill, Fey; Welji, Haleema; Han, Tony Xu; Tu, Jilin; Huang, Zhongqiang; Harper, Mary; Quek, Francis; Xiong, Yingen; McNeill, David; Tuttle, Ronald; Huang, Thomas

doi:10.1007/11677482_4

Lei Chen¹⁸,
R. Travis Rose¹⁹,
Ying Qiao¹⁹,
Irene Kimbara²⁰,
Fey Parrill²⁰,
Haleema Welji²⁰,
Tony Xu Han²¹,
Jilin Tu²¹,
Zhongqiang Huang¹⁸,
Mary Harper¹⁸,
Francis Quek¹⁹,
Yingen Xiong¹⁹,
David McNeill²⁰,
Ronald Tuttle²² &
…
Thomas Huang²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3869))

Included in the following conference series:

International Workshop on Machine Learning for Multimodal Interaction

2132 Accesses
55 Citations

Abstract

In this paper, we report on the infrastructure we have developed to support our research on multimodal cues for understanding meetings. With our focus on multimodality, we investigate the interaction among speech, gesture, posture, and gaze in meetings. For this purpose, a high quality multimodal corpus is being produced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Burger, S., MacLaren, V., Yu, H.: The ISL meeting corpus: The impact of meeting type on speech type. In: Proc. of Int. Conf. on Spoken Language Processing (ICSLP) (2002)
Google Scholar
Morgan, N., et al.: Meetings about meetings: Research at ICSI on speech in multiparty conversations. In: Proc. of ICASSP, Hong Kong, vol. 4, pp. 740–743 (2003)
Google Scholar
Garofolo, J., Laprum, C., Michel, M., Stanford, V., Tabassi, E.: The NISTMeeting Room Pilot Corpus. In: Proc. of Language Resource and Evaluation Conference (2004)
Google Scholar
McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., Zhang, D.: Automatic analysis of multimodal group actions in meetings. IEEE Trans. on Pattern Analysis and Machine Intelligence 27, 305–317 (2005)
Article Google Scholar
Schultz, T., Waibel, A., et al.: The ISL meeting room system. In: Proceedings of the Workshop on Hands-Free Speech Communication, Kyoto, Japan (2001)
Google Scholar
Polzin, T.S., Waibel, A.: Detecting emotions in speech. In: Proc. of the CMC (1998)
Google Scholar
Stiefelhagen, R.: Tracking focus of attention in meetings. In: Proc. of Int. Conf. on Multimodal Interface (ICMI), Pittsburg, PA (2002)
Google Scholar
Alfred, D., Renals, S.: Dynamic bayesian networks for meeting structuring. In: Proc. of ICASSP, Montreal, Que, Canada, vol. 5, pp. 629–632 (2004)
Google Scholar
Gatica-Perez, D., Lathoud, G., McCowan, I., Odobez, J., Moore, D.: Audio-visual speaker tracking with importance particle filters. In: Proc. of Int. Conf. on Image Processing (ICIP), Barcelona, Spain, vol. 3, pp. 25–28 (2003)
Google Scholar
Renals, S., Ellis, D.: Audio information access from meeting rooms. In: Proc. of ICASSP, Hong Kong, vol. 4, pp. 744–747 (2003)
Google Scholar
Ajmera, J., Lathoud, G., McCowan, I.: Clustering and segmenting speakers and their locations in meetings. In: Proc. of ICASSP, Montreal, Que, Canada, vol. 1, pp. 605–608 (2004)
Google Scholar
Moore, D., McCowan, I.: Microphone array speech recognition: Experiments on overlapping speech in meetings. In: Proc. of ICASSP, Hong Kong, vol. 5, pp. 497–500 (2003)
Google Scholar
Han, T.X., Huang, T.S.: Articulated body tracking using dynamic belief propagation. In: Proc. IEEE International Workshop on Human-Computer Interaction (2005)
Google Scholar
Tu, J., Huang, T.S.: Online updating appearance generative mixture model for meanshift tracking. In: Proc. of Int. Conf. on Computer Vision (ICCV) (2005)
Google Scholar
Tu, J., Tao, H., Forsyth, D., Huang, T.S.: Accurate head pose tracking in low resolution video. In: Proc. of Int. Conf. on Computer Vision (ICCV) (2005)
Google Scholar
Quek, F., Bryll, R., Ma, X.F.: A parallel algorighm for dynamic gesture tracking. In: ICCV Workshop on RATFG-RTS, Gorfu,Greece (1999)
Google Scholar
Bryll, R.: A Robust Agent-Based Gesture Tracking System. PhD thesis, Wright State University (2004)
Google Scholar
Quek, F., Bryll, R., Qiao, Y., Rose, T.: Vector coherence mapping: Motion field extraction by exploiting multiple coherences. CVIU special issue on Spatial Coherence in Visual Motion Analysis (Submitted, 2005)
Google Scholar
Strassel, S., Glenn, M.: Shared linguistic resources for human language technology in the meeting domain. In: Proceedings of ICASSP 2004 Meeting Workshop (2004)
Google Scholar
Huang, Z., Harper, M.: Speech and non-speech detection in meeting audio for transcription. In: MLMI 2005 NIST RT-05S Workshop (2005)
Google Scholar
Bird, S., Liberman, M.: Linguistic Annotation: Survey by LDC, http://www.ldc.upenn.edu/annotation/
Barras, C., Geoffrois, D., Wu, Z., Liberman, W.: Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication (2001)
Google Scholar
Boersma, P., Weeninck, D.: Praat, a system for doing phonetics by computer. Technical Report 132, University of Amsterdam, Inst. of Phonetic Sc. (1996)
Google Scholar
Chen, L., Liu, Y., Harper, M., Maia, E., McRoy, S.: Evaluating factors impacting the accuracy of forced alignments in a multimodal corpus. In: Proc. of Language Resource and Evaluation Conference, Lisbon, Portugal (2004)
Google Scholar
Sundaram, R., Ganapathiraju, A., Hamaker, J., Picone, J.: ISIP 2000 conversational speech evaluation system. In: Speech Transcription Workshop 2001, College Park, Maryland (2000)
Google Scholar
Pellom, B.: SONIC: The University of Colorado continuous speech recognizer. Technical Report TR-CSLR-2001-01, University of Colorado (2001)
Google Scholar
Quek, F., McNeill, D., Rose, T., Shi, Y.: A coding tool for multimodal analysis of meeting video. In: NIST Meeting Room Workshop (2003)
Google Scholar
Chen, L., Liu, Y., Harper, M., Shriberg, E.: Multimodal model integration for sentence unit detection. In: Proc. of Int. Conf. on Multimodal Interface (ICMI), University Park, PA (2004)
Google Scholar
Rose, T., Quek, F., Shi, Y.: Macvissta: A system for multimodal analysis. In: Proc. of Int. Conf. on Multimodal Interface (ICMI) (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering, Purdue University, West Lafayette, IN, USA
Lei Chen, Zhongqiang Huang & Mary Harper
CHCI, Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
R. Travis Rose, Ying Qiao, Francis Quek & Yingen Xiong
Department of Psychology, University of Chicago, Chicago, IL, USA
Irene Kimbara, Fey Parrill, Haleema Welji & David McNeill
Beckman Institute, University of Illinois Urbana Champaign, Urbana, IL, USA
Tony Xu Han, Jilin Tu & Thomas Huang
Air Force Institute of Technology, Dayton, OH, USA
Ronald Tuttle

Authors

Lei Chen
View author publications
You can also search for this author in PubMed Google Scholar
R. Travis Rose
View author publications
You can also search for this author in PubMed Google Scholar
Ying Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Irene Kimbara
View author publications
You can also search for this author in PubMed Google Scholar
Fey Parrill
View author publications
You can also search for this author in PubMed Google Scholar
Haleema Welji
View author publications
You can also search for this author in PubMed Google Scholar
Tony Xu Han
View author publications
You can also search for this author in PubMed Google Scholar
Jilin Tu
View author publications
You can also search for this author in PubMed Google Scholar
Zhongqiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Mary Harper
View author publications
You can also search for this author in PubMed Google Scholar
Francis Quek
View author publications
You can also search for this author in PubMed Google Scholar
Yingen Xiong
View author publications
You can also search for this author in PubMed Google Scholar
David McNeill
View author publications
You can also search for this author in PubMed Google Scholar
Ronald Tuttle
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, Scotland
Steve Renals
IDIAP Research Institute, Martigny, Switzerland
Samy Bengio

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, L. et al. (2006). VACE Multimodal Meeting Corpus. In: Renals, S., Bengio, S. (eds) Machine Learning for Multimodal Interaction. MLMI 2005. Lecture Notes in Computer Science, vol 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_4

Download citation

DOI: https://doi.org/10.1007/11677482_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32549-9
Online ISBN: 978-3-540-32550-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics