Language Resources and Evaluation

, Volume 41, Issue 3–4, pp 389–407 | Cite as

The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms

  • Djamel Mostefa
  • Nicolas Moreau
  • Khalid Choukri
  • Gerasimos Potamianos
  • Stephen M. Chu
  • Ambrish Tyagi
  • Josep R. Casas
  • Jordi Turmo
  • Luca Cristoforetti
  • Francesco Tobia
  • Aristodemos Pnevmatikakis
  • Vassilis Mylonakis
  • Fotios Talantzis
  • Susanne Burger
  • Rainer Stiefelhagen
  • Keni Bernardin
  • Cedrick Rochet
Article

Abstract

The analysis of lectures and meetings inside smart rooms has recently attracted much interest in the literature, being the focus of international projects and technology evaluations. A key enabler for progress in this area is the availability of appropriate multimodal and multi-sensory corpora, annotated with rich human activity information during lectures and meetings. This paper is devoted to exactly such a corpus, developed in the framework of the European project CHIL, “Computers in the Human Interaction Loop”. The resulting data set has the potential to drastically advance the state-of-the-art, by providing numerous synchronized audio and video streams of real lectures and meetings, captured in multiple recording sites over the past 4 years. It particularly overcomes typical shortcomings of other existing databases that may contain limited sensory or monomodal data, exhibit constrained human behavior and interaction patterns, or lack data variability. The CHIL corpus is accompanied by rich manual annotations of both its audio and visual modalities. These provide a detailed multi-channel verbatim orthographic transcription that includes speaker turns and identities, acoustic condition information, and named entities, as well as video labels in multiple camera views that provide multi-person 3D head and 2D facial feature location information. Over the past 3 years, the corpus has been crucial to the evaluation of a multitude of audiovisual perception technologies for human activity analysis in lecture and meeting scenarios, demonstrating its utility during internal evaluations of the CHIL consortium, as well as at the recent international CLEAR and Rich Transcription evaluations. The CHIL corpus is publicly available to the research community.

Keywords

Mutlimodal Corpus Annotation Evaluation Audio Video 

Notes

Acknowledgments

The work presented here was partly funded by the European Union under the integrated project CHIL, “Computers in the Human Interaction Loop” (Grant Number IST-506909).

References

  1. AMI—Augmented Multiparty Interaction. http://www.amiproject.org
  2. Burger, S., McLaren, V., & Yu, H. (2002). The ISL meeting corpus: The impact on meeting type on speech style. In Proceedings of International Conference on Spoken Language Processing, Denver, USA.Google Scholar
  3. CALO—Cognitive Agent that Learns and Organizes. http://www.caloproject.sri.com/
  4. CHIL—Computers in the Human Interaction Loop. http://www.chil.server.de
  5. Classification of Events, Activities, and Relationships Evaluation and Workshop. http://www.clear-evaluation.org
  6. ELRA Catalogue of Language Resources. http://www.catalog.elra.info
  7. Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., & Wooters, C. (2003). The ICSI meeting corpus. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, China.Google Scholar
  8. Mostefa, D., et al. (2005). Chil Public Deliverable D7.6: Exploitation material for CHIL evaluation campaign 1. http://www.chil.server.de/servlet/is/8063/
  9. Mostefa, D., Garcia, M.-N., & Choukri, K. (2006). Evaluation of multimodal components within CHIL. In Proceedings of the 5th International Language Resources and Evaluations Conference (LREC), Genoa, Italy.Google Scholar
  10. Stiefelhagen, R., Bernardin, K., Bowers, R., Garofolo, J., Mostefa, D., & Soundararajan, P. (2007). The CLEAR 2006 evaluation. In R. Stiefelhagen & J. Garofolo (Eds.), Multimodal Technologies for Perception of Humans. Proceedings of the First International CLEAR Evaluation Workshop, CLEAR 2006, number 4122 in Springer Lecture Notes in Computer Science, pp. 1–45.Google Scholar
  11. Stiefelhagen, R., & Garofolo, J. (Eds). (2007). Multimodal Technologies for Perception of Humans, First International Evaluation Workshop on Classification of Events, Activities and Relationships, CLEAR’06. Number 4122 in Lecture Notes in Computer Science, Springer.Google Scholar
  12. The AGTK Annotation Tool. http://www.agtk.sourceforge.net
  13. The NIST MarkIII Microphone Array. http://www.nist.gov/smartspace/cmaiii.html
  14. The NIST Smart Space Project. http://www.nist.gov/smartspace/
  15. The Rich Transcription 2006 Spring Meeting Recognition Evaluation Website. http://www.nist.gov/speech/tests/rt/rt2006/spring
  16. The Transcriber Tool Home Page. http://www.trans.sourceforge.net
  17. VACE—Video Analysis and Content Extraction. https://www.control.nist.gov/dto/twiki/bin/view/Main/WebHomeGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  • Djamel Mostefa
    • 1
  • Nicolas Moreau
    • 1
  • Khalid Choukri
    • 1
  • Gerasimos Potamianos
    • 2
  • Stephen M. Chu
    • 2
  • Ambrish Tyagi
    • 2
    • 3
  • Josep R. Casas
    • 4
  • Jordi Turmo
    • 4
  • Luca Cristoforetti
    • 5
  • Francesco Tobia
    • 5
  • Aristodemos Pnevmatikakis
    • 6
  • Vassilis Mylonakis
    • 6
  • Fotios Talantzis
    • 6
  • Susanne Burger
    • 7
  • Rainer Stiefelhagen
    • 8
  • Keni Bernardin
    • 8
  • Cedrick Rochet
    • 8
  1. 1.Evaluations and Language Resources Distribution Agency (ELDA)ParisFrance
  2. 2.IBM T.J. Watson Research CenterNYUSA
  3. 3.Department of Computer Science and EngineeringThe Ohio State UniversityColumbusUSA
  4. 4.Universitat Politècnica de CatalunyaBarcelonaSpain
  5. 5.ITC-IRSTPovoItaly
  6. 6.Athens Information TechnologyPeaniaGreece
  7. 7.Interactive Systems LabsCarnegie Mellon UniversityPittsburghUSA
  8. 8.Interactive Systems LabsUniversität Karlsruhe (TH)KarlsruheGermany

Personalised recommendations