An Exchange Format for Multimodal Annotations

  • Thomas Schmidt
  • Susan Duncan
  • Oliver Ehmer
  • Jeffrey Hoyt
  • Michael Kipp
  • Dan Loehr
  • Magnus Magnusson
  • Travis Rose
  • Han Sloetjes
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5509)


This paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation and analysis of multimodality. Each of the tools has specific strengths so that a variety of different tools, working on the same data, can be desirable for project work. However this usually requires tedious conversion between formats. We propose a common exchange format for multimodal annotation, based on the annotation graph (AG) formalism, which is supported by import and export routines in the respective tools. In the current version of this format the common denominator information can be reliably exchanged between the tools, and additional information can be stored in a standardized way.


Exchange Format Annotation Tool Annotation Graph Tool Format Mitre Corporation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
  3. 3.
    Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: Development and Use of a Tool for Assisting Speech Corpora Production. Speech Communication 33, 5–22 (2000)CrossRefzbMATHGoogle Scholar
  4. 4.
    Bird, S., Liberman, M.: A formal framework for linguistic annotation. Speech Communication 33, 23–60 (2001)CrossRefzbMATHGoogle Scholar
  5. 5.
    Boersma, P., Weenik, D.: PRAAT, a system for doing phonetics by computer, version 3.4. Institute of Phonetic Sciences of the University of Amsterdam, Report 132, 182 pages (1996)Google Scholar
  6. 6.
    Brugman, H., Russel, A.: Annotating Multimedia/Multi-modal resources with ELAN. In: Proceedings of LREC 2004, Fourth International Conference on Language Resources and Evaluation (2004)Google Scholar
  7. 7.
  8. 8.
    Cochran, M., Good, J., Loehr, D., Miller, S.A., Stephens, S., Williams, B., Udoh, I.: Report from TILR Working Group 1: Tools interoperability and input/output formats (2007),
  9. 9.
  10. 10.
    EXMARaLDA website,
  11. 11.
    Kipp, M.: Anvil - A generic annotation tool for multimodal dialogue. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), Aalborg, pp. 1367–1370 (2001)Google Scholar
  12. 12.
    Kipp, M.: Gesture Generation by Imitation – From human behavior to computer character animation, Boca Raton, Florida: (2004)Google Scholar
  13. 13.
    Laprun, C., Fiscus, J., Garofolo, J., Pajot, S.: Recent Improvements to the ATLAS Architecture. In: Proceedings of HLT 2002, Second International Conference on Human Language Technology, San Francisco (2002)Google Scholar
  14. 14.
  15. 15.
    Milde, J.-T., Gut, U.: The TASX Environment: An XML-Based Toolset for Time Aligned Speech Corpora. In: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), Gran Canaria (2002)Google Scholar
  16. 16.
    Website of the multimodal annotation workshop (2007),
  17. 17.
    Rohlfing, K., Loehr, D., Duncan, S., Brown, A., Franklin, A., Kimbara, I., Milde, J.-T., Parrill, F., Rose, T., Schmidt, T., Sloetjes, H., Thies, A., Wellinghoff, S.: Comparison of multimodal annotation tools: workshop report. Gesprächsforschung – Online-Zeitschrift zur verbalen Interaktion (7), 99–123 (2006)Google Scholar
  18. 18.
  19. 19.
    Maeda, K., Bird, S., Ma, X., Lee, H.: Creating Annotation Tools with the Annotation Graph Toolkit. In: Proceedings of the Third International Conference on Language Resources and Evaluation. European Language Resources Association, Paris (2002)Google Scholar
  20. 20.
    NITE XML Toolkit Website,
  21. 21.
    Rose, T.: MacVisSTA: A System for Multimodal Analysis of Human Communication and Interaction. Master’s thesis, Virginia Tech. (2007)Google Scholar
  22. 22.
    Rose, T., Quek, F., Shi, Y.: MacVisSTA: A System for Multimodal Analysis. In: Proceedings of the 6th International Conference on Multimodal Interfaces (2004)Google Scholar
  23. 23.
    Schmidt, T.: Time-Based data models and the TEI guidelines for transcriptions of speech. Working papers in Multilingualism (56), Hamburg (2005)Google Scholar
  24. 24.
    Schmidt, T., Wörner, K.: EXMARaLDA – Creating, analysing and sharing spoken language corpora for pragmatic research. In: Pragmatics (to appear, 2009)Google Scholar
  25. 25.
  26. 26.
  27. 27.
    Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., Sloetjes, H.: ELAN: a Professional Framework for Multimodality Research. In: Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Thomas Schmidt
    • 1
  • Susan Duncan
    • 2
  • Oliver Ehmer
    • 3
  • Jeffrey Hoyt
    • 4
  • Michael Kipp
    • 5
  • Dan Loehr
    • 4
  • Magnus Magnusson
    • 6
  • Travis Rose
    • 7
  • Han Sloetjes
    • 8
  1. 1.University of HamburgGermany
  2. 2.University of ChicagoUSA
  3. 3.University of FreiburgGermany
  4. 4.MITRE CorporationUSA
  5. 5.DFKI SaarbrückenGermany
  6. 6.Human Behavior Laboratory ReykjavikIceland
  7. 7.Virginia TechUSA
  8. 8.MPI for Psycholinguistics NijmegenNetherlands

Personalised recommendations