Proposal of MMI-API and Library for JavaScript

  • Kouichi Katsurada
  • Taiki Kikuchi
  • Yurie Iribe
  • Tsuneo Nitta
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 14)


This paper proposes a multimodal interaction API (MMI-API) and a library for the development of web-based multimodal applications. The API and library enable us to embed synchronized multiple inputs/outputs into an application, as well as to specify concrete speech inputs/outputs and actions of dialogue agents. Because the API and the library are provided for JavaScript, which is a commonly used web-development language, they can be executed on general web browsers without having to install special add-ons. The users can therefore experience multimodal interaction simply by accessing a web site from their web browsers. In addition to presenting an outline of the API and the library, we offer a practical example of the use of the multimodal interaction system, as applied to an English pronunciation training application for Japanese students.


MMI-API JavaScript web-based multimodal interaction 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Katsurada, K., Nakamura, Y., Yamada, H., Nitta, T.: XISL: A Language for Describing Multimodal Interaction Scenarios. In: Proc. of ICMI 2003, pp. 281–284 (2003)Google Scholar
  3. 3.
    Wang, K.: SALT: A spoken language interface for web-based multimodal dialog systems. In: Proc. of InterSpeec 2002, pp. 2241–2244 (2002)Google Scholar
  4. 4.
  5. 5.
    Tsutsui, T., Saeyor, S., Ishizuka, M.: MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions. In: Proc. WebNet 2000 World Conf. on the WWW and Internet (2000)Google Scholar
  6. 6.
    Hayashi, Ueda, Kurihara: TVML (TV program Making Language) - Automatic TV Program Generation from Text-based Script. In: ACM Multimedia 1997 State of the Art Demos (1997)Google Scholar
  7. 7.
  8. 8.
    Nishimura, Y., Minotsu, S., Dohi, H., Ishizuka, M., Nakano, M., Funakoshi, K., Takeuchi, J., Hasegawa, Y., Tsujino, H.: A markup language for describing interactive humanoid robot presentations. In: Proc. of IUI 2007, pp. 333–336 (2007)Google Scholar
  9. 9.
  10. 10.
    Kawahara, T., Kobayashi, T., Takeda, K., Minematsu, N., Itou, K., Yamamoto, M., Yamada, A., Utsuro, T., Shikano, K.: Sharable software repository for Japanese large vocabulary continuous speech recognition. In: Proc. ICSLP 1998, pp. 3257–3260 (1998)Google Scholar
  11. 11.
  12. 12.
    Mori, T., Iribe, Y., Katsurada, K., Nitta, T.: Real-time Visualization of English Pronunciation on an IPA Vowel-Chart Based on Articulatory Feature Extraction. IPSJ SIG Technical Report 89-15 (2011) (in Japanese)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Kouichi Katsurada
    • 1
  • Taiki Kikuchi
    • 1
  • Yurie Iribe
    • 1
  • Tsuneo Nitta
    • 1
  1. 1.Toyohashi University of TechnologyToyohashiJapan

Personalised recommendations