Chapter

Machine Learning for Multimodal Interaction

Volume 4299 of the series Lecture Notes in Computer Science pp 407-418

The ISL RT-06S Speech-to-Text System

  • Christian FügenAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)
  • , Shajith IkbalAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)
  • , Florian KraftAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)
  • , Kenichi KumataniAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)
  • , Kornel LaskowskiAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)
  • , John W. McDonoughAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)
  • , Mari OstendorfAffiliated withLancaster UniversityCarnegie Mellon UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)Dept. of Electrical Engineering, University of Washington
  • , Sebastian StükerAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)
  • , Matthias WölfelAffiliated withLancaster UniversityInteractive Systems Laboratories, Universität Karlsruhe (TH)

* Final gross prices may vary according to local VAT.

Get Access

Abstract

This paper describes the 2006 lecture and conference meeting speech-to-text system developed at the Interactive Systems Laboratories (ISL), for the individual head-mounted microphone (IHM), single distant microphone (SDM), and multiple distant microphone (MDM) conditions, which was evaluated in the RT-06S Rich Transcription Meeting Evaluation sponsored by the US National Institute of Standards and Technologies (NIST). We describe the principal differences between our current system and those submitted in previous years, namely improved acoustic and language models, cross adaptation between systems with different front-ends and phoneme sets, and the use of various automatic speech segmentation algorithms.