Abstract
Mitsubishi Electric Corporation has been developing speech applications for 20 years. Our main targets are car navigation systems, elevator-controlling systems, and other industrial devices. This chapter deals with automatic speech recognition technologies which were developed for these applications. To realize real-time processing with small resources, syllable N-gram-based text search is proposed. To deal with reverberant environments in elevators, spectral-subtraction-based dereverberation techniques with reverberation time estimation are used. In addition, discriminative methods for acoustic and language models are developed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boll, S.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113–120 (1979)
Diehl, F., Woodland, P.: Complementary phone error training. In: Proceedings of INTERSPEECH (2012)
Fiscus, J.: A post-processing system to yield reduced error word rates: recognizer output voting error reduction (ROVER). In: Proceedings of ASRU, pp. 347–354 (1997)
Hanazawa, T., Okato, Y., Iwasaki, T.: Speech recognition using statistical language model and text match based large vocabulary search by voice. In: Proceedings of 2009 Autumn Meeting of the Acoustical Society of Japan, pp. 61–62 (2009)
Iwasaki, T., Kosaka, M., Nanba, T., Narita, T.: Voice interface of car navigation system – current technologies and the future. In: Mitsubishi Denki Giho, pp. 51–54 (2004)
Lebart, K., Boucher, J.M., Denbigh, P.N.: A new method based on spectral subtraction for speech dereverberation. Acta Acustica 87, 359–366 (2001)
Mikolov, T., Karafiát, M., Burget, L., C̆ernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of INTERSPEECH, pp. 1045–1048 (2010)
Nakayama, M., Nishiura, T., Denda, Y., Kitaoka, N., Yamamoto, K., Yamada, T., Tsuge, S., Miyajima, C., Fujimoto, M., Takiguchi, T., Tamura, S., Ogawa, T., Matsuda, S., Kuroiwa, S., Takeda, K., Nakamura, S.: CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments. In: Proceedings of Interspeech, pp. 968–971 (2008)
Naylor, P., Gaubitch, N.: Speech Dereverberation. Springer, New York (2010)
Tachioka, Y., Watanabe, S.: Discriminative training of acoustic models for system combination. In: Proceedings of INTERSPEECH, pp. 2355–2359 (2013)
Tachioka, Y., Watanabe, S., Hershey, J.: Effectiveness of discriminative training and feature transformation for reverberated and noisy speech. In: Proceedings of ICASSP, pp. 6935–6939 (2013)
Tachioka, Y., Watanabe, S., Le Roux, J., Hershey, J.: A generalized framework of discriminative training for system combination. In: Proceedings of ASRU, pp. 43–48 (2013)
Vincent, E., Barker, J., Watanabe, S., Le Roux, J., Nesta, F., Matassoni, M.: The second “CHiME” speech separation and recognition challenge: datasets, tasks and baselines. In: Proceedings of ICASSP, pp. 126–130 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Tachioka, Y., Hanazawa, T., Narita, T., Ishii, J. (2017). Advanced ASR Technologies for Mitsubishi Electric Speech Applications. In: Watanabe, S., Delcroix, M., Metze, F., Hershey, J. (eds) New Era for Robust Speech Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-64680-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-64680-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64679-4
Online ISBN: 978-3-319-64680-0
eBook Packages: Computer ScienceComputer Science (R0)