Embedded ViaVoice

Beran, Tomáš; Bergl, Vladimír; Hampl, Radek; Krbec, Pavel; Šedivý, Jan; Tydlitát, Bořivoj; Vopička, Josef

doi:10.1007/978-3-540-30120-2_34

Tomáš Beran²¹,
Vladimír Bergl²¹,
Radek Hampl²¹,
Pavel Krbec²¹,
Jan Šedivý²¹,
Bořivoj Tydlitát²¹ &
…
Josef Vopička²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3206))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

867 Accesses
4 Citations

Abstract

In this paper we present IBM Embedded ViaVoice (EVV), a speech recognizer for embedded devices. It is designed for grammar-based command and control applications with medium to large vocabularies. We show what algorithms and technologies were used to cope with the fundamental problems of embedded systems: limited CPU performance, slow memory, no floating point unit, and the division of the memory into ROM and RAM. The scalable EVV system described is capable of real-time performance on embedded platforms as slow as 40 MIPS with minimal RAM around 1 MB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Deligne, S., Eide, E., Gopinath, R.A., Kanevksy, D., Maison, B., Olsen, P., Printz, H., Šedivý, J.: Low-Resource Speech Recognition of 500-word Vocabularies In: EuroSpeech 2001, Proceedings (2001)
Google Scholar
Balakrishnan, S.V.: Fast Incremental Adaptation using Maximum Likelihood Regression and Stochastic Gradient Descent. In: EuroSpeech 2003, Proceedings (8th European Conference on Speech Communication and Technology) (2003)
Google Scholar
Novák, M., Gopinath, R.A., Šedivý, J.: Efficient Hierarchical Labeler Algorithm for Gaussian Likelihoods Computation in Resource Constrained Speech Recognition Systems, http://www.research.ibm.com/people/r/rameshg/novak-icassp2002.ps
Bahl, L.R., de Souza, P.V., Gopalakrishnan, P.S., Nahamoo, D., Picheny, M.A.: Robust methods for using context-dependent features and speech recognition models in a continuous speech recognizer. In: Proc. ICASSP 1994 (1994)
Google Scholar
Novák, M., Hampl, R., Krbec, P., Bergl, V., Šedivý, J.: Two-Pass Search Strategy For Large List Recognition on Embedded Speech Recognition Platforms. In: ICASSP 2003 (2003)
Google Scholar
Maison, B.: Automatic Baseform Generation from Acoustic Data. In: EuroSpeech 2003 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Voice Technologies and System Group, IBM Research Prague, ‘The Park” building, V Parku 2294/4, 148 00, Praha-Chodov, Czech Republic
Tomáš Beran, Vladimír Bergl, Radek Hampl, Pavel Krbec, Jan Šedivý, Bořivoj Tydlitát & Josef Vopička

Authors

Tomáš Beran
View author publications
You can also search for this author in PubMed Google Scholar
Vladimír Bergl
View author publications
You can also search for this author in PubMed Google Scholar
Radek Hampl
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Krbec
View author publications
You can also search for this author in PubMed Google Scholar
Jan Šedivý
View author publications
You can also search for this author in PubMed Google Scholar
Bořivoj Tydlitát
View author publications
You can also search for this author in PubMed Google Scholar
Josef Vopička
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Beran, T. et al. (2004). Embedded ViaVoice. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_34

Download citation

DOI: https://doi.org/10.1007/978-3-540-30120-2_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23049-6
Online ISBN: 978-3-540-30120-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics