Fixed-Point Arithmetic

Bocchieri, Enrico

doi:10.1007/978-1-84800-143-5_12

Enrico Bocchieri³

Part of the book series: Advances in Pattern Recognition ((ACVPR))

1271 Accesses
3 Citations

There are two main requirements for embedded/mobile systems: one is low power consumption for long battery life and miniaturization, the other is low unit cost for components produced in very large numbers (cell phones, set-top boxes). Both requirements are addressed by CPU’s with integer-only arithmetic units which motivate the fixed-point arithmetic implementation of automatic speech recognition (ASR) algorithms. Large vocabulary continuous speech recognition (LVCSR) can greatly enhance the usability of devices, whose small size and typical on-the-go use hinder more traditional interfaces. The increasing computational power of embedded CPU’s will soon allow real-time LVCSR on portable and lowcost devices. This chapter reviews problems concerning the fixed-point implementation of ASR algorithms and it presents fixed-point methods yielding the same recognition accuracy of the floating-point algorithms. In particular, the chapter illustrates a practical approach to the implementation of the frame-synchronous beam-search Viterbi decoder, N-grams language models, HMM likelihood computation and mel-cepstrum front-end. The fixed-point recognizer is shown to be as accurate as the floating-point recognizer in several LVCSR experiments, on the DARPA Switchboard task, and on an AT&T proprietary task, using different types of acoustic front-ends, HMM’s and language models. Experiments on the DARPA Resource Management task, using the StrongARM-1100 206 MHz and the XScale PXA270 624 MHz CPU’s show that the fixed-point implementation enables real-time performance: the floating point recognizer, with floating-point software emulation is several times slower for the same accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Comparison and Analysis of Several Phonetic Decoding Approaches

A Fixed-Point Neural Network Architecture for Speech Applications on Resource Constrained Hardware

Article 25 November 2016

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

References

Bocchieri, E. and Mak, B. (2001) Subspace distribution clustering hidden Markov model. IEEE Transactions on ASSP, vol. 9, pp. 264-275.
Google Scholar
Davis, S.B. and Mermelstein, P. (1980) Comparison of parametric representations for mono-syllabic word recognition in continuously spoken sentences. IEEE Transactions on ASSP, vol. ASSP-28, no. 4, pp. 357-366.
Article Google Scholar
Gong, Y. and Kao, Y. (2000) Implementing a high accuracy speaker-independent Continuous speech recognizer on a fixed-point DSP. In Proceedings of ICASSP, pp. 3686-3689.
Google Scholar
Hermansky, H. and Morgan, N. (1994) Rasta processing of speech. IEEE Transaction on ASSP, vol. 6, pp. 578-589.
Google Scholar
Huggins-Daines, D., Kumar, M., Chan, A., Black, A.W., Ravishankar, M. and Rudnicky, A.I. (2006) Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices. In Proceedings of ICASSP, vol. 1, pp. 185-188.
Google Scholar
Jeong, J., Han, I., Jon, E. and Kim, J. (2004) Memory and computation reduction for embed-ded ASR systems. In Proceedings of ICSLP.
Google Scholar
Kanthak, S., Schütz, K. and Ney, H. (2000) Using SIMD instructions for fast likelihood calcu-lation in LVCSR. In Proceedings of ICASSP, pp. 1531-1534.
Google Scholar
Kao, Y.H. and Rajasekaran, P.K. (2000) A low cost dynamic vocabulary speechrecognizer on a GPP-DSP system. In Proceedings of ICASSP, pp. 3215-3218.
Google Scholar
Köhler, T., Fügen, C., Stüker, S. and Waibel, A. (2005) Rapid porting of ASR systems to mobile devices. In Proceedings of INTERSPEECH, pp. 233-236.
Google Scholar
Lee, K.F. (1989). Automatic Speech Recognition Recognition. The Development of the SPHINX System, Kluwer Academic.
Google Scholar
Lee, L. and Rose, R.C. (1996) Speaker normalization using efficient frequency warping pro-cedures. In Proceedings of ICASSP, vol. 1, pp. 353-356.
Google Scholar
Leppänen, J. and Kiss, I. (2005) Comparison of low foot-print acoustic modeling techniques for embedded ASR studies. In Proceedings of INTERSPEECH, pp. 2965-2968.
Google Scholar
Li, X., Malkin, J. and Bilmes, J. (2006) A high-speed, low-resource ASR back-end based on custom arithmetic. IEEE Transaction on Speech and Audio Processing, vol. 14, issue 5, pp. 1683-1693.
Article Google Scholar
Mohri, M., Pereira, F. and Riley, M. (2002) Weighted finite-state transducers in speech recog-nition. Computer, Speech and Language, vol. 16 issue 1, pp. 69-99.
Article Google Scholar
Novak, M. (2004) Towards large vocabulary ASR on embedded platforms. In Proceedings of ICSLP.
Google Scholar
Novak, M., Hampl, R., Krbec, P. and Sedivy, J. (2003) Two-pass search strategy for large list recognition on embedded speech recognition platforms. In Proceedings of ICASSP, vol. 1, pp. 200-203.
Google Scholar
Oppenheim, A.V. and Schafer, R.W. (1975) Digital signal processing, Prentice-Hall.
Google Scholar
Rose, R., Parthasarathy, S., Gajic, B., Rosenberg, A. and Narayanan S. (2001) On the imple-mentation of ASR algorithms for hand-held wireless mobile devices. In Proceedings of ICASSP, vol. 1, pp. 17-20.
Google Scholar
Sagayama, S. and Takahashi, S. (1995) On the use of scalar quantization for fast HMM com-putation. In Proceedings of ICASSP, Vol. 1, pp. 213-216.
Google Scholar
Saon, G., Padmanabhan, M., Gopinath, R., and Chen, S. (2000) Maximum likelihood dis-criminant feature spaces. In Proceedings of ICASSP, vol. 2, pp. 1129-1131.
Google Scholar
Vasilache, M. (2000) Speech recognition using HMM’s with quantized parameters. In Pro-ceedings of ICSLP, vol. 1, pp. 441-444.
Google Scholar
Vasilache, M., Iso-Sipilä, J. and Viikki, O. (2004) On a practical design of a ow complexity speech recognition engine. In Proceedings of ICASSP, vol. 5, pp. V-113-16.
Google Scholar
Viikki, O. (2001) ASR in portable wireless devices. In Proceedings of ASRU, pp. 96-99.
Google Scholar
Zaykovskiy, D. (2006) Survey of the speech recognition techniques for mobile devices. In Proceedings of 11th International Conference Speech and Computer, SPECOM’2006, pp. 88-92.
Google Scholar

Download references

Author information

Authors and Affiliations

AT&T Labs Research, Florham Park, New Jersey, USA
Enrico Bocchieri

Authors

Enrico Bocchieri
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bocchieri, E. (2008). Fixed-Point Arithmetic. In: Automatic Speech Recognition on Mobile Devices and over Communication Networks. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84800-143-5_12

Download citation

DOI: https://doi.org/10.1007/978-1-84800-143-5_12
Publisher Name: Springer, London
Print ISBN: 978-1-84800-142-8
Online ISBN: 978-1-84800-143-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fixed-Point Arithmetic

Access this chapter

Preview

Similar content being viewed by others

Comparison and Analysis of Several Phonetic Decoding Approaches

A Fixed-Point Neural Network Architecture for Speech Applications on Resource Constrained Hardware

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Fixed-Point Arithmetic

Access this chapter

Preview

Similar content being viewed by others

Comparison and Analysis of Several Phonetic Decoding Approaches

A Fixed-Point Neural Network Architecture for Speech Applications on Resource Constrained Hardware

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation