Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models
Purchase on Springer.com
$29.95 / €24.95 / £19.95*
* Final gross prices may vary according to local VAT.
Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PC’s processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device.
- Burchard, B. & Romer, R., “A single chip phoneme based HMM speech recognition system for consumer applications,” IEEE Trans. Consumer Elec., 46, No.3, 2000, pp.914–919. CrossRef
- Gorin, A.L., Riccardi, G. & Wright, J.H., “How may I help you?” Speech Communication, 23, 1997, pp.113–127. CrossRef
- Holmes, J. N. & Holmes WJ, “Speech synthesis and recognition,” Taylor & Francis, 2001
- Melnikoff, S.J., James-Roxby, P.B., Quigley, S.F. & Russell, M.J., “Reconfigurable computing for speech recognition: preliminary findings,” FPL 2000, LNCS #1896, 2000, pp.495–504.
- Melnikoff, S.J., Quigley, S.F. & Russell, M.J., “Implementing a hidden Markov model speech recognition system in programmable logic,” FPL 2001, LNCS #2147, 2001, pp.81–90.
- Nakamura K. et al, “Speech recognition chip for monosyllables,” Proc. Asia and South Pacific Design Automation Conference (ASP-DAC 2001), IEEE, 2001, pp.396–399.
- Rabiner, L.R., “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, 77, No.2, 1989, pp.257–286.
- Shi Y.Y., Liu J. & Liu R.S., “Single-chip speech recognition system based on 8051 microcontroller core,” IEEE Trans. Consumer Elec., 47, No.1, 2001, pp.149–153. CrossRef
- Shozakai, M., “Speech interface VLSI for car applications”, ICASSP’ 99, 1999, pp.141–144.
- Stogiannos, P., Dollas, A. & Digalakis, V., “A configurable logic based architecture for real-time continuous speech recognition using hidden Markov models,” Journal of VLSI Signal Processing Systems, 2000, 24, No.2–3, pp.223–240. CrossRef
- Woodland, P.C., Odell, J.J., Valtchev, V. & Young, S.J. “Large vocabulary continuous speech recognition using HTK,” ICASSP’ 94, 1994, pp.125–128.
- Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models
- Book Title
- Field-Programmable Logic and Applications: Reconfigurable Computing Is Going Mainstream
- Book Subtitle
- 12th International Conference, FPL 2002 Montpellier, France, September 2–4, 2002 Proceedings
- pp 202-211
- Print ISBN
- Online ISBN
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- Series ISSN
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag Berlin Heidelberg
- Additional Links
- Industry Sectors
- eBook Packages
- Editor Affiliations
- 4. Institut für Datentechnik, FG Mikroelektronische Systeme, Technische Universität Darmstadt
- 5. Microelectronics Department, LIRMM
- Author Affiliations
- 6. Electronic, Electrical and Computer Engineering, University of Birmingham, B15 2TT, Edgbaston, Birmingham, UK
To view the rest of this content please follow the download PDF link above.