Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models
Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PC’s processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device.
Unable to display preview. Download preview PDF.
- 3.Holmes, J. N. & Holmes WJ, “Speech synthesis and recognition,” Taylor & Francis, 2001Google Scholar
- 4.Melnikoff, S.J., James-Roxby, P.B., Quigley, S.F. & Russell, M.J., “Reconfigurable computing for speech recognition: preliminary findings,” FPL 2000, LNCS #1896, 2000, pp.495–504.Google Scholar
- 5.Melnikoff, S.J., Quigley, S.F. & Russell, M.J., “Implementing a hidden Markov model speech recognition system in programmable logic,” FPL 2001, LNCS #2147, 2001, pp.81–90.Google Scholar
- 6.Nakamura K. et al, “Speech recognition chip for monosyllables,” Proc. Asia and South Pacific Design Automation Conference (ASP-DAC 2001), IEEE, 2001, pp.396–399.Google Scholar
- 7.Rabiner, L.R., “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, 77, No.2, 1989, pp.257–286.Google Scholar
- 9.Shozakai, M., “Speech interface VLSI for car applications”, ICASSP’ 99, 1999, pp.141–144.Google Scholar
- 11.Woodland, P.C., Odell, J.J., Valtchev, V. & Young, S.J. “Large vocabulary continuous speech recognition using HTK,” ICASSP’ 94, 1994, pp.125–128.Google Scholar