A Configurable Logic Based Architecture for Real-Time Continuous Speech Recognition Using Hidden Markov Models

Stogiannos, Panagiotis; Dollas, Apostolos; Digalakis, Vassilis

doi:10.1023/A:1008197523254

A Configurable Logic Based Architecture for Real-Time Continuous Speech Recognition Using Hidden Markov Models

Published: 01 March 2000

Volume 24, pages 223–240, (2000)
Cite this article

Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Panagiotis Stogiannos¹,
Apostolos Dollas¹ &
Vassilis Digalakis¹

110 Accesses
5 Citations
Explore all metrics

Abstract

An architecture is presented for real-time continuous speech recognition based on a modified hidden Markov model. The algorithm is adapted to the needs of continuous speech recognition by efficient encoding of the state space, and logarithmic encoding of the weights so that products can be computed as sums. The paper presents the algorithm and its application related modifications, the mapping of the algorithm to a special purpose architecture, and the detailed design of this architecture using configurable logic. Emphasis is given on how the attributes of the algorithm are exploited in a configurable logic based design. A concrete design example is presented with a coprocessor engine having one large FPGA, 64 Mbytes of synchronous DRAM (SDRAM), a small FPGA as a SDRAM controller, and 2 Mbytes SRAM. This engine operating at 66 MHz performs roughly nine times as fast as a high end personal computer running a fully optimized version of the same algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-Time Implementation of an Optimized Speech Compression System in STM32F4 Discovery Board

Neuron-Like Approach to Speech Recognition

Article 01 May 2018

A comparative review of dynamic neural networks and hidden Markov model methods for mobile on-device speech recognition

Article Open access 04 June 2017

References

Speech Technology is the Next Big Thing in Computing. Business Week, February 23, 1998.
H. Murveit, “An Integrated Circuit Based Speech Recognition System,” Ph.D. Thesis, Department of Electrical Engineering, University of California at Berkeley, 1983.
Google Scholar
S. Narayanaswamy, “Pen and Speech Recognition in the User Interface for Mobile Multimedia Terminals,” Ph.D. Thesis, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, 1996.
Google Scholar
S. Hauck, “The Roles of FPGA's in Reprogrammable Systems,” Proceedings of the IEEE, April 1998, pp. 615–637.
H. Schmit and D. Thomas, “Hidden Markov and Fuzzy Controllers in FPGAs,” Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, Computer Society Press, 1995, pp. 214–221.
D. Yeh, G. Feygin, and P. Chow, “RACER: A Reconfigurable Constraint-Length 14Viterbi Decoder,” Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, Computer Society Press, 1996, pp. 60–69.
S.J. Young, “A Review of Large-Vocabulary Continuous-Speech Recognition,” IEEE Signal Processing Magazine, September 1996, pp. 45–57.
S.B. Davis and P. Mermelstein, “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,” IEEE Trans. Acoustics Speech and Signal Processing, vol. ASSP-28, no.4, 1980, pp. 357–366.
Article Google Scholar
V. Digalakis, P. Monaco, and H. Murveit, “Genones: Generalized Mixture Tying in Continuous Hidden Markov Model-Based Speech Recognizers,” IEEE Trans. Speech Audio Processing, 1996, pp. 281–289.
F. Jelinek, Statistical Methods for Speech Recognition, MIT Press, 1997.
V. Digalakis, L. Neumeyer, and M. Perakakis, “Quantization of Cepstral Parameters for Speech Recognition Over the WWW,” IEEE Journal on Selected Areas in Communications, 1999, pp. 82–90.
S. Tsakalidis, V. Digalakis, and L. Neumeyer, “Efficient Speech Recognition Using Subvector Quantization and Discrete-Mixture HMMs,” IEEE International Conference on Acoustics, Speech and Signal Processing, submitted.
P. Price, “Evaluation of Spoken Language Systems: The ATIS Domain,” Proceedings of the Third DARPA Speech and Natural Language Workshop, Morgan Kaufmann, June 1990.
L. Moll and M. Shand, “Systems Performance Measurement on PCI Pamette,” Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, Computer Society Press, 1997, pp. 125–133.
Intel Corporation, PC SDRAM Specification ver. 1.51, 1997.
Digital Equipment Corporation, PCI Pamette user-area Interface for Firmware v1.9, 1997.
J.F. Wakerly, Digital Design Principles and Practices, Prentice Hall, 1990.
R.H. Katz, Contemporary Logic Design, The Benjamin/Cummings Publishing Com., 1994.
Altera Corporation, ALTERA MAXCPLUS II VHDL, Altera Corporation 101 Innovation Drive, San Jose, CA, USA, 1996.
Google Scholar
Altera Corporation, ALTERA Data Book 1998, Altera Corporation 101 Innovation Drive, San Jose, CA, USA, 1998.
Google Scholar
J.M. Berge, A. Fonkoua, S. Maginot, and J. Rouillard, VHDL Designer's Reference, Kluwer Academic Publishers, 1992.
Altera Corporation, FLEX 10KE Embedded Programmable Logic Family, Altera Corporation 101 Innovation Drive, San Jose, CA, USA, 1998.
Google Scholar
IBM Corporation, 168 Pin SDRAM Registered DIMM Functional Description & Timing Diagrams, 1998.
IBM Corporation, 8M x 64/72 Bank Unbuffered SDRAM Module, 1998.
SAMSUNG Electronics, KM68257C 32Kx8 Bit High Speed Static RAM(5V Operating), February 1998.

Download references

Author information

Authors and Affiliations

Technical University of Crete, 73100, Chania, Crete, Greece
Panagiotis Stogiannos, Apostolos Dollas & Vassilis Digalakis

Authors

Panagiotis Stogiannos
View author publications
You can also search for this author in PubMed Google Scholar
Apostolos Dollas
View author publications
You can also search for this author in PubMed Google Scholar
Vassilis Digalakis
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stogiannos, P., Dollas, A. & Digalakis, V. A Configurable Logic Based Architecture for Real-Time Continuous Speech Recognition Using Hidden Markov Models. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 24, 223–240 (2000). https://doi.org/10.1023/A:1008197523254

Download citation

Published: 01 March 2000
Issue Date: March 2000
DOI: https://doi.org/10.1023/A:1008197523254

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Configurable Logic Based Architecture for Real-Time Continuous Speech Recognition Using Hidden Markov Models

Abstract

Access this article

Similar content being viewed by others

Real-Time Implementation of an Optimized Speech Compression System in STM32F4 Discovery Board

Neuron-Like Approach to Speech Recognition

A comparative review of dynamic neural networks and hidden Markov model methods for mobile on-device speech recognition

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Configurable Logic Based Architecture for Real-Time Continuous Speech Recognition Using Hidden Markov Models

Abstract

Access this article

Similar content being viewed by others

Real-Time Implementation of an Optimized Speech Compression System in STM32F4 Discovery Board

Neuron-Like Approach to Speech Recognition

A comparative review of dynamic neural networks and hidden Markov model methods for mobile on-device speech recognition

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation