Design and Implementation of a Bayesian Network Speech Recognizer

Wiggers, Pascal; Rothkrantz, Leon J. M.; van de Lisdonk, Rob

doi:10.1007/978-3-642-15760-8_57

Pascal Wiggers²³,
Leon J. M. Rothkrantz²³ &
Rob van de Lisdonk²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6231))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1443 Accesses
1 Citations

Abstract

In this paper we describe a speech recognition system implemented with generalized dynamic Bayesian networks (dbns). We discuss the design of the system and the features of the underlying toolkit we constructed that makes efficient processing of speech and language data with Bayesian networks possible. Features include: sparse representations of probability tables, a fast algorithm for inference with probability tables, lazy evaluation of probability tables, algorithms for calculations with tree-shaped distributions, the ability to change distributions on the fly, and a generalization of dbn model structure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Murphy, K.: Dynamic Bayesian Networks: Representation, Inference and Learning. Ph.D. thesis, University of California, Berkeley (2002)
Google Scholar
Huang, T., Koller, D., Malik, J., Ogasawara, G.H., Rao, B., Russell, S.J., Weber, J.: Automatic Symbolic Traffic Scene Analysis Using Belief Networks. In: National Conference on Artificial Intelligence, pp. 966–972 (1994)
Google Scholar
Singhal, A., Brown, C.: Dynamic Bayes Net Approach to Multimodal Sensor Fusion. In: SPIE Conference on Sensor Fusion and Decentralized Control, Pittsburgh, PA, vol. 3209 (1997)
Google Scholar
Chella, A., Vitabile, S., Sorbello, R.: A Vision Agent for Mobile Robot Navigation in Time-Variable Environments. In: ICIAP 2001 (2001)
Google Scholar
Kumagai, T., Akamatsu, M.: Prediction of Human Driving Behavior Using Dynamic Bayesian Networks. IEICE Transactions on Information and Systems E89-D (2006)
Google Scholar
Perrin, B.E., Ralaivola, L., Mazurie, A., Bottani, S., Mallet, J., d’Alché Buc, F.: Gene Networks Inference Using Dynamic Bayesian Networks. Bioinformatics 19, 138–148 (2003)
Article Google Scholar
Zweig, G.: Speech Recognition with Dynamic Bayesian Networks. Ph.D. thesis, Computer Science Division, University of California at Berkeley (1998)
Google Scholar
Bilmes, J.: Natural Statistical Models for Automatic Speech Recognition. Ph.D. thesis, Dept. of EECS, University of California, Berkeley (1999)
Google Scholar
Wiggers, P., Rothkrantz, L.J.M.: Topic-Based Language Modeling with Dynamic Bayesian Networks. In: Proceedings of Interspeech 2006 – ICSLP, Pittsburgh, Pennsylvania, pp. 1866–1869 (2006)
Google Scholar
Livescu, K., Glass, J., Bilmes, J.: Hidden Feature Models for Speech Recognition Using Dynamic Bayesian Networks. In: Proc. Eurospeech (2003)
Google Scholar
Nefian, A.V., Liang, L., Pi, X., Liu, X., Murphy, K.: Dynamic Bayesian Networks for Audio-Visual Speech Recognition. EURASIP Journal on Applied Signal Processing 11, 1–15 (2002)
Google Scholar
Wiggers, P., Rothkrantz, L.J.M.: Combining Topic Information and Structure Information in a Dynamic Language Model. In: Text, Speech and Dialogue 2009, pp. 218–225 (2009)
Google Scholar
Kjaerulff, U.: dhugin: a Computational System for Dynamic Time-Sliced Bayesian Networks. International Journal of Forecasting 11, 89–111 (1995)
Article Google Scholar
Murphy, K.P.: The Bayes Net Toolbox for Matlab. Computing Science and Statistics 33, 331–350 (2001)
Google Scholar
Bilmes, J., Zweig, G.: The Graphical Models Toolkit: An Open Source Software System for Speech and Time-Series Processing. In: Proc. IEEE ICASSP (2002)
Google Scholar
Druzdzel, M.J.: SMILE: Structural Modeling, Inference, and Learning Engine and GeNIe: a Development Environment for Graphical Decision-Theoretic Models. In: AAAI 1999/IAAI 1999, Menlo Park, CA, USA, pp. 902–903 (1999)
Google Scholar
Madsen, A., Jensen, F.: Lazy Propagation: a Junction Tree Inference Algorithm Based on Lazy Evaluation. Artificial Intelligence 113, 203–245 (1999)
Article MATH MathSciNet Google Scholar
Hulst, J.: Modeling Physiological Processes with Dynamic Bayesian Networks. Master’s thesis, Man-Machine Interaction Group, Delft University of Technology (2006)
Google Scholar
Zweig, G., Padmanabhan, M.: Exact Alpha-Beta Computation in Logarithmic Space with Application to Map Word Graph Construction. In: Proceedings of ICSLP 2000, Beijing, China (2000)
Google Scholar
Damhuis, M., Boogaart, T., In ’t Veld, C., Versteijlen, M., Schelvis, W., Bos, L., Boves, L.: Creation and Analysis of the Dutch Polyphone Corpus. In: Proceedings of ICSLP 1994, Yokohama, Japan, pp. 1803–1806 (1994)
Google Scholar
Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The htk Book (for htk Version 3.2.1) (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Man–Machine Interaction Group, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands
Pascal Wiggers, Leon J. M. Rothkrantz & Rob van de Lisdonk

Authors

Pascal Wiggers
View author publications
You can also search for this author in PubMed Google Scholar
Leon J. M. Rothkrantz
View author publications
You can also search for this author in PubMed Google Scholar
Rob van de Lisdonk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Aleš Horák
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wiggers, P., Rothkrantz, L.J.M., van de Lisdonk, R. (2010). Design and Implementation of a Bayesian Network Speech Recognizer. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_57

Download citation

DOI: https://doi.org/10.1007/978-3-642-15760-8_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics