Abstract
In this paper we describe a speech recognition system implemented with generalized dynamic Bayesian networks (dbns). We discuss the design of the system and the features of the underlying toolkit we constructed that makes efficient processing of speech and language data with Bayesian networks possible. Features include: sparse representations of probability tables, a fast algorithm for inference with probability tables, lazy evaluation of probability tables, algorithms for calculations with tree-shaped distributions, the ability to change distributions on the fly, and a generalization of dbn model structure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Murphy, K.: Dynamic Bayesian Networks: Representation, Inference and Learning. Ph.D. thesis, University of California, Berkeley (2002)
Huang, T., Koller, D., Malik, J., Ogasawara, G.H., Rao, B., Russell, S.J., Weber, J.: Automatic Symbolic Traffic Scene Analysis Using Belief Networks. In: National Conference on Artificial Intelligence, pp. 966–972 (1994)
Singhal, A., Brown, C.: Dynamic Bayes Net Approach to Multimodal Sensor Fusion. In: SPIE Conference on Sensor Fusion and Decentralized Control, Pittsburgh, PA, vol. 3209 (1997)
Chella, A., Vitabile, S., Sorbello, R.: A Vision Agent for Mobile Robot Navigation in Time-Variable Environments. In: ICIAP 2001 (2001)
Kumagai, T., Akamatsu, M.: Prediction of Human Driving Behavior Using Dynamic Bayesian Networks. IEICE Transactions on Information and Systems E89-D (2006)
Perrin, B.E., Ralaivola, L., Mazurie, A., Bottani, S., Mallet, J., d’Alché Buc, F.: Gene Networks Inference Using Dynamic Bayesian Networks. Bioinformatics 19, 138–148 (2003)
Zweig, G.: Speech Recognition with Dynamic Bayesian Networks. Ph.D. thesis, Computer Science Division, University of California at Berkeley (1998)
Bilmes, J.: Natural Statistical Models for Automatic Speech Recognition. Ph.D. thesis, Dept. of EECS, University of California, Berkeley (1999)
Wiggers, P., Rothkrantz, L.J.M.: Topic-Based Language Modeling with Dynamic Bayesian Networks. In: Proceedings of Interspeech 2006 – ICSLP, Pittsburgh, Pennsylvania, pp. 1866–1869 (2006)
Livescu, K., Glass, J., Bilmes, J.: Hidden Feature Models for Speech Recognition Using Dynamic Bayesian Networks. In: Proc. Eurospeech (2003)
Nefian, A.V., Liang, L., Pi, X., Liu, X., Murphy, K.: Dynamic Bayesian Networks for Audio-Visual Speech Recognition. EURASIP Journal on Applied Signal Processing 11, 1–15 (2002)
Wiggers, P., Rothkrantz, L.J.M.: Combining Topic Information and Structure Information in a Dynamic Language Model. In: Text, Speech and Dialogue 2009, pp. 218–225 (2009)
Kjaerulff, U.: dhugin: a Computational System for Dynamic Time-Sliced Bayesian Networks. International Journal of Forecasting 11, 89–111 (1995)
Murphy, K.P.: The Bayes Net Toolbox for Matlab. Computing Science and Statistics 33, 331–350 (2001)
Bilmes, J., Zweig, G.: The Graphical Models Toolkit: An Open Source Software System for Speech and Time-Series Processing. In: Proc. IEEE ICASSP (2002)
Druzdzel, M.J.: SMILE: Structural Modeling, Inference, and Learning Engine and GeNIe: a Development Environment for Graphical Decision-Theoretic Models. In: AAAI 1999/IAAI 1999, Menlo Park, CA, USA, pp. 902–903 (1999)
Madsen, A., Jensen, F.: Lazy Propagation: a Junction Tree Inference Algorithm Based on Lazy Evaluation. Artificial Intelligence 113, 203–245 (1999)
Hulst, J.: Modeling Physiological Processes with Dynamic Bayesian Networks. Master’s thesis, Man-Machine Interaction Group, Delft University of Technology (2006)
Zweig, G., Padmanabhan, M.: Exact Alpha-Beta Computation in Logarithmic Space with Application to Map Word Graph Construction. In: Proceedings of ICSLP 2000, Beijing, China (2000)
Damhuis, M., Boogaart, T., In ’t Veld, C., Versteijlen, M., Schelvis, W., Bos, L., Boves, L.: Creation and Analysis of the Dutch Polyphone Corpus. In: Proceedings of ICSLP 1994, Yokohama, Japan, pp. 1803–1806 (1994)
Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The htk Book (for htk Version 3.2.1) (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wiggers, P., Rothkrantz, L.J.M., van de Lisdonk, R. (2010). Design and Implementation of a Bayesian Network Speech Recognizer. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_57
Download citation
DOI: https://doi.org/10.1007/978-3-642-15760-8_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)