Non-uniform Kernel Allocation Based Parsimonious HMM
In conventional Gaussian mixture based Hidden Markov Model (HMM), all states are usually modeled with a uniform, fixed number of Gaussian kernels. In this paper, we propose to allocate kernels non-uniformly to construct a more parsimonious HMM. Different number of Gaussian kernels are allocated to states in a non-uniform and parsimonious way so as to optimize the Minimum Description Length (MDL) criterion, which is a combination of data likelihood and model complexity penalty. By using the likelihoods obtained in Baum-Welch training, we develop an effcient backward kernel pruning algorithm, and it is shown to be optimal under two mild assumptions. Two databases, Resource Management and Microsoft Mandarin Speech Toolbox, are used to test the proposed parsimonious modeling algorithm. The new parsimonious models improve the baseline word recognition error rate by 11.1% and 5.7%, relatively. Or at the same performance level, a 35-50% model compressions can be obtained.
KeywordsHide Markov Model Gaussian Kernel Minimum Description Length Word Error Rate Bayesian Information Criterion
Unable to display preview. Download preview PDF.
- 1.Ariew, R.: Occam’s razor: A historical and philosophical analysis of Ockham’s principle of pasimony. Philosophy, Champaigh-Urbara, University of Illinois (1976)Google Scholar
- 2.Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csake, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973)Google Scholar
- 5.Li, X.B., Soong, F.K., Myroll, T.A., Wang, R.H.: Optimal Clustering and Non-uniform Allocation of Gaussian Kernels in Scalar Dimension for HMM Compression. In: Proc. ICASSP 2005, vol. 1, pp. 669–672 (2005)Google Scholar
- 6.Takami, J., Sagayama, S.: A Successive State Splitting Algorithm for Efficient Allophone Modeling. In: Proc. ICASSP 1992, vol. I, pp. 573–576 (1992)Google Scholar
- 11.Chang, E., Shi, Y., Zhou, J.-L., Huang, C.: Speech lab in a box: A Mandarin speech toolbox to jumpstart speech related research toolbox. In: Eurospeech 2001, pp. 2799–2782 (2001)Google Scholar