All-Path Decoding Algorithm for Segmental Based Speech Recognition
In conventional speech processing, researchers adopt a dividable assumption, that the speech utterance can be divided into non-overlapping feature sequences and each segment represents an acoustic event or a label. And the probability of a label sequence on an utterance approximates to the probability of the best utterance segmentation for this label sequence. But in the real case, feature sequences of acoustic events may be overlapped partially, especially for the neighboring phonemes within a syllable. And the best segmentation approximation even reinforces the distortion by the dividable assumption. In this paper, we propose an all-path decoding algorithm, which can fuse the information obtained by different segmentations (or paths) without paying obvious computation load, so the weakness of the dividable assumption could be alleviated. Our experiments show, the new decoding algorithm can improve the system performance effectively in tasks with heavy insertion and deletion errors.
KeywordsSegment Model Good Path Acoustic Event Observation Sequence Label Sequence
Unable to display preview. Download preview PDF.
- 1.Gao, S., Lee, T., Wong, Y.W., Xu, B.: Acoustic modeling for chinese speech recognition: A comparative study of mandarin and cantonese. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1261–1264 (2000)Google Scholar
- 4.Tang, Y., Liu, W.J., Zhang, H., Xu, B., Ding, G.H.: One-pass coarse-to-fine segmental speech decoding algorithm. In: IEEE Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France, pp. 441–444 (2006)Google Scholar
- 5.Tang, Y., Liu, W.J., Zhang, Y.Y., Xu, B.: A framework for fast segment model by avoidance of redundant computation on segment. In: International Symposium on Chinese Spoken Language Processing, Hong Kong, pp. 117–121 (2004)Google Scholar
- 6.Tang, Y., Zhang, H., Liu, W.J., Xu, B.: Coloring the speech utterance to accelerate the SM based LVCSR decoding. In: IEEE Proceedings of International Conference on Natural Language Processing and Knowledge Engineering, Wuhan, China, pp. 121–126 (2005)Google Scholar
- 8.Deng, Y.G., Huang, T.Y., Xu, B.: Towards high performance continuous mandarin digit string recognition. In: Proceedings of the International Conference on Spoken Language Processing, Beijing, China, pp. 642–645 (2000)Google Scholar
- 9.Gao, S., Xu, B., Zhang, H., Zhao, B., Li, C.R., Huang, T.Y.: Update of progress of sinohear: Advanced mandarin lvcsr system at NLPR. In: Proceedings of the International Conference on Spoken Language Processing, Beijing, China, pp. 798–801 (2000)Google Scholar
- 10.Duda, R., Hart, P., Stork, D.: Pattern Recognition, 2nd edn. John Wiley & Sons, Inc., Chichester (2001)Google Scholar