Advertisement

All-Path Decoding Algorithm for Segmental Based Speech Recognition

  • Yun Tang
  • Wenju Liu
  • Bo Xu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4274)

Abstract

In conventional speech processing, researchers adopt a dividable assumption, that the speech utterance can be divided into non-overlapping feature sequences and each segment represents an acoustic event or a label. And the probability of a label sequence on an utterance approximates to the probability of the best utterance segmentation for this label sequence. But in the real case, feature sequences of acoustic events may be overlapped partially, especially for the neighboring phonemes within a syllable. And the best segmentation approximation even reinforces the distortion by the dividable assumption. In this paper, we propose an all-path decoding algorithm, which can fuse the information obtained by different segmentations (or paths) without paying obvious computation load, so the weakness of the dividable assumption could be alleviated. Our experiments show, the new decoding algorithm can improve the system performance effectively in tasks with heavy insertion and deletion errors.

Keywords

Segment Model Good Path Acoustic Event Observation Sequence Label Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gao, S., Lee, T., Wong, Y.W., Xu, B.: Acoustic modeling for chinese speech recognition: A comparative study of mandarin and cantonese. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1261–1264 (2000)Google Scholar
  2. 2.
    Ostendorf, M., Roukos, S.: A stochastic segment model for phoneme based continuous speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 37(12), 1857–1869 (1989)CrossRefGoogle Scholar
  3. 3.
    Ostendorf, M., Digalakis, V., Kimball, O.: From HMM’s to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech Audio Processing 4(5), 360–378 (1996)CrossRefGoogle Scholar
  4. 4.
    Tang, Y., Liu, W.J., Zhang, H., Xu, B., Ding, G.H.: One-pass coarse-to-fine segmental speech decoding algorithm. In: IEEE Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France, pp. 441–444 (2006)Google Scholar
  5. 5.
    Tang, Y., Liu, W.J., Zhang, Y.Y., Xu, B.: A framework for fast segment model by avoidance of redundant computation on segment. In: International Symposium on Chinese Spoken Language Processing, Hong Kong, pp. 117–121 (2004)Google Scholar
  6. 6.
    Tang, Y., Zhang, H., Liu, W.J., Xu, B.: Coloring the speech utterance to accelerate the SM based LVCSR decoding. In: IEEE Proceedings of International Conference on Natural Language Processing and Knowledge Engineering, Wuhan, China, pp. 121–126 (2005)Google Scholar
  7. 7.
    Rabiner, L., Wilpon, J., Soong, F.: High performance connected digit recognition using hidden markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing 37(8), 1214–1225 (1989)CrossRefGoogle Scholar
  8. 8.
    Deng, Y.G., Huang, T.Y., Xu, B.: Towards high performance continuous mandarin digit string recognition. In: Proceedings of the International Conference on Spoken Language Processing, Beijing, China, pp. 642–645 (2000)Google Scholar
  9. 9.
    Gao, S., Xu, B., Zhang, H., Zhao, B., Li, C.R., Huang, T.Y.: Update of progress of sinohear: Advanced mandarin lvcsr system at NLPR. In: Proceedings of the International Conference on Spoken Language Processing, Beijing, China, pp. 798–801 (2000)Google Scholar
  10. 10.
    Duda, R., Hart, P., Stork, D.: Pattern Recognition, 2nd edn. John Wiley & Sons, Inc., Chichester (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yun Tang
    • 1
  • Wenju Liu
    • 1
  • Bo Xu
    • 1
  1. 1.National Laboratory of Pattern Recognition, Institute of AutomationChinese Academy of Sciences 

Personalised recommendations