Advertisement

Speech Fragment Decoding Techniques Using Silent Pause Detection

  • Zhanlei Yang
  • Wenju Liu
  • Wei Jiang
  • Pengfei Hu
  • Mingming Chen
Part of the Communications in Computer and Information Science book series (CCIS, volume 321)

Abstract

Silent pause frequently occurs in spontaneous speech. When recognizing spontaneous speech, silent pause tends to degrade the performance of typical speech recognizers. This paper proposes a fragment decoding method to improve the performance of speech recognizer using silent pause detection. This method automatically detects silent pauses and cuts long utterance into speech fragments. At decoding stage, instead of being skipped, these silent fragments are decoded separately. Final transcription of the whole utterance can be derived from corresponding fragmental results. Further improvement is made to reduce the run-time consumed on decoding. Because of an introduction of accurate word boundary, the misrecognition at silent frames is declined. Recognition experiments conducted on monolog speech in tourism field show that the proposed method outperforms the traditional frame skipping method.

Keywords

speech recognition fragment decoding silent pause detection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Goldwater, S., Jurafsky, D., Manning, C.: Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase ASR error rates. In: Proceedings of the Joint Meeting of the Association for Computational Linguistics and Human Language Technology Conference, ACL/HLT (2008)Google Scholar
  2. 2.
    Shriberg, E.: Spontaneous Speech: How People Really Talk And Why Engineers Should Care. In: Proc. of Interspeech 2005, Lisbon, Portugal, pp. 1781–1784 (2005)Google Scholar
  3. 3.
    Goto, M., Itou, K., Hayamizu, S.: A real-time filled pause detection system for Spontaneous Speech Recognition. In: Proc. of Eurospeech 1999, pp. 227–230 (1999)Google Scholar
  4. 4.
    Stolcke, A., Shriberg, E., Bates, R., Ostendorf, M., Hakkani, D., Plauche, M., Tur, G., Lu, Y.: Automatic detection of sentence boundaries and disfluencies based on recognized words. In: Proceedings of the International Conference on Spoken Language Processing, vol. 5, pp. 2247–2250 (1998)Google Scholar
  5. 5.
    Ogata, J., Goto, M., Itou, K.: The use of acoustically detected filled and silent pauses in spontaneous speech recognition. In: Proc. of ICASSP 2009, pp. 4305–4308 (2009)Google Scholar
  6. 6.
    Audhkhasi, K., Kandhway, K., Deshmukh, O.D., Verma, A.: Formant-based technique for automatic filled-pause detection in spontaneous spoken English. In: Proc. of ICASSP 2009, Taiwan, (2009)Google Scholar
  7. 7.
    Mporas, I., Ganchev, T., Fakotakis, N.: Speech segmentation using regression fusion of boundary predictions. Computer Speech and Language 24(2), 273–288 (2010)CrossRefGoogle Scholar
  8. 8.
    Wang, D., Lu, L., Zhang, H.J.: Speech segmentation without speech recognition. In: Proc. of ICASSP 2003, pp. 468–471 (2003)Google Scholar
  9. 9.
    Li, Y.X., He, Q.H., Li, T.: A novel detection method of filled pause in mandarin spontaneous speech. In: ICIS 2008, pp. 217–222 (2008)Google Scholar
  10. 10.
    Stouten, F., Martens, J.P.: A Feature-Based Filled Pause Detection System for Dutch. In: Procs of Workshop for Automatic Speech Recognition and Understanding, pp. 309–314 (2003)Google Scholar
  11. 11.
    Stouten, F., Duchateau, J., Martens, J.P., Wambacq, P.: Coping with disfluencies in Spontaneous Speech Recognition: acoustic detection and linguistic context manipulation. In: Speech Communication (48), pp. 1590–1606 (2006)Google Scholar
  12. 12.
    Ortmanns, S., Eiden, A., Ney, H., Coenen, N.: Look-Ahead Techniques for Fast Beam Search. In: Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Munich, Germany, pp. 1783–1786 (April 1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Zhanlei Yang
    • 1
  • Wenju Liu
    • 1
  • Wei Jiang
    • 1
  • Pengfei Hu
    • 1
  • Mingming Chen
    • 1
  1. 1.National Laboratory of Pattern Recognition (NLPR), Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations