Abstract
Silent pause frequently occurs in spontaneous speech. When recognizing spontaneous speech, silent pause tends to degrade the performance of typical speech recognizers. This paper proposes a fragment decoding method to improve the performance of speech recognizer using silent pause detection. This method automatically detects silent pauses and cuts long utterance into speech fragments. At decoding stage, instead of being skipped, these silent fragments are decoded separately. Final transcription of the whole utterance can be derived from corresponding fragmental results. Further improvement is made to reduce the run-time consumed on decoding. Because of an introduction of accurate word boundary, the misrecognition at silent frames is declined. Recognition experiments conducted on monolog speech in tourism field show that the proposed method outperforms the traditional frame skipping method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Goldwater, S., Jurafsky, D., Manning, C.: Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase ASR error rates. In: Proceedings of the Joint Meeting of the Association for Computational Linguistics and Human Language Technology Conference, ACL/HLT (2008)
Shriberg, E.: Spontaneous Speech: How People Really Talk And Why Engineers Should Care. In: Proc. of Interspeech 2005, Lisbon, Portugal, pp. 1781–1784 (2005)
Goto, M., Itou, K., Hayamizu, S.: A real-time filled pause detection system for Spontaneous Speech Recognition. In: Proc. of Eurospeech 1999, pp. 227–230 (1999)
Stolcke, A., Shriberg, E., Bates, R., Ostendorf, M., Hakkani, D., Plauche, M., Tur, G., Lu, Y.: Automatic detection of sentence boundaries and disfluencies based on recognized words. In: Proceedings of the International Conference on Spoken Language Processing, vol. 5, pp. 2247–2250 (1998)
Ogata, J., Goto, M., Itou, K.: The use of acoustically detected filled and silent pauses in spontaneous speech recognition. In: Proc. of ICASSP 2009, pp. 4305–4308 (2009)
Audhkhasi, K., Kandhway, K., Deshmukh, O.D., Verma, A.: Formant-based technique for automatic filled-pause detection in spontaneous spoken English. In: Proc. of ICASSP 2009, Taiwan, (2009)
Mporas, I., Ganchev, T., Fakotakis, N.: Speech segmentation using regression fusion of boundary predictions. Computer Speech and Language 24(2), 273–288 (2010)
Wang, D., Lu, L., Zhang, H.J.: Speech segmentation without speech recognition. In: Proc. of ICASSP 2003, pp. 468–471 (2003)
Li, Y.X., He, Q.H., Li, T.: A novel detection method of filled pause in mandarin spontaneous speech. In: ICIS 2008, pp. 217–222 (2008)
Stouten, F., Martens, J.P.: A Feature-Based Filled Pause Detection System for Dutch. In: Procs of Workshop for Automatic Speech Recognition and Understanding, pp. 309–314 (2003)
Stouten, F., Duchateau, J., Martens, J.P., Wambacq, P.: Coping with disfluencies in Spontaneous Speech Recognition: acoustic detection and linguistic context manipulation. In: Speech Communication (48), pp. 1590–1606 (2006)
Ortmanns, S., Eiden, A., Ney, H., Coenen, N.: Look-Ahead Techniques for Fast Beam Search. In: Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Munich, Germany, pp. 1783–1786 (April 1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, Z., Liu, W., Jiang, W., Hu, P., Chen, M. (2012). Speech Fragment Decoding Techniques Using Silent Pause Detection. In: Liu, CL., Zhang, C., Wang, L. (eds) Pattern Recognition. CCPR 2012. Communications in Computer and Information Science, vol 321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33506-8_71
Download citation
DOI: https://doi.org/10.1007/978-3-642-33506-8_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33505-1
Online ISBN: 978-3-642-33506-8
eBook Packages: Computer ScienceComputer Science (R0)