Abstract
Spelling recognition is an approach to enhance a speech recognizer to cope with incorrectly recognized words and out-of-vocabulary words. This paper presents a general framework for Thai speech recognition, enhanced with spelling recognition. To implement Thai spelling recognition, Thai alphabets and their spelling methods are analyzed. Based on hidden Markov models, we propose a method to construct a Thai spelling recognition system using an existing continuous speech corpus. To compensate for speed differences between spelling utterances and continuous speech utterances, the adjustment of utterance speed is taken into account. Our system achieves up to 87.37% correctness and 87.18% accuracy with the mix-type language model.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
San-Segundo, R., Macias-Guarasa, J., Ferreiros, J., Martin, P., Pardo, J.M.: Detection of Recognition Errors and Out of the Spelling Dictionary Names in a Spelled Name Recognizer for Spanish. In: Proceedings of EUROSPEECH 2001 (2001)
San-Segundo, R., Colas, J., Cordoba, R., Pardo, J.M.: Spanish Recognizer of Continuously Spelled Names Over the Telephone. Journal of Speech Communication 38, 287–303 (2002)
Rodrigues, F., Rodrigues, R., Martins, C.: An Isolated Letter Recognizer for Proper Name Identification Over the Telephone. In: Proceedings of 9th Portuguese Conference on Pattern Recognition (1997)
Mitchell, C.D., Setlur, A.R.: Improved Spelling Recognition using a Tree-based Fast Lexical Match. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 597–600 (1999)
Bauer, J.G., Junkawitsch, J.: Accurate recognition of city names with spelling as a fall back strategy. In: Proceedings of EUROSPEECH 1999, pp. 263–266 (1999)
Pisarn, C., Theeramunkong, T.: Incorporating Tone Information to Improve Thai Continuous Speech Recognition. In: Proceedings of International Conference on Intelligent Technologies 2003 (2003)
Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.2.1). Cambridge University Engineering Department, Cambridge (2002)
Pallone, G.: Time-stretching and pitch-shifting of audio signals: Application to cinema /video conversion, http://www.iua.upf.es/activitats/semirec/semi-pallone/index.htm
Verhelst, W., Roelands, M.: An overlap-add technique based on waveform similarity (wsola) for high quality time-scale modification of speech. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 554–557 (1993)
Wikipedia: The free encyclopedia, Audio time stretching, http://www.ebroadcast.com.au/lookup/encyclopedia/au/Audio_time_stretching.html
Anastasakos, A., Schwartz, R., Shu, H.: Duration Modeling in Large Vocabulary Speech Recognition. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 628–631 (1995)
Thubthong, N., Kijsirikul, B.: Tone Recognition of Continuous Thai Speech under Tonal Assimilation and Declination Effects using Half-Tone Model. Journal of International of Uncertainty, Fuzziness and Knowledge-Based System 9(6), 815–825 (2001)
Betz, M., Hild, H.: Language Models for a Spelled Letter Recognizer. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 856–859 (1995)
Jurafsky, D., Martin, J.: Speech and Language Processing: An Introduction to Natural Language Processing. Computational Linguistics and Speech Recognition, Prentice Hall (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pisarn, C., Theeramunkong, T. (2004). Speed Compensation for Improving Thai Spelling Recognition with a Continuous Speech Corpus. In: Aagesen, F.A., Anutariya, C., Wuwongse, V. (eds) Intelligence in Communication Systems. INTELLCOMM 2004. Lecture Notes in Computer Science, vol 3283. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30179-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-30179-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23893-5
Online ISBN: 978-3-540-30179-0
eBook Packages: Springer Book Archive