Speed Compensation for Improving Thai Spelling Recognition with a Continuous Speech Corpus

Pisarn, Chutima; Theeramunkong, Thanaruk

doi:10.1007/978-3-540-30179-0_9

Chutima Pisarn¹⁹ &
Thanaruk Theeramunkong¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3283))

Included in the following conference series:

International Conference on Intelligence in Communication Systems

636 Accesses
2 Citations

Abstract

Spelling recognition is an approach to enhance a speech recognizer to cope with incorrectly recognized words and out-of-vocabulary words. This paper presents a general framework for Thai speech recognition, enhanced with spelling recognition. To implement Thai spelling recognition, Thai alphabets and their spelling methods are analyzed. Based on hidden Markov models, we propose a method to construct a Thai spelling recognition system using an existing continuous speech corpus. To compensate for speed differences between spelling utterances and continuous speech utterances, the adjustment of utterance speed is taken into account. Our system achieves up to 87.37% correctness and 87.18% accuracy with the mix-type language model.

Download to read the full chapter text

Chapter PDF

The NECTEC 2015 Thai Open-Domain Automatic Speech Recognition System

Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition

Interactive Speech Recognition Based on Excel Software

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

San-Segundo, R., Macias-Guarasa, J., Ferreiros, J., Martin, P., Pardo, J.M.: Detection of Recognition Errors and Out of the Spelling Dictionary Names in a Spelled Name Recognizer for Spanish. In: Proceedings of EUROSPEECH 2001 (2001)
Google Scholar
San-Segundo, R., Colas, J., Cordoba, R., Pardo, J.M.: Spanish Recognizer of Continuously Spelled Names Over the Telephone. Journal of Speech Communication 38, 287–303 (2002)
Article MATH Google Scholar
Rodrigues, F., Rodrigues, R., Martins, C.: An Isolated Letter Recognizer for Proper Name Identification Over the Telephone. In: Proceedings of 9th Portuguese Conference on Pattern Recognition (1997)
Google Scholar
Mitchell, C.D., Setlur, A.R.: Improved Spelling Recognition using a Tree-based Fast Lexical Match. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 597–600 (1999)
Google Scholar
Bauer, J.G., Junkawitsch, J.: Accurate recognition of city names with spelling as a fall back strategy. In: Proceedings of EUROSPEECH 1999, pp. 263–266 (1999)
Google Scholar
Pisarn, C., Theeramunkong, T.: Incorporating Tone Information to Improve Thai Continuous Speech Recognition. In: Proceedings of International Conference on Intelligent Technologies 2003 (2003)
Google Scholar
Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.2.1). Cambridge University Engineering Department, Cambridge (2002)
Google Scholar
Pallone, G.: Time-stretching and pitch-shifting of audio signals: Application to cinema /video conversion, http://www.iua.upf.es/activitats/semirec/semi-pallone/index.htm
Verhelst, W., Roelands, M.: An overlap-add technique based on waveform similarity (wsola) for high quality time-scale modification of speech. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 554–557 (1993)
Google Scholar
Wikipedia: The free encyclopedia, Audio time stretching, http://www.ebroadcast.com.au/lookup/encyclopedia/au/Audio_time_stretching.html
Anastasakos, A., Schwartz, R., Shu, H.: Duration Modeling in Large Vocabulary Speech Recognition. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 628–631 (1995)
Google Scholar
Thubthong, N., Kijsirikul, B.: Tone Recognition of Continuous Thai Speech under Tonal Assimilation and Declination Effects using Half-Tone Model. Journal of International of Uncertainty, Fuzziness and Knowledge-Based System 9(6), 815–825 (2001)
MATH Google Scholar
Betz, M., Hild, H.: Language Models for a Spelled Letter Recognizer. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 856–859 (1995)
Google Scholar
Jurafsky, D., Martin, J.: Speech and Language Processing: An Introduction to Natural Language Processing. Computational Linguistics and Speech Recognition, Prentice Hall (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Sirindhorn International Institute of Technology, 131 Moo, 5 Tiwanont Rd., Bangkadi, Muang, Phathumthani, 12000, Thailand
Chutima Pisarn & Thanaruk Theeramunkong

Authors

Chutima Pisarn
View author publications
You can also search for this author in PubMed Google Scholar
Thanaruk Theeramunkong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Telematics, Norwegian University of Science and Technology (NTNU), N7491, Trondheim, Norway
Finn Arve Aagesen
Shinawatra University, 99 Moo 10 Bangtoey, 12160, Samkok, Pathum Thani, Thailand
Chutiporn Anutariya
School of Engineering and Technology, Asian Institute of Technology, P.O. Box 4, 12120, Klong Luang, Pathum Thani, Thailand
Vilas Wuwongse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pisarn, C., Theeramunkong, T. (2004). Speed Compensation for Improving Thai Spelling Recognition with a Continuous Speech Corpus. In: Aagesen, F.A., Anutariya, C., Wuwongse, V. (eds) Intelligence in Communication Systems. INTELLCOMM 2004. Lecture Notes in Computer Science, vol 3283. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30179-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-540-30179-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23893-5
Online ISBN: 978-3-540-30179-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Speed Compensation for Improving Thai Spelling Recognition with a Continuous Speech Corpus

Abstract

Chapter PDF

Similar content being viewed by others

The NECTEC 2015 Thai Open-Domain Automatic Speech Recognition System

Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition

Interactive Speech Recognition Based on Excel Software

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Speed Compensation for Improving Thai Spelling Recognition with a Continuous Speech Corpus

Abstract

Chapter PDF

Similar content being viewed by others

The NECTEC 2015 Thai Open-Domain Automatic Speech Recognition System

Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition

Interactive Speech Recognition Based on Excel Software

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation