Improving Thai Spelling Recognition with Tone Features
Spelling recognition has been used for several purposes, such as enhancing speech recognition systems and implementing name retrieval systems. Tone information is an important clue, in addition to phones, for recognizing speeches in tonal languages. In this paper, we present a method to improve accuracy of spelling recognition in Thai, a tonal language, by incorporating tone-related acoustic features to a well-known front-end feature named Perceptual Linear Prediction Coefficients (PLP). The proposed method makes use of three kinds of tone information: fundamental frequency (pitch), pitch delta and pitch acceleration, to enhance the original features. Compared to the baseline result gained from the original feature, our HMMs-based recognition model shows improvement of 1.73%, 2.85% and 3.16% of letter accuracy for close-type, mix-type and open-type language models, respectively.
KeywordsFeature Vector Speech Recognition Language Model Speech Recognition System Pitch Information
Unable to display preview. Download preview PDF.
- 1.Mitchell, C.D., Setlur, A.R.: Improved spelling recognition using a tree-based fast lexical match. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 597–600 (1999)Google Scholar
- 2.Hild, H., Waibel, A.: Recognition of spelled names over the telephone. In: Proceedings of the International Conference on Spoken Language Processing, ICSLP 1996, Philadelphia, pp. 346–349 (1996)Google Scholar
- 3.Bauer, J.G., Junkawitsch, J.: Accurate recognition of city names with spelling as a fall back strategy. In: Proceedings of EUROSPEECH, pp. 263–266 (1999)Google Scholar
- 5.Rodrigues, F., Rodrigues, R., Martins, C.: An isolated letter recognizer for proper name identification over the telephone. In: Proceedings of 9th Portuguese Conference on Pattern Recognition (RECPAD 1997), Coimbra (1997)Google Scholar
- 7.Lee, T., Ching, P.C., Chan, L.W., Cheng, Y.H., Mark, B.: Tone Recognition of Isolated Cantonese Syllables. IEEE Transaction on Speech Audio Processing, 988–992 (1988)Google Scholar
- 8.Chen, C.J.: Recognize Tone Languages Using Pitch Information on The Main Vowel of Each Syllable. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 61–64 (2001)Google Scholar
- 9.Chang, E., Zhou, J., Di, S., Huang, C., Lee, K.: Large Vocabulary Mandarin Speech Recognition with Different Approaches in Modeling Tones. In: Proceedings of International Conference on Spoken Language Processing (2000)Google Scholar
- 10.Thubthong, N., Kijsirikul, B.: Improving Connected Thai Digit Speech Recognition using Prosodic Information. In: Proceedings of The 4th National Computer Science and Engineering Conference, pp. 63–68 (2000)Google Scholar
- 11.Wong, P., Siu, M.: Integration of Tone Related Feature for Chinese Speech Recognition. In: 6th International Conference on Signal Processing, pp. 476–479 (2002)Google Scholar
- 12.Pisarn, C., Theeramunkong, T.: Incorporating tone information to improve Thai continuous speech recognition. In: Proc. of International Conference on Intelligent Technologies, Chiangmai, Thailand, pp. 84–89 (2003)Google Scholar
- 13.Rabiner, L.R., et al.: A Comparative Performance Study of Several Pitch Detection Algorithms. IEEE Transaction on Acoustics, Speech, and Signal Processing ASSP-24(5) (1976)Google Scholar
- 14.Young, S., et al.: The HTK Book (for HTK Version 3.1) (2000) Google Scholar