Improving Thai Spelling Recognition with Tone Features

  • Chutima Pisarn
  • Thanaruk Theeramunkong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4139)


Spelling recognition has been used for several purposes, such as enhancing speech recognition systems and implementing name retrieval systems. Tone information is an important clue, in addition to phones, for recognizing speeches in tonal languages. In this paper, we present a method to improve accuracy of spelling recognition in Thai, a tonal language, by incorporating tone-related acoustic features to a well-known front-end feature named Perceptual Linear Prediction Coefficients (PLP). The proposed method makes use of three kinds of tone information: fundamental frequency (pitch), pitch delta and pitch acceleration, to enhance the original features. Compared to the baseline result gained from the original feature, our HMMs-based recognition model shows improvement of 1.73%, 2.85% and 3.16% of letter accuracy for close-type, mix-type and open-type language models, respectively.


Feature Vector Speech Recognition Language Model Speech Recognition System Pitch Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mitchell, C.D., Setlur, A.R.: Improved spelling recognition using a tree-based fast lexical match. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 597–600 (1999)Google Scholar
  2. 2.
    Hild, H., Waibel, A.: Recognition of spelled names over the telephone. In: Proceedings of the International Conference on Spoken Language Processing, ICSLP 1996, Philadelphia, pp. 346–349 (1996)Google Scholar
  3. 3.
    Bauer, J.G., Junkawitsch, J.: Accurate recognition of city names with spelling as a fall back strategy. In: Proceedings of EUROSPEECH, pp. 263–266 (1999)Google Scholar
  4. 4.
    San-Segundom, R., Colas, J., Cordoba, R., Pardo, J.M.: Spanish recognizer of continuously spelled names over the telephone. Journal of Speech Communication 38, 287–303 (2002)CrossRefGoogle Scholar
  5. 5.
    Rodrigues, F., Rodrigues, R., Martins, C.: An isolated letter recognizer for proper name identification over the telephone. In: Proceedings of 9th Portuguese Conference on Pattern Recognition (RECPAD 1997), Coimbra (1997)Google Scholar
  6. 6.
    Pisarn, C., Theeramunkong, T.: Speed Compensation for Improving Thai Spelling Recognition with a Continuous Speech Corpus. In: Aagesen, F.A., Anutariya, C., Wuwongse, V. (eds.) INTELLCOMM 2004. LNCS, vol. 3283, pp. 100–111. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Lee, T., Ching, P.C., Chan, L.W., Cheng, Y.H., Mark, B.: Tone Recognition of Isolated Cantonese Syllables. IEEE Transaction on Speech Audio Processing, 988–992 (1988)Google Scholar
  8. 8.
    Chen, C.J.: Recognize Tone Languages Using Pitch Information on The Main Vowel of Each Syllable. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 61–64 (2001)Google Scholar
  9. 9.
    Chang, E., Zhou, J., Di, S., Huang, C., Lee, K.: Large Vocabulary Mandarin Speech Recognition with Different Approaches in Modeling Tones. In: Proceedings of International Conference on Spoken Language Processing (2000)Google Scholar
  10. 10.
    Thubthong, N., Kijsirikul, B.: Improving Connected Thai Digit Speech Recognition using Prosodic Information. In: Proceedings of The 4th National Computer Science and Engineering Conference, pp. 63–68 (2000)Google Scholar
  11. 11.
    Wong, P., Siu, M.: Integration of Tone Related Feature for Chinese Speech Recognition. In: 6th International Conference on Signal Processing, pp. 476–479 (2002)Google Scholar
  12. 12.
    Pisarn, C., Theeramunkong, T.: Incorporating tone information to improve Thai continuous speech recognition. In: Proc. of International Conference on Intelligent Technologies, Chiangmai, Thailand, pp. 84–89 (2003)Google Scholar
  13. 13.
    Rabiner, L.R., et al.: A Comparative Performance Study of Several Pitch Detection Algorithms. IEEE Transaction on Acoustics, Speech, and Signal Processing ASSP-24(5) (1976)Google Scholar
  14. 14.
    Young, S., et al.: The HTK Book (for HTK Version 3.1) (2000) Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Chutima Pisarn
    • 1
  • Thanaruk Theeramunkong
    • 1
  1. 1.Sirindhorn International Institute of TechnologyBangkadi, MuangThailand

Personalised recommendations