Error Detection and Correction Based on Chinese Phonemic Alphabet in Chinese Text

  • Chuen-Min Huang
  • Mei-Chen Wu
  • Ching-Che Chang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4617)

Abstract

Misspelling and misconception resulting from similar pronunciation appears frequently in Chinese texts. Without double check-up, this situation is getting even worse with the help of Chinese input method editor. It is hoped that the quality of Chinese writing would be enhanced if an effective automatic error detection and correction mechanism embedded in text editor. Therefore, the burden of manpower to proofread shall be released. Until recently, researches on automatic error detection and correction of Chinese text have undergone many challenges and suffered from bad performance compared with that of Western text editor. In view of the prominent phenomenon in Chinese writing problem, this study proposes a learning model based on Chinese phonemic alphabet. The experimental results demonstrate this model is effective in finding out most of words spelled incorrectly, and furthermore this model improves detection and correction rate.

Keywords

Error detection of Chinese text Error correction of Chinese text language model Chinese phonemic alphabet 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cao, F.F.: Instances of interaction between Taiwanese Japan and Taiwanese Mandarin in Taiwan across the span of the last one hundred year. Chinese Study 36(12), 273–297 (2000)Google Scholar
  2. 2.
    Chang, C.-H.: A New Approach for Automatic Chinese Spelling Correction. In: Proceedings of Natural Language Processing Pacific Rim Symposium 1995, Seoul, Korea, pp. 278–283 (1995)Google Scholar
  3. 3.
    Chen, K.J., Bai, M.H.: Unknown Word Detection for Chinese by a Corpus-based Learning Method. In: Computational Linguistics and Chinese Language Processing, pp. 27–44 (1998)Google Scholar
  4. 4.
    Chi, C.: You jian bie zi, 2nd edn. Ming Jen Publications, Inc., Taipei (1980)Google Scholar
  5. 5.
    Chiang, H.: tou shi:ti xiao jie fei di bu zhi shi cuo wu bai chu. In: Focus on China beijing: BBC CHINESE.com (2006)Google Scholar
  6. 6.
    Chuang, T.I., Chuang, S.Y.: Yi zi zhi cha. Jian Lin, Taipei (1991)Google Scholar
  7. 7.
    Fan, S.P.: Xiao yuan chang jian cuo bie zi shou ce. Chinese improvement working group, Hong Kong (1998)Google Scholar
  8. 8.
    Hsieh, K.P.: Ti wan di qu nian qing ren yu zh(?), ch(?), sh(?) with z(?), c(?), s(?) zhen di bu fen ma?Do young people in Taiwan really confuse zh(?), ch(?), sh(?) with z(?), c(?), s(?)? The World of Chinese Language 90(12), 1–7 (1998)Google Scholar
  9. 9.
    Hsieh, P.C.: Bie zai xie cuo zi liao. Business Weekly Publications, Inc., Taipei (2001)Google Scholar
  10. 10.
    Huang, C.N., Chang, H.F.: Zi ran yu yan chu li ji shu di san ge li cheng bei. Foreign Language Teaching and Research. 2005, 180–187 (2002)Google Scholar
  11. 11.
    Hung, F.L.: Bian zi ji jin. Fu Wen, Kaohsiung (1997)Google Scholar
  12. 12.
    Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press, Cambridge (1999)MATHGoogle Scholar
  13. 13.
    Papoulis, A.: Probability, Random Variables, and Stochastic Processes, 2nd edn. McGraw-Hill, New York (1984)MATHGoogle Scholar
  14. 14.
    Ssu Ma, T.: Cuo bie zi chu lie, 1st edn. BusinessWeekly Publications, Inc., Taipei (2005)Google Scholar
  15. 15.
    Ssu Tu, A.J.: Hao wan cuo bie zi you xi. Singtao, Hong Kong (2005)Google Scholar
  16. 16.
    Tso, H.L.: Cuo bie zi bian zheng. The Commercial Press, Ltd, Taipei (1980)Google Scholar
  17. 17.
    Wagner, R.A.: Order-n correction for regular languages. Commun. ACM 17, 265–268 (1974)MATHCrossRefGoogle Scholar
  18. 18.
    Wang, H.J.: Gao zhong zhi xue sheng zuo wen cuo bie zi yan jiu-yi gao xiong shi gao zhong zhi xue sheng zuo wen wei li. In: Wang, H.J. (ed.) Junior high school material. vol. Graduate Student, vol. 220, National Kaohsiung Normal University, Kaohsiung (2003)Google Scholar
  19. 19.
    Witten, I.H., Bell, T.C.: The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression.Information Theory. IEEE Transactions 37, 1085 (1991)Google Scholar
  20. 20.
    Yang, H.I.: Xue shi zhong wen cheng du qi ye zhu guan yao tou. China times express, Taipei Report (2005)Google Scholar
  21. 21.
    Zhang, L., Zhou, M., Huang, C., Lu, M.: Approach in automatic detection and correction of errors in Chinese text based on feature and learning. In: Proceedings of the 3rd world congress on Intelligent Control and Automation, Hefei, pp. 2744–2748 (2000)Google Scholar
  22. 22.
    Zhang, L., Zhou, M., Huang, C., Pan, H.: Automatic detecting/correcting errors in Chinese text by an approximate word-matching algorithm. In: The 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Chuen-Min Huang
    • 1
  • Mei-Chen Wu
    • 1
  • Ching-Che Chang
    • 1
  1. 1.Department of Information Management, National Yunlin University of Science & Technology, TaiwanR.O.C.

Personalised recommendations