Advertisement

Chinese Grammatical Error Correction Using Statistical and Neural Models

  • Junpei Zhou
  • Chen LiEmail author
  • Hengyou Liu
  • Zuyi Bao
  • Guangwei Xu
  • Linlin Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11109)

Abstract

This paper introduces the Alibaba NLP team’s system for NLPCC 2018 shared task of Chinese Grammatical Error Correction (GEC). Chinese as a Second Language (CSL) learners can use this system to correct grammatical errors in texts they wrote. We proposed a method to combine statistical and neural models for the GEC task. This method consists of two modules: the correction module and the combination module. In the correction module, two statistical models and one neural model generate correction candidates for each input sentence. Those two statistical models are a rule-based model and a statistical machine translation (SMT)-based model. The neural model is a neural machine translation (NMT)-based model. In the combination module, we implemented it in a hierarchical manner. We first combined models at a lower level, which means we trained several models with different configurations and combined them. Then we combined those two statistical models and a neural model at the higher level. Our system reached the second place on the leaderboard released by the official.

Keywords

Grammatical Error Correction Combination Statistical machine translation Neural machine translation 

References

  1. 1.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  2. 2.
    Brockett, C., Dolan, W.B., Gamon, M.: Correcting ESL errors using phrasal SMT techniques. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 249–256. Association for Computational Linguistics (2006)Google Scholar
  3. 3.
    Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)Google Scholar
  4. 4.
    Bustamante, F.R., León, F.S.: GramCheck: a grammar and style checker. In: Proceedings of the 16th Conference on Computational Linguistics-Volume 1, pp. 175–181. Association for Computational Linguistics (1996)Google Scholar
  5. 5.
    Chang, R.Y., Wu, C.H., Prasetyo, P.K.: Error diagnosis of Chinese sentences using inductive learning algorithm and decomposition-based testing mechanism. ACM Trans. Asian Lang. Inf. Process. (TALIP) 11(1), 3 (2012)Google Scholar
  6. 6.
    Chang, T.H., Sung, Y.T., Hong, J.F., Chang, J.I.: KNGED: a tool for grammatical error diagnosis of Chinese sentences. In: 22nd International Conference on Computers in Education, ICCE 2014. Asia-Pacific Society for Computers in Education (2014)Google Scholar
  7. 7.
    Cheng, S.M., Yu, C.H., Chen, H.H.: Chinese word ordering errors detection and correction for non-native Chinese language learners. In: Proceedings of COLING 2014, The 25th International Conference on Computational Linguistics: Technical Papers, pp. 279–289 (2014)Google Scholar
  8. 8.
    Felice, M., Yuan, Z., Andersen, Ø.E., Yannakoudakis, H., Kochmar, E.: Grammatical error correction using hybrid systems and type filtering. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 15–24 (2014)Google Scholar
  9. 9.
    Gaoqi, R., Zhang, B., Endong, X., Lee, L.H.: IJCNLP-2017 task 1: Chinese grammatical error diagnosis. In: Proceedings of the IJCNLP 2017, Shared Tasks, pp. 1–8 (2017)Google Scholar
  10. 10.
    Gra, D., Chen, K.: Chinese gigaword. LDC Catalog No.: LDC2003T09, ISBN 1, 58563-58230 (2005)Google Scholar
  11. 11.
    Han, N.R., Chodorow, M., Leacock, C.: Detecting errors in English article usage with a maximum entropy classifier trained on a large, diverse corpus. In: LREC (2004)Google Scholar
  12. 12.
    Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 187–197. Association for Computational Linguistics (2011)Google Scholar
  13. 13.
    Heidorn, G.E., Jensen, K., Miller, L.A., Byrd, R.J., Chodorow, M.S.: The EPISTLE text-critiquing system. IBM Syst. J. 21(3), 305–326 (1982)CrossRefGoogle Scholar
  14. 14.
    Ji, J., Wang, Q., Toutanova, K., Gong, Y., Truong, S., Gao, J.: A nested attention neural hybrid model for grammatical error correction. arXiv preprint arXiv:1707.02026 (2017)
  15. 15.
    Junczys-Dowmunt, M., Grundkiewicz, R.: The AMU system in the CoNLL-2014 shared task: grammatical error correction by data-intensive and feature-rich statistical machine translation. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 25–33 (2014)Google Scholar
  16. 16.
    Lee, L.H., Gaoqi, R., Yu, L.C., Endong, X., Zhang, B., Chang, L.P.: Overview of NLP-TEA 2016 shared task for Chinese grammatical error diagnosis. In: Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016), pp. 40–48 (2016)Google Scholar
  17. 17.
    Lee, L.H., Yu, L.C., Chang, L.: Overview of the NLP-TEA 2015 shared task for Chinese grammatical error diagnosis, 07 March 2015Google Scholar
  18. 18.
    Lee, L.H., Yu, L.C., Lee, K.C., Tseng, Y.H., Chang, L.P., Chen, H.H.: A sentence judgment system for grammatical error detection. In: Proceedings of COLING 2014, The 25th International Conference on Computational Linguistics: System Demonstrations, pp. 67–70 (2014)Google Scholar
  19. 19.
    Lin, C.J., Chan, S.H.: Description of NTOU Chinese grammar checker in CFL 2014. In: Proceedings of the 1st Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2014), Nara, Japan, pp. 75–78 (2014)Google Scholar
  20. 20.
    Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
  21. 21.
    Madnani, N., Tetreault, J., Chodorow, M.: Exploring grammatical error correction with not-so-crummy machine translation. In: Proceedings of the Seventh Workshop on Building Educational Applications using NLP, pp. 44–53. Association for Computational Linguistics (2012)Google Scholar
  22. 22.
    Napoles, C., Callison-Burch, C.: Systematically adapting machine translation for grammatical error correction. In: Proceedings of the 12th Workshop on Innovative use of NLP for Building Educational Applications, pp. 345–356 (2017)Google Scholar
  23. 23.
    Ng, H.T., Wu, S.M., Briscoe, T., Hadiwinoto, C., Susanto, R.H., Bryant, C.: The CoNLL-2014 shared task on grammatical error correction. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 1–14 (2014)Google Scholar
  24. 24.
    Ng, H.T., Wu, S.M., Wu, Y., Hadiwinoto, C., Tetreault, J.: The CoNLL-2013 shared task on grammatical error correction (2013)Google Scholar
  25. 25.
    Rozovskaya, A., Roth, D.: Algorithm selection and model adaptation for ESL correction tasks. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 924–933. Association for Computational Linguistics (2011)Google Scholar
  26. 26.
    Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRefGoogle Scholar
  27. 27.
    Sun, C., Jin, X., Lin, L., Zhao, Y., Wang, X.: Convolutional neural networks for correcting English article errors. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds.) NLPCC 2015. LNCS (LNAI), vol. 9362, pp. 102–110. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-25207-0_9CrossRefGoogle Scholar
  28. 28.
    Sun, J.: ‘jieba’ Chinese word segmentation tool (2012)Google Scholar
  29. 29.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  30. 30.
    Tetreault, J.R., Chodorow, M.: The ups and downs of preposition error detection in ESL writing. In: Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pp. 865–872. Association for Computational Linguistics (2008)Google Scholar
  31. 31.
    Wu, X., Huang, P., Wang, J., Guo, Q., Xu, Y., Chen, C.: Chinese grammatical error diagnosis system based on hybrid model. In: Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications, pp. 117–125 (2015)Google Scholar
  32. 32.
    Yu, L.C., Lee, L.H., Chang, L.P.: Overview of grammatical error diagnosis for learning Chinese as a foreign language. In: Proceedings of the 1st Workshop on Natural Language Processing Techniques for Educational Applications, pp. 42–47 (2014)Google Scholar
  33. 33.
    Yuan, Z., Briscoe, T.: Grammatical error correction using neural machine translation. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 380–386 (2016)Google Scholar
  34. 34.
    Yuan, Z., Felice, M.: Constrained grammatical error correction using statistical machine translation. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task, pp. 52–61 (2013)Google Scholar
  35. 35.
    Zampieri, M., Tan, L.: Grammatical error detection with limited training data: the case of Chinese. In: Proceedings of ICCE (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Junpei Zhou
    • 1
    • 2
  • Chen Li
    • 1
    Email author
  • Hengyou Liu
    • 1
  • Zuyi Bao
    • 1
  • Guangwei Xu
    • 1
  • Linlin Li
    • 1
  1. 1.Alibaba GroupHangzhouChina
  2. 2.Zhejiang UniversityHangzhouChina

Personalised recommendations