Abstract
The task of Chinese Grammatical Error Diagnosis (CGED) is considered challenging due to the diversity of error types and subtypes, as well as the imbalanced distribution of subtype occurrences and the emergence of new subtypes, which pose a threat to the generalization ability of CGED models. In this paper, we propose a sentence editing and character filling-based CGED strategy that conducts task decomposition and transformation based on different types of grammatical errors, and provides corresponding solutions. To improve error detection accuracy, a refined set of error types is designed to better utilize training data. The correction task is transformed into a character slot filling task, the performance of which, as well as its generalization for long-tail scenarios and the open domain, can be improved by large-scale pre-trained models. Experiments conducted on CGED evaluation datasets show that our approach outperforms comparison models in all evaluation metrics and has good generalization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rozovskaya, A., Chang, K.-W., Sammons, M., Roth, D., Habash N.: The illinois-columbia system in the CoNLL-2014 shared task. In: Proceedings of the 18th Conference on Computational Natural Language Learning, pp. 34–42. ACL, Baltimore (2014)
Awasthi, A., Sarawagi, S., Goyal, R., Ghosh, S., Piratla, V.: Parallel iterative edit models for local sequence transduction. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4260–4270. ACL, Hong Kong (2019)
Omelianchuk, K., Atrasevych, V., Chernodub, A., Skurzhanskyi, O.: GECToR-grammatical error correction: Tag, not rewrite. In: Proceedings of the 15th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 163–170. ACL, Seattle (2020)
Rao, G., Yang, E., Zhang, B.: Overview of NLPTEA-2020 shared task for Chinese grammatical error diagnosis. In: Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications, pp. 25–35. ACL, Suzhou (2020)
Liang, D., et al.: BERT enhanced neural machine translation and sequence tagging model for Chinese grammatical error diagnosis. In: Proceedings of the 6th Workshop on NLPTEA, pp. 57–66. ACL, Suzhou (2020)
Chen, M., Ge, T., Zhang, X., Wei, F., Zhou, M.: Improving the efficiency of grammatical error correction with erroneous span detection and correction. In: Proceedings of the 2020 Conference on Empirical Methods in NLP, pp. 7162–7169. ACL (2020)
Chollampatt, S., Wang, W., Ng., H.-T.: Cross-sentence grammatical error correction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 435–445. ACL, Florence (2019)
Levenshtein, V.-I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys. Doklady 10, 707–710 (1966)
Lichtarge, J., Alberti, C., Kumar, S., Shazeer, N., Parmar, N., Tong, S.: Corpora generation for grammatical error correction. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 3291–3301. ACL, Minneapolis (2019)
Malmi, E., Krause, S., Rothe, S., Mirylenka, D., Severyn, A.: Encode, tag, realize: High-precision text editing. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5054–5065. ACL, Hong Kong (2019)
Sutton, C., McCallum, A.: Composition of conditional random fields for transfer learning. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 748–754. ACL, Vancouver (2005)
Yue, T., Liu, S., Cai, H., Yang, T., Song, S., Yu, T.: Improving Chinese grammatical error detection via data augmentation by conditional error generation. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 2966–2975. ACL, Dublin (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xie, H., Chen, X., Lyu, X., Liu, L. (2023). A Generalized Strategy of Chinese Grammatical Error Diagnosis Based on Task Decomposition and Transformation. In: Wang, H., Han, X., Liu, M., Cheng, G., Liu, Y., Zhang, N. (eds) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers Artificial General Intelligence. CCKS 2023. Communications in Computer and Information Science, vol 1923. Springer, Singapore. https://doi.org/10.1007/978-981-99-7224-1_11
Download citation
DOI: https://doi.org/10.1007/978-981-99-7224-1_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7223-4
Online ISBN: 978-981-99-7224-1
eBook Packages: Computer ScienceComputer Science (R0)