Skip to main content

A Generalized Strategy of Chinese Grammatical Error Diagnosis Based on Task Decomposition and Transformation

  • Conference paper
  • First Online:
Knowledge Graph and Semantic Computing: Knowledge Graph Empowers Artificial General Intelligence (CCKS 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1923))

Included in the following conference series:

  • 600 Accesses

Abstract

The task of Chinese Grammatical Error Diagnosis (CGED) is considered challenging due to the diversity of error types and subtypes, as well as the imbalanced distribution of subtype occurrences and the emergence of new subtypes, which pose a threat to the generalization ability of CGED models. In this paper, we propose a sentence editing and character filling-based CGED strategy that conducts task decomposition and transformation based on different types of grammatical errors, and provides corresponding solutions. To improve error detection accuracy, a refined set of error types is designed to better utilize training data. The correction task is transformed into a character slot filling task, the performance of which, as well as its generalization for long-tail scenarios and the open domain, can be improved by large-scale pre-trained models. Experiments conducted on CGED evaluation datasets show that our approach outperforms comparison models in all evaluation metrics and has good generalization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/ymcui/LERT.

  2. 2.

    https://github.com/blcuicall/cged_datasets.

  3. 3.

    http://thuctc.thunlp.org/.

  4. 4.

    http://data.people.com.cn/rmrb.

  5. 5.

    http://www.ltp-cloud.com/download.

References

  1. Rozovskaya, A., Chang, K.-W., Sammons, M., Roth, D., Habash N.: The illinois-columbia system in the CoNLL-2014 shared task. In: Proceedings of the 18th Conference on Computational Natural Language Learning, pp. 34–42. ACL, Baltimore (2014)

    Google Scholar 

  2. Awasthi, A., Sarawagi, S., Goyal, R., Ghosh, S., Piratla, V.: Parallel iterative edit models for local sequence transduction. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4260–4270. ACL, Hong Kong (2019)

    Google Scholar 

  3. Omelianchuk, K., Atrasevych, V., Chernodub, A., Skurzhanskyi, O.: GECToR-grammatical error correction: Tag, not rewrite. In: Proceedings of the 15th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 163–170. ACL, Seattle (2020)

    Google Scholar 

  4. Rao, G., Yang, E., Zhang, B.: Overview of NLPTEA-2020 shared task for Chinese grammatical error diagnosis. In: Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications, pp. 25–35. ACL, Suzhou (2020)

    Google Scholar 

  5. Liang, D., et al.: BERT enhanced neural machine translation and sequence tagging model for Chinese grammatical error diagnosis. In: Proceedings of the 6th Workshop on NLPTEA, pp. 57–66. ACL, Suzhou (2020)

    Google Scholar 

  6. Chen, M., Ge, T., Zhang, X., Wei, F., Zhou, M.: Improving the efficiency of grammatical error correction with erroneous span detection and correction. In: Proceedings of the 2020 Conference on Empirical Methods in NLP, pp. 7162–7169. ACL (2020)

    Google Scholar 

  7. Chollampatt, S., Wang, W., Ng., H.-T.: Cross-sentence grammatical error correction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 435–445. ACL, Florence (2019)

    Google Scholar 

  8. Levenshtein, V.-I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys. Doklady 10, 707–710 (1966)

    MathSciNet  Google Scholar 

  9. Lichtarge, J., Alberti, C., Kumar, S., Shazeer, N., Parmar, N., Tong, S.: Corpora generation for grammatical error correction. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 3291–3301. ACL, Minneapolis (2019)

    Google Scholar 

  10. Malmi, E., Krause, S., Rothe, S., Mirylenka, D., Severyn, A.: Encode, tag, realize: High-precision text editing. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5054–5065. ACL, Hong Kong (2019)

    Google Scholar 

  11. Sutton, C., McCallum, A.: Composition of conditional random fields for transfer learning. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 748–754. ACL, Vancouver (2005)

    Google Scholar 

  12. Yue, T., Liu, S., Cai, H., Yang, T., Song, S., Yu, T.: Improving Chinese grammatical error detection via data augmentation by conditional error generation. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 2966–2975. ACL, Dublin (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haihua Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xie, H., Chen, X., Lyu, X., Liu, L. (2023). A Generalized Strategy of Chinese Grammatical Error Diagnosis Based on Task Decomposition and Transformation. In: Wang, H., Han, X., Liu, M., Cheng, G., Liu, Y., Zhang, N. (eds) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers Artificial General Intelligence. CCKS 2023. Communications in Computer and Information Science, vol 1923. Springer, Singapore. https://doi.org/10.1007/978-981-99-7224-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7224-1_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7223-4

  • Online ISBN: 978-981-99-7224-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics