HWCGEC:HW-TSC’s 2023 Submission for the NLPCC2023’s Chinese Grammatical Error Correction Task

Su, Chang; Zhao, Xiaofeng; Qiao, Xiaosong; Zhang, Min; Yang, Hao; Zhu, Junhao; Zhu, Ming; Ma, Wenbing

doi:10.1007/978-3-031-44699-3_6

Chang Su¹¹,
Xiaofeng Zhao¹¹,
Xiaosong Qiao¹¹,
Min Zhang¹¹,
Hao Yang¹¹,
Junhao Zhu¹¹,
Ming Zhu¹¹ &
…
Wenbing Ma¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14304))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

600 Accesses

Abstract

Deep learning has shown remarkable effectiveness in various language tasks. This paper presents Huawei Translation Services Center’s (HW-TSC’s) work called HWCGEC which get the best performance among the seven submitted results in the NLPCC2023 shared task 1, namely Chinese grammatical error correction (CGEC). CGEC aims to automatically correct grammatical errors that violate language rules and converts the noisy input texts to clean output texts. This paper, through experiments, discovered that after model fine-tuning the BART a sequence to sequence (seq2seq) model performs better than the ChatGLM a large language model (LLM) in situations where training data is large while the LoRA mode has a smaller number of parameters for fine-tuning. Additionally, the BART model achieves good results in the CGEC task through data augmentation and curriculum learning methods. Although the performance of LLM is poor in experiments, they possess excellent logical abilities. With the training set becoming more diverse and the methods for training set data augmentation becoming more refined, the supervised fine-tuning (SFT) mode trained LLMs are expected to achieve significant improvements in CGEC tasks in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Sequence to Sequence Learning for Chinese Grammatical Error Correction

Toward perfect neural cascading architecture for grammatical error correction

Article 19 November 2020

Youdao’s Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction

Notes

References

Borong, H., Xudong, L.: Modern Chinese (updated five editions) (2011)
Google Scholar
Bryant, C., Yuan, Z., Qorib, M.R., Cao, H., Ng, H.T., Briscoe, T.: Grammatical error correction: a survey of the state of the art. arXiv preprint arXiv:2211.05166 (2022)
Choshen, L., Abend, O.: Inherent biases in reference based evaluation for grammatical error correction and text simplification. arXiv preprint arXiv:1804.11254 (2018)
Hu, E.J., et al.: LoRA: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)
Kubis, M., Vetulani, Z., Wypych, M., Zietkiewicz, T.: Open challenge for correcting errors of speech recognition systems. In: Vetulani, Z., Paroubek, P., Kubis, M. (eds.) Human Language Technology. Challenges for Computer Science and Linguistics - 9th Language and Technology Conference, LTC 2019, Poznan, Poland, May 17–19, 2019, Revised Selected Papers. Lecture Notes in Computer Science, vol. 13212, pp. 322–337. Springer (2019). https://doi.org/10.1007/978-3-031-05328-3_21
Li, J., et al.: Sequence-to-action: grammatical error correction with action guided sequence generation (2022)
Google Scholar
Ma, S., et al.: Linguistic rules-based corpus generation for native Chinese grammatical error correction. arXiv preprint arXiv:2210.10442 (2022)
Mangrulkar, S., Gugger, S., Debut, L., Belkada, Y., Paul, S.: PEFT: state-of-the-art parameter-efficient fine-tuning methods (2022). https://github.com/huggingface/peft
Omelianchuk, K., Atrasevych, V., Chernodub, A., Skurzhanskyi, O.: GECToR - grammatical error correction: tag, not rewrite. In: Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. pp. 163–170. Association for Computational Linguistics, Seattle, WA, USA Online (2020). https://doi.org/10.18653/v1/2020.bea-1.16, https://aclanthology.org/2020.bea-1.16
Rao, G., Gong, Q., Zhang, B., Xun, E.: Overview of NLPTEA-2018 share task Chinese grammatical error diagnosis. In: Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, pp. 42–51 (2018)
Google Scholar
Rao, G., Yang, E., Zhang, B.: Overview of NLPTEA-2020 shared task for Chinese grammatical error diagnosis. In: Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications, pp. 25–35 (2020)
Google Scholar
Sakaguchi, K., Napoles, C., Post, M., Tetreault, J.: Reassessing the goals of grammatical error correction: fluency instead of grammaticality. Trans. Assoc. Comput. Linguist. 4, 169–182 (2016)
Article Google Scholar
Tang, Z., Ji, Y., Zhao, Y., Li, J.: Chinese grammatical error correction enhanced by data augmentation from word and character levels. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics, Hohhot, China, pp. 13–15 (2021)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, B., Duan, X., Wu, D., Che, W., Chen, Z., Hu, G.: CCTC: a cross-sentence Chinese text correction dataset for native speakers. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 3331–3341 (2022)
Google Scholar
Weller-Di Marco, M., Fraser, A.: Findings of the WMT 2022 shared tasks in unsupervised MT and very low resource supervised MT. In: Proceedings of the Seventh Conference on Machine Translation (WMT), pp. 801–805 (2022)
Google Scholar
Xu, L., Wu, J., Peng, J., Fu, J., Cai, M.: FCGEC: fine-grained corpus for Chinese grammatical error correction. arXiv preprint arXiv:2210.12364 (2022)
Zhang, B.: The characteristics and functions of the HSK dynamic composition corpus. Int. Chin. Lang. Educ. 4(11) (2009)
Google Scholar
Zhang, Y., et al.: MuCGEC: a multi-reference multi-source evaluation dataset for Chinese grammatical error correction (2022)
Google Scholar
Zhang, Y., et al.: NaSGEC: a multi-domain Chinese grammatical error correction dataset from native speaker texts. arXiv preprint arXiv:2305.16023 (2023)
Zhao, Y., Jiang, N., Sun, W., Wan, X.: Overview of the NLPCC 2018 shared task: grammatical error correction. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11109, pp. 439–445. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_41
Chapter Google Scholar
Zhao, Z., Wang, H.: MaskGEC: improving neural grammatical error correction via dynamic masking. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 1226–1233 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Huawei Translation Services Center, Shenzhen, China
Chang Su, Xiaofeng Zhao, Xiaosong Qiao, Min Zhang, Hao Yang, Junhao Zhu, Ming Zhu & Wenbing Ma

Authors

Chang Su
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaosong Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Min Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Junhao Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Wenbing Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chang Su .

Editor information

Editors and Affiliations

Emory University, Atlanta, GA, USA
Fei Liu
Microsoft Research Asia, Beijing, China
Nan Duan
Soochow University, Suzhou, China
Qingting Xu
Soochow University, Suzhou, China
Yu Hong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, C. et al. (2023). HWCGEC:HW-TSC’s 2023 Submission for the NLPCC2023’s Chinese Grammatical Error Correction Task. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14304. Springer, Cham. https://doi.org/10.1007/978-3-031-44699-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-44699-3_6
Published: 08 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44698-6
Online ISBN: 978-3-031-44699-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

HWCGEC:HW-TSC’s 2023 Submission for the NLPCC2023’s Chinese Grammatical Error Correction Task

Abstract

Access this chapter

Similar content being viewed by others

A Sequence to Sequence Learning for Chinese Grammatical Error Correction

Toward perfect neural cascading architecture for grammatical error correction

Youdao’s Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

HWCGEC:HW-TSC’s 2023 Submission for the NLPCC2023’s Chinese Grammatical Error Correction Task

Abstract

Access this chapter

Similar content being viewed by others

A Sequence to Sequence Learning for Chinese Grammatical Error Correction

Toward perfect neural cascading architecture for grammatical error correction

Youdao’s Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation