Abstract
Grammatical Error Correction (GEC) is the task of correcting several diverse errors in a text such as spelling, punctuation, morphological, and word choice typos or mistakes. Expressed as a sentence correction task, models such as neural-based sequence-to-sequence (seq2seq) GECs have emerged to offer solutions to the task. However, neural-based seq2seq grammatical error correction models are computationally expensive both in training and in translation inference. Also, they tend to suffer from poor generalization and arrive at inept capabilities due to limited error-corrected data, and thus, incapable of effectively correcting grammar. In this work, we propose the use of Neural Cascading Architecture and different techniques in enhancing the effectiveness of neural sequence-to-sequence grammatical error correction models as inspired by post-editing processes of Neural Machine Translations (NMTs). The findings of our experiments show that, in low-resource NMT models, adapting the presented cascading techniques unleashes performances that is comparable to high setting NMT models, with improvements on state-of-the-art (SOTA) JHU FLuency- Extended GUG corpus (JFLEG) parallel corpus for developing and evaluating GEC model systems. We extensively exploit and evaluate multiple cascading learning strategies and establish best practices toward improving neural seq2seq GECs.
Similar content being viewed by others
References
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Chodorow M, Tetreault JR, Han NR (2007) Detection of grammatical errors involving prepositions. In: ACL 2007
Chollampatt S, Hoang DT, Ng HT (2016) Adapting grammatical error correction based on the native language of writers with neural network joint models. In: EMNLP
Chollampatt S, Ng HT (2017) Connecting the dots: Towards human-level grammatical error correction. In: BEA@EMNLP
Dahlmeier D, Ng HT, Wu SM (2013) Building a large annotated corpus of learner english: the nus corpus of learner english. In: BEA@NAACL-HLT
Dale R, Kilgarriff A (2011) Helping our own: the hoo 2011 pilot shared task. In: ENLG
Felice M, Yuan Z (2014) Generating artificial errors for grammatical error correction. In: EACL
Felice M, Yuan Z, Andersen ØE, Yannakoudakis H, Kochmar E (2014) Grammatical error correction using hybrid systems and type filtering. In: CoNLL shared task
Felice RD, Pulman SG (2008) A classifier-based approach to preposition and determiner error correction in l2 english. In: COLING
Ge T, Wei F, Zhou M (2018) Fluency boost learning and inference for neural grammatical error correction. In: ACL
Gehring J, Auli M, Grangier D, Yarats D, Dauphin Y (2017) Convolutional sequence to sequence learning. In: ICML
Heilman M, Cahill A, Madnani N, Lopez M, Mulholland M, Tetreault J (2014) Predicting grammaticality on an ordinal scale. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 2: Short Papers), association for computational linguistics, Baltimore, Maryland, http://www.aclweb.org/anthology/P14-2029, pp 174–180
Ji J, Wang Q, Toutanova K, Gong Y, Truong S, Gao J (2017) A nested attention neural hybrid model for grammatical error correction. In: ACL
Junczys-Dowmunt M, Grundkiewicz R (2016) Phrase-based machine translation is state-of-the-art for automatic grammatical error correction. In: EMNLP
Kim S, Kim HE (2017) Transferring knowledge to smaller network with class-distance loss. In: ICLR
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. arXiv:1412.6980
Knowles R, Koehn P. (2016) Neural interactive translation prediction
Koehn P, Knowles R (2017) Six challenges for neural machine translation. In: NMT@ACL
Luong T, Pham HQ, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: EMNLP
Mizumoto T, Hayashibe Y, Komachi M, Nagata M, Matsumoto Y (2012) The effect of learner corpus size in grammatical error correction of esl writings. In: COLING
Napoles C, Sakaguchi K, Tetreault J (2017) Jfleg: a fluency corpus and benchmark for grammatical error correction. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics: Volume 2, Short Papers, association for computational linguistics, Valencia, Spain. http://www.aclweb.org/anthology/E17-2037, pp 229–234
Peris Á, Casacuberta F (2018) Nmt-keras: a very flexible toolkit with a focus on interactive nmt and online learning. arXiv:1807.03096
Peris Á, Casacuberta F (2018) Online learning for effort reduction in interactive neural machine translation. arXiv:1802.03594
Schetinin V (2004) A learning algorithm for evolving cascade neural networks. Neural Process Lett 17:21–31
Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. arXiv:1511.06709
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NIPS
Warsito B, Santoso R, Suparti YH (2018) Cascade forward neural network for time series prediction. J Phys Conf Series 1025:012097. https://doi.org/10.1088/1742-6596/1025/1/012097
Wertyworld C (2019) KidsLoveGrammar fix grammatical errors for beginners - homepage. https://wertyworld.com/kidslovegrammar/
Yannakoudakis H, Briscoe T, Medlock B (2011) A new dataset and method for automatically grading esol texts. In: ACL
Yuan Z, Briscoe T (2016) Grammatical error correction using neural machine translation. In: HLT-NAACL
Acknowledgments
We thank Wertyworld, Co. for its contribution in gathering the data used in this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Acheampong, K.N., Tian, W. Toward perfect neural cascading architecture for grammatical error correction. Appl Intell 51, 3775–3788 (2021). https://doi.org/10.1007/s10489-020-01980-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01980-1