Abstract
In this work, we reinvestigate the classifier-based approach to article and preposition error correction going beyond linguistically motivated factors. We show that state-of-the-art results can be achieved without relying on a plethora of heuristic rules, complex feature engineering and advanced NLP tools. A proposed method for detecting spaces for article insertion is even more efficient than methods that use a parser. We examine automatically trained word classes acquired by unsupervised learning as a substitution for commonly used part-of-speech tags. Our best models significantly outperform the top systems from CoNLL-2014 Shared Task in terms of article and preposition error correction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
\(\varnothing \) stands for the English zero article.
- 2.
- 3.
- 4.
- 5.
References
Buck, C., Heafield, K., Van Ooyen, B.: N-gram counts and language models from the common crawl. In: LREC. vol. 2, p. 4 (2014)
Cahill, A., Madnani, N., Tetreault, J.R., Napolitano, D.: Robust systems for preposition error correction using Wikipedia revisions. In: NAACL-HLT, pp. 507–517 (2013)
Dahlmeier, D., Ng, H.T.: Better evaluation for grammatical error correction. In: NAACL-HLT, pp. 568–572 (2012)
Dahlmeier, D., Ng, H.T., Wu, S.M.: Building a large annotated corpus of learner English: the NUS corpus of learner English. In: BEA8 Workshop, pp. 22–31 (2013)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. JMLR 9, 1871–1874 (2008)
Felice, M., Yuan, Z., Andersen, Ø.E., Yannakoudakis, H., Kochmar, E.: Grammatical error correction using hybrid systems and type filtering. In: CoNLL, pp. 15–24 (2014)
Fossati, D., Di Eugenio, B.: A mixed trigrams approach for context sensitive spell checking. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 623–633. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-70939-8_55
Gamon, M., Gao, J., Brockett, C., Klementiev, A., Dolan, W.B., Belenko, D., Vanderwende, L.: Using contextual speller techniques and language modeling for ESL error correction. IJCNLP 8, 449–456 (2008)
Grundkiewicz, R., Junczys-Dowmunt, M.: The AMU system in the CoNLL-2014 shared task: Grammatical error correction by data-intensive and feature-rich statistical machine translation. CoNLL pp. 25–33 (2014)
Han, N.R., Chodorow, M., Leacock, C.: Detecting errors in english article usage by non-native speakers. JNLE 12(02), 115–129 (2006)
Han, N.R., Tetreault, J.R., Lee, S.H., Ha, J.Y.: Using an error-annotated learner corpus to develop an ESL/EFL error correction system. In: LREC (2010)
Koehn, P., Hoang, H.: Factored translation models. In: EMNLP-CoNLL, pp. 868–876 (2007)
Leacock, C., Chodorow, M., Gamon, M., Tetreault, J.: Automated grammatical error detection for language learners. Synth. Lect. Hum. Lang. Technol. 3(1), 1–134 (2010)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Mizumoto, T., Hayashibe, Y., Komachi, M., Nagata, M., Matsumoto, Y.: The effect of learner corpus size in grammatical error correction of ESL writings. In: COLING, pp. 863–872 (2012)
Ng, H.T., Wu, S.M., Briscoe, T., Hadiwinoto, C., Susanto, R.H., Bryant, C.: The CoNLL-2014 shared task on grammatical error correction. In: CoNLL, pp. 1–14 (2014)
Ng, H.T., Wu, S.M., Wu, Y., Hadiwinoto, C., Tetreault, J.: The CoNLL-2013 shared task on grammatical error correction. In: CoNLL (2013)
Rozovskaya, A., Chang, K.W., Sammons, M., Roth, D.: The University of Illinois system in the CoNLL-2013 shared task. In: CoNLL. pp. 13–19 (2013)
Rozovskaya, A., Chang, K.W., Sammons, M., Roth, D., Habash, N.: The Illinois-Columbia system in the CoNLL-2014 shared task, pp. 34–42 (2014)
Rozovskaya, A., Roth, D.: Generating confusion sets for context-sensitive error correction. In: EMNLP, pp. 961–970 (2010)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)
Tetreault, J., Foster, J., Chodorow, M.: Using parse features for preposition selection and error detection. In: ACL, pp. 353–358 (2010)
Tetreault, J.R., Chodorow, M.: The ups and downs of preposition error detection in ESL writing. In: COLING, pp. 865–872 (2008)
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: ACL, pp. 384–394 (2010)
Acknowledgements
This work has been funded by the National Science Centre, Poland (Grant No. 2014/15/N/ST6/02330).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Grundkiewicz, R., Junczys-Dowmunt, M. (2018). Reinvestigating the Classification Approach to the Article and Preposition Error Correction. In: Vetulani, Z., Mariani, J., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2015. Lecture Notes in Computer Science(), vol 10930. Springer, Cham. https://doi.org/10.1007/978-3-319-93782-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-93782-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93781-6
Online ISBN: 978-3-319-93782-3
eBook Packages: Computer ScienceComputer Science (R0)