Abstract
This paper focuses on the use of technology in language learning. Language training requires the need to group learners homogeneously and to provide them with instant feedback on their productions such as errors [8, 15, 17] or proficiency levels. A possible approach is to assess writings from students and assign them with a level. This paper analyses the possibility of automatically predicting Common European Framework of Reference (CEFR) language levels on the basis of manually annotated errors in a written learner corpus [9, 11]. The research question is to evaluate the predictive power of errors in terms of levels and to identify which error types appear to be criterial features in determining interlanguage stages. Results show that specific errors such as punctuation, spelling and verb tense are significant at specific CEFR levels.
This paper benefited from the support of the Partenariat Hubert Currien Ulysse 2019 funding for the project “Investigating criterial features of learner English and AI-driven automatic language level assessment” (ref 43121RJ).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The EFCAMDATA is hosted by the University of Cambridge and data is accessible for academic and non-commercial purposes. Our scripts will be available on our github. Data was selected and manipulated independently of the participation of the Cambridge and Education First research teams.
- 2.
For instance, see the Intelligent-Essay-Assessor™developed at Pearson Knowledge Technologies; the IntelliMetric™-Essay-Scoring-System developed by Vantage Learning.
- 3.
- 4.
References
Arnold, T., Ballier, N., Gaillat, T., Lissón, P.: Predicting CEFRL levels in learner English on the basis of metrics and full texts. arXiv:1806.11099 [cs] (2018)
Attali, Y., Burstein, J.: Automated essay scoring With e-rater V.2. J. Technol. Learn. Assess. 4(3), 3–30 (2006)
Barker, F., Salamoura, A., Saville, N.: Learner corpora and language testing. In: Granger, S., Gilquin, G., Meunier, F. (eds.) The Cambridge Handbook of Learner Corpus Research, pp. 511–534. Cambridge Handbooks in Language and Linguistics, Cambridge University Press (2015)
Baur, C., et al.: Overview of the 2018 Spoken CALL Shared Task. In: Interspeech 2018, pp. 2354–2358. ISCA (2018)
Council of Europe, Council for Cultural Co-operation. Education Committee. Modern Languages Division: Common European Framework of Reference for Languages: learning, teaching, assessment. Cambridge University Press, Cambridge (2001)
Crossley, S.A., Kyle, K., Allen, L.K., Guo, L., McNamara, D.S.: Linguistic Microfeatures to Predict L2 Writing Proficiency: A Case Study in Automated Writing Evaluation (2014)
Crossley, S.A., Salsbury, T., McNamara, D.S., Jarvis, S.: Predicting lexical proficiency in language learner texts using computational indices. Lang. Test. 28(4), 561–580 (2011)
Dale, R., Anisimoff, I., Narroway, G.: HOO 2012: a report on the preposition and determiner error correction shared task. In: Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, NAACL HLT 2012, pp. 54–62. Association for Computational Linguistics, Stroudsburg (2012). event-place: Montreal, Canada
Díaz-Negrillo, A., Fernandez-Domingez, J.: Error tagging systems for learner corpora. Spanish J. Appl. Linguist. (RESLA) 19, 83–102 (2006)
Geertzen, J., Alexopoulou, T., Korhonen, A.: Automatic linguistic annotation of large scale L2 databases: the EF-Cambridge Open Language Database (EFCamDat). In: Miller, R.T., et al. (eds.) Proceeedings of the 31st Second Language Research Forum. Cascadilla Press, Carnegie Mellon (2013)
Granger, S., Gilquin, G., Meunier, F. (eds.): The Cambridge Handbook of Learner Corpus Research. Cambridge University Press, Cambridge (2015)
Hawkins, J.A., Buttery, P.: Criterial features in learner corpora: theory and illustrations. English Profile J. 1(01), 1–23 (2010)
Higgins, D., Xi, X., Zechner, K., Williamson, D.: A three-stage approach to the automated scoring of spontaneous spoken responses. Comput. Speech Lang. 25(2), 282–306 (2011)
Huang, Y., Murakami, A., Alexopoulou, T., Korhonen, A.L.: Dependency parsing of learner English (2018)
Leacock, C.: Automated Grammatical Error Detection for Language Learners. Morgan & Claypool Publishers, California (2010)
Nedungadi, P., Raj, H.: Unsupervised word sense disambiguation for automatic essay scoring. In: Kumar Kundu, M., Mohapatra, D.P., Konar, A., Chakraborty, A. (eds.) Advanced Computing, Networking and Informatics- Volume 1. SIST, vol. 27, pp. 437–443. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07353-8_51
Ng, H.T., Wu, S.M., Briscoe, T., Hadiwinoto, C., Susanto, R.H., Bryant, C.: The CoNLL-2014 shared task on grammatical error correction. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 1–14. Association for Computational Linguistics (2014), event-place: Baltimore, Maryland
Page, E.B.: The use of the computer in analyzing student essays. Int. Rev. Educ. 14(2), 210–225 (1968)
Scrucca, L., Fop, M., Murphy, T.B., Raftery, A.E.: mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8(1), 205–233 (2016)
Selinker, L.: Interlanguage. Int. Rev. Appl. Linguist. Lang. Teach. 10(3), 209 (1972)
Shermis, M.D., Burstein, J., Higgins, D., Zechner, K.: Automated essay scoring: writing assessment and instruction. Int. Encycl. Educ. 4(1), 20–26 (2010)
Shute, V.J.: Focus on formative feedback. Rev. Educ. Res. 78(1), 153–189 (2008)
Vajjala, S.: Automated assessment of non-native learner essays: investigating the role of linguistic features. Int. J. Artif. Intell. Educ. (2017). arXiv: 1612.00729
Vajjala, S., Loo, K.: Automatic CEFR level prediction for estonian learner text. NEALT Proc. Ser. 22, 113–128 (2014)
Volodina, E., Pilán, I., Alfter, D.: CALL communities and culture - short papers from EUROCALL 2016. In: Papadima-Sophocleous, S., Bradley, L., Thouësny, S. (eds.) Classification of Swedish Learner Essays by CEFR Levels, pp. 456–461. Research-publishing.net (2016)
Weigle, S.C.: English language learners and automated scoring of essays: critical considerations. Assessing Writ. 18(1), 85–99 (2013)
Yan, H., Jeroen, G., Rachel, B., Anna, K., Theodora, A.: The EF Cambridge Open Language Database (EFCAMDAT) information for users (2017)
Yannakoudakis, H., Briscoe, T., Medlock, B.: A new dataset and method for automatically grading ESOL texts. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 180–189. Association for Computational Linguistics, Stroudsburg (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ballier, N., Gaillat, T., Simpkin, A., Stearns, B., Bouyé, M., Zarrouk, M. (2019). A Supervised Learning Model for the Automatic Assessment of Language Levels Based on Learner Errors. In: Scheffel, M., Broisin, J., Pammer-Schindler, V., Ioannou, A., Schneider, J. (eds) Transforming Learning with Meaningful Technologies. EC-TEL 2019. Lecture Notes in Computer Science(), vol 11722. Springer, Cham. https://doi.org/10.1007/978-3-030-29736-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-29736-7_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29735-0
Online ISBN: 978-3-030-29736-7
eBook Packages: Computer ScienceComputer Science (R0)