Abstract
With the burgeoning popularity and swift advancements of automated writing evaluation (AWE) systems in language classrooms, scholarly and practical interest in this area has noticeably increased. This systematic review aims to comprehensively investigate current research on three prominent AWE systems: Grammarly, Pigai, and Criterion. Objectives include assessing each system’s characteristics, advantages, and drawbacks, analyzing prior studies’ frameworks, methodologies, findings, and implications, and identifying research gaps and future directions. The analysis of 39 articles underscored an escalating interest in scrutinizing AWE systems, predominantly focusing on their efficacy and learners’ viewpoints. The findings demonstrated the positive impact of AWE systems on enhancing students’ writing proficiency, with both learners and educators conveying positive attitudes towards these digital tools. However, several noteworthy research gaps endure, including the need to further investigate the usage patterns of AWE tools, expanding the participants to wider language proficiency and research comparing AWE feedback with peer feedback. The majority of the studies focused on non-native English-speaking university students over a single academic semester, using quantitative and mixed research methods. The review concludes by offering insights and recommendations for educators and researchers in the field, stressing the importance of tackling the identified research gaps and further delving into the potential of AWE systems in the age of generative artificial intelligence.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
References marked with an asterisk indicate studies included in the review and cited in the text.
AlAfnan, M. A., Dishari, S., Jovic, M., & Lomidze, K. (2023). Chatgpt as an educational tool: Opportunities, challenges, and recommendations for communication, business writing, and composition courses. Journal of Artificial Intelligence and Technology, 3(2), 60–68. https://doi.org/10.37965/jait.2023.0184
Alharbi, W. (2023). AI in the Foreign language classroom: A pedagogical overview of automated writing assistance tools. Education Research International, 2023, 1–15. https://doi.org/10.1155/2023/4253331
*Almusharraf, N., & Alotaibi, H. (2022). An error-analysis study from an EFL writing context: Human and Automated Essay Scoring Approaches. Technology, Knowledge and Learning. https://doi.org/10.1007/s10758-022-09592-z
*Bai, L., & Hu, G. (2017). In the face of fallible AWE feedback: How do students respond? Educational Psychology, 37(1), 67–81. https://doi.org/10.1080/01443410.2016.1223275
*Barrot, J. S. (2021). Using automated written corrective feedback in the writing classrooms: Effects on L2 writing accuracy. Computer Assisted Language Learning, 1–24. https://doi.org/10.1080/09588221.2021.1936071
Barrot, J. S. (2023a). Using ChatGPT for second language writing: Pitfall and potentials. Assessing Writing, 57, 100745. https://doi.org/10.1016/j.asw.2023.100745
Barrot, J. S. (2023b). Trends in automated writing evaluation systems research for teaching, learning, and assessment: A bibliometric analysis. Education and Information Technologies, 1–25. https://doi.org/10.1007/s10639-023-12083-y
Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The criterion online writing service. AI Magazine, 25(3), 27–35.
Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385–405.
Chen, B., Chen, Y., & Pan, X. (2021). Research on the effect of college English Level-4 writing teaching based on learning App and Pigai. Overseas English, 3, 79–80. In Chinese.
Chen, C. F. E., & Cheng, W. Y. E. C. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning & Technology, 12, 94–112.
Chung, K. W. K. & O’Neil, H. F. (1997). Methodological approaches to online scoring of essays (ERIC reproduction service no ED 418 101).
Corder, S. P. (1967). The significance of learners’ errors. International Review of Applied Linguistics in Language Teaching, 5(4), 161–170.
Creswell, J. W. (2012). Educational research: Planning, conducting, and evaluating quantitative and qualitative research (4th ed.). Pearson Education.
Creswell, J. W. (2014). Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications.
Criterion. (2023). Retrieved from https://criterion.ets.org/
Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y. S., Gasevic, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. In 2023 IEEE International Conference on Advanced Learning Technologies (ICALT) (pp. 323–325). IEEE.
Deane, P. (2013). On the relation between automated essay scoring and modern views of the writing construct. Assessing Writing, 18(1), 7–24. https://doi.org/10.1016/j.asw.2012.10.002
*Dikli, S., & Bleyle, S. (2014). Automated essay scoring feedback for second language writers: How does it compare to instructor feedback?. Assessing Writing, 22, 1–17. https://doi.org/10.1016/j.asw.2014.03.006
*Dizon, G., & Gayed, J. (2021). Examining the impact of Grammarly on the quality of mobile L2 writing. The JALT CALL Journal, 17(2), 74–92. https://doi.org/10.29140/jaltcall.v17n2.336
*Ebadi, S., Gholami, M., & Vakili, S. (2022). Investigating the effects of using Grammarly in EFL writing: The case of articles. Computers in the Schools, 40(1), 85–105. https://doi.org/10.1080/07380569.2022.2150067
*Ebyary, K. E. E. (2017). Eye tracking analysis of EAP student’s regions of interest in computer-based feedback on grammar, usage, mechanics, style and organization and development. CDELT Occasional Papers in the Development of English Education, 63(1), 5–30
Fishbein, M., & Ajzen, I. (1975). Belief, Attitude, Intention, and Behavior: An Introduction to Theory and Research. Addison-Wesley.
Fu, Q. K., Zou, D., Xie, H., & Cheng, G. (2022). A review of AWE feedback: types, learning outcomes, and implications. Computer Assisted Language Learning, 1–43. https://doi.org/10.1080/09588221.2022.2033787
Foltz, P. W., Laham, D., & Landauer, T. K. (1999). The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning, 1(2), 939–944.
*Gao, J. (2021). Exploring the Feedback Quality of an Automated Writing Evaluation System Pigai. International Journal of Emerging Technologies in Learning (IJET), 16(11), 322. https://doi.org/10.3991/ijet.v16i11.19657
Guo, K., & Wang, D. (2023). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 1–29. https://doi.org/10.1007/s10639-023-12146-0
*Guo, Q., Feng, R., & Hua, Y. (2022). How effectively can EFL students use automated written corrective feedback (AWCF) in research writing? Computer Assisted Language Learning, 35(9), 2312–2331. https://doi.org/10.1080/09588221.2021.1879161
Grammarly. (2023). Retrieved from https://www.grammarly.com
Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. The Journal of Technology, Learning and Assessment, 8(6). Retrieved September 1, 2023, from http://www.jtla.org
*Han, T., & Sari, E. (2022). An investigation on the use of automated feedback in Turkish EFL students’ writing classes. Computer Assisted Language Learning, 1–24. https://doi.org/10.1080/09588221.2022.2067179
*Han, Y., Zhao, S., & Ng, L. L. (2021). How Technology Tools Impact Writing Performance, Lexical Complexity, and Perceived Self-Regulated Learning Strategies in EFL Academic Writing: A Comparative Study. Frontiers in Psychology, 12, 752793. https://doi.org/10.3389/fpsyg.2021.752793
*Hassanzadeh, M., & Fotoohnejad, S. (2021). Implementing an automated feedback program for a Foreign Language writing course: A learner‐centric study: Implementing an AWE tool in a L2 class. Journal of Computer Assisted Learning, 37(5), 1494–1507. https://doi.org/10.1111/jcal.12587
Hibert, A. I. (2019). Systematic literature review of automated writing evaluation as a formative learning tool. In Transforming Learning with Meaningful Technologies: 14th European Conference on Technology Enhanced Learning, EC-TEL 2019, Delft, The Netherlands, September 16–19, 2019, Proceedings 14 (pp. 199–212). Springer International Publishing.
*Huang, S., & Renandya, W. A. (2020). Exploring the integration of automated feedback among lower-proficiency EFL learners. Innovation in Language Learning and Teaching, 14(1), 15–26. https://doi.org/10.1080/17501229.2018.1471083
*Huang, S. J. (2014). Automated versus Human Scoring: A case study in an EFL Context. Electronic Journal of Foreign Language Teaching, 11. Retrieved September 1, 2023, from https://e-flt.nus.edu.sg/wp-content/uploads/2020/09/v11s12014/huang.pdf
Huawei, S., & Aryadoust, V. (2023). A systematic review of automated writing evaluation systems. Education and Information Technologies, 28(1), 771–795. https://doi.org/10.1007/s10639-022-11200-7
Hung, H. T., Yang, J. C., Hwang, G. J., Chu, H. C., & Wang, C. C. (2018). A scoping review of research on digital game-based language learning. Computers & Education, 126, 89–104. https://doi.org/10.1016/j.compedu.2018.07.001
Hockly, N. (2019). Automated writing evaluation. ELT Journal, 73(1), 82–88. https://doi.org/10.1093/elt/ccy044
*Hou, Y. (2020). Implications of AES system of Pigai for self-regulated learning. theory and practice in language studies, 10(3), 261. https://doi.org/10.17507/tpls.1003.01
Hwang, G. J., & Fu, Q. K. (2019). Trends in the research design and application of mobile language learning: A review of 2007–2016 publications in selected SSCI journals. Interactive Learning Environments, 27(4), 567–581. https://doi.org/10.1080/10494820.2018.1486861
Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing. Language Teaching, 39(2), 83–101.
Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., ... & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and individual differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274
Khoii, R., & Doroudian, A. (2013). Automated scoring of EFL learners’ written performance: a torture or a blessing. In Proceedings of Conference on ICT for Language Learning (pp. 5146–5155)
*Klobucar, A., Elliot, N., Deess, P., Rudniy, O., & Joshi, K. (2013). Automated scoring in context: Rapid assessment for placed students. Assessing Writing, 18(1), 62–84.
Kohnke, L., Moorhouse, B. L., & Zou, D. (2023). ChatGPT for language teaching and learning. RELC Journal, 00336882231162868. https://doi.org/10.1177/00336882231162868.
*Koltovskaia, S. (2020). Student engagement with automated written corrective feedback (AWCF) provided by Grammarly: A multiple case study. Assessing Writing, 44, 100450. https://doi.org/10.1016/j.asw.2020.100450
Lai, Y. H. (2010). Which do students prefer to evaluate their essays: Peers or computer program. British Journal of Educational Technology, 41, 432–454.
*Lei, J. I. (2020). An AWE-Based Diagnosis of L2 English Learners’ Written Errors. English Language Teaching, 13(10), 111. https://doi.org/10.5539/elt.v13n10p111
*Li, J., Link, S., & Hegelheimer, V. (2015). Rethinking the role of automated writing evaluation (AWE) feedback in ESL writing instruction. Journal of second language writing, 27, 1–18. https://doi.org/10.1016/j.jslw.2014.10.004
Li, M. (2021a). Researching and teaching second language writing in the digital age (1st ed.). London: Palgrave Macmillan. https://doi.org/10.1007/978-3-030-87710-1
*Li, Z. (2021b). Teachers in automated writing evaluation (AWE) system-supported ESL writing classes: Perception, implementation, and influence. System, 99, 102505.
*Li, Z., Link, S., Ma, H., Yang, H., & Hegelheimer, V. (2014). The role of automated writing evaluation holistic scores in the ESL classroom. System, 44, 66–78. https://doi.org/10.1016/j.system.2021.102505
Liu, N. F., & Carless, D. (2006). Peer feedback: The learning element of peer assessment. Teaching in Higher Education, 11(3), 279–290.
Lin, C., Huang, C., & Chen, C. (2014). Barriers to the adoption of ICT in teaching Chinese as a foreign language in US universities. ReCALL, 26(1), 100–116. https://doi.org/10.1017/S0958344013000268
Lu, X. (2019). An empirical study on the artificial intelligence writing evaluation system in China CET. Big data, 7(2), 121–129. https://doi.org/10.1089/big.2018.0151
Maxwell, J. A. (2013). Qualitative research design: An interactive approach (Vol. 41). Sage publications.
McMillan, J. H. (2000). Educational research: Fundamentals for the consumer (3rd ed.). Harper Collins.
Mead, G. H. (1934). Mind, self, and society: From the standpoint of a social behaviorist. University of Chicago Press.
*Miranty, D., & Widiati, U. (2021). An automated writing evaluation (AWE) in higher education. Pegem Journal of Education and Instruction, 11(4), 126–137. https://doi.org/10.47750/pegegog.11.04.12
Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050
Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., The PRISMA Group. (2009). Preferred Reporting items for systematic reviews and meta-analyses: The PRISMA Statement. PLOS Medicine, 6(7), e1000097. https://doi.org/10.1371/journal.pmed.1000097
Morse, J. M. (2000). Determining sample size. Qualitative Health Research, 10(1), 3–5.
Nunes, A., Cordeiro, C., Limpo, T., & Castro, S. L. (2022). Effectiveness of automated writing evaluation systems in school settings: A systematic review of studies from 2000 to 2020. Journal of Computer Assisted Learning, 38(2), 599–620. https://doi.org/10.1111/jcal.12635
*ONeill, R., & Russell, A. (2019). Stop! Grammar time: University students’ perceptions of the automated feedback program Grammarly. Australasian Journal of Educational Technology, 35(1). https://doi.org/10.14742/ajet.3795
Page, E. B. (2003). Project essay grade: PEG. In M. D. Shermis & J. C. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 43–54). Lawrence Erlbaum.
*Parra G., L., & Calero S., X. (2019). Automated Writing Evaluation Tools in the Improvement of the Writing Skill. International Journal of Instruction, 12(2), 209–226. https://doi.org/10.29333/iji.2019.12214a
Pavlik, J. V. (2023). Collaborating with ChatGPT: Considering the implications of generative artificial intelligence for journalism and media education. Journalism & Mass Communication Educator, 78(1), 84–93. https://doi.org/10.1177/10776958221149577
Pigai. (2023). Retrieved September 1, 2023, from http://www.pigai.org
Qassemzadeh, A., & Soleimani, H. (2016). The impact of feedback provision by Grammarly software and teachers on learning passive structures by Iranian EFL Learners. Theory and Practice in Language Studies, 6(9), 1884–1894. https://doi.org/10.17507/tpls.0609.23
*Qian, L., Yang, Y., & Zhao, Y. (2021). Syntactic complexity revisited: Sensitivity of China’s AES-generated scores to syntactic measures, effects of discourse-mode and topic. Reading and Writing, 34(3), 681–704. https://doi.org/10.1007/s11145-020-10087-5
Ramesh, D., & Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55(3), 2495–2527. https://doi.org/10.1007/s10462-021-10068-2
*Ramineni, C. (2013). Validating automated essay scoring for online writing placement. Assessing Writing, 18(1), 40–61. https://doi.org/10.1016/j.asw.2012.10.005
*Ranalli, J. (2021). L2 student engagement with automated feedback on writing: Potential for learning and issues of trust. Journal of Second Language Writing, 52, 100816. https://doi.org/10.1016/j.jslw.2021.10081
*Ranalli, J. (2022). Automated written corrective feedback: Error-correction performance and timing of delivery. Language Learning & Technology, 26(1), 1–25. Retrieved September 1, 2023, from http://hdl.handle.net/10125/73465
Saldaña, J. (2016). The coding manual for qualitative researchers (3rd ed.). Sage Publications.
*Sanosi, A. B. (2022). The impact of automated written corrective feedback on EFL learners’ academic writing accuracy. Journal of Teaching English for Specific and Academic Purposes, 301–317. https://doi.org/10.22190/JTESAP2202301S
*Sari, E., & Han, T. (2022). Using generalizability theory to investigate the variability and reliability of EFL composition scores by human raters and e-rater. Porta Linguarum: revista internacional de didáctica de las lenguas extranjeras, (38), 27–45. https://doi.org/10.30827/portalin.vi38.18056
*Saricaoglu, A., & Bilki, Z. (2021). Voluntary use of automated writing evaluation by content course students. ReCALL, 33(3), 265–277. https://doi.org/10.1017/S0958344021000021
Shamseer, L., Moher, D., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., Shekelle, P., Stewart, L. A., PRISMA-P Group. (2015). Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation. BMJ, 350, g7647. https://doi.org/10.1136/bmj.g7647
Shadiev, R., & Feng, Y. (2023). Using automated corrective feedback tools in language learning: A review study. Interactive Learning Environments, 1–29. https://doi.org/10.1080/10494820.2022.2153145
Shermis, M. D., Mzumara, H. R., Olson, J., & Harrington, S. (2001). On-line grading of student essays: PEG goes on the World Wide Web. Assessment & Evaluation in Higher Education, 26(3), 247–259. https://doi.org/10.1080/02602930120052404
Shermis, M. D., Burstein, J., & Bursky, S. A. (2013). Introduction to automated essay evaluation. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 1–15). Routledge.
Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.
Shortt, M., Tilak, S., Kuznetcova, I., Martens, B., & Akinkuolie, B. (2023). Gamification in mobile-assisted language learning: A systematic review of Duolingo literature from public release of 2012 to early 2020. Computer Assisted Language Learning, 36(3), 517–554. https://doi.org/10.1080/09588221.2021.1933540
Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007
*Tambunan, A. R. S., Andayani, W., Sari, W. S., & Lubis, F. K. (2022). Investigating EFL students’ linguistic problems using Grammarly as automated writing evaluation feedback. Indonesian Journal of Applied Linguistics, 12(1), 16–27. https://doi.org/10.17509/ijal.v12i1.46428
*Thi, N. K., Nikolov, M., & Simon, K. (2022). Higher-proficiency students’ engagement with and uptake of teacher and Grammarly feedback in an EFL writing course. Innovation in Language Learning and Teaching, 0(0), 1–16. https://doi.org/10.1080/17501229.2022.2122476
*Thi, N. K., & Nikolov, M. (2022). How Teacher and Grammarly Feedback Complement One Another in Myanmar EFL Students’ Writing. The Asia-Pacific Education Researcher, 31(6), 767–779. https://doi.org/10.1007/s40299-021-00625-2
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.
Wang, P. L. (2015). Effects of an Automated Writing Evaluation Program: Student Experiences and Perceptions. Electronic Journal of Foreign Language Teaching, 12(1), 79–100. Retrieved September 1, 2023, from https://e-flt.nus.edu.sg/v12n12015/wang.pdf
Wang, Y. J. (2011). Exploring the effect of using automated writing evaluation in Taiwanese EFL students’ writing. Unpublished Master’s thesis. I-Shou University, Taiwan.
Wang, Y. J., Shang, H. F., & Briody, P. (2013). Exploring the impact of using automated writing evaluation in English as a foreign language university students’ writing. Computer Assisted Language Learning, 26(3), 234–257. https://doi.org/10.1080/09588221.2012.655300
Warschauer, M., & Ware, P. (2006). Automated writing evaluation: Defining the classroom research agenda. Language Teaching Research, 10(2), 157–180.
Webb, N. M., & Shavelson, R. J. (2005). Generalizability theory: overview. Encyclopedia of statistics in behavioral science, 2, 717–719. https://doi.org/10.1002/0470013192.bsa703
*Wu, L., Wu, Y., & Zhang, X. (2021). L2 learner cognitive psychological factors about artificial intelligence writing corrective feedback. English Language Teaching, 14(10), 70. https://doi.org/10.5539/elt.v14n10p70
*Xu, J., & Zhang, S. (2022). Understanding AWE feedback and English writing of learners with different proficiency levels in an EFL classroom: A sociocultural perspective. The Asia-Pacific Education Researcher, 31(4), 357–367. https://doi.org/10.1007/s40299-021-00577-7
*Yao, D. (2021). Automated writing evaluation for ESL learners: A Case study of Pigai system. Journal of Asia TEFL, 18, 949–958. https://doi.org/10.18823/asiatefl.2021.18.3.14.949
*Yousofi, R. (2022). Grammarly deployment (in)efficacy within EFL academic writing classrooms: An attitudinal report from Afghanistan. Cogent Education, 9(1), 2142446. https://doi.org/10.1080/2331186X.2022.2142446
*Zaini, A. (2018). Word processors as monarchs: Computer-generated feedback can exercise power over and influence EAL learners’ identity representations. Computers & Education, 120, 112–126. https://doi.org/10.1016/j.compedu.2018.01.014
Zhang, F., & Gao, J. (2021). The impact of Pigai on the English writing performance of English Major students. Teaching Research, 1, 59–65. In Chinese.
Zhang, S. (2021). Review of automated writing evaluation systems. Journal of China Computer-Assisted Language Learning, 1(1), 170–176.
Zhang, Z., & Hyland, K. (2018). Student engagement with teacher and automated feedback on L2 writing. Assessing Writing, 36, 90–102. https://doi.org/10.1016/j.asw.2018.02.004
Zhang, R., & Zou, D. (2022). Types, purposes, and effectiveness of state-of-the-art technologies for second and foreign language learning. Computer Assisted Language Learning, 35(4), 696–742. https://doi.org/10.1080/09588221.2020.1744666
Zou, D., Luo, S., Xie, H., & Hwang, G. (2020). A systematic review of research on flipped language classrooms: Theoretical foundations, learning activities, tools, research topics and findings. Computer Assisted Language Learning, 35(8), 1811–1837. https://doi.org/10.1080/09588221.2020.1839502
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical approval
I confirm that all the research meets ethical guidelines and adheres to the legal requirements of the study country.
Conflict of interests
None.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ding, L., Zou, D. Automated writing evaluation systems: A systematic review of Grammarly, Pigai, and Criterion with a perspective on future directions in the age of generative artificial intelligence. Educ Inf Technol (2024). https://doi.org/10.1007/s10639-023-12402-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10639-023-12402-3