Abstract
ChatGPT, the newest pre-trained large language model, has recently attracted unprecedented worldwide attention. Its exceptional performance in understanding human language and completing a variety of tasks in a conversational way has led to heated discussions about its implications for and use in education. This exploratory study represents one of the first attempts to examine the possible role of ChatGPT in facilitating the teaching and learning of writing English as a Foreign Language (EFL). We examined ChatGPT’s potential to support EFL teachers’ feedback on students’ writing. To reach this goal, we first investigated ChatGPT’s performance in generating feedback on EFL students’ argumentative writing. Fifty English argumentative essays composed by Chinese undergraduate students were collected and used as feedback targets. ChatGPT and five Chinese EFL teachers offered feedback on the content, organisation, and language aspects of the essays. We compared ChatGPT- and teacher-generated feedback in terms of their amount and type. The results showed that ChatGPT produced a significantly larger amount of feedback than teachers and that compared with teacher feedback, which mainly focused on content-related and language-related issues, ChatGPT distributed its attention relatively equally among the three feedback foci (i.e., content, organisation, and language). Our results also indicated that ChatGPT and teachers displayed tendencies towards using different feedback types when evaluating different aspects of students’ writing. Additionally, we examined EFL teachers’ perceptions of using ChatGPT-generated feedback to support their own feedback. The five teachers reported both positive and negative perceptions of the features of ChatGPT feedback and the relation between ChatGPT and teacher feedback. To foster EFL students’ writing skills, we suggest that teachers collaborate with ChatGPT in generating feedback on student writing.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the first author, Kai Guo, upon reasonable request.
References
Alshuraidah, A., & Storch, N. (2019). Investigating a collaborative approach to peer feedback. ELT Journal, 73(2), 166–174. https://doi.org/10.1093/elt/ccy057
Bai, L., & Hu, G. (2017). In the face of fallible AWE feedback: How do students respond? Educational Psychology, 37, 67–81. https://doi.org/10.1080/01443410.2016.1223275
Biber, D., Nekrasova, T., & Horn, B. (2011). The effectiveness of feedback for L1-English and L2-writing development: A meta-analysis. TOEFL iBTTM research report. Educational Testing Service.
Bitchener, J., Young, S., & Cameron, D. (2005). The effect of different types of corrective feedback on ESL student writing. Journal of Second Language Writing, 14, 191–205. https://doi.org/10.1016/j.jslw.2005.08.001
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems (pp. 1877–1901). Curran Associates, Inc.
Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing service. AI Magazine, 25(3), 27–36. https://doi.org/10.1609/aimag.v25i3.1774
Cho, K., Schunn, C. D., & Charney, D. (2006). Commenting on writing: Typology and perceived helpfulness of comments from novice peer reviewers and subject matter experts. Written Communication, 23(3), 260–294. https://doi.org/10.1177/0741088306289261
Devlin, J., Chang, M., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In J. Burstein, C. Doran, & T. Solorio (Eds.), Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 4171–4186). Association for Computational Linguistics.
Dujinhower, H., Prins, F. J., & Stokking, K. M. (2010). Progress feedback effects on students’ writing mastery goal, self-efficacy beliefs, and performance. Educational Research and Evaluation, 16, 53–74. https://doi.org/10.1080/13803611003711393
Ferris, D. R. (1997). The influence of teacher commentary on student revision. TESOL Quarterly, 31(2), 315–339. https://doi.org/10.2307/3588049
Foltz, P. W., Streeter, L. A., Lochbaum, K. E., & Landauer, T. K. (2013). Implementation and applications of the intelligent essay assessor. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation (pp. 68–88). Routledge.
Fu, Q. K., Zou, D., Xie, H., & Cheng, G. (2022). A review of AWE feedback: Types, learning outcomes, and implications. Computer Assisted Language Learning, 1–43. https://doi.org/10.1080/09588221.2022.2033787
Garcia-Peñalvo, F. J. (2023). The perception of artificial intelligence in educational contexts after the launch of chatgpt: Disruption or panic? Education in the Knowledge Society. https://doi.org/10.14201/eks.31279
Geng, J., & Razali, A. B. (2020). Tapping the potential of Pigai automated writing evaluation (AWE) program to give feedback on EFL writing. Universal Journal of Educational Research, 8(12B), 8334–8343. https://doi.org/10.13189/ujer.2020.082638
Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. Journal of Technology, Learning, and Assessment, 8(6), 1–44.
Gilson, A., Safranek, C., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2022). How well does ChatGPT do when taking the medical licensing exams? The implications of large language models for medical education and knowledge assessment. medRxiv, 1–9. https://doi.org/10.1101/2022.12.23.22283901
Guo, K., Chen, X., & Qiao, S. (2022). Exploring a collaborative approach to peer feedback in EFL writing: How do students participate? RELC Journal. 1–15. https://doi.org/10.1177/00336882221143192
Hearst, M. (2000). The debate on automated essay grading. IEEE Intelligent Systems and Their Applications, 15(5), 22–37. https://doi.org/10.1109/5254.889104
Hyland, K. (1990). Providing productive feedback. ELT Journal, 44(4), 279–285. https://doi.org/10.1093/elt/44.4.279
Kellogg, R. T., Whiteford, A. P., & Quinlan, T. (2010). Does automated feedback help students learn to write? Journal of Educational Computing Research, 42(2), 173–196. https://doi.org/10.2190/EC.42.2.c
Khan, R. A., Jawaid, M., Khan, A. R., & Sajjad, M. (2023). ChatGPT - Reshaping medical education and clinical management. Pakistan Journal of Medical Sciences, 39(2), 605–607. https://doi.org/10.12669/pjms.39.2.7653
Koltovskaia, S. (2020). Student engagement with automated written corrective feedback (AWCF) provided by Grammarly: A multiple case study. Assessing Writing, 44, 100450. https://doi.org/10.1016/j.asw.2020.100450
Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, C., ... & Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2(2), e0000198.
Lee, I. (2014). Feedback in writing: Issues and challenges. Assessing Writing, 19, 1–5. https://doi.org/10.1016/j.asw.2013.11.009
Li, J., Link, S., & Hegelheimer, V. (2015). Rethinking the role of automated writing evaluation (AWE) feedback in ESL writing instruction. Journal of Second Language Writing, 27, 1–18. https://doi.org/10.1016/j.jslw.2014.10.004
Link, S., Mehrzad, M., & Rahimi, M. (2022). Impact of automated writing evaluation on teacher feedback, student revision, and writing improvement. Computer Assisted Language Learning, 35(4), 605–634. https://doi.org/10.1080/09588221.2020.1743323
McMartin-Miller, C. (2014). How much feedback is enough?: Instructor practices and student attitudes toward error treatment in second language writing. Assessing Writing, 19, 24–35. https://doi.org/10.1016/j.asw.2013.11.003
Nelson, M. M., & Schunn, C. D. (2009). The nature of feedback: How different types of peer feedback affect writing performance. Instructional Science, 37(4), 375–401. https://doi.org/10.1007/s11251-008-9053-x
Neuwirth, C. M., Chandook, R., Charney, D., Wojahn, P., & Kim, L. (1994). Distributed collaborative writing: A comparison of spoken and written modalities for reviewing and revising documents. Proceedings of the Computer-Human Interaction ‘94 Conference, April 24–28, 1994, Boston Massachusetts (pp. 51–57). Association for Computing Machinery.
OpenAI. (2022). ChatGPT: Optimizing language models for dialogue. Retrieved on 7 January 2023 from https://openai.com/blog/chatgpt/
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. CoRR, abs/2203.02155.
Pavlik, J. V. (2023). Collaborating with ChatGPT: Considering the implications of generative artificial intelligence for journalism and media education. Journalism & Mass Communication Educator, 78(1). https://doi.org/10.1177/10776958221149577
Peterson, S., Childs, R., & Kennedy, K. (2004). Written feedback and scoring of sixth-grade girls’ and boys’ narrative and persuasive writing. Assessing Writing, 9(2), 160–180. https://doi.org/10.1016/j.asw.2004.07.002
Qadir, J. (2022). Engineering education in the era of ChatGPT: Promise and pitfalls of generative AI for education. TechRxiv. Preprint. https://doi.org/10.36227/techrxiv.21789434.v1
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al. (2018). Improving language understanding by generative pre-training. Retrieved on 15 January 2023 from https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9. Retrieved 15 January 2023 from https://life-extension.github.io/2020/05/27/GPT%E6%8A%80%E6%9C%AF%E5%88%9D%E6%8E%A2/language-models.pdf
Ranalli, J. (2018). Automated written corrective feedback: How well can students make use of it? Computer Assisted Language Learning, 31(7), 653–674. https://doi.org/10.1080/09588221.2018.1428994
Reynolds, L., & McDonell, K. (2021). Prompt programming for large language models: Beyond the few-shot paradigm (arXiv:2102.07350). arXiv. https://doi.org/10.48550/arXiv.2102.07350
Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching, 6(1). https://doi.org/10.37074/jalt.2023.6.1.9
Shermis, M. D., & Burstein, J. C. (2013). Handbook of automated essay evaluation. Routledge.
Stevenson, M. (2016). A critical interpretative synthesis: The integration of automated writing evaluation into classroom writing instruction. Computers and Composition, 42, 1–16. https://doi.org/10.1016/j.compcom.2016.05.001
Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007
Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. arXiv preprint arXiv:2009.01325.
Taecharungroj, V. (2023). “What can ChatGPT do?” Analysing early reactions to the innovative AI chatbot on Twitter. Big Data and Cognitive Computing, 7(1), 35. https://doi.org/10.3390/bdcc7010035
Taylor, W., & Hoedt, K. (1966). The effect of praise upon the quality and quantity of creative writing. Journal of Educational Research, 60, 80–83. https://doi.org/10.1080/00220671.1966.10883440
van Dis, E. A., Bollen, J., Zuidema, W., van Rooij, R., & Bockting, C. L. (2023). ChatGPT: Five priorities for research. Nature, 614(7947), 224–226. https://doi.org/10.1038/d41586-023-00288-7
Wambsganss, T., Janson, A., & Leimeister, J. M. (2022). Enhancing argumentative writing with automated feedback and social comparison nudging. Computers & Education, 191, 104644. https://doi.org/10.1016/j.compedu.2022.104644
Wang, E. L., Matsumura, L. C., Correnti, R., Litman, D., Zhang, H., Howe, E., ... & Quintana, R. (2020). eRevis (ing): Students’ revision of text evidence use in an automated writing evaluation system. Assessing Writing, 44, 100449. https://doi.org/10.1016/j.asw.2020.100449
Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An International Journal, 3, 22–36. https://doi.org/10.1080/15544800701771580
Wilson, J., Ahrendt, C., Fudge, E. A., Raiche, A., Beard, G., & MacArthur, C. (2021). Elementary teachers’ perceptions of automated feedback and automated scoring: Transforming the teaching and learning of writing using automated writing evaluation. Computers & Education, 168, 104208. https://doi.org/10.1016/j.compedu.2021.104208
Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English Language Arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100, 94–109. https://doi.org/10.1016/j.compedu.2016.05.004
Yang, M., Badger, R., & Yu, Z. (2006). A comparative study of peer and teacher feedback in a Chinese EFL writing class. Journal of Second Language Writing, 15(3), 179–200. https://doi.org/10.1016/j.jslw.2006.09.004
Yau, C., & Chan, K. (2023). University of Hong Kong temporarily bans students from using ChatGPT. South China Morning Post. Retrieved on 17 February from https://www.scmp.com/news/hong-kong/education/article/3210650/university-hong-kong-temporarily-bans-students-using-chatgpt-other-ai-based-tools-coursework
Zhai, X. (2022). ChatGPT user experience: Implications for education. Available at SSRN4312418.
Zhu, M., Liu, O. L., & Lee, H. S. (2020). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing. Computers & Education, 143, 103668. https://doi.org/10.1016/j.compedu.2019.103668
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, K., Wang, D. To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Educ Inf Technol 29, 8435–8463 (2024). https://doi.org/10.1007/s10639-023-12146-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-023-12146-0