1 Introduction

Feedback is essential for effective second/foreign language writing (Bai & Hu, 2017). The development of automated writing evaluation (AWE) systems makes feedback provision more efficient by allowing numerous submissions. Recently, artificial intelligence (AI)-programmed AWE has attracted increasing attention. Students’ engagement with AWE feedback is a complex process and crucial in improving students’ learning (Zhang & Hyland, 2022). A careful investigation of the AWE feedback, their revision and resubmissions process can provide information about how to support their engagement with AWE feedback (Koltovskaia, 2020; Mayordomo et al., 2022; Stevenson & Phakiti, 2019; Zhang, 2017). However, most research on AWE has been conducted in North America (Stevenson & Phakiti, 2014) with predominantly a product-based or experimental-based research design, to examine the effectiveness of the AWE. There is relatively less research monitoring the process of students’ revision in response to the AWE (Stevenson & Phakiti, 2019; Storch, 2018; Zhang & Hyland, 2022). To address this gap in the research, we examined how five university English as a foreign language (EFL) students interacted with and responded to the feedback of Pigai, the largest AI-programmed AWE in China. Our analysis involved repeated exchanges between the students and Pigai in a single writing task. Our findings also contribute to the limited research capturing the revision process of students’ repeated interaction with AWE (Stevenson & Phakiti, 2019; Tan, 2019; Zhang & Hyland, 2018).

2 Theoretical background and literature

2.1 Students’ engagement with written corrective feedback

In second language learning, research tends to focus on the effectiveness of written corrective feedback (Chong, 2018; Hyland & Hyland, 2019), and it is believed that students’ engagement with written corrective feedback can improve their revision (e.g., Hyland, 2003; Zhang & Hyland, 2018). Some studies categorise learner engagement to corrective feedback into three aspects: cognitive, behavioural and affective aspect (Han & Hyland, 2015; Zhang & Hyland, 2018). Research has revealed that students apply a variety of cognitive strategies in their process of engaging with teacher feedback, including planning, prioritising, monitoring, and evaluating (Han & Hyland, 2015). However, the extent to which each student engages with the feedback varies quite markedly (Hyland, 2003). Researchers have called for more studies investigating learner’s development and change of behaviours in their engagement with written corrective feedback over time (Han & Hyland, 2015). In contrast to these large sample studies, our research conducted d a detailed micro-level analysis of learners’ interactions with written corrective feedback in a single task informed by the error categories of Writing Corrective Feedback (WCF) (Appendix Table 9) and learners’ revision categories (Ferris, 2006; Han & Hyland, 2019).

In the EFL context, some research revealed that teachers’ WCF mainly focused on written accuracy, with more direct error feedback (Lee, 2011; Waer, 2021). At the same time, teacher WCF has been gradually replaced by the increasing use of automated writing evaluation programs. It was regarded as unrealistic to implement teacher WCF in large EFL classrooms in China (Yu et al., 2020; Zhang & Hyland, 2018). Therefore, to explore “feedback up (what the student can do better in the same task?)” (Chong, 2018, p. 342), our research explored the process and pattern of Chinese EFL students’ engagement with AWE via multiple submissions to complete one writing task.

2.2 Automated writing evaluation

Automated writing evaluation (AWE) is a machine learning system that provides learners feedback on spelling, punctuation, grammar, sentences, and coherence (Zhang & Hyland, 2018). However, some aspects of writing, such as writing style, creativeness, and conceptual ideas, cannot be evaluated by AWE (Stevenson & Phakiti, 2014, 2019) and there are limited genre types that can be marked by AWE, apart from the narrative and argumentative text type (Stevenson & Phakiti, 2014). Research on AWE has largely focused on writing products, with little attention paid to the revision process (Stevenson & Phakiti, 2014; Storch, 2018).

There are several key AWE systems for second or foreign language learners, such as Cywrite, Writing Assistant, and Pigai. This research focused on Pigai, which was launched in 2011 by Beijing Cikuu Science and Technology Co Ltd, and provides an online automatic grading service based on natural language processing, corpus-based algorithms, and AI technology on EFL writing for Chinese learners (Zhang, 2017). Pigai provides “diagnostic feedback in an average speed of 1.2 s” (Zhang & Zhang, 2018, p.31). The distinctive features of Pigai system include: (1) detecting collocation errors made by Chinese students and providing written corrective feedback; (2) providing linguistic resources and recommended expressions for students, such as the frequency of expressions/collocations from a database, and if not identified from the database, Pigai treats it as a possible Chinese English expression; and (3) providing positive comments when good expressions are used by students (Bai & Hu, 2017; Du & Gao, 2022; Zhang, 2017; Zhang & Zhang, 2018). It can also give a holistic mark, record and track students’ multiple revisions (Bai & Hu, 2017; Du & Gao, 2022).

Research on AWE could be categorised into two areas: learner-centric and system-centric (Hoang, 2019). The former focused on learners’ attitudes, evaluation and responses to AWE; the latter focused on the mechanism of the AWE programs (Al-Inbari & Al-Wasy, 2022; Hoang, 2019). The student-centric research has been further classified based on the effect of AWE in terms of (1) on EFL learners’ final written products; (2) EFL learners and teachers’ perceptions about AWE (e.g., Du & Gao, 2022; Mayordomo et al., 2022), and (3) the process of learning, editing and teaching (Al-Inbari & Al-Wasy, 2022; Lai, 2010).

Regarding the effectiveness of AWE feedback, some research revealed that AWE feedback could lead to improvement in students’ writing especially in terms of error rate reduction (Al-Inbari & Al-Wasy, 2022; Liao, 2016; Lu, 2019; Wang et al, 2013), with reduced frequency of grammatical, vocabulary, spelling, and conjunction errors was observed (Liao, 2016; Song, 2019; Tan, 2019). Reduced foreign language writing anxiety was also observed (Waer, 2021). However, some researchers argued that students’ integration of AWE feedback did not bring observable improvement in their revision, because the accuracy-oriented EFL context overemphasised language-related errors (Huang & Renandya, 2020). At the same time, research on learners’ perceptions of AWE revealed that the relevance of the perception of feedback played an important role in their resubmissions (Mayordomo et al., 2022). The studies focusing on the product can enhance the validity and reliability of AWE, but tell us little about the process of how students engage with AWE feedback.

Most of the studies on AWE have used an experimental design (e.g., Al-Inbari & Al-Wasy, 2022; Lu, 2019; Song, 2019; Tang & Rich, 2017) or surveying large samples (e.g., Fu et al., 2020; Li et al, 2019). A systematic review of 105 published papers on AWE revealed that 99.9% adopted quantitative methods (Shi & Aryadoust, 2022). The majority of these studies (around 70%) focused on validating AWE scores (Shi & Aryadoust, 2022) or the correlations between automated scores and human scores (e.g., Shi & Aryadoust, 2022; Wang, 2022). However, the experimental design and large-scale survey cannot reveal the detailed process and patterns of learners’ engagement with AWE feedback.

Learners’ engagement with feedback was viewed as a dialogical process in which learners use the feedback to improve their work (Carless, 2015). Research regarding learners’ engagement with AWE feedback revealed advantages, such as the immediate AWE feedback and the opportunity for multiple revisions and resubmissions at students’ own pace, which increased their autonomy and the internalisation of the language input (Lu, 2019; Zhang & Hyland, 2018). The key styles of learner engagement with AWE have been identified as (1) superficial correction/revision with low uptake (Bai & Hu, 2017); (2) very limited time (average 6 min) spent on revision after receiving AWE feedback (Warden, 2000); and (3) no revision after receiving feedback (El Ebyary & Windeatt, 2010; Jiang, 2015). However, these styles of engagement were derived from one, or non-successive, interaction with AWE feedback (e.g., Liao, 2016; Wang et al, 2013), and with limited evidence on student uptake of feedback on collocations provided by Pigai from the multiple drafting and revision process (Bai & Hu, 2017; Lu, 2019; Song, 2019; Tang & Rich, 2017). Thus, more research was needed to capture the revision process and examine learners’ repeated interactions with AWE (Tan, 2019; Zhang & Hyland, 2018), especially in a real learning context rather than experimental conditions (Stevenson & Phakiti, 2019; Zhang, 2017). A careful investigation of their revision process can provide information about how to support their engagement with AWE feedback via resubmissions (Koltovskaia, 2020; Mayordomo et al., 2022; Stevenson & Phakiti, 2019; Zhang, 2017).

Only a few studies analysed students’ revision and resubmission process. In one study focusing on one EFL student’s engagement with the AWE of Pigai (Zhang, 2017), a student’s submissions with AWE feedback for 10 different assignments and an individual interview revealed that the effectiveness of AWE feedback was dependent on the student’s level of engagement with the feedback. Koltovskaia (2020) had a detailed analysis of two students’ engagement with the automatic writing feedback provided by Grammarly. The analysis showed that the two students barely used revision strategies to refine their drafts. One reason might be that the two students only had two rounds of revisions.

Some research focuses on comparing students’ engagement with AWE feedback with other types of feedback. For instance, Zhang and Hyland (2018) investigated two Chinese university students’ engagement with both teacher WCF and AWE feedback provided by Pigai. The results showed that the advantages of AWE feedback, such as immediate feedback and opportunity for resubmission contributed to high students’ engagement with the AWE feedback over teacher feedback. At the same time, the amount of AWE feedback and low scores could result in some students’ low engagement with AWE feedback. Similarly, Zhang and Hyland (2022) analysed students’ drafts of three rounds of revisions of a single essay assignment based on AWE, peer, and teacher feedback respectively. The AWE feedback was categorised according to Han and Hyland (2015)’s taxonomy of error categories, and students’ revisions were categorised as accept, reject, and substitute. It appears necessary to have a more nuanced analysis of students’ revisions via their multiple submissions. This study chose to focus on a more sustained process of learner engagement with the AWE, observing repeated interactions in a single writing task.

As learners’ interaction with AWE feedback is an iterative and dynamic process, it is necessary to analyse the patterns of the AWE feedback, and learners’ revisions over repeated resubmissions. Based on the gaps in the existing literature, the questions this research aimed to explore were:

  1. 1.

    What are the salient features of iterative feedback by Pigai over a period of multiple resubmissions?

  2. 2.

    How do learners interact with Pigai feedback through the process of multiple revisions?

3 Method

3.1 Participant and context

This paper drew data from an ongoing large project exploring teachers and students’ engagement with the AI-supported Pigai system used in a general English course, College English, in a Chinese university in Shanghai. To satisfy the essential selection criteria, participating students needed to be enrolled in the College English course, and had experience of using Pigai. Participation in this study was voluntary, and participants were required to complete an English writing task, which was not related to their course assessment. Participants submitted their initial draft to Pigai and were encouraged to have multiple revisions and resubmissions, but the specific number of resubmissions was decided by individual students. This writing task and revisions were done outside of formal class time. The topic of the writing task was “Why should sportsmen earn huge amounts of money?” and it was an argumentative essay. The word count requirement was 150 to 200 words. Five sophomore students voluntarily participated in this study. The data were their submissions to Pigai for this single writing task. After receiving Pigai feedback, four of the five students had 3 rounds of submissions, and one student (student 5), who responded actively to the feedback and had 12 successive submissions for this single writing assignment. This prompted us to analyse the five participants’ successive drafts in different versions. Specifically, the analysis looked at the number and types of revisions they made based on the AWE feedback, by way of zooming in and out aspects of leaner-machine interaction. Ethics approval was granted from the authors’ university.

3.2 Analytical framework

Data analysis aligned with the two research questions, focusing on the analysis of Pigai feedback and students’ revisions respectively. Each type of data followed a category approach in terms of the types of feedback and revision as evidenced by related literature. To address the first research question, the types of feedback generated by Pigai were analysed in three rounds. In the first round of analysis, researchers categorised the Pigai feedback into: direct feedback, which corrected an error or explicitly required correction; indirect feedback, which only indicated errors; and metalinguistic feedback which provided rules and examples of use (Han & Hyland, 2019). The second round of feedback analysis focused on local or global issues following the categories from Montgomery and Baker (2007) and Storch and Tapper (2000) (Appendix Table 8). This framework has been used to analyse teacher WCF (e.g. Jiang et al., 2020) and can be applied in analysing AWE feedback over multiple submissions. In the third round of analysis, marking and corrective feedback for each revision was analysed and compared with the detailed error categories summarised by Han and Hyland (2019), based on Ferris’ (2006) error categories (Appendix Table 8), which has been used by Zhang and Hyland (2022) in analysing Pigai feedback in their research.

To address the second research question, students’ revisions were analysed in three rounds according to the coding schemes from Ferris (2006) and Hyland (2003) and the refined coding in Han and Hyland (2015), regarding students’ revision based on teacher WCF (Appendix Table 10). The first round of analysis compared the types of revision with the types of feedback, following the categories in Appendix Table 10. The second round focused on coding and selecting text extracts for revision from students’ writing and tracing the development of students’ revision over the successive interaction with Pigai feedback. In this round, we hoped to show evidence of how the students processed and acted on the Pigai feedback, and how this impacted their subsequent writing (Storch, 2018). The third round synthesised the whole process of participants’ interaction with the Pigai feedback and mapped the related aspects by way of creating terrain models, showing the relationships between types of revision, and highlighting a process of change and development in the learners’ EFL writing.

Being aware that the small data set yielded from this study may affect the generalisability of the findings, a conscious effort was made to enhance the credibility and transferability of the analytical tools. Three co-authors took different roles in the process of data analysis. The first author analysed and coded the data from the participants with 12 submissions. Using the same set of analytical frameworks, the second author coded the data from the other four participants. After that, the two authors critically reviewed and recoded each other’s coding. The third author had a final review of all the coding. When there were ambiguities and inconsistencies, a collective decision was made to refine the coding based on evidence from Pigai’s feedback and students’ submissions. The detailed procedures of the analysis and coding framework helped ensure the transferability of the analysis.

4 Results

The results section first reports on the analysis of the focal points and categories of Pigai feedback of all participants. In each section, the results from the four participants who had three submissions are presented first, followed by the results from student 5 with 12 submissions. The second part of this section reports the analysis of the participants’ engagement with the feedback and different types of revisions.

4.1 Feedback types and error categories

4.1.1 Error and non-error feedback

The types and numbers of Pigai feedback for the five participants’ submissions are displayed in Table 1 and Fig. 1. As shown, the number of error corrective feedback items decreased over multiple submissions. The reason might be that it was easier to address the direct error corrective feedback. Further analysis of how students engaged with the feedback in their multiple revisions and submissions is described in Sect. 4.2. Compared with the number of error corrective feedback items, the five participants received much more feedback with suggestions and tips for synonyms, word differences, and collocation (shown in Table 2).

Table 1 The types and numbers of Pigai feedback for Students 1—4
Fig. 1
figure 1

The types and numbers of Pigai feedback for student 5

Table 2 The total numbers of feedback

As shown in Table 2, the error corrective feedback focused on capital letters, grammar, verb errors, noun errors, and punctuation. Another key feature of Pigai feedback was the non-error feedback of suggestions and tips for synonyms and collocations, which took 73% of the total feedback. Examples of non-error feedback are shown in Appendices 11 and 12.

4.1.2 The global and local focus of Pigai feedback

The second round of analysis of Pigai feedback focused on an in-depth coding of the global and local focus of Pigai feedback. As shown in Table 1, the Pigai feedback for the four participants with three submissions focused on local-level errors such as grammar, language expression, and mechanics. However, there were some changes in feedback focus over student 5’s 12 submissions (Fig. 2).

Fig. 2
figure 2

Global and local focus of Pigai feedback for student 5

As shown, the feedback for the initial submission had a more local focus, grammar (2), lexical error (2), and mechanics (spelling, punctuation, and capitalisation) (5). The feedback with a local focus decreased quickly after the second submission with only one lexical error till the submission 7 and three on grammar, in submissions 5 and 8. The general feedback for the first three submissions was the same “Does not meet the requirement of word count”. Only from submission 4, when student 5’s writing met the word count requirement, did Pigai provide more detailed general feedback with a global focus on the organisation of the essay: the structure, the use of connectives and transition, and feedback for using more complex sentences (see example in Fig. 3). There was no feedback on ideas and content.

Fig. 3
figure 3

General feedback for resubmission 4

4.1.3 Detailed error types

As shown in Fig. 2, the number of sentence errors (both grammar and capitalisation), verb errors, and noun errors were the highest in the first submission but didn’t appear after that. Although sentence grammar errors and noun errors occasionally appeared again in resubmissions 5 and 8, they were not identified in other submissions. Meanwhile, Pigai also identified the suspected Chinglish (Chinese style English) (the same one repeated from the submissions 2 to 7), which is a type of error often made by Chinese EFL learners. There was no error correction feedback after submission 9. However, some errors in student 5’s writing were not identified by Pigai, such as singular-plural error and sentence structure error: “So I think they pay and reward is fair, I am proud of them.” Compared with the 16 error categories suggested by Han and Hyland (2019), errors related to pronouns, fragmentation, and prepositions were not identified by Pigai in this student’s submissions either.

4.2 Students’ engagement with Pigai feedback and revisions

The five participants’ interactions with the feedback were analysed following the revision analysis categories summarised by Han and Hyland (2019). It was found that there were three types of revision: (1) revisions based on error corrective feedback; (2) revisions based on non-error feedback, including general feedback, tips and suggestions; (3) self-initiated revisions.

4.2.1 Revision based on error corrective feedback

The revision patterns of all five participants are shown in Figs. 4 and 5. Students 1, 4 (Fig. 4), and 5 (Fig. 5) had higher levels of engagement with the error corrective feedback in the first two revisions, evidenced by the number of corrections they made based on the instance of feedback. However, among the four students who had three submissions, there was no consistent pattern in terms of their interaction with the error corrective feedback.

Fig. 4
figure 4

Students 1 to 4: Revision on error corrective feedback over 3 submissions

Fig. 5
figure 5

Student 5: Revision on error corrective feedback over 12 submissions

A closer look at the 11th revisions of student 5, showed that they corrected most of the corrective feedback focusing on local issues (e.g., grammar, expressions and mechanics) in the first two revisions, except for one error which received feedback that the expression was a suspected Chinglish expression: “They will acquire a great deal of money award”. Student 5 didn’t take any action in correcting it in the first five submissions. A closer look showed that student 5 attempted to respond to this comment in the revision 6 by correcting the expression to “They will acquire award” removing “a great deal”, but received similar feedback that was a Chinglish expression. In his revision 7, they changed it back to “acquire a great deal of award” and received the same feedback. In his revision 8, they corrected to “acquire a great number of award” and the feedback for suspected Chinglish disappeared.

This indicated that Pigai could only identify “acquire … number” as fragmental pieces instead of the full collocation and treated it as a correction. This revealed a limitation of Pigai in data retrieving and matching. At the same time, it showed the process of trial and error the student had made, reflective of their cognitive and behavioural engagement with the AWE feedback. However, without contextual information and specific example or translation, it appeared to be hard for the student to improve, because they had difficulties in differentiating the provided synonyms.

4.2.2 Revisions based on non-error feedback

There were two types of non-error feedback: (1) the general feedback provided by Pigai, and (2) suggestions and tips (see Appendices 11 & 12). All five participants’ interaction with these types of feedback is summarised in Figs. 6 and 7. The four participants, who had three submissions, had a higher level of engagement with the suggestion feedback than the one who had 12 submissions. Student 5 responded to feedback for suggestions and tips from revision 5 by trialing and substituting the suggested words/synonyms (see examples in Fig. 7). This could be interpreted as an attempt to use more academic vocabulary as suggested in the general feedback. However, their responses to suggestions and tips were much more delayed than the other four students. One reason could be that they focused on responding to the general feedback in the first few revisions.

Fig. 6
figure 6

Students 1 to 4: Revisions to suggestions and tips feedback

Fig. 7
figure 7

Student 5: Revision according to general feedback and feedback for suggestions and tips

The four students with three submissions did not respond to the general feedback in their revisions. However, student 5 started revising the essay at revision 3, based on the general feedback: “字数不符合要求” [Zì shǔ bù fúhé yāoqiú; Does not meet the word count requirements], by adding more sentences. This revision received more detailed general feedback which not only gave positive comments for the flexible use of vocabulary and good structure, but also indicated the areas that need improvement: more academic vocabulary, more long and complex sentences, and more connectives (included in Fig. 3). Student 5 made some revisions according to the general feedback in the next submission, by adding some connectives and a complex sentence.

4.2.3 Self-initiated revisions

An interesting finding was that all five participants had some self-initiated additions or revisions, which could not be related to any specific feedback (Tables 3 and 4). Student 5 also had some self-initiated revisions after revision 5. This occurred with some language errors in revisions 5, 6, and 7 (Fig. 7). Another example of trial and error in their self-initiated revision is shown in Table 5. It seemed that student 5 made this incorrect revision according to the general feedback for more complex sentences, by making syntactic and verb agreement errors. After that, they corrected the error in the following submission. In addition to the initiated revision on syntactic structure, student 5 also had a revision on the organisation of the essay. Although the general feedback gave a positive comment on the structure of the essay for the submission 8, student 5 took the initiative and combined the first two paragraphs in the next submission, which tightened up the organisation of the essay. This self-initiated revision could be interpreted as a sign of learner autonomy.

Table 3 Examples of revisions to suggestion and tips feedback
Table 4 Self-initiated revisions
Table 5 Example of incorrect revision in trial and error

4.2.4 A terrain model mapping aspects of revisions

The types of revisions made by all five participants are summarised in Table 6. Their responses to both error-corrective feedback and feedback with suggestions and tips, and the take-up rate are shown in Table 7. Although the number and percentage of revisions according to suggestions and tips were higher than those according to error-corrective feedback and general feedback, the take-up rate of the latter two types of feedback was higher than the former one. Another marked feature was the self-initiated revision, which demonstrated the emergence of students’ autonomy and the shift from passive learning to active learning. Although data from the first four students’ submissions of did not show a consistent pattern of self-initiated revision, the revisions made by student 5 over 12 submissions revealed some patterns (Fig. 8).

Table 6 Numbers and types of revisions
Table 7 The sum of Pigai feedback and take-up rate
Fig. 8
figure 8

Student 5: Types and patterns of revision over 12 submissions

The patterns showed that student 5 was selective in their up-take of Pigai feedback in their revision. Some patterns were identified. First, they responded to error corrective feedback quickly in the first two revisions (shown in orange in Fig. 8). Second, the engagement with the general feedback started from the third revision after receiving more detailed general feedback. Third, there were some interruptions, which may have been caused by some incorrect revisions and corrections (revisions 6 to 9), but restarted and increased again in the last two revisions (shown in brown in Fig. 8). Fourth, the revision based on the general feedback seemed to stimulate the revision based on the suggestions and tips starting from revision 6 (shown in light green in Fig. 8). Meanwhile, his self-initiated revision also started after receiving some positive feedback from the revision 6, with a pause in revisions 9 and 10, but restarted in the final revisions (shown in dark green in Fig. 8). Last, there was a delay in responding to the comment on Chinglish without obvious improvement, but the student addressed it after trial and error.

5 Discussion

5.1 Changes and patterns of Pigai’s feedback

Regarding the first research question about the salient features of Pigai feedback, this study provided detailed information regarding the patterns of Pigai feedback for the five participants’ submissions. As shown in Table 2, 73% of the feedback items were non-error feedback and the error-corrective feedback focused on capital letter, vocabulary, grammar and punctuation errors. These covered most of the error categories listed by Han and Hyland (2019). The analysis of Pigai feedback types and error categories showed that although Pigai feedback varied across types, the majority focused on language-related errors, such as mechanics, grammar, and lexical errors. These direct error corrective feedback items reduced after the submission 2 for all five students and remained low until submission 8 for student 5. The general feedback focused on essay organisation but lacked a more comprehensive view of writing, in areas such as idea and content (Huang & Renandya, 2020).

Another feature of Pigai feedback was that the total number of feedback items for synonyms and collocations was far more than the direct error corrective feedback on mechanics errors (Table 2). This type of indirect and non-error feedback remained at a high level for students with three submissions and showed an increasing trend for student 5 with 12 repeated submissions. One reason could be that it was hard for students to address the feedback when there were no examples or contextual information for a list of synonyms or confused words provided.

At the same time, Pigai did not identify certain errors in the overall writing (Bai & Hu, 2017). The marking sometimes was based on data retrieval and matching, which resulted in some matching of single words/fragment parts in the collocation. This indicated future improvements of the AWE system, especially in examining complex sentences. Different from other studies which examined AWE feedback with experimental design or surveys of large samples (Li et al, 2019; Lu, 2019; Song, 2019; Tang & Rich, 2017), patterns and iterative changes in Pigai feedback for multiple submissions for one single writing task, were new findings in this research. This provided information for a better understanding of students’ engagement with AWE and autonomy developed over multiple revisions.

5.2 Patterns of interaction with Pigai feedback

Regarding the second research question, all five participants demonstrated sustained engagement with Pigai feedback via multiple revisions and submissions. Student 5 was particularly highly engaged in responding to Pigai feedback, evidenced by the process and patterns of the revision. This differed from some research that reported the lack of revision or re-drafting after receiving feedback (El Ebyary & Windeatt, 2010; Koltovskaia, 2020). In addition, all students had different revisions based on different feedback sources. All five students corrected the errors identified by the error corrective feedback quickly in the first two submissions. This was evidenced by the high take-up rate of error-corrective feedback and the decrease of this type of feedback after the second submission. This was in line with the finding that error-corrective feedback leads to improvement in students’ writing, especially in terms of error rate reduction (Liao, 2016; Wang et al, 2013). This may have been because it was easy for the students to make superficial mechanical revisions, but more troublesome to make revisions of collocations and synonymous words (Bai & Hu, 2017).

The students’ responses to non-error corrective feedback varied. The four students with three submissions responded to non-error corrective feedback in their two revisions, but did not respond to the general feedback. Student 5 started the revision based on the general feedback after receiving more detailed and positive feedback. This finding aligned with other studies where students’ engagement tended to be enhanced with more specific feedback (Lu, 2019; Zhang, 2017). During this process, student 5’s autonomy was demonstrated via their sustained multiple revisions, and trial and error following the general feedback, such as adding more academic vocabulary, connectives, and increasing the use of complex and compound sentences. This echoed the findings of other studies where immediate feedback received for the revision further encouraged students to try new expressions and sentence structures (Bai & Hu, 2017; Zhang, 2017; Zhang & Hyland, 2022). In addition, this was in agreement with the finding that opportunities for revisions and resubmission were important for enhancing learners’ engagement with feedback (Mayordomo et al., 2022). The findings also indicated that the general feedback played the role of both an assessment and learning tool for students’ writing, but needed to be informative and specific for the revision.

Although Pigai provided linguistic resources for students, for example, the suggestions for synonyms and collocations from the AWE system, the take-up rate was lower than that with error-corrective feedback and general feedback. This was in line with the findings of other studies on Pigai (Zhang, 2017; Zhang & Hyland, 2018, 2022). However, the detailed analysis showed that student 5 had been trialling selecting the vocabulary from the list provided by Pigai, even though this trialling was somewhat delayed. This echoed the findings that the students sometimes substituted their simple words with more complex ones to increase lexical diversity, but their attempt was not appropriate due to the lack of knowledge about “stylistic, syntactic, collocational and semantic differences between synonyms” (Bai & Hu, 2017, p. 78). This indicated that Pigai feedback for confusion words/synonyms without explanation may result in errors and a low take-up rate (Bai & Hu, 2017).

Although some research has used learners’ correction and evaluation of Pigai feedback as evidence of autonomy (Bai & Hu, 2017), this research provided an alternative perspective using the terrain model to illustrate the detailed process of one learner’s autonomy, developed via sustained engagement with Pigai feedback and the patterns and relations between of types of revisions. For example, autonomy developed from the initial superficial corrections to responses to general feedback. This was evidenced by the student adding more complex sentences and academic expressions. Another sign of autonomy was student 5’s attempts to experiment with the vocabulary and expressions provided by the suggestions and tips, gradually with the increasing self-initiated revisions in the second half of their resubmissions. The other four students also had self-initiated revisions (Table 3). This demonstrated their behavioural engagement with the AWE feedback (Zhang, 2017). We speculate that students applied various strategies when responding to different types of feedback (Zhang, 2017). In future studies, this could be substantiated by interview data, to ascertain student awareness of using these strategies. Although some of student 5’s attempts contained further errors, they still indicated the learner’s autonomy in creating something new in their own writing, rather than passively following AWE suggestions (Bai & Hu, 2017).

6 Conclusions, implications, and limitations

Our detailed analysis showed that Pigai provided various types of feedback. The majority of error corrective feedback focused on local, language-related errors. At the same time, Pigai feedback provided a significant amount of non-error feedback items, increasing through multiple submissions but lacking examples and contextual information. The detailed analysis of all five students’ revisions and resubmissions showed certain patterns via their responses to different types of feedback, initially with error-corrective feedback, then with non-error corrective feedback with trial and errors, and general feedback. It could be concluded that sustained engagement with Pigai feedback could facilitate writing improvement and develop students’ autonomy in using various writing strategies. These findings contribute to the literature on students’ engagement with AWE which until now has lacked evidence on how students uptake AWE feedback (Bai & Hu, 2017; Lu, 2019).

The findings of this study have implications for both language classroom pedagogy and the future development of AWE. Firstly, as the majority of Pigai feedback had a local focus without comments on ideas and content, teachers could adjust teacher feedback to focus on the content and idea development of student drafts and allocate more time to instruction. Meanwhile, Pigai feedback with the local focus could still be used for both formative assessment and as a learning resource in their teaching. Simultaneously, students need guidance for engaging with Pigai feedback by incorporating revisions and resubmissions. In an EFL classroom, some lecture time could be allocated to explain the features of AWE feedback, including its strengths and weaknesses, and raise students’ awareness that revision is an integral part of writing (Du & Gao, 2022; Zhang & Hyland, 2022). At the same time, some activities need to be designed for students to analyse, evaluate, monitor, and regulate their writing and revisions (Zhang & Hyland, 2022). One incentive to promote students’ engagement with AWE could be assigning some assessment marks for revisions and editing of their writing based on AWE feedback (Du & Gao, 2022). As certain errors in the writing were not identified by Pigai, this also shows that teachers’ guidance and monitoring are necessary during the process of using AWE for learning. One strategy is to combine the teacher, peer and AWE feedback in the revision process (Koltovskaia, 2020; Zhang & Hyland, 2022). At the same time, both teachers and students should be cautious and critical of over-reliance on technology.

The patterns and changes in interactions with AWE feedback over multiple submissions showed some evidence of autonomous learning. Teachers could capitalise on this to support students in planning and monitoring the revision process. Considering the low take up of feedback of suggestions, students may need extra support from teachers, providing more contextualised use of vocabulary/expressions to make effective use, or selection, of the appropriate expressions. This could be supported with other resources, such as grammar resources and dictionaries.

There are also implications for the future of AWE to build more personalised learning models based on learners’ engagement pattern with Pigai feedback. As learners’ responses to feedback impact the effectiveness of their learning, it is important to embed the capacity for meaningful engagement in the development of learning technology (Zhang & Zhang, 2018). The content of the feedback could highlight the focus and relevance. Recent research in deep learning and AI-supported learning could be incorporated to enhance AWE systems in providing more meaningful and personalised feedback (Shi & Aryadoust, 2022). Research also showed that not all errors were picked in Pigai feedback, which implied that current Pigai feedback is mainly based on data retrieval and matching, therefore tailoring the content and providing more appropriate feedback for optimal learning effectiveness may increase the reliability of future AWE design. In addition, future AI-programmed apps may be designed to give feedback for conceptual ideas and more genre types.

It is important to note the limitations of this study. First, this paper only included a small number of samples with a detailed analysis of five students’ repeated submissions. This cannot be generalised to a wider context. However, the analytical model can be applied to other research on analysing learners’ engagement with AWE feedback. In addition, the detailed analysis of the patterns of Pigai feedback and the students’ revisions contribute to the understanding of AWE feedback, which cannot be captured by large-scale surveys (Zhang & Hyland, 2022) and provides a set of baseline empirical evidence items to inform future research with a more complex research design involving interviews and multiple cases. Second, this paper only focused on students’ engagement with the AWE feedback rather than their language learning. This will be a future research direction. In this research, the computer-generated data could not reveal the student’s thinking process, decision-making, and emotional engagement with feedback in each revision. A future research area would be to explore the strategies students use in revision and their affective engagement with AWE. The incorporation of interview data and think-aloud data also warrants future research.