1 Introduction

Recent decade has witnessed a rapid proliferation of service robots, which have permeated diverse fields, ranging from reception and cargo delivery to home service. Presented by the International Federation of Robotics (2021), the turnover of the global market for professional service robots was up 12% to 6.7 billion U.S. dollars. Service robots are believed to be promising for mitigating the challenge of global teacher shortage (Edwards & Cheok, 2018). It is of critical importance to develop an educational service robot (ESR) into a full-fledged robotic instructor that can be fully accepted by students.

So far, various ESRs have been investigated, including the tele-education robot teacher EngKey in South Korea (Yun et al., 2011), the android robot SAYA in Japan (Hashimoto et al., 2011), RoboThespian in Israel (Verner et al., 2016), Nao in France (Banaeian & Gilanlioglu, 2021) and so on. More recently, topics on how to develop ESRs and why teachers and students accept them have also aroused broad interests (e.g., Kossewska & Kłosowska, 2020; Lin et al., 2021; Naneva et al., 2020; Turja & Oksanen, 2019). However, the study of ESRs is still in its infancy, and the developing of ESRs is rather challenging, as the programming for non-technical users is complex and difficult, which posts a tremendous obstacle for educational researchers and roboticists (Fogli et al., 2022). Luo et al. (2021) revealed that another reason for novel educational technologies not achieving long-term effects might be the unawareness of significance of teachers and students’ attitudes. Other scholars also claimed that students’ acceptance of ESRs is unclear, although using intention of other cohorts like elder adults (Wu et al., 2014), undergraduates (Conti et al., 2015), parents (Lin et al., 2021), and workers (Turja & Oksanen, 2019) have been studied. Therefore, due to popularization of ESRs, more related studies are highly demanded, especially quantitative ones (Turja & Oksanen, 2019).

In order to make the operation of ESRs easy to teachers and students, this study developed an artificially intelligent robot as a classroom teacher (AI teacher for short hereafter) based on pedagogical theories and robot programming principles (Huang et al., 2020). Moreover, in order to access students’ perspectives on its utilization in school environment, an extended technology acceptance model (TAM) is proposed to explore the behavioral mechanism of Chinese pupils in accepting the AI teacher. To the best of our knowledge, this study is one of the first attempts for creating an AI teacher and an integrated model to assess how elementary school students accept and use it. It paves the way for developing competent AI teachers, and may find applications in rural and underdeveloped areas suffering from teacher shortages.

This paper is divided into seven parts. Section 2 presents the literature review on robots as teachers, TAM, and factors influencing people’s acceptance of service robots. Moreover, the objective of this study is proposed. In Section 3, the theoretical model and eight hypotheses have been developed. In Section 4, the development of the AI teacher, data selection and analysis methods, and samples have been introduced. Section 5 reports the results, and Section 6 shows the discussion, theoretical and practical implications. Finally, Section 7 provides the conclusion of this study, including the limitation and future research trend.

2 Literature Review

2.1 Robots as teachers

ESRs are frequently used at three levels: as tools operated by teachers and students, as assistants for instructors, and as teachers collaborated with human teachers (El-Hamamsy et al., 2021). Robots as teachers are not as common as robots as tools, and scholars have attempted to design robots to behave like human teachers. Their works could be summarized into three categories: effective teaching methods to ESR, social capabilities endowing ESR to better interact with learners, and robot programming techniques. Thomaz and Breazeal (2008) presented the reward channel for feedback and future-directed guidance, and emphasized the significance of comprehending human robot interaction to design algorithms to improve robot learning and teaching. Leite et al. (2012) proposed and justified an empathic model for robots interacting with children. According to Fogli et al. (2022), on-line programming, off-line programming, visual programming, natural language programming, tangible programming have been used to instruct robots to carry a task.

2.2 TAM

TAM is one of the most commonly used models in information systems theory discussing how humans accept and use technologies (Al-Nuaimi & Al-Emran, 2021). It includes external variables, variables of perceived usefulness (PU), perceived ease of use (PEOU), attitudes towards using (ATU), behavioral intention to use, and actual system use (Davis, 1989). ATU are affected by PU and PEOU directly and indirectly. This model has been used in some research related to service robot applications. Park and Kwon (2016) used TAM to determine users’ perception of robots as teaching assistant and found PU, perceived enjoyment and service quality as influencing factors. Lee et al. (2018) in their study used variables of PU, PEOU, trust, quality of output, interaction and attitudes towards restaurant robots to predict implementation of robotics in restaurant services. The study done by Go et al. (2020) used iTAM to examine factors (highly interactive systems, increased capability, a user-friendly interface, and perceived interactivity of technology) contributing to consumers’ acceptance towards advanced AI robots in hospitality and tourism.

2.3 Determinants of robots acceptance

The determinants of robots acceptance from the individual, technological and interactive perspectives have been summarized in Table 1. Users’ geographical location (cultural background and nationality), characteristics (age and sex), technology anxiety, and trust in robots were considered important factors impacting peoples’ acceptance of robots (Naneva et al., 2020). The survey done by Broadbent et al. (2009) in the field of healthcare and elderly care indicated that design and function of robots exerted a large impact. The intended domain of application and type of exposure to robots affected users’ experience with robots (Savela et al., 2018).

Table 1 Determinants of service robots acceptance

2.4 Aim of the current study

As discussed earlier, TAM has been used in various studies but the variables chosen to explore the acceptance of service robots were different. ESR have evolved to the use of AI technology. Thus, as per the activity theory emphasizing interactions among humans (pupils and teachers), tools (AI teachers), and activities (teaching and learning) (Heo & Lee, 2013), this study uses modified TAM with predictors of robot use anxiety (RUA), PEOU and PU, and expanded it to include the task-technology fit variable Robot Instructional Task Difficulty (RITD) to assess pupils’ acceptance of the newly developed AI teacher.

RUA and RITD are the key factors and have not been used in any other models to study pupils’ attitudes towards the AI teacher. Also, many existing studies focused on service industries of robot use and attitudes of the middle-aged and elderly. Thus, this study is not only aimed to develop the AI teacher, but also to contribute new variables to known models. Our findings will serve policymakers, researchers, manufacturers, and teachers with useful information to develop AI teachers effectively which can enhance quality of education with less human cost.

3 Hypotheses development

Deriving from the literature discussed above, the theoretical model including constructs of RUA, PEOU, PU and RITD denoting eight hypotheses of this study was presented in Fig. 1.

Fig. 1
figure 1

Theoretical model of the AI teacher acceptance

3.1 RUA and acceptance

RUA is discomfort, insecurity and frustration that people feel when they confront new technologies such as the AI teacher (Mac Callum et al., 2014). It has been proved that people feel anxious when they are unfamiliar with novel technologies, and anxiety in turn has discouraged them to use the technology (Özdemir-Güngör & Camgöz-Akdağ, 2018). However, an increase in the frequency of use reduces anxiety and leads to a change of attitudes. Accordingly, we proposed two hypotheses: RUA has a negative effect on acceptance of AI teachers among pupils (H1). The relationship between RUA and acceptance is mediated by PEOU (H8).

3.2 PEOU and acceptance

PEOU refers to the extent to which pupils believe that using AI teachers would involve neither difficulty nor effort (Davis, 1989). According to TAM, PEOU has a significant positive effect on attitudes (Davis, 1989). Thus, we put forward the second hypothesis: PEOU has a positive effect on pupils’ acceptance of AI teachers (H2).

3.3 PU and acceptance

PU is defined as the extent to which pupils perceive the AI teacher to be helpful to their learning outcomes after using them (Davis, 1989). In TAM, PU has a significant positive effect on attitudes (Davis, 1989). Specifically, if pupils perceive that the AI teacher is useful to them, their acceptance of this technology and likelihood to use it will be enhanced. Therefore, we formulated the third hypothesis: PU has a positive effect on pupils’ acceptance of AI teachers (H3).

3.4 RITD and acceptance

Very few previous works such as the study conducted by Li et al. (2010) have mentioned fit of task conditions and robots technology influenced people’s perception of robots. According to Bloom’s taxonomy of educational objectives (1956), instructional tasks in charge by the AI teacher could be divided into functional and situational tasks. Functional tasks are completed without considering context, while situational tasks require understanding of context, analysis of conditions for completing the tasks, and use of necessary knowledge and skills to complete the tasks. Compared with functional tasks, situational ones are regarded as more difficult and challenging for the AI teacher, which may influence pupils’ trust in the AI teacher. RITD refers to the degree to which the task represents a personally demanding situation requiring a considerable amount of cognitive or physical effort to develop learners’ knowledge and skill levels (Brehm & Self, 1989). To explore relationships of RITD and RUA, PU and PEOU, we designed another four hypotheses: RITD predicts acceptance of the AI teacher. The easier the AI teacher’s instructional task is, the more acceptable the AI teacher would be (H4). Also, the relationship between RITD and acceptance is mediated by RUA (H5), PU (H6), and PEOU (H7).

4 Methodology

4.1 Development of the AI teacher

The AI teachers is recommended to deliver knowledge and provide immediate feedbacks to students through eyes contacts, different vocal and facial expressions, and gestures. According to these criteria, we attempted to develop a qualified AI teacher and the AI Teacher-led Instruction (see Fig. 2; Huang et al., 2020).

Fig. 2
figure 2

The AI Teacher-led Instruction

The robot this study employed for secondary development was the Avatar-mind IPAL produced by Nanjing Avatar-Mind Robot Technology (see Fig. 2). This robot has some advantages over other robotic products in current Chinese market such as feasibility, customization and affordability. IPAL was chosen for this study for four reasons. What we first considered was robot appearances. It looks like a cute child due to its humanoid shell (1025 mm*395 mm*440 mm), few angles, and no exposed mechanical parts, which eliminates pupils’ fear and cold feelings of the robot according to the Uncanny Valley (Lin et al., 2021). Second, touch sensors, ultrasound sensors, infrared sensors, microphone array for sound direction and detection (MASDD), mega pixel camera in eyes for facial recognition, Liquid Crystal Display (LCD) screen and four wheels have been installed on the IPAL, making IPAL multifunctional and knowledgeable. Third, Android Operating Systems and Motion Control Software have been installed in IPAL for the innovative development and maintenance of the robot. Last, IPAL is priced at 1,361 USD, which is affordable for many schools.

Then, based on previous studies and experience, we developed the AI teacher to to increase its performance following three stages. First, hardwares and softwares installed on IPAL were used for collecting, storing, processing and presenting data through TCP/IP communication protocol (see Fig. 2). Second, teaching materials, language and behaviors were encoded and programmed into IPAL by the human teacher in the open code platform (see Fig. 3). To make the AI teacher look like an experienced teacher, the following behaviors were programmed into it: ① turning the torso and head and shifting the gaze when talking with the class; ② pointing the finger at the slides projected on the white board to draw the attention of pupils and to emphasize the important contents; ③ avoiding obstacles and walking to the pupils slowly, and making appropriate gestures (raising hands, opening arms or bending arms, etc.) to the pupils; ④ showing different facial expressions by displaying colorful signs to give feedback to students, such as “♥” to express satisfaction, happiness and affection, “☺” to show welcome and greet pupils, and “☹” to show unhappiness and unpleasure with pupils’ performance.

Fig. 3
figure 3

a Motion frame edit, b Content editor and its panels, c Resource library

Third, the human teacher used the iRemoter to control the AI teacher, so as to facilitate interactive knowledge sharing and task executions with a common goal constraint among the AI teacher, the human teacher, pupils and other multimedias (see Fig. 2). The iRemoter is a visual editor for IPAL compatible with different operating system that ensures the operation in handheld devices such as mobile phones and personal computers. It is suitable for human (teachers and students) robot interface due to its network ability.

4.2 Data collection and analysis

Based on the review of users’ acceptance of robots (e.g., Heerink et al., 2009), we first adapted a questionnaire. The initial part contained demographic information such as gender, experience and preference. Its second part explored the AI teacher acceptance and had 25 items on a five-point Likert scale (1 = strongly disagree, 2 = disagree, 3 = unsure, 4 = agree, 5 = strongly agree) distributed in five variables: 3 items for RUA, 6 items for PEOU, 7 items for PU, 2 items for RITD, and 7 items for ACC (see Appendix 1). Pretest was conducted to ensure accuracy of the questionnaire.

Second, except elementary schools that we have collaborated with for a long time, we could not enter other schools due to the COVID-19 pandemic and epidemic prevention policies. In this case, we adopted both direct and indirect ways of Human–Robot Interactions (HRI), which was suggested by Naneva et al. (2020). Specifically, we utilized the AI teacher in cooperative elementary schools. Meanwhile, we recorded, edited and produced video materials about our AI teacher-led instruction (see Fig. 4).

Fig. 4
figure 4

a AI teacher-led Chinese lesson, b AI teacher-led math lesson, c AI teacher-led English lesson, d AI teacher-led Moral and Law lesson

Third, participants in our cooperative schools filled out anonymous questionnaires after attending the AI teacher-led class and interacting with it, and those in other elementary schools finished the questionnaires after listening to the introduction of the AI teacher from their human teacher and watching the above video we offered (see Fig. 5). All participants provided informed consent, and their answers were appreciated and protected.

Fig. 5
figure 5

Data collection procedure

Fourth, quantitative data was imported into SPSS 23.0 and examined. As structural equation modeling (SEM) is useful in analyzing quantitative data and describing relationships among observed variables, allowing investigators to test theoretical models and extend theories (Thakkar, 2020, p. 1), it was chosen for data analysis via SPSS 23.0 and Amos 26.0.

4.3 Participants

Convenience sampling was used in this study to select pupils in grades 3–6 from six elementary schools. Their educational environments and resources were similar. A total of 734 questionnaires were distributed on site, and 684 were collected, with effective recovery of 93%. There were no missing values or errors in recording. 19 records were deleted because they showed outliers. As shown in Table 2, the participants included 341 boys and 324 girls, ranging in age from 9 to 12. Among them, 398 participants did not know about AI teachers until watching the video, the rest of them (267) had known AI teachers six months ago. Some 641 participants had never used AI teachers before. As to the sex of AI teachers, 90 participants preferred male, 206 preferred female, 129 wanted androgynous, 127 did not care, and the rest (113) thought it should depend on subject areas that AI teachers took charge of.

Table 2 Characteristics of the participants

5 Results

5.1 Status quo of Chinese elementary school students’ acceptance of the AI teacher

As revealed by Table 3 and Fig. 6, normality, skewness, and kurtosis of the data were appropriate (Kline, 2015, p. 77). In Table 4, the average score of RUA, PEOU, PU, RITD and ACC were 8.22 (SD =  ± 2.867), 22.17 (SD =  ± 5.149), 29.29 (SD =  ± 4.642), 7.53 (SD =  ± 1.99), and 26.82 (SD =  ± 5.401) respectively, indicating that in the face of the new AI teacher, pupils had moderate level of anxiety, perceived it as practical and simple, valued the fit between the AI teacher and its instructional tasks, and were open to this new technology.

Table 3 Descriptive measures of the items
Fig. 6
figure 6

Mean score of each item

Table 4 Descriptive measures of the constructs

5.2 Measurement model testing

Measurement model relates measured variables to latent variables. To develop it, first, reliability is used to measure the internal consistency of a scale or construct, and Cronbach’s coefficient alpha (α) is the most frequently used estimator of reliability (Thakkar, 2020, p. 121). The threshold for desirable α values is 0.6 (Rong, 2009). As shown in Table 5, α values of RUA, PEOU, PU, RITD, and ACC were 0.641, 0.839, 0.859, 0.735, and 0.855 respectively, indicating that the measurement model had good reliability and internal consistency. Though RITD was composed of two items, its α value met the recommended level (Bollen & Davis, 2009).

Table 5 Results of reliability analysis

Second, before examining convergent validity (CV) and discriminant validity (DV), KMO and Bartlett’s sphericity test was conducted (Ferguson & Cox, 1993). As shown in Table 6, KMO coefficient was 0.949 and the Bartlett’s sphericity test significance coefficient was 0.000, meeting the criterion (Ferguson & Cox, 1993).

Table 6 Results of KMO and Bartlett’s test

Third, construct validity is investigated using CV and DV. CV examines correlations among different items for every construct. The threshold for desirable values is 0.5 for factor loading (FL) (Fornell & Larcker, 1981), 0.6 for composite reliability (CR) and estimates, and 0.36 for squared multiple correlations (SMC) and average variance extracted (AVE) (Bagozzi & Yi, 1988). As seen from Table 7, FL was in the range of 0.550–0.792, indicating the explanatory power of 25 items was strong for the five observed variables. The CR values were in the range of 0.642–0.862, meeting the recommended level of 0.60 (Fornell & Larcker, 1981), indicating high internal consistency for the five constructs. AVE of RITD met the acceptance level of 0.5, but AVE of RUA, PEOU, PU and ACC were in the range of 0.377–0.483, a little below the recommended level of 0.5 (Asghar, et al., 2021a, 2021b). According to Fornell and Larcker (1981), because CR of five constructs were greater than the acceptable level, items could reflect constructs’ traits and CV was adequate.

Table 7 Item loading, CR and AVE

Fourth, as to DV, it is tested by comparing the square root of AVE for each construct with the inter-factor correlations between the construct and each of the other constructs. A construct shows good DV when the AVE is higher than the construct’s squared inter-scale correlations (Hair et al., 2010; Hu & Bentler, 1999; Fornell and Larcker, 1981). As shown in Table 8, the square root of AVE for RITD, RUA, PEOU, PU and acceptance were 0.727, 0.614, 0.695, 0.687, 0.675 respectively, confirming good DV in the measurement model.

Table 8 DV results for the measurement model

5.3 Structural model testing

Fit indices of the overall model are used to evaluate the overall fit between the theoretical model and the sample data (Thakkar, 2020, p. 33). We referred to seven widely used goodness-of-fit tests (see Table 9). Results showed a good fit for the structural model constructed by the sample data, useful in interpreting actual observed data.

Table 9 Recommended and actual values of goodness-of-fit measures

5.4 Structural model path analysis

To validate eight research hypotheses and examine the built relationships, SEM was used. Tables 10, 11 and Fig. 7 revealed the research findings, showing all relationships.

Table 10 Path analysis and results
Table 11 Mediating effects analysis and results
Fig. 7
figure 7

Empirical model of pupils’ acceptance of the AI teacher

As shown in Table 10, the relationship between RUA and acceptance (H1) (Std. = -0.02, Unstd. = -0.022, SE = 0.046, p = 0.626 > 0.05) was not significant. Likewise, the relationship between PEOU and acceptance (H2) (Std. = 0.33, Unstd. = 0.326, SE = 0.053, p < 0.001) was positive. The relationship between PU and acceptance (H3) (Std. = 0.4, Unstd. = 0.499, SE = 0.067, p < 0.001) was positive. The relationship between RITD and acceptance (H4) (Std. = 0.221, Unstd. = 0.211, SE = 0.061, p < 0.001) was significant.

Variables had complex relations. Thus, to test the mediating role of RUA, PU, and PEOU, we used 5000 repetitions of bootstrapping (Preacher & Hayes, 2008). This method of repeated sampling has greater statistical power and more stable test results, and it can test for multiple mediating or overall effects. The point estimate of the mediating effect can be considered significant if the 95% confidence interval (CI) does not contain zero (Zhao et al., 2010). As shown in Table 11, the hypothesis that RUA mediated the relationship between RITD and the ACC (r = -0.027, CI = [-0.063, -0.07]) was accepted (H5). The hypothesis that PU mediated the relationship between RITD and ACC (r = 0.259, CI = [0.19, 0.348]) was accepted (H6). The hypothesis that PEOU mediated the relationship between RITD and ACC (r = 0.242, CI = [0.152, 0.387]) was accepted (H7). The hypothesis that PEOU mediated the relationship between RUA and ACC (r = -0.018, CI = [-0.13, 0.078] contained 0) was not accepted (H8).

6 Discussions and implications

Human teachers and AI teachers teaching hand-in-hand seems to be no more a choice, but it is becoming a necessity. This study might be a good starting point for stakeholders. In the fields of robotics and educational technology, this study was one of the first attempts to develop the ESR into an AI teacher, to investigate pupils’ acceptance of it, to specify the determinants related to acceptance, and finally to construct and test the acceptance model for explaining pupils’ intention to use the AI teacher through SEM.

6.1 What is the level of acceptance of the AI teacher among Chinese elementary school students?

As shown in Table 2, less than half of the Chinese pupils knew the AI teacher before taking part in this research, and the majority of them did not use the AI teacher before. Although they had various levels of previous experience with AI teachers, ranging from no experience to veteran, they presented positive attitudes to the newly developed AI teacher, as informed by the mean score of the factor of acceptance (see Table 3 & Fig. 6). The average score of PEOU revealed that Chinese pupils required technical coaching, and inspired us to increase pupils’ using time and familiarity of the AI teacher. This result further supported the idea that PEOU is affected by users’ prior experience of the technology (Di Nuovo et al., 2018). Experience is decisive at the beginning of use, but loses importance as use continues and experience accumulates (Kossewska & Kłosowska, 2020). A high mean score on the factor of PU confirmed the previous result that when individuals applied a service robot to accomplish tasks, they expected the robot to improve their performance or deliver useful services for them (Park & Kwon, 2016). This study also revealed that pupils who rated high on RITD perceived a good fit between instructional tasks and the AI teacher. Chinese pupils placed a high value on efficient delivery of courses and real-time interactive capabilities of the AI teacher, which matched those found in earlier studies (e.g., Strader, et al., 2015).

6.2 What factors influence the acceptance of the AI teacher among Chinese elementary school students, and how do they relate to each other?

Except for the factor of RUA, the other three factors exerted influences on Chinese pupils’ acceptance of the AI teacher in the following order: PU > PEOU > RITD. Besides, RITD had not only direct effects on acceptance, but also indirect effects through the mediating variables of RUA, PU and PEOU. These main findings were further analyzed below.

This study revealed three crucial factors. The first of these is PU. It outdoes other three factors, being consistent with the prior works (e.g., Kossewska & Kłosowska, 2020). It is understandable that usefulness of the AI teacher have been valued most by Chinese pupils as they have been under the existing educational system of exam-oriented. Pupils perceived the AI teacher as useful in four aspects: delivering knowledge and skills to learners, assisting learners in finding solutions to learning problems, establishing and maintaining interpersonal relations between learners and human teachers, and interacting with learners and encouraging their ideas and opinions, which could be classified accordingly as informational usefulness (Rudat et al., 2014), instrumental usefulness (Ranson, 2008), relational usefulness (Asghar & Pilkington, 2018), and communicative usefulness (Riera-Gil, 2019).

Being consistent with previous works (Kossewska & Kłosowska, 2020), this study found another core predictor of accepting the AI teacher. It was PEOU and had a positive effect on Chinese pupils’ acceptance. As the AI teacher has not yet entered frequent use worldwide, it remains unknown and apparently complex to students. Therefore, to provide a better experience for pupils, AI teachers should be configured using high-performance servers to improve the stability and response time of their robot operations, and to provide smooth interfaces and easy navigation for pupils.

The third factor is task characteristics or fit of task conditions and the AI teacher’s capabilities. ESRs have been used to play built-in songs and stories for entertainment instead of teaching. And the substance of what robots teach, and the question of how this changes students’ perceptions, have not been noticed before. Our findings revealed that pupils reported an increase in the AI teacher usage with an increase in the fit between instructional task characteristics and the AI teacher affordances. Acceptance dropped when the instructional task equivocality is high. Hence, the design of instructional task and functions of the AI teacher need to be considered comprehensively.

In the current research, we extended the TAM model by comprising RUA and RITD to explore changes in Chinese pupils’ acceptance towards the AI teacher. In the light of theoretical implications, first, the TAM model was effectively updated and applied in a new context, namely, the AI teacher-led instruction in China. Second, we used TAM to produce empirical evidence on elements concerning with pupils’ decision to accept the AI teacher. So far as we know, this is one of the few works to discuss influencing factors on accepting the AI teacher in Chinese elementary schools. As such, third, RUA and RITD incorporated into this work may deepen understandings on pupils’ different acceptance of the AI teacher.

In terms of practical implications, findings of this study suggest that as there is a lack of AI teachers in China (Huang, 2021), university and school executives need to provide related robotics training programs for students (El-Hamamsy et al., 2021), help them recognize AI teachers’ utilities, increase PU and PEOU, and allay their anxiety. Second, AI teachers remain new, so any attempt to improve students’ acceptance will be valuable. This study suggests that educational robot researchers could offer AI teachers appropriate to different disciplines, pilot with voluntary schools to organize AI teachers-led classroom, and generalize to others who would like to follow. Third, robot manufacturers and designers take charge of the development and sales of ESRS. Findings of this research assist them to identify the most important determinants needed to be incorporated into ESRs, and also suggest them to communicate with school users, especially those giving low ratings, so as to identify and figure out problems. Besides, they need to develop ESRs that can work on different computers and mobile phones. Finally, this study implies that schools, teachers and pupils need to identify which kind of the AI teacher is appropriate for them, and be aware of consequences of misusing AI teachers, such as privacy leaks.

7 Conclusions, limitations and future research

This study is devoted to delve into Chinese pupils’ acceptance of a new AI teacher, and to validate the proposed model to decide factors that impact their acceptance. To achieve the goals, this study collected quantitative data on site and analyzed it via SPSS and Amos. It is concluded that Chinese pupils were positive towards the AI teacher, and PEOU, PU and RITD were indicators of acceptance. Also, RUA, PU and PEOU significantly mediated the relationship between RITD and acceptance.

This study has some limitations. First, while some participants watched videos and listened to their teachers’ explanations, their attitudes would have been more positive if they could had direct interaction with the AI teacher. Second, results were collected via the survey-based questionnaire, which might restrict effects of factors on the acceptance. Third, the sample size, if expanded, could lead to a more comprehensive picture of pupils’ acceptance of the AI teacher, especially in rural areas. Fourth, there were some variables that we did not touch, such as technological literacy, self-efficacy and moderating factors including age and gender and experience.

The above limitations are expected to be ameliorated in future studies. First, the extended TAM in this paper needs to be validated in other circumstances in order to portray a wider picture of the determinants of acceptance of the AI teacher. Second, the empirical evidence of this study could be reexamined employing actual behavioral data produced from other research tools. Third, this study explored determinants of acceptance of the AI teacher from pupils’ perspectives. Future works could involve other stakeholders such as groups of in- or pre-service teachers. Finally, additional studies on factors influencing pupils’ acceptance of the AI teacher is critical to elucidate the AI teachers’ role. Thus, future studies could take new factors into account. When they are put onto our model, our understanding on the acceptance of ESRs might be more complete.