1 Introduction

In the era of swift advancements in information technology, artificial intelligence (AI) has achieved significant breakthroughs, transforming multiple facets of existence. AI-enhanced English learning applications are gaining popularity primarily due to their adaptability and advanced features. Notable examples include Busuu, which employs AI for vocabulary and grammar training; ELSA, which serves as an English language speech facilitation tool; and DeepL, which is recognized for its AI-driven translation capabilities. One outstanding development in this field is ChatGPT. ChatGPT, as an advanced language model, employs deep learning methodologies to comprehend and generate human-like communication, allowing for seamless conversational exchanges with users (Lock, 2022). It has found utility in various domains, including serving as a valuable tool for information retrieval, creative writing assistance, and language translation (Roose, 2022). Students frequently employ ChatGPT for tasks such as converting languages, editorial refinement, and the composition of assignments. Likewise, leveraging AI-assisted training programs, ChatGPT has established itself as a potent instrument for enhancing students’ academic performance and motivation in educational contexts (Srinivasa et al., 2022). AI technologies like ChatGPT contribute to an improved learning experience and foster greater engagement and immersion in the educational process by providing personalized and interactive assistance to students. As AI continues to evolve, its potential to transform language education further is immense. Accordingly, investigating ChatGPT’s impact on ESL learners is both timely and essential in an era marked by digital integration in education.

The Hedonic-Motivation System Adoption Model (HMSAM) is a theoretical framework emphasizing hedonic and motivational factors in technology adoption (Lowry et al., 2013). In this research, HMSAM is aptly suited for several reasons. Firstly, the study investigates ChatGPT’s role in enhancing ESL students’ intrinsic motivation and its effectiveness in promoting focused immersion in language learning. This focus aligns with HMSAM’s emphasis on understanding the hedonic and motivational dimensions that drive technology adoption. Such an alignment is particularly pertinent for addressing the unique needs and challenges faced by ESL learners, especially Chinese students in the UK. Secondly, HMSAM encompasses a wide range of dimensions that are instrumental in exploring the complex relationships between various factors in technology adoption. This research aims to delve into these multidimensional relationships, and HMSAM provides a comprehensive set of dimensions that facilitate a thorough examination of the intricate dynamics at play. Through HMSAM, this study can explore the nuanced interplay of factors that contribute to the effective integration of ChatGPT in language education, offering valuable insights into fostering a conducive learning environment for ESL students. By applying HMSAM, the study aims to investigate the interplay between hedonic motivations, user engagement, and the effectiveness of AI tools like ChatGPT in language education by hypothesizing and validating a model of the relationship between these constructs. This exploration is pivotal in highlighting the significant role of motivation in the language learning journey of Chinese students in the UK, aiming to offer a detailed understanding of the motivational dynamics at play and insights into optimizing language learning strategies and interventions in ESL contexts.

2 Literature review

The literature review comprehensively examines the dynamic interplay between Computer-Assisted Language Learning (CALL), Generative Artificial Intelligence (GAI), and the innovative application of tools like ChatGPT in enhancing the learning journey of English as a Second Language (ESL) learners. It delves into the transformative impact of technological advancements in CALL, highlighting how GAI, particularly through platforms such as ChatGPT, facilitates personalized and interactive learning experiences. Furthermore, the review explores theoretical frameworks like HMSAM, elucidating the psychological and motivational drivers that underpin the adoption and effective utilization of such technologies in educational settings. Empirical evidence is woven throughout to underscore the potential of GAI in revolutionizing ESL education by offering nuanced insights into learner engagement, motivation, and the overall efficacy of GAI-enhanced language learning strategies. In addition, the comprehensive analysis of CALL, GAI, and the application of ChatGPT in ESL education, together with an exploration of HMSAM, establishes a robust groundwork for the forthcoming research hypotheses. By integrating these aspects, the literature review effectively lays a comprehensive theoretical foundation for this study, offering a well-rounded introduction to the role of educational technology in language learning. This synthesis not only enriches the academic discourse on the use of technology in language acquisition but also establishes a solid foundation for the ensuing sections of this study.

2.1 CALL, Generative AI, and ChatGPT in relation to ESL learners

The journey of acquiring a second language is a multifaceted one, encompassing various elements that lead to diverse outcomes. The extent of one’s integration into a new culture, exposure to clear and understandable language input, focus on specific language attributes, and chances to engage in meaningful communication are pivotal (Ortega, 2008). The endeavour to learn additional languages involves deliberate strategies to enhance linguistic knowledge and skills (Bereknyei et al., 2010). This includes grasping essential language components such as grammar and vocabulary, which are crucial for mastering reading, writing, listening, and speaking, along with associated elements like pronunciation (Tindall & Nisbet, 2010; Lan, 2016). Historically, ESL educators have utilized computers for supplementary activities. With technological advancements, the integration of computer-based tools has become a fundamental aspect of ESL education. Wang (2011) has explored the effectiveness of computer technology and software in teaching writing, finding a significant positive impact on learner engagement and interest in English writing, with approximately 80% of students acknowledging the benefits of Computer-Assisted Language Learning (CALL). Similarly, Beatty (2013) indicated that when CALL is thoughtfully integrated into the curriculum, it significantly enhances language comprehension, aiding both instructors and students.

In the recent decade, GAI has emerged as a pivotal force in the development of CALL for ESL students. Within the educational context, GAI facilitates immediate and authentic interactions with ESL learners (Fryer et al., 2020). For educational purposes, GAI technologies, exemplified by platforms like ChatGPT, assist students in writing essays and comprehending theories and concepts (Johnston et al., 2024). It can provide students with round-the-clock support, ensuring learners receive assistance precisely when necessary (Haleem et al., 2022). With the ability to correct grammar, suggest improvements, and identify weak areas, ChatGPT can provide students with immediate feedback on their academic work, helping them to rectify their errors and progressively advance their language competencies (Lee et al., 2023). Continuously learning and improving from previous interactions, these systems are capable of engaging in intelligent communication with ESL students, serving as tireless assistants in language learning (Fryer et al., 2019). This timely intervention empowers learners to recognise, rectify, and refine their linguistic capability (Baskara, 2023). In essence, GAI chatbots, epitomised by ChatGPT, assist students in grasping abstract theories and concepts and optimise their overall academic efficacy (Song & Song, 2023).

Empirical studies have delved into the relationship between ChatGPT and ESL learning, shedding light on its effectiveness. For instance, Smith et al. (2022) conducted a study involving English language learners and found that integrating ChatGPT into language learning activities increased learners’ involvement and enthusiasm. The research also underscored the positive influence of ChatGPT on students’ oral and written skills. This effectiveness is partly due to ChatGPT’s ability to simulate natural dialogues, offering a practical platform for learners to hone their conversational skills. By engaging users in interactive exchanges, ChatGPT bridges the theoretical aspects of language learning with real-world applications, thus contributing to a more comprehensive development of language proficiency. Additionally, Dai et al. (2023) investigated the effectiveness of ChatGPT in providing personalized feedback to learners. Their findings revealed that learners perceived ChatGPT’s feedback as helpful and valuable, addressing their language needs and facilitating targeted language practice. At this point, ChatGPT functions as a dynamic educational tool, offering real-time, tailored feedback on vocabulary and grammar by analysing learner inputs, pinpointing inaccuracies, and recommending specific corrections. Similarly, Chen et al. (2023) undertook a comparative analysis to evaluate how ChatGPT fares against conventional language learning materials. The results indicated that learners who interacted with ChatGPT demonstrated significantly improved vocabulary acquisition and grammatical accuracy compared to conventional materials. This distinction arises from ChatGPT’s interactive and adaptive dialogue capabilities, which contrast with the static nature of traditional resources, offering a more captivating and tailored educational experience that effectively enhances language proficiency.

While previous research has explored the potential educational applications of ChatGPT, limited attention has been directed toward examining the role of intrinsic motivation in its adoption. Intrinsic motivation is a person’s inclination to participate in an activity driven by their interests and inner enjoyment (Lepper & Henderlong, 2000). According to Deci and Ryan (1985), empirical evidence reveals that students with heightened intrinsic motivation towards an activity show an enhanced propensity to engage voluntarily. In other words, a positive correlation has been observed between intrinsic motivation among students and their aptitude for effectively acquiring intricate knowledge and demonstrating creative capabilities within the specific domain of the activity (Deci & Ryan, 1985). In the realm of technology uptake, intrinsic motivation is crucial in shaping students’ views on the benefits of using technology (Sun & Gao, 2020). It is believed that intrinsic motivation offers valuable insights into the potential of computer-assisted learning to serve as influential catalysts for transformative change and tools for reconfiguring educational and instructional frameworks (Martens et al., 2004; Teo et al., 2006). Given this backdrop, this research seeks to delve into the connection between English as a second language (ESL) students’ perception of ChatGPT as a CALL instrument and their motivation, basing its findings on HMSAM.

2.2 Theoretical frameworks

The hedonic system acceptance model (HSAM) formulated by van der Heijden (2004) offers a crucial framework for grasping how users adopt and employ technology. HSAM can be seen as a progression from the technology acceptance model (TAM) first presented by Davis and colleagues in 1989. Unlike TAM, HSAM incorporates the hedonic aspects of technology usage, emphasizing the pleasurable and enjoyable dimensions that shape individuals’ acceptance of technology. HSAM suggests that users’ adoption of technology goes beyond just pragmatic influences like perceived usefulness (PU) and perceived ease of use (PEOU) but also by hedonic factors like enjoyment, entertainment, and emotional satisfaction derived from engaging with technology. According to this model, Individuals tend to adopt and continue using technology if they perceive it to deliver both pleasurable experiences and functional benefits (van der Heijden, 2004). In other words, PU and enjoyment are critical intermediaries that drive users’ intention to engage with technology.

In addition to the rewarding and satisfying aspects that characterize hedonic systems, cognitive absorption plays a pivotal role in shaping user experiences with technology. Cognitive absorption refers to a condition of intense concentration and engagement, characterized by a person’s complete involvement in a task to the point of forgetting oneself and experiencing altered time perception (Agarwal & Karahanna, 2000). This concept is underpinned by the principles of flow theory, which explains the level of engagement and concentration experienced in activities driven by intrinsic motivation. Flow theory posits that individuals with high internal motivation are more inclined to immerse themselves fully and become deeply engaged in their current tasks (Guo & Poole, 2009; Lee, 2010). This concept has garnered significant interest within scholarly studies, playing a key role in refining established frameworks like the Unified Theory of Acceptance and Use of Technology (UTAUT) introduced by Venkatesh et al. in 2003. Building on these insights, Lowry et al. (2013) refined HSAM into the HMSAM, extending the scope of the TAM to emphasize intrinsic motivations that encourage users to engage deeply with technology. While TAM primarily addresses pragmatic factors like PU and PEOU, HMSAM encompasses the hedonic aspects of technology use, including sensory and cognitive curiosity, a sense of control, and the pursuit of fun and enjoyment (Sidek et al., 2020). This broader focus within HMSAM acknowledges the complex interplay between users’ intrinsic motivations and their technology engagement experiences, offering a more nuanced understanding of technology adoption and sustained use in hedonic contexts.

HMSAM has found extensive applicability across diverse domains of hedonic motivation systems (Lowry et al., 2013; van der Heijden, 2004). It has been utilized as a theoretical framework to investigate users’ acceptance and adoption of technology, including general gamification, virtual reality, and social media (Francke & Alexander, 2018). For instance, Francke and Alexander (2018) employed the HMSAM to study augmented reality (AR) uptake in practical learning and collaborative learning environments. Similarly, Palos-Sanchez et al. (2022) utilized the HMSAM to assess a game-centric student response tool called Kahoot!. Additionally, Florensia and Suryadibrata (2023) employed the HMSAM as a framework to design and evaluate a mobile visual novel game focused on mathematics, particularly the topic of definite integrals. Their research aimed to amplify students’ enthusiasm and involvement in mathematical education by integrating gamification elements and immersive storytelling into the educational game (Florensia & Suryadibrata, 2023). However, students’ hedonic motivation and adoption of AI platforms have received limited attention in the existing studies. Thus, this research intends to bridge this knowledge void, focusing on students’ acceptance of ChatGPT as a CALL instrument in tertiary education establishments.

2.3 Research model and hypotheses

The concepts of Perceived ease of use (PEOU) and Perceived usefulness (PU) are central to research models examining technology adoption (Davis, 1989). PU pertains to individuals’ perception that new technology can augment their productivity and represents the perceived probability of potential users (Lee et al., 2003; Chatterjee et al., 2021). Conversely, PEOU denotes the degree to which people adopt new technology based on their belief that its adoption can be achieved effortlessly without requiring a substantial investment of time to acquire proficiency (Rokaya et al., 2022). Previous research has substantiated the favourable impact of PEOU on PU when employing chatbots as AI conversational tools within the domain of language learning (Belda-Medina & Calvo-Ferrer, 2022). PEOU, offered by ChatGPT and characterized by its intuitive user interface, clear instructions, and prompt feedback, expedites the learning process of system operation, thus supporting the English language acquisition of non-English major ESL students. Furthermore, it facilitates the acquisition of inspiration and enhances the efficiency and quality of knowledge for students majoring in English language studies. Drawing on the theoretical foundations, the authors propose the following hypothesis:

  • H1. Perceived ease of use of ChatGPT is expected to significantly influence users’ perceived usefulness.

Agarwal and Prasad (1998) first presented the idea of Personal Innovativeness in Domain Information Technology (PIIT), describing it as a unique construct reflecting a person’s inclination to adopt and try emerging information technologies. Their study underscored the potential moderating influence of PIIT in melding individuals’ viewpoints on and reactions to new information technologies (Agarwal & Prasad, 1998). Subsequently, PIIT has been widely employed in studies on technology acceptance, including mobile learning (Joo et al., 2014) and TikTok-based learning (Deng & Yu, 2023). Tech acceptance literature suggested that attitudes about personal innovation are influenced by perceived ease of use, indicating the cognitive load associated with adopting the innovation (Rogers, 1995; Agarwal & Prasad, 1998). In Agarwal and Prasad’s research on innovation as a moderator, they proposed a hypothesis that potential adopters are more likely to accept changes in their innovation attitude within the domain of information technology once they perceive the application as easy to use. When considering the use of ChatGPT, the significant attention given to its PEOU may alter individuals’ inclination to adopt and explore AI-based learning tools. In the research on AR tools for scientific writing, Kim and colleagues found that direct engagement enriched teachers’ understanding of their capabilities, leading to a shift in their view of AI-based educational tools (Kim & Kim, 2022). However, in the context of the HMSAM, PIIT has not been identified as a moderator between perceptions and usage intentions. Accordingly, we argue that by incorporating an important individual difference variable, PIIT, we can better comprehend the formation of perceptions and their ensuing impact on intentions to use:

  • H2. Perceived ease of use of ChatGPT is expected to significantly influence users’ innovations in information technology.

Control (CO) refers to the user’s belief in possessing dominance and sway while engaging with a system (Agarwal & Karahanna, 2000; Oluwajana et al., 2019). Maintaining a sense of mastery in technology engagement is pivotal for users. Previous research has established a negative association between perceptions of uselessness towards system usage, reduced confidence, and a diminished sense of control over the interaction (Venkatesh, 2000). Conversely, high levels of PU may enhance individuals’ sense of control over their technological interactions. Accordingly, the PU of ChatGPT may empower students with greater control over the program. However, the connection between PIIT and intrinsic motivators, such as control, has received scant exploration. However, recent studies have underlined that teachers who exhibit scepticism towards new technology may experience limited control over the system when utilizing AI-based educational tools in classroom settings (Kim & Kim, 2022). Therefore, it is plausible that PIIT may influence users’ perceived control during human-computer interactions with ChatGPT. This study seeks to investigate the impact of PIIT further, evaluating its influence on intrinsic motivational aspects. Consequently, the authors put forth the following hypothesis:

  • H3. Personal innovations in information technology are expected to significantly influence users’ control over ChatGPT.

  • H4. Perceived usefulness of ChatGPT is expected to significantly influence users’ control over ChatGPT.

Boredom (BO) can be characterized as an unpleasant psychological state wherein individuals experience reduced levels of physical and cognitive activation, leading to a desire to disengage from ongoing activities (Li, 2021). This affective state is a negative deactivating emotion (Dewaele & Li, 2021). In line with the control-value theory, boredom often emerges from a perceived shortfall in task value and control (Pekrun, 2006). Specifically, boredom tends to have detrimental effects on student engagement and performance within academic contexts, which may lead to decreased motivation, reduced learning outcomes, and overall academic underperformance (Sharp et al., 2017). Heidegger has associated boredom with individuals’ inclination towards seeking technological change (Thiele, 1997). Prior studies have identified an inverse relationship between individuals with high levels of PIIT and their propensity for boredom (López-Bonilla & López-Bonilla, 2012). Research indicates that individuals with higher PIIT levels are inclined to sidestep feelings of boredom (Pizam et al., 2004; López-Bonilla & López-Bonilla, 2012). Recent studies have integrated boredom as an essential component of the HMSAM to explore the impact of negative emotions on acceptance of the hedonic motivation system (Deng & Yu, 2023). Moreover, investigations into the influence of chatbots on anticipated communication quality have demonstrated that AI chatbots equipped with social characteristics can mitigate negative emotions, including frustration and dissatisfaction (Zhou et al., 2023). Consequently, ChatGPT may potentially alleviate feelings of boredom during language learning. Nevertheless, examining chatbots as CALL tools is a relatively nascent area of academic inquiry, with limited research thus far exploring its association with boredom. This research sought to bridge this knowledge void by examining the linkage between ChatGPT’s PU and BO. Building upon these theoretical underpinnings, the authors put forth the following hypothesis:

  • H5. Personal innovations in information technology are expected to significantly influence boredom emotions.

  • H6. Perceived usefulness of ChatGPT is expected to significantly influence boredom emotions.

Curiosity (CU) occupies a fundamental position within the conceptual framework of intrinsic motivation, as underscored by previous research (Ryan & Deci, 2000; Berlyne, 1960). It functions as an inherent motivator that fosters the processes of learning and exploration (Silvia, 2012). In line with this conceptualization, individuals who possess an inherent inclination towards innovation in computer-related activities are more prone to manifest a predisposition to experiencing episodes of curiosity (Agarwal & Karahanna, 2000). Additionally, the level of PU has a significant impact on curiosity. Studies on Video on Demand have shown that a game’s low PU can lead to player frustration, consequently reducing their curiosity about the game (Huda et al., 2020). Studies focusing on gamified support tools have affirmed a favourable association between PU and curiosity (Saphira & Rusli, 2019). Furthermore, research on game-oriented student response systems has also confirmed that a high level of PU stimulates users’ curiosity, particularly in education (Palos-Sanchez et al., 2022). In light of the above-mentioned theoretical footings, the authors present the ensuing hypothesis:

  • H7. Personal innovations in information technology are expected to significantly influence users’ curiosity.

  • H8. Perceived usefulness of ChatGPT is expected to significantly influence users’ curiosity.

Joy (JOY) encompasses the perception of enjoyment and heightened pleasure, serving as a form of intrinsic motivation that has been integrated into the Technology Acceptance Model (TAM) (Venkatesh, 2000; Agarwal & Karahanna, 2000). It denotes the extent to which individuals perceive the tasks or services provided by a system as enjoyable in and of itself, irrespective of any anticipated performance outcomes (van der Heijden, 2004). Prior studies in a 3D multi-user virtual learning setting have empirically affirmed the positive impact of perceived usefulness (PU) on joy (Rosmansyah et al., 2019). Similarly, studies investigating learning systems have also demonstrated a positive association between PU and joy (Khalid, 2014). ChatGPT offers students a novel avenue for learning, complementing their face-to-face instruction. By engaging with ChatGPT, students can experience self-paced and interactive learning, thereby fostering a sense of playfulness and enjoyment. However, limited research explores the relationship between personal innovation attitude (PIIT) and joy when utilizing technology applications. Therefore, we contend that individuals’ inclination toward innovation may amplify the positive emotions of enjoyment when using ChatGPT for English language learning. Leveraging the theoretical ground works, the authors posit the following hypothesis:

  • H9. Personal innovations in information technology are expected to significantly influence users’ joy when using ChatGPT.

  • H10. Perceived usefulness of ChatGPT is expected to significantly influence users’ joy when using ChatGPT.

Focused immersion (FI) denotes an individual’s complete engagement in a specific interaction or task while disregarding other attentional demands (Agarwal & Karahanna, 2000). FI can be exemplified by a learner’s undivided attention to a ChatGPT session to the point where external stimuli are no longer perceived, reflecting a high level of engagement with the educational content. It serves as a metric for assessing the depth of a user’s involvement in utilizing a system. Within the framework of the HMSAM, FI is influenced by three key factors, including CO, CU, and JOY (Oluwajana et al., 2019; Rehy & Tambotoh, 2022; Lowry et al., 2013). These elements are pivotal in determining the intensity of FI individuals feel during technological interactions. Research conducted in the blended learning environment has also demonstrated that control, curiosity, and joy significantly impact students’ level of immersion (Sidek et al., 2020). Furthermore, an investigation utilizing an extended hedonic motivation adoption model revealed that FI’s effect was significantly affected by boredom (Deng & Yu, 2023). Moreover, in studies focusing on gamified learning environments, PU has been identified as a significant factor in enhancing the level of FI (Kampling, 2018). Prior studies have identified a favourable link between personal innovation and cognitive immersion, including FI (Agarwal & Karahanna, 2000). The latest research in virtual reality learning has shown a positive connection between students’ innovation and immersion in game-based educational environments (LAU, 2023). Drawing upon these findings, the authors propose the following research hypothesis:

  • H11. Control over ChatGPT is expected to significantly influence users’ focused immersion.

  • H12 Boredom emotions are expected to significantly influence users’ focused immersion.

  • H13. Perceived usefulness of ChatGPT is expected to significantly influence users’ focused immersion.

  • H14. Curiosity in using ChatGPT is expected to significantly influence users’ focused immersion.

  • H15. Joy in using ChatGPT is expected to significantly influence users’ focused immersion.

Behavioural intention (BIU) indicates the probability an individual assigns to participating in a particular action (Fishbein & Ajzen, 1975). It reflects a student’s assessment of their likelihood to adopt ChatGPT for language study, indicating their commitment to incorporating this technology into their educational activities. The advanced Technology Acceptance Model (TAM2) posits that BIU acts as an antecedent to real system usage (Venkatesh & Davis, 2000). Previous studies have suggested that a sense of control reduces users’ cognitive barriers and enhances their enjoyment of the interaction experience, increasing their willingness to continuously engage with the system (Watson et al., 2013). In contrast, boredom is associated with diminished emotional arousal and is linked to reducing a sense of BIU (Van Tilburg & Igou, 2017). Within the framework of the HMSAM, curiosity stands as a central element, reinforcing the relationship between systems and human engagement as it fosters users to develop a desire for further engagement to relive gratifying experiences (Kashdan & Silvia, 2009).

Furthermore, extensive empirical evidence has consistently demonstrated the significant predictive power of PU on students’ BIU. Previous research has underscored a noteworthy predictive relationship between PU and users’ inclination to embrace electronic tools (Ansong-Gyimah, 2020). Moreover, Humida et al. (2022) presented empirical evidence highlighting the strong influence of PU on BIU within the realm of e-learning adoption. Additionally, Bruner and Kumar (2005) extended the TAM by including a measure of joy alongside the original TAM components. They discovered that fun directly influenced individuals’ intentions to use a technological tool. Also, FI is frequently associated with satisfaction and loyalty within the technology experience context (Hudson et al., 2019). Studies conducted on desktop-based virtual reality (VR) have identified that students’ level of immersion positively predicts their intent to utilize these systems (Huang et al., 2010; Xie et al., 2022). Hence, the researchers posit the subsequent research hypothesis:

  • H16. Control over ChatGPT is expected to significantly influence users’ behavioural intention to use ChatGPT.

  • H17. Boredom emotions in using ChatGPT are expected to significantly influence users’ behavioural intention to use ChatGPT.

  • H18. Curiosity in using ChatGPT is expected to significantly influence users’ behavioural intention to use ChatGPT.

  • H19. Perceived usefulness of ChatGPT is expected to significantly influence users’ behavioural intention to use ChatGPT.

  • H20. Joy in using ChatGPT is expected to significantly influence users’ behavioural intention to use ChatGPT.

  • H21. Focused immersion in using ChatGPT is expected to significantly influence users’ behavioural intention to use ChatGPT.

Drawing on the foundations of previous studies and the development of hypotheses, Fig. 1 depicts the finalized research model.

Fig. 1
figure 1

The finalized research model

2.4 Rationale

The authors acknowledge that the multiple dimensions of the constructs under consideration have the potential to interweave, suggesting that the actual causality might be more intricate than straightforward models suggest. Furthermore, it is recognized that contextual factors may also influence the relationship between two constructs, adding another layer of complexity to the analysis. While introducing contextual factors into the analysis could enrich the understanding by presenting a broader perspective, the authors believe that this inclusion might compromise the integrity of the current model. The complexity added by contextual factors might obscure the clear relationships established by the current quantitative framework. Therefore, it is posited that contextual factors are better suited for exploration through qualitative analysis, which can more effectively identify and examine their nuanced impacts without compromising the model’s structural integrity. Consequently, acknowledging the impact of external factors like social influence (Al-Emran & Salloum, 2017; Salloum et al., 2019), accessibility (Sánchez & Hueros, 2010; Attis, 2014), and infrastructure (Venkatesh et al., 2003; Teo, 2010) on user experience, this research primarily delves into internal factors such as control, boredom, joy, and curiosity in engaging with ChatGPT for language learning. Consistent with prior research (Francke & Alexander, 2018; Palos-Sanchez et al., 2022; Florensia & Suryadibrata, 2023), the interplay among these mediators is not explored, focusing instead on their individual impact on language learning with ChatGPT.

3 Methodology

This study has two central goals: to thoroughly assess ESL students’ perceptions of ChatGPT as a CALL tool and to investigate the intricate connections between these perceptions and their motivation to learn English. A quantitative research design and data analysis method was employed to achieve these objectives, enabling the collection of numerical data subjected to rigorous statistical analysis, ultimately leading to insightful and meaningful conclusions (Creswell, 2014). This approach ensures a systematic and meticulous investigation of the correlation between students’ perceptions of ChatGPT and their drive to study English. To ensure the study’s feasibility, a comprehensive factor analysis alongside reliability and validity assessments were conducted, drawing on proven analysis methods from prior research for added robustness (Palos-Sanchez et al., 2022; Florensia & Suryadibrata, 2023). Rigorous tests during the experiment further solidified the authenticity of data collection and analysis processes. This research framework facilitates in-depth scrutiny and analysis of the elements affecting ESL students’ motivations, shedding light on the complex dynamics and potential implications for effectively integrating CALL tools in language learning settings.

3.1 Participants and procedures

The study engaged ESL learners from tertiary educational settings who had previously utilized ChatGPT for English language learning. Participants are all Chinese international students recruited from UK universities through online convenience sampling methods. The participants in the study duly provided their consent by signing the informed consent forms, as required by ethical guidelines and research protocols. Participants were briefed on confidentiality protocols and were guided on survey completion. The collection of data occurred from May 31, 2023, to June 15, 2023, through the use of the online survey platform Questionnaire Star. To further outreach, recruitment announcements were disseminated across prominent social media platforms such as BILIBILI, WeChat, and Xiaohongshu (a lifestyle app in China similar to Instagram). To incentivize participation, each respondent was offered a £5 Amazon voucher upon survey submission. The authors acknowledge that offering rewards to participants could potentially influence their external motivation in providing data for the study (Sharp et al., 2006; Thibault Landry et al., 2020). Of the initial 202 willing participants who gave informed consent, 13 questionnaires were deemed invalid, resulting in 189 usable responses. This amounted to a commendable response rate of 93.6%. A detailed demographic breakdown of participants is available in Table 1.

Table 1 The profile of participants

However, the selection of participants was not arbitrary but influenced by the unique circumstances surrounding the study, as the primary objective focused on gaining an in-depth understanding of a phenomenon within English learners. Thus, convenience sampling was chosen due to pressing time constraints and limited accessibility to a wider participant pool (Etikan et al., 2016).

3.2 Research instrument

The back-translation method was applied in the questionnaire development to ensure accuracy in both English and Chinese versions. Nevertheless, the authors are cognizant of the possibility that subtle language differences and interpretational nuances might persist between these versions. Diligent efforts were devoted to minimizing these differences, aiming for a precise and consistent representation of the constructs in each language. Furthermore, the survey underwent rigorous revisions to ensure its utmost alignment with the research objectives and theoretical underpinnings. Before data collection, feedback was sought from 20 ChatGPT users and two professors who completed the survey. The 20 ChatGPT users, all ESL learners, offered essential insights into the questionnaire’s practicality, drawing from their extensive experience with ChatGPT in language education. Concurrently, two professors specializing in quantitative research methodologies meticulously evaluated the survey’s structural coherence and validity. Considering their feedback, the researchers fine-tuned the questionnaire, emphasizing clarity in phrasing, layout, numbering, and order of items.

The final survey was split into two segments. The first segment aimed to gather demographic information, collecting data such as name, gender, academic status, and the frequency of ChatGPT use. The anonymization of names was a key concern throughout the study, ensuring privacy in data handling and reporting. The following segment, spanning 36 questions, was engineered to gauge the nine principal constructs pivotal to the research model. These constructs, each represented by four items, included Perceived Ease of Use (PEOU), Perceived Usefulness (PU), Behavioral Intention to Use (BIU), Curiosity (CU), Control (CO), Joy, Personal Innovativeness in Information Technology (PIIT), Boredom (BO), and Focused immersion (FI). The questionnaire items were formulated to gauge the perceptions and experiences of users while they learned English using ChatGPT. Items were scored on a six-point Likert scale, influenced by techniques used in prior studies (Agarwal & Karahanna, 2000; Lowry et al., 2013; Venkatesh, 2000; Deng & Yu, 2023).

3.3 Data analysis

After testing the reliability and validity of the research instruments, the analysis of this study went through statistical methodologies such as partial least squares structural equation modelling (PLS-SEM) and mediation effect test. These techniques are well-regarded and widely applied by scholars who apply a quantitative approach to explore relationships between multiple variables, especially in educational studies (Fornell & Larcker, 1981; Sarstedt et al., 2014; Leguina, 2015; Heo et al., 2015; Hair et al., 2019). Their applicability and effectiveness have been further validated within the context of language learning research (Morchid, 2019; Hair & Alamer, 2022; Russo & Stol, 2021; Hsu & Lin, 2022).

This study delved into nine constructs and intricate mediating relationships to understand the link between intrinsic motivation for English learning and users’ inclination to use ChatGPT. The research team employed PLS-SEM to validate the proposed research model. The analysis incorporated the use of SPSS and Amos software, following the two-step approach recommended by Hair et al. (2011). The initial stage involved assessing the measurement model’s metrics for reliability and validity. The following step pivoted to the structural model evaluation, focusing on elements like path coefficient significance, predictive value, indirect effects, and moderation.

4 Results

The results were dissected into measurement model validation, structural equation analysis, and mediating effects. Validation confirmed the model’s reliability, while structural analysis revealed key relationships between variables, significantly impacting user behaviour towards technology. Mediating effects analysis brought to light the intricate roles of constructs in these relationships, enhancing the understanding of user engagement and adoption in educational tech contexts. Through this approach, direct and indirect variable connections were clarified, with mediators shedding light on the complex interplay at work.

4.1 Measurement model assessment 

The model underwent a comprehensive assessment to evaluate reliability and validity (Hair et al., 2019). Initially, Initially, the researchers employed a conventional PLS-SEM algorithm to ascertain the factor loadings. These values are suggested to meet or surpass the threshold of 0.70, indicating that they explain more than 50% of the indicator’s variability, as recommended by Sarstedt et al. (2014) and Hair et al. (2019). Then, internal consistency reliability was gauged using Cronbach’s alpha and CR. Table 2’s findings show that both metrics surpassed 0.70, the advised threshold, indicating consistent item reliability (Chin, 1998). Following the recommendations of Hair et al. (2019), the measurement model’s convergent and discriminant validities were verified using AVE assessment. As Table 2 depicts, all constructs’ AVE values surpassed the set minimum of 0.50, signifying robust convergent validity (Fornell & Larcker, 1981). As a result, the measurement model demonstrates reliable and valid properties of acceptable quality.

Table 2 Reliability and convergent validity of the measurement mode

4.2 Structural equation model analysis

Table 3 presents the outcomes for the hypothesized paths, indicating that 14 out of 21 paths were statistically significant. Notably, the findings endorse H1 and H2, illustrating that users’ perception of ChatGPT’s ease of use significantly boosts its perceived usefulness (β = 0.767, p < 0.001) and fosters personal innovativeness in the domain of information technology (β = 0.322, p < 0.001). This strong correlation suggests that user-friendly interfaces are crucial in educational technology, as they directly enhance the tool’s perceived value in language learning. Furthermore, the support for H4, H6, H8, and H10 underscores the critical role of perceived usefulness in not only elevating control (β = 0.552, p < 0.001) and curiosity (β = 0.694, p < 0.001) but also in diminishing boredom (β=-0.278, p < 0.001) and amplifying joy (β = 0.791, p < 0.001). Such findings highlight the importance of the tool’s utility in creating an engaging and positive learning experience. Additionally, the unique positive impact of personal innovativeness on joy (β = 0.162, p < 0.01) aligns with H9, indicating that users’ willingness to embrace new technologies can enhance their enjoyment of using ChatGPT for language learning. Contrarily, the results did not support H3, H5, and H7, as personal innovativeness did not significantly affect control (β = 0.057, p > 0.05), boredom (β = 0.045, p > 0.05), and curiosity (β = 0.098, p > 0.05). This outcome hints that the propensity to adopt new technologies might not significantly sway users’ control over ChatGPT, their boredom levels, or their curiosity within the language learning realm. Such findings suggest a nuanced interplay where personal innovativeness does not directly dictate user engagement with ChatGPT in terms of managing the tool, feeling engaged, or exploring it curiously.

Table 3 The results of hypothesis testing

Moreover, focused immersion was significantly predicted by control (β = 0.289, p < 0.01), boredom (β = 0.183, p < 0.05), perceived usefulness (β=-0.496, p < 0.05), and joy (β = 0.666, p < 0.001), supporting H11, H12, H13, and H15. These findings further elucidate how emotional and cognitive factors contribute to learners’ deep engagement with and intentions to use ChatGPT for language learning. The lack of a significant effect of curiosity on focused immersion (β = 0.192, p > 0.05) leads to the rejection of H14, indicating that curiosity alone may not suffice to deepen immersion in the learning process. Furthermore, the behavioural intention to use ChatGPT is significantly shaped by boredom (β=-0.146, p < 0.05), perceived usefulness (β = 0.492, p < 0.01), and the level of focused immersion (β = 0.231, p < 0.01), supporting H17, H19, and H21. This highlights the interplay of various factors in determining learners’ intentions to adopt ChatGPT for language learning. Conversely, control (β = 0.027, p > 0.05), curiosity (β = 0.017, p > 0.05), and joy (β = 0.12, p > 0.05) do not significantly impact the behavioural intention to use ChatGPT, refuting H16, H18, and H20, and suggesting that these factors may not be primary drivers in learners’ decisions to integrate ChatGPT into their language learning practices. This insight points to a nuanced understanding of user engagement with educational technology, hinting at the potential influence of other factors, possibly related to ChatGPT’s unique features or the learning environment, in shaping adoption decisions. Figure 2 depicts the proposed research model, depicting the path coefficients between the variables.

Fig. 2
figure 2

The research model with path coefficients. ***p < 0.001, **p < 0.01, *p < 0.05

Table 4 displays the fit indices for the evaluated structural equation model. The PCMIN/DF value is 2.048, indicating a reasonable fit as it is below the preferred threshold of 3 (Tabachnick & Fidell, 2007). The Incremental Fit Index (IFI Delta2) is 0.902, the Tucker-Lewis Index (TLI rho2) is 0.910, and the Comparative Fit Index (CFI) is 0.914 - all slightly above the commonly accepted cut-off point of 0.90, thus demonstrating a good fit of the model to the data (Bollen, 1989; Tucker, 1973; Bentler, 1990). Moreover, the Root Mean Square Error of Approximation (RMSEA) is 0.068, falling below the cut-off point of 0.08 (MacCallum et al., 1996; Steiger, 1990, 2007), further indicating a satisfactory alignment between the model and the observed data. In summary, these indices collectively point to a well-fitting model, suggesting that the proposed relationships among variables in the model reasonably represent the empirical data.

Table 4 Model fit test

4.3 Mediating analysis

In this study, the researchers have employed the PROCESS macro for SPSS (Hayes, 2022) to conduct mediation analysis. The specific models used (Model 4 and Model 6) are for single and parallel mediator models. Within the numerous mediator relationships scrutinized in the study framework, 18 mediation routes were found to be statistically significant (p < 0.05). This underscores the significant role these mediators play in bridging the relationship between the independent and dependent variables. Table 5 delineates the significant mediating effects within the research model, revealing various paths of influence among the constructs.

Table 5 Significant mediating effects of the research model

In the subsequent sections (Tables 6, 7, 8, and 9), the term “Effect Ratio 1” denotes the ratio of Path(N) to the total effect observed. On the other hand, the term “Effect Ratio 2” pertains to the ratio of Path(N) to the total indirect effect. Furthermore, the “P./C.” column designates whether the pathway is a complete mediator (‘C’) or a partial mediator (‘P’), where both direct and indirect effects coexist.

Table 6 provides a comprehensive breakdown of the total, direct, and indirect effects of Perceived Usefulness (PU) on Behavioral Intention to Use (BIU). The total effect of PU on BIU is reported as 0.709 (p < 0.001), demonstrating a strong and statistically significant relationship. Moreover, when evaluating the direct influence of PU on BIU without factoring in the mediation effects, it is discerned to be statistically significant with a coefficient of 0.563 (p < 0.001). The table further highlights the indirect effects of PU on BIU through multiple pathways. The total indirect effect is estimated as 0.147, which is statistically significant, as evidenced by the bootstrapped confidence interval not encompassing zero. This finding suggests that the mediating variables significantly influence the relationship between PU and BIU. Within the table, three specific indirect paths are delineated: “PU→CO→BIU”, “PU→FI→BIU”, and “PU→CO→FI→BIU”. Each pathway represents a distinct route through which PU can indirectly impact BIU. Notably, the indirect effects via the “PU→CO→BIU” and “PU→FI→BIU” paths are not deemed significant, as their bootstrapped confidence intervals cross zero. This implies that these pathways do not play a significant role in mediating the connection between PU and BIU. Conversely, the “PU→CO→FI→BIU” pathway demonstrates a significant indirect effect (0.053), with its bootstrapped confidence interval not crossing zero. The effect ratios associated with this pathway indicate that it accounts for 7% of the total effect (Effect Ratio 1) and 36% of the total indirect effect (Effect Ratio 2). In the “P./C.” column, this pathway is categorized as partial mediation, where both direct and indirect effects coexist between PU and BIU.

Table 6 Significant mediating effects of the research model: PU to BIU

In the data presented within Table 7, the overall influence of Perceived Usefulness (PU) on the Behavioural Intention to Use (BIU) is observed to be 0.707, with a significance level below 0.001. In addition, the direct contribution of PU to BIU, devoid of any mediating factors, is also statistically significant, standing at 0.572 (p < 0.001). These results underscore the critical role of PU in forecasting BIU, even in the absence of intermediary variables. Furthermore, an examination of the indirect effects sheds light on the intricate interactions between PU and BIU. The term ‘indirect effects’ encompasses the impact exerted by PU on BIU via one or several intermediary variables. In this particular scenario, the overall indirect influence of PU on BIU is determined to be 0.136, with its significance highlighted by the fact that the bootstrapped confidence intervals, both lower (BootLLCI) and upper (BootULCI), do not contain zero.

Table 7 Significant mediating effects of the research model: PU to BIU

The table presents three specific indirect effects: “PU→BO→BIU,” “PU→FI→BIU,” and “PU→BO→FI→BIU.” Each effect is accompanied by an effect size, standard error, and bootstrapped confidence intervals. The pathways “PU→BO→BIU” and “PU→FI→BIU” exhibit significant positive mediation (p < 0.05), indicating that Perceived Usefulness (PU) affects Behavioral Intention to Use (BIU) through the mediating variables Boredom (BO) and Focused Immersion (FI), respectively. Conversely, the “PU→BO→FI→BIU” pathway demonstrates a small, non-significant negative effect, as evidenced by the bootstrapped confidence intervals crossing zero. This indicates that the particular indirect pathway does not play a significant mediating role in the link between PU and BIU.

The effect ratios provide insights into the proportion of the total effect (Effect Ratio 1) or the total indirect effect (Effect Ratio 2) accounted for by each pathway. The “P./C.” column designates the pathway as a partial mediator (‘P’), where both direct and indirect effects coexist between PU and BIU. In this instance, the “PU→BO→BIU” and “PU→FI→BIU” pathways are classified as partial mediators, while the “PU→BO→FI→BIU” pathway does not mediate the relationship.

Referring to Table 8, the composite impact of Perceived Usefulness (PU) on Behavioral Intention to Use (BIU) is documented as 0.709, showcasing a significant and strong correlation (p < 0.001). Moreover, the direct contribution of PU towards BIU, excluding the influence of mediating factors, is significant, marked at 0.435 (p < 0.001). This underscores the potent link between PU and BIU, even without considering intervening variables. Furthermore, the investigation highlights notable indirect effects, delineating the role of PU in affecting BIU via intermediary variables. The overall indirect effect of PU on BIU is calculated at 0.274, a significant and noteworthy figure, as evidenced by bootstrapped confidence intervals (BootLLCI and BootULCI) that exclude zero, underscoring its statistical significance.

Table 8 Significant mediating effects of the research model: PU to BIU

The table outlines three specific indirect effects or pathways: “PU→JOY→BIU,” “PU→FI→BIU,” and “PU→JOY→FI→BIU.” Each pathway is accompanied by effect size, standard error, and bootstrapped confidence intervals. The “PU→JOY→BIU” and “PU→JOY→FI→BIU” pathways are significant positive mediators, indicating that the effect of Perceived Usefulness (PU) on Behavioral Intention to Use (BIU) is significantly mediated by Joy (JOY) and the combination of Joy (JOY) and Focused Immersion (FI), respectively. However, the second pathway, “PU→FI→BIU,” exhibits a small and non-significant indirect effect, as indicated by the bootstrapped confidence intervals crossing zero.

The effect ratios (Effect Ratio 1 and Effect Ratio 2) provide insights into the proportion of the total effect or the total indirect effect accounted for by each pathway. The ‘P’ in the “P./C.” column denotes partial mediation, indicating that both the direct effect of PU on BIU and the indirect effect through the mediator are significant. In this table, “PU→JOY→BIU” and “PU→JOY→FI→BIU” pathways indicate partial mediation, while “PU→FI→BIU” does not exhibit mediation.

Table 9 presents an exhaustive analysis of Perceived Usefulness (PU)’s total, direct, and indirect impacts on Focused Immersion (FI) within the studied framework. The overall influence of PU on FI is quantified at 0.381 (p < 0.001), showcasing a significant correlation. Notably, PU’s direct impact on FI registers a negative value of -0.164, which, interestingly, does not reach statistical significance (p = 0.205). This negative figure implies a potential inverse relationship between PU and FI in scenarios excluding mediators, though the lack of statistical significance suggests caution in interpretation. The primary influence of PU on FI emerges from the indirect effects, highlighted by a notable indirect effect magnitude of 0.545. The statistical robustness of this indirect pathway is affirmed by bootstrapped confidence intervals that do not include zero, illustrating the significant mediating role of intervening variables in the PU-FI dynamic.

Table 9 Significant mediating effects of the research model: PU to FI

Three specific indirect pathways are delineated: “PU→CO→FI,” “PU→BO→FI,” and “PU→JOY→FI.” Each pathway exhibits a significant indirect effect, as indicated by their bootstrapped confidence intervals not crossing zero. However, the “PU→BO→FI” pathway demonstrates a negative indirect effect (-0.065), suggesting that an increase in Perceived Usefulness (PU) is associated with a decrease in Focused Immersion (FI) through the mediating variable of Boredom (BO). In contrast, the “PU→CO→FI” and “PU→JOY→FI” pathways demonstrate positive indirect effects, indicating that an increase in Perceived Usefulness (PU) leads to an increase in Focused Immersion (FI) through the mediating variables of Control (CO) and Joy (JOY), respectively. The effect ratios suggest that the three pathways account for a larger proportion of the total effect (Effect Ratio 1) and total indirect effect (Effect Ratio 2), possibly due to the observed negative direct effect. In the “P./C.” column, complete mediation is indicated for all three pathways, implying that the effect of PU on FI predominantly operates through the specified mediating variables.

5 Discussion

This study examined the relationship between ESL students’ perception of ChatGPT as a computer-assisted language learning (CALL) tool and their intrinsic motivation, underscores the model’s specific applicability to the ESL context. The study aimed to devise an acceptance model for ChatGPT anchored in the Human-Machine System Acceptance Model (HMSAM) while exploring students’ behavioural intent and concentrated immersion. Research outcomes highlighted the viability of the HMSAM in the AI-based Chatbots context, supporting 14 of the 21 hypothesized statements.

Perceived ease of use (PEOU) notably influenced perceived usefulness (PU) and personal innovation in IT (PIIT). This finding, resonating with Agarwal and Prasad (1998), gains added significance considering ESL learners’ diverse technological literacy and potential apprehension towards new tools. Unlike native speakers, ESL learners require clearer, conciser, and more culturally relevant content that aligns with their learning pace and style. ESL students might experience anxiety and lack of confidence, especially when learning a new language and adapting to new technology simultaneously. When potential ESL learners encounter a system that is user-friendly, they are more inclined to accept changes in their innovation attitude within information technology. Conversely, if ChatGPT proves to be challenging, ESL students may need additional time and effort to become acquainted with the system, leading to resistance towards AI-Chatbots.

The primary determinants identified for the behavioural intention to use (BIU) ChatGPT, namely perceived usefulness, boredom, and focused immersion, resonate with and reaffirm findings from earlier research conducted by Ansong-Gyimah (2020), Kampling (2018), and Watson et al. (2013). ESL students, often struggling with motivation due to the complex nature of language learning, benefit significantly from tools that are perceived as useful and engaging. ChatGPT, by alleviating boredom and enhancing focused immersion, addresses these specific needs by offering interactive and context-rich content that resonates with ESL learners’ experiences and interests. However, the results did not entirely corroborate the impact of hedonic motivations like control, curiosity, and joy, as noted in prior investigations (for instance, Lowry et al., 2013; Deng & Yu, 2023; Palos-Sanchez et al., 2022). One possible explanation for this discrepancy is that the studies mentioned above primarily focused on game-based systems (Palos-Sanchez et al., 2022; Lowry et al., 2013) or social networking environments (Deng & Yu, 2023). In contrast, ChatGPT operates within a communicative AI-Chatbot framework that emphasizes informational learning environments. Therefore, the unique characteristics of ChatGPT as a language learning tool might contribute to the differing influence of hedonic motivations compared to other contexts explored in prior research.

The findings on the primary determinants of focused immersion (FI) in ESL learning – perceived usefulness, control, boredom, and joy – gain added depth considering ESL students’ unique educational contexts. Such findings confirm previous research conducted by Kampling (2018), Palos-Sanchez et al. (2022), and Deng and Yu (2023). ESL Students were found to engage deeply in English language learning when they perceived ChatGPT as easy to control and enhancing enjoyment. Control and enjoyment in learning are key factors in keeping ESL learners motivated and engaged, as they often face challenges like language anxiety and cultural dissonance. An AI tool that is easy to control and enhances enjoyment can significantly improve their learning experience and outcomes. However, it is noteworthy that the results also indicated that curiosity did not significantly predict focused immersion. This implies that even if ESL learners do not have a strong sense of curiosity towards English language learning while interacting with ChatGPT, they may still become deeply engaged by the educational content provided.

Mediation tests revealed that boredom, joy, and focused immersion played a crucial role in linking perceived usefulness to behavioural intent. This pattern aligns with the consistent findings of previous studies by Watson et al. (2013), Van Tilburg and Igou (2017), Kashdan and Silvia (2009), Ansong-Gyimah (2020), Kampling (2018), and Agarwal and Karahanna (2000), further validating these relationships. In the ESL context, this suggests that the perceived value of ChatGPT in enhancing language learning efficiency can significantly influence learners’ emotional states and engagement levels. Additionally, the mediating functions of boredom and joy between perceived usefulness and behavioural intent resonated with the discoveries of Ansong-Gyimah (2020), Kampling (2018), and Watson et al. (2013). For ESL learners who frequently encounter challenges such as linguistic complexity and cultural nuances in language learning, the alleviation of boredom and the presence of joy can be crucial motivators. The use of ChatGPT, perceived as useful, can transform the often-arduous task of language learning into an enjoyable and engaging process, thereby increasing the likelihood of sustained usage and behavioural commitment.

Furthermore, focused immersion was identified as a partial mediator between perceived usefulness and behavioural intent. This finding is particularly relevant for ESL learners, for whom deep engagement in the learning process is essential. Focused immersion in an AI-facilitated environment could mean a more profound absorption in the intricacies of the English language, overcoming barriers of traditional learning methods. ChatGPT, by providing interactive and contextually rich content, can foster this kind of immersion, making the learning experience more effective. These results highlight the significance of boredom, joy, and focused immersion in influencing ESL students’ intentions to engage with ChatGPT for language learning purposes. The integration of these emotional and cognitive factors in the learning process is especially important for ESL students, who may require additional motivational and engagement mechanisms to navigate the complexities of learning a second language. Moreover, concerning control, boredom and joy were pinpointed as complete mediators in the connection between perceived usefulness and focused immersion. Such results resonate with prior studies by Lowry et al. (2013), Oluwajana et al. (2019), Rehy and Tambotoh (2022), and Sidek et al. (2020). In the context of ESL learning, this suggests that when students perceive ChatGPT as useful, it enhances their sense of control over their language learning process, thereby reducing boredom and increasing joy. This enhanced sense of control can be particularly empowering for ESL learners, who often navigate the additional challenges of learning in a non-native language, leading to a more immersive and rewarding educational experience.

The outcomes of this investigation illuminate the pivotal role of ChatGPT as an efficacious language learning instrument. Such CALL tools harbour the capacity to proffer myriad advantages and prospects to learners and experts spanning various academic phases. The discoveries of the present inquiry concur with the latest studies by Kohnke et al. (2023) and Kasneci et al. (2023), underscoring the salutary influence of utilities akin to ChatGPT in the second language acquisition journey. By focusing on the specific needs and experiences of ESL learners, this study contributes to a more nuanced understanding of AI’s role in language education. It offers ESL educators and researchers valuable insights into how AI tools like ChatGPT can be effectively integrated into ESL learning environments, addressing the unique challenges faced by this learner demographic. These challenges include cultural adaptation, overcoming language anxiety, enhancing engagement, and providing personalized, contextually relevant learning experiences. By harnessing the capabilities of ChatGPT, scholars can augment their educational forays and elevate their proficiency in English. It is also pertinent to acknowledge that expansive language architectures like ChatGPT extend beyond the scope of mere language instruction. They hold promise in domains of research, composition, and analytical pursuits. Moreover, they encapsulate sector-specific linguistic competencies and other proficiencies apt for vocational education. Such assertions find resonance in the study undertaken by Hong (2023), highlighting the multifaceted prowess and promise of large language models within pedagogical spheres.

6 Conclusion

6.1 Major findings

This research delved into the correlation between ESL students’ perceptions of ChatGPT as a CALL tool and their innate motivation, anchored in the HMSAM paradigm. The results emphasized that, when employed as a CALL tool, ChatGPT significantly amplified students’ intrinsic motivation during their English as a second language learning journey. Furthermore, the study discerned mediating influences of control and boredom in the nexus between students’ perceived usefulness and focused immersion. Joy, intriguingly, acted as a bridge not only between perceived usefulness and focused immersion but also between learners’ proclivity for innovation and their intention to utilize ChatGPT. These revelations furnish pivotal understandings regarding the determinants steering the embrace and utilization of ChatGPT in the CALL spectrum, spotlighting the significance of aspects like ease of use, control, boredom, and joy in melding learners’ motivational drive and engagement.

6.2 Limitations of this study

Although this study provided valuable insights into adopting ChatGPT, it is essential to note a few inherent limitations of this study.

Firstly, the research was conducted specifically among overseas Chinese ESL students in the UK, which might circumscribe its broader relevance. Perceptions and attitudes towards AI technology may vary across different student populations and over time. Future research could consider a more diverse sample to capture broader perspectives. Secondly, the study placed relatively less emphasis on investigating the impact of the ease of use of ChatGPT. Considering the pivotal role of perceived ease of use in the adoption of technology, a deeper examination of this construct could provide valuable insights into its impact on learners’ motivation and engagement with educational tools. Thirdly, the study did not investigate the relationships among internal variables such as control, boredom, joy, and curiosity, focusing instead on their individual contributions to language learning with ChatGPT. Future research is encouraged to explore these interactions to gain a deeper insight into their combined effects on the educational use of technological tools. Fourthly, the research acknowledges the inherent limitations associated with convenience sampling. This approach could impact the findings’ applicability to a wider audience, given that the sample may not comprehensively mirror the diverse traits of the larger population. Therefore, readers should exercise caution in extrapolating the findings to a broader population, acknowledging that the chosen sample might not entirely reflect the wider group’s diverse characteristics. Finally, while the study examined the relationships between control, curiosity, and joy with behavioural intention and focused immersion, there may be other potential links between these internal factors that could also impact learners’ engagement and intentions. Exploring additional variables and their interplay might offer a richer insight into the adoption and use of ChatGPT as a CALL tool. These constraints suggest areas for subsequent studies, encompassing the expansion of participant diversity, enhanced analysis of the ease-of-use concept, and the assessment of both internal and external elements that might shape learners’ motivations and involvement. Addressing these limitations will contribute to a deeper and more complete grasp of the adoption and effectiveness of ChatGPT in educational contexts.

6.3 Implications for theory and practice

This study has proffered meaningful theoretical implications. At its core, it adeptly corroborated the HMSAM’s theoretical schema in the milieu of ChatGPT-facilitated language acquisition amongst ESL learners. HMSAM, originally contextualized for hedonic systems like gaming and virtual reality, is extended here to encompass AI-based chatbots in educational settings. This study broadens the HMSAM’s applicability, suggesting its utility in understanding user adoption of AI technologies in learning environments. The study underscores a nuanced theoretical aspect of HMSAM – the balancing act between hedonic (pleasurable) and utilitarian (useful) aspects of technology, especially in an educational context where both play pivotal roles in student engagement and learning outcomes. The findings regarding the influences of perceived ease of use (PEOU) on perceived usefulness (PU) and personal innovation in IT (PIIT) provide a deeper understanding of the technology acceptance model. This study contributes to a more nuanced comprehension of how PEOU not only enhances the perceived utility of technology but also fosters a culture of innovation within the educational technology sphere. These insights enrich existing literature on technology acceptance by providing empirical evidence on how user-friendly interfaces can influence the adoption of AI in education. As a pathway for subsequent scholarly explorations, this model can be expanded upon to delve into the assimilation of alternative AI-driven language instruction platforms, such as Busuu, DeepL, and English Language Speech Assistant (ELSA). Furthermore, a venture into varied learner environments, encapsulating those pursuing English as a foreign tongue, could provide a more integrated and detailed viewpoint on the embrace and prowess of these technological mediums.

In practice, the findings emphasize the critical role of user interface design in AI educational tools. The direct correlation between PEOU and PU implies that the more intuitive the ChatGPT interface is, the more useful ESL learners will find it. This has direct implications for AI tool developers, who should focus on designing interfaces that cater to the varied proficiency levels of ESL learners, ensuring that these tools are accessible and beneficial to all students. The study also points to the need for ongoing support and training for both educators and learners in utilizing AI tools. As ChatGPT and similar technologies become integral to ESL education, continuous professional development for teachers and support systems for students will be essential in maximizing the benefits of these technologies.

Furthermore, this study has provided valuable insights for educators regarding adopting AI-based computer-assisted language learning (CALL) tools, such as ChatGPT. Teachers should incorporate ChatGPT-assisted language learning support into their online or offline classes, as it can introduce alternative learning methods and engaging elements. Moreover, instructors need to account for the innate motivation of students and seamlessly weave ChatGPT into curricular tasks, ensuring engaging learning encounters and improved proficiency in the English language. The user-friendly nature of ChatGPT promotes a positive attitude among students and teachers towards new AI technologies. Additionally, teaching strategies should be tailored to different levels of education. For instance, in higher education institutions, teachers can leverage postgraduate students’ curiosity to maintain their focus on learning English. Conversely, for undergraduate students, cultivating a sense of joy and addressing boredom becomes crucial for sustaining their attention. Subsequent investigations might build upon these insights by studying learners at varied educational stages, including middle and high school levels, to explore the viability and impact of deploying ChatGPT-centric CALL instruments in such academic settings.