1 Introduction

Education is influenced by the considerable changes driven by the new challenges that society has to face, such as ending poverty, saving the planet, and improving people’s lives and perspectives (Rieckmann, 2018). Today, we live in a complex scenario where technology pervades every facet of our lives, especially teaching conditions (Richey, 2008). Educational technology has become a core element for both students and teachers, but this scenario is dynamic, and hardware and software are constantly changing their configurations. As a result, new possibilities are emerging that go beyond the traditional face-to-face classroom space, such as online learning, mobile learning, new educational materials and media, new methodological designs, and new ways to assess academic outcomes (Fombona et al., 2020). However, perhaps the most important is the emergence of a new model of relationships between students, and between students and teachers (Sáez López et al., 2022). These huge challenges are framed in the 17 objectives of the UN’s Sustainable Development Agenda 2030 (Biswas et al., 2021), which also include delivering quality education for all, in which inclusion, equity, quality and opportunities for learning throughout life must underpin all innovative proposals designed for educational centers. Previous research has analysed serious games like those used in this study and demonstrated that they support sustainable curricula (Peña et al., 2020; Saitua-Iribar et al., 2020).

1.1 Students at a Socio-Educational Disadvantage

This research focuses on the academic needs of Spanish students in a situation of socio-educational disadvantage and, therefore, at risk of failure or dropping out of school, due to socioeconomic and cultural conditions. Our study explores the possibilities that serious games offer to help eliminate the academic gap that separates them from their peers.

The statistical yearbook compiled by the General Directorate of Statistics and Studies of the Ministry of Education and Vocational Training of Spain publishes in 2019–2020 academic year, that students in a situation of socio-educational disadvantage are those who.

present a significant school lag, with two or more courses difference between their level of curricular competence and the course in which they are enrolled, due to being in situations of socio-educational disadvantage derived from their belonging to ethnic and/or cultural minorities, by social, economic, or geographical factors, or by educational insertion difficulties associated with irregular schooling (National Institute of Educational Assessment, 2021, p. 239).

Students at risk due to socioeconomic and cultural conditions have high academic failure and early drop-out rates (Agasisti et al., 2021; Akbulut, 2022), and are also deemed responsible for their own situation for lacking effort or merit (Owens & de St Croix, 2020; Talib, 2020). However, the link between social class and academic success or failure (King & Trinidad, 2021; Nunes et al., 2021) should not be overlooked, especially in relation to the day-to-day living conditions that affect families. Lower-class families often have to deal with financial problems, situations of tension and stress, low aspirations and poor-quality linguistic interaction between parents and children, all of which influences children’s academic performance (Shin & So, 2018; Tamm, 2021).

There are also well-documented literacy gaps among certain groups of students due to social or socio-economic circumstances, or use of a dialect of the majority language that is different (Luo et al., 2021; Mendive et al., 2020). Schwab and Lew-Williams (2016) found that this type of student tends to present with lower linguistic development indicators and has fewer means to access technological resources. This focus on linguistic competence can determine academic achievement in disadvantaged students, though there are numerous other variables that relate to performance at school. Some correspond to the individual while others relate to the socio-economic and cultural context in which the student interacts. Researchers have found that the acquisition of linguistic competence determines student’s performance at school (Hong et al., 2020; Taghinezhad & Riasati, 2020). More specifically, lexical knowledge alone can be a reliable predictor of academic achievement (Szabo et al., 2020). Research concludes that a higher degree of lexical competence leads to better academic performance (Heeren et al., 2021; Schuth et al., 2017; Wood et al., 2021), since better skills are available understanding, analysis, synthesis, and communication. In reading comprehension, it has been found that deficiency in this sub-competence leads to difficulties in decoding, confusion as to task requirements, poor vocabulary, and limited background knowledge, mnesic imbalance, low self-esteem, and lack of motivation (Capin et al., 2021; Torppa et al., 2019).

Despite all these investigations, complex or situational vulnerabilities, more indirect, such as those that have to do with variables related to the socioeconomic and cultural environment of students, are not being addressed or sufficiently studied (van der Lubbe et al., 2021). Nor are pedagogical alternatives offered that respond to the needs of students at risk of academic failure or dropout due to socioeconomic and cultural conditions.

In this new current environment that sets out improve education, there is clearly a need for innovative initiatives that enable students who are socio-educationally disadvantaged to attain a level of linguistic development that guarantees them access to a successful academic life and a professional future on equal terms with their less disadvantaged peers. In this study, the use of serious games is proposed to boost students’ motivation and involvement in their schoolwork (Schindler et al., 2017), which is fundamental for disadvantaged students. At this point, it is worth asking what can its application in a category of students with specific needs contribute to research on serious games and learning analytics? Regarding this second question, Zhao et al. (2022) propose a new line of research, stating that considering the knowledge gap of certain students improves the effectiveness of the games, since it allows personalizing both the content and the most appropriate type of game for each learning level. Therefore, working with specific groups of students with particular characteristics makes it possible to contribute to research on the content and design of serious games.

1.2 Serious Games and Learning Analytics for the Specific Needs of Some Students

In the educational context, games have become one of the pedagogical tools that can effectively promote learning, and several research studies (Al-Tarawneh, 2016; Cheung & Ng, 2021) have proven the effectiveness of educational games in the acquisition of science concepts in primary education. These games can be used to teach a wide range of science topics, moreover, the computer environment offers applications that can be used in classrooms. Examples include many cases of software use that is not specifically educational and is used in schools. On the other hand, serious games are designed (Abt, 1970) with a primary educational purpose other than pure entertainment (Zhao et al., 2022). They are used in a variety of settings, business, industry, health, and educational contexts, and they combine learning strategies with game elements to teach specific skills, content, and attitudes. Their main features are characterized by mixing fun, simulation, topics, and attractive nuances. In the serious games the players develop their knowledge and practice their skills by tackling obstacles (Zhonggen, 2019); these games strike a balance between fun and learning, ensuring efficacy and effectiveness in the didactic processes.

Previous research evidences the ability of serious games to improve psychological and physical well-being, specifically subjective well-being, physical activity, nutritional education, and self-efficacy (de Vlieger et al., 2022; Scarpa et al., 2021). It has been found that properly designed and implemented serious games could be effective in improving student engagement, motivation, attitudes, concentration, and learning performance (Tapingkae et al., 2020; Taub et al., 2020; Wronowski et al., 2019). On the other hand, serious games promote greater student participation in the classroom (Saleem et al., 2021), favoring their participation in decision-making in a healthy and pleasant environment to explore, being more reflective in the decision making and behavior (Hallifax et al., 2019).

In terms of higher cognitive functions or processes, the use of this methodology favors processes such as critical thinking, problem solving and creativity (Dindar, 2018; Pratama & Setyaningrum, 2018), as well as the development of self-regulation processes (Saiz-Manzanares et al., 2020). In a more academic field, serious games allow a better acquisition of skills related to language, mathematics, software engineering and scientific education (Clark et al., 2016; Fraga-Varela et al., 2021; Tokac et al., 2019; Wouters et al., 2013). Linguistic competence has a transversal character and is important in the rest of the subjects; thus, in both Mathematics and Sciences, understanding and verbal expression with accuracy and precision are needed (Enkvist, 2011; Navarra, 2020). Considering this, serious games can be an effective tool to improve student learning, but it is important to keep in mind that they must be contextualized in the educational curriculum and in the specific needs of students (Sun, 2023; Zhonggen, 2019). On the other hand, serious games need to be based on systematic instructional designs that guarantee their effectiveness (Schrader, 2023). Regarding these two questions, a research gap needs to be addressed.

The benefits of serious games are not only produced at an individual level, but also at a social level, this is relevant in students at risk of academic failure or dropout due to socioeconomic conditions. The children show progress in their communicative competence and creativity, which has repercussions on the interaction with their peers and the teaching staff (Hutagaol et al., 2023; Li et al., 2022; Tacoronte Sosa & Peña Hita, 2023). The improvement of the interaction supposes an increase in self-esteem, observing in children a more assertive pattern of communication and being more aware of their own rights and needs, improving the expression of their wishes and preferences in a clear and appropriate way (Flogie et al., 2020). This benefit in the communicative style has, in turn, an emotional impact since children are more capable of expressing their frustrations and anger without harming others (Davies et al., 2020). At the same time, they can express arguments and solutions, which indicates better skills for teamwork. Therefore, children who have used serious games are more socially competent.

Regarding the benefits that serious games can provide to children with specific learning difficulties, previous studies have used digital resources, such as Augmented Reality, to support learning in children with dyslexia and dyspraxia (Cano et al., 2022) or digital applications that adapt to different learning rates in children with ASD (López-Bouzas & Moral-Pérez, 2023; Lussier-Desrochers et al., 2023). However, it is considered that this is insufficient and that research on how to use serious games to support children with specific learning difficulties can be expanded greatly. In this sense, a research gap was detected, which be interesting to fill.

With respect to students in a situation of socio-educational disadvantage due to socioeconomic and cultural conditions, some research has been carried out that has evaluated the effects of serious games on the academic performance of these students specifically. In an acceptance analysis carried out by López et al. (2021) to evaluate the effects of serious games on academic performance, it was concluded that the effect was moderate and depended on a series of factors such as the design of the game, the context in which it is implemented, and the characteristics of the students. Another study by Deng et al. (2023) reached similar conclusions. However, research in this area is limited. More research is needed to better understand how ICT can be used to improve the academic performance of these students and develop effective intervention models, as pointed out by Haleem et al. (2022). These authors carried out a systematic review of the literature on the integration of ICT in the inclusive education of students in vulnerable situations, and the results indicate that the studies focus on the use of ICT to improve the academic performance of students in general, but do not focus specifically on students in vulnerable situations.

Returning to the academic field, serious games have been shown to be a powerful predictor of academic performance (Hautala et al., 2020; Thomson et al., 2020). Predicting academic achievement is a key element in education that enables practitioners to identify needs and develop preventive actions. In the context of serious games, an evaluation system based on Learning Analytics (LA) (Loh and Sheng., 2015; Valtonen et al., 2022) is beginning to gain strength, which, applied to the field of educational games, pass to be called Game Learning Analytics (GLA). Learning analytics is defined as “the measurement, collection, analysis and reporting of data about students and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs” (Ferguson, 2012, p. 305). LAs collect a lot of data about students, such as demographics, course performance, activity logs (Rienties et al., 2019; Tempelaar et al., 2018). Specifically, GLA provide precise information based on evidence of the progress of the students within an educational setting (Alonso-Fernández et al., 2021). Assessment of the results from serious games is normally made by using external questionnaires, while GLA data have yet to become a common feature of the assessment process (Alonso-Fernández et al., 2019). However, it is coherent for a proposal for an intervention based on serious games to assess students based on the learning analytics data provided by the game’s own software, in order to monitor the student’s interaction with the game and compile more accurate metrics and in real time. Thus, this research aims to contribute to expanding knowledge about evaluating the effectiveness of serious games through these learning analytics. We propose using high-level metrics to measure learning, prediction of performance (Alonso-Fernández et al., 2019) and which would allow the teacher to monitor in real time the student’s progress through learning analysis panels (Charleer et al., 2017). In this sense, it is necessary to consider what the scientific literature tells us about the predictive capacity of GLA. Previous studies have confirmed that serious games can predict students' academic performance, but that the predictive capacity varies depending on the subject or global indicator analyzed (Fraga-Varela et al., 2021; López et al., 2021). In general, predictive capacity is greater in subjects that focus on the development of complex cognitive skills (Jovanović et al., 2021). This question will be verified in the present study.

Another relevant issue is that GLA can be a valuable tool for improving the integration of serious games into the educational curriculum. Learning analytics can provide information about student’ progress, areas where they need support, and students' interactions with the game (Daoudi, 2022; Swiecki et al., 2022). This information can be used to identify students with specific difficulties and improve their instructional design (Khalil et al., 2023). Despite these contributions, research in this area is growing rapidly, but there are still important challenges, such as the lack of high-quality data and difficulty in interpreting the results of prediction models (Hlosta et al., 2022). Here, we believe our research can contribute to fill this knowledge gap.

2 Research on Serious Games and Linguistic Competence

This research is part of a broader study, the LingüisticTIC project, whose aim is to use serious games to improve linguistic competence in socio-educationally disadvantaged students in primary education. The project is systematic in design and application, and includes a coding and data analysis system, and programs predesigned for the stated objectives that function as an instrument for the assessment of learning.

The LingüisTIC project is made up of four fundamental elements:

  • Identification of the sub-competences of linguistic competence in Primary Education through the analysis of the corresponding educational curriculum.

  • Selection of digital resources following the planned design of Dichev and Dicheva (2017) and Zichermann and Cunningham (2011). Additionally the following references have been considered for the selection of serious games:

    • The relationship between game mechanics and learning mechanics, following the methodology known as Learning Mechanics-Gaming Mechanics, LM-GM, by Arnab et al. (2015).

    • The relationship between the game objectives and the learning objectives to be met (Lameras et al., 2016), considering that the learning objectives should be what guides game mechanics.

    • The three fundamental components of a serious game are the game, pedagogy, and reality (Harteveld et al., 2007), considering that a serious educational game is created with a pedagogical purpose, that is, students learn some content. This content must be relevant to the student to be used outside of the game and be valid. These three elements must be balanced, otherwise, they may not meet their objectives.

    • The 4C-ID instructional design methodology by Van Merriënboer et al., (2002). This methodology focuses on four aspects of the learning materials: learning tasks, help information, process information and task practice. This methodology presents the serious game as if it were just another learning material (Akpınar et al., 2023).

  • Specification of the application based on the optimal time and the most appropriate number of sessions. This systematization allows the creation of a framework for the application of the online programs of the guidelines proposed by Mora et al. (2017).

  • Data collection and analysis using a non-invasive system, the GLA, following the protocol established by Alonso-Fernández et al. (2021).

In the process of selecting predesigned programs to improve linguistic competence, together with piloting experience prior to this study, it was decided to base this study on two serious games that fit the requisites for quality education software (Ulicsak & Williamson, 2011): Leobien and Walinwa.

The objective of Leobien (https://www.supertics.com/) is to improve the student’s reading speed and reading comprehension. The program uses an artificial intelligence system that can adapt to the student’s pace and curricular level. Players move forward or backward based on their success and errors. In addition, there are intermediate tasks that favor the student's progress, and all these activities are challenges adjusted to an optimal level of difficulty in maintaining the student's interest, but without posing an obstacle so high that they cannot overcome it. Leobien is structured around 8 sub-materials: Attention, Comprehension, Letter and Phrase, Memory, Word, Sequencing, Syllable and Text, and Reading Speed. Each sub-material contains different levels according to academic year, and there is a specific coding system for data gathering and analysis. The program’s work procedure consists of daily sessions of around 15 min. Leobien uses an adaptive behavioral methodology and enables the student to acquire skills and strategies fundamental to the reading process, helping the student to become a competent reader (see Fig. 1).

Fig. 1
figure 1

Screen capture from Leobien: avatar design scenario

The aim of Walinwa (https://www.walinwa.com/) is to help users improve in spelling and vocabulary, as a way to enhance their written and oral expression and reading comprehension. Walinwa enables personalization of learning pathways by focusing on reinforcement processes in areas where students are deficient, consolidating those areas in which the student’s progress is adequate or good. The training sessions were designed to last for approximately 15 min due to the decline in student attention after 10–15 min (Almendingen et al., 2021). Therefore, this time is taken as a reference for making a decision regarding the maximum implementation time of each serious game. Walinwa is structured in 44 content items grouped around five broader sub-material areas: main topic, secondary topic, accent marking, grammar topic and other content in the Walinwa method. This investigation used three sub-material categories in other content in the Walinwa method as they were relevant to our study of linguistic competence in socio-educationally disadvantaged students, namely attention, memory, and vocabulary. The proposed content and categories in the program were developed by Walinwa in collaboration with a panel of expert teachers, and the program complies with the primary education curriculum.

Walinwa applies artificial intelligence to set the level of difficulty in line with the individual user’s progress. When the student completes the set of daily sessions (between 20 and 30), they receive a prize in the form of walinwos for reaching the proposed objectives in each session. The student is given a final score out of 10 as feedback on their performance in the program (see Fig. 2).

Fig. 2
figure 2

Screen capture of Walinwa's gameplay

Another significant question relates to the results assessment method used by Leobien and Walinwa. Their assessment system is based on the Game Learning Analytics (GLA) provided by the serious games, which are analyzed and correlated to the student’s academic performance, in line with Shute and Ventura (2013) and the GLA measurement and prediction objectives in Alonso-Fernández et al. (2019).

The Leobien program gives information on the progress of the individual student and of the group in the various sub-materials and provides two global indicators (see Fig. 3): Effectiveness (correct exercises divided by total exercises) and Performance (exercises completed by the students divided by the number of exercises completed by the mean number of students on this same course).

Walinwa provides several types of report and data both on individual and class group progress, with specific reports on each session and more general reports on students’ progress by subject. The latter in particular was a data source used in the analysis for this study (see Fig. 4).

Fig. 3
figure 3

Typical individual report provided by Leobien

Fig. 4
figure 4

Individual student report by topic provided by Walinwa

Walinwa and Leobien enable the teacher to monitor each student’s progress in acquisition of knowledge in each part of the game, and to identify the areas they find more difficult and those in which they excel.

3 Research questions

Based on the literature, it seems that serious games are resources capable of improving psychological and physical well-being (de Vlieger et al., 2022; Scarpa et al., 2021), as well as motivation, attitudes, concentration of the students, learning achievement (Tapingkae et al., 2020; Taub et al., 2020; Wronowski et al., 2019), and language and mathematics-related skills (Clark et al., 2016; Fraga-Varela et al., 2021; Tokac et al., 2019; Wouters et al., 2013). However, there is a lack of studies that explore how GLAs have been designed, integrated, and implemented in serious games. There are few studies that focus on the educational improvement of students with specific needs and on complex or situational needs such as educational disadvantage due to socioeconomic conditions. Likewise, there are no studies that address the instructional design of interventions based on serious games considering the academic curriculum defined by educational legislation. Considering this, the research questions about disadvantaged students using serious games were:

Research Question 1 (RQ1): Development of linguistic skills. Can the use of serious games integrated into the curriculum and adjusted to the learning objectives of each educational stage facilitate the development and acquisition of key competencies in students in disadvantage socio-educational situations?

The research gap that this question aims to answer is that there is not enough research that addresses the use of serious games into the educational curriculum, and that they are carried out following a systematic instructional design. As stated in the theoretical framework, it is essential to consider that these interventions must be contextualized in the educational curriculum and in the specific needs of students (Sun, 2023; Zhonggen, 2019).

Research Question 2 (RQ2): Similarity with the rest of their colleagues. Can socio-educationally disadvantaged students become equal to their peers if educational interventions based on ICT and, specifically, serious games are systematically designed and implemented?

In this case, two research gaps are detected. First, the scant attention has been paid to the educational needs of students in situations of complex vulnerability due to socioeconomic and cultural conditions (van der Lubbe et al., 2021). Second, the low quality and rigor of the interventions designed for this purpose, and based on the use of ICT, most of them are unsystematic and have few empirical studies, as pointed out by Haleem et al. (2022). This prevents us from testing whether an ICT-based intervention can help these students catch up with their peers in terms of academic performance.

Research Question 3 (RQ3): Possibility to assess developed key competences and specific needs. Can the level acquired in a key competence be evaluated and specific needs identified in students with academic difficulties using GLA, as well as monitoring the progress obtained after the use of serious games in which they are integrated?

The research gap in this case is that serious games and GLAs are still rarely used to detect specific needs in students and track their progress (Khalil et al., 2023), even though we know that GLAs can provide information about students' progress, the areas in which they need support, and students' interactions with the game (Daoudi, 2022; Swiecki et al., 2022).

Research Question 4 (RQ4): Similarity of the results in these games with the results in other subjects. Is there any relationship detected between the scores obtained in serious games based on the curriculum and academic grades of the students, and is it possible to predict students’ performance in academic subjects through the scores collected in the analysis of learning embedded in serious games?

The research gap here is that, although this area is growing rapidly, there are still significant challenges, such as the lack of high-quality data and the difficulty of interpreting the results of prediction models (Hlosta et al., 2022).

4 Method

A single-group pretest–posttest pre-experimental pilot (exploratory), non-experimental design was conducted (Shadish et al., 2002). In pilot studies, pre-experimental designs are particularly useful for evaluating the viability of an intervention although there is no control group (Campbell & Stanley, 1963). Pre-experimental design can also be used to evaluate the effectiveness of an intervention, they can provide valuable information on the feasibility, efficacy and safety of an intervention, even without providing conclusive evidence of causality. For example, a pilot study could be used to determine whether an intervention produces changes in the dependent variable. However, it is important to note that pre-experimental designs cannot provide conclusive evidence of causality (Campbell & Stanley, 1963).

A pretest–posttest was carried out to determine the students’ level of improvement in linguistic competence and academic performance using serious games. To determine the programs’ GLA predictive capacity, the two were correlated to the students’ performance in class, and the percentage variance explained by the program data.

Various linguistics research has used similar methodologies based on pretest and posttest studies (Abu & Farrah, 2020; Loewen et al., 2020), and in these cases, results have been achieved quantifying the development of communicative sub-competences. The data on this evolution are provided by the games themselves through learning analytics.

The pretest consists of a linguistic competence test based on the prior knowledge that each student must start the academic year with (this is knowledge that they should already have consolidated from the previous year). If the student does not reach that level, serious games adapt to the starting level and promote progress towards the level that corresponds to them by age.

For the posttest, the algorithm of the serious games themselves and their artificial intelligence system establish a final level reached in relation to the initial level in each of the sub-competencies.

This follows the recommendation of Alonso-Fernández et al. (2019) that GLA data should be part of the evaluation process, instead of resorting to external tests.

In this way, it is observed that it starts from an evaluation in relation to the norm to establish the level that the students have in relation to their classmates in the same course, but the functioning system of the algorithm and artificial intelligence works in relation to the criterion, adjusting the activities of the game to the pace of each student and not to the level of their classmates.

4.1 Participants

The selection of the sample was carried out by intentionally non-probabilistic sampling. The sample consisted of 75 students considered to be socio-educationally disadvantaged, attending four state school centers in the Principality of Asturias (Spain) located in vulnerable neighborhoods, of whom 48% were girls (n = 36) and 52% boys (n = 39). In terms of school years, 6.7% were in fourth year of primary school education (n = 5), this level includes children from 9 to 10 years of age; 52% in fifth (n = 39), with children from 10 to 11 years of age; and 41.3% in sixth (n = 31), 11 to 12 years of age (M = 10.6; SD = 0.7). This information can be seen in Table 1. The school years allowed us to establish a comparison in terms of academic performance and acquisition of skills with respect to educational level. In this case, it is especially interesting to consider performance in relation to the norm to establish the level of socio-educational disadvantage of the students. Lamas (2015) defined academic performance as the level of knowledge demonstrated in an area or subject. This provide information about the degree of socio-educational disadvantage that students present, which is defined by a significant academic lag, with two or more degrees of difference between their level of curricular competence and the grade in which they are enrolled. Therefore, their level of curricular competence and the academic level at which they were enrolled were considered as variables.

Table 1 Characteristics of the participating sample

4.2 Instrument

The two programs’ own Game Learning Analytics (GLA) were used to evaluate linguistic competence. Both programs chart the progress of each student in the learning process in each of the sub-competences worked on in each session and provide an initial and final mark. Leobien also provides two global indicators, effectiveness, and performance. As well as providing data on individual progress in each of the sub-materials worked on, Walinwa has a global indicator called mean global result. Both programs can compare individual results to the mean result for the group to which each student belongs and compare individual results to the mean result obtained nationally from their databases. These programs also house the academic results for each student in the term before implementation of the programs, and at the end of the term in which the program was implemented.

4.3 Procedure

The research was developed in the second term of academic year 2020–2021. The serious games used were integrated into the first part of each class of Spanish Language and Literature. The selection criterion for the time when the programs were deployed in class was based on the piloting experience gained in 2019–2020, and from consulting previous investigations, such as by Myles et al. (2007), who noted that students’ interest in ICT generated behaviors that related to academic achievement, such as preparing themselves for the class or showing greater initiative in developing a work that they could share with class colleagues. The intervention was developed over 50 sessions of approximately 15 min’ duration each, with a daily session five days a week. The games were alternated weekly to avoid monotony and to retain student motivation.

4.4 Data Analysis

The data obtained from the Leobien and Walinwa programs were analyzed using the SPSS (version 24.0) statistical package. The Kolmogorov–Smirnov (K-S) (n > 50) test presented values of Sig. < .001 for the initial academic results (Table 2), which meant that the criterion for normal distribution was not met, so, non-parametric tests were used in the subsequent analyses.

Table 2 Distribution of the initial academic results according to the results of the Kolmogorov–Smirnov test, skewness and kurtosis

With respect to the submaterials of Leobien and Walinwa, if we observe the significance test of the Kolmogorov–Smirnov-Lilliefors statistic, in cases less than 0.05 the null hypothesis of normality will be rejected for said level of significance. To verify these tests, a single table has been created (Table 3) in which the submaterials and global scores of Leobien and Walinwa considered in this research appear, of which some met the assumption of normality of the data while others did not. This fact was taken into account in subsequent data analysis.

Table 3 Distribution of Leobien and Walinwa sub-materials data according to the results of the Kolmogorov–Smirnov test, skewness, and kurtosis

The data were distributed as follows (Table 3):

  • Data that followed a normal distribution pattern: the sub-materials of reading speed, effectiveness, and performance in Leobien, and main topic, secondary topic, accent marking, grammar topic, other content of the Walinwa method and mean global score in Walinwa. The Student-t test was used in the pretest–posttest analysis, and to calculate the Cohen’s d effect size, using Cohen’s (1988) own threshold conventions as criterion. Pearson’s correlation coefficient was used to calculate the correlations.

  • Data that did not follow a normal distribution pattern: the categories of memory, attention, and vocabulary in Walinwa, and all the sub-materials in Leobien, except reading speed and the effectiveness and performance global indices. The Wilcoxon rank-sum test was used in the pretest–posttest analysis. In values where significant differences were detected, the effect size was calculated by the r =|z|/√ N – 8.66 statistic (Field, 2018; Fritz et al., 2012), and to interpret r, the Cohen criterion (1988) was used with the Rosenthal (1996) extension. Spearman’s Rho was used to calculate the correlations.

The correlations between the posttest results for the sub-materials and the programs’ global indices and academic results were calculated to determine the programs’ capacity to predict academic performance. This decision was based on the testing of the programs before the pilot study, where different implementation methods were tested for eight weeks. Alternating weeks was the method that maintained the highest motivation. It was found that the use of either of the two serious games for two consecutive weeks caused unwanted effects, such as boredom, lack of motivation, or low performance.

5 Results and Analysis

The results indicated the effectiveness of serious games to improve students' linguistic competence and academic performance. The specific nature of the categories of serious games and GLA allow a detailed analysis of the different subcompetences to be carried out, identifying specific needs and improvements. Finally, the predictive capacity of GLA on students' academic grades is demonstrated. The following indicates how the results of the study respond to the research questions.

5.1 Serious Games and Improvement in Linguistic Competence (Sub-materials) and Academic Results

To respond to RQ1 and RQ2 a pretest–posttest comparison was made to detect any gains achieved following use of serious games (Table 4). After using the programs, it was found that there was a statistically significant improvement (p < .001), and of considerable magnitude in the Leobien sub-materials of comprehension (M: pretest = 26.11; posttest = 30.67), attention (M: pretest = 23.52; posttest = 31.13), letter and phrase (M: pretest = 26.92; posttest = 31.92), memory (M: pretest = 27.58; posttest = 32.15), word (M: pretest = 26.34; posttest = 31.41), sequencing (M: pretest = 24.59; posttest = 31.68), syllable and text (M: pretest = 25.95; posttest = 31.86) and reading speed (M: pretest = 41.45; posttest = 46.37). For effect size, this is particularly significant in memory and sequencing. These results also respond to RQ3, since it is observed that students in a situation of socio-educational disadvantage present special difficulties in attention and sequencing. After using the program, the improvement is considerable, especially in sequencing.

Table 4 Results of the pretest–posttest comparison, and effect size of the sub-materials

A statistically significant difference was also observed in the Walinwa sub-materials, in the mean pretest and posttest scores in main topic (M: pretest = 4.96; posttest = 5.98), secondary topic (M: pretest = 6.06; posttest = 6.68), accent marking (M: pretest = 4.76; posttest = 5.60), grammar topic (M: pretest = 5.42; posttest = 6.11) and other content of the Walinwa method (M: pretest = 5.19; posttest = 6.31). The same is true for the Walinwa mean global score global indicator (M: pretest = 5.58; posttest = 6.39). Following Cohen’s threshold conventions, the effect size was shown to be bigger in all sub-materials, especially in main topic. In this case, and in response to RQ3, special needs are observed in main topic and accent marking, in which the minimum grade is not reached to be considered exceeded. After using the program, the improvement is considerable.

In the memory, attention, and vocabulary categories in Walinwa, belonging to the sub-material of other content of the Walinwa method, there was a statistically significant difference between the pretest and posttest scores, which were moderate in effect size.

In terms of the academic results, the trend towards significant results in the pretest and posttest comparison continued (Table 5), in Natural Sciences (M: pretest = 5.26; posttest = 5.86), Social Sciences (M: pretest = 5.50; posttest = 5.88), Spanish Language and Literature (M: pretest = 5.33; posttest = 5.82), Mathematics (M: pretest = 5.07; posttest = 5.65) and English (M: pretest = 5.06; posttest = 5.76). The effect size is large for Spanish Language and Literature, Mathematics and English, and average for Natural and Social Sciences. Responding to RQ3 and taking into account that the score to consider a subject passed is 5/10, especially low scores are observed in English and Mathematics. After using the program, the improvement is significant in both subjects.

Table 5 Results of the pretest–posttest comparison, and effect size of the academic results

5.2 Serious Games and Predicting Academic Performance

We use the square of the correlation coefficient, which is the coefficient of determination, R2. We chose this coefficient because it is used in the context of a model to predict future outcomes or test a hypothesis. This coefficient of determination is interesting since it expresses the proportion of the dependent variable explained by the independent variable (López-Roldán & Fachelli, 2016), determines the quality to replicate the results, and the proportion of variation in the results that can be explained by the model. By calculating the correlation and coefficient of determination between the students’ final scores in the programs and their schoolwork scores following implementation of the project, we could determine the ability of the programs to predict each student’s academic performance (Table 6) and respond in this way to RQ4. The correlations in those sub-materials and categories in which these results were significant are shown below.

Table 6 Correlations between the scores for sub-materials/global indicators and the academic results

Firstly, the effectiveness global indicator in Leobien was found to correlate closely to moderate for Natural Sciences and for the other materials. In the coefficient of determination, it was found that the indicator predicted around 10% of the variance in the academic results prior to the intervention. And in the performance global indicator, there was a small correlation to Mathematics (R2 = .017) and English (R2 = .167); in the latter, the performance indicator explained 16.7% of the variance.

In the Walinwa sub-materials, it was found that main topic presented significant correlations to all the school materials. The correlations to Natural Sciences, Social Sciences, Mathematics and English were small and positive, while for Spanish Language and Literature the correlation was moderate, and the coefficient of determination obtained enabled us to establish that this sub-material explained 24–8% of the postest score for the material. The grammar topic sub-material had a small positive correlation to Mathematics, as did the other content in the Walinwa method sub-material to Spanish Language and Literature, Mathematics and English. These sub-materials explained between 9 and 13% of the variance in the academic scores. Finally, the mean global score global indicator presented stronger correlations to the academic materials, in particular to Spanish Language and Literature, where it explained 25.3% of the variance in the mean scores after using the program.

The score that best predicted academic performance was Walinwa’s mean global score indicator in all the materials, except English; in this case, it was Leobien’s performance global indicator.

5.2.1 Predictive Capacity in Low-Performance Students

An important feature of this study was the analysis of these programs’ capacity to predict academic performance in low-achieving students in Spanish Language and Literature. This finding is relevant to both RQ3 and RQ4.To this end, the data were segmented by setting a score of 5 (Pass) as the cut-off point for the students’ starting score in Spanish Language and Literature (n = 18).

Of particular importance was the correlation obtained between the syllable and text sub-material in Leobien and the posttest score in Spanish Language and Literature (rs = .540; R2 = .292). In the vocabulary category of the other content of the Walinwa method sub-material, it is found that this explains 57.3% of the variance of the average score obtained in Mathematics (rs = .757; R2 = .573) (Table 7).

Table 7 Correlations between the scores for the sub-materials / global indicators and the academic scores for the low-achiever group

6 Discussion

The aim of this study was to determine the impact of the use serious games on improving linguistic competence and academic performance in socio-educationally disadvantaged students.

The students benefited greatly by acquiring the sub-competences that were developed to boost their linguistic communication skills. These results are in line with those found in Clark et al. (2016), Pires et al. (2019) and Wouters et al. (2013). The findings in our study confirm that the use of gamification strategies and game-based educational actions facilitate the development of competences (Erhel & Jamet, 2019; Lamb et al., 2018; Papastergiou, 2009). This question is related to RQ1, which asks whether the use of serious games integrated into the curriculum and adjusted to the learning objectives of each educational stage can facilitate the development and acquisition of key competencies, demonstrating that these students can benefit from these strategies and resources, if they have been designed based on the curriculum. In terms of the socio-educationally disadvantaged students at the center of this research, our findings match those in Salinas and Garr (2009), Hornstra et al. (2015) and Van Oers and Duijkers (2013), which show that the performance of these students is on a par with that of the rest of the students when teachers deploy innovative strategies such as serious games. This finding contributes to covering the research gap that was pointed out in RQ2, which asked whether socio-educationally disadvantaged students could become equal to their peers if educational interventions based on serious games were systematically designed and implemented.

Regarding the identification of specific needs of students in a situation of socio-educational disadvantage, the ability of GLA to detect these needs and the progress obtained after the use of serious games has been demonstrated. The vocabulary category is one that showed significant improvement, in line with the study by Van Oers and Duijkers (2013) about acquisition of vocabulary in primary school students of different socio-economic levels, and in which disadvantaged students were shown to benefit most from sets of relevant combined activities. Among the low-achiever group, it was found that the score for vocabulary could largely predict the final score for Mathematics, which makes this finding even more relevant, and along the lines of Szabo et al. (2020) who demonstrated that lexical knowledge constitutes a reliable predictor of academic performance. On the other hand, the improvement observed in the academic performance of the students in the study may be related to this variable, contributing to the line of research in which it is determined that a higher degree of lexical competence leads to better academic performance (Heeren et al., 2021; Schuth et al., 2017; Wood et al., 2021), since better comprehension, analysis, synthesis, and communication skills are available. These findings contribute to filling the research gap noted in RQ3.

The intervention favors an improvement in academic grades in line with previous research that collects this benefit in academic performance (Clark et al., 2016; Tokac et al., 2019; Wouters et al., 2013), thanks, especially, to the emotional and motivational component inherent in the game (Nazry et al., 2017). Motivation thus becomes a strong reason for deploying serious games and acts as an effect of their use (Almeida, 2017; Fokides et al., 2019).

The results also show a significant moderate-to-high relation between the scores the students obtained in the programs and their scores in schoolwork. This relation is particularly significant between the global indicators and the academic scores, which demonstrates the predictive capacity of GLA, these findings contributing to expanding knowledge about the possibilities of the GLA as predictors of academic performance, an issue included in RQ4. These results match the findings in previous research (Hautala et al., 2020; Medina et al., 2020; Tenpipat & Akkarajitsakul, 2020; Thomson et al., 2020) that showed how the use of learning analytics in education enables teachers to draw up reports, predict behavior and design actions to improve learning, and focus on specific competences in real time.

7 Conclusions

The study's exploration of the effective integration of serious games within the educational curriculum to reinforce comprehensive competencies and skills reviewed key considerations. First, contextualization emerges as a crucial facet, an issue referred to by Clark et al. (2016); aligning serious games with learning objectives, curricular content and student profiles optimizes their effectiveness, as Zhonggen (2019), and Sun (2023) had pointed out. Furthermore, it is necessary to adopt systematic instructional designs that support their effectiveness, which means adhering to the principles of learning psychology and, at the same time, adapting approaches to fit the characteristics of serious games, as stated by Schrader (2023).

Regarding the assessment of serious games and students' academic performance, this study highlighted their predictive capacity. However, this predictive capacity varies depending on the subjects, with a greater influence observed in areas that emphasize complex cognitive abilities, as reported in the meta-analysis by Jovanović et al. (2021). Interestingly, in this study, serious games exhibited greater predictive ability among lower-performing students, suggesting their potential as support tools for students who face specific difficulties.

In terms of integrating learning analytics into serious games for real-time assessment and instructional refinement, their symbiosis has emerged as a powerful fusion. Learning analytics provides valuable information about student progress, identifies areas requiring intervention, and evaluates interactions between students and games. This integration makes it easier to specifically identify struggling students for personalized support and to refine game designs to improve their effectiveness and engagement. This finding follows the line of studies such as those of Daoudi (2022) and Swiecki et al. (2022), who found that GLAs can help identify students who have difficulty learning a particular subject or skill.

To ensure the effectiveness of learning analytics, its integration requires meticulous alignment with the learning objectives, comprehensive monitoring of student interactions, and rigorous data analysis. This alignment ensures the acquisition of relevant and crucial information to refine the dynamics of the game and to improve student’ learning experiences.

In summary, the combination of serious games and learning analytics holds great promise in education. Its seamless integration not only identifies struggling students for personalized support but also refines instructional design, ultimately fostering a more engaging and effective learning environment.

7.1 Contribution of the Current Study

This study provides empirical evidence that serious games can be effective tools for improving student' language proficiency and academic achievement, even for those with specific difficulties. This finding is important because existing scientific literature on the subject is relatively limited and, in general, does not provide conclusive evidence on the efficacy of serious games.

In addition, this analysis provides information on how the integration of learning analytics into serious games can further enhance their efficacy. Learning analytics can provide valuable information about student progress, the areas in which they need support, and student interactions with the game. This information can be used to identify students with specific difficulties, provide personalized support, and tailor the game to the needs of students.

This study fills the existing research gap in the following aspects:

  • Efficacy of serious games in students with specific difficulties: this study provides empirical evidence that serious games can be an effective tool to improve the language proficiency and academic achievement of students with specific difficulties.

  • Integration of learning analytics into serious games: this study demonstrates how the integration of learning analytics into serious games can further enhance their efficacy.

  • Contextual and instructional design considerations: this study highlights the importance of considering contextual and instructional design considerations when integrating serious games into the educational curriculum.

In general, the study makes a significant contribution to the scientific literature on serious games and their application in the field of education.

8 Limitations and Future Research Directions

A relevant issue is that this study is based solely on quantitative data, therefore the results obtained should be complemented with data from the application of qualitative techniques, combining both methods. These techniques allow for the collection valuable information for the design and adaptation of the program.

On the other hand, additional data is required to ensure that there is no variable contamination. In this sense, it would be necessary to consider all the variables that affect academic performance, for which it would be essential to carry out experimental studies in which a control group would be used. Additionally, it is necessary to delve into the heterogeneity of students in a sociocultural disadvantage situation by establishing subgroups or different profiles.

Another relevant issue has to do with taking into account the context of the educational center in which the programs are developed, with its own educational philosophy and the teaching practices that are carried out in it, as well as other aspects, such as the value that is given to ICT, digital teacher training or the resources with which they have to be able to carry out this type of intervention. All the centers studied are in vulnerable neighborhoods, but there is diversity among them that must be considered.