1 Introduction

Reading is a complex skill based on the development of several component skills (Setorini et al., 2022). In modern society, reading is strongly associated with academic success (McGee et al., 2002; Duncan et al., 2007), socioeconomic status (Ritchie & Bates, 2013), creativity (Wang, 2012; Ritchie et al., 2013), and intelligence (Torgesen, 2005; Ritchie et al., 2015), more specifically, verbal intelligence (Cunningham & Stanovich, 1998). Therefore, special attention must be drawn to the development of pre-reading skills among kindergarten-aged children and early reading interventions (Ehri, 2012; Torgesen, 2014), especially among children at risk for reading disorders.

The role of phonological awareness in reading acquisition and its relationship with the level of later reading are among the most widely studied areas as regards the cognitive precursors of reading (Lonigan et al., 2000; Ziegler & Goswami, 2005; Blomert & Csépe, 2012; Phillips et al., 2016; Stanley et al., 2018). More specifically, results indicate that the level of phonological awareness among children just coming out of kindergarten is distributed over a wide range and that objective mapping and directed development of this awareness are essential for later learning and cognitive development. There are also different views on the development of phonological awareness. There is agreement that two main levels can be distinguished in this area: both phonological awareness (Brady et al., 2011; Andrews & Wang, 2014; Mayer & Trezek, 2014) and phoneme awareness (Hatcher et al., 2006; Smith et al., 2008) play a key role in the process of learning to read. These skills are significant in decoding (Blomert & Csépe, 2012) and text comprehension as well (Nation & Snowling, 2004). The methods for measuring pre-reading skills show a varied picture, while a number of abilities are examined internationally (e.g., synthesis, segmentation and deletion), the examination of speech sound discrimination is the most authoritative in Hungary. Furthermore, the development of children in the sensitive period of the kindergarten-school transition can also be monitored by subjective observation, as kindergarten teachers are not obliged to support the development of children’s abilities with measurements. Subjective observation may be significantly less effective than objective observation. At the level of individual children, there may also be a case where a student entering primary school has not taken any other tests of their cognitive abilities beyond the compulsory speech therapy/logopedic screenings until the first grade. Students can thus start the first grade without diagnostics. Even if there is an objective measurement of the cognitive components of school readiness, educators can choose from a number of tests and test variants, the results of which they can use to compile a number of development tasks and development programs, even at no cost. Although this methodological freedom allows teachers to choose among several aspects of subjective observation and among several assessment tools as well, this diversity does not result in a unified cohort that can be monitored from the same aspect or with the same assessment tools. Rigorous experimental research is therefore needed to provide personalized training tools to raise phonological awareness to maximize learning outcomes, promote appropriate learning processes at this very early stage of schooling, and help prevent dyslexia (Anthony & Francis, 2005).

Face-to-face development in a classroom environment without a diagnosis and personalized training does not provide the greatest effect (Molnár & Lőrincz, 2012). However, with technology and technology-based assessment and development, this issue can be addressed and training can be personalized so that students can receive truly individualized stimuli and immediate feedback, allowing them to take their own learning path through the exercises (Csapó & Molnár, 2019; Csapó et al., 2012). Previous research findings have illustrated the advantages of computer-based development in the following areas: (1) enhancing phonological awareness (Mitchell & Fox, 2001; Lonigan et al., 2003; Cassady & Smith, 2004; Segers & Verhoeven, 2005; Macaruso & Walker, 2008; Wild, 2009; de Graaff et al., 2009; Nelson, 2010; Adlof et al., 2010; Al Otaiba et al., 2012; Savage et al., 2013), (2) facilitating the identification of letter-sound relationships (Segers & Verhoeven, 2005; Macaruso et al., 2006), (3) improving word recognition skills (van Daal & Reitsma, 2000; Hecht & Close, 2002; Shelley-Tremblay & Eyer, 2009; Macaruso & Rodman, 2011) and (4) advancing word reading abilities (Johnson et al., 2010; Saine et al., 2011). Online measurement and development of pre-reading skills not only offer a more cost-effective solution, but also allow a number of students to complete tasks at the same time, and can increase student motivation and reduce examiner and scorer variance (Rathvon, 2004).

In this paper, we present an online training program in Hungarian designed to develop pre-reading skills in first graders (6–8-year-old students). Our program has been tested empirically, and we provide the direct results of an evaluation study. We established the students’ ability levels before the development by measuring their pre-test results, and then we evaluated the effectiveness of the online training program using a pre- and post-test design. To improve the validity of our study, we used propensity score matching to assign students to either the intervention or control groups, taking into account factors such as school, class, gender, and grade. We first examined the overall effectiveness of the intervention program and then assessed which ability groups showed the most improvement in the development program, both at the surface and deeper levels.

1.1 Phonological awareness as a pre-reading skill and its development

Phonological awareness is a pre-reading skill which plays a central role in reading acquisition and refers to the ability to identify and manipulate syllables and/or sounds within a word (Zugarramurdi et al., 2022). During an individual’s language development, phonological awareness at the syllable level appears as early as preschool age (Chard & Dickson, 1999; Ziegler & Goswami, 2005). Certain “simpler” subskills of phoneme awareness, such as identifying and differentiating speech sounds, also develop at this age.

To become a successful reader, there are four main areas that need to be developed: (1) the ability to recognize and manipulate individual sounds in words (phonemic awareness), (2) understanding the relationships between letters and sounds (phonics), (3) reading with speed and accuracy (fluency), and (4) understanding the meaning of what is being read (comprehension). Another group of experts identified five key steps to learning to read: (1) becoming aware of the sounds in words (phonological awareness), (2) learning the letters and the sounds they represent (letter-sound relationships), (3) expanding vocabulary, (4) reading smoothly and accurately (fluency), and (5) using strategies to understand what is being read (comprehension strategies). Phonological awareness and a subskill of this umbrella term, phoneme awareness, are paramount in these studies. Although at different linguistic levels (the syllable and phoneme levels), phonological awareness operates with the same mental operations (Elmesalamy & El-Ater, 2022): (1) segmentation, that is, breaking linguistic units down into linguistic elements of different sizes; (2) synthesis, that is, concatenating linguistic units into words; (3) isolating, that is, separating a given linguistic unit from a word; and (4) manipulation of language units.

There is agreement in the literature that the development of phonological awareness from large to small sound units is universal among languages, with the rate of populations speaking different languages varying in order and proficiency at each level (Anthony & Francis, 2005). Anthony and Francis (2005) also point out that since phonological awareness develops in the pre-reading phase, the spoken language factors that differ between languages (e.g., saliency and complexity of word structures, phoneme position, and articulatory factors) significantly affect the development of the ability. Therefore, it is not possible to build a universal model for the exact sequence of the development of each component skill of phonological awareness. Anthony and Francis (2005) first highlight the interdependence of syllable-level manipulation, onset and rhyme manipulation, and the formation of individual phonemes within intrasyllabic word units. The second main process stresses the interdependence of detecting similar- and dissimilar-sounding words, the manipulation of sounds within words, and the interdependence of sound manipulation and segmenting phonological information (Anthony et al., 2003).

Farrall (2012) also points out the process of moving from larger to smaller language units. Manipulation at the level of words can be detected among children entering kindergarten, and word awareness already appears in this age group at the age of 3. With age, children recognize ever smaller linguistic units (in the first years of kindergarten, syllable and rhyme consciousness dominates) before they approach the output stage and the final year of kindergarten. At the age of five, children are able to identify sounds at the beginning of the word and segment short words into sounds. By the time children leave kindergarten and move on to school, they can separate speech sounds in the initial position of the word and synthesize sounds into words (Farrall, 2012).

Anthony and Francis (2005) highlighted that the peculiarities of oral and written language play an important role in the development of phonological awareness, which causes differences between nations and languages. In the Anglophone world, schooling commences at the age of five, and children begin to learn the letters in the final year of kindergarten. However, targeted education for children of similar ages is not common in Hungary. Of course, this does not mean that there are no children in Hungarian kindergartens who acquire the ability spontaneously and/or in other, non-institutional settings.

In English, rhyming appears first, then syllable segmentation and synthesis, while syllable segmentation and synthesis emerge first in Hungarian, followed by finding rhymes. In both cases, the line of development is followed by manipulation of phonemes (Jordanidisz, 2011). As for differences in the phonological structure, two characteristics of the types of language appear: the isolating feature of English and the agglutinative nature of Hungarian as well as related aspects of complex words. Another important factor in the process of teaching reading is that, in contrast to the Anglophone world, where teaching within an institutional framework begins at the age of 5, Hungarian children begin to learn the basics of reading and writing 1–2 years later, at 6–7. That is why it is extremely important in Hungary to set the goal of practicing basic skills, which are first introduced and developed in kindergarten, in the early stages of primary school.

1.2 Directions of fostering phonological awareness

Computer-assisted education has been utilized in schools since the 1980s and has demonstrated its effectiveness, particularly for students at risk. Numerous studies have emphasized the advantages it offers, such as increased motivation (Papastergiou, 2009), reduced cognitive burden (Ricci et al., 2009). Additionally, computer-based programs have been designed to improve students’ reading skills, with a specific focus on aspects like phonological awareness, letter-sound associations, word recognition, and reading comprehension. Nonetheless, there is a noticeable absence of comprehensive reading intervention programs in the curriculum designed for students who do not have specific learning needs. Kloos et al. (2019) employed the MindPlay Virtual Reading Coach program to enhance the reading skills of second and fourth-grade students, resulting in enhanced reading fluency as the program usage increased. Prescott et al. (2018) utilized the Core5 online program to assist underprivileged students with their reading, targeting various reading components and achieving positive outcomes across all grade levels. Macaruso et al. (2019) further corroborated the efficacy of the Core5 program in a long-term study involving disadvantaged students. A meta-analysis by Jamshidifarsani et al. (2019) revealed that technology-based reading programs can effectively teach letter and word recognition, progressively developing reading skills. They also recommended harnessing technology for innovative teaching methodologies. To sum up, technology-supported reading development programs often cater to specific learning disabilities, leaving limited options for students experiencing developmental delays.

In recent years, the Internet has represented one of the primary areas of edutainment. Teachers, parents, and students can find a number of online and offline games that they may use routinely, including games which may develop phonological awareness. However, the educational impact of most entertaining materials has never been tested or even examined. At best, these programs are designed to entertain children, and the programmers are usually not interested in knowing their cognitive and affective effects on them.

Beyond games aimed at improving phonological awareness, there is face-to-face training with proven efficacy. Bdeir et al. (2020) showed a significant improvement in phonological awareness as a result of a twelve-week enhancement effort involving four kindergarten groups. In a quasi-experimental study of 124 first- to third-grade students, Damasceno et al. (2022) reported higher performance due to the training, which was administered by a speech therapist twice a week in ten 50-minute sessions in a classroom setting.

twenty-first-century ICT has produced a number of online programs for the development of phonological awareness (Mitchell & Fox, 2001; Lonigan et al., 2003; Cassady & Smith, 2004; Segers & Verhoeven, 2005; Macaruso & Walker, 2008; de Graaff et al., 2009; Wild, 2009; Adlof et al., 2010; Nelson, 2010; Al Otaiba et al., 2012; Savage et al., 2013; Amorim et al., 2020). Most of the evidence-based digital training programs are either designed for students with developmental disorders or students with dyslexia. In these programs, it is mostly a combination of visual and acoustic stimuli or the teaching of correct articulation that predominate. For example, the purpose of LiPs (Lindamood Sequencing Program for Reading, Spelling and Speech) is to practice the correct pronunciation of phonemes (Jamshidifarsani et al., 2019), where the visual and acoustic stimuli are connected and hearing problems can be compensated for by visual stimuli. The two stimuli are combined to complement each other, so it is beneficial for deaf children. Magnan et al. (2004) also reported on a successful development program for dyslexic learners, highlighting the development of phonological awareness through development four times a week for 30 minutes each. The programs published in Hungary to help students learn to read or catch up are also organized around the same topics but are more targeted to students with impaired or reduced abilities.

To sum up, most of the programs referenced above, whether face-to-face or online, focus on students with hearing difficulties, dyslexia, or dysgraphia. However, they do not deal with students who have no disorders, organic problems, or learning difficulties, only a developmental delay as a result of different or missing kindergarten and/or school instruction in disruptive times. To resolve this issue and speed up students’ development, the staff of the MTA-SZTE Digital Learning Technologies Research Group and the MTA-SZTE Research Group on the Development of Competencies have started to develop an online training program in phonological awareness. There are many options for measuring and developing phonological awareness online. Molnár (2016) classifies technology-based measurement according to nine criteria of effectiveness: (1) the economics of testing; (2) the test/testing design; (3) the possibility of immediate, objective, standardized feedback; (4) changes in children’s motivation; (5) innovative task-editing opportunities; (6) the potential for adaptive testing; (7) an expanded body sample; (8) contextual testing and efficient data capture and analysis; and (9) the potential to improve test reliability in innovative testing. Online development is not only a more time- and cost-effective development method, but objectivity can also be increased during the test-taking process. During face-to-face development, the teacher sits in front of the student, they see each other, and they can read each other’s facial expressions and/or gestures. Despite the best intentions of the teacher, the student can be manipulated, since the teacher can suggest the right answer with his or her tone of voice and non-verbal signals. Therefore, they not only have to pay attention to maintaining the monotone of they voice, but also has to concentrate on resisting any gestures, on the correct pronunciation/articulation of the sounds, and on recording the results accurately. When teacher and student are face to face, the testing/developing procedure requires a lot of time, concentration, and self-discipline. In an online environment, students can participate in the development in groups, with the number of participants only limited by the school’s infrastructure. Pre-recorded audio ensures that all the students hear the same instructions with the same pronunciation, and they can move between tasks at their own pace. The responsibilities and workload of the teachers will be reduced, as will the time of the measurements, especially because of the group recording format. The surveyor does not need to record the answers to the tasks; the online system stores them (Molnár, 2016).

1.3 Aims of the present study and research questions

The aim of the study is twofold: first, to confirm earlier research results that the ages of 6–8, that is, the first years of schooling, represent a period in which to enhance and accelerate the development of phonological awareness, and, second, to examine the effectiveness of a computer-based training program in phonological awareness at this early age of schooling.

Thus, we intend to answer two research questions which we examined based on the effectiveness of the intervention program.

First, the general effectiveness of the intervention program is examined with RQ1:

  • RQ1: How effective is the online personalized training program for pre-reading skills in an educational context for ages 6–8?

Then we explore which subgroups benefit the most from the training if the groups are formed by starting level (RQ2):

  • RQ2: Which starting ability level of phonological awareness is the most sensitive for the training program? Do latent level analyses confirm the results of ability group comparisons?

2 Materials and methods

2.1 Participants

The sample for the study was drawn from first-grade students (aged 6–8), from the partner school network of our research groups, using the convenience sampling method. Elementary schools were included in the sample on the basis of voluntary participation and the presence of the appropriate infrastructure. The appropriate infrastructure network did not mean that the students might only have been selected from among students of higher ability groups, since the developmental programme affected entire grades and classes within the particular elementary schools. The average age of the participants was 6.55 years (SD = .59 years). A total of 30 classes out of 20 elementary schools, that is, 336 students, participated in the study. After the propensity score matching, 168 students (51.3% boys, 16 missing) were assigned to the intervention group and 168 students (49.3% boys, 24 missing) to the control group. On the sample level, the proportion of boys and girls was almost the same (50.3% boys and 49.7% girls).

2.2 Instruments

The effectiveness of the training programme was evaluated with an online computer-based test of phonological awareness. The same 20-item test was used as the pre- and post-tests to monitor the students’ phonological awareness. Both the pre- and post-tests proved to be reliable (Cronbach’s α = .80), and correlations were significant (p = .01) among the items on both the pre- and post-test. In terms of testing the reliability of the tests, the coefficient reliability could not be improved using item-level reliability through item omission. The test consisted of 20 multiple-choice tasks: acoustic differentiation (five items), segmentation (five items), synthesis (five items), and visual skills (five items). On the acoustic differentiation tasks, students were asked to decide the sound of the animal they had heard. The sounds are similar in place and/or manner of articulation, thus making acoustic differentiation difficult for the students (Fig. 1). During the segmentation tasks, students were expected to click on as many sunflowers as claps for the example word, i.e., as many syllables as they heard for the word. During the synthesis task, students heard the word for a given picture broken up into separate sounds and were asked to decide which picture matched the word they had heard (e.g., students heard the /l/−/e/−/m/−/o/−/n/ sounds, chose the picture of a lemon and clicked on it). On the visual skills tasks, spatial orientation was examined, which plays a prominent role in the development of reading ability. During the tasks, students were asked to find all the arrows pointing in the same direction and distinguish them from their rotated image (e.g., students were expected to find all the arrows pointing down).

Fig. 1
figure 1

Example task from the acoustic differentiation subtest (The snake hisses /ss/. The bee buzzes /zz/. Click on the one you hear: /ss/)

Pre-recorded instructions were provided via headsets. The online delivery made it possible to administer exactly the same sounds/phonemes, i.e., the same stimuli, to all the students, thus avoiding the differences that arise in face-to-face data collection. It enabled us to evaluate the responses of the pupils the same way, pre-empting raters’ harshness in face-to-face data collection and boosting the validity of the results (Csapó et al., 2014).

The online training is built up consistently, based on the developmental process by Farrall (2012) noted above, moving more and more toward complex capabilities. The training program, which contains 148 training exercises, serves to establish and develop phonological awareness skills. The program includes game-based exercises related to phoneme awareness, including the recognition and differentiation of sounds focusing on the manner and place of articulation, the visual similarity of shapes, and the segmentation and synthesis of the operational levels of phonological awareness. Acoustic seriality and analysis already appear in the last members of the exercise series.

Feedback was provided for all 148 training exercises. If students failed to complete a task, they received 2–3 pieces of supportive feedback to help them overcome their difficulties; a total of 318 helpful tasks were thus included in the training program. This feedback does not indicate the correctness of the responses. It provides another opportunity for the learner to successfully complete the task and offers minimal additional assistance in finding the correct solution. In the case of pre-reading skills, this resource takes two forms: it either reduces the number of possible answers for the task or it guides the student towards the correct answer with helpful advice and emphasis. If, despite both forms of help, the student does not do the task correctly, the program proceeds to the next exercise (Fig. 2.)

Fig. 2
figure 2

Example of an intervention task (Script: Step 1. Click on the pictures for which you hear the word. Dagger, Lilla. Step 2. Let’s try again. I’ll tell you the words for two pictures. Find them and click on them. Listen carefully: dagger, Lilla. Step 3. I’ll help. Let’s say the words for the pictures in order, going from left to right: break, dagger, purple, Lilla, fork. Now I’ll tell you two of them. Click on them. Dagger, Lilla. Note on the task: in Hungarian, some of the example words sound similar. The aim of the task was to differentiate between long and short vowels and single and double consonants. In Hungarian, the words for the pictures are as follows: tör (break), tőr (dagger), lila (purple), Lilla (Lilla, the name of the girl), and villa (fork). The task requires the student to click on the picture of the dagger (tőr) and Lilla (the girl). The difficulty of the task lies in the fact that word pairs with short and long vowels and single and double consonants appear among the pictures (tör and tőr; lila and Lilla)

2.3 Procedure

The evaluation study followed a quasi-experimental design. That is, students in both the intervention and control groups were assessed at two different time points: before and after the intervention (Fig. 3). Both tests were administered via the eDia online platform (Csapó & Molnár, 2019).

Fig. 3
figure 3

Educational processes in the intervention group and the control group

To minimize the impact of teachers’ individual characteristics and teaching methods, we paired students based on their pre-test results using a statistical method called propensity score matching. We took into account various factors such as the students’ school, class, gender, and grade level, while excluding any potential effects caused by the teacher. The matching was primarily based on the average test scores of the students. Each pair was then randomly divided into an intervention group and a control group.

The students in the 30 classes included in the study were formed based on the following criteria: (1) all the students who had not taken both tests (the pre- and post-tests) were deleted from the database; (2) data from all the students who had not participated in at least 85% of the training exercises were deleted from the intervention group; and (3) propensity score matching: students in the intervention and control groups were matched by grade based on their pre-test result – additional students were excluded, mainly from the control group (and from the intervention group only when the program did not match). That is, as a result of this process, half of the classes were assigned to the intervention group and the other half to the control group.

The online intervention lasted three months. The eDia system provided an opportunity for linear concatenation of developmental exercises. This form of linear development provided continuous progress between opportunities for development. Students had one training opportunity in the first week of the development program, two in the second week, and three in the third. We were thus able to check if development had really taken place as a process so that not all eight developmental tasks should be completed by the students in the first week of the measurement.

Pupils completed the assignments online in the computer room at their own institution, receiving instructions via headphones so that they could proceed at their own pace without disturbing their peers. A supervisor offered assistance in entering the child’s secret assessment ID but did not provide any further help in completing the tasks or the developmental exercise sets.

The intervention was conducted over a period of three months, with one or two sessions per week during regular school hours, each lasting around 15 to 20 minutes. Trained classroom or afternoon teachers supervised the intervention, which was automatically scored by a computer system. Before the programme started, participating teachers received a briefing session and continuous help from the researchers throughout the implementation process. The briefing included information on the effects of the lockdowns on primary school students’ pre-reading skills, the concept, structure, content, skills focus and task types for the intervention tasks for each session, and the use of the eDia online platform. Teachers were instructed to limit pupils to no more than three sessions per week, with one session per day and at least one day off between two sessions. Pupils were required to complete at least six out of the eight sessions, and teachers were responsible for resolving any technical issues but not allowed to help pupils with mathematical problems during the sessions, thus ensuring consistent intervention conditions across all schools. In the case of the control group, there was no special development in addition to the normal classroom lessons, and we did not give directions on the instruction and development of the control group. The improvements were made in the afternoon, since the children no longer benefited from guided lessons at that time. The time allotted for the development thus took place during the supervised study time in the afternoon, thereby ensuring that the students participating in the development did not miss the daily lessons and that the children in the control group would then engage in leisure activities, thus not taking part in the development or in class sessions. The intervention group received development in addition to the normal school lessons, with the development always being delayed compared to the content practiced in the lessons and aimed at developing abilities that were naturally embedded in the lesson (e.g., acoustic differentiation, seriality, and working memory). Thus, during the development of the intervention tasks, we also took into account the students’ first-year curriculum.

The consent of students and their parents was based on an agreement with the school. The research was approved by the Ethics Committee at the Doctoral School of Education, University of Szeged.

2.4 Analytical method

Beyond descriptive statistics, a paired-sample t-test was employed to investigate first graders’ achievement differences between the pre- and post-tests on the sample level. and the linear model helps us to model and interpret the relationship between the pre- and post-tests, the change in learning performance between the two measurement points. To confirm the results obtained from these dichotomous data, we used ANCOVA analysis, which compares the covariance of the variables.

Cohen’s d (Cohen, 1988) was used to measure the extent of achievement differences between groups. This effect size metric is commonly used to indicate the degree of differentiation between groups, with larger values representing a greater difference (Salkind, 2008). Values smaller than .2 are considered to have a small effect, while values around .5 are considered medium, and values larger than .8 are considered large.

Beyond the analyses with observed variables, which have several limitations (Alessandri et al., 2017), by exclusively analyzing the intervention effects at the group level and having strong normality and homoscedasticity assumptions, which are rarely met in real treatment data, we also used latent-curve modeling with a three-step approach (Little et al., 2002) to evaluate the generalizability of the results on a latent level. Latent variable approaches can easily accommodate the lack of normality in the data and provide an opportunity to test stability and change as a result of the training program not only at the group level (such as the average group-level effect), but also in individuals; participants’ sensitivity to responding differently to the intervention can thus also be modeled (Curran & Muthén, 1999). The mean of the two continuous latent variables, the intercept, and the slope parameters describe the average initial level (the pre-test performance) and the average rate of change; their variance shows the amount of inter-individual variability around the average initial level and the average rate of change (Alessandri et al., 2017). If the mean and variance of the latent slope are not significant in the control group but significant in the intervention group, it shows that some participants were affected by the program differently (Muthén & Curran, 1997).

By comparing the relative fit indices of the models, we gained further insights into students’ development through regular school instruction (control group) and students’ development through explicit training beyond regular school instruction. First, we specified a no-change model for both groups (intervention and control), assuming that neither normal school education nor the additional intervention produced any meaningful effect. This model only includes a second-order intercept factor representing students’ initial performance, with the mean and variance of the second-order intercept factor being freely estimated across groups.

=Second, we used a latent change model for the intervention and a no-change model for the control group. That is, we additionally estimated a slope growth factor in the intervention group to capture any possible change. This model anticipates a significant intervention effect in the intervention group but not in the control group. Finally, in order to assess a possible, normative mean level from the intervention-independent change in the control group, we estimated a latent change model for both groups. The first model was then compared to the second model to evaluate the need for the slope growth latent factor in the intervention group, then the changes in the fit indexes between the second and third models were used to evaluate the need for this further latent factor in the control group as well. Please note that the model comparison approach is described in detail in Alessandri et al. (2017).

We compared model fit indexes, CFI (Comparative Fit Index) and TLI (Tucker–Lewis Index) with associated 90% confidence intervals, and RMSEA (Root Mean Square Error of Approximation) and the changes in fit indexes between the different models. We accepted CFI and TLI values >.90 and RMSEA values <.08 (see Kline, 2016). The Akaike information criterion (AIC; Burnham and Anderson 2004) was also used, as it “rewards goodness of fit and includes a penalty that is an increasing function of the number of parameters estimated” (Alessandri et al., 2017). If a model differs more than 2 from the best fitting model, it has considerably less support; if the difference is larger than 10, it can be said that there is no support from that model. The differences between the CFI and RMSEA values were also used in identifying the best fitting model. According to Chen (2007), if differences between CFI and RMSEA values of two different models exceed .01, the data support the models on a different level.

As there were only two points in time available, which is generally insufficient to estimate latent change models, according to Steyer et al. (1997) and Little et al. (2002), we created two parallel forms of the phonological awareness scale based on the factor loading values. That is, we partitioned the items, thus dividing the scale into two parcels that can be treated as parallel forms.

3 Results

3.1 Results for research question 1

  • RQ1: How effective is the online personalized training program for pre-reading skills in an educational context for ages 6–8?

On the pre-tests, the distribution curve for the intervention and control groups are on top of each other, thus confirming that students with similar abilities were indeed included in the intervention and control groups. There were therefore no significant differences between intervention and control group achievement on the pre-test (t = 1.37 p = .17). After the training, both of the distribution curves shifted to the right. However, the line for the intervention group was more pointed, steeper, and in the upper ability range. It differed significantly from the control group curve in a positive direction, reflecting an additional improvement through the training among members of the intervention group at all levels of pre-reading skills (t = −2.28 p = .02) (Fig. 4).

Fig. 4
figure 4

Student-level changes in performance from pre- to post-test in both control and intervention groups in Grade 1

ANCOVA analysis was used to compare the changes between the groups. Levene’s test showed that the variances are equal across the groups (F = 3.38; df = 334; p = .07) and that an ANCOVA analysis should therefore be used. The analysis demonstrated that there were significant differences (η2 = .032) between the intervention and control groups (F = 11.08; df = 1; p = .001).

The findings from the group-level analysis are reinforced by individual-level analyses presented in Fig. 5. In this figure, the results from the pre- and post-tests are compared. The horizontal axis represents the results from the pre-test, while the vertical axis shows the results from the post-test. The dots on the graph represent individual students. If a dot falls on the middle line or between the two lines that indicate two standard deviations from the mean, then the student performed similarly on both tests. However, if the dot is above the middle line, it means that the student made significant progress from the pre-test to the post-test. Conversely, if the dot is below the middle line, it suggests that the student’s performance declined significantly between the two assessments.

Fig. 5
figure 5

Relationship between student performance on pre- and post-tests in control and intervention groups

3.2 Results for research question 2

  • RQ2: Which starting ability level of phonological awareness is the most sensitive for the training program? Do latent level analyses confirm the results of ability group comparisons?

Following propensity score matching, there were no substantial differences between the intervention and control groups’ performance before the intervention (Mcont = 58.9, SDcont = 19.0; Mintp = 55.9, SDint = 20.2, t = 1.38, p = .17). However, the intervention group demonstrated higher performance on the post-test compared to the control group due to the three-month computer-based training (t = 2.28, p = .02) (Fig. 6).

Fig. 6
figure 6

Average performance of control and intervention groups on pre- and post-tests

Students’ mean performance was significantly higher in both of the groups on the post-test than on the pre-test. However, this difference did not reach one-fifth of one standard deviation (Cohen’s d = .17) in the control group, and it was more than half a standard deviation (Cohen’s d = .60) for the intervention group (Table 1).

Table 1 Descriptive statistics for sample

The participants in the study were divided into three groups based on their initial performance on the pre-test. The first group, which included 62 students classified as low achievers, scored more than one standard deviation below the average (0–39%). The second group, consisting of 223 students, had an average score (40–78%). The third group, with 51 students, scored one standard deviation above the average (79–100%). The results indicate that the differences in performance between the control and intervention groups were more significant in the two lower ability groups (Cohen-d1cont = 1.86, Cohen-d1int = 2.63; Cohen-d2cont = .06, Cohen-d2intp = .47). In contrast, there was little difference in performance between the control and intervention groups in the higher ability group (Cohen-d3cont = .80, Cohen-d3intp = .99). The intervention group showed a significant improvement in both the lower ability groups, as demonstrated in Fig. 7.

Fig. 7
figure 7

Changes in level of ability among students based on standardized mean difference for three skill groups (the first columns refer to the Cohen’s d values for the control groups and the second ones to those of the intervention groups; the first column pair represents the low achievers, the second the mean achievers, and the third the high achievers)

The second part of RQ2 aims to evaluate the impact of the intervention program using latent curve modeling.

First, we tested a model to measure phonological awareness by combining all indicators under one general factor using a preferred estimator for categorical variables. This model showed acceptable fit based on pre-test results (χ2 = 261.5; df = 54; p = .00; CFI = .928; TLI = .912; RMSEA = .110 (CI: .080, .128). Second, we created two parallel forms of the phonological awareness scale based on the factor loading values. We compared the fit of three alternative models using different fit indexes in Table 2. The data supported the need for a slope growth latent factor in the intervention group, but not in the control group. The slope factor in the intervention group had a positive, significant mean (κ = 0.315, p = .00), indicating that students with a lower ability level responded more positively to the intervention than their peers. The correlation between the intercept and slope values (r = −.72) showed that the lower-ability students developed more than their peers due to the intervention. Overall, the fit indexes suggested that the second model was a better fit than the first, indicating the need for a slope growth latent factor in the intervention group. The third model did not significantly improve the overall fit of the model.

Table 2 Goodness-of-fit indices for tested models

4 Discussion

Adequate reading skills represent an essential element of integrating into society (Allen et al., 2014). The direct development of pre-reading skills begins as early as kindergarten age (Ehri, 2012; Torgesen, 2014). There are many efforts to develop these abilities as well as to diagnose early skills in children. However, aspirations do not show a unified picture (Anthony et al., 2003; Anthony & Francis, 2005). Students enter the first year of primary school, with a wide range of competencies that the teacher needs to address. There is no uniform methodology to handle this problem, and differentiated development often appears to be a burden.

Practice shows that phonological awareness, as a determining cognitive factor in reading, can be developed appropriately either in face-to-face (Bdeir et al., 2020; Damasceno et al., 2022) or computer-based mode (Magnan et al., 2004; Jamshidifarsani et al., 2019). Most of these programs focus on students with hearing difficulties, dyslexia, or dysgraphia, but not on students without any disorders, organic problems, or learning difficulties. An online training program that would enhance the cognitive factor of reading has therefore been considered.

The study demonstrates a pre-reading skills development program, focused on the development of phonological awareness, which followed a quasi-experimental design with a total of 336 first graders. In addition to the students involved in the analyses, we decided on three main criteria, which were limited to completing the pre- and post-tests, participating in at least 85% of the development program, and propensity score matching. The development program was framed by pre- and post-tests, which were limited to the basic skills that are foundational to reading and writing skills.

With regard to RQ1, results showed that the students had already achieved significant progress in completing 85% of the training program. but spontaneous improvement can also be detected in the performance of the control group. Although the experimental and control groups started the development program with almost the same performance frequency, by the end of the program, the lower-performing students showed a lower frequency on the post-test, while the better-performing students’ frequency increased (Fig. 4). As regards the student-level analyses, we examined the differences in performance on the pre- and post-tests within the intervention and control groups. From the results, we can conclude that the ratio of the pre- and post-tests is in the average range in the control group. That is, no major reorganization was observed. However, in the intervention group, a performance difference between the pre- and post-tests can be observed in relation to reorganization, with the students achieving better performance on the post-test compared to the pre-test. Reorganization is most visible among students who performed poorly on the pre-test (Fig. 5). Both the analyses on the frequency of achievement and at the student level draw attention to the effectiveness of the development program in terms of the entire sample as well as the individual level of the students. Spontaneous development highlights the relationship between reading and phonological awareness. The literature has explored the indisputable relationship between the two abilities (Lonigan et al., 2013) and their causal (Morais, 1991), interdependent, and mutually supportive relationship (Klicpera & Gasteiger-Klicpera, 1995). Jordanidisz (2011) observed that phonological awareness can develop faster through reading instruction. Although there is an improvement within the control group, which can be considered a result of literacy teaching. The development of the control group can be explained since, in terms of language development, institutional instruction and reading and writing are of great importance for a child’s development. In addition to learning to read and write as of the first year of elementary school, students here acquire the vocabulary required for each discipline. Researchers assume three types of relationships between reading and the development of phonological awareness: the development of phonological awareness affects reading (Bradley & Bryant, 1983; Wagner et al., 1994; Hatcher et al., 2006), the child’s phonological awareness develops while learning to read (Morais, 1991), and the connection between the development of phonological awareness and learning to read is bidirectional (Perfetti et al., 1987; Cataldo & Ellis, 1988; Stuart & Coltheart, 1988; Burgess & Lonigan, 1998; Castles & Coltheart, 2004). Although views differ, researchers agree that the development of the two components is closely related. The development of the control group is not significant and is not as large as the difference in performance between the pre- and post-tests taken by the students in the program.

In the case of RQ2, we performed latent and manifest level tests to identify the effectiveness of the online training program. As a result of the online training, students’ mean performance was significantly higher in both of the groups on the post-test. The difference between the magnitude of the effect size measured on the pre- and post-tests taken by the control and intervention groups supports the extent of development (Cohen-dint = .60, Cohen-dcont = .17); the impact size for the whole training program therefore showed a large effect (Cohen-d = .11). The advantage of the development of the intervention group was also shown during the examination of the three skill groups. However, there is a larger difference in the effect size in the lower two skill groups (Cohen-d1cont = 1.86, Cohen-d1int = 2.63; Cohen-d2cont = .06, Cohen-d2int = .47), and the training program also had a positive effect on the ability development of the upper third group (Cohen-d3cont = .80; Cohen-d3int = .99). The results showed that the current design of the development program had a greater impact on bridging the gap among the lower skill groups, with children low in phonological awareness benefiting the most from the intervention.

The study used structural equation modeling to analyze the effect of the intervention program at a deeper level. Different models were compared to find the best-fitting trajectory, which showed a significant positive effect of the training on the intervention group, while the control group did not show any significant change. The results also indicated that there was significant variability among students in their response to the training program. In summary, the analysis at the latent level supported the positive impact of the intervention program and highlighted the individual differences in how students responded to it. After three months, students who showed lower initial levels of phonological awareness were more affected by the intervention delivered. In contrast, students with higher phonological awareness skills on the pre-test remained overall stable in their high level of phonological awareness. Results also demonstrated that the 6–8-year-old period is a sensitive one for reading acquisition. In the Hungarian context, reading and writing begin in this period (Jordanidisz, 2011), which in itself significantly reduces phonological awareness and results in the development of reading skills. However, development was accelerated beyond this guided training process through the training program.

To sum up, the results suggest that the elaboration of this online training program can be considered successful. It develops 6–8-year-old first graders’ pre-reading skills in a playful way. The results indicate that pre-reading skills can undergo significant and efficient development, not only in a traditional face-to-face setting but also in a computer-based environment. The development program has successfully accomplished its objective, specifically by targeting lower skilled groups and aiding them in bridging the developmental gap. However, it is surprising the extent to which students in first grade struggle to acquire the skills that are fundamental to reading and writing acquisition. Therefore, it is possible to implement this development program at an even younger age.

The study presents a development programme that does not require the continuous presence of a teacher and provides students with independently usable and easily accessible, gamified tasks. Students receive continuous feedback between tasks, making it suitable for use during regular classes or afternoon activities. Due to the immediate feedback, teachers can constantly monitor their students’ performance.

5 Limitations of the study

The present research shows the change between two points of measurement. However, at least 3–4 measurement points are usually required to run the latent change model used in this study. In addition to the low number of measurement points, a relatively low number of items was used due to the young age of the first-grade sample, which also made it difficult to create two parallel forms of phonological awareness, which was essential for a latent analysis of the results. In addition to the analyses, there are also limitations in connection with the sample, one of which is the use of convenience sampling, as schools completed the tests after volunteering. Therefore, the sample is non-representative. In addition, only half of the sample meets the analysis criteria compared to the original sample, as not all students who took the pre-test also took the post-test and dropout after the pre-test was significant, which requires further research. Although the control group did not benefit from development or classroom lessons during the period in which the experimental group was participating in the development program, it would be worth investigating the placebo effect of the development program in the future.

6 Conclusion

The research presents a quasi-experimental design that implements a pre-reading skills development program for first-grade students. The program progresses from rudimentary skills to more advanced skills while providing learners with continuous feedback. The online development program facilitated the students’ significant developmental advancement over their control group counterparts. The program had the most profound impact on the development of students with lower ability scores, offering an opportunity to remedy subsequent delays. The findings of the research demonstrate that an online environment can foster the development of phonological awareness and pre-reading skills, providing teachers and students with an objective form of measurement and development.