Background

This study was conducted to investigate the benefits of rhythm training on the literacy and working memory performance of 6–8-year-old Primary school children. The research setting was designed to achieve high ecological validity and the study was implemented as a part of a weekly class schedule. Previous research investigating the effects of music education methods on literacy has addressed the benefits of training, but the research is still inconclusive in some respects (e.g., Dumont et al., 2017). The current study was conducted in response to the existing literature on the proposed yet not comprehensively confirmed relationship between rhythm perception and reading skills (Bonacina et al., 2021; Gordon et al., 2015; Lundetræ & Thomson, 2018; Ozernov-Palchik et al., 2018; Politimou et al., 2019; Steinbrink et al., 2019) and the benefits of music education methods for these skills (Degé et al., 2015; Flaugnacco et al., 2015; Slater et al., 2014; Tierney & Kraus, 2013a). Grounded in prior literature, the rhythm intervention in this study was specifically designed to develop musical rhythm skills in this age and literacy group of students.

Music Interventions and Reading Skills

The benefits of general music education on language skills such as reading have been studied for half a century (Ozernov-Palchik et al., 2018). Music education has been suggested to be associated with the development of literacy skills (Slater et al., 2014; Tierney & Kraus, 2013a), suggesting a link between children’s musical aptitude/general music perception skills and reading (e.g., Degé et al., 2015; Flaugnacco et al., 2015; Welch et al., 2021). However, a recent critical review of music interventions by Dumont et al. (2017) found that research on music interventions and reading skills remained inconclusive. Of their (8) criteria-matched studies, the results varied dramatically from supportive to null. One approach in recent music and reading intervention research has been to examine the effectiveness of different interventions, for example by comparing singing-based and rhythm-based interventions (as in Patscheke et al., 2019) but, as they point out, it can be very challenging to disentangle different musical elements from music interventions, and conducting, for instance, “only a singing intervention” or “only a rhythm intervention.”

The importance of rhythm perception skills in language acquisition has been recognized in the literature over the past decade. Both children (e.g., Bonacina et al., 2015; Corriveau & Goswami, 2009) and adults (e.g., Bekius et al., 2016; Fiveash et al., 2020; Leong & Goswami, 2014; Thomson et al., 2006) diagnosed with dyslexia appear to have challenges with rhythm skills. Previous research suggests that rhythm perception performance in typically developing readers correlates with pre-literacy skills and literacy development (Bonacina et al., 2021; Chern et al., 2018; Gordon et al., 2015; Lee et al., 2020; Steinbrink et al., 2019). Researchers such as Usha Goswami (Goswami, 2011, 2018), Barbara Tillman (e.g., Ladányi et al., 2020) and Simone Dalla Bella (e.g., Bégel et al., 2022) have suggested that deficits in rhythmic priming should be considered as one of the determinants of challenges related to literacy. Previous research on preschool populations has concluded that rhythm perception and production are the best predictors of phonological awareness, both for 3- to 4-year-olds (Politimou et al., 2019) and for 5- to 6-year-olds (Gordon et al., 2015; Lundetræ & Thomson, 2018; Ozernov-Palchik et al., 2018; Steinbrink et al., 2019). Bonacina et al. (2021) tested the beat synchrony skills of 3- to 5-year-old children (N = 150) and reported that the children who performed well on a beat synchrony task significantly outperformed their peers on pre-literacy measures. This recent research suggests a relationship between rhythm perception and literacy, particularly pre-literacy skills.

There seems to be a lack of research on music interventions that focus strictly on rhythm interventions and literacy development. However, there is research on rhythm interventions that cannot be strictly defined as “music interventions”. For example, a recent study by Harrison et al. (2018) used speech rhythm-based training to investigate the effects on the literacy skills of typically developing 4- to 5-year-old British children. After the 10-week training period and at the 3-month follow-up, the speech-rhythm-based training groups performed significantly better on word reading compared to the control intervention groups (phonological awareness and mathematics intervention groups). When reviewing previous research, country-specific differences in literacy skills and literacy curricula should be taken into account. For example, when children in the UK start school (see Harrison et al., 2018), they are customarily between four and five years old and begin to learn the basics of reading. In Finland, on the other hand, according to the national curriculum, children enter Primary school much later, at the ages of 6 to 7 (the first year of Primary school) and are thus only expected to have acquired reading skills by the end of the first year or, at the latest, by the end of second year at the age of 8. This gap of two years is an extremely long period in children’s natural development. We suggest, therefore, that international reading research should be compared and referenced with great caution.

Executive Functions in Cognition

In the current study, an act of learning is conceived as a function of a neuropsychological system that relies heavily on the ability to use executive functions (Chan et al., 2008; Diamond, 2013; Goswami, 2019). Executive functions (EFs) essentially enable all of our engagement with the world around us —we learn, care, love, and function at higher cognitive and physiological levels because of these cognitive attributes. Executive functions is an umbrella term used to describe a neural system that controls and manages other subordinate cognitive processes that are thought to be critical for behavior and learning. In particular, EFs are also reported to be intertwined with the mechanisms that enable and support reading skills (Kieffer & Christodoulou, 2020; Meixner et al., 2019; Michel et al., 2019; Nouwens et al., 2021). These functions can be measured by various standardized tests and their scoring procedures are well established in the literature (e.g., Anderson, 2002; Chan et al., 2008; Spreen & Strauss, 1998). It has been suggested that the three core EFs are working memory, inhibition and cognitive flexibility (Diamond, 2013). The contexts in which these functions are most commonly assessed are educational settings and hospitals (Chan et al., 2008).

Music Interventions, Motor Functions and EF

Various forms of music education, ranging from instrumental training (e.g., Loui et al., 2019) to dance training (e.g., Shen et al., 2020), have been shown to have a positive impact on executive and general cognitive functions. A review of related research suggests that music training also has a positive specific impact on the development of cognitive functions required for the development of reading skills (de Diego-Balaguer et al., 2016; Hallam & Himonides, 2022; Kraus et al., 2014; Swaminathan & Schellenberg, 2020). However, music intervention research has been subject to criticism, as the actual mechanisms by which “music” and musical training influence cognition seem to remain a mystery. A recent statistically based meta-analysis by Sala and Gobet (2020) even concluded that the assumption of a clear impact of music interventions on other aspects of cognition could be challenged once the quality of the studies is controlled for.

Both general music education (Degé et al., 2011) and musical instrument training (Bergman et al., 2014) have been reported to improve working memory performance. Both short-term (6 weeks) (Guo et al., 2018) and long-term (Saarikivi et al., 2019) settings have yielded supportive results for improved working memory performance following music training. Lesiuk (2015) investigated differences in music perception (musical tone and timing discrimination) between typically developing children and children diagnosed with ADHD (attention deficit/hyperactivity disorder). It has been reported that children diagnosed with ADHD have difficulties with, for example, working memory performance (Martinussen et al., 2005). Lesiuk found that the level of performance on working memory tasks correlated with rhythm ability in children diagnosed with ADHD and concluded that working memory performance was the only executive function that significantly predicted rhythm discrimination in this population. Ireland et al. (2018) observed similar results with a typically developing population of children while creating an age-equivalent musicality test. Within their sample (N = 213), they found that working memory performance (recorded with digit span and letter-number sequencing tests) correlated with a rhythm discrimination task (c-RST, the child version of a rhythm synchronization task developed by Chen et al., 2008) and a syllable sequence discrimination task (the baseline task for the Melody Discrimination Task by Foster & Zatorre, 2010).

It has been suggested that motor skills, the ability to use our bodies, are intertwined with our cognition (Diamond, 2013; Kantomaa et al., 2013; Piek et al., 2008; Tierney & Kraus, 2013a; Michel et al., 2019; Vazou et al., 2020; Viholainen et al., 2006; Westendorp et al., 2011). Although a link between motor and cognitive skills is widely supported in the literature, the effects of rhythm and movement, or other music and movement-defined interventions, on executive functions have only been investigated by a few researchers using controlled experimental designs (Frischen et al., 2019; Maróti et al., 2019; Vazou et al., 2020). As the role of motor coordination and the effects of motor skills on cognitive functions has gained support in the literature (for a systematic review, see, e.g., Jylänki et al., 2022), the question of the effectiveness of music and movement-defined exercises on motor skills is relevant. Martins et al. (2018) reported that an Orff-based music training significantly improved children’s bimanual coordination and manual dexterity skills compared to a sports training control group and a passive control group. Zachopoulou et al. (2004) compared two differentiated motor skills interventions, one with and one without musical rhythm. Their research setting included music and movement-defined training for the experimental group and general exercise training for the control group. The experimental group achieved higher scores on both motor performance tests (jumping and dynamic balance) after eight weeks of training.

Close-to-Practice, Quasi Experimental and Experimental Study Design

Close-to-practice research (Wyse et al., 2018) refers to educational research that often involves researchers working with partnership with practitioners. It could be stated that close-to-practice is a fairly standard design in educational research, representing a type of study conducted in a naturalistic school setting. In our study, one of the aims was to conduct research without interfering with the regular teaching routines and practices of the participating educational institution in order to enhance the ecological validity of the research setting. In this type of research, the use of intact research groups could be described as a convention, and the study designs in this type of research are often quasi-experimental and/or experimental designs. Depending on the acquired resources, a naturalistic research design might reduce internal validity and potentially increase selection bias, but it could be suggested that these settings can achieve higher ecological validity than strictly randomized controlled trials. The current study was conducted in a naturalistic school setting, in a close integration with the pedagogical and curricular aims of the participating Primary school, and this defined the operational framework of the research design. The close-to-practice design allowed for reflective and immediate support from the research to the school’s education specialists, which in turn supported both the effectiveness and feasibility of the research. However, in this study, because of this close collaboration, it was not possible, for example, to randomize the research groups, as the intact research groups were more feasible for the teaching practice of the participating primary schools.

The effectiveness of rhythm and movement/Orff-defined/music and movement exercises for motor development has been supported in the research literature (e.g., by Martins et al., 2018). Rhythm and movement, and music and movement interventions appear to be more effective than sports training for motor function. In the referenced literature, the introduced relationship between literacy and rhythm perception (e.g., Ladányi et al., 2020) seems to have a strong background in the research field, but the lack of research focusing strictly on music rhythm interventions is striking. Music training is suggested to have a positive impact on reading development (Ozernov-Palchik et al., 2018), but research disentangling the effectiveness in more detail has yet to be conducted in a sufficiently large number of intervention studies. The aim of the current study was to disentangle this phenomenon further by designing an experimental and control training to compare, in particular, the effects of embodied rhythm training.

Aims of the Current Study

In the current study, we aimed to investigate the effect of enhanced rhythm training on reading skills and related cognitive functions. For executive cognitive functions, the focus was on working memory performance, a function that has been reported to promote reading skills, including word reading and reading comprehension (Ahokas et al., 2023: Carretti et al., 2009; Christopher et al., 2012; Fischbach et al., 2014; Gray et al., 2019; Towse et al., 2010). Previous literature has shown the strong importance of timing (Politimou et al., 2019; Ozernov-Palchik et al., 2018; Steinbrink et al., 2019; Gordon et al., 2015; Lundetræ & Thomson, 2018; Bonacina et al., 2021) and the ability to produce such behaviors. Both beat perception (e.g., Carr et al., 2014) and motor skills (Michel et al., 2019; Viholainen et al., 2006) have been reported to be associated with the development and improvement of literacy skills. The intervention method of the current study was specifically designed to challenge and develop participants’ beat perception and motor abilities. Our research question was: Does enhanced rhythm training improve the development of literacy skills and working memory performance in pupils learning to read in their first and second years of school? According to the national curriculum in Finland, children should learn to read by the end of their second year in school. Our research sample consisted of pupils in the first and second years of school, so we expected both groups to improve their reading skills over the course of the training period. However, we hypothesized that there might be an effect of rhythm training in the subgroup of so-called lower starting level readers, that is, those students who scored lower at baseline.

Research Methods

Participants

The participants of this study were all primary school pupils from ages 6 to 8 who were in their first and second year of school. The participants’ guardians gave consent for the research in writing before the study began and ethical and official permission to conduct the research was confirmed by the education divisionFootnote 1 of the City of Helsinki. Overall, there were 70 students in Years 1 and 2. Of these, 56 were able to participate in the pre- and post-measures of the working memory test. The experimental group consisted of 29 pupils, and the comparison group had 27. See Table 1 for the age and gender division in the participant groups. Fifty-two students were able to complete the literacy skill evaluation at all four measurement points (pre, post and follow-ups 1 and 2), and the size of both the experimental and comparison groups was 26, respectively. See Table 2 for gender division in literacy skill evaluation.

Table 1 Participants’ age and gender division in EF (working memory) test
Table 2 Participants’ gender division in literacy skill evaluation including lower starting level (1–3) readers

The research environment was a small Finnish primary school located in the city of Helsinki. During the period of the fieldwork, the whole student population of the school was approximately 120 students. The primary school has Years 1 to 4 and one special unit that is a smaller class for students with special educational needs. As a research environment, the school can be interpreted as a comprehensive example of an average Finnish capital area school. The school was a typical school in its region, where the majority of children’s parents were middle-income. Approximately 30% of the student population had a home language other than FinnishFootnote 2. This is relevant for this study, as the focus was on the participants’ literacy skills in relation to the Finnish language. These participants with non-Finnish speaking backgrounds were equally divided between the two research groups (n = 11 in both groups).

Our study design was an experimental setting, where participants were divided into experimental and comparison groups on the basis of their assessed literacy performance. The utilized literacy performance test was a standardized reading assessment tool ALLU (Ala-asteen Lukutesti [Reading Test for Primary School], Lindeman, 2005). The 1st graders of our research sample, half of the participants, were pupils in their first year who were still inexperienced in school rules and group work. In order to ensure the operational quality of the teaching, the class teachers’ expert assessment of the group cooperation skills of the whole group of participants was used in the allocation. In sum, the allocation was randomized within three parameters: already existing intact class levels (pupils in Year 1 and Year 2 respectively), equal distribution of lower starting level readers in both research groups and drawing on a reflective consultation by the teaching staff of the participating school. Therefore, the setting was not fully quasi-experimental, but was defined as an experimental research design. See Table 2 for the distribution of lower starting level readers in the experimental and comparison groups.

Materials (Apparatus)

Measures

The literacy assessment method used, the national literacy assessment test ALLU (Ala-asteen Lukutesti [Reading Test for Primary School], Lindeman, 2005), is used as a screening tool on behalf of the special education specialists in the Finnish education system. This Finnish literacy assessment tool is used for the annual assessment of primary school pupils’ literacy skills and as a light initial screening for possible challenges in the development of literacy skills. This assessment tool generates three different ways of presenting the results in order to accommodate different purposes for examining the data: raw scores, relative T-scores standardized for performance at the national grade level, and level group norms. The level group norms are sensitive to the age of the participants, and it is recommended that those students whose score remains at levels 1–3 should be followed up, as a low starting level may indicate challenges in developing literacy skills, or even an impairment in literacy skills. The reliability of the outcome measure has recently been assessed by Ståhlberg et al. (2020), with Cronbach’s alpha = 0.82 for the technical literacy skills tool. At all four measurement points in our study, the test was administered as a paper-and-pencil test with a test-specific time limit to complete the task. The assessment was carried out in collaboration with the first author (as in the pre-and post-measures), or in collaboration with the special educator of the participant school (in the follow-ups).

Participants’ working memory performance was assessed before and after the intervention using a visuospatial test called Corsi blocks (Vandierendonck et al., 2004). The Corsi blocks task administered was a computerized, visual, forward recall order memory task. The test was obtained from the PEBL application, an open-source software platform (Mueller & Piper, 2014. The Psychology Experiment Building Language (version 0.13) and available at http://pebl.sourceforge.net), which contains a collection of neuropsychological tests that can be used electronically. The reliability of the Corsi blocks outcome measure was recently assessed by Berg (2008) with typically developing child participants, where Cronbach’s alpha was 0.67. Participants’ visuospatial working memory performance was assessed on behalf of the first author in a quiet classroom environment using a laptop computer in individual test sessions (approximately 10 min per participant). Farrell Pagulayan et al. (2006) concluded that the mean memory span for 7-year-old typically developing participants using this test is 5. Within our data, the mean raw performance score at the pre-measurement point (0 months) for the experimental group was 4.5 and for the comparison group the mean raw performance score was 4.

Rhythm Intervention

The research intervention was embedded within regular weekly music lessons. During the Autumn term of 2017 (a period of approximately 3 months, 13 sessions in total), half of the school’s Year 1 and Year 2 student population received either specifically designed music instruction or regular basic music lessons delivered by the same teacher. Both groups received an equivalent amount of music instruction, but the content of the instruction was different. The experimental group (EG) received music education with a strong focus on rhythm, whilst the comparison group (CG) received similar music education content, but without a specific focus on rhythm. The intervention for both groups was controlled by the same music educator. Put simply, the difference between the EG and CG was in the choice of instruments (EG used percussive instruments, such as djembes and body percussion, whereas CG did not) and the inclusion (EG) or exclusion (CG) of music and movement activities in the structured music curriculum. The music education of the experimental group can be characterized as lessons that emphasized percussion, body percussion, rhythm and movement. The lessons of the comparison group did not include any movement-based teaching methods, nor specific rhythmic exercises during the intervention period.

Rhythm and Movement Exercises is an umbrella term for embodied music education methods that use the learner’s body as a tool to experience and learn the basic elements of music, such as dynamics, rhythm and intonation. These exercises aim to use multisensory learning mechanisms (Shams & Seitz, 2008) as fully as possible. Participants used their bodies, space and some simple percussion instruments (e.g., djembes) as tools to study the basic elements of music, which was the basis of Karl Orff’s Schulwerk educational approach in the 1930s (Orff, 1963). These fundamentally “body-centered” teaching methods expand the possible sensory inputs of learning to include the sense of touch, sense of balance and spatial perception (Kayili & Kuşcu, 2020).

Lessons lasted from 20 to 40 min, depending on the size of the group (group sizes ranged from 10 to 17 pupils). Shorter (20 min) lessons were implemented for smaller (maximum 10 pupils) groups and longer (40 min) music lessons for larger (maximum 17 pupils) groups. This division of group size and length was implemented at the request of the participating Primary school; this was the weekly lesson plan schedule that the school could provide. The total length of the music treatment was controlled to be the same for both the experimental and comparison groups. The program for the experimental group was designed to promote awareness of beat perception and the internalization of musical rhythm by the participants. The intervention exercises were designed and implemented by the first author and were based mainly on Western music education materials (e.g., specialized music and movement). For the comparison group, the training was designed to include music education objectives that were pedagogically similar to those of the experimental group (e.g., learning the same song), but the teaching methods for the comparison training did not include additional rhythm training, nor movement-based practice. Examples of the content of the intervention during one week in the experimental and comparison groups can be found in Supplementary Material 1.

Procedure

Timeline of the Study

The study was based on an intervention with a pre-test/post-test design. At the pre-measurement point (0 months), all participants were tested with the national literacy skill evaluation ALLU test and a visuospatial working memory test (Corsi blocks). At the post-measurement point (3 months), the same battery was used to evaluate any possible change in literacy skill and working memory performance. The literacy skill development was followed up at two measurement points (8 and 20 months) with the ALLU test. See Fig. 1 for the study timeline.

Fig. 1
figure 1

Timeline of the study

Statistical Analyses

Non-parametric tests were chosen to be used in the analyses. This decision was made because the sample sizes varied and were small. Reading skills were evaluated at four time-points (before intervention and at 3, 8 and 20 months after the starting measurement) and working memory at two time-points (before and after the intervention period at 3 months). Any similarities or differences between the comparison and experimental group in the development of literacy skills across the four time-points was tested using an independent Mann-Whitney U test, and a related-samples Friedman’s two-way analysis of variance. The difference between the comparison and experimental groups in working memory performance across the two time-points was investigated with a related-samples Wilcoxon signed rank test. A third analysis was conducted to detect possible effects in the literacy skill performance of the at-risk group, i.e., students whose results were at levels 1–3 in the initial (0 months) measurement point. This was tested using an independent Mann-Whitney U test and subsequently investigated with a related-samples Friedman’s two-way analysis of variance. All analyses were conducted using IBM SPSS Statistics software for Windows, Version 26.0 and 27.0.

Results

Literacy Skill Evaluation (ALLU) Results

For the subsequent statistical analysis, we used the standards of the grade level corresponding to the age of the participants. This was done according to the protocol for “comparing the effects of training on research groups” (Lindeman, 2005).

There was no statistical difference in development in any of the timepoints between the research groups according to the independent-samples Mann Whitney U test. Related-samples Friedman’s two-way analysis of variance revealed the literacy skill development to be statistically significant in both groups within the designated time course of 0 to 20 months (experimental group X²F(2) = 17.587, p = .001, W= 0.2 / comparison group X²F(2) = 10.815, p = .013, W= 0.1). In both groups, in general across the whole measurement time, the development of their literacy skills was highly similar. Both groups improved in their literacy skills by the end of the first academic year (see measurement points 1 to 3 in Fig. 2A). A minor deterioration of results between measurement points 3 and 4 most presumably results from the weakness of the evaluation tool ALLU, which does not respond effectively enough for the standards of the class level corresponding to participants’ age regarding students that achieved the basic literacy skills during the first academic year of this study. This development can be seen as a peak between measurement points 2 and 3 in Fig. 2A.

Fig. 2
figure 2

Mean plots representing the mean score differences between the research groups. Note. (A) Mean plot of the age-matched level values of the literacy skill evaluation (ALLU) scores from pre (0 months), post (3 months) and follow-up 1 (8 months) and follow-up 2 (20 months) measurement points for all participants. (B) Mean plot of the pre (0 months) and post (3 months) test scores of the Corsi blocks working memory test for all participants. (C) Mean plot of the age-matched level values of the literacy skill evaluation (ALLU) scores from pre (0 months), post (3 months) and follow-up 1 (8 months) and follow-up 2 (20 months) measurement points within the lower starting level participants

Visuospatial Working Memory Test (Corsi Blocks) Results

The raw visuospatial working memory performance score (the length of the sequence recalled by participants) was converted using the PEBL platform to a more representative value of overall task performance, namely the memory span score (the minimum list length with the total number of correct answers divided by the number of lists of each length). This score was used as the main performance outcome score in the analysis.

The Corsi blocks test performance was investigated before (0 months) and after (3 months) the intervention period. Possible differences between the development of the experimental and comparison group across this time was analyzed with a non-parametric Wilcoxon signed rank test. Analyses revealed the experimental group’s change between the pre (Md = 4.0) and post (Md = 4.0) measures to be moderate (T = 207.500, Z = 2.654, p = .008, r= .5), while the effect for the comparison group (pre Mdn = 3.5 and post Mdn = 3.5) was small (T = 162.500, Z = 1.193, p = .233). Figure 2B represents the mean score difference between the research groups.

Literacy Skill Evaluation Measure (ALLU) Results Among the Subgroup of Lower Starting Level Readers

The number of low starting level reader participants in the experimental group (n = 26) was 15 by the last measurement point. In the comparison group (n = 26), towards the end of the study the number of low starting level reader participants had dropped to 12. A significant difference was observed in the development of the literacy skills among these participants with a low starting level in reading in the experimental group. A Mann-Whitney U test revealed that the experimental group’s subgroup advanced in literacy skills significantly better (Md = 2.5) compared to the comparison group’s subgroup (Md =. 75) when comparing the performance in the pre- and follow-up 2 measurement points (U = 47.5, p = .037). When investigated with the Related-Samples Friedman’s Two-Way Analysis of Variance, the development of the experimental group’s subgroups between all four measurement points was of a large effect size (X²F(2) = 26,979, p = < 0.001, W = 0.6), whilst the development in the comparison group’s subgroup remained small (X²F(2) = 3.667, p = .300). The literacy skill development within the lower starting level participants in the experimental group (n = 15) compared to the comparison group (n = 12) is illustrated in Figs. 2C and 3.

Fig. 3
figure 3

Boxplots representing lower starting level participants’ scores from pre (0 months) and follow-up 2 (20 months) measurement points in literacy skill evaluation

Discussion

Our research question was: Does enhanced rhythm training improve the development of literacy skills and working memory performance in pupils in the first and second years of school learning to read? Due to the natural development of reading in Years 1 and 2, we hypothesized that both groups would develop their reading skills over the course of our study. We also hypothesized that it might be possible to see an effect of the rhythm training in the subgroup of readers with a lower starting level. As suspected, all the children learned to read, and there was no statistical difference between the experimental and control groups across the whole sample.

Nevertheless, our findings suggest that a rhythm-based music education intervention could be beneficial for the development of literacy skills among students with lower starting levels. In our sample, we found evidence of transfer effects within the subgroup of lower-starting-level readers, which accounted for about half of the participants. When we started the study, the enrolled students represented the average first and second years of Primary school in Finland. These students were screened for possible reading difficulties before the intervention using the national reading assessment test ALLU (Ala-asteen Lukutesti [Reading Test for Primary School], Lindeman, 2005). From the total sample of 52 students, those considered to be at a lower starting level (students whose scores were at levels 1–3) were divided into 15 students in the experimental group and 12 in the control group. Perhaps the most important finding of the current study is that the differences between the experimental and control groups were particularly pronounced among these lower-starting-level students.

Our results also show that the benefits of our literacy intervention were sustained 17 months after training, compared to a control group of students who started at a lower level. In our data, the experimental group’s lower starting-level students outperformed their control group peers almost one and a half years after the training (See Figs. 2C and 3). The analysis of the performance of students with lower initial literacy levels supports other findings to suggest the contribution of timing skills to pre-literacy and literacy skills in general (Bonacina et al., 2021; Gordon et al., 2015; Lundetræ & Thomson, 2018; Ozernov-Palchik et al., 2018; Politimou et al., 2019; Steinbrink et al., 2019) and particularly among children with reading difficulties (e.g., Bégel et al., 2022; Corriveau & Goswami, 2009; Goswami et al., 2013; Huss et al., 2011; Thomson & Goswami, 2008). Considering our findings with the previous literature, we suggest that training musical rhythm skills with embodied, dance-like music and movement exercises may be beneficial for the development of literacy skills. And, depending on the student’s own interests and aptitudes, this training could be particularly useful for students who need extra help with their literacy development.

A moderate positive effect on working memory performance was observed in the experimental group (EG) at the level of the whole group. As outlined in the aims of this study, working memory functions have been reported to support reading skills. Working memory functions are thought to be important in three main areas of literacy (e.g., Baddeley & Hitch’s working memory model, 1974): phonological, visuospatial, and executive information processing. The Corsi Blocks forward test used in our study is thought to tap into the visuospatial sketchpad of working memory (WM) and, more specifically, the capacity for dynamic information processing. (See Fischbach et al. (2014) for more on the comparison of neuropsychological measures of WM with primary school children in the context of literacy). The experimental intervention in our study was music and movement-based training that specifically targeted the participants’ spatial perception. Working memory functions are considered to be one of the main causal factors in literacy disorders (e.g., Fischbach et al., 2014), and therefore our moderately positive outcome within the experimental group in these skills post-intervention can be seen as a supportive finding. As the function of working memory develops into early adulthood, it is a function that can and is being trained, particularly in school settings. According to our findings, one way to develop working memory skills could be similar to the embodied music rhythm training used in our study. Our findings are also in line with previous studies reporting the contribution of working memory performance to rhythm perception (Ahokas et al., 2023; Ireland et al., 2018; Lesiuk, 2015).

Interdisciplinary research has indicated that there is a connection between rhythm perception and production abilities, motor performance and literacy skills. However, there is little evidence of a strong line of practical rhythm and movement implementations related to literacy skill development in the fields of music education, education, and special education. Only a few registered methods have been developed for remediation, such as Marion Long’s Rhythm for Reading programme in the UK (Long, 2014). Such a program does not exist in Finland, and the supportive research findings reporting a link between pre-literacy and rhythm skills (e.g., Bonacina et al., 2021; Gordon et al., 2015; Lundetræ & Thomson, 2018; Ozernov-Palchik et al., 2018; Politimou et al., 2019; Steinbrink et al., 2019) have not yet reached the fields of practical teaching and special education. Furthermore, it can be argued that the reported benefits of music and movement, rhythm and movement or dance educational interventions (Bugos & DeMarie, 2017; Frischen et al., 2019; Rudd et al., 2021; Shen et al., 2020; Zach et al., 2015; Zinelabidine et al., 2022) could be more widely used in practical applications, such as in classrooms or as rehabilitation tools. The collaborative project between a music school and a general primary school initiated by the lead researcher of this study, called the Koskela project (http://koskelakmo.blogspot.com), is still in operation after eight years of initialisation for the Primary school as additional music education two days a week and an ‘El Sistema’-style community orchestra activity. In Finland, the expertise of music professionals in relation to the development of literacy skills is relatively unexplored. Launching the Koskela project between 2016 and 2018 and now reporting on its literacy outcomes is a step towards building and disseminating research-based knowledge about why and to whom music and rhythm-based activities could be offered as part of broader learning goals.

Practising music educators may even be unaware of their own expertise in supporting literacy skills. Music educators might perceive this additional expertise as a burden, as an extra in an already overloaded curriculum or, above all, as practical work that is not included in the (Finnish) primary school arts curriculum in the first place. Moreover, it likely requires additional skills to be able to cooperate and share their knowledge with general teachers in a way that would benefit the general teachers’ practical work in this particular field. This, in turn, requires resources and expertise in the training phase of music educators and, in particular, general educators for whom music is a marginal curricular focus. It has been suggested that the area of research that focuses on the impact of music education, grounded in the curricular realities of schools, could be more deeply developed and supported (Kraus & Chandrasekaran, 2010). By focusing on the effects of music education in a mainstream Primary school, the current study sought to meet some of the critical standards of ecological and naturalistic validity. The choice of setting was based on the aim of providing benefits to all involved (Primary school, children, research centre, and music education specialists). The study aimed to support the dialogue between research and practice and to test and develop music education methods that could potentially be effective in educational and rehabilitation contexts (cf. Ahokas, 2022).

Our study was conducted in research setting that could be defined as a close-to-practice design (Wyse et al., 2018). In such settings, the role of the participating school is fundamental. Possible similar, novel research designs in a naturalistic setting could achieve higher quality by making more elaborate agreements with the participating schools at the outset, for example regarding the research design. Collaboration requires considerable work as well, as collaborative and communication skills from the researcher that are different from typical academic or researcher skills (Lai et al., 2016; Lawson, 2004). In addition, for the collaboration to be carried out to the highest possible quality, the participating school requires a feasible amount of additional time and resources. As the aim of a naturalistic study design embedded in the school day is to support both the study and the curricular aims of the school, the implementation of such a design is likely to require a highly multidisciplinary approach.

Ecological validity is a high standard goal for any scientific work (Andrade, 2018). It could be suggested that in science there is a continuous balance between validity in the clinical laboratory and ecological validity in the real world, and that perfection or balance between the two might even be an impasse. The challenges of a natural and exploratory research setting (close to practice) can include, for example, the lack of clinical power (Andrade, 2018). A laboratory research setting might offer better experimental control over potential extraneous variables that might confound the results, and a laboratory research setting might prove cheaper to run. However, the clear potential benefits of a developmental project embedded in the school day for educators (including special educators), psychologists and, most importantly, pupils may not be achieved to the same extent. In this study, the challenges were directly manifested in the possibility of controlling the randomization of the study groups, which could be said to compromise the validity of the research. As the setting was close to practice, the objective had to be in line with the curricular objectives of the participating primary school, and therefore randomization was carried out in collaboration with classroom teachers. In other words, the randomization of the groups depended on the expertise of the class teachers as to how the group would function in a way that would support the participants’ learning holistically, but still make the groups research ready. Another highly relevant challenge of this study was related to the age group, as the curriculum for first- and second-year pupils in Finland is essentially focused on learning the rudiments of reading, and therefore it was expected (and certainly hoped) that the reading performance of both the experimental and comparison groups would improve during their first two years of primary school. This was expected to reduce the statistical power of the hypothesized effect of experimental rhythmic training on language learning. It is also important to note that the objectivity and neutrality of the research can be questioned, as the researcher who provided the instruction was aware of the hypothesis and the group assignments. One limitation of the study was that we only had access to the level values of literacy evaluation (ALLU), not the raw scores.

Conclusions

The present study investigated the effects of rhythm and movement-based training on reading ability and related cognitive functions. One of the main aims was to increase knowledge about the possible role of rhythm and movement as one of the important mechanisms underlying the effects of music on cognition reported elsewhere in the literature (e.g., Loui et al., 2019 or Shen et al., 2020). Our findings focused on working memory and literacy, and we suggest that it would be worthwhile to further investigate the role of rhythm and motor skill development as a mechanism linking effective music education to broader cognitive development, including the other aspects of executive function. We suggest that the current design should be replicated with a larger number of participants to confirm or challenge the findings on a larger scale.