Theoretical background

Response to intervention

Response to Intervention (RTI) represents a major shift in educational methodologies. Instead of relying on reactive models, RTI places a significant emphasis on proactive and early interventions, specifically targeting students who present academic or behavioral challenges (Fletcher and Vaughn 2009; Hughes and Dexter 2011; Gersten et al. 2020). RTI’s academic and societal intent seeks to ensure equitable opportunities for all students, irrespective of their inherent abilities or socio-economic backgrounds (Horner et al. 2017). Its distinct multi-tiered structure is crafted to optimize the distribution of educational resources, contrasting sharply with traditional models such as the IQ-discrepancy method used in the identification of learning disabilities (Fuchs and Fuchs 2006).

The RTI model is structured around a tiered system designed not only to identify students in need of additional support, but also to integrate specialized interventions with the core curriculum:

Tier 1: Universal instruction and screening

At this foundational level, all students participate in high-quality, research-based instruction provided within the general education environment. Periodic universal screenings occur, assessing students’ progress and mastery of core curriculum content. These screenings serve as proactive measures to detect early signs of academic struggles, ensuring timely interventions (Greenwood et al. 2011).

Tier 2: Targeted interventions

For students identified through Tier 1 screenings as not making adequate progress in the core curriculum, Tier 2 offers targeted interventions. These interventions are tailored to students’ unique needs, often in small group settings, and complement the foundational Tier 1 instruction. Progress monitoring tools are deployed more frequently here, providing educators with data on the efficacy of the targeted interventions and whether students are closing the academic gap (Cho et al. 2014).

Tier 3: Intensive interventions

This tier is reserved for students who, despite Tier 2 interventions, continue to display minimal academic progress. The interventions at Tier 3 are heightened in intensity, often delivered in one-on-one settings. The data from these interventions, combined with a comprehensive evaluation, can serve as a basis for considering students for special education services if required (Jimerson et al. 2016).

Throughout the RTI tiers, the connection between interventions and the core curriculum is vital. Each intervention layer is designed to scaffold and reinforce the foundational content provided in the core instruction. This structure ensures not only the remediation of skill deficits but also the alignment of interventions with grade-level expectations and standards (Jimerson et al. 2016).

Extant literature underscores RTI’s efficacy in both outcomes and cost-effectiveness compared to traditional methods (Van Der Heyden et al. 2007; Torgesen 2009). Of note is RTI’s inclusiveness, especially benefiting low-socioeconomic-status students who might be underrepresented using older models like the IQ-discrepancy method (Fuchs and Fuchs 2006; Eccles and Roeser 2011).

RTI and multi-tiered system of supports

Another important educational framework is the Multi-Tiered System of Supports (MTSS). MTSS framework is widely adopted to optimize student outcomes by integrating both academic and non-academic interventions. MTSS is grounded in data-driven decision-making processes to identify and support students who may be at risk in various domains, from academic challenges to behavioral and socio-emotional concerns (Sugai and Horner 2003). Through its tiered structure, MTSS ensures that interventions are appropriately intensified based on individual student needs, promoting a holistic educational environment where students receive support tailored to their unique requirements (McIntosh et al. 2009).

Response to Intervention (RTI), while similarly tiered in its structure, is a narrower framework primarily concentrated on academic interventions. It serves as a mechanism to address learning difficulties in the early stages of education, typically during Kindergarten or grades 1 or 2, especially when children are acquiring foundational skills in areas like reading and mathematics (Gersten et al. 2017). In essence, while RTI can be viewed as a component or subset of MTSS focusing on academics, MTSS offers a more expansive and encompassing approach by addressing both academic and non-academic student needs.

Several studies highlight the effectiveness of both frameworks. RTI, with its proactive early academic intervention, has been correlated with improved student performance in reading and mathematics and a reduction in referrals to special education compared to traditional educational models (Fuchs and Fuchs 2006). MTSS, due to its broader scope, is associated not only with enhanced academic outcomes but also with a decrease in behavioral issues and an improvement in the socio-emotional well-being of students (Benner et al. 2013). In environments devoid of RTI or MTSS structures, the introduction of these systems often yields heightened positive student outcomes, streamlined resource allocation, and increased teacher satisfaction.

RTI’s implementation

Despite these recognized advantages, few educational systems have implemented a RTI model (notable exceptions being in the USA (Berkeley et al. 2020) and Finland (Jahnukainen and Itkonen 2016)). The main reason is that, in most countries, the models of providing special education services are top-down, i.e., developed by governments and imposed on schools, but educational policymakers are historically slow at implementing innovations (Serdyukov 2017). A second challenge arises when the model depends heavily on human resources for screening, monitoring, and intervention. This reliance makes it financially unfeasible for many educational systems. (Fletcher and Vaughn 2009).

One of the educational systems not implementing either RTI or MTSS is the Catalan, where this study takes place. In Spain, each autonomous community holds competence over its educational system. Catalonia, as one of these autonomous communities, dictates its educational parameters independently. Historically, within the Catalan education system—and, indeed, across the entirety of Spain—there has been no official policy or framework implementing either RTI or MTSS. Despite international research supporting these frameworks, they have not yet been integrated into Spanish regional educational systems. This absence, while in line with broader global hesitations towards systemic changes in education, underscores the potential benefits that could be realized should such frameworks be considered for future adoption. Notably, none of the participating schools in this study had prior knowledge or exposure to the concepts of RTI or MTSS. This unfamiliarity offers a unique opportunity: to observe the potential impacts and benefits of these interventions in an environment devoid of any preconceived notions or biases related to these specific educational strategies.

Difficulties in implementing RTI

Given the assessment-centric nature of RTI, refining traditional screeners, especially in mathematics, becomes a pressing concern. Current screeners often narrow their focus on specific mathematical concepts, which can neglect broader areas of mathematical understanding (Butterworth 2003). This narrow focus not only restricts their diagnostic reach, but can also misrepresent a student’s holistic mathematical abilities (Reigosa-Crespo et al. 2012). Additionally, many of these tools bear cultural and linguistic biases, which may disadvantage diverse student groups in increasingly multicultural classrooms (Geary et al. 2017).

Moreover, the substantial reliance on teacher observations, prevalent in many screeners, introduces the potential for subjective biases. While the insights of educators are invaluable, relying solely on them may not paint a comprehensive picture of a student’s genuine capacities or needs (Butterworth 2003).

Digital, semi-automated tools emerge as a promising solution. Not only can they reduce the human resource costs associated with RTI’s implementation, but they can also enhance the accuracy and consistency of screenings and interventions (Wilson et al. 2006; Cho et al. 2014; Butterworth and Laurillard 2016).

Arithmetic fluency

Our study focuses on a specific academic area: arithmetic fluency. Arithmetic fluency, defined as the ability to solve basic arithmetic operations quickly and accurately, serves as a cornerstone for higher-level mathematical reasoning and problem-solving (Jordan et al. 2009). Jordan et al. (2009) elucidate that a solid grasp of arithmetic in the early years significantly predicts success in algebra and other advanced math topics later on. Furthermore, research by Vasilyeva et al. (2015) indicates that deficiencies in arithmetic fluency can cascade into challenges in more complex mathematical tasks, as students struggle with basic calculations, diverting cognitive resources from grasping higher-level concepts.

Recognizing its paramount importance, interventions targeting arithmetic fluency have shown promise in not only enhancing this foundational skill but also in better equipping students for subsequent mathematical challenges (Geary 2011). By aligning RTI with these evidence-based interventions, there lies a profound opportunity to substantially impact students’ mathematical trajectories.

In this study, two critical cutoff points, based on indices of the screener, were employed to categorize students based on their arithmetic fluency. First, a cutoff at the 30th percentile was used to define students in the “low average” category. This cutoff is common in educational research for differentiating students who perform below the average range but are not necessarily at significant risk for learning disabilities (Desoete and Grégoire 2006). Importantly, the 30th percentile was chosen as it lies on the upper end of commonly used cutoff percentages, which often extend as low as 25% (Geary et al. 2007). The rationale for selecting the upper limit was to capture as broad a spectrum of students as possible.

Second, a more stringent cutoff at the 15th percentile was used to identify students “at risk of Mathematics Learning Difficulties” (MLD). This percentile roughly corresponds to being one standard deviation below the mean, a metric regularly used in both psychological and educational research to identify significant deviation from average performance (Jordan and Hanich 2003). While the 15th percentile is at the upper limit of commonly used cutoffs—some research employs even lower percentiles like the 10th—it serves as an inclusive measure that aims to reduce the risk of false negatives (Jordan and Hanich 2003; Desoete and Grégoire 2006; Geary et al. 2007).

It is worth noting that both the 30th and 15th percentile cutoffs were chosen to be intentionally inclusive. Given that our assessment tools might not be as precise as those employed by trained psychologists, the upper-threshold cutoffs helped to mitigate the risk of false negatives. The focus was to ensure that students who might require additional support were not inadvertently excluded from receiving targeted interventions.

Study overview

In this research, the principal objective was to examine the efficacy of semi-automated digital tools designed to bolster the implementation of Response to Intervention (RTI) in schools unfamiliar with the framework. This inquiry was driven by the understanding that while RTI is empirically proven to be more proficient in identifying and addressing learning difficulties than traditional models, its adoption remains limited. This is due to the challenges associated with its implementation in the absence of institutional policies.

The study encompassed 13 schools in total, with 5 schools that integrated the RTI framework through the utilization of the digital tools, and 8 schools that continued their regular teaching methods without the aid of RTI—serving as the control group. The results highlighted two key findings: First, students in the intervention group demonstrated a significant improvement in arithmetic fluency compared to their peers in the control group. Second, these students were also more likely to progress out of zones categorized as “low achievement” or “at risk of Mathematics Learning Difficulties”. Together, these results point to the statistical significance and practical importance of the intervention in improving arithmetic fluency and helping students move out of risk categories.

This study explores the potential of these tools within a k-12 curriculum context, offering an innovative approach that integrates seamlessly with regular curriculum tools. This integration seeks to address barriers such as resistance from educators and administrators while allowing for a scalable, grassroots approach to RTI in educational landscapes where policy does not back it.

Materials and methods

Participants

The study involved 418 first-grade students from 13 schools in Catalonia, Spain (Fig. 1). Among these, 5 schools (comprising 149 children) were provided with RTI tools, while the other 8 schools (comprising 269 children) served as a control group. Among the intervention schools, 4 were public and had low socioeconomic status (SES), and the remaining one was a charter school with a high SES. The control schools included 6 public schools, with 4 having low SES and 2 with average SES. The remaining 2 control schools were charter schools, with one having a high SES and the other having an average SES.

Fig. 1
figure 1

An overview of the study’s methodology and key findings. We administered an arithmetic fluency test to 418 grade 1 students in 13 schools in Catalonia, Spain. From these schools, 5 were in the intervention group and 8 in the control. We selected the 48 students in the overall lowest 30% percentile (Low Achievement (LA) regime). These students then participated in an intervention that consisted of 15 min of extra mathematics practice per day, four days per week, for 15 weeks. Afterward, we administered the same test again to all the students. We found that the children who underwent the intervention showed statistically higher arithmetic fluency. Additionally, after the intervention, a statistically smaller percentage of them remained in the Low Achievement (LA) regime compared to those in the control group. Note: if some normally achieving children fell into the LA regime in the post-test, they are still painted in green in order to make the abstract message clearer

All 13 schools were approached at the beginning of the academic year, and they were offered to participate in the study. They themselves chose whether to be included in the control or experimental group. Schools signed a data protection agreement with Innovamat Education SL (the curriculum provider), as well as with the families of the children participating in the study.

Tools

Tier I

Regarding the multi-tiered instruction required by RTI, the first Tier consists of the research-based mathematics curriculum proposal developed by Innovamat, which is currently implemented in more than 1,700 schools across 7 countries. The main teaching ideas behind Innovamat curriculum can be summarized as follows:

  • Contents and processes: A global tendency in mathematical teaching practices is to opt for a dual approach which includes both content and processes (OECD 2018). This core curriculum achieves it by providing the teacher with a very detailed guide for every session, in which every activity is educationally justified. Each activity contributes to one or more of the mathematical processes as described by NCTM Principles and Standards (2008): problem-solving, reasoning and proof, connections, communication and representation. This dual focus is critical for performance outcomes, as it builds both conceptual understanding and procedural fluency (Hiebert and Grouws 2007).

  • Fostering a problem-solving environment: Research shows that problem-solving fosters mathematical understanding and improves performance outcomes (Schonfeld 2016). Innovamat encourages students to engage in open-ended questions and peer discussions, fostering an environment conducive to research, analysis of mistakes, comparison of strategies, and reasoning. This aligns with Vygotsky’s Social Constructivism, suggesting that peer interactions can significantly impact learning (Vygotsky 1978).

  • From manipulation to abstraction: based on the ideas by V. M. D. Heuvel-Panhuizen (2008), a scaffolding mechanism where learning trajectories are developed from concrete materials to reduce abstraction is implemented. In next steps, abstraction is sought as materials are retired from the game until children, through representation, have understood the mathematical concepts behind them. This scaffolding technique has been shown to facilitate understanding and improve performance in mathematical tasks (Gersten et al. 2009).

  • Practicing in meaningful contexts: two types of practicing are included, productive and reproductive. Productive practice is carried out by the whole class after an open-ended question has been introduced. Research indicates that context-based, meaningful practices improve retention and application of mathematical skills (Boaler 1993). Reproductive practice is carried out through a digital application, which helps children practice mathematical concepts and procedures in a gamified setup that self adapts to every student’s path, and provides continuous formative feedback. Numerous studies provide evidence that adaptive learning technologies can enhance performance outcomes (Kulik 2003; Shute 2008; Koedinger et al. 2010).

Tier II

In this study, a second Tier of support was designed specifically for low-achieving children. The Tier II support was facilitated using a digital app for 15–20 min, four times per week. While many activities in Tier II are analogous to those in Tier I, a distinct and targeted activity was integrated for children demonstrating low achievement in mathematics. This activity is rooted in the insights of Butterworth and Laurillard (2016).

Special activity for Tier II

At the heart of our Tier II intervention is a game-based activity inspired by Butterworth and Laurillard’s (2016) work (Fig. 2). The activity deploys a series of beads that children can assemble or disassemble to produce sets with varied item counts. A student’s task involves utilizing these digital beads to replicate a specific target number presented to them. Each unique bead count is color-coded, aiding students in associating a particular hue with a number. As students become accustomed to the game, the Arabic numeral equivalent of the bead count is incorporated into the design, fostering an association between numerosity and its symbolic representation. Once students exhibit mastery over this association, the color cues are withdrawn, ensuring that their numerical understanding is not color-dependent. The cardinality of these target numbers is progressively amplified, starting from 2 and culminating at 10.

Fig. 2
figure 2

Screenshot of the activity to build the concept of number as a set. In this activity, children can split or join strings of beads to match the objective set, above

Children spent around 5 min a day practicing this activity, 5 more minutes with Kindergarten level activities, and 5 more minutes with grade 1 activities. These were drawn from the usual tier I practice activities, and their details are elaborated in Supplementary Material I.

Screening

The screening materials have also been developed in the framework of this study and comprise a set of 8 tasks: motor speed, whereby the child has to press a stimulus appearing in the screen; visual processing speed, similar to WISC’s symbol search (Wechsler 2014); number knowledge, where children have to select the bigger of two numbers presented simultaneously; an Approximate Number Sense task based on Halberda et al. (2008); a numerical line of range 0–100; a Working Memory task, where children need to remember some images while being distracted by other images; a reasoning task based on Raven’s progressive matrices (Raven and Raven 2003) and a normed arithmetic fluency test, where children need to solve additions and subtractions in the 0–20 range as fast as possible. A detailed explanation of the tasks is provided in Table 1. For more information about their validity and reliability measures, please see Supplementary Material II.

Table 1 Explanation of the tasks involved in the pre- and post-screening tests

The test is administered collectively, and each task is preceded by up to 5 examples that the whole class completes together. The test administrator makes sure that all children have understood the task before instructing the entire class to perform it by themselves, individually. The total duration of the test is around 45 min.

Arithmetic fluency (reflecting better addition and subtraction strategies as well as automation of basic arithmetic facts) is pivotal in the Curriculum Based Measurement in grade 1 (Fuchs and Fuchs 2006; Hosp et al. 2016) and is used in this study as a single-skill probe of mathematical achievement. The rest of the tasks were given to the students for two reasons: first, we wanted to give teachers a broader cognitive profile of each student and second, we wanted to analyze the effect of the intervention on cognitive areas not directly related to numerosity.

For the purpose of this study, the screening materials have been developed on a separate platform with different technology and a different user experience than the used by the students when learning with Innovamat (e.g., keyboard appearance). This was done in order to minimize the effect of children in the experimental group learning the tool.

Procedure

At the start of the 2021–2022 school year, participating schools were informed of the details of the study. They were then given a 2-h individual session on the topic of learning difficulties in math and the RTI framework. They were then asked to identify a suitable time and location to carry out the intervention within the school and outside the normal mathematics classes. The intervention took place in the school because we couldn’t ensure that the families would carry it out correctly, either due to lack of resources or willingness to participate from home.

In January 2022, all children were screened by members of the research team, and the statistical analysis of the results was conducted during the final week of the month. Based on these results, 30% of the children with the lowest performance on the arithmetic fluency task were selected for the intervention group. Schools were given two weeks to become familiar with the intervention materials and to report any issues or questions. In the third week of February, they were instructed to begin the daily 15 min practice sessions that comprised the tier II, which lasted for 15 weeks. After this, the research team re-screened the 13 schools using the same test as in January during the last two weeks of June.

Once a month, schools were sent a report with information on the level of participation and of successful completion of the activities that were given.

Analysis plan

We computed a final score for each task by adding the correct results and subtracting the wrong results, multiplied by a coefficient found in Supplementary Table 1. The first variable of analysis was the difference between the pre- and post-tests arithmetic fluency. Since it was a non-normally distributed continuous variable, we used Mann-Whitney’s U to test the hypothesis of equality of medians between the control and intervention groups.

The second variable of study was the percentage of students still in the low achievement or in the “at risk of mathematics difficulties” zones. These categorical data were presented as absolute numbers and percentages, and tested using Chi squared test with Yates continuity correction to compare the control and intervention groups.

All analyses were performed with R statistical software version 4.1.

Results

Intervention adherence

Regarding the participation in the intervention, the average sessions per student per week was 2.22 (with standard deviation of 0.45 sessions per week). This is considerably less than the 4 sessions that they were instructed to do. Particularly problematic is the number of sessions at the very beginning of the intervention and after Easter break, when schools had a hard time picking up the pace. Variability among schools was very small, and thus the results were not affected by the amount of intervention. We show a distribution of the sessions per student per week in Supplementary Fig. 1.

Arithmetic fluency

In Table 2, we show the results of the number of corrected operations in the fluency task before and after the intervention period, along with the difference between the pre and the post scores. We divide children in three groups: “Medium-High fluency” (i.e., scoring at or over the 30th percentile on the pre-test), “Low fluency: Control” (i.e., scoring below the 30th percentile but belonging to a control schools and thus not receiving intervention) and “Low fluency: Intervention”, where belonged the children under the 30th percentile receiving intervention.

Table 2 Arithmetic fluency task results

In Table 2 and Fig. 3, we can see that the pre-test arithmetic fluency scores are lower for students who received or needed the intervention (by construction), and somewhat higher for those who received the intervention, though not statistically significant. On the other hand, the post-test arithmetic fluency for students who received the intervention is statistically higher than for those who required it but did not receive it (11.1 vs. 7.2 operations, with a p-value 0.001). This indicates that the extra practice resulted in higher fluency. The effect size, calculated using Cohen’s d, is 0.62, indicating a moderate effect of the intervention on arithmetic fluency. Looking at the gains, students in the intervention group also improved significantly more than the control group (7.3 vs. 4.2 operations, with a p-value = 0.007), with an effect size of d = 0.54, indicating again a moderate effect.

Fig. 3
figure 3

Comparison of the difference between pre- and post-test arithmetic fluency, given in total corrected operations, for the three groups in the study. The boxes represent the interquartile range (IQR), stretching from Q1 to Q3, and contain the middle 50% of the data, for each study group. The two p-values shown on top represent the results of two Mann Whitney U tests between the study groups

Other dimensions

We present descriptive statistics for all tasks in the Pre- and Post-tests, as well as their differences, in Table 3 and Fig. 4. When comparing the two “Low Achievement” groups, we find statistically significant results in the comparison of the differences for the Numerical Line (p = 0.03) and Number Knowledge (p-value: 0.01) tasks, but not in any other one (note that Working Memory is close to significance at p-value: 0.07).

Table 3 Medians and interquartile ranges for all the extra tasks given to students
Fig. 4
figure 4

Difference between pre- and post-scores for all extra tasks given to students

This result is yet another indicative that the intervention indeed succeeded in building the numerical abilities of children, and that other cognitive abilities were not mediating in the final improvement of arithmetic fluency.

Low achievement and at risk of MLD zones

In table and Fig. 5, we show that the percentage of students who leave the “low achievement” zone (below the 30th percentile) and the “at risk of Mathematics Learning Difficulties” zone (below the 15th percentile) is significantly higher for students who received the intervention. Of the 48 students who received the intervention, only 9 remained in the “low achievement” zone afterward (19%), compared to 47% of students in the control group (p-value = 0.001, Fig. 5A). Similarly, only 4 students in the intervention group remained at risk of MLD afterward (8%), compared to 33% of students in the control group (p-value = 0.003, Fig. 5B). This indicates that the RTI intervention was effective in helping students leave the “low achievement” and “at risk of MLD” zones (Table 4).

Fig. 5
figure 5

Percentage of students who remain: A below the 30th percentile (low achievement regime); B below the 15th percentile (at risk of MLD regime), depending on whether they had intervention or not

Table 4 Absolute numbers and percentages of students who remain below the 30th percentile (low achievement regime) or 15th percentile (at risk of MLD regime) depending on whether they had intervention or not

Discussion

In this study, we evaluated whether it is feasible to develop a scalable semi-automated RTI framework in an educational system where RTI is not policy. For this purpose, we developed a screening test centered on arithmetic fluency that was administered to 418 first grade students in 13 schools. From these, we picked the lowest performing (below the 30th percentile) of 5 of these schools, and gave them a digitized extra practice of 15 min per day, four days a week for 15 weeks.

Interpretation of results

The core finding of our study is the efficacy of an RTI framework even in environments where RTI is not policy. Students who were provided with the RTI intervention displayed a marked improvement in arithmetic fluency compared to their peers in the control group. Notably, the interventions did not directly target arithmetic fluency, but instead aimed at building foundational number concepts. This outcome thus emphasizes that a better understanding and automatization of numbers can substantially bolster arithmetic capabilities, an insight consistent with other studies (Östergren and Träff 2013; Malone et al. 2020).

While we observed that 8% of students did not show a significant response to the intervention, considering the general prevalence of dyscalculia, this outcome aligns with existing literature (Shalev et al. 2000). This showcases the RTI framework’s sensitivity in identifying potential cases of dyscalculia, an invaluable asset for early intervention.

The challenges faced during the study year, notably the pandemic-induced disruptions and teacher strikes, highlight the resilience of the framework. Achieving over half the recommended sessions in such a challenging environment is testament to the framework’s adaptability and effectiveness.

Implications of the findings

Our findings hold significant implications for educational practices globally. Even in regions without a formal RTI policy, schools can successfully integrate a semi-automated RTI framework to improve student outcomes. This holds promise for democratizing quality education, providing students with structured, individualized support irrespective of systemic policies.

The effectiveness of our intervention also underscores the significance of foundational number concepts in enhancing arithmetic fluency. This suggests that educators might need to re-evaluate and potentially redesign curricula that prioritize rote learning over foundational understanding.

Limitations of the study

Several limitations should be considered while interpreting our findings. The most significant limitation was the inability to complete all recommended RTI sessions due to unforeseen challenges during the study year. This may have impacted the full potential of the intervention’s effectiveness.

Another limitation pertains to the specificity of the educational context in which the study was conducted. Given that educational systems, pedagogical practices, and student demographics can vary widely across regions and countries, our findings are rooted in the unique attributes of the participating schools and their respective environments. Thus, while our results provide valuable insights for this particular setting, caution should be exercised when attempting to generalize the outcomes to other educational contexts or diverse student populations. Further studies in varied settings would be essential to ascertain the broader applicability and adaptability of our RTI framework.

Conclusions

In summary, our study presents a robust case for the potential scalability of RTI frameworks even in systems where it is not formally integrated as policy. Our results suggest a strong link between foundational number understanding and arithmetic fluency, challenging traditional pedagogical approaches. With continuous refinement and integration of feedback, such RTI frameworks could transform the educational landscape, providing individualized, targeted support to those who need it most.