Introduction

Formative assessment (FA) is considered an effective framework for promoting learning in school (Black & Wiliam, 2009; Kingston & Nash, 2011; Klute et al., 2017; Lane et al., 2019). Previous research on FA has focused on its effects on student achievement which are well documented. However, in addition to achievement, student motivation is another important goal dimension in education as formulated in school curricula (Department for Education, 2013; Kunter, 2005). At the same time, motivation is an essential prerequisite for successful learning (e.g., Gottfried, 1990; Lemos & Veríssimo, 2014). However, effects of FA on motivation have rarely been studied (for exceptions, see Hondrich et al., 2018; Rakoczy et al., 2019). In particular, little is known about the underlying mechanisms mediating effects of FA on students’ intrinsic motivation and about personal learner characteristics moderating these effects. These two specific questions are addressed in this article. A theoretical rationale for hypotheses about mediating processes is provided by the self-determination theory (Ryan & Deci, 2000), postulating that intrinsic motivation arises as a consequence of experiencing competence. When instruction is carried out in the sense of FA (i.e., adapting instructions to the students’ achievement level and providing appropriate feedback), students can have positive learning experiences and feel themselves as competent. In this regard, according to the well-known “Offer-and-Use-Model for Teaching Effectiveness” (Fend, 1981; Helmke, 2003) or “Multilevel supply-use model of student learning” (Brühwiler & Blatchford, 2011) students’ perceptions and interpretations of instruction ultimately determine outcomes of instruction in school. Linking self-determination theory to the importance of students’ perceptions, more precisely, teacher behavior should be perceived by the students as supporting their competencies to affect their intrinsic motivation. Hence, this study assumes that the effect of FA on students’ intrinsic motivation is mediated by students’ perceived competence support.

Formative assessment and teacher behavior

FA is a proven concept with adequate evidence of its potential to promote school learning (Klute et al., 2017; Lane et al., 2019). It refers to “…all those activities undertaken by teachers, and/or by their students, which provide information to be used as feedback to modify the teaching and learning activities in which they are engaged” (Black & Wiliam, 1998, pp. 7–8). The assessment of students’ achievement, the use of this assessment information for individualized feedback, and the adaption of teaching activities to individual learning needs can be identified as central features of FA (Wiliam & Thompson, 2008). At this point, it already becomes evident that teachers play the key role in this process as they decide whether and how to implement FA practices in their day-to-day teaching. Both an intervention study employing classroom recordings (Yin et al., 2008) and meta-analyses (Klute et al., 2017; Lane et al., 2019) show that the effectiveness of FA depends on the specific implementation in the classroom, that is, teachers’ use of effective FA practices. Within the related framework of data-based decision-making (Mandinach, 2012), it is also assumed that the effects of data assessment on student learning are mediated by teachers’ data interpretation and changes in teaching behavior in terms of adjusting instructions to the students’ skills (e.g., Hebbecker et al., 2022).

Learning progress assessment

Evidence on the concrete design of FA features and their effective use in the classroom, however, is still scarce. Despite the high potential associated with FA (Black & Wiliam, 2009), a successful implementation poses typical challenges. For instance, the assessments need to be easy to integrate into the classroom and provide valuable information about existing misconceptions; teachers should be supported in using the data to make informed decisions about further instruction, and a flexible adaptation of the materials to teachers’ routines should be ensured (Hondrich et al., 2016; Yin et al., 2008). One prominent approach to assist diagnostic processes in mathematics education is provided by instruments for progress monitoring, such as learning progress assessment (LPA, Förster & Souvignier, 2015). LPA refers to a standardized assessment of students’ learning progress with short parallel tests repeated at intervals of about 3 weeks throughout the school year (e.g., Souvignier et al., 2021). These data collected in the learning process provide teachers with meaningful information about students’ achievement level and should inform teachers’ instructional decisions. It has been empirically confirmed that making use of LPA contributes to student learning (Förster & Souvignier, 2014, 2015). However, translating assessment information into adaptive support and learning conducive feedback remains a major challenge for teachers (Visscher, 2021). Hence, it might be useful to provide teachers with materials that support them in adaptive teaching and providing feedback. In the present study, we therefore apply two differently structured approaches of FA: The effects of (a) an online-based LPA tool alone and (b) a combination of the LPA tool with prepared material for feedback and adaptive teaching (thereby covering all central FA features) are compared to (c) conventional mathematics instruction.

Students’ intrinsic motivation and experience of competence

Intrinsic motivation is of particular relevance for school learning. It represents the prototype of autonomous behavior and thus provides the basis for self-determined learning. A central characteristic is that the performance of the action is experienced as inherently interesting or enjoyable (Deci & Ryan, 1985; Ryan & Deci, 2000). Intrinsic motivation has positive consequences for student outcomes, including better learning, increased persistence, deeper processing, and increased well-being (Guay et al., 2000; Howard et al., 2021; Vansteenkiste et al., 2006). The question of how intrinsic motivation can be fostered by teacher behavior is therefore of central importance. A starting point for teachers’ influence on students’ motivation might be students’ experience of competence as a facet of intrinsic motivation. According to the self-determination theory (Ryan & Deci, 2000), fulfilling the need for competence (together with autonomy and relatedness) contributes to the development of intrinsic motivation. A recently published meta-analysis (Bureau et al., 2022) shows that competence is the strongest positive predictor of self-determined motivation. Besides SDT, further prominent motivational theories (such as social-cognitive theory: Bandura, 1997, or the expectancy-value model: Eccles & Wigfield, 2002; Wigfield & Eccles, 2000) posit that students’ perceived competence or related concepts like self-efficacy or ability self-concept are prerequisites for experiencing intrinsic motivation.

Thus, these theories provide a theoretical rationale for how FA practices can influence intrinsic motivation via an enhanced experience of competence. For instance, from a theoretical point of view, the feeling of competence can be enhanced by the teacher showing the discrepancy between current achievement level and learning goal and providing information on how to move forward to reach this goal without feeling under pressure or control (feedback, Ryan & Deci, 2002). In addition, matching the instruction to the student’s current learning level should lead to a better fit between skills and demands, which should reduce the likelihood of being overchallenged. Following a theoretical paper by Linnenbrink-Garcia et al. (2016) as well as considerations on the basic dimensions of teaching quality, especially student support (Praetorius et al., 2018), teachers’ differentiated, high-quality instruction which is well-structured, appropriately paced, and ensures that demands are slightly above students’ current achievement level should optimally foster students’ feeling of competence (adaption).

First empirical evidence also suggests positive effects of teachers’ FA use on intrinsic motivation, feelings of competence, or related constructs (Faber et al., 2017; Hondrich et al., 2018; Miller & Lavin, 2007; Rakoczy et al., 2019). For instance, Hondrich et al. (2018) investigated the impact of FA in third-grade science education. The researchers found that students whose teachers were trained in how to realize FA (including assessment tasks, feedback, and adaption of instruction) had a stronger feeling of competence in terms of perceived learning gains and reported higher intrinsic motivation than students in a control condition. In an exploratory study, Miller and Lavin (2007) found that students benefit from a wide range of techniques associated with FA (e.g., improved questioning techniques) in terms of self-esteem and beliefs about their competence. Likewise, in a cluster randomized field trial, Rakoczy et al. (2019) investigated the impact of FA in ninth-grade mathematics instruction revealing that compared to a regular mathematics instruction, teachers’ use of FA practices (using assessment tasks and feedback including hints for improvement) had a positive effect on students’ change in self-efficacy.

Some studies also investigated the effect of specific teacher behavior (e.g., feedback or adaptive teaching) instead of the application of a range of FA practices. Positive effects of feedback on motivational outcomes have been shown in most of these studies (e.g., Harks et al., 2014; Rakoczy, Klieme, Bürgermeister, & Harks, 2008; Wisniewski et al., 2019; for exceptions, see, for example, Drost & Todorovich, 2017). Results of survey studies indicate that students who perceive their teachers to frequently apply formative feedback report higher feelings of competence and intrinsic motivation (Leenknecht et al., 2020; Pat-El et al., 2012). Further, a differentiated teaching style in girls’ physical education classes can positively influence their development of intrinsic motivation (Goudas et al., 1995) and reduce negative consequences of the big-fish-little-pond effect regarding low-achieving students’ academic self-concept (Roy et al., 2015).

Students’ perceived competence support

To achieve desired outcomes in school (here, increased intrinsic motivation), students’ perception of the learning environment appears to be central (Kunter et al., 2007). The critical role of students’ perception is also postulated in well-known models of instructional impact (Brühwiler & Blatchford, 2011; Fend, 1981; Helmke, 2003). Thus, it might be relevant whether students subjectively feel supported in their development of competencies through the instructional decisions made by their teachers. If students perceive that they are supported in their competencies by the teacher’s behavior, an increase in their experience of competence and subsequently in intrinsic motivation seems likely. In this context, the focus lies on the effect of students’ perception of competence support on intrinsic motivation, while the possible intermediate step of an increased experience of competence is neglected. However, the influence of students’ perceived competence support (PCS) on intrinsic motivation has so far rarely been examined. Initial evidence for secondary school mathematics instruction suggests that PCS can positively influence subsequent self-determined motivation (Rakoczy, Klieme, & Pauli, 2008). Further, results of a laboratory experiment in ninth-grade mathematics classes showed that process-oriented feedback had a positive effect on the development of interest compared to social-comparative feedback which was mediated by PCS (Rakoczy et al., 2013). The present study therefore focuses on the influence of FA on intrinsic motivation mediated by PCS.

Achievement level as a moderating variable

In intervention studies, the question of differential effects is of particular interest (Fuchs & Fuchs, 2019). Not least, against the background that the FA approach aims at creating an optimal fit between learning requirements and learning opportunities, differential effects at the edges of the achievement spectrum seem conceivable. Assessment information might make it easier for teachers to pay attention to differentiation in the classroom, especially in the case of particularly high-achieving or low-achieving students, and to allow these children to work on suitable tasks. Such an increased fit through FA might also lead to positive effects on motivational variables. However, such effects have rarely been studied to date. In an exploratory analysis on effects of FA on self-esteem and self-competence of Miller and Lavin (2007), changes in the two constructs were significant for lower and higher ability group members, but not for middle ability group members. However, experimental control was rather limited in this study. Slightly more evidence on potential moderation effects of FA is available for student achievement as outcome. Some of these few studies show differential effects in favor of high-achieving children (Faber et al., 2017), some in favor of low-achieving children (Bokhove & Drijvers, 2012; Koedinger et al., 2010), while Lee et al. (2020) did not find evidence for differential effects regarding different classroom types in their systematic review. Even when the individual components of FA are considered, no consistent pattern emerges. Studies suggest that the learning benefits of various types of feedback differ depending on the student’s level of achievement (for an overview, see Shute, 2008). Regarding differentiated instruction, a large-scale intervention study in which teachers participated in a professional development training on adaptive instruction in mathematics classrooms showed significant, small effects on mathematics achievement in one of two cohorts (Prast et al., 2018). Effects were equal for students at different achievement levels. The limited research base nevertheless does not allow us to formulate directed hypotheses about the differential effectiveness of FA.

The present study

In the present study (see Fig. 1), we investigate the impact of (a) LPA and (b) LPA in combination with additional support consisting of material for feedback and adaptive instruction (= LPA+) on students’ motivational development compared to business-as-usual instruction. More specifically, we aim at investigating the influence of students’ PCS on intrinsic motivation as well as the influence of teachers’ use of the aforementioned FA practices on students’ PCS. Further, we will analyze whether effects of FA on intrinsic motivation are mediated by students’ PCS. While the focus here is on students’ perceptions of competence support and their influence on intrinsic motivation as a final desired outcome in school, students’ experience of competence is not assessed. In addition, differential effects depending on students’ achievement level will be examined.

Fig. 1
figure 1

Assumptions regarding the effect of FA practices on students’ intrinsic motivation via students’ perceived competence support. Note. EG experimental group, LPA learning progress assessment

The following hypotheses will be addressed:

  • Hypothesis 1: We expect PCS to positively influence intrinsic motivation.

  • Hypotheses 2a, b: We expect (a) LPA and (b) LPA+ to have positive effects on PCS compared to a control group.

  • Hypothesis 2c: The positive effect on PCS is higher for students in the LPA+ group compared to the LPA group.

  • Hypotheses 3a, b: We expect (a) LPA and (b) LPA+ to have an indirect effect on intrinsic motivation mediated by PCS compared to a control group.

  • Hypothesis 3c: The positive indirect effect on intrinsic motivation via PCS is higher for students in the LPA+ group compared to the LPA group.

As an additional exploratory question, we analyze if the treatment effects are moderated by students’ achievement level.

Method

Sample

Our sample consisted of N = 27 third-grade mathematics classes from 18 schools recruited in Germany. Participation in the study was voluntary. Participating teachers had a mean age of M = 40.35 years (SDAge = 10.08, MinAge = 28, MaxAge = 62). The majority of teachers were female (87%) and had studied mathematics (72.7%). A total of N = 613 students participated in the study. At pre-test, students were on average M = 8.74 years old (SDAge = 0.47). 52.5% of the students were female and 92.0% were born in Germany.

Design

The study was conducted as a quasi-experimental field experiment with three conditions. FA was realized using two different approaches: Teachers in the experimental group one (EG1 = LPA) used a digital LPA tool, whereas teachers in the experimental group two (EG2 = LPA+) were provided with additional support consisting of materials for feedback and differentiated instruction in their classrooms. Teachers in the control group (CG) conducted their business-as-usual instruction. In all three groups, the regular curriculum for the third school year was taught, with the intervention focussing on arithmetic and math text problems. Assignment to a condition was made at the class level, making sure that classes from the same school were in the same condition. Teachers in both experimental conditions participated in teacher trainings prior to the study. The assessment of both students’ mathematics competencies and motivational variables as well as teacher surveys was conducted at the beginning (September) and the end (June) of the school year (see Fig. 2).

Fig. 2
figure 2

Study design. Note. CG control group, LPA learning progress assessment, LPA+ learning progress assessment plus support, AI adaptive instruction, FB feedback

Treatment

Teachers of both treatment groups embedded components of FA in their mathematics classes. The individual components of the interventions are briefly explained below.

LPA group

Quop

The online-based assessment tool “quop” which monitors students’ math progress was applied (Souvignier et al., 2021). The program includes eight short (8–15 min.), computer-based math tests in multiple-choice format that are taken by the students at intervals of about 3 weeks over the course of half a school year (see Fig. 2, LPA1-LPA8). The math tests include arithmetic (sample item 135 + 17 = ?, response options 142; 152; 155; 118), geometry and calculation with units, are scored by computer and can be inspected by both teachers and students immediately after the tests have been administered. Teachers can inspect test results on both individual and class level. Results are reported separately for the three areas of competence, with the proportion of correctly solved tasks being indicated. In addition, students’ individual test results are compared with a norm sample. Thus, teachers receive objective assessment data that can be used to make instructional decisions and adjustments (for further information on psychometric properties, see Souvignier et al., 2021).

LPA+ group

The LPA tool quop was also used by the LPA+ group. Additionally, a more differentiated assessment and material for feedback and adaptive instruction were provided. Examples are presented in Table 1.

Table 1 Implementation of formative assessment activities in the LPA+ group (for semi-written subtraction in the number range up to 1000 with hundreds)

Additional assessment by differentiated evaluation of quop results and supplementary tests

Going beyond the standard evaluation of quop results, teachers in EG2 were provided with a more differentiated report of quop results focussing on basic arithmetic operations. The percentage of correctly completed tasks per basic arithmetic operation type (addition, subtraction, multiplication, division) was displayed for each student. This information should help teachers at recognizing which students had difficulties with which basic arithmetic operation and provide the basis for the selection of one supplementary test. Supplementary tests were used to identify individual strengths and weaknesses within one specific type of basic arithmetic operations. For each basic arithmetic type, tests were available for semi-written procedures, written procedures (addition and subtraction only), and math text problems. Teachers selected one supplementary test per student in advance (e.g., semi-written subtraction). Due to time constraints, the tests were administered six times during the school year following quop tests. All tests covered different domains of difficulty (e.g., for semi-written subtraction: subtraction in the number range up to 20, subtraction in the number range up to 100). The feedback on test results for each student included the indication of which domains of difficulty (e.g., subtraction in the number range up to 20) of the respective supplementary test (e.g., semi-written subtraction) were mastered, as well as a support recommendation consisting of two domains of difficulty that were not mastered (indicated by a test result of less than 50%)Footnote 1.

Adaptive instruction

Teachers were supported in their individualized instruction for students. To implement adaptive teaching, they were provided with support toolboxes. For each domain of difficulty of each supplementary test (e.g., semi-written subtraction in the number range up to 20), there was an explanation card as well as practice and solution cards. Particularly high-performing students were given tasks designed for promoting highly talented mathematic students (Käpnick, 2016) to work on. The support usually took place within the framework of a double lesson after each supplementary test. Each student was expected to work on the two domains of difficulty not yet mastered assigned to him or her. Students worked on the support toolbox in achievement-homogeneous or achievement-heterogeneous learning tandems (depending on the teacher’s preference).

Feedback

The goal of the oral feedback was to provide students with information about their individual achievement level and assistance in their learning process. Teachers could decide on the feedback setting (individual vs. in groups) and were asked to follow guidelines of learning promotive feedback. Especially, they were required to address the three central feedback questions: “How am I going?”, “Where am I going?”, and “Where to next?” (Hattie & Timperley, 2007), in terms of reporting strengths and weaknesses, setting learning goals, and providing appropriate strategies to achieve the learning goals. Other characteristics related, for example, to temporal proximity or attribution. By the use of a feedback guide and strategy cards for each domain of difficulty, teachers were supported in giving feedback. A learning map was used to visualize oral feedback and learning progress.

Professional development trainings

Teachers of the two experimental groups participated in professional development trainings prior to the study. The trainings were held separately at university but by the same training teams. In both groups, the general project procedure and the project goals were presented. Afterwards, the computer-based LPA tool quop was introduced, and the implementation of the tool in the classroom was discussed. Instructions for the implementation, evaluation, and interpretation of the learning process data were given. In EG1 (LPA; duration of the training: 90 min), the teachers were given no specific information on how to implement feedback and adaptive instruction but were instructed to carry out these two steps at their own discretion. Teachers in EG2 (LPA+; duration of the training: 120 min) were provided with additional information on the more differentiated quop results, the supplementary tests, the use of the support toolbox, how to provide feedback, and recommendations on how to implement feedback and support material in their instruction.

Measures

Data were collected from both teachers and students at the beginning and end of the school year. At pre-test, teachers’ demographic information was collected. At post-test, teachers in the experimental groups were asked to provide information on the implementation of progress monitoring with quop (EG1 and 2) and additional support (EG2, see “Treatment fidelity”).

On student level, math achievement was assessed at pre-test; PCS and intrinsic motivation were assessed at pre- and post-test. Both PCS and intrinsic motivation were assessed with self-report ratings. Items were rated on a four-point Likert scale ranging from 1 = “does not apply” to 4 = “applies exactly.” Items at pre-test referred to general practice in math classrooms, whereas items at post-test were related to the mathematics lessons of the last 3 months. Apart from that, items were identical. Due to practical reasons and internal consistency issues, the number of items differed between pre- and post-test. After item selection, all scales displayed satisfactory to good internal consistency.

Perceived competence support

PCS was assessed using an adapted version of the perceived competence scale of the “IGEL”-project (Decristan et al., 2015; Hardy et al., 2011) based on Kunter (2005) and Prenzel et al. (1996). Six items (α = 0.76) were included at pre-test, and eight items (α = 0.82) at post-test, e.g., “In math classes, I was supported in better understanding the topics that were covered in class.”

Intrinsic motivation

The intrinsic motivation scale was also based on the IGEL-project and was adapted from Blumberg (2008) and Bos et al. (2005). It consisted of four items (α = 0.68) at pre-test and six items (α = 0.74) at post-test, e.g., “Why do you put effort into math classes? Because I want to understand more about the subject.”

Math achievement

Math achievement was assessed using the subtests arithmetic and math text problems of the DEMAT 2+ (Krajewski et al., 2004). Twenty-four items were employed (α = 0.89, sample item “Take the double! 5 ➔ X”).

Treatment fidelity

Our strategy for assessing treatment fidelity was twofold: In both experimental groups, we used self-report ratings of teaching practices. In the LPA+ group, teachers’ feedback practices were additionally assessed at random by trained observers.

Self-report ratings

At the end of the school year, teachers were asked to rate their implementation fidelity on a four-point Likert scale ranging from 1 = “does not apply” to 4 = “applies exactly.” In both groups, the use of quop was assessed with four items. Teachers in EG2 additionally provided information on their use of the supplementary tests (five items), their feedback given to students (seven items), and adaptive instruction using the support material (four items). Implementation fidelity varied across teachers with the majority of the teachers reporting a satisfactory level of implementation fidelity. Nonetheless, some teachers indicated low implementation fidelity for LPA+. One teacher (1) did not participate in the professional development training and reported very low implementation fidelity. For another teacher (2), no information on implementation fidelity was available. We decided to conduct the analyses with all participating classes. At the same time, we conducted analyses after excluding either class (1) or classes (1) and (2). The results proved to be stable. We provide information on treatment fidelity and the results of the other analyses in the supplemental material in the Open Science Framework (https://osf.io/pk5rm/).

Observer ratings

Teachers’ feedback practices were rated once by two trained observers if teachers agreed to the observation. The rating scheme covered 19 aspects of promotive feedback, e.g., identifying strengths/weaknesses, providing strategies, and setting learning goals and attribution. In each feedback situation, the presence or absence of these characteristics were rated (0 = “characteristic not available,” 1 = “characteristic partially available,” 2 = “characteristic available”). In case of inconsistency, observers agreed on a common rating. Interrater reliability across all characteristics was satisfactory (K = 0.71). Overall, implementation fidelity was satisfying except for providing (detailed) strategies. Detailed information is provided in the supplemental material.

Data analysis

Our study design is consistent with cross-lagged panel models for half-longitudinal designs (Cole & Maxwell, 2003; Preacher, 2015). In these models, the effect of the treatment on the mediator at T2 and the effect of the mediator at T1 on the outcome at T2 are estimated controlling for pre-test differences in both the mediator and the outcome. Given the assumption of stationarity, implying that the effect of the mediator on the outcome is stable over time, the indirect effect can be estimated as the product of the two path coefficients (see Fig. 3).

Fig. 3
figure 3

Illustration of the half-longitudinal mediation model. Note. The indirect effects are calculated by multiplying a × c and b × c, respectively. For better clarity, the direct effects of group membership on motivation are omitted in this figure

All variables were grand-mean-centered at T1. Motivation and PCS at T2 were specified as outcome variables. We estimated two separate path models to examine hypotheses one to three (model 1) and the exploratory research question (model 2), using the lavaan-package (Rosseel, 2012) in R (R Core Team, 2022). For both models, we entered the treatment variables as predictors using two dummy-coded variables with the control condition representing the reference category (LPA vs. CG: 0 = CG, 1 = LPA; LPA+ vs. CG: 0 = CG, 1 = LPA+). We estimated the effects of the treatment variables and PCS at T1 on both outcomes. In addition, the effects of achievement at T1 on PCS at T2 and the effect of motivation at T1 on motivation at T2 were included in the model. To test hypothesis 1, we examined the effect of PCS at T1 on motivation at T2 (path c, cf. Fig. 3). To test hypotheses 2a, b, we examined the effects of the treatment variables on PCS at T2 (paths a and b). The indirect effects of the treatment variables on motivation at T2 via PCS (hypotheses 3a, b) were tested based on the products of the path coefficients a × c and b × c, respectively. The differences between the treatment variables effects’ were tested on statistical significance (hypothesis 2c, effects on PCS at T2; hypothesis 3c, indirect effects on motivation at T2). For model 2, to investigate the exploratory research question, we additionally specified the interaction effects (a) treatment variables × achievement at T1 on PCS at T2 and (b) PCS at T1 × achievement at T1 on motivation at T2. In both models, we used robust maximum likelihood estimation. For the direct effects, we calculated Cohen’s f2 (Cohen, 1988; conventions small = 0.02, moderate = 0.15, large = 0.35). For the indirect effects, we report confidence intervals for the bias-corrected bootstrap (DiCiccio & Efron, 1996). To adequately account for the hierarchical data structure and obtain correct standard errors, the variance estimation procedure “cluster” (Rosseel et al., 2022) was applied in both models. Except for the exploratory research question, all hypotheses were tested one-tailed at the 0.05 α-level.

Missing data analysis revealed that the percentage of missing data was below 15% for all constructs (PCS T1, 9.8%; PCS T2, 12.6%; motivation T1, 10.9%; motivation T2, 10.4%; achievement T1, 6.9%). The two-step procedure suggested by Jamshidian and Jalal (2010) and Jamshidian et al. (2014) was used for analyzing the missing data pattern. The assumption of Missing Completely At Random (MCAR) was not rejected (Hawkin’s test, p < .001; non-parametric test of homoscedasticity, p = .689). Thus, we considered it justified to deal with missing data using the conventional full information maximum likelihood procedure (FIML) requiring the even weaker assumption that values are missing at random (MAR; Enders, 2001; Lüdtke et al., 2007).

Results

Means and standard deviations of all examined variables are presented in Table 2.

Table 2 Means and standard deviations for all variables

Both proposed models displayed very good model fit (model 1: χ2(2) = 3.76, RMSEA = 0.05, SRMR = 0.02, CFI = 0.99, TLI = 0.95; model 2: χ2(4) = 6.75, RMSEA = 0.04, SRMR = 0.01, CFI = 0.99, TLI = 0.96; robust fit-indices are reported).

Results of the path model for research questions 1–3 (model 1) are provided in Table 3 (for an illustration, see Fig. 4). Consistent with hypothesis 1, PCS at T1 significantly predicted intrinsic motivation at T2. Regarding research question 2, significant, yet very small, positive effects on PCS at T2 were found for both LPA (hypothesis 2a) and LPA+ (hypothesis 2b). Contrary to hypothesis 2c, the positive effect on PCS at T2 was not higher for LPA+ compared to LPA (EG2 β = −0.024, p = .430). The indirect effect of the treatment variables on intrinsic motivation via PCS (hypotheses 3a, b) was calculated by multiplying the corresponding path coefficients. A significant indirect effect was found for LPA (β = 0.062, 90% CI [.007, 0.117]) but not for LPA+ (β = 0.057, 90% CI [−.009, 0.122]). The indirect effects (hypothesis 3c) did not differ between the two treatments (β = −0.005, p = .429). In this model, the explained variance was R2 = 0.15 for both PCS at T2 and intrinsic motivation at T2.

Table 3 Results of the path model for research questions 1–3 (model 1)
Fig. 4
figure 4

Half-longitudinal mediation for treatment effects on motivation via PCS (model 1). NoteCG control group, LPA learning progress assessment, LPA+ learning progress assessment plus support, T1 pre-test, T2 post-test, for better clarity, coefficients that did not reach statistical significance were omitted, and presented coefficients were rounded to the second decimal place. *p < .05; **p < .001

To examine potential moderator effects of students’ achievement, the interactions between achievement at T1 and the treatment variables as well as achievement and PCS at T1 were integrated in a second path model (model 2; see Table 4 and Fig. 5, respectively). A significant interaction effect was found for LPA+, indicating that the higher the level of math achievement at T1, the greater the positive effect of LPA+ on PCS at T2. No significant interaction effects were found for LPA and achievement at T1 and PCS and achievement at T1. The explained variances for PCS at T2 and motivation at T2 were R2 = 0.16 and R2 = 0.15, respectively.

Table 4 Results of the path model for research question 4 (model 2)
Fig. 5
figure 5

Half-longitudinal mediation for treatment effects on motivation via PCS with interaction effects (model 2). Note. CG control group, LPA learning progress assessment, LPA+ learning progress assessment plus support, T1 pre-test, T2 post-test, for better clarity, coefficients that did not reach statistical significance were omitted, and presented coefficients were rounded to the second decimal place. *p < .05; **p < .001

Discussion

The aims of this study were manifold: first, we examined whether students’ PCS is positively associated with intrinsic motivation. Second, we analyzed if teachers’ use of FA practices has a positive effect on students’ PCS and whether there is an indirect effect of FA on intrinsic motivation mediated by students’ PCS. Further, we investigated whether the effects on motivation depend on students’ achievement level. As a particular strength of the study, two differently structured approaches of FA were realized: While teachers in EG1 received results from a digital LPA tool, teachers in EG2 were provided with the combination of LPA results and additional support consisting of materials for feedback and adaptive instruction.

As expected, we found PCS to significantly predict intrinsic motivation (hypothesis 1). For both LPA (hypothesis 2a) and LPA with an additional support component (hypothesis 2b), significant, small positive effects on PCS were shown compared to the control group. Contrary to our expectations, these effects did not differ between the two experimental conditions (hypothesis 2c). There are indications of small indirect effects for LPA (hypothesis 3a) and LPA+ (hypothesis 3b) on intrinsic motivation mediated by PCS. However, statistical significance was only reached for LPA. There was no difference in the indirect effect between LPA and LPA+ (hypothesis 3c). Regarding possible differential effects depending on students’ achievement level, a significant interaction effect for LPA+ emerged. The higher the students’ achievement level, the stronger the positive effects of the intervention. For LPA, in contrast, no interaction effect was found. The effect of PCS on intrinsic motivation is not related to students’ achievement level.

Our results highlight the importance of students’ perception of supportive teacher behavior for their motivational development: The more students feel supported in the development of their competencies by their teachers’ adaptive behavior, the greater their intrinsic motivation. Given the positive outcomes associated with intrinsic motivation (Howard et al., 2021), this finding highlights the need for promoting the feeling of competence support in the classroom. Both experimental groups were associated with increased PCS compared to the control group. This suggests that FA practices are effective in changing teaching behavior towards an enhanced competence support in the mathematics classroom. This finding is consistent with previous studies on the relationship between the use of FA practices and perceived competence (Leenknecht et al., 2020; Miller & Lavin, 2007; Pat-El et al., 2012).

The very small effects of LPA and LPA+ may be due to the fact that in these conditions, regular mathematics classes were enriched by teachers’ use of FA practices, but not necessarily in all lessons adaptive instruction and feedback were implemented. Likewise, only very weak indirect effects were found for both groups (which were significant only for LPA). It appears that the influence of the implemented methods might not be strong enough to achieve greater effects on intrinsic motivation mediated by students’ PCS. The assumed mechanism that students’ PCS strengthened by teachers’ use of FA practices is beneficial for their intrinsic motivation could nevertheless be confirmed. A possible reason for the lack of greater indirect effects might be a low frequency of FA application. For instance, whereas the present study extended over a long period of time, but with a comparatively low frequency of FA, the implementation of FA in a quite similar study by Hondrich et al. (2018) occurred over a relatively short period of time but was very intensive (two units of 9 h each in slightly over 2 weeks). Thus, a higher dosage of FA application may be needed to achieve the desired effects.

A strength of the study relates to the comparison of two different approaches of formative assessment. Unexpectedly, a comparison of the effects of LPA and LPA+ failed to yield an advantage of the additional support component. Therefore, the provision of progress monitoring information appears to be a key element in promoting PCS. This finding seems surprising in view of both feedback (Krijgsman et al., 2019; Rakoczy et al., 2013) and adaptive teaching (Goudas et al., 1995; Roy et al., 2015) being associated with the experience of competence or related constructs. As a possible explanation, results from feedback research (Harks et al., 2014; Rakoczy et al., 2019) can be considered, suggesting that the effect of feedback (in this case, on interest in math) is mediated by its perceived usefulness. It may be that feedback was not perceived as more useful by students in the LPA+ group. Classroom observations which indicated that teachers in the LPA+ condition rarely taught students appropriate strategies for achieving the learning goals support this notion. It may therefore be assumed that, despite the support components teachers were provided with, the quality of feedback did not substantially differ between the experimental groups (for the challenge of improving teachers’ feedback practice in mathematics instruction, see Schütze et al., 2017). Possibly, this is also true for adaptive instruction. It seems that teachers in both groups used the assessment information in a similar way for adaptive teaching, with the provision of differentiation options through the LPA+ materials having no substantial additional effect. The potentially fundamental role of LPA in the context of FA was also discussed in a study conducted by Hebbecker and Souvignier (2018). In reading classrooms, the researchers found additional effects of feedback or support components on neither achievement nor motivational variables and interpreted this result (in addition to considerations of a possibly limited implementation of the components) as an indication that LPA had already a sufficiently high stimulative nature to implement the concept of FA. However, more information on how teachers in the LPA group dealt with the information on learning progress would be useful to explain the missing additional effect through feedback and adaptive instruction. Another explanation could relate to the duration of the teacher trainings. For practical reasons, the training sessions differed by only 30 min between the intervention groups. Although teachers in the LPA+ group were provided with feedback material and support over the course of the school year, more time for professional development might have been helpful to get better acquainted with the material.

Examining differential effects, it was found that the effect on PCS in the LPA group was similar for students of different achievement levels, but in the LPA+ group, high-achieving students appeared to particularly benefit from the intervention. It is possible that the high-achieving students feel that their potential is seen and supported to a greater extent than in the regular classroom, especially as a result of the tasks designed for promoting highly talented mathematics students. The ready-made material obviously made it easier for the teachers to offer tasks to the high-achieving students in particular that showed a good fit with their abilities. Similarly, it is conceivable that the high-achieving students did particularly well with the learning setting, that is, learning in tandems and working independently with new materials.

Limitations

This study makes a valuable contribution to a better understanding towards the motivational effects of FA practices. Still, some limitations should be acknowledged when interpreting the results. First, the rather small sample size at the class level (N = 27) did not allow for analyses of the half-longitudinal model in a multilevel path model. Estimation of the effects therefore occurred at the individual level, correcting for standard errors given the nested data structure. The analysis of mediation effects by half-longitudinal designs, moreover, implies further restrictions (Cole & Maxwell, 2003). While it is possible to test whether PCS functions as a partial mediator of group membership on intrinsic motivation, a direct test of a complete mediation is not possible. Further, the assumption of stationarity cannot be tested without at least three waves of data. If this assumption is violated, estimates of the indirect effect will be biased. A confirmation of the observed effects in a longitudinal design with at least three waves of data would therefore be promising. In terms of implementing assessment-based instruction, the supplementary tests need further improvement. At the beginning of the study, too many students mastered each domain of difficulty, so the cut-off criterion had to be strengthened again during the course of the study. This, in turn, may have led to many children working with tasks that were too challenging for them. However, when high-achieving children are correctly identified, the materials seem particularly conducive to motivation. Another limitation relates to the lack of control of the actual classroom activities. Even if the self-reports of the teachers allow for an approximate evaluation of treatment fidelity, we do not know how exactly the individual components were implemented in the classroom and did not have the opportunity to monitor teachers’ classroom activities more objectively, for instance, based on video observations or expert ratings. An approximation of these external judgments is provided by the observation of teachers in the LPA+ group in one feedback situation by two independent observers. However, not all teachers agreed to this. At the same time, the self-reports of some classes show a rather low implementation fidelity, especially for LPA+. Not all teachers implemented the core elements of feedback and adaptive instruction to a desirable extent. The study can therefore help to draw conclusions about the effect of the intervention when the components are implemented “as conducted” rather than “as intended” (Century & Cassata, 2016).

Implications

Overall, the study enables us to better understand the underlying mechanisms of motivational effects of FA. Although only small effects were found, the study suggests that FA is a powerful approach to enhancing students’ PCS in the classroom, which leads to an increase in intrinsic motivation. The implementation of FA practices in everyday school life should thus be further supported. In particular, the dosage and quality of FA should be increased to achieve stronger effects. The results give reason to reflect on the role of each feature of FA. It seems that LPA can play a key role in promoting adaptive teaching behavior and therefore making students feel supported in their competencies, which is why the dissemination of practicable tools for LPA should be given priority. The differential effect for high-achieving children in the LPA+ condition indicates that elaborated materials can support the implementation of adaptive instruction. However, to implement the concepts presented in this study on a broader level, some adjustments are still needed. Intensive professional development trainings to support teachers in implementing FA can play a crucial role here. The study provides clear evidence that students’ perception of supportive teacher behavior has a motivating effect. Consequently, teachers need to be assisted in being able to continuously demonstrate this behavior in the classroom.