Affect is important in learning mathematics (McLeod, 1992; Schukajlow, Rakoczy, & Pekrun, 2017; Zan, Brown, Evans, & Hannula, 2006). Particularly in higher education, where students have less guidance and more self-directed learning, student motivation is crucial for achievement and may explain drop-out (Di Martino & Gregorio, 2018). However, university students often lose their motivation, which can be explained to some extent by their beliefs (Daskalogianni & Simpson, 2002; Liebendörfer & Schukajlow, 2017) or experiences (Liebendörfer & Hochmuth, 2017). Yet, not much is known about the theoretical mechanisms that help university teachers improve students’ motivation and more specifically students’ interest in mathematics.

Theory-based utility-value interventions have recently shown promising results in terms of improvements in students’ interest in higher education. Given the importance of enhancing students’ interest, the overall goal of the present study was to examine how utility-value interventions improve students’ interest in mathematics teacher education. Our primary goal was to test whether the hypothesized mechanism proposed by expectancy-value theory would hold in mathematics teacher education. Our secondary goal was to analyze the key characteristics of interest-fostering reflections. The results of this analysis should be helpful for designing interest-fostering interventions in the future.

We designed a utility-value intervention to increase students’ interest. First, we conducted a quantitative analysis of the effects of the intervention, reflections on the utility value of mathematics, and prior knowledge on interest at posttest while controlling for interest at pretest. Second, we analyzed the quality of students’ written materials to explore the characteristics of interest-fostering reflections.

1 Theoretical background

1.1 Utility value

Utility value is a central concept in the expectancy-value framework (Barron & Hulleman, 2015; Hulleman, Barron, Kosovich, & Lazowski, 2016; Wigfield & Eccles, 2000). The basic model of students’ motivation involves the two factors of expectancies (Do students think they can solve the task?) and values (Do students want to solve the task?). The higher students’ expectancies and values are, the more likely they are to try to solve a task. Values can be viewed as task-specific (e.g., the value of a problem), activity-specific (e.g., the value of problem solving), or object-specific (e.g., the value of a course or the value of learning materials; Harackiewicz & Priniski, 2018; Krawitz & Schukajlow, 2018) constructs. We focused on utility value, which has been found to be one major value component that is important for learning (e.g., Berger & Karabenick, 2011). Utility value describes the usefulness of activities for students’ goals at the present time or in the future. The direct connections to the individual’s goals distinguish the concept of utility value from other forms of relevance that, for example, may also be present if there is utility value for others but not for the individual (Hulleman, Kosovich, Barron, & Daniel, 2017). Research has shown positive connections between utility value and students’ motivation, behavior, and achievement (Hulleman et al., 2016; Hulleman, Thoman, Dicke, & Harackiewicz, 2017), particularly in mathematics education (Schukajlow, 2017). In mathematics education, values are generally related to students’ performance (Schukajlow, 2017; Seah, 2018) and have been found to predict career choices (Watt, 2006; Watt et al., 2012), enrollment intentions (Bong, 2001), mastery goal orientation (Chouinard, Karsenti, & Roy, 2007), learning activities (Luttrell et al., 2010), and performance (Bong, 2001).

1.2 Utility value of mathematics

Mathematics is highly relevant both in everyday life and in many important vocational areas. Fostering learners’ utility value is an important goal of mathematics education in both school and teacher education (Maaß, 2006). Yet, the commonly accepted relevance of mathematics need not automatically turn into individual utility value, which needs direct connections to an individual’s goals (Hulleman, Kosovich, et al., 2017). Despite its ubiquitous character, people can often avoid contact with mathematics in everyday situations and consequently do not need to connect their goals with mathematics. Mathematics helped create technologies that made mathematics superfluous for the user such as electronic cash registers (Heymann, 2003). Heymann even claimed that “neither students nor adults, as long as the latter are not engaged in mathematics-related professions, are now in a position to experience any practical usefulness of the mathematical knowledge they learned in school, apart from a few very elementary skills” (Heymann, 2003, p. 2). Mathematics educators call this “discrepancy between the objective social significance of mathematics and its subjective invisibility […] the relevance paradox” (Niss, 1994, p. 371, original emphasis).

Heymann (2003, p. 94 ff.), however, added that mathematical knowledge might have more utility if we focus not only on standard applications but also on modeling. Whereas in standard applications, the appropriate model is obvious, in modeling problems, a mathematical model must be formulated, and results must be evaluated (Niss, Blum, & Galbraith, 2007, p. 12). Both the construction and evaluation make use of demanding translation processes between the real world and the mathematical model that benefit from sophisticated mathematical knowledge. Considering modeling may thus enhance utility value more than just thinking about standard applications. Research has shown that modeling may help university students see mathematics as highly relevant in their own lives (Hernandez-Martinez & Vos, 2017).

1.3 Interest

Interest is a motivational variable defined as a person–object relationship that is specific to a person, but, unlike other motivational concepts (Eccles & Wigfield, 2002), it is also specific to a (mental) object. The cognitive component of interest refers to high personal value, and the emotional component refers to positive affect (Hidi, 2006; Hidi, Renninger, & Krapp, 2004; Krapp, 1993, 2007). We focused on individual interest, which is a rather stable disposition. Motivational theories have thus suggested that values positively affect the development of interest and interest-related outcomes (e.g., intrinsic motivation).

Individual interest is an important variable in learning processes. Empirical evidence has been found for connections to students’ use of learning strategies (metacognition, effort, deep learning) in diverse settings (Schiefele & Schreyer, 1994) and students’ enjoyment (Schukajlow, 2015). Consequently, interest is positively related to learning outcomes for students in mathematics in school (Heinze, Reiss, & Rudolph, 2005; Köller, Baumert, & Schnabel, 2001; Singh, Granville, & Dika, 2002) and in mathematics teacher education (Schwippert, Feld, Doll, & Buchholtz, 2013). Further, teachers’ interest in their subject is very important because teachers’ interest is connected to their students’ motivation and learning (Long & Hoy, 2006). Interested teachers report more enjoyment and flow (Schiefele, Streblow, & Retelsdorf, 2013) and have a lower risk of burnout (Kunter, Frenzel, Nagy, Baumert, & Pekrun, 2011).

In mathematics teacher education, students’ interest often decreases substantially during the first weeks or months of their studies (Cooper, 1990). In Germany, longitudinal studies have provided evidence of a decline in the interest of future primary school teachers with compulsory mathematics lectures (Kolter, Liebendörfer, & Schukajlow, 2016) and higher secondary school preservice teachers who chose to study mathematics (Liebendörfer, 2014, 2018; Rach & Heinze, 2013). This decline has even occurred in innovative course designs that focused less on mathematical theory and more on mathematical thinking and problem solving (Kuklinski et al., 2018). Surprisingly, no such decline was found for future lower secondary school teachers, perhaps because their courses also included topics from mathematics education and real-life connections (Liebendörfer & Schukajlow, 2017). Students in such combined courses may see more utility value in the material, thus preventing their interest from dropping.

1.4 Utility-value interventions

A main step toward fostering students’ interest was given by the introduction of utility-value interventions with promising results from short-term interventions on students’ interest, particularly in higher education (see Harackiewicz & Priniski, 2018, for an overview). In school mathematics, such interventions have demonstrated success in fostering students’ utility value (Rosenzweig et al., 2019) and increasing the percentage of students who pass a class (Kosovich, Hulleman, Phelps, & Lee, 2019).

The basic idea of these interventions is that a brief reflection on the utility of the course material may initiate a recursive process in which utility, motivation, and experiences positively influence each other during the subsequent weeks or months (Yeager & Walton, 2011). Students’ improved values may increase their motivation, which can lead to more positive learning outcomes. In turn, the better outcomes may reinforce students’ values and motivation. Thus, long-lasting effects have been found on the basis of one or a few short reflections, often taking less than 1 h (see Harackiewicz & Priniski, 2018, for an overview). The central mechanism for the effects of utility-value interventions is based on the number of connections that students make with the material. “Students who make more connections between course material and existing knowledge may be more likely to find usefulness in the course, which may enhance motivation” (Hulleman, Kosovich, et al., 2017, p. 389).

Most utility-value interventions have been based on either directly communicated explanations provided by teachers or requested self-generated explanations (Priniski, Hecht, & Harackiewicz, 2018). Reflections on self-generated rationales may have advantages over directly communicated rationales because they are experienced “from within” and are thereby less strongly associated with personal resistance. In many situations, reflections on self-generated explanations are more persuasive (Aronson, 1999). Furthermore, students can reflect on the most suitable arguments that correspond with their own understanding if they generate the rationales themselves (Hulleman, Godes, Hendricks, & Harackiewicz, 2010). In practice, students were often asked to write a letter to a significant person about the relevance and usefulness of the course material as part of their homework (Harackiewicz & Priniski, 2018). Students in control groups were often requested to write a text of similar length on a topic from the course (e.g., Hulleman et al., 2010). The writings have typically been as short as one to three pages (Hulleman et al., 2010) and sometimes even shorter. Although directly communicated information has been effective in some studies, it can undermine interest, particularly for less confident students (Canning & Harackiewicz, 2015; see also Durik, Shechter, Noh, Rozek, & Harackiewicz, 2015). Thus, if students already have knowledge of the potential utility of the material, reflections on their self-generated arguments seem more advantageous at the university level. However, this advantage has not always been confirmed empirically (Ivanov, 2016).

1.5 The quality of students’ reflections

Research on the quality of students’ reflections has shown that their compliance and task involvement may explain different effects (Nagengast et al., 2018; Shechter, Durik, Miyamoto, & Harackiewicz, 2011). Even simple quality indicators (e.g., the number of personal relations given in a text or its length) were found to explain the effects of an intervention on utility value (Rosenzweig et al., 2019) and performance (Harackiewicz, Canning, Tibbetts, Priniski, & Hyde, 2016). Thus, researchers have called for a closer investigation of the quality of texts as the analysis “may offer new insights into how different groups internalize intervention messages and what types of writing interventions have the greatest benefits for students” (Harackiewicz & Priniski, 2018, p. 432).

In the present article, we answered this call by investigating the quality of reflections using theoretically derived a priori categories in a quantitative path analysis and an exploratory qualitative a posteriori analysis. Drawing on research on modeling and applications (Hernandez-Martinez & Vos, 2017; Heymann, 2003), we distinguished between reflections on modeling and reflections on standard applications.

1.6 Effects of prior performance on students’ reflections and interest

Research has shown that students with poor prior performance, in particular, may benefit from a utility-value intervention (Hulleman et al., 2010; Hulleman, Kosovich, et al., 2017) but may also lose interest during an intervention (Canning, Priniski, & Harackiewicz, 2019). Similarly, there have been interaction effects with performance-related measures (e.g., success expectancies) showing both positive and negative interactions depending on the study design (Durik et al., 2015). Again, the connection frequency may explain these varying effects. Whereas there was sometimes a positive correlation between students’ prior performance and the length of their texts, particular groups of disadvantaged students may produce longer essays that reflect active thinking (Harackiewicz et al., 2016). The crucial question seems to be whether the manipulation positively challenges students or instead appears daunting to them (Durik et al., 2015).

In university mathematics, low-performing students are often marginalized and may see the utility-value intervention as a good chance to participate by voicing their opinion, which they often cannot do (Solomon, 2007). In particular, self-generated reflections may help students who have less knowledge, as these students can adjust their examples to match their own competencies (Durik et al., 2015). We thus asked whether students with lower prior knowledge would take advantage of this opportunity such that prior performance would moderate the intervention’s effect on students’ reflections on the utility value of modeling and standard applications. Because these two qualities represent only part of students’ reflections, we also asked whether students’ prior performance would moderate the direct effect of “the request to reflect on utility value” on students’ interest.

2 Set-up of the studies, analyses, and research questions

On the basis of prior research, we aimed to investigate the interplay of students’ connections to both modeling and standard applications and their prior performance in a utility-value intervention for the development of interest. We further sought to identify characteristics of interest-promoting reflections. Thus, we conducted an experiment using self-generated reflections on utility value. Our research interests in both effects of the intervention and characteristics of reflections called for a mixed-methods design. We used quantitative methods to estimate effects of the intervention and qualitative methods for the “development of explanations and the identification of additional variables which help to explain variance in the quantitative data” (Kelle & Buchholtz, 2015, p. 340). We aimed to address the following research questions (RQs):

  • RQs 1a, b, c, and d: Are there positive indirect effects of the request to reflect on utility value on students’ interest via their reflections on the utility value of (a) modeling and (b) standard applications? (c) Is there a positive direct effect? (d) Do they sum up to a total effect?

  • RQs 2a, b, and c: Do students with lower prior performance reflect more on the utility value of (a) modeling and (b) standard applications? (c) Is there a higher direct effect on interest?

  • RQ 3: Which characteristics distinguish between reflections on the utility value of mathematics that improve students’ interest versus other reflections?

RQs 1 and 2 produced the model in Fig. 1. The hypothesized path model links the treatment (request to reflect on the utility value of mathematics) and students’ prior interest as well as their prior performance with the outcome (future interest). Reflections on the utility values of modeling and standard applications serve as intervening variables between the treatment and future interest. Prior performance is included as a moderator of the direct and indirect effects of the treatment on the outcome. Prior interest is included as a predictor of future interest.

Fig. 1
figure 1

Theoretical model based on RQs 1 and 2

3 Study design and procedure

The sample for both studies comprised 58 students enrolled in the second part of a two-semester course on arithmetic and geometry (14 weeks each) taught by the second author and scheduled in the first year of a lower secondary school teachers’ program. All but three of them were in their second semester, 15 were male, and their ages ranged from 18 to 33 (M = 21).

The students were asked to answer a questionnaire in week 2 (T1) of the course. The treatment was assigned in weeks 5 to 6. Following the treatment (week 7), the students were given another questionnaire (T2).

In the lectures, applications of mathematics were mentioned, but students were not trained to reflect on the utility of mathematics. For the treatment, students were randomly assigned to one of two groups. Students in the experimental condition (n1 = 28) were asked to: “Write a letter (1,000–1,500 words) to someone close to you (e.g., a relative, partner, or friend) explaining (a) the meaning and (b) the utility of mathematics. Please refer to the contents of the geometry lecture.” Students in the control condition (n2 = 30) were asked to write an essay (1000–1500 words) elaborating on a topic from the geometry lecture. All students were allowed to use literature on mathematics or mathematics education as additional resources. This task replaced the usual weekly exercise sheet and did not affect their grades. To ensure that the students worked on the correct task, the tasks were sent by e-mail. Students were asked to work on the task individually. All students provided a solution in accordance with these requirements. Still, students from one group might have known the other group’s task.

4 Study 1: Quantitative analysis

4.1 Method

4.1.1 Instruments

We measured interest with the mathematics interest scale of the Project for the Analysis of Learning and Achievement in Mathematics (PALMA), whose validity was tested with quantitative and qualitative analyses (Frenzel, Pekrun, Dicke, & Goetz, 2012). Out of six items, we used the three that had been most strongly associated with the three most important interest components (emotional experience, value of mathematics, and behavioral engagement): “I am not interested in mathematics” (reverse-scored), “I like to read books or solve brain teasers related to mathematics,” and “Doing mathematics is one of my favorite activities.” These items have been shown to form a reliable scale in prior research (Schukajlow et al., 2012; Schukajlow & Krug, 2014).

The answers were given on a 6-point Likert scale ranging from 1 (strongly disagree) to 6 (strongly agree). The internal consistencies (Cronbach’s alphas) of the scale were 0.62 and 0.73 at T1 and T2, respectively. We further asked students to provide the grade from their last mathematics course in school as an estimate of prior performance. German grades range from 1 (best) to 6 (worst) but were recoded so that higher values indicated better performance.

We coded the texts for the two forms of applications to assess the number of reflections on utility value. The texts were split into segments of up to 10 sentences that addressed the same idea or aspect. Each segment was then coded as either containing no application, modeling (e.g., finding an optimal package for a given volume), or standard applications (e.g., calculating the change in a supermarket; see other examples in the “Results” section). Modeling was coded if three criteria were satisfied. First, the segment had to refer to a situation in the real world that contained an application of mathematics. Second, a translation of the situation into a mathematical model was necessary to solve the given problem. Third, this translation had to contain a decision about how to mathematize the information (e.g., by choosing one of several possible mathematical models or making assumptions about vague conditions). We coded all 58 essays comprising 955 segments in both the experimental and control groups. To check for intercoder reliability, a second coder who was blind to the goals of the study was trained and then given 17% of the essays. Out of 146 segments, 122 (84%) were coded consistently (Cohen’s kappa = 0.72; Fleiss, Levin, & Paik, 2003). Inconsistency usually resulted when the second coder coded a section for standard applications when the main coder did not (e.g., statements that mathematics has many applications that did not name examples). The resulting numbers of segments coded for modeling or standard applications were taken as measures of reflections on the utility value of modeling or standard applications, respectively.

4.1.2 Statistical analysis

To address all questions with one model, we conducted a path analysis. The ratio of the sample to the number of free parameters was 1.81 (58/32) and was thus below the critical value of 5 (Kline, 2005). To ensure the stability of the results, we additionally tested all single regressions in separate models with almost identical results. We present the general model for the sake of consistency and comprehensibility. We computed chi-square statistics, the comparative fit index (CFI), and the root mean square error of approximation (RMSEA). According to Hooper, Coughlan, and Mullen (2008), the CFI should be above 0.95 and the RMSEA should be below 0.08. The chi-square was χ2 = 0.78, p = 0.85, d.f. = 3, χ2/d.f. = 0.26, and because χ2 < 1, we got CFI = 1.00 and RMSEA = 0.00. Thus, the data fit the hypothesized path model well according to all fit indices.

The proportion of missing data ranged from 7 to 11% and resulted from small numbers of nonparticipation in one of the two surveys. Thus, we assumed no systematic connection between missingness and the values of the missing variables (missing at random) and used a maximum-likelihood estimator based on incomplete data sets (Graham, 2009). Because the mediation coefficients could be non-normally distributed, we used the bias-corrected bootstrap test for indirect effects (Fritz & MacKinnon, 2007). The path analysis was computed in AMOS 25.0.0 (Arbuckle, 2018).

4.2 Results

A preliminary analysis indicated that the experimental and control groups were not significantly different on prior performance or interest (see Table 1).

Table 1 Correlations, means, and SDs for the quantitative variables

The correlations between prior performance and interest were as expected (i.e., interest was relatively stable over time and was positively correlated with prior performance; see Table 1). The path model is presented in Fig. 2.

Fig. 2
figure 2

Path model for effects of the request to reflect on utility value. Note. Solid paths represent significant regression coefficients. The request to reflect on utility value was dummy-coded (0 = no request, 1 = request). All other variables were standardized. The solid path from Request to reflect on utility value to Reflection on the utility value of standard applications, for example, indicates that students in the experimental condition had 1.56 SD more reflections on the utility value of standard applications. **p < 0.01, ***p < 0.001

RQ 1: Reflections on the utility value of modeling affected students’ interest, but students in the experimental condition did not reflect more on the utility value of modeling than those in the control group. Students in the experimental condition reflected more on the utility value of standard applications, which did not predict future interest. Consequently, bootstrapping revealed the absence of indirect effects of the intervention on interest (RQs 1a and 1b; p ≥ 0.44). The direct effect of the intervention was also nonsignificant (RQ 1c; p = 0.71). Accordingly, these effects did not add up to a significant total effect (RQ 1d; p = 0.36).

RQ 2: Students’ prior performance did not serve as a moderator of the effect of teaching method on the number of connections to modeling (p = 0.42), answering RQ 2a. It moderated the effect on reflecting on the utility value of standard applications, but unexpectedly, students with poorer prior performances reflected less, answering RQ 2b. The estimates show that in the experimental condition, students with lower prior performance (− 1 SD) wrote 0.76 segments, average students wrote 5.71 segments, and students with higher prior performance (+ 1 SD) wrote 10.66 segments when reflecting on the utility value of standard applications. Finally, prior performance did not moderate the direct effect of teaching method on students’ interest (p = 0.38), answering RQ 2c.

5 Study 2: qualitative analysis of students’ reflections

The quantitative analysis did not confirm an effect of the utility-value intervention on students’ interest. For 40% of the students with complete data, interest had fallen by 0.84 SD on average; for 16%, it remained unchanged; and for 44%, it increased by 0.63 SD on average. Thus, some students’ interest increased, and their texts may indicate interest-raising reflections.

5.1 Method

We thus turn to RQ 3: Which characteristics distinguish between reflections on the utility value of mathematics that improve students’ interest versus other reflections? Variations in the quality of the reflections may be manifold. Yet, in the path analysis, reflections on the utility value of modeling explained increasing interest. Therefore, we compared segments on modeling written by students with an above-average increase in interest (≥ 0.63 SD; “interest-fostering reflections”) with other students’ texts. The analysis should reveal aspects of quality that had not been worked out in the literature before. The cut-off for interest-fostering reflections was set to the average increase to obtain a suitable number of essays, resulting in 12 segments from seven students that were to be contrasted with 126 segments from 48 students.

5.1.1 Methods for developing categories

Based on the given segments and their deductive coding for modeling and standard applications, an inductive category formation was carried out (Kuckartz, 2019; Mayring, 2015, p. 374 ff.). In the first coding cycle, the text segments of interest-fostering reflections were analyzed, and their special features were recorded in keywords as provisional categories. For each segment, we examined whether old categories should be changed or merged or whether new categories should be included in the coding schema. The provisional categories covered diverse aspects (e.g., references to the lecture or other sources; connections to the past, the present, or the future, to utility for oneself, close people, specific groups, or all of society, and to the level of detail provided in the segment). This resulted in a thematic matrix of provisional categories and students (Kuckartz, 2019, p. 187), where the cells of the matrix contained text excerpts by the corresponding students coded with the respective category. In the second coding cycle, this matrix was contrasted with all segments of standard applications by other students. The provisional categories were modified or deleted to ensure that their intersection described interest-fostering reflections only. In a third cycle, the final categories were then used to code all segments that referred to any applications to show the differences between interest-fostering reflections and other reflections.

5.2 Results

5.2.1 Preliminary remarks

There were many essays that appeared superficial. Some essays sounded more like repetitions of statements from the lecture. Other essays, such as the following, made incorrect conclusions:

Route plans of busses and trains are an example that I would not understand without knowledge of Eulerian trails. You probably don’t know Eulerian trails, but because this mathematical principle was discovered at that time, we can apparently simply read bus plans and find our way around.Footnote 1

In similar ways, several students made questionable connections between mathematics and standard applications without elaborating on the role of mathematics. Further, these and other statements illustrate that students wrote about utility without referring to their own personal goals.

5.2.2 Characteristics of interest-fostering reflections

We found three characteristics that could distinguish between interest-fostering reflections and other reflections: lecture references, originality, and general utility. A lecture reference was made when an application could be connected to material presented in the lecture. The connection could be made explicitly but was often made implicitly by using mathematical concepts from the lecture. Originality was given when the application or parts of it had not been presented in the lecture in the way they were presented in the text segment. Giving a new context or stating a new problem in a given context could fulfill this characteristic. General utility was given when the application helped solve a problem that was relevant to society.

In Table 2, we first present three text segments that were coded as reflections on the utility value of standard applications and represented typical essays written by students in the experimental group without substantial changes in their interest. In contrast, segments 4–6 were coded as reflections on the utility value of modeling written by three students whose interest had increased by 0.67 to 1 point on the scale (0.8 to 1.2 SD).

Table 2 Text segments and their coding for (L) lecture reference, (O) originality, and (G) general utility

In Segment 1, planar graphs and all the descriptions had been covered in the lecture, so the lecture reference was met, but originality was not met because there was nothing new. General utility was met, because problems with overlapping pipes are present in engineering. In Segment 2, the examples had no direct connection to the lecture on geometry, so the lecture reference was not met, but originality was because mobile phones were not addressed in the lecture. General utility was also met because the use of mobile phones is always present. Segment 3 showed a direct connection to symmetries, which were part of the lecture, so the lecture reference was met. Originality was also given, because snowflakes were not discussed in the lecture. However, there was no general utility in this segment because no problem was presented. Other segments missing general utility included applications of problems that were only fictional (e.g., discussing a fictional character’s training in a manga series) or historical (the use of geometry in ancient Egypt). In Segment 4, the lecture reference was met by the connections to geometric figures and pi. Originality was met by the context of painting curvilinear surfaces. The problem of estimating how much paint or materials were needed more generally is a common problem, so general utility was also met. Segment 5 refers to the problem of an optimal package. The lecture reference was met by the geometric context, originality was met because the packaging problem and the history of Tetra Pak had not been covered in the lecture. Finally, general utility was given because packaging problems occur in economics. Segment 6 takes up curves of constant width from the lecture and elaborates on Wankel engines and noncircular drills, so the lecture reference was given. The examples had only been identified in the lecture, so the elaborations (e.g., considering the movement of the center of the drill bit) were coded for originality. General utility was also met, as Wankel engines are still produced, and drilling holes with different shapes may have present and future applications in engineering.

These three categories were used to code all essays. All three categories appeared frequently. Of all segments relating to applications, 72% were coded for a lecture reference, 46% for originality, and 63% for general utility. Of the students with positive interest development, 64% had at least one segment coded for a lecture reference, 45% had a segment coded for originality, and 55% had a segment coded for general utility. Of those with negative or no development, 65% had a segment coded for a lecture reference, 46% had a segment coded for originality, and 50% had a segment coded for general utility. Fisher’s exact test revealed no significant association between the student group (positive interest development or not) and the categories (each p ≥ 0.78).

Although there was overlap between these categories, segments that met all three characteristics were very rare (7% of segments relating to any applications). There were four people with segments specific to modeling that had all three characteristics. They had increased their interest by at least 0.5 SD. The association between an above-average interest increase and having a segment on reflection that met all three criteria was significant (p = 0.001, Fisher’s exact test). This result indicates that the combination of a lecture reference, originality, and general utility might be a distinguishing feature of a high-quality, interest-fostering reflection.

6 General discussion

The primary goal of our research was to test whether the hypothesized mechanism proposed in expectancy-value theory would hold in mathematics teacher education. The secondary goal was to identify the key characteristics of interest-fostering reflections. We now discuss empirical, theoretical, and practical contributions as well as strengths and limitations that need to be noted.

6.1 Empirical contributions

Our first finding was a null total effect of the intervention. This result conflicted with most studies (Hulleman et al., 2010; Kera & Nakaya, 2017) but was in line with others (Ivanov, 2016; Lindeman, 2017). The null effect of the request to reflect on utility value may have different sources. One reason might be students’ low task involvement, as students’ compliance and task involvement were previously found to predict effects of utility-value interventions (Nagengast et al., 2018). Further, some students did not follow the instructions completely and did not refer to the lecture in their reflections. As all students’ texts met the required length, and the number of reflections on standard applications (but not modeling) was much higher in the experimental group than in the control group, we believe that many students did their best to follow the instructions. The new contribution of our study that adds to other findings is that utility-value interventions seem to work only under certain conditions (Nagengast et al., 2018; Rosenzweig et al., 2019; Shechter et al., 2011), and this result calls for the identification of such conditions in future studies.

Our second finding is that reflections on the utility value of modeling (but not standard applications) explained future scores on interest in mathematics. Thus, the quality of reflections is important for students’ interest. This is a new contribution because, besides superficial indicators of quality (e.g., the number of personal relations), the quality of the reflections had not yet been analyzed. This contribution calls for investigations of quality, which leads to our next two findings.

Our third finding stems from the analysis of interest-fostering reflections. Three characteristics may help in identifying interest-fostering reflections: lecture references, originality, and general utility. These characteristics indicate that reflections build on the lecture’s content, contain ideas original to the students’ work, and address a topic that has utility for at least some groups in society.

Our fourth finding is the most prominent one: Only a few students were able to write reflections of good quality on their own. Two indicators underline this conclusion. First, students in the experimental condition reflected more on the utility value of standard applications but not modeling. Second, students’ prior performance moderated the effect of the intervention on their reflections on the utility value of standard applications, but, unlike our expectations, students with low prior performance did not reflect more but actually reflected less on standard applications. It thus seems that reflecting on the utility value of modeling was a demanding task for all students, and reflecting on the utility value of standard applications was demanding for students with low prior performance. The qualitative analysis strongly supported this conclusion. Several reflections seemed questionable in terms of the proposed utility. Surprisingly, no students linked their reflections to their personal experiences with modeling or to their personal goals. Further, whereas the single characteristics for interest-fostering reflections were met rather often, only four reflections included a statement that satisfied all three of them at once. Because students rarely produced a reflection that had all three characteristics, we believe they had trouble finding suitable examples for their reflections. Some utility examples that students proposed (e.g., understanding a bus plan using knowledge of Eulerian trails) were not convincing. Unlike our expectations, the task of reflecting on the utility of a lecture on geometry could have been too hard for most of them.

6.2 Theoretical contributions

On a theoretical level, we further expanded the theory of utility-value interventions by bringing up assumptions about which connections between course materials and existing knowledge could foster students’ interest. Pursuing our first goal, we confirmed the assumption from the literature (Harackiewicz & Priniski, 2018) that the quality of the reflections can help explain their effect. To our knowledge, this is the first study to provide evidence for this claim. Because quality turned out to be specific to mathematics, it is an open question what indicators of quality might be identified in other domains.

By contrast, we found the assumption that students can make the most convincing arguments on the basis of their own understanding (Hulleman et al., 2010) invalid for the case of preservice mathematics teachers. Earlier studies also reported on preservice lower secondary school teachers’ reflections by focusing on simplistic situations such as counting change in the supermarket (Maaß, 2006). Finding utility value in higher mathematics seems particularly difficult due to the relevance paradox (Niss, 1994). In teacher education, large parts of the content are beyond what Heymann (2003) labeled practically useful, so we should not expect students to recognize many everyday applications of their course material. Unlike other undergraduates, preservice teachers are not headed toward a profession that applies mathematics outside academia, so they might have trouble finding vocational applications that connect to their personal goals. This might also explain the lack of connections to students’ individual goals, which are central for the intervention’s success (Hulleman, Kosovich, et al., 2017).

We further confirmed the importance of modeling suggested in mathematics education for the development of motivation. Referring to modeling may help circumvent the relevance paradox (Heymann, 2003; Niss, 1994). Modeling includes choosing appropriate mathematical models and calculations as well as interpreting and validating results. Knowledge of modeling may thus provide utility value, even if technology is used to replace mathematical knowledge and has consequently been shown to be helpful for fostering university students’ perceived relevance (Hernandez-Martinez & Vos, 2017). Because modeling is only one specific aspect of mathematics, future research should investigate what else may constitute quality in reflections on the utility value of mathematics.

Pursuing our second goal, describing the quality of interest-fostering reflections, we set up three criteria: lecture references, originality, and general utility. The first two ensured that a personal connection was made between applications and the material that students learned. They are in line with the theoretical mechanism based on making connections (Hulleman, Kosovich, et al., 2017). The third characteristics of general utility ensured that the reflections included some use of mathematics that was not only for individual people. This adds a new perspective to the theory of utility-value interventions. General utility is closely linked to what Di Martino (2019) called “social utility” and follows the “general approach” in Wedege’s (2007) terms. According to these concepts, the utility value of mathematics is not based on individual needs but on societal or economic demands. The reference to general utility may replace the focus on utility value in a narrower sense (referring to personal goals). Referring to societal relevance may motivate people (Di Martino, 2019; Wedege, 2007)—particularly preservice teachers who aspire to a career that serves society.

6.3 Practical contributions

The present study suggests that in teaching, educators can try motivating students not only by referring to applications of mathematics but by pointing precisely to modeling. Despite or even because of its more complex nature, thinking about modeling may be more persuasive for students than thinking about standard applications.

Further, making connections between the material and general utility may be beneficial for students’ motivation. Although students might not personally need a certain piece of mathematical knowledge in their future lives, addressing general utility might still motivate them, particularly educational multipliers such as future teachers.

Given students’ limited ability to generate suitable reflections on their own, we should not simply ask students to reflect on the utility of mathematics. It seems much more promising to give students prepared examples that are related to modeling and general utility.

6.4 Strengths, limitations, and future directions

The longitudinal design of the study and the randomization would allow causal conclusions to be drawn if there were no further limitations, and the setting in a real lecture ensured high ecological validity. Students’ reflections were assigned like regular homework to ensure that all of them completed the task, and the lengths of the texts indicated that all students followed the instructions. However, the text segments might reflect events outside the treatment that happened after T1 but before the students wrote their texts. In particular, as a contamination effect, students from the control group might also have begun reflecting after finding out about the other students’ tasks. Then, the request to reflect on utility value could have caused reflections on the utility value of modeling in both groups. But this would not explain the null effect in both groups and the often low quality of students’ reflections. Further, the texts might not have perfectly represented students’ reflections as they might not have written down all they had reflected on.

Although the items we selected for assessing interest were demonstrated to be valid in prior studies (Frenzel et al., 2012), and the scale we used was found to be reliable in prior research (Schukajlow et al., 2012; Schukajlow & Krug, 2014), the internal consistency at T1 was low. A more reliable scale should be used in future studies. We should further think of using instruments that cover aspects of students’ interest that are more closely linked to their personal experiences or anticipated use of mathematics as future teachers.

As our sample size was too small to give precise estimates of effects by testing the hypothesized mediation model, we also analyzed three separate models that confirmed the stability of our results. However, these results should be interpreted with caution, and they should be replicated in larger samples to collect more indications of the generalizability of these findings.

The qualitative analysis of students’ texts provided deeper insights into their level of reflection and extended the understanding of both the positive role of interest-fostering reflections and the null effect of the request to reflect on utility value in the present study. As the categories were drawn from a restricted sample of texts, they have only a hypothetical status, and other characteristics of interest-fostering reflections may be found in the future. Whether the characteristics we found really help to predict students’ interest is an open question that must be tested with new data.

Future research should further examine how students can best be helped to reflect on the value of mathematics. Many students, particularly when they are fresh out of school, are not used to reflecting on applications of mathematics, as this is atypical for school mathematics. Presenting modeling examples in the lecture, demonstrating how the course material helps in solving modeling examples, and encouraging students to find and reflect on additional examples might be promising approaches in future studies aimed at improving interest via utility-value interventions. Besides such trainings, directly communicated interventions (Canning & Harackiewicz, 2015) can be offered to ensure that students reflect on suitable examples.

Working out suitable examples, however, is a demanding task. There are many resources on the impact of mathematics in the modern world, including mathematical simulations in research, big data, and artificial intelligence (e.g., European Mathematical Society, 2020). However, connecting such resources to the specific material learned in a course may be difficult. We know from modeling in school mathematics that real-life problems may seldom be presented in the same way professionals would deal with them. Yet, preparing such problems would be valuable for teacher education.

Because general utility transcends personal utility, we should also rethink the framing of such interventions. In the expectancy-value framework (Barron & Hulleman, 2015; Hulleman et al., 2016), references to students’ identity instead of their personal goals would mean that the reflections affect their attainment value rather than their utility value. Similar interventions based on a self-transcendent purpose for learning could improve students’ academic self-regulation (Yeager et al., 2014). Particularly interventions in teacher education could focus on attainment value for addressing applications that are important to society and are thus related to students’ identity. The first attainment value interventions seem to be in the making (Hecht, Canning, Tibbetts, Priniski, & Harackiewicz, 2016).

6.5 Conclusion

Our study shows that utility-value interventions do not always increase students’ interest. The main important finding of the quantitative analysis is that the effects of asking students to reflect on the utility of their lectures depend strongly on the quality of students’ reflections. Many preservice lower secondary school teachers did not reflect on modeling but on standard applications in the present study. An important finding of the qualitative analysis is that interest-fostering reflections included three characteristics: lecture references, originality, and general utility. Future research should focus on stipulating reflections with the specific qualities that were related to an increase in students’ interest in mathematics. This includes reflections on the utility value of modeling and reflections that integrate lecture references, originality, and general utility.