Introduction

Retrieval practice is a learning strategy that is based on the well-established finding that repeated retrieval of to-be-remembered information from long-term memory, usually through repeated testing, is highly beneficial for learning and retention when compared to other learning strategies (Carpenter et al., 2008; Dunlosky et al., 2013; Karpicke & Blunt, 2011; Rawson et al., 2013; Roediger & Karpicke, 2006; Rowland, 2014). In recent years, the research field of retrieval practice (also called practice testing, test-enhanced learning, etc.) has shifted from more experimental settings to classroom-based research using educationally relevant material. A number of reviews and meta-analyses have found that the benefits of retrieval practice can be transferred to classroom activities (Agarwal et al., 2021; Lamotte et al., 2021; Moreira et al., 2019; Schwieren et al., 2017; Sotola & Credé, 2021; Yang et al., 2021). In a review by Agarwal et al. (2021), the aim was to suggest practical recommendations for when retrieval practice is beneficial for learning. The findings indicated that all retrieval practice conditions in the included studies resulted in positive effects, which suggests that as long as students participate in some form of retrieval practice activity, they will experience a learning benefit. However, in all studies included in the review by Agarwal et al. (2021), participants were required to take part in the retrieval practice activities, which raises the question of how transferable the results are to classroom settings where students’ actions are less controlled. With the effectiveness of retrieval practice in mind, it is crucial that students self-regulate their learning and actually use retrieval practice on a voluntary basis. The overall focus in the present study is to explore how students use the effective learning strategy of retrieval practice when it is optional, and whether individual differences might be related to self-regulated use of retrieval practice.

Self-regulated learning (SRL) concerns the processes students use to direct their own study behaviors in order to achieve their goals (Pintrich, 2000; Pintrich & Zusho, 2007; Zimmerman, 2001, 2013). More specifically, Zimmerman (2002) suggests that self-regulation processes include three phases. The first phase includes setting learning goals and planning for how to achieve them. The second phase involves action to achieve the goals set, which includes the use of learning strategies. In the final phase, learning is evaluated in relation to the goals set in the first phase. The ability to regulate one’s own learning has been found to be an important skill for achieving academic success (Broadbent & Poon, 2015), and has also been suggested to contribute to or explain gender differences in school achievement (Weis et al., 2013). A common finding regarding gender differences in school achievement is that female students tend to have higher course grades regardless of school subject, whereas males tend to score higher on achievement tests (see, for example, Voyer & Voyer, 2014). Studies examining gender differences in SRL indicate that females have a greater tendency than males to employ strategies of SRL (Panadero et al., 2017; Zimmerman & Martinez-Pons, 1990), but there are mixed results regarding gender and the different components of SRL (Martinez-Lopez et al., 2017; Stanikzai, 2019). Note that the cited sources have used the term gender, however, the present study recorded the participants biological sex, which is why different terms are used.

Self-regulated use of retrieval practice

Although there are many studies about self-regulated learning in general, self-regulated use of retrieval practice has not received much attention. A few recent studies suggest that although there is vast support for the effectiveness of retrieval practice, students tend to use the learning strategy to a low extent. Instead, students in both undergraduate (Blasiman et al., 2017) and secondary school (Dirkx et al., 2019) contexts tend to favor study strategies less beneficial for learning, such as restudy, and this appears to be a common flaw in students’ self-regulated learning. For example, Tullis and Maddox (2020) investigated self-reported use of retrieval practice among middle- and high-school students, and found that while both age groups used retrieval practice, it was used to a lesser extent than re-reading. In a study investigating the optional use of retrieval practice, undergraduate students were provided optional online reviews, either in test format (a quiz) or read format (students were provided question and answer) (Corral et al., 2020). The results indicated that students underutilize retrieval practice, as only 12 percent of the total reviews were completed. Moreover, only 55 percent of the participants completed at least one review (either test or read format) during the semester (Corral et al., 2020). Trumbo et al. (2016) also investigated the optional use of quizzes but compared the effect of optional vs. required quizzes on test scores and final course grade. Participants in the optional quiz group spent less time on each quiz and completed fewer quizzes in comparison with participants in the required quiz group. This difference was reflected in final grade performance such that participants in the required quiz group achieved better final grade performance than participants in the optional quiz group. The authors suggest that students may need extrinsic motivation in order to utilize quizzes as a study method (Trumbo et al., 2016). To sum up, a vast amount of research shows that retrieval practice is an effective learning strategy. However, getting students to employ this strategy in their own studying seems to be a challenge, as findings indicate that students use retrieval practice to a low extent when the use is optional. One interesting question regarding the use of retrieval practice is whether the employment of this strategy might be related to individual differences.

As research indicates that there are sex-related differences in self-regulated learning (Marrs & Sigler, 2012), it appears worthwhile to examine sex differences with respect to use of retrieval practice. Studies on the use of retrieval practice have often neglected this question. However, a study by Gagnon and Cormier (2019) examined Canadian college students’ use of self-testing and distributed practice. Results indicated that females (63%) reported that they used self-testing as a study strategy to a greater extent than males (57%).

Self-regulated learning, achievement, and individual differences

The framework of SRL involves cognitive, motivational and emotional aspects of learning (Panadero, 2017). Recently, it has been highlighted that individual differences need to be incorporated into SRL theory (Azevedo, 2020) and researchers argue that such an integration, including both cognitive and non-cognitive factors, might provide a better understanding of students’ academic achievement (Wolters & Hussain, 2015). Cognitive factors, such as fluid intelligence, crystallized intelligence, working memory and previous academic performance, as well as non-cognitive factors, such as personality, procrastination and emotional intelligence, have been found to be associated with SRL (see, e.g., Pérez-González et al., 2022; Richardson et al., 2012). Further, it has been suggested that SRL mediates the relationship between personality characteristics and academic achievement (Pintrich, 2000). Thus, it would be of interest to investigate possible relationships between non-cognitive factors such as personality and self-regulated use of retrieval practice.

The five-factor model (FFM) of personality has been suggested as a comprehensive indicator of non-cognitive factors (Borghans et al., 2008). The FFM consists of five broad personality factors: openness to experience, conscientiousness, extraversion, agreeableness and neuroticism (Costa & McCrae, 1992; Goldberg, 1993). Research on the relationship between FFM and academic achievement has shown conscientiousness to be the most consistent predictor of performance (Hakimi et al., 2011; Poropat, 2009; Sorić et al., 2017). Conscientiousness is defined as willingness to comply with conventional rules, and comprises facets of striving for achievement and self-discipline (Borghans et al., 2008). As such, the construct of conscientiousness is closely linked to persistence and motivation, and thus related to concepts such as grit and possibly also need for cognition (NFC). Grit is defined as “an individual’s perseverance and passion for long-term goals” (Duckworth et al., 2007, p. 1087), and might explain differences in individuals’ persistence or determination to succeed in learning situations. NFC is defined as the need to engage in and enjoy thinking and cognitively demanding tasks (Cacioppo & Petty, 1982), and might explain differences in individuals’ motivation when engaging in learning activities. The constructs of conscientiousness, grit and NFC have been found to be positively associated with academic performance (see, for example, Colling et al., 2022; Poropat, 2009; Wolters & Hussain, 2015). On a theoretical level, it also seems that NFC is highly related to the openness factor in the FFM, which has been defined as “the degree to which a person needs intellectual stimulation, change, and variety” (Borghans et al., 2008, p. 983). As with conscientiousness, openness has also been found to be positively associated with learning and academic achievement, while the other factors of the FFM seem to have weaker associations with performance (Bidjerano & Dai, 2007; Poropat, 2009).

Individual differences and retrieval practice

As regards retrieval practice, rather few studies have examined individual differences, and those studies conducted have mainly focused on relationships to the effect of retrieval practice (i.e., the so-called testing effect) rather than the use of this learning strategy. Studies have investigated the relationship between retrieval practice and cognitive functioning known to be associated with academic achievement. However, the findings are mixed. Regarding working memory capacity (WMC), some studies show no association between WMC and effect of retrieval practice (Bertilsson et al., 20172021; Brewer & Unsworth, 2012; Wiklund-Hörnqvist et al., 2014), whereas others indicate that individuals with lower levels of WMC benefit more from retrieval practice (Agarwal et al., 2017). Regarding studies on episodic memory and fluid intelligence, Jonsson et al. (2021) found no association between general cognitive ability and the testing effect, whereas Brewer and Unsworth (2012) found that retrieval practice is more beneficial for individuals with poorer episodic memory and lower fluid intelligence. Studies on the testing effect have also investigated the relationships with non-cognitive factors such as NFC and grit. Previous research has not found any evidence that benefit from retrieval practice is influenced by individual differences in grit or NFC (Bertilsson et al., 2017, 2021; Stenlund et al., 2017; Wiklund-Hörnqvist et al., 2022). One possible explanation for these results could be that because the constructs of grit and NFC both include aspects of motivation, they may not be as relevant for explaining differences in performance when the researchers or teachers, and not the participants themselves, initiate the use of retrieval practice, as is most often the case in studies investigating retrieval practice. However, it might be the case that both cognitive and non-cognitive aspects are related to self-regulated use of retrieval practice. To our knowledge, there are only two studies (Fellman et al., 2020a, b) where individual differences have been investigated in relation to how students use retrieval practice. In these studies, medical students’ optional use of online quizzes was examined in relation to individual differences, specifically in relation to reasoning or fluid intelligence (as measured by Raven’s Advanced Progressive Matrices), NFC and grit (Fellman et al., 2020b) and working memory (Fellman et al., 2020a). The results showed that quiz use was related to reasoning and verbal working memory, but not to any of the non-cognitive variables. However, these two studies have limitations in the respect that homogenous samples of high-performing students were used, and one lacked demographical data, such as sex and age (Fellman et al., 2020a), restricting the possibility to generalize the results. In order to extend our knowledge on self-regulated use of retrieval practice, broader samples from other school settings need to be studied. In line with this, the present study examines optional use of retrieval practice in an upper-secondary school setting using educationally relevant material and taking cognitive and non-cognitive individual differences into account.

Purpose of the study

The overall aim of the study was to investigate self-regulated use of retrieval practice when integrated in courses in mathematics and Swedish, and to examine whether individual differences are related to this use. More specifically, the following research questions were addressed:

  1. 1.

    Does the use of retrieval practice differ depending on optional (i.e., outside the classroom) and non-optional (i.e., inside the classroom) retrieval practice conditions and are there sex-related differences?

  2. 2.

    Is the optional use of retrieval practice related to cognitive and non-cognitive aspects?

Method

Participants

One hundred forty-six upper-secondary school students (Mage = 16.23 years, SD = 0.49, 27% female) at science and technical programs in the northern part of Sweden were included in the study. The students were enrolled in classes in mathematics (96 students), and Swedish (26 students), or both mathematics and Swedish (24 students), resulting in a total of 120 mathematics students (31 female) and 50 Swedish students (16 female). The proportion of males and females in the study was relatively representative of the proportions of males/females in science and technical programs in upper-secondary schools on a national level (27% females in science programs, 19% females in technical programs; Statistics Sweden, 2023). The classes were led by teachers who were selected by the school to take part in the research project. Data from the mathematics classes were collected in two cohorts during the school years of 2018–2019 and 2019–2020. Data from four students were excluded from the study because they did not take part in the intervention or due to missing data in the measures of cognitive and non-cognitive aspects.

Materials and measures

Retrieval practice material

The material used for the retrieval practice intervention consisted of eight quizzes containing a total of 20 items in mathematics and Swedish, respectively. In mathematics, the content of each quiz corresponded to the content of a chapter in the course book (e.g., algebra & equations, geometry, etc.), with a focus on mathematical terms. In Swedish, the content of each quiz corresponded to different subtopics of the course (e.g., grammar, literature, etc.). The short-answer questions consisted of a definition of a mathematical term or concept related to Swedish and the name of the term or concept was typed in as an answer. The quizzes were made available to the students on their online school platform. Pre- and posttests were conducted before and after each chapter/topic, but results from these were not included in the current study.

Non-cognitive measures

Short Grit Scale (GRIT-S)

Grit was measured using a Swedish version of the Short Grit Scale (GRIT-S; Duckworth & Quinn, 2009). GRIT-S is a self-report instrument consisting of eight items, and is an adaption of the original Grit Scale (Duckworth et al., 2007). Half of the items assess consistency of interest (e.g., “I often set a goal but later choose to pursue a different one”) and the other half assess perseverance of effort (e.g., “I have achieved a goal that took years of work”). Four of the items are phrased negatively and are reversely scored. The items are responded to on a five-point scale ranging from “not like me at all” (1) to “very much like me” (5). The total score is generated by adding up the points awarded to each item and then dividing by the number of items. Higher scores indicate higher levels of grit. To ensure a high-quality and accurate translation of the questionnaire, back-translation by a professional translator was utilized. GRIT-S has demonstrated acceptable validity and reliability, with an internal consistency ranging between α = 0.73 and α = 0.84 (Duckworth & Quinn, 2009). In the current study α = 0.61.

The Mental Effort Tolerance Questionnaire (METQ)

NFC was measured using a Swedish adaption of the original NFC scale (Cacioppo & Petty, 1982), the Mental Effort Tolerance Questionnaire (METQ; Dornic et al., 1991; Stenlund & Jonsson, 2017). METQ is a self-report scale consisting of 30 items, which represent both positive (e.g., “I really enjoy a task that involves coming up with new solutions) and negative (e.g., “I only think as hard as I have to”) attitudes toward engaging in and enjoying thinking. Responses are given on a five-point scale, ranging from “strongly disagree” (1) to “strongly agree” (5). The items that capture negative attitudes are reversely scored. Responses are summed to a total score and high scores on the METQ indicate a high need for cognition. METQ has demonstrated good psychometric properties (Dornic et al., 1991; Stenlund & Jonsson, 2017), including an internal consistency of 0.80 in the present study.

Mini-IPIP

The Mini International Personality Item Pool (Mini-IPIP; Donnellan et al., 2006) is a brief measure of the five-factor model of personality, which includes the dimensions of: neuroticism, extraversion, openness to experience, conscientiousness, and agreeableness. The instrument assesses each of these five personality factors using only four items per factor. In the present study, a Swedish translation of the Mini-IPIP was used and only the subscales of conscientiousness and openness to experience were included. The subscale of conscientiousness comprised four items about the tendency to be organized and self-disciplined, written as short statements: “Get chores done right away,” “Often forget to put things back in their proper place” (scored reversely), “Like order,” and “Make a mess of things” (reversely scored). The subscale of openness to experience comprised four items about imagination and the tendency to enjoy abstract thinking: “Have a vivid imagination,” “Have difficulty understanding abstract ideas” (scored reversely), “Am not interested in abstract ideas” (scored reversely), and “Do not have a good imagination” (scored reversely). Responses are given on a five-point scale ranging from very inaccurate (1) to very accurate (5). The Mini-IPIP is well-validated in general population samples (Donnellan et al., 2006), and has demonstrated acceptable reliability in college samples (Baldasaro et al., 2013) as well as in the current study (α = 0.58 and α = 0.75 respectively for the included scales).

Cognitive measures

Raven’s Advanced Progressive Matrices (RAPM)

RAPM were used to capture participants’ general cognitive ability (Raven, 1990). RAPM is a non-verbal test, where items consist of a 3 × 3 matrix of geometric patterns with the bottom-right area missing. The participants are asked to complete the pattern by selecting one option among eight alternatives. The difficulty of the items advances progressively during the test. The original test includes 48 items, of which the first 12 are often used as practice items. However, in this study, a short version including 18 items and six practice items was used. The participants had 25 min to complete the tasks. The total number of correctly scored items was used as a dependent variable, where higher scores indicated higher general cognitive ability. Previous studies have indicated that RAPM has good construct validity (Schweizer et al., 2007) and demonstrates good psychometric properties in college samples (Arthur et al., 1999). Internal consistency in the current study was 0.77.

The Operation Span Task (Ospan)

To measure working memory capacity (WMC), a standardized complex working memory task—an automated version of the Operation Span Task (Ospan; Unsworth et al., 2005)—was used. The Ospan is computer-administered and comprises two tasks: a letter span and a concurrent math task. The participant is asked to solve simple arithmetic tasks (processing demand), while simultaneously maintaining the presented letters in long-term memory (storage demand). These tasks are combined in sets that range from three to seven blocks, and each set size is performed for three trials, resulting in a total of 75 math problems and 75 letters. After each trial, the participant is shown a matrix of 12 letters, and is asked to recall the 3–7 letters in the order they were shown. Participants must have at least 85% correct on the math tasks, in order to ensure that they do not ignore these tasks in favor of rehearsing the letters. Ospan has demonstrated good test–retest reliability, r = 0.83, acceptable internal consistency, α = 0.78, and good construct validity (Unsworth et al., 2005).

Procedure

The design of the retrieval practice implementation was conducted in collaboration with four teachers who were employed at the upper-secondary school. The teachers contributed with valuable insight regarding the challenges associated with teaching in upper-secondary school and gave advice on how to design an intervention that could be a realistic part of everyday education for their students. The practical implementation was conducted by the teachers. At the start of each cohort, the students were introduced to the project and given an inspiring lecture about retrieval practice and how it should be used to improve retention. This was done to ensure that the students had knowledge about the benefits of retrieval practice, as previous studies have shown that students often tend to choose other, less-effective learning strategies.

At the start of each section of the courses, the quiz was made available to the students on their learning platform. The quiz was available for the duration of the corresponding section of the course, with an average time open of 11 days for the quizzes in the Swedish course, and 22 and 24 days, respectively, for the two cohorts in the mathematics course. For all sections, the quiz was available for the students to use of their own volition. In addition, for half of the sections, the quiz was also used once a week in the classroom (i.e., 3–4 times per chapter in mathematics and two times per section in Swedish). Sections/chapters for which quizzing was completely optional will be referred to as “optional” and sections/chapters for which quizzing was conducted in the classroom in addition to being available for voluntary use will be referred to as “non-optional”. It is important to note that while quizzing was an in-class activity during non-optional sections, students were sometimes absent from class for various reasons, which means that some students may not have completed any or only very few quizzes even in non-optional sections. Non-optional and optional quizzes were alternated between the sections, resulting in an ABAB design (A = non-optional/in the classroom and B = optional/outside the classroom). For practical reasons, the second cohort in mathematics was designed in the reverse order and started with an optional section (i.e., a BABA design). The purpose of the ABAB design was to investigate whether there were differences in quiz use between the two conditions.

The measures of non-cognitive aspects and cognitive abilities were collected in a group setting at the school, during two sessions of about 90 min each. The cognitive tasks were conducted on the students’ school-provided computers via an online platform. Non-cognitive aspects were measured through self-report questionnaires, using paper and pen. The students received two movie tickets for their participation in the study. The study was approved by the Regional Ethical Review Board, Sweden (2017/517–31), and written informed consent was obtained in accordance with the Declaration of Helsinki.

Statistical analyses

First, the data was subjected to descriptive analysis. In this analysis, two outliers (defined as z-score > 3.29 or < -3.29) were identified in the measures of NFC and RPM, respectively. Due to the sensitivity of the analysis used, these values were excluded. Secondly, paired-samples t-tests were used to examine differences in average quiz use between optional and non-optional conditions, and independent-samples t-tests examined differences between males and females as well as between cohorts in mathematics. Cohen’s d was used as a measure of effect size, for which values of 0.02 are considered a small, 0.05 a medium, and 0.08 a large effect (Cohen, 1992). Bivariate correlations between optional quiz use and cognitive and non-cognitive variables were calculated in order to investigate the strengths and directions of the relationships, as well as to detect potential problems with multicollinearity as indicated by elevated VIF values (> 10) (Tabachnick & Fidell, 2019). No multicollinearity was found. In order to determine which of the included independent variables were most predictive of optional quizzing, as well as the potential influence of sex, the correlation analyses were followed up with hierarchical linear regression analyses. The dependent variable for the analyses was the total number of quizzes completed during the optional sections. Independent variables were the measures of fluid intelligence, WMC, NFC, grit, conscientiousness, and openness. Sex was also added as a predictor in the first step, fluid intelligence and WMC were added in the second step to control for cognitive abilities, and the non-cognitive variables NFC, grit, conscientiousness, and openness were added in the third step.

Results

Use of retrieval practice

First, a series of analyses were performed to investigate how the optional and non-optional quizzes had been used. Independent-samples t-tests showed that there was a clear difference in frequency of quiz use between optional and non-optional quizzing in both school subjects (see Figs. 1 and 2). The average number of quizzes completed was significantly higher in non-optional sections than in completely optional sections, in both mathematics, t = 20.02, p < 0.001, d = 1.8, and Swedish, t = 7.11, p < 0.001, d = 1.0 (see Table 1 for descriptive statistics). The variation in number of quizzes completed by each participant is also quite different between the two conditions.

Fig. 1
figure 1

Average number of completed quizzes for each chapter in mathematics

Fig. 2
figure 2

Average number of completed quizzes for each section in Swedish

Table 1 Descriptive statistics for optional and non-optional quiz use in mathematics and Swedish, as well as means for male and female students separately

As Table 1 illustrates, the quiz use is lower for the optional sections than for the non-optional sections. Moreover, the optional sections had a much larger share of students who had completed no or only very few quizzes. There was also a significant difference in number of completed quizzes between male and female students (see Table 1), with females completing more quizzes in Swedish in both the optional (t = -3.30, p < 0.01, d = 0.9) and non-optional sections (t = -3.87, p < 0.001, d = 1.2). In mathematics, there was a significant difference in completed quizzes between males and females for the non-optional quizzes (t = -1.89, p = 0.03, d = 0.4), but not for the optional quizzes. There was no significant difference between the two cohorts in mathematics in terms of completed quizzes (t = 1.40, p = 0.08, d = 0.3).

Individual differences associated with the optional use of retrieval practice

In order to investigate whether students’ individual differences in cognitive and non-cognitive factors were related to their self-regulated use of quizzes, correlational analyses and hierarchical regression analyses were performed. As the purpose was to examine self-regulated quiz use, the dependent variable used in these analyses was the total number of completed optional quizzes. The correlations suggested that grit and conscientiousness have weak to moderate positive associations with optional quiz use in mathematics and Swedish (see Table 2). In Swedish, there was also a positive relationship between NFC and optional quiz use.

Table 2 Pearson correlations between the dependent variables optional quiz use in mathematics and Swedish, and the independent variables fluid intelligence, WMC, NFC, grit, conscientiousness, and openness

In the next step, hierarchical linear regression analyses were performed to examine the predictive ability of the independent variables. The results showed that, controlling for sex in the first step, conscientiousness was the only statistically significant predictor for optional quiz use in mathematics, explaining 11 percent of the variance (see Table 3). For optional quiz use in Swedish, a somewhat different pattern emerged. Sex was a significant positive predictor in all three steps, and in the third step, NFC and conscientiousness were statistically significant predictors, with positive associations with quiz use, whereas openness was significantly and negatively related to quiz use (see Table 3). NFC and sex had the most predictive power, followed by conscientiousness and openness. Together, the significant non-cognitive predictors explained 28 percent of the variance in optional quiz use in Swedish.

Table 3 Hierarchical regression analyses using sex, fluid Intelligence, WMC, NFC, grit, conscientiousness, and openness as predictors of optional quiz use in mathematics and Swedish

Discussion

Although a large amount of research has shown convincing evidence that retrieval practice is a highly effective learning strategy, it seems to be an underutilized study strategy. The present study focused on the group of students that actually use retrieval practice on their own, and whether they differ from other students, which has thus far been a neglected research area. More precisely, the aim was to investigate the self-regulated use of retrieval practice (i.e., repeated online quizzing) in an intervention for upper-secondary school students, and to determine to what extent the use of optional quizzes outside the classroom is related to differences between males and females, as well as to individual differences in cognitive and non-cognitive factors related to academic success.

First, we compared optional (outside the classroom) with non-optional (in the classroom) use of retrieval practice in two school subjects, mathematics and Swedish, and examined whether sex-related differences could be found. As expected, and in line with previous research (Corral et al., 2020; Trumbo et al., 2016), the number of quizzes completed differed between the optional condition and the non-optional condition, with significantly more quizzes being completed during non-optional sections for both mathematics and Swedish. What makes this finding especially interesting is the fact that all students had been informed about the benefits of retrieval practice at the start of the intervention. Knowing that retrieval practice can be an unintuitive learning technique, starting off the intervention with an inspirational lecture about the method was an attempt to inspire the students to stick with the technique until they could see its effectiveness for themselves. While it is impossible to know what the use of quizzes would have been without this introduction, the generally low usage suggests that informing the participants of the observed benefits of retrieval practice did not have a strong inspirational effect (but see also Ariel & Karpicke, 2017). Thus, it seems important that teachers encourage students to use retrieval practice activities inside the classroom, and maybe more importantly, support students’ development of the skill to self-regulate their learning so that they choose effective learning strategies (Zimmerman, 2002, 2013).

From a self-regulated learning (SRL) perspective, it is probable that there are sex differences with respect to the use of retrieval practice (see, for example, Panadero et al., 2017). While this has been a neglected research topic, one previous study does suggest that females use self-testing as a study strategy to a greater extent than males (Gagnon & Cormier, 2019). In line with this, the present study found that females completed more quizzes than males overall. With respect to optional quizzing, there was a significant difference between the sexes in Swedish, but not in math. One explanation for this finding might be that more quizzes were completed in mathematics overall (in both the optional and non-optional conditions) in comparison with Swedish (see Figs. 1 and 2). It is possible that the use of quizzes is a more natural and traditional part of teaching, and way of learning, in the mathematics classroom than in the Swedish classroom. For example, in a study examining Canadian teachers’ assessment practices in mathematics (N = 1096), quizzes were used to a large extent, both as formative assessments in order to get a sense of students’ understanding (89%), and as summative assessments (79%; Suurtamm et al., 2010).

Second, besides examining non-optional and optional use of retrieval practice and sex-related differences in this use, we also examined whether there are individual differences with respect to cognitive and non-cognitive factors related to the optional use of retrieval practice. Previous studies have indicated that cognitive abilities such as fluid intelligence and working memory capacity (Brewer & Unsworth, 2012; Minear et al., 2018), and non-cognitive factors such as grit, conscientiousness and openness are related to self-regulated learning and academic achievement (Pérez-González et al., 2022; Richardson et al., 2012). To our knowledge, however, there are only two studies (Fellman et al., 2020a, b) that have investigated individual differences in relation to optional use of retrieval practice. The findings from these studies indicated that higher cognitive ability and verbal working memory were related to greater optional use of retrieval practice, whereas no differences were observed in the non-cognitive variables grit and NFC. In contrast, the findings from the present study showed that NFC (Swedish), as well as conscientiousness (mathematics, Swedish), were positively related to optional quiz use, which is in line with earlier research in the SRL area (Pérez-González et al., 2022; Richardson et al., 2012), although no relationships to cognitive abilities were found. From a theoretical perspective, it has been suggested that both cognitive ability and personality characteristics, such as conscientiousness, might serve important functions in self-regulated learning with respect to employing study strategies (Pérez-González et al., 2022), and the different student populations (undergraduate vs. upper-secondary school) and different subject areas examined in the present study and by Fellman et al. (2020b) as well as Fellman et al. (2020a) might explain the different results observed in the different studies. In addition, as regards the results related to intelligence, Fellman et al. (2020b) found that a one-unit increase in RAPM score increased the odds ratio for belonging to the high-retrieval practice group by 1.18, but additional analyses on high-retrieval practice users revealed that intelligence did not predict quiz use per session. Thus, the practical relevance of the observed association should be considered.

Another interesting finding from the present study was that in mathematics, conscientiousness was the only statistically significant predictor of optional quiz use, whereas in Swedish, several statistically significant predictors emerged. Sex, conscientiousness, and NFC were positively related to optional quiz use, while openness to experience was negatively related to quiz use. Thus, in mathematics, students who are more conscientious completed more quizzes, while quiz use in Swedish was related to being conscientious and having a higher NFC. The negative relationship to openness can be understood in the sense that those who score low on openness, that is, those who are conventional and traditional in their behavior and prefer familiar routines to a greater extent (Aitken Harris, 2004), use optional quizzing more frequently as it can become a form of routine. The differences between the two school subjects, regarding non-cognitive factors relating to the optional use of retrieval practice, might be explained by differences in how these subjects are traditionally taught and in familiarity with quizzes, as mentioned above.

Overall, the finding that self-regulated use of retrieval practice is related to non-cognitive factors that have been found to be strong predictors of academic success implies that students who already are more likely to succeed in their academic endeavors are those using retrieval practice outside the classroom. Importantly, this was found despite giving all students information about retrieval practice that could have inspired them to use the strategy in a different way than their usual manner of engaging in educational tasks. Therefore, the results emphasize the important role of the educator in ensuring that the students actively engage with effective learning strategies by including retrieval practice in the classroom activities.

Limitations

A strength of the present study is the inclusion of both cognitive and non-cognitive factors to examine individual differences in self-regulated use of retrieval practice, which contributes to the research area. Nevertheless, the study has some limitations that need to be mentioned. Considering that the implementation of retrieval practice was conducted with a focus on ecological validity, rather than experimental rigor, some limitations are to be expected. Unforeseen circumstances necessitated a slight adjustment to the original design, resulting in the change from an ABAB design in cohort 1 to a BABA design in cohort 2. However, since the purpose of the study was to investigate voluntary quiz use, this change should not impact the results in any significant way. This is supported by a t-test confirming that the two cohorts did not differ in terms of completed quizzes. In addition, using a quasi-experimental design inherently incorporates some general limitations. In this case, the inclusion of two subjects and four separate teachers may have included differences between the groups or impacted how the quizzing was practically implemented in the different classrooms. The effects of teacher- and group characteristics on students’ study behaviors are important to consider and should be further explored in future research. Another limitation is the large difference in sample size in the mathematics and Swedish courses which makes it difficult to make comparisons between the subjects. Finally, although the proportion of females in the sample was comparable to the proportion of female students in science and technical programs in general, the relatively low number of female students implies that the sex-related differences found should be interpreted with caution. Hence, it is important to examine individual differences in self-regulated use of effective learning strategies, such as retrieval practice, in larger studies using more representative samples.

Conclusions and further studies

Not surprisingly, the results in the present study show that use of retrieval practice outside the classroom is low when compared to the use of retrieval practice inside the classroom. The results also suggest that sex-related and individual differences in non-cognitive factors can predict self-regulated use of retrieval practice. Female students seem to use retrieval practice as an optional strategy to a greater extent than males. Moreover, students who are more persistent (more conscientious) and who have a higher motivation to engage in learning activities (NFC) as well as a preference for routines (low on openness) use retrieval practice as an optional study strategy to a greater extent. To make sure that all students, regardless of their sex and non-cognitive aspects, benefit from effective learning strategies such as practice testing, it is evident that the strategy needs to be implemented in the classroom under non-optional conditions. On the other hand, we also need to learn more about students’ motivations and what might inspire them to use practice testing as a study strategy.