Self-assessment of learning is linked to greater self-regulation (Andrade, 2018; Yan, 2019) and achievement (Brown & Harris, 2013). Furthermore, the ability to evaluate one’s own work and processes is an important objective of higher education (Tai et al., 2017). However, our understanding of how students integrate feedback within their self-assessment processes is limited (Panadero et al., 2016), though we have a considerable knowledge on how feedback concerning task, process, and self-regulatory processes has been shown to improve educational outcomes (Butler & Winne, 1995; Hattie & Timperley, 2007). In one of the few studies exploring self-assessment and external feedback, (Yan & Brown, 2017) showed in an interview study with teacher education students that students claim to seek external feedback to form a self-assessment. Hence, it is important to understand how to support the development of realistic and sophisticated self-assessment. A successful formative assessment practice has been the introduction of rubrics or scoring guides into classroom practice (Brookhart & Chen, 2015). Hence, it was expected that students would describe more complex self-assessment processes when provided feedback based on a rubric.

In a randomized experiment with university students, this study systematically extends our understanding of the role feedback plays on self-assessment by manipulating the type of feedback, its timing, and the expertise level of tertiary students. The study extends our understanding of the self-assessment “black box” by examining the strategies and criteria students used. Hence, this study provides new insights into how we can support robust self-assessment.

Self-assessment

Self-assessment “involves a wide variety of mechanisms and techniques through which students describe (i.e., assess) and possibly assign merit or worth to (i.e., evaluate) the qualities of their own learning processes and products” (Panadero et al., 2016 p. 804). This definition indicates that self-assessment can take different shapes, from self-grading (e.g., Falchikov & Boud, 1989) to formative approaches (e.g., Andrade, 2018). However, what exactly happens when students self-assess is still largely mysterious.

Yan and Brown (2017) interviewed 17 undergraduate students from a teacher education institute using six general learning scenarios (e.g., How good are you at learning a new physical skill?) and five questions specific to self-assessment (e.g., What criteria did you use to conduct self-assessment?). From that data, the authors built a schematic cyclical self-assessment process consisting of three subprocesses: (1) determining performance criteria, (2) self-directed feedback seeking, and (3) self-reflection. Despite being an early effort to unpack the black box, the results are limited by a small sample and highly descriptive and interpretive analysis of interview data.

More recently, Panadero et al. (2020) analyzed the behavior of 64 secondary education students when self-assessing Spanish and mathematics tasks. Multi-method data sources (i.e., think aloud protocols, direct observation and self-report via questionnaires) described self-assessment actions as either strategies or criteria. The study showed that (1) the use of self-assessment strategies and criteria was more frequent and advanced without feedback and among girls, (2) there were different self-assessment patterns by school subject, (3) patterns of strategy and criteria use differed by school year, and (4) none of the self-assessment strategies or criteria had a statistically significant effect on self-efficacy.

Factors influencing self-assessment

Feedback in general has been shown to improve academic performance, especially when focused on specific tasks, processes, and self-regulation (Hattie & Timperley, 2007; Wisniewski et al., 2020). Butler and Winne’s (1995) feedback review showed that self-regulated learners adjust their internal feedback mechanisms in response to external feedback (e.g., scores, comments from teachers). Scholars have claimed that students need instructor’s feedback about their self-assessments as well as about content knowledge (Andrade, 2018; Brown & Harris, 2014; Panadero et al., 2016; Boud, 1995). Previous studies have shown little effect of external feedback on student self-assessment (Panadero et al., 2012, 2020; Raaijmakers et al., 2019). Thus, understanding how external feedback such as instructor’s or via instruments (e.g., rubrics) can influence students’ self-assessment is important.

Among feedback factors that influence student outcomes (Lipnevich et al., 2016), the timing of feedback is important. In general, delayed feedback is more likely to contribute to learning transfer, whereas prompt feedback is useful for difficult tasks (Shute, 2008). However, linking feedback to self-assessment is relatively rare. Panadero et al. (2020) found that secondary education students self-assessed using fewer strategies and criteria after receiving feedback. This has crucial implications for instructors as to when they should deliver their feedback, if they want students to develop calibrated self-assessments.

One potentially powerful mechanism for providing feedback is a marking, scoring, or curricular rubric, which has been shown to have stronger effects on performance than other assessment tools, such as exemplars (Lipnevich et al., 2014). The use of rubrics in education and research has grown steadily in the last years (Dawson, 2017), due to its instructional value with positive effects for students, teachers and even programs (Halonen et al., 2003). Rubric use has been associated with positive effects on self-assessment interventions and academic performance (Brookhart & Chen, 2015). Previous research has demonstrated that a rubric alone produced better results than combining rubrics with exemplars (Lipnevich et al., 2014). Although there is previous research exploring the effects of rubrics when compared or combined with feedback (Panadero et al., 2012, 2020; Wollenschläger et al., 2016), we still need insights around the impact of rubrics with or without feedback on student self-assessment.

It was established in the self-assessment literature that more sophisticated and accurate self-assessments are conducted by older and more academically advanced students (Brown & Harris, 2013; Barnett & Hixon, 1997; Boud & Falchikov, 1989; Kostons et al., 2009, 2010). As Boud and Falchikov (1989) demonstrated, it was subject specific competence that reduced discrepancy between self-assessments and teacher evaluations. However, recent research shows that the relationship might not be so straight forward (Panadero et al., 2020; Yan, 2018). Additionally, it is unclear at what level of higher education students need to be to have sufficient expertise to self-assess appropriately. Thus, an investigation with students in consecutive years of study in the same domain might clarify the role of year level on self-assessment capacity.

Research aim and questions

The current study adds to this body of research by examining the number and type of self-assessment strategies and criteria among higher education students in a randomized experiment which manipulated three feedback conditions (rubric vs. instructor’s vs. combined) without a control group because the university Ethics Committee did not grant permission. Importantly, we also examined feedback occasion (before vs. after) and year level (1st, 2nd, and 3rd university undergraduates). This is a single group, multi-method study (i.e., think aloud, observation, and self-report; though only the two first ones are analyzed here).

We explored three research questions (RQ):

  1. RQ1.

    What are the self-assessment strategies and criteria that higher education students implement before and after feedback?

    • Hypothesis 1 (H1): Self-assessment strategies and criteria will decrease when feedback is provided, in line with Panadero et al. (2020).

  2. RQ2.

    What are the effects of feedback type and feedback occasion on self-assessment behaviors (i.e., number and type of strategy and criteria)?

    • H2: Rubric feedback will provide better self-assessment practices than other feedback types, in line with Lipnevich et al. (2014).

  3. RQ3.

    What is the effect of student year level on the results?

    • H3: Students in higher years within a discipline will use more sophisticated strategies and criteria in their self-assessments. There are results in different directions from no differences in primary education but less self-assessment in more advanced secondary education students (Yan, 2018), to more similarities than expected yet some differences identified in secondary education students (Panadero et al., 2020). Nevertheless, as our participants are higher education students, it is expected they will behave differently with more advanced students showing higher self-assessment skills.

Method

Sample

A convenience sampling method at one university site where the first author worked created a sample of 126 undergraduate psychology students (88.1% females) across first, second, and third years of study (34.9%, 31.7%, and 33.3%, respectively). Participants were randomly assigned to one of three feedback conditions: rubric only (n = 43), instructor’s written feedback (n = 43), and rubric and instructor’s written feedback combined (n = 40). Participants received credit in accordance with the faculty volunteering programme. In a 3 × 3 ANOVA, given a risk level of α = 0.005, and a statistical power of 1 − β = 0.800, the current sample size would detect a medium effect size, f = 0.280 (G*Power 3.1.9.2; Faul et al., 2007).

Data collection and instruments

Data from the video-recorded think aloud protocols was inductively coded using the categories defined in a previous study (Panadero et al., 2020). In addition, two structured feedback intervention tools were used (i.e., rubric and instructor’s feedback).

Coded video-recorded data

Think-aloud protocols

Participants were asked to think aloud while conducting two self-assessments of their written essay. The first was an unguided self-assessment in which students were asked to evaluate the quality of their essay and the reasons for their evaluation. Participants were asked to express their thoughts and feelings and reminded that if they were silent, they would be prompted to think out loud. After the feedback was provided, students were asked to talk about their thoughts and feelings concerning the feedback and to repeat the think aloud process of self-assessing their essay. If the participant remained silent for more than 30 s, they were verbally reminded to think out loud. There were no time restrictions to perform the self-assessment.

A closed coding process was followed, as the codes were already defined as part of a previous study (see Panadero et al., 2020) with secondary education students. In such study, a deductive approach was employed to create the two general coding categories of self-assessment elements: strategies and criteria. Additionally, we created codes for those general categories. The categories were contrasted with the data using an inductive approach, to ensure that they were applicable to the new sample and procedure.

The video-recorded think-aloud content was coded to identify the strategies and criteria each student used. As in our previous study, we further organized each set of 13 categories into four levels for clarity in interpretation (0–3). Such levels classify the categories depending on their type and complexity. Details of the levels, categories, definitions, and exemplar comments are provided in Table 1.

Table 1 Category description and examples

Intervention prompts

Rubric (Appendix 1)

It was created for this study using experts’ models of writing composition. It contains three types of criteria: (1) writing process, (2) structure and coherence, and (3) sentences, vocabulary, and punctuation. There are three levels of quality: low, average, and high. The rubric is analytic as three criteria should be scored independently. The rubric was provided to some of the students during the experimental procedure, depending on the experimental condition, but it was not explicitly used by de instructor to provide feedback on the essays.

Instructor’s feedback (Appendix 2)

The instructor provided feedback to each essay using the same categories as the rubric. For the “writing process” criterion, as that was not directly observable by the instructor, he provided feedback by suggesting whether some of those strategies had been put into places (e.g., planning). Additionally, it included a grade ranging from 0 to 10 points. All essays were evaluated by the second author. The first author evaluated a third of the essays reaching total agreement in the rubric categories.

Procedure

This randomized experiment is part of a larger study; this report focuses on the specific self-assessment strategies and criteria students elicited (see Fig. 1), as measured via thinking aloud protocols and observations. After attending a 3 h’ group seminar on academic writing, participants wrote a short essay answering the question: “Why is the psychologist profession necessary?”. This topic was directly directed to the participants’ psychology programme. There was no length limitation for the essays that were written in the participants’ computers, which then submitted it to the research team. This essay did not have implications outside of the research experiment but we emphasized its utility for the students’ academic perspective of the programme. Some days later (approx. 1 week), participants went individually to the laboratory setting. There, they participated in the experiment face-to-face with one of the authors.

Fig. 1
figure 1

Experimental procedure

First, they received the instructions for self-assessing their essay that was handed out to them in its original form, in other words with no feedback. Students were instructed to while self-assessing think aloud their thoughts, emotions, and motivational reactions. Then, they performed the first think aloud self-assessment of the essay they had written. Right after, participants were given feedback on their essay according to the condition they had been assigned to (rubric vs. instructor vs. combined) and asked to self-assess again. The rubric group was handed out the rubric with the instruction of using it for their self-assessment. In the instructor’s feedback group, the participants were said that they should use the instructor’s feedback for their self-assessment. Finally, the combined group received both instructions. After reading the feedback, each participant repeated the self-assessment thinking aloud.

Data analysis

The coding of the think aloud utterances for strategies and criteria was evaluated in three rounds of inter-judge agreement. In round one, agreement between two judges on 15 videos reached an average Krippendorff’s α = 0.78, with three categories below 0.70. After discussion and consensus building around the low agreement categories, a second set of 15 videos was coded with an average Krippendorff’s α = 0.83. A third round, using 15 new videos, produced Krippendorff’s α = 0.87. This indicates the final coding values are dependable. The direct observation was performed in situ during data collection but more intensively during the coding of the video data. The observation data was used to inform and confirm the thinking aloud categories via defining the participants’ behavior, so as supplementary data to further establish the categories.

The categorical variables were described using multiple dichotomous frequency tables, as each participant could display more than one behavior. To study the effect of the factors (feedback occasion, condition, and year level) on self-assessment strategies and criteria frequencies, we conducted ANOVAs and square test to compare differences among the levels.

Results

RQ1: What are the self-assessment strategies and criteria that higher education students implement before and after feedback?

Type of strategies

Table 2 shows the multiple self-assessment strategies enacted by the participants. The most used before feedback were Read the essay, Think of different responses, and Read the instructions. After the feedback, the most used were Read the feedback or rubric received and Compare essay to feedback or rubric. These strategies are low level according to our code except for Think of different responses which show a deeper level of self-assessment elaboration. Three main results can be extracted. First, the strategies used before and after feedback are similar in nature, with five categories occurring at both moments. However, second, once the students received the feedback, there was a general decrease in the number of frequency of strategies with three out of the five strategies showing significant decreases. This is logical as most of the strategies were basic, and participants did not need to enact them again (e.g., read the essay, which they had done just minutes before). Also, there was the appearance of two new strategies that were not present before the feedback as they are specific to the reception of feedback (i.e., Read the feedback or rubric received and Compare essay to feedback or rubric). Third, after the feedback, there was also a new category that the participants did not activate it before: Compares question and response.

Table 2 Type of strategies deployed by feedback condition and time

Type of criteria

As the students could choose more than one criterion, we described multiple dichotomous variables. In general, the most used criteria before the feedback were: Sentences and punctuation marks, Negative intuition, Positive intuition, and Paragraph structure (Table 3). The most used after the feedback were as follows: Feedback received, Sentences and punctuation marks, Paragraph structure, and Writing process. When it comes to the trajectories, most of the criteria frequencies decreased significantly after receiving the feedback. However, there were three criteria that increased after feedback (significantly Writing process and Paragraph structure, non-significantly Sentences and punctuation marks) all being advanced strategies and all increasing in the rubric and combined condition but decreasing in the instructor’s condition. Additionally, a new criterion was used Feedback received, which, for obvious reasons, only occurred after feedback.

Table 3 Type of criteria deployed, by feedback and condition

RQ2: What are the effects of feedback type and feedback occasion on number and type of strategy and criteria in self-assessment behaviors?

At time 1, before receiving feedback, the number of strategies by condition (Table 4) differed statistically and substantially (F(2, 121) = 4.22, p = 0.017, η2 = 0.65) with a significant post hoc difference between the instructor condition (M = 2.78, SD = 0.183) and the combined condition (M = 2.06, SD = 0.185); the rubric condition did not differ from any of the two (M = 2.37, SD = 0.179). When it comes to number of criteria used, the conditions were equivalent (F(2, 121) = 0.48, p = 0.62, η2 = 0.008, 1 – β = 0.127) with no differences among the three groups: instructor (M = 3.32, SD = 0.224), rubric (M = 3.51, SD = 0.219), or combined (M = 3.63, SD = 0.227). We also analyzed if there were differences within the different levels of strategies (χ2(6) = 8.38, p = 0.21), and levels of criteria (χ2(6) = 6.32, p = 0.39), but both were equivalently distributed across conditions.

Table 4 Number and level of strategies deployed by condition and time

At time 2, after feedback, the number of strategies by condition (Table 4) did not differ (F(2,121) = 0.42, p = 0.66, η2 = 0.007, 1 − β = 0.118): instructor (M = 2.56, SD = 0.976), rubric (M = 2.44, SD = 0.765), or combined (M = 2.40, SD = 0.671), showing that the effects of rubric had no meaningful impact on the number of strategies. However, the number of criteria differed substantially (F(2,121) = 25.30, p < 0.001, η2 = 0.295) with significant post hoc differences for Rubric (M = 4.48, SD = 0.165) and combined conditions (M = 4.50, SD = 0.171) that outperformed the instructor condition (M = 3.02, SD = 0.169), both at p < 0.001. Similar to the number of strategies, the level of strategies was equivalently distributed across conditions (χ2(6) = 2.29, p = 0.89). However, and to be expected, the level of criteria differed significantly (χ2(4) = 12.00, p = 0.02), which is likely to be a function of the large sum of criteria differences across conditions at Time 2 (i.e., 193, 134, 180, respectively). When viewed as differences based on percentage of responses at each level, this is statistically not significant (χ2(4) = 7.74, p = 0.10).

When we explored the interaction condition by feedback occasion, we found no significant effect in self-assessment strategies (F(2,121) = 1.74, p = 0.180, η2 = 0.028). However, we found a significant main effect of condition in self-assessment criteria (F(2,115) = 7.97, p = 0.001, η2 = 0.116). The pre-post increase in number of strategies deployed was greater (post hoc p = 0.002) in the rubric (M = 0.938, SE = 0.247) than in the instructor's feedback (M =  − 0.291, SE = 0.253) condition. The combined condition (M = 0.881, SE = 0.256) also yielded a greater increase (post hoc p = 0.004) compared to the instructor’s feedback.

RQ3: What is the effect of student year level on the results?

We calculated the differences in strategies and criteria by year level between pre- and post-feedback conditions in two-way ANOVAs with condition and year level as factors. When it comes to the use of strategies, neither main effects (i.e., year level, F(2, 115) = 1.04, p = 0.359, η2 = 0.018, 1 − β = 0.227; feedback type, F(2, 115) = 1.72, p = 0.183, η2 = 0.029, 1 − β = 0.355) nor interaction (F(2, 115) = 0.973, p = 0.425, η2 = 0.033, 1 − β = 0.300) was significant, largely due to lack of power. Likewise, in the use of criteria, the same result was seen (i.e., year level, F(2, 115) = 1.68, p = 0.192, η2 = 0.028, 1 − β = 0.347; feedback type, F(2, 115) = 7,57, p < 0.001, η2 = 0.116, 1 − β = 0.940; and interaction, F(2, 115) = 0.25, p = 0.911, η2 = 0.009, 1 − β = 0.102). Therefore, our hypothesis that older students would show more advanced self-assessment action is not supported.

Discussion

This study explored the effects of three factors (i.e., feedback type, feedback occasion, and year level) on self-assessment strategies and criteria. This study contributes to our understanding of what happens in the “black box” of self-assessment by disentangling the frequency and type of self-assessment actions in response to different types of feedback.

Effects on self-assessment: strategy and criteria

In RQ1, we categorized self-assessment actions in a writing task in terms of strategies and criteria. Strategies were categorized on their depth or sophistication ranging from very basic activities (e.g., Read the essay) to advanced ones (e.g., think of different responses). Understandably, the most common strategies were relatively low level, as they are foundational to understanding the task. However, once feedback was received most of the strategies focused on the content of the feedback received (e.g., Compare essay to feedback or rubric), making the feedback as the anchor point of comparison (Nicol, 2021). In consequence, the strategies used prior to feedback were greatly reduced in number, indicating that, with feedback, self-assessment strategies were led by that information. Self-assessment criteria demonstrated similar effects. Prior to feedback, students used a wide range of criteria ranging from very basic (e.g., Negative intuition) to advanced (e.g., Writing process). Upon receipt of feedback, most of the criteria responded to the feedback in a less sophisticated manner, especially in the presence of rubrics.

In terms of the three different feedback conditions (RQ2), the two conditions containing rubrics outperformed the instructor’s feedback group in terms of criteria and close the initial gap in strategies. Despite of the instructor’s feedback condition having a higher number of self-assessment strategies before the intervention than the combined group, that difference vanished after feedback. Both the rubric and combined conditions had a higher number and more advanced types of criteria after feedback than the instructor’s feedback condition by large margins. No statistically significant differences in self-assessment strategies and criteria were found across the year levels (RQ3) regardless of feedback presence or type.

Regarding the alignment of our results to previous research, first, the feedback occasion effects on self-assessment strategies are very similar to a study with secondary education students (Panadero et al., 2020), as these strategies decreased significantly after feedback except for the ones related to the use of the feedback. In contrast, while the secondary education students decreased their number of criteria used and the type of criteria, here university students increased the number of criteria and used more advanced criteria when using rubrics, an instrument that was not implemented in Panadero et al. (2020). (Wollenschläger et al., 2016) compared three conditions (rubric, rubric and individual performance feedback, rubric and individual performance-improvement), finding that the latest was more powerful in increasing performance than the two first conditions. An important difference of this study is that it examined the impact of rubric and feedback on self-assessment, while the Wollenschläger et al. (2016) study examined the effects on academic performance. Hence, the impact of feedback appears to be contingent upon the kind of assessment being implemented.

Also, the secondary education students in Panadero et al. (2020) study showed differences across year levels, which was not found here with university students. This year level lack of effects aligns with Yan (2018) primary education students where he did not find differences, but is it not aligned with the same study when comparing secondary education students where he found significant differences (i.e., older students self-reporting lower levels of self-assessment). Unlike studies that have reported clearly delineated phases of self-assessment (Yan & Brown, 2017), the think aloud protocols in this study did not identify clear-cut phases, finding instead a naturally evolving process. While (Panadero et al., 2012) reported that scripts were better than rubrics, this study found that the presence of rubrics led to more sophisticated criteria use; future research would need to determine if script-based feedback would have any greater impact.

Three main conclusions from this study can be reached. First, there are different effects due to the form of feedback, with rubric-assisted feedback being especially promising for self-assessment. The effect of rubrics corrected the initial difference between the instructor’s feedback and the combined group so that, after receiving the feedback or/and rubric, all conditions were equal in terms of the number of self-assessment strategies. Also, and more interestingly, the rubrics conditions showed bigger effects on the use of criteria even in a situation in which the participants had already self-assessed freely before. This might indicate that rubrics as a tool are indeed very useful in stimulating student reflection on their work (Brookhart, 2018), more so than instructor’s feedback which may have been perceived as external criticism rather than supportive of improvement. This effect could be caused by instructor’s feedback putting students in passive position (e.g., they are being evaluated, they are recipients of feedback), while rubrics provided them with guides to explore and reflect by themselves. This also might speak to the importance of tools, such as rubrics, to support active self-assessment, rather than of the importance of providing corrective or evaluative feedback. This result might seem logical, as rubrics contain clear criteria and performance levels to which performance can be anchored. This may be especially pertinent to higher education students who are used to being assessed and graded against standards (e.g., Brookhart and Chen, 2015). Therefore, one viable conclusion is that the best type of feedback among the explored ones here is using rubrics, followed by a combination of rubric and instructor’s feedback.

Second, the introduction of feedback does impact self-assessment practices. Feedback decreased the number of strategies and increased the level of criteria used. A feature of this study is that students had to self-assess before they received feedback and then again upon receiving it. This process shows the impact of feedback in that it changes the strategies and criteria that students used. Therefore, for educational benefit, feedback may best be presented after students are required to implement their own self-assessment based on their own strategies and criteria. It may be that performance feedback prior to self-assessment will discourage students from the constructive strategies and criteria they exhibited in the pre-feedback stage.

And third, although self-assessment strategies did not become more advanced over years of study among our participants (i.e., our year level variable), this is not likely to be because there was a ceiling effect in the task itself. It is possible for students to exhibit in such a task more sophisticated strategies and criteria. It may be that, once entry to higher education is achieved, self-assessment is relatively homogeneous for this type of task. Perhaps much more demanding tasks (e.g., research thesis) would require more sophisticated self-assessment behaviors.

Limitations and future research

First, our participants conducted a first self-assessment without any structure or teaching on how to effectively evaluate one’s own work. Future research could introduce an intervention on self-assessment prior to the introduction of feedback to better eliminate confounds between self-assessment and feedback. Second, feedback focused on the essay writing task, not on the self-assessment process; such feedback may have had an effect on the quality of subsequent self-assessments (e.g. Andrade, 2018; Panadero et al., 2016). Third, the absence of a control group with no feedback is a limitation, although our conditions can be more realistic controls than no feedback as it is unusual to find activities without some kind of feedback in real educational settings. Additionally, internal feedback seems to be ubiquitous and automatic in any event (Butler & Winne, 1995), so even in the absence of experimenter-controlled feedback, there will be feedback. Fourth, the rubric contained an assessment criteria (i.e., writing process) that only the students could assess as the instructor did not have access to the process. Fifth, it could be an interesting line of work to explore peer feedback and how it affects self-assessment strategies and criteria. While there has been some research in that direction (To & Panadero, 2019), it would be interesting to explore these effects using our methodology to fulfill the aim of “opening the black box of self-assessment.” Sixth, it is likely that greater insights into self-assessment could be achieved by combining this self-reported approach to self-assessment with technology, such as eye-tracking (Jarodzka et al., 2017) or physiological reaction equipment (Azevedo et al., 2018). These additional tools may allow for a more precise understanding of the underlying cognitive, emotional, and motivational processes in self-assessment and in response to feedback. And seventh, future research should also seek to determine if there are gender or content-specific effects on self-assessment and feedback (Panadero et al., 2016).

Conclusions

In general, this study shows that rubrics have the greatest potential to increase positively the quality of student self-assessment behaviors. The study also indicates that feedback has a mixed effect on self-assessment strategies and criteria use. This may explain in part why reliance on feedback from peers or markers has been shown to have a negative impact on overall academic performance (Brown et al., 2016). Students who rely more on their own evaluative and self-regulatory learning strategies are more likely to discount external feedback. The provision of rubrics is likely to enable more effective and thoughtful self-assessed judgements about learning priorities. All in all, this study helps to better understand the specific strategies and criteria higher education students enact while self-assessing, something that is key to really understanding how self-assessment works.