Putting post-decision wagering to the test: a measure of self-perceived knowledge in basic sciences?
Students learn more effectively when they know what they do not know. Gaining insight into students’ metacognitive awareness is needed as misalignment between actual and self-perceived knowledge impedes their learning process. The optimal method of measuring self-perceived knowledge is still under debate. In this study, we evaluate the use of psychology-derived post-decision wagering for mapping students self-perceived knowledge.
Students (n = 71) performed a pre-test on medical physiology, followed by a teacher-moderated discussion and a post-test with isomorph questions. Half of the students rated their self-perceived knowledge on each question using post-decision wagering, i. e. betting 1–5 points on the correctness of their answer, whereas the other half used a 5-point Likert scale to rate their confidence.
Self-perceived knowledge scores were higher for post-decision wagering (pre: 3.75 ± 0.14, post: 4.60 ± 0.07) compared with Likert scales (pre: 3.13 ± 0.08, post: 3.92 ± 0.08) despite similar actual knowledge scores. Furthermore, Likert ratings showed a near-normal distribution, whereas wagers were placed preferentially using the outer ends of the scale. Correlations between mean actual and self-perceived knowledge scores were low in both groups. On average, 8.5% of responses were classified as misconceptions, defined as highly confident incorrect answers.
Despite the presumed reliability of post-decision wagering, our findings suggest that we should adhere to the use of Likert scales as a balanced measure for self-perceived knowledge in medical education. Moreover, the prevalence of misconceptions did not alter after instruction, indicating a need for instructional designs that enhance students’ conceptual understanding in basic sciences.
KeywordsMedical education Instructional methods Quantitative Research Methods Metacognition
What this paper adds
The importance of ‘knowing what you do not know’ is well-acknowledged in medical education. However, the optimal method of measuring self-perceived knowledge is still under debate. In this study, a psychology-derived tool to measure self-perceived knowledge called post-decision wagering was evaluated and considered less effective for educational settings than traditional Likert scales. We recommend including Likert scales in multiple-choice formats to establish confidence-weighted practice tests. Such tests may inform both students and educators on the student’s level of understanding, making it a more powerful tool for improving teaching and learning.
Learning basic sciences in medicine is recognized as an important, but challenging undertaking. It requires students to achieve understanding of complex learning material by analyzing, conceptualizing, and integrating knowledge. Strikingly, students are often unaware of what they do and do not know, which can be detrimental to learning [1, 2]. We cannot expect students to perform as effective learners if they are unaware of their own knowledge deficiencies.
Being aware of knowledge deficiencies is considered part of one’s metacognition. Metacognition, put simply, refers to thinking about one’s own thinking . The ability to evaluate one’s knowledge (i. e. self-perceived knowledge) after performing a task or test (i. e. actual knowledge) is a subcomponent of metacognition, which we refer to as metacognitive evaluation: knowing how well you did after finishing a task [3, 4, 5]. Metacognitive evaluation is considered a critical component for learning as it informs both students and educators on students’ potential knowledge deficiencies [6, 7]. Moreover, some researchers have showed that including metacognitive evaluation through confidence marking during multiple-choice practice tests increases the predictive value of testing [8, 9, 10].
However, studies assessing metacognitive evaluation in medical education mainly focus on clinical knowledge [11, 12, 13, 14, 15, 16, 17, 18], while the domain of basic sciences remains largely unexplored. Particularly within this latter domain, misunderstandings of physiological scientific concepts should be detected and corrected, since these concepts form a foundation for accurate clinical reasoning [19, 20]. Regarding basic science knowledge, Dawson et al. illustrate that according to faculty teachers physiological topics are experienced by medical students as being among the most difficult . This finding is supported by research that has shown a high prevalence of misunderstandings among students in medical physiology education [22, 23, 24, 25]. According to conceptual change theory, these so-called misconceptions are different from a mere lack of knowledge; misconceptions are robust to change and are therefore difficult to alleviate by traditional teaching methods . Misconceptions are generally indicated by incorrect answers that are given by students with high confidence, indicating that students are unaware of their incorrect knowledge . The role of awareness in establishing conceptual understanding illustrates that conceptual change theory and metacognitive theory are strongly intertwined in practice. Recognizing and evaluating existing conceptions, and deciding whether to reconstruct or review one’s understanding are all metacognitive processes . Conceptual change theory emphasizes the key role of awareness in accomplishing the shift from a misconception to the scientifically correct conception [26, 29, 30, 31]. Awareness can be created by explicitly assessing students’ metacognitive evaluation in the classroom, as it forces both learners and educators to think about their understanding of the subject matter.
Research on science learning in higher education nicely illustrates how metacognitive evaluation can be measured in classroom settings using a multi-tier approach [32, 33, 34]. Multi-tier assessments consist of multiple-choice exercises with a complementary Likert scale question: ‘How confident are you that your answer to this question is correct?’. This is an example of measuring metacognitive evaluation; assessing students’ actual knowledge through test scores and their self-perceived knowledge through Likert scales. The use of Likert scales as a self-report tool for self-perceived knowledge is very common in educational research, with variations in scales ranging from 3 to an increasing number of integers [11, 12, 13, 14, 15, 32, 33, 34, 35]. However, there is some controversy on the objectiveness of Likert scales for measuring self-perceived knowledge. As noted in a review by Koch and Preuschoff: ‘Studying consciousness is difficult because asking subjects to report their awareness of a stimulus perturbs that awareness’ . In psychology research, an alternative method to more implicitly measure awareness of visual stimuli was introduced, called post-decision wagering (PDW) . This method is suggested to determine awareness more objectively by indirectly assessing individuals’ self-perceived knowledge by betting points (or money) on their decisions. If an individual chooses the correct answer points are gained, whereas with a bet on an incorrect answer points are lost. Additionally, researchers suggest that PDW enhances individuals’ motivation to reveal their confidence in their answer compared with self-reports on numerical confidence scales [38, 39]. Contrastingly, various studies show a substantial influence of loss aversion in PDW resulting in a decreased accuracy towards measuring self-perceived knowledge. Therefore, participants are less prone to indicate a high confidence level through wagering [40, 41]. Moreover, wagers are also suggested to depend on wager size indicating that PDW is still a subjective tool to measure awareness .
The use of PDW as a tool to measure self-perceived knowledge has not yet been reported in medical education. In this study, our primary objective was to evaluate psychology-derived PDW as a measure of self-perceived knowledge in educational contexts. We compared PDW with Likert scales and hypothesized that confidence wagers rather than ratings would be more aligned with students’ actual knowledge due to their supposed more objective nature. To further determine the practical usefulness of the instruments in quantitating self-perceived knowledge, we examined the distribution of responses along the scales to compare effective resolution and discriminative power. As a secondary objective, we investigated the prevalence of misconceptions among students to gain insight into their conceptual understanding of the subject matter. In the present study, we focused on basic sciences particularly given the high prevalence of misconceptions [21, 22, 23, 24, 25, 43]. Misconceptions may not only affect students’ conceptual understanding, but also have significant impact on their clinical reasoning skills [23, 44, 45], and are therefore considered an important subject of investigation in medical education.
A cohort of first-year bachelor Biomedical Sciences students (n = 71) from Leiden University Medical Center participated in this study. In the Netherlands, similar physiology courses are part of both the biomedical and medical bachelor curriculum.
For this study, the online platform Socrative was used to develop multiple-choice knowledge tests. The knowledge tests were implemented in a compulsory 2‑hour supervised seminar on cardiovascular physiology that was taught in five small working groups (13-15 students/group) by the same teacher. Allocation of the students to the working groups was arbitrary except for the aim to have a similar female/male ratio in all groups. The topics were introduced in a preceding plenary lecture, and students were provided with a handout during the seminar which contained some basic figures and diagrams. Although the knowledge tests were a mandatory part of the seminar, students could voluntarily decide if their anonymous answers could be used for educational research purposes by giving informed consent. No incentives were offered for participation and test results did not affect the course grade. Students could withdraw their permission at any time. This experiment was part of a larger study also investigating the influence of peer discussion on confidence of which data are not reported in this paper. This study was approved by the Educational Research Review Board (ERRB) of Leiden University Medical Center: ERRB reference number: OEC/ERRB/20180612/3.
The knowledge tests were taken individually under exam conditions, at the beginning (pre) and at the end (post) of the seminar. Both tests consisted of 10 multiple-choice questions and examined the same knowledge base. The post-test used slightly different phrasing of the questions which can be considered isomorphic or so-called near-transfer questions. All students were instructed to answer each question individually and to provide a confidence rating (Likert scale: 3 out of 5 groups, n = 42) or wager (PDW: 2 out of 5 groups, n = 29) immediately after each question. To compute the actual knowledge, students received one point for every correct answer of the multiple-choice tests yielding a maximum score of 10 points per test. During the body of the actual seminar the questions from the pre-test and related topics were explained and discussed.
Questions were framed according to the revised version of Bloom’s taxonomy of cognitive domain and all categorized as comprehension-type questions. To ensure validity of the knowledge test, the physiology questions were designed by an expert physiologist (PS) and derived from a database of previously used exam questions. PS is an educational professor at Leiden University Medical Center and actively involved in shaping the biomedical and medical curriculum regarding medical physiology education, and familiar with the learning goals of the human biology course. Consequently, the questions were designed in such a way that they would meet the overall course objectives regarding physiological knowledge.
For confidence rating in the Likert scale group (LS group), students used a 5-point Likert scale rating instrument: (1) Completely unsure (just guessing), (2) Rather unsure, (3) Neutral (50/50), (4) Rather sure, (5) Very sure (almost 100%). Students received additional instruction on the neutral item (3) as the lecturer indicated that this rating should be given when a student doubted between two remaining answer choices considered equally likely. In the PDW group, students were instructed to place their bets on the correct answer. In order to compare results with the LS group, wagers ranged from 1 to 5 points. Students could bet 1–5 points per question, which were gained if the answer was correct and lost if the answer was incorrect. Students received their total wagering scores after the seminar.
Mean scores for students’ actual and self-perceived knowledge
(Score, max 10)
(Score, max 5)
No. of students
6.78 ± 0.27
3.75 ± 0.14
6.38 ± 0.27
3.13 ± 0.08
8.31 ± 0.23
4.60 ± 0.07
7.68 ± 0.27
3.92 ± 0.08
The aim of this study was to evaluate psychology-derived post-decision wagering (PDW) as a measure of self-perceived knowledge by comparing it with the commonly used Likert scales. Despite similar levels of actual knowledge in both groups, students who used wagers indicated more confidence in their answers compared with students who used ratings. PDW confidence scores were also less evenly distributed compared with Likert scales and did not show a normal distribution, resulting in a less proportional use of the instrument. Contrary to our a priori hypothesis, PDW thus represents a less balanced measure of self-perceived knowledge than the traditional Likert scale.
Misalignment between actual and self-perceived knowledge was present in both groups, indicating that students generally do not know what they know or do not know. Some students using PDW mentioned that they would easily go ‘all in’ and bet all their points even when they were not completely sure, and despite the rule that points were lost in case of an incorrect answer. Contrary to these findings, psychological research and behaviour economics report that individuals who wager generally show lower confidence compared with verbal confidence reports due to so-called risk aversion [41, 42]. The apparent absence of risk aversion in our students might be due to the use of imaginary ‘points’ as incentives. Interestingly, however, studies in laboratory settings have reported similar responses when using real versus imaginary incentives . We suggest that the educational environment in which students are socially engaged might influence their response to risk aversion. The effects of classroom versus laboratory settings on the use of self-perceived knowledge measures await further investigation.
Our findings are in line with previous studies reporting discrepancies between students’ actual and self-perceived knowledge consistently across disciplines [1, 6, 15, 24, 46, 47, 48, 49]. Kruger and Dunning have suggested that incompetence deprives an individual of the ability to recognize its shortcomings . Based on this reasoning, our reported underestimation of performance is an unexpected finding. An explanation for this discrepancy may be that metacognitive evaluation of the specific materials in this study did not resemble the more global self-assessments mentioned in Kruger and Dunning’s studies. Such global judgements are, for example, self-competence (‘I feel I am able to …’) and self-efficacy (‘how confident are you that you can …’). These are prospective judgements and can therefore be considered part of one’s metacognitive knowledge, whereas metacognitive evaluation (as measured in the present study) is an on-line judgement which is made after one has performed a specific task [4, 5]. Students might be more cautious with estimating their actual scores after they have performed a task compared with a situation in which a global prospective judgement is requested. Furthermore, gender differences may have influenced our outcomes. Research has indicated that women tend to underestimate their academic capacities, as illustrated by the study by Ravesloot on progress testing in medical education . Most of the students (70%) in this experiment were female, which might partly explain the relatively large percentage of correct low-confidence responses.
Overall, teacher instruction during the seminar reduced the percentage of lack of knowledge and lucky guesses and increased the number of correct responses, whereas the number of misconceptions was not greatly altered. Based on conceptual change theory and schema theory, we assume that the seminar discussion may not have benefitted the students with misconceptions, as the instructional design to achieve conceptual change should include more explicit activation of students’ prior knowledge. A learner’s understanding is facilitated by adding new information to an existing mental model or schema in the brain, which comprises relevant prior knowledge a learner has already obtained [51, 52, 53]. Cognitive neuroscientists have shown that new information can be added to schemas (i. e. stable neural network) faster when this information fits the prior knowledge [54, 55]. Thus, when a learner’s schema encompasses a misconception, it may be difficult for the instructor to teach the scientific conception using traditional teaching methods. By activating students’ prior knowledge, their current understanding of concepts and potential misconceptions become detectable for instructors [56, 57]. Future studies may focus on the use of conceptual change instructions, including activation of prior knowledge, to alleviate the number of misconceptions and enhance students’ conceptual understanding.
Our study has several limitations. We focused on the comparison of different measurement instruments for self-perceived knowledge. The relatively low difficulty level of questions might have limited the prevalence of misconceptions as these were more comprehensive than conceptual-oriented questions. Additionally, the post-test questions were not identical but isomorph to the pre-test questions which may have resulted in unanticipated changes in item difficulty because of near-transfer conversions of the original items. Furthermore, students had limited time to process the information as the total length of a seminar was approximately 120 min. An additional retention test might have provided more insight into the stableness of students’ conceptual understanding.
This study was conducted to evaluate post-decision wagering as an instrument for measuring students’ self-perceived knowledge in educational contexts. Our findings add to the growing number of studies that demonstrate the subjective properties of PDW and thus do not support previous literature on the objectivity of PDW as a measure of confidence. In all, this study supports the use of Likert scales over post-decision wagering as an instrument for measuring self-perceived knowledge in educational settings.
Conflict of interest
M. Versteeg and P. Steendijk declare that they have no competing interests.
- 1.Ziegler B, Montplaisir L. Student Perceived and Determined Knowledge of Biology Concepts in an Upper-Level Biology Course. CBE Life Sci Educ. IEEE Trans Med Imaging. 2014;13:322:30.Google Scholar
- 4.Zohar A, Barzilai S. A review of research on metacognition in science education: Current and future directions. Stud Sci Educ. IEEE Trans Med Imaging. 2013;49:121:69.Google Scholar
- 6.Cavers M, Ling J. Confidence Weighting Procedures for Multiple-Choice Tests. In: Chen DG, Chen J, Lu X, Yi G, Yu H, editors. Advanced Statistical Methods in Data Science. ICSA Book Series in Statistics. Singapore: Springer; 2016.Google Scholar
- 7.Sparck EM, Bjork EL, Bjork RA. On the learning benefits of confidence-weighted testing. Cogn Res Princ Implic. IEEE Trans Med Imaging. 2016;1:3.Google Scholar
- 10.Bokhorst FD. Confidence-weighting and the validity of achievement tests. Psych Rep. IEEE Trans Med Imaging. 1986;59:383:6.Google Scholar
- 12.Kampmeyer D, Matthes J, Herzig S. Lucky guess or knowledge: A cross-sectional study using the Bland and Altman analysis to compare confidence-based testing of pharmacological knowledge in 3rd and 5th year medical students. Adv Health Sci Educ. IEEE Trans Med Imaging. 2015;20:431:40.Google Scholar
- 19.Tanner K, Allen D. Approaches to biology teaching and learning: understanding the wrong answers—teaching toward conceptual change. CBE Life Sci Educ. IEEE Trans Med Imaging. 2005;4:112:7.Google Scholar
- 25.Badenhorst E, Hartman N, Mamede S. How Biomedical Misconceptions May Arise and Affect Medical Students’ Learning: A Review of Theoretical Perspectives and Empirical Evidence. Health Prof Educ. IEEE Trans Med Imaging. 2016;2:10:7.Google Scholar
- 26.Posner GJ, Strike KA, Hewson PW, Gertzog WA. Accommodation of a scientific conception: Toward a theory of conceptual change. Sci Educ. IEEE Trans Med Imaging. 1982;66:211:27.Google Scholar
- 27.Hasan S, Bagayoko D, Kelley EL. Misconceptions and the certainty of response index (CRI). Phys Educ. IEEE Trans Med Imaging. 1999;34:294:9.Google Scholar
- 32.Caleon IS, Subramaniam R. Do students know what they know and what they don’t know? Using a four-tier diagnostic test to assess the nature of students’ alternative conceptions. Res Sci Educ. IEEE Trans Med Imaging. 2010;40:313:37.Google Scholar
- 33.Sreenivasulu B, Subramaniam R. Exploring undergraduates’ understanding of transition metals chemistry with the use of cognitive and confidence measures. Res Sci Educ. IEEE Trans Med Imaging. 2014;44:801:28.Google Scholar
- 34.Yan YK, Subramaniam R. Using a multi-tier diagnostic test to explore the nature of students’ alternative conceptions on reaction kinetics. Chem Educ Res Pract. IEEE Trans Med Imaging. 2018;19:213:26.Google Scholar
- 45.Goldszmidt M, Minda JP, Devantier SL, Skye AL, Woods NN. Expanding the basic science debate: the role of physics knowledge in interpreting clinical findings. Adv Health Sci Educ. IEEE Trans Med Imaging. 2012;17:547:55.Google Scholar
- 48.Lindsey BA, Nagel ML. Do students know what they know? Exploring the accuracy of students’ self-assessments. Phys Rev Spec Top-Phys Educ R. IEEE Trans Med Imaging. 2015;11:20103.Google Scholar
- 50.Ravesloot CJ, Van der Schaaf MF, Muijtjens AM, et al. The don’t know option in progress testing. Adv Health Sci Educ. IEEE Trans Med Imaging. 2015;20:1325:38.Google Scholar
- 51.Anderson RC, Pearson PD. A schema-theoretic view of basic processes in reading comprehension. In: Pearson PD, Barr R, Kamil ML, Mosenthal P, editors. Handbook of reading research. New York: Longman; 1984. pp. 255–91.Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.