Introduction

Individuals who intend to expand their knowledge or make personal decisions on the basis of scientific evidence are often confronted with competing knowledge claims, particularly when it comes to ill-structured scientific issues (Greene and Yu 2016; Sinatra et al. 2014). Therefore, the ability to evaluate scientific controversies is crucial for laypeople as it ensures informed decisions and democratic participation (Carey and Smith 1993; Kuhn 2005). However, it is not possible for laypeople to know and understand all relevant scientific findings of any given subject. They must therefore develop the competence to weigh and evaluate the contradictory knowledge claims they encounter (Bromme and Goldman 2014). Specifically, key to evaluating a scientific controversy is the ability to detect and interpret its underlying causes (Britt et al. 2014). Recent research has stressed the role of epistemic beliefs when evaluating conflicting scientific knowledge claims (Bråten et al. 2011; Greene and Yu 2016; Sinatra and Hofer 2016). Specifically, beliefs about the certainty or uncertainty of knowledge (Hofer and Pintrich 1997) have been proposed to be related to how students evaluate conflicting information pertaining to scientific issues such as global warming (Bråten et al. 2011; Bråten and Strømsø 2010). Two lines of previous research have focused on either individuals’ professed epistemic beliefs or their enacted epistemic beliefs. Professed epistemic beliefs refer to personal views about knowledge and knowing (Hofer and Pintrich 1997), usually measured by questionnaires (i.e., offline data). In contrast, data sources such as verbal reports have been used to capture individuals’ enacted epistemic beliefs while engaged in certain tasks (i.e., online or process data). The relation between individuals’ professed epistemic beliefs and their enacted epistemic beliefs, however, has remained unclear in most previous studies. Therefore, a major goal of the present study was to investigate the interplay of individuals’ professed and enacted epistemic beliefs regarding the uncertainty of scientific knowledge and how they relate to the evaluation of scientific controversies.

Science-related epistemic beliefs

The term epistemic beliefs refer to individuals’ personal views about knowledge and the process of knowing (Hofer and Bendixen 2012; Hofer and Pintrich 1997). A dominant line of epistemic belief research has identified systems of relatively independent belief dimensions (Hofer and Pintrich 1997; Schommer 1990) that target the nature of knowledge and knowing and include, for instance, beliefs about the certainty (or uncertainty) of knowledge or beliefs about the justification of knowledge. Beliefs about the certainty or uncertainty of knowledge (hereafter referred to as uncertainty beliefs) constitute a core dimension in most epistemic belief frameworks (Bromme et al. 2008; Trautwein and Lüdtke 2007). Uncertainty beliefs target the nature of knowledge, ranging from views that knowledge is absolute and unchanging (i.e., certain) to views that knowledge is tentative and evolving (i.e., uncertain; Hofer 2001; Hofer and Pintrich 1997). Prior research has linked uncertainty beliefs to successful learning and achievement (Cano and Cardelle-Elawar 2004; Trautwein and Lüdtke 2007), particularly in the domain of science (Conley et al. 2004; Elby et al. 2016; Winberg et al. 2019).

A promising approach for studying uncertainty beliefs in the domain of science lies in investigating how students deal with conflicting knowledge claims, or more specifically, how they evaluate scientific controversies (Bråten et al. 2016; Flemming et al. 2015). Even though believing in uncertain knowledge might not be advantageous under all circumstances (Sinatra et al. 2014), acknowledging the uncertainty of scientific knowledge appears to be an important prerequisite for individuals to compare and evaluate multiple conflicting knowledge claims (Bråten et al. 2011; Bråten and Strømsø 2010; Britt et al. 2014; Schraw et al. 1995). In this regard, Bråten et al. (2011) introduced a theoretical framework that specifies how different epistemic belief dimensions influence the understanding of multiple, partly conflicting information sources. Specifically, beliefs in uncertain knowledge are proposed to be beneficial for juxtaposing inconsistent information, whereas beliefs in certain knowledge are assumed to prompt readers to search for a single correct answer. Accordingly, uncertainty beliefs should lead to more in-depth processing when readers are confronted with scientific controversies (Bråten et al. 2011; Bråten et al. 2016).

Professed versus enacted science-related epistemic beliefs

Usually, epistemic beliefs are either assessed with self-report measures such as questionnaires, or they are measured directly in a particular context, for example by using verbal reports (see Mason 2016 and Sandoval et al. 2016, for an overview). In line with these different approaches, several authors have introduced dichotomous terms differentiating between professed and enacted epistemic beliefs (Louca et al. 2004), espoused beliefs and beliefs in practice (Chai and Khine 2008), or formal and practical epistemology (Sandoval 2005), to distinguish the two assessment approaches. The present study builds upon Louca et al.’s (2004) terminology of professed and enacted epistemic beliefs, differentiating between professed uncertainty beliefs (PUB) and enacted uncertainty beliefs (EUB).

In self-report measures that attempt to assess PUB, respondents are asked to rate their agreement with statements about the certainty or uncertainty of knowledge either in general (e.g., Schommer 1990) or in relation to a particular subject domain such as science (e.g., Conley et al. 2004). However, criticism has been raised that questionnaires provide only decontextualized measures of science-related beliefs because “what students say about knowledge, science, or experiments in general might have little connection with their actual epistemic practices of reasoning and thinking about real matters” (Sinatra and Chinn 2012, p. 264, see also Bendixen and Rule 2004; Greene and Yu 2014).

EUB are usually measured with verbal data such as cognitive interviews or thinking aloud (e.g., Ferguson et al. 2012; Greene et al. 2010; Hofer 2004; Mason et al. 2011; Mason et al. 2010a; Muis et al. 2014). Whereas cognitive interviews are prone to elicit information that participants will consider only because they were asked the respective questions (Hofer and Sinatra 2010; Schraw 2000), thinking aloud has the advantage of producing information about cognitive processes when individuals complete a task (Mason et al. 2010b). Furthermore, van Gog et al. (2005) proposed cued retrospective reports as a procedure in which participants are presented with cues of their own task performance (e.g., a video of their own eye movements) as a cue for retrospectively thinking aloud. Compared to concurrent thinking aloud, this approach can result in more verbal utterances on a cognitive and metacognitive level without altering the quality of participants’ responses (Brand-Gruwel et al. 2017; Hyrskykari et al. 2008) and without impairing task performance (Fox et al. 2011).

However, rather than assessing either PUB or EUB, in the present paper, we propose to measure science-related PUB and EUB in conjunction, as such triangulation of data sources is likely to produce more valuable insights into the construct of uncertainty beliefs and how it relates to the evaluation of scientific controversies than could be gathered by only one data source (Muis 2007).

Relation between professed and enacted epistemic beliefs

The relation between professed and enacted epistemic beliefs, that is, between what individuals say they think about knowledge and knowing and what they actually think in a certain context, has recently been described by Alexander (2016) as one of the big unresolved questions in the field. Similarly, it has been documented in the science education literature that science teachers’ epistemic beliefs are not always in alignment with their teaching practices (e.g., Schraw and Olafson 2003; Tobin and McRobbie 1997), emphasizing also the practical relevance of clarifying the interrelation between PUB and EUB. It has been suggested that individuals’ professed epistemic beliefs can inform their enacted epistemic beliefs in a given context such as scientific controversies in the sense that “epistemic beliefs are the content upon which epistemic cognition processes act” (Greene et al. 2016, p. 5). Both PUB and EUB can be adaptive for the evaluation of scientific controversies. With respect to PUB, they have been shown to be beneficial for readers in integrating multiple texts addressing a controversial socio-scientific issue (Strømsø et al. 2008) and in constructing arguments with regard to such issues (Bråten and Strømsø 2010). Furthermore, an eye-tracking study by Mason and Ariasi (2010) showed that PUB was positively correlated to readers’ fixation times on controversial or ambiguous information about a biological topic. Besides, a study by Richter and Schmid (2010) showed that students with stronger PUB reported more advanced strategies to check (in)consistencies within an academic text. Finally, a recent meta-analysis revealed that PUB, among other factors, predicted academic achievement such as individuals’ argumentation and conceptual knowledge (Greene et al. 2018a).

Likewise, with respect to EUB, several studies also have provided evidence for a positive relation to individuals’ performance in terms of online learning regarding a socio-scientific issue (Cho et al. 2018), self-regulation strategies with respect to academic reading (Richter and Schmid 2010), and science learning strategies (Lee et al. 2016). Moreover, Mason et al. (2010a) directly investigated the interplay of science-related PUB and EUB in 8th grade students conducting a web search on a controversial scientific topic. Results showed that PUB were related to EUB as measured by retrospective interviews (i.e., by the question “How stable over time do you think the information you found on the Internet is?”). However, students’ EUB were not significantly related to their learning outcome measured after the web search by means of a set of open-ended questions, neither were such relations for PUB reported in this study.

The present study

The aim of the present study was to investigate the relation between science-related PUB and EUB and their role in university students’ evaluation of scientific controversies. Drawing on prior research, PUB were defined as individuals’ self-reported beliefs about the uncertainty of scientific knowledge and EUB as individuals’ verbalized beliefs about the uncertainty of knowledge related to their task processing, that is, the evaluation of scientific controversies. The present study focused on university students because scientific controversies both play a central role in their academic careers (Sinatra and Chinn 2012) and become increasingly important for young adults’ personal life decisions (Bromme and Goldman 2014; Feinstein 2011; Greene and Yu 2016). Based on the assumption that professed and enacted uncertainty beliefs are interrelated in the sense that individuals activate the beliefs they hold in the context for which these beliefs are adaptive (Bråten et al. 2016; Sandoval et al. 2016), a positive correlation between PUB and EUB was expected (Hypothesis 1).

Furthermore, we tested the respective relations of both PUB and EUB with individuals’ performance when evaluating scientific controversies. Based on previous findings (e.g., Greene et al. 2018b for PUB, or Cho et al. 2018 for EUB), it was hypothesized that individuals’ evaluation of scientific controversies would be predicted by both PUB (Hypothesis 2) and EUB (Hypothesis 3). Moreover, it was expected that due to their close link to individuals’ actual cognition (Barzilai and Zohar 2014), the effect of EUB on controversy-evaluation performance would be larger than the effect of PUB (Hypothesis 4).

Finally, it was predicted that the positive relation between PUB and controversy-evaluation performance would be mediated by EUB (Hypothesis 5). The enactment of uncertainty beliefs, which implies this mediation effect, has been proposed by different theoretical models. Several authors have suggested that underlying epistemic beliefs would influence performance through adaptive epistemic cognitive processes (Bråten et al. 2016; Hofer 2001; Muis 2007). Thus, the present study analyzed whether this prediction would hold for uncertainty beliefs when university students evaluate scientific controversies.

Method

Participants

Participants were N = 83 university students. Data from four students had to be excluded due to technical problems or because they did not complete the study. Thus, all analyses were conducted with N = 79 students (mean age = 20.8 years; SD = 2.08; 70% female).Footnote 1 The study took place at a large German university with participants from different majors (45 from the natural sciences, 20 from the social sciences and humanities, 7 from economics and business, 7 from psychology and cognitive science). They received 12 € for their participation. German was the first language of all participants. The study was approved in advance by the local ethics committee, and participants gave their written consent at the beginning of the study.

Materials

Controversy-evaluation test

The dependent measure was students’ performance in evaluating scientific controversies. This was measured with a controversy-evaluation test that required the evaluation of five texts that each described a scientific controversy between two scientists and respective claims regarding central aspects of the controversy (Kramer et al. 2020). The controversy-evaluation test was an element from the National Education Panel Study (NEPS; Oschatz et al. 2017) and had the aim of assessing the ability to critically reflect on opposing scientific claims as an indicator of individuals’ ability to evaluate scientific controversies. Because the controversy-evaluation test was initially developed for high school students (i.e., grades 12 and 13Footnote 2), the difficulty of the test was assumed to be appropriate for this sample of undergraduate students. Each of the five texts included a vignette describing a scientific controversy regarding a scientific debate. Table 1 provides an overview of the titles, content, and length of the five scientific controversies. As can be seen from the titles, the controversy-evaluation test covered a range of different topics within the domain of science. Each of the controversies was presented on one page (M = 349.6 words, SD = 52.68), containing a short introduction to the topic followed by a description of the opposing perspectives of two fictitious scientists on the respective issue. Each controversy was accompanied by five to seven items stating possible reasons for the controversy or the conflicting claims. Participants were asked to judge each item as correct or incorrect. In total, the controversy-evaluation test consisted of 32 items, resulting in a score range of 0 to 32. Reliability in the present study was acceptable with α = .66.

Table 1 Description of the scientific controversies from the controversy-evaluation test

The items were carefully designed such that they can be solved by interested laypeople. They neither require prior knowledge of the underlying topics nor do they address mere text comprehension. Rather, readers are required to make inferences about possible causes for the controversy through critical reflection. Such critical reflection is distinct from individuals’ uncertainty beliefs (Thomm et al. 2017). Rather, uncertainty beliefs can be seen as a prerequisite for adequately coping with the former. For high test scores, readers need to infer plausible causes that relate to inherent aspects of the controversies, such as the justification of theoretical assumptions made by the opposing scientists or the validity of the respective measurement approaches. In the following, this rationale is illustrated by two sample items from the controversy “Chemical plant.”

In the respective controversy, two chemists presented soil samples from varying distances to a chemical plant, resulting in contradictory claims as to whether the emissions were harmful. One of the chemists stressed that children playing close to the chemical plant would be in particular danger. One item relating to this controversy stated that “The argumentation of Scientist A would also be plausible if he didn’t refer to playing children.” For a correct answer, participants had to recognize that, in this case, a reference to playing children did not convey a valid argumentation by itself. Therefore, the correct answer to this item was “correct” because the plausibility of the arguments was not affected by a reference to playing children. Another item addressing this controversy stated “Because scientist B wants to demonstrate the harmlessness of the chemical plant, it is scientifically correct that he publishes only the matching results”. Here, participants needed to identify the underlying cause that the collection and presentation of scientific data should not be determined by desired results. Therefore, the right answer to this item was “incorrect.”

Professed uncertainty beliefs

To measure participants’ professed (i.e., self-reported) uncertainty beliefs in the domain of science, two subscales from the Scientific Epistemological Beliefs Questionnaire developed by Conley et al. (2004) were used. These subscales are labeled certainty of scientific knowledge (six items, e.g., “Scientists always agree about what is true in science”) and development of science-related knowledge (six items, e.g., “Ideas in science sometimes change”). Following Mason et al. (2008), these two subscales were collapsed into one scale in order to achieve a conceptual correspondence with the original measurement of uncertainty beliefs by Hofer and Pintrich (1997), which contains both aspects (i.e., uncertainty and development of knowledge). Items from the certainty subscale were recoded so that high values for all items of the scale indicated beliefs about uncertain (i.e., tentative and evolving) knowledge. The resulting scale consisted of 12 items that were answered on a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree), with an acceptable reliability of α = .75.

Enacted uncertainty beliefs

Cued retrospective verbal reports (van Gog et al. 2005) were used to measure participants’ enacted uncertainty beliefs, that is, their epistemic cognitive processes while they were working on the controversy-evaluation test. In this procedure, participants retrospectively verbalize their thought processes when working on a task, prompted by their eye movements shown to them as cues. To obtain these cues, an SMI (SensoMotoric Instruments) remote eye-tracking system with 250 Hz using infrared cameras positioned below a 22-in. Dell monitor (with a resolution of 1680 × 1050 pixels) was used. A chin rest was used to avoid head movements during data recording and to guarantee a fixed distance of about 70 cm between the eyes and the eye-tracking device. Participants’ eye movements were recorded as they looked at the computer screen to complete the controversy-evaluation test. After participants had completed the task, for each controversy, their own test-taking behavior was played back to them at 50% speed (Brand-Gruwel et al. 2017; Kammerer et al. 2013; van Gog et al. 2005) as a so-called gaze replay, a screen-recording video with their eye movements superimposed. Specifically, participants were only confronted with the parts of the gaze replay that showed how they answered the items. That is, participants were presented with indicators of their covert cognitive processes (i.e., their own eye movements, depicted as a yellow dot representing participants’ fixation points) as well as their overt actions (i.e., mouse movements and clicks; van Gog and Jarodzka 2013; van Gog et al. 2009) while they answered the items on the controversy-evaluation test.

Before watching the gaze replay, participants received the following instructions for the verbalization (that were in line with the standards described by Ericsson and Simon 1993):

“In the following, you will be shown a video with a recording of your eye movements when answering the questions. Please watch the video and tell me everything you were thinking then. In the video, you will see a yellow dot that moves across the screen. This is the recording of your eye movements. The video will be played at half speed so that you have the opportunity to comment on your eye movements. Just act as though you were alone in the room talking to yourself. It is important that you verbalize everything that comes to mind. This is not about your thoughts being correctly formulated or thought through. If you don’t say anything for a while, I will ask you to speak. I will play the video now and start a new recording that records what you are saying. Please keep in mind that you should verbalize everything you were thinking about when answering the questions.”

Coding of the verbal protocols

Participants’ verbalizations during cued retrospective reporting were audiotaped and transcribed. Transcripts were then segmented into idea units that each comprised a coherent statement. Note that idea units can consist of only a few words up to several sentences, depending on the semantic structure of what participants expressed rather than grammatical considerations. In order to identify idea units that referred to uncertainty beliefs in the verbal protocols, a coding scheme was developed in a deductive process, taking into account the theoretical framework of the study as well as the context of test-taking when participants completed the controversy-evaluation test (see Table 2). Two steps had to be fulfilled for an idea unit to be coded as referring to uncertain or certain knowledge. In Step 1, idea units were coded with respect to whether they referred to (a) the content of the respective controversy or to (b) participants’ actions during the test-taking process (e.g., “Now I’m reading the next question”). If an idea unit was coded as referring to content, the coding decision in Step 2 was whether the idea unit referred to (a) uncertain knowledge, (b) certain knowledge, or (c) whether the idea unit did not refer to the certainty or uncertainty of knowledge (residual category). Sample utterances reflecting uncertainty beliefs are “I don’t think that new views or findings should be valued less than old ones”, or “It is hard to say whether such statements are correct, because in my opinion several views can be correct. And they can be more or less substantiated, and there are models that sometimes apply and sometimes they don’t”. A sample utterance reflecting certainty beliefs is “No, I mean evolutionary biology is just as up-to-date as it was then. That doesn’t make a difference, nothing is changing”. As can be seen from the provided examples, when coding for beliefs about the uncertainty or certainty of knowledge, the idea units were carefully examined for whether they suggested that scientific knowledge is complex and subject to change (or simple and unchanging, respectively) and whether they acknowledged disagreement between the two scientists (or endorsed a single correct answer, respectively). Using this coding scheme, two raters familiar with the study materials independently coded a random subsample of 20% of the verbal protocols. For Step 1, interrater agreement as measured by Krippendorff’s α for idea units coded as content of the controversy as well idea units coded as test-taking process was .90. For Step 2, for idea units coded as referring to uncertain knowledge, Krippendorff’s α was .81; for idea units coded as certain knowledge .80; and for the residual category (i.e., other content-related utterances), it was .85. All disagreements were resolved through careful discussion. Then, one rater coded the remaining verbal protocols.

Table 2 Coding scheme and interrater agreement for the verbal protocols

Control variables

Because of the text-intensive nature of the controversy-evaluation test, individuals’ reading comprehension ability was assessed as a control variable. Reading comprehension ability was measured with a standardized German cloze test (LGVT 6-12 by Schneider et al. 2007). On this test, participants are given 4 min to underline up to 23 target words that are presented next to two distractor words. They receive 2 points for every correctly underlined word, − 1 point for every incorrectly underlined word and 0 points if no word was underlined, resulting in a reading comprehension score ranging from − 23 to + 46.

The length of the gaze replay that was shown to participants to collect cued retrospective verbal reports varied depending on how long it took participants to answer the items. That is, participants with longer processing times during test-taking were also confronted with longer gaze replays. Hence, to account for the time available for verbalization during cued retrospective reports, gaze replay duration was used as another control variable.

Procedure

Participants’ PUB were measured approximately 1 week before the lab sessions with an online questionnaire in order to avoid carryover effects to the subsequent assessment of EUB. Then, participants were tested in single sessions in the lab. The five scientific controversies from the controversy-evaluation test and their accompanying items were each presented successively on a single page on a computer screen, and the items were answered via a mouse click. Calibration to the eye-tracking system was repeated before each controversy to increase measurement accuracy. After participants completed the controversy-evaluation test, they were shown the gaze replay of their test-taking performance for all five controversies in the original order and recorded their cued retrospective verbal reports. Hence, whereas the resulting score on the controversy-evaluation test served as a measure of participants’ controversy-evaluation performance, their utterances during test-taking as measured with the cued retrospective reports were used as online measures of EUB. Finally, participants completed the reading comprehension ability test.

Results

Descriptive results and intercorrelations with control variables

Table 3 provides an overview of the descriptive and correlational results. On average, participants answered M = 24.41 items (SD = 3.73) out of 32 correctly. Participants’ average gaze replay duration was 20.09 min (SD = 5.25), and in the verbal protocols across all five controversies of the controversy-evaluation test, an average of 49.19 (SD = 13.12) idea units was coded per participant. About two-thirds of the idea units were related to the content of the controversies (M = 32.77, SD = 11.10) and one-third to participants’ test-taking process (M = 16.42, SD = 14.47). Among the content-related idea units, beliefs in uncertain knowledge (i.e., EUB) were coded M = 2.46 times (SD = 2.24), and beliefs in certain knowledge were coded M = 0.32 times (SD = 0.69) per participant. Due to the low frequency of the latter (only 16 out of the 79 participants uttered at least one idea unit related to certain knowledge), as in Mason et al. (2010a), this category was excluded from further analyses. Participants’ average reading comprehension score was M = 20.41 (SD = 7.91). Note that according to the norms of Schneider et al. (2007), a comprehension score of 20 would be at percentile rank 89 for twelfth-grade academic-track students. Reading comprehension ability was negatively correlated with gaze replay duration (r = − .29, p = .010), that is, the higher students’ reading comprehension ability, the less time they spent with answering the items. However, because neither reading comprehension ability nor gaze replay duration were significantly correlated with controversy-evaluation performance (p = .247 and p = .533, respectively), these variables were not considered in the remaining analyses. In contrast, there was a significant correlation between number of idea units addressing the test-taking process and controversy-evaluation performance (r = − .29, p = .010). Furthermore, the number of idea units addressing the test-taking process was significantly negatively correlated with EUB (r = − .40, p < .001). Thus, we controlled for this variable in an additional analysis (see below). Besides, EUB and gaze replay duration were significantly positively correlated (r = .36, p = .001).

Table 3 Summary of intercorrelations and descriptive statistics for all measured variables

In the following sections, the results regarding our five hypotheses will be presented. In the results of the multivariate analyses, standardized coefficients will be reported to allow for easier interpretation.

Interrelations between professed uncertainty beliefs, enacted uncertainty beliefs, and controversy-evaluation performance

Hypothesis 1 predicted that PUB and EUB would be interrelated. As shown in Table 3, there was a small but significant positive correlation between these variables (r = .23, p = .045), confirming this hypothesis. Furthermore, we also expected a positive correlation between controversy-evaluation performance and PUB (Hypothesis 2) and between controversy-evaluation performance and EUB (Hypothesis 3). As shown in Table 3, these hypotheses were also confirmed (r = .45, p < .001 and r = .33, p = .003). Besides, as reported above, EUB was negatively correlated to number of idea units addressing the test-taking process, which in turn was negatively correlated to participants’ controversy-evaluation performance (see also Table 3). Thus, to investigate whether this variable would alter the relation between EUB and controversy-evaluation performance, a multiple linear regression analysis was conducted. Controversy-evaluation performance was used as the dependent variable and EUB and number of idea units addressing the test-taking process as predictor variables. In the resulting model, R2 = .14, F(2, 76) = 6.16, p = .003, number of idea units addressing the test-taking process did not significantly predict controversy-evaluation performance (β = − .18, p = .118), whereas EUB still was a significant positive predictor (β = .26, p = .028).

Strength of interrelations of controversy-evaluation performance with professed uncertainty beliefs and enacted uncertainty beliefs

In addition, Hypothesis 4 predicted that performance in the controversy-evaluation test would be more strongly related to EUB than to PUB. However, the correlation between PUB and controversy-evaluation performance was higher than the correlation between EUB and controversy-evaluation performance (see Table 3), although this difference was not significant (z = 0.91, p = .182). Thus, Hypothesis 4 was not supported, and there was even a tendency toward the opposite pattern, that is, a stronger association of participants’ performance in the controversy-evaluation test with their PUB than with their EUB.

Mediating role of enacted uncertainty beliefs

Finally, Hypothesis 5 predicted that the positive relation between PUB and performance in the controversy-evaluation test would be mediated by EUB. The prerequisites for a mediation model were met (cf. Baron and Kenny 1986): There was a positive correlation between the predictor and mediator (i.e., PUB and EUB), between the mediator and dependent variable (i.e., EUB and controversy-evaluation performance), and between the predictor and dependent variable (i.e., PUB and controversy-evaluation performance; see Table 3). The fourth prerequisite is that, when controlling for the mediator, the effect of the predictor decreases (partial mediation) or vanishes (complete mediation). In order to estimate this indirect effect, modern bootstrapping-based techniques allow for significance testing (Hayes 2013). Thus, we tested for an indirect effect of PUB on performance in the controversy-evaluation test through EUB as shown in Fig. 1 with a preset number of 10,000 bootstraps. When including EUB as a mediator, there was still a strong association between PUB and controversy-evaluation performance (B = 4.01, SE = 1.06, β = .39, p < .001). However, there was also a significant indirect effect of PUB on controversy-evaluation performance through EUB (B = 0.58, SE = 0.36, β = .06, 95% bootstrapped CI [0.006, 0.149]), indicating a partial mediation.

Fig. 1
figure 1

Mediation of professed uncertainty beliefs on controversy-evaluation performance through enacted uncertainty beliefs

Discussion

Summary of empirical findings

The purpose of this study was to investigate the relation between PUB and EUB as well as the roles of these variables in the evaluation of scientific controversies. Results revealed a small but significant correlation between PUB and EUB (Hypothesis 1). This implies that, as expected, individuals’ general perceptions of the uncertainty of scientific knowledge are related to the ways in which they reflect on the uncertainty of knowledge in the context of evaluating scientific controversies. Note that, in line with previous research (Mason et al. 2010a), participants in the present study mainly expressed beliefs in uncertain knowledge, but not in certain knowledge, in their verbalizations. Furthermore, both PUB and EUB predicted performance in the controversy-evaluation test (Hypotheses 2 and 3). In Hypothesis 4, we predicted that EUB would be more closely linked to participants’ controversy-evaluation performance than PUB would be (cf. Barzilai and Zohar 2014). This hypothesis was not confirmed, and the relation between performance in the controversy-evaluation test and PUB tended to be even stronger than the respective relation with EUB. Finally, in line with Hypothesis 5, EUB were found to partially mediate the relation between PUB and students’ performance in the controversy-evaluation test. The enactment of underlying uncertainty beliefs in a given context, as it has been assumed in the literature (e.g., Hofer 2001; Muis et al. 2016), is reflected in this mediation model. However, it should be noted that the mediation effect was small and the remaining effect of PUB on the controversy-evaluation test performance was substantially larger than the effect of the mediator EUB. Moreover, due to the correlational data structure, these results should not be interpreted as causal effects.

Regarding the investigated control variables, neither reading comprehension ability nor gaze-replay duration was correlated with students’ performance in the controversy-evaluation test. Given that study participants were university students, they possibly possessed sufficient reading skills to process the textual information on a semantic level. Prior research drawing on more heterogeneous samples, however, indicates that less skilled readers who process textual information on a surface level do not show the same levels of epistemic understanding when it comes to reading controversial information (Cho et al. 2018). Besides, gaze replay duration was positively correlated with EUB in our study. One possible explanation for this result is that participants holding strong EUB spent more time reflecting on the controversial positions during test-taking, resulting in longer gaze replay durations. Importantly, however, it should be noted that gaze replay duration did not predict students’ performance in the controversy-evaluation test over and above EUB.

Theoretical implications

The results of the present study provide novel theoretical insights into the relations between professed and enacted beliefs about the uncertainty of scientific knowledge and their role in individuals’ evaluation of scientific controversies. Whereas prior research has focused on either professed or enacted uncertainty beliefs (see also Sandoval et al. 2016), the results of the present study suggest that a direct juxtaposition of the two conceptualizations will yield more theoretical clarification as to how individuals evaluate conflicting information.

Specifically, on a descriptive level, the relation between PUB (i.e., the decontextualized measure of science-related uncertainty beliefs) and controversy-evaluation test performance was found to be larger than that of EUB. It appears that epistemic beliefs are not an entirely contextualized phenomenon, but rather that PUB and EUB are different facets of the same construct. Whereas PUB as measured with a questionnaire might represent participants’ explicit and more general beliefs about the uncertainty of scientific knowledge, EUB as measured by cued retrospective verbal reports probably reflect more tacit and context-specific beliefs. A potential post hoc explanation for the stronger association between PUB and performance in the controversy-evaluation test is that in both of these measures, participants were asked to explicitly state their agreement with different written claims, as compared with the open-ended, oral format of the cued retrospective verbal reports. Moreover, correlations of this magnitude are typical when comparing offline measures such as questionnaires and online measures such as verbal reports (e.g., Cromley and Azevedo 2007). The different assessment modalities might also, at least in part, account for the smaller effect of the mediator EUB on the test score in comparison with the large effect of PUB, because PUB and the controversy-evaluation test drew on similar data sources. Another possible explanation for the relatively small relation between EUB and individuals’ performance in the controversy-evaluation test is related to task demands of cued retrospective verbal reports. This method may have been uncomfortable for some participants or may have exceeded their cognitive capacities (Chinn et al. 2011; Schraw 2000). This conclusion needs to remain speculative, however, and more research is needed to clarify how the cognitive demands of a task influence the quality of verbal protocols (Jarodzka and Boshuizen 2017).

Furthermore, PUB and EUB correlated only moderately. The theory of integrated domains in epistemology (TIDE; Muis et al. 2006) offers a potential explanation for this result. According to the TIDE, epistemic beliefs operate on different levels, from general to specific, with a reciprocal relation between these different levels of epistemic beliefs (see also Merk et al. 2018). In the present study, PUB were measured in the domain of science, while the EUB measure rather reflects participants’ science-related beliefs on a topic-specific level (across 5 different scientific issues). Furthermore, whereas PUB are assumed to be relatively stable, EUB partly depend also on the context of the respective topic in which they are enacted, which is why a one-to-one correspondence between PUB and EUB is unlikely (cf. Muis et al. 2006). Future research could examine PUB also on a topic-specific level (cf. Mason et al. 2010a). However, in the context of the present research, this would mean to assess PUB separately for the five different topics.

Whereas prior research on epistemic beliefs has primarily relied either on analyses of verbal data (often using small samples) or on quantitative assessments of questionnaires, a strength of the present study is the integration of the two approaches. Hence, the present study provides an example of how different conceptualizations of an epistemic belief dimension can translate into respective measurement approaches, aligning the employed measurements with the constructs in question (cf. Barzilai and Zohar 2014; Mason et al. 2010a; Sandoval et al. 2016). Given that both PUB and EUB were able to explain variance in the performance of the controversy-evaluation test, we argue that EUB should neither be conceptualized as entirely dependent on context nor are PUB likely to fully determine how individuals think about—in this case—scientific controversies. Rather than striving for a true or direct measurement of uncertainty beliefs, both explicit and tacit measures seem necessary for understanding how individuals evaluate conflicting scientific information (Limón 2006; Sandoval and Millwood 2007). Whereas PUB might serve as an underlying mindset that affects, for instance, which tasks individuals select, EUB have the added value of explaining the epistemic cognitive processes that occur when individuals are engaged in such tasks (Hofer 2004; Muis 2007; Pieschl et al. 2014).

Practical implications

Based on the results of the present study, in the following, some key practical implications for science instruction will be outlined. The finding of the present study that both PUB and EUB are important for individuals in dealing with scientific controversies suggests that both of these facets should be an integral part of science curricula. In many science classrooms, science is being taught as a body of knowledge rather than having students engage in scientific inquiry and familiarizing them with the concept of uncertainty in the respective knowledge domain (Kirch 2012). Making students aware of the epistemic underpinnings of science and also having them engage in epistemic practices will likely advance both their professed and enacted beliefs about the uncertainty of scientific knowledge inside as well as outside an academic setting, for example, when they are searching the Internet for science-related information (Strømsø and Kammerer 2016). Borrowing from Veenman et al.’s (2006) principles of metacognitive instruction, educators might be advised to (a) connect the content matter to instruction about the uncertainty of knowledge (e.g., introduce multiple, conflicting viewpoints on a biological theory), (b) explain to students the usefulness of enacting their uncertainty beliefs for solving the task (e.g., the solution might lie in an integration of the different viewpoints), and (c) have students apply these skills repeatedly in order to internalize the critical evaluation of the uncertainty of knowledge (e.g., confront them with opposing or changing viewpoints in different topics or domains). Zohar and Barzilai (2013) concluded that this kind of metacognitive instruction, in which students’ ways of thinking about knowledge and knowing are made salient, can best advance the epistemic understanding of science. Moreover, having access to one’s beliefs about the uncertainty of scientific knowledge and knowing when and how to apply them will likely help individuals in our modern knowledge-based society to draw more valid conclusions from competing knowledge claims pertaining to science-related topics of personal relevance. This will also allow them to make more informed decisions (Feinstein 2011; Roth and Lee 2004; Yang and Tsai 2010).

Limitations and future directions

The present study is one of the first attempts to provide a joint empirical, quantitative examination of professed and enacted epistemic beliefs about the uncertainty of scientific knowledge and their mediational relationship in the context of evaluating scientific controversies. Bearing this in mind, the study is not without its limitations, but it also points toward promising possibilities for future research.

One limitation of the present study is that due to the correlational data structure, no firm conclusions can be drawn about causal relationships between PUB, EUB, and the evaluation of scientific controversies. Whereas the present study tested the prediction of the enactment of uncertainty beliefs when evaluating scientific controversies, there is also evidence that, conversely, being confronted with contradictory information can have an impact on individuals’ professed epistemic beliefs (Barzilai and Zohar 2016; Flemming et al. 2017; Kienhues et al. 2016). Similarly, other potential confounding variables that were not accounted for in this study (e.g., general cognitive ability) might, in part, provide alternative explanations for the present results. Future studies should clarify these questions by using experimental designs with repeated measurements of PUB, EUB, and potential moderators.

On a related note, it should be mentioned that due to the fact that participants were confronted with controversial positions in the controversy-evaluation test, the study setting itself might have prompted them to reflect on the uncertainty of scientific knowledge above and beyond their own epistemic dispositions. For example, Ferguson et al. (2013) found that readers who were confronted with contradictory positions tended to develop stronger beliefs in uncertain knowledge, whereas those reading consistent positions did not. However, the abovementioned framework of epistemic beliefs and multiple document comprehension by Bråten et al. (2011) suggests that readers holding beliefs in certain knowledge will try to find a single correct answer despite being confronted with multiple conflicting knowledge claims.

Moreover, beliefs in uncertain knowledge are not adaptive in each instance. For example, a study by Lee et al. (2016) found that students who believed in uncertain knowledge less frequently used deep strategies (such as making connections to other school subjects) for learning biology. Indeed, it does not seem beneficial to question the certainty of scientific knowledge for phenomena on which there is broad consensus (e.g., “The earth is round.”; see also Sinatra et al. 2014). Accordingly, individuals should not put the epistemic belief that scientific knowledge is tentative and evolving on a level with a form of naive multiplism, where all scientific claims are regarded as equally valid or invalid (cf. Rosman et al. 2017). In contrast, they need to develop the competence to identify sound scientific evidence and valid scientific arguments (e.g., Bromme et al. 2013). Presumably, as the ambiguity and complexity of scientific issues increase, so does the relevance of uncertainty beliefs in explaining the different opposing viewpoints. It is therefore plausible to assume that uncertainty beliefs are beneficial for the evaluation of scientific controversies, but caution is warranted to overgeneralize the adaptiveness of uncertainty beliefs to other contexts or tasks.

Finally, given the complexity of the research question, the present study focused only on one specific epistemic belief dimension and in a particular context using a relatively homogeneous sample of university students. This narrow focus comes along with certain restrictions in terms of generalizability. Future research might investigate the adaptiveness of uncertainty beliefs in different contexts and with different age groups or educational backgrounds (Greene and Yu 2014). For example, individuals might apply their uncertainty beliefs differently in conditions that are less standardized than the controversy-evaluation task, such as a free web search (Greene et al. 2018a; Greene et al. 2014; Kammerer et al. 2013; Mason et al. 2010a; Mason et al. 2011). Furthermore, participants without a university background might differ in their uncertainty beliefs or might apply them differently (Kammerer et al. 2015). Moreover, future studies could aim to identify other individual or situational factors that contribute to the enactment of uncertainty beliefs when individuals evaluate conflicting scientific information, such as individuals’ cognitive engagement (Ravindran et al. 2005) or the nature of the task (e.g., summary tasks versus argument tasks, see Gil et al. 2010). In addition, future research should investigate whether the findings of the present study can be replicated for other epistemic belief dimensions, bearing in mind that different dimensions of epistemic beliefs might be adaptive for different kinds of tasks (Sandoval et al. 2016). In summary, instead of obliterating the seemingly outdated construct of (professed) epistemic beliefs, it might in fact be a more promising approach to clearly state the conceptual overlap and differences between individuals’ professed and enacted epistemic beliefs (Alexander 2016; Hofer 2016).