May a witness challenge the conviction? (Some) Confirmation bias in legal experts

Criminal investigations and trials are always guided by assumptions such that a suspect or defendant has committed a certain crime. Research on confirmation bias suggests that such prior assumptions bias subsequent information processing, fostering a confirmation of those assumptions. Biased information processing, in turn, would pose a severe threat to legal decision-making. Since previous evidence regarding professional decision makers is sparse and inconsistent, the present paper investigated whether legal experts showed evidence of confirmation bias. Specifically, we provided case materials pointing to a certain suspect and investigated evaluations of subsequent eyewitness evidence as a function of whether it was consistent or inconsistent with the initial suspicion. Although identical, witnessing conditions were rated as significantly poorer for inconsistent (vs. consistent) statements, thus, indicating confirmation bias. The effects were rather small, but this finding did not hinge on professional training, as another study with (law) students suggested. We argue that even small effects may threaten fair judgments and that our findings likely underestimate real-world effects.

When in March 2009, Rudolf Rupp's vehicle and body were recovered from the Danube, investigators were puzzled. After all, in 2005 Rupp's wife and the ex-boyfriend of one of his daughters had been sentenced to 8.5 years imprisonment for not only killing Rudolf Rupp but also dismembering his body and feeding the body parts to their animals. Even though the body's discovery was inconsistent with the findings made by the court in its verdict, a retrial was rejected at first since the fact that "the farmer may have been killed in a way other than described in the court ruling does not alter the other findings of the court decision, namely that the act was planned, that the farmer came home that night, that he was expected by the convicted and killed in a joint plan of action" (Spiegel TV, March 19, 2011) 1 . This was not the first instance of an astonishing tolerance to inconsistencies in this case, however, as not a single trace of blood had been found at the alleged crime scene.
Much research indicates that people show a strong tendency to reject, discredit or downplay belief-inconsistent information. Such a confirmation bias would pose a severe threat to a fair trial, however, if it was present in legal decision-making as well.

Confirmation bias in legal decision making
Belief-consistent information processing is ubiquitous (Nickerson 1998): Whenever people hold prior beliefs, they "seek, interpret, and create new evidence in ways that verify their pre-existing beliefs" (Kassin et al. 2013, p. 44). Individuals prefer belief-consistent over belief-inconsistent information (Hart et al. 2009), they erroneously perceive new information to confirm their beliefs (Lord and Taylor 2009), and they tend to discredit belief-inconsistent information (Ditto and Lopez 1992). All of this may lead to a persistent belief despite contrary evidence (Davies 1997), which is even more pronounced after people have already made far-reaching decisions (Festinger et al. 2011(Festinger et al. /1955. Importantly, confirmation bias is not limited to cases in which people want their beliefs to be true (Snyder and Swann 1978;Snyder and Uranowitz 1978). Thus, it is not surprising that confirmation bias is found even in contexts where people are deliberately motivated to be unbiased (Lord et al. 1985), such as in scientific research (Greenwald et al. 1986) as well as in forensic investigations: Even if people are personally indifferent about the quality of a statement and motivated to provide an accurate evaluation, they are still influenced by prior information about the credibility of that person (Bogaard et al. 2014). Moreover, experts may be influenced by prior information: During the investigation, crime scene investigators may interpret the same scene differently depending on prior information (Dror et al. 2006;Elaad et al. 1994;van den Eeden et al. 2016). Additionally, forensic investigations sooner or later focus on certain leads coming along with specific hypotheses (e.g., a certain person being suspected, Fahsing and Ask 2013;Singelnstein 2016), and thus, confirmation bias is almost invited here. In its worst version, it may then lead to a self-fulfilling prophecy as guilt presumption fosters an interrogation style that makes the interviewed person more nervous and defensive (Lidén et al. 2018c), which is then interpreted as a sign of guilt (Kassin et al. 2003), even if innocent people react the same way (Hill et al. 2010). Hence, guilt-presumptive interrogation may produce false confessions (Costanzo and Costanzo 2014;May et al. 2022;Volbert 2013). Still, when it comes to trial, the defense may take all legal measures to challenge the perspective of the prosecution. That is, trials are an institutionalized attempt to counter (confirmation) bias (Singelnstein 2016)-at least in theory.
Although there is much evidence for confirmation bias in the forensic context (Kassin et al. 2013), only a few studies on professional legal decision-making exist. Their comparability is already challenged by their conduct in different legal systems (cf. Lind et al. 1973). Moreover, some studies have methodological shortcomings: Research examining case studies or reviewing court records (Dobbie et al. 2018;Findley and Scott 2006;Lidén et al. 2018a, b;Schünemann 1988) always faces the challenge of providing conclusive evidence for confirmation bias. Pretrial detentions, for instance, which increase the probability of guilty pleas and the probability of a conviction (Dobbie et al. 2018) are likely predominantly issued in cases that are characterized by a higher a priori probability to convict. Moreover, pretrial detentions may produce the very result by exerting pressure on the accused to plead guilty (Euvrard and Leclerc 2017). Also, the extent to which the court deviates from the prosecution's plea does not provide ultimate evidence for a bias as long as deviations from the defense cannot be determined (Schünemann 1988). Finally, some of the older studies (Bandilla and Hassemer 1989;Schünemann 1995, Study 1) comprised sample sizes that are regarded as insufficient today (Nelson et al. 2018).
Of the remaining research, there are two studies demonstrating a confirmation bias in the search for information and the postdecisional evaluation of information (Lidén et al. 2018d;Schweizer 2005;Schmittat and Englich 2016). The evidence is mixed, however, concerning the evaluation of novel information: Two studies by Rassin (2017) found that the same evidence was evaluated as stronger when it followed another piece of evidence that pointed in the same direction, whereas the evaluation of a novel piece of ambiguous evidence was not affected by prior detention decisions, which were supposed to elicit a stronger guilt presumption (Lidén et al. 2018d). Being confronted with evidence that is inconsistent with a prior hypothesis is extremely common in court, however. On the one hand, there is a clear hypothesis guiding each trial: charges are pressed and a trial is conducted only if a conviction is more likely than an acquittal (Schünemann 1995). On the other hand, prosecution and defense argue from different perspectives, which is envisaged to prevent confirmation bias. But does it work that way? The present research addressed this question for two reasons. First, the empirical basis to date is inconsistent when it comes to professional decisionmakers. Second, we cannot generalize from studies with laypeople as legal training may foster the ability to control for unwarranted influences (Kahan et al. 2011). Therefore, we conducted a study with judges and prosecutors testing for confirmation bias in the evaluation of novel information that is consistent vs. inconsistent with their prior assumption. Additionally, we examined whether confirmation bias varies as a function of the strength of the prior assumption (operationalized by the amount of evidence for it, Study 2) as this is likewise an unsolved question so far: On the one hand, research found uncertainty to increase belief-consistent information processing (Sawicki et al. 2011). On the other hand, stronger prior evidence led to an even stronger search for incriminating evidence (Rassin et al. 2010).

Participants and design
A power analysis conducted with G*Power (Faul et al. 2009) suggested a sample size of N = 128 for detecting medium-sized effects (d = 0.5, independent t-tests, α = 0.05, 1-β = 0.80). Altogether, 143 German judges (n = 102) and prosecutors (n = 38; n = 3 missing data) completed the study (89 women, 51 men, 3 participants did not indicate their biological sex). The mean age was 33.33 years (SD = 6.87) and they had an average of 3.42 years of professional trialrelated experience (SD = 6.26; range: 0-30). Assignment to one of two experimental conditions (consistent, inconsistent witness testimony) was random.

Materials and procedure
Data were collected in the context of professional training for legal experts. Participants were told that we were interested in their evaluation of a case. We used the translated materials by Ask and Granhag (2007a; see https://osf.io/ z8fcq/ for materials, data, and supplemental information). After informed consent, each participant received a booklet with a short case vignette designed to create an initial suspicion against the father of the severely injured victim and the summarized reports of two witnesses. The first witness report was consistent with the initial suspicion for all participants. The second witness was either consistent with the initial suspicion (recognizing the father's voice) or inconsistent (stating that the voice had sounded much too young for the father).
We assessed the probability that the father was guilty of the crime (1 = very unlikely, 9= very likely) and the strength of evidence against the father (1 = very weak, 9 = very strong) after reading the case description and after reading the two witness reports. After each testimony, participants were asked to evaluate the reliability of the witness statement (1 = very low, 9= very high), the perceived trustworthiness of the witness (1 = very low, 9= very high), the perceived witnessing conditions (1 = very poor, 9= very good), the evidential value of the testimony (1 = very low, 9 = very high) and the legal relevance they would assign to the testimony relative to the other evidence (1 = very low, 9 = very high). Finally, variables that are not reported here (see OSF) and demographic data were assessed. Participants were then fully debriefed.

Results
As a manipulation check, we tested whether the consistency of evidence affected the overall evaluation of the case. Guilt ratings increased after receiving two pieces of evidence that were consistent with the initial suspicion, while they remained the same when receiving mixed evidence (one consistent and one inconsistent piece of evidence), F(1,140) = 64.07, p < 0.001, ηp 2 = 0.32 (see OSF). Consequently, our experimental manipulation was successful.

Confirmation bias
Contrary to guilt ratings, evaluations of witness reliability, trustworthiness, and witnessing conditions should not be affected by the report's consistency with the initial suspicion. An impact of evidence consistency would thus represent a confirmation bias. To test this, we conducted a multivariate analysis of variance (MANOVA)2 2 with witness reliability, trustworthiness, and witnessing conditions regarding the second witness report as dependent variables, and consistency of this statement with the initial suspicion (consistent, inconsistent) as between-subjects factor. Fig. 1 Reliability of the statement, trustworthiness of the witness, and witnessing conditions as a function of the statement's consistency with the initial suspicion (standard errors) There was no global effect of the experimental manipulation, F(3,139) = 1.65, p = 0.181. Neither reliability of the witness report, F(1,141) = 0.82, p = 0.366, nor perceived trustworthiness of the witness, F(1,141) = 0.09, p = 0.761, were affected by the statement's (in)consistency with the initial suspicion. Witnessing conditions, however, differed significantly as a function of the report's consistency, F(1,141) = 4.53, p = 0.035, ηp 2 = 0.03: They were judged as significantly poorer when the statement challenged the initial suspicion (Fig. 1). Thus, we found evidence for confirmation bias in experts' evaluations for one out of three variables and the size of this effect was rather small 3 .

Discussion
We found evidence for a confirmation bias in legal experts in one out of three dependent variables. Only evaluations of the circumstances were susceptible to confirmation bias: The situation that led to the statement was evaluated as less 3 We reran our main analysis after excluding those 14 participants who had already heard of confirmation bias. The results pattern was identical (see OSF). favorable when evidence contradicted the initial assumption. In principle, it does not need more to discredit a witness account: Questioning the witnessing conditions could effectively call the statement into doubt. In fact, witnessing conditions were positively correlated with statement reliability, r = 0.637, p < 0.001. But evaluations of statement reliability did not differ significantly between experimental conditions. Thus, confirmation bias in our study was small and limited to the evaluation of the situation (which was descriptively most ambivalent, and, thus, might have provided more room for bias to occur), leaving the general and the specific competence of the witness (trustworthiness and statement reliability) unaffected. Although rather weak evidence for confirmation bias had been obtained in some studies as well (Ask and Granhag 2007b;Lidén et al. 2018d), other research documented larger effects (Schmittat and Englich 2016;Schünemann 1988;Schweizer 2005). Therefore, we tested whether the small effects were due to our professional sample by inviting two samples with less legal training for Study 2: law students and students from other disciplines. Additionally, we explored, whether the magnitude of confirmation bias varied as a function of the strength of prior evidence.

Participants
To enable a detection of small to medium-sized effects (f = 0.15), we aimed at a minimum sample size of at least N = 351 (as suggested by G*Power for an ANOVA with between-subject factors, α = 0.05, 1-β = 0.80, 8 groups, Faul et al. 2009). Altogether, N = 519 participants (391 women, 120 men, 1 diverse person and 7 who did not indicate their biological sex) took part and agreed to have their data analyzed. Mean age was 23.13 (SD = 7.23) and participants of both samples (n = 227 law students and n = 292 students without any background in law, henceforth "laypeople") were randomly assigned to one of four experimental conditions.

Materials, design, and procedure
Case materials were identical to Study 1 with the following exception. We varied the case composition by additionally realizing a case in which participants did not receive the first witness report (which did not vary between subjects) but only read the case vignette (introducing the initial suspicion) and the-originally second-witness report, which varied in its content (consistent vs. inconsistent with initial suspicion). Therefore, the study comprised a 2 (sample: law students vs. laypeople) × 2 (case composition: two vs. one witness reports) × 2 (evidence: consistent vs. inconsistent) between-subjects design. The procedure was otherwise identical to Study 1-except for the administration as an online study.

Results
We will limit our report on confirmation bias (i.e., effects involving the factor "evidence", see OSF for further results). A multivariate analysis of variance (MANOVA) across all three variables measuring confirmation bias as dependent variables (witness reliability, witness trustworthiness, witnessing conditions) and with case composition (one, two witness reports) and evidence (consistent, inconsistent) as well as sample (law students vs. laypeople) as between-subjects factors yielded a main effect of evidence, F(3,509) = 7.40, p < 0.001, ηp 2 = 0.04, but no interactions, F's < 1.18, p's > 0.316. The main effect of evidence was significant for two out of the three dependent variables: reliability of the witness report, F(1,511) = 5.00, p = 0.026, ηp 2 = 0.01, and witnessing conditions, F(1,511) = 21.47, p < 0.001, ηp 2 = 0.04, whereas witness trustworthiness was unaffected, F(1,511) = 1.50, p = 0.221. Participants rated the witnessing conditions as significantly poorer when the witness report was inconsistent, M = 4.87, SD = 1.89, versus consistent, M = 5.62, SD = 1.83, with prior evidence. Also, the statement was evaluated as less reliable when inconsistent, M = 6.11, SD = 1.46, compared to consistent, M = 6.38, SD = 1.52, with prior evidence.

Discussion
Study 2 revealed evidence for a confirmation bias on two (vs. one) out of three variables but the effects were likewise rather small. Consequently, professional experience does not seem to make a substantial difference (see also Guilbault et al. 2004). This might be further supported by the fact that there was also no difference between students with and without some initial training in law (on average 2.44. semesters). Additionally, the amount of incriminating evidence prior to the critical (consistent vs. inconsistent) witness report did not affect biased information processing in our study.

General discussion
In sum, confirmation bias was small and evident only on some variables. Participants were more critical of the witnessing conditions of a statement that was inconsistent (vs. consistent) with prior evidence. Discrediting the circumstances of witnessing was, in turn, related to the perceived reliability of the statement (Study 1), which translated into a-smaller-confirmation bias on this variable, which was documentable when statistical power was large (Study 2). We had, however, used a conservative operationalization of confirmation bias. That is, we focused on variables for which (in)consistency should not matter: the description of the witness and the circumstances had been identical for all statements. The obtained effects of statement (in)consistency thus clearly argue for confirmation bias. But what about the evidential value of the statement? We obtained significant effects of statement (in)consistency here as well. Could this speak for confirmation bias, too?
The answer to this question varies with the interpretation of those items. Evidential value, for one, could be interpreted narrowly in terms of the reliability of the witness's testimony: Does the testimony correspond to the witness's perception? In other words, does it accurately reflect her knowledge? Since the circumstances for the witness statements were identical, the significantly lower rates for the evidential value of the inconsistent (vs. consistent) testimony could indicate confirmation bias. The evidential value was significantly positively correlated with the reliability of the testimony, r = 0.637, p < 0.001. However, it is also conceivable that "evidential value" was understood more broadly, namely as the importance of the testimony for the judicial conviction of the suspect's guilt. And in this respect, it certainly matters whether the content of the testimony incriminates or exonerates the specific suspect. If a second witness statement incriminates the suspect along with the first one, it should enhance the conviction that the suspect committed the crime. In this understanding, it would be rational (and thus not indicative of confirmation bias) to assign a higher evidential value to a (consistent) incriminating statement than to an (inconsistent) exonerating statement.
Considering the different possible interpretations, we are cautious with our conclusions. Likewise, it is noteworthy that confirmation bias might also be hidden in the evaluation of the second witness report: Without an independent rating of the evidential value of the second witness report, it cannot be precluded that prior evidence led to an undue discrediting of the inconsistent report (and its effect on guilt ratings). But this discussion already points to the greater complexity of real-world settings. And although we aimed at an ecologically valid examination of confirmation bias, our experiment still lacks crucial features of realworld settings (Konečni and Ebbesen 1992): Participants in our studies had no stakes in the outcome or in the prior assumptions-and several studies suggest that this enhances perseverance effects (Festinger et al. 2011(Festinger et al. /1955Jonas et al. 2001;Schmittat and Englich 2016). More importantly, various actors and processes interact (Singelnstein 2016): Investigators may commit themselves to a hypothesis too early (Fahsing and Ask 2013) and be biased by it (O'Brien 2009), when interviewing suspects (Hill et al. 2010;Kassin et al. 2003;Lidén et al. 2018c) or evaluating evidence (Ask and Granhag 2007a;Appleby and Kassin 2016;Simon et al. 2004). Consulted experts may be biased by information provided to them (van den Eeden et al. 2016) or by the side that retained them (Murrie et al. 2013). The investigative proceedings, however, effectively shape the file upon which it is determined whether charges are pressed and a trial is opened (Schünemann 1988). Consequently, the prosecution has a pronounced influence on the trial, which may undermine the idea of institutionalized dissonance that is supposed to result from the work of opposing parties (Singelnstein 2016). Furthermore, the decision to press charges as well as to conduct a trial may bias subsequent information processing as it further increases commitment to prior assumptions (suspicion against the defendant, Schünemann 1995) but possibly also because all participants in the legal trial act according to their role (Babcock et al. 1995). In other words, the real world does not only entail more opportunities for confirmation bias to step in, but those opportunities are not independent of one another. Rather, earlier stages affect later stages by setting the course and laying the foundations for the file, the trial, and the verdict (Singelnstein 2016). Consequently, bias likely accumulates during the process. Against this background, it becomes clear how minuscule the fraction of reality is that we examined here. Therefore, our results are anything but an allclear signal. Not only because we did find confirmation bias but because our results likely provide a rather conservative estimate-with real-world effects likely being larger. Therefore, our findings should not be taken lightly.
Funding Open Access funding enabled and organized by Projekt DEAL.

Conflict of interest A. Oeberst and I. Goeckenjan declare that they have no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4. 0/.