How many times has any of us said something that we thought was perfectly clear, only to learn that our intended meaning had failed? During his presidential campaign, in Saginaw, Michigan, George W. Bush (2000) deviated from his prepared speech on fish conservation by stating, “I know the human being and fish can coexist peacefully.” While it might have been clear in Bush’s mind that government should practice responsible ecology, the media interpreted this statement differently: What behaviors can fish exhibit to live more peacefully with humans? Bush’s statement demonstrates the persistence of hindsight bias in daily oral communication. Hindsight bias occurs when outcome knowledge compromises one’s ability to appreciate one’s own prior or another person’s naïve knowledge (Fischhoff, 1975). Likewise, the clarity of one’s thoughts during oral communication may lead the speaker to overestimate the clarity of the message for the listener.

Hindsight bias occurs in many real-life situations, including investments, legal decisions, emergencies, and clinical judgments (Hawkins & Hastie, 1990). Most hindsight bias studies involve written materials in which people try to ignore event outcomes and answers to almanac questions when recalling their own ability to foresee those outcomes and answers (Blank, Musch, & Pohl, 2007). Hindsight bias has also been observed in the gustatory (Pohl, Schwarz, Sczesny, & Stahlberg, 2003) and visual (Bernstein, Erdfelder, Meltzoff, Peria, & Loftus, 2011; Harley, Carlsen, & Loftus, 2004) domains.

Despite the vast hindsight bias literature, little work has examined the auditory hindsight bias. In a well-cited unpublished study, Newton (1990, cited in Griffin & Ross, 1991) assigned participants to be tappers or listeners and presented them with 25 familiar songs. The tappers chose a song and tapped its rhythm, and the listeners guessed the song from the tapped rhythm. Although the listeners rarely identified the songs, the tappers greatly overestimated how many songs listeners would identify.

Overconfidence in communication can affect both speakers (Keysar & Henly, 2002) and listeners (Vokey & Read, 1985). Speakers often overestimate the clarity of their message, and listeners overestimate their own understanding of unclear messages. In each case, speakers fail to clarify their message, and listeners fail to seek clarification (Chang, Arora, Lev-Ari, D’Arcy, & Keysar, 2010). We maintain that when speakers and listeners incorrectly believe that the listener understands the intended message (see, e.g., Keysar, 1994), this miscommunication results from auditory hindsight bias. Avoiding miscommunication requires that speakers and listeners account for their respective differences (Todd, Hanko, Galinsky, & Mussweiler, 2011).

The epitome of hindsight bias in listeners is hearing what one expects to hear. For example, people who hear the song “Another One Bites the Dust” played backward while listening for the words “It’s fun to smoke marijuana” hear just that (Epley, Keysar, Van Boven, & Gilovich, 2004). There are also practical consequences to such errors: Court transcripts of degraded audio recordings make those recordings sound clearer than they are (Lange, Thomas, Dana, & Dawes, 2011); knowing what to listen for makes degraded audio stimuli sound clear.

With the evolution of communication media, understanding the mechanisms of auditory hindsight bias is critical to understanding and preventing miscommunication. Face-to-face communication has evolved over generations, but the 20th and 21st centuries have ushered in new forms of communication without visual cues to assist in deciphering messages (e.g., telephone, e-mail, and texting). This has created misunderstandings due to communicators’ overestimation of their receivers’ ability to understand the messages (Kruger, Epley, Parker, & Ng, 2005). The present work systematically explores the occurrence of auditory hindsight bias for words and sentences.

Experiment 1: Words

Method

Participants

A group of 78 undergraduates (55 female, 23 male; mean age 22.7 years) participated in exchange for credit.

Materials

We recorded 58 common-object words (e.g., “barn”), and degraded each using a low-pass filter in MATLAB. Low-pass filters reduce the amplitudes of the sound frequencies above the filter’s cutoff frequency. Lowering the filter’s cutoff frequency dampens the higher-frequency sounds associated with consonants in spoken language. At a frequency cutoff of 2000 Hz, people with normal hearing begin confusing consonant sounds (Sher & Owens, 1974). For this reason, researchers use low-pass filters to simulate high-tone hearing loss in people of normal hearing (Scott, Green, & Stuart, 2001). Lowering the cutoff frequency for the low-pass filter makes it harder to identify the degraded word. We created 30 levels of degradation, containing cutoff frequencies between 500 and 10000 Hz, and then investigated these stimuli in a pilot study to find the frequencies at which only 10 %–20 % of participants could identify the degraded words after one hearing. We chose a low identification base rate because hindsight bias tends to increase as task difficulty increases (see Harley et al., 2004). The final set of 40 words contained a low-pass filter frequency between 930 and 1728 Hz. We recorded the final presentation of the stimuli onto audio CD.

Procedure

The experiment had one within-subjects variable (task: naïve identification or hindsight estimation). We randomly divided the 40 words into two blocks (Block 1 contained Words 1–20; Block 2, Words 21–40), and we fixed the word orders within each block. We counterbalanced block presentation order (Block 1 first, Block 2 first) and task order (naïve identification first, hindsight estimation first). In the naïve-identification task, participants tried to identify degraded words (see the Electronic Supplementary Material for the words and the instructions). For each trial, a 0.5 s warning tone preceded the degraded word, presented 1.5 s after tone onset. The interitem interval between tones was 12 s, leaving participants approximately 8.5 s to write down the word. In the hindsight estimation task, the 0.5 s warning tone preceded a clear word, followed by a degraded version of the word. The clear word is the functional equivalent of the outcome knowledge in more traditional hindsight bias studies. Unlike traditional hindsight bias studies, which involve judgments about one’s own ability to foresee outcomes while ignoring outcome knowledge, here we asked participants to ignore outcome knowledge while estimating what percentage of their naïve peers would correctly identify the degraded word when their peers had not heard the clear word first. Again, the interitem interval was 12 s, with the clear word presented at 1.5 s and the degraded word presented 5.5 s after tone onset. Participants had approximately 4.5 s to respond. Pilot work indicated that the two tasks required different amounts of time, hence the different time limits. We tested the participants in groups and presented the stimuli via a portable stereo.

Results and discussion

Table 1 lists the mean naïve-identification rates (i.e., the percentages of words correctly identified) and the mean hindsight estimations for all experiments. There was less variability in naïve identification than in hindsight estimates; therefore, we report analyses assuming either equal or unequal variance, accordingly. We report effect sizes as point-biserial correlations (r 2) in order to circumvent the problems associated with heterogeneity of variance and unequal sample sizes.

Table 1 Mean (and SEM) accuracy and sample size in a naïve-identification task, and percentage estimates in a hindsight estimation task in Experiments 14

In Experiment 1, we conducted an analysis of variance (ANOVA) with task (naïve identification, hindsight estimation) as the within-subjects variable, and block presentation order (Block 1 items first, Block 2 items first) and task order (hindsight estimation first, hindsight estimation second) as between-subjects variables. We found a main effect of task such that participants’ hindsight estimates (M = 54.39, SEM = 1.72) far exceeded their own naïve-identification rates (M = 19.03, SEM = 1.22), F(1, 74) = 333.27, p  < .001, r 2 = .63. The only other significant effect was a Task × Task Order interaction, F(1, 74) = 16.76, p < .001. The participants’ naïve-identification rates improved if they first performed the hindsight estimation task (M = 24.02, SEM = 2.22, vs. M = 14.04, SEM = 2.00), t = 4.02, p < .001, r 2 = .18, and their hindsight estimates decreased if they first performed the naïve-identification task (M = 50.74, SEM = 2.30, vs. M = 58.03, SEM = 2.55), t = −2.08, p < .05, r 2 = .05. This interaction suggests that performance in both tasks improved with practice processing degraded words. To eliminate issues of task practice, in Experiments 24, we will report between-subjects analyses on the first block of trials.Footnote 1

Finally, a reviewer wondered whether the auditory hindsight bias was due to participants overestimating their naïve-identification performance. We asked a subgroup of the participants (n = 37), when they finished the experiment, to estimate their own accuracy retrospectively by indicating how many naïve-identification words they had guessed correctly. Even though the participants overestimated their own accuracy (M = 34.05, SEM = 2.69; correct naïve identification = 20.95, SEM = 2.21), t(36) = 4.60, p < .001, r 2 = .16, their retrospective estimates were still significantly lower than their hindsight estimates (M = 52.44, SEM = 2.52), t(36) = −5.97, p < .001, r 2 = .26. Thus, the auditory hindsight bias in Experiment 1 occurred despite participants overestimating their own naïve-identification performance.

Experiment 2: Words with warning

Previous studies of hindsight bias have failed to reduce or eliminate hindsight bias by informing participants about the bias (e.g., Pohl & Hell, 1996). To test whether auditory hindsight bias is also robust against warnings, in Experiment 2, we informed our participants about the nature of auditory hindsight bias and asked them to avoid making this error (Griffin, Dunning, & Ross, 1990; Harley et al., 2004).

Method

A group of 17 undergraduates (11 female, 6 male; mean age 20.2 years) participated for credit. Ten completed the naïve-identification task, and 7 completed the hindsight estimation task on the Block 1 words (Items 1–20) from Experiment 1. The procedure was identical to that of Experiment 1, except we added the following warning instructions for the hindsight estimation group:

We are investigating whether first knowing a word will affect your prediction of how many of your peers will be able to identify a degraded version of that word. Typically, knowing the word makes people think others will be able to identify degraded versions of that word, when others actually cannot identify the word. This is called “hindsight bias.” Please try to avoid this bias and be as accurate as possible when estimating how many of your peers will identify the words.

Results and discussion

Hindsight estimates (Table 1) again far exceeded the naïve-identification rates, t(6.50) = 5.31, p = .001, r 2 = .73. Indeed, there was no reduction in the hindsight bias as compared to the same set of items in Experiment 1, t(23) = −1.10, p > .10, r 2 = .05. Thus, auditory hindsight bias occurred even when the participants knew about the effect and tried to avoid it.

Experiment 3: Sentences

Although Experiments 1 and 2 demonstrated the robustness of auditory hindsight bias, they reveal little about how it influences natural discourse and communication. Here, we used full sentences instead of individual words to test the ecological validity of the effect.

Method

Participants

A group of 44 undergraduates (34 female, 10 male; mean age 21.4 years) participated for credit; 20 completed the naïve-identification task, and 24 completed the hindsight estimation task.

Materials

We recorded 60 sentences from 5 to 12 words long (M = 7.7, SD = 1.64) and then degraded and piloted each one, as in Experiment 1. The final stimuli consisted of 40 degraded sentences, of which we report on the first 20 (see the supplementary materials), which were degraded with a low-pass filter frequency of 1031–1405 Hz.

Procedure

The instructions and procedure were identical to those of Experiment 1, except as noted here. Sentences varied in length from 2 to 3.5 s. In the naïve-identification task, the interitem interval was 20 s, giving participants approximately 15 s to write down the sentence and prepare for the next item. In the hindsight estimation task, the interitem interval was 15 s, giving participants up to 7 s to estimate what percentage of their peers would correctly identify the degraded sentence when their peers had not heard the clear sentence first.

Results and discussion

Contrary to our expectations from the pilot study, only one participant identified a sentence 100 % correctly. Therefore, we adopted a liberal definition of a correct answer: We scored as correct any sentence in which a participant identified at least 75 % of the words in their correct order (e.g., “I practiced writing my ________” when the correct response was “I practiced writing my signature”). We scored as correct minor morphemic errors (e.g., “movie” vs. “movies”) and misspellings and homonyms (e.g., “buy” vs. “by”), and we ignored extra words (e.g., “in my hair” vs. “in my long hair”). Even under this liberal definition of a correct answer, the answers still demonstrate a basic understanding of the sentence and resemble how people might interpret individual sentences in a conversation. As Table 1 shows, hindsight estimates far exceeded the naïve-identification rates, t(27.44) = 15.26, p  < . 001, r 2 = .83, demonstrating a robust auditory hindsight bias for sentences.Footnote 2

Experiment 4: Less-degraded words

The hindsight estimates in Experiments 13 hovered around 50 % (as in Newton, 1990, cited in Griffin & Ross, 1991). When participants must ignore privileged knowledge to estimate for their naïve peers, the participants may simply guess. In Experiment 4, we reduced the degradation of the words used in Experiment 2 to a point at which approximately 50 % of the words could be identified. If hindsight estimates are merely guesses, participants should still estimate that around 50 % of their peers would identify the degraded words, even though they are now easier to identify. In such a case, participants would be accurate in their hindsight estimates, and thus, no hindsight bias would occur. If, however, participants calibrate their hindsight estimates from how easily they can identify the degraded words (Nickerson, Baddeley, & Freeman, 1987), they should still overestimate their naïve peers’ ability to identify the words.

Method

A group of 49 undergraduates (42 female, 7 male; mean age 22.1 years) participated for credit; 27 completed the naïve-identification task, and 22 completed the hindsight estimation task. The materials were identical to those of Experiment 1, except that the filter frequency range was 1031–2124 Hz, based on a pilot study in which correct naïve identification of the words was roughly 50 %. The procedure was identical to that of Experiment 1.

Results and discussion

Once again, hindsight estimates far exceeded naïve-identification rates, t(47) = 7.40, p < .001, r 2 = .54, despite the words being easier to identify. The increased mean in the hindsight estimation task with easier words (cf. Exps. 1 and 2) shows that participants are not simply guessing in this task. Instead, they adjust and calibrate their hindsight estimates according to how easy it is for them to identify the words (see also Lange et al., 2011).

General discussion

We obtained auditory hindsight bias in four experiments. In a naïve-identification task, participants tried to identify degraded words or sentences. In a hindsight estimation task, participants heard words or sentences clearly before estimating the percentage of their naïve peers who would be able to identify degraded versions of those words or sentences. Participants consistently overestimated their peers’ ability to identify degraded common words (Exp. 1) and sentences (Exp. 3) when the participants themselves knew the words and sentences. This bias persisted despite attempts to avoid it (Exp. 2) and when we made it easier to identify the degraded versions of the words (Exp. 4). These results demonstrate a large and pervasive gap between one’s actual ability and one’s perception of others’ ability to decipher spoken language when one knows what to listen for.

We maintain that hindsight bias contributes to the miscommunication that results when speakers and listeners overestimate the clarity of their message or the depth of their understanding (see Krauss & Fussell, 1991). Speakers often overestimate the clarity of their message and of listeners’ understanding (Jaccard & Jacoby, 2010; Keysar & Henly, 2002), and listeners often overestimate how much they understand (Lange et al., 2011; Vokey & Read, 1985). Although speakers could benefit from listeners’ feedback that the speakers’ messages are unclear, listeners often withhold such feedback because they believe that they understood the intended message (Keysar, 1994). Additionally, extraneous noise or distortion can produce phonemic restoration, in which listeners correctly or incorrectly restore missing phonemes (Samuel, 1996). Given the prevalence of miscommunication, how might hindsight bias be overcome in discourse?

Our and others’ results show that we cannot overcome hindsight bias through awareness or intention alone (see Lilienfeld, Amirati, & Landfield, 2009; Pohl & Hell, 1996; Sanna, Schwarz, & Stocker, 2002). Hindsight bias can, however, be reduced, eliminated, or reversed by surprise (Calvillo & Gomes, 2011; Pezzo, 2003), by considering alternative explanations (Koriat, Lichtenstein, & Fischhoff, 1980), and by taking another person’s perspective (Todd et al., 2011). Considering alternative explanations or taking another person’s perspective could alert the speaker to a gap in the listener’s knowledge base, thus leading the speaker to provide more information and clarify the message. Conversely, being highly surprised during a conversation may cause a listener to seek clarification. Receiving feedback can facilitate the speaker’s effort in tailoring the message to the listener (Butterfield & Metcalfe, 2006; see Hoch & Loewenstein, 1989), thus improving communication.

Future research should explore the mechanisms underlying auditory hindsight bias and ways to eliminate this bias. Currently, we are investigating the role of fluency—that is, speed of processing—in the auditory hindsight bias (see also Bernstein & Harley, 2007; Fessel & Roese, 2011; Harley et al., 2004; Werth & Strack, 2003). Briefly, hindsight bias could arise in part from a misattribution of processing fluency caused by knowing the outcome when reasoning from a naïve perspective: Privileged knowledge increases the fluency with which one processes a degraded stimulus or question. To avoid hindsight bias, one must attribute this fluency to its correct source (knowing the outcome). Both speakers and listeners often fail to appreciate their privileged knowledge (Nickerson, 1999), thus paving the way for miscommunication.