We aimed at recruiting at least 100 participants and continued data collection until the end of the week in which this goal was reached. One participant had to be excluded because of a severe visual impairment and one participant could not complete the study because the computer mouse was defect. The remaining sample consisted of 122 students (100 of whom were female) who were recruited on campus at Heinrich Heine University Düsseldorf. Their age ranged from 18 to 38 years with a mean age of 23 (SD = 4) years. Written informed consent was obtained from all participants. Up to five participants were seated in individual cubicles in a quiet room. They wore headphones with high-insulation hearing protection covers to further shield them from any remaining background sounds. With a final sample size of N = 122, α = .05, and 160 items in the memory test, we were able to detect small effects of size w = 0.03 with a statistical power of 1 – β = .95 in the multinomial analysis of the guessing parameters.
We created 260 statements about products such as “The snack mini pretzels of Beaxen are made of purely organic ingredients.” In a norming study, 15 participants rated these product statements on a scale ranging from −3 (highly non-credible) to +3 (highly credible). Participants were instructed to read each statement carefully and to answer spontaneously, without much deliberate reflection. To maximize the difference between the conditions, the 80 statements with the lowest credibility ratings and the 80 statements with the highest credibility ratings were selected for Experiment 1. To illustrate, an example for a statement with low credibility is “Only the cornflakes from the brand Auve make breakfast irresistible” while an example for a credible statement is “Glucose tablets from Delklate performed very well in a standardized comparative test.” The product statements comprised 260 brand names that were created using the pseudoword generator wuggy (Keuleers & Brysbaert, 2010). Examples are Admel, Bastol, Calta, Daubort, Fuson, Gafet, Hörter, and Ibutu. The brand names were randomly assigned to the product statements.
The encoding of the product statements and their sources was incidental. Participants were instructed that they would see statements about products. Each statement originated from one of two sources: They were either paid advertisements or press releases of a renowned independent (fictitious) institute for product testing (“Foundation for Brand Testing”). In the presentation phase, 80 product statements were shown. Half of these statements had a low a priori credibility and the other half had a high a priori credibility. The statements were written in 32 pt Times font. Half of the statements in each condition were assigned to the untrustworthy source and labeled as “Advertisement,” while the other half were assigned to the trustworthy source and labeled “Foundation for Brand Testing.” The labels were shown in white 29 pt Avenir font in the upper left corner of the frame in which the product statement was written (Fig. 1). The label “Advertisement” appeared in front of a red background while the label “Foundation for Brand Testing” appeared in front of a blue background to increase the perceptual discriminability of the source tags. For each participant, a different set of 40 product claims with high a priori credibility and 40 product claims with low a priori credibility were randomly drawn from the pool of 160 statements and randomly assigned to the two conditions so that half of the 40 statements of each category were associated with either source. Each participant saw the statement in a different, randomly determined order. The label appeared 1 s before the product statement was shown. The participants were instructed to indicate for each statement how non-credible or credible they thought the statement was given its content and source. The statement’s credibility was rated on a scale ranging from −3 (highly non-credible) to +3 (highly credible). Upon clicking a “continue” button, the statement disappeared from the screen, and the next trial started after an inter-trial-interval of 1 s. A progress bar at the bottom of the screen showed the percentage of trials that had been completed.
Following the presentation phase, 20 trials of serial recall had to be performed as a distractor task. In each trial, eight digits were presented, one after another, for 1 s each. Immediately after the presentation of the digits, eight question marks appeared that had to be replaced by the digits in the order of their presentation, using the number key pad of the computer keyboard. After each trial, participants received feedback about their performance. The distractor task lasted about 10 min.
A surprise source-memory test then followed. Participants saw the 80 statements from the presentation phase, randomly intermixed with 80 new statements, half of which had a low a priori credibility and half of which had a high a priori credibility. All statements were presented in black 30 pt Arial font against a white background at the center of the screen. Participants first rated the credibility of the statements on a scale ranging from −3 (highly non-credible) to +3 (highly credible). Then they were asked to indicate whether they had seen the statement before or not by clicking “old” or “new.” After a statement had been classified as “old,” participants were asked to provide a source judgment by indicating whether the statement came from an advertisement or from the Foundation for Brand Testing. This follows the standard procedure of source-memory tests (Bayen et al., 1996). A progress bar at the bottom of the screen indicated the percentage of trials that had been completed.
A 2 × 2 × 2 repeated-measures analysis of variance with a priori credibility (high, low), source (advertisement, brand testing), and phase (presentation, test) as independent variables and credibility ratings as dependent variable revealed main effects of a priori credibility, F(1,121) = 1358.44, p < .01, ηp2 = .92, source, F(1,121) = 129.41, p < .01, ηp2 = .52, and phase, F(1,121) = 125.30, p < .01, ηp2 = .51 (Table 1). Statements with high a priori credibility received higher ratings than statements with low a priori credibility, showing that a priori credibility was manipulated successfully. Product statements that were labeled as advertisement received lower credibility ratings than statements that had been labeled as coming from the trustworthy source. On average, the credibility ratings decreased from the presentation phase to the test phase. Phase did not interact with a priori credibility, F(1,121) = 1.07, p = .30, ηp2 = .01. However, there was an interaction between phase and source, F(1,121) = 102.07, p < .01, ηp2 = .46. Source had a pronounced effect in the presentation phase but the influence of source was markedly reduced in the test phase. A priori credibility interacted with the source, F(1,121) = 12.93, p < .01, ηp2 = .10, reflecting the fact that source had a stronger influence on statements with high a priori credibility than on statements that were not credible from the outset. The interaction between phase, a priori credibility, and source was significant, F(1,121) = 18.87, p < .01, ηp2 = .13, suggesting that the interaction between a priori credibility and source was stronger in the presentation phase than in the test phase. Overall, the data suggest that the influence of the source tags markedly decreased from the presentation phase to the test phase. This may reflect forgetting of the source labels, but note that the credibility ratings likely reflect the joint influence of different types of processes such as item recognition, source memory, and source guessing, so that it is necessary to disentangle these processes via cognitive modeling before drawing conclusions about them.
To analyze the performance in the source-monitoring test, we first assessed the proportion of statements that were attributed to advertising (Fig. 2). Statements that had been labeled as advertisements were more likely to be attributed to the advertising source than statements that came from the trustworthy source, F(1,121) = 175.96, p < .01, ηp2 = .59, which suggests that participants had some memory for the labels. However, a priori credibility also had a pronounced influence on source attributions, F(1,121) = 84.61, p < .01, ηp2 = .41, even though there actually was a zero contingency between a priori credibility and source. Statements with low a priori credibility were more likely to be attributed to advertising. This was true both for the statements that were labeled as advertisements, F(1,121) = 43.78, p < .01, ηp2 = .27, and for the statements of the trustworthy source, F(1,121) = 102.53, p < .01, ηp2 = .46. The effect was somewhat stronger for the misattributions of the trustworthy statements than for the correct attributions of the advertising statements, F(1,121) = 12.96, p < .01, ηp2 = .10.
Figure 3 shows the source-monitoring model of Bayen et al. (1996), adapted for the present purposes. To illustrate, the first tree of the model depicts the processes that occur in response to product statements that were labeled as advertisements. The parameters represent cognitive processes that may occur with certain probabilities that vary in the [0, 1] interval. With probability DAd, these statements are recognized as old. When a statement is recognized as old, participants may also have source memory for the statement with probability dAd, in which case the statement is correctly classified as an advertisement. With the complementary probability 1 – dAd, participants have no source memory for the statement, in which case they have to guess, with probability aAd, that the statement is an advertisement, or, with the complementary probability 1 – aAd, that the statement is from brand testing. When the statement is not recognized as old that occurs with probability 1 – DAd, participants can still guess that it is old with probability b, in which case they may guess that the statement is an advertisement with probability gAd or from brand testing with probability 1 – gAd. With probability 1 – b, the statement is guessed to be new. Similar processes are thought to occur in response to items labeled “Foundation for Brand Testing” (second tree of Fig. 3), and to new items (bottom tree of Fig. 3). For analyzing the results, we needed two sets of the model trees described above, one for statements with high a priori credibility and one for statements with low a priori credibility. This model is widely used in source-monitoring research to distinguish between source memory and guessing (Bayen & Kuhlmann, 2011; Bayen et al., 2000; Kuhlmann et al., 2012; Schaper et al., 2019). It is well validated as it has been empirically shown that the model parameters allow for an uncontaminated measurement of item recognition, source memory, and guessing (Bayen et al., 1996; Bröder & Meiser, 2007).
The model displayed in Fig. 3 contains eight parameters (DAd, DTest, DNew, dAd, dTest, b, aAd, gAd), each representing the probability with which certain cognitive processes occur. However, there are only six independent data categories to fit, which means that the model is not identifiable. Therefore, equality restrictions have to be imposed on the parameters to obtain an identifiable base model (Bayen et al., 1996). Assumptions that are imposed on the unrestricted model to make it identifiable include (1) the assumption that the probability of detecting an old item as old is identical to the detection of a new item as new, which is the standard assumption of two-high threshold models of item recognition (Snodgrass & Corwin, 1988). Validation studies have found that models incorporating this restriction perform better than alternative models assuming that new items cannot be detected as new and as good as signal-detection based models (Bayen et al., 1996; Schütz & Bröder, 2011; Snodgrass & Corwin, 1988). We therefore included the assumptions that the detection of an item as old did not differ as a function of the source and was equal to the detection of new items (DAd = DTest = DNew). To ensure model identifiability, we also included (2) the assumption that source guessing does not differ as a function of the old-new recognition status of the items (aAd = gAd), which is a common assumption in studies examining schematic guessing biases (e.g., Bayen & Kuhlmann, 2011; Bayen et al., 2000; Schaper et al., 2019). The base model incorporating these restrictions fit the data well, G2(2) = 2.95, p = .23, which indicates that the assumptions incorporated in the model were compatible with the data (for an additional empirical test of these assumptions, see Experiment 2). The model-based analyses were performed using multiTree (Moshagen, 2010).
Item recognition was higher for statements with low credibility than for statements with high credibility, ΔG2(1) = 47.08, p < .01 (Table 2). Implausible or unbelievable statements may stick in memory because of a bizarreness effect (Macklin & McDaniel, 2005).
Overall, source memory was rather poor (Table 2). Descriptively, source memory was somewhat better for the trustworthy source than for the untrustworthy source. The difference between the corresponding parameters was not significant for items with high a priori credibility, ΔG2(1) = 0.24, p = .62, but it was significant for statements with low a priori credibility, ΔG2(1) = 5.26, p = .02.
The source-guessing parameter reflects the probability of guessing that a statement had been labeled as an advertisement (Fig. 4). Statements with low a priori credibility were more likely to be attributed to advertising than statements with high a priori credibility, ΔG2(1) = 42.14, p < .01, confirming that people rely on schematic knowledge to reconstruct the sources when memory fails.
The influence of source tags on credibility judgments decreased markedly from the presentation phase to the test phase. This is most likely due to forgetting of the source tags. In line with this interpretation, source memory was rather poor. The parameter representing the conditional probability of remembering the source tag provided that the statement was still remembered ranged from .02 for the fact that a non-credible statement was an advertising message to .37 for the fact that a credible statement came from a trustworthy source. Participants were thus more likely to forget than to remember the source even after a rather short period of time.
In line with previous studies (Nadarevic & Erdfelder, 2013, 2019), the present results refute the idea that people remember untrustworthy sources particularly well. In the present study, the trustworthy source was remembered even somewhat better than the untrustworthy source. Here this is demonstrated only for particular types of untrustworthy and trustworthy sources (advertisements and brand testing), but the pattern of results fits well with the context-dependent model of source tagging proposed by Nadarevic and Erdfelder (2019). According to this model, trustworthy sources are more informative, and, thus, more important to remember, than untrustworthy sources when people are generally skeptical of new information. Consistent with this idea, the memory advantage for trustworthy sources was numerically stronger – and, in fact, only significant – for statements with low a priori credibility.
When source memory is no longer available due to forgetting, reconstructive guessing processes become increasingly important. Naturally, guessing does not always lead to the reconstruction of the correct source, but may also lead to the misattribution of statements to the false source. As expected, the a priori credibility of the statements had a pronounced effect on these guessing processes. The tendency towards guessing that a statement was an advertising message was much stronger for non-credible statements than for credible statements.