To test whether the metamemory illusion observed in Experiment 1 generalizes to a different stimulus domain, we used a sequential prisoner’s dilemma game (Clark & Sefton, 2001) to pair trustworthy-looking and untrustworthy-looking faces with cheating and cooperative behavior in Experiment 2. As in Experiment 1, the item information (e.g., a trustworthy face) was associated with either unexpected or expected source information (cheating or cooperative behavior). As in Experiment 1, we assessed participants metamemory in judgments of item learning (Rhodes, 2016) and judgments of source learning (Kuhlmann & Bayen, 2016) after each round of the prisoner’s dilemma game. In a subsequent memory test, we assessed how well participants remembered the partners’ faces (item memory) and the associated cheating or cooperative behavior (source memory). After the memory test, participants were asked to provide postdictions for item and source memory in all four cells of the design.
We collected as much data as possible in the 6 weeks the laboratory was available to us. A total of 185 participants (124 female) took part in Experiment 2. They were compensated with course credit or money. They were alternatingly assigned to either the with-judgment group (n = 92, 59 female) or without-judgment group (n = 93, 65 female). Three additional data files had to be excluded because of repeated participation. Age ranged between 17 and 40 years (Mage = 23 years, SDage = 4 years). All participants gave written informed consent in accordance with the Declaration of Helsinki.
A sensitivity power analysis showed that with a sample size of 185 participants (>90 per group), 80 answers per participant in the memory test, and given α = .05, effect sizes of w = 0.04 could be detected in the model-based comparisons with a statistical power of 1 − β = .95. This power analysis was calculated using G*Power (Faul et al., 2007).
Prisoner’s dilemma game
Participants played a one-shot sequential prisoner’s dilemma game with 20 trustworthy-looking and 20 untrustworthy-looking partners. For each participant, the faces of the partners were drawn from a pool of 80 frontal facial photographs of women with neutral facial expressions (250 × 375 pixels) from the FERET database (Phillips, Wechsler, Huang, & Rauss, 1998). Of these faces, 40 were trustworthy looking and 40 were untrustworthy looking according to an independent norming study with N = 21 students. The faces were identical to those used in a previous study in which participants showed enhanced source memory for trustworthy-looking cheaters (Mieth et al., 2016). Mean trustworthiness ratings in the norming study on a scale ranging from 1 to 6 were M = 4.28 (SD = 0.23; min = 4.00; max = 4.86) for the trustworthy-looking and M = 2.75, (SD = 0.24; min = 1.90; max = 3.10) for the untrustworthy-looking faces. The faces were randomly drawn from the pool of 80 faces and randomly assigned to the conditions with the restriction that half of the cheating partners and half of the cooperating partners looked trustworthy, while the other half looked untrustworthy.
The prisoner’s dilemma game was identical to that used in previous studies (Bell et al., 2012; Bell, Giang, Mund, & Buchner, 2013; Mieth et al., 2016). Participants were required to invest money into a joint business venture with partners whose faces were shown on the screen. Participants knew that they played for real money (relative to an initial endowment of 100 cents, they could win or lose money depending on their own and their partners’ decisions). In each trial, participants were required to invest a small amount of money (30 or 15 cents). Cooperating partners always reciprocated the participants’ investment, which led to a small gain for both partners (10 or 5 cents). Cheating partners invested nothing which led to a comparatively large gain for the partner (20 or 10 cents) at the expense of the participant who lost money (−10 or −5 cents).
A silhouette at the left side of the screen represented the participant (see Fig. 5). At the right side of the screen, a photograph of the partner was shown. In each trial, participants decided to invest either 15 or 30 cents by pressing a button on a response box. The decision was displayed on screen for 500 ms. For 500 ms, the investment was shown in an arrow which then moved to the center of the screen within an additional 500 ms. After 500 ms, the partner’s investment was shown in an arrow for 500 ms, which also moved to the center of the screen within 500 ms. After 500 ms, the sum of investments, a bonus of one-third of the sum and the total sum were presented in the center of the screen, each for 500 ms. After 500 ms, the total sum was split evenly between the partners, regardless of their previous investments. The partner’s share was shown in an arrow and moved toward the partner’s photograph (within 500 ms). After 500 ms, the participant’s share was shown in an arrow and moved toward the participant’s silhouette (within 500 ms). After 1 s, both interactants’ gains and losses, as well as the updated account balances (after 500 ms) were shown below the photograph and the silhouette, respectively. Participants gained money when interacting with cooperators, and lost money when interacting with cheaters. The losses that resulted from interactions with cheaters were as large as the gains that resulted from interactions with cooperative partners. A verbal description of the interaction was shown until the participant pressed a continue button on the response box to start the next trial. Two practice trials were provided to familiarize the participants with the game.
Judgments of item and source learning
Participants in the without-judgment group performed the prisoner’s dilemma game without providing metamemory judgments, as in previous studies (Bell et al., 2012; Mieth et al., 2016). Participants in the with-judgment group provided metamemory judgments immediately after each round of the prisoner’s dilemma game. Judgments were assessed as in Experiment 1. First, participants were asked to predict the probability that they would later remember the partner’s face (judgment of item learning). Second, they were asked to predict the independent probability that they would later remember the partner’s behavior (judgment of source learning).
Source monitoring test
After the game phase, participants received instructions for a standard source monitoring test (Bell et al., 2012; Mieth et al., 2016). The 40 faces from the prisoner’s dilemma game were randomly intermixed with 40 new faces from the same pool of faces. Twenty of the new faces were trustworthy looking and 20 were untrustworthy looking. All 80 faces were presented in random order, one at a time, at the center of the screen. For each face, participants first rated the likability of the face on a scale ranging from 1 (not likable at all) to 6 (very likable). Then participants were required to indicate whether they had seen the face in the game before. If so, they were asked whether the face was paired with cheating or cooperation. By pressing the continue button, participants initiated the next trial.
Measuring source memory
The multinomial source monitoring model (Bayen et al., 1996) was used to estimate parameters for item memory (D), item guessing (b), source memory (d), and source guessing (g). Here, Source A and Source B refer to cheating and cooperation in the prisoner’s dilemma game, respectively. Four sets of the model in Fig. 1 were needed for the combinations of faces (trustworthy looking and untrustworthy looking) and the two judgment groups (without judgments and with judgments).
After the memory test, participants provided postdictions (Schaper et al., 2019a) for all four combinations of facial trustworthiness (trustworthy looking and untrustworthy looking) and partner behavior (cheating and cooperation), as in Experiment 1.
We used a 2 × 2 × 2 design with facial trustworthiness (trustworthy vs. untrustworthy) and partner behavior (cheating vs. cooperation) as within-subject factors. Judgment group (without judgments vs. with judgments) was again a between-subjects factor. Dependent variables were judgments of item and source learning averaged across items, objective memory (item and source memory) and postdictions of item and source memory.
Metamemory judgments and source monitoring processes were analyzed as in Experiment 1. Additionally, investments in the prisoner’s dilemma game and test-phase likeability ratings were analyzed with repeated-measures ANOVAs.
Investments in the prisoner’s dilemma game and likability ratings at test
Before focusing on metamemory and memory, it seems worth noting that both game investments and test-phase likability ratings sensitively reflected the manipulations of facial and behavioral trustworthiness, as expected (see Table 2). Participants invested more in the prisoner’s dilemma game when interacting with trustworthy-looking partners than when interacting with untrustworthy-looking partners, F(1, 183) = 142.84, p < .001, ηp2= 0.44. At test, trustworthy-looking faces were rated as being more likable than untrustworthy-looking faces, F(1, 183) = 515.79, p < .001, ηp = 0.74, and cooperators were rated as more likable than cheaters, F(1, 183) = 56.84, p < .001, ηp2= 0.24. Judgment group had no main effect on any of these variables and did not interact with any other factor (all Fs ≤ 3.35). These findings suggest that facial trustworthiness and partner behavior were successfully manipulated, and cheating and cooperative behavior had a significant influence on the socioemotional evaluation of the partners.
Judgments of item and source learning
In the with-judgments group, judgments of item learning were neither influenced by facial trustworthiness, F(1, 91) = 1.01, p = .317, ηp2= 0.01, nor by partner behavior, F(1, 91) = 1.27, p = .263, ηp2= 0.01. There was no interaction between facial trustworthiness and partner behavior, F(1, 91) = 0.53, p = .467, ηp2= 0.01 (see Fig. 2c).
For judgments of source learning, there were no main effects of facial trustworthiness, F(1, 91) = 2.04, p = .157, ηp2= 0.02, and partner behavior, F(1, 91) = 1.16, p = .284, ηp2= 0.01. However, critically, there was a significant interaction between facial trustworthiness and partner behavior, F(1, 91) = 25.69, p < .001, ηp2= 0.22. On average, participants predicted better source memory for trustworthy-looking cooperators than for trustworthy-looking cheaters, F(1, 91) = 10.96, p = .001, ηp2= 0.11, and for untrustworthy-looking cheaters than for untrustworthy-looking cooperators, F(1, 91) = 16.20, p < .001, ηp2= 0.15 (see Fig. 2d). Thus, in their metamemory predictions participants expressed the belief that they would have better source memory for expected behaviors.
Item and source memory
We used the same equality restrictions for the base model as in Experiment 1. The base model fit the data well, G2(4) = 3.60, p = .462, which suggests that the restriction that item memory did not differ between cheater and cooperator faces (implied by the base model) is compatible with the data (cf. Barclay & Lalumière, 2006; Bell et al., 2012; Bell, Mieth, & Buchner, 2015; Mehl & Buchner, 2008; Mieth et al., 2016). However, item memory was higher for untrustworthy-looking faces than for trustworthy-looking faces, ΔG2(2) = 7.16 p = .028, w = 0.02 (see Fig. 3c). Furthermore, participants in the with-judgments group had better item memory than participants in the without-judgments group, ΔG2(2) = 57.17 p < .001, w = 0.06. This suggests that the metacognitive processing benefitted item memory, consistent with previous studies (Rhodes, 2016; Schaper et al., (2019a).
Source-memory parameter d represents the conditional probability that participants remember the cheating or the cooperation of a partner given that they have recognized the face as old. As in Experiment 1, we further restricted the model by using one parameter representing source memory for expected item–source pairings (trustworthy-looking cooperators and untrustworthy-looking cheaters) and one for unexpected item–source pairings (trustworthy-looking cheaters and untrustworthy-looking cooperators). The resulting model was compatible with the data, G2(8) = 5.81, p = .669. This new base model was used for the following analyses of the source memory parameters. Source memory for expected and unexpected source pairings differed significantly between groups, ΔG2(2) = 9.85, p = .007, w = 0.03. In the without-judgments group, participants showed an expectancy violation effect, replicating previous studies (Bell et al., 2012; Mieth et al., 2016; Suzuki & Suga, 2010). Source memory was better for unexpected than for expected item–source pairings, ΔG2(1) = 9.24, p = .002, w = 0.02 (left side of Fig. 3d). As in Experiment 1, this expectancy-violation advantage was not significant for participants who provided metamemory judgments at encoding, ΔG2(1) = 1.05, p = .306, w < 0.01 (right side of Fig. 3d).
Parameter g—which reflects the probability of guessing that a face was associated with cheating—differed significantly between trustworthy-looking faces and untrustworthy-looking faces, independent of whether participants provided metamemory judgments, ΔG2 (1) = 17.05, p < .001, w = 0.03, or not, ΔG2(1) = 47.12, p < .001, w = 0.06 (see Table 3). Participants in both groups guessed that trustworthy-looking partners had been associated with cooperation and that untrustworthy-looking partners had been associated with cheating.
Postdictions of item and source memory
After the memory test, participants in both groups provided postdictions for all four cells of the design. Postdictions for item and source memory are shown in Figs. 4c–d. Participants thought that they had remembered trustworthy-looking faces better than untrustworthy-looking faces, F(1, 183) = 5.22, p = .024, ηp2= 0.03. They also thought that item memory had been better for cheaters than for cooperators, F(1, 183) = 5.43, p = .021, ηp2= 0.03. These main effects were qualified by a significant interaction, F(1, 183) = 6.46, p = .012, ηp2= 0.03. Participants thought that item memory had been better for untrustworthy-looking cheaters than for untrustworthy-looking cooperators, F(1, 183) = 12.64, p < .001, ηp2= 0.06, while no such difference was obtained for trustworthy-looking faces, F(1, 183) = 0.04, p = .835, ηp2< 0.01. There was neither a main effect of judgment group nor any two-way or three-way interactions with this variable (all Fs ≤ 0.20).
For the postdictions of source memory, there was no main effect of facial trustworthiness, F(1, 183) = 1.93, p = .167, ηp2= 0.01. The main effect of partner behavior, in contrast, was significant—participants thought they had remembered cheating better than cooperative behavior, F(1, 183) = 18.36, p < .001, ηp2= 0.09. The interaction was not significant, F(1, 183) = 3.60, p = .059, ηp2= 0.02. There was neither a main effect of judgment group nor any two-way or three-way interactions with this variable (all Fs ≤ 0.44).
Experiment 2 tested whether metamemory in source monitoring with socially relevant materials follows the same principles as metamemory for nonsocial materials. Most notably, Experiment 2 provided evidence of a metamemory expectancy illusion in social source memory. Participants on average predicted better source memory when the cheating or cooperative behavior of the partner confirmed their expectations about the trustworthy-looking or untrustworthy-looking person. In stark contrast, veridical source memory was better for unexpected behaviors than for expected behaviors in the without-judgment group. A similar metamemory expectancy illusion has been obtained in Experiment 1 and in other studies (Schaper et al., 2019a, 2019b) with nonsocial stimulus material. The present results demonstrate that people fall prey to the same metacognitive illusion even when making judgments about information that is socially relevant and associated with real financial gains and losses. This suggests that metamemory seems to be governed by similar principles for social and nonsocial information.
Another parallel to the findings of Experiment 1 and Schaper et al. (2019a) is that the source memory advantage for unexpected information was reduced when participants provided metamemory judgments after each encoding trial. This is also consistent with the finding of Soderstrom et al. (2015) that memory for related word pairs was selectively increased when participants provided judgments of learning. This finding thus strengthens the general conclusion that memory for cheaters and cooperators is determined by the same principles as memory for nonsocial information (Bell & Buchner, 2012).
Postdictions about item and source memory assessed immediately after test did not differ between the judgment groups in Experiment 2. In contrast to the in-the-moment judgments of source learning, source-memory postdictions were characterized by a main effect of partner behavior, suggesting that participants believed after test that they had remembered cheating better than cooperative behavior. The interaction between facial trustworthiness and partner behavior did not attain significance. This pattern of results is thus different from that obtained in Experiment 1. Other than the judgments of source learning obtained at encoding, the postdictions thus seem to differ between social and nonsocial stimuli. A potential explanation for this difference between postdictions and judgments of source learning obtained during encoding is that postdictions—as global judgments—are more reflective of the participants’ beliefs or naïve theories about memory, whereas judgments of source learning during encoding are more strongly determined by in-the-moment processing experiences (Frank & Kuhlmann, 2017). In fact, it seems plausible that people have different beliefs about social memory versus nonsocial memory. This explanation implies that people hold the belief that cheating is better remembered than cooperative behavior, which is plausible considering the salience of the negative experience of being cheated. However, this assumption needs to be tested. Experiment 3 therefore examines whether people hold the belief that cheaters are better remembered than cooperators.