Metaphor creates intimacy and temporarily enhances theory of mind

Metaphor is a type of figurative language that alters the literal meaning of words and phrases. Although once considered an intentionally misleading and cognitively taxing form of language (see summary by Gibbs, 1994), current research shows metaphor is commonly used in conversation and is comprehended with relative ease. Cameron (2008) estimated that we use 50 metaphors per thousand words in discourse, with Glucksberg (1989) hypothesizing that one person over a 60-year life span uses millions of metaphors and other types of nonliteral language.

The main research emphasis in metaphor studies has been focused on how one infers a nonliteral meaning from the explicit or surface meaning of a sentence presented to them (e.g., Gibbs, 1994; Glucksberg, 2001). There is a much smaller literature focused on understanding why people speak metaphorically when literal language might have been used. Most of these explanations involve communicative or cognitive goals, such as providing a compact and efficient way to state a complex message; enhancing the vividness of the message; and serving to illuminate, clarify, or explain a concept that is not easily understood with literal language (Ortony, 1975). Other cognitive roles for metaphor have also been suggested, such as being especially persuasive (Sopory & Dillard, 2002) or in creating a stronger memory trace (e.g., Whitney, Budd, & Mio, 1996).

Without discounting the cognitive or communicative function of metaphor, we emphasize here and expand upon an even smaller research literature that implicates metaphor usage in creating social bonds and in understanding other’s intentions. The general thrust of this argument is captured by philosopher Ted Cohen. Cohen (1978) made the following claim:

I want to suggest a point in metaphor which is independent of the question of cognitivity and which has nothing to do with its aesthetic character. I think of this point as the achievement of intimacy. There is a unique way in which the maker and the appreciator of a metaphor are drawn closer to one another. (p. 8)

He explained that intimacy is created by the speaker issuing “a kind of concealed invitation” to which the hearer “expends a special effort to accept the invitation” resulting in a shared understanding and the creation of common ground. Cohen claims these acts are true of all language, “but in ordinary literal discourse their involvement is so pervasive and routine they go unremarked” (p. 8). Cohen speculated further that intimacy created by metaphor relies on cognitive effort that psychologists nowadays would attribute to theory of mind (ToM), or the representations that the people have of each other’s beliefs.

Cohen’s (1978) was a work of speculation. Since his work, a limited empirical literature has emerged which supports that metaphor in particular is strongly tied to understanding intention and conveying social information. Use of metaphor serves as a cue to gender identity (Hussey & Katz, 2009), knowledge of a person’s occupation aids in the recognition of a statement as metaphoric or not (Katz & Pexman, 1997), and social knowledge is a factor involved in shaping use and understanding of metaphoric discourse (Gibbs & Cameron, 2008). The most direct empirical evidence for Cohen’s hypothesis is found in Horton (2007) and is conceptually replicated in Horton (2013). Horton (2007) presented participants with texts containing conversations between two characters whose relationship is ambiguous. In each narrative, one character replied either metaphorically or literally to some personal information provided by the other character. Of immediate importance, Horton found that his readers judged the characters as “closer” (i.e., knowing each other better) when one of the characters responded with a metaphorical expression.

Consider again the reasons that intimacy might have been created. According to Cohen (1978), metaphor engages inferential mental process regarding the mental representations of the interlocutor. To our knowledge, none of the studies that postulate a social role for metaphor have shown that processing of metaphor, in fact, enhances one’s ability to infer the mental state of others. For instance, in Horton’s (2007) seminal study, it is possible that intimacy was created by engaging the “concealed invitation” proposed by Cohen, thus sensitizing people to the internal states of interlocutors. However, it is also possible that the research participants went through a reasoning process wherein they decided that informal or unusual language is more likely to be used by friends than by strangers and, in the absence of additional information, used that reasoning to infer the interlocutors must know each other well.

In the three experiments reported here we directly test whether the act of processing metaphor enhances the ability to infer mental states, by asking participants to complete an ostensibly unrelated task after processing metaphor (or a literal counterpart). This task, the Reading the Mind in the Eyes Test (RMET; Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, 2001), provides a behavioral measure of one’s ability to infer the internal state of a person from subtle affective facial expressions The RMET consists of 36 black-and-white photographs of the eye region of 18 male and 18 female targets (midway along the nose to just above the eyebrow). For each of the 36 photographs, four mental-state descriptions are also presented (one target and three foil terms, of the same emotional valence). From the visual information alone, respondents are required to choose the word that best describes what the person in the picture is thinking or feeling.

The RMET has been widely used in assessing individual differences in identifying mental states, both in normal and clinical populations. A recent meta-analysis identified over 250 studies that have used this instrument (Baker, Peterson, Pulos, & Kirkland, 2014). The test has acceptable internal validity (with Cronbach’s alphas in the 0.6 to 0.8 range; see Vellante et al., 2013) and good one-month test–retest validity (over r =. 80 in Vellante et al., 2013). Performance on this task reliably identifies individuals along the autism spectrum (Happe, 1993). The test has shown discriminant validity: for instance, scores on this task have been shown to be independent of performance on general executive functioning tasks (Gregory et al., 2002) and Stroop interference tasks (Mimura, Oeda, & Kawamura, 2006).

Whereas the RMET specifically asks for the identification of emotions, a range of studies indicate that performance on the RMET taps ability to identify mental states in social situations more generally and is associated with a network of brain regions associated with social cognition, including the inferior frontal gyrus (IFG), a brain area associated with social perception. Indeed, a recent study concludes, that “the left IFG plays a crucial role in reading the mind in the eyes, and it is probably part of a more general semantic working memory system that allows people to reason about the mental states of others” (Dal Monte et al., 2014, p. 14). In addition to brain imaging and lesion studies, there is an ever-increasing set of studies that shows performance on the RMET varies with oxytocin levels, a neuropeptide associated with interpersonal closeness, affiliation, and attachment (see Domes, Heinrichs, Michel, Berger, & Herpertz, 2007). RMET performance also is facilitated with mindfulness meditation in which people are asked to relax and consider their bodily states (e.g., Tan, Lo, & Macrae, 2014). It is this general sensitivity of the RMET to inferring mental states (albeit mainly affective) that made it an attractive way of seeing whether metaphor has some effects on facilitating making inferences about mental states of others.

We examine whether RMET performance differs as a function of reading text containing metaphor or literal counterparts. To our knowledge, there has been no prior study of this nature, although recently Kidd and Castano (2013) have shown that the mere act of reading fiction led to better scores on the RMET than did the act of reading nonfiction. Kidd and Castano concluded that reading literary fiction temporarily enhances the detection and processing of information regarding the goals and intentions of other people and that the results more broadly suggest that the activation of these processes might be influenced by merely engaging with works of art. Here we argue for a specific role for fiction that involves going beyond the literal or surface sense of a sentence, such as found with metaphor. Consistent with the proposal of the philosopher Cohen (1978), we argue that the processing of metaphor invites one to consider the goals and the intentions of the person who issued the metaphor and that this invitation comes with a heightened sensitivity to seeing whether the invitation has been accepted. We argue here that compared to the reading of literal sentence counterparts, reading metaphor, even mundane metaphor one would encounter in daily conversation, not only creates a sense of intimacy but will also temporarily enhance one’s ability to detect the mental states of others, as operationalized by performance on the RMET. Thus, our general prediction is that, if the processing of metaphor enhances one’s ability to infer the state of mind of other people, we should find better performance on the RMET when participants read metaphors compared to when literal counterparts are read.

The three studies presented here follow a progression sequence. Since, to our knowledge, Horton (2007) has not been replicated, we conceptually replicate it in Experiment 1 and show further that the degree of intimacy found when metaphor is processed varies with RMET performance. Given that in Experiment 1 the effect on the RMET employed a correlational design, and thus the independent contributions of metaphor to RMET performance cannot be clearly identified, Experiments 2 and 3 employed a between-subjects experimental design. In Experiment 2, participants read target sentences (either metaphor or literal counterparts) and created for each a meaningful discourse context before completing the RMET. To anticipate the results, those who processed metaphor exhibited better RMET performance compared to those who processed the literal sentence counterparts. To examine whether the mere act of reading metaphor had an effect on RMET scores, participants read metaphors or literal counterparts presented without any context in Experiment 3.

Experiment 1

Horton (2007) demonstrated that when people read short text involving interlocutors, and that when one of the characters used metaphor they were perceived as being in a closer relationship than when the character used a literal counterpart. To our knowledge, this finding has not been directly replicated, though Horton (2013) did find a related result. In Horton (2013), participants read brief stories that described interactions between two characters portrayed as being familiar with one another to varying degrees. The critical finding was that readers were as fast to read metaphoric utterances as literal utterances in the context of close, familiar relationships, but were slower to read metaphoric utterances in the context of unfamiliar relationships, indicating the degree of character intimacy played a role in construing the meaning of the text.

Here we attempt to replicate Horton (2007) and to more directly test whether degree of character intimacy is related to detecting the mental state of others. It should be noted that in Horton (2007) as well as most metaphor studies, interpersonal effects are shown when people are directly asked questions pertaining to participants to characters in the text (e.g., closeness of friends), and, presumably, involve the evocation of knowledge held by speaker and hearer of one another (see chapters in Colston & Katz, 2005, for reviews). Consequently, as in Horton (2007), participants will be asked to read a set of brief stories, similar to those he employed, and after each story will be asked a set of questions pertaining to the text they just read. Unexpectedly, after reading all of the stories, participants will be asked to complete the RMET. The critical test is whether degree of intimacy perceived for the characters in the story is related to the ability to identify emotions on the RMET.

Method

Participants

There were 24 undergraduate students employed in a pretest of the stimuli. A separate sample of 40 undergraduate students (mean age = 19.4, SD = 2.8; 23 females) from Western University, with English as their first language, was tested in the main study. Participants received one research credit for completing the study. None of the participants in the main study had served in the pretest.

Materials and procedure

As a pretest, 24 undergraduate participants rated a set of metaphorical and literal comments created by the researchers similar to those found in Katz and Pexman (1997). Each item was rated along three 5-point Likert scales: familiarity, emotional intensity, and exaggeration. From this pretest, eight metaphorical (e.g., “What a gem of an idea”) and eight literal statements (e.g., “What a very good idea”), matched on a set of variables, constituted the final stimulus set: (familiarity 3.6 versus 3.3; emotionality 3.2 versus 3.0; and exaggeration 2.7 versus 2.5). The 16 statements were each placed in a discourse context involving two friends. The stories consisted of a short description of an event involving two people, and then one of the characters made a statement relevant to the event. Two versions for each context were created, which differed in only one way: one version ended with a metaphorical statement whereas the other with a literal statement, relevant to the situation. Each person was presented only one version of a scenario. Thus, the manipulation was within subject and consistent with procedures used in Horton (2007). To provide a sense of the stimuli employed, two examples are presented here:

Frank knew that Edward wasn’t reliable. Frank had told him some personal information and Edward told the rest of their friends about it. Edward suggested that Frank was prone to problems. Frank warned Kyle: “Be careful what you say to him.” (Metaphorical: “Watch your back around him”).

Maria had just completed a nursing course and graduated with honors. She thought that she would be able to get a good job. She was ready to celebrate her hard work. Julia saw Maria later that day and suggested they go out for dinner. Maria responded , “What a very good idea.” (Metaphorical: “What a gem of an idea”).

For each story, participants answered five questions on 5-point Likert scales (closeness of the speakers, degree of perceived emotional intensity, the degree with which the speaker’s friend could relate to the speaker’s experience, if the speaker might be like someone the participant knows, and someone the participant can relate to). As an analog to the measure employed by Horton (2007), the main question of interest was that involving closeness of the speakers, as an indicant of intimacy. The other questions were employed to provide additional information about interpersonal effects. Following this portion of the task, participants were unexpectedly asked to complete the RMET.

Results

Separate repeated measures ANOVAs were run for each of the five questions, with language condition (metaphor or literal language) as the independent variable. Two effects were reliable. When metaphor was used, participants believed the friends in the story to be significantly closer to one another (M = 3.7, SD = .42) relative to when a literal counterpart was employed (M = 3.5, SD = .48); F(1, 39) = 6.17, p = .02 with, Cohen’s d for repeated measures using pooled SD corrected for the correlated factor = 0.99. The second reliable effect was that the perceived characters’ experience was rated as significantly more emotionally intense (M = 3.8, SD = .6 ) when metaphor was employed than when the literal counterpart was used (M = 3.6, SD = .60); F(1, 39) = 5.39, p = .02, Cohen’s d = 0.58.

Given that emotional intensity and closeness ratings differed between the metaphoric and literal conditions, we correlated the ratings for those two conditions for each participant separately with their respective accuracy on the RMET. Although variability on the emotional intensity ratings was more associated with the variability on performance on the RMET for the metaphorical conditions, the effects were not reliable for either condition. Importantly, we find effects with the critical measure of intimacy. Increasing ratings of the closeness perceived in the characters in the text are related to increasing accuracy on the RMET, but only in the condition in which the friends employed metaphor in the text, r(38) = .34, p = .03 and not when the text characters spoke literally, r(38) = .14, p = .36.

We observed as well that the metaphoric and literal closeness ratings were reliably correlated, r(38) = .70, p <.001, indicating that some people perceived the characters as closer to one another than did others, regardless of target sentence type. To eliminate effects due to this individual difference, we used partial correlation to assess the unique contribution of metaphoricity to our findings. Closeness ratings given in the metaphor condition continued to be significantly correlated with the RMET, r(37) = .34, p = .03, when we removed variance attributable to the literal condition. When we compute the reverse, the correlation of the literal ratings with the RMET score, controlling for the contribution of metaphorical language not only remained nonsignificant but was now in the opposite direction, r(37) = -.14, p = .38.

In summary, we replicated the basic finding of Horton (2007) who showed reading metaphor produces a greater sense of social intimacy and extended this in a novel way: the degree of inferred intimacy is related to performance on the RMET. These findings are intriguing, but because the findings with the RMET are based on a correlational design we cannot infer that the processing of metaphor led to a temporary enhancement of the ability to infer the mental states of others. In the next two studies, between-group experimental designs will be employed.

Experiment 2

As noted above, we employ a between-subject experimental design here to see if the sample that process metaphor will perform better on the RMET than those who process literal counterparts. In addition, we employ a task that should provide additional information about the state of mind induced by processing metaphor: participants are given a set of metaphoric sentences or literal sentence counterparts and asked to provide a short narrative context in which the sentence would be used. It should be emphasized that no mention is made of the literal or nonliteral nature of the stimuli. Given the challenge to create “matched” literal and metaphorical sentences, the stimuli we employed were taken from a recent norming study (Cardillo, Schimdt, Kranjec, & Chatterjee, 2010). These norms provide for a set of metaphoric and literal sentences that permitted us to choose items that controlled for a number of potentially confounding variables.

The aim of the production manipulation is twofold. First, reading the target sentence and thinking of an appropriate discourse context should force participants to elaborate each target sentence and place it in a metaphoric or literal discourse that they themselves had generated. Emphasizing the metaphoric or literal nature of the target sentences should provide personally relevant contexts for the target sentences. If the findings suggested in Experiment 1 were due to the preconstructed aspect of the contexts employed, then any differences in RMET performance between the two groups should be eliminated here. Naturally, finding better performance for the metaphor group on the RMET would indicate that processing metaphor in a self-generated context would be consistent with Cohen’s (1978) notion that metaphor induces a state that facilitates making inferences about the mental state of others. Second, the narrative context generated should provide empirical data on what characteristics came to the participants mind when reading metaphor or the literal counterpart. Consequently, the narratives produced will be content analyzed to see whether (and how) participants contextualize metaphors and literal counterparts, a procedure employed successfully to study sarcastic usages (Campbell & Katz, 2012).

Importantly, following the completion of the context-creation task, participants were again asked to complete an ostensibly unrelated task: the RMET. Because intimacy might be created by self-disclosure, a potential consequence of the production methodology we employ, we wanted to check whether metaphor might also induce more self-disclosure than literal statements. Finding differences here would implicate disclosure as a potential contributor to any RMET effects. Consequently, after completing the context-creation task, two scales from the emotional self-disclosure scale (Snell, Miller, & Belk, 1988) were also included.

Method

Stimuli were short sentences, taken from Cardillo et al.’s norms (2010; supplemental materials: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2952404/). The items were created by Cardillo and colleagues to be highly similar to one another. For instance, one such pair is: Metaphorical: “The woman dove into her knitting”; Literal: “The woman dove into the pool.” Cardillo et al. (2010) provided a number of different ratings on their materials, which we used to ensure the items we chose were equated on relevant, potentially confounding, factors. In our study, all of the sentences employed were written in the third person and were matched on familiarity and emotional valence in the Cardillo et al.’s norms. The stimuli we employed were familiar: Mmet = 5.15, SD = 1.00, Mlit = 5.51, SD = .87, t(30) = 1.11, p = .26, on the 7-point scale, where 1 = not familiar. In the norms’ emotional valence is based on the proportion of participants who indicated the stimuli had a positive valence. Our stimuli were only moderately emotionally positive: Mmet = .22, SD = .23, Mlit = .19, SD = .30, t(30) = .30, p = .75. To supplement information on Cardillo et al.’s (2010) norms, the sentences were analyzed with the LIWC word analysis software, and we ensured further that the stimuli were matched for number of pronouns overall, personal pronouns, affective words, adverbs, mental state words, words that refer to people, and motion. We discuss the LIWC next.

The Linguistic Inquiry and Word Count (LIWC) and coding the narrative protocols

The LIWC is a text-analysis program (Pennebaker, Francis, & Booth, 2001) that identifies and places words in 80 psychologically meaningful categories. Tausczik and Pennebaker (2010) provided a comprehensive discussion of the history of the LIWC and the research done with this instrument. The output from the program is the proportion (percentage) of words within a text that comes from a given category. Thus, if a text is examined consisting of 500 words, and 25 of those words were “cognitive mechanism” words, the score on that category for that text would be 25/500 = 0.50 %. Given our question of interest, namely whether relative to literal counterparts, the sense of intimacy created in reading metaphor is in part reflected by an enhanced ability to infer the mental states of interlocutors, we examined only a theoretically relevant subset of the categories available in the LIWC, namely, two types of category: one that refers to emotion and a second that refers to human thought (categorized in the LIWC as “cognitive mechanism” words).

In our analyses, a WordPad file was created for each participant’s set of contexts created in response to the prompt sentences and analyzed with the Linguistic Inquiry and Word Count program. The following categories were extracted:

  • Cognitive mechanism words. This category consists of words such as think, feel, and intend and has been shown to be used when a person is conscious of what he or she is saying in order to frame a message (Pennebaker, Slatcher, & Chung, 2005).These words are relevant here for a number of reasons: use of mental-state terms have been shown to be predictive of the ability of children to understand the mental states of others (Adrian, Clemente, Villaneuva, & Rieffe, 2005) and tends to be found more frequently in fictive and interactive, interpersonal contexts, such as blogs and conversation, than in factual or literal documents, such as scientific articles and nonemotional writing (Pennebaker et al., 2007). Moreover, greater use of cognitive mechanism words has been found with friends compared to strangers (Marsh, Tversky, & Hutson, 2005), and is employed in pretense and lies (Newman, Pennebaker, Barry, & Richards, 2003), two processes that are highly dependent on recognizing and manipulating the beliefs held by others of oneself (Winner, Brownell, Happe, Blum, & Pincus, 1998).

  • Emotional categories. Several different categories were extracted. Affect word usage (e.g., words such as happy, sad, and abandon) was examined because it is a direct indicant of the emotional state of mind in which the participant placed the metaphor or literal statement. Adverbs (e.g., very or really) were examined because they add emphasis to speech and text and tend to occur with emotionally intense experiences (see Gayle & Priess, 1999) and, importantly, are found in nonliteral language use in communication with friends (Whalen, Pexman, & Gill, 2009). We examined the use of personal pronouns (particularly the first-person pronoun) because they convey self-disclosure of the writer (see Van Hell, Verhoeven, Tak, & Van Oosterhaut, 2005) and because of evidence that personal pronouns “reveal how the writers view themselves, their relationships with readers and their relationship to the discourse community in which they belong” (Kuo, 1999, p. 123). It is claimed that pronouns enhance common ground (Jucker & Smith, 1996) and help an audience embody a speaker or writer’s perspective (Brunye, Ditman, Mahoney, Augustyn, & Taylor, 2009).

  • Idiomatic coding. Finally, given that LIWC (and most other programs of its kind) cannot identify metaphor, we were concerned that a blind use of the program might underestimate markers of import to the metaphoric intention participants might use in the contexts they created. Given that the RMET asks participants to identify the emotions being conveyed in pictures of eyes, we were especially interested in whether the constructed contexts provided additional evidence of emotional content. As such, we examined the contexts and coded for idiomatic expression of emotion (e.g., “I can’t stand it anymore!”). Emotional idioms have been shown to play a role relevant to interpersonal intimacy. For instance, idioms are used in intimate relationships to express emotion and promote cohesiveness (Hopper, Knapp, & Scott, 1981) and are used to signal one’s social presence in online communities, suggesting this type of speech can capture an audience’s attention (e.g., Delfino & Manca, 2007). The items analyzed here were identified as idioms of emotion by and agreed upon by two raters, working independently, with the intraclass correlation of .90 between raters.

Participants and procedure

Sixty-nine participants from Western University (49 females, mean age = 18.00, SD = 2.40), with English as a first language, took part in the study. The sample sizes employed in this study is slightly larger than is typically found in the experimental metaphor literature to ensure that it is sufficiently large to find LIWC effects. Participants completed the study online for course credit. They were provided a link that randomly assigned them to a metaphor (n = 33) or literal (n = 36) condition with the related counterbalancing of the individual differences measures. Baron-Cohen et al. (2001) reported that scores on the RMET task in undergraduate and normal populations fall between 17 and 35 out of a possible 36. Five participants were eliminated from analyses involving the RMET because their scores fell below 17 (two from the metaphorical group and three from the literal group).

Participants were told they were going to complete a writing task followed by two questionnaire tasks. Each participant was presented with either 16 metaphorical or the 16 literal counterparts, one at a time. They were not made aware of the nature of the questionnaire tasks. It should be emphasized that no mention is made of the literal or nonliteral nature of the stimuli. Participants were provided the following instructions: “In this study you will read short sentences and will create a context or scenario in which you think these sentences would occur. You will write approximately 2–4 lines per scenario. You can write anything you want as long as it is able to be comprehended.” Following completion of the writing task, participants completed the RMET and the scales from the emotional self-disclosure task (counterbalanced). Upon completion, participants were debriefed online.

Results and discussion

LIWC analysis of contexts

Before the items were subjected to the LIWC, prompt sentences were removed if they were used by the participants in creating their context. Results show that the number of words used did not differ reliably when constructing contexts to metaphorical (M = 383.94, SD = 144.62) and literal (M = 329.91, SD = 135.53) prompts, t(67) = 1.60, p = .12. The data below are reported as the percentage of words in the constructed text that were classified to a given category, controlling further for any differences in the size of the discourse context produced. Idiom production is based on the number of idioms produced by each participant in each of the constructed contexts.

Of theoretical interest, participants in the metaphor context-building condition used a significantly greater percentage of cognitive mechanism words (M = 15.96, SD = 2.98) than those in the literally prompted group, M = 12.33, SD = 4.05, t(67) = 4.22, p = .001, Cohen’s d = 1.02. Moreover, the contexts created to metaphoric prompts contained significantly more idiomatic emotional expressions (M = 1.70, SD = 1.77) than those in the literal group (M = .69, SD = .82), t(67) = 3.14, p = .003, Cohen’s d = 0.73. There were no differences between the metaphor and literal context-building groups on the percentage of adverbs, t(67) = 1 .09, p = .24, or of affect words, t(67) = .27, p = .78, or personal pronouns, t(67) = -.98, p = .35.

The Reading the Mind in the Eyes Test (RMET) task and self-disclosure measures

The website’s randomization resulted in more participants completing the self-disclosure task first and the RMET task second. Therefore, order of the task was included as a covariate in overall group difference analyses. Importantly for the hypothesis under consideration, participants in the metaphor group performed reliably better on the RMET (M = 27.69, SD = 3.13) than those in the literal group (M = 25.75, SD = 4.30); F(1, 66) = 4.22, p = .04, Cohen’s d = 0.52.

Some exploratory analyses were conducted examining performance on the RMET and percentage of words produced in the select LIWC categories examined here. The only reliable finding was that performance on the RMET was significantly positively correlated with the percentage of affective words produced in contexts created for the metaphoric prompts, r(31) = .39, p = .03, but not when produced to the literal prompts, r(34) = -.03, p = .82.Footnote 1 RMET performance did not correlate with use of idioms or cognitive mechanism words. Finally, no differences were found between the metaphor and literal prompted groups on any analysis performed on the self-disclosure scales

In summary, this experiment has provided two novel findings. First, when asked to create comprehensible contexts for sentences, participants situate metaphors in contexts relatively more rich in the use of both cognitive mechanisms words and idiomatic emotional idioms than found when situating comparable literal sentences. Second, in line with the findings of Kidd and Castano (2013) and consistent with theorizing by Gallese (2006), reading metaphors and thinking of their wider context temporarily enhances an ability to identify the mental states of others, as measured by the RMET. Neither reading metaphor nor RMET performance was associated with the self-disclosure ratings. One should note that the contexts created to both types of prompt words were forms of fiction and as such the differences with the RMET we observe are not simply fiction versus nonfiction.

In Experiment 1, metaphors and literal sentences were read within a short, predetermined discourse context and, in Experiment 2, metaphors or literal counterparts were employed to prompt the creation of a discourse context. Thus, the effects relating RMET to metaphor processing shown in our first two studies involves effects due, possibly, to either or both of reading the metaphors per se, or considering the metaphor within an explicit fictive context.

In Experiment 3, we examine whether one can show facilitation on RMET performance when extended fictive context is not presented or generated, and metaphorical or literal statements are read in isolation. In addition to the RMET task, we introduce another task that is intended to tap what the participant might be thinking when reading the target sentences. Recall that in Experiment 2 we employed a task in which participants were asked to create discourse contexts to give us insights into their thinking while processing metaphor or literal counterparts. This study is thus a conceptual replication of Experiment 2, with the exception that participants did not have to create fictive contexts for the sentences they were reading, which, hence, permits us to examine whether the mere act of reading metaphor transfers to facilitated performance on the RMET.

Experiment 3

In this experiment, participants were asked to read a set of metaphorical sentences or literal counterparts on a computer screen, and we recorded the speed with which they did so. Critically, following the reading task, participants were asked to do two ostensibly unrelated tasks: (1) as in the two earlier studies, pick the correct emotion depicted from each set of photographs of eyes (the RMET; Baron-Cohen et al., 2001), and (2) generate a different noun for each of a set of verbs. We predict that if the mere act of reading metaphor induces an orientation towards social interaction, we should find reliably better identification of emotional states in the seemingly unrelated RMET. The noun-generation task is another gauge of social inference. If the reading of metaphor invites consideration of a social interpersonal context, we should see a greater number of nouns produced that involve a human agent (e.g., mother, father) after reading metaphorical sentences relative to when literal counterparts had been read.

Method

Participants

The scores from 39 undergraduate students (25 females) from Western University with English as a first language (mean age = 18.56, SD = 1.80) were analyzed. Two additional participants were removed from the study—one for showing reaction times longer than 2 standard deviations above the mean and one for failing to complete all parts of the study. Of the removed participants, one was from the literal group and one the metaphorical group. Participants received one research credit for completing the study. Participants were randomly placed in a metaphor (n = 20) or literal (n = 19) group. None of the participants had served in either of the first two studies.

Procedure

Materials were 44 metaphorical and literal statements taken from the Cardillo et al. (2010) norms and were chosen so that the last word of the sentences was the same between the two groups (e.g., Metaphorical: “The price change was a major drop”; Literal: “The bungee jump was a scary drop”). Item pairs had the same number of words per sentence, and all items were matched on emotional valence, as determined by the proportion of people who rated the comment as positive (Metaphor M = .27, SD = .31; Literal M = .26, SD = .28, t(114) = .10, p = .91). Additionally, sentences were analyzed with LIWC and matched on pronouns, affect, social, motion, and cognitive mechanism words.

These short sentences were presented on a computer screen using E-Prime (Schneider, Eschman, & Zuccolotto, 2002). Participants read the sentences word by word and occasionally answered some comprehension questions about the text they had just read to ensure they were paying attention. The questions indexed basic comprehension. For instance, given the sentence “His illness was a slow drift,” the question would be “Was he sick”” y/n. There 14 questions randomly presented for each group. Overall, less than 1 % of the questions were answered erroneously, and the few trials in which they did occur were eliminated from further analyses. Following the reading task, participants completed the RMET and the noun-generation task (counterbalanced across participants).

The noun-generation task, akin to verb-generation tasks (e.g., Holland et al., 2001 ), requires participants to produce a noun for a given action verb (e.g., hugging). Participants are asked to provide the first noun that comes to mind for 30 verbs presented one at a time on the computer screen. All participants were given the same verbs as prompting cues. Half of the verbs were taken from the sentences the participants read, and half were verbs not previously presented.

Results

RMET

Participants in the metaphor group did significantly better on the RMET (M = 29.60, SD = 2.16) than those in the literal group (M = 25.80, SD = 3.50), t(37) = 4.04, p = .001, Cohen’s d = 1.31, replicating the findings of Experiment 2. Performance on the RMET did not correlate with RTs to the last word in either the metaphor r(l8) = .16, p = .4 or literal sentences, r(17) = .11, p = .6.

Noun-generation task

The nouns generated for each of 30 verbs were analyzed with the Linguistic Inquiry and Word Count program (LIWC; Pennebaker et al., 2001). The data here are average word counts with a maximum of up to 30 responses per participant. We are interested in the production of words from the category of LIWC that determines the proportion of nouns from categories of social roles (i.e., mother, father, friend).Footnote 2 We found that the group who had read the metaphors prior to doing the noun-generation task produced significantly more social role words (e.g., mother, when given a verb such as hugging), M = 7.3, SD = 4.0, than the group who read the literal sentences prior to the task, M = 4.6, SD = 2.6, t(37) = 2.41, p = .02, Cohen’s d = 0.80. Paired t tests show responses did not differ when participants were prompted with old verbs (M = 3.6, SD = 1.9) or new verbs (M = 3.7, SD = 2.4, t(19) = -.36, p = .72.

Because of the exploratory nature of this task in this context, we examined the whole range of categories. Only one other category exhibited a reliable difference between the groups: people reading literal sentences produced significantly more biological words (e.g., hand, to the same verbs), M = 3.9, SD = 1 .5, compared to those who read metaphors, M = 2.4, SD = 1.6, t(37) = -2.64, p = .02, Cohen’s d = 0.97. There were no reliable differences between groups in the generation of biological nouns when prompted with new or old verbs. Although we had only predicted an effect for social words, this unexpected finding is consistent with the growing literature on the bodily embedding of actions (e.g., Willems, Hagoort, & Casasanto, 2010).

In summary, and consistent with other experiments presented in this paper, we find that the mere act of reading metaphors led to an orientation towards interpersonal social information, demonstrated implicitly by superior recognition of emotional states on an unrelated task (the RMET), and by a reliable superiority in the use of words that describe human agent when cued by a verb. These effects occur without an elaborative text in which a speaker utters the metaphor or, indeed, when there is no obvious connection to the secondary tasks employed.

General discussion

The results of the three studies are as follows. First, when tested between subjects, reading metaphor led to superior performance on the RMET. This was found when the metaphors were employed as a prompt for creating a comprehensible fictive context (Experiment 2) or when there was no explicit discourse context at all (Experiment 3). Second, when tested within subjects, the sole reliable relationship was that the degree of intimacy created in the mind of the reader through reading metaphor (relative to a literal counterpart) was related to degree to which emotional states were identified on the RMET. Third, relative to reading literal information, reading metaphor creates what can generally be considered as a social or interpersonal impact. This is shown by the differences in the discourse contexts created to metaphors rather than literal prompts (Experiment 2), by the inducement of a sense of intimacy and emotional impact (Experiment 1), and by the implicit activation of social role actors in the ostensibly unrelated noun-generation task (Experiment 3). It should be noted that the effect found in the noun-generation task was found both for verbs used as the prompting stimuli and for a set of verbs that had not been used previously, suggesting that reading metaphor generates a set or orientation for interpersonal information that is independent of specific priming of previously seen verbs. Finally, the data presented here suggest that RMET performance is tied to specific characteristic engendered by reading metaphor, namely the generation of affective context (Experiment 2), and the creation of interpersonal intimacy (Experiment 1).

The natural question of course is why does the processing of metaphor lead to enhancement in RMET performance after reading metaphor (relative to the literal controls). The RMET was constructed as a first-order theory of mind (ToM) test, that is, a test of the ability to infer the mental state of another person (e.g., “I believe she is happy”) and not a second-order ToM, or the inferences drawn regarding activated the mental state (e.g., “I believe she is happy because she has had her paper just accepted in Memory and Cognition”). As such, one can speculate that knowledge about the mechanisms that underlie first-order ToM performance would inform the findings reported here. Unfortunately, there is no universally accepted theory about the processes involved in ToM performance or even about the best way to test ToM activities (see Byom & Mutlu, 2013). Nonetheless, there are various speculations about ToM that might prove as useful frameworks for understanding our findings.

In their review of ToM, Byom and Mutlu (2013) argued that people use three types of cues in making attributions of another person’s mental state: shared world knowledge, the perception of social cues, and making inferences from the actions of others. They further situate RMET performance as an ability falling within the use of social cues, perhaps dependent on knowledge learned early in life about where and when people gaze at another person or object. From this perspective, metaphor would serve both in activating shared social knowledge—as indicated here by the contextualizing of metaphor as engaging mental and emotional content (Experiment 2), interpersonal intimacy (Experiment 1), and social agents (Experiment 3)—and in increasing one’s sensitivity to the social cues in their environment. Although one can speculate that these types of cues play a role in the results we obtain, there is theoretical disagreement in how they might be implemented and used, as discussed below.

In general, two overarching approaches have been proposed in the theoretical and philosophical literature for results such as those found here, namely, theory-theory and simulation-theory (see Goldman, 1992). The former position holds that over their lifetime individuals learn basic or naive (folk) theories of psychology and use these beliefs to compute the mental states of others. In essence, performance here is based on explicit reasoning and not implicit mechanisms. As an instantiation, a person, on seeing a certain expression, could activate a well-articulated stored theory and can then make the decision that, for instance, “Something I just said to Joe has made Joe angry.” One can see how this approach can explain performance on the RMET in general and could be used to explain some of the group differences that have been seen with that instrument, such as poor performance by samples from individuals with autism and from various other psychological disorders, if one assumes that in these groups either the prior learning of folk theories was deficient or that the disorder has compromised reasoning abilities. However, it is less clear how theory-theory can explain the modulation found here unless one wishes to postulate and defend the notion that the reading of metaphor (compared to literal counterparts) is more likely to activate folk theories of mind, thus priming reasoning about these theories when faced subsequently with the RMET.

Simulation-theory, on the other hand, argues that people do not go around with fairly sophisticated theories about the mind but rather use their own, sometimes fragmentary, bodily reactions to make inferences about others. One can now find an ever-growing literature postulating how simulations might occur, the embodiment of comprehension, and the neural underpinnings involved. For instance, Gallese (2005, 2006) has proposed a neural mechanism based on mirror neurons for simulation that mediates between one’s own experiences and the understanding of the emotions and sensations that are produced in an interlocutor. His theory also suggests the creation of intimacy, such as we found in Experiment 1. As Gallese (2006, p. 16) put it, “there is also an experiential dimension of interpersonal relationships which enables a direct grasping of the sense of the actions performed by others, and the emotions they experience. This dimension of social cognition is embodied in that it mediates between the experiential knowledge we hold of our lived body and the experience we make of others”. He claims further that when confronting the intentional behavior of others, people experience a specific phenomenal state of “intentional attunement.” This phenomenal state generates a peculiar quality of familiarity with other individuals, allowing correspondence between one’s own intentions with those of the interlocutor. Moreover, Gallese (2006) posited that a deficiency in this neural mechanism might be responsible for some of the social impairments found with individuals with autism, which, as noted earlier, are impairments captured by performance on the RMET.

Metaphor in particular has been analyzed from embodiment and simulation perspectives (see Gibbs, 2006; Ritchie, 2006). Ritchie, for instance, claims that metaphor has its origins in social interactions, such as speech, and that processing metaphor, even when presented without context (as we did in Experiment 3) evokes an extralinguistic context involving past memories, thoughts, and emotions. He avers that literal language calls upon these extralinguistic contextual support to a lesser extent. As extended to the data presented here, the simulation perspective would be that these bodily reactions would sensitize one to the cues to bodily reactions presented in the RMET pictures. Gibbs (2006) provided the most detailed explanation of embodiment in the processing of metaphor, giving examples on how the mere act of reading metaphor activates motor or sensory mechanisms as part of the act of comprehending. The evidence is that people simulate the motoric, sensory, or emotional states described in a metaphor. As with the theory-theory account, extensions to our finding necessitates fleshing out the processes wherein the embodied reactions of the person who read the metaphor sensitizes him or her more generally to identifying mental states in others, and to a greater extent than when reading literal sentence counterparts.

Whether theory-theory or simulation-theory approaches provide a better fit for the data presented here remains a task for further research, as does identifying how long the effect of metaphor on RMET performance lasts. Regardless, the findings presented here provide novel evidence that metaphor plays a special role in orienting one to the mental states of others. Across reading and writing tasks, we see effects of metaphor on the RMET. The findings are congruent with the philosophical speculations of Cohen (1978) and the findings of intimacy found initially by Horton (2007), and extend the findings which show that reading fiction promotes higher scores on the RMET (Kidd & Castano, 2013).