Metaphors (e.g., lawyers are sharks) are extremely common in language, and can make complex or abstract concepts easier to understand. They are composed of a topic (lawyer), the subject to which features are ascribed, and the vehicle (shark), the object whose features are borrowed. How metaphors are processed has, for several decades, been the subject of intense debate within the field of psychological science. On one hand, metaphor comprehension can be viewed as a cognitively demanding task, such that when a metaphor is encountered, both the figurative (lawyers are aggressive) and the literal (lawyers have fins) meanings are activated and compete for access, and individuals need to inhibit the literal meanings in order to successfully compute the intended (figurative) meaning (dual access account; e.g., Giora, 1997). On the other hand, metaphors are ingrained in our everyday language, and often there is no need to resolve any competition for access between literal and figurative meanings because the figurative meaning can be directly accessed without the need to evaluate the literal meanings (direct access account; e.g., Gibbs, 1994). Which of the two proposed theories better explains how metaphors are actually processed? The answer is not that simple. A central issue is the consideration of other factors that could contribute to metaphor comprehension.

One primary difference between the dual access and direct access accounts is that the former involves the activation and subsequent inhibition of literal information from consideration, which suggests that executive functions might be necessary to actively inhibit features that do not map semantically between the topic and vehicle items. Executive functions are comprised of three core mental processes: shifting (directing attention to newly relevant information), updating (screening incoming information for task-relevance and then revising or integrating it with the current goal), and inhibition (suppressing task-irrelevant information; Miyake et al., 2000). All three processes share a common underlying mechanism that contributes to metaphor comprehension, as indicated by a strong relationship between measures of executive functions and metaphor comprehension (e.g., Carriedo et al., 2016; Chiappe & Chiappe, 2007). Moreover, these studies seem to use both the broader term, executive functions, and the more specific term, inhibitory control. While executive functions may underlie all three processes, inhibitory control may play a more specific role in metaphor comprehension. Indeed, there is evidence that inhibitory control processes filter out inappropriate features of the metaphor vehicle to compute the figurative meanings (e.g., Gernsbacher et al., 2001). However, there is also evidence that individuals can interpret some metaphors faster than they can interpret literal phrases, suggesting that the corresponding figurative meanings may be readily available without the need to evaluate or inhibit literal meanings (Glucksberg et al., 1982).

In addition to the role of executive functions, and more specifically inhibition, an important moderating factor potentially contributing to these conflicting findings is the metaphor itself—not all metaphors are alike. A metaphor when first encountered is novel, but with repeated exposure becomes conventionalized. Thus, metaphors can vary from being novel (ideas are diamonds) to more conventional (arms are steel), which could affect the extent to which inhibitory control processes are recruited to facilitate comprehension. For example, the figurative meaning of a conventional metaphor may be easily accessed because the meaning is lexicalized, whereas the figurative meaning of a novel metaphor needs to be computed first, and therefore, requires inhibitory control to select the appropriate meaning from alternate, literal meanings. Unlike dual access account, direct access account does not make claims about metaphor conventionality, and therefore, do not assign a special role to inhibitory control. Instead, they emphasize the role of context, such that there is no difference in processing between literal and figurative phrases if sufficient contextual support is available (e.g., Gibbs, 2002). Conversely, the dual access account proposes that the involvement of inhibitory control depends on the salience (e.g., conventionality) of the metaphor. The more frequently used or conventional a metaphor, the more salient or accessible the figurative meaning without the need for inhibitory control. And the less frequently used or more novel a metaphor, the more salient or accessible the literal meaning, which then requires inhibitory control to successfully compute the figurative meaning (Giora, 1997). Considering the two theoretical accounts, to this end, the objective of the current project was to examine the extent to which inhibitory control processes are recruited during the processing of metaphors that vary in conventionality.

Previous research on different types of metaphors shows that the threshold for the activation of figurative meanings among more established metaphors is often as strong as or stronger than that of their corresponding literal meanings (Bowdle & Gentner, 2005). In the case of novel metaphors, the figurative meaning may not be initially obvious, and inhibitory control processes may be required to override the activation of irrelevant features. It is at precisely these times when literal meanings need to be inhibited. Indeed, it has been found that novel metaphors tend to be associated with a greater processing cost compared with more familiar or conventional metaphors (e.g., Bowdle & Gentner, 2005; Lai & Curran, 2013; Lai et al., 2009; Pierce et al., 2010), and that frontal brain regions associated with more effortful semantic processing are recruited during novel metaphor processing (Mashal et al., 2007; Rutter et al., 2012). Moreover, Amanzio et al. (2008) found that compared with healthy controls, Alzheimer’s patients, who have deficits in executive functions, have trouble processing novel metaphors, but not conventional metaphors. The authors proposed that for novel metaphors, for which no figurative meanings are accessible, executive functions are necessary to compute the meanings based on the possible relations between novel topic-vehicle pairings, but for conventional pairings, the associated meanings are easily accessible, without the need to evaluate the literal meanings. Incidentally, the finding that nonliteral meanings are typically retrieved from semantic memory is evident in the literature on idiom processing (Gibbs, 1980; Swinney & Cutler, 1979). Although this line of research has contributed significantly to our understanding of metaphor comprehension, there are some methodological differences that should be addressed with respect to the roles of inhibitory control and metaphor conventionality.

First, studies that examine executive functions, including inhibitory control, in metaphor comprehension sometimes include measures that recruit not only domain-general processes, such as inhibitory control and working memory, but also some aspects of domain-specific processes, such as reading ability and vocabulary knowledge. For example, in the original reading span task, participants read a series of sentences as well as remember the final word of each sentence for later recall (Daneman & Carpenter, 1980). So, although this task requires processes related to different executive processes, it also relies heavily on specific reading abilities related to the auditory-verbal subsystem (Baddeley, 2003; Daneman, & Merikle, 1996; Ericsson, & Kintsch, 1995), such that reduced performance on such reading tasks is associated with higher instances of reading disability (Siegel, 1994). Thus, from a theoretical standpoint, such measures make it difficult to interpret the true role of executive functions, particularly inhibitory control, which should be domain general—if individuals indeed inhibit literal meanings, this process should not be related to their reading ability or vocabulary. Moreover, using a domain-specific measure could artificially inflate the relationship between executive functions and metaphor comprehension because conventional metaphors are well established due to repeated usage, and therefore are more likely associated with crystalized knowledge and vocabulary. Evidence for this assertion comes from Beaty and Silvia (2013), who reported that the production of conventional metaphors was associated with crystalized intelligence, but not with fluid intelligence (which is closely linked to inhibitory control; Ackerman et al., 2005), suggesting that conventional metaphor generation may recruit prior knowledge with minimal executive resources.

Second, while studies have examined how executive functions interact with different types of metaphors, these metaphors tended to vary in familiarity (e.g., Columbus et al., 2015; Mashal, 2013) rather than in conventionality. Whereas familiarity of a metaphor is determined by the topic-vehicle pairing (the overall sense of familiarity; e.g., alcohol is a crutch), conventionality of a metaphor is determined primarily by the vehicle (ice cream is a crutch; Bowdle & Gentner, 2005). That is, familiarity reflects the extent to which both the topic and the vehicle are used together figuratively, while conventionality reflects the extent to which the vehicle is conventionally used to build a metaphoric meaning. Hence, conventional metaphoric uses of a vehicle can be used either with a frequent topic pairing or with an infrequent topic pairing. The two types of metaphors (familiar vs. conventional) have different effects on metaphor comprehension although the exact nature of these differences is debatable, and open for investigation. Perhaps the extent to which executive functions, particularly inhibitory control processes, are engaged in computing figurative meanings is similar between unfamiliar and novel/nonconventional metaphors, but they differ between familiar and conventional metaphors. Familiar metaphors may be easier to interpret because, unlike conventional metaphors that consist only of a lexicalized vehicle, familiar metaphors consist of both lexicalized vehicle and topic. As such, familiar metaphors may be more likely to be associated with a figurative meaning and therefore easily retrievable, and more likely to use semantic processing that involves long-term storage (Mashal, 2013). Another related issue is that several metaphors studies use the terms familiarity and conventionality interchangeably, when they are in fact measuring familiarity (e.g., Kenett et al., 2018). To summarize, treating familiarity as a measure of conventionality and treating domain-specific and domain-general processes as indications of true executive functions are overgeneralizations that could be contributing to the disagreement between the theoretical findings in the dual access literature. In the current set of experiments, we vary metaphor conventionality and use domain-general measures to examine the relation between inhibitory control and metaphor comprehension.

Current project

In Experiment 1, we used an individual-difference paradigm to examine whether an inhibitory control measure predicts response time on a sense–nonsense task. In the sense–nonsense task, participants indicated as quickly as possible whether a metaphorical or matched literal phrase has a sensible meaning. Response times on this task provide information about how easily a metaphor is interpreted. In the inhibition task, more specifically, a modified version of the Eriksen flanker task (Eriksen & Eriksen, 1974; Ridderinkhof et al., 2020), participants indicated the direction of the central arrow (pointing left or right) as quickly and accurately as possible while ignoring flanking peripheral arrows. Response times on this task assess an individual’s ability to inhibit task-irrelevant responses. According to the dual access account, if more novel metaphors require greater inhibitory control (because the literal meaning needs to be inhibited) and more conventional metaphors do not (because their meanings are directly retrieved from LTM), then scores on the flanker task should predict response time on the sense–nonsense task only for novel metaphors. However, if the flanker task scores do not predict response time for either novel or conventional metaphors, it would provide evidence for the direct access account as this account does not make processing distinctions between different types of metaphors (i.e., they do not assign any special role to inhibitory control).

In Experiment 2, we used a dual-task paradigm in which participants performed a sense–nonsense task while simultaneously performing an n-back (executive functions) task designed to tax inhibitory control processes required for novel metaphor processing. In this n-back task (Kirchner, 1958), a temporal sequence of numbers was presented in between each metaphorical and literal phrase presentation, and participants indicated whether a currently presented number was identical to the number presented two trials ago (dual load) or if the currently presented number was odd or even (single load or control condition). This paradigm tests the notion that if the primary task (sense–nonsense task) recruits inhibitory control processes, then responding to the secondary task (n-back task) simultaneously will exceed attentional capacity and performance on either task will be impaired. But, if the primary task does not recruit inhibitory control processes, then performance should remain unaffected on either task. The dual access account proposes that inhibitory control contributes differently depending on the conventionality of the metaphor. We predict that accuracy on the sense–nonsense task should be similar for more conventional metaphors between single and dual load conditions because the figurative meanings of these metaphors should be easily accessible. However, accuracy should be lower for more novel metaphors in the dual load than in the single load condition because computing figurative meanings of these metaphors should require the inhibition of the literal meanings, a process that will presumably be taxed under the load condition. Conversely, the direct access account does not make any claims about differences in the processing of conventional and novel metaphors. Therefore, we predict no differences in accuracy between the two load conditions regardless of metaphor conventionality.

Experiment 1

Experiment 1 used an individual-difference paradigm to examine the extent to which metaphor conventionality and inhibitory control contribute to metaphor comprehension. Although several studies in the literature equate metaphor processing with some aspect of intellectual ability and higher-level cognition, most of them typically assume that all individuals process different metaphors in a similar manner. But not all individuals are alike; they vary in their ability to inhibit task-irrelevant thoughts and responses. This suggests that different individuals might process different metaphors differently. To the extent that inhibitory control processes are required to interpret at least some types of metaphors, individuals with more efficient inhibitory control may be faster at processing those metaphors compared with individuals with less efficient inhibitory control.

Only recently have researchers begun to isolate the variance associated with individuals that is often left in the general error term, and to include it as a dependent measure to better understand the executive processes involved in metaphor comprehension. These studies show that an individual’s working memory capacity can constrain metaphor comprehension. Working memory is related to inhibitory control (Chiappe et al., 2000; Hasher et al., 1999) and is responsible for coordinating different language units that affect meaning computation. Inhibitory control requires a high working memory capacity to simultaneously hold both literal and figurative interpretations of a phrase, identify the relevant (figurative) aspects, inhibit the irrelevant (literal) aspects, and update the ongoing meaning. Chiappe and Chiappe (2007) examined the time individuals took to generate metaphor interpretations as a function of their working memory capacity and inhibitory control, as measured by the listening span task and Stroop task, respectively. Those with higher working memory spans and better inhibitory control were faster and more accurate at interpreting metaphors than those with lower working memory spans and poor inhibitory control. In a different study, Pierce et al. (2010) further demonstrated that working memory plays an important role in the course of metaphor comprehension. They reported that the metaphor interference effect (MIE)—longer response time to judge metaphors to be literally false than scrambled phrases—was smaller for individuals with higher than with lower working memory spans, as measured by the forward letter span task. The authors suggested that the MIE arises because scrambled phrases do not give rise to shared properties and therefore are quickly rejected as false, whereas metaphoric phrases give rise to shared properties, and hence form a metaphoric interpretation. However, individuals with higher working memory spans would be less susceptible to the MIE because they would be better at identifying whether a retrieved meaning was metaphorical or literal, and that this, in turn, makes them quicker to reject the metaphorical meaning than those with lower working memory spans.

Other individual-difference studies also support the role of executive functions during metaphor comprehension. Individuals with higher IQ (which is related to working memory and inhibitory control; Ackerman et al., 2005) produce better and more accurate metaphor interpretations than their counterparts (Kazmerski et al., 2003), and individuals with executive control deficits have difficulty processing metaphors (Mashal et al., 2013). These studies seem to support the dual access account as they all report a relation between executive functions and metaphor comprehension. Moreover, when examining for potential metaphor characteristics that contribute to its comprehension, the evidence suggests that individual variation in executive functions may be especially important when processing unfamiliar or novel metaphors (e.g., Columbus et al., 2015). For example, Mashal (2013) found that individuals with higher working memory spans, as measured by the backward digit span task, had better recall, comprehension, and recognition for unfamiliar metaphors than for familiar word pairs compared with those with lower working memory spans. In another study, Carriedo et al. (2016) analyzed the contribution of different executive functions to metaphor comprehension across development. They demonstrated that updating in working memory and cognitive inhibition (e.g., resistance to interference and suppression of information), as measured by go/no-go and flanker tasks, predicted individual and developmental differences, such that working memory and cognitive inhibition became increasingly involved when metaphor comprehension was highly demanding, either when metaphors were relatively unfamiliar or when individuals had processing difficulties (e.g., low levels of reading experience or low semantic knowledge, as observed in children).

As mentioned earlier, the individual-difference measures used in majority of the studies recruit a mixture of domain-specific processes, such as reading ability and vocabulary knowledge in complex span tasks (e.g., reading span task) and memory and storage capacity in simple span tasks (backward and forward span tasks). Although certain aspects of domain-general processes are recruited through these tasks simultaneously (e.g., performance on the reading span task is based on language processing, vocabulary knowledge, and some form of executive functions, often inhibition of information), from a theoretical standpoint, it makes it difficult to determine the independent contribution of these factors in metaphor comprehension. In the current experiment, we addressed this issue by incorporating an individual-difference measure (i.e., a modified Eriksen flanker task) that captures the domain-general aspect of executive functions, particularly inhibitory control, and by examining how this measure interacts with metaphor processing on a sense–nonsense task.

Method

Participants

Forty-six introductory psychology students at the University of Alberta participated for course credit. They were native English speakers. Data from four participants were excluded from the statistical analysis because they had unusual response time patterns, resulting in 42 participants.

Materials and design

The experiment consisted of two experimental tasks: sense–nonsense task, in which participants indicated whether a given phrase made sense (regardless of whether it made literal or metaphorical sense) or whether it did not make sense, and a modified version of the Eriksen flanker task (Eriksen & Eriksen, 1974), in which participants indicated whether a target arrow was pointing in the same or opposite direction as the surrounding flanker arrows.

The sense–nonsense task, programmed on Supercard, consisted of 74 metaphorical phrases (e.g., a friend is an anchor). An equal number of nonsense (literally false) phrases (e.g., “fruits are furniture”) were also included (see Appendix A for a list of these phrases). All phrases were in the “X is Y” format. The metaphorical phrases varied in conventionality. We derived a continuous measure of the conventionality of the phrases from a pilot study, in which 51 native English speakers rated 78 metaphorical phrases (e.g., time is money), each on the strength of association between the vehicle and the specific figurative meaning. On a scale from 1 to 7, participants rated, for example, the extent to which money is used to convey value in the phrase time is money. On any given phrase, a lower rating referred to a more novel association, and a higher rating referred to a more conventional association. Participants’ average ratings on these metaphorical phrases was M = 5.30, 95% CI [4.98, 5.42]. Although we did not have an equal number of metaphorical phrases in each category level (i.e., from 1 to 7), each phrase had an intended figurative meaning. In a majority of studies, the different types of metaphors are categorized in a binary manner (e.g., familiar vs. unfamiliar; conventional vs. novel). However, factors such as familiarity and conventionality are likely to vary on a continuum. Thus, treating metaphor conventionality as a continuous measure would likely increase statistical power and result in a consequent gain in specificity.

The current study used a modified version of the flanker task programmed on Supercard in order to be administered on the same software as the sense–nonsense task. In a conventional flanker task, participants are presented with an array of seven letters, and they make directional responses to a centrally located target letter and ignore simultaneously presented distractor letters presented on the left and right that flank the target. In relation to the central target, these flankers are either congruent (e.g., both central target and flank letters correspond to the same responses) or incongruent (e.g., central target and flank letters correspond to opposite responses). Our version was modified in two ways: (1) unlike the original flanker task that used letters as stimuli, we used arrowheads, nonlanguage-based stimuli (Ridderinkhof et al., 2020). This clears participants’ working memory capacity from having to remember the directional responses to certain letters (e.g., a right response to the target letters H and K, and a left response to S and C). (2) Unlike the original flanker task that arranged only seven letters in a horizontal array, we included horizontal, vertical, and diagonal array of 25 arrows, as shown in Fig. 1. The impact of flankers varies based on the spatial distance between the target and flanker stimuli (e.g., Miller, 1991). Therefore, the flanker arrows in the current experiment were displayed both closer and further away from the target arrow to increase greater flanker noise. This makes it more challenging for participants to focus attention to a particular spatial position, and it allows for a more precise estimate of smaller differences in a participant’s ability to inhibit conflicting information.

Fig. 1
figure 1

Experiment 1 procedure. Participants completed two tasks, one to measure individual differences in inhibitory control (Eriksen flanker task) and the other to measure metaphor comprehension (sense–nonsense task)

The current task stimuli, which consisted of left and right pointing arrows, included a total of 180 trials: 60 neutral, 60 congruent and 60 incongruent. As illustrated in Fig. 1, each trial composed of a target arrow, pointing towards the right or left, that was presented in the middle (third arrow and third row) of the computer screen, and flanker arrows on both the left and the right side of the screen. In the neutral trials, the flankers were empty boxes on each side of the target arrow, and in the congruent and incongruent trials, the flankers were arrows, pointing in the same or the opposite direction as the target arrow, respectively. Each trial started with a centered red dot, followed by the stimulus (target and flanker arrows) after 100 ms. The stimulus remained on the screen until the participant responded with a valid key press, after which the next fixation dot appeared. Participants responded by pressing the left key for a left-pointing target arrow, or the right key for a right-pointing target arrow. A difference in response time between incongruent and congruent flanker trials was used to compute the inhibition score for each participant, such that lower scores reflected more efficient inhibitory control.

There were three variations of the number of flankers presented: 1 × 3, 3 × 3 or 5 × 5, with the first digit indicating the number of rows of squares or arrows and the second digit indicating the number of columns. For example, 1 × 3 indicated one row of squares or arrows in three columns. Within each of the three different types of trials (neutral, congruent, and incongruent), 30 of the arrows were pointed to the left, and 30 to the right. Within each of these 30 trials, 10 were 1 × 3, 10 were 3 × 5, and 10 were 5 × 5. We have no reason to believe that these modifications resulted in differential effects on reaction time because the general task, the three different conditions, and the response required were identical to those in other studies using the flanker task.

Procedure

Figure 1 illustrates the general procedure of Experiment 1. All participants completed both experimental tasks. The order in which the tasks were presented was counterbalanced, such that half of the participants completed the sense–nonsense task first, whereas the other half completed the flanker task first. All instructions were automated and presented on the screen. In the sense–nonsense task, participants were explicitly asked to indicate whether or not the phrase makes sense—it could make literal sense or metaphorical sense. The task consisted of 12 practice trials (i.e., 12 phrases—six nonsense, six metaphorical) followed by 148 trials (i.e., 148 phrases—74 nonsense, 74 metaphorical).Footnote 1 Presentation order of the trials was randomized for each subject. Each trial began with the message “Ready?,” and participants pressed the space bar to initiate the trial. A single phrase was presented during each trial, and stayed on the screen until participants responded by pressing the letter key J (the phrase made sense) or F (the phrase did not make sense).

In the flanker task, participants completed six practice trials: two congruent, two incongruent, and two neutral, followed by 180 experimental trials: 60 congruent, 60 incongruent and 60 neural. The direction of the arrow, which pointed towards either the right or the left, determined which key press was required—a right arrow key press if the target arrow pointed to the right or a left arrow key press if the target arrow pointed to the left. Presentation order of the trials was randomized for each subject. Each trial began with a red dot presented in the middle of the screen for 100 ms followed by the trial during which the target and flanker stimuli were presented until participants pressed one of the two keys. To reiterate, we computed the inhibition score by taking the difference in response time between incongruent and congruent flanker trials for each participant. Lower scores reflect more efficient inhibitory control. It is worth noting that in the current study, we used a single measure (i.e., the flanker task) to assess individual differences in inhibitory control. Using multiple measures of inhibitory control to create a composite inhibition score can eliminate concerns related to any idiosyncratic correlations between flanker task variance and metaphor conventionality.

Results and discussion

The data were analyzed using linear mixed effects (LME) regression models using the mixed function for the response time data, and the melogit function for the accuracy data in Stata 14. Metaphor conventionality rating (ranging from 1 to 7) and inhibition score (computed by subtracting congruent from incongruent score on correct trials) were entered as fixed continuous factors, and participants and item were entered as crossed random factors. Following a significant interaction, we used the margins function with the dydx option to test the slope for our factors. The dependent variable, response time on sense–nonsense task, was log-transformed to reduce skewness. A Q-Q plot of the transformed data confirmed that the transformation was appropriate (see Appendix B). To eliminate anticipation and inattention errors, trials with a response time of less than 100 ms or greater than two standard deviations from a participant’s mean reaction time were excluded from the analyses (n trials = 350).

Participants correctly responded “nonsense” for 98%, 95% CI [97.79, 98.73] of the nonsense phrases, and “sense” to 71%, 95% CI [68.37, 72.91] of the metaphorical phrases. The raw inhibition scores ranged from −105.94 to 89.30 (M = 30.43; 95% CI [29.18, 31.69]). For the statistical analyses, nonsense-phrase responses were excluded, and inhibition scores were scaled. Inhibition scores were computed by subtracting congruent from incongruent scores on correct trials, and they were scaled by subtracting the mean raw inhibition score from the participant’s inhibition score. This means a scaled inhibition score of 0 would indicate an average ability to inhibit goal irrelevant responses (i.e., the flanker arrows), and a negative or a positive score would indicate better or worse inhibitory control abilities, respectively. For example, a scaled inhibition score of −200 would indicate that the participant was, on average, 200 ms faster on incongruent trials, and therefore, demonstrated better inhibitory control abilities, whereas a score of +200 would indicate that the participant was, on average, 200 ms slower on incongruent trials, and therefore, demonstrated poor inhibitory control abilities.

Figure 2 shows accuracy on the sense–nonsense task as a function of metaphor conventionality and inhibition score. The LME analysis with accuracy data yielded only a main effect of conventionality, b = 1.00, SE = 0.02, z = 7.45, p < .001, suggesting that the more conventional metaphors were more accurately identified to make sense. The main effect of inhibition was nonsignificant, b = −0.01, SE = 0.01, z = −0.78, p = .436, and the interaction between conventionality and inhibition was also nonsignificant, b = 0.00, SE = 0.00, z = 1.06, p = .287.

Fig. 2
figure 2

Experiment 1 accuracy results. Accuracy on the sense–nonsense task as a function of inhibition score and metaphor conventionality. Lower inhibition scores indicate higher levels of inhibitory control

The LME analysis with response time data yielded a significant main effect of conventionality, b = −0.09, SE = 0.02, z = −5.96, p < .001, and a main effect of inhibition, b = 0.01, SE = 0.002, z = 2.98, p = .003. There was also a significant conventionality by inhibition interaction, b = −0.001, SE = 0.00, z = −3.98, p < .001. Figure 3 shows the relation between inhibition scores and response times, with separate lines corresponding to the different levels of conventionality. Recall that the inhibition score for each participant was computed based on the difference in response time between incongruent and congruent flanker trials, where lower scores reflect better inhibitory control abilities. The inhibition slope increased as metaphor conventionality decreased. Specifically, the inhibition slope for more novel metaphors was greater than the inhibition slope for more conventional metaphors. The margins command with the dydx option was used to compute the inhibition slope for conventionality that ranged from 1 (not conventional at all) to 7 (very conventional). As seen in Table 1, for metaphorical phrases that were less conventional at rating Levels 1 and 2, the inhibition slopes were significant. For metaphorical phrases that ranged from rating levels between 3 and 6, the inhibition slopes were not significantly different from zero (p > .05). Finally, for the most conventional phrases (i.e., rating level of 7), the inhibition slope was significant.

Fig. 3
figure 3

Experiment 1 response time results. Response time on the sense–nonsense task as a function of inhibition score and metaphor conventionality. Lower inhibition scores indicate higher levels of inhibitory control

Table 1 Regression statistics for response time on the sense–nonsense task summarizing the inhibition slopes as a function of conventionality in Experiment 1

Experiment 1 used an individual-difference approach to examine if individual differences in inhibitory control affect response times on novel and conventional metaphor comprehension. Individuals with higher inhibitory control use controlled executive processes to reach their goal-relevant response in the face of distraction, whereas individuals with lower inhibitory control are more susceptible to dominant or irrelevant responses. According to the dual access account, if the processing of novel metaphors requires greater inhibitory control because the literal meanings need to be inhibited whereas the processing of conventional metaphors does not because their meanings are directly accessible, then individuals with lower inhibitory control should take longer to process novel metaphors compared with conventional metaphors. Results of Experiment 1 are consistent with this notion. In fact, metaphor conventionality appeared to matter a lot more for participants with poor inhibitory abilities than for those with efficient inhibitory abilities—those with low inhibitory control were faster at responding to conventional metaphors suggesting that they may be more susceptible to dominant or salient responses.

Experiment 2

The goal of Experiment 2 was to use a dual-task paradigm to further examine the role of inhibition in metaphor comprehension. In Experiment 1, we found that inhibitory control processes seem critical for more novel metaphor comprehension, but not for more conventional metaphor comprehension, and individual differences seemed to contribute to this finding, such that compared with participants with higher inhibitory control, those with lower inhibitory control were slower at processing more novel metaphors, and interestingly, faster at processing more conventional metaphors. This implies that they were less efficient at computing the figurative meanings of more novel metaphors, but were more efficient at retrieving or accessing the figurative meanings of more conventional metaphors, likely stored in semantic memory.

Indeed, research in cognitive science shows that individuals are able to retrieve information from memory with relative ease sometimes, and with much more effort at other times (Anderson, 2003; Levy & Anderson, 2008). The amount of cognitive resources available at the time of retrieval plays an important role in how quickly and accurately the information gets retrieved. If there are sufficient executive resources (which includes inhibitory control), retrieval may be successful, but if executive resources are taxed or burdened, retrieval may not be successful (Hicks & Marsh, 2000; Jacoby, 1991; Lozito & Mulligan, 2010). This is particularly true for novel tasks. In the current case, computing figurative meanings of novel metaphors amidst other activities that also require executive functions, particularly inhibitory control, compromises metaphor comprehension. Conversely, cognitive research also shows that retrieval is surprisingly resilient to executive demand manipulations if what is being retrieved is primarily automatic (Baddeley et al., 1984; Craik et al., 1996; Naveh-Benjamin et al., 1998), or salient as would be the case when accessing figurative meanings of conventional metaphors according to the dual access account. In Experiment 2, we used a dual-task paradigm to test the notion that the processing of novel metaphors, that presumably require greater inhibitory control, would be more susceptible to interference effects under divided or dual attention that taxes inhibitory control resources, than the processing of conventional metaphors.

Dual-task paradigms require participants to perform a primary task while simultaneously performing an attention-demanding secondary task designed to tax executive function resources. The basis of the paradigm is that if both primary and secondary tasks require controlled, effortful processing, attentional capacity is exceeded and performance on either task is impaired. However, if parts of the task involve automatic processes, performance will likely remain unaffected (Marois & Ivanoff, 2005; Pashler, 1998). For example, in one study, De Neys (2006) had participants perform a reasoning task with various congruent and incongruent problems along with a secondary task (i.e., memorization of a dot pattern). Performance on congruent problems was unaffected, whereas performance on incongruent problems decreased. De Neys argued that on congruent problems, belief-based processes—which operate automatically and are unaffected by reduced attentional resources—trigger the correct response. In contrast, controlled processes—which require inhibitory control and attentional focus—are needed to derive the correct response on incongruent problems, and thus, taxing attention impairs performance on these tasks.

In a different study, Baddeley et al. (1984) used the dual-task paradigm to examine the role of executive functions in retrieval of long-term episodic memories, presumed to be accessed easily. Participants concurrently performed a card-sorting task or held a digit load in mind while they responded to recall and recognition memory tests. They concluded that retrieval was automatic because divided attention produced no reduction in memory performance compared with full attention. Dual-task studies that have found performance impairments under divided attention show that these deficits may only present when task demands are high (e.g., Mangels et al., 2002). This body of research is relevant to the current experiment because task demands could be related to computing the meaning of a novel metaphor. According to the dual access account, novel metaphors require more inhibitory control resources than conventional metaphors to compute the appropriate interpretation because the former requires inhibition of literal meanings whereas the latter requires the (presumably automatic) retrieval of salient figurative meanings from memory. Thus, under divided attention, when inhibitory control resources are taxed by another task, performance should be disrupted when processing more novel metaphors, but not more conventional metaphors.

To our knowledge, there are no studies to date that have examined metaphor comprehension using the dual-task paradigm, but our prediction may be comparable to the results observed in the cognitive domains on automaticity and skill acquisition—performance on well-learned tasks (e.g., visual array search, word retrieval, playing piano) that are primarily driven by proceduralized or automated knowledge representations do not degrade under dual-task conditions because they generally require minimal executive processes during task execution (Allport et al., 1972; Craik et al., 1996). To compute the meaning of a novel metaphor, individuals must hold all alternate meanings in mind initially, and inhibit the literal meaning. Divided attention theories would posit that if this process is taxed due to a secondary task, then performance will be impaired. Whereas to compute the meaning of a conventional metaphor, individuals can easily select the salient, figurative meaning from the literal one without inhibitory control processes, and as such will remain unaffected by the secondary task.

Method

Participants

Forty-four introductory psychology students from the University of Alberta, who were not part of Experiment 1, participated for course credit. They were native English speakers. Data from two participants were excluded from the analysis because their response time data were highly variable. This resulted in a total of 41 participants.

Materials and design

Experiment 2 consisted of two tasks. The primary task was the sense–nonsense task, wherein participants indicated whether a given phrase made sense (literally or metaphorically) or not. These phrases were the same as the ones used in Experiment 1. The secondary task manipulated inhibitory load, implemented through an odd–even task (single load) and the n-back task (dual load). In the odd–even task, participants indicated whether a given number was odd or even, and in the n-back task, participants judged whether a new number was identical to the second number back in a sequentially presented list of numbers. For example, in the sequence 2–5–3–6–1, the current number 6 is identical to the number presented 2 numbers back (6). Unlike the odd-even task, the n-back task requires online monitoring, updating, and inhibition to discard numbers more than two trials ago, and is therefore assumed to place great demands on executive functions, including inhibitory control processes (Jonides et al., 1997; McElree, 2001). Moreover, if the current number matches a previously presented number, but not the one n items back in the sequence, inhibitory control and interference resolution processes are recruited to resolve this conflict (Kane et al., 2007; Oberauer, 2005). To further emphasize the role of inhibition, Jonides et al. (1997) argued that to successfully perform an n-back task, participants must use inhibitory control processes to reduce traces of previous trials in memory, and replace them with current and upcoming trails. In this way, an n-back task involves competition between relevant and irrelevant information. The primary and secondary task trials in the current experiment were interleaved such that a phrase was always followed by a number. Figure 4 illustrates the design of Experiment 2.

Fig. 4
figure 4

Experiment 2 procedure. Participants completed a secondary task that was interpolated within the primary, sense–nonsense task. The secondary task required participants to either evaluate whether a given number was odd or even (single load), or evaluate whether a given number was presented n trials ago (dual load)

In the sense–nonsense task, there were 72 metaphorical and 72 matching nonsense (literal false) phrases. In the secondary task, there were 144 single-digit numbers that ranged from 1 to 9. Each phrase trial was followed by a number trial. The 144 phrases and 144 matching numbers were divided into two lists, each list containing 72 phrases (36 metaphorical and 36 nonsense), and 72 numbers. For half of the participants, List 1 was assigned to the single load condition, and List 2 to the dual load condition. This assignment was reversed for the other half of the participants, such that List 2 was assigned to the single load condition, and List 1 to the dual load condition. In the single load condition, half of the number trials within each phrase type (i.e., metaphorical and nonsense) were even and half were odd. Similarly, in the load condition, half of the trials within each phrase type were 2-back and the other half were random or non-2-back. Whether a participant completed the single load or the dual load condition first was also counterbalanced, such that half of the sample completed the single load condition first, and the other half completed the dual load condition first.

Procedure

Figure 4 illustrates the general procedure of Experiment 2. Participants completed two sets of primary and secondary task practice trials, one at the beginning of the single load condition and another one at the beginning of the dual load condition. Participants first practiced responding to eight phrases (four metaphorical and four nonsensical), then practiced responding to eight numbers (four odd and four even in the single load condition, and four 2-back and four random-back in the dual load condition). Finally, they practiced responding to both the phrases and the numbers together (six phrases and six numbers) as would be presented in the experiment. The procedure for this experiment was identical to the sense–nonsense task from Experiment 1, with the exception of the message “Ready,” which was replaced with a number presented in the middle of the screen that prompted a response. A single phrase or number was presented during each trial, and stayed on the screen until participants responded. For the phrases, the response was either the letter key J (the phrase made sense) or F (the phrase did not make sense). In the single load condition, participants pressed the J key if the number was even, and the F key if the number was odd. In the dual load condition, participants indicated via key press (F or J key) whether the number in the current trial matched (i.e., was identical to) or mismatched (i.e., was different from) the number they had seen within the sequence two trials back. Henceforth, we will refer to trials of the match situation as 2-back trials, and to trials of the mismatch situation as n-back trials. On every trial, participants pressed the J key in case of an n-back target, whereas they pressed the F key if the currently presented number was not the 2-back target. For example, in the sequence, 3–6–9–5–9, the J key press would only be given to the final number, as the number 9 is the same as the number presented two trials previously, and the F key press would be given otherwise. For all tasks, participants were instructed to respond as quickly and accurately as possible.

Results and discussion

As in Experiment 1, the data were analyzed using linear mixed effects (LME) regression models using the mixed function for the response time data, and the melogit function for the accuracy data in Stata 14. Metaphor conventionality rating (ranging from 1 to 7) and load (single vs. dual) were entered as fixed continuous and categorical factors, respectively, and participants and item were entered as crossed random factors. Following a significant interaction, which was only observed for accuracy on the sense–nonsense task, we used the margins function with the dydx option to compute the conventionality slope for each level of load. The dependent variable, response time on sense–nonsense task, was log-transformed to reduce skewness. A Q-Q plot of the transformed data confirmed that the transformation was appropriate (see Appendix B). To eliminate anticipation and inattention errors, trials with a response time of less than 100 ms or greater than two standard deviations from a participant’s mean reaction time were excluded from the analyses (n = 410). Descriptive statistics for both the primary, sense–nonsense task, and for secondary numerical task are summarized in Table 2. For the statistical analyses, nonsense-phrase responses, and secondary task responses (i.e., responses to numbers) were excluded.

Table 2 Mean accuracy and response times on primary and secondary tasks in both single and dual load conditions

Figure 5 illustrates response time on the sense–nonsense task as a function of conventionality and load. We did not have any predictions for response time. The LME analysis for response time on correct trials yielded a significant main effect of conventionality, b = −0.06, SE = 0.02, z = −3.49, p < .001, suggesting that more conventional metaphorical phrases were responded to faster than more novel phrases. The main effect of load was nonsignificant, χ2(1) = .08, p = .778,Footnote 2 and the interaction between conventionality and load was also nonsignificant, χ2(1) = .18, p = .673.

Fig. 5
figure 5

Response time on the sense–nonsense task as a function of metaphor conventionality and single versus dual load conditions

For accuracy on sense–nonsense task, the LME yielded a significant main effect of conventionality, b = 1.24, SE = 0.15, z = 8.56, p < .001, and a significant main effect of load, χ2(1) = 31.13, p < .001. There was a significant conventionality by load interaction, χ2(1) = 40.56, p < .001. Figure 6 shows accuracy on the sense–nonsense task as a function of conventionality and load. The interaction is both clear and interesting; accuracy was greater under single load for more novel metaphorical phrases, but greater under dual load for more conventional metaphorical phrases. We followed up on this interaction by examining the simple effects of load within each level of conventionality. The summary statistics are reported in Table 3. For metaphorical phrases that were less conventional at rating Levels 1–3, accuracy was higher under single load versus dual load. For metaphorical phrases that were more conventional at rating Levels 5 and 6, accuracy was higher under dual load than single load.

Fig. 6
figure 6

Accuracy on the sense–nonsense task as a function of metaphor conventionality and single versus dual load conditions

Table 3 Regression statistics for accuracy on the sense–nonsense task summarizing the difference in accuracy between single versus dual load conditions at each conventionality in Experiment 2

It is important to note that having a secondary task, regardless of whether it imposed a dual load or a single load decreased accuracy; the accuracy for nonsense phrases and metaphorical phrases, respectively, was 74% and 57% in the single load condition and 76% and 54% in the dual load condition. Moreover, even a single load secondary task decreased performance compared with having no secondary task at all, as was the case in Experiment 1, where the accuracy for nonsense phrases and metaphorical phrases was 98% and 71%, respectively. Thus, it could be the case that regardless of the load, a secondary task was taxing cognitive resources, but in different ways depending on metaphor conventionality.

Most critical to the experiment, we find that the dual-task condition seemed to limit the processing of multiple meanings in a way that the single-load condition did not—the dual-load condition seemed to have different consequences for metaphors depending on conventionality. Specifically, dual load encouraged participants to process and select the dominant or salient meanings of more conventional metaphors, but it may not have encouraged participants to process and compute the meanings of more novel metaphors. Given that we did not rate the metaphors in the current experiment for meaningfulness, one possibility is that participants did not understand the more novel metaphors, which could have resulted in lower accuracy, particularly under dual load.

Experiment 2 used a dual-task approach to examine how performance on novel and conventional metaphor comprehension is differentially affected when inhibitory control processes are taxed. The dual access account proposes that performance on the sense–nonsense task should remain unchanged for more conventional metaphors under divided attention, but should decrease for more novel metaphors because the former are processed based on accessibility of salient figurative meanings, whereas the latter require inhibitory control to inhibit the literal meanings. Conversely, the direct access account may predict that performance should remain the same for both novel and conventional metaphors regardless of whether the processing occurred under divided attention or not, as this account does not make claims about the role of inhibitory processes in metaphor comprehension. The results show lower performance on novel metaphors, and higher performance on conventional metaphors under dual load compared with single load, consistent with the notion that novel metaphors rely on inhibitory control, whereas conventional metaphors presumably rely on access to stored figurative meanings. The fact that performance on conventional metaphors increased under divided attention suggests that when inhibitory control processes are taxed, more dominant or salient responses are employed.

General discussion

A central question in metaphor research is whether and when literal meanings are activated and subsequently inhibited in the course of metaphor comprehension. The direct access account makes the claim that metaphor processing does not necessarily involve the activation of literal meanings (Gibbs, 1994; Glucksberg et al., 1982), and that the debate between possible literal and figurative meanings is unnecessary, particularly when metaphors are supported by context. This account does not make distinctions between conventional and novel metaphors. Conversely, the dual access account, particularly the graded salience theory (Giora, 1997) does make distinctions between different types of metaphors and places great emphasis on salience. When a metaphor is encountered, salient meanings (conventional or familiar) are activated and processed without the need to inhibit the less salient (literal) meanings. However, if the metaphor is not associated with a salient meaning, as would be the case with novel metaphors, then both literal and figurative meanings are activated. To the extent that inhibitory control processes are engaged anytime literal information needs to be inhibited, we employed two approaches to examine how inhibitory control contributes differently as a function of metaphor conventionality.

Experiment 1 examined the interaction between metaphor conventionality and individual differences in inhibitory control as it relates to metaphor comprehension. To the extent that both literal and figurative meanings are brought to mind during metaphor comprehension, inhibitory control processes may be recruited to inhibit the irrelevant meanings. The more novel metaphors should require the inhibition of irrelevant meanings, and individuals with higher inhibitory control should be better able to execute task-relevant responses in the face of interference. Consistent with this notion, we found that, compared with their counterparts, participants with lower inhibitory control were slower at processing more novel metaphorical phrases. If conventional metaphors acquire a new literal sense, their figurative meanings should be easily accessible with minimal involvement of inhibitory control processes (Kittay, 1987; Utsumi, 2007). The results show that participants with lower inhibitory control were faster at responding to highly conventional phrases compared with individuals with higher inhibitory control. This latter finding is consistent with the notion that in the face of interference (e.g., alternate literal and figurative meanings) individuals with low inhibitory control may be even more susceptible to dominant or salient responses (e.g., Xu et al., 2014). Evidence for this assertion comes from studies that examine how individual differences in inhibitory control interacts with dominant processes—that are default, unintentional, implicit and effortless—and controlled processes—that are slow, intentional, explicit, and effortful (e.g., Barrett et al., 2004; Fabio, 2009). One reason why some individuals are more suspectable to dominant processes is because they have limited cognitive resources that control what is retrieved or gets activated. These often tend to be the dominant responses, unless the task is novel or has competing demands, at which point controlled processes would need to be recruited to override or inhibit the retrieved responses.

Experiment 2 examined the interaction between metaphor conventionality and inhibitory control by taxing inhibitory control processes. We predicted that accuracy on conventional metaphors should remain unchanged regardless of load because the associated figurative meanings are more salient, and thus easily accessible. The results show that, in fact, accuracy increased under divided attention, suggesting that the load depleted inhibitory resources and facilitated the activation of dominant responses. This finding is consistent with research in the cognitive domain, suggesting that single or no load (i.e., full attention demanding) environments can be counterproductive. For example, in several studies, researchers reported that, compared with single-load conditions, dual-task conditions designed to tax executive function resources from well-learned primary tasks (i.e., golf putting and soccer) resulted in more accurate performance (Beilock et al., 2002; Hardy et al., 1996; Masters, 1992). One reason for this may be that divided attention does not allow sufficient time for processing and forces dominant responses to occur, particularly if they are highly practiced or in the current case, easily accessible. These dominant responses are likely to also be the correct responses.

So far, we have been interpreting the results with the assumption that the figurative meanings of conventional metaphors are easily accessible, and the literal meanings of these metaphors receive little to no processing attention. However, it may be possible that a given conventional metaphor may have several possible figurative meanings, with one being more salient than the others. Thus, it could be that competition among possible meanings still exist; however, the need for inhibition, at least for conventional metaphors, may be driven not by literal versus figurative meanings, but rather by several plausible figurative meanings. This hypothesis still needs to be tested, but could account for the findings in the current study.

Another plausible explanation is that perhaps inhibition is still involved during the processing of conventional metaphors, but to a lesser degree. In a study relevant to the Experiment 2 results, Lavie (1995) found that an easy perceptual task benefited more than a difficult perceptual task from divided attention. The researcher proposed that individuals generally invest all attentional resources available to a given task. In the easy task (analogous to processing conventional metaphors), not all these resources are necessary, and some attention will spill over to the distractors (literal meanings of metaphors). The distractors are then processed up to a relatively high level, where they interfere with the processing of the primary task. This additional, unnecessary processing is bypassed under divided attention. Based on Lavie’s rationale, it may be possible that conventional metaphors, like novel metaphors, may still activate alternate meanings, but the figurative meaning has a higher activation threshold, as it is the most salient one. Thus, although the current set of experiments shed some insight, the hypothesis of whether only the figurative meaning, or alternate meanings are also activated for conventional metaphors needs to be explored further.

There are a few methodological limitations that should be addressed by future research. For example, we did not collect ratings on other constructs, such as familiarity. It is quite possible that the conventional metaphors used in the current set of experiments were also familiar metaphors. In fact, Bowdle and Gentner (2005) argue that a metaphor can be equally conventional (because it shares the same vehicle), but it can differ in familiarity (familiar metaphor: his anger is fire vs. unfamiliar metaphor: his panic is fire). This means that while both types of conventional metaphors can be processed with ease, they may be processed differently (e.g., Glucksberg & Keysar, 1990). Another limitation is that while the n-back task used in Experiment 2 is an executive functions task, it is not a pure inhibitory control task, and as such it is unclear whether the dual-task cost depended specifically on inhibitory control or whether it resulted from a more general executive control cost.

Multiple theories, even within the dual access framework (e.g., Carston & Wearing, 2011; see Holyoak & Stamenković, 2018, for an overview), have been proposed to explain metaphor comprehension, and several of them have highlighted the role of metaphor conventionality in the involvement of inhibitory control. However, very few have systematically evaluated and specified the possible interactions between the two variables, particularly using the individual-difference and dual-task paradigms. The current project attempts to address this issue. For example, we tried to better explain why different individuals might process metaphors differently and more or less efficiently. Katz et al. (1988) showed that different individuals process the exact same metaphors with great variability, although they did not propose reasons for why that occurred. Experiment 1 explained the conjoint effects of stimulus characteristics (metaphor conventionality) with an individual difference measure, a point made in Trick and Katz (1986), although with different metaphor characteristics and different individual-difference variables.

Further, while the current project does not discriminate between existing dual access theories (e.g., the graded salience theory vs. the career of metaphor theory), it attempts to encapsulate the fundamental component of a dual access account—the differential role of inhibitory control in novel and conventional metaphor comprehension—and use well-known paradigms in the study of attention and effortful processing to better integrate the study of metaphor within more mainline cognitive approaches. By exploring possible interactions between metaphor conventionality and inhibitory control, this line of research has the potential to resolve some conflicting results in the literature, especially those concerning the direct access of a figurative meaning versus the activation of both figurative and literal meanings of a metaphor.

To conclude, the current study is unique in that it attempts to quantify the involvement of inhibitory control during metaphor comprehension, particularly when metaphors are less conventional. These findings extend the findings of previous studies by revealing a more causal link between inhibitory control and metaphor conventionality. Moreover, two novel findings emerged: (1) Individuals with lower inhibitory control are actually faster than individuals with higher inhibitory control at processing more conventional metaphorical phrases, and (2) individuals are more accurate at processing conventional metaphors under inhibitory load compared with no load. Both these findings seem to suggest that accessing the figurative meanings of conventional metaphors relies on salient or dominant responses.