In Experiment 2, we attempted to design a word production task in which it would be easier to access the semantic and lexical representations of abstract words than it had been from their definitions in Experiment 1. Crutch and Warrington (2005) argued that concrete words tend to be represented in memory by a particular set of semantic features. Conversely, they claimed that abstract words are more likely to be represented in semantic memory in terms of their associative connections with other words. Consistent with these claims, a series of neuropsychological investigations and a number of studies with unimpaired participants have shown that abstract words are more efficiently processed in associative contexts, whereas concrete words are better processed in the context of semantically similar items such as members of the same semantic category (e.g., Crutch, Connell, & Warrington, 2009; Crutch & Jackson, 2011).
If it is assumed that retrieval from definitions depends heavily on an attempt to recall a word on the basis of its semantic features, it follows that the naming-to-definition task would likely prove particularly difficult for abstract words. The way in which words are retrieved during normal language production, however, is in the context of a sentence in which there are often rich associative connections between each word and the sentence context in which it appears. In Experiment 2, therefore, we gave participants sentences that described events in which an abstract word was missing (e.g., In court, Joan entered a _______ of “not guilty” to the crime she was accused of) and sentences in which a concrete word was missing (e.g., Jason played a game of _______ on the famous links). Participants were asked to try and generate the word that best fit into that sentence (plea and golf, respectively, in the sentences above).
We assume that the retrieval of a word’s lexical representation from an event context such as these will reflect a greater influence of associative processing and a weaker influence of semantic similarity than is the case for the definition task. If so, it then follows from Crutch and Warrington (2005) that the problems associated with the production of abstract words (fewer correct retrievals and increased probability of Step 1 failure relative to concrete words) may be reduced or eliminated when they have to be retrieved from an event context. Should the differences between the probabilities of Step 1 failure be reduced significantly, a key question would then be whether the probability of phonological retrieval failure (Step 2) would remain substantially greater for abstract than for concrete words. If the connection between the lexical and phonological levels in the production system is a primary problem for abstract words, then significant phonological retrieval problems would remain even if the Step 1 problems with abstract words were resolved. Conversely, if the phonological access problems for abstract words occur because the lexical representations of such words are activated strongly enough to elicit a TOT but not the target word, then higher levels of Step 2 retrieval failure for abstract words should no longer be observed.
Method
Participants
A group of 58 undergraduate students at the University of Essex took part in the experiment. None of them had been participants in Experiment 1. Half were randomly allocated to the definition condition and half to the event condition.
Materials
For participants in the definition condition, the presentation was the same as in Experiment 1, except that none of the participants received any letter length information. For participants in the event condition, the definitions used in Experiment 1 were replaced by new sentences that described a simple event. These sentences were generated so as to create a context in which the target word was highly predictable. However, no attempt was made to match the degree of lexical association between the target and any of the words in the concrete and abstract event sentences. The sentences were given to six new participants with the target word underlined to ensure that the concrete and abstract words were equally congruent with their sentence contexts. These participants were therefore asked to indicate on a scale of 1–5 how well each underlined target word fit into the sentence in which it appeared (1 = very poorly, 5 = very well). The overall mean (with SD) for the concrete sentences was 4.75 (0.23), and the overall mean for the abstract sentences was 4.73 (0.26). These means did not differ significantly, t(66) = 0.33 p = .742. These suitability ratings were very similar to those provided for the definitions by Allen and Hulme’s (2006) participants (see Exp. 1 for details). The Appendix provides details of all of the event sentences that were used in Experiment 2. Three new words that were similar in meaning to each target word were generated specifically for use as recognition test foils in this experiment.
Procedure
Participants were given a set of 68 sentences in which a word was missing. They were told that their task was to produce the word that best fit into that sentence context. If they were unable to recall the word, they were asked to indicate whether they were experiencing a TOT state, exactly as in Experiment 1. At the end of the experiment, the participants undertook a recognition test for those items for which they had experienced a TOT. They heard the sentence and were shown four words on a computer screen. For example, the foils for dare were provoke, bully, and goad, and the foils for lake were pool, lagoon, and spring. They were asked to indicate whether any of the four words was the item that had previously elicited their TOT state. It was emphasized that their task was not to indicate which was the item that they now believed best fit the definition. If the participants selected the target item, the TOT was considered to be a positive TOT. If they selected a foil or stated that none of the four words had elicited their TOT, it was considered a negative TOT.
Results
Number correct
Performance in Experiment 2 is summarized in Tables 3 and 4. There was a significant effect of concreteness, F(1, 58) = 127.31, MSE = .05, p = .00001, and significantly more words were correctly recalled from event than from definition sentences, F(1, 58) = 50.51, MSE = .05, p = .001. The interaction was also highly significant, F(1, 58) = 16.01, MSE = .05, p = .001. Tests of simple main effects showed a significant effect of concreteness in both the definition, F(1, 16) = 116.81, MSE = .05, p = .001, and event, F(1, 16) = 26.51, MSE = .05, p = .001, conditions. The interaction seems to have come about because the concreteness effect was larger in the definition than in the event condition. To demonstrate this point formally, an analysis compared the sizes of the advantage for concrete words in the two conditions. This revealed a significantly larger advantage for concrete words in the definition than in the event condition, F(1, 58) = 28.95, MSE = 17.63, p = .001.
Table 3 Mean proportions of different types of responses for abstract and concrete target words in Experiment 2
Table 4 Probabilities of failure at Steps 1 and 2 (Gollan & Brown, 2006) for abstract and concrete target words in Experiment 2
Step 1 failure
We found a significantly higher probability of Step 1 failure for abstract than for concrete words, F(1, 58) = 176.53, MSE = .04, p = .001, and a significantly higher probability of Step 1 failure in the definition than in the event condition, F(1, 58) = 49.63, MSE = .04, p = .001. There was also a significant interaction, F(1, 58) = 45.10, MSE = .04, p = .001. As with the number of correct responses, the interaction seems to reflect a stronger effect of concreteness in the definition than in the event condition.
Significantly fewer alternates were reported in the event than in the definition condition, F(1, 58) = 7.70, MSE = .01, p = .007. As in Experiment 1, more alternates were produced in response to sentences about abstract than about concrete words, F(1, 58) = 72.84, MSE = .01, p = .001. The interaction between these factors was not significant.
Significantly fewer failures to respond occurred in the event than in the definition condition, F(1, 58) = 17.87, MSE = .02, p = .001, and significantly more failures to respond occurred for abstract than for concrete words, F(1, 58) = 23.73, MSE = .02, p = .001. The interaction was significant, F(1, 58) = 29.93, MSE = .02, p = .001, with tests of simple main effects revealing a concreteness effect in the definition condition, F(1, 116) = 53.48, MSE = .01, p = .001, but not in the event condition, F < 1.
We found significantly more negative TOTs in the definition than in the event condition, F(1, 58) = 14.11, MSE = .01, p = .001. The effect of concreteness on negative TOTs was also significant, F(1, 58) = 9.20, MSE = .01, p = .004, as was the interaction, F(1, 58) = 8.05, MSE = .01, p = .006. Tests of simple main effects revealed a significant effect of concreteness in the definition, F(1, 116) = 17.23, MSE = .01, p = .001, but not in the event, F < 1, sentences.
Step 2 failure
Most importantly, we found a significantly higher probability of Step 2 failure for abstract than for concrete words F(1, 58) = 9.43, MSE = .03, p = .003, and significantly higher probability of Step 2 failure in the definition than in the event condition, F(1, 58) = 9.59, MSE = .04, p = .003. The interaction was also significant, F(1, 58) = 4.63, MSE = .03, p = .04: Tests of simple main effects showed a significant effect of concreteness with definitions, F(1, 116) = 13.64, MSE = .02, p = .001, but not with event sentences. Significantly more positive TOTs were reported in the definition condition than in the event condition, F(1, 58) = 6.82, MSE = .02, p = .01, but no effect of concreteness and no significant interaction appeared, both Fs < 1.
Word category
Although Allen and Hulme’s (2006) set of abstract and concrete words were matched on a number of important variables, they were not matched for word category. Whereas all of the concrete words were nouns, the abstract words comprised nine adjectives, 11 nouns, and 14 verbs. We therefore investigated whether we could find any evidence in Experiment 2 that the observed differences in performance were caused by word category rather than concreteness.
In the definition condition, the probabilities of Step 1 failure were .78 for abstract adjectives, .72 for abstract nouns, and .57 for abstract verbs. These values are all much higher than the probability of Step 1 failure for concrete words (.39). The probabilities of Step 2 failure were .17 for abstract adjectives, .22 for abstract nouns, and .10 for abstract verbs, so the value for Step 2 failure for abstract nouns was descriptively higher than that for concrete words (.10).
In the event condition, the probabilities of Step 1 failure were .36 for abstract adjectives, .32 for abstract nouns, and .41 for abstract verbs, as compared to .26 for the concrete words. Although these results suggest that the event condition may have been particularly beneficial for abstract adjectives and nouns, evidence remains of an advantage for concrete relative to abstract nouns, verbs, and adjectives. The probabilities of Step 2 failure in the abstract condition were .02 for adjectives, .06 for nouns, and .03 for verbs, as compared to .04 in the concrete word condition.
Strength of semantic competitors
An attempt was made to investigate whether the abstract and concrete target words faced equally strong semantic competitors during word production in Experiment 2. Additional data were therefore collected to determine how well the most frequent alternates to the target words fit into the event and definition sentences. A group of 16 new participants, drawn from the same population as those who took part in the main experiment, were shown each of the definition sentences together with three words. One of the words was the target word, and the other two words were the most frequent alternates produced by participants in the definition condition in the main experiment. The participants were asked to indicate how well each of these words fit into its corresponding sentence on a scale of 1–5, where 5 = very well and 1 = very badly. A further 15 new participants were shown the target word and the two most frequent alternates produced by participants for the event sentences in the main experiment and rated how well each of these words fit into its corresponding sentence, on the same 1–5 scale.
The mean ratings are shown in Table 5. In the event condition, a significant effect of type of word emerged, F(2, 66) = 162.2, MSE = .54, p = .001, but no effect of concreteness, F < 1, and no significant interaction, F < 1. Newman–Keuls tests revealed that the target words received significantly higher ratings than did the highest-rated alternate, which in turn received higher ratings than the second alternate (both ps < .01). In the definition condition, we found a significant effect of concreteness, F(1, 66) = 14.65, MSE = .57, p = .001, a significant effect of type of word, F(2, 66) = 150.81, MSE = .57, p = .001, and a significant interaction, F(2, 66) = 15.71, MSE = .54, p = .001. Newman–Keuls tests revealed no significant effect of concreteness on the ratings given to the concrete and abstract target words. However, the alternate words in the abstract sentences were given significantly higher ratings than were the alternate words in the concrete condition (p < .01).
Table 5 Participants’ ratings of how well the target word and alternates fit into the definition and event sentences (1 = very badly, 5 = very well)
What these findings reveal is that the abstract and concrete target words fit equally well into the definition sentences and into the event sentences. Such an outcome is consistent with the normative data collected for the event sentences prior to Experiment 2, as well as with the normative data collected for the definitions by Allen and Hulme (2006). No evidence emerged from the analyses that the abstract target words faced greater competition from alternates than did the concrete words in the event sentences. The situation was somewhat different for the definition sentences, however. Here, the alternates for abstract target words were seen as fitting the definitions better than the alternates to the concrete words did . It therefore appears that abstract words in the definition condition may face stronger competition from alternates than do the concrete target words.
Discussion
In Experiment 2, the effect of concreteness on the proportion of words correctly retrieved was significantly smaller when participants were asked to produce words from event sentences than when they were asked to produce words from dictionary definitions. The use of event sentences seemed, therefore, to have had the desired effect of substantially improving the probability that the semantic and lexical representations of abstract words would be successfully activated relative to concrete words. The critical issue was whether differences would remain between abstract and concrete words in terms of Step 2 (phonological) retrieval.
As in Experiment 1, we again found a significantly greater probability of Step 2 failure for abstract than for concrete words in the definition task. One possibility is that this effect arose because of weaker connections between the lexical and phonological levels in the word production system for abstract than for concrete words. If this was indeed the case, concreteness should have exerted an effect on the probability of Step 2 failure when the task was to retrieve words from event scenarios. However, the results of Experiment 2 revealed no significant effect of concreteness on the probability of Step 2 failure in the event condition. It therefore appears that the increased probability of Step 2 failure for abstract words in the definition condition was a direct consequence of the weaker activation of an abstract word’s lexical representation from its dictionary definition. When the semantic and lexical activation of abstract words was increased by presenting event sentences instead of definitions, abstract words were no longer associated with a greater probability of Step 2 retrieval failure.
It might be argued that this interaction came about because of a floor effect in the number of TOTs in the event condition. However, Table 4 shows that the probability of Step 1 failure for abstract words in the event condition was similar to that of Step 1 failure for concrete words in the definition condition. This suggests approximately equivalent levels of lexical access in these two conditions. If there were an independent phonological retrieval problem for abstract words, then a greater probability of Step 2 failures for abstract than for concrete words should have been observed in these two conditions. As Table 4 shows, this was clearly not the case. Experiment 2, therefore, provided no evidence for an independent problem with abstract words at Step 2 of the word production process, as appears to be the situation for words of low frequency (Kittredge et al., 2008). Nevertheless, it must be acknowledged that for the present study we used only binary measures. One cannot rule out the possibility that more continuous measures, such as reaction times, might reveal more subtle effects of concreteness on Step 2 retrieval.
The difference that remained between the correct retrieval of abstract and concrete words in the event condition was directly related to the larger number of alternates that participants generated for abstract words. A possible explanation for the large number of remaining alternates is that the abstract words may have had more synonyms than the concrete words. However, the additional normative data collected at the end of Experiment 2 provided no evidence that the most common alternates for abstract words in the event condition were more closely related to the meaning of the event sentences than were the most common alternates for the concrete words. Because we found no evidence that abstract words faced more competition from competitors in the event condition, it therefore appears that the greater number of alternates produced in the abstract word condition is a consequence of the failure to access the correct lexical representation as a result of weak semantic–lexical weights (Hanley et al., 2004).
In the definition condition, however, participants did rate the most common alternatives to the abstract target words as being more compatible with the definitions than the most common alternatives to the concrete target words. This observation is consistent with Newton and Barry’s (1997) claim that abstract words are difficult to retrieve in the definition task because they face stronger competition from semantic neighbors. It seems unlikely, however, that this is one of the reasons why abstract words were worse recalled. If this had been the case, a larger effect of concreteness on the number of alternates produced should have been observed in the definition condition than in the event condition. In fact, though, this interaction failed to approach significance. The poor retrieval of abstract relative to concrete words in Experiment 2 is therefore explicable solely in terms of weaker semantic–lexical weights for abstract words (Hanley et al., 2004); there is no need to suggest that interference from competitors of abstract words plays an important additional role. As in Experiment 1, the evidence suggests that the increased number of alternates is a consequence rather than a cause of poor lexical retrieval of abstract words. Nevertheless, the finding that the alternates of abstract words were considered to fit the definitions so well merits further investigation. No independent formal evidence has been found that abstract words have more synonyms, although the use of latent semantic analysis (Hoffman, Rogers, & Lambon Ralph, 2011) has clearly established that abstract words have more senses, and therefore are more ambiguous than concrete words. This would be an interesting issue for future research on latent semantic analysis to address.
Although we do not have any direct measure of the degree of association between the target words and the event sentences, it seems reasonable to assume that recalling words in response to specific events is likely to require more associative processing and less featural processing than recalling words from their definitions. Consequently, the significantly greater improvement in the retrieval of abstract than of concrete words in the event condition provides some support for the views of Crutch and Warrington (2005) that abstract words tend to be represented in semantic memory in terms of their associative connections with other words rather than of their semantic features. Nevertheless, recall of concrete words did improve to some extent in the event condition. So, as Crutch and Warrington maintained, any differential dependence of concrete and abstract words on associative and featural information appears to be relative rather than absolute. An alternative explanation of the greater improvement for abstract words would be the presence of a ceiling effect in performance for concrete words in the event condition. Although it is impossible to entirely dismiss this possibility, the similarity of the standard deviations for concrete words correctly recalled from the definition and event sentences suggests that performance had not reached ceiling in the event condition (see Table 3).
Finally, the Experiment 2 Results section provided some evidence that word category might have had an influence on performance in Experiment 2. For example, although Step 1 retrieval in the event condition was consistently more difficult for abstract than for concrete words regardless of word category, the effect was descriptively smaller for nouns and adjectives than for verbs, and it appeared that Step 1 access might be more difficult for abstract verbs than for abstract nouns or adjectives. This outcome must be treated with caution, as the items from these different word categories were not equated on variables such as frequency or age of acquisition. Nevertheless, the effects of word category on lexical retrieval in the event and definition tasks would be an interesting issue for future research to address.