Semantic richness effects in lexical decision: The role of feedback
Across lexical processing tasks, it is well established that words with richer semantic representations are recognized faster. This suggests that the lexical system has access to meaning before a word is fully identified, and is consistent with a theoretical framework based on interactive and cascaded processing. Specifically, semantic richness effects are argued to be produced by feedback from semantic representations to lower-level representations. The present study explores the extent to which richness effects are mediated by feedback from lexical- to letter-level representations. In two lexical decision experiments, we examined the joint effects of stimulus quality and four semantic richness dimensions (imageability, number of features, semantic neighborhood density, semantic diversity). With the exception of semantic diversity, robust additive effects of stimulus quality and richness were observed for the targeted dimensions. Our results suggest that semantic feedback does not typically reach earlier levels of representation in lexical decision, and further reinforces the idea that task context modulates the processing dynamics of early word recognition processes.
KeywordsStimulus quality Semantic richness Visual word recognition Lexical decision Semantic feedback RT distributional analyses
Across a number of lexical processing paradigms, including perceptual identification, lexical decision (i.e., classifying letter strings as words or nonwords such as flirp), speeded pronunciation (i.e., reading letter strings aloud), and semantic categorization (e.g., classifying words as animate or inanimate), it is well established that semantically rich words, which are associated with relatively more semantic information, are recognized faster (Pexman, Hargreaves, Siakaluk, Bodner, & Pope, 2008; Yap, Pexman, Wellsby, Hargreaves, & Huff, 2012). Importantly, the richness of a word’s semantic representation is not a unitary construct and can be reflected by a number of dimensions, including the number of semantic features associated with its referent (McRae, Cree, Seidenberg, & McNorgan, 2005), its semantic neighborhood density (Shaoul & Westbury, 2010), its number of senses (Hoffman, Lambon Ralph, & Rogers, 2013; Miller, 1990), the number of distinct first associates elicited by the word in free association (Nelson, McEvoy, & Schreiber, 1998), imageability, the extent to which the word evokes mental imagery (Cortese & Fugett, 2004), body-object interaction, the extent to which a human body can interact with the word’s referent (Siakaluk, Pexman, Aguilera, Owen, & Sears, 2008), sensory experience ratings, the extent to which a word evokes a sensory or perceptual experience (Juhasz & Yap, 2013), and emotional valence (i.e., whether a word is positive, negative, or neutral; Yap & Seow, 2014).
These findings collectively converge on the idea that the lexical system has access to meaning before a word is fully identified (Balota, 1990). While the mere existence of meaning-based influences on visual word recognition is no longer contentious, the processes and mechanisms underlying these influences remain poorly understood (for reviews, see Balota, Ferraro, & Connor, 1991; Pexman, 2012). For example, the role of word meaning is minimal in theories of lexical access (Larsen, Mercer, Balota, & Strube, 2008), and this is reflected in how computational models of word recognition have generally not implemented semantics (but see Harm & Seidenberg, 2004, for a notable exception).
Richness effects through semantic feedback
While the feedback activation account is predicated on the idea that lexical-level activity drives responses on word recognition tasks, there exist competing theoretical accounts which can accommodate semantic richness effects in lexical decision without requiring semantics-to-orthography feedback. For example, according to Borowsky and Besner’s (1993) multistage activation model, lexical decisions are primarily based on activity within the semantic system; such a framework yields semantic effects without feedback. However, Pexman and Lupker (1999) have argued that certain empirical findings are difficult to reconcile with this perspective. Specifically, if we assume that lexical decisions are driven by semantic-level activity, it is unclear how a common process can simultaneously explain effects of homophony (i.e., slower responses for homophones) and number of senses (i.e., faster responses for words with many senses) in lexical decision. For example, suppose the delayed responses for homophones (e.g., maid) are due to their activating multiple semantic representations (i.e., those for made and maid) which subsequently compete with each other, thereby prolonging semantic settling times. If this view is correct, then words with many senses (e.g., bank), which map onto multiple semantic representations, should also elicit slower responses. However, when the effects of number of senses and homophony were examined simultaneously within the same lexical decision experiment, response times (RTs) were slower for homophones but faster for words with many senses (Pexman & Lupker, 1999). The feedback account explains these findings in a principled and unified manner. Specifically, feedback from phonological to orthographic representations underlies the homophone effect, while feedback from semantic to orthographic representations underlies the number of senses effect.
The present study
In summary, the available evidence is consistent with the idea that feedback activation between different levels of representation in the lexical system is necessary for accommodating both semantic richness and homophone effects in the word recognition literature. While researchers have explored feedback from semantic- to lexical-level representations (Pexman et al., 2002), and from phonological to orthographic representations (Pexman et al., 2001), the role of word-to-letter feedback has received less attention. As described earlier, the classic explanation for the word superiority effect is based on the top-down influence of word- on letter-level representations (McClelland & Rumelhart, 1981). As a result, the architectural assumption of word-to-letter feedback is a fundamental aspect of influential word recognition models, including the dual-route cascaded (DRC) model (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001), the multiple read-out model (Grainger & Jacobs, 1996), the bimodal interactive activation framework (Grainger, Muneaux, Farioli, & Ziegler, 2005), and the CDP+ and CDP++ models (Perry, Ziegler, & Zorzi, 2007; Perry, Ziegler, & Zorzi, 2010).
More pertinently, the interaction between semantic priming and target degradation has been explained using semantic feedback to letter-level representations by way of lexical-level representations. For example, in lexical decision, words are recognized more quickly when preceded by a semantically related word (e.g., doctor – NURSE) than by an unrelated control (e.g., porter – NURSE); this is known as the semantic priming effect. A robust finding in the semantic priming literature is that semantic priming effects are larger when targets are visually degraded, compared to when they are presented clearly (Balota, Yap, Cortese, & Watson, 2008; Meyer, Schvaneveldt, & Ruddy, 1975). Using an interactive activation framework (Stolz & Besner, 1996; 1998) much like the one depicted in Fig. 2, McNamara (2005) suggested that this interaction arises because the presentation of a prime word (e.g., doctor) activates the semantic representations of related concepts (e.g., nurse, medicine, sick), and these related concepts, through feedback pathways, will then preactivate their respective lexical- and letter-level representations (see also Brown, Stolz, & Besner, 2006). As a consequence of this compensatory feedback, targets preceded by related, compared to unrelated, primes will be disrupted to a lesser extent by visual degradation, thereby yielding the overadditive priming × stimulus quality interaction.
Despite the pervasiveness of the assumption that meaning-level information reaches the letter level, this assumption has not, to our knowledge, been empirically tested. In two experiments, we explore the role of word-to-letter feedback in mediating semantic richness effects, by studying the joint effects of stimulus quality (clear vs. degraded) with four theoretically important richness dimensions (E1: imageability & number of features; E2: semantic neighborhood density & number of senses). Assuming that semantic richness effects reflect partially activated letter-level representations, the predictions are straightforward. Specifically, in addition to the main effects of stimulus quality and richness, one should observe an overadditive interaction wherein the effects of stimulus degradation are smaller for words which are semantically richer.
In order to characterize observed effects in a more fine-grained manner, the data are examined both at the level of mean RTs and at the level of RT distributional characteristics. Analyzing the influence of factors on mean RTs alone has been shown to be inadequate and indeed sometimes misleading (see Balota & Yap, 2011, for a review). For example, Heathcote, Popiel, and Mewhort (1991) examined color-naming RTs to congruent (e.g., RED displayed in red) and neutral (e.g., XXX displayed in red) Stroop stimuli, and found no difference in mean RTs. However, when they analyzed the effect of variables on different portions of the RT distributions, they found a facilitatory effect of congruency (i.e., congruent faster than neutral) on the modal portion of the RT distribution but an inhibitory effect (i.e., congruent slower than neutral) in the slow tail of the distribution. These opposing effects cancelled each other out, thereby producing a spurious null effect in means.
In the present study, empirical RT distributions are fitted to the theoretical ex-Gaussian function, which is a convolution of a normal and exponential distribution. This yields three parameter estimates: μ and σ (mean and standard deviation of the normal distribution) and τ (mean of the exponential distribution). Ex-Gaussian analysis allows us to evaluate the extent to which an effect is reflected by distributional shifting (μ) and/or an increase in the tail of the distribution (τ). These analyses are complemented by quantile plots, which provide a graphic representation of distributional effects. These distributional analyses will fulfill two important objectives. First, our results will help shed more light on the impact of semantic richness on RT distributions. For example, Yap and Seow (2014) reported that emotional valence effects in lexical decision (i.e., slower responses to neutral words, relative to positive and negative words) reflected both distributional shifting and an increase in the tail of the distribution. These results are difficult to reconcile with the view that valence effects in lexical decision are fully attributable to early, preconscious processes (cf. Kousta, Vinson, & Vigliocco, 2009); relatively automatic effects (e.g., masked repetition or semantic priming) are typically mediated exclusively by distributional shifting (Balota et al., 2008; Gomez, Perea, & Ratcliff, 2013). Instead, the findings are more consistent with the idea that positive and negative words, which are semantically richer, elicit stronger semantic feedback to word-level representations, thereby making lexical decision less attentionally demanding (Balota & Chumbley, 1984) for such words. It is unclear if other semantic richness effects (e.g., imageability, number of features, semantic neighborhood density, number of senses) are similarly mediated by distributional shifting and changes in the slow tail.
More importantly, there is compelling evidence that semantic richness effects do not tap a single undifferentiated dimension, but instead reflect distinct theoretical frameworks (Pexman, Siakaluk, & Yap, 2014). Consistent with this, intriguing between-task dissociations have been reported in the literature. For example, semantic neighborhood density facilitates lexical decision performance, but has no effect on semantic classification performance (Yap et al., 2012). Likewise, while words with more senses (i.e., more ambiguous) enjoy a processing advantage in lexical decision, the effect of ambiguity is less clear in tasks which place an emphasis on semantic activation, such as semantic categorization or semantic relatedness (i.e., are these two words related?). Specifically, there is in some cases an ambiguity disadvantage in semantic relatedness (Hoffman & Woollams, 2015; Pexman, Hino, & Lupker, 2004; Piercey & Joordens, 2000) while ambiguity effects are either inhibitory or null in semantic categorization (Hino, Lupker, & Pexman, 2002). By ascertaining how stimulus quality and semantic variables modulate the shape, rather than just the mean, of distributions, one may find dissociations that are apparent only at the level of distributional characteristics.
Forty undergraduates (31 females) from the National University of Singapore participated for partial course credit. The participants’ first language was English, and they had normal or corrected-to-normal vision.
Two 2 × 2 designs were incorporated within the same experiment, with non-overlapping items used to examine the effects of each variable. Specifically, we examined Stimulus Quality (clear or degraded) × Imageability (high or low) and Stimulus Quality × Number of Features (high or low). All variables were manipulated within-participants and the dependent variables were RTs and accuracy rates.
Descriptive statistics for the word and nonword stimuli used in Experiment 1
High imageability (N = 60)
Low imageability (N = 60)
Number of letters
Number of syllables
Orthographic neighborhood size
Number of features
High number of features (N = 60)
Low number of features (N = 60)
Number of features
Number of letters
Number of syllables
Orthographic neighborhood size
PC-compatible computers running E-prime software (Schneider, Eschman, & Zuccolotto, 2001) were used for stimulus presentation and data collection. Participants were individually tested in sound-attenuated cubicles, and positioned approximately 60 cm from the computer screen. Participants were instructed to decide whether the letter string presented formed a word or nonword by making the appropriate button press (slash key for words and Z key for nonwords). Participants were encouraged to respond quickly but not at the expense of accuracy. There were 20 practice trials, followed by six experimental blocks of 80 trials each, with breaks between blocks. The order in which stimuli were presented was randomized anew for each participant. Stimuli were presented in uppercase 14-point Courier New, and each trial comprised the following order of events: (a) a fixation point (+) at the center of the monitor for 400 ms, (b) a blank screen for 400 ms, and (c) the target. The target remained on the screen for 4,000 ms or until a response was made. If a response was incorrect, a 170-ms tone was presented simultaneously with the word “Incorrect” displayed slightly below the fixation point for 450 ms. Half the targets were degraded by rapidly alternating letter strings with a randomly generated mask of the same length. For example, the mask @$#&% was presented for 14 ms, followed by a five-letter target word for 28 ms; the two rapidly alternated until a response was detected. Mask patterns were consistent within a trial, and were generated from random permutations of the following symbols: &@?!$*%#?. Across participants, targets were counterbalanced across degraded and clear conditions. This degradation method has been used in a number of studies (Balota et al., 2008; Thomas, Neely, & O’Connor, 2012; Yap & Balota, 2007; Yap, Tse, & Balota, 2009) and has been shown to yield qualitatively similar effects to contrast reduction (O’Malley, Reynolds, & Besner, 2007).
Results and discussion
Mean response times (RTs) and accuracy rates as a function of imageability/number of features and stimulus quality
Stimulus quality effect
Stimulus quality effect
High number of features
Stimulus quality effect
Low number of features
Stimulus quality effect
For RTs, the main effect of Imageability was significant by participants, Fp(1, 39) = 25.89, p < .001, MSE = 1015.34, ηp2 = .40, but not by items, p = .14; RTs were faster for high-imageability words (M = 596 ms) than for low-imageability words (M = 621 ms). The main effect of Stimulus Quality was significant by participants, Fp(1, 39) = 119.96, p < .001, MSE = 2178.28, ηp2 = .75, and by items, Fi(1, 118) = 194.87, p < .001, MSE = 2116.70, ηp2 = .62; RTs were faster for clear words (M = 568 ms) than for degraded words (M = 649 ms). The Stimulus Quality × Imageability interaction was not significant by participants or by items, Fs < 1. In order to establish the robustness of the non-significant by-participants interaction in RTs (see Gomez & Perea, 2014), we used the package BayesFactor (Morey, Rouder, & Jamil, 2015) to compute the Bayes factor (BFs) for the various alternative hypotheses in our design (see Rouder, Morey, Speckman, & Province, 2012) against the null hypothesis that there are no differences across conditions. For example, a BF of 10 means that there is 10:1 evidence in favor of the specific alternative hypothesis being tested. The additive (i.e., two main effects) model was preferred over all other models, BF = 3.97 × 1023, compared to the model with the interaction, BF = 8.77 × 1022. Put another way, the data were 4.53 (i.e., 3.97 × 1023 / 8.77 × 1022) times more likely to occur under the additive model, compared to the interactive model. Turning to accuracy rates, the main effect of Imageability was not significant by participants or by items, Fs < 1. The main effect of Stimulus Quality was significant by participants, Fp(1, 39) = 11.94, p = .001, MSE = .002, ηp2 = .23, and by items, Fi(1, 118) = 8.42, p = .004, MSE = .005, ηp2 = .07; accuracy rates were higher for clear words (M = .93) than for degraded words (M = .90). The Stimulus Quality × Imageability interaction was not significant by participants or by items, Fs < 1.
We now turn to the ex-Gaussian parameters. For μ, the main effect of Imageability was significant, Fp(1, 39) = 8.71, p = .005, MSE = 1056.95, ηp2 = .18; μ was greater for low-imageability words (M = 488 ms) than for high-imageability words (M = 473 ms). The main effect of Stimulus Quality was significant, Fp(1, 39) = 54.00, p < .001, MSE = 1626.22, ηp2 = .58; μ was greater for degraded words (M = 504 ms) than for clear words (M = 457 ms). The Stimulus Quality × Imageability interaction was not significant, F < 1. For σ, none of the effects were significant. Finally, for τ, the main effects of Imageability, Fp(1, 39) = 3.46, p = .071, MSE = 2540.14, ηp2 = .08, and Stimulus Quality, Fp(1, 39) = 10.01, p = .003, MSE = 4603.63, ηp2 = .20, were significant or approached significance; τ was greater for less imageable words (M = 137 ms) than for more imageable words (M = 122 ms), and τ was greater for degraded words (M = 147 ms) than for clear words (M = 113 ms). The Stimulus Quality × Imageability interaction was not significant, F < 1.
Number of features
The main effect of Number of Features was significant by participants, Fp(1, 39) = 32.72, p < .001, MSE = 1080.01, ηp2 = .46, and by items, Fi(1, 118) = 10.06, p = .002, MSE = 6212.87, ηp2 = .08; RTs were faster for words with more features (M = 575 ms) than for words with fewer features (M = 605 ms). The main effect of Stimulus Quality was significant by participants, Fp(1, 39) = 161.88, p < .001, MSE = 1650.19, ηp2 = .81, and by items, Fi(1, 118) = 189.99, p < .001, MSE = 2142.63, ηp2 = .62; RTs were faster for clear words (M = 549 ms) than for degraded words (M = 631 ms). The Stimulus Quality × Number of Features interaction was not significant by participants or by items, Fs < 1. For the by-participants data, the additive model was preferred over all other models; the data were 4.28 times more likely to occur under the additive model, BF = 4.83 × 1027, compared to the model with the interaction, BF = 1.13 × 1027. Turning to accuracy rates, the main effect of Number of Features was significant by participants, Fp(1, 39) = 27.77, p < .001, MSE = .001, ηp2 = .42, and by items, Fi(1, 118) = 6.23, p = .014, MSE = .008, ηp2 = .05; accuracy rates were higher for words with more features (M = .96) than for words with fewer features (M = .93). The main effect of Stimulus Quality was significant by participants, Fp(1, 39) = 7.70, p = .008, MSE = .001, ηp2 = .16, and by items, Fi(1, 118) = 5.34, p = .023, MSE = .003, ηp2 = .04; accuracy rates were higher for clear words (M = .96) than for degraded words (M = .94). The Stimulus Quality × Number of Features interaction approached significance by participants, p = .06, and was significant by items, Fi(1, 118) = 4.30, p = .04, MSE = .003, ηp2 = .04; the degradation effect was larger for words with fewer features than for words with more features.1
In Experiment 1, reliable additive effects of stimulus quality and semantic richness were observed on RTs. That is, responses were faster for clear words and for semantically richer words, whether semantic richness reflected imageability or number of features, but there was no hint of an interaction for either dimension. The RT distributional analyses further revealed that the effects of imageability and number of features were mediated by a combination of distributional shifting and an increase in the tail of the distribution, replicating the pattern observed by Yap and Seow (2014) for emotional valence. Importantly, the interaction between stimulus quality and semantic richness was not significant for any ex-Gaussian parameter for both imageability and number of features, confirming that the mean-level additive effects generalize to the distributional characteristics. Given the theoretical importance of this pattern, Experiment 2 was designed to establish if these results were replicable when one examines two additional semantic richness dimensions, semantic neighborhood density and number of senses.
Fifty-six undergraduates (42 females) from the University of Calgary participated for partial course credit. The participants’ first language was English, and they had normal or corrected-to-normal vision.
Like Experiment 1, two 2 × 2 designs were incorporated within the experiment: Stimulus Quality × Semantic Neighborhood Density (dense or sparse) and Stimulus Quality × Number of Senses (high or low). All variables were manipulated within-participants and the dependent variables were RTs and accuracy rates.
Descriptive statistics for the word and nonword stimuli used in Experiment 2.
Semantic neighborhood density
High neighborhood density (N = 60)
Low neighborhood density (N = 60)
Number of letters
Number of syllables
Orthographic neighborhood size
High semantic diversity (N = 60)
Low semantic diversity (N = 60)
Number of letters
Number of syllables
Orthographic neighborhood size
Same as Experiment 1.
Results and discussion
Mean response times (RTs) and accuracy rates as a function of semantic neighborhood density/semantic diversity and stimulus quality
High neighborhood density
Stimulus quality effect
Low neighborhood density
Stimulus quality effect
High semantic diversity
Stimulus quality effect
Low semantic diversity
Stimulus quality effect
Semantic neighborhood density
The main effect of Semantic Neighborhood Density was significant by participants, Fp(1, 55) = 46.81 , p < .001, MSE = 1498.84, ηp2 = .46, and by items, Fp(1, 118) = 9.76, p = .002, MSE = 9269.66, ηp2 = .08; RTs were faster for words in dense neighborhoods (M = 729 ms) than for words in sparse neighborhoods (M = 764 ms). The main effect of Stimulus Quality was significant by participants, Fp(1, 55) = 114.90, p < .001, MSE = 10985.46, ηp2 = .68, and by items, Fi(1, 118) = 471.98, p < .001, MSE = 2729.66, ηp2 = .80; RTs were faster for clear words (M = 672 ms) than for degraded words (M = 822 ms). The Stimulus Quality × Semantic Neighborhood Density interaction was not significant by participants or by items, ps > .17. For the by-participants data, the additive model was preferred over all other models; the data were 3.78 times more likely to occur under the additive model, BF = 6.70 × 1033, compared to the model with the interaction, BF = 1.77 × 1033. Turning to accuracy rates, the main effect of Semantic Neighborhood Density was significant by participants, Fp(1, 55) = 36.03, p < .001, MSE = .003, ηp2 = .40, and by items, Fi(1, 55) = 6.24, p = .014, MSE = .015, ηp2 = .05; accuracy rates were higher for words in dense neighborhoods (M = .94) than for words in sparse neighborhoods (M = .90). The main effect of Stimulus Quality was significant by participants, Fp(1, 55) = 41.96, p < .001, MSE = .003, ηp2 = .43, and by items, Fi(1, 118) = 41.70, p < .001, MSE = .003, ηp2 = .26; accuracy rates were higher for clear words (M = .95) than for degraded words (M = .90). The Stimulus Quality × Semantic Neighborhood Density interaction was not significant by participants or by items, ps > .22.
The main effect of Semantic Diversity was significant by participants, Fp(1, 55) = 30.00, p < .001, MSE = 2555.38, ηp2 = .35, and by items, Fi(1, 118) = 7.06, p = .009, MSE = 12275.82, ηp2 = .06; RTs were faster for more ambiguous words (M = 729 ms), compared to less ambiguous words (M = 766 ms). The main effect of Stimulus Quality was significant by participants, Fp(1, 55) = 128.28, p < .001, MSE = 10330.36, ηp2 = .70, and by items, Fi(1, 118) = 392.36, p < .001, MSE = 3431.45, ηp2 = .77; RTs were faster for clear words (M = 671 ms) than for degraded words (M = 824 ms). The Stimulus Quality × Semantic Diversity interaction was not significant by participants or by items, Fs < 1. For the by-participants data, the additive model was preferred over all other models; the data were 5.08 times more likely to occur under the additive model, BF = 3.22 × 1032, compared to the model with the interaction, BF = 6.34 × 1031. Turning to accuracy rates, the main effect of Semantic Diversity was significant by participants, Fp(1, 55) = 50.92, p < .001, MSE = .003, ηp2 = .48, and by items, Fi(1, 118) = 7.28, p = .008, MSE = .018, ηp2 = .06; accuracy rates were higher for more ambiguous words (M = .93), compared to less ambiguous words (M = .88). The main effect of Stimulus Quality was significant by participants, Fp(1, 55) = 38.74, p < .001, MSE = .004, ηp2 = .41, and by items, Fi(1, 118) = 45.64, p < .001, MSE = .003, ηp2 = .28; accuracy rates were higher for clear words (M = .93) than for degraded words (M = .88). The Stimulus Quality × Semantic Diversity interaction was not significant by participants or by items, Fs < 1.
Like Experiment 1, Experiment 2 yielded robust additive effects of stimulus quality and semantic richness on RTs, for both semantic neighborhood density and semantic diversity. The results of the distributional analyses are less clear-cut. The pattern for semantic neighborhood density was identical to those observed for imageability and number of features in Experiment 1. Specifically, the semantic neighborhood density effect was mediated by distributional shifting and an increase in the distributional tail, and this was not moderated by stimulus quality. Unexpectedly, however, semantic diversity effects were larger for degraded, compared to clear, words in μ. That is, if one examines the modal portion of the RT distribution, words higher on semantic diversity (i.e., more ambiguous words) were affected less by stimulus degradation. This was counteracted by a non-significant opposing trend for the slowest RTs, wherein more ambiguous words were associated with a larger degradation effect (see Table 4); the trade-off between μ and τ produced additivity at the level of the mean. We will postpone discussion of this intriguing pattern until the General discussion.
In two lexical decision experiments, we examined the joint effects of stimulus quality and four semantic richness dimensions (imageability, number of features, semantic neighborhood density, and semantic diversity). Although there is a substantial literature examining the interactions between stimulus quality and word-frequency (e.g., Becker & Killion, 1977; Stanners, Jastrzembski, & Westbrook, 1975), and between stimulus quality and semantic priming (e.g., Meyer et al., 1975), this is, to our knowledge, the first study to examine whether stimulus quality and semantic richness produce additive or interactive effects. Broadly speaking, our findings are straightforward and easy to summarize. Specifically, with the exception of semantic diversity, stimulus quality and each of the four targeted variables produced robust additive effects in mean RTs and RT distributional characteristics, but there were no interactions. We will now consider the implications of these findings.
Semantic richness effects: The role of feedback
The semantic feedback account has been an influential perspective for explaining semantic richness effects. Specifically, researchers (e.g., Hino & Lupker, 1996; Pexman et al., 2002) have argued that the facilitation afforded by semantically rich representations is mediated by semantics-to-orthography and semantics-to-phonology feedback. Implicit in this account is the premise that meaning-level activation also reaches the letter level by way of the word level. Indeed, such an assumption is an integral aspect of computational models such as the DRC, multiple read-out, and CDP+/CDP++ models. However, the present results are difficult to reconcile with this view. Specifically, if semantically rich words activate their corresponding letter representations more strongly, then the effect of visual degradation should be smaller for such words. Instead, the major finding from our study is that visual degradation effects are equivalent for words that are high and low in semantic richness, both at the level of mean and at the level of RT distributional characteristics. As such, feedback from semantics to lexical-level representations does not appear to extend to earlier levels of representation in lexical decision.
Although this is the first study to show that stimulus quality and semantic richness produce additive effects in lexical decision, there is a literature (e.g., Balota & Abrams, 1995; Becker & Killion, 1977; O’Malley et al., 2007; O’Malley & Besner, 2008; Plourde & Besner, 1997; Stanners et al., 1975; Yap & Balota, 2007) indicating that the effects of stimulus quality and word-frequency are similarly additive. In general, additive effects of factors on RTs are most naturally accommodated by models based on serially organized discrete stages where processing is thresholded (Borowsky & Besner, 1993; Sternberg, 1969). These effects pose a special problem for computational models incorporating cascaded and interactive processing (O’Malley et al., 2007), which typically produce interactions in simulations (see Reynolds & Besner, 2004). In order for computational models to simulate the additive effects of stimulus quality with both word-frequency and semantic richness, a relatively simple solution is to implement thresholded (as opposed to cascaded) output from the letter level (Besner & Roberts, 2003; Reynolds & Besner, 2004). Activation from the lexical level onwards is cascaded and interactive, and semantic richness effects can then be explained by top-down semantic feedback to lexical-level orthographic and phonological representations.
Of course, this begs the question of why letter-level processing would be thresholded. There have been suggestions that such thresholding is adaptive and reflective of a lexical processor that can flexibly adjust to specific task demands (see Balota & Yap, 2006). For example, in the lexical decision task, the ultimate goal of the participant is to discriminate between familiar/meaningful real words and unfamiliar/meaningless nonwords, a procedure which strongly emphasizes familiarity-based information (Balota & Chumbley, 1984). Yap and Balota (2007) speculated that in experimental contexts where familiarity is useful for driving binary decisions, it might be necessary to perceptually normalize stimuli in order to recover the familiarity-based information. Consistent with this, when the utility of familiarity is undermined in lexical decision by increasing the word-nonword overlap (e.g., by using wordlike distracters such as brane), the effects of stimulus quality and frequency are interactive in the modal portion of the RT distribution (Yap, Balota, Tse, & Besner, 2008). Similarly, O’Malley and Besner (2008), who found additive effects of stimulus quality and frequency in speeded pronunciation when both words and nonwords were presented, suggested that thresholding helps to reduce the likelihood of lexical capture for degraded words. Specifically, a degraded nonword may activate a word strongly enough such that the participant incorrectly reads the nonword as a word; thresholding at the letter level reduces the likelihood of this happening.
The important implication here is that the lexical processing system may not be as modular or inflexible as suggested by frameworks such as the interactive activation model. Instead, in a flexible lexical processing system (Balota, Paul, & Spieler, 1999; Balota & Yap, 2006), different processing pathways support the computation of orthography, phonology, and meaning, and the influences of these pathways are modulated by attentional control systems that are sensitive to experimental task demands. Balota et al. (1999) mainly discussed how particular tasks might emphasize different pathways; for example, lexical decision is primarily driven by the connections between orthography and meaning while pronunciation is driven by the connections between orthography and phonology. Our results, along with those from Besner and colleagues, lend further support to the idea that task context can also modulate the processing dynamics of early word recognition processes, such that letter-level output can be thresholded or cascaded.
At this point, we need to acknowledge that aspects of our account may seem incompatible with certain theoretical frameworks. For example, a central assumption we make is that the influence of stimulus quality is limited to early word processing (i.e., the feature and letter levels) and does not extend to higher levels of representation and processing. Blais, O’Malley, and Besner (2011) have argued that such an assumption is implausible in light of the joint effects of stimulus quality, word frequency, and repetition priming. The repetition priming effect refers to the finding that word recognition is faster on the second presentation of a word than the first (see Tenpenny, 1995, for a review). As described earlier, the effects of stimulus quality and word-frequency are robustly additive (Yap & Balota, 2007), which is consistent with these two factors selectively and respectively influencing letter-level and lexical-level processing. Repetition priming and word-frequency have also been found to interact in lexical decision (Forster & Davis, 1984), with stronger priming for low-frequency words, which suggests that lexical-level representations are affected by both repetition and word-frequency. However, it is the case that stimulus quality and repetition priming also interact (Besner & Swan, 1982; den Heyer & Benson, 1988), with stronger priming for degraded words. If the effect of stimulus quality indeed does not extend beyond letter-level representations, and repetition priming’s influence begins at the lexical level, then it is unclear why stimulus quality and repetition priming interact.
In order to accommodate this complex constellation of findings, Blais et al. (2011) proposed a model in which: (1) stimulus quality affects feature-, letter-, and lexical-level processing, (2) word-frequency, rather than modulating activation levels of lexical representations, is instead reflected in the weights between lexical and semantic representations, and (3) processing is thresholded at the lexical, not letter, level. While the finer details of Blais et al.’s (2011) framework are beyond the scope of the present paper, it is able to explain the stimulus quality × repetition priming interaction, the additive effects of stimulus quality and word-frequency, and the joint effects of repetition priming and word-frequency. More pertinently, Blais et al.’s (2011) assumption that the influence of stimulus quality goes beyond letter-level representations appears to pose a challenge for our account. However, while thresholding at the lexical level predicts additive effects of stimulus quality and semantic richness (which is what we found), this requires lexical decision to be based on semantic activity. As already discussed in the Introduction, such a premise is difficult to reconcile with Pexman and Lupker’s (1999) finding that a facilitatory effect of ambiguity and an inhibitory effect of homophony can be concurrently observed in lexical decision. Furthermore, lexical-level thresholding is largely motivated by the empirical observation that degradation effects are smaller for repeated targets. However, the relevant studies in this domain (e.g., Besner & Swan, 1982; Blais et al., 2011; den Heyer & Benson, 1988) have exclusively used unmasked primes (i.e., primes which can be consciously processed); such primes can establish episodic memory traces which can be retrieved when the word is subsequently presented (Forster & Davis, 1984). As such, it is difficult to definitively rule out the possibility that the stimulus quality × repetition priming interaction partly reflects episodic memory traces, which are outside the scope of computational word recognition models (Blais et al., 2011). For example, in Forster and Davis’ (1984) lexical decision study, the effects of repetition priming and word-frequency were additive with masked primes and overadditive with unmasked primes (see also Versace & Nevers, 2003), suggesting that the interaction reflected the influence of the prime’s episodic trace. To shed more light on this issue, future research might explore whether the intriguing interaction between stimulus quality and repetition priming holds up when masked primes are used.
Implications for models of lexical processing
Interestingly, some recent work by Thomas et al. (2012) speaks to this issue by providing compelling evidence against the idea that the priming × stimulus quality interaction is mediated by semantic feedback. In their study, they examined how the interaction was moderated by the symmetry of the prime-target relationship. Forward asymmetric pairs (e.g., keg – BEER) have a strong prime-to-target association but no target-to-prime association, backward asymmetric pairs (e.g., small – SHRINK) have a strong target-to-prime association but no prime-to-target association, and symmetric prime-target pairs (e.g., cat – DOG) are strongly related in both directions. For our purposes, the key finding was that the priming × stimulus quality interaction was reliable only for pairs with a target-to-prime association (i.e., symmetric and backward asymmetric pairs), suggesting that the interaction was carried by a process that depended on a relationship from the target to the prime. This effectively ruled out spreading activation and its attendant semantic feedback as an explanation, which predicts an interaction for pairs with a prime-to-target association (i.e., symmetric and forward asymmetric pairs).
Instead, Thomas et al. (2012) proposed that the priming × stimulus quality interaction is mediated by a strategic process called backward semantic matching, whereby participants determine whether the target is semantically related to the prime after lexical access of the target has occurred (Neely, Keefe, & Ross, 1989). In line with this, Stolz and Neely (1995) also reported that the interactive effects of priming and stimulus quality became additive when relatedness proportion (i.e., the proportion of word targets preceded by a related prime) was decreased from .50 to .25, consistent with the idea that a low relatedness proportion (i.e., low payoff) gives the participants less incentive to engage the semantic matching mechanism. Importantly, the additive effects of stimulus quality and priming (under low relatedness proportion conditions) provide converging evidence that early processes in visual word recognition are not moderated by feedback created by spreading activation or by semantically rich representations.
Effects of stimulus quality and semantic richness: Going beyond the mean
This is the first study to explore an array of semantic richness effects at the level of RT distributional characteristics. Across imageability, number of features, semantic neighborhood density, and semantic diversity, our findings were relatively straightforward. Specifically, richness effects were mediated by distributional shifting (μ) and changes in the tail of the distribution (τ). More simply put, as RTs become longer, semantic richness effects become larger. This trend, which closely matches what Yap and Seow (2014) reported for emotional valence, is consistent with the idea that the stronger feedback afforded by semantic richness speeds up lexical decision by increasing stimulus familiarity and making word/nonword discrimination less attentionally demanding (see Andrews & Heathcote, 2001; Balota & Spieler, 1999). More relevantly for present purposes, the analyses revealed that for three of the targeted richness dimensions (imageability, number of features, semantic neighborhood density), the joint effects of stimulus quality and richness were additive at the level of the mean and at the level of underlying distributional characteristics.
There was an interesting exception to the foregoing trends. Although the effects of semantic quality and semantic diversity were additive for mean RTs, the distributional analyses (see Table 4 and Fig. 6) revealed an unexpected trade-off between an overadditive interaction in μ (smaller degradation effects for more ambiguous words) and a non-significant underadditive interaction in τ (larger degradation effects for more ambiguous words). This suggests that words with more senses are able to somehow compensate for the deleterious effect of visual degradation. It is unclear why this effect is seen only for semantic diversity (a measure of ambiguity), but not for imageability, number of features, or semantic neighborhood density. As discussed earlier, Yap et al. (2008) observed a similar trade-off in their data when they examined the joint effects of stimulus quality and word-frequency in the context of pseudohomophones. Can the present results be similarly explained by resorting to the idea that participants rely less on familiarity-based information and thresholded processing when they are presented with words that are high and low on semantic diversity (than when presented with words that vary on our other semantic richness dimensions)? Unfortunately, because the degree of word/nonword overlap was controlled for across the different semantic richness dimensions, this account seems implausible and we have to look elsewhere for an answer.
There is mounting evidence in the literature that semantic diversity/ambiguity effects diverge from other semantic effects in interesting ways (Pexman, 2012). That is, while other the facilitatory influence of other richness effects tend to be quite consistent across lexical decision and other semantic tasks, the processing advantage for ambiguous words appears to be specific to lexical decision (Piercey & Joordens, 2000); an ambiguity disadvantage is often seen in semantic tasks such as semantic relatedness decision (Hoffman & Woollams, 2015; Piercey & Joordens, 2000) and semantic categorization (Hino et al., 2002). The dissociation between ambiguity and other dimensions is also reflected in neural consequences. For example, although more ambiguous words are associated with more cortical activation during semantic categorization (Hargreaves, Pexman, Pittman, & Goodyear, 2011), words with more distinct first associates are associated with less cortical activation (Pexman, Hargreaves, Edwards, Henry, & Goodyear, 2007). One major difference between ambiguity and the other three variables is that the former implicates multiple referents and meanings whereas the latter forms of richness are associated with a single referent and meaning (Pexman, 2012). Based on their modeling work, Hoffman and Woollams (2015) also demonstrated that the many-to-one semantic-to-orthography mappings for semantically diverse words yielded noisy, unstable, and underspecified semantic representations.
However, in spite of the foregoing dissociations between semantic diversity and other richness variables, it remains unclear why degradation effects are smaller for more ambiguous, compared to less ambiguous, words, particularly in the fastest RTs. Further speculation on these findings would likely exceed the explanatory power of the present dataset. Future work should be directed towards establishing the robustness of this interaction with a different set of items, and determining if similar distributional trade-offs are observed in tasks (e.g., semantic categorization) which place more emphasis on semantic-level activity. Given the complexity of semantic ambiguity effects, we agree with Pexman (2012) that much remains to be worked out in future research.
Limitations and future directions
For over two decades, an embellished version of the interactive activation framework (Balota, 1990; Balota et al., 1991) has provided a useful metaphor for explaining semantic richness effects in word recognition. The results of the present study suggest that one central aspect of this framework, the interactive activation between letter- and lexical-level representations, cannot be reconciled with how semantic richness effects unfold in visual word recognition. Instead, our results are more consistent with a flexible lexical processor which can strategically toggle between thresholded or cascaded early processing, depending on the specific demands of the task or the composition of the stimuli (see O’Malley & Besner, 2008, for a similar perspective). We have suggested that the additive effects of stimulus quality and richness in lexical decision might be a function of the task’s emphasis on familiarity. To test this, a future experiment could examine the joint effects of stimulus quality and semantic richness in other lexical processing tasks. For example, in the semantic categorization task (e.g., is a word abstract or concrete?), a binary decision is also required but familiarity is not helpful in driving decisions. The prediction is that stimulus quality and semantic richness should interact, with larger richness effects seen for degraded targets.
While the metaphorical framework presented in Fig. 7 provides a useful extension to Balota’s (1990) original account, it is almost certainly too simple to accommodate the complex joint effects of the various factors that have been shown to influence word recognition. We have assumed that semantic feedback to letter-level representations would yield smaller degradation effects for semantically rich words. However, in the absence of a specific implemented model, it is impossible to tell in advance what feedback would do. For example, Besner and colleagues (Besner, Wartak, & Robidoux, 2008; Borowsky & Besner, 2006) have shown that for the parallel distributed processing model described by Plaut and Booth (2000), one can produce underadditivity, additivity, or overadditivity between stimulus quality and a second factor, depending on the portion of the input–output sigmoidal activation function being examined. We look forward to further explorations of these effects within implemented models. Finally, Yap et al. (2009) found that the joint effects of priming and frequency critically depended on the vocabulary knowledge of the participants. Specifically, participants with more vocabulary knowledge produced additive effects while participants with less vocabulary knowledge produced interactive effects. This pattern is difficult to reconcile with any theoretical position that rigidly adheres to serially arranged discrete stages or interactive cascaded processing. Instead, it is more likely that the variable nature of activation dynamics (i.e., cascaded vs. thresholded) applies not only to early processes but to processes later in the time-course.
The Stimulus Quality × Number of Features interaction was marginally significant for accuracy rates. To follow up on this, we computed the magnitude of the Stimulus Quality × Number of Features interaction in accuracy rate and RTs for each participant. The correlation between RT and accuracy interactions was not significant, r = −.02, p = .89, confirming that the additive pattern in RTs is not artifactually driven by a speed-accuracy tradeoff.
This work was supported in part by a Natural Sciences and Engineering Research Council (NSERC) of Canada Discovery Grant to P.M.P. Portions of this research was carried out as an undergraduate honors thesis by G.Y.L. under the direction of M.J.Y. We thank Ellen Lloyd for her help with data collection, and Derek Besner and Pablo Gomez for helpful comments on an earlier version of this article.
- Andrews, S., & Heathcote, A. (2001). Distinguishing common and task-specific processes in word identification: A matter of some moment? Journal of Experimental Psychology: Human Perception and Performance, 27, 514–544.Google Scholar
- Balota, D. A. (1990). The role of meaning in word recognition. In D. A. Balota, G. B. Flores d’Arcais, & K. Rayner (Eds.), Comprehension processes in reading (pp. 9–32). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
- Balota, D. A., & Yap, M. J. (2006). Attentional control and flexible lexical processing: Explorations of the magic moment of word recognition. In S. Andrews (Ed.), From inkmarks to ideas: Current issues in lexical processing (pp. 229–258). New York: Psychology Press.Google Scholar
- Balota, D. A., Ferraro, F. R., & Connor, L. T. (1991). On the early influence of meaning in word recognition: A review of the literature. In P. J. Schwanenflugel (Ed.), The psychology of word meanings (pp. 187–218). Hillsdale: Erlbaum.Google Scholar
- Balota, D. A., Paul, S., & Spieler, D. H. (1999). Attentional control of lexical processing pathways during word recognition and reading. In S. Garrod & M. Pickering (Eds.), Language processing (pp. 15–57). East Sussex: Psychology Press.Google Scholar
- Becker, C. A., & Killion, T. H. (1977). Interaction of visual and cognitive effects in word recognition. Journal of Experimental Psychology: Human Perception and Performance, 3, 389–401.Google Scholar
- Coltheart, M., Davelaar, E., Jonasson, J., & Besner, D. (1977). Access to the internal lexicon. In S. Dornic (Ed.), Attention and performance VI (pp. 535–555). Hillsdale: Erlbaum.Google Scholar
- Forster, K. I., & Davis, C. (1984). Repetition priming and frequency attenuation in lexical access. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 680–698.Google Scholar
- Hino, Y., & Lupker, S. J. (1996). Effects of polysemy in lexical decision and naming: An alternative to lexical access accounts. Journal of Experimental Psychology: Human Perception and Performance, 22, 1331–1356.Google Scholar
- Hino, Y., Lupker, S. J., & Pexman, P. M. (2002). Ambiguity and synonymy effects in lexical decision, naming, and semantic categorization tasks: Interactions between orthography, phonology, and semantics. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 686–713.PubMedGoogle Scholar
- Meyer, D. E., Schvaneveldt, R. W., & Ruddy, M. G. (1975). Loci of contextual effects on visual word-recognition. In P. M. A. Rabbitt (Ed.), Attention and performance V (pp. 98–118). London: Academic Press.Google Scholar
- Morey, R. D., Rouder, J. N., & Jamil, T. (2015). BayesFactor: Computation of Bayes factors for common designs. R package 0.9.11–1, http://cran.r-project.org/web/packages/BayesFactor/index.html
- Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (1998). The University of South Florida word association, rhyme, and word fragment norms. http://web.usf.edu/FreeAssociation/
- O’Malley, S., Reynolds, M. G., & Besner, D. (2007). Qualitative differences between the joint effects of stimulus quality and word frequency in reading aloud and lexical decision: Extensions to Yap and Balota (2007). Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 451–458.PubMedGoogle Scholar
- Pexman, P. M. (2012). Meaning-based influences on visual word recognition. In J. S. Adelman (Ed.), Visual word recognition (Meaning and context, individuals, and development, Vol. 2, pp. 24–43). Hove: Psychology Press.Google Scholar
- Pexman, P. M., Siakaluk, P. D., & Yap, M. J. (Eds.) (2014). Meaning in mind: Semantic richness effects in language processing. Frontiers Media SA.Google Scholar
- Reynolds, M., & Besner, D. (2004). Neighborhood density, word frequency, and spelling-sound regularity effects in naming: Similarities and differences between skilled readers and the dual route cascaded computational model. Canadian Journal of Experimental Psychology, 58, 13–31.CrossRefPubMedGoogle Scholar
- Schneider, W., Eschman, A., & Zuccolotto, A. (2001). E-Prime user’s guide. Pittsburgh: Psychology Software Tools.Google Scholar
- Stolz, J. A., & Besner, D. (1996). Role of set in visual word recognition: Activation and activation blocking as nonautomatic processes. Journal of Experimental Psychology: Human Perception and Performance, 22, 1166–1177.Google Scholar
- Stolz, J. A., & Neely, J. H. (1995). When target degradation does and does not enhance semantic context effects in word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 596–611.Google Scholar
- Yap, M. J., Balota, D. A., Tse, C.-S., & Besner, D. (2008). On the additive effects of stimulus quality and word frequency in lexical decision: Evidence for opposing interactive influences revealed by RT distributional analyses. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34, 495–513.PubMedGoogle Scholar