Introduction

Referent identification during sentence comprehension unfolds over time and is driven by multiple sources of linguistic and nonlinguistic information (see Barr & Keysar, 2006; Tanenhaus & Trueswell, 2006; and references therein). A focus on the dynamics of processing leads to a natural division of these information sources: reference in the moment is constrained by both local (lexical) and global (contextual) factors. Although it is obvious that the individual words being heard play a central role in what is considered as a referent, use of this information must be regulated when it clashes with prior context. Yet theories of sentence comprehension have disagreed on whether contextual information is consulted early or late during processing (see Dahan & Tanenhaus, 2004 for a review of competing accounts). In addition, in spite of growing evidence for the involvement of cognitive control in various aspects of sentence and discourse processing (Novick, Kan, Trueswell, & Thompson-Schill, 2009; Nozari, Arnold, & Thompson-Schill, 2014; Nozari, Mirman, & Thompson-Schill, 2016; Nozari & Thompson-Schill, 2015), the link between control processes and inhibition of context-incompatible information has received little attention (but see Brown-Schmidt, 2009; Nilsen & Graham, 2009). Critically, it is unclear whether inhibitory resources regulate the constraining effect of context, and whether such resources are shared between linguistic and non-linguistic domains or are domain-specific. This study answers these questions.

Sensitivity to context in sentence processing

Numerous studies have reported early context effects in lexical and sentential processing (e.g., Altmann & Kamide, 1999; Barr, 2008; Chambers & San Juan, 2008; Dahan & Tanenhaus, 2004; Magnuson, Tanenhaus, & Aslin, 2008; Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995). For example, the classic “cohort competitor” effect in the visual world paradigm (i.e., looks toward a buckle when hearing “bucket”) can be eliminated in a constraining context (“Empty the…”) as compared to an unconstraining one (“Click on the…”) (Barr, 2008, Experiment 2). There are, however, studies that show evidence of processing of related competitors even when rendered implausible by the context (e.g., Kukona, Fang, Aicher, Chen, & Magnuson, 2011; Swinney, 1979; Tabor, Galantucci, & Richardson, 2004; Tanenhaus, Leiman, & Seidenberg, 1979). Most recently, Kukona, Cho, Magnuson and Tabor (2014) showed that, upon hearing a sentence such as “The boy will eat the brown cake”, participants also considered a brown car, even though it was incompatible with the verb “eat” (see also Kukona & Tabor, 2011). While this study provides strong evidence for local, context-insensitive processing of information, this effect might be driven in part by salient bottom–up information in the scene: it is possible that the color brown activates the lexical semantic category for “brown” even before the word is heard, and, when the time comes for choosing, there are two objects with the same salient feature competing for capturing visual attention. If attending to the brown car is truly due to bottom–up capture of attention by color (see Simons, 2000 for a review of the conditions where color is a pop-out feature), then there is no need for a self-organizing account.

The current design builds on Kukona et al. (2014), with one important difference: consideration of semantic competitors could not be explained by prior bottom–up activation of visual features. In the experimental trials, participants heard sentences like “She will eat the red pear” and looked at a scene with four black-and-white line drawings: a pear (target), a banana (a verb competitor), a heart (an adjective competitor = local attractor), and a fourth unrelated object (Fig. 1; Table 1). In control trials, the adjective competitor was replaced with a picture incompatible with the color adjective (igloo). The difference between fixation proportions to the adjective competitor (heart) and the control picture (igloo) indexed “local attraction”, i.e., the context-insensitive influence of the local word.

Fig. 1
figure 1

An example of the visual display on a trial of the eye-tracking task. Note that all pictures are presented as seen here, in black and white, providing no direct bottom–up cues for the adjective (i.e., no red color in the display)

Table 1 An example of a set with its four trial types; each subject only gets one of the four

This factor was crossed with a contextual manipulation. Constraining trials had a constraining verb like “eat”, while non-constraining trials had verbs like “see”. This second manipulation followed Dahan and Tanenhaus (2004), and served to gauge the sensitivity of local attraction to the presence of constraining context. Similar to the current study, Dahan and Tanenhaus examined the effect of semantically constraining context, but in their study they examined consideration of phonological cohorts. Looking to the phonological competitor was only found with cross-spliced stimuli that favored the acoustic form of the competitor, and this effect was not sensitive to contextual constraint in an early time window. The current study investigates a similar issue but with adjectives that supported reference to a local competitor.

Cognitive control and sentence comprehension

While there is solid evidence for the involvement of control processes in comprehension (e.g., Nozari et al., 2016; Sommers & Danielson, 1999; see Novick, Hussey, Teubner-Rhodes, Harbison, & Bunting, 2014; Novick, Kan, Trueswell, & Thompson-Schill, 2009 for reviews), relatively little work has explored the role and nature of such processes in limiting the consideration of context-incompatible information. An exception is a recent study by Brown-Schmidt (2009; see also Nilsen & Graham, 2009 for a similar concept in children). Brown-Schmidt (2009) had participants play a game with the experimenter, in which they had to jointly determine whether certain criteria were met for the arrangement of objects on a visual array. The experimenter read sentences aloud to the participant, and the participant’s fixations were tracked. The goal was to test whether, between two objects of the same kind, the participant considered only the one that the experimenter could not see (and would therefore ask about) or also the one that the experimenter could see. The results showed that the participant’s fixations on the pragmatically-infelicitous object, in this case, the object in the common ground, could be predicted from their incongruency scores on a linguistic Stroop task, but not on a non-linguistic no-go task in which participants withheld a button-press response if a certain object appeared on the screen. These results might be interpreted as compatible with a domain-specific control process: scores on a linguistic task were predictive of perspective taking, but scores on a non-linguistic task were not. If so, the results can be taken to support theories that posit specialized resources for specific domains (e.g., Fedorenko, Behr, & Kanwisher, 2011), and even sub-processes within a domain (e.g., Caplan & Waters, 1999; Waters & Caplan, 1996), against those that postulate domain-general resources shared by multiple systems (e.g., Novick et al., 2009; Thompson-Schill, Bedny, & Goldberg, 2005).

However, the tasks used in Brown-Schmidt (2009) differed in more than material type: no-go tasks canonically tap into response selection and execution while Stroop also imposes strong conflict at the level of stimulus processing (Ni et al., 2000). Thus, another interpretation of Brown-Schmidt’s results is that the ability that determined suppression of irrelevant information in perspective taking was suppression at the level of stimulus processing (i.e., excluding an object from possibly being a referent) and not the response (i.e., suppressing an eye movement towards a given object that might still be a cognitive candidate).

Thus, while these results suggest links between cognitive control and inhibition of competitors that were incompatible with at least one type of context (common ground), Brown-Schmidt’s (2009) study leaves two questions open: (1) Are similar control processes involved in inhibiting other context-incongruent competitors? Specifically, is cognitive control involved in resolving the competition between global and local information during referent selection? (2) Are these processes domain-specific? These questions are addressed in the current study. To answer the first, we examined if suppression of semantic competitors that clash with the sentence’s verb can be predicted from individuals’ performance on cognitive control tasks. To answer the second, we used a variant of the Flanker task, with an embedded no-go task. The Flanker task requires suppression of irrelevant visual stimuli (the flanking objects) in order to determine the direction of the central object. We use cartoon fish as stimuli, facing left or right, and response buttons the positions of which correspond to the direction of interest (left button to indicate a central fish facing left, right button to indicate a central fish facing right). The non-linguistic nature of stimuli and the spatial congruence of the response buttons and the target direction minimizes reliance on the language system. Thus, the effect size calculated as RTs in the incongruent trials (central and flanking fish facing opposite directions)—RTs in the congruent trials (all fish facing the same direction) provides an index of inhibitory control in the non-linguistic domain. We can then test if such an index predicts the magnitude of inhibitory control required to suppress the adjective competitor during sentence comprehension. A positive and reliable correlation speaks to an inhibitory control process that is shared between the linguistic and non-linguistic domains. Absence of such a correlation is consistent with specialized inhibitory control processes in each domain.

The task also involves a no-go component. Flanker trials prominently involve stimulus conflict, some response conflict, and little to no response execution difficulty, as indexed by low error rates (Ni et al., 2000), distinguishing them from no-go trials which prominently index response conflict and response inhibition. Importantly, the same non-linguistic materials were used for both trial types (Flanker and no-go). Finding an effect similar to that reported by Brown-Schmidt would support the involvement of domain-general control processes that mediate conflict resolution at the level of stimulus processing.

Methods

Participants

Thirty-two undergraduates of the University of Pennsylvania (17 females, mean age = 21.03 ± 1.87 years), all right handed and native English speakers, participated in the study in exchange for payment.

Materials

The eye-tracking task

A complete list of stimuli, along with the rationale and criteria for the selection of experimental materials is presented in the Appendix. Twenty sets were created, each containing four trial types (see Table 1). A 2 × 2 design manipulated context (constraining vs. non-constraining verb) and local attraction (local attractor present = experimental vs. absent=control). Constraining (e.g., “eat”) and non-constraining (e.g., “see”) verbs were comparable in frequency (SUBTLEX; Brysbaert & New, 2009), and their inclusion was determined by norming on Amazon’s Mechanical Turk (e.g., Buhrmester, Kwang, & Gosling, 2011). In the context of our paradigm, we use the term “competitor” to refer to pictures that compete for selection as a referent, based on the information available to the participant at each point in time. Local attraction was manipulated by including a picture (adjective competitor) that was compatible with the adjective but not with the constraining verb (“heart” in Table 1). Choice of the adjective competitor was also determined by norming on Mechanical Turk, such that the adjective competitor would be at least as compatible with the adjective as the target. Seven out of the 20 adjectives were color adjectives (see the Appendix for a complete list). The control picture (“igloo” in Table 1) was selected by re-shuffling the adjective competitor pictures, such that, for a given trial, it was incompatible with both the adjective and the verb. Thus, adjective competitors acted as their own controls across different trials. In addition, 20 fillers were created with adjectives that, unlike in the experimental and control trials, provided no useful information in localizing the target (e.g., “good” compatible with all four pictures), and verbs that varied in how constraining they were. The use of adjectives in sentences that did not necessarily require an adjective may seem like an unnatural feature of the task, but recent work has shown that overspecification, especially with color adjectives, is not unusual in speakers (Tarenskeen, Broersma, & Geurts, 2015).

Pictures were 300 × 300-pixel black and white line-drawings taken from either the IPNP corpus (Szekely et al., 2004), or Google images. Sentences, which had the fixed format “She will [verb] the [adjective] [noun].”, were recorded by a native English speaker at 44.1 kHz. A mixed design was employed, such that each subject only encountered one trial type from each set (for a total of 20 trials, 5 of each type + 20 fillers + 4 practice trials in the beginning). There was, therefore, no repetition of auditory or visual stimuli in individual participants.

The Fish-Flanker task

The Fish-Flanker task was a variation of the classic Flanker task (Eriksen & Eriksen, 1974), with five cartoon fish, and three trial types (Fig. 2). Participants indicated the direction of the central fish by pressing a button on the same side of the keyboard (left or right) as the direction that the fish was pointing. In the congruent trials, the central fish and the flanking fish all faced the same direction, while they faced opposite directions in the incongruent trials (100 trials; 50 facing left, 50 facing right for each). In the no-go trials, the flanking fish were dotted, cueing the subject not to respond (100 trials; 25 of each of the four direction combinations). There were a total of 300 trials, with 12 initial practice trials.

Fig. 2
figure 2

The three trial types in the Fish-Flanker task. A manual response was required for the Incongruent and Congruent trials where all fish were striped, but participants were instructed to refrain from pressing any buttons if the flanking fish were dotted

Apparatus

Participants were seated in a dimly-lit room, approximately 25 inches (c. 63.5 cm) from a 17-inch (c. 43.2-cm) monitor with the resolution set to 1024 × 768 dpi. Stimuli were presented using E-Prime Professional, v.2.0 software (Psychology Software Tools, www.pstnet.com). An Eyelink 1000 eye-tracker with chin-rest recorded participants’ monocular gaze position at 500 Hz. Fish-Flanker responses were registered by E-prime via adjacent keys on a keyboard.

Procedure

Participants completed the eye-tracking task followed by the Fish-Flanker task in one session. For the eye-tracking task, they were instructed to “listen and look at the pictures” (no response was required). They completed four practice trials, followed by 40 trials (20 critical, 20 fillers intermixed). Each trial began with a 1375-ms preview. In the first 1000 ms, the four line-drawings were presented in the four corners, while, in the last 375 ms, a shrinking red dot appeared at the center to draw the gaze back to the central location. After the preview, the sentence was presented through speakers at a comfortable listening volume. The position of the four pictures was randomized on every trial.

After a 5-min break, participants completed the Fish-Flanker task (12 practice trials with feedback, followed by 100 congruent, 100 incongruent and 100 no-go trials intermixed). In each trial, a fixation cross was presented at the center of the screen. The duration of presentation of the cross (ITI) was sampled from a uniform distribution ranging between 500 and 1500 ms. Next, the five fish were presented at the center of the screen for 1000 ms or until a response was made. Participants responded with either the index or the middle finger of their (dominant) right hand or made no response if the trial was a no-go trial. When a response was required, the position of the response button was congruent with the direction of the central fish (e.g., left button for the fish facing left). The spatial congruency was chosen to minimize the need for verbal strategies during response selection.

Results

The eye-tracking task

Data were analyzed using Growth Curve Analysis (GCA; Mirman, Dixon, & Magnuson, 2008), a variant of multilevel modeling developed specifically to analyze time course data, in R 3.0.3 (http://www.R-project.org.). For all analyses, the pattern of fixations was analyzed using cubic orthogonal polynomial models, with random intercept and slopes for subjects. The critical effect is reflected on the interaction between condition and the model’s polynomial terms. To keep the results interpretable, only the intercept, linear and quadratic terms are included in this interaction. In the discussion of the results, we focus primarily on the intercept, which reflects the average height of the curve, and can be used directly to compare the proportion of fixations in one condition versus another. For critical (i.e., adjective competitor) effects, we report the full model in tables. For all the analyses that follow, unless stated otherwise, we picked a pre-defined analysis window starting 200 ms after the onset of the adjective to allow for planning and execution of an eye movement, and ending at the average noun offset. Average duration of adjectives and nouns were 480 and 707 ms, respectively, making the analysis window 987 ms.

Figures 3 and 4 show fixation proportions (±SE) to the target, verb competitor and adjective competitor when the verb was non-constraining and constraining, respectively. Local attraction was measured by comparing looks to the adjective competitor in the experimental and control conditions. When the verb was non-constraining, there were significantly more looks to the adjective competitor (heart) than control (igloo; t = 5.05, p <0.001). Critically, when the verb was constraining, there were also reliably more looks to the adjective competitor than control (t = 2.21, p = 0.034; see Table 2 for full results). An interaction analysis revealed that the magnitude of local attraction was reliably smaller when the verb was constraining (t = –2.096, p = 0.038; see Table 3 for full results). Complementary analyses of looks to the target revealed fewer looks to the target in the presence of the adjective competitor when the verb was non-constraining (t = –2.66, p = 0.001), no interaction between context and presence or absence of the adjective competitor (t = 0.080, p =0.42), with a significant effect of the adjective competitor on target on the quadratic term (t = 2.12, p = 0.04) when the verb was constraining.

Fig. 3
figure 3

Looks to the target (T, left panel), the adjective competitor (AC, right panel), and the verb competitor (middle panel) in the experimental (AC = heart) and control (AC = igloo) conditions, when the verb is non-constraining (e.g., see)

Fig. 4
figure 4

Looks to the target (T, left panel), the adjective competitor (AC, right panel), and the verb competitor (middle panel) in the experimental (AC = heart) and control (AC = igloo) conditions, when the verb is constraining (e.g., eat)

Table 2 Results of comparing fixations on the adjective competitor in the experimental and control conditions when the verb is constraining (trial types 1 and 2)
Table 3 Results of comparing the magnitude of local attraction (looks to the adjective competitor in the experimental minus control condition) when the verb is constraining and when it is not (trial type 1−2 vs. trial type 3−4)

In summary, the results showed a reliable effect of local attraction in the presence of constraining context, with a timeline similar to that reported by Kukona et al. (2014, Fig. 6). Local attraction started late, shortly before the noun onset, continued throughout the noun zone, and was extinguished at the noun offset. In addition, comparison of the magnitude of local attraction when the verb was and was not constraining revealed a reduction in the size of the effect in the constraining context. These findings show that local attraction is reliable and its magnitude is modulated by contextual constraints. Next, we asked whether this modulation could be predicted from a domain-general inhibitory process.

Fish-Flanker task

Mean error rate was 8.31 (SE = 1.58). The majority of errors were commission errors in the no-go condition: 6.34 (SE = 1.12). Error rates were slightly higher in the congruent (1.13, SE = 0.23) than the incongruent (0.84, SE = 0.15) condition, but this difference was not significant (t(31) = 1.06, p = 0.30). Mean RT for correct incongruent trials (519 ms, SE = 11) was, however, significantly longer than that of congruent trials (494 ms, SD = 10), when the distribution of log-transformed RTs were compared (t(31) = 8.89; p < 0.001), replicating the classic congruency effect in the Flanker task. The Flanker effect size was calculated as follows for each subject: RT(incongruent – congruent). The average effect size was 25 ms (SE = 3).

Analysis of individual differences

Our results, in keeping with those of past studies (Dahan & Tanenhaus, 2004; Kukona et al., 2014), showed late looks to the adjective competitor when the verb was constraining (~400 ms after adjective onset). This analysis investigated whether looks to the adjective competitor in this late time window were predicted by the strength of domain-general inhibitory processes. For each individual, the magnitude of local attraction was calculated as average fixation proportions on the adjective competitor in the experimental minus control conditions when the verb was constraining. Participants varied considerably in their fixation proportions to the adjective competitor in the experimental and control conditions, with an average effect size of 0.04 (SD = 0.1). This effect size shows the magnitude of the ability to inhibit looks to the referent activated by the information in the sentence adjective.

Flanker effect size was not reliably correlated with baseline performance on the Flanker task (i.e., RT in the congruent condition; Spearman’s rho = 0.19, p = 0.30). This shows that the Flanker effect size was not a reflection of the basic abilities for carrying out cognitive tasks that both Flanker and the linguistic task require (e.g., visual perception, speed of processing, etc.). This effect size can therefore be taken as a measure of the ability to inhibit irrelevant information in the visual domain. We then asked if the effect sizes in the eye-tracking and Flanker tasks were correlated. The upper and lower panels in Fig. 5 show the correlations between the Flanker effect size and the number of no-go errors, respectively, with the magnitude of local attraction. The two variables were themselves not correlated (r = –0.08, p = .66), so they were both entered as regressors in a GLM with the magnitude of local attraction as the dependent variable. Only the Flanker effect size was reliably predictive of local attraction (t = 2.17, p = 0.038; no-go effect: t = 1.23, p = 0.22; model’s R 2 = 0.17).

Fig. 5
figure 5

Correlations between the size of local attraction in the eye-tracking task (looks to the adjective competitor in the experimental and control conditions when the verb was constraining) and two measures in the Fish-Flanker task. The upper panel shows the correlation with the size of the Flanker effect (difference in RTs in the congruent and incongruent conditions for the go trials). The lower panel shows the correlation between with the number of commission errors on the no-go trials

General Discussion

This study tested the influence of earlier semantic context on inhibiting the consideration of incompatible semantic competitors of later words in the sentence, and found a reliable local attraction. This finding extended Kukona et al.’s (2014) claims of local attraction by showing that the effect could be observed in the absence of direct mapping of words to referents in the visual scene. Previously, Dahan and Tanenhaus (2004) had reported looks to phonological competitors if the cross-spliced phonetic information temporarily biased the listener towards the competitors. They, however, reported no interaction with the context, while our results showed sensitivity to context. One possible explanation is the different nature of the biasing information (phonetic vs. semantic). However, a more likely explanation is the difference in the windows of analysis. Those authors used a narrow window of 350–500 ms from the onset of the critical word, because the main question of that study was whether context can impose an early effect on selection, while we were interested in consideration of competitors at any point. Indeed, the visual inspection of their data suggests an at least numerically smaller local attraction in the presence of the constraining verb when a larger window is considered (Dahan & Tanenhaus, 2004, Fig. 3), very similar to the current findings.

Regardless of the differences mentioned above, both the current study and that of Dahan and Tanenhaus (2004) found a late consideration of context-incongruent competitors. Why this late effect? Kukona et al. (2014) proposed a self-organizing model that predicts exactly such a pattern from the parallel influence of context and local attraction: the early context (e.g., “eat”) activates the target and the verb competitor, whose rising activation increasingly suppresses the adjective competitor until the adjective arrives. This arrival has two consequences: (1) it drives down the activation of the verb competitor, thus reducing its imposed inhibition on the adjective competitor, and (2) it directly supports the activation of the adjective competitor. Together, these two processes gradually lead to the late increased activation of the adjective competitor, unless it is suppressed by top–down control.

We then turned to the critical question of what determines the magnitude of local attraction. Kukona et al.'s (2014) simulations predict a practice effect: early in its training the model shows large local attraction, but this effect diminishes as the model receives more training. This is not surprising, given that initial processing is highly bottom–up, and it is only through feedback and learning that such bottom–up processing becomes sensitive to constraints. The model, thus, predicts that the more mature the linguistic system, the more constrained the bottom–up processing. Comparing linguistic systems of different strengths has its challenges. One approach would be to compare local attraction in children versus adults, where the linguistic systems are truly at different maturational stages; but so are some of non-linguistic systems such as the cognitive control system (e.g., Davidson, Amso, Anderson, & Diamond, 2006). Moreover, evidence suggests that developmental delay in cognitive control has consequences for real-time sentence comprehension in children (Choi & Trueswell, 2010; Trueswell, Sekerina, Hill, & Logrip, 1999; Weighall, 2008) although these developmental changes may be related to cognitive flexibility rather than inhibitory control (Woodard, Pozzan & Trueswell, 2016). It would thus be difficult to claim that any given difference between children and adults in local attraction is really due to the maturation of their language system alone. Another approach would be to use adult speakers, but to somehow quantify their linguistic competence. Vocabulary size (e.g., Borovsky, Elman, and Fernald, 2012) is a likely candidate, but it is unclear how much of the variability in linguistic competence of adults can be captured by that index. Moreover, above and beyond the maturation of the language system, other abilities may contribute to modulation of local attraction. This study posited that domain-general inhibitory control is one such factor.

Note that the local coherence phenomenon itself is already a challenge to one kind of mental modularity, namely, the mutual modularity of non-overlapping phrases in sentence representations. However, it is potentially compatible with the assumption that language processing is modular with respect to other cognitive processes. The difference between Kukona et al. (2014) and the current design is critical here. In the former, the presence of colored objects on the screen immediately draws attention to color, as has been shown in many studies of visual search (e.g., Theeuwes, 1994), thus the induced competition is primarily visual, similar to Flanker tasks. Thus, it would not have been greatly surprising if the magnitude of inhibition in Kukona and colleagues’ design was correlated with that in Flanker. The current design avoids the confound of strongly-guided visual capture for the following reason: we set up the materials such that two of the pictures were rated as incompatible with the adjective (i.e., banana and the fourth picture, e.g., antlers, were never picked during norming to go with the adjective “red”). Of the two pictures that were compatible with the adjective, one (i.e., the target pear) had a lower probability of being associated with the adjective (see the Appendix for details). Thus, it is quite unlikely that, upon viewing such a scene, the adjective was automatically activated. Past evidence supports this claim at least for color adjectives. Yee, Ahmed and Thompson-Schill (2012) found no evidence of color-based priming between pairs such as “cucumber” and “emerald”, unless the priming task was preceded by a Stroop task which drew participants’ attention specifically to color.

In summary, the probability of participants becoming aware of the upcoming competition simply by inspecting the visual scene, without hearing the sentence, is very low in the current design. It is at the point where the adjective becomes critical for referent selection (i.e., “She will eat the red…” to distinguish between banana and pear) that the adjective activates the adjective competitor. Thus, the ensuing competition is a direct result of sentence processing, as opposed to visual pop-out. We can now ask whether the ability to resolve the competition induced by sentence comprehension can be predicted from the ability to resolve the competition induced by non-linguistic visual cues as in the Flanker task.

Our results revealed that participants’ ability to suppress irrelevant information in a Flanker task, but not their ability to withhold responses, predicted the magnitude of local attraction. Note, however, that the Flanker effect is indexed by RTs, which can be more sensitive in capturing variations among individuals than the error rates in the no-go task, thus making the null less reliable. However, this replicates an earlier null finding reported by Brown-Schmidt (2009), who found that no-go scores were not predictive of inhibition in perspective taking, while Stroop scores were. Similar to Flanker cost, Stroop cost taps more strongly into suppression of irrelevant information at the stimulus level than response inhibition, thus the correlation with local attraction reflects a true suppression of linguistic information, as opposed to suppression of eye movements. Together, these findings suggest that cognitive control is involved in suppression of competitors that are in conflict with different types of contextual constraints. The results also answer a second question left open by Brown-Schmidt (2009), namely that inhibitory control processes in a non-linguistic task were still predictive of performance in sentence comprehension, suggesting the domain-generality of such processes.

In summary, these results rule out the possibility, consistent with prior work on lexical local coherence, that local attraction is simply due to low-level feature pop-out capture of attention interfering with sentence interpretation. On the other hand, prior accounts of this phenomenon have posited a purely bottom–up mechanism, while our results show that top–down cognitive control mediates the strength of local coherence effects. Collectively, these results support a sentence-processing model in which activation and suppression of semantically related information are decided by the interplay of context and local attraction (as in a self-organizing model), regulated by domain-general top–down control.