Skip to main content

Best not to bet on the horserace: A comment on Forrin and MacLeod (2017) and a relevant stimulus-response compatibility view of colour-word contingency learning asymmetries


One powerfully robust method for the study of human contingency learning is the colour-word contingency learning paradigm. In this task, participants respond to the print colour of neutral words, each of which is presented most often in one colour. The contingencies between words and colours are learned, as indicated by faster and more accurate responses when words are presented in their expected colour relative to an unexpected colour. In a recent report, Forrin and MacLeod (2017b, Memory & Cognition) asked to what extent this performance (i.e., response time) measure of learning might depend on the relative speed of processing of the word and the colour. With keypress responses, learning effects were comparable when responding to the word and to the colour (contrary to predictions). However, an asymmetry appeared in a second experiment with vocal responses, with a contingency effect only present for colour identification. In a third experiment, the colour was preexposed, and contingency effects were again roughly symmetrical. In their report, they suggested that a simple speed-of-processing (or “horserace”) model might explain when contingency effects are observed in colour and word identification. In the present report, an alternative view is presented. In particular, it is argued that the results are best explained by appealing to the notion of relevant stimulus–response compatibility, which also resolves discrepancies between horserace model predictions and participant results. The article presents simulations with the Parallel Episodic Processing model to demonstrate this case.

In the study of contingency learning, one useful and highly robust tool is the colour-word contingency learning paradigm (Schmidt & Besner, 2008; Schmidt, Crump, Cheesman, & Besner, 2007; Schmidt & De Houwer, 2016a; for related paradigms, see Carlson & Flowers, 1996; Levin & Tzelgov, 2016; Miller, 1987; Mordkoff & Halterman, 2008; Schmidt & De Houwer, 2012b, 2012c). In the typical preparation, participants are presented with coloured neutral words (e.g., “plate” in green), and their task is to identify the print colour while ignoring the word. Critically, each word is presented most often in one colour (e.g., “plate” most often in green, “month” most often in red). Contingency learning is revealed by faster and more accurate responses to high-contingency trials, where the word is presented in the correlated colour (e.g., “plate” in green), relative to low-contingency trials, where the word is presented in an infrequently paired colour (e.g., “plate” in red). Acquisition is extremely rapid (Schmidt & De Houwer, 2016b; Schmidt, De Houwer, & Besner, 2010; Lin & MacLeod, in press ), the effect magnitude is influenced by the contingency strength (Forrin and MacLeod, 2017a) and contingency awareness (Schmidt & De Houwer, 2012a, 2012d), and the effect can also be observed between languages (Atalay & Misirlisoy, 2012).

Horserace model

In a recent paper, Forrin and MacLeod (2017b) used both the typical colour-identification version of the paradigm and a related word-identification variant in which the task goal was reversed: identify the word and ignore the colours, which are again correlated. The authors then explored whether the magnitude of the contingency effect in both colour-identification and word-identification variants was influenced by relative speed of processing of words and colours. In particular, they appealed to the notion of a horserace between the word and colour, an analogy that has been discussed (and subsequently discarded) in the colour-word Stroop literature (Dunbar & MacLeod, 1984; Dyer, 1973; Klein, 1964; Morton & Chambers, 1973; Palef & Olson, 1975; Warren, 1972).

In their Experiment 1, they predicted that the contingency learning effect for word identification would be smaller than the contingency learning effect for colour identification. This was inspired by the notion (with a major caveat to be discussed later) that processing of words is faster than processing of colours (Cattell, 1886; Fraisse, 1969). This notion is illustrated visually in Fig. 1. In particular, the notion is that the word-processing pathway (Pathway A) “runs” faster than the colour-processing pathway (Pathway B), meaning that the word can influence colour identification to a greater extent than the colour can influence word identification. That is, a faster-to-process word will influence colour keypresses more than slow-to-process colours will influence word keypresses. This asymmetry was not, however, observed: The contingency effect was roughly equivalent in both conditions.

Fig. 1

A simple horserace model as it applies to contingency learning paradigms. The “word horse” runs faster to the response “finish line” (checkered) than the “colour horse,” producing an asymmetry in the magnitude of colour and word identification contingency effects

Why is this? The authors reasoned that the keypress response modality used for Experiment 1 complicated matters. Words were, contrary to their expectations, responded to slightly slower than colours (albeit only marginally). The authors went on to suggest that the mapping of words to keys might have been less intuitive than the mapping of colours to keys. And more importantly, they pointed out that both colour-to-key and word-to-key mappings are arbitrary (unlike with vocal responding), which may work against the “word horse” advantage. The current report will expand on this latter point to a much greater extent later. However, before presenting an alternative view to the simple horserace model, it is first worth considering the remaining two experiments of Forrin and MacLeod (2017b).

In their Experiment 2, the task was identical, save that keypress responses were replaced with vocal responses (i.e., colour naming and word reading). Their prediction, which was (mostly) confirmed, was that with vocal responses the word would be able to beat the “colour horse” to a vocal response, boosting the contingency effect in colour naming. In contrast, the slower colour would not be able to beat the “word horse” to a vocal response, thus ameliorating the contingency effect in word reading. Indeed, a contingency effect was present for colour naming but not for word reading. As one abnormality, however, it is noteworthy that the contingency effect was decreased in vocal relative to keypress responding for both word reading and, more critically, colour naming. The decrease in the colour-naming condition is not consistent with their account: Because the word is “winning the race” to a much greater extent with a vocal response, the contingency effect presumably should have been boosted. In the alternative account of their data to be presented later, however, this decrease in both conditions, including the more drastic decrease for word reading, is to be expected.

Finally, their Experiment 3 was identical to Experiment 2, except that the colour was preexposed with a coloured rectangle before the coloured word was presented. The notion was that this temporal head start for the colour would shift the advantage away from word reading and toward colour naming (for similar logic, see Schmidt & De Houwer, 2016b). Thus, the larger contingency effect of words on colour naming was predicted to be diminished or even reversed. Consistent with this, the contingency effect decreased for colour naming (relative to Experiment 2) and increased for word reading (both effects, however small, were still significant). In the alternative account to be given below, the interpretation of the difference between Experiments 2 and 3 will be similar to that presented by Forrin and MacLeod (2017b).

Conceptual considerations

The alternative account to be presented in the current report does not disagree with the broader idea of Forrin and MacLeod (2017b) that the relative speed with which the distracter (via a contingency) and the target bias a response matters in the observed magnitude of the contingency effect. However, it does differ in where it is supposed that a word-over-colour advantage is observed. To begin illustrating this point, consider the slightly expanded version of a simple horserace model, presented in Fig. 2. In this variant, we consider both the initial processing of the stimuli, followed by the conversion of a decision about the identity of the stimulus to a response.

Fig. 2

An expanded horserace model as it applies to colour-naming and word-reading contingency learning paradigms. Most critically, it is unclear why words should influence colour naming at a particularly strong rate when the word is not a potential response (i.e., why C in the top panel should be stronger than D in the bottom panel, indicated as learned connections)

The reader might note that Fig. 2 depicts the connections between stimulus inputs and decisions as equally strong for words and colours (Pathways A and B). Why is this? A simple horserace model might suggest a stimulus-processing advantage for words, contrary to the figure. However, this is not a reasonable assumption. Past work has shown that word reading is faster than colour naming (Cattell, 1886; Fraisse, 1969). Contrary to the common assumption that tends to echo throughout the literature, this finding does not mean that word stimuli are processed faster than colour stimuli (Melara & Algom, 2003). Rather, the time between stimulus presentation and verbalisation is faster for words than for colours. From the perspective of a very simple horserace model this might sound like the same thing, simply worded two different ways. However, if we consider stimulus identification and the translation (Sugg & McDonald, 1994; Virzi & Egeth, 1985) of that identified stimulus to a vocal response as two different things, then it may actually be the case that words are not (visually) processed especially fast, but only that the identified word can be rapidly converted to a vocal output. That is, the path from the representation (e.g., lexical) of a word to its pronunciation is much more direct than the path from a colour representation (e.g., pictorial) to the appropriate colour label pronunciation. Indeed, reading words is much more heavily practiced than naming colours. However, we see colours (literally everything has a colour) even more frequently than words. Indeed, word detection does not seem to be especially fast (Fraisse, 1969). Furthermore, no word-identification benefit was observed in the keypress experiments of Forrin and MacLeod (2017b), which is also similar to keypress Stroop studies (e.g., Blais & Besner, 2006). Thus, the proposition here is that the advantage that words have over colours with a vocal response (reading/naming) is not a benefit in stimulus-processing speed but a benefit in the compatibility between targets and responses (i.e., response-selection speed).

If the word-over-colour advantage is in response selection (rather than in stimulus processing), then should task-irrelevant words not still retrieve responses faster than task-irrelevant colours? As the figure caption indicates, the hidden assumption with this notion is that words will speed responses with any vocal response modality, regardless of whether the word stimuli (e.g., “plate”) match the vocal responses (e.g., “green”). In a Stroop task (Stroop, 1935), for instance, the word red produces a quick-response activation of the “red” vocal response because of the overtrained compatibility between the word and its verbalisation. For the same logic to work with the colour-word contingency learning paradigm, it would have to be assumed that, for example, the distracting word plate is quickly “read” as green (i.e., if plate is presented most often in green) because of the vocal modality, and much faster than the distracting green print colour will be “read” as plate. Though not impossible that the sheer nature of the vocal response modality leads to rapid translation of a word to a vocal response (e.g., month translated to a “red” response) via a contingency in the colour-naming task (Connection C; or perhaps faster processing of the word via Connection A), this seems less obvious than in the case of an overlearned word-vocalisation association (e.g., red translated to a “red” response) in the Stroop task (i.e., Connection C in word reading).

Alternative interpretation

As Forrin and MacLeod (2017b) correctly point out, the horserace metaphor they discuss did not fare well long-term in the Stroop literature (for a review, see MacLeod, 1991). This is inevitably because a horserace metaphor is too simple. That is, the horserace model served as an interesting analogy for thinking about the very basic finding of an congruency-effect asymmetry in simple vocal Stroop experiments (i.e., large effect in colour naming, and no effect in word reading) but proved a blunt tool in explaining further details of Stroop task performance. Here, it is suggested that the same is true for a simple horserace model of colour-word contingency effects.

Consider instead a model of Stroop (and related) effects that has fared better: the dimensional-overlap model (Kornblum, Hasbroucq, & Osman, 1984; Kornblum & Lee, 1995; Kornblum, Stevens, Whipple, & Requin, 1999; Zhang & Kornblum, 1998; Zhang, Zhang, & Kornblum, 1999). According to this model, the presence and magnitude of conflict effects, such as those in the Stroop or Simon tasks, are determined by the overlaps between stimulus and/or response dimensions (see also Augustinova, Silvert, Ferrand, Llorca, & Flaudias, 2015; De Houwer, 2003, 2004; Melara & Algom, 2003; Risko, Schmidt, & Besner, 2006; Schmidt & Cheesman, 2005). Of particular importance, distracting stimuli are said to interfere to the extent that they overlap with responses. For instance, distracting horizontal (left/right) locations can be compatible or incompatible with left/right response keys to another stimulus (Simon, Craft, & Webster, 1973; Simon & Rudell, 1967). In the case of the vocal (and keypress) studies of Forrin and MacLeod (2017b), it is important to stress that this irrelevant stimulus-response (S-R) compatibility did not exist. Words (e.g., plate) were not potential colour-naming responses, and colours were not potential word-reading responses.

On the other hand, the relevant S-R compatibility between target stimuli and their assigned responses does increase when switching to vocal. For instance, mapping of words to keys is arbitrary in a manual task (e.g., “Press the J key for plate”). Reading the words (e.g., saying “plate” to the word plate), however, is heavily overtrained. The same is also true for colour targets. Colour-to-key mappings are arbitrary (e.g., “Press the J key for red”), whereas naming colours is nonarbitrary (e.g., saying “red” to a red stimulus). In both cases, we would expect the “target horse” to have an advantage over the “distracter horse,” even if the distracter horse runs at the same speed in keypress, word reading, and colour naming. However, because word reading is more heavily trained than colour naming, we should anticipate a “target horse” advantage to a greater extent with word reading.

Thus, according to the alternative interpretation presented here, the difference between keypress and vocal contingency learning tasks is exclusively due to relevant stimulus-response compatibility. That is, it is assumed that a target stimulus can be more quickly translated into a response with a vocal response modality because the vocal response directly corresponds to an overlearned reading/naming response. That is to say, Connection D (see Fig. 2) is strengthened in colour naming (e.g., when participants are saying “red” to a red stimulus), and Connection C is strengthened in word reading (e.g., when participants are saying “plate” to the word plate). Similar to Forrin and MacLeod (2017b), it is assumed that the latter word-reading translation is more heavily overlearned than the colour-naming translation. With keypress responses, it is assumed that Connections C and D do not exist at all (i.e., no overlearned colour-key or word-key associations). In addition, it is assumed that there is no (meaningful) difference at all between vocal and keypress (visual) processing speeds of words and colours early on (Connections A and B).

Contingency effects emerge from episodic retrieval (Schmidt et al., 2010). In particular, on each trial a new episode is formed, linking the stimuli presented to the response that was made. During retrieval on subsequent trials, these episodes automatically bias responding. Because, for instance, most memories of the word plate point to a green response, simple presentation of the word plate will lead to a strong retrieval bias of the green response. This facilitates performance on high-contingency trials, producing a contingency effect. Critical to the current argument, it is here assumed that contingency learning proceeds identically in all experiments for all stimuli (i.e., no advantage for words over colours). That is, words do not produce a stronger influence on responses than colours via episodic retrieval.

Parallel Episodic Processing (PEP) model

To assess the stimulus-response compatibility account of contingency learning effects, the current report uses the Parallel Episodic Processing (PEP) model (Schmidt, 2013a, 2013b, 2016a, 2016b; Schmidt, De Houwer, & Liefooghe, 2017; Schmidt, De Houwer, & Rothermund, 2016; Schmidt & Weissman, 2016). This model learns both what to respond (contingency learning) and (less relevant for the current report) when to respond (temporal learning) on the basis of memories of past events. In particular, the model stores a new episodic memory of each trial that it experiences. On each trial, it retrieves memories on the basis of similarity (e.g., the word plate will retrieve memories of the word plate) in order to anticipate the likely response. For instance, if plate was presented most often in green, then most “plate” memories will point to a green response, thereby facilitating a green response. The PEP model is similar to other episodic (aka, instance or exemplar) models of memory (e.g., Hintzman, 1984, 1986, 1988; Logan, 1988; Medin & Schaffer, 1978; Nosofsky, 1988a, 1988b; Nosofsky & Palmeri, 1997; Nosofsky, Little, Donkin, & Fific, 2011), but is structured for the purpose of simulating performance (response times) rather than recall, recognition, or categorization. The PEP model can simulate a range of phenomena from a diverse range of research fields, including work on practice, contingency learning, temporal learning, feature integration, instruction and goal implementation, timing, and various so-called cognitive control tasks. Most critical for the current report, of course, is the ability of the model to simulate colour-word contingency learning effects. The model thus provides a means to assess the qualitative predictions of the stimulus-response compatibility account presented here. As will be demonstrated, this model predicts (a) no meaningful asymmetry with keypress responses (Simulation 1), (b) reduced contingency effects with verbal responses, especially for word reading (Simulation 1), and (c) a shift in asymmetries with colour preexposure (Simulation 2).

Simulation 1: Experiment 1 versus Experiment 2

For the current simulations, Version 3.0 of the PEP model (the current working version) is used in an unaltered state. Note, however, that the simulations to be presented here are straightforward enough that any version of the model should produce the same qualitative results. Here, for brevity, only a brief conceptual overview of the relevant features of the PEP model is presented. Full documented source code can, however, be freely downloaded from the website of the lead author (, and prior reports explain the functioning of the model in further detail. It is also critical to stress that the model is not “parameter hacked” on a simulation-by-simulation basis to produce good quantitative fit, but is merely structured to provide insights on qualitative predictions in a fixed parameter framework. Thus, exact effect magnitudes should not be interpreted too strongly. Note also that no inferential statistics are reported, because enough simulated participants are run to ensure correct model description. All discussed effects are, however, statistically significant, generally by a gigantic margin.

Figure 3 presents the PEP model as it applies to colour identification/naming (top panel) and word identification/reading (bottom panel). For both colour identification and word identification, there are input nodes for each of the colours (red, yellow, and green) and each of the words (plate, month, and under). In colour identification, there are three decision nodes (one for each colour) and three corresponding colour responses. The same is true in word identification, except that the decision and response nodes are changed (i.e., one for each word). Target input nodes are directly connected to the matching decision nodes (e.g., the green colour input node is connected to the green decision node).

Fig. 3

The PEP model as it applies to a colour naming (top) and word reading (bottom) contingency learning tasks. The overtrained decision-response connections were the only connections modified in Simulation 1

In simulations of keypress experiments, however, the response nodes are changed to keys, and there is no connection at all between decision and response nodes. That is, the fact that the response key for red is J is completely arbitrary and nothing that the model could know in advance. However, the model can still perform the task well, even from the very first trial. This is because the model stores the task instructions at the start of the simulation (e.g., that red should be responded to with a J key). The model can then solve for the response via memory retrieval. These instructions are then automatized during performance of the task, because memories of the trials themselves effectively reencode the instructions (e.g., memory of actually seeing red and pressing the J key in response to it).

Because stimuli are simply represented arbitrarily as nodes in a stimulus array, the colour-identification and word-identification models are identical in all respects for keypress. That is, it is arbitrary whether we call the target input nodes “green,” “yellow,” and “red” or “under,” “month,” and “plate,” and the same holds true for the distracter input and decision nodes. Thus, by definition, a simulation of Experiment 1 will produce no difference between word-identification and colour-identification contingency learning procedures. More interesting is when we consider the role of relevant (target) stimulus-response compatibility. To simulate relevant stimulus-response compatibility, decision and response nodes are connected. For instance, in colour naming, the green decision node is connected to the “green” vocal response. Similarly, in word reading, the month decision node is connected to the “month” reading response. As mentioned earlier, the assumption is that there is a compatibility between target decisions and responses for both word reading and colour naming but that this compatibility is stronger for word reading. Thus, the only difference between the instantiation of word reading and colour naming is the strength of the weightings between decision and response nodes.

Rather than just selecting two relatively arbitrary values, one for colour naming and one for word reading, stimulus-response compatibility was manipulated parametrically in the current stimulations, with weightings of 0, 1, 2, 3, and 4. A connection weight of zero, of course, indicates no compatibility and is the keypress reference point. The remaining levels are to show parametrically what happens in the model as connection weightings increase. In particular, the predictions are that (a) overall response speed should increase, and (b) the contingency effect should decrease as connection weights increase. The latter prediction is largely a by-product of the former: the faster that the response can be determined directly via the overtrained compatible stimulus-response associations, the less time there is for contingency learning processes to affect behaviour. The learning mechanism itself does not change, however. Five hundred simulated participants were run for each of the five stimulus-response weightings. Each simulated participant was presented with 360 trials, selected randomly with replacement from the same contingency matrix as used by Forrin and MacLeod (2017b), presented in Table 1.

Table 1 Contingency manipulation of simulations

As can be observed in Fig. 4, the contingency effect decreases rapidly as the decision-response connections are strengthened: 21, 16, 12, 7, and 3 cycles, respectively. This is consistent with the notion that the stronger the overtrained compatibility between target stimuli and responses, the less the distracter is able to influence responding. For instance, with very strong decision-response compatibility for word reading (e.g., connection weight of 4), the contingency effect is practically eliminated (the effect is still robust, but only because of the very large number of simulated participants). However, with weaker colour-naming weightings (e.g., connection weight of 2), the contingency effect is also decreased relative to an arbitrarily instructed keypress response (i.e., connection weight of zero), but not eliminated.

Fig. 4

Simulation 1 cycle times for high-contingency and low-contingency trials as a function of overtrained decision-response weightings

Note as well that mean RT also decreases as the decision-response connections are strengthened. The averages of the high-contingency and low-contingency means were 439, 419, 400, 383, and 368 cycles, respectively, for connection weights 0 to 4. This is also consistent with the observation that word reading was overall much faster than colour naming in the original sample. As one minor discrepancy, colour identification was nominally slower in vocal colour naming than in keypress colour identification in the original report. The reverse is true in the simulated data. However, this might merely reflect slower response initiation in vocalisations (e.g., longer to begin speaking “green”) relative to keypresses (e.g., quicker finger depression of the key assigned to green), which is not modelled in the PEP framework. As another note of interest, the size of the contingency effect is not directly proportional to mean RT in the simulated data but instead decreases much more quickly (Fig. 4 might be deceptive in this respect, given the restricted scale). The contingency effect divided by mean RT (i.e., average of the high-contingency and low-contingency means) for the 0 to 4 connection weights was, respectively, 4.7%, 3.8%, 3.0%, 1.9%, and 0.9%. To use horserace-metaphor terminology, this is because episodic retrieval on the basis of the distracter must, to some extent, “beat” the target to a response. That is, with a heavily overtrained target-response correspondence (e.g., reading of words), episodic search on the basis of a neutral distracting stimulus (e.g., colour) does not have enough time to meaningfully bias responses.

Simulation 2: Experiment 2 versus Experiment 3

In their Experiment 3, Forrin and MacLeod (2017b) tested to what extent preexposing the colour might serve to reduce and/or reverse the asymmetry between colour-naming and word-reading contingency learning effects. In particular, the colour was preexposed for 200 ms as a coloured rectangle before the coloured word was presented. The logic, then, is that, although the word is ordinarily processed more quickly, the colour receives a “head start” to reduce the word-reading advantage. In the current instantiation of the PEP model, the logic is similar to this. That is, it is indeed assumed that the colour receives a head start. This is implemented in the model by adding three colour rectangle nodes to the model, which are presented 200 cycles in advance of the coloured word. In both colour naming and word reading, this rectangle always matches the word print colour (i.e., the target colour in colour naming and the distracting colour in word reading), as in the original study. These rectangle nodes are not connected to any other nodes (see General Discussion). However, head start aside, the only difference between colour naming and word reading in the simulation is the relevant (target) stimulus-response compatibility. Partially arbitrarily and partially on the basis of the quantitative results in Simulation 1, relevant word-response connections are set to 4 and relevant colour-response connections to 2. However, the general principle to be taken from the following simulation will hold regardless of which parameters are chosen. In particular, the predictions are that preexposing the colour will (a) lead to a reduction in the colour-naming contingency effect, and (b) lead to an increase in the word-reading contingency effect. With colour naming, the head start of the colour will leave less time for episodic memory to bias responding on the basis of the word-response contingency. With word reading, it is the reverse: The head start of the colour gives extra time for memory to be biased by the colour-response contingency. As with the previous simulation, 500 simulated participants were run per condition with the same number of trials (360) and the same contingency matrix (see Table 1).

The simulation results are presented in Fig. 5. Two critical features are of note. First, the contingency effect for colour naming decreased with colour preexposure, as predicted. In particular, the contingency effect decreased from 12 to 10 cycles, a two-cycle decrease. The decrease in the contingency effect, though robust, is small in the model. However, the model is not parameterized for perfect quantitative fit. More critical is that, qualitatively, a decrease in the effect is expected with a colour preview (see General Discussion for further considerations). Second, the contingency effect for word reading increased with colour preexposure, also as predicted. In particular, the contingency effect increased from 3 to 18 cycles, a 15-cycle increase. Thus, preexposure of the target (as is the case with colour naming here) will give the distracter less time to influence processing via memory retrieval, whereas the exact reverse is true with preexposure of the distracter (as is the case with word reading here).

Fig. 5

Simulation 2 contingency effect as a function of colour preview and target dimension

General discussion

At a broader level, the present report agrees with the explanation put forward by Forrin and MacLeod (2017b) that the relative time with which the target and distracter-contingent responses are activated impacts the size of the contingency effect and the extent to which an asymmetry will be observed between word identification (whether reading or keypress) and colour identification (whether naming or keypress). At a more detailed level, however, the suggestion presented here is that the compatibility between the target dimension and response modality (in addition to any stimulus preview advantages, of course) will be the primary factor influencing effect magnitude. At least in the preparations considered in the present report, speed of processing might be less relevant for the distracting dimension. In both cases, there is no compatibility between colours and reading responses of noncolour words or between noncolour words and naming responses of colours. Of particular importance, the suggestion is that distracting words do not necessarily boost a contingency effect simply because the response modality is vocal. These ideas, of course, deviate from the simple horserace metaphor, which suggests that the word "horse" is fast and the colour “horse” is slow, hard stop. As in the Stroop literature, then, a simple horserace model again falls short, whereas dimensional overlap proves more informative. Indeed, Forrin and MacLeod themselves depart from a simple horserace model in their General Discussion, considering a hybrid between a dual process model (Moors, Spruyt, & De Houwer, 2010) and the PEP model.

Of course, one can rightly view the horserace model as simply a much more abstracted (i.e., simplified) version of more developed models of performance, such as the dimensional-overlap account presented here. That is, the dimensional-overlap model is more precise about when a given “horse” runs fast or slow, and at what stages of processing. In that sense, the present report can be viewed as providing a more “microscopic” investigation of speed of processing in colour-word contingency learning paradigms (i.e., with a horserace model being a “macroscopic” version of the same idea). The microscopic focus of the present report is useful, however, as it helps us to better comprehend the observed results. For instance, both (a) the lack of an asymmetry between word and colour identification with keypresses and (b) the decrease in contingency effects when switching to both word reading and colour naming might seem surprising from the perspective of a horserace model (i.e., as words should “run” faster than colours at all times) but are completely in line with expectations from a dimensional-overlap model.

One interesting feature of the present report is that the insights obtained from the current analysis of the Forrin and MacLeod (2017b) experiments were obtained by following the logic of two large-scale performance frameworks (i.e., the PEP and dimensional-overlap models). In particular, the PEP modelling framework allows us to make predictions about contingency learning via memory retrieval, and the stimulus-response compatibility notions from the dimensional-overlap model help us to understand how contingency effects (like compatibility effects) are modified by changes in response modality. Future neural network research might aim to integrate these two frameworks even further.

As one caveat, the decrease in the contingency effect in colour identification when switching to vocal was rather small in the current Simulation 2. Although exact effect magnitudes should not be interpreted too strongly in the simulated data, one reason for this smaller decrease (i.e., relative to in the participant data) might be due to the way that colour rectangles were treated. Colour rectangles were treated as task-irrelevant stimuli, meaning that they did not have the strong connections to decision nodes that targets do. In fact, these rectangle nodes were not connected to anything. Though these rectangles will aid in selecting the colour (due to the 100% contingency), actual participants may treat the coloured rectangles as targets. As the rectangle colour is perfectly correlated with the word print colour, participants might deliberately respond to the rectangle rather than (or in addition to) the print colour, as Forrin and MacLeod (2017b) rightly point out in their original article. If so, selection of the colour will be even faster, leaving the word less time to influence performance. Attention to the rectangle might also draw attention away from the word, reducing learning on the basis of the word further. Future research might explore this issue further (e.g., by making the colour rectangle not perfectly predictive of the print colour). In the current simulations, rectangles were treated as task-irrelevant merely to make the comparison to word reading (where they must be task irrelevant) clearer. Incidentally, this decision only worked against predictions, as the effect for colour naming would have been reduced further in colour naming with rectangle–decision node connections, the biggest (albeit only quantitative) discrepancy in the simulations presented here.

To summarize, the horserace metaphor provides an interesting description of some simple findings in the Stroop literature (e.g., vocal responding asymmetries). However, the account falls short in explaining the finer details, making it a blunt tool. Further, the current manuscript suggests that it may not be a particularly useful metaphor for colour-word contingency learning asymmetries, especially given the lack of stimulus-response compatibility between distracting stimuli and responses. Like a solar system model of the atom, taking the horserace analogy too far will result in misprediction. A more developed model, such as the dimensional-overlap model (when combined with assumptions about how learning occurs, as in the PEP model), might provide a much better account of the data. Of course, one might reasonably expand the horserace metaphor to encompass the added considerations discussed in this manuscript. For example, the race could be split into different legs, with an initial dash out of the starting gate (stimulus processing) and a final sprint to the finish line (response selection), with further specifications for faster running horses in less bumpy lanes (stimulus-response compatibility) and routes to the finish line via a “memory lane” (learned stimulus-response contingencies). This, of course, does undermine the initial simplicity of the horserace metaphor, and also merely serves to force horserace terminology on already existing theories, such as the dimensional-overlap and PEP models. For this reason, it might be best to keep our betting money in the episodic memory bank.


  1. Atalay, N. B., & Misirlisoy, M. (2012). Can contingency learning alone account for item-specific control? Evidence from within- and between-language ISPC effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38, 1578–1590.

    PubMed  Google Scholar 

  2. Augustinova, M., Silvert, L., Ferrand, L., Llorca, P. M., & Flaudias, V. (2015). Behavioral and electrophysiological investigation of semantic and response conflict in the Stroop task. Psychonomic Bulletin & Review, 22, 543–549.

    Article  Google Scholar 

  3. Blais, C., & Besner, D. (2006). Reverse Stroop effects with untranslated responses. Journal of Experimental Psychology: Human Perception and Performance, 32, 1345–1353.

    PubMed  Google Scholar 

  4. Carlson, K. A., & Flowers, J. H. (1996). Intentional versus unintentional use of contingencies between perceptual events. Perception & Psychophysics, 58, 460–470.

    Article  Google Scholar 

  5. Cattell, J. K. (1886). The time it takes to see and name objects. Mind, 11, 63–65.

    Article  Google Scholar 

  6. De Houwer, J. (2003). On the role of stimulus-response and stimulus-stimulus compatibility in the Stroop effect. Memory & Cognition, 31, 353–359.

    Article  Google Scholar 

  7. De Houwer, J. (2004). Spatial Simon effects with nonspatial responses. Psychonomic Bulletin & Review, 11, 49–53.

    Article  Google Scholar 

  8. Dunbar, K., & MacLeod, C. M. (1984). A horse race of a different color: Stroop interference patterns with transformed words. Journal of Experimental Psychology: Human Perception and Performance, 10, 622–639.

    PubMed  Google Scholar 

  9. Dyer, F. N. (1973). The Stroop phenomenon and its use in study of perceptual, cognitive, and response processes. Memory & Cognition, 1, 106–120.

    Article  Google Scholar 

  10. Forrin, N. D., & MacLeod, C. M. (2017a). The influence of contingency proportion on contingency learning. Manuscript submitted for publication.

  11. Forrin, N. D., & MacLeod, C. M. (2017b). Relative speed of processing determines color-word contingency learning. Memory & Cognition.

  12. Fraisse, P. (1969). Why is naming longer than reading? Acta Psychologica, 30, 96–103.

    Article  Google Scholar 

  13. Hintzman, D. L. (1984). Minerva 2: A simulation model of human memory. Behavior Research Methods Instruments & Computers, 16, 96–101.

    Article  Google Scholar 

  14. Hintzman, D. L. (1986). “Schema abstraction” in a multiple-trace memory model. Psychological Review, 93, 411–428.

    Article  Google Scholar 

  15. Hintzman, D. L. (1988). Judgments of frequency and recognition memory in a multiple-trace memory model. Psychological Review, 95, 528–551.

    Article  Google Scholar 

  16. Klein, G. S. (1964). Semantic power measured through the interference of words with color-naming. American Journal of Psychology, 77, 576–588.

    Article  PubMed  Google Scholar 

  17. Kornblum, S., Hasbroucq, T., & Osman, A. (1984). The dimensional overlap model for stimulus-response compatibility. Bulletin of the Psychonomic Society, 22, 276–276.

    Google Scholar 

  18. Kornblum, S., & Lee, J. W. (1995). Stimulus-response compatibility with relevant and irrelevant stimulus dimensions that do and do not overlap with the response. Journal of Experimental Psychology: Human Perception and Performance, 21, 855–875.

    PubMed  Google Scholar 

  19. Kornblum, S., Stevens, G. T., Whipple, A., & Requin, J. (1999). The effects of irrelevant stimuli: 1. The time course of stimulus-stimulus and stimulus-response consistency effects with Stroop-like stimuli, Simon-like tasks, and their factorial combinations. Journal of Experimental Psychology: Human Perception and Performance, 25, 688–714.

    Google Scholar 

  20. Levin, Y., & Tzelgov, J. (2016). Contingency learning is not affected by conflict experience: Evidence from a task conflict-free, item-specific Stroop paradigm. Acta Psychologica, 164, 39–45.

    Article  PubMed  Google Scholar 

  21. Lin, O. Y.-H., & MacLeod, C. M. (in press). The acquisition of simple associations as observed in color-word contingency learning. Journal of Experimental Psychology: Learning, Memory, and Cognition.

  22. Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527.

    Article  Google Scholar 

  23. MacLeod, C. M. (1991). Half a century of research on the Stroop effect: An integrative review. Psychological Bulletin, 109, 163–203.

    Article  PubMed  Google Scholar 

  24. Medin, D. L., & Schaffer, M. M. (1978). Context theory of classification learning. Psychological Review, 85, 207–238.

    Article  Google Scholar 

  25. Melara, R. D., & Algom, D. (2003). Driven by information: A tectonic theory of Stroop effects. Psychological Review, 110, 422–471.

    Article  PubMed  Google Scholar 

  26. Miller, J. (1987). Priming is not necessary for selective-attention failures: Semantic effects of unattended, unprimed letters. Perception & Psychophysics, 41, 419–434.

    Article  Google Scholar 

  27. Moors, A., Spruyt, A., & De Houwer, J. (2010). In search of a measure that qualifies as implicit: Recommendations based on a decompositional view of automaticity. In B. Gawronski & B. K. Payne (Eds.), Handbook of implicit social cognition: Measurement, theory, and applications (pp. 19–37). New York: Guilford Press.

    Google Scholar 

  28. Mordkoff, J. T., & Halterman, R. (2008). Feature integration without visual attention: Evidence from the correlated flankers task. Psychonomic Bulletin & Review, 15, 385–389.

    Article  Google Scholar 

  29. Morton, J., & Chambers, S. M. (1973). Selective attention to words and colors. Quarterly Journal of Experimental Psychology, 25, 387–397.

    Article  Google Scholar 

  30. Nosofsky, R. M. (1988a). Exemplar-based accounts of relations between classification, recognition, and typicality. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 700–708.

    Google Scholar 

  31. Nosofsky, R. M. (1988b). Similarity, frequency, and category representations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 54–65.

    Google Scholar 

  32. Nosofsky, R. M., Little, D. R., Donkin, C., & Fific, M. (2011). Short-term memory scanning viewed as exemplar-based categorization. Psychological Review, 118, 280–315.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Nosofsky, R. M., & Palmeri, T. J. (1997). An exemplar-based random walk model of speeded classification. Psychological Review, 104, 266–300.

    Article  PubMed  Google Scholar 

  34. Palef, S. R., & Olson, D. R. (1975). Spatial and verbal rivalry in a Stroop-like task. Canadian Journal of Psychology, 29, 201–209.

    Article  Google Scholar 

  35. Risko, E. F., Schmidt, J. R., & Besner, D. (2006). Filling a gap in the semantic gradient: Color associates and response set effects in the Stroop task. Psychonomic Bulletin & Review, 13, 310–315.

    Article  Google Scholar 

  36. Schmidt, J. R. (2013a). The Parallel Episodic Processing (PEP) model: Dissociating contingency and conflict adaptation in the item-specific proportion congruent paradigm. Acta Psychologica, 142, 119–126.

    Article  PubMed  Google Scholar 

  37. Schmidt, J. R. (2013b). Temporal learning and list-level proportion congruency: Conflict adaptation or learning when to respond? PLOS ONE, 8, e0082320.

    Article  Google Scholar 

  38. Schmidt, J. R. (2016a). Context-specific proportion congruent effects: An episodic learning account and computational model. Frontiers in Psychology, 7( 1806).

  39. Schmidt, J. R. (2016b). Proportion congruency and practice: A contingency learning account of asymmetric list shifting effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42(9), 1496–1505.

    PubMed  Google Scholar 

  40. Schmidt, J. R., & Besner, D. (2008). The Stroop effect: Why proportion congruent has nothing to do with congruency and everything to do with contingency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34, 514–523.

    PubMed  Google Scholar 

  41. Schmidt, J. R., & Cheesman, J. (2005). Dissociating stimulus-stimulus and response-response effects in the Stroop task. Canadian Journal of Experimental Psychology, 59, 132–138.

    Article  PubMed  Google Scholar 

  42. Schmidt, J. R., Crump, M. J. C., Cheesman, J., & Besner, D. (2007). Contingency learning without awareness: Evidence for implicit control. Consciousness and Cognition, 16, 421–435.

    Article  PubMed  Google Scholar 

  43. Schmidt, J. R., & De Houwer, J. (2012a). Adding the goal to learn strengthens learning in an unintentional learning task. Psychonomic Bulletin & Review, 19, 723–728.

    Article  Google Scholar 

  44. Schmidt, J. R., & De Houwer, J. (2012b). Contingency learning with evaluative stimuli: Testing the generality of contingency learning in a performance paradigm. Experimental Psychology, 59, 175–182.

    Article  PubMed  Google Scholar 

  45. Schmidt, J. R., & De Houwer, J. (2012c). Does temporal contiguity moderate contingency learning in a speeded performance task? Quarterly Journal of Experimental Psychology, 65, 408–425.

    Article  Google Scholar 

  46. Schmidt, J. R., & De Houwer, J. (2012d). Learning, awareness, and instruction: Subjective contingency awareness does matter in the colour-word contingency learning paradigm. Consciousness and Cognition, 21, 1754–1768.

    Article  PubMed  Google Scholar 

  47. Schmidt, J. R., & De Houwer, J. (2016a). Contingency learning tracks with stimulus-response proportion: No evidence of misprediction costs. Experimental Psychology, 63, 79–88.

    Article  PubMed  Google Scholar 

  48. Schmidt, J. R., & De Houwer, J. (2016b). Time course of colour-word contingency learning: Practice curves, pre-exposure benefits, unlearning, and relearning. Learning and Motivation, 56, 15–30.

    Article  Google Scholar 

  49. Schmidt, J. R., De Houwer, J., & Besner, D. (2010). Contingency learning and unlearning in the blink of an eye: A resource dependent process. Consciousness and Cognition, 19, 235–250.

    Article  PubMed  Google Scholar 

  50. Schmidt, J. R., De Houwer, J., & Liefooghe, B. (2017). Modelling the effects of instructions and goals: Perpetuation of instructed task rules in episodic memory. Manuscript submitted for publication.

  51. Schmidt, J. R., De Houwer, J., & Rothermund, K. (2016). The Parallel Episodic Processing (PEP) Model 2.0: A single computational model of stimulus-response binding, contingency learning, power curves, and mixing costs. Cognitive Psychology, 91, 82–108.

    Article  PubMed  Google Scholar 

  52. Schmidt, J. R., & Weissman, D. H. (2016). Congruency sequence effects and previous response times: Conflict adaptation or temporal learning? Psychological Research, 80, 590–607.

    Article  PubMed  Google Scholar 

  53. Simon, J. R., Craft, J. L., & Webster, J. B. (1973). Reactions toward stimulus source: Analysis of correct responses and errors over a five-day period. Journal of Experimental Psychology, 101, 175–178.

    Article  PubMed  Google Scholar 

  54. Simon, J. R., & Rudell, A. P. (1967). Auditory S-R compatibility: Effect of an irrelevant cue on information processing. Journal of Applied Psychology, 51, 300–304.

    Article  PubMed  Google Scholar 

  55. Stroop, J. R. (1935). Studies on interference in serial verbal reactions. Journal of Experimental Psychology, 18, 643–661.

    Article  Google Scholar 

  56. Sugg, M. J., & McDonald, J. E. (1994). Time-course of inhibition in color-response and word-response versions of the Stroop task. Journal of Experimental Psychology: Human Perception and Performance, 20, 647–675.

    PubMed  Google Scholar 

  57. Virzi, R. A., & Egeth, H. E. (1985). Toward a translational model of Stroop interference. Memory & Cognition, 13, 304–319.

    Article  Google Scholar 

  58. Warren, R. E. (1972). Stimulus encoding and memory. Journal of Experimental Psychology, 94, 90–100.

    Article  Google Scholar 

  59. Zhang, H., & Kornblum, S. (1998). The effects of stimulus-response mapping and irrelevant stimulus-response and stimulus-stimulus overlap in four-choice stroop tasks with single-carrier stimuli. Journal of Experimental Psychology: Human Perception and Performance, 24, 3–19.

    PubMed  Google Scholar 

  60. Zhang, H., Zhang, J., & Kornblum, S. (1999). A parallel distributed processing model of stimulus-stimulus and stimulus-response compatibility. Cognitive Psychology, 38, 386–432.

    Article  PubMed  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to James R. Schmidt.

Additional information

This research was supported by Grant BOF16/MET_V/002 of Ghent University to Jan De Houwer and by the Interuniversity Attraction Poles Program initiated by the Belgian Science Policy Office (IUAPVII/33)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schmidt, J.R. Best not to bet on the horserace: A comment on Forrin and MacLeod (2017) and a relevant stimulus-response compatibility view of colour-word contingency learning asymmetries. Mem Cogn 46, 326–335 (2018).

Download citation


  • Contingency learning
  • Neural networks
  • Episodic memory
  • Speed of processing
  • Stimulus–response compatibility