Ishihara color vision test
On average, participants identified M = 5.72 (SD = 0.68) of the six Ishihara plates correctly.Footnote 5 Four participants were excluded from all subsequent analyses because they had difficulties correctly identifying the numbers displayed on at least two of the six color plates (see participant section).
Perceptual fluency test
We analyzed mean RTs for correct responses of the perceptual fluency test (94% of the responses) by means of a 2 (target: present vs. absent) \(\times\) 3 (color: green vs. red vs. gray) repeated-measures ANOVA. There was no color main effect, F(1.74, 73.18) = 1.02, p = 0.357, \({\eta }_{p}^{2}\) = 0.02, CI [0.00, 0.09], nor a target main effect or a color by target interaction, Fs < 1. Hence, once again, the different colors were equally discriminable on the black screen.
Stroop task
Response times
In line with the prior experiments, we excluded RTs for incorrect responses (block 1: 6.3%, block 2: 14.6%) and omitted participants’ fastest and slowest correct response per block. We then z-standardized RTs for each participant per block to account for the different response deadlines of test block 1 (fixed deadline: 5000 ms) and test block 2 (adaptive response deadline: M = 814 ms, SD = 179 ms). The z-transformed RTs were submitted to a 2 (block: 1 vs. 2) \(\times\) 3 (color: green vs. red vs. gray) \(\times\) 2 (validity: true vs. false) repeated-measures ANOVA. Again, we report Greenhouse–Geisser corrected degrees of freedom whenever the sphericity assumption does not hold (as indicated by Mauchly’s test). All descriptive results are displayed in Fig. 4. Mean unstandardized RTs per condition are listed in Table 3.
Table 3 Mean (SD) unstandardized RTs and error rates for each condition of Experiment 3 Similar to the previous experiments, participants responded faster to true-related words than to false-related words, F(1, 42) = 40.37, p < 0.001, \({\eta }_{p}^{2}\) = 0.49, CI [0.31, 0.62]. Once again, this validity effect was moderated by test block, F(1, 42) = 6.45, p = 0.015, \({\eta }_{p}^{2}\) = 0.13, CI [0.02, 0.30]. A comparison of simple main effects indicated a larger validity effect in test block 1, F(1, 42) = 60.44, p < 0.001, \({\eta }_{p}^{2}\) = 0.59, CI [0.43, 0.70], than in test block 2, F(1, 42) = 9.59, p = 0.003, \({\eta }_{p}^{2}\) = 0.19, CI [0.04, 0.36]. There was no significant color main effect on RTs, F < 1, but a significant color by validity interaction, F(1.93, 80.88) = 97.73, p < 0.001, \({\eta }_{p}^{2}\) = 0.70, CI [0.61, 0.76]. The latter was further qualified by a three-way interaction of block, color, and validity, F(1.95, 81.76) = 3.61, p = 0.032, \({\eta }_{p}^{2}\) = 0.08, CI [0.00, 0.17]. A separate analysis per test block showed that the color by validity interaction was considerably stronger in test block 2, F(1.85, 77.49) = 70.49, p < 0.001, \({\eta }_{p}^{2}\) = 0.63, CI [0.52, 0.70], than in test block 1, F(1.99, 83.44) = 35.84, p < 0.001, \({\eta }_{p}^{2}\) = 0.46, CI [0.33, 0.56]. However, importantly, the overall pattern was similar for both test blocks (see Fig. 4). The subsequent follow-up tests of the color by validity interaction therefore cover both blocks.
Simple main effect analyses indicated that color significantly affected RTs for true-related words, F(1.89, 79.36) = 61.19, p < 0.001, \({\eta }_{p}^{2}\) = 0.59, CI [0.48, 0.67], as well as false-related words, F(1.94, 81.57) = 40.64, p < 0.001, \({\eta }_{p}^{2}\)= 0.49, CI [0.36, 0.59]. Planned comparisons within each validity condition with gray words as a the reference group yielded the following results: For true-related words, RTs were shorter when such words appeared in green, F(1, 42) = 34.30, p < 0.001, \({\eta }_{p}^{2}\)= 0.45, CI [0.26, 0.59], and longer when they appeared in red, F(1, 42) = 35.17, p < 0.001, \({\eta }_{p}^{2}\)= 0.46, CI [0.27, 0.60]. Vice versa, for false-related words, RTs were shorter when such words appeared in red, F(1, 42) = 16.57, p < 0.001, \({\eta }_{p}^{2}\)= 0.28, CI [0.11, 0.45], and longer when they appeared in green, F(1, 42) = 24.71, p < 0.001, \({\eta }_{p}^{2}\)= 0.37, CI [0.18, 0.53].
Accuracy
We compared mean error rates for the different combinations of block, color, and validity by means of a 2 \(\times\) 3 \(\times\) 2 repeated-measures ANOVA. The descriptive results are illustrated in Fig. 4. As in the previous experiments, participants made more errors in test block 2 than in test block 1, F(1, 42) = 130.20, p < 0.001, \({\eta }_{p}^{2}\) = 0.76, CI [0.65, 0.82]. Moreover, the error rate was higher for false-related words than for true-related words, F(1, 42) = 22.00, p < 0.001, \({\eta }_{p}^{2}\) = 0.34, CI [0.16, 0.50]. Additionally, there was a main effect of color, F(1.98, 83.20) = 28.99, p < 0.001, \({\eta }_{p}^{2}\) = 0.41, CI [0.27, 0.52]. The latter effect was qualified by a color by block interaction, F(1.93, 80.91) = 6.04, p = 0.004, \({\eta }_{p}^{2}\) = 0.13, CI [0.03, 0.23], a color by validity interaction, F(1.37, 57.50) = 72.39, p < 0.001, \({\eta }_{p}^{2}\) = 0.63, CI [0.53, 0.71], and a significant three-way interaction of block, color, and validity, F(1.76, 73.85) = 30.74, p < 0.001, \({\eta }_{p}^{2}\) = 0.42, CI [0.29, 0.53]. Similar to the RT data, the color by validity interaction was considerably stronger in test block 2, F(1.51, 63.25) = 68.18, p < 0.001, \({\eta }_{p}^{2}\) = 0.62, CI [0.51, 0.69], than in test block 1, F(1.62, 67.87) = 23.62, p < 0.001, \({\eta }_{p}^{2}\) = 0.36, CI [0.22, 0.47]. Importantly, however, the overall pattern was similar for both test blocks (see Fig. 4, for exact values of mean percent errors per condition see Table 2). Thus, again, the subsequent follow-up tests of the color by validity interaction cover both blocks.
Simple main effect analyses indicated that color significantly affected response accuracy for true-related words, F(1.49, 62.67) = 54.00, p < 0.001, \({\eta }_{p}^{2}\) = 0.56, CI [0.44, 0.65], as well as for false-related words, F(1.51, 63.37) = 59.24, p < 0.001, \({\eta }_{p}^{2}\) = 0.59, CI [0.47, 0.67]. Planned comparisons within each validity condition revealed that participants’ mean error rates were significantly higher for the true-related words displayed in red than for such words displayed in gray, F(1, 42) = 60.84, p < 0.001, \({\eta }_{p}^{2}\)= 0.59, CI [0.43, 0.70]. In contrast, mean error rates did not significantly differ between true-related words displayed in green and gray, F(1, 42) = 2.70, p = 0.108, \({\eta }_{p}^{2}\)= 0.06, CI [0.00, 0.21]. The opposite pattern emerged for the false-related words. Mean error rates were considerably higher for false-related words presented in green than in gray, F(1, 42) = 67.20, p < 0.001, \({\eta }_{p}^{2}\)= 0.62, CI [0.46, 0.72], and somewhat lower when such words were presented in red compared to gray, F(1, 42) = 5.23, p = 0.027, \({\eta }_{p}^{2}\)= 0.11, CI [0.01, 0.27].
Explicit color–validity associations
When asked about their explicit color–validity associations, all participants indicated that they associated green with the attribute true and red with the attribute false. A complete list of participants’ explicit color associations is provided in Appendix A.