Introduction

Mental imagery is the term used to describe the representation of, or process of representing, an experience or activity in the absence of the represented sensory stimuli or behavior. The processing mechanisms and experience of mental imagery of perceptual events can strongly resemble the experience of perceiving the actual external event. Indeed, many studies suggest mental imagery of perception is simply a weak form of perception (for a recent review, see Pearson et al. 2015). Consistent with this idea, imagining visual information (visualization) can lead to classical conditioning (Lewis et al. 2013) and perceptual learning (Tartaglia et al. 2009). This general framework dovetails with promising work suggesting that mental imagery can be a useful tool in clinical settings (e.g., Foa et al., 1980) and in musical and athletic training (e.g., Zatorre et al., 2007; Guillot and Collet, 2008).

An interesting recent study by Reinhart et al. (2015) demonstrated that attentional mechanisms can be effectively trained to select target objects through visualization. Moreover, they found that visualizing search for a particular target in an imagined array can speed reaction time to find the target on a subsequent trial even more than actually searching for the target in an array presented on the screen. In their experiments, participants were first shown a green C oriented in a particular direction, and then shown an array of 12 C’s organized in a circle, only one of which was green. Participants had to press a key to indicate whether the one green C in the array was oriented in the same or a different direction as the C they had just been shown. Reaction times become faster when the C of a particular orientation repeated over a run of up to seven trials. On some runs of trials, participants were asked to “visualize search” and were shown the oriented C but not the array for the first two trials of the run. On trial 3 of the run, reaction time to find the target was significantly faster—faster not only compared to the first trial of the run but also faster when compared to trial 3 of runs during which the search array had actually been shown on trials 1 and 2. The authors also measured EEG during the experiment and found that the N2PC (an early component of stimulus-evoked EEG associated with attentional focus) followed a similar pattern as reaction time, with an enhanced N2PC to the third target in a run following two visualization trials—again, relative not only to the first trial in the run but also to the third target following trials in which the search array was physically present rather than just visualized. The results suggest, as the title of the article states, that “visualization trumps vision in training attention.”

On the one hand, this result is consistent with previous work reviewed above demonstrating that visualization has meaningful effects on subsequent perception and behavior. On the other, this result is unique in demonstrating that the effects of visualization can surpass those of actual perception—on its face, this is inconsistent with the idea that mental imagery is “weak” perception. One possible reason for the superiority of visualization over perception, however, is that the perceptual learning over a run of trials when the target and distractors are displayed in fact has components of both interference and facilitation on the training of attention, and visualizing removes some of the interference but maintains the facilitation. For example, the participant may not imagine the distractors very vividly, and this could strengthen the representation of the target orientation, speeding reaction time to match it green C in the array on trial 3. Consistent with this interpretation, when the black distractor C’s were removed from the display (Experiment 4), the difference between visualization and practice with actual stimuli was eliminated. This could reflect a floor effect; however, as reaction times in general in this experiment were much faster than in the experiments with distractors. It is also the case that the target can be rapidly selected on the basis of its color in this experiment, so processing of distractors would be expected to be minimal even when they are presented. If visualization was facilitated relative to search because the interference from distractors was minimized, even more facilitation for visualization should be observed when search is difficult and the distractors during “real” search need to be processed in more detail.

Definitively establishing why visualization is more effective than actually performing the task is of critical importance not only for understanding the nature of visual imagery but also because it could lead to improved imagery-based training or therapy regimes like those noted above. If it is the case that visualization is more effective because it isolates the facilitative components of the task, for example, vague or incomplete representations may in fact be superior to less vivid or detailed ones. We therefore plan to first replicate the original study by Reinhart et al. (2015), focusing specifically on the behavioral results. Omitting the recording of EEG allows us to use a slightly simplified stimuli design to test the original results, without the second irrelevant colored distractor (only included in the original study to allow for comparison of the lateralized N2PC component between the two Hemifields). This will provide independent confirmation of this interesting result. We will also analyze the data in more depth using linear mixed-effect models and explore how the visualization effect changes over time and is affected by to serial dependencies (Fischer and Whitney 2014). Provided we replicate the original findings, we also plan to run a second experiment using a more traditional visual search paradigm using an array of C’s of uniform color, in which the observer’s task is to determine whether the target (a C of the cued orientation) is present or absent. If the superiority of visualization over practice at search is due to proactive interference from the distractors when the search array is presented, there should be even larger benefits of visualization when serial search of all the items is required.

Experiment

The aim of this experiment is to directly replicate the behavioral effect of visualization in visual search (Reinhart et al. 2015). We have made two minor changes to the original paradigm: (i) we have removed the red distracter element used in the original study (included to measure the N2PC component of the EEG; we will not measure EEG). (ii) We will use runs of length three, four, and five, rather than three, five, and seven. This was done because (1) the practice effects asymptotes by trial 5 of the runs, and (2) runs of longer length are unimportant for testing the critical result: that reaction times for the first three trials in a run will be faster in the visualize condition than those in the practice condition.

Methods

Participants

Thirty participants will be recruited via from amongst the student population at the University of Aberdeen. All participants will have normal or corrected-to-normal vision.

The sample size of n = 30 is based on a power analysis for a one-tailed paired t test, with a power of 0.9 and effect size of d = 0.55. This should be sufficient for the replication given the original effect sizes of 0.61 < d < 0.72 (n = 18).

Stimuli

The search stimuli consisted of twelve Landolt C’s arranged in a circle (radius 8) around a central fixation cross. The Landolt C’s had a radius of 0.62 (with thickness =0.25, gap width =0.19) and had one of eight possible orientations (\(\phi =n\frac {\pi }{4}\), n = 0, …, 7). One of the C’s was colored green, which indicated that it was the target. An example stimulus is shown in Fig. 1.

Fig. 1
figure 1

Example stimulus from Experiment 1. Before the stimulus is shown, the observer is shown a green Landolt C. Their task is to decide if the green C in the stimulus matches the orientation of the cued C

Procedure

The first block of trials was pre-empted by a set of ten practice trials. The practice trials included the visual search condition only. There were then four blocks of 30 trials (15 for each condition). Each trial began with the presentation of a fixation cross (1200–1600 ms) followed by the presentation of a green Landolt C (the cue stimuli) for 100 ms, followed by a interval of 1000 ms. The search array was then presented in the visual search condition for a maximum of 2000 ms. In the visualize search condition, the cue stimuli was followed by instructions to ‘visualize search’ before the presentation of a further fixation cross. The orientation of the green C, the location of the green C within the search array and whether the green C matched the cue was randomized for each trial.

Planned analyses

We plan to carry out the analysis in several different ways. Firstly, we will repeat the analysis from Reinhart et al. (2015) in which the medianFootnote 1 reaction time for trial 1, 2, and 3 within a (normal) run is compared to the median reaction time for trial 3 in a visualize-run. A one-tailed paired t test will be used for this comparison. We will include a Bayesian t testFootnote 2, which will enable comparison of the expected difference between visualization and trials 1-3 in the run to the null hypothesis (no difference). We will also directly compare/combine our results with those of Reinhart et al. (2015), who have already provided us with the summary data from their Experiment 1.

Additionally, we will also analyze our data in more detail using a linear mixed-effects model (lme4 Bates et al., 2015; R Core Team 2015), following the guidelines on model design given by Barr et al. (2013). This allows us to include trial-to-trial variation in the analysis rather than using aggregate statistics. As the distribution of reaction times is expected to be skewed, we use log reaction times in the analysis. We will treat trial number (within run) as a numerical factor, and will investigate non-linear regression if required (e.g., if there is a clear asymptote). However, given the results presented by Reinhart et al. (2015), a linear model should suffice. We will follow the model simplification procedure put forward by Crawley (2012, chapter 9). p values will be obtained via the ANOVA function from the car package (Fox 2011).

Finally, we will also compare the visualization effect to the effect of serial dependency. More specifically, we will investigate how reaction time varies on trial 3 depending on what preceded it (present-present, absent-absent, present-absent, absent-present, visualize-visualize). This will be modeled using another linear mixed-effects model with a five-level factor describing the previous two trials, and a two-level factor for whether trial three is a target absent or present trial.Footnote 3

Results and discussion

All participants were accurate at the task (88–99 %) and incorrect trials were removed from further analysis. The reaction times from our study are presented together with those of Reinhart et al. (2015) in Fig. 2 and the t test-based analysis is given in Table 1. The first two lines of this table compare performance across successive trials of search for the same target and demonstrate that we have a repetition benefit of around 30 ms on each trial where the target repeats from the last trial. The remaining lines of the table repeat the t tests used by Reinhart et al. (2015) to compare the effect of visualizing search to actually performing search. Participants were slower on trial 3 after visualization than compared to all three trials within a run of actual search. Because we planned to do a one-tailed test, and the direction of these effects is the opposite of what was predicted, they are deemed non-significant.

Fig. 2
figure 2

Results over all participants. The mean reaction time was first computed for each participant, and then the mean of the means and associated standard error were plotted. Note: in order to facilitate comparison with Reinhart et al. (2015), the error bars indicate ± standard error, rather than 95 % confidence intervals

Table 1 t test (p value) results

The results from each of the 30 participants are shown separately in Fig. 3. As the reaction times give a skewed distribution, we calculated the mean and 95 % confidence intervals after log-transforming the data. We further analyzed these data with a linear mixed-effects model. Reaction times were log-transformed, trial number was treated as a continuous variable, and we allowed for a maximal (crossed) random effects structure. All effects were statistically significant (p < 0.05) and are shown in Table 2. The negative effect of trial number confirms that we have a practice effect: as with Reinhart et al. (2015), reaction times decrease over the course of a run of trials. However, as visualization has a positive effect, we conclude that participants are slower in the visualization condition. The interaction is also statistically significant, with reaction times decreasing faster for trials after the visualization condition. Upon inspection of Fig. 3, it appears to be primarily driven by a few participants with relatively slow reaction times on Trial 3 in the visualization condition, followed by an improvement for Trials 4 and 5. This pattern is consistent with task-switching costs (e.g., Rogers and Monsell, 1996).

Fig. 3
figure 3

Results from each participant. Means and 95 % confidence intervals are computed after a log transform. The results have been transformed (using the exponential function) back into the original units for ease of reading the plots and comparing to the original data

Table 2 Linear mixed-effects model fit

Finally, we use Bayesian methods (Dienes 2008) to compute the posterior distribution of effect, taking the previously published findings from Reinhart et al. (2015) as the prior. As can be seen in Fig. 4, when we update our beliefs based on the new data presented here, we end up with a posterior distribution centred around zero and conclude that there is no evidence of an effect in either direction.

Fig. 4
figure 4

The purple graph shows the posterior distribution of the difference in mean reaction times between Trial 1 in the stimulus condition and Trial 3 in the visualize condition. The red graph shows the prior, taken from Reinhart et al. (2015), while the green line shows the likelihood of our data

General discussion

We find no evidence that “visualizing trumps vision in training attention.” Indeed, mean reaction times to find a target are actually slower when performing search following visualization compared to an equivalent amount of practice actually searching for that target. We offer two explanations for the discrepancy between our findings and the original results.

First, the most salient difference between our paradigm and the one presented by Reinhart et al. (2015) is the additional task-irrelevant color singleton, included in the original design to rule out physical-stimulus confounds in the lateralized ERP component of interest (the N2PC). This additional colored item in the array seemed like an unimportant detail when we planned our (purely behavioral) replication, so we did not include it. However, this small modification may have made our search task slightly easier than theirs. Reaction times in our version of the experiment were indeed slightly faster than in the original, although it is important to note that it is not the case that participants who responded relatively slowly in our experiment were more likely to show an effect in the original direction (quite the opposite in fact, see Fig. 3). Obviating the need to search, as they did in their Experiment 4, eliminated the benefit of visualization. It may be the case that removing one salient distractor, as we did in our study, has a similar impact. If so, this is an important detail to note for any researcher interested in trying to study the effect of visualization on search: the search may need to be of a minimum level of difficulty for the effect of visualization to be observed.

A second important difference is that we did not measure EEG. While the measurement of EEG itself does not influence behavior, perhaps participants wearing an EEG cap believed that their compliance with instructions to “visualize” could actually be verified by the experimenter, and hence were more likely to follow that instruction. Effects along these lines have been found in the eye-tracking literature (Nasiopoulos et al. 2015) in which participants modify their viewing behavior when they know their eyes are being tracked. Instructions to visualize did have an impact on reaction time in our study (albeit opposite to what was found previously) so there was some evidence that our participants were attempting to comply with instructions. However, they may not have been as consistent or effortful in their imagery compared to a group of participants wearing a device that they have been told measures their brain activity. If this is the case, it would be important for researchers to be aware that a putative ability to monitor compliance with visualization instructions is a key requirement for finding the effect. For the record, the participants in our study were instructed as follows: “When prompted to visualize search you are required to generate a mental image of the search array in your mind and imagine searching through this array for the green cue C you have been shown in the trial.” A precise description of how researchers instruct and/or verify visual imagery in their participants should be made available in future studies on this topic so that the importance of this factor can be determined.

By failing to replicate the original effect, it is our hope that this report can provide important guidance to future researchers who may wish to build on the findings of Reinhart et al. (2015). To be useful in clinical or training settings, as suggested by the authors, the imagery effect must be robust and the boundary conditions need to be clearly and consistently documented. Moreover, for the potential theoretical implications of their results to be fully developed, we require a thorough understanding of the conditions under which it does, and does not, occur.