After more than a century of being on the fringes of perceptual science (Galton, 1880), synesthesia has seen renewed interest in the past few years with the application of modern psychophysical methods (Cytowic, 2002; Ramachandran & Hubbard, 2001a, 2001b). Synesthesia is both involuntary and stable (Brang & Ramachandran, 2011; Cytowic & Eagleman, 2009); as a perception in one modality that occurs as a result of stimulation in another, it represents a failure of accurate perception of the properties of the world. In this way, synesthesia is a tool for uncovering perceptual mechanisms, which are often investigated by exploring the limits of perceptual capability. Investigating when and how perception breaks down (e.g., measuring thresholds) often informs researchers about the mechanisms of perception. The natural breakdown of correspondence between physical stimulation and perception can thus be informative about perceptual mechanisms.

We can begin to locate the level of grapheme–color synesthesia in the brain by examining the point of perceptual breakdown using direct psychophysical methods. The best-documented type of synesthesia is grapheme–color synesthesia, in which letters and numbers evoke an idiosyncratic experience of color for each grapheme. In this study, we employed a metacontrast-masking paradigm to compare the performance of grapheme–color synesthetes and nonsynesthete control participants using both dichromatic and monochromatic stimuli. If both synesthetes and controls experienced stronger masking under monochromatic than under dichromatic conditions, this would indicate that metacontrast masking is processed before synesthesia; the synesthetic advantage of experiencing color would not affect metacontrast perception. This would suggest a later processing site for synesthesia within the visual stream. If synesthete performance were dissimilar from nonsynesthete performance with monochromatic stimulus presentation, such that metacontrast masking was weakened or eliminated, this would indicate that metacontrast masking is processed either after synesthesia or at the same level. The idiosyncratic experience of color in this case would interfere with metacontrast masking, which would suggest an early, or perhaps a multilevel, processing site for synesthesia.

Metacontrast masking

Metacontrast, a type of backward visual masking, occurs when a briefly presented stimulus, the target, becomes less visible or is visually eliminated if it is immediately followed by another briefly presented stimulus, the mask. Metacontrast is itself an interesting phenomenon, because the mask obscures visual perception of the target backward in time. Metacontrast follows a U-shaped masking function (Alpern, 1953), alternately referred to as a Type B masking function (Kolers, 1962; Kahneman, 1968). The point for optimal visual suppression is within a 50- to 60-ms stimulus onset asynchrony (SOA; Alpern, 1953; Stigler, 1910) or a 50- to 60-ms stimulus termination asynchrony (STA). By manipulating the relative durations of the target and mask, Macknik and Livingstone (1998) found that masking functions follow the STA more closely than the SOA; previously the two measures had been confounded, because the target and mask generally had equal durations.

Metacontrast lends itself nicely to this project because of what is already known about it. It can be obtained dichoptically (Kolers & Rosner, 1960; Schiller & Smith, 1968; Werner, 1940), which eliminates lateral geniculate nucleus or retinal explanations (Bridgeman, 1971). This indicates that the earliest possible site of metacontrast must be V1, the earliest site of binocular convergence. Evidence of metacontrast has been found in single cells of V1 in both the cat (Bridgeman, 1975) and the monkey (Bridgeman, 1980). However, metacontrast can also be obtained using illusory or subjective contours (Gilden, MacDonald, & Lasaga, 1988), and area V2 is the first processing site for subjective contours (Petry, 1987). If metacontrast were to be unaffected by synesthesia, we could therefore further narrow the seat of grapheme–color synesthesia as being beyond the level of V2.

Method

Measure of equiluminance

Metacontrast is generally considered to be stronger if the target and mask are of the same color, but the relationship is complex (Breitmeyer, 1984). By displaying target–mask pairs that are equiluminant with the background for each participant, we were able to record two distinct masking functions for dichromatic and monochromatic stimulus presentations for nonsynesthete control participants.

Finding an observer’s equiluminant point is often a difficult and time-consuming process, as well as being prone to error because of chromatic adaptation during a series of trials. To locate each observer’s subjective equiluminant point quickly and efficiently, we used a graphic interfaceFootnote 1 in which two flickering fields alternated two opposed color gradients. The top of the first field was bright red and faded gradually to black at the bottom. This field alternated with a second field that was bright green at the bottom and faded gradually to black at the top. These two fields were alternated at a flicker rate above the chromatic flicker fusion rate but below the luminance flicker fusion rate.

At some intermediate height in the field, the decreasing red and the increasing green will have the same luminance for the observer. That individual’s equiluminant values will then pop out by flicker fusion, the location at which the observer sees the least amount of flicker, or even a stationary line; in other words, a location where the luminances of the red and green gradients match. However, this is a false perception, and thus a garden path, because the entire array is flickering at the same rate. Each participant is asked to adjust the relative brightness levels of the two fields until the equiluminant point is exactly in the middle of the pattern. For the present study, this process was repeated for each participant in order to locate the correct RGB triplet values for equiluminant red and blue and for equiluminant red and yellow, for the example of a red background, blue target, and yellow mask.

Procedure

Four female grapheme–color synesthetes, as well as seven female and two male nonsynesthete controls, were recruited from the undergraduate student body at the University of California, Santa Cruz. All of the synesthetes volunteered their time, while the nonsynesthete control observers volunteered for credit for a class requirement.

For each of our synesthetes, we validated with a test–retest method their synesthetic associations of letters, numbers, and ordered time units (days of the week and months of the year). All four synesthetes reported the same synesthetic associations two days apart with 100 % consistency between the two time periods. If the synesthetic observers did not associate a particular grapheme with a specific color, they were asked to leave it blank.

All synesthetes and nonsynesthete controls were run individually in the same darkened experimental room using the same computer and screen for both determination of equiluminance and metacontrast masking. Each participant sat with eyes 60 cm from the center of the screen. We located each participant’s equiluminant point using Bridgeman’s garden path procedure. The RGB triplet values obtained were then used for the remainder of the study for the corresponding observer. Prior to experimental participation, each synesthete was first activated until subjective colors were experienced by displaying a stationary target and mask pair together until the synesthetic color was experienced, although this was achieved almost immediately. At least a 1-s interval was interposed between the activation and masking stimuli, to prevent the activation from distorting the masking. The graphemes that each synesthete reported as evoking the strongest color experience among the colors that we used were assigned as the target–mask pairs. Each participant completed 30–50 practice trials in both chroma conditions prior to experimental participation.

The target stimuli consisted of a pair of isolated horizontal bars composed of repeated letters or numbers 0.28º high, one above and one below the fixation point (Fig. 1). Each target was bordered by a mask consisting of a pair of nonoverlapping bars, each 0.28º high, composed of a different repeated letter or number. The target–mask separation was 0.09º. One of the targets consisted of eight repeated symbols, while the other consisted of seven. All of the masks were eight letters wide.

Fig. 1
figure 1

Stimulus array, scaled as in the experiment. In this example, the shorter target is in the lower target–mask pair. In the experiment, upper and lower short targets were assigned randomly for each trial. For backward masking, the upper panel would be presented before the lower panel. Following the targets and masks, a decision window remained on the screen until response

The masking paradigm was based on a two-alternative forced choice (2AFC) task in which participants reported by keypress whether the upper or the lower target–mask pair contained the shorter target bar. The two target–mask combinations were displayed simultaneously on a CRT screen refreshed at 60 Hz. The target duration was one frame, and mask duration was two frames. For the “dichromatic” condition, the target and mask were presented in different, equiluminant colors. For the control, “monochromatic” condition, the same letters or numbers as in the dichromatic condition were used in both target and mask, but both were presented in the same color—for instance, both blue or both green. Thus, any distortion of masking due to the use of different letters in the target and mask would be equilibrated across conditions.

Seven different timing conditions were based on the STA, which has been found to more reliably predict masking performance than does the SOA (Macknik & Livingstone, 1998). The seven timing conditions were −33 (forward paracontrast masking), 0, 66, 99, 132, 165, and 199 ms. These timing conditions were presented in a randomized order for each participant, and each participant completed the same number of trials for each of the seven STAs. A block consisted of 154 trials (22 trials at each STA) of either dichromatic presentation or monochromatic presentation. The block order alternated monochromatic and dichromatic stimulus presentation, and the first block alternated between monochromatic and dichromatic stimuli for each participant. Each participant completed one monochromatic and one dichromatic block.

Analysis

Our analysis included one between-subjects variable, synesthesia, and two within-subjects variables, chroma and STA. The runs for each participant were averaged in order to obtain a single masking function for each participant. We performed a 2 (synesthesia) × 2 (chroma) × 7 (STA) mixed design analysis of variance.

In two phases, we tested the null hypothesis that the monochromatic and dichromatic conditions would yield indistinguishable masking functions for the synesthetes. This would mean that synesthetic color reduced masking in the same way as real (physical) color. The first, preliminary phase engaged the control observers, to assure that our dichromatic stimulus conditions would yield less metacontrast masking than would our monochromatic stimulus conditions among the nonsynesthetic controls. The second phase tested our synesthetes in the monochromatic and dichromatic conditions.

Results

In the control observers, the dichromatic condition resulted in weaker metacontrast than did the monochromatic condition (Fig. 2), establishing a baseline difference between the stimulus conditions against which the synesthete performance could be compared.

Fig. 2
figure 2

Nonsynesthesia data, averaged over nine participants. STA stands for stimulus termination asynchrony; that is, at STA = 0 ms, the target and mask terminate simultaneously. Error bars indicate between-subjects standard errors

The synesthetic observers also showed weaker masking in the dichromatic condition (Fig. 3), a significant main effect, F(1, 11) = 89.3, p < .001. The difference between the monochromatic and dichromatic conditions was as strong in the synesthetic observers as in the controls, F(1, 11) < 1, n.s., indicating that the difference between dichromatic and monochromatic masking occurred for both synesthetes and controls. Thus, our null hypothesis of no significant difference in masking between the conditions for the synesthetes was rejected. The synesthetes were unable to use their synesthetic colors to differentiate the target from the mask, and therefore showed strong metacontrast in the monochromatic condition, even though they were able to use physical color to defeat masking in the dichromatic condition. There was also a significant main effect of STA condition, F(6, 54) = 34.40, p < .001.

Fig. 3
figure 3

Synesthesia data, averaged over four participants. STA stands for stimulus termination asynchrony. Error bars indicate between-subjects standard errors

The average performance of the synesthetes collapsed across STAs was no better in the dichromatic condition than was the performance of nonsynesthetes (M = .939, SE = .10, and M = .932, SE = .10, respectively), as tested by a post-hoc t test, t(27) = 0.081, p = .94. The synesthetes reported seeing their synesthetic colors in the masking trials, however. Furthermore, the nonsynesthetes did not perform significantly better than the synesthetes in the monochromatic condition (M = .827, SE = .015, and M = .862, SE = .015, respectively), t(27) = 0.395, p = .67.

Performance in both the dichromatic and monochromatic conditions followed a U-shaped function for the backward side of the metacontrast function. However, performance was categorically better in the dichromatic than in the monochromatic condition for both synesthetes and controls. While the point of optimal masking was the same for both conditions, the degradation in performance was not as dramatic in the dichromatic condition.

Discussion

The results of the present study suggest that metacontrast masking and synesthesia are mutually exclusive and that synesthesia occurs at a later processing stage than does metacontrast in the visual stream. This distinction is reflected qualitatively in the differences between synesthetic and real color perception—for instance, in the lack of a complementary-color afterimage upon the disappearance of a letter seen in synesthetic color (Bridgeman, Winter, & Tseng, 2010). In Bridgeman et al.’s study, synesthetes perceived both a synesthetic and a real color together; when a black high-contrast letter abruptly disappeared, one synesthete, for example, saw the afterimage color as “white, but still red [the synesthetic color for that figure in that person].” The present results also suggest that synesthetic color experience does not influence color processing in metacontrast masking. A U-shaped metacontrast masking function appeared in the monochromatic stimulus condition, with an optimal masking point at 66 ms, whether or not the observer experienced synesthetic colors. That is, the synesthetes were incapable of using their synesthetic colors to identify the masked target.

A related interpretation of our results can also be made: It is possible that, rather than coming after metacontrast, synesthesia is based on a different system that is not involved in metacontrast masking. (We thank Vince DiLollo for suggesting this possibility.)

By using a metacontrast-masking procedure, we were able to build on what we know of metacontrast and its location in the brain to begin to locate the level of grapheme–color synesthesia, which must occur beyond the level of V2 if it does involve the same system as metacontrast. Measuring thresholds and recording natural breaks in perception with a psychophysical experiment affords a more direct measure of experience than do neuroimaging methods such as fMRI (Hubbard, Arman, Ramachandran, & Boynton, 2005; Nunn et al., 2002; Sperling, Prvulovic, Linden, Singer, & Stirn, 2006), though fMRI studies are consistent with ours in identifying color–grapheme synesthesia with activity beyond V2. Specifically, they have identified activity in V4 with synesthetic color (see, however, Hupé, Bordier, & Dojat, 2012). When research locates the exact level of synesthesia in the brain and accurately maps its neural structure, we will better understand how the human brain creates and binds its own perceptions.