Our 85 participants comprised 28 self-declared LG synesthetes (26 females, two males, mean age = 46.21 years, SD = 14.43) and 57 self-declared nonsynesthetes (40 females, 17 males, mean age = 48.32 years, SD = 16.39). An independent-samples t test showed no significant differences between the groups in age [t(83) = 0.577, p = .566]. Our synesthetes were recruited from our database of synesthete participants who had previously contacted the University of Sussex to offer to take part in our synesthesia research, and via the UK Synesthesia Association, whom they had previously contacted to report their LG synesthesia. The control participants were recruited through advertisements in the media and from Prolific.ac, an online participant recruitment platform that holds a database of individuals who have expressed an interest in taking part in research studies. Both experiments presented here were approved by the local university ethics committee, and the study was conducted in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki.
Word stimuli were 30 words in English (mean length = 6, SD = 1.86, range = 3–10), typically acquired between the ages of 3 and 6 years (mean age-of-acquisition [AoA] rating = 301.30, SD = 52.14, range = 206–381). The words were especially common words in English, with an average CELEX word frequency of 115.23 (SD = 48.82, range = 57.65–248.88; Baayen, Piepenbrock, & Gulikers, 1995) and a mean familiarity rating of 579.63 (SD = 26.95, range = 500–630; Davis, 2005; Gilhooly & Logie, 1980; Toglia & Battig, 1978).
Participants also saw a palette of food names, divided hierarchically into superordinate and subordinate categories. This food palette was based on the DAFNE (Data Food Networking) Food Classification System, used in the UK and throughout Europe (http://ec.europa.eu/health/ph_projects/2002/monitoring /dafne_code_en.pdf). Minor changes were made to reflect the food experiences that are often described by LG synesthetes (see Ward & Simner, 2003). For example, synesthetes’ flavors are weighted toward sugary produce and chocolate, so the category of “Sugar/Sugar products” was expanded in this regard. Table 1 shows the final palette of foods, and Fig. 1 shows an example of the way these foods were hierarchically presented on screen during our test. Before running the study, we ran a pilot study that tested the usability of the test interface, to ensure that individuals would be able to consistently report tastes using it. The data from this pilot study can be found in the supplementary materials.
Participants were tested remotely via an online interface hosted on our testing platform, The Synaesthesia Toolkit, and entered the test by clicking on its URL. On entering the test, participants first provided demographic information, such as age and gender. Participants then proceeded to the main test, which screened for synesthesia in the two-step process of a self-report questionnaire followed by an objective test of consistency.
Participants read the following description about synesthesia, and were then required to self-report whether or not they experienced LG synesthesia:
This study is looking at synaesthesia, a rare condition that causes a kind of “merging of the senses.” We are interested in tasteFootnote 1 synaesthesia, a condition where thinking about words causes unusual taste sensations. For example, hearing the word “door” might trigger the taste of blackcurrants. Synaesthesia is rare and not many people have it. Synaesthesia is NOT the kind of associations everyone makes. E.g. the word “tin” or “can” probably make everyone think of beans or peas or coke. This is NOT synaesthesia. Synaesthesia is automatically linking words to foods, even if the word isn’t normally related to food at all. In synaesthesia, tastes can flood the mouth (like real tastes), or even just be strong thoughts that come automatically to mind. For example, hearing the word “door” might trigger the taste of blackcurrants in the mouth, or the thought of blackcurrants in the mind. Both are synaesthesia (so long as it’s automatic and has happened a lot since childhood).
Participants were then asked the following question, to allow them to self-report having or not having synesthesia: “Have you felt since you were little that some words, like ‘door,’ always have their own tastes? (even if the words aren’t related to food at all).” They responded by ticking either: “YES, I’ve thought this since I was little” or “NO, not really . . . but I could probably make some up today if I tried.” If participants answered “no,” they were told they would be required to invent word–food associations. If they answered “yes,” they were prompted to indicate whether they experienced the food association as a veridical flavor in the mouth (which we refer to in our analyses as projector synesthesia) or as thoughts in the mind (referred to as associator synesthesia). A third option was the chance for the participant to reject his or her previous self-report of synesthesia (i.e., “I’ve made a mistake – I DON’T feel that words have their own tastes”). If one of the first two options was chosen (i.e., “flavors in the mouth” or “thoughts in the mind”), participants were asked to provide two examples of a word and the flavor it triggered. If participants stated they had made a mistake, they were shown the same text presented to those who answered “no” to having synesthesia. Following this, all participants clicked to begin the objective consistency test. Figure 2 outlines the flow of the questions and the possible responses for synesthetes and nonsynesthetes.
Objective consistency test
Participants were given the following instructions: “In this test, we will show you a list of words and ask you to think of a taste for each word. The taste can be a food or drink etc. E.g. if we give you the word ‘filter,’ you might associate this with the taste of coffee.” The individuals classed as nonsynesthetes on the basis of their questionnaire response were given the additional instructions to just invent these associations (“Just read the word and think of the first taste that comes to mind. We know this is an unusual thing to ask but we want you to get creative!”). Words were presented onscreen individually alongside our food palette. Participants were required to select their food association from the palette by first clicking on a food category and then selecting one of the subordinate foods within that category. Figure 1 shows a screenshot based on the example target word “distance” and the interface seen as if a participant selected the food category “Condiments/Sauces/Soups.”
Participants were also asked to rate the strength/intensity of the association, on a scale from Very weak to Extremely strong, using a slider. There was no preset value, and a response marker appeared on the scale only when participants had clicked on it. Participants were told they could press a “no-taste” button if it was impossible for them to answer, but they were urged not to press the button too often and to try hard to think of a flavor for each word, even if the flavor association was not instantly obvious. Participants clicked “Select” when they were ready to move on to the next trial, in which case the screen would not advance until they had selected a subordinate food (e.g., mayonnaise) and an intensity rating, or they selected “No taste.” Participants completed two blocked repetitions of the word list. Words were fully randomized within each block. Once the participant had responded to each of the 30 words twice, they were debriefed and thanked for their participation.
As expected, all the LG synesthetes, and no controls, self-reported having LG synesthesia. Within the LG synesthetes, 11 reported having associator synesthesia, and 17 reported having projector synesthesia.
Objective consistency test
Our two aims were to determine whether our test of consistency would (a) discriminate group-wise between self-declared LG synesthetes and nonsynesthetes, and (b) provide a useful threshold cutoff for future test users, to effectively diagnose LG synesthesia in new individuals.
Scoring the test
For each participant, we compared food responses to the first and second presentations of each word (e.g., we compared the responses for the first and second presentations of the word “distance”). A score of 2 points was awarded for an exact match across the two presentations (i.e., the same category and the same subordinate food; e.g., “Fats/Butter”–“Fats/Butter”). A score of 1 was awarded for a partial match [i.e., same food category but different subordinate foods; e.g., “Fats/Butter”–“Fats/Vegetable fat (e.g., margarine)”]. The total number of consistent trials excluding “no-taste” responses was converted to a percentage, out of the maximum number of available points. For example, a participant responding with four consistent foods, one partial match, five inconsistent foods, and 20 no-taste responses would score nine points out of a possible 20 (2 points available for each of the ten words for which at least one food was provided) and would be given a score of 45.00%. We excluded consistent no-taste responses in order to prevent highly consistent datasets that would consist predominantly of no-taste responses (e.g., in the previous example, this poor-performing participant would otherwise have scored 81.70%, because they would have scored a further 40 points from consistent “no-taste” responses, and the total of 49 points would be scored out of 60, the sum of 2 points per every trial). The intensity responses were scored from 1 (Very weak) to 100 (Extremely strong), with 0 being assigned to any word that was given a no-taste response on one presentation and a taste response on the other.
Figure 3 shows the distributions of consistency scores for our two groups of participants. We compared the groups using nonparametric tests because the scores were nonnormally distributed for synesthetes, W(28) = 0.88, p = .005. We found that the LG synesthetes were significantly more consistent (Mdn = 85.90%) at reporting flavor associations than were the nonsynesthete controls (Mdn = 45.00%), U = 203.00, p < .0005, r = .60. However, despite the group difference, Fig. 3 shows that no clear cutoff value separates synesthetes from nonsynesthetes.
To rule out the possibility that the number of words to which participants assigned tastes might have accounted for the difference in performance across the synesthete and nonsynesthete groups, we ran a two-step hierarchical linear regression, predicting consistency scores from the percentage of words given tastes on both list presentations and from synesthete status. The first model was significant, F(1, 83) = 4.78, p = .032, explaining 5.00% of the variability in consistency scores; as the number of words with assigned tastes, β = – .23, t = – 219, p = .032, decreased, consistency increased. The addition of synesthete status as a predictor resulted in another significant model, F(2, 83) = 23.30, p < .0005, this time explaining 36.20% of the variability in consistency scores. The change in the percentage of variability explained was significant (p < .0005). Crucially, once synesthete status was added to the model, it became the only significant predictor in the model, β = .59, t(82) = 6.29, p < .0005, and the percentage of words given tastes no longer significantly predicted the consistency score, β = – .03, t = – 0.31, p = .759. Overall, this shows that the group-difference in the number of words with tastes did not account for the relationship between consistency and synesthete status, because although synesthetes assigned tastes to significantly fewer words, and although the number of “tasty” words predicts consistency score, synesthete status explained significantly more variability in consistency scores than the number of words with tastes did.
To explore this result further, we applied receiver operating characteristics (ROC) analysis to the data, to examine how effective our test is at predicting participants’ status as an LG synesthete or nonsynesthete. We used self-reports to classify the presence and absence of synesthesia and used consistency scores as a predictor. The analysis computed a continuum of potential cutoff scores (see Fig. 4) that can be used for a diagnostic test, and for each one provided measures of sensitivity and specificity. Sensitivity is represented by the proportion of self-declared synesthetes with consistency scores greater than the cutoff (i.e., hits), and 1-specificity is represented by the proportion of nonsynesthetes with consistency scores greater than the cutoff (i.e., false alarms). The area under the curve (AUC) is taken to represent the overall predictive accuracy of a diagnostic tool. This statistic runs linearly from .5 (guessing rate) to 1 (perfect predictive power). Our consistency test yielded an AUC of .86, p < .0005, SE = .05, 95% CI [.77, .96], indicating good but not excellent predictive power.
Our analysis revealed that maximum sensitivity (i.e., classifying all self-declared synesthetes as synesthetes) would come with a score threshold of 45.83% (see Table 2 for the sensitivity and specificity values corresponding to each cutoff score value). This threshold would, however, also classify 45.61% of self-declared nonsynesthetes as synesthetes. A threshold of 95% would achieve maximum specificity (i.e., it would classify all those individuals who reported not having synesthesia as nonsynesthetes), but it would also classify 85.71% of self-declared synesthetes as nonsynesthetes. On the basis of our data, the cutoff with maximum efficiency—that is, the test threshold score that would pass the largest number of self-declared synesthetes (67.86%) while also passing the smallest number of nonsynesthetes (8.77%)—is 75%.
We also looked at whether the consistency of food choices separated projector from associator LG synesthetes. The data were not normally distributed for either associators, W(11) = .82, p = .019, or projectors, W(17) = .88, p = .029, so a nonparametric test was used. There was no significant difference between associators (Mdn = 91.66) and projectors (Mdn = 83.33) in this measure of consistency, U = 79.00, p = .517, r = .13.
We next examined participants’ consistency at rating the intensity of flavor associations across the two presentations of the word list. To calculate our dependent measure for the consistency of intensity, we correlated the intensity ratings given by each participant in the first presentation with those given in the second presentation, for the same words. Hence, our intensity consistency measure (a correlation coefficient) ranged from – 1 to 1. When a no-taste response was given on only one of the two presentations, an intensity of 0 was assigned to the word and was correlated against the intensity given for the taste response in the other presentation. If no-taste responses were given in both presentations of the same word, the trial was not included in the correlation. This was again done to avoid data sets with a small number of inconsistent responses attaining a high score due to the predominance of no-taste responses. The distribution of these scores as a function of self-declared synesthete status can be seen in Fig. 5. The synesthete data were not normally distributed, W(28) = .910, p = .019, and variance was heterogeneous across groups, F(1, 83) = 8.84, p = .036, so nonparametric comparisons were used. On average, the measures of the correlation between intensity ratings given on the first and second presentations of the word list were significantly higher in the synesthete group (Mdn = .60) than in the nonsynesthete group (Mdn = .27), U = 390.00, p < .0005, r = .41. However, a ROC analysis of the intensity correlation scores and self-declared synesthete status showed that intensity scores did not fare any better at discriminating between self-declared synesthetes and nonsynesthetes than did our previous measure: AUC = .76, p < .0005, SE = .06, 95% CI [.64, .87]. Finally, we note that there were no differences in the consistency of intensity across associators (M = .55, SD = .41) and projectors (M = .56, SD = .36), t(26) = 0.073, p = .942, Cohen’s d = 0.03.
Above we saw that LG synesthetes were more consistent in their intensity ratings, but they also gave higher ratings overall: we looked at the average intensity ratings (on a scale from 0 to 100) within each presentation of the word list, and ran a mixed 2×2 analysis of variance crossing word list presentation (first, second) and group (synesthete, nonsynesthete). Although there was no significant effect of presentation, F(1, 83) = 1.12, p = .292, ηp2 = .01, and no significant interaction, F(1, 83) = 0.78 p = .375, ηp2 = .01, we did observe a main effect of group, F(1, 83) = 14.93 p < .0005, ηp2 = .15. This indicated that flavor associations were significantly stronger for self-declared synesthetes (M = 57.43, SD = 19.70) than for nonsynesthetes (M = 39.86, SD = 19.70). Within our group of LG synesthetes, associators (M = 60.03, SD = 12.13) and projects (M = 55.75, SD = 14.66) reported similar levels of intensity; we found no group difference in the intensity of word–taste associations, F(1, 26) = 0.37, p = .548, ηp2 = .01, no main effect of presentation, F(1, 26) = 0.02, p = .887, ηp2 = .001, and no interaction, F(1, 26) = 0.02, p = .880, ηp2 = .001.
In our experiment, we tested a group of self-declared LG synesthetes and self-declared nonsynesthetes. Our test aimed to distinguish synesthetes from nonsynesthetes using a consistency measure in which words are associated with foods selected from a hierarchical list of food names. Words were presented twice, and we calculated the consistency with which the same words were given the same food association for each participant. We found that the synesthete group was significantly more consistent in their food associations across test and retest, and they were also significantly more consistent when ratings the intensity of those word–food associations. Synesthetes also rated their flavors as being more intense overall. Finally, when we looked within our group of LG synesthetes, we found that associators and projectors performed similarly on every measure.
We might also conclude that we selected our target words well. Firstly, the synesthetes provided synesthetic tastes for 83% of the words in Experiment 1, and for 87% in Experiment 2. These hit rates are high in comparison to the low rates previously recorded from LG synesthetes in other studies (e.g., less than 60% in the word list of Ward et al., 2005). Secondly, all 30 words elicited a taste from at least 50% of synesthetes in Experiment 1, and from at least 38% in Experiment 2, with the majority of words (27/30) eliciting a taste response in more than half of the synesthete sample.
Although our test showed a number of group-wise differences, there was some degree of overlap in the consistency with which food associations were given over time, across synesthetes and nonsynesthetes. Our ROC analysis showed good, but not excellent, discriminability. A threshold high enough to recognize at least eight out of ten self-declared synesthetes (a score of approximately 60%) would nonetheless have a 32% chance of classifying nonsynesthetes as synesthetes. Reducing this error rate to only 8% would only pass around 6.7 out of ten of the self-declared synesthetes. For this reason, we present an alternative way to diagnose LG synesthetes below.