Introduction

It is not uncommon in our daily lives to perceive auditory information in challenging conditions. Whether it is the chatter of multiple conversations at a coffee shop or music coming from a speaker yards away on the beach, auditory perception is not always easy and can be influenced by many factors. One example of a challenging situation is when there could be multiple interpretations of an auditory stimulus, thus creating ambiguity – for instance, if a loud environment causes someone to be unsure if their friend said one word or another. In this type of situation, visual information can help guide auditory perception toward one percept or the other. In the present study, we were interested in understanding factors that can influence one’s auditory perception under conditions of ambiguity – specifically, whether visual information can shift perception of an auditory illusion. To assess this, we used a particular auditory illusion known as the Tritone Paradox (Deutsch, 1986, 1987), which is designed to be heard in two different ways.

The tritone paradox is an auditory illusion consisting of two consecutive tones that can be perceived as either ascending (from low to high pitch) or descending (from high to low pitch). This illusion is created by two musical tones that are separated by a half octave, or a tritone. The tones of the tritone paradox have also been described as successive octave ambiguous tones that span a half-octave interval, i.e., a tritone (Deutsch, 1986; Repp, 1997; Shepard, 1964). Each of these tones is classified along a pitch class circle (see Fig. 1), which places tritone pairs directly across from each other along the edge of the circle. Tones on one side of the circle tend to be heard as higher, while those in the corresponding opposite direction tend to be perceived as lower (Deutsch, 1986). While perception of the tritones is typically consistent within a given person over time, there are individual differences between listeners; one person may hear a given pair in an ascending direction, while their friend may insist the same tritone pair is in fact descending (Shepard, 1964). The orientation of the pitch class circle varies from person to person, producing variations in perceptions of the tones as more often ascending or descending. This exemplifies the ambiguity of the tritone paradox.

Fig. 1
figure 1

Pitch-class circle for the tritone paradox

Previous research has shown that various individual factors can influence one’s auditory perception of the tritone paradox. One such factor is a person’s range of vocal frequencies. For instance, prior research has identified a positive correlation between a listener’s perception of the tritone paradox and the range of frequencies that comprise the listener’s voice (Deutsch et al., 1990). Additionally, perception of the tritone paradox seems to have a hereditary component; biological mothers and their children tend to perceive the tritone paradox similarly (Deutsch, 2007). Likewise, one’s native language can influence perception of the tritone paradox. For example, speakers of Asian tonal languages for whom different sounds indicate entirely different words tend to experience the tritone paradox differently than those who only speak English (Deutsch, 1991; Deutsch et al., 2004). Even one’s geographical location can influence their perception; listeners from the USA and Canada were more likely to perceive the tritone as ascending with their highest pitch class at the note D, whereas listeners from England perceived the same tone pair as descending, with the highest pitch class at note G (Dawe et al., 1998; Deutsch, 1991; Deutsch et al., 1987). Even within the USA, perception has been shown to differ by region (Ragozzine & Deutsch, 1994). These past findings all speak to how perception of the tritone paradox can vary based on individual and demographic factors.

Not only does perception of the tritone paradox often vary from individual to individual (perhaps depending in part on the factors identified above), but it can also be manipulated within a given individual. For instance, some research has suggested that action can influence one’s perception of the tritone paradox (Repp & Knoblich, 2007). In these studies, two groups of participants – one of skilled pianists and one of non-musicians – were prompted to press two keys on a keyboard in either a right-to-left or left-to-right direction while listening to an instance of the tritone paradox. They were then asked to report whether they heard the ambiguous tritones as either ascending or descending. The results showed that the pianists (but not the non-musicians) heard the tritone paradox more often as ascending with a left-to-right key-press order than with a right-to-left key-press order, which makes sense given that pitch increases from left to right on piano keyboards. Importantly, the participants were able to look at their hands while playing the keys; thus, it was unclear whether the participants’ perception of the tritone paradox was influenced by the action of moving their hands or by the visual input, or both. A follow-up experiment showed that the same results were obtained even if the pianists were simply watching someone else play the tones on the keyboard and not actually performing the action themselves (Repp & Knoblich, 2009). This implies a strong influence of the visual input of someone else playing the keyboard on perception of the tritone paradox, at least for pianists (non-pianists were not tested in this follow-up experiment).

To further separate the effect of action from the effect of the visual input that prompted or coincided with that action, Repp and Goehrke (2011) conducted a similar study with two conditions. In the active condition, pianists saw a musical notation of a rising or falling melody and played it on the keyboard, while in the passive condition, they saw the musical notation (but did not play it) and heard the melody played to them. The results showed that in both conditions, pianists heard the tritone paradox more often as ascending when they saw a rising melody and more often as descending when they saw a falling melody. The fact that the effect was the same in both conditions demonstrated that the visual input of the musical notation, rather than the action of playing those notes, influenced perception of the tritone paradox. It is uncertain whether this effect would also exist for non-musicians as only skilled pianists were tested in this study.

The above studies suggest that visual input – specifically, musical information – can influence one’s perception of the tritone paradox. It is important to note that in these previous studies, the visual input was perceived consciously; that is, participants were aware of the visual stimuli presented to them, since the visual information served as a prompt for their actions. The question that arises is whether visual information, such as a notated melody (as in Repp & Goehrke, 2011), would still influence auditory perception if it were not consciously perceived. Previous work has not addressed unconscious influences on perception of an ambiguous auditory illusion. Moreover, most of the prior studies focused on skilled musicians, so it remained unclear how visual input might influence auditory perception in non-musicians. The current study aimed to fill these gaps in the literature by assessing whether one’s auditory perception of the tritone paradox can be influenced by visual information (a musical notation) presented outside of conscious awareness, regardless of musical experience. Doing so would provide insight into the bigger picture of how non-reportable visual information can influence auditory perception under conditions of ambiguity. Understanding nonconscious visual influences on auditory perception is important for two reasons. Firstly, it is not uncommon in our daily lives to receive visual input that we don’t ultimately perceive consciously (for a review, see Kouider & Dehaene, 2007), so gaining insight into how that unconscious information can influence other senses will further clarify our daily perceptual experiences. Secondly, this study will further the field’s understanding of priming by showing that it can occur automatically, unconsciously, and across senses, specifically for situations of auditory ambiguity.

To place the musical notations outside of conscious awareness in the current study, a masked priming paradigm was employed (Breitmeyer et al., 2006; Forster & Davis, 1984). In masked priming, initial exposure to a stimulus (the prime) which is “hidden” by a subsequently presented mask can influence responses to a later stimulus (the target). The vast majority of past research using masked priming has been within a given modality (Dehaene & Changeux, 2011) – for instance, visual primes with visual targets or auditory primes with auditory targets. The few existing cross-modal priming studies have shown that visual information, like words and numbers, can indeed prime auditory targets (Chng et al., 2019; Grainger et al., 2003; Kiyonaga et al., 2007; Kouider & Dehaene, 2009; Kouider & Dupoux, 2001; Nakamura et al., 2006). This priming effect has been observed both for consciously perceived primes (Kouider & Dupoux, 2001) and for unconsciously processed primes (Ansorge et al., 2016; Chng et al., 2019; Kouider & Dehaene, 2009). This prior work provides a foundation for the current study by showing that unconscious visual information can influence auditory perception. Here, we extend this cross-modal priming work by using an image of a musical notation as the prime (rather than word or number primes) and the tritone paradox as the target – a combination that has not been tested previously.

Based on the previous research on the tritone paradox and the research on cross-modal masked priming, we hypothesized that masked musical notations would unconsciously alter perception of the tritone paradox. Specifically, we expected that masked ascending music notes would unconsciously shift perception towards hearing the tones as ascending, whereas masked descending music notes would shift perception towards hearing the tones as descending.

Experiment 1

Methods

Participants

The participants were 66 undergraduate volunteers with normal or corrected-to-normal vision and hearing at California Polytechnic State University in San Luis Obispo. Of the 66 participants, 40 participants identified as female, and 26 participants identified as male. These students were either currently enrolled in a Psychology course or had taken at least one during their time at the University. All participants signed up online through the SONA system, a software designed for managing research participants at the University. All participants received and signed an informed consent form prior to experimentation, and this was approved by the Cal Poly Institutional Review Board. The population of interest in this study was undergraduate students who had little to no prior knowledge of the tritone paradox and masked priming.

Stimuli and apparatus

There were two types of primes: a “neutral” prime and a “meaningful” prime (see Fig. 2). The neutral prime was an image of a black empty music staff led by a black treble clef. The meaningful prime was the same music staff, with the addition of two black quarter-notes (D and G). The notes were either arranged in an ascending order (G–D) or a descending order (D–G) depending on the participant’s condition (see Design section). All primes subtended 5.97 (H) × 8.34 (W) degrees of visual angle and were presented in the center of the screen on a white background. Contrast luminance was calculated for each stimulus in terms of the proportion of the pixels in the image that were white (RGB 255, 255, 255) versus black (RGB 0, 0, 0), based on how others have conceptualized luminance previously in binary images (e.g., Trujillo et al., 2010). The neutral primes were 93.26% white and 6.64% black, while the meaningful primes, due to the addition of the two black music notes, were 91.14% white and 8.85% black. The mask image that appeared after the prime was a meaningless pattern consisting of randomly placed blobs and circles, similar to the music notes, and was 50% black and 50% white. The mask was the same height, width, and location as the prime.

Fig. 2
figure 2

Trial structure. All participants received both the neutral prime condition (treble clef and staff only) and the meaningful prime condition (musical notation), but whether the meaningful prime was ascending or descending depending on their perception in the neutral condition. See text for details

All of the tritone stimuli presented in this study were based on those used by Deutsch (1986, 1987) and others who have studied the tritone paradox (Repp & Goehrke, 2011; Repp & Knoblich, 2007, 2009). The tritone pairs were created using the digital audio workstation Logic Pro X developed by Apple. Each tritone pair consisted of two successive tones that were each 500 ms long, with no pause between the tones. There were 12 pairs in total: C#-G, G#-D, D-G#, A#-E, B-F, A-D#, D#-A, F-B, E-A#, C-F#, G-C#, F#-C, as in previous studies (Deutsch, 1986).

Stimuli were presented on a 21-in. Apple iMac desktop computer running jsPsych (de Leeuw, 2015) with the participant seated at a distance of 65 cm from the screen. A white screen background was used throughout the entire experiment (luminance 100%). Tritone stimuli were presented over CB3 Hush wireless noise-cancelling headphones at 50% volume and with the noise-cancelling feature enabled.

Experimental design and procedure

All participants received both the neutral prime and the meaningful prime conditions. The neutral prime condition was always presented first (i.e., as block 1) in order to establish a baseline level of perception of the tritone paradox. Block 2 was the meaningful prime condition; whether the participant received the ascending or descending music notes depended on their baseline perception in block 1 (see below). Importantly, each participant was only given one type of meaningful prime: either ascending or descending. All 12 tritone pairs were presented twice: once in each blocked condition (neutral and meaningful). The order of presentation within each condition was randomized.

Prior to the first experimental trial, instructions were presented on the computer screen while being read aloud by the experimenter. An example of the tritone paradox was played over the headphones, and participants were asked to practice indicating whether they heard it as ascending or descending. The instructions emphasized that there was no “right” or “wrong” answer, but rather that it was an illusion that could be heard in different ways. Participants were also told that they would see random “flickering blobs” on the screen (i.e., the pattern mask) prior to each auditory stimulus. Importantly, they were naive to what these blobs meant and to the condition they were in. Nothing about musical notations was mentioned in the instructions or practice. Once the experimenter ensured that the participant understood the tritone paradox as well as the task, the experiment began.

The trial structure of this experiment is depicted in Fig. 2. Each trial began with a fixation cross in the center of the screen. The participant initiated each trial by pressing the space bar, after which the prime was presented for 25 ms, followed immediately by the mask for 300 ms. We chose to use backward masking with these particular exposure durations based on previous research, which also used pictures as masked primes (e.g., Dannlowski et al., 2007) as we did in this study. After the mask, a screen that read “Ascending or descending?” was shown while the tritone pair was played over the headphones. The participant responded whether they heard that tritone pair as ascending or descending by pressing “J” or “F” on the keyboard, respectively. They were given unlimited time to respond. After responding, a 3-s delay occurred before the start of the next trial. The researcher emphasized to each participant that they should respond as quickly as possible with their initial gut response.

Once the participant finished the first 12 trials with the neutral prime (block 1), the percentage of the time that they heard the tones as ascending was automatically calculated in addition to the average response time. If the participant heard the tritone pairs more often as ascending than descending (i.e., > 50% heard as ascending) across the 12 neutral prime trials in block 1, then they were given the descending music notes prime in block 2. Conversely, if they heard it more often as descending (i.e., < 50%) during the neutral prime condition, then they were shown the ascending music notes as the meaningful prime. This was done in order to optimize the potential to shift participants’ perception in the opposite direction. In an instance where a participant perceived the tritones as ascending 50% of the time in block 1, they were randomly given either the ascending or descending music notes as the meaningful prime in block 2. Importantly, participants were unaware of which meaningful prime they were receiving, since the prime was not consciously perceived.

After the completion of block 2, each participant was asked to complete a post-experiment questionnaire. This was used in order to gather demographic information, and asked questions about anxiety, stress, mood, linguality, past musical experience (both instrumental and singing), and hearing and visual impairments. A variety of question-style formats were used, including a Likert Scale, forced choice format, differential format, and free-response questions. Importantly, one section of the questionnaire asked participants to indicate if they were able to see anything discernable in the “flickering blobs” (i.e., the quickly presented prime and mask, which looked like flickering blobs to the participant), and if so, to describe what they saw. If a participant indicated that they saw anything discernable, such as music notes of any kind or direction or even just the music staff, they were eliminated from all subsequent analyses. This was done in order to increase the likelihood that the prime was not being consciously perceived, since our question of interest was the unconscious influence of the music notes on perception of the tritone paradox. Even with such strict exclusion criteria, only two participants were eliminated for being potentially aware of the prime. By taking a conservative approach in our questioning (i.e., eliminating participants if they expressed even the slightest possibility that they saw anything), we can be reasonably confident that the remaining 66 participants who were included in our analyses did not consciously perceive the prime. Indeed, participants were quite surprised to learn during the verbal debriefing that a musical notation appeared in the “flickering blobs,” stating that they were unaware that it had been there.

Results

Perception of the tritone paradox

Our measure of interest was participants’ perception of the tritone paradox (ascending vs. descending), which was calculated as the percentage of trials on which the tritone pairs were heard as ascending. In other words, a score of 100% would indicate that the participant always heard the tritones as ascending, while 0% means they always heard them as descending. This calculation was done separately for the neutral prime condition (block 1) and the meaningful prime condition (block 2) for each participant.

In order to gain a sense of participants’ perception of the tritone paradox at baseline – that is, in the absence of any meaningful prime – we first looked at responses in just the neutral prime condition (block 1). In this condition alone, participants on average heard the tritone paradox as ascending 41.04% of the time. In other words, participants tended to hear the tritone pairs as descending more often than ascending in our sample.

Our initial question, then, was how the masked prime influenced this baseline perception. To address this, the participants who heard the tritone paradox more often as descending in the neutral prime condition (i.e., < 50% of trials heard as ascending) were then given the ascending music notes as the meaningful prime (N = 47, or 71% of participants). We refer to this as the “ascending group” because they received the ascending prime. Likewise, those who heard it more often as ascending (i.e., > 50% of trials heard as ascending) were given the descending music notes as the meaningful prime (N = 19, or 29% of participants). This was the “descending group,” as they received the descending prime.

A 2 × 2 mixed-design analysis of variance (ANOVA) was conducted on the average percentage of trials heard as ascending, with factors of prime condition (neutral vs. meaningful) and group (ascending vs. descending). The results are shown in Fig. 3. The results first showed, as expected, a main effect of group [F(1,64) = 35.162, p < .001, ηp2 = .355]; participants who received the ascending prime, on average, heard the tritone paradox more often as descending (M = 35.60%), while those who received the descending prime heard it as ascending (M = 55.83%). This of course was intentional and serves as a manipulation check. There was no significant main effect of prime condition (p > .218).

Fig. 3
figure 3

Experiment 1 results. Average percent of trials heard as ascending is plotted as a function of prime condition (neutral/meaningful) and whether participants received the ascending or descending meaningful prime. Error bars represent standard error of the mean. *p < .05, **p < .01

Importantly, the results of the ANOVA also revealed a significant crossover interaction between prime condition and group [F(1,64) = 14.998, p < .001, ηp2 = .190]. To further understand this interaction, follow-up paired-samples t-tests were conducted separately on each group (ascending and descending) comparing the difference in perception of the tritone paradox between the neutral versus meaningful prime conditions. In the ascending group, the results showed a significant effect of prime condition [t(46) = -2.957, p = .005, d = .360]; participants on average heard the tritone paradox as ascending more often when they received the meaningful (ascending) prime (M = 38.21%, SD = 15.98) compared to the neutral prime (M = 32.96%, SD = 13.09). The descending group also showed a significant effect of prime condition [t(18) = 2.260, p = .036, d = .731], but in the opposite direction; participants heard the tritone paradox as descending more often in the meaningful (descending) prime condition (M = 50.74% heard as ascending, SD = 17.54) versus the neutral prime condition (M = 60.95% heard as ascending, SD = 10.42).

This critical finding demonstrates that, even though it was not consciously perceived, the masked visual music notes influenced participants’ auditory perception of the tritone paradox. Specifically, participants were more likely to hear the tritone pairs as the direction indicated by the preceding masked prime (ascending or descending).

To further confirm the effect of the prime across both groups (ascending and descending), we calculated a measure that we referred to as the directional difference score. This measure was a difference score between the percent of trials heard as ascending during the neutral prime condition versus the meaningful prime condition. However, to take into account the opposite expected (and observed) directions of influence for the group who received the ascending prime versus the descending prime, the difference score was calculated in the direction of expected influence. For the ascending group, it was calculated as meaningful primeneutral prime, since the meaningful prime was expected to raise scores (percentages of trials heard as ascending); for the descending group, it was calculated as neutral primemeaningful prime, since the meaningful prime was expected to lower scores. By calculating the difference scores in this way, a positive value represents a positive influence of the prime in the expected direction (i.e., more ascending for the ascending prime, more descending for the descending prime), and a negative value indicates that the prime altered participants’ perception in the opposite direction than expected.

A one-sample t-test on average directional difference scores across both groups (ascending and descending) showed that these scores were significantly greater than 0 [t(65) = 3.68, p < .001, d = .453], indicating that the prime did significantly influence perception of the tritone paradox in the expected direction, regardless of the direction of the prime. On average, the meaningful prime shifted participants’ perception of the tritone paradox by 6.68% in the expected direction. Furthermore, this influence of the prime was equally strong for both groups; indeed, an independent-samples t-test comparing the directional difference scores for the ascending versus descending groups showed no significant difference (M = 10.21% influence for the descending group; M = 5.25% influence for the ascending group; p > .219).

Together, the results of these analyses show that the music note prime did significantly influence participants’ perception of the tritone paradox in the expected direction, and that the magnitude of this influence did not differ by the direction of the music notes (ascending vs. descending).

Using the post-experiment questionnaire, we were also able to analyze whether any other factors might have influenced perception of the tritone paradox. Post hoc analyses on participants’ reported level of stress, anxiety, mood (all 1–7 Likert scales), and amount of sleep the night prior across both experiments showed no significant correlation with participants’ baseline perception of the tritone paradox in the neutral prime condition or with the influence of the prime as measured by the directional difference scores (all ps > .123; see Table 1).

Table 1 Additional factors assessed in post-experiment questionnaire in Experiment 1

Reaction time

In order to determine the influence of the prime on reaction times (in ms) to the tritone paradox, a 2 × 2 mixed-design ANOVA was conducted with factors prime condition (neutral vs. meaningful) and group (ascending vs. descending). The results, shown in Fig. 4, revealed a significant main effect of prime condition [F(1,64) = 45.075, p < .001, ηp2 = .413]; participants responded faster on average to tritone pairs that followed the meaningful versus neutral primes (2,164 vs. 2,941 ms, respectively), regardless of whether that meaningful prime consisted of ascending or descending music notes. Indeed, there was no significant interaction with or main effect of group (ps > .408).

Fig. 4
figure 4

Experiment 1 reaction times. Average reaction times are plotted as a function of prime condition (neutral/meaningful) and whether participants received the ascending or descending meaningful prime. Error bars represent standard error of the mean. **p < .01

The finding of faster reaction times to the meaningful prime condition might be because the musical prime speeded responses, or alternatively (or additionally) because this condition was always second (block 2), and thus participants may have simply been responding faster as the experiment progressed due to practice effects. This possibility is further explored in Experiment 2.

Experiment 2

Experiment 1 showed that a masked visual prime – a picture of a musical notation – can shift one’s perception of the tritone paradox in the direction indicated by the music notes. In Experiment 2, we sought to further understand this effect by testing whether the shift in perception occurred unconsciously (Experiment 2a) or could have been due to regression to the mean (Experiment 2b).

In Experiment 1, we assumed that the shift in perception was unconscious, since the prime was presented very briefly and was masked, as others have done previously (e.g., Chng et al., 2019; Dannlowski et al., 2007). We assessed this qualitatively in Experiment 1 by asking participants in the post-experiment questionnaire whether they perceived any music notes in the “flickering” on the screen, and participants generally indicated that they did not (and if they did, they were removed from the analysis). However, it is possible that participants did consciously perceive the primes on a trial-by-trial basis but were unable to self-report on their experience after the experiment had concluded. Indeed, we did not include a quantitative measure or assessment of prime visibility in Experiment 1, so it remained unclear whether the influence of the masked primes was conscious or unconscious.

In Experiment 2a, we aimed to address this by including a measure of prime visibility. This experiment was the same as Experiment 1 but with the addition of a third block of trials, the “control” condition, on which participants were asked to respond to the visual masked prime rather than to the auditory target, as others have done previously to assess prime visibility (Ansorge et al., 2016; Wernicke & Mattler, 2019). If the prime was indeed outside of awareness and not consciously perceived, then we expected that participants would not be able to accurately respond to it.

In Experiment 2b, we further explored the effect of the prime by assessing whether the observed change in perception could have been due to regression toward the mean, rather than due to the masked musical prime itself. Indeed, the results of Experiment 1 showed that participants whose perception started at a lower average percent heard as ascending shifted higher (i.e., toward 50%) in block 2, whereas participants whose perception started higher exhibited the opposite pattern. In other words, both groups’ averages moved towards each other in block 2, suggesting possible regression to the mean as an explanation for the change in perception from block 1 to block 2, rather than the influence of the meaningful prime itself. In Experiment 2b, we assessed this alternative explanation by recruiting a third group of participants who received two blocks of the neutral prime (rather than one block of the neutral prime and one block of the meaningful music note prime as in Experiments 1 and 2a). If the shift in perception is driven solely by regression to the mean, then we would expect to observe a similar change in perception of the tritones between blocks 1 and 2 in Experiments 2a (consisting of the meaningful prime in block 2) and 2b (consisting of the neutral prime in block 2). If, however, the effect is driven by the masked musical prime shifting auditory perception above and beyond any regression to the mean, then we expect to observe a significantly greater change in perception between blocks 1 and 2 in Experiment 2a than in Experiment 2b.

Methods

Participants

The participants were 75 volunteers in Experiment 2a (43 female, 29 male, one non-binary, two declined to report gender) and 62 volunteers in Experiment 2b (35 female, 27 male), all undergraduate students at California Polytechnic State University. As in Experiment 1, this sample had little to no prior knowledge of the tritone paradox and masked priming.

Stimuli and apparatus

The stimuli in Experiment 2 were the same as in Experiment 1 (see Fig. 2), except without the meaningful prime in Experiment 2b. Because Experiment 2 was conducted during the pandemic protocol that was in effect during the 2020–2021 academic year, participants completed the experiment online (rather than in person, as in Experiment 1) using their own computer and headphones. The study was run using jsPsych (de Leeuw, 2015) hosted on a public Github Pages repository to which participants were linked. Upon clicking on the link, participants were provided with on-screen informed consent and instructions. Participants were specifically instructed to complete the experiment on a laptop or desktop computer (rather than a mobile device) and at a time when they were in a quiet, distraction-free space. They were told that the experiment consisted of an auditory component and to put on headphones (if available) and turn up the volume to a comfortable level. Conducting this study online meant that we were unable to control precise aspects of the visual or auditory stimuli, such as the visual angle and auditory volume that we were able to control in Experiment 1.

Participants were asked in the post-experiment questionnaire whether they were able to focus completely on the experiment, whether they experienced any distractions or technical issues during the experiment, and whether they adopted any strategies during the experiment. Eight additional participants were excluded from Experiment 2a and four from Experiment 2b for reporting distractions during the experiment that influenced their ability to focus. An additional one participant was removed from the analysis of Experiment 2b for reporting that they closed their eyes for the duration of the experiment. No participants reported experiencing any technical issues that would exclude them from the study (i.e., internet browser unresponsive, sound not playing, etc.).

Experimental design and procedure

The design and procedure in Experiment 2 were the same as in Experiment 1, except that in Experiment 2a, participants received one extra “control” block of trials, and in Experiment 2b, participants received two identical blocks of the neutral prime trials (and no meaningful prime block).

In Experiment 2a, as in Experiment 1, participants were given 12 trials with the neutral masked prime, followed by 12 trials with the meaningful masked prime (ascending or descending, depending on their perception in block 1). In the third “control condition” block, participants were given 24 trials – 12 with the ascending meaningful prime and 12 with the descending musical prime (randomly intermixed). The trial structure was the same as in Experiment 1 (see Fig. 2), with the prime followed by the mask, followed by the tritone pair.

Prior to the control block, participants were told that on these trials, they might see a picture of two music notes appear briefly on the screen prior to the “flickering blobs.” They were informed that these notes would appear on a music staff and would be either in an ascending or a descending orientation. They were instructed to respond on each trial whether they saw ascending music notes, descending music notes, or no music notes at all. The auditory illusion still played, but participants were told to respond based on what they saw rather than what they heard. The purpose of this control condition was to ensure that the prime was subliminal; participants should not have been able to correctly identify whether the prime was ascending or descending (i.e., performance at 50%).

After the experiment was completed, participants in both Experiment 2a and Experiment 2b were linked to an online post-experiment questionnaire consisting of the same questions as in Experiment 1, with the addition of the question asking whether they had experienced any distractions or technical difficulties. As in Experiment 1, they were asked whether they saw any ascending or descending music notes in the “flickering blobs” in the first half of the experiment, when they were responding to the auditory illusion. Participants, as in Experiment 1, responded that they did not visually perceive any music notes in the first part of the experiment, with many participants also reporting that they could not visually perceive any music notes even when they were asked to do so in the second half of Experiment 2a (the control condition). This provides some preliminary qualitative evidence, along with the quantitative analysis conducted below, that the primes were not consciously perceived.

Experiment 2a: Results

Perception of the tritone paradox

As in Experiment 1, we calculated and analyzed the percentage of trials on which the tritone pairs were heard as ascending, based on condition. We first assessed perception in the neutral prime condition alone and found that participants on average heard the tritone paradox as ascending 34.53% of the time. This shows that participants in our sample tend to hear the tritones as descending more often than ascending, just as in Experiment 1.

In order to assess the influence of the masked prime, we first assessed only the neutral and meaningful conditions in order to determine whether we replicated the effect observed in Experiment 1, even though the experiment was conducted online rather than in the laboratory. As before, we had two groups of participants – those who heard the tritones more often as descending in the neutral prime condition and therefore were given the ascending meaningful prime (the “ascending” group, N = 59), and those who heard the tritone more often as ascending in the neutral condition and thus were given the descending meaningful prime (the “descending” group, N = 16). A 2 × 2 mixed-design ANOVA was conducted on the average percentage of trials heard as ascending, with factors of block (neutral block 1 vs. meaningful block 2) and group (ascending vs. descending). The results, shown in Fig. 5, revealed a significant main effect of group [F(1,73) = 43.32, p < .001, ηp2 = .372]; participants who received the ascending prime, on average, heard the tritone paradox more often as descending (M = 29.4%), while those who received the descending prime heard it more often as ascending (M = 53.6%), as expected and observed in Experiment 1. More importantly, the results of the ANOVA also showed a significant crossover interaction between block and group [F(1,73) = 18.72, p < .001, ηp2 = .204]. As before, follow-up paired-samples t-tests were conducted separately on each group. The results showed a significant effect of block in both the ascending and the descending groups [ascending: t(58) = -2.94, p = .005, d = .381; descending: t(15) = 4.47, p = .001, d = .882], but in opposite directions; participants in the ascending group heard the tritones as ascending more often when they received the ascending/meaningful prime in block 2 (M = 32.2%, SD = 15.8) compared to the neutral prime in block 1 (M = 26.5%, SD = 14.5), while participants in the descending group heard the tritones as descending more often in the descending/meaningful prime condition in block 2 (M = 47.9%, SD = 16.2) versus the neutral prime condition in block 1 (M = 59.4%, SD = 10.1).

Fig. 5
figure 5

Experiment 2a and 2b results. Average percent of trials heard as ascending is plotted as a function of block, prime group, and experiment. Error bars represent standard error of the mean. *p < .05, **p < .01, n.s. = not significant

These results are consistent with the primary results of Experiment 1. In fact, an ANOVA as above but including the additional factor of experiment (Experiment 1 vs. Experiment 2) revealed no main effects of or interactions with experiment (all ps > .101), indicating that the overall effect did not differ between the two experiments – in other words, that conducting the experiment in-person versus online did not influence our pattern of results.

To further ascertain the effect of the prime in the current experiment, directional difference scores were calculated as they were in Experiment 1, where a positive value represented a positive influence of the prime in the expected direction (i.e., more ascending for the ascending prime, more descending for the descending prime). A one-sample t-test on average directional difference scores across both groups revealed that these scores were significantly greater than 0 [t(74) = 4.235, p < .001, d = .489], suggesting that the prime influenced perception of the tritone paradox in the expected direction for both prime groups. Specifically, the meaningful prime shifted participants’ perception of the tritone paradox by 6.91% in the expected direction – a value not significantly different from that observed in Experiment 1 (p > .32). Moreover, as in Experiment 1, an independent-samples t-test showed that the influence of the prime, as measured by these directional difference scores, did not differ by group (M = 11.38% influence for the descending group; M = 5.69% influence for the ascending group; p > .155).

These results, obtained in this experiment via an online study, replicated Experiment 1’s in-person results – participants’ perception of the tritone paradox was shifted in the direction of the masked music notes.

Reaction time

A 2 × 2 mixed-design ANOVA was conducted on average reaction times with factors of block (neutral block 1 vs. meaningful block 2) and group (ascending vs. descending). As in Experiment 1, the results, shown in Fig. 6, revealed a significant main effect of block [F(1,73) = 35.65, p < .001,ηp2 = .328], with faster responses on average to tritones that followed the meaningful primes in block 2 versus the neutral primes in block 1 (2,314 vs. 2,999 ms, respectively). As before, there was no significant interaction with or main effect of group (ps > .19). This main effect may again have been due to practice effects, since, as in Experiment 1, the meaningful prime was always presented second by design. An ANOVA comparing the results of Experiment 2 to Experiment 1 again revealed no significant main effect of or interaction with experiment (ps > .15), indicating that the pattern of reaction times did not differ when the study was conducted in person versus online.

Fig. 6
figure 6

Experiment 2a and 2b reaction times. Average reaction times are plotted as a function of block, prime group, and experiment. Error bars represent standard error of the mean. * p < .05, **p < .01

Effect of musical experience

A natural question that arose in this study was whether participants’ level of musical experience influenced either (1) their baseline perception of the tritone paradox, or (2) the influence of the meaningful prime on participants’ perception of the tritone paradox. Indeed, prior work has shown that pianists, for example, might be more affected by the tritone paradox than non-musicians (Repp & Knoblich, 2007), and other previous studies focused on perception of the tritone paradox in pianists only (Repp & Goehrke, 2011; Repp & Knoblich, 2009). Thus, we might have expected to observe different effects in musicians compared to non-musicians.

In our post-experiment questionnaire, we asked participants to indicate whether they played a musical instrument (yes/no), and if so, how long they have played (number of years) and how frequently (daily/weekly/monthly/occasionally/rarely). From this, we grouped participants into musicians and non-musicians. We only categorized a participant as a musician if they responded “yes” to playing a musical instrument, and if they had played for more than 5 years, more frequently than “occasionally.” Using these guidelines, 71 participants were categorized as musicians and 69 as non-musicians across Experiments 1 and 2a (data were pooled from both experiments since the same measures were used).

An independent-samples t-test comparing perception of the tritone paradox at baseline (i.e., during the neutral prime condition only) between musicians and non-musicians showed no significant difference (p > .200); musicians heard the tritone paradox as ascending on an average of 35% of trials, while for non-musicians it was 39% (see Fig. 7a). This means that, at least in our sample of participants, musicians and non-musicians did not perceive the tritone paradox differently at baseline.

Fig. 7
figure 7

Effect of musical experience on perception of the tritone paradox across Experiments 1 and 2a. (a) Percent of trials heard as ascending in the neutral prime condition only (block 1) for non-musicians and musicians. (b) Directional difference score, which reflects the change from neutral to meaningful prime, for non-musicians and musicians. All participants (both “ascending” and “descending” prime groups) are included here. Error bars represent standard error of the mean. * p < .05

To determine whether musical experience affected the influence of the musical prime on one’s perception of the tritone paradox, an independent-samples t-test was conducted comparing the directional difference scores (ascending and descending groups together) between musicians and non-musicians (see Fig. 7b). The results showed that musicians on average were more influenced by the prime than non-musicians [t(138) = 2.097, p = .038, d = .35]; the meaningful prime shifted non-musicians’ perception an average of 4.2% in the expected direction (toward ascending for ascending primes, toward descending for descending primes), while musicians’ perception was shifted an average of 9.2% in the expected direction. One-sample t-tests revealed that both of these values were significantly higher than 0% [non-musicians: t(68) = 2.506, p = .015, d = .302; musicians: t(70) = 5.411, p < .001, d = .642], indicating that both musicians’ and non-musicians’ perception was significantly influenced by the prime. Jointly, these results suggest that although musical experience did not influence participants’ baseline perception of the tritone paradox, it did affect the magnitude of the influence of the prime on perception of the tritone paradox, with musicians being more affected by the musical prime than non-musicians.

Assessment of prime visibility

The most important aspect of Experiment 2a was the addition of the control condition in which participants were asked to respond to the masked prime rather than to the auditory target, in order to assess prime visibility. An initial analysis showed that – when asked to indicate whether they saw ascending music notes, descending music notes, or no music notes at all on each trial – participants responded “no notes” on 56.11% of trials. There were always music notes in the primes on these trials (half ascending, half descending); that participants reported “no notes” on over half of trials provides some initial evidence that the primes were not consciously perceived.

To further assess prime visibility, trials on which participants responded “no notes” were excluded from subsequent analyses such that accuracy could be assessed. There were 12 participants out of the 75 who responded “no notes” on every trial, and an additional four who responded “no notes” on all but one trial; the other 59 participants responded “ascending” or “descending” on more than one trial. Out of these 59 participants, average accuracy – that is, reporting “ascending” when the ascending prime was shown and reporting “descending” when the descending prime was shown – was 55.2%. A follow-up one-sample t-test revealed that this was not statistically greater than chance performance of 50% (p > .112). This result suggests that participants were just guessing and were unable to consciously perceive the prime.

As an additional measure of prime visibility, a d’ analysis of sensitivity was conducted on the 59 participants indicated above. A d’ analysis is a measure of discriminability between two conditions, in this case the ascending and descending prime conditions; a low d’ score would indicate a low degree of discriminability and thus of conscious perception of the prime. In line with signal detection theory (Macmillan & Creelman, 2004), d’ scores were calculated using the hit rates (H) and false-alarm rates (F) of the two-alternative choices (ascending vs. descending), with z as the inverse of the normal distribution function, as per the equation below:

$${d}^{\prime }=\frac{1}{\sqrt{2}\ }\left[z(H)-z(F)\right]$$

The results showed an average d’ score of .167, which is extremely low and not statistically different from 0 (p > .10), indicating that participants were unable to distinguish between the prime conditions. Indeed, others have used similar procedures and analyses to assess prime visibility, showing that accuracy scores no different from chance and d’ scores not significantly above 0 – as we have found here – indicate lack of prime awareness (e.g., Ansorge et al., 2016; Wernicke & Mattler, 2019).

Together, the results of Experiment 2a replicate those of Experiment 1 and further confirm that this effect – that a masked visual musical notation prime can influence perception of an auditory illusion – can be observed without conscious awareness of the prime.

Experiment 2b: Results

Perception of the tritone paradox

As in Experiments 1 and 2a, we calculated and analyzed the percentage of trials on which the tritone pairs were heard as ascending in order to determine if perception changed (in this case, regressed toward the mean) from blocks 1 to 2.

Assessing perception in just block 1 revealed that, as in the neutral condition (block 1) in the first two experiments, participants tend to hear the tritone paradox more often as descending (average of 34.5% of trials heard as ascending across all participants). To be consistent with the analyses of Experiments 1 and 2a, we split participants into two groups based on their perception in block 1, with 46 participants who heard the tritone paradox more often as descending in block 1, and 16 who heard it more often as ascending in block 1.

First, a 2 × 2 mixed-design ANOVA was conducted on the average percentage of trials heard as ascending in just Experiment 2b, with factors of block (block 1 vs. block 2) and group (ascending vs. descending initial perception). The results showed no significant interaction (p > .52), which is in contrast to the interaction observed in both Experiment 1 and Experiment 2a, which employed the meaningful prime. Indeed, in Experiment 2b where both blocks contained the neutral prime with no music notes, there was no significant change in perception from blocks 1 to 2 in either group of participants (those who heard it initially as descending vs. ascending, ps > .66), suggesting that regression to the mean did not drive the results in Experiments 1 and 2.

To further ascertain that the influence of the meaningful prime in the prior experiments goes above and beyond any regression to the mean, a 2 × 2 × 2 ANOVA was conducted with factors of block (block 1 vs. block 2), group (those who initially heard the tritones as descending vs. ascending), and experiment (Experiment 2a vs. Experiment 2b, which both had the same running conditions being online experiments). The results, shown in Fig. 5, revealed a significant three-way interaction with experiment [F(1,133) = 8.342, p = .005, ηp2 = .059].

To understand this three-way interaction, separate 2 × 2 ANOVAs were conducted on each group of participants (those who initially heard the tritones as descending vs. ascending). The results showed a significant interaction between block and experiment in both groups [descending: F(1,103) = 4.06, p = .04, ηp2 = .048; ascending: F(1,30) = 4.96, p = .03, ηp2 = .142]. Specifically, in both groups of participants, the difference between block 1 and block 2 in Experiment 2a (where block 2 consisted of the meaningful music note prime) was significantly larger than in Experiment 2b (where block 2 consisted of the neutral, meaningless prime).

The 2 × 2 ANOVAs did also show significant main effects of block in both groups [descending: F(1,103) = 5.97, p = .016, ηp2= .055; ascending: F(1,30) = 8.59, p = .006, ηp2 = .223], suggesting that there may have been some regression toward the mean (i.e., toward 50%) across both experiments, as would be expected in a repeated-measures design. However, the significant interactions demonstrate that the change in perception from block 1 to block 2 in Experiment 2a goes above and beyond the influence of regression to the mean and is instead driven by the direction of the music notes in the masked prime.

Reaction time

A 2 × 2 mixed-design ANOVA was conducted on average reaction times with factors of block (block 1 vs. block 2) and group (ascending vs. descending). The results, shown in Fig. 6, revealed a significant main effect of block [F(1,73) = 36.97, p < .001,ηp2 = .381], with participants faster overall in block 2 than block 1 in Experiment 2b. Follow-up paired-samples t-tests showed that this change in reaction times from block 1 to block 2 was significant in both groups of participants [descending: t(45) = 7.407, p < .001, d = 1.09; ascending: t(15) = 2.495, p = .025, d = .62]. This pattern resembles that observed in Experiments 1 and 2a, where responses there were also faster in block 2. Indeed, a 2 × 2 × 2 ANOVA on average reaction times with factors of block, group (ascending vs. descending), and experiment (2a vs. 2b) showed no significant main effect of or interactions with experiment, all ps > .61. This finding indicates that the difference in reaction times between blocks 1 and 2 observed in the previous two experiments was not due to the meaningful prime speeding responses; instead, the faster responses in block 2 were likely due to practice effects since they were also observed here in Experiment 2b where there was no meaningful prime in block 2. Our primary question of interest and hypothesis only pertained to perception of the tritone paradox (ascending/descending), not reaction times, so this result did not affect our main conclusions. Even so, this finding is further addressed in the general discussion.

General discussion

The goal of this study was to investigate whether a masked visual musical notation could unconsciously affect perception of an ambiguous auditory illusion – the tritone paradox. Our hypothesis was supported, as we saw a significant shift in perception of the tritone paradox in both meaningful prime groups (ascending and descending) from the neutral baseline prime condition. Moreover, the direction of influence depended on the direction indicated by the musical notation in the meaningful prime; participants who received the ascending music note prime shifted their perception in the ascending direction, and vice versa for participants who received the descending music note prime. This was observed in two separate experiments, one conducted in person and one conducted online, which speaks to the robustness of this cross-modal priming effect. Importantly, the prime influenced participants’ perception of the tritone paradox unconsciously. This was confirmed in Experiment 2a with a quantitative assessment of prime visibility, which revealed that participants were not consciously aware of the masked prime. Experiment 2b further ascertained that these results were not due to regression to the mean. Together, these findings are noteworthy because this is the first study to show that perception of an auditory illusion can be unconsciously influenced by relevant masked visual information.

The current study is consistent with prior work showing that visual information can influence perception of the tritone paradox (Repp & Goehrke, 2011; Repp & Knoblich, 2007, 2009). Specifically, this previous research demonstrated that watching oneself or someone else play notes on a keyboard either in a right-to-left or a left-to-right direction made participants more likely to hear the tritone paradox in that pitch direction (Repp & Knoblich, 2007, 2009) – an effect that was attributed to the visual input rather than the action (Repp & Goehrke, 2011). Here, we extend this previous research by using a cross-modal masked priming paradigm to show that the visual input of the musical notation need not be consciously perceived in order to exert an influence on auditory perception.

While there has been other research on cross-modal masked priming, none of the previous studies have used an image of a musical notation as the prime or tested perception of the tritone paradox. Rather, they have often used visual and auditory words or numbers as the primes and targets (Chng et al., 2019; Grainger et al., 2003; Kiyonaga et al., 2007; Kouider & Dehaene, 2009; Kouider & Dupoux, 2001; Nakamura et al., 2006). One study showed that visually presented spatial words, such as “up,” facilitated localization of spatially congruent auditory targets (Ansorge et al., 2016). These studies laid the foundation for the current study by showing that visual information can unconsciously influence auditory perception. We add to this cross-modal masked priming literature by demonstrating this effect for the first time with an image prime and an auditory illusion.

Our study also extends previous research by testing perception in both musicians and non-musicians. Prior studies on visual influences on the tritone paradox have only observed the effect in musicians (specifically pianists); some studies did not test non-musicians (Repp & Goehrke, 2011; Repp & Knoblich, 2009), while others found that non-musicians’ perception of the tritone paradox was not significantly influenced by the visual input (Repp & Knoblich, 2007). In the present study, we found that while both musicians and non-musicians were influenced by the masked musical notation, musicians were influenced to a greater extent than non-musicians. This is consistent with the prior work showing visual influences on the tritone paradox particularly in musicians. Here, we may have also observed the effect in non-musicians, while previous studies did not (Repp & Knoblich, 2007), because of the prime stimuli given to participants in this study versus previous studies. The visual input in the prior work was the right-to-left or left-to-right action of pressing the keys on the keyboard. The pianists may have been more likely to interpret a left-to-right key-press as ascending pitch and a right-to-left key-press as descending pitch based on their experience, whereas non-musicians may not have such a strong association between keyboard direction and pitch. In the current study, the music notes in the prime were more obviously rising or falling as one was always visibly higher than the other on the music staff. Even without any musical experience, one could more easily interpret the notes as ascending versus descending, had they been consciously perceived. It is even possible that the influence of the prime in our study came from a more basic visual level (i.e., one element being visually higher than the other) rather than the music notes specifically and the pitches they represent. While this explanation could account for the influence of the prime on non-musicians’ perception, the fact that the priming effect was significantly greater for musicians suggests that priming in these participants was due to more than just the basic visual elements of the image; musicians’ experience with music notes and the pitches they represent likely bolstered the priming effect, leading to the difference between musicians and non-musicians observed in the current study. Future studies could further explore what categories of visual primes, aside from musical notations, can influence one’s perception of the tritone paradox.

Although perception of the tritone paradox (ascending or descending) was our main measure of interest in this study, we also assessed participants’ reaction times. In doing so, we discovered that, in all experiments, participants responded to the tritones significantly faster in block 2 than in block 1. Experiment 2b confirmed that this result was likely due to practice effects, as the same change in reaction times was observed when block 2 consisted of the neutral prime as when it consisted of the meaningful prime. Thus, it was not the case that the meaningful prime speeded responses above and beyond practice effects. The results of Experiment 2b also allow us to rule out other potential explanations for the difference in reaction times, such as the possibility that simply having any single note on the staff enhanced the semantic or spatial correspondence to the heard tritones, thereby speeding responses. We can also rule out the possible explanation that congruence between participants’ responses and the prime (e.g., responding ascending when the prime was ascending) speeded responses. Indeed, a post hoc analysis found that congruent responses were not faster than incongruent responses in block 2 (p = .27), nor was the difference in reaction times between block 2 and block 1 significantly greater for congruent versus incongruent responses (p = .22), further confirming that response congruency did not drive the speeded reaction times in block 2 in Experiment 1. While the reaction time results are likely due to practice effects, the important point is that our main finding – that the musical prime influenced the percent of trials heard as ascending – cannot be due to practice effects or music notes alone. Rather, our effect must be due to the arrangement of the music notes in the meaningful prime, since participants’ perception of the tritone paradox was shifted in opposite directions, depending on the prime (ascending or descending musical notes), an effect not driven by regression to the mean. From this, we can conclude that the 25-ms prime exposure duration was long enough that it was processed to the extent that it influenced behavior, but short enough that it was not consciously perceived. This was confirmed quantitatively in Experiment 2a, where participants were unable to accurately report on the identity of the visual masked prime, and qualitatively in all experiments, where participants reported in the post-experiment questionnaire that they did not see anything meaningful in the prime (or that there was even a “prime” at all).

In addition to asking about potential conscious perception of the prime in our post-experiment questionnaire, we also asked participants to report on their stress, anxiety, mood, amount of sleep the night prior, as well as demographic information such as their gender, languages spoken, and origin. We included these questions in order to investigate factors that may have influenced our results (the shift in perception from the neutral prime to the meaningful prime) as well as whether those factors influence baseline perception of the tritone paradox. Indeed, prior research has found that various individual factors can influence one’s perception of the tritone paradox, such as their vocal range, geographic location, and language experience (Deutsch, 1991, 2007; Deutsch et al., 1990, 2004). However, no previous work has investigated the influence of factors such as sleep, stress, anxiety, and overall mood. Follow-up analyses on these various factors in our main experiment, Experiment 1, showed no correlation between participants’ stress, anxiety, mood, or sleep and their baseline perception of the tritone paradox, or their shift in perception from the neutral prime condition to the meaningful prime condition. This suggests that these factors did not correlate with our primary result, and more generally, that they do not correlate with perception of the tritone paradox. Although we collected data on participants' language(s) and country/area of origin, we could not analyze these factors in the current study because we did not have a diverse enough sample; we only tested participants in central California and did not seek participants of different locations or languages. We note that assessment of all of these factors was tangential to the main goal of this study; future research could further investigate these and other factors that influence one’s perception of this auditory illusion.

One potential limitation in the current study is the relatively fewer number of participants in the “descending” group versus the “ascending” group in both Experiment 1 and Experiment 2a. This was because there was a significantly larger proportion of our participants (71% in Experiment 1, 65% in Experiment 2a) who initially heard the tritone pairs more often as descending, which meant that they were given the ascending meaningful prime. We do not believe that the fact that we had unequal groups retracts from the interpretation or the importance of our results, especially since (1) the effect was still observed in both groups of participants separately, and (2) combining the groups into one measure (the directional difference score) revealed an even stronger overall effect of the prime on auditory perception. Further research may continue to analyze mechanisms that influence how one perceives the tritone paradox in order to help explain why the majority of participants initially heard the tritone pairs as descending.

In conclusion, the results of this study show that masked visual musical notations can unconsciously shift perception of the tritone paradox – a finding that contributes to the literature on cross-modal interactions, masked priming, and ambiguous auditory perception. Future research could continue to investigate the extent to which different types of unconsciously processed information can influence perception under conditions of ambiguity.