Experimental Evidence for a Cochlear Source of the Precedence Effect
- First Online:
- Cite this article as:
- Bianchi, F., Verhulst, S. & Dau, T. JARO (2013) 14: 767. doi:10.1007/s10162-013-0406-z
- 758 Downloads
The precedence effect (PE) refers to the dominance of directional information carried by a direct sound (lead) over the spatial information contained in its multiple reflections (lags) in sound localization. Although the processes underlying the PE have been largely investigated, the extent to which peripheral versus central auditory processes contribute to this perceptual phenomenon has remained unclear. The present study investigated the contribution of peripheral processing to the PE through a comparison of physiological and psychoacoustical data in the same human listeners. The psychoacoustical experiments, comprising a fusion task, an interaural time difference detection task and a lateralization task, demonstrated a time range from 1 to 4.6–5 ms, in which the PE operated (precedence window). Click-evoked otoacoustic emissions (CEOAEs) were recorded in both ears to investigate the lead–lag interactions at the level of the basilar membrane (BM) in the cochlea. The CEOAE-derived peripheral and monaural lag suppression was largest for ICIs of 1–4 ms. Auditory-evoked brainstem responses (ABRs) were used to investigate monaural and binaural lag suppression at the brainstem level. The responses to monaural stimulation reflected the peripheral lag suppression observed in the CEOAE results, while the binaural brainstem responses did not show any substantial contribution of binaural processes to monaural lag suppression. The results demonstrated that the lag suppression occurring at the BM in a time range from 1 to 4 ms, as indicated by the suppression of the lag-CEOAE, was the source of the reduction in the lag-ABRs and a possible peripheral contributor to the PE for click stimuli.
Keywordsprecedence effectlag suppressionperipheryclick-evoked otoacoustic emissionauditory brainstem response
In an enclosed environment, the signal generated from a sound source reaches the listener both through a direct path and from multiple reflections off the room's surfaces. Although the listener receives reflections from different locations, the auditory system is generally able to localize the sound source rather accurately by suppressing the directional cues carried by the numerous reflections. The perceptual phenomenon of dominance of the directional information contained in the first arriving sound is known as the precedence effect (PE) (Wallach et al. 1949; Zurek 1987).
This natural situation of a direct sound followed by multiple reflections can be simplified by considering a direct sound with a single reflection. The direct sound (lead) and its reflection (lag) can be reproduced in the free field by two loudspeakers at different locations, driven with identical click stimuli with a delay between the onsets (lead–lag delay or inter-click interval (ICI)).
The perception of the lead–lag pair depends on the ICI and varies both in the number of perceived stimuli and in their perceived location. Although this variation is gradual and stimulus dependent, some approximate ranges of perception can be defined: a summing window, a precedence window, and an echo window (Fitzpatrick et al. 1999; Litovsky et al. 1999).
The summing window is defined by an ICI range between 0 and 1 ms (e.g., Litovsky et al. 1999), where the lead and the lag are perceptually fused in one single image and contribute both to the perceived localization of the fused event. The precedence window is defined by an ICI range from 1 ms up to the echo threshold (Fitzpatrick et al. 1999; Litovsky et al. 1999). Here, the percept is a fused event localized at the lead location. For this time range, the directional cues contained in the lag are weighted less heavily than those of the lead (Wallach et al. 1949; Litovsky et al. 1999). The echo window refers to the ICI range above the echo threshold, where the lead and the lag are audible as two separated sound images, each perceived at its own location (Blauert 1997). The echo threshold estimates the ICI at which the fused auditory event perceptually splits into two sound images. For clicks, the echo threshold occurs at ICIs of 2–10 ms (Freyman et al. 1991; Yang and Grantham 1997b; Litovsky et al. 1999), and studies using headphones generally observe smaller values (2–4 ms) than those using loudspeakers (Fitzpatrick et al. 1999; Litovsky et al. 1999).
Although the PE has been intensively studied over the last two decades (Lindemann 1986; Divenyi and Blauert 1987; Freyman et al. 1991; Fitzpatrick et al. 1995; Litovsky and Yin 1998; Fitzpatrick et al. 1999; Liebenthal and Pratt 1999; Hartung and Trahiotis 2001; Damaschke et al. 2005; Xia and Shinn-Cunningham 2011), the debate whether the lag-suppression mechanism results from peripheral or central processes has remained unresolved. Previous studies have suggested the existence of monaural and peripheral mechanisms responsible for a reduction in the sensitivity to the spatial cues contained in the lagging stimulus (Tollin 1998; Tollin and Henning 1998, 1999; Hartung and Trahiotis 2001; Wolf et al. 2010; Xia and Shinn-Cunningham 2011). However, these studies either consisted of solely psychoacoustical experiments (Tollin and Henning 1998, 1999), a test of computational models against psychoacoustical results (Tollin 1998; Hartung and Trahiotis 2001; Xia and Shinn-Cunningham 2011) or physiological findings in animals (Wolf et al. 2010). Monaural neural correlates of lag suppression were also reported by Wickesberg and Oertel (1990), Fitzpatrick et al. (1995), Parham et al. (1996), Fitzpatrick et al. (1999), and Tollin et al. (2004).
The current study investigated contributions to the PE at different stages along the auditory pathway, whereby comparisons between psychoacoustical and physiological data were analyzed in the same human listeners. Three psychoacoustical experiments, a fusion task, an interaural time difference (ITD) detection task, and a lateralization task were performed to investigate the perceptual phenomena related to the PE. Furthermore, noninvasive physiological methods, click-evoked otoacoustic emissions (CEOAEs), and auditory-evoked brainstem responses (ABRs), were used to systematically examine the effect of the leading click on the lagging click at cochlear and brainstem levels and to experimentally test the hypothesis of a peripheral source of the PE.
Six normal-hearing subjects (three females and three males), aged from 24 to 34, participated in the experiments. All had audibility thresholds of less than 20 dB hearing level at the frequencies in a standard audiogram. The experiments took place in a double-walled soundproof booth that was electrically shielded for the CEOAE and ABR experiments. All signals were generated digitally in MATLAB at a sampling rate of 48 kHz and consisted of 83 μs clicks.
The psychoacoustical experiments investigated two perceptual phenomena that characterize the perception of the lead–lag pair in the precedence window (Litovsky et al. 1999): fusion, which refers to the perception of one single, fused auditory event and lag-discrimination suppression, which refers to the difficulty of the listener to discriminate directional information contained in the lag.
An adaptive one-interval, two-alternative forced-choice (2 AFC) procedure was adopted to determine the echo threshold, i.e., the ICI for which the deviant was perceived as two separate clicks. Each presentation consisted of a deviant with a lag-ITD of 300 μs, for which the ICI was varied between 1 and 7 ms. The test was carried out both for monaural and binaural stimulation to investigate the contribution of binaural processing to fusion. In both tests, the subjects' task was to specify whether they perceived a single click (SC; fused image) or two separated clicks (lead and lag). The subjects were instructed to press the two-click response only when they could hear two auditory events clearly separated in time (monaural test) or in space (binaural test). The starting value of the ICI was 1 ms, which was increased after each single-click response and was decreased after two consecutive two-click responses. The initial step size was 1 ms and reduced after a lower reversal to 0.5 and 0.3 ms as the threshold was approached. The echo threshold was obtained after six reversals and corresponded to the 70.7 % point on the psychometric function. Thresholds were obtained as the average of three repeated measurements.
This test investigated lag-discrimination suppression by studying lag-ITD detection as a function of ICI. Seven sequences containing references and deviants were presented, one for each of the following ICIs: 0, 1, 2, 3, 4, 5, and 8 ms. Within each sequence, the ICI was constant and the deviants were randomly presented among the references, allowing a minimum of three references between the presentation of two deviants (Damaschke et al. 2005). The deviants contained ITDs ranging from 150 to 900 μs with a step size of 150 μs. Each ITD was repeated three times within the same sequence for a total of 18 deviants per sequence (six lag-ITDs repeated three times). The interval between the onset of one lead–lag pair and the onset of the following pair was 1 s. The subjects' task was to hit a button on the keyboard whenever a noncentered click pair (i.e., a deviant) was detected among the centered references. The response was considered correct when the button was pressed within 1 s after the presentation of the deviant. False alarms were accounted for by calculating the ratio between the number of correct hits and the total hits for each sequence. Subjects were asked to repeat those sequences where the ratio was below 70 %. The ITD-detection threshold was calculated as the lag-ITD that corresponded to 67 % correct performance, i.e., when the lag-ITD was correctly detected at least two times out of three for each sequence.
The stimulus presentation consisted of one interval containing two lead–lag pairs: a reference followed by a deviant. The reference and deviant had the same ICI, with values among: 0, 1, 2, 3, 4, 5, or 8 ms. The deviant contained a lag-ITD in the right channel, which was randomly varied among: 0, 150, 300, 450, 600, 750, 900, and 1,000 μs. Each ITD was repeated three times for each ICI. After the presentation of each reference-deviant pair, subjects were asked to press one of the six response buttons ((1) left, (2) center, (3) center and center, (4) center and right, (5) center and left, and (6) center and left and right) according to the perceived lateralization of the deviant with respect to the reference. The six buttons were designed to take all possible percepts of the deviant into account, both when fusion occurred and when fusion was no longer present. In the case of a fused percept, a SC was perceived, either to the left (when the ITD was detected) or at the center. Otherwise, lead and lag were perceived as two separate clicks, where the lead was always perceived as centered, and the lag was perceived either at the center, left, or right, or as two clicks to the left and to the right. Although the lag ITD was leading to the left ear, the percepts of the lag either to the right, or to the left and right, were included to account for the possibility of different monaural suppressions of the lagging clicks in the left and right ear (e.g., for large ITDs). The lateralization threshold was calculated for each ICI as the minimum ITD producing at least two times out of three (67 %) a noncentered percept of the deviant.
The stimuli were sent via the open source software pa-wavplay to the soundcard (RME FireFace 800 A/D-D/A converter, RME Intelligent Audio Solutions, Germany). The clicks were calibrated at a level of 65 dB peSPL in a BK-2012 ear-canal coupler (Brüel & Kjær Sound & Vibration Measurement A/S, Denmark), attached to a BK-4157 artificial ear. After insertion of the recording probe in the ear canal, in situ calibration was performed using a TDT-PA5 programmable attenuator (Tucker-Davis Technologies, Alachua, FL) to ensure that the levels of the clicks in the ear canal were equal in each ear. The stimuli were presented to the left and right ear of the test subjects via two ER-2 earphones (Etymotic Research, Inc., Elk Grove Village, IL). Recordings were performed using two ER-10B+ low-noise microphones and were bandpass filtered between 0.6 and 5 kHz (analog Rockland 852 HI/LO filter). Click pair stimuli were designed for seven different ICIs (0, 1, 2, 3, 4, 5, and 8 ms) and a lag-ITD of 300 μs.
The response recorded to the double-click stimulus consists of a CEOAE to the lead click, a CEOAE to the lag click, and a nonlinear component that depends on the ICI (Verhulst et al. 2011a). Kemp and Chum (1980a) developed a technique to remove the CEOAE component from the leading click while keeping the CEOAE component to the lagging click and the nonlinear component due to the ICI. This technique, as adapted by Kapadia and Lutman (2000b), was used here to calculate the derived suppressed (DS) response of the lagging click. Figure 1B illustrates this interleaved procedure adopted for stimulus presentation (Verhulst et al. 2011a). For each ICI and ITD condition, 1,800 repetitions of the following three stimuli were presented: SC, double click (DC; two condensation clicks), and double-click inverted (DCI; one condensation and one rarefaction click). The unsuppressed response (US) corresponded to the SC recordings. The DS response was obtained by subtracting the DCI response from the DC response and by halving the result. The DS response thus consisted of the CEOAE component due to the lagging click and the nonlinear component due to the ICI. The lag suppression was calculated as the root-mean-square (rms) level difference between DS and US responses in a time frame of 6–18 ms after click onset. Both monaural and binaural stimulations were tested. As no difference in lag suppression level was found between the two stimulations, it was decided to present the stimuli binaurally to extract monaural CEOAE lag suppression.
The electrodes were placed according to the 10–10 system (American Clinical Neurophysiology Society), using a tight-fitting elastic cap that holds the electrodes in position (Picton 2011). Four electrodes were used: Cz (at the vertex, halfway between nasion and inion), Fz (at the forehead at three tenths of nasion–inion distance), M1 (left mastoid), and M2 (right mastoid). The electrode Cz was used as a reference and the electrode Fz as ground. Low impedances (below 2 kΩ) were achieved by carefully degreasing the test subject's scalp with alcohol and an abrasive electrolyte gel. The stimuli were played back and sent to the soundcard (RME FireFace 800 D/A converter, RME Intelligent Audio Solutions, Germany). The clicks were calibrated at a level of 75 dB peSPL in a BK-2012 ear-canal coupler (Brüel & Kjær Sound & Vibration Measurement A/S, Denmark), attached to a BK-4157 artificial-ear calibrator. The stimuli were presented to the left and right ear of the test subjects via two ER-2 earphones (Etymotic Research, Inc., Elk Grove Village, IL). The electrodes were connected to an EEG amplifier (Synamps 5803), responsible for the amplification and A/D conversion of the recorded potentials. The output of the amplifier was connected to the recording PC where the EEG-data were post-processed. The average, variance, and covariance of the evoked responses were calculated, and the resulting waveform was bandpass filtered with a FIR filter with cut-off frequencies of 200 and 1,500 Hz. Deviants were presented for seven different ICIs (0, 1, 2, 3, 4, 5, and 8 ms) and a lag-ITD of 300 μs. For each ICI and ITD condition, the 25-ms-long epoch containing the deviant stimulus was presented 2,000 times (Fig. 1C).
In the data analysis, the wave V amplitude peaks of the lead were determined as the maximum voltage (absolute value) in a time range of 6.5–7.5 ms after stimulus onset (Damaschke et al. 2005). The wave V amplitude peaks of the lag were determined with a similar procedure, in a time range shifted in latency according to the ICI and the ITD.
The data obtained for the DS and US conditions were divided into five blocks of 360 averages each. Mean and rms level were calculated for each block and suppression was calculated for the 25 combinations of level difference between the DS and US conditions. The standard deviation (SD) was calculated over the 25 values of suppression (Verhulst et al. 2011a).
SDs of the ABR recordings were calculated as the square root of the time-averaged variances. Normal distributions were built from the mean and SD of the wave-Vs of lead and lag. A normal distribution of lag-wave V suppression and its SD were obtained by random sampling from the distributions of the lead and lag wave-Vs.
Confidence interval and significance testing
Mean lag suppression and standard deviation (in decibels), calculated for each subject, for ICIs below and above the individual echo thresholds
Lag suppression (dB)
95 % CI (dB)
OAE mono L
4.9 ± 0.2
1.7 ± 0.2
4.7 ± 0.1
1.0 ± 0.1
3.8 ± 0.2
1.4 ± 0.3
3.9 ± 0.1
1.0 ± 0.1
3.3 ± 0.3
1.7 ± 0.4
4.5 ± 0.1
0.3 ± 0.2
OAE mono R
6.6 ± 0.1
1.4 ± 0.2
4.0 ± 0.1
1.2 ± 0.2
4.6 ± 0.2
2.5 ± 0.2
4.3 ± 0.1
1.6 ± 0.1
4.2 ± 0.7
2.0 ± 0.3
3.8 ± 0.1
0.8 ± 0.2
ABR mono L
4.7 ± 1.9
−0.3 ± 0.5
4.7 ± 2.0
1.8 ± 1.2
3.3 ± 1.8
0.7 ± 0.7
4.3 ± 2.5
0.3 ± 0.6
2.1 ± 2.1
−0.3 ± 1.0
3.4 ± 1.3
1.9 ± 0.9
ABR mono R
2.2 ± 0.6
−0.2 ± 0.3
3.0 ± 0.6
0.8 ± 0.5
3.7 ± 2.0
2.1 ± 0.7
2.9 ± 1.8
2.0 ± 2.5
3.4 ± 3.4
0.4 ± 1.0
2.7 ± 0.6
0.7 ± 0.6
3.2 ± 0.5
0.6 ± 0.2
4.6 ± 1.4
1.1 ± 0.3
6.2 ± 1.8
0.8 ± 0.3
2.0 ± 0.8
0.6 ± 0.5
3.9 ± 1.8
1.2 ± 0.7
4.4 ± 1.4
1.6 ± 0.2
Figure 2B presents the mean ITD-detection thresholds (squares) and lateralization thresholds (circles). The ITD-detection threshold, i.e., the minimum lag-ITD to obtain a noncentered percept of the deviant, increased up to 590 μs for ICIs between 0 and 4 ms, and then decreased again for ICIs above 4 ms. Large threshold values indicated strong lag-discrimination suppression. For an ICI of 0 ms, no lag-discrimination suppression occurred (i.e., lead and lag had the same weight in lateralization) and all subjects could detect the deviants at the shortest ITD presented (150 μs). For an ICI of 8 ms, the ITD threshold was 340 μs, which was significantly higher than the baseline threshold for an ICI of 0 ms (p < 0.05, two-sample right-tailed t test) and not significantly lower than the threshold at 5 ms (p = 0.074, two-sample right-tailed t test), indicating that lag-discrimination suppression was still present for a lead–lag delay of 8 ms (and ITDs below the threshold). The ITD-threshold obtained here showed an ICI range over which lag-discrimination suppression occurred that is in agreement with previous studies (Zurek 1980; Damaschke et al. 2005).
The lateralization test refined the ITD-detection test by specifying the lateralization of a lead–lag pair as a function of the ICI. The difference from the previous test was that the task in this experiment was not only to detect the ITD contained in the lead–lag pair, but also to specify the perceived lateralization of the lead–lag pair. For each subject, the threshold was calculated as the minimum ITD producing at least two (out of three) noncentered percepts of the deviant. Figure 2B shows the mean lateralization threshold (circles), where the error bars indicate the standard error of the mean. The lateralization threshold curve presented similar values as the ITD-detection threshold function for all ICIs except at 3 ms, where the lateralization threshold was significantly larger than the detection threshold (p = 0.029, Wilcoxon rank sum test). Largest thresholds were obtained for ICIs of 2–3 ms. For longer ICIs, the threshold curve decreased again, until reaching 300 μs for an ICI of 8 ms. Although not at baseline level (150 μs), this value was significantly lower than the threshold at 5 ms (p = 0.021, two-sample right-tailed t test).
In Figure 2C, the mean lateralization threshold (black curve) is represented together with the lateralizations that were reported the most by the six subjects. The different symbols represent the different response buttons, whereas the size of the symbols shows at what percentage the lateralization was reported over 18 responses (six subjects and three repetitions). Small symbols indicate the lateralizations that were reported less than nine times (i.e., below 50 %). Medium-sized and large symbols represent reported lateralizations corresponding to between 50 and 70 % and above 70 %, respectively. The black symbols indicate perception of the lead–lag pair at the lead location, i.e., when lag-discrimination suppression occurred. Colored symbols show the release from lag-discrimination suppression. Fused percepts are indicated by the squared symbols.
For an ICI of 0 ms, the blue squares show that lead and lag had the same weight in lateralization (i.e., summing location), as subjects reported to hear a SC towards the left more than 70 % of the times. For ICIs between 1 and 4 ms, lag-ITDs below 600 μs show a strong lag-discrimination suppression (black symbols), whereas ITDs above 600 μs indicate a release from lag-discrimination suppression (colored symbols), even though difficulties were reported in consistently lateralizing the lag (small symbols). For ICIs above 4 ms, the results for all ITDs indicated that lead and lag were no longer perceived as fused. Despite the breakdown of fusion, lag-discrimination suppression was still observed for ICIs of 5 and 8 ms at short ITDs (black diamonds). For large ITDs, the subjects reported to perceive a diffuse sound image inside the head (green circles).
In summary, the results from the three perceptual experiments estimated fusion to occur within an ICI range up to 4.6 ms, and lag-discrimination suppression to last for longer ICIs (at least up to 8 ms).
When the auditory system is stimulated by a click, the forward travelling wave created along the basilar membrane (BM) can be reflected by preexisting random BM impedance irregularities (Shera and Guinan 1999; Zweig and Shera 1995). These irregularities are inherent to a healthy cochlea and may reflect small cell-to-cell differences in outer-hair cell amplification and alignment, which can be thought of as place-fixed BM impedance irregularities. Through a mechanism of coherent reflection, the BM irregularities are assumed to give rise to a backwards traveling wave that can be recorded in the ear canal as a CEOAE (Zweig and Shera 1995). CEOAEs contain information about the BM processing at the cochlear regions where the emission was generated (Moleti et al. 2008; Shera et al. 2002). When the cochlea is stimulated with lead–lag pairs, both the lead and lag elicit a CEOAE. It has been shown that, when preceded by the lead, the CEOAE elicited by the lag is reduced in amplitude compared with a CEOAE elicited by the lag presented in isolation (Kapadia and Lutman 2000; Verhulst et al. 2011a). This CEOAE amplitude reduction, which depends on the lead–lag delay, presumably reflects attenuation of the BM response to the lagging click, and will be referred to as peripheral lag suppression in the following.
In Figure 3B, peripheral lag suppression is represented as a function of ICI. The figure shows individual (gray curves) and mean data (black curves) of peripheral lag suppression for monaural left (left panel) and monaural right (right panel) stimulation, for lead–lag pairs with an ITD of 300 μs. The mean data show a large suppression of the lag (between 3 and 6 dB) for lead–lag delays up to 4 ms. Above an ICI of 4 ms, the mean peripheral lag suppression decreased to 2 dB at 5 ms and 0.5 dB at 8 ms.
A statistical analysis was conducted on the null hypothesis that the difference of individual suppression, calculated for ICIs below and above individual echo thresholds, was zero (95 % CI). All test subjects showed peripheral lag suppression that was significantly larger for ICIs below the individual echo threshold than above it (Table 1).
ABRs are auditory-evoked potentials that reflect synchronized neural activity generated at the level of the auditory nerve (AN) and the auditory brainstem. Wave V is typically the most prominent peak in the ABR and is considered to reflect activity stemming from the superior olivary complex in the brainstem (Picton 2011).
Effect of frequency range and implications for peripheral processing
Previous studies regarding the auditory processes underlying the PE (Divenyi 1992; Divenyi and Blauert 1987; Dizon and Colburn 2006; Shinn-Cunningham et al. 1995; Tollin and Henning 1999; Wolf et al. 2010; Xia and Shinn-Cunningham 2011) investigated the frequency dependence of localization dominance and lag-discrimination suppression. Two main hypotheses emerged: Divenyi and Blauert (1987) and Blauert and Divenyi (1988) proposed the “spectral overlap” concept, where lag-discrimination suppression was greatest (i.e., ITD thresholds were largest) for a large spectral overlap between the lead and the lag stimuli. Thus, they suggested that discrimination suppression operated within frequency bands (corresponding to peripheral auditory filters). An alternative concept of “localization strength” was proposed by Divenyi (1992) who found that localization dominance decreased with decreasing lead center frequency, i.e., a low-frequency lead suppressed the spatial information of a high-frequency lag more strongly than when they were both centered at the same high frequency. This second hypothesis assumed a discrimination suppression mechanism operating across frequency bands. Consistent with the localization strength hypothesis, Shinn-Cunningham et al. (1995) showed that low frequency stimuli dominated over high-frequency stimuli in ITD-detection tasks. Yang and Grantham (1997b) suggested that spectral overlap (i.e., processes operating within frequency bands) and localization strength (i.e., processes across frequency bands) are two independent processes governing discrimination suppression.
Other studies investigated the frequency dependence of the PE by using spectrally identical lead and lag stimuli. By varying the center frequency of the lead–lag pair, these studies investigated within frequency-band effects as a function of frequency. Localization dominance was found to be longer lasting and more pronounced for low frequency lead and lag stimuli than for high frequency stimuli (Lindemann 1986; Tollin and Henning 1999; Dizon and Colburn 2006; Wolf et al. 2010). This frequency-dependent behavior, where localization dominance was demonstrated to decrease with increasing center frequency, strongly supported the contribution of peripheral auditory processing to the PE (Tollin 1998; Hartung and Trahiotis 2001; Wolf et al. 2010; Xia and Shinn-Cunningham 2011). In fact, due to the mechanical properties of the BM, lead and lag exhibit shorter impulse responses and, therefore, shorter interactions when they are both centered at higher frequencies than at lower frequencies.
The current study tested this hypothesis experimentally, by measuring CEOAEs to spectrally identical lead and lag stimuli. The results revealed that the CEOAE lag suppression was highly frequency dependent, with longer lasting suppression at low frequencies (Fig. 3A). By experimentally supporting the previously mentioned studies, these results provide a strong link between BM impulse response duration and within-frequency channel effects reported in psychoacoustical experiments measuring the PE. Although across-frequency processes may also be present, this study shows how within-frequency band lead–lag interactions change over frequency and how this mechanism could affect the perception of a lead–lag pair.
The abovementioned studies investigated the frequency dependence of lead dominance and lag-discrimination suppression (i.e., localization tasks), whereas the current study also presented measures of fusion, which does not necessarily involve the extraction of spatial cues. Fusion and discrimination suppression might, to some extent, rely on independent mechanisms, as previously suggested (Yang and Grantham 1997a), and operate in different frequency regions. It has been shown that ITD detection most likely relies on low frequencies (Dizon and Colburn 2006; Tollin and Henning 1999), where the extraction of ITDs is most effective. In contrast, echo thresholds may be dominated by high frequencies, where the lead and lag impulse responses produce shorter interactions on the BM and can, therefore, be separated out for shorter delays than at lower frequencies. The psychoacoustical results of the current study (Fig. 2) showed slightly different ICI ranges over which fusion and lag-discrimination suppression occurred. While fusion broke down at 4.6 ms (Fig. 2A), lag-discrimination suppression was still strong for an ICI of 5 ms and present for an ICI of 8 ms (for an ITD of 150 μs, Fig. 2B, C). The shorter time range over which fusion occurred would, thus, support the hypothesis of dominance of high frequencies for echo threshold determination, where one can extract cues for the number (one or two) of perceived clicks at shorter ICIs than for lateralization.
Effects of peripheral processing on the PE
The CEOAE results (Fig. 3B) showed that peripheral suppression of the lagging click was maximal for lead–lag delays up to 4 ms, in agreement with previous studies (Kapadia and Lutman 2000; Verhulst et al. 2011a). For an ICI of 0 ms, the stimulus in the left channel was a SC with double amplitude. Here, no lag suppression occurred and the reduction of 3–4 dB with respect to the single-click condition resulted from the compressive behavior of the CEOAE level curve (Verhulst et al. 2011a). Thus, peripheral lag suppression, defined as the suppressive effect of the lead on the lag, was largest for ICIs between 1 and 4 ms.
These results provide evidence for a monaural and peripheral component of lag suppression, occurring for lead–lag delays within the precedence window, and suggest a relation between peripheral suppression effects and the perceptual PE.
The lag suppression observed in the CEOAEs is of peripheral origin and likely related to the processing at local sites of the BM where the emission was generated. The frequency-dependent release of suppression as a function of ICI (Fig. 3A) appears to be linked especially to the duration of the local BM impulse response duration, where short ICIs lead to overlapping impulse responses that can cause lag suppression for both low and high frequency cochlear locations, whereas longer ICIs are only able to affect low frequency BM impulse responses. Although there is no invasive study that relates CEOAEs with impulse responses recorded from the BM, a large body of OAE literature provides evidence for spectral components in CEOAEs to reflect local BM processing (Kemp and Chum 1980b; Neely et al. 1988; Zweig and Shera 1995; Shera and Guinan 1999; Harte et al. 2009). Moreover, cochlear dispersion combined with coherent reflection filtering can explain why the short latencies of the CEOAE waveform contain high frequencies and the longer latencies contain low frequencies (Jedrzejczak et al. 2005; Moleti and Sisto 2008). The above studies support the view that lag suppression observed in CEOAE frequency components can be considered as reflecting complex interactions (both in phase and magnitude) of local BM impulse responses at those cochlear regions where the emission was generated. This view is further supported by two AN studies that performed recordings from single AN fibers to acoustic click pairs (Goblick and Pfeiffer 1969; Parham et al. 1996). While Parham et al. (1996) did not clarify whether the origin of lag suppression arose from adaptation in the AN itself or from cochlear processing that served as an input to the AN, Goblick and Pfeiffer (1969) referred to dynamics in local BM amplification to explain lag suppression.
Modeling studies that account for BM as well as higher level processing can provide insight in this matter (Tollin 1998; Hartung and Trahiotis 2001; Xia and Shinn-Cunningham 2011). In the model of Hartung and Trahiotis (2001), two monaural lead–lag stimuli were processed through a left- and right-ear gammatone filterbank (Patterson et al. 1995) and a hair-cell transduction stage (Meddis 1986) before the outputs were processed by a binaural cross correlation operation. Based on the monaural effects of BM filtering, (inner) hair-cell processing and subsequent binaural processing, the model was shown to qualitatively account for some of the behavioral data associated with the PE (Wallach et al. 1949; Shinn-Cunningham et al. 1995). However, whereas the role of inner-hair-cell (IHC) processing was stressed in the framework of the modeling study, the results from the present study suggest that BM processing, and not IHC/AN processing, might provide the major link between the observed CEOAE-derived lag-suppression data and the behavioral data (in agreement with the model of Tollin 1998). Adaptation effects in the AN and subsequent neural stages may further contribute to the peripheral lag suppression that was shown to affect the perception of the PE in this study. For the click stimuli used in the present study, lag suppression caused by BM impulse response interactions may dominate over AN adaptation effects, which might be stronger for longer-duration stimuli.
CEOAEs and monaural ABRs
The mean wave-V amplitude reductions (Fig. 4B, blue and red curves) obtained from ABR recordings for monaural stimulations were largest in a shorter ICI range (1–2 ms) than the peripheral lag suppression observed in the CEOAEs (Fig. 3B). Several aspects may account for this difference. First, peripheral lag suppression was measured as an amplitude reduction of the backward travelling wave, which contains information of specific reflection sites along the BM (e.g., Zweig and Shera 1995; Shera et al. 2002). In contrast, the ABR reflects neural activity elicited by the forward travelling wave and, in particular, represents the synchronous activity of neurons across the whole cochlear partition (Dau et al. 2000; Junius and Dau 2005). Even though OAE and ABR results comprise monaural lead–lag interactions, the OAE only contains a subset of frequency components present in the ABR. CEOAEs are, in fact, dominated by frequency components in the 1–2 kHz range where the middle-ear gain is largest (Puria 2003). Moreover, peripheral lag suppression in CEAOEs was observed to be frequency dependent, with longer-lasting suppression at low frequencies than at high frequencies (Verhulst et al. 2011b, Fig. 3A). Thus, the shorter time range of suppression obtained in the ABR results may be explained by the wider frequency window effective in ABRs versus CEOAEs. Second, ABRs not only reflect outer-hair-cell processing, as in the case of CEOAEs, but also represent effects of IHC processing and neural recovery times in the AN and brainstem.
Contributions of binaural processes
The mean lag-wave V reduction obtained with binaural stimulation (black curve in Fig. 4B, left panel) was not larger than the one obtained with monaural left stimulation (blue curve). The absence of binaural attenuation at the brainstem is consistent with previous results, which showed correlates of binaural lag suppression only in middle-latency responses but not in early-latency responses (Liebenthal and Pratt 1999), and with results showing correlates of binaural lag suppression in the pattern of late auditory-evoked potentials (Damaschke et al. 2005). Although the present study is in agreement with the absence of a binaural contribution to lag suppression at the brainstem level (Damaschke et al. 2005), the conclusion here differs with respect to the monaural mechanism occurring for stages below the brainstem. While previous studies (Damaschke et al. 2005; Fitzpatrick et al. 1999) concluded that monaural lag-suppression mechanisms occurring for ICIs below 5 ms originate from recovery times in neurons of the AN and brainstem, the present study presents evidence for mechanical BM lead–lag interactions as the main source of lag suppression for ICIs between 1 and 4 ms. When the cue for lateralization is carried by the lag, a mechanism of monaural suppression would account for the raise in the lateralization threshold for short ICIs. This is consistent with results from a recent study (Fisher et al. 2011) where monaural instantaneous frequency glides in BM could account for characteristic features of binaural ITD processing. For ICIs larger than 5 ms (e.g., for an ICI of 8 ms in the current paper), where no peripheral suppression occurs, central (binaural) processes are likely responsible for raising the lateralization thresholds.
Furthermore, the comparison of monaural and binaural behavioral echo thresholds (Fig. 2A) did not show any contribution of binaural processes to fusion, in agreement with previous studies (Litovsky et al. 1997; Rakerd et al. 1997), suggesting that binaural processes might not be involved in echo threshold determination.
In conclusion, the results of the present study show a correlation between mechanical cochlear processes and psychoacoustical measures of the PE for short ICIs. Although low-level effects cannot be sufficient to account for all aspects of precedence, experimental evidence was provided that monaural peripheral suppression plays a fundamental role for the binaurally perceived PE for short lead–lag delays (i.e., 1–4 ms). Not only do BM lead–lag interactions occur within the same time range as the behaviorally determined precedence window for clicks, they also represent the main component of lag suppression at the level of the auditory brainstem. The findings of the present study apply for click stimuli. For stimuli of longer duration than clicks, inhibitory processes may account for some aspects of the PE (Braasch and Blauert 2003; Lindemann 1986; Xia et al. 2010). Longer durations of suppression (above 5 ms) may be explained by central processes occurring at stages above the brainstem (Blauert 1997; Damaschke et al. 2005; Liebenthal and Pratt 1999; Sanders et al. 2008).
This work was supported by the Technical University of Denmark and the Oticon Foundation.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.