Determining the location of a sound source is an important function of the auditory system that relies heavily on input from both ears. Although the original intention behind cochlear implants (CIs) was to restore speech understanding for people with severe hearing loss, it has been shown that some basic localization ability can also be restored when using two CIs (Seeber et al. 2004; Litovsky et al. 2009). However, in most studies, localization performance with CIs was investigated in favorable conditions with little reverberation or background noise. In real rooms, where sound reflects from the room boundaries and overlaps with the direct sound, our preliminary studies showed that localization is disrupted for most CI users. Nevertheless, some CI users seemed to struggle less with reverberation than others. To identify the factors most important for sound localization in challenging environments this study compares a range of basic psychoacoustic measures with measures of localization performance in an anechoic and a reverberant room with a group of seven CI users.

The ability to localize sounds depends mainly on binaural cues: interaural time differences (ITDs) carried in the signal waveform (also called the temporal fine structure [TFS]) and the envelope, and interaural level differences (ILDs). Normal-hearing listeners use all cues for localization weighted depending on stimulus and listening situation. At low frequencies, ITDs from the signal waveform contribute dominantly to localization of most broadband sounds, whereas ILDs and ITDs in the envelope form the main cues for high-frequency stimuli (Macpherson and Middlebrooks 2002). In contrast, several studies with CI users showed that they mainly use ILD cues for localization in a quiet, anechoic space and other cues contribute little (van Hoesel and Tyler 2003; Seeber and Fastl 2008; Grantham et al. 2008).

The ability to correctly localize sounds in rooms is associated with the precedence effect (Blauert 1997; Litovsky et al. 1999). The effect exhibits two key elements: the ability to locate a source correctly despite the presence of the reflection(s), and the perceptual fusion of the reflection(s) with the direct sound. Accurate localization is achieved by weighting binaural cues in steep onsets stronger than in the later ongoing part of a sound (Houtgast and Aoki 1994; Rakerd and Hartmann 1986). This helps because binaural cues are least corrupted by room reflections at, and shortly after, stimulus onset. Nevertheless, binaural cues in the TFS and from envelope modulations of the ongoing part also contribute to the localization percept in reverberant rooms (Dizon and Colburn 2006).

Monaural spectral and temporal cues play a key role in the fusion of direct sound and reflection(s) (Seeber and Hafter 2011). Let us first consider a simple precedence effect paradigm with one leading and one lagging sound. The lag is more likely to be audible if the sound is brief and contains steep onsets; hence echo detection thresholds are shortest for clicks (Litovsky et al. 1999). When the delay between lead and lag click is increased, the lag will eventually become audible — the forward masking by the lead will no longer mask the lag. For temporally overlapping lead and lag stimuli the situation is more complicated since simultaneous masking, the detection of spectral coloration, and binaural unmasking will additionally contribute (Buchholz 2007). However, the absence of true binaural processing in CI users due to the lack of encoding TFS cues in most devices, and the reduction of temporal and spectral detail due to current spread and processing in few discrete filter bands, can be thought to alter the fusion and segregation processes involved. These altered processes were also seen in an earlier study (Kerber and Seeber 2010), where, in some CI users, echo-thresholds were much longer than in normal hearing listeners. Hence it is important to identify the low-level monaural and binaural factors that underlie the disruption to CI users’ localization performance in reverberation.

The general approach taken here follows that of several studies attempting to predict higher level auditory performance from basic psychoacoustic measures in listeners with sensorineural hearing loss or with CIs (Throckmorton and Collins 1999; Strelcyk and Dau 2009; Neher et al. 2011). Most of these studies focused on factors underlying speech understanding, but Neher et al. (2011) also searched for predictors of localization performance. They looked for correlations between localization performance in anechoic space and the audiogram, performance in a spectral-ripple discrimination task, outcomes of an interaural phase discrimination task, and two cognitive measures. No significant correlations were found.

The present study aimed to determine predictive parameters for localization performance with bilateral CIs in reverberation. The hypothesis was that CI users with better access to binaural cues perform better in reverberation because unimpaired access to cues in the direct sound is crucial for correct localization in the presence of reflections (Seeber and Hafter 2011). In addition we tested if patients with faster decaying forward masking thresholds localize better in reverberant rooms than listeners showing a slower decay. The hypothesis was that faster forward masking decay might lead to greater perceptual segregation between direct sound and reflections, thereby reducing the influence of the reverberant tail on localization of the direct sound. Sensitivity to these basic perceptual cues was assessed with seven users of bilateral CIs in a battery of psychoacoustic tests with direct computer control of the CIs. The basic psychoacoustic measures were then related to measures of localization performance in reverberation.



Seven bilateral CI users took part in this study. They had all been regular users of bilateral CI devices made by Cochlear Ltd. for at least 0.5 years. An overview of the participants’ details including etiology of hearing loss, duration of implant use, and device types are given in Tables 1 and 2.

TABLE 1 Details of the seven bilateral CI users who participated in the present study
TABLE 2 Clinical devices and direct stimulation parameters

All participants used CIs with fully inserted electrodes and, with the exception of BiCI09, were implanted sequentially. Electrode arrays consisted of 22 intracochlear contacts and two extracochlear return electrodes (Patrick et al. 2006). They were numbered 1 to 22 with low numbers denoting more basal locations in the cochlea. Participants were fitted with either the SPEAK or the ACE strategy in their clinical programs which were set to deliver biphasic current pulses in monopolar mode. The participants used pulse-widths of 25 μs in their everyday program, with the exception of BiCI09 who used 50 μs at the left ear only.

Ethics approval was given by the North Nottingham Research Ethics Committee (09/H0407/17) and all participants gave written informed consent.

Localization test

Localization ability was assessed in the free-field using the “Simulated Open Field Environment” installed in an anechoic chamber (SOFE; Seeber et al. 2010). The setup consisted of 48 loudspeakers, of which 36 surrounded the participant in 10 ° steps in the horizontal plane. The other 12 loudspeakers were at elevations of −50 ° and +50 °. Loudspeakers between −120 ° (left) and +120 ° (right) were hidden behind acoustically transparent curtains. In the anechoic baseline condition a target sound was played from a randomly selected loudspeaker at −80 °, −60 °, −40 °, −20 °, −10 °, 0 °, 10 °, 20 °, 40 °, 60 °, or 80 ° and participants indicated the perceived location by pointing a light spot to it. This light spot was projected on the curtains and could be moved in the horizontal plane by means of a trackball (ProDePo pointing method; Seeber 2002).

The same method was used to assess localization ability in a simulated reverberant room, but here the sound from one of the target locations was accompanied by its reflections played from all loudspeakers of the SOFE, including the loudspeakers at different elevations to represent floor and ceiling reflections. The reflection pattern was derived using a mirror image method (Seeber et al. 2010). The simulated room had dimensions of 5.825 × 4 × 2.5 m3, with a carpeted floor, a wooden ceiling and gypsum-cardboard covered side walls. Room reflections were auralized by mapping them to the SOFE loudspeaker most closely corresponding to their location, thereby recreating the spatial, spectral and temporal properties of the sound field in a real room. The location of the sources and the listener was such that the first reflection always originated from the hemi-field opposite to the direct sound. The room had a reverberation time of about 410 ms. The listener was placed 1.3 m from the source, yielding a direct-to-reverberant ratio (DRR) at the listener’s position of −3 dB. This DRR was calculated directly from the simulated room impulse response by comparing the energy of the isolated direct sound to the summed energy of all reflections.

Target stimuli were trains of pulses cut out of uniform exciting noise (Fastl and Zwicker 2007) of 60 dB SPL and a bandwidth of 300 Hz to 10 kHz. The pulsetrains consisted of six pulses separated by gaps of 120 ms. Each pulse lasted 10 ms and had Gaussian ramps of 2 ms. The overall level of the target stimuli was roved over a range of −6 to +6 dB in 3-dB steps. A 12-dB range was selected because it corresponds approximately to the average natural ILD of lateral sources (Feddersen et al. 1957), and thus to the size of the monaural level cues created by the ILD. The approximately equally sized rove renders the monaural level cues unreliable, while it keeps rove size and thus dynamic range requirements small. In a related study, we verified that these levels were low enough to avoid disruption of ILD cues caused by activation of the AGC in the speech processors (Kerber and Seeber 2012).

In all localization tests participants used their own CI devices with no changes to their standard clinical maps. Prior to data collection each participant was familiarized with the localization task using a shortened version of the test. The familiarization session was also used to adjust the volume and sensitivity settings on the participant’s speech processors. To this end, the experimenter spoke directly in front of the participants and the CI users were asked to alternately remove the implant coils and to compare and adjust the loudness between the two devices until both matched and until the speech was perceived in the front. Formal data collection did not start until participants showed a localization pattern symmetric about the median plane in the familiarization runs in anechoic conditions, and devices were re-adjusted if necessary. Localization performance was then tested in anechoic conditions in a single run of 110 trials (11 directions × 5 roving levels × 2 trials per roving level). This was followed by tests in different simulated rooms over four to eight runs, depending on the participant, with 110 trials in each run (11 directions × 4–8 room conditions × 5 roving levels × 2 trials per roving level). Room conditions were randomly interleaved within and across the multiple runs, but this article presents results only for the previously described room with a DRR of −3 dB.

Direct stimulation tests

General method

The external hardware used for the experiments consisted of two modified L34 processors provided by Cochlear Ltd. All testing was done in monopolar mode using biphasic negative-leading current pulses with both return electrodes used as reference electrodes (MP1 + 2 mode). Current pulses were initially presented with a pulse-width of 25 μs, an inter-phase gap of 8 μs, and a rate of 900 pulses per second (pps). However, for stimulation on single electrodes some participants were unable to reach their highest comfortable level (C level) with these pulse parameters, meaning that sounds were still perceived as too quiet at the maximum output current of 255 CU.Footnote 1 For these participants the pulse-width was increased to 40 μs (Table 2). Using the final pulse parameters for each participant two target stimuli were produced. The first stimulus, henceforth referred to as “ongoing-pulsetrain”, consisted of 270 pulses and had a total duration of 300 ms; it was perceived by the CI users as a single tone-like sound. The second stimulus, “pulsed-pulsetrain”, was akin to the stimulus used in the localization test. It comprised sequences of nine pulses with fixed current (i.e., a burst of 10-ms duration), followed by 108 pulses with zero current (i.e., 120 ms of silence). Six bursts of this pattern made up the second target stimulus.

Stimuli were streamed directly from the computer to the implants using the Nucleus Implant Communication (NIC 2.0) toolbox for Matlab. During development of the test procedures, the L34 processors were connected to an implant-in-a-box and a storage oscilloscope to verify current amplitudes and the time course of the pulses. We measured the accuracy for ITDs of our stimuli to be below 1 μs for this system.

Prior to testing each participant was introduced to the direct stimulation hardware and briefed on safety precautions. Electrode impedances were then measured using the standard procedure of the Cochlear “Custom Sound” software to discover any irregularities in single electrodes, i.e., disconnected or shorted electrodes. These would be marked by abnormally high or low impedances. No such irregularities were found. We then defined the reference ear as the ear with better everyday performance as reported by the participant (Table 2). This reference ear was later used to present the anchor sound during the loudness matching task and for the forward masking measurements.

Determination of threshold and comfortable current levels

Comfort levels (C levels) and threshold levels (T levels) were measured first. The difference between C and T levels defines the electrodynamic range (EDR). We also refer to stimulation currents between C and T levels as a percentage value of EDR.

C levels were measured using an adjustment procedure. Participants were presented with a graphical user interface with seven buttons, of which six were to influence the playback level and one was to confirm the adjustment. The playback level could be raised or lowered in three different step sizes (±15, ±5 and ±1 CU) and each step-size was associated with a different button. Participants were instructed to adjust the level of the ongoing-pulsetrain presented on a single electrode until it sounded comfortably loud, but not too loud. This was done one ear and one electrode at a time for each test electrode indicated in Table 2. Each measurement was carried out twice and the final C level was calculated as the arithmetic mean of the two estimates. C levels for intermediate electrodes were linearly interpolated.

T-levels were measured using a two-interval, two-alternative forced-choice procedure. The two intervals were indicated by two buttons on a graphical user interface which changed color during playback. The ongoing-pulsetrain was played in one randomly selected interval, the other interval was silent. Participants indicated which interval they believed the sound to be in. Feedback was provided. The measurement electrodes were identical to those used for the C-level estimation (Table 2). Thresholds were estimated using a two-down/one-up adaptive procedure (Levitt 1971), with a step size of 10 CU up to the first reversal, 5 CU up to the second reversal, and 2 CU thereafter. The final threshold was calculated as the arithmetic mean of the level at the last eight reversals at 2-CU step size. T levels were measured one electrode and one ear at a time.

Loudness matching

For use in the pitch magnitude estimation (PME) task presentation levels which elicited equal loudness on all test electrodes were determined with the ongoing-pulsetrain stimulus. This was necessary because in electric hearing loudness differences greatly influence the pitch percepts (McKay 2004). The loudness matching task was also used to match the loudness across the electrode pairs selected for the binaural sensitivity tests. The matching was done independently for the ongoing-pulsetrain and the pulsed-pulsetrain (Table 2).

The loudness matching task was implemented as an adjustment procedure. Participants matched the loudness of a probe stimulus to that of a reference stimulus. The reference stimulus was always the ongoing-pulsetrain presented at a current of 70 % of the EDR on electrode 11 on the reference ear. After a silent interval of 500 ms a probe stimulus was presented on a randomly selected test electrode on a randomly selected ear (cf. Table 2 for test electrodes and the reference ear for all participants). The starting level of the test stimulus was set randomly between 40 % and 100 % of the EDR of the test electrode. The adjustment was made by turning a trackball left/right to increase/decrease the level of the probe stimulus in 1-CU steps. The reference-probe stimulus pair was repeated after a 1-s pause until participants had finished their adjustment, which they confirmed by a button press. Four adjustments were made for each configuration, and the median value was used as the loudness-matched level. The whole experiment consisted of at least 48 trials (i.e., number of test electrodes × 2 ears × number of stimuli × 4 trials), split into two runs of no more than 15 min each. Participants were familiarized with the task prior to the experiment.

Pitch magnitude estimation

The PME task was intended to find electrode pairs for the binaural sensitivity tests. Pitch-based place matches have been shown to yield electrode pairs that are most sensitive to interaural differences (Long et al. 2003; van Hoesel 2004). In the PME task, participants assigned a number between 0 and 100 to the perceived pitch of the ongoing-pulsetrain stimulus. Higher numbers corresponded to higher pitch percepts. The assignment was done using a graphical user interface with a slider as a visual analog scale and the value was additionally displayed numerically. Stimuli were presented once to a randomly selected electrode and ear, with the test ear being indicated on the computer screen one second prior to each stimulus presentation. Test electrodes were the same as in the loudness matching task (Table 2). Stimuli were presented at the previously established loudness-matched level for each electrode. Ten trials were collected for each electrode resulting in a total of at least 180 trials (2 ears × 9–10 electrodes × 10 trials) split into two runs of no longer than 15 min each.

As CI users often have difficulties estimating pitch (Carlyon et al. 2010), participants received substantial training prior to data collection. This ensured that participants used the full scale between 0 and 100 and that they produced stable results. Training runs were similar to the experimental runs but with fewer repetitions per test electrode. To remind participants of the full range of possible pitches, prior to each run all stimuli were presented in ascending electrode order until the participant felt confident to begin.

Binaural sensitivity tests

Sensitivity to ILDs and envelope-based ITDs was measured in separate lateralization tasks. Participants indicated the perceived intracranial location of the test stimulus by positioning a marker on a horizontal line between two endpoints labeled "left ear" and "right ear". Compared to the ILD/ITD discrimination tasks more commonly used in psychoacoustics, this type of lateralization task provides a more direct estimate of how interaural cues contribute to perceived location (Seeber and Fastl 2008).

Prior to the experiments, three pitch-matched bilateral electrode pairs located in the basal, mid, and apical area of the intracochlear electrode array were identified for each participant using the results of the PME task (Table 2). The loudness-matched ongoing-pulsetrain was presented on each electrode pair and participants were asked to confirm if this led to the perception of a single intracranial sound. If this requirement was fulfilled then presentation levels were further fine-tuned in an interactive session until sounds were perceived in the centre of the head for all electrode pairs. In cases where an initially selected electrode pair did not lead to the perception of a single intracranial sound, a new electrode pair was selected from adjacent electrodes on one or both ears until an electrode combination was found that led to the perception of a single intracranial sound. This procedure was also completed with the pulsed-pulsetrain stimulus prior to the measurement of envelope ITD sensitivity using the same electrode pairs.

ILD sensitivity was measured with the ongoing-pulsetrain stimulus modified to have a slow onset ramp of 50 ms to render the onset-ITD cue less salient. Nominal ILDs of ±16, ±8, ±4, ±2, ±1, and 0 CU were applied such that half of the ILD was subtracted from one ear, and half was added to the other. Note that positive ILDs refer to locations toward the right hemisphere. Current level was roved to restrict any use of the monaural stimulus level/loudness as a lateralization cue (Yost and Dye 1988). The rove was either −10 %, 0 % or +10 % of the EDR of the test-electrode pair. Binaural stimuli were streamed simultaneously to both implants, i.e., each pulse on one ear was simultaneous to the corresponding pulse on the other ear (zero ITD). The ILD experiment consisted of 396 trials (11 ILDs × 3 electrode pairs × 3 roving levels × 4 trials), split into three blocks of less than 15 min each. All trials were presented in random order.

ITD sensitivity was measured using the pulsed-pulsetrain, which was streamed to the two CIs with a time delay in the whole waveform and at the levels yielding centered percepts obtained in the previous interactive session. Although the whole waveform was delayed, participants were likely sensitive to the ITD carried in the envelope and onsets only, because the ability to evaluate ITDs in single pulses vanishes for constant-rate pulse trains above 300 pps (van Hoesel 2007), and 900 pps were used here. The ITDs tested were ±1,200, ±800, ±400, ±200, ±100, and 0 μs. Positive numbers denote locations toward the right hemisphere. Two participants (BiCI07 and BiCI08) showed no lateralization even at the largest ITD of 1,200 μs. They were additionally tested with ITDs of ±1,600 μs. Unlike in the ILD experiment, no additional onset/offset ramp and no level rove was applied. The whole ITD experiment consisted of at least 264 trials in random order (11–13 ITDs × 3 electrode pairs × 8 trials) split into two runs of less than 15 min each.

The order of the two lateralization experiments was randomly chosen for each participant. Prior to data collection in both lateralization experiments participants received training to familiarize themselves with the task.

Forward masking

Forward masking, the effect of a masker stimulus on the detection threshold of a probe that follows the masker was measured. Forward masking was measured using a three-interval, two-alternative forced-choice procedure. Participants identified the interval containing the masker plus probe stimulus which was presented in either the second or third interval. In the other intervals the masker was presented alone. After each trial the level of the probe stimulus was adjusted following a two-down/one-up tracking procedure (Levitt 1971). The step size was 10 CU up to the first reversal, 5 CU up to the second reversal, and 2 CU thereafter. The threshold of a single track was determined as the median of six reversals at the lowest step size.


Localization tests

Localization accuracy was assessed by comparing the true target location to the location pointed to by the participant. Overall performance was quantified by two measures: (1) the root-mean-square (RMS) error, which is a measure of the average error between the actual and perceived location; and (2) by the coefficient of determination r 2, which describes the extent to which localization judgments followed the idealized linear relationship between actual and perceived location. RMS error was calculated from the pooled results of all test directions. The coefficient of determination r 2 was calculated using the squared Pearson correlation between all individual localization responses and their corresponding target directions. Figure 1 shows the RMS error in anechoic conditions, which varied between 17.5 ° and 44.1 ° (median: 26.9 °) across participants. Error bars depict the standard deviation of the RMS error at single target locations. The bottom panel of Figure 1 shows r 2 values which varied from 0.72 to 0.97 (median: 0.88).

FIG. 1
figure 1

Root-mean-square (RMS) localization error (top) and coefficient of determination r 2 (bottom) for the localization test in anechoic conditions (white bars) and in a simulated reverberant room with a direct-to-reverberant ratio of −3 dB (grey bars). Error bars show the standard deviation of the RMS errors across different target locations.

For all participants RMS error was greater in reverberation than in anechoic conditions, as expected (Fig. 1). The difference between the RMS error in the two environments was between 2.8 ° and 21.1 °, indicating that some participants were severely affected by reverberation, while for others there was little change. The a priori assumed trend toward poorer performance in reverberation was significant (Wilcoxon signed-rank test, one-sided, p < 0.01).

The coefficient of determination r 2 decreased in the reverberant room for all participants by a magnitude between 0.03 and 0.25. This difference proved significant (Wilcoxon signed-rank test, one-sided, p < 0.01). Thus, room reflections caused localization judgments to deviate more from the idealized linear relationship between actual and perceived direction.

Binaural sensitivity tests

Normal-hearing listeners can make use of ILDs as well as ITDs carried both in the signal’s waveform and envelope to localize sounds. This multitude of cues makes localization robust in reverberation where uncorrupted ITD information is available mainly in signal onsets. In contrast, bilateral CI users base their localization percepts in quiet mainly on the evaluation of ILDs (Seeber and Fastl 2008; Grantham et al. 2008). However, in contemporary CIs, potentially useful ITD information is also transmitted in signal envelopes. The binaural sensitivity tests were designed to find out if the CI users who localize better in reverberation rely on good sensitivity to ILDs or to envelope ITDs.

Before the binaural sensitivity tests pitch matched electrodes at both ears where identified based on the results of the PME task. Previous studies showed that binaural sensitivity is best at matched electrode pairs (Long et al. 2003; van Hoesel 2004). For five of seven participants, no significant difference between pitch estimates on corresponding electrodes of the left and right ears was found. This was calculated by pairwise comparison of pitch estimates on corresponding electrodes at the left and right ear using a Wilcoxon signed-rank test and Bonferroni correction for the number of electrodes. Electrodes were considered to elicited equal pitch percepts if their pitch estimates were not significantly different at p ≤ 0.05. Pitch estimates across ears differed significantly for BiCI01 and BiCI07. Closer inspection revealed that this difference originated from single electrode pairs (electrode pair 9–9 for BiCI01 and 15–15 for BiCI07) which were not selected for the binaural sensitivity tests in these participants.

All participants were able to lateralize stimuli based on the imposed ILDs (Fig. 2). This held for basal, mid and apical electrodes alike. Perceived lateral position was positively and significantly correlated with imposed ILD for all participants and all tested electrode pairs (Spearman correlation on raw data points, p < 0.001).

FIG. 2
figure 2

Perceived lateral position as a function of ILD plotted as medians with interquartile ranges. The lateralization magnitudes of −1 and +1 refer to lateralization at the left and right ear, respectively. The sensitivity to ILDs is given as ILD Discriminability (D ILD) at the top right corner of each panel. Panels are arranged left to right from basal to apical electrode pairs. For all participants, and test electrode pairs Spearman correlations between lateral position and ILD were significant (p < 0.001).

Sensitivity to the ILD cue was quantified by dividing the median of the standard deviation at each nominal ILD magnitude by the slope of a linear fit to the data (Kerber and Seeber 2012). The resulting value, “ILD Discriminability” (D ILD), gives an average discrimination performance across all tested ILDs. An ILD of magnitude D ILD would evoke a discriminable lateralization percept in 69.1 % of trials in a hypothetical ideal observer. This follows from signal-detection theory and is explained in greater detail by Kerber and Seeber (2012). Note that D ILD was derived from the results of the lateralization experiment and might differ from the result that would be obtained by directly measuring the ILD threshold in a discrimination experiment.

For the comparison with localization performance, the best D ILD value for each participant was chosen. A closer look at these D ILD values revealed that they only covered a narrow range for five of the seven participants who all showed DILD values between 4 and 6 CU (Fig. 2 and Table 3). For the two other participants, D ILD values were higher with 10 and 26 CU. The narrow range of D ILD is likely because all participants were successful and experienced CI users, and it complicates the prediction of the localization performance as it leaves little scope for a correlation.

TABLE 3 Summary performance measures for the localization test and all direct stimulation tests

For some participants, D ILD might have underestimated performance at small nominal ILDs where response curves were steeper than for larger ILDs (see, e.g., the sigmoidal response curves of BiCI04). To check for this possibility, D ILD was re-calculated including only responses at nominal ILDs between −4 CU and +4 CU. The re-calculated values did not significantly differ from D ILD previously calculated over the whole range of test ILDs (Wilcoxon signed rank test, two-sided, p > 0.52). Closer inspection of the data revealed that steeper response curves at small ILDs were accompanied by increased variance, thus not leading to an overall improvement in ILD discrimination.

Lateralization performance based on envelope ITDs varied widely across participants (Fig. 3). However, all participants except BiCI07 had at least one electrode pair for which there was a significant correlation between perceived lateral position and imposed ITD (Spearman correlation on raw data points, p < 0.05).

FIG. 3
figure 3

Perceived lateral position as a function of ITD plotted as in Figure 2. The sensitivity to ITDs is given as ITD discriminability (D ITD) at the top of each panel. D ITD values marked with an asterisk denote electrode pairs for which there was a significant Spearman correlation between lateral position and ITD.

Similarly to the calculation of D ILD, a measure of “ITD Discriminability” (D ITD) was calculated. D ITD was between 256 μs (for BiCI09) and 6,402 μs (for BiCI07), both measured at the mid electrode pair (Fig. 3 and Table 3). Note that for BiCI07 there was effectively no lateralization of the sound away from midline for test ITDs up to 1,600 μs. For several other participants, ITD discrimination performance as given by D ITD was still outside the physiologically useful range of about 700 μs (Feddersen et al. 1957).

Forward masking

We hypothesized that the rate of decay of forward-masked thresholds with increasing delay is related to localization performance in reverberation. To facilitate comparison across participants with different thresholds and EDRs, masked thresholds were normalized. Masked thresholds were converted into a normalized level (NL), equal to the difference between the masked threshold at the delay in question and the masked threshold at a delay of 100 ms, expressed as a percentage of the EDR:

$$ \mathrm{NL}={{{100\left( {S-\mathrm{T}{{\mathrm{H}}_{100 }}} \right)}} \left/ {\mathrm{EDR}} \right.}, $$

where S is the current level at masked threshold for the delay in question, TH100 the masked threshold current level at 100-ms delay, and EDR the EDR of the test electrode. All thresholds entered the calculation in Cochlear current units. This procedure is different to that commonly used in other forward masking studies with direct stimulation (e.g., Shannon 1990), which normalize relative to the absolute threshold of the probe stimulus in isolation. The alternative approach was adopted in the present study because, for some participants, it was not possible to establish reliable absolute thresholds for the short probe stimulus in isolation. This was due to problems with the direct stimulation devices discovered after study completion. However, we are interested in the slope of the decay which is little affected by the different normalization provided that the maskers were of equal loudness across subjects after normalization.

Normalized masked thresholds decayed with increasing delay for all participants (Fig. 4). A broken stick function was fitted to the individual thresholds using a least-squares method and with delay expressed in logarithmic units. The function was defined to show a linear decay for delays up to a breakpoint, and was zero for longer delays. The line slope was used as a measure for the decay of forward masking. It was between −38.2 and −13.2 normalized units per decade.

FIG. 4
figure 4

Normalized forward-masked thresholds for each participant with broken stick fits to the data (see text for details). The normalization was done relative to the individual electrodynamic range of the test electrodes. The slope of the first part of the broken stick fit is given in the top left corner of each panel.

Relations between localization performance and the basic psychoacoustic measures

Potential relations between localization performance in anechoic conditions, the decline in localization performance caused by room reverberation and the outcomes of the direct stimulation tests were investigated. The detrimental effect of reverberation for each participant was quantified by the difference in the coefficient of determination r 2 between the anechoic and reverberant localization tests; recall that r 2 differed significantly between the two conditions. We choose r 2 rather than RMS error as it is more sensitive to small changes in the localization pattern due to reverberation. r 2 can be interpreted as a measure for the linearity of the translation of binaural cues into localized positions, which is of primary interest. Table 3 gives r 2 values and the difference in r 2 between anechoic and reverberant conditions along with a summary of individual performance in the direct stimulation tests, i.e., lowest D ITD and D ILD values across all tested electrode pairs and the forward masking slope. Note that we normalized D ILD for the prediction given below using the mean EDR for each participant (Table 2). The normalization accounts for the EDR-dependent compression applied in the CI system which effectively normalizes the acoustic ILD on the speech processor by the EDR.

Multiple linear regressions were performed using the statistics toolbox of Matlab. The dependent variable was either the coefficient of determination r 2 Anechoic in the anechoic condition, or r 2 Reverb in the reverberant condition. In both cases the predictor variables were the measures derived from the direct stimulation tests (Table 3). Structure coefficients (b) that show the contribution of a single isolated factor to the regression model were calculated. These are correlation coefficients between each predictor variable and the predicted dependent variable, related to the overall coefficient of determination, R 2, of the model as a whole.

Figure 5 shows that the explained variance of the model was high, and that it allowed for similarly good prediction of r 2 values in the anechoic and the reverberant condition. This is reflected in R 2 values of 0.84 for both conditions.

FIG. 5
figure 5

Predicted coefficient of determination r 2 plotted against the coefficient of determination for localization in anechoic conditions (r 2 Anechoic; left panel) and in reverberant conditions (r 2 Reverb; right panel). The text in each panel gives the overall quality of the fit R 2, and the structure coefficients for each predictor variable (see text for details).

Focusing on the anechoic condition first, the main contributor to the prediction of r 2 Anechoic was the sensitivity to ILDs with a structure coefficient b ILD of −0.92, followed by the sensitivity to envelope ITDs with a structure coefficient b ITD of −0.87. Performance in the forward masking test contributed marginally to the prediction with a b FWD of 0.01. These results are generally consistent with other studies that found that localization with CIs in anechoic conditions is mediated mainly by ILDs, although here the relative contribution of ITDs is high (van Hoesel and Tyler 2003; Seeber and Fastl 2008; Grantham et al. 2008).

In the reverberant room, the main influence on the predicted r 2 Reverb came from the sensitivity to ITDs (b ITD = −1.03), although ILDs retained some importance indicated by a b ILD of −0.76. In the reverberant room, the forward masking decay also rose in importance, but remained small (b FWD = 0.24) compared to binaural cues. Overall, this suggests that good sensitivity to envelope ITDs is relatively more important for performance in reverberant than in anechoic rooms.

This hypothesis was tested directly by correlating the basic performance measures with the degradation of r 2 between anechoic and reverberant conditions, r 2 Anechoic − r 2 Reverb. The degradation in localization performance (Fig. 6, right column) was best predicted by the sensitivity to ITDs (Pearson correlation, r P = 0.92, p < 0.01). No other correlations were significant despite the fact that in anechoic conditions r 2 Anechoic correlated strongest with the normalized sensitivity to ILDs (Pearson correlation r P = −0.77, p < 0.05; Fig. 6, left column). However, the ILD correlation has to be interpreted carefully since it was driven by an outlier. Most participants had good and similar ILD sensitivity (4–6 CU), leaving little scope to correlate. Even more surprising is the fact that despite the similar and good ILD sensitivity reverberation affected participants differently. The high correlation between the degradation of localization performance in reverberation (Fig. 6, right column) and ITD sensitivity suggests that good sensitivity to envelope ITDs helps CI users to maintain localization performance in reverberant rooms. In fact, going back to Figure 1, it could already been seen that those CI users with the best sensitivity to envelope ITDs (i.e., smallest D ITD) are also those with the least degraded localization performance in reverb.

FIG. 6
figure 6

r 2 in anechoic conditions (left column) and the degradation in r 2 (right column) plotted against the three predictive variables ITD discriminability, normalized ILD discriminability, and forward masking slope (top to bottom, respectively). Numbers in the panels are correlation coefficients (+ p < 0.05; *p < 0.01).

Users of the ACE processing strategy showed the highest sensitivity to envelope ITDs, while users of the SPEAK strategy were less sensitive. ACE users also performed better in the localization task in reverberation. However, neither the reduction of r 2 in the localization task nor D ITD were significantly different between the users of ACE and the users of SPEAK in this study (Wilcoxon rank sum test, two-sided, p > 0.06 for r 2 Anechoic − r 2 Reverb; p > 0.11 for D ITD).


Localization performance of bilateral CI users was assessed in anechoic space and in the simulated reverberant sound field of a common living room (Seeber et al. 2010). Localization ability expressed as r 2 values and RMS errors declined significantly in reverberation. To shed light on the underlying reasons for the decline subjects completed several tests using direct electrical stimulation. While subjects’ sensitivity to ILDs predicted their localization performance in anechoic conditions best, ITDs in signal envelopes (D ITD) were the best predictor both for localization performance in reverberation and for the decline of localization performance due to reverberation. The slope of the forward masking decline added little to the predictions. We conclude that CI users maintain better localization performance in reverberation if they are able to exploit the envelope ITD cues provided by contemporary CI devices. However, further research with a larger and more diverse test group and other test stimuli will be necessary to clarify the role of other low-level psychoacoustic cues

The extent of the ITD cue

The direct stimulation measures obtained here show that some, but not all, bilateral CI users are sensitive to envelope ITDs. ITD sensitivity was found to be between 256 and 6,402 μs which agrees well with measurements in other laboratories (Laback et al. 2004; van Hoesel 2007; Laback et al. 2011). Laback et al. (2004) measured envelope ITD thresholds through clinical sound processors and found performances in the range of 259–1,400 μs for click-train stimuli, which is comparable with the results of the better performing bilateral CI users in the present study. Van Hoesel (2007) found slightly lower ITD thresholds of approximately 50–550 μs in his experiments for carrier pulse rates of 6,000 pps and modulation frequencies up to 400 Hz. Finally, Laback et al. (2011) used direct stimulation with a pulse rate more comparable to ours (1,515 pps) and found envelope ITD thresholds around 250 μs for their best performers. Note that we derived ITD thresholds from the results of a lateralization task, whereas Laback et al. and van Hoesel used a discrimination paradigm. The similarity in outcomes suggests that the supra-threshold measurement of lateralization yields similar results, though study in the same subjects would be needed to confirm this.

The relatively low sensitivity to envelope ITDs and the high sensitivity to ILDs is consistent with previous localization studies in anechoic space showing that bilateral CI users localize based on ILDs with no, or only marginal, help from envelope ITDs (e.g., van Hoesel and Tyler 2003; Grantham et al. 2007; Seeber and Fastl 2008). This study adds the notion that in reverberant environments sensitivity to envelope ITDs becomes relatively more important to maintain localization performance. However, the fact that we used a very homogeneous group of good CI users (CI experience >0.5 years, no problems with the devices, speech reception thresholds of below +6 dB) might have somewhat confounded the result. Sensitivity to ILDs was close to ceiling for most CI users, thus calling for some caution when interpreting the correlations with localization measures. Nevertheless, although most users showed similar, good sensitivity to ILDs, performance declined strongly for some listeners in reverberation, indicating that ILDs are not the only relevant cue.

The high sensitivity to ILDs in direct stimulation experiments is caused by the steep loudness growth function which is compensated by the logarithmic compression in the clinical CI processor. However, for lateral sounds at high sound levels, caused for example by intense lateral room reflections overlapping with the target sound, the dynamic range limitation in the CI processor might disrupt the ILD cue (Kerber and Seeber 2012). CI users with sufficient sensitivity to envelope ITDs might be able to compensate for the disrupted ILD cue, thus showing better localization performance in a reverberant room. Note that unlike the ILD cue onset-ITDs are not disrupted by the dynamic range limitations in the speech processor.

Normal-hearing listeners are usually unaware of the presence of reverberation in a room, possibly due to their ability to evaluate ITDs in the TFS and the envelope. The importance of TFS versus envelope cues is controversially discussed in the literature. Devore and Delgutte (2010), in line with Rakerd and Hartmann (2010), found that reverberation has a more detrimental effect on sensitivity to envelope ITDs than ITDs in the TFS. Studies in our lab showed that ITDs in envelope and TFS are similarly affected by reverberation (Monaghan 2012). Ruggles et al. (2012) investigated how interaural coherence was degraded by reverberation. Interaural coherence is a measure for the similarity of signals across ears and thus important for the evaluation of ITDs. They found that reverberation reduces envelope coherence less than coherence of the fine structure, making the latter a less robust cue for localization in reverberation. While this suggests that the theoretical benefit from additional TFS information may be limited in CI users, it could be that the brain may optimally use and integrate this information to gain benefit in specific listening situations. However, it remains an open question what these situations would be and the degree of benefit that could be obtained.

Considering the signal waveform itself it is not surprising that sensitivity to envelope ITDs helps localization in reverb. Reflections overlap with the direct sound after a short delay, i.e., the signal onsets carry uncorrupted binaural information even in a reverberant room (Rakerd and Hartmann 1986). After only a short delay, reflections will overlap with the direct sound and disrupt binaural information. CI users with sensitivity to envelope ITDs should be able to extract ITDs from the onset, though not from the uncorrupted TFS in the non-overlapping part near the onset. Amplitude modulation found in natural stimuli will create additional “onsets” prior to modulation maxima, thus providing the auditory system with “multiple looks” of the ITD in the envelope.

Users of the ACE processing strategy showed somewhat less degraded localization performance in reverberation than users of the SPEAK strategy, but the difference did not reach significance. SPEAK uses a lower rate (250 pps) than ACE (900 pps) which might affect the accuracy of encoding onsets. Users of ACE also showed better sensitivity to ITDs than users of SPEAK. It might thus be that ACE users, due to the better availability of onset ITDs in their devices, have learned to use ITD cues after implantation while SPEAK users have not. To our knowledge, this hypothesis has not been explored.

Improving envelope ITD perception in CI users

This study has demonstrated that good sensitivity to envelope ITDs is important for localization in challenging acoustic environments. The question remains how sensitivity to envelope ITD can be restored in CI users who currently lack it, and how better access to envelope ITDs can be facilitated by the processing in CI devices.

Auditory training may help to partially restore sensitivity to envelope ITDs. Training has been proven successful in related domains, e.g., to improve speech understanding with CIs (Fu and Galvin 2008; Wilson and Dorman 2008). Rowan and Lutman (2006) studied to what extent sensitivity to envelope ITDs can be trained in normal-hearing listeners. Participants discriminated ITDs in the envelope of amplitude modulated tones with carrier frequencies between ears being either identical or mistuned. This simulates the mismatch which might occur because of incomplete electrode insertion in a bilateral CI fitting. Rowan and Lutman showed that learning of envelope-ITD discrimination was possible in ten of their 16 normal-hearing participants, even with mistuned carrier frequencies. To our knowledge, there has been no research on binaural sensitivity training with CI users. Furthermore, it remains an open question if better performance in an ITD training task would also translate to better localization performance in reverberation.

Sharpening signal envelopes enhances ITD sensitivity in normal-hearing listeners (e.g., Bernstein and Trahiotis 2009) as well as in CI users (Laback et al. 2011). Therefore signal processing strategies which sharpen envelopes might help CI users to use envelope ITD cues more effectively. Such a strategy was developed and evaluated by Monaghan (2012) in our laboratory. The strategy identifies specific envelope peaks in each CI channel and sets the preceding audio samples to zero. This effectively sharpens the signal envelope in onsets belonging to the direct sound. An evaluation of lateralization performance in vocoder studies with normal-hearing listeners showed greater extents of laterality and better discrimination of envelope-ITDs carried in the direct sound. Speech intelligibility was unaffected by the processing. Such strategies could thus be a worthwhile extension to CI processing algorithms.