Rationale
The choice of paradigm to be used to study concurrent sound segregation by CI listeners is somewhat constrained compared to those previously employed with NH listeners. One approach to studying segregation in NH has been to manipulate the influence of a component or subset of components on the pitch, timbre, or phonetic identity of a complex sound, by varying some physical parameter such as their onset times or frequencies (Bregman and Pinker 1978; Darwin 1981; Darwin and Sutherland 1984; Darwin and Gardner 1986; Darwin and Ciocca 1992; Hukin and Darwin 1995). Such an approach requires a good understanding of what the perceptual effect of removing those component or components would be. Unfortunately, we do not have such a priori knowledge when considering the effects of removing or segregating one channel of electrical stimulation.
A second approach, which has been used profitably in NH, is to require subjects to determine whether an isolated “probe” tone is also present as a component of a complex tone (e.g., Hartmann et al. 1990). We adapted this approach by requiring CI listeners to compare stimulation on a single channel of their implant to that applied concurrently to four channels. We then introduced manipulations, such as differences in onset times and pulse rate between the “target” and “nontarget” channels in the mixture, and investigated whether, by improving segregation of the target, these manipulations increased performance on the task. This increase was measured relative to a baseline task in which these additional cues were absent. Because we did not want our CI subjects to be faced with an impossible task in any condition, we increased the current applied to the target channel, relative to that of the others, in all conditions including the baseline.
Subjects
Five adult postlingually deafened users of the Nucleus CI24 implant took part. Their details are given in Table 1.
TABLE 1 Details of the CI users who took part in the experiments
Experiment 1a: overview of conditions
Experiment 1a consisted of a “baseline” condition and several experimental conditions. The baseline condition is illustrated schematically in Figure 1a. Each half of each two-interval trial consisted of a 400-ms “mixture” preceded, 200 ms earlier, by a 400-ms “probe.” The two halves of the trial were separated by 2 s. The mixture consisted of four 100-pps-pulse trains, each presented concurrently to one of four electrodes. We will refer to the four electrodes as A, B, C, and D, where A is the most apical and D the most basal, as the electrodes used differed from subject to subject. Each electrode in the mixture except one was presented at the same percentage of the dynamic range (DR) for that electrode when presented alone. One electrode—the “target”—was presented at a higher percentage of its DR than the rest. This is indicated in Figure 1a by the “taller” pulses on channel B. The target electrode was always B or C and was the same for the two presentations in each trial. In one presentation in a trial the probe was identical to the pulse train presented on the target electrode in the mixture. This presentation was termed the “signal” interval. In the “standard” interval, the probe was the same as the pulse train applied to the other possible target electrode (i.e., C or B) and had the same level as on those trials in which that other electrode was the target.
It is worth pointing out two important features of the above design. First, although the target electrode was presented at a higher current level than the others, subjects could not perform the task simply by comparing the two mixtures on each trial, in a way analogous to “profile analysis” in NH (Green 1988), because the two mixtures were identical on each trial. Second, although the probes differed between the two halves of the trial, subjects could not perform the task by identifying the “signal” probe, because this differed from trial to trial. Rather, the subject had to compare the probe to one channel in the mixture: they were instructed to identify that interval where the probe was most clearly present in the subsequent mixture. A practice run was performed before each test condition, but no feedback was given during the tests.
The “Δrate” condition is illustrated in Figure 1b. It differed from the baseline condition in that the pulse rate applied to the target electrode was reduced from 100 to 77 pps. We wished to determine whether this rate difference helped the listener to “hear out” the target in the mixture, thereby improving performance relative to the baseline condition. Because we wished to maximize the chances of subjects being able to exploit rate differences, we chose to introduce a difference between two quite low rates, where sensitivity in a sequential rate discrimination task is quite good, and to avoid higher rates where performance in sequential tasks often deteriorates (Shannon 1983; Tong and Clark 1985; Townshend et al. 1987; McKay et al. 2000; Zeng 2002). It is also worth noting that, at least for sequential tasks, discrimination of these low-rate pulse trains is better than that of the modulation rate applied to high-rate carriers, such as those used in some CI speech-processing strategies (Baumann and Nobbe 2004). By choosing stimuli that are discriminable in a sequential task, we hoped to maximize our chances of observing any effect of Δrate in the segregation of concurrent stimuli. In this Δrate condition, the current level on all channels was kept the same as in the baseline condition, because the function relating loudness to pulse rate is flat between 77 and 100 pps (McKay and McDermott 1998).
Figure 1c illustrates one trial in the “Delay” condition. It was identical to the baseline condition except that the pulse train presented to the target electrode started 200 ms after that applied to the others, and ended at the same time as them. We wished to determine whether this onset delay helped listeners to “hear out” the target in the mixture, thereby improving performance relative to the baseline condition. Because this reduced the duration of the pulse train on the target channel from 400 to 200 ms, the duration of the probe was also reduced to 200 ms.
Two further experimental conditions were included. As Figure 2 shows, introducing a rate difference also introduces a large and time-varying asynchrony between the pulses in the target and nontarget channels. To determine whether any effect of the rate difference was due to this asynchrony, we included the “Asynch” condition, in which the pulses in the target channel were delayed by 5 ms (Fig. 2). We also included an “AsynchDelay” condition, which was a combination of the “asynchrony” and “delay” cues. It was identical to the Asynch condition, except for the introduction of a 200-ms onset delay and reduction of the probe duration to 200 ms.
Each biphasic pulse in each channel consisted of two 100-μs phases of opposite polarity separated by a gap of 43 μs. All presentation was in so-called “BP+1” mode, with the return electrode being separated from the stimulating electrode by a single “unused” contact. The timing of pulses within each period in the baseline condition is shown schematically in Figure 3, for the case where the target was on channel B. There was a 50-μs gap between subsequent pulses, and, in all conditions, the order was basal to apical except that the last pulse in each period was always on the target channel. The four pulses were repeated every 10 ms, to give a rate of 100 pps per electrode, except for the target channel in the Δrate condition, where the rate was 77 pps. To check that subjects were not performing the task by attending to the very last pulse in the stimulus, which was also always on the target channel, we included an additional baseline condition, “BaseDrop,” in which this final pulse was dropped.
Experiment 1a: preliminary measures
Prior to the start of the experiment, we selected the subset of electrodes to be tested for each subject. This choice was guided by the requirements that none of the electrodes in the set had abnormal impedances or were deactivated in clinical use for any other reason. The positions of electrodes A, B, C, and D are indicated separately for each subject in the last column of Table 1. The spacing between electrodes in the Nucleus CI24 device is 0.75 mm. Because each electrode was separated from its nearest neighbor by four electrode positions, the spacing between adjacent electrodes was 3 mm.
Once the electrode set had been selected, pretests confirmed that pulse trains on electrodes B and C were easily discriminable when presented on their own. Threshold (“T”) and most-comfortable (“C”) levels were obtained for each electrode and are shown in Table 2a. Thresholds were obtained using a two-interval forced-choice task and a 2-up 1-down adaptive procedure (Levitt 1971). Thresholds were estimated from the mean of the last eight turn-points in the procedure, for which the step size was five clinical units (CUs). C levels were obtained by presenting the stimulus at a moderate level and then increasing the level on subsequent presentations until it reached the maximum level that the subject said he/she would be comfortable listening to intermittently for 2 or 3 h.
TABLE 2 Part a shows the shows the T and C levels for each electrode and subject, expressed in CUs. Part b shows the levels applied to the target and nontarget electrodes, expressed as a percentage of the DR for each electrode presented alone
We then performed a set of pilot measures to identify the levels to be used for each electrode and subject. The current level on the targets and nontargets in a mixture were set so that they did not exceed a comfortable level when presented together. We also aimed to choose levels that produced performance which, when averaged across the two target electrodes, was neither at ceiling nor at chance. The number of measures needed to converge on this solution differed across subjects, but in all cases, the following procedures were adopted: (1) The current levels applied to each electrode of a four-electrode mixture were set to the same percentage of the subject’s DR for that electrode alone, defined as C–T in linear microamperes. These levels were then covaried using a loudness-balancing procedure (see below) so that the mixture’s loudness was the same as that of electrode B presented at 65% of DR. The resulting value, termed x% DR, was always less than 65 due to loudness summation across electrodes. (2) We then confirmed that electrodes B and C, presented at 65% of DR, had equal loudness. (3) The stimulus level for each target electrode was initially set to 70% DR, and that of the nontarget electrodes to x% DR. However, these target and nontarget levels sometimes had to be adjusted in order to avoid excessive loudness and/or to ensure that performance in the baseline condition was between chance and ceiling. The final levels chosen for each subject are shown in Table 2b. Because the target in each mixture had the same amplitude as the probe, and because the nontargets had current levels greater than zero, the loudness of the mixture was greater than that of the probe. Note that for one subject, CI5, the nontarget electrodes were set to −10% DR, meaning that each nontarget electrode was each stimulated at a level below the threshold for that electrode alone. These nontarget electrodes were, however, almost certainly audible when presented together; the threshold for all four electrodes presented together at the same percent DR was −42%. It therefore appears that this subject showed a large amount of loudness summation. Our research software specified levels in CUs, which we converted to microamperes using the formula \({\text{ $ \mu $ A}} = 10 \times {\left( {175^{ \wedge } {\left( {{{\text{CU}}} \mathord{\left/ {\vphantom {{{\text{CU}}} {255}}} \right. \kern-\nulldelimiterspace} {255}} \right)}} \right)}\). This formula, provided by Cochlear, was verified using a test implant and digital oscilloscope.
When loudness balancing was performed, the two stimuli to be balanced were presented in random order, and the subject indicated which one was the louder. The level of the variable stimulus was then adjusted on the next trial by an amount that was initially 10% DR. When the direction of change in the variable stimulus had reversed twice, the step size was reduced to 5% DR, and the procedure continued until 10 reversals had taken place. The current levels at the last eight reversals were then averaged. This procedure was performed four times, twice with the variable stimulus starting at a low level and twice at a level that was louder than that of the fixed stimulus. These four estimates were then averaged. When a multielectrode stimulus was varied, the levels applied to each electrode were varied by the same percentage of the DR, as measured for that electrode when presented alone.
Methods: experiment 1b
Experiment 1b investigated whether CI listeners can use rate, onset, or asynchrony cues in the absence of any additional current level boost applied to the target channel. The methods for experiment 1b were the same as for experiment 1a except that there was no level increment on the target channel; instead, all channels in the mixture were stimulated at the same percentage of their DR. Similarly, the probe was always presented at the same level as the corresponding channel in the mixture. There were no baseline conditions as, in the absence of onset, rate, or asynchrony cues, there was no a priori reason to expect the target channel to differ from any of the others. Two subjects, CI1 and CI2, were tested on all the conditions of experiments 1a and 1b. These subjects were selected for experiment 1b because they had shown a relatively large effect of onset delay in experiment 1a.
Procedure
In the main part of experiments 1a and 1b, each block of 20 trials consisted of a single condition with the target on electrode B on half of those trials and on electrode C on the other half. Subjects completed one practice (10-trial block) and two blocks of each condition (40 trials) in turn before repeating the sequence of blocks in reverse order. The order was not formally counterbalanced but differed across subjects. This procedure was completed until there were at least 160 trials per condition. The results were analyzed both with the data averaged across the two targets and with the two target electrodes considered separately. Our discussion of the results will start with the former analysis.
Results: experiment 1a
Data averaged across the two target electrodes
Percent-correct scores averaged across the two target electrodes are shown for each subject in Figure 4a. Performance in the two baseline conditions (“base” and “BaseDrop”) did not differ significantly from each other, indicating that the final pulse of the stimulus did not have an effect on performance. In the following discussion, we compare the effects of each manipulation relative to the baseline condition and consider whether any such effects are greater than would be expected from random variation.
Performance in the Δrate condition did not differ significantly from that in the baseline condition (t = 0.37, df = 4, p = 0.73), suggesting that subjects could not use rate differences to “hear out” one channel in a mixture. This is consistent with our previous finding, using simulations of CI hearing presented to NH listeners, that across-channel differences in rate do not provide a useful cue for concurrent sound segregation (Deeks and Carlyon 2004). Performance in the Asynch condition also did not differ significantly from baseline (t = 0.67, df = 4, p = 0.54). However, there is some evidence that these two negative findings did not simply arise from a null effect or from random variation. Figure 4b shows performance in each condition with that in the baseline condition subtracted out; it can be seen that, when an asynchrony changed performance in a particular direction, a change in rate produced a change in the same direction. This relationship is reflected by a significant correlation between the change in performance re baseline in these two conditions (r = 0.84, df = 4, p < 0.05). In principle, such a correlation between two difference scores could simply arise from the fact that the two differences have one score—that in the baseline condition—in common. Hence, even if scores were randomly distributed across conditions and subjects, then those subjects who happened to have a high score in the baseline condition would, on average, show a lower score when this baseline performance was subtracted out. However, the correlation between the two difference scores reported here remained significant when the correlation between each of these two difference scores and the baseline condition was partialed out (partial correlation = 0.97, t = 6.13, df = 2, p < 0.05). Our interpretation is that an asynchrony can either enhance or impair performance, and that any effect of a rate difference is driven by the across-channel asynchrony that it produces (cf. Fig. 2). A caveat is that, in common with many CI experiments, we tested a fairly small number of subjects and that correlations obtained with low numbers of observations should be treated with some caution.
The only condition to provide a significant improvement re the baseline condition was the Delay condition, in which there was a 200-ms onset delay on the target channel (t = 3.87, df = 4, p = 0.018). This condition also differed significantly from the BaseDrop condition (t = 3.13, df = 4, p = 0.035). However, it should be noted that this improvement was generally modest, ranging from 2 to 11 percentage points. The small size of the improvement is perhaps surprising, given the important role for onset delays that has been reported in several experiments with NH listeners (Darwin and Carlyon 1995). The remainder of the experiments reported here focus on the usefulness of such onset delays.
Data analyzed separately for the two target electrodes
The percentage-correct scores are shown separately for each target and for each subject and condition in Table 3a. It can be seen that, for a given subject, performance could differ between the two types of target. When this happened, the difference in performance was usually consistent across conditions—for example, subject CI 3 always did better with the target on electrode C, whereas CI 5 usually did better with the target on electrode B. These differences in performance between the two target electrodes may have been due to a simple response bias, or to the “partial loudness” of the two targets not being equal. Differences in partial loudness could arise if partial masking produced by adjacent electrodes were not equal for the two targets.
TABLE 3 Results of experiments 1a (part a) and 1b (part b) shown separately for trials where the target was on channel B and on channel C
Table 3a also shows that, although performance in the baseline condition was always below ceiling when averaged across the two electrodes (Fig. 4a), this was not always the case for each electrode when analyzed separately. This raises the possibility that the near-ceiling performance for some electrodes could have obscured differences between conditions. To test this idea, we repeated the analyses described in the previous section, taking into account only the data for the target electrode yielding the worst performance for each subject—electrode B for CI 1, CI 3, and C4, and electrode C for the other two subjects. The general pattern of results was the same as when the data were averaged across electrodes. Specifically, paired-sample t tests revealed no significant improvement re baseline in the Δrate (p = 0.77) and Asynch (p = 0.50) conditions, but a significant improvement in the Delay condition (p < 0.01). Performance in the AsynchDelay condition was also marginally (p = 0.05) better than that in the baseline condition.
Results: experiment 1b
Data averaged across the two target electrodes
Figure 5 shows the results for the two subjects, CI 1 and CI 2, who took part in experiment 1b. Because, in this experiment, there was no consistent level cue to the target channel, we would expect that, in the absence of any additional cue, performance would be at chance (50%). It is therefore instructive to compare the scores obtained in each condition and for each subject to the chance level of 50%. Instances where scores differ significantly from chance are indicated by asterisks in the figure. A significant difference was defined as one where chance performance (50%) did not fall between the 95% confidence limits surrounding a given score.
The effects of the Δrate and Asynch cues were consistent with those obtained for the same subjects in experiment 1a. For subject CI 1, performance in the Asynch and Δrate conditions was worse than baseline in experiment 1 and significantly worse than chance in experiment 2. For subject CI 2, performance in these two conditions was similar to baseline in experiment 1 and close to chance in experiment 2. The fact that, for each subject, the effect of Δrate and of an asynchrony was similar is consistent with our conclusion that the effect of a rate difference is driven by the asynchrony that it produces.
The effect of the 200-ms onset delay was also consistent with experiment 1a in that performance was significantly above chance for both subjects. No other differences were significant, except for subject CI 2 in the AsynchDelay condition. Performance for this condition and subject was also significantly above baseline in experiment 1a.
Data analyzed separately for the two target electrodes
Performance analyzed for the two targets separately is shown in Table 3b. It can be seen that listener CI 1 shows a large difference in performance between the two targets, being below chance for electrode B and above chance for electrode C. This could be due to a response bias, or to the partial loudness for electrode C being higher, causing it to “pop out” of the mixture even when there was no additional cue to make it do so. The effect of this asymmetry would be to reduce the overall percentage correct when averaged across the two electrodes; an extreme asymmetry would limit performance to 50% in all conditions, even if the subject were sensitive to the cues introduced in particular conditions. Hence, the average scores shown in Figure 5 probably underestimate this subject’s underlying sensitivity. However, the asymmetry would strongly affect our interpretation of the pattern of results only if it varied markedly across conditions. To test this idea, we calculated a new measure in which we first defined a “hit” as the case where the subject correctly identified an interval when the target and probe were on electrode B, and a “false alarm” as the case when the target was on electrode C but the subject picked the interval when the probe was on electrode B. We then calculated a measure of asymmetry, or “bias,” in a way similar to that used to calculate the criterion “c” in a yes–no task: bias = −0.5[z(H) + z(FA)] (Macmillan and Creelman 1991). The absence of any bias would yield a score of zero. The resulting values were similar across conditions for both subjects: for CI 1 they varied between 1.24 and 1.48 across conditions, and for CI 2 they ranged from −0.03 to +0.52.