INTRODUCTION AND OUTLINE

Psychophysical studies have revealed that there are two basic cues for pitch perception in cochlear implant subjects (Tong et al. 1982; Townshend et al. 1987) and that these cues are independent of each other (McKay et al. 2000; Tong et al. 1983). The first cue is related to the site of excitation along the cochlea (place pitch). The more the excitation is located toward the base, the higher the pitch. The second cue is related to the repetition rate in time of a stimulation sequence on one location (temporal pitch); the higher the repetition rate, the higher the pitch.

Temporal pitch is known to saturate at rates above 300 Hz for most listeners (Shannon 1983) and the just noticeable difference (JNDs) in rate of cochlear implant subjects are a lot larger than those of normal hearing subjects (Townshend et al. 1987), even when the normal-hearing subjects are using only purely temporal pitch cues (Carlyon and Deeks 2002). However, temporal pitch cues in cochlear implant recipients have been shown to provide a means for melody recognition (Pijl and Schwarz 1995) and musical interval identification (McDermott and McKay 1997; Pijl 1997).

Place pitch sensitivity in cochlear implants is usually measured in an electrode discrimination task. The subjects have to indicate the highest (or different) one out of two or more stimuli that are played on different channels (Tong and Clark 1985; Townshend et al. 1987). Similarly, scaling tasks have been used where subjects had to indicate quantitatively the perceived pitch of a certain sound. However, it is difficult to extract sensitivity measures from scaling experiments (e.g., Busby et al. 1994; Cohen et al. 1996b). Both measures (scaling and discrimination) are highly correlated (Collins et al. 1997) but are not completely equivalent, because a second perceptual dimension (apart from place pitch) also changes as electrode insertion depth is varied (Collins and Throckmorton 2000). This second dimension is of minor influence, however, and it is unclear what this second dimension relates to. The scaling task measures the difference in place pitch while the discrimination task takes into account both perceptual dimensions.

Studies of place pitch sensitivity have typically revealed a large intersubject variability ranging from near-perfect discrimination of adjacent channels up to weak discrimination of channels at opposite extremes of the electrode array (Nelson et al. 1995). This large range of JNDs and intersubject variability has been corroborated by other studies (Busby and Clark 1996; Collins et al 1997; McKay et al. 1999; Pfingst et al. 1999; Zwolan et al. 1997). Reducing the loudness of the stimuli tends to decrease place pitch sensitivity or electrode discrimination, though the effect is small (McKay et al. 1999) and might not be present in all subjects (Pfingst et al. 1999). Also, random roving of the loudness decreases place pitch sensitivity (Henry et al. 2000).

The perceived magnitude of place pitch has been proven to be related to the center of gravity of the forward-masking pattern for single-channel stimuli and thus related to the center of gravity of the excitation pattern (Cohen et al. 1996a). Furthermore, intermediate place pitches can be elicited by stimulation of adjacent channels stimulated simultaneously (Townshend et al. 1987), or nonsimultaneously with pulses temporally interleaved (McDermott and McKay 1994).

Although place pitch has not been studied extensively in a musical context, McDermott and McKay (1997) studied musical interval identification using single-channel pulse trains in one musically trained subject. They found that musical interval identification is possible using place pitch although a large variability existed in the responses, and that the size of the interval is approximately linearly related to distance along the electrode array.

Most studies that report on place pitch discrimination use bipolar stimulation, where both the active and the return electrode are intracochlear. However, in current clinical practice, more and more cochlear implant devices are stimulated in monopolar mode where the return electrode is distant and located outside the cochlea (Pfingst et al. 2001). Monopolar stimulation requires lower stimulation levels and can lead to better speech perception in cochlear implant recipients (Zwolan et al. 1996). In the present study, place pitch discrimination was determined using monopolar stimulation.

Studies of pitch sensitivity in cochlear implant subjects, based on either temporal or place pitch cues, typically involve simple stimuli, usually only stimulating one channel or electrode at a time. However, current speech processors stimulate a number or even all channels simultaneously or with pulses temporally interleaved depending on the stimulation strategy used. Few studies have measured the effect of multiple-channel activation on pitch sensitivity. In a study by Geurts and Wouters (2001), the discrimination of the fundamental frequency of synthetic vowels was measured after processing with a CIS strategy. They also found that both temporal and place pitch cues can contribute to the discrimination of the fundamental frequency of synthetic vowels. In an additional temporal pitch discrimination experiment, they also concluded that multiple-channel stimulation improves the temporal pitch sensitivity. However, in their experiment the loudness of the multichannel stimuli was higher than the loudness of the single-channel stimuli. This might have contributed to a better discrimination for the multichannel stimuli because rate discrimination improves as loudness increases (Morris and Pfingst 2000).

In the present study we measure the effect of multiple-channel stimulation on place pitch sensitivity. We hypothesize that place pitch sensitivity will be affected by the number of active channels. Place pitch sensitivity might decrease because of the increased overlap in excitation patterns. Similarly, place pitch sensitivity might decrease with increasing number of active channels if the sensitivity is based upon the peakedness of the excitation pattern. In contrast, the distance between the channels that are not common to both stimuli in one trial becomes larger as the number of active channels increases, although the overall shift of active channels is fixed. This larger distance might improve place pitch sensitivity. If place pitch discrimination is based upon size of the difference in excitation at the edges or flanks of the excitation patterns, it will be unaffected by the number of active channels.

METHODS

A pitch-ranking experiment using single- and multichannel stimuli examined whether the number of active channels affects the subjects’ sensitivity to place pitch. The channels were stimulated in monopolar mode. JNDs were estimated from this pitch-ranking experiment and expressed as a function of the number of active channels. The multichannel stimuli were loudness balanced to the single-channel stimuli in order to minimize cues related to loudness. The subjects were trained with the psychophysical task prior to the start of the experiment in order to minimize training effects.

Subjects

Four postlingually deafened users of the Nucleus CI24 cochlear implant participated in this study. All subjects performed all tests and were paid for their collaboration. Some relevant details about the subjects can be found in Table 1. Subjects S1 and S2 are implanted with the straight version of the Nucleus electrode array CI24R(ST), Nucleus24k. Subjects S3 and S4 are implanted with the perimodiolar version of the Nucleus electrode array CI24R(CS), Nucleus24 Contour. All subjects achieve relatively good speech recognition.

Table 1 Relevant information about each of the subjects who participated in the experiments. The last column shows the results for a phoneme recognition task with monosyllabic consonant–vowel–consonant (CVC) words of the NVA list (Wouters et al. 1994)

Stimuli

The stimuli in the experiments consisted of constant-amplitude biphasic pulse trains with phase duration and interphase gap set to 25 μs and 8 μs, respectively. The pulse rate was 900 pulses per second (pps) per channel. The stimulation mode was monopolar with both return electrodes active (MP 1 + 2), so every stimulation channel consisted of one intracochlear electrode and two extracochlear electrodes. All stimuli were 500 ms in duration with a 500-ms silent gap in between the two stimuli of a test trial.

The single-channel stimuli consisted of 450 pulses of equal amplitude presented to the desired channel. The multichannel stimuli consisted of several adjacent single-channel stimuli presented with pulses temporally interleaved. This was done by delaying every single-channel pulse train added to the stimulus by 138 μs (equivalent to the interpulse interval of an ACE strategy with 8 maxima at 900 pps per channel) relative to the last added pulse train. This allows a maximum of 8 simultaneously active pulse trains. Multichannel stimuli are indicated by their center electrode for stimuli containing an odd number of active channels. For stimuli containing an even number of channels, the stimuli are indicated by the more apical electrode of the middle two electrodes. The stimuli used in the present study were centered on electrode 13 or 17 and the number of active intracochlear channels was 1, 2, 3, 4, 5, or 8 depending on the condition. The electrodes are numbered from base to apex, according to clinical practice of Cochlear devices. Figure 1 displays a detail of two examples of stimuli: a single-channel stimulus and a 4-channel stimulus both centered on electrode 17.

Figure 1
figure 1

Amplitude–time waveforms of parts of two different stimuli, both centered on electrode 17. Electrodes are numbered from base to apex. The upper panel displays a part of a single-channel stimulus. The lower panel displays a part of a four-channel stimulus.

The amplitude of the single-channel stimuli was at C-level or 100% of the dynamic range for the respective channel and subject. The amplitude of the multichannel stimuli was set to a fixed percentage between 0% and 100% of the dynamic range expressed in clinical current units. The clinical current units follow a logarithmic scale and one clinical current unit corresponds to approximately 0.17 dB. The amplitudes of all channels of one multichannel stimulus were set to the same percentage. This percentage was chosen to make the single- and multichannel stimuli equally loud and was obtained through the loudness-balancing procedure discussed later. The amplitude of the pulses to each channel in a multichannel stimulus was set to the current level that corresponded to this percentage of the dynamic range for the respective channel. Consequently, the amplitude of pulses was constant within a channel but not across channels. Setting the amplitude for each channel at the same percentage of the dynamic range makes all channels approximately equally loud when presented separately, at least when assuming that the loudness growth functions as a function of dynamic range are similar over neighboring channels. This assumption is also used in the clinical implementation and fitting of speech processors.

The dynamic ranges for each channel and each subject were obtained from the clinically determined T- and C-levels for single-channel pulse trains at 900 pps.

To minimize cues related to loudness, random current intensity roving was applied by multiplying each stimulus’ amplitude with a random value between 0.85 and 1.1, where the amplitude was expressed in clinical current units relative to the T-level. The roving factor was the same for all channels of a multichannel stimulus.

Loudness balancing

The multichannel stimuli were loudness balanced for each subject to the loudness of a single-channel stimulus. This single-channel stimulus was presented at full dynamic range or C-level and at the center of the respective multichannel stimulus. The loudness balancing procedure was performed for stimuli centered on electrodes 17, 13, and 6 for subjects S2 and S4. For subjects S1 and S3, the loudness balancing was done for electrodes 17 and 13.

The loudness-balancing procedure was based on a 2-interval–2-alternative forced-choice (2I2AFC) task. The subject perceived in random order the single-channel stimulus presented at C-level and a multichannel stimulus of which the amplitude was varied and was asked to indicate the loudest signal. The amplitude of the multichannel stimulus was adaptively varied using a 1-up 1-down procedure (Levitt 1971). The amplitude of the pulses in the multichannel stimuli was varied in steps of 5% of the dynamic range for amplitudes between 10% and 100% of the dynamic range. For dynamic ranges between 0% and 10%, the amplitude varied in steps of 1%.

The subjects performed two runs of the adaptive procedure for each center electrode and each number of active channels. At the start of the first run, the multichannel stimulus (presented at 10%–20% of the dynamic range) sounded softer than the single-channel stimulus. In contrast, the second run started with the multichannel stimulus (presented at 80%–100% of the dynamic range) sounding louder than the single-channel stimulus.

Each run was stopped after 8 reversals. The mean of the last 4 reversals of both runs (and rounded to the accuracy of one of the steps of the dynamic range used in the loudness balancing) was taken as the amplitude of the multichannel stimulus after loudness balancing. The average absolute difference between both runs was 0.04%. The resulting loudness-balanced amplitudes, presented as a percentage of the dynamic range, are depicted in Figure 2 for each subject and for each center electrode location. The amount of current reduction increased monotonically as a function of the number of active channels for each condition in every subject. For subjects S2, S3, and S4, there was virtually no variation over the different electrode locations. For subject S1, the multichannel stimuli centered on electrode 13 required more current reduction for loudness balancing than the multichannel stimuli centered on electrode 17. The average standard errors over the different electrode locations are 0.17%, 0.04%, 0.01%, and 0.02% of the dynamic range for each subject, respectively.

Figure 2
figure 2

Results of the loudness balancing of multichannel stimuli relative to a single-channel stimulus presented at C-level. Loudness-balanced stimuli were always centered on the same electrode. Each panel corresponds to one subject and the different lines within each panel represent results for different center electrodes. The loudness of the multichannel stimuli is adjusted by setting the amplitude of each channel to a percentage of the dynamic range. The latter percentage is fixed over the different channels that make up the multichannel stimulus.

Psychophysical procedure

Place pitch sensitivity was measured at two reference locations along the electrode array (electrodes 17 and 13) and for six active channels (1, 2, 3, 4, 5, and 8 simultaneously active channels). For subject S2, a third reference location along the electrode array, electrode 6, was also measured.

Place pitch sensitivity was measured in a pitch-ranking experiment using a 2I2AFC task. In one trial, the subjects were presented two stimuli in random order and were instructed to indicate the highest one in pitch and to ignore the loudness of the stimuli. Both stimuli contained the same number of active channels, but one stimulus was centered on the reference electrode and the other stimulus was centered on an electrode located 1 or 2 electrodes more toward the base or 1 or 2 electrodes toward the apex.

The trials were presented in blocks. Each combination of reference electrode location and number of active channels was measured in a separate block of trials, leading to 12 (2 electrode locations × 6 numbers of active channels) different blocks per subject. Within each block, the four different shifts (−2, −1, +1, +2) of active channels were repeated ten times, leading to 40 trials per block. Each block of trials was presented twice to the subjects. Accordingly, each point on the psychometric function was based on the mean of 20 trials, and each psychometric function was based on a total of 80 trials per subject.

The amplitude of the pulse trains of all stimuli in one block was adjusted to match the result of the loudness-balancing procedure for the reference stimulus. Because center electrode location affected the loudness-balancing results only mildly (except for subject S1) and because the shifts in center electrode location are small within one block, the same loudness adjustment in percentages is used for the comparison stimuli as for the reference stimuli. In addition, any possible remaining loudness effects are expected to be eliminated by the amplitude roving.

Training

Because none of the subjects had prior experience in performing psychophysical tasks, they were trained on three different pitch discrimination tasks prior to the experiment. The subjects were trained to a 2I2AFC pitch-ranking experiment. The first two training experiments were electrode-ranking tasks where the subjects were presented two stimuli consisting of single-channel pulse trains on different electrodes and had to indicate the highest one. In the first training task, the active electrodes of the two stimuli were 4, 8, or 12 electrodes apart and the amplitude was roved as described before. Feedback was presented immediately after each trial only during this first training task. This test was repeated until the subjects performed the task with a success rate of 100% over 20 consecutive trials. This lasted not longer than 15 minutes.

The second training task was similar to the first training task, but electrodes were spaced more closely together and no feedback was given to the subjects. Feedback was removed from this and further experiments in order to make the subjects use their intuitive pitch sensation rather than any trained pattern they might learn during training with feedback. This test consisted of a 2I2AFC pitch-ranking task and amplitude was roved as described before. The active electrodes were spaced 1, 2, or 3 electrodes apart and the electrode discrimination or place pitch sensitivity was determined at four sites along the electrode array (electrodes 21, 17, 13, and 6).

In the third task, the subjects performed a pitch-ranking experiment of complex multichannel stimuli. This task involved over more than 1000 trials per subject in at least 6 sessions of 1.5 h. So plateau performance was expected on pitch-ranking tests, although this was not proven explicitly, before the data reported here were gathered.

Analysis of results

The data of each block of trials were converted into one single parameter that is a measure of the subjects’ place pitch sensitivity in that particular condition. A typical result of a block of trials for one subject is depicted in Figure 3, where the proportion of times that the reference stimulus was perceived to be higher in pitch than the comparison stimulus is plotted as a function of the shift in location of the center electrode of the comparison stimulus. This proportion was fitted to a normal cumulative distribution function using a Gauss–Newton nonlinear least-squares fitting method. The slope of this fitted normal cumulative distribution function was taken as a performance measure of place pitch sensitivity. This fitting procedure is similar to a linear regression with the data points converted into d′ values. JNDs were derived from the fitted curves as the minimal distance expressed in number of electrodes to obtain 75% correct pitch rankings.

Figure 3
figure 3

Typical result of one block of trials of place pitch sensitivity for one subject. The thin line depicts the proportion of times that the reference stimulus was perceived to be higher in pitch as the comparison stimulus. This proportion is expressed as a function of shift in location along the electrode array of the comparison stimulus relative to the reference stimulus. Negative shifts are displacements toward the apex. The thick line represents the fitted cumulative normal distribution function. The 75% point or JND is indicated on the fitted curve.

The fitting procedure did not converge for blocks of trials that contained only 100% correctly pitch-ranked trials for all shifts in location. For these blocks, a 5% error (equivalent to half a trial error because each trial was repeated 10 times) in pitch-ranking performance was introduced in the proportion of correctly ranked trials of the smallest shift in location before the fitting procedure, as suggested by MacMillan and Creelman (1991).

RESULTS

Average JND estimates of the four subjects are depicted in Figure 4 as a function of the number of active channels. JNDs are averaged over the two center electrode locations (17 and 13) and both test runs. Subjects scored very well in the discrimination task. The subjects made only between zero and seven errors in pitch ranking per block of 40 trials with an average of 2.1 errors per block, or leading to estimated JNDs that were often smaller than the smallest difference presented to the subjects. Subject S2 was able to rank all stimuli perfectly in pitch on all trials for all numbers of active channels and for all stimuli centered on electrodes 17 and 13. The JND for subject S2 was estimated based on the introduction of 5% errors as described above.

Figure 4
figure 4

The average JND of place pitch sensitivity (based on the fitting procedure) as a function of the number of active channels for the four subjects. The JNDs are averaged over the two electrode locations and the two test runs, and they are expressed in number of electrode separation. Error bars indicate ±1 standard deviation. Increasing the number of active channels did not affect the place pitch sensitivity. Subject S2 had perfect discrimination over all conditions and all runs. The estimated JNDs shown are based upon a 5% error for the smallest tested shift (see text).

The data were analyzed using a three-factorial repeated-measures analysis of variance (ANOVA). The three factors were center electrode location (electrode 17 or 13), number of active channels (1, 2, 3, 4, 5, or 8 active channels) and the number of the test run (the first or second test run). No significant effects were found. The effects of the center electrode location [F(1) = 0.027; p = 0.879] and the effect of test run [F(1) = 0.113; p = 0.758] were not significant. The effect of the number of active channels was not significant [F(1) = 2.038; p = 0.249] using the lower-bound statistics. There were no significant interaction effects. There were significant differences in performance between subjects [F(1) = 50.008; p = 0.006].

The average JND for each subject respectively is 0.59, 0.34, 0.61, 0.40 expressed in electrodes. As electrodes in the Nucleus electrode array are spaced 0.75 mm apart, these JNDs can also be expressed as 2.26 d′/mm, 3.92 d′/mm, 2.19 d′/mm, and 3.33 d′/mm. Because the data from the training experiments indicated a slightly worse place pitch sensitivity at electrode 6 for subject S2, this additional condition was presented to subject S2. On average, subject S2 made 1.5 errors per block, resulting in an average JND of subject S2 in the extra condition of 0.45 electrodes or 2.96 d′/mm. The JNDs of the subjects are derived from a fitting procedure as described in the Methods section. Most JNDs are interpolated values because they are smaller than the smallest difference actually presented to the subjects.

DISCUSSION

The obtained channel discrimination JNDs are within the range of JNDs reported for Nucleus CI22 subjects stimulated in BP or BP + 1 mode. In the study of Nelson et al. (1995), the average performance of 14 subjects for a 212AFC pitch-ranking experiment with loudness-balanced stimuli ranged from very poor (0.12 d′/mm) up to excellent (3.16 d′/mm) which translates to JNDs of 11 electrodes and 0.4 electrodes, respectively. Compared to these results, the subjects of the present study all have very good place pitch sensitivity.

Subjects S3 and S4 of the present study were implanted with a precurved electrode array, the CI24R(CS) Nucleus24 Contour device. JNDs in bipolar mode (BP + 1) in three subjects implanted with a prototype precurved electrode array ranged between 0.5 and 3 electrodes, measured with a 4I4AFC discrimination task and with loudness-balanced stimuli using an adaptive procedure (Cohen et al. 2001). The JNDs of subjects S3 and S4 of the present study are equivalent to the JNDs of the better-performing subject in the study of Cohen et al. The amplitude roving of the stimuli in the present study might have reduced place pitch sensitivity. The JNDs of the subjects in the study of Cohen et al. increased to between 0.7 electrode and 6 electrodes when loudness was roved between 35% and 80% of the dynamic range (Cohen et al. 2001). However, the loudness roving in that study was deliberately increased until the task was significantly more difficult for each subject. The amount of loudness roving was smaller in the present study and, consequently, the decrease in place pitch sensitivity due to loudness roving might have been smaller.

The present study measured place pitch sensitivity in monopolar stimulation mode, whereas most studies, including the studies of Nelson et al. (1995) and Cohen et al. (2001), all report channel discrimination in bipolar mode. Few studies have reported channel discrimination in monopolar mode. Townshend et al. (1987) examined channel discrimination in three subjects with monopolar stimulation using a 2I2AFC procedure. For one subject the pitch varied nonmonotonically with distance from the base. The two remaining subjects were able to pitch rank the channels in an orderly way from apex to base. The d′ per electrode in those two subjects ranged from 0.1 to 0.9. This corresponds to JNDs ranging from 33 to 3.5 Nucleus electrodes (or ranging from 0.04 d′/mm up to 0.36 d′/mm), respectively, taking into account that the electrodes in the study of Townshend et al. were separated by 2.5 mm. The JNDs reported by Townshend et al. are appreciably higher than the JNDs reported in this study.

Although modeling results and physiologic studies in animals showed that stimulation in monopolar mode excited very broad regions (Black and Clark 1980), recent data by Nelson et al. (2003) show that forward-masking patterns obtained using monopolar mode can be relatively narrow with relatively steep slopes. They also found that the range of widths of the forward-masking patterns is similar for bipolar and monopolar stimulation modes, although the comparison was not done within the same subjects. The steep slopes of the excitation patterns, even when using monopolar stimulation, might explain the excellent place pitch discrimination found in the present study because narrower excitation patterns are assumed to be more easily discriminated. Although the multichannel stimuli excite wider and more overlapping populations compared to single-channel stimuli, the discrimination was not diminished. This is consistent with the idea (although it does not conclusively prove) that discrimination is based upon the edges of the excitation pattern and not upon the overlapping region. This last hypothesis is consistent with the model of Zwicker (1970) for frequency discrimination with acoustic stimulation. According to Zwicker’s model, frequency discrimination depends on the detection of differences in the time-averaged patterns of excitation. Zwicker assumed that a change between stimuli can be detected whenever the excitation patterns differ at any point by more than a criterion value. The size of the JND is predicted to be inversely proportional to the slope of the excitation pattern at its steepest point and directly proportional to the criterion value. The model of Zwicker, however, is not capable of explaining data with roving intensity. Lyzenga and Horst (1995) modified the model of Zwicker to account for frequency discrimination using randomly varying intensity by equalizing the average overall level of the excitation patterns of the comparison stimuli prior to the comparison of the two excitation patterns. This model is capable of explaining frequency discrimination when temporal cues are not present in the excitation pattern in the auditory system (Lyzenga and Horst 1997), as is the case with the stimuli in our experiment. In a study examining the effect of current intensity on electrode discrimination, McKay and colleagues (1999) also found that their data were consistent with such a model.

CONCLUSIONS

Place pitch discrimination or site of excitation discrimination was measured in four Nucleus CI24 cochlear implant recipients using single- and multichannel stimuli, whereas previous studies measured single-channel discrimination. The results from this study can be summarized as follows:

  1. 1.

    Place pitch sensitivity or electrode discrimination performance for single- and multichannel stimuli in monopolar stimulation mode can be very good. JNDs are smaller than the distance between two adjacent electrodes (which corresponds to 0.75 mm for the Nucleus CI24 device). This performance agrees with the data of the better performers in other studies for single-channel stimulation in bipolar mode.

  2. 2.

    Place pitch sensitivity is not affected as the number of active channels, or equivalently the width of the excitation pattern, is increased from 1 to 8, even when the excitation patterns of the different stimuli largely overlap.

  3. 3.

    Discrimination of single- and multichannel stimuli based upon place pitch in a cochlear implant is consistent with Zwicker’s model (Zwicker 1970), at least for monopolar stimulation.