Introduction

In human listeners, a tone complex comprised of successive harmonics of a fundamental frequency (F0), but having no acoustic energy at the F0, evokes a pitch corresponding to the missing F0. This phenomenon is known as the “pitch of the missing fundamental” and is a well-established attribute of human pitch perception. The pitch of the missing F0 remains unchanged when the tone complex is presented in a background of masking noise centered on the F0 (Thurlow and Small 1955; Licklider 1956; Patterson 1969). These masking studies suggest that the pitch of the missing F0 does not arise through the reintroduction of the F0 as a cochlear distortion product. The perception of the missing F0 is not unique to human listeners and also appears to exist in cats (Heffner and Whitfield 1976; Chung and Colavita 1976), monkeys (Tomlinson and Schwarz 1988), birds (Cynx and Shapiro 1986), and goldfish (Fay 2005). Although Chung and Colavita (1976) state in their discussion that cats responded to missing F0 tone complexes in the presence of low-pass masking noise, there has been no systematic investigation using masking noise to assess the possible contribution of cochlear distortion products to the perception of the missing F0 in an animal model.

Understanding the contribution of cochlear distortion products in periodicity perception in animals is important for understanding the neural mechanisms underlying pitch. The magnitude of the distortion product should depend in part on cochlear tuning, which has recently been debated in the literature (e.g., Shera et al. 2002; Ruggero and Temchin 2005). If cochlear tuning is broader in non-human mammals (Shera et al. 2002), then it would suggest that the overlap between the traveling wave components would be greater, thus producing larger cochlear distortion products. Evidence from otoacoustic emissions suggests that the cochleae of rodents may produce larger distortion products than those of humans (Withnell et al. 2003). If distortion products are larger in non-human mammals, then they could potentially play a greater role in animal periodicity perception.

Perceptual questions in animals can be addressed using stimulus generalization paradigms. In a stimulus generalization paradigm, an animal is trained to respond to a specific stimulus, and then responses are measured to test stimuli that are presented infrequently and vary systematically along one or more stimulus dimensions (Mallott and Mallott 1970). Test stimuli that evoke similar behavioral responses as the training or signal stimulus indicate a perceptual equivalence or perceptual invariance (see Hulse 1995) among these stimuli. A systematic change in behavioral response along the physical dimension of the stimulus is known as a generalization gradient and is consistent with the hypothesis that the animal possesses a psychological dimension related to the physical dimension of the stimulus (Guttman 1963). In generalization experiments, the animal is not trained specifically to discriminate among the various test stimuli, and consequently, the behavioral responses observed to the test stimuli reflect the stimulus features and resulting perceptual dimensions used by the animal in analyzing the stimuli during testing (Fay 2005). In the present study, behavioral responses to missing F0 harmonic tone complexes in the presence of low-pass masking noise were measured in chinchillas using operant conditioning and stimulus generalization paradigms in order to assess the possible contribution of distortion products in periodicity perception.

Methods

The procedures used in this study were reviewed and approved by the Institutional Animal Care and Use Committee for the Bloomington Campus of Indiana University.

Subjects

Four adult, male chinchillas (Chinchilla laniger) served as subjects in these experiments. The ages of the chinchillas were between 1 and 2 years when training began. None of the four chinchillas (c12, c15, c24, and c36) used in the present study had previous experience in the behavioral paradigm. Chinchillas were food-restricted, but they had free access to water. Animals received food pellet rewards during behavioral testing, and their diets were supplemented with chinchilla chow to maintain their body weights between 80% and 90% of their normal body weight. Chinchillas received a raisin as a treat following each daily behavioral test session. Animals were housed individually in rabbit cages in a room in the animal care facility, and the ambient noise level of the room was measured at 44-dB SPL (A-weighted). All animals were in good health during the period when their data were being collected. Animals were tested daily in a session that typically lasted 1 h.

Acoustic stimuli

Stimuli were generated on a digital array processor (AP2 Tucker-Davis Technologies) at a sampling rate of 50 kHz and stored as stimulus files on disk. Stimulus presentation and data acquisition were under the control of a Gateway computer system and Tucker-Davis Technologies System II modules. Stimuli were played through a D/A converter (TDT DD1 module) at conversion rate of 50 kHz and low-pass filtered at 15 kHz. The output of the low-pass filter was amplified (Bryston power amplifier), attenuated (TDT PA4 module), and played through a loudspeaker (RCA Pro-X880AV from Radio Shack). The frequency response of the system is shown in Figure 1A.

FIG. 1
figure 1

A Frequency response of the acoustic system measured using a 40-μs click played through the loudspeaker. The output of the loudspeaker was measured with a calibration microphone placed at the approximate position of the animal’s head and analyzed with an IVIE-33 spectrum analyzer. The horizontal lines indicate that the frequency response was relatively flat varying ±8 dB from 100 to 10,000 Hz. B Example acoustic spectra obtained from the calibration microphone and spectrum analyzer for the low-pass masking noise at an overall level of 72-dB SPL (black line) and the noise floor of the system (gray line). The triangle shows the level of the 125-Hz component obtained at 73-dB SPL for the 125-Hz missing F0 tone complex. The diamond shows the level of the 500-Hz component obtained at 73-dB SPL for the 500-Hz missing F0 tone complex. Note that there was no measurable acoustic energy above the noise floor at the F0 for the missing F0 stimuli. The dashed line indicates a spectrum level of 50 dB/Hz.

Stimuli consisted of harmonic complex tones comprised of successive harmonics of a F0 up to 10 kHz. Individual components were of equal amplitude and added in cosine-starting phase. Stimuli had durations of 500 ms with 10-ms rise/fall times; the intervals between stimuli were 500 ms. Unless specified, the root-mean-squared amplitudes of the harmonic tone complexes were scaled to be equal in order to eliminate the use of overall level as a cue during testing. The overall sound pressure level of the harmonic complex tones was determined by placing a condenser microphone at the approximate position of an animal’s head and measuring the sound pressure level with a sound spectrum analyzer (Ivie IE-33). In this study, the sound pressure level was fixed at 73-dB SPL for all harmonic complex tones. In many conditions, harmonic complex tones were presented in a continuous background of low-pass noise. Wideband, Gaussian noise was generated (TDT WG1 module) and low-pass filtered (TDT PF1 module) with a cut-off frequency of 710 Hz (Fig. 1B).

The presence of nonlinear distortion in the acoustic system at the F0s of 125 and 500 Hz was checked by measuring the amplitude of the F0 components from the spectrum analyzer for F0 present complexes and missing F0 tone complexes. There was no measurable distortion component above the noise floor from the speaker system at either F0 (Fig. 1B).

Behavioral procedure

Chinchillas were placed into a cage (61 cm width × 61 cm length × 36 cm high); animals were not restrained in any way, but were free to roam around the cage. The cage was located on a card table in a single-walled sound-attenuating chamber (Series 40a Industrial Acoustics) having internal dimensions of 1.6 m width × 1.6 m length × 2 m high. A pellet dispenser was located at one end of the cage with a reward chute attached to a response lever. The loudspeaker was placed next to the pellet dispenser approximately 30° to the right of center at an approximate distance of 15 cm in front of the animal. The behavioral procedure was based on an operant conditioning paradigm and has been employed previously to study the perception of periodicity strength in chinchillas (Shofner 2002; Shofner et al. 2005; Shofner and Whitmer 2006) and was similar to a procedure used by Ohlemiller et al. (1999) to study categorical perception of consonant-vowel syllables in chinchillas.

A standard 125-Hz F0 tone complex was presented continually in 500-ms bursts at a rate of once per second, regardless of whether or not a trial was initiated. Chinchillas were trained to discriminate a 500-Hz F0 signal tone complex from a 125-Hz F0 standard tone complex. A trial was initiated when the chinchilla pressed down on the response lever. The lever had to be depressed for a specified duration of time; this is called the holdtime. After the lever was depressed, the standard tone complex was presented for an additional number of bursts to create a holdtime prior to the start of a trial. The number of additional bursts was determined for each trial from a rectangular probability distribution and ranged from one to eight bursts for c12, c24, and c36, but ranged from one to six bursts for c15. If the chinchilla released the lever before the holdtime expired, then the trial was aborted. That aborted trial began again with the same holdtime with the next lever press. If the animal depressed the lever for the duration of the holdtime, then two 500-ms bursts of a selected tone complex were presented for that trial. The response window was coincident with the duration of the two 500-ms bursts (2,000 ms), except that the response window actually began 150 ms after the onset of the first burst and lasted until the onset of the next burst of the continual standard stimulus. Consequently, the actual duration of the response window was 1,850 ms. The resulting holdtimes ranged from 1.15 to 8.15 s for c12, c24, and c36 and ranged from 1.15 to 6.15 s for c15. A release of the lever during the response window was considered to be a positive response, whereas continuing to depress the lever for the duration of the response window was considered to be a negative response.

A signal trial consisted of two bursts of the 500-Hz F0 harmonic tone complex. If the animal released the lever during the response window of a signal trial, then this positive response was treated as a hit. A negative response during a signal trial was treated as a miss. A blank trial consisted of two additional bursts of the 125-Hz F0 harmonic tone complex. If the animal released the lever during the response window of a blank trial, then this positive response was treated as a false alarm. A negative response during a blank trial was treated as a correct rejection. A hit was considered a correct response and was rewarded with a food pellet, whereas misses and false alarms were incorrect responses and as such were not rewarded with food pellets. Correct rejections were also considered as correct responses. Food pellet rewards for correct rejections were occasionally given to c15 in order to induce the animal to continue to hold the level down and reduce false alarms, but it was not necessary to reinforce correct rejections for the other three chinchillas. A test trial consisted of two bursts of a tone complex that was different from the signal and standard tone complexes. Although positive and negative responses during signal or blank trials were treated as correct responses, the responses obtained to test trials were considered to be neither correct nor incorrect. Consequently, chinchillas did not receive food pellet rewards for response to test stimuli, regardless of whether the behavioral response was positive or negative.

Chinchillas were trained and tested in blocks consisting of 40 trials. Details of the training procedure have been described previously (Shofner 2002). During training sessions, no test stimuli were presented, and a block of 40 trials consisted of 32 signal trials and eight blank trials. During testing sessions, test stimuli were presented infrequently in the block of trials such that in each block, 60% of the trials were signal trials (24/40 trials), 20% were blank trials (eight of 40 trials), 10% were test stimulus no. 1 trials (four of 40 trials), and 10% were test stimulus no. 2 trials (four of 40 trials). Behavioral responses were considered to be under stimulus control if the percent correct for the discrimination of the signal from the standard was at least 81% for each block. Behavioral responses were collected for a minimum of 50 blocks (i.e., 2,000 total trials). Consequently, responses to each of the test complex tones are based on at least 200 trials. Behavioral responses are expressed as percent response as defined as

$$ \% \;{\hbox{Response}} = \frac{{\# \;{\text{of}}\;{\text{lever}}\;{\text{releases}}}}{{\# \;{\hbox{of}}\;{\hbox{trials}}}} \times 100. $$

Animals typically completed two to eight blocks per day. The specific signal, standard, and test tone complexes were dependent on the specific experiment (see below).

Results

Experiment 1: Effect of low-pass masking noise on behavioral responses

The purpose of the first series of experimental conditions was to determine whether harmonic tone complexes having no acoustic energy at the F0 are perceptually equivalent to harmonic tone complexes in which the F0 is present. In this condition, the standard was a 125-Hz F0 tone complex comprised of the F0 and all successive harmonics up to 10 kHz (i.e., harmonics 1–80); the signal was a 500-Hz F0 tone complex comprised of the F0 and all successive harmonics up to 10 kHz (i.e., harmonics 1–20). It should be noted that when the overall sound level was fixed at 73-dB SPL for the standard, the measured level of the 125-Hz F0 component was 52.8-dB SPL, and the measured level of the 500-Hz F0 component was 60.4-dB SPL. The amplitude of these two F0 components were 30 and 47.8 dB above the noise floor, respectively, and are higher than the critical ratio reported for the chinchilla (Seaton and Trahiotis 1975; Miller 1988; Niemiec et al. 1992). Consequently, in the absence of the external low-pass masking noise, both the 125- and 500-Hz components of the standard and signal tone complexes were at a level that was detectable. The two test stimuli used were missing F0 harmonic tone complexes having F0s of 125 and 500 Hz (i.e., harmonics 2–80 and 2–20, respectively). Thus, these test stimuli were only missing the F0 component.

The individual behavioral responses of four chinchillas trained to discriminate the 500-Hz F0 present tone complex from the 125-Hz F0 present tone complex and tested with missing F0 tone complexes are shown by the white bars in Figure 2. When tone complexes were presented in quiet, behavioral responses were high to the 500-Hz F0 present signal complex (i.e., animals released the lever during signal presentation), whereas behavioral responses to the 125-Hz F0 present standard complex were low (i.e., animals continued to hold the level down during standard presentation). Behavioral responses to a missing 500-Hz F0 tone complex were similar to those obtained to the 500-Hz F0 present signal; responses to a missing 125-Hz F0 complex were similar to those obtained from the 125-Hz F0 present standard for all animals. These behavioral responses suggest that missing F0 harmonic tone complexes are perceptually equivalent to F0 present tone complexes in quiet.

FIG. 2
figure 2

Bar graph shows the percent responses obtained from each of four chinchillas to F0 present and missing F0 stimuli at 73-dB SPL. Responses were obtained in quiet (white bars) and in the presence of low-pass masking noise at overall levels of 52-dB SPL (striped bars), 62-dB SPL (gray bars), and 72-dB SPL (black bars).

In order to evaluate the role of distortion products in the perceptual equivalence between missing F0 and F0 present tone complexes, the standard, signal, and test stimuli were presented in the generalization task in the presence of a continuous low-pass noise. The masking noise had an upper cut-off frequency of 710 Hz, and behavioral responses to test stimuli were obtained with the overall level of the masking noise fixed at 52-, 62-, and 72-dB SPL (striped, gray, and black bars in Fig. 2, respectively). Note that the behavioral responses obtained for each of the masking noise levels are similar to those obtained in quiet for each of the two F0s.

Figure 3 shows the d′ for the discrimination of the 500-Hz F0 signal (i.e., F0 present) from the 125-Hz standard (i.e., F0 present) obtained in quiet and in the three levels of low-pass masking noise for each individual animal. The d′ was computed as z(Hits) − z(False Alarms). Note that d′ remains relatively constant as the masking noise level increases for all four chinchillas; that is, there is no decrease in d′ as masking noise level increases. In particular, when the masking noise is 72-dB SPL, the d′s are similar to those obtained in quiet for each animal. At a level of 72-dB SPL, the spectrum level of the masking noise is approximately 50 dB/Hz (see Fig. 1), and the level of the 500-Hz component is 60.4-dB SPL. This 10-dB difference between noise and tone levels at 500 Hz falls below the critical ratio for a 500-Hz tone (Seaton and Trahiotis 1975; Miller 1988; Niemiec et al. 1992). If behavioral performance reflects frequency discrimination between the 500- and 125-Hz auditory filters, then there should be a decrease in d′ as the level of the masking noise increases to 72-dB SPL. In contrast, however, behavioral performance remains constant as masking noise level increases (see black symbols in Fig. 3).

FIG. 3
figure 3

Behavioral performance expressed as d′ obtained from four chinchillas for discriminating the 500-Hz F0 present signal complex from the 125-Hz F0 present standard complex in quiet and in low-pass masking noise (black symbols). Tones complexes were presented at an overall level of 73-dB SPL, which corresponds to 60.4-dB SPL per component for the 500-Hz F0 tone complex. Also shown are individual behavioral responses obtained for discriminating a 500-Hz pure tone from a 125-Hz F0 present standard complex in quiet and in low-pass masking noise for pure tones presented at 60-dB SPL (open symbols) and 50-dB SPL (smaller gray symbols). Data are shown as squares for c12, diamonds for c15, triangles for c24, and circles for c36. Lines are used to group data from individuals together. The horizontal dashed line indicates threshold performance (d′ = 1).

In order to verify that masking occurs when the low-pass noise is at a level of 72-dB SPL, animals were re-trained to discriminate a 500-Hz pure tone from a 125-Hz pure tone. The overall levels of the tones were varied, and psychometric functions using the method of constant stimuli were obtained in quiet and in low-pass masking noise at 72-dB SPL for all four chinchillas. The open symbols in Figure 3 show the d′s obtained from the psychometric functions when the pure tones were presented at a level of 60-dB SPL. A sound level of 60-dB SPL corresponds to the level of the individual harmonic components in the 500-Hz F0 tone complex. The d′s for each animal are lower in masking noise than in quiet for each animal; note that the open symbols for the 72-dB SPL masker fall below the line indicating where d′ equals 1 (open symbols in Fig. 3). Psychometric functions were also obtained for three chinchillas when the low-pass masking noise was presented at an overall level of 62-dB SPL. For this masker level, d′ is low when the tones are 50-dB SPL (gray symbols in Fig. 3), but not when the tones are at 60-dB SPL (open symbols in Fig. 3). Note that for the 62-dB SPL masker level, gray symbols fall below the line indicating where d′ equal 1, but the open symbols fall above this line.

The amplitudes of cochlear distortion products are below the amplitudes of the primary frequencies (Goldstein 1967; Buunen and Rhode 1978; Nuttall and Dolan 1990, 1993; Robles et al. 1997; Cooper and Rhode 1997). Since the levels of the primaries for the 500-Hz F0 tone complex were 60-dB SPL, then the distortion product at 500 Hz should have been below 60-dB SPL. For missing F0 harmonic tone complexes, the amplitudes of distortion products at F0 have been estimated to be 10–15 dB below the levels of the primary frequencies for human listeners (Pressnitzer and Patterson 2001). Given that the levels of the primaries for the 500-Hz F0 tone complex were 60-dB SPL, then the level of the distortion product at 500 Hz could then be estimated to have been around 50-dB SPL. Thus, the pure tone results shown in Figure 3 indicate that when the low-pass masking noise is at an overall masker level of 62-dB SPL, the level of the noise will be sufficient to mask distortion products if their amplitudes are 50-dB SPL (gray symbols in Fig. 3), but will not be sufficient to mask distortion products having amplitudes of 60-dB SPL (open symbols in Fig. 3). However, when the low-pass noise is at an overall level of 72-dB SPL, the level of the noise will be sufficient to mask distortion products at the F0 even if the amplitudes of those distortion products are equal to 60-dB SPL (open symbols in Fig. 3). Thus, in all subsequent experimental conditions described below, the low-pass masking noise was presented at an overall level of 72-dB SPL in order to adequately mask any distortion products at F0.

Behavioral responses to missing F0 tone complexes were also obtained for two other test conditions. In one condition, the 500-Hz test complex was comprised of the second to 20th harmonics whereas the 125-Hz test complex was comprised of the eighth to 80th harmonics. Consequently, the spectral bandwidth was between 1,000 and 10,000 Hz for both test complexes. In the second condition, the 500-Hz test complex was comprised of the third to 20th harmonics, and the 125-Hz test complex was comprised of the 12th–80th harmonics; the spectral bandwidth was between 1,500 and 10,000 Hz for both test complexes. Again, a low-pass noise at an overall level of 72-dB SPL was presented to mask distortion products. Figure 4 shows that the average behavioral responses to the 500-Hz missing F0 test complexes were equal to those obtained to the signal. The responses to the 125-Hz missing F0 test complexes were not equal to the signal, but were similar to those obtained to the 125-Hz F0 standard.

FIG. 4
figure 4

Bar graph shows the average percent responses obtained from four chinchillas to F0 present and missing F0 stimuli at 73-dB SPL. Complexes were presented with the low-pass masking noise at an overall level of 72-dB SPL. Note that the bandwidths of the 125- and 500-Hz missing F0 tone complexes were equal. The passbands (PB) were 1–10 kHz (striped bars) or 1.5–10 kHz (gray bars). (Note that both test 2 sounds illustrated in the Supplement for Fig. 6 correspond to the 1.5–10 kHz passband test condition.) The harmonic numbers (Harm) of the tone complexes are indicated in the figure. Error bars are ±1 standard deviation.

The above results suggest that behavioral performance does not reflect frequency discrimination between auditory filters centered at 125 and 500 Hz. Moreover, the results are consistent with the hypothesis that the perceptual equivalence of missing F0 tone complexes to F0 present tone complexes is not due to the reintroduction of cochlear distortion products at the F0s. Based on the stimulus features of the standard and signal harmonic tone complexes, there are three possible stimulus features that could be analyzed by the auditory system. The most obvious is the difference in the periodicity of the 125- and 500-Hz F0 tone complexes, which would argue for a pitch-like percept associated with the sounds. However, in addition to the periodicity differences, there are concurrent differences in spectral bandwidth and in the amplitude of individual harmonic components. The contributions of each of these three features are evaluated in the experiments below.

Experiment 2: Evaluating the role of fundamental frequency

The purpose of this experiment was to determine whether there exists a gradient in behavioral responses as the F0 systematically varies. In this experiment, all stimuli were missing F0 tone complexes. The 500-Hz signal tone complex was comprised of harmonics 2–20, whereas the 125-Hz standard complex was comprised of harmonics 2–80. Note that both of these sounds are comprised of successive harmonics up to 10 kHz. Test stimuli were harmonic tone complexes also missing only the F0. The F0s varied between 125 and 500 Hz in 0.125 octave steps. Consequently, the upper frequency limit of 10 kHz was generally not harmonically related to the F0, except for when the F0 was 250 Hz. The test complexes were generated with all successive harmonics (beginning with harmonic 2) up to the highest frequency that did not exceed 10 kHz. For example, the upper frequency limit for the missing F0 harmonic tone complex with the 354-Hz F0 was 9,912 Hz (i.e., harmonic 28), not 10,266 Hz (i.e., harmonic 29). In this experiment, harmonic tone complexes were played in the presence of continuous low-pass noise fixed at an overall level of 72-dB SPL.

Figure 5 shows the behavioral responses to missing F0 complexes obtained as a function of the F0 in the presences of low-pass masking noise. For each chinchilla, there is a systematic increase in behavioral response as F0 increases from the missing 125-Hz F0 standard to the missing 500-Hz F0 signal. That is, there is a gradient in behavioral responses as F0 varies. Generalization gradients were modeled with a best fitting logistic function having the form

$$ \% \;{\hbox{Response}} = p\left( {\hbox{FA}} \right) + \left[ {p\left( {\hbox{Hit}} \right) - p\left( {\hbox{FA}} \right)} \right] \times \frac{1}{{1 + {e^{\left[ { - \theta \left( {{\rm{F}}0 - \mu } \right)} \right]}}}} \times 100 $$

where p(FA) and p(Hit) are the probabilities of a false alarm and hit, respectively, θ is the slope of the function, and μ is the mean F0 of the function. There was no theoretical reason for choosing a logistic fit, but rather a logistic fit was chosen for convenience. The means of the logistic functions indicate the half-way point along the generalization gradient. The arithmetic mean of the 125-Hz standard and 500-Hz signal is 312.5 Hz, and it is interesting to note that the means of the logistic functions for each of the four chinchillas (Fig. 5) are 306, 309, 324, and 364 Hz. The average mean of the four logistic functions is 325.75 Hz; a two-tailed t test showed that this is not significantly different from 312.5 Hz (t = 0.993; P = 0.394).

FIG. 5
figure 5

Percent responses as a function of F0 obtained from four chinchillas. In this condition, the 125-Hz standard, 500-Hz signal, and all test tone complexes were missing F0 stimuli (only the F0 was missing). Stimuli were presented at an overall level of 73-dB SPL in the presence of the low-pass masking noise at an overall level of 72-dB SPL. The gradients in behavioral responses obtained were fit with a modified logistic function as described in the text.

Experiment 3: Evaluating the role of spectral bandwidth

Experiment 2 suggests that the missing F0 is the stimulus feature that is being analyzed by the chinchilla auditory system. However, in addition to changes in F0 among the tone complexes, there were concomitant changes in spectral bandwidth among the test tone complexes. That is, if spectral bandwidth was analyzed rather than F0, then the gradients observed in Figure 5 could reflect a perceptual dimension related to bandwidth rather than F0. However, it should be noted that the bandwidths of both of the test complexes used to obtain the responses presented in Figure 4 were equal, having bandwidths of 9,000 and 8,500 Hz for the two conditions, respectively. If spectral bandwidth is the stimulus feature being analyzed, then it would be expected that the behavioral responses to these two test stimuli should be equal to each other. Figure 4 clearly shows that the responses obtained to the two test stimuli are not equal to each other, but are different for both conditions. Rather, behavioral responses were consistent with the appropriate F0s of the signal and standard tone complexes.

To further evaluate the potential role of spectral bandwidth, another set of conditions was examined in which the passband of the 125-Hz standard was 2,500–10,000 Hz and the passband of the 500-Hz signal was 1,000–10,000 Hz. The passbands of the test complexes were 2,000–10,000 and 1,500–10,000 Hz. Note that for these conditions, all stimuli including the standard and the signal were missing F0 tone complexes (see Supplement for Fig. 6 for schematic spectra and corresponding *.wav files). In these conditions, there was a systematic increase in spectral bandwidth across the tone complexes (i.e., 7,500, 8,000, 8,500, and 8,500 Hz). If spectral bandwidth is the stimulus feature that is being analyzed during the discrimination, then there should be a systematic change in behavioral response as bandwidth changes (i.e., a stimulus generalization gradient), regardless of the F0 of the test stimuli. Again, a low-pass noise fixed at 72-dB SPL was presented to mask distortion products. When the F0s of the two test stimuli were both at 500 Hz, the behavioral responses to the test stimuli were equal to those observed for the 500-Hz F0 signal (striped bars in Fig. 6). That is, a gradient in behavioral responses was not obtained for any of the chinchillas tested. When the F0s of the two test stimuli were both at 125 Hz, the behavioral responses to the test stimuli were equal to those observed for the 125-Hz F0 standard (gray bars in Fig. 6). If bandwidth was the feature being analyzed, it would be expected that the behavioral responses for each of these conditions would be the same. In contrast, the behavioral responses are consistent with what would be expected if F0 was the feature being analyzed. When the test stimuli have a 500-Hz F0, regardless of bandwidth, the behavioral responses are high and are similar to those obtained for the 500-Hz signal. When the test stimuli have a 125-Hz F0, regardless of bandwidth, the behavioral responses are low and are similar to those obtained for the 125-Hz standard.

FIG. 6
figure 6

Bar graph shows average percent responses obtained from four chinchillas as bandwidth (BW) systematically increases from left to right. Stripe bars show responses obtained when the two test complexes had F0s equal to 500 Hz. Gray bars show responses obtained when the two test complexes had F0s of 125 Hz. The harmonic numbers (Harm) of the tone complexes are indicated in the figure. Error bars are ±1 standard deviation.

Experiment 4: Evaluating the role of the amplitude of individual harmonics

For all of the above experiments, overall level was fixed at 73-dB SPL for all harmonic tone complexes in order to eliminate overall level as a potential cue. As a result, the amplitudes of the individual components for the 125-Hz F0 80-component standard were 6 dB lower than the individual components for the 500-Hz F0 20-component signal. Thus, it is possible that the discrimination between the signal and standard could result simply through intensity discrimination if the animal monitors the output of an auditory filter having a center frequency common to the harmonics of both the standard and signal. For example, the auditory filter centered at 1,000 Hz is at the second harmonic of the 500-Hz F0 signal and at the eighth harmonic of the 125-Hz F0 standard. The output of the 1,000-Hz auditory filter will increase by 6 dB when the standard is followed by the signal. In order to assess whether the discrimination between the 125-Hz F0 standard and 500-Hz F0 signal was due to intensity discrimination in a single channel, animals were tested using 125- and 500-Hz F0 tone complexes in which individual components were all at the same equal levels rather than having equal overall levels. That is, the standard and signal tones complexes as well as the missing F0 test complexes all had components of equal levels. As in experiment 1, standard and signal tones complexes had the F0 present and test complexes were missing F0 stimuli. Tone complexes were presented in continuous low-pass noise at a fixed level of 72-dB SPL. Figure 7 shows the behavioral responses when overall level was fixed (striped bars) and when the level of each individual component was fixed (gray bars). For both conditions, responses to the signal were high, and the responses to the standard are low. Moreover, for all animals, the responses to the missing F0 125-Hz test complex were similar to those obtained for the 125-Hz standard, and responses to the missing 500-Hz test complex were similar to those obtained for the signal. That is, the perceptual equivalence between missing F0 and F0 present tone complexes remained unchanged when the amplitude of individual harmonic components was eliminated as a cue.

FIG. 7
figure 7

Bar graph shows the average percent responses obtained from four chinchillas to F0 present and missing F0 stimuli in the presence of low-pass masking noise at an overall level of 72-dB SPL. Striped bars show responses obtained when individual harmonics were adjusted such that all tone complexes had equal overall level of 73-dB SPL. Gray bars show responses obtained when individual harmonics were adjusted such that all components were at an equal level of 54-dB SPL for all tone complexes. For this condition, the overall level of the 125-Hz F0 tone complex was 73-dB SPL and the level of the 500-Hz F0 tone complex was 67-dB SPL. Error bars are ±1 standard deviation.

Discussion

Behavioral responses were measured from chinchillas to harmonic tone complexes in a stimulus generalization paradigm. In stimulus generalization paradigms, an animal is trained to respond to a specific signal stimulus and then tested using stimuli that vary along some physical dimension. Responses by the animal to test stimuli that are similar to the responses obtained to the signal stimulus indicate a perceptual equivalence between the test and signal stimulus (Hulse 1995). If the animal does not respond to the test stimuli, it suggests that there is no perceptual equivalence between the test and signal stimuli. A gradient in responses that is obtained as the physical dimension varies systematically suggests that a perceptual dimension exists in the animal that corresponds to the physical dimension. Responses that are intermediate between the responses to the signal and the standard are an indication of where the percept of the test sound lies between the percepts of signal and standard sounds.

The physical dimension of interest in the present study is the F0 of a harmonic tone complex. The results of the present study show that missing F0 harmonic complex tones are perceptually equivalent to F0 present harmonic complex tones in chinchillas in quiet. When animals were trained to discriminate a 500-Hz F0 present signal complex from a 125-Hz F0 present standard complex and then tested with tone complexes having missing F0s, the responses to the 500-Hz missing F0 tone complexes were equal to those obtained for the 500-Hz F0 present signal. Similarly, when tested with missing 125-Hz F0 complexes, the responses to the missing 125-Hz complex were similar to those observed for the 125-Hz F0 present tone complex. Thus, missing F0 harmonic tone complexes are perceived to be equivalent to F0 present harmonic complexes in chinchillas. These results along with those from similar studies in cats (Heffner and Whitfield 1976; Chung and Colavita 1976), monkeys (Tomlinson and Schwarz 1988), birds (Cynx and Shapiro 1986), and goldfish (Fay 2005) suggest that the perception of the missing F0 is not limited to humans, but rather is a general vertebrate trait.

In order to assess the possible contribution of distortion products to the perception of the missing F0 in chinchillas, the present study also used low-pass noise in order to mask any distortion products at the F0. The perceptual equivalence that was observed between missing F0 tone complexes and F0 present complexes in quiet was not altered by the presence of low-pass masking noise. If the perception of the missing F0 in chinchillas was due to the detection of distortion products through an auditory filter centered at F0, then there should be a decrease in discrimination performance as the masking noise level increased. Although the discrimination of pure tones was decreased by low-pass masking noise, the discrimination of tone complexes obtained in quiet was identical to those obtained in the presence of low-pass masking noise (Fig. 3). Thus, it does not appear that the auditory system of the chinchilla monitors the output of the auditory filter centered at the F0 for tone complexes. Similar to the missing F0 percept in human listeners (Thurlow and Small 1955; Licklider 1956; Patterson 1969), any underlying reintroduction of distortion products plays a relatively minor role in the perception of the missing F0 in chinchillas, even though the magnitude of distortion products may be larger non-human mammals (Withnell et al. 2003). That is, the chinchilla is not attending to or analyzing any distortion product at the missing F0.

The perceptual equivalence between missing F0 and F0 present complexes observed in the present study raises the question as to whether there is a psychological or perceptual dimension related to F0 in the chinchilla. It has been argued that if a psychological dimension corresponding to a physical dimension does exist, then a gradient in behavioral responses should be observed as the physical dimension changes systematically (Guttman 1963). In experiment 2, chinchillas discriminated a 500-Hz missing F0 signal from a 125-Hz missing F0 standard and were then tested with missing F0 complexes in which the F0 varied systematically between 125 and 500 Hz. A gradient in behavioral responses was observed for each of the four chinchillas, suggesting that a perceptual dimension of missing F0 exists in the chinchilla. Again, since the missing F0 generalization gradients were obtained in the presence of low-pass masking noise, it is unlikely that the psychological dimension of F0 is due to the detection of distortion products at F0.

If the acoustic feature being analyzed by the chinchilla auditory system is not a component corresponding to the F0, then what feature of the harmonic complex tones are the animals responding to during the discrimination of the 500-Hz signal complex from the 125-Hz standard complex? For the specific stimuli used in the present study, there are concomitant changes in two other acoustic features in addition to changes in F0 or periodicity. These features include spectral bandwidth and the level of individual harmonic components. During the discrimination between the 500-Hz signal complex and the 125-Hz standard complex, the chinchillas could potentially be attending to either of these acoustic features rather than F0 (or periodicity).

The results of experiment 3 argue that spectral bandwidth is not the stimulus feature being analyzed. In this experiment, the bandwidth of the tone complexes varied systematically, whereas the F0 did not. If bandwidth was the feature being analyzed, then a gradient in behavioral responses should be observed. However, a gradient in behavioral responses was not observed, but rather the behavioral responses were consistent with the missing F0. When the F0 of the test stimuli corresponded to 500 Hz (Fig. 6), behavioral responses to the test complexes were similar to those of the 500-Hz signal complex; but when the F0 of the test stimuli corresponded to 125 Hz (Fig. 6), behavioral responses to the test complexes were similar to those of the 125-Hz standard complex. Thus, behavioral responses appear to reflect an analysis of the missing F0 and not an analysis of spectral bandwidth.

For most of the experiments in the present study, the overall level was fixed for all of the stimuli. Consequently, there are differences in the level of the individual harmonic components across stimuli. For example, because the 500-Hz F0 present signal was comprised of 20 components and the 125-Hz F0 present standard was comprised of 80 components, the level of the individual components will differ between these two sounds. Thus, the discrimination between the signal and standard could be performed by monitoring the output of an auditory filter centered at one of the harmonics and responding to the intensity difference produced between the standard and signal. If this intensity difference was the cue used in the discrimination, then behavioral performance should be degraded if the level of the individual harmonics was equalized. However, when the harmonics for all stimuli were adjusted to be equal level, there was no affect on behavioral performance. That is, behavioral responses to the 500-Hz F0 present signal were high, whereas responses to the 125-Hz F0 present standard were low. Moreover, responses to missing F0 test complexes were also unaltered; 500-Hz missing F0 test complex was perceptually equivalent to the 500-Hz F0 present signal, and 125-Hz missing F0 test complex was perceptually equivalent to the 125-Hz F0 present standard. These results argue that the feature being analyzed is not the level of individual harmonics.

Understanding the role of cochlear distortion products in periodicity perception is important for understanding the neural mechanisms underlying pitch. Langner and co-workers (Langner and Schreiner 1988; Schreiner and Langner 1988; Langner et al. 2002; Biebel and Langner 2002) have reported the existence of topographic maps of best modulation frequency (BMF) in the inferior colliculus (IC) of mammals with the organization of BMF being orthogonal to the tonotopic map of best frequency. These authors have argued that these BMF maps are an important neural substrate for pitch perception. The controversy regarding the role of these units in pitch perception is whether modulation tuning arises from specific neural mechanisms in the IC or simply reflects cochlear distortion products (see McAlpine 2004). The results of the present behavioral study establish that the perception of the missing F0 in non-human mammals is not altered by the presence of low-pass masking noise at the F0 sufficient to mask distortion products at the F0. Based on these results, one would expect that the physiological responses of any presumptive “pitch-specific” neuron in the mammalian central auditory system to missing F0 tone complexes would also not be affected by the presence of masking noise. If masking noise does not affect IC modulation tuning, then it would suggest that modulation tuning does not arise from distortion products and that these units are likely to be important for pitch perception. However, if masking noise has an effect on modulation tuning, then it would suggest that the IC responses simply reflect distortion products and thus, are not likely to be important in mammalian pitch perception. In addition, “pitch-selective” units which respond to both pure tones and missing F0 tone complexes having similar F0s have been described in the auditory cortex of marmosets (Bendor and Wang 2005). More recently, Bendor and Wang (2010) have reported that for pitch-selective neurons, rate responses are not affected by the presence of masking noise, whereas the rate responses of “non-pitch” neurons to the distortion product at the F0 are reduced by the presence of masking noise. The former is a physiological property expected for the existence of pitch-specific neurons in the mammalian central auditory system. The results of masking studies on modulation tuning in the IC would provide important insights into whether pitch-selective cortical units receive input from modulation tuned IC units.

In conclusion, the results of the present study argue that chinchillas possess a pitch-like perception of the missing F0 that is unlikely to arise from the reintroduction of the F0 through cochlear distortion products. In addition to the perception of the missing F0, chinchillas also show spectral dominance (Shofner and Yost 1997), a perceptual dimension related to the delay of iterated rippled noise (Shofner et al. 2007), and a perceptual dimension related to periodicity strength (Shofner et al. 2005). All of the above-mentioned pitch attributes also exist in human listeners (Thurlow and Small 1955; Licklider 1956; Ritsma 1967; Plomp 1967; Patterson 1969; Bilsen and Ritsma 1970; Yost and Hill 1978; Yost 1996a, b; Leek and Summers 2001; Shofner and Selas 2002). The similarities in pitch attributes between humans and chinchillas suggest that the mechanisms underlying pitch perception reflect mechanisms of periodicity processing common to all mammals rather than mechanisms specific to the human auditory system. Moreover, although aspects of learning can have an influence on the pitch perception in chinchillas (Shofner et al. 2005; Shofner and Whitmer 2006), none of the chinchillas used in the present study or in the previously cited studies from our lab had been trained specifically to discriminate speech sounds. Consequently, none of the animals tested had acquired any speech recognition skills, yet they still possessed the same psychological attributes of pitch as human listeners. This finding argues that the mechanisms underlying pitch do not require a learning stage based on the repeated exposure to speech sounds (Terhardt 1974; Divenyi 1979).