Acoustic Cues for Sound Source Distance and Azimuth in Rabbits, a Racquetball and a Rigid Spherical Model
- 760 Downloads
There are numerous studies measuring the transfer functions representing signal transformation between a source and each ear canal, i.e., the head-related transfer functions (HRTFs), for various species. However, only a handful of these address the effects of sound source distance on HRTFs. This is the first study of HRTFs in the rabbit where the emphasis is on the effects of sound source distance and azimuth on HRTFs. With the rabbit placed in an anechoic chamber, we made acoustic measurements with miniature microphones placed deep in each ear canal to a sound source at different positions (10–160 cm distance, ±150° azimuth). The sound was a logarithmically swept broadband chirp. For comparisons, we also obtained the HRTFs from a racquetball and a computational model for a rigid sphere. We found that (1) the spectral shape of the HRTF in each ear changed with sound source location; (2) interaural level difference (ILD) increased with decreasing distance and with increasing frequency. Furthermore, ILDs can be substantial even at low frequencies when distance is close; and (3) interaural time difference (ITD) decreased with decreasing distance and generally increased with decreasing frequency. The observations in the rabbit were reproduced, in general, by those in the racquetball, albeit greater in magnitude in the rabbit. In the sphere model, the results were partly similar and partly different than those in the racquetball and the rabbit. These findings refute the common notions that ILD is negligible at low frequencies and that ITD is constant across frequency. These misconceptions became evident when distance-dependent changes were examined.
Keywordshead-related transfer function (HRTF) sound localization acoustics auditory distance interaural time difference (ITD) cues interaural level difference (ILD) cues spectral cues
There are numerous studies measuring the transfer functions representing signal transformation between a sound source and each ear canal, i.e., the head-related transfer functions (HRTFs), for various species. However, only a handful of these addressed the effects of sound source distance on HRTFs (Zahorik 2002; Kim et al. 2008; Kuwada et al. 2010a). Ability to recognize the distance of a sound source, be it predator or prey, is important for survival of an animal.
In this study, we measured the effects of sound source distance and azimuth on HRTFs in the rabbit. This is the first study of HRTFs in the rabbit, an animal that has prominent pinnas and is being increasingly used in auditory neurophysiology (e.g., Zheng et al. 2002; Coffey et al. 2006; Kuwada et al. 2006; Devore and Delgutte 2008; Ebert et al. 2008). We compared the rabbit HRTFs to those of a racquetball and a rigid sphere model (Duda and Martens 1998) that had diameters similar to the rabbit's interaural distance. This comparison was made in order to gain insights into how the various attributes of the HRTF are affected by components of the head and body not present in simple structures, i.e., racquetball and sphere model. The goal was to provide systematic information about how monaural and binaural acoustic cues associated with sound source location change with sound source distance, azimuth, and frequency. Such information could be used to generate virtual auditory space stimuli to investigate spatial hearing in the animal both behaviorally and physiologically. The virtual auditory space method presents HRTF-filtered signals through earphones. In humans, the perception created by the virtual space stimuli mimics the perception of the corresponding real sound presented in external space (Wightman and Kistler 1989; Kulkarni and Colburn 1998).
Measuring spatial signal transformations between a sound source and the ears is ideally achieved if the sound emanates from a point in space. When the sound source is not at a point in space, the consequences become increasingly problematic with decreasing distance. Because we used sound sources at close distances, this issue was important for our acoustic measurements.
We designed and constructed a sound source to approximate an acoustic point source. It consisted of a loudspeaker with a 3-in. diaphragm (Fostex FF85K) enclosed in an airtight, foam-filled box. The sound source displayed total harmonic distortion <1% at the signal levels used. The diaphragm was sealed onto a custom rigid conical coupler with either a 15-mm aperture and a 227-mm axis length or a 30-mm aperture and a 177-mm axis length. The interior of the rigid cone had a Pellon cloth inner cone to serve as a cross-mode suppressor. The inner cone was separated from the inner wall of the outer cone by several felt standoffs (5-mm cubes). The tips of outer and inner cones were loosely packed with acoustic damping material (AcustaStuff).
We used the small aperture source for distances between 10 and 40 cm and the larger aperture source for distances between 56 and 160 cm. Because the intensity decreases with increasing distance, we used the source with the larger aperture at distances beyond 40 cm in order to produce sufficient intensity at the microphones. The frequency responses of the small (15 mm) and large (30 mm) aperture sound sources were measured with a ½-in. microphone (B&K Type 4190) facing the aperture at a distance of 5 cm. For both sound sources, the output was constant within ±9 dB over a range of 0.2–20 kHz. The small aperture sound source level was on average 7.2 dB lower than the large aperture source.
We measured the directionality of each sound source for three frequency bands as a function of azimuth for each of several distances. The levels were relative to the level at 0° azimuth at each distance and for each frequency band. Each sound source was fixed in space such that the axis of the conical coupler was in a horizontal plane. The azimuth and distance of the microphone (B&K Type 4190) was varied relative to the tip of the sound source's conical coupler. An ideal point source exhibits equal level across all azimuths. Overall, our sound sources approximated this behavior. For both sound sources, for all azimuths and distances, the sound levels essentially were constant within 10 dB in the low and mid-frequency bands. In the high frequency band, the sound levels remained constant within 10 dB for azimuths between ±90° for the small aperture source, whereas the levels remained within 10 dB for azimuths between ±50° for the larger aperture source. In the actual recording situation with the rabbit in place, the angle subtended by the rabbits head was ∼±30° at 10 cm distance and ∼±1.8° at 160 cm. So, for these azimuths, the sound level remained within 1 dB for all frequency bands. The sound source was positioned from −180° to 165° in 15° steps along the azimuth and from 10 to 160 cm in half doubling steps (factor of 1.4) along the distance. For the close distances, the azimuths towards the back could not be measured because the body of the sound source was an obstruction.
Measurements were made in an anechoic chamber designed so that the sound field complies with the performance specified in ISO 3745 (1977). All surfaces of the anechoic chamber were lined with fiberglass wedges designed to be anechoic between 0.11–200 kHz. The space between the wedge tips was approximately 9 × 4 × 4 m, which permitted acoustic experiments to be conducted with sound sources both near and far from the animal.
We chose the rabbit because this animal has been used in neurophysiological studies (Kuwada et al. 2006; Nelson and Carney 2007; Devore and Delgutte 2008; Fitzpatrick et al. 2009). It is a good model because the frequency range of rabbit's hearing (Heffner and Masterton 1980) largely overlaps with that of human hearing. Furthermore, Carney et al. (2010) recently showed that the Dutch-belted rabbit can discriminate the distance of a sound source.
Figure 1 illustrates the outer ear of the rabbit (panel A) with a positive cast of the deep, hidden part of the outer ear superimposed onto the visible part (panel B). The visible part of the outer ear (pinna) is relatively simple and is expected to act as a collector that funnels the sound into the deeper, more complex part of the outer ear. The deeper part has ridges that create two to three air compartments (panel B) that the sound must travel through before reaching a cylindrical part that ends with the tympanum.
Cross-sections of the cast at four levels are shown in the right part of panel B. The deepest section shows an embedded miniature microphone (Knowles FG-3329, 2.5-mm diameter × 2.6-mm length) used to measure the acoustic signals. We inserted a pair of such earmold tips into the ears and then recorded ear canal signals, i.e., a blocked meatus approach (Moller et al. 1995). This location is chosen because it is near the point where the ear canal becomes cylindrical and constant in diameter. Beyond this point, it is expected that there is little, if any, influence of source direction on the acoustic signal because, in humans and cats, the signal in the ear canal was direction-independent (Wiener and Ross 1946; Wiener et al. 1966; Middlebrooks and Green 1990; Moller et al. 1995).
We used two rabbits. Rabbits 1 and 2 had body weights of 2.3 and 2.15 kg, respectively. The lengths of the outer ear from the bottom of the slit (Fig. 5a. arrow) to the tip for rabbit 1 were 80 and 81.2 mm for the right and left ears, respectively. The interaural distance of this rabbit measured at the level of the slits was 57.3 mm. Rabbit 2 died unexpectedly, and measurements of its outer ears were not available.
Head-related transfer functions
These functions were obtained by presenting a periodic logarithmic-chirp (0.05–49 kHz, 671 ms duration, repeated eight times) to the rabbit, at each spatial position described above. We used a PC with custom Matlab-based software interfaced to a RX6 processor (TDT). The processor output was connected to an attenuator (HP 350D) and then a power amplifier (Bryston 2B-LP) to drive the sound source. The output of each miniature microphone (Knowles FG-3329) was processed with a custom-designed amplifier–filter (40 dB gain, 0.03 to 30 kHz pass band). The RX6 processor concurrently sampled, at 98 kHz, the outputs of the amplifier–filter. The frequency response of the total measurement system that included the sound source and each microphone was obtained with each microphone placed 5 cm in front of the sound source. This frequency response served as the reference signal for the determination of the HRTF. The frequency responses measured with the miniature microphones were indistinguishable from those measured with the B&K ½-in. microphone. For each sound source location, the HRTF of each ear was obtained by dividing the spectrum of the signal in the ear with the spectrum of the reference signal. The head-related impulse response (HRIR) was obtained by taking an inverse Fourier-transform of the HRTF.
In order to separate the direct signal component from late-arriving reverberant component, we applied a temporal window to the HRIR. We centered a Blackman window (Oppenheim et al. 1999) at the onset of the direct signal component. The full width of the window was 40 ms. Thus, the effective width of the window (i.e., the width of an equivalent rectangular window) was 20 ms. Since the HRIR was periodic where the last point was contiguous with the first point, the temporal window often encompassed both the initial portion and the last portion of the HRIR. We took a Fourier-transform of the direct HRIR to obtain the direct HRTF. The amplitude of the HRTF was converted into level in decibels (relative to the reference signal) while the phase of the HRTF was converted into “phase delay” in milliseconds (viz., phase/frequency; Blauert 1997). The amplitude and phase versus frequency functions were smoothed by convolving them with a Blackman logarithmic-frequency window with a 1/6 octave half-width. This approach is analogous to that of Kulkarni and Colburn (1998). We determined ILD for each frequency by taking a difference between the monaural levels (right–left level). Likewise, we determined ITD for each frequency by taking a difference between the monaural phase delays.
We made measurements in a racquetball in the anechoic chamber so that we could compare it with the rabbit and the computational model of a rigid sphere. We chose a racquetball because its diameter (56 mm) is close to that of the rabbit's interaural distance. The racquetball was filled with Plaster of Paris and fitted with two miniature microphones (Knowles FG-3329) at ±90° azimuths on the equator. The surface of each microphone was flush with the surface of the ball.
Rigid sphere model
We used a quantitative model of a rigid sphere (Duda and Martens 1998) that incorporates sound source distance and azimuth as independent variables. We used their model without modification and derived level and phase of monaural signals. We used these signals to determine ITD and ILD as a function of frequency, distance, and azimuth as a basis for comparison with measurements in the rabbit and racquetball.
Level and temporal characteristics of the racquetball and rigid sphere model
Variability of HRTF measurements in the rabbit
The variability of HRTFs measured both within and across sessions was small. This was assessed by making multiple measurements at the same sound source location. We recorded ten HRTFs over 2 days in the right and left ears of a rabbit to a sound source at the same location (0° azimuth and 20-cm distance). In both ears, the spectral shapes remained stable across the multiple measurements. The standard deviation was less than 2.8 dB except at 20 kHz where it was 5 dB.
Monaural and binaural level characteristics of the rabbit
Across distance, the level in both ears increased with decreasing distance as expected, except at high frequencies (Fig. 5, top and middle rows). The level in the right ear (ear closer to the sound source, top row) generally increased as the sound source azimuth moved from 0° (directly in front of the rabbit) to 90°. The level in the left ear (middle row) displayed an opposite pattern to the right ear in that the level decreased as the sound source azimuth moved from 0° to 90°. These opposite changes in the levels of the two ears indicate that the ILD should generally increase as the sound source azimuth is moved from 0° to 90°. This prediction was verified by the ILDs (bottom row) derived from the monaural levels. Between 0.2 and 1 kHz, the ILD was flat and increased with increasing azimuth and with decreasing distance. At the closest distance (10 cm), it reached ∼10 dB whereas it was near 0 dB at the farthest distance (160 cm). At high frequencies, the ILDs were larger but less systematically related to distance. The maximum ILD was much larger (42 dB) for the rabbit than for the ball (15 dB) or the sphere model (12 dB). This difference must be due to the complex geometry of rabbit's outer ear, head, and body.
Monaural and binaural temporal characteristicsof the rabbit
Effect of pinna position on ILD and ITD
All of the previous results from the rabbits were from ear canal measurements where the ears were upright and symmetrical. Here, we show the effects of ear position on ITD and ILD. By rotating the trapeze the right ear could be placed forward, and the left ear placed backward relative to the symmetrical position. The other variant was to reverse this configuration, i.e., right ear backward and left ear forward.
The shifts in the ITD functions (Fig. 14D) were similar to those seen in the ILD functions (Fig. 14A–C). Relative to the symmetrical ear position the amount of shift was larger when the right ear was forward than when the left ear was forward. The asymmetrical ear positions also reduced the range of ITDs from ±296 µs to ±259 µs (red) or to ±270 µs (blue) for the opposite ear position. The asymmetrical shifts and reduction in range may reflect the same factors discussed above.
This is the first study of HRTFs in the rabbit. Our main finding is that features of the rabbit HRTF changed substantially with sound source distance and azimuth. We found that (1) the spectral shape of the HRTF in each ear changed with sound source location; (2) ILD increased with decreasing distance and with increasing frequency. Furthermore, ILDs can be substantial even at low frequencies at close distances; and (3) ITD decreased with decreasing distance and generally increased with decreasing frequency. The information described in this paper extends the literature for acoustic cues for sound localization with respect to distance. The present results will be useful, e.g., for generating virtual auditory space stimuli that can facilitate behavioral and neural studies of sound localization (Kuwada et al. 2010b).
Studies that measured HRTFs as a function of distance are few. Brungart and Rabinowitz (1999) measured sound source distance effects using the human mannequin, KEMAR. Zahorik (2002) and Kuwada et al. (2010a) performed these measurements in humans, and Kim et al. (2008) measured this in barn owls. A rigid sphere model by Duda and Martens (1998) provides a most useful theoretical tool for computing HRTFs for any sound source location in 3-dimensional space.
Effects of distance and azimuth on spectral shape of HRTFs
Distance-dependent changes in spectral shape occurred mainly at frequencies above ∼3 kHz in the rabbit and in the studies cited above. This change was not observed in the racquetball or spherical model, implying that these spectral changes arise from the phase cancelations created by the complex shapes of the head and outer ear. This is consistent with the finding that pinna removal in the rat and the cat led to a reduction of the gain and smoothing of the spectral shape (Wiener et al. 1966; Koka et al. 2008; Tollin and Koka 2009).
Both the rabbit and human exhibited remarkably similar positive gains. For example, at a distance of 14 cm and 90° azimuth, the maximum gain was 15.3 dB at 2.8 kHz in the rabbit (data not shown) and 14.6 dB at 4.3 kHz in the human (Kuwada et al. 2010a, b). This similarity was also observed when cats were compared with humans regarding both the maximum gain and the frequency at the maximum gain (Wiener and Ross 1946; Wiener et al. 1966).
We showed that the pattern of change of the spectral shape across distance and azimuth was recognizable in the ear on the same side as the sound source. In contrast, the changes of the spectral shape in the opposite ear were more complex (Fig. 6). This is consistent with observation that human monaural sound localization is based mainly on the information in the ear on the side of sound source (Humanski and Butler 1988; Morimoto 2001; Jin et al. 2004; Van Wanrooij and Van Opstal 2004). However, Van Wanrooij and Van Opstal (2004) observed that the head shadowing effect was also a contributing factor to monaural sound localization, thus implicating a role of monaural information in the ear opposite to the sound source. Slattery and Middlebrooks (1994) observed that some unilaterally deaf humans were able to achieve reasonably accurate sound localization whether the sound source was on the same side or opposite side of the intact ear. This further implies that the information from a sound source through the intact ear either on the same side or the opposite side contributes to sound localization.
Effects of distance, azimuth, and frequency on ILD
In the rabbit, when the sound source was off the median plane, the ILD increased with decreasing sound source distance. At close distances, ILDs could be substantial even at low frequencies. Brungart (1999) showed in humans that very low-frequency ILDs are important for distance perception. The increase in ILD with decreasing distance has been reported in humans, a human mannequin, and barn owls cited above. The largest ILDs were seen at lateral sound source positions and at high frequencies. This has been extensively reported in humans (Wightman and Kistler 1998) and different nonhuman species (cat: Musicant et al. 1990; gerbil: Maki and Furukawa 2005; ferret: Schnupp et al. 2003; guinea pig: Sterbing et al. 2003; rat: Koka et al. 2008; barn owl: Keller et al. 1998).
ILDs in the rabbit were much larger than those in the racquetball and sphere model. This difference must be due to the complex geometry of the rabbit's outer ear, head, and body. Consistent with this view is the finding that removal of the pinna in the rat reduced the ILDs (Koka et al. 2008).
Effects of distance, azimuth, and frequency on ITD
The ITD behavior in the rabbit across distance and frequency was partly similar and partly different than that in the racquetball. Both showed a decrease in ITD with decreasing distance, albeit much larger, by a factor of 4.4 in the rabbit. The maximum ITD was greater in the rabbit (350 vs. 250 µs) which may arise from the rabbit's outer ears and body. This may be consistent with the finding of Koka et al. (2008) that pinna removal reduced the envelope ITDs in the rat. At frequencies <1 kHz and distances ≥40 cm, the ITD in the rabbit increased with decreasing frequency, whereas, in the ball, it was approximately constant. This constant ITD across frequency of the ball was well-explained by the rigid sphere model.
The increase in ITD with decreasing frequency at far distances described in the rabbit has also been reported in the human mannequin (Kuhn 1977; Brungart and Rabinowitz 1999), cat (Roth et al. 1980), and guinea pig (Sterbing et al. 2003). These observations refute the common misconception that the physiological range of ITD is constant across frequency. The observation that the best ITD of a neuron increases with decreasing best frequency (McAlpine et al. 2001) may be, at least in part, due to this acoustic phenomenon.
The effect of distance and frequency on ITD has been studied in the human mannequin and the matching rigid sphere model by Brungart and Rabinowitz (1999). They found that ITD increased with decreasing frequency in both the human mannequin and the sphere model. This is consistent with the present finding and in a separate measurement of the human mannequin as reported by Kuhn (1977). Furthermore, Brungart and Rabinowitz (1999) found that for frequencies above 0.5 kHz, ITD increased with decreasing distance in both the mannequin and sphere model, albeit smaller in the sphere model. Our findings of a decrease in ITD with decreasing distance in the rabbit and the racquetball are seemingly contradictory to those of Brungart and Rabinowitz (1999). However, Kuwada et al. (2010a) found that, in humans and in a human-head-size ball, ITD did decrease with decreasing distance at frequencies between 0.2 and ∼0.5 kHz, a pattern much like that seen in the rabbit. Since Brungart and Rabinowitz (1999) did not include ITDs for frequencies below 0.5 kHz, the above pattern was missed.
The behaviors of different diameter rigid sphere models are identical when the variables are properly normalized (Duda and Martens 1998). In this approach, the normalized frequency is proportional to the diameter of the sphere. Considering that we used a diameter of 0.056 m for the rabbit-size sphere and Brungart and Rabinowitz (1999) used a diameter of 0.18 m for the human-size sphere, the normalized frequency is 3.21 times higher for the human-size sphere. For example, 2 kHz for the rabbit-size sphere is equivalent to 0.62 kHz for the human-size sphere. At these frequencies, ITD increased with decreasing distance in both size spheres.
The acoustic data indicate that the maximum ITDs were 301 and 254 µs for rabbit 1 and 2, respectively. The bulk of the characteristic delays (CD) and best ITDs of neurons in the dorsal nucleus of the lateral lemniscus (DNLL) and inferior colliculus (IC) recorded in the unanesthetized rabbit were within ±300 µs (DNLL, CD = 92%, best ITD = 88%, IC, CD = 83%, best ITD = 79%; Kuwada et al. 2006) and heavily biased towards the contralateral azimuthal field. Thus, the bulk of the neural CDs and best ITDs fall within the range of acoustic ITDs measured in the rabbit.
Resolving ambiguity in distance and azimuth
Different combinations of sound source azimuth and distance can produce similar values of an acoustic cue, thus creating ambiguity in localizing a sound source. Is it possible to resolve such ambiguities by unique combinations of multiple acoustic cues? For example, ITD alone is an ambiguous indicator of azimuth because it changes with distance. This is illustrated in Figure 8 for rabbit 1. The ITD of −234 µs corresponds to a source either at 10 cm, −90° or at 40 cm, −45°. This ambiguity can be resolved if ITD and ILD cues are combined as follows. In Figure 4, for the same rabbit, the low band ILD is 11 dB at 10 cm, 90° or 2.3 dB at 40 cm, 45°. So, when the two cues are combined, the ambiguity is resolved, e.g., a combination of −234 µs ITD and 11 dB ILD uniquely identifies the location to be at a distance of 10 cm and an azimuth of −90°. In general, combining multiple acoustic cues derived from monaural and binaural spectra would help resolve ambiguities in sound source location (Slattery and Middlebrooks 1994; Jin et al. 2004; Van Wanrooij and Van Opstal 2004).
Effects of pinna position on binaural cues
The position of the pinna could markedly alter the ILD and ITD cues for a fixed sound source location. Thus, given its mobile pinna, the rabbit could use these ear movements to place the sound source location at the center of the ILD-ITD ranges. Furthermore, the shifts in ILD and ITD in azimuth were remarkably similar indicating that the information based on these cues was coherent. For a broadband sound source, the high-frequency ILD information and the low-frequency ITD information would lead to a consistent determination of the sound source location. Pinna position was previously found to affect the HRTFs in cats (Young et al. 1996). It has also been shown that neural spatial receptive field in the superior colliculus of the cat shifted in responses to changes in pinna position (Middlebrooks and Knudsen 1987).
We thank Dr. Susanne Sterbing-D'Angelo and Mr. Jason Sijie Wang for their assistance. We also thank Drs. Anthony Brammer and Pavel Zahorik for their advice in the design of our sound source. This study was supported by NIH grant R01 DC002178.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Blauert J (1997) Spatial Hearing: The Psychophysics of Human Sound Localization, revised edition. Cambridge, MA: MIT PressGoogle Scholar
- Carney LH, Koch K, Abrams R, Idrobo F (2010) Acoustic source distance discrimination in rabbit. AssocResOtolarngol Abst 33:313Google Scholar
- Devore S, Delgutte B (2008) Effects of reverberation on neuronal sensitivity to fine time structure and envelope ITD in the inferior colliculus of awake rabbit. Assoc Res Otolarngol Abstr 31:868Google Scholar
- Kuwada S, Bishop B, Kim D (2010b) Responses of inferior colliculus neurons in the unanesthetized rabbit to virtual auditory space stimuli. AssocResOtolarngol Abst 33:838Google Scholar
- Moller H, Sorensen MF, Hammershoi D, Jensen CB (1995) Head-related transfer functions of human subjects. J Audio Eng Soc 43:300–321Google Scholar
- Oppenheim A, Schafer R, Buck J (1999) Discrete-Time Signal Processing. Upper Saddle River, NJ: Prentice HallGoogle Scholar