Subjects and test environment
Subjects consisted of two bottlenose dolphins: SAY (female 37 years) and APR (female 32 years). Upper-cutoff frequencies for their hearing, defined as the frequency at which psychophysical thresholds reached a sound pressure level (SPL) of 100 dB re 1 µPa, were ~ 110 kHz for APR and ~ 140 kHz for SAY, indicating full hearing bandwidth for SAY and limited high-frequency loss for APR.
All tests were conducted within a 9 m × 9 m floating, netted enclosure at the US Navy Marine Mammal Program facility in San Diego Bay, California. During each trial, the dolphin positioned itself on an underwater “biteplate” supported at a depth of 80 cm by vertical posts spaced 1.8 m apart. The biteplate was oriented so the dolphin faced San Diego Bay through an enclosure gate opening containing a netted frame with a central observation aperture (Fig. 1). Beyond the gate opening, at a distance of 1.3 m, a piezoelectric transducer (TC4013, Reson Inc., Slangerup, Denmark) was positioned for use as the echo projector. The nearest land masses within ± 20° of the dolphin’s main biosonar transmission axis while on the biteplate were ~ 500-m distant. Mean water depth was ~ 4–5 m. Background ambient noise at the test site was dominated by snapping shrimp and other dolphins, with occasional contributions from passing vessels and aircraft.
Task description
The biosonar task required the dolphin to produce echolocation clicks, listen to returning echoes created by convolving the received click with an impulse function, and produce a conditioned acoustic response (a whistle) when the echoes changed from non-jittering (Echo A) to jittering (Echo B). For non-jittering echoes, echo delay and echo polarity were fixed. For jittering echoes, echo delay and echo polarity could vary on alternate echoes (i.e., different values for even or odd numbered echoes within a trial, see Fig. 2). Three experiments were conducted: Experiments 1 and 2 featured jittered echo delay only and Exp. 3 featured jittered echo delay and echo polarity (Table 1). Non-jittering echo delay was fixed at 12.56 ms (~ 9.4 m simulated target range) for all experiments.
Table 1 Echo delay and polarity characteristics for the three experiments
Experimental sessions typically consisted of 60 trials and lasted ~ 30 min. Within each session, 80% of the trials were designated as echo change trials, where the echoes changed from non-jittering to jittering after a random interval of 5–10 s, followed by a 1-s response window. On the remainder of the trials (control trials), non-jittering echoes were presented for the entire 6–11-s trial duration. If the dolphin responded during the 1-s response interval after an echo change (a hit), or withheld the response for an entire control trial (a correct rejection), it was rewarded with one fish. The dolphin was recalled to the surface with no fish reward for responding during a control trial (a false alarm) or for failing to respond during a response interval following an echo change (a miss). If the dolphin responded before the echoes changed during an echo change trial, it was immediately recalled to the surface with no fish reward, and the trial was re-classified as a control trial and scored as a false alarm. If the dolphin did not echolocate during a trial, stopped echolocating before the echoes changed, left the biteplate, or was visually observed to be echolocating on another object, it was recalled and the trial data were discarded.
During data collection sessions, jitter delay varied from 50 µs down to 2, 1, or 0 µs for Exps. 1, 2, and 3, respectively (initial training utilized much larger jitter delay values). Sessions were divided into 10-trial blocks with constant jitter delay within each block. Each session typically featured six jitter delay values, which were tested in descending order. Within each experiment, at least 40 echo-change trials were conducted for each value of jitter delay; this required 21–23 sessions for Exp. 1 and 9 to 10 sessions for Exps. 2 and 3 (Exps. 2 and 3 featured fewer values of jitter delay).
Echo generation
Biosonar echoes were created using a phantom echo generator (PEG, Fig. 3) based on a TMS320C6713 floating point digital signal processor (Texas Instruments, Dallas, TX) with an analog input/output (I/O) daughtercard (AED109, Signalware Corp., Colorado Springs, CO). The system operated in an “open-loop” fashion, where click signals that exceeded a threshold triggered the creation of echo waveforms that were then held in digital memory before transmission to the dolphin. This operating mode is in contrast to a digital delay line that would simply delay all received signals and transmit back to the dolphin. Clicks emitted by the dolphin were recorded using a hydrophone (TC4013, Reson Inc., Slangerup, Denmark) embedded in a silicon suction cup and attached to the dolphin’s melon along the main biosonar transmission axis. This arrangement minimized potential echo timing errors associated with head movements. The contact hydrophone signal (Fig. 3) was amplified and filtered (5–200 kHz bandwidth: VP-1000, Reson Inc., Slangerup, Denmark and 3C module, Krohn-Hite Corporation, Brockton, MA), then digitized by the AED109 with a 1-MHz sampling rate and 12-bit resolution. The digitized hydrophone signal was passed to a threshold-crossing click detector. If a click was detected, the click waveform was scaled in amplitude, delayed by the appropriate time, then converted to analog (AED109). The outgoing analog waveform was filtered (5–200 kHz, 3C module, Krohn-Hite Corporation, Brockton, MA), attenuated if necessary (PA5, Tucker-Davis Technologies, Alachua, FL), amplified (M7600, Krohn-Hite Corp.), and used to drive the echo transmitter. The echo level relative to the emitted click—not the absolute echo level—was constant; i.e., echo levels dynamically changed in response to changes in emitted click level. Energy flux density levels of echo A and B were approximately − 72 dB relative to the received click at the contact hydrophone (about 20 dB above echo-detection threshold). The dolphin clicks and electronic echo waveforms were digitized at 2 MHz and 16-bit resolution by a PXIe-6368 multifunction data acquisition device (National Instruments, Austin, TX) and stored for later analysis.
Ambient noise was monitored using a hydrophone (TC4032, Reson Inc.) located ~ 50 cm above and to the side of the biteplate. The signal from this “off-axis” hydrophone was high-pass filtered at 100 Hz before being digitized at 2 MHz and 16-bit resolution by the same PXIe-6368 used for click recording.
Changes in echo delay (i.e., jitter in echo delay) were accomplished by changing the position of the echo waveform in the 1-MHz digital-to-analog (D/A) converter output buffer on an echo-by-echo basis. This approach had the advantage of preventing delay-dependent changes in echo amplitude and spectral content, but echo-delay resolution was limited to a single sample interval (1 µs). Experiments 1 and 3 featured echo delays that symmetrically jittered about the non-jittering echo delay (Table 1); therefore, the PEG D/A sampling rate restricted jitter delays to integral multiples of 2 µs. For Exp. 2, the shorter of the jittering delay values matched the non-jittering delay; therefore, the minimum jittered delay resolution was 1 µs.
Operation of the PEG was verified before each session by replacing the dolphin click signal input to the PEG analog-to-digital (A/D) converter with an electronic signal resembling a dolphin click and inspecting the resulting electronic echo waveform. A high-speed digital oscilloscope was used to ensure the actual jitter delay values matched the desired values. Calibration measurements using the oscilloscope revealed potential errors (i.e., unavailable, random jitter) in echo delay of less than ± 15 ns for the electronic echoes (i.e., before transmission into the water). Potential movement of the dolphin relative to the echo projector (not more than approximately 3 cm on a trial) could have caused larger changes in echo delay. However, as movement would have occurred on a relatively slow time scale compared to the changes arising from jittering echo delay from one echo to the next, the effect was considered to be negligible.
Analysis
Each dolphin’s performance in the echolocation task was quantified using the hit rate:
$$H={N_H}/{N_{{\text{EC}}}},$$
(1)
false alarm rate
$$F={N_{{\text{FA}}}}/{N_{{\text{NC}}}},$$
(2)
and the error rate (i.e., the number of incorrect responses divided by total number of trials)
$$E=(1 - H)\frac{{{N_{{\text{EC}}}}}}{N}+F\frac{{{N_{{\text{NC}}}}}}{N},$$
(3)
where H is the hit rate, F is the false alarm rate, E is the error rate, NH is the number of hits, NFA is the number of false alarms, NEC is the number of echo-change trials, NNC is the number of control trials, and N = NEC + NNC is the total number of trials. The error rate is used here to facilitate comparison with bat-jittered echo delay results presented as error rates from the two alternative forced choice paradigm (e.g., Simmons 1979).
Ambient noise during experimental sessions was quantified by computing the pressure spectral density from the off-axis hydrophone signal over 4096-sample (~ 2 ms) time intervals just before the generation of each echo. Biosonar click emissions were quantified by extracting emitted clicks from the digitized contact hydrophone signal, then computing the 10, 50, and 90th amplitude percentiles at each time value. Echo calibration was done with representative echoes, obtained by replacing the contact hydrophone input to the PEG with a waveform representing the median of the recorded clicks, then measuring the acoustic echo waveform projected back to the listening position (midpoint between the dolphin’s lower jaws) without the dolphin present. This allowed the use of coherent averaging to extract echoes from the background ambient noise. Echo waveforms were temporally aligned during averaging using the time delay of the peak of the cross-correlation function between each echo and the mean of the preceding (previously aligned) echoes (Woody 1967). Representative examples of farfield dolphin clicks were also obtained from recordings made during preliminary training/testing with a hydrophone located at a distance of 1 m on the biosonar main transmit axis. For representative clicks and echoes, the peak–peak sound pressure level (SPL), energy flux density level, energy spectrum, and the ACR function were calculated. For comparison to the click/echo ACR functions, the XCR function between the farfield click and echo was also computed. Envelopes for ACR and XCR functions were derived using the magnitude of the analytic function whose real part is the function itself and imaginary part is the Hilbert transform of the function (Au 1993).
For optimal receivers, maximum accuracy in echo delay estimation (minimum standard deviation of the delay estimate, σ) can be expressed as σ = (2πBd)−1, where B is the echo bandwidth, d = \(\sqrt {2E/{N_0}}\) is the echo amplitude signal-to-noise ratio (SNR), E is the echo energy flux density, and N0 is the noise pressure spectral density (Menne and Hackbarth 1986; Schnitzler et al. 1985; Simmons et al. 2004). The definition of bandwidth varies depending on whether the receiver is coherent or semicoherent. For a coherent receiver, echo delay is estimated using the maximum peak in the fine structure of the XCR function and bandwidth is given by the rms bandwidth, Brms:
$${B_{{\text{rms}}}}={\left( {\frac{{\int_{{ - \infty }}^{\infty } {{f^2}{{\left| {S(f)} \right|}^2}{\text{d}}f} }}{{\int_{{ - \infty }}^{\infty } {{{\left| {S(f)} \right|}^2}{\text{d}}f} }}} \right)^{1/2}},$$
(4)
where f is frequency and S(f) is the Fourier transform of the signal waveform (Au 1993; Menne and Hackbarth 1986; Simmons et al. 2004). For a semicoherent receiver, echo delay is estimated using the peak of the envelope of the XCR function and bandwidth is defined by the centralized rms bandwidth (Bcrms):
$${B_{{\text{crms}}}}={\left( {\frac{{\int_{{ - \infty }}^{\infty } {{{\left( {f - {f_{\text{c}}}} \right)}^2}{{\left| {S(f)} \right|}^2}{\text{d}}f} }}{{\int_{{ - \infty }}^{\infty } {{{\left| {S(f)} \right|}^2}{\text{d}}f} }}} \right)^{1/2}}=\sqrt {B_{{{\text{rms}}}}^{2} - f_{{\text{c}}}^{2}} ,$$
(5)
where the center (centroid) frequency (fc) is defined as:
$${f_{\text{c}}}=\frac{{\int_{{ - \infty }}^{\infty } {f{{\left| {S(f)} \right|}^2}{\text{d}}f} }}{{\int_{{ - \infty }}^{\infty } {{{\left| {S(f)} \right|}^2}{\text{d}}f} }},$$
(6)
(Au 1993; Menne and Hackbarth 1986; Simmons et al. 2004).