12.1 Introduction

Echolocation, a term coined by Griffin (1944, 1958), is an active sensory system. Echolocating animals emit sound signals and perceive their surroundings by way of the returned echoes. Using this approach, echolocators can determine the direction and distance to an object, the type of object, and whether it is moving or stationary. Echolocation (also known as biosonar) is used by most bats, odontocetes (toothed whales), oilbirds, and some swiftlets to negotiate, respectively, night skies, deep waters, or dark caves. In addition, soft-furred tree mice use echolocation in darkness for orientation (He et al. 2021). These are all habitats characterized by limited visibility, likely a key evolutionary driver for echolocation. Echo feedback may also provide functional sensory abilities in shrews and tenrecs.

The discovery of echolocation traces back to Lazzaro Spallanzani’s suggestion in 1794 that bats could “see” with their ears. Griffin (1944, 1958) verified this idea much later when he demonstrated that bats produce ultrasonic sounds to collect information about their surroundings and concluded that “echolocation is an eye-opening discovery about animal behavior.”

Demonstrating echolocation behavior means showing that the animal uses echoes of their outgoing sounds to locate and identify objects in their path. Several robust protocols exist for assessing echolocation ability and capacity in terrestrial and marine animals (Griffin 1958; Norris et al. 1961). Echolocation and ultrasound are not inherently linked. Many animals echolocate by signals fully or partly composed of frequencies readily audible to humans, such as the clicks of some odontocetes, certain bat species, and birds. Conversely, many non-echolocating animals use ultrasonic sounds for intraspecific communication.

A primary advantage of echolocation is that it allows animals to operate and orient in uncertain lighting conditions. At the same time, information leakage is a primary disadvantage of echolocation. The signals used in echolocation are audible to many other animals, such as competing conspecifics, predators, and prey. The evolutionary arms race between echolocating bats and several families of insects sensitive to ultrasound is a classic example of predator–prey co-evolution (Miller 1983; Miller and Surlykke 2001). Some fishes (Alosinae) hear high-frequency sounds (Mann et al. 1997; Wilson et al. 2008), which could suggest similarly co-evolving sensory abilities between odontocetes and their fish prey (Wilson et al. 2013).

In this chapter, we review basic concepts about echolocation, the variety of animals known to echolocate, the main types of echolocation signals they use, and how they produce and receive those signals. The topic of perception by echolocating animals is beyond the scope of this chapter.

12.2 Characteristics of Echolocation Signals

Echolocating animals use two broad classes of sounds. Toothed whales, rousette bats, and birds generate broadband clicks produced at varying rates. The vast majority of bats, however, use tonal echolocation signals, characterized by longer duration and either a constant frequency or, more commonly, frequency modulation (FM; i.e., sweeping across several frequencies over time). With the exception of certain bat species, echolocating animals time their outgoing pulses so the echo from a previous pulse does not overlap with the next outgoing signal, especially during general orientation and searching for prey. This separation ensures that the strong outgoing signal does not mask the fainter returning echoes from the previous signal (Jen and Suga 1976; Kalko and Schnitzler 1989; Verfuss et al. 2009). Bats and odontocetes both show characteristic changes in echolocation behavior as they approach objects. Notably, most species in both groups adjust the sound emission rate to the distance of the target. The click rate increases as they approach objects and numerous species emit a terminal buzz (i.e., a series of pulses or clicks in rapid succession) during prey capture (Fig. 12.1). In bats, these temporal changes are accompanied by a change from narrow to wider bandwidths and lower to higher frequencies as they move from an open to a cluttered aerial environment or detect an airborne insect prey. Such pronounced, systematic changes have not been documented in oilbirds or swiftlets.

Fig. 12.1
figure 1

Echolocation sequence from a harbor porpoise (Phocoena phocoena) and a Daubenton’s bat (Myotis daubentonii) as they approach and capture prey. Both species increase the rate of sound emission as they approach prey and emit a terminal buzz immediately before prey capture

Echolocation signals are often much higher in amplitude than other sounds produced by animals. Amplitudes of bat echolocation signals are typically given at a reference distance of 0.1 m in front of the mouth or nostril. For whales and birds, source levels are referenced to a distance of 1 m in front of the animal. Source levels of bats are variable, but generally higher in aerial-feeding bats that fly and search for prey in the open sky (typically 100–130 dB re 20 μPa at 0.1 m). Bats that fly and forage in vegetation use lower-amplitude signals. Among these, the so-called “whispering bats” (e.g., slit-faced bats (Nycteridae), false vampire bats (Megadermatidae), and many New World leaf-nosed bats (Phyllostomidae)), emit echolocation sounds at about 65–70 dB re 20 μPa at 0.1 m (Jakobsen et al. 2013a). The source level of a dolphin’s echolocation signal is several orders of magnitude greater than that of a bat’s signal, primarily owing to the different properties of the two media (see next section) (Madsen and Surlykke 2014). Echolocation clicks of bottlenose dolphins (Tursiops truncatus) can reach source levels of 225 dB re 1 μPa at 1 m peak-to-peak (Au 1993, p. 78). Source levels of oilbirds (Steatornis caripensis) are around 100 dB re 20 μPa root-mean-square (rms) at 1 m (Brinkløv et al. 2017), corresponding to roughly 120 dB re 20 μPa at 0.1 m, which is comparable to estimates from many bat species. Little has been documented about the source levels of swiftlets, tenrecs, and shrews.

Bats and toothed whales both emit the acoustic signal energy in a focused beam, with specific vertical and horizontal transmission patterns, akin to an “acoustic flashlight” focused on a certain search area. The open mouth of a bat, or the nose in nasal-emitting bats, shapes the transmitted beam (Hartley and Suthers 1987, 1989), which is much broader than that of dolphins (Madsen and Surlykke 2014). The dolphin’s melon transmits the outgoing echolocation signals with a slightly elevated vertical beam above the rostrum (Au 1993). There is no information on signal directionality from oilbirds or swiftlets.

12.3 Differences in Echolocation Signals in Air and Water

Only a few of the 71 known species of toothed whales are proven to use echolocation, but by inference probably all of them do (Culik 2011), as do presumably more than 1000 species of bats. For echolocators, there are three important differences between sound in air and sound in water: (1) density of the medium, (2) reflectivity of targets, and (3) maneuverability of the target (Madsen and Surlykke 2014). These differences severely influence the way echolocation has evolved in the two media (Au and Simmons 2007).

First, water is about 770 times denser than air: 1000 and 1.3 kg/m3, respectively, partly explaining why sound travels about 4.4 times faster in water than in air (1520 m/s versus 344 m/s). For the same frequency of sound, the wavelength in water is about 4.4 times longer than in air. Longer wavelengths limit detection to larger targets because reflection depends on the relationship between the wavelength of the impinging sound and the size of the reflecting object (Urick 1983; also see Chap. 5, section on reflection). Sound at a given frequency reflects more effectively from smaller objects in air than in water. For example, the wavelength of a 100-kHz signal is 3.4 mm in air, and 15 mm in water. Thus, a sphere with a circumference greater than 3.4 mm strongly reflects the 100-kHz sound in air, while in water, the sphere must be larger than 15 mm in diameter.

The absorption coefficient (see Chaps. 5 and 6 on sound propagation) of the medium is a function of several factors, but frequency is the most important for echolocators. In seawater, the absorption coefficient for sound at 100 kHz is about 0.038 dB/m, while in air at the same frequency, it is much larger: 3.3 dB/m. In addition, sound pressure is lost through geometric spreading in both air and water. For spherical spreading, each time the distance is doubled, the sound pressure level of the emitted signal is halved (i.e., reduced by 6 dB). Taken together, sound absorption and geometric spreading mean that an echolocating dolphin can detect an object at much longer distances than can an echolocating bat (Madsen and Surlykke 2014).

Investigators often want to get a relative notion of the difference in amplitude of bat and dolphin echolocation signals. However, such a comparison should be done cautiously because of the different physical properties of air and water and the two different reference pressures. To compare a sound intensity level measured in dB in water to a reading in air, subtract 36 dB to compensate for the differences in acoustic impedance (i.e., density × sound speed; see Chap. 4, introduction to acoustics) between the two media. For the same source intensity, sound pressure in water is 60 times greater than in air (i.e., ~36 dB).

$$ {I}_{\mathrm{water}}/{I}_{\mathrm{air}}={\left({p}^2/\rho\ c\right)}_{\mathrm{water}}/{\left({p}^2/\rho\ c\right)}_{\mathrm{air}}=1/3570 $$
$$ 10\ {\log}_{10}\left(1/3570\right)=-36\ \mathrm{dB} $$

where p is sound pressure, I is intensity, ρ is density, c is the speed of sound, and ρc is acoustic impedance. Then, subtract 26 dB (20 log10 (20/1) = 26 dB) to correct for the different reference pressures used for the decibel scales of sound in air and in water; i.e., 1 μPa in water and 20 μPa in air (Fig. 12.2). For example, if the sound pressure level of a dolphin click were 220 dB re 1 μPa (Au 1993), then a source with the same power would produce a click of 158 dB re 20 μPa in air (220 − 36 − 26 = 158 dB re 20 μPa), which is a very high sound pressure in air and well above the maximum sound pressure levels achieved by bats.

Fig. 12.2
figure 2

For sound sources of the same power or intensity, the sound pressure levels in air and water differ by 62 dB

In air, there is a considerable difference in acoustic impedance between the medium and bat food, such as flying insects. There is, however, little impedance difference between seawater and toothed whale prey, such as fish or squid (Madsen et al. 2007). Accordingly, most sound from an echolocating toothed whale goes right through a fish or squid, producing low echo levels and making it difficult for the animal to detect its prey. In contrast, the air-filled swim bladders of some fish and hard features, such as the pen and beak of squid, reflect sound well, resulting in strong echoes.

In spite of substantial differences in the impedance and reflectivity of prey in air and in water, echo levels from airborne and aquatic prey are about the same. The target strength (TS) is the difference between the echo level (EL) measured 1 m from the target and the incident sound (IS) at the target: TS = EL − IS, where EL and IS are measured in dB re 20 μPa in air and 1 μPa in water, and TS is in dB as the reference levels cancel out. Maximum target strength depends on the frequency of the echolocation signal and the reflectivity, size, and orientation of the prey with respect to incident sound. For cod, haddock, and saithe (400 to 500 mm long) the TS (at 30 kHz) is −32 to −40 dB. For a moth (Arctia caja) with a 25–35 mm wingspan, TS (at 20–50 kHz) is −42 dB; for the stonefly (Plecoptera sp.) with a wing-span of ~15 mm, TS (at 10–37 kHz) is −47 dB (Miller 1983; Rydell et al. 1999). Despite more than a magnitude of difference in size, the target strengths of fish and insect prey are similar because of a combination of the differences in acoustic impedance of the medium and reflectivity of the prey.

Viscosity differences between air and water make toothed whales much less agile than bats. Toothed whales swim at about 2 m/s when capturing prey while bats fly at 2–10 m/s. After detection, a bat arrives at its prey much sooner than the toothed whale. A bat catching prey moves quickly because it is hardly hindered by friction from air. Bats typically take about a second to capture prey, while porpoises and dolphins need several seconds because the higher viscosity of water hinders their mobility. These differences occur despite similar ratios between body length of predator and prey; a 3-m long dolphin is 6–15 times larger than its fish prey (20 to 50 cm long) and a 3–8 cm long bat is 5–10 times bigger than its insect prey. Bats often use their wing and tail membranes and even their feet to catch and manipulate insects. Toothed whales are streamlined with only pectoral and dorsal fins and flukes as appendages; they must catch and manipulate prey with their teeth and mouths (Miller 2010).

Despite very different selective pressures placed on bats and toothed whales, most of which are founded in the density and viscosity differences between air and water, they operate their biosonar in very similar ways. This similarity of the biosonar systems of bats and toothed whales (Fig. 12.5a) is a wonderful example of convergent evolution (Madsen and Surlykke 2014; Wilson et al. 2013).

12.4 Echolocation in Bats

Bats are the second-most species-rich order of mammals, currently comprising almost 1400 species (Burgin et al. 2018) and they play several trophic roles. Echolocating bats eat a diverse range of food including animals (insects, vertebrates), plant materials (leaves, fruit, nectar, and pollen), and even blood. The non-echolocating pteropodid bats all eat mainly plant materials. Traditionally, bats were arrayed in two suborders separating them into the echolocating Microchiroptera and the non-echolocating Megachiroptera, but recent phylogenetic studies do not support this division. Bats are now divided into Yinpterochiroptera and Yangochiroptera (Teeling 2009; Teeling et al. 2005). The non-echolocating pteropodid bats are found in the Yinpterochiroptera. This new division is intriguing because it creates two alternatives for the evolution of bat echolocation, either as a single event resulting in the loss of echolocation by the pteropodids or as two separate events. The current consensus favors a single origin of echolocation and subsequent loss in the pteropodids (Thiagavel et al. 2018; Wang et al. 2017).

12.4.1 Sound Production and Signal Characteristics

With the exception of the tongue-clicking Rousettus bats (10 species belonging to the pteropodid family), all ~1200 species of echolocating bats produce their echolocation signals in the larynx (Suthers and Hector 1988). The larynges and associated structures in bats are specialized to varying degrees from the basic mammalian pattern, notably the entire structure ossifies much earlier during development than in most mammals, and for many species the vocal tract and nasal passages are modified to filter frequencies used for echolocation (Au and Suthers 2014). Most echolocating bats emit sound through the open mouth, but bats in several families emit sound through the nostrils (Pedersen 1993). Bats emitting sound through the mouth generally have plain faces, while the bats emitting sound through the nose typically have elaborate structures surrounding the nostrils such as a nose-leaf that aids in sound radiation (Fig. 12.3).

Fig. 12.3
figure 3

Variation in bat facial morphology. (a) Nyctalus noctula, (b) Murina cyclotis, (c) Plecotus auritus, (d) Mimon crenulatum, (e) Rhinolophus rouxii, (f) Hipposideros lankadiva. Bats a and b are mouth emitting echolocators while cf are nose emitters. Note that c does not have the associated nasal structures common in nose emitters. Photos by S. Brinkløv

The vast majority of echolocating bats are insectivorous. Most insectivorous bats hunt flying insects and typically vary the structure of their echolocation calls as they progress from searching to approaching and capturing prey. Traditionally, prey capture is divided into three phases (Fig. 12.4): a search, an approach, and a terminal phase (Griffin 1958; Griffin et al. 1960). In the search phase, bats emit long-duration, lower-frequency, narrowband signals (search calls) at a low repetition rate. After an object of interest is detected, the bats gradually reduce the duration and intensity of the signals; while they increase the rate and the bandwidth as they approach objects (approach calls). In the terminal phase, immediately before prey capture, the repetition rates may exceed 150 calls per second (the terminal buzz). Several reasons underlie these progressive changes in call emission. The search calls facilitate a long detection range as lower frequencies are attenuated much less than are higher frequencies (Lawrence and Simmons 1982b) and the long duration and narrow bandwidth focus the energy of the call in a narrow range of the sensory system. These calls are, however, not ideal for accurate localization and object classification. Short-duration, broadband, high-frequency calls are much better suited for these tasks (Simmons et al. 1975). The switch from long-duration, narrowband, low-frequency calls in the search phase to short-duration, broadband, higher-frequency calls in the approach phase is a clear indication of object detection and it has been used to estimate detection distance in echolocating bats. However, it is important to note that this is a minimum measure as the bat may well have detected the object before adjusting its call parameters (Kalko and Schnitzler 1989, 1993).

Fig. 12.4
figure 4

Echolocation call sequence emitted by a foraging soprano pipistrelle (Pipistrellus pygmaeus), illustrating the progressive change in call characteristics and emission rate as the bat searches for, approaches, and captures insect prey

Most echolocating bats, like toothed whales, emit an echolocation call and wait for echoes from objects of interest before emitting the next call (Madsen and Surlykke 2014). While this avoids perceptual errors associated with potentially assigning echoes to the wrong calls, it also means that the distance between the bat and objects of interest limits the call emission rate. As the bats approach an object, echoes return with progressively shorter delays and the bat can emit the calls at a higher rate, up to over 200 calls/s during the terminal buzz (Simmons et al. 1979, Fig. 12.4). While this is an impressively high call rate, the echoes are still received well before the next call is emitted. At the short distances between the bat and the prey when the buzz is emitted, the bat could theoretically increase the call rate to 1000 calls/s and still avoid call-echo ambiguity. Instead, the call rate is limited by the maximum speed of the superfast muscles that control each call emission (Elemans et al. 2011). Concurrent with the increase in call rate, the call duration decreases as distance to the object decreases. This is likely to prevent overlap between the emitted call and the returning echo since the much louder call emission will mask the quieter returning echo if the two overlap (Kalko and Schnitzler 1989, 1993). Hence, echoes from objects of interest are received in a clearly defined window between the end of call emission and the beginning of the next call. For example, a bat emitting calls of 8 ms duration at a call rate of 10 calls/s can resolve echoes from objects between 1.4 and 17 m distance without masking the returning echo during call emission and without the risk of call-echo ambiguity (Fig. 12.5).

Fig. 12.5
figure 5

Schematic illustration of why most echolocating bats adjust call duration and call emission rate relative to target distance. Echoes received during call emission are masked by the louder call and echoes received after emission of the next call may create ranging ambiguity if assigned to the incorrect call. IPI: inter-pulse interval

While call rate and call duration define an overlap-free window, it is the energy and frequency of the emitted call together with the bat’s hearing threshold and the nature of the echo-generating object that determine the range of the echolocation system. Echoes have to return with enough energy to be detected by the bat. Emitting more energy, either by increasing the intensity or duration of the call, increases the detection distance. Emitting lower frequencies also increases the detection distance because acoustic attenuation is less for lower frequencies. On the reflection side, small objects return quieter echoes and will therefore always be detectable at shorter ranges than large objects (Fig. 12.6). The structure and texture of the object also affects the level of the returning echo. Hard objects reflect more sound than soft objects and the same is true for plane or convex surfaces compared to concave surfaces (Urick 1983; also see Chap. 5, section on reflection). Additionally, the relationship between the wavelength of the sound impinging on the object and the size of the object affects how efficient the sound is reflected. If the wavelength becomes too long (i.e., the frequency too low) relative to the size of the object, very little sound is reflected (Fig. 12.6). This means that prey size imposes a lower frequency limit on bat echolocation (Houston et al. 2004; Pye 1993).

Fig. 12.6
figure 6

Target strength of three types of insect as a function of echolocation frequency illustrating how reflection depends on the relationship between object size and frequency. Smaller insects have lower target strength and require higher frequencies for efficient reflection. Indicated sizes are wing length. Based on data from Houston et al. (2004)

Bats are limited both physically and physiologically in how high a sound pressure they can produce. Supposedly, the main reason why they emit long-duration calls in the search phase is to increase the energy of the call. Emitting sound directionally also increases the source level, that is the sound level measured directly in front of the animal. All bats studied to-date emit directional echolocation calls. Most bats increase their source level by 10 dB or more purely by focusing the sound as opposed to radiating sound equally in all directions (Jakobsen et al. 2013a). The highest source levels measured from bats are around 140 dB re 20 μPa rms at 0.1 m for the greater bulldog bat (Noctilio leporinus), but most reports of open-space aerial hawking bats are around 130 dB re 20 μPa rms at 0.1 m (Holderied et al. 2005; Hulgard et al. 2016; Surlykke and Kalko 2008). Combining knowledge of source level, signal frequency, hearing threshold, and the echo-generating object, the detection distance is relatively easy to estimate using a variation of the sonar equation (Urick 1983) (also see Chap. 6, section on the sonar equation):

$$ RL= SL-2\times PL+ TS $$
$$ PL=20\times {\log}_{10}\ \left(\mathrm{distance}/0.1\ \mathrm{m}\right)+\alpha \times \left(\mathrm{distance}-0.1\ \mathrm{m}\right) $$

Here, RL is the received level, SL is the source level emitted by the bat, PL is the propagation (formerly, transmission) loss, α is the frequency-dependent attenuation in air, and TS is the target strength, a measure of how much sound is reflected from the object at 0.1 m relative to the sound impinging on the target. For an object to be detected by the bat, RL simply has to be above the bat’s hearing threshold. The maximum distance that satisfies this requirement is the maximum detection distance. Estimated detection distances vary greatly between species, but it is clear that bat echolocation is a short-range system; the furthest estimates for large insect prey are around 10 m with most estimates below 5 m (Kalko and Schnitzler 1989, 1993; Nørum et al. 2012; Surlykke and Kalko 2008; Stilz and Schnitzler 2012).

The directional echolocation calls of bats allow an increased detection distance ahead of the bat while reducing the sound levels off to the sides and the back. This reduction in off-axis sound level offers an additional benefit as it reduces echoes from objects in these directions that are likely of little interest to the bats. Echoes from irrelevant objects are known as clutter echoes and reducing them simplifies the acoustic scene that the bats experience. The obvious disadvantage in emitting directional echolocation calls is the loss of echoes from relevant off-axis objects. The degree to which the benefits outweigh the costs of emitting a very directional echolocation call varies with the environment and the behavioral context. The directionality of the echolocation call is determined by the emitted frequency and the shape and size of the sound emitter. For mouth-emitting bats, this is the shape and size of the open mouth, and for nose-emitting bats, the shape and size of the nostrils and the nose-leaf (Hartley and Suthers 1987, 1989; Strother and Mogus 1970). Higher frequencies and larger emitters produce higher directionality (Fig. 12.7). Varying the frequency, shape, and size of the emitter allows the bats to adjust the directionality of the emitted call to suit their environment (Kounitsky et al. 2015; Surlykke et al. 2009b). During the final buzz of prey pursuit, bats can broaden their echolocation beam to increase peripheral echo levels and better track the prey (Jakobsen et al. 2015; Jakobsen and Surlykke 2010; Matsuta et al. 2013; Motoi et al. 2017). This is achieved in several species by a sudden drop in call frequency by nearly an octave (as illustrated in Figs. 12.4, 12.7, and 12.8) and is often referred to as the buzz II phase.

Fig. 12.7
figure 7

Echolocation call directionality as a function of emitter size and frequency. Directionality increases with increasing frequency and increasing size. Reprinted by permission from Springer Nature. Jakobsen L, Ratcliffe JM, Surlykke A. Convergent acoustic field of view in echolocating bats. Nature 493 (7430):93–96. https://www.nature.com/articles/nature11664. © Springer Nature, 2013b. All rights reserved

Fig. 12.8
figure 8

Echolocation calls emitted by a low duty-cycle bat (Myotis daubentonii) with strongly frequency-modulated calls (left) and a high duty-cycle bat (Rhinolophus formosae) with mostly constant frequency calls (right)

The majority of echolocating bats, and the focus of our description so far, hunt flying insects (aerial hawking bats) using relatively short-duration echolocation calls (also known as low duty-cycle calls, with duty cycle being the duration of the call divided by the time period (from the start of one call to the start of the next call). There are, however, many species that forage and echolocate differently. About 150 species, including the Old World horseshoe bats and hipposiderid bats (i.e., Pteronotus parnellii and closely related species in the family Mormoopidae from the New World), also feed on flying insects. These bats are so-called high duty-cycle echolocators and are able to broadcast and receive sound at the same time. While low duty-cycle bats maintain a clear time separation between the emitted call and returning echo, high duty-cycle bats separate call and echo by frequency. They all emit much longer duration, constant-frequency echolocation calls with short intervals to navigate and forage (Fig. 12.8, Fenton et al. 2012). When an echo-generating object, such as a moth, moves relative to the bat, the echo returns to the bat at a slightly different frequency than the emitted call because of the Doppler shift. The classical example used to explain the Doppler shift phenomenon is the moving ambulance. When an ambulance moves toward a nearby listener, the siren appears to be higher in frequency than the one heard by someone riding in the ambulance, which does not change. The effect of Doppler shift is apparent when the ambulance passes and moves away from the listener. Now, the frequency abruptly changes from higher to lower in pitch. Doppler shift occurs because the speed of the moving ambulance is added to, or subtracted from, the speed of sound, raising or lowering the perceived pitch of the siren. The amount of the Doppler shift is doubled for echolocating animals, as the frequencies of both outgoing and returning signals are shifted. The Doppler shift experienced by an echolocating animal may be computed as:

$$ \Delta f=\left({v}_1+{v}_2\right)\times f\times \cos \uptheta \times \frac{2}{c} $$

Here, Δf is the amount of Doppler shift in Hz, v1 is the speed of the echolocating animal in m/s, v2 is the speed of the target in m/s (+ indicates movement away from the echolocator; − would be movement toward the echolocator), f is the emitted frequency in Hz, θ is the angle in degrees between the echolocater and the target, and c is the speed of sound in the medium (about 344 m/s in air and 1500 m/s in water).

Perception of a Doppler shift by an echolocator is facilitated by emitting long signals tuned to one frequency (narrowband or constant frequency) and by having acute hearing in the frequency band of the Doppler-shifted echo. Specifically, Doppler-shifted echoes are dominated by different frequencies than those dominating outgoing pulses (Fenton et al. 2012) and bats using this strategy are therefore not sensitive to overlap of the two.

Greater horseshoe bats (Rhinolophus ferrumequinum) detect the frequency and amplitude modulations of the Doppler-shifted echo from an insect to within a few Hz of the ~82 kHz carrier-frequencies of their echolocation calls (Neuweiler 2000). The bats that use Doppler-shifted echoes readily detect the wing beats of a fluttering insect and distinguish the prey from the background. Flutter-detection is a recurring theme among bats that exploit Doppler shifts (Goldman and Henson 1977; Schnitzler and Flieger 1983; Lazure and Fenton 2011).

Bats that exploit Doppler-shifted echoes are Doppler-shift compensators (DSC; Hiryu et al. 2016) because they continuously adjust the outgoing signal to ensure that the Doppler-shifted echoes remain at the frequencies to which their acoustic foveae are tuned (Schuller and Pollack 1979, Schnitzler 1968; Schnitzler and Flieger 1983; Hiryu et al. 2016).

There is no current evidence that toothed whales or other echolocators using broadband clicks are capable of Doppler-shift compensation. However, the small harbor porpoise would be a good species to test for Doppler-shift sensitivity, as they have narrow auditory filters (Popov et al. 2006) and use relatively long clicks (100 μs) and narrowband echolocation signals centered around 130 kHz.

High duty-cycle bats, in general, have a highly specialized hearing to facilitate this type of echolocation and they modify their emitted echolocation calls such that the frequency of the returning echoes always falls within a very narrow frequency range for which their hearing is optimized (Fig. 12.8 and Sect. 12.4.2) (Schnitzler 1973; Schuller 1977). In spite of the large differences between high and low duty-cycle bats, the overall call emission pattern when catching flying insects is still remarkably similar. High duty-cycle bats still emit calls that correspond to the three phases of search, approach, and buzz when they pursue flying insects, including similar call-structure changes to those in the low duty-cycle bats: gradual source-level reduction, duration shortening, increasing repetition rate (Ratcliffe et al. 2013), and broadening of the echolocation beam during the terminal buzz (Matsuta et al. 2013).

Bats that do not forage for flying insects generally search for more conspicuous food. Many species hunt non-flying insects in dense vegetation, a strategy known as gleaning. Gleaning bats, in general, emit very short low-intensity calls that sweep over a broad range of frequencies (Denzinger and Schnitzler 2013). As noted earlier, such calls provide excellent localization and classification and the low intensities greatly weaken clutter echoes, which is particularly important when flying in dense vegetation. Fruit and nectar eating can be considered variations on the gleaning strategy, and the echolocation behavior of fruit-eating and nectar-drinking bats very closely resembles that of insect-gleaning bats (Denzinger and Schnitzler 2013). Notably, while these species often cluster their calls in groups with increased repetition rates when faced with increasing acoustic complexity, they do not emit the terminal buzz characteristic of bats that target flying insect prey (Gonzalez-Terrazas et al. 2016). In addition, they often rely on additional sensory input, such as olfactory cues (Gonzalez-Terrazas et al. 2016), or, in the special case of vampire bats, thermoreception (Kürten and Schmidt 1982).

12.4.2 Hearing Anatomy and Echolocation Abilities

The hearing of echolocating bats is based on standard mammalian hearing anatomy, including recognizable pinnae, tragus, ear canal, tympanic membrane, three middle ear bones, and a coiled cochlea. With few exceptions, they even have the same hearing threshold as most other mammals, measured at their best frequencies: 0 dB re 20 μPa (Fay 1988), Fig. 12.9. There are, however, notable specializations that relate to echolocation where bats differ from most mammals. It is clear that most bats have a larger than average pinna and tragus, but there is considerable variation across species in size and shape that likely relates to the bat’s echolocation signals and foraging ecology (Coles et al. 1989; Obrist et al. 1993) (Fig. 12.3). In general, bats that complement their echolocation by passive listening for prey-generated sounds have larger pinnae than bats that rely solely on echolocation (Obrist et al. 1993). The pinna provides substantial directionality and acoustic gain depending on the relationship between pinna size and sound frequency. The pinnae of gleaning bats commonly amplify sound well below the bats’ echolocation frequencies (Coles et al. 1989; Guppy and Coles 1988; Obrist et al. 1993; Schmidt et al. 1983). The acoustic gain provided by the large pinnae affords some bats extremely low hearing thresholds such as the impressive −20 dB re 20 μPa hearing threshold found in the brown long-eared bat (Plecotus auritus) and the Indian false vampire bat (Megaderma lyra) (Coles et al. 1989; Schmidt et al. 1983). While pinna structure plays a crucial role in bat echolocation, large external ears have a disadvantage during flight. Large ears create substantial drag, and it is likely that the ears of fast-flying bats are shaped as much by the aerodynamics of flight as by echolocation (Gardiner et al. 2008; Johansson et al. 2016; Vanderelst et al. 2015).

Fig. 12.9
figure 9

Audiograms of three echolocating bats and two echolocating bird species. A non-echolocating bird is shown for comparison. Bat thresholds are based on behavioral experiments, bird thresholds are derived from neurophysiological experiments. Green: big brown bat (Eptesicus fuscus, from Dalland 1965); light blue: Egyptian fruit bat (Rousettus aegyptiacus, from Koay et al. 1998); purple: greater horseshoe bat (Rhinolophus ferrumequinum, from Long and Schnitzler 1975); dark blue: oilbird (Steatornis caripensis, from Konishi and Knudsen 1979); red: swiftlet (Aerodramus spodiopygia, from Coles et al. 1987); yellow: black-capped chickadee (non-echolocating, from Wong and Gall 2015). Thresholds are not directly comparable between species due to differences in experimental conditions

As mentioned above, bats decrease their emitted intensity progressively as they approach objects. This is primarily believed to function as gain control for the auditory system, a phenomenon also seen in echolocating odontocetes (see Sect. 12.5.2). If the bats kept their output level constant, the echo level would increase progressively by many orders of magnitude as the bat approached an object. Considering small insects as point sources, this increase would be 40 × log10(r) or 12 dB per halving of distance r. So, the output call level generally decreases by 6 dB per distance halved (Boonman and Jones 2002; Brinkløv et al. 2013; Hartley 1992a, b; Lewanzik and Goerlitz 2018). Such a reduction results in a constant intensity at the object/prey, but a progressive increase in echo strength at the bat by +6 dB per halving of distance. However, the bat’s auditory system reduces its sensitivity by an additional 6 dB per halving of distance, because as the bat vocalizes, the middle ear muscles contract to avoid self-deafening, increasing the bat’s hearing threshold. This time-dependent change in hearing threshold corresponds almost perfectly to the missing 6 dB per halving of distance and presumably provides a constant perceived echo level for the bat (Hartley 1992a, b; Henson 1965; Suga and Jen 1975). The gradual relaxation of the middle ear muscles progressively decreases the bat’s hearing threshold back to resting level. It is worth noting that this is under very predictable laboratory conditions and that in a real-life field scenario, the bats encounter much more unpredictable conditions and prey behavior. Recordings of prey capture in the field reveal that intensity reduction is much more variable and commonly exceeds 6 dB per halving of distance (Nørum et al. 2012). This subject is also discussed below for harbor porpoises and dolphins.

Bat hearing is certainly specialized for echolocation and for high frequencies (Fig. 12.9). Other small mammals such as mice and rats have a similar high-frequency hearing. Bats are, however, much more sensitive up to their high-frequency limit and have very high sensitivity over a much wider range of frequencies. Comparing echolocating to non-echolocating bats, the cochlea is significantly larger relative to skull size, and the basilar membrane, where frequency coding occurs, is longer for echolocating bats compared to all other mammals (Kössl and Vater 1995). High duty-cycle bats have the longest basilar membranes containing an acoustic fovea, which is a large region of the membrane dedicated to a very narrow frequency range. The acoustic fovea provides the crucial frequency resolution and sharp tuning that allows high duty-cycle bats to separate call and echo by frequency instead of time (Bruns and Schmieszek 1980).

Bats use the time delay between their outgoing call and the returning echo to determine the distance to a target. They determine the horizontal direction to the object by comparing the input on the two ears. For bats, interaural intensity differences likely provide the main cues (Pollak 1988). The vertical direction is mainly coded by frequency-dependent reflections from the pinna and tragus (Lawrence and Simmons 1982a). Bats have excellent spatial resolution and accuracy. They consistently aim their echolocation beam to within less than 5° of their target both horizontally and vertically (Ghose and Moss 2003; Jakobsen and Surlykke 2010; Masters et al. 1985; Surlykke et al. 2009a) and can discriminate between two objects in the horizontal plane if they are more than 1.5° apart (Simmons et al. 1983) and, in the vertical plane, if they are more than 3° apart (Lawrence and Simmons 1982a).

Aerial hawking bats can easily be tricked into catching small pebbles thrown in the air. This is not because bats cannot distinguish pebbles from insects, but likely because most airborne items of a given size are edible to bats. Classification of small objects is based on temporal and spectral features of the echo generated by one or more reflections from the objects (Schmidt 1988; Simmons et al. 1990; Weissenbacher and Wiegrebe 2003), while the classification of large objects such as trees is more complex (Grunwald et al. 2004). The bat’s resolution of a target depends on both the frequency of the emitted call (higher frequencies reflect more efficiently off smaller structures than do lower frequencies (Fig. 12.6 and Urick 1983) and the bat’s ability to perceive these reflections. Bats are capable of distinguishing similar-sized objects with very minute textural differences. They can clearly distinguish small disks from mealworms when both are thrown in the air and smooth hanging beads from textured beads with the same overall echo-strength (Falk et al. 2011; Griffin et al. 1965).

Our account of bat echolocation only contains broad strokes. With around 1200 species of echolocating bats, the variation in echolocation design is vast, and while most follow the outline given here, there are many deviations and many bat species that utilize their echolocation in puzzling ways that are as yet unexplained.

12.5 Echolocation in Odontocetes

Among cetaceans, only species in the suborder Odontoceti (toothed whales) are known to echolocate (Au 1993). Bioacoustical research has focused on bottlenose dolphins, belugas, false killer whales, and killer whales (all in the families Monodontidae and Delphinidae) as well as porpoises (Phocoenidae), sperm whales (Physeteridae), and a few species of beaked whales (Ziphiidae).

Odontocetes use echolocation to orient in the aquatic environment, to detect, chase, and capture prey, and to socialize (Thomas et al. 2004; Thomas and Turl 1990). They have broadband hearing and a good ability to discriminate a signal in noise. Their echolocation signals have narrow beam patterns that can be modified, as can the amplitude and frequency content of outgoing clicks.

The bottlenose dolphin has been the “laboratory rat” of odontocete biosonar studies. A series of experiments by US Navy researchers examined the ability of captive bottlenose dolphins (Tursiops truncatus) to detect subtle differences in human-made objects for military reconnaissance purposes (Au 1993, 2015; Moore and Popper 2019). They showed that dolphins wearing eyecups (so they could not see their targets) and using only echolocation could: (1) distinguish objects of the same shape, but of different materials (e.g., cylinders of glass, metal, or rock), (2) distinguish objects of the same material but different shapes (e.g., PVC cylinders, plates, squares, and tubes), (3) detect a 3-inch hollow metal sphere at about 115 m distance and a sphere of a few millimeters at a distance of about 50 m, (4) feed normally if blind, but if hearing-impaired become disoriented, (5) discriminate metal cylinder targets with different wall-thickness (difference as little as 0.00 l mm), and (6) control the amplitude and frequency of their outgoing pulses, such that in areas of high ambient noise, they produced louder and higher-frequency pulses.

12.5.1 Sound Production and Signal Characteristics

Most dolphins emit whistles and burst-pulse sounds for intraspecific communication and brief broadband clicks for echolocation. Figure 12.10 shows four echolocation clicks from a false killer whale (Pseudorca crassidens). Each click generally has four to eight cycles and a duration of 15–70 μs. Peak-to-peak source levels can be very high, from 210 to over 225 dB re 1 μPa at 1 m. High-intensity signals from dolphins generally are broadband and can contain frequencies beyond 100 kHz. The frequencies of dolphin clicks vary almost linearly with the signal intensity, such that, as the peak frequency of echolocation signals increases, the intensity of clicks increases (Au and Suthers 2014).

Fig. 12.10
figure 10

Left: Waveform of false killer whale biosonar signals with increasing averaged peak-to-peak source level in dB re 1 μPa (relative amplitudes are drawn). Right: Spectra of the corresponding signal type showing increasing peak-frequency with increasing signal amplitude. Adapted by permission from Springer Nature. Au WWL, Suthers RA. Production of Biosonar Signals: Structure and Form, pp. 61–105, in Surlykke A, Nachtigall PE, Fay RR, Popper AN (eds) Biosonar. Springer, New York, NY, USA; https://link.springer.com/chapter/10.1007/978-1-4614-9146-0_3. © Springer Nature, 2014. All rights reserved

All odontocetes studied thus far produce echolocation signals using one or two pairs of phonic lips located in the nasal passages. The lips contain bursae, which are rod-like fatty structures situated just below the blowhole (AB, PB in Fig. 12.11b). The phonic lips produce both echolocation clicks and communication whistles (Cranford et al. 1996).

Fig. 12.11
figure 11

Schematic sagittal reconstruction of the head of an adult harbor porpoise showing the nasal structures and the position of the larynx (LA). (a) Overview. (b) Detail of boxed area in (a). Blue: air spaces of the upper respiratory tract; gray: digestive system; light gray: cartilage and bone of the skull; yellow: fat bodies. AB: rostral bursa cantantis; AL: rostral phonic lip; AN: anterior nasofrontal sac; AS: angle of nasofrontal sac; BC: brain cavity; BH: blowhole; BL: blowhole ligament; BM: blowhole ligament septum; C: caudal; CS: caudal sac; DI: diagonal membrane; DP: low density pathway; IV: inferior vestibulum; LA: larynx; MA: mandible; ME: melon; MT: melon terminus; NA: nasal passage; NP: nasal plug; NS: nasofrontal septum; PB: caudal bursa cantantis; PE: premaxillary eminence; PN: posterior nasofrontal sac; PS: premaxillary sac; PX: pharynx; RO: rostrum; sm, sphincter muscle of larynx; TO: tongue; TR: trachea; TT: connective tissue theca; V: ventral; VE: vertex of skull; VP: vestibulum of nasal passage; VS: vestibular sac; VV: folded ventral wall of vestibular sac. Reprinted with permission from John Wiley and Sons. Huggenberger S, Rauschmann MA, Vogl TJ, Oelschläger HHA. Functional Morphology of the Nasal Complex in the Harbor Porpoise (Phocoena phocoena L.). The Anatomical Record 292:902–920; https://anatomypubs.onlinelibrary.wiley.com/doi/full/10.1002/ar.20854. © John Wiley and Sons, 2009. All rights reserved

Amundin (1991) and Huggenberger et al. (2009) studied click-production in the harbor porpoise, which can serve as a general example for odontocetes other than sperm whales. Figure 12.11 shows an overview and details of the harbor porpoise sound-producing apparatus (Huggenberger et al. 2009). Air passages are shown in blue, fat in yellow, bone in white, and other tissues in red. Air in the bony nares (NA) is pressurized by the nasopharyngeal pouch and the sphincter muscle of the larynx (sm), possibly with help of the piston-like action of the rostral end of the larynx (LA) and epiglottis (Ridgway and Carter 1988). The nasal plug (NP) and the blowhole ligament septum (BM) control the flow of pressurized air past the phonic lip pair (AL: Anterior Lip/PL: Posterior Lip) in each naris resulting in a click-like vibration in the bursae (Anterior Bursa, AB and Posterior Bursa, PB), primarily on the right-side. Each click projects from the bursae through a low-density pathway (DP) to the melon (ME) and from there to the water. This low-density pathway (DP) is characteristic for the families Phocoenidae (porpoises) and Cephalorhynchinae (small dolphins). In the bottlenose dolphin, and most other delphinids, the anterior bursa (AB) directly abuts the melon. The small amount of air needed to produce a single click ends up in the vestibular air sac (VS) and eventually is re-cycled to the nasal cavity (NA), rather than exhaled through the blow hole (BH) (Norris et al. 1971; Dormer 1979). This process appears to be the same in all odontocetes.

Dormer (1979) showed that in three delphinids, the right pair of phonic lips produces high-frequency clicks, the left pair produces whistles. Whistles, like clicks, are also transmitted to the melon and into the water but are much less directional due to their lower frequencies. There is conflicting evidence for click-production by the left pair of phonic lips (Madsen et al. 2013; Cranford et al. 2011, 2015). Critically designed experiments and field recordings are needed to elucidate the full function of the left pair of phonic lips, particularly in species such as porpoises that do not whistle.

In dolphins, porpoises, and river dolphins, the melon (ME in Fig. 12.11) and associated tissues are the primary structures for transmitting echolocation clicks from the phonic lips to the water (Cranford et al. 1996). In the bottlenose dolphin melon, fat is not homogeneous; rather it is composed of varying amounts of triglycerides and wax esters that differentially affect the sound transmission velocity through the melon (Au 1993, 2015). The same is true for the harbor porpoise (Au et al. 2006; Madsen et al. 2010), where the melon contains mainly triglycerides, probably of many different types (chain lengths and degree of saturation) producing different densities (acoustical impedances). The lowest density is near the low-density pathway (DP in Fig. 12.11), while the highest density approximates that of seawater and occurs in the dorsal part of the melon about four centimeters caudal to the upper lip of the harbor porpoise (Kuroda et al. 2015).

The density of muscle and connective tissue above and lateral to the melon (TT in Fig. 12.11) is greater than the density of the melon tissue and keeps sound from leaking out of the melon. In dolphins and the harbor porpoise, a vestibular air sac (VS) is associated with the melon and also acts like a shield to preventing sound leakage. New results indicate that the melon of the harbor porpoise functions as an acoustic waveguide (Wei et al. 2017, 2018).

The foreheads of beaked whales (Ziphiidae) and the two pygmy sperm whales (family Kogiidae) are quite different. Here, the anterior bursae lie against a spermaceti organ filled with wax esters (Cranford et al. 1996). The spermaceti organ abuts the melon, so an echolocation click first passes through the spermaceti organ into the melon and out into the sea. Beaked whales have an extensive sheet of thick, dense, connective tissue rather than air sacs above the spermaceti organ and melon (Cranford et al. 2008). Beaked whales dive deep and hunt at depths of more than 1000 m (Johnson et al. 2006). At such extreme pressures, air sacs would collapse, but the structural adaptation of the forehead would still protect against acoustic leakage from the melon. Song et al. (2015) measured the acoustical properties of the melon in pygmy sperm whale (Kogia breviceps). The density of the melon tissue, and the velocity and impedance of sound are highest in the center of the melon. These physical characteristics keep sound from leaking through connective and muscular tissue surrounding the melon. In addition, air sacs above the spermaceti organ of Kogia keep sound in the spermaceti organ. It is unknown how deep Kogia dives, but the presence of air sacs above the spermaceti organ suggests that it does not dive as deeply as beaked whales. Kogia has extreme right-sided asymmetry of the skull bones, the function of which remains unclear.

The bioacoustical system of the sperm whale differs from all other odontocetes (Cranford et al. 1996). Sperm whales (Physeter macrocephalus) have only the right pair of phonic lips, which projects to the tip of the giant rostrum (Fig. 12.12). Click-production is essentially like that of other odontocetes. Air is pressurized in the right naris (Rn) causing a click from the right pair of phonic lips (Mo). A very small amount of sound energy escapes through the distal air sac (Di) at click-production (P0 Fig. 12.12b). The major portion of sound energy projects back through the spermaceti organ (So, heavy dashed line), hits the frontal air sac (Fr) and is reflected through the “junk” (Jo, heavy dashed line) into the water as a powerful and broadband click (P1 in Fig. 12.12b). The sperm whale P1 click is the most powerful biological sound known (with maximum source levels of 236 dB re 1 μPa rms at 1 m, Møhl et al. 2003), and is probably used as a long-distance biosonar probe signal (see Fig. 12.13b). But it has been proposed that these powerful clicks could stun prey. Norris and Møhl (1983) suggested a “big bang theory” for bottlenose dolphins and sperm whales that produce especially loud, single pulses (or bangs). These pulses could debilitate prey for easy capture, but this has never been proven. In fact, a new study using D-tags on sperm whales recorded no “big bangs,” but normal odontocete prey capture behavior (Fais et al. 2016).

Fig. 12.12
figure 12

A schematic drawing of a sperm whale head. Bl Blow hole; Di Distal air sac; Fr Frontal air sac; Jo Junk organ; Ln Left naris; Mo Monkey lips (museau de singe); Rn Right naris; So Spermaceti organ. (a) communication or coda clicks and (b) echolocation clicks, p1 being the strongest. According to the bent horn model, the production of an intense echolocation click (the solid black dashed lines and p1 in b) generates multiple weaker pulses (p2, p3, p4 in b) owing to reverberation of the initial sound (p1) between Di and Fr (the thin dashed lines). The whale can modify click generation to produce coda, or weaker communication clicks (the red solid line). This indicates that the whale can somehow control where the click, generated by the monkey lips (Mo), reflects off the frontal air sac (Fr) thus exiting near the distal air sac (Di). Modified from Caruso et al. (2015). © Caruso et al. 2015; https://doi.org/10.1371/journal.pone.0144503. Licensed under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/.

Fig. 12.13
figure 13

Multi-pulse structure of a sperm whale click. The P1 click is the most intense and broadest in frequency. It is the most powerful biological sound known. The following clicks of decreasing amplitude (P2–P4) are caused by reverberations in the nose of the whale (see also Fig. 12.12). From Møhl et al. (2003). © Acoustical Society of America, 2003. All rights reserved

A fraction of P1 energy reflects from the distal air sac causing a P2 click to be emitted at a delay consistent with the length of the head (spermaceti organ). The reverberation continues (P1 to P4 in Figs. 12.12b and 12.13a), resulting in a multi-pulse structure. Cranford et al. (1996) proposed that the spermaceti organ and the junk are homologous with the posterior and anterior bursae in the dolphin, respectively.

Although the sound-generating apparatus is basically similar in odontocetes, the outgoing sound from the melon can differ substantially among species. Initially, the action of the phonic lips, controlled by pneumatic pressure, influences the intensity of the click. Stronger hammer-action of a phonic lip pair means the transmission of more intense and higher-frequency clicks (Finneran et al. 2014; Fig. 12.10).

During orientation, most delphinids produce short, broadband echolocation clicks (Au 1993) often of high intensity. They produce less intense, but rapidly repeated clicks, analogous to a bat’s buzz when approaching objects or prey (see Fig. 12.1). A single click of a wild white-beaked dolphin lasts about 15 μs and has energy from about 30 kHz to over 200 kHz (Rasmussen and Miller 2002). The sperm whale also fits into this category (Møhl et al. 2003) with a broadband P1 click (Fig. 12.13b).

At present, it seems that the modulation of clicks in the harbor porpoise occurs in the whale’s forehead and that the basic echolocation signals entering the forehead are short-duration, broadband clicks. Madsen et al. (2010) used contact hydrophones to show that a harbor porpoise click recorded near the right (or left) phonic lip pair is broadband. The same click recorded on the melon, along the midline of the animal near the exit point of the sound, has the typical polycyclic narrowband structure. The narrowband high-frequency click (Fig. 12.14) somehow results from the melon and associated tissues, but the details of this mechanism are unknown.

Fig. 12.14
figure 14

(a) Echolocation click from a harbor porpoise. (b) Spectrum of a harbor porpoise click. The harbor porpoise is one of several smaller toothed whales that use a high-frequency narrowband echolocation click (Galatius et al. 2019). From Fig. 12.1 in Miller and Wahlberg (2013); © Miller and Wahlberg 2013; https://doi.org/10.3389/fphys.2013.00052. Licenced under CC BY 3.0; https://creativecommons.org/licenses/by/3.0/

Beaked whales regularly use frequency-modulated up-swept clicks for orientation and when searching for prey. These are relatively broadband and about 200 μs long (Fig. 12.15). Clicks used during prey capture in the buzz are less than 100 μs long, slightly more broadband than the regular clicks and similar to dolphin clicks. It is unknown how the upsweep of the regular click is generated, but by analogy to the porpoise, the basic signal is likely a broadband click somehow shaped in the forehead of the whale.

Fig. 12.15
figure 15

Beaked whale click waveform (a), spectrogram (b Hann window, 40-point FFT, 98% overlap), and spectrum (c Hann window, 256-point FFT; dashed line shows ambient noise). Baumann-Pickering et al. (2010). © Acoustical Society of America, 2010. All rights reserved

The directionality of the echolocation sound beam in odontocetes has been studied for many years (Au 1993, 2015; Au et al. 1985, 1986, 1999; Kloepper et al. 2012; Koblitz et al. 2012). Recent work reveals that odontocetes control the shape and direction of the beam (Moore et al. 2008; Wisniewska et al. 2015). A bottlenose dolphin with its head stationary and its mouth on a biteplate moved its sound beam by 26° to the left and 21° to the right when echolocating a movable sphere 9 m away (Moore et al. 2008). Wisniewska et al. (2015) used two-dimensional hydrophone arrays to verify that harbor porpoises approaching a target (a dead fish) voluntarily change the diameter of their echolocation beam to increase the ensonified area by 100–200%, while reducing the interval between clicks in the buzz phase just before prey capture (Fig. 12.16). These changes are analogous to what a bat will do when capturing an insect (Jakobsen et al. 2015). Wild Amazon river dolphins (Inia geoffrensis) also increase the beam width during prey capture (Ladegaard et al. 2017). Increasing the beam width helps the porpoise (or bat) track a moving prey at close proximity. Presumably, the musculature around the melon helps control the beam width and direction in porpoises and dolphins (Moore et al. 2008), but this needs verification.

Fig. 12.16
figure 16

The harbor porpoise can increase the ensonified area by nearly 200% during the buzz phase with short inter-click intervals (ICI in b, blue). The large diameter circle (solid in a) illustrates the beam width for clicks with short intervals. The small diameter circle (dashed in a) shows the beam width of clicks with longer intervals emitted in the search phase at longer distances (ICI in b, red). © Wisniewska et al. 2015; https://elifesciences.org/articles/05651. Licensed under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/. All rights reserved

The direction of the sound beam from the head of a porpoise carcass can be changed by artificially inflating the vestibular air sacs (Miller 2010). With no air in the vestibular air sacs, a broadband click generated by a small hydrophone between the right pair of phonic lips projects left of the midline and vice versa with an artificial click generated between the left phonic lip pair. With air in the vestibular air sacs, the artificial clicks project out the midline (Fig. 12.17; see also Starkhammar et al. 2011; Cranford et al. 2014). Incidentally, the exiting click remained broadband in these experiments indicating that the living melon and associated tissues are necessary for producing a high-frequency, narrowband click typical for the harbor porpoise (Madsen et al. 2010).

Fig. 12.17
figure 17

Short broadband artificial clicks generated between the phonic lips (right lip: solid arrow and curve; left lip: dashed arrow and curve) of a cadaver harbor porpoise. With air in the vestibular air sacs (right image), the clicks emerge at the midline. Without air in the vestibular air sacs (left image), the clicks emerge on either side of the midline depending on where the artificial click was generated (clicks generated between the right pair of phonic lips emerge to the left and vice versa). Adapted with permission from Miller LA (2010); Prey Capture by Harbor Porpoises (Phocoena phocoena): A Comparison Between Echolocators in the Field and in Captivity; J Marine Acoust Soc Jpn 37 (3):156–168. © The Marine Acoustics Society of Japan, 2010

The primordial odontocete echolocation signal was probably a short, broadband click similar to the clicks used by most living dolphins and the sperm whale (Fig. 12.10, left). In contrast, the La Plata dolphin (Pontoporia blainvillei), six small dolphins (family Delphinidae), all porpoises (family Phocoenidae, six species with four documented), and the pygmy and dwarf sperm whales (family Kogiidae) use narrowband, high-frequency (NBHF) echolocation clicks (see Fig. 12.14). The change from broadband to NBHF echolocation clicks could reflect predation pressure by killer whales (and their ancestors), as well as environmental factors (Andersen and Amundin 1976; Madsen et al. 2005; Morisaka and Connor 2007; Miller and Wahlberg 2013; Galatius et al. 2019). NBHF clicks appear to be generated in the melon and associated tissues (Madsen et al. 2010). It is assumed that all odontocetes can control the amplitude of echolocation clicks, steer the sound beam, and manipulate its width (Moore et al. 2008; Wisniewska et al. 2015). These features are of obvious advantage for detecting and tracking prey. There are rich possibilities in future research of sound production and the use of echolocation by odontocete whales.

12.5.2 Hearing Anatomy and Echolocation Abilities

We refer to Vol. 2 Chap. 9 on aquatic mammals for more detail on hearing anatomy and abilities. Here, we focus on the hearing abilities of odontocetes as they relate to the tasks of obstacle and prey detection by echolocation.

Experimental studies show that the bottlenose dolphin (Li et al. 2011), the false killer whale (Nachtigall and Supin 2008), and the harbor porpoise (Linnenschmidt et al. 2012, 2013) have voluntary control over the level of the emitted click and of their auditory sensitivity during echolocation tasks. The results from the harbor porpoise clearly illustrate active hearing during the echolocation of targets: the porpoise maintains a constant level of auditory perception independent of target distance. If the distance to a target is doubled, the level of a click impinging on the target is halved (−6 dB). To compensate for this, the porpoise doubles the level of the outgoing click (+6 dB), keeping the level of the incident sound on the target constant and independent of distance (within a certain range). However, the returning echo is halved (−6 dB) at double the distance. Linnenschmidt et al. (2012) showed that there is an “automatic gain control” in the auditory system of the porpoise such that its hearing increases in sensitivity by about +6 dB to compensate for the loss in the echo level over double the distance. Without compensating for the level of the outgoing click and the gain control in the auditory system, the echo level would drop by 1/4 (−12 dB) per doubling of distance to the target, making echolocation more difficult for the whale.

Toothed whales obviously find their prey using echolocation, but how they discriminate between prey species is not known and, to our knowledge, has not been studied experimentally. Probably the most spectacular use of echolocation to find prey is shown by bottlenose dolphins in the Grand Bahamas. The dolphins often find fish under the sand using their echolocation and stick their proboscis down in the sand, sometimes to the pectoral fins, and come up with a fish in their mouths (Rossbach and Herzing 1997). What echo information they use for this unusual behavior is unknown. Harbor porpoises can discriminate between identical spheres of different materials (Wisniewska et al. 2012). Three harbor porpoises were easily able to distinguish between an aluminum sphere and spheres of plexiglas, PVC, and brass. Two of the three had problems differentiating aluminum from steel spheres. The spectra of these two spheres were very similar, so we assume the harbor porpoises were using spectral information to detect the differences among the spheres. Perhaps they also use spectral information together with target strength to distinguish between different fish species.

All echolocating toothed whales have a U-shaped audiogram (Fig. 12.18) and a broad range of hearing extending up to 200 kHz. In general, the hearing of odontocetes is most sensitive at the frequencies used for echolocation. For example, the harbor porpoise, a narrow-band high-frequency species, is most sensitive at around 130 kHz, the peak frequency of its narrow band signal. The killer whale uses lower frequencies in its echolocation signals and its best hearing is accordingly lower (Fig. 12.18).

Fig. 12.18
figure 18

Underwater audiograms of four odontocetes. Blue: Harbor porpoise behavioral audiogram using a 50-ms sound stimulus (Kastelein et al. 2010). Orange: White-beaked dolphin auditory evoked response audiogram using a 1-s sinusoidal amplitude-modulated stimulus (Nachtigall et al. 2008). Purple: Risso’s dolphin (Grampus griseus) auditory evoked response audiogram using a 20-ms sinusoidal amplitude-modulated stimulus (Nachtigall et al. 2005). Yellow: Killer whale average behavioral audiogram of two animals using a 2-s tone (Szymanski et al. 1999)

12.6 Echolocation in Birds

The oilbird (Steatornis caripensis, family Steatornithidae), and a subset of the swiftlets, family Apodidae (about 16 of 27 species, currently including Aerodramus spp and Collocalia troglodytes) are the only birds known to echolocate (Griffin 1958; Novick 1959; Chantler et al. 1999; Price et al. 2004). Neither seem to use echolocation to find food, but rather for crude orientation in dark caves or tunnels where they roost and nest. Arguably, bird echolocation systems are not a highly evolved sensory specialization in the same sense as in bats and odontocetes.

Disregarding nesting habits, oilbirds and swiftlets have very different ecologies. Oilbirds are nocturnal fruit-eaters from the tropical part of South America (Chantler et al. 1999). Swiftlets occur across the Indo-Pacific and use vision to locate insect prey during the day. There are records of swiftlets hunting at dusk, but it is unclear if they use echolocation during this activity (Price et al. 2004; Fullard et al. 1993).

12.6.1 Sound Production and Signal Characteristics

Like other birds, oilbirds and swiftlets produce sounds, including their biosonar signals, by inducing vibrations in air passed by membranous structures in their syrinx (see Vol. 2, Chap. 6). Suthers and Hector (1982, 1985) revealed distinct differences in the syringeal morphology of oilbirds and swiftlets (Fig. 12.19) but proposed similar sound production mechanisms in both. Oilbirds have a bronchial syrinx located caudal to the tracheal bifurcation. The two half-syringes are placed with bilateral asymmetry in the two bronchi (Suthers and Hector 1985). The swiftlet syrinx is tracheobronchial (i.e., located where the trachea splits into the two bronchi; Suthers and Hector 1982).

Fig. 12.19
figure 19

Schematic of syrinx anatomy in the oilbird (based on Suthers and Hector 1988, Fig. 12.2) and the Australian grey swiftlet (Aerodramus (formerly Collocalia) spodiopygia; based on Suthers and Hector 1982, Fig. 12.2), showing the trachea and its bifurcation into the two bronchi. Note the lack of intrinsic syringeal muscles (mm. broncholateralis) in the swiftlet. Note also the asymmetry of the bronchial oilbird syrinx with a more cranial placement of the right semi-syrinx. Adapted by S. Brinkløv

Suthers and Hector suggested that biosonar signals in both oilbirds and swiftlets are produced as a contraction of the extrinsic sternotrachealis muscles pulls the trachea caudal. This reduces tension across the syrinx and causes the syringeal membranes to fold into the syrinx lumen, where they induce vibrations of the expiratory airflow. Contrary to their other vocalizations, oilbirds and swiftlets actively terminate their echolocation clicks but do so by using different sets of muscles. In oilbirds, termination is controlled by contraction of the broncholateralis muscles intrinsic to the syrinx (Suthers and Hector 1985). Swiftlets lack intrinsic syringeal muscles (Fig. 12.19) and instead contract extrinsic tracheolateralis muscles to terminate their echolocation clicks (Suthers and Hector 1982).

Bird biosonar signals are relatively broadband and without structured frequency changes over time (Pye 1980). In this sense, they resemble the tongue-clicks of rousettes bats more than the signals produced by other echolocators, but with a narrower frequency range, longer duration, and lacking similarly well-defined on- and offsets (Fig. 12.20).

Fig. 12.20
figure 20

Waveform and spectrogram displays of bird echolocation click sequences. Top panel: oilbird (Steatornis caripensis) exiting cave roost, recorded at Dunstan’s Cave, Asa Wright Nature Centre, Trinidad. Bottom panel: swiftlet (Aerodramus unicolor) returning to its nest in a Sri Lankan railway tunnel. The overall timescale is 1 s, frequency scale is from 0 to 20 kHz. Spectrogram settings: FFT size 256, Hann window, 98% overlap. Both recordings are high-pass filtered at 1 kHz (second order Butterworth filter)

In the wild, oilbirds emit click-bursts of two or more single clicks in rapid succession (Fig. 12.20). Their clicks and click intervals are stereotyped within such a burst, with click durations of 0.5–1 ms and click intervals of ~2.5 ms. Clicks recorded from oilbirds in the wild have the most energy around 10–15 kHz but extend from 7 to 23 kHz measured at −6 dB from the peak frequency (Brinkløv et al. 2017). The intervals between click-bursts are more variable, but often around 200 ms (Griffin 1953). Each click-burst is perceived by human ears as one coherent sound (Konishi and Knudsen 1979). It is unresolved whether the number of individual clicks in a burst has functional meaning to the oilbird, but recent studies indicate that oilbirds may add click subunits to a burst as a means to increase overall burst energy and, as a result, the echolocation range (Brinkløv et al. 2017). Click-bursts typically have source levels of around 100 dB re 20 μPa rms at 1 m (Brinkløv et al. 2017).

Data from captive oilbirds differ somewhat from field recordings. Konishi and Knudsen (1979) reported that oilbird signals had most energy around 2 kHz and described each click as a pulse-like sound burst of 20 ms or more. Suthers and Hector (1985) described a large signal variation including continuous pulsed signals of 40–80 ms and shorter single or double pulses. This difference between field and captive data possibly indicates that the sounds of captive birds do not accurately reflect the echolocation behavior of birds in the wild since vocalization could be affected by reverberant confines or the stress of handling/being restrained.

Swiftlets emit biosonar signals either as single or double clicks (two single clicks in rapid succession, Thomassen et al. 2004; Fig. 12.20). As in oilbirds, it is unclear if the difference between single and double clicks has functional meaning to the swiftlets or is merely an artifact of the sound production mechanism (Suthers and Hector 1982). Of 12 swiftlet species studied, only the Atui swiftlet (Aerodramus sawtelli) appears to consistently produce single clicks (Fullard et al. 1993), while the rest emit both single and, more often, double-clicks. Each click of a pair is 1–8 ms long, with the second often of higher amplitude and slightly longer duration (Griffin and Suthers 1970; Suthers and Hector 1982; Coles et al. 1987). Clicks within a pair have intervals of 1–25 ms and click-pairs are emitted at intervals of 50–350 ms. Swiftlet clicks have most energy below 10 kHz (see spectrogram in Fig. 12.20).

12.6.2 Hearing Anatomy and Echolocation Abilities

While the auditory systems of echolocating bats and odontocetes include specializations that confer increased acuity and sensitivity, only a few such morphological or neurological specializations have been found in echolocating birds. Tomassen et al. (2007) used three-dimensional, micro-CT scans to model the middle ear function of a range of swiftlet species. They found no morphological adaptations in the middle ear single bone-lever system of the birds (Fig. 12.21) to improve impedance-matching in echolocating compared to non-echolocating species. Both had low tympanum-to-oval-window ratios relative to bird auditory specialists such as owls. Birds have a straight, rather than coiled cochlea (Fig. 12.21) and generally do not hear much above 10 kHz (Fig. 12.9, also see Manley 1990, p. 238).

Fig. 12.21
figure 21

Overview of avian and mammalian middle and inner ear anatomy. Left: Birds have a single middle ear bone (columella) and a straight cochlea. Right: Mammals have three middle ear bones (malleus, incus, and stapes) and a coiled cochlea. Adapted by permission from Springer Nature. Manley GA, Peripheral hearing mechanisms in reptiles and birds; https://www.springer.com/gp/book/9783642836176. © Springer Nature, 1990. All rights reserved

While peripheral auditory adaptations for echolocation seem absent in birds, there is some evidence that certain of the brain nuclei involved in auditory processing are enlarged in echolocating bird species. Thomassen (2005) found that echolocating swiftlets have larger nuclei magnocellularis and nuclei laminaris compared to non-echolocating swiftlets, structures that are both involved in temporal coding of auditory stimuli. The nucleus angularis appears to be enlarged in oilbirds (Kubke et al. 2004) and is known to process intensity information in barn owls (Tyto alba). Iwaniuk et al. (2006) concluded that oilbirds and swiftlets may have enlarged MLds (nucleus mesencephalicus lateralis, pars dorsalis), a structure homologous to the mammalian inferior colliculus. However, this enlargement was only apparent compared to closely related non-echolocating species, not to non-echolocating birds in general.

The hearing abilities of both oilbirds and swiftlets have been tested using neurophysiological approaches and indirectly through obstacle avoidance experiments. Measurements of cochlear and evoked potentials from the forebrain nucleus of anesthetized oilbirds empirically support the absence of inner ear specializations for echolocation. Oilbirds appear to be more or less insensitive to frequencies above 6 kHz and their best auditory sensitivity is at ~2 kHz (Fig. 12.9, and Konishi and Knudsen 1979). Single neuron recordings from the midbrain auditory nucleus of the echolocating Australian grey swiftlet showed best thresholds at 1–5 kHz (Fig. 12.9 and Coles et al. 1987). Hence, both oilbirds and swiftlets appear to have the ‘standard’ bird hearing range, with lowest thresholds between 2 and 4 kHz and poor sensitivity above 10 kHz (Dooling 1980). Curiously, it appears that oilbirds in the wild emit echolocation clicks that are not well-aligned to their best area of hearing. The lack of external ear structures in oilbirds and swiftlets means that directional cues occur at frequencies predicted by head size.

With echolocation signals matching their most sensitive area of hearing, oilbirds and swiftlets should detect objects down to at least 17 cm in diameter, equal to the wavelength of the signal at 2 kHz. For Oilbirds, this prediction is supported by obstacle-avoidance experiments, suggesting that they detect discs 20 cm in diameter suspended from the ceiling of their cave roost (Konishi and Knudsen 1979). However, detection thresholds between 0.6 and 2 cm have been found for swiftlets (Griffin and Suthers 1970; Fenton 1975; Griffin and Thompson 1982; Smyth and Roberts 1983), indicating that they may somehow extract echo information from the upper, albeit weaker, frequency range of their signals.

Like bats and odontocetes, oilbirds and swiftlets detect obstacles in dark spaces using echolocation. Unlike bats and odontocetes, echolocating birds, even the nocturnal oilbird, are also vision specialists and presumably do not forage by echolocation. The importance of vision in oilbirds is reflected in their specialized retinal morphology with multiple layers of photoreceptors (Martin et al. 2004). Initial behavioral experiments revealed that oilbirds flying in darkness consistently produced sounds but could not avoid obstacles if their ears were blocked. With the lights on, the birds, in contrast, produced fewer or no sounds and negotiated obstacles also with their ears blocked (Griffin 1953).

Biosonar signals of birds are generally stereotyped (Thomassen and Povel 2006) and there is no indication that birds have similar adaptive control over signal frequency as most echolocating bats. However, Brinkløv et al. (2017) recently found that the intensity of oilbird echolocation signals increased on darker nights relative to nights with more ambient light. The higher intensity of click-bursts emitted on darker nights resulted both from an increase in the amplitude of individual clicks and an increase in the number of individual clicks per click-burst. Several studies have noted that swiftlets increase click repetition rate as they approach obstacles (Griffin and Suthers 1970; Coles et al. 1987) and Atiu swiftlets emit signals at higher repetition rate when they enter than when they emerge from their cave roost (Fullard et al. 1993).

Nesting in dark places, such as caves, mines, tunnels, and other places where the lighting is uncertain, is a common feature of the ecology of oilbirds and echolocating swiftlets. Both start clicking as they cross a threshold from light to dark (Fenton 1975; Thomassen 2005; Brinkløv et al. 2017). Neither have been shown to use echolocation for foraging, although oilbirds may be able to detect some of the larger fruits they eat (palm fruits up to 6 cm) by echolocation (Snow 1961, 1962; Bosque et al. 1995).

12.7 Orientation and Echolocation in Insectivores and Rodents

12.7.1 Echo-Based Orientation in Insectivores: Tenrecs and Shrews

Tenrecs and shrews are small insectivorous mammals that forage in dense vegetation or under leaf-litter (Fig. 12.22). Tenrecs are largely endemic to Madagascar, but shrews have a wide distribution across Eurasia and North America. Both have tiny eyes and a presumably well-developed olfactory sense and emit a variety of sounds. The use of sounds by shrews and tenrecs, as they approach and explore unfamiliar objects in their surroundings, led to initial suggestions that they may use echolocation. However, few studies have successfully tested this hypothesis directly. The current consensus is that shrews and tenrecs may use a simple echo-based orientation system to obtain rough acoustic input about their surroundings at short range beyond their snout and vibrissae. As stated by Siemers et al. (2009): “Except for large and thus strongly reflecting objects, such as a big stone or tree trunk, shrews probably are not able to disentangle echo scenes, but rather derive information on habitat type from the overall call reverberations. This might be comparable to human hearing whether one calls into a forest or into a reverberant cave.”

Fig. 12.22
figure 22

Photographs (from left) of lowland streaked tenrec (Hemicentetes semispinosus), lesser hedgehog tenrec (Echinops telfairi), and northern short-tailed shrew (Blarina brevicauda). Photo of lowland streaked tenrec by Frank Vassen, 2010, https://commons.wikimedia.org/wiki/File:Lowland_Streaked_Tenrec,_Mantadia,_Madagascar.jpg#filelinks. Photo of lesser hedgehog tenrec by Wilfried Berns, 2006, https://en.wikipedia.org/wiki/Lesser_hedgehog_tenrec#/media/File:Kleiner-igeltanrek-a.jpg. Photo of northern short-tailed shrew by Giles Gonthier, 2007, https://en.wikipedia.org/wiki/Northern_short-tailed_shrew#/media/File:Blarina_brevicauda.jpg. All photos licensed under CC BY 2.0; https://creativecommons.org/licenses/by/2.0/deed.en

Gould et al. (1964) and Gould (1965) provided the most direct evidence for echo-based orientation in several species of shrews and tenrecs. After unsuccessful attempts to use an obstacle-avoidance set-up, the animals were instead tested using a so-called disc-platform apparatus. They were trained to find and jump onto a platform suspended at a vertical distance below a disc with an area of partial overlap. The location of the overlap was varied at random between trials. Both tenrecs and shrews emitted sounds during this task in the dark, but animals with their ears blocked were less successful in finding and landing on the platform than control animals. The control experiments included two tenrecs that were blindfolded.

Gould (1965) recorded the sound pulses emitted by captive tenrecs (Echinops telfairi, Hemicentetes semispinosus, and Nesogale (formerly Microgale) dobsoni) as they explored the disk-platform apparatus. The tenrecs emitted series of tongue clicks, each less than 2 ms long with most energy between 10 and 16 kHz. The clicks were produced as singles, doubles, or in triplets. Streaked tenrecs (Hemicentetes semispinosus) emitted clicks of low intensity; while those of Nesogale dobsoni were audible to humans at 7 m.

Gould et al. (1964) found that, contrary to the audible pulses of tenrecs, shrews (Sorex vagrans, S. cinereus, S. palustris, and Blarina brevicauda) searching for the platform emitted ultrasonic pulses with most energy between 30 and 60 kHz. The pulses were about 5 ms in duration with inter-pulse intervals of about 20 ms. Sanchez et al. (2019) recorded five Sorex unguiculatus in three different experimental setups, including soft and hard barrier obstacles. Under all three conditions, the shrews emitted a variety of calls, including clicks and several tonal pulse types ranging in frequency between 5 and 45 kHz with durations of 3–40 ms. While several studies have shown that shrews and tenrecs do show context-dependent changes in vocalization rate, there is little direct evidence for echolocation by these animals (Buchler 1976; Tomasi 1979; Forsman and Malmquist 1988; Siemers et al. 2009; Sanchez et al. 2019).

No morphological adaptations for echolocation have been found in the auditory systems of tenrecs or shrews. The limited data on hearing in these animals indicate that at least tenrecs hear well across the frequency range of their tongue-clicks. Sales and Pye (1974) reported that the hearing of streaked tenrecs is most sensitive from 2 to 60 kHz. Drexl et al. (2003) used otoacoustic emissions and auditory evoked potentials from the inferior colliculus and the auditory cortex to determine that the auditory range of lesser hedgehog tenrecs (Echinops telfairi) extends from 5–50 kHz at 40 dB SPL, with a lowest threshold at 16 kHz. Siemers et al. (2009) report a best hearing range of shrews between 2 and 20 kHz.

12.7.2 Echolocation in Rodents

One important test for echolocation is to blind the echolocator. This was done by Griffin (1958) for bats and by Norris et al. (1961) for dolphins. Although such a “blinding test” was not performed, a multifaceted study by He et al. (2021) convincingly suggests soft-furred tree mice (Typhlomys) must be added to the list of echolocating animals. Through behavioral experiments in total darkness, filmed with an infrared video camera, they showed that all four species of soft-furred tree mouse emitted acoustic pulses at higher rate and grouped pulses more in complex space than open space and during obstacle avoidance. Further, three species (T. cinereus, T. daloushanensis, and T. nanus) were tested in a disk-platform setup similar to that used by Gould et al. (1964) for shrews and tenrecs. The tree mice spent increased time emitting higher pulse rates on the sector of the disk above the platform before dropping down onto the platform. This preference was lost when their ears were blocked but regained when the ears were unplugged or fitted with hollow tubes. The study also used laboratory house mice (Mus musculus) as a control to demonstrate absence of any location preference or sound emission during the disk-platform test. Myriad tests and field studies document the functional use of echolocation by bats and toothed whales, but such studies are not available for insectivores and rodents.

Supplementing the behavioral part of their study, He et al. (2021) also conducted anatomical scans to reveal that the stylohyal bone of soft-furred tree mice is fused with the tympanic bone, which is characteristic of echolocating bats. Lastly, they used genetic analyses to document a strong convergence of hearing-related genes with those of other echolocating mammal groups, including the prestin gene associated with echolocation in bats and toothed whales (Liu et al. 2014). All four species of soft-furred tree mice emit similar short (~2 ms) ultrasonic pulses ranging from 65 to 140 kHz (He et al. 2021).

12.8 Are Echolocation Signals also Used for Communication?

Studies on the role of echolocation signals for intraspecific communication have included observations and recordings, playback experiments, and combinations of these approaches. Echolocation signals elicited territorial behavior in foraging spotted bats, served in individual recognition, and assisted in maintaining group adhesion among foraging molossids (Fenton 1995). Furthermore, bats use buzzes (high pulse repetition rates) not only when attacking prey, but also during landing, drinking and by several species in social settings (e.g., Schwartz et al. 2007). Many bat species roost in large groups in caves and emerge at dusk as a group to forage. Several toothed whale species forage in large numbers. Echolocation in bats and odontocetes likely plays a role in maintaining spacing among group members during foraging or during large group movements. However, there has been little research on whether all or only specific animals echolocate while foraging as a group. The benefits of eavesdropping on each other’s echolocation signals need to be studied. Groups of flying bats and swimming toothed whales surely eavesdrop on each other’s echolocation signals to gain general information about prey location. The energetic cost of sound production for flying bats and for clicking dolphins is negligible (Speakman and Racey 1991; Noren et al. 2017).

Evidence suggests that toothed whales use their echolocation clicks as communication signals. These comprise repeated patterns of rising, falling, or constant click repetition rates up to near 1000 clicks/s. Clicks used for communication by dolphins and porpoises have the same spectral properties as those used for echolocation, but this does not hold true for the coda-clicks of sperm whales, as explained below.

In toothed whales, most is known about the communication role of echolocation clicks from studies of captive harbor porpoises, captive bottlenose dolphins, and wild sperm whales. Porpoises and dolphins communicate with changing click repetition rates, rather like Morse code, without changing the temporal and spectral properties of the clicks (Rasmussen and Miller 2002; Clausen et al. 2010). These “pulse-bursts” (or burst-pulse sounds) of high repetition rate clicks with narrow sound beams are especially good for close range and directed communication (Clausen et al. 2010).

Figure 12.23 shows click rates used in five behavioral contexts between a mother harbor porpoise and her calf. The porpoises used the highest click rates in aggressive encounters, the lowest in grooming and echelon swimming (Clausen et al. 2010). The mother may be aggressive toward her calf and toward males. Aggressive signals were usually higher in intensity and repetition rates and always resulted in the other animal moving away from the emitter. Both mother and calf emitted approach signals, but only the calf emitted contact signals and only the mother emitted grooming signals. Wild harbor porpoises also use rapid click rates for communication (Sørensen et al. 2018).

Fig. 12.23
figure 23

Use of echolocation click rates by harbor porpoise as communication signals. Five different acoustic behaviors with seven events in each are shown. Note the very rapid increase in click repetition rate up to 1000 clicks/s during aggressive encounters. Reprinted with permission from Taylor & Francis. Clausen KT, Wahlberg M, Beedholm K, Dereuiter S, Madsen PT, Click communication in harbor porpoises (Phocoena phocoena). Bioacoustics 20:1–28; https://www.tandfonline.com/doi/abs/10.1080/09524622.2011.9753630. © Taylor & Francis, 2011. All rights reserved

Bottlenose dolphins use both echolocation clicks and whistles as communication signals. Blomkvist and Amundin (2004) studied two captive female bottlenose dolphins that used high-frequency, high repetition rate pulse-bursts during aggressive behavior. The pulse-bursts lasted up to 900 ms with click repetition rates from 100 to 940 clicks/s. Like the echolocation clicks used for orientation and foraging, the pulses were between 60 and 150 kHz. The metabolic rate of dolphins producing clicks was only slightly greater than that of silent dolphins indicating that echolocation is not energetically costly (Noren et al. 2017).

Several free-ranging species of dolphins (Tursiops truncatus, Stenella attenuata, S. longirostris, S. frontalis, Orcinus orca, and Cephalorhynchus hectori) use pulse-bursts mostly during affiliative and aggressive behavior (Dawson 1991; Herzing 2000; Lammers et al. 2004). Rasmussen et al. (2016) played back artificial pulse-burst signals (repeated at 300 clicks/s for 2 s) to 21 free-ranging white-beaked dolphins. Rather than responding with aggressive behavior, the dolphins showed mostly a change in swimming direction and swam around the projection equipment, mirroring the retreat of individual captive harbor porpoises receiving an ‘aggressive’ pulse-burst. The pulse-bursts, or rasps, of Blainville’s beaked whale are only emitted at depths below 200 m and composed of a series of short, FM clicks similar to its FM echolocation clicks, except with a lower peak-frequency. The communication context is not known (Arranz et al. 2011).

Sperm whales are social and form social units in subtropical and tropical waters worldwide. Up to 12 females with young of both sexes gather in long-term stable social units. Sperm whales in all ocean basins communicate using rhythmic “coda” clicks (see Fig. 12.12), which are a unique specialization among toothed whales (Watkins and Schevill 1977) and may even signify individual identity. The composition of codas can have many repetitive patterns, such as one click + a group of three clicks: 1 + 3, or 2 + 1 + 1 + 1, 1 + 1 + 3, etc. The coda patterns are not stereotyped; click intervals within a coda can vary and seem to contain information for the receiver. One stable social unit of five adult females, a juvenile male, and a calf in the waters off Dominica used 15 different codas. All individuals in the unit used several codas and one individual used 11 of the 15 codas (Antunes et al. 2011). A recent study (Oliveira et al. 2016) confirmed and extended those of Antunes et al. (2011). Using digital data acquisition tags (D-tags) attached to five individual sperm whales near the Azores, Oliveira et al. (2016) strongly indicated that codas from these sperm whales contained individual identification information. Some of the patterns can be distinct from one area to another while others, like the five-click coda, occurred in geographically widespread social units. We have yet to reach a detailed understanding of the use of codas by sperm whales, but codas may carry specific behavioral information from individual sperm whales.

Sperm whale coda-clicks resemble biosonar-clicks (Fig. 12.12) and the same basic mechanism likely underlies the production of both. However, whereas the biosonar-click largely bypasses the distal air sac, reducing the strength of back reflections (P1 etc. in Fig. 12.12), the (Po) of the coda-click seems to exit the rostrum more dorsally (see Fig. 12.12). It thus hits a larger portion of the distal air sac and reflects to a larger extent back to the frontal air sac producing the P1. This difference is indicated by the smaller dB difference between the Po and P1 components for coda clicks relative to biosonar clicks (Fig. 12.12). The large muscle and tendon layer between the dorsal edges of the cranium to the tip of the rostrum could play a role in directing the click. The initial coda click (Po) is lower in frequency and intensity than the biosonar click (Fig. 12.12, relative amplitude values). The intervals between repetitions of a coda click match those of a biosonar click from the same animal (Fig. 12.12b) and reflect the distance between the distal (Di) and frontal (Fr) air sacs (see Fig. 12.12). The properties of the coda clicks make them more suited for close-range and less directional communication than the more intense, higher frequency biosonar clicks (Fig. 12.13).

Whether echolocation signals serve a role for intraspecific communication in birds and insectivores has, to our knowledge, not been studied, but Suthers and Hector (1988) hypothesized that individual differences of the syrinx anatomy, specifically the position of the syringeal membranes, would allow oilbirds to distinguish own from conspecific signals by differences in the spectral characteristics of their clicks.

12.9 Summary

To date, highly specialized echolocation systems have evolved in many bat species and in toothed whales. Oilbirds and swiftlets also make use of a cruder type of echolocation, independent of obvious auditory specializations, for orientation when their visual abilities become insufficient. A more complete understanding of echolocation by birds awaits future studies. A form of echo-based orientation may be present in shrews and tenrecs, but the exact extent of its function still needs proper documentation.

Most echolocators use ultrasonic signals, either broadband clicks (including most toothed whales, rousette bats, oilbirds and swiftlets) or, as in most bats, tonal echolocation calls of constant frequency, frequency-modulated sweeps, or a combination of these call types. Generally, echolocation signals have high amplitude to promote long-range transmission. Bats and dolphins emit echolocation signals in a narrow beam, a sort of acoustic flashlight, to focus their search. In both bats and dolphins, the repetition rate of signals increases as they approach a target. Bats and dolphins can adjust the frequency and amplitude of their biosonar signals to adapt to noisy ambient conditions. Most echolocators do not broadcast and receive echolocation signals at the same time but separate the outgoing pulse from the echo in time to minimize the masking of faint echoes by the next outgoing signal. However, some families of bats are overlap-tolerant and emit long echolocation signals of constant frequency while listening for Doppler-shifted echoes returned by prey items.

Hearing anatomy, physiology, and abilities in bats and dolphins have been well-studied. Bats have a tragus and grooves in their pinnae that aid in signal reception and directional hearing. In contrast, dolphins do not have pinnae but have evolved asymmetrical skull bones that aid in directional hearing. Some bats emit echolocation signals through their nose and have elaborate nose-leafs while others are open-mouth echolocators. Bats produce their echolocation sounds in the larynx. Dolphins emit echolocation sounds through the melon within their forehead and from here into the water. They have phonic lips in their nasal passage to produce their echolocation clicks and communication whistles.

A primary advantage of echolocation is allowing animals to operate and orient in situations where light is uncertain, unpredictable, or plain absent. But as with other sensory capacities, echolocation often does not stand alone. The cross-modal sensory interactions between echolocation and sensory abilities such as touch, olfaction, and vision, is an area awaiting further exploration.

Information leakage is a primary disadvantage of echolocation. The signals used in echolocation are audible to many other animals, such as competing conspecifics, predators, and prey. The evolutionary arms race between echolocating bats and some insect prey is a classic example of predator–prey co-evolution. Signals used in echolocation also can function in communication, as shown in echolocating bats and toothed whales.

Both bats and odontocetes are affected by anthropogenic activities, as exemplified by the high mortality experienced by some bat species from wind turbines and incidents of drowning, for example, in porpoises accidentally entangled in stationary gillnets. Anthropogenic sound sources like road or shipping noise may interfere with efficient foraging in bats and toothed whales and seismic explosions used for offshore oil exploration can affect the behavior of toothed whales and other marine mammals. Echolocating birds are also affected by humans, for example, from poaching or nest collecting and habitat-destructive mining activity. Gaining an increased understanding of echolocation behavior in these animals could have important implications for such issues and for wildlife management in general.

12.10 Additional Resources

For a more in-depth view of bat echolocation, we strongly recommend Griffin’s book Listening in the Dark. While now more than 60 years old, the original observations and insights detailed by Griffin (1958) are still very much to the point and relevant today. The Springer Handbook of Auditory Research volumes Hearing by Bats, Bat Bioacoustics, Hearing by Whales and Dolphins, and Biosonar are also highly recommended as they hold much more detail than the present description. Finally, Thomas, Moss, and Vater edited a book on Echolocation in Bats and Dolphins in 2002.