11.1 Introduction

The study of animal communication, which is sometimes called zoosemiotics (as opposed to anthroposemiotics, the study of human communication), is fundamental to the areas of ethology, evolutionary biology, and animal cognition. Here, we are not so emboldened as to claim that humans are separate from other “animals.” In fact, we are ordinary mammals. Therefore, other than a brief discussion of human language at the end of the chapter, we will not discuss anthroposemiotics. Instead, we highlight and discuss what much of the rest of the Kingdom does.

In Acoustic behavior of animals (edited by Busnel 1963, p. 751), Tembrock stated that, “the production of sounds is not a fancy of Nature, but an expression of biological needs.” Moles (also in Busnel 1963), in what are believed to be the main lines of acoustic communication in animals, included a code that is received and acted upon (p. 112). Groundbreaking as this volume was, knowledge of acoustic communication in animals has come a long way since. Just 20 years later, Kroodsma (1982) published Acoustic communication in birds. The first volume of this multivolume publication discussed the significant advances made in recording animal signals, as well as the advancement in knowledge of the anatomy of neural and auditory structures, the physical characters of signal transmission, signaler motivation and coding, species-specific signaling, and the use of signals in behaviors such as spacing and mating (Morton 1982). The second volume (Kroodsma and Miller 1982) discussed issues of signal ontogeny, mimicry, vocal learning, and the ecological, behavioral, and genetic implications of variations within vocalizations. Other early compendiums, such as Sebeok (1977), provided an extensive summary of high-quality research studies from an expanding discipline of behavior and animal communication.

Bioacoustics is defined as the study of mechanical communication by acoustic (sound) waves. It is a widely used term when referring to animal communication. Biotremology is a relatively recent term. It was conceived to refer to communication signals that comprise substrate-borne vibrations, and which are detected as surface vibrations by specialized perception organs such as slit-sense organs in spiders, subgenual organs in insects, hair receptors, or Pacinian and Herbst corpuscles in vertebrates (Hill and Wessel 2016). Substrate-borne vibrations are sensed via, “…pressure waves traveling through … solid matter … detected via the surface vibrations they elicit or the airborne waves (sound) they induce” (Hill and Wessel 2016). Bioacoustical (sound) communication, refers to signals that are encoded in acoustic waves, and are detected using the ear. Vibrational communication has been recognized as evolutionarily older than bioacoustic communication and is much more prevalent among some animal groups (e.g., arthropods; Fig. 11.1). Therefore, researchers are also interested in how these mechanical vibrations affect behavior.

Fig. 11.1
figure 1

Biotremology examines mechanical communication such as that produced by many insects, including planthoppers (Apache degeeri; common in places such as North Carolina, USA). Photo “9 Apache degeeri (planthopper)” by Wildreturn; https://wordpress.org/openverse/image/4323324f-25c8-408f-9b88-8c5b3ae93655/. Licensed under CC BY 2.0; https://creativecommons.org/licenses/by/2.0/

Both areas of study use similar equipment to record and analyze communication signals. However, scientists in the field of biotremology also use devices such as laser Doppler vibrometers and wavelet analysis. These function to detect faint vibrational emissions made by animals. In addition, electromagnetic transducers produce signals, and when in contact with the substrate, serve as vibration generators for artificial playback experiments.

Now, nearly 60 years later beyond Busnel’s (1963) paradigm of bioacoustics, tremendous changes in recording technology and analysis have occurred. Acoustic identification of anything from birds to bats can be carried out using an iPhone, an acoustic detection application, and a bluetooth speaker or microphone!

11.2 The Origins of Substrate-Borne Vibrational and Acoustic Communication

Communication is the transfer of information from one animal (sender) to another animal (receiver) that can affect the current or future behavior of the receiver. In other words, communication conveys information. It is adaptive, in that a successful communication exchange enhances the survival of one or both participants. Vibrational communication has been suggested to have evolved, along with chemical communication, concurrently with evolution of the Metazoa (all animals; Endler 2014). We know that any movement of an animal, whether in water or at the boundary between air and any type of substrate, creates vibrations that can be detected by any other organism with receptors capable of receiving and translating them. Increasing evidence also suggests that invertebrate hearing organs evolved from vibrational precursors millions of years ago (Stumpner and von Helversen 2001; Lakes-Harlan and Strauss 2014). Therefore, the discussion of origins of communication in this section is restricted to the more recently evolved acoustic communication.

The origins of acoustic communication are likely to be in nonverbal sounds made by chance as the animal moves through the environment. These sounds could be scraping, a stick breaking, footfalls, opening or flapping of wings, or scratching. They are the result of environmental disturbance, which in turn makes a sound through the air, earth, or water. By just being made, these sounds convey to others the presence of the animal, and something about what it might be doing. It is then a simple developmental step for a particular sound to become associated with a particular situation and thus carry a particular message to the recipient. Examples of nonverbal sounds are sounds from an elephant breaking sticks as it moves through the environment, a sigh, a cough, or a sneeze. Originally, these sounds may not have been made to communicate. However, sounds that provide an advantage for an individual, or a population, will be perpetuated if they enhance the fitness of the species. This, ultimately, gives them an evolutionary advantage that would reinforce further refinement of this new sensory mode.

This origin likely gave the evolutionary opening to develop specialized body parts that could produce auditory signals, in tandem with sophisticated sensory capabilities to receive them (Narins et al. 2009). One such specialized body part is the respiratory tract. Once a respiratory tract had developed in vertebrates, sounds associated with breathing could convey information to others, and so the necessary adaptations for sound generation began to develop. For example, holding the breath and then letting it out as a sigh or a cough produces various sounds. These sounds are then associated with situations being experienced by the sender, meaning this information is available to all who hear it. Presumably, it was this evolutionary process that gave rise to sound-making organs in the respiratory tract to the point where vocal communication now involves a larynx.

Ritualization is the evolutionary process by which a pattern of behavior changes to become more effective as a signal (Huxley 1966; Morris 1957). The behavior is performed in a consistent way and is either stereotyped or incomplete. Incomplete behaviors may be used for activities such as courtship. For example, a drake mallard (Anas platyrhynchos), when preening and displaying to a female, acts as if he is addressing a skin irritation (Morris 1956), but he may not even touch his feathers during the display. In other words, the behavior seems to be a preening behavior, but is in fact a courtship behavior. To increase the effectiveness of the ritualized signal, anatomical modifications may also have evolved. A classic example of this is the elaborate colors of the Mandarin drake (Aix galericulata). During the courtship of a female, the male will highlight these colors by pointing to them during incomplete, exaggerated, and stereotypical preening.

Exaggerated signal ritualization is characterized by a clear signaling behavior, such as the ears of a horse (Equus caballus) flattening back as a precursor signal to biting. This exaggerated ear movement has a clearer meaning than just putting the ears back. Ritualistic behavior is usually no longer tied to its original role because it has become more important for the signaler’s fitness to communicate, rather than being used for its original purpose. Therefore, the signal has evolved to produce a clear message.

Signals can also evolve to become more effective by redundancy, or by emulation of another’s acoustic or vibrational expression. Redundancy in animal acoustic communication is the repeated use of a signal. Vocal signals, for example, can be repeated for long periods of time, such as the continuous chorusing of frogs advertising during mating sessions. Redundancy reduces the risk that a signal will be missed or misinterpreted and assures that the signal is heard even when environmental conditions are poor (e.g., when there are masking sounds from the environment and/or human sources). This continual production of sounds in chorus can also sustain the state of arousal or excitement, which may be necessary for completion of the behavior.

Signal emulation is when other members of a group join in when a signal is given. An example of this is when a group of domestic dogs (Canis lupus familiaris) hear the high-pitched siren of an emergency vehicle. One may start to howl, and others soon join in (Fig. 11.2). When one individual calls, this often stimulates others to make the same call. Other examples include the greeting calls (trumpeting) between mother and offspring elephant (Elephas maximus), or the “see-saw” vocalized inspiration and expiration call and reply signals of bull cattle (Bos taurus). Sound emulation is also common in humans. The vocalization is copied and repeated by a recipient and can cause increased arousal in both the sender and the recipient (Kiley 1972). Animals copying new sounds, which often happens by emulation, requires vocal learning (Janik and Slater 2000).

Fig. 11.2
figure 2

Emulative acoustic behavior is seen when a domestic dog (Canis lupus familiaris) hears a siren or other high-pitched signal. Photo “Howling white husky” by Tambako the Jaguar; https://wordpress.org/openverse/image/7d77b8d9-3dc4-4f3d-9c04-318833d1759e/. Licensed under CC BY-ND 2.0; https://creativecommons.org/licenses/by-nd/2.0/

A more complex version of this is antiphonal singing, which is an acoustic exchange between animals where they call at the same time to produce a chorus. There are benefits to this emulative calling behavior. Males that chorus, such as frogs and toads (Anura), cicadas (Cicadoidea), and humpback whales (Megaptera novaeangliae), may attract more females to a localized area. For example, millions of cicadas gather to mate in a forest in the eastern US, where the singing males produce loud, pure-tone sounds above 90 dB SPL (Fig. 11.3; Bennet-Clark 1998, 2000). Prairie mole cricket (Gryllotalpa major) males in the south-central US sing in choruses from burrows in the soil that individuals construct in aggregations. At 20-cm from the burrow entrance, the males’ loud harmonic songs average 96 dB SPL (Hill 1998).

Fig. 11.3
figure 3

17-year Cicada (Magicicada sp). Photo by the U.S. Department of Agriculture; https://www.flickr.com/photos/usdagov/8672057401/in/photostream/. Licensed under CC BY 2.0; https://creativecommons.org/licenses/by/2.0/

The larynx and various resonating cavities in the respiratory tract (throat, mouth, and nasal cavities that can be specialized into trunks or elongated noses) are collectively responsible for an enormous range of vocal sounds made by different species. Vocal signals have evolved to convey a great variety of messages, encompassing many meanings that can be interpreted by the recipients. The development of this messaging system becomes intricate with human language. Whether the degree of development of the young at birth (which could relate to cognitive development; Scheiber et al. 2017; Wilson-Henjum et al. 2019) influences the complexity of vocalizations and other displays are yet to be determined.

11.3 A Summary of Communication

Communication occurs when a signaler encodes a message in a signal, which passes through some medium (air, water, soil, plant organs, etc.), and is received, decoded, and acted upon by the receiver. The receiver’s response benefits the fitness of the signaler, and perhaps itself. It is a common misconception that communication always consists of a simple signal that is reciprocated with a single response. In fact, communication often uses multimodal sensory combinations of visual, olfactory, tactile, gustatory, electrical (as in electric fish or the duck-billed platypus, Ornithorhynchus anatinus), substrate-borne vibrational, and acoustical modes. The use of multimodal signals helps ensure that the message is unmistakable. For example, a cat can swish her tail, pull back her ears, swipe with her claws, and hiss to give an aggressive signal of potential attack, whereas just hissing or swishing her tail is a less clear message.

The focus of this chapter is substrate-borne (vibrational) and acoustic (sound) communication. A signal, for the purposes of this chapter, contains substrate-borne or acoustic information that is broadcast by an individual and is available to be received by another individual. The receiver may be the intended target of the signal or an unintended eavesdropper. Any individual in the environment with the appropriate receptor can receive the signal (Wiley 1983). The receiver of a signal may recognize it as containing information beyond that of just sensing the signal and the presence of the signaler.

11.3.1 Communication Concepts

Marler (1961) recognized four functions of signals: identifiers, designators, prescribers, and appraisers. For example, a male seal swims into the territory of another seal and the territory holder sends out a warning call. This call identifies the place and time of the territory holder (identifier), reports that he is the territory holder (designator), warns that the intruder (prescriber) should stop approaching, and allows the intruder to react to his call (appraiser). Smith (1969) expanded this into 12 generalized categories for vertebrates. Since then, with technological and analytical advancements, signal functions have been expanded to include complex displays, either vocal or nonvocal, and the other categories explored below.

Displays are behaviors that use one or several signals. These signals have evolved and become specialized to convey specific information. A classic example of a display behavior is the chest-beating of a mountain gorilla (Gorilla beringei), made famous by King Kong movies. This signal is given only by the dominant silverback males when he encounters a threat, such as another gorilla male, though the display can be practiced or mocked by the young (Fig. 11.4). The chest-beating forms part of a complex threat display, which involves nine steps, and includes both visual and acoustic modalities (Schaller 1964). In other words, the threat display can encompass several different signals.

Fig. 11.4
figure 4

Displays such as shown by this young gorilla (Gorilla beringei) often accompany both vibrational and acoustic communication. Photo “Gorilla Holding Baby Sister and Beating Her Chest” by Eric Kilby; https://www.flickr.com/photos/ekilby/36360289044. Licensed under CC BY-SA 2.0; https://creativecommons.org/licenses/by-sa/2.0/

A similar threatening display is produced by a dog (Canis lupus familiaris), drawing back its lips and exposing its teeth (visual), as well as growling (acoustic) (Fig. 11.5). Again, this is a complex display involving multiple steps and multiple modalities. However, displays can be simpler, such as a grasshopper (Orthoptera) scraping its wings as an acoustic signal to indicate location and readiness to mate.

Fig. 11.5
figure 5

Yellow Labrador retriever growls at a border collie, while using a mix of visual displays and vocalizations; the collie responds. "Growl" by smerikal is licensed under CC BY-SA 2.0; https://commons.wikimedia.org/wiki/File:Labrador_Growl.jpg

Much of the communication in insects, other invertebrates, and nonmammalian vertebrates such as fish and amphibians, involves stereotyped signals. That is, the signal is produced in a constant form and the response is evoked only by that signal. As a result, this signal/response relationship becomes characteristic of that species. In this way, stereotyped signals can be important in evolution. For example, if a signal influences mate selection, then a slight alteration in the signal could lead to failure to reproduce, or if mating is successful, it might give rise to a new species.

11.3.2 Biotremology

Vibrational behavior in animals has gained momentum in general awareness and research in the last few decades (Narins 1990; Hill 2008; Cocroft et al. 2014; Hill et al. 2019). Any sort of motion of a living organism produces vibrations in the various media around them, including the soil, air, plants, water surface, or spider webs. Some vibrations can be signals, while others are incidental cues not produced purposefully, or to benefit the sender. The rather new branch of behavioral biology studying vibrational communication is called “biotremology” and is concerned with substrate-borne mechanical waves used as a communication channel (Hill and Wessel 2016). In contrast to airborne sound, which consists of pressure waves only (see Chap. 4, Sect. 4.2.2), in solid substrates mechanical energy can travel in several waveforms, especially at the surface (i.e., the boundary between two distinct media; Fig. 11.6). Surface-borne waves are of special interest as most animals that make use of vibrational communication receive the signals by detector organs. These organs are in contact with a substrate surface, be it the ground, the surface of plant stems and leaves, or the water surface.

Fig. 11.6
figure 6

Mechanical wave forms produced by a signaling plant-dwelling insect. A planthopper is one of the small relatives of the cicadas. It has a tymbal organ to produce vibrations, which are transferred through its legs, then the thin air layer between its body and the plant surface, to the plant on which it is sucking fluids. By doing this, the planthoppers produce a very faint sound, which can be propagated through the air or soil. The planthopper tymbal organ is homologous to the “drumming organ” of the large singing cicadas. Tens of thousands of these smaller hemipteran bugs use tymbal organs to produce “silent songs.” Reprinted by permission from Elsevier. Hill P SM, Wessel A (2016). Biotremology. Current Biology 26, R181–R191; https://doi.org/10.1016/j.cub.2016.01.054 © Elsevier, 2016. All rights reserved

In addition to pressure waves (P-waves) and shear waves (S-waves) traveling inside the body of a solid (see Chap. 4), we have at the substrate surface Rayleigh waves (R-waves) and Love waves (L waves). Both R- and L-waves show particle oscillation perpendicular to the direction of the wave, but different propagation characteristics. P- and L-waves, for example, both have a higher propagation velocity than R-waves. Animals who can detect those waveforms differently could localize the source of these waves—be it a communication partner, a predator, or prey.

In 1979, Brownell and Farley showed that scorpions localize their prey by using differences in the propagation velocity of P- and R-waves (150 m/s:50 m/s), which they perceive using different sensory organs (tarsal hair receptors v. basitarsal slit sensilla). That was a significant discovery on the path to biotremology. Until then, the substrate the scorpions use, loose sand, was considered as not fitting for the transmission of vibrational signals, nor for the differential detection of different waveforms. Since the establishment of the view that a host of natural substrates are suitable for vibrational communication, a great number of (apparently) well-known behaviors are now seen in a new perspective, and new discoveries are made for almost all animal groups with increasing frequency (Hill et al. 2022).

The production of vibrational signals nor cues can be accomplished through different forms: drumming (any sort of percussion event where a body part impacts the substrate of soil or a plant or water, etc.), tremulation (a body shaking/trembling that does not strike the substrate as the signal travels through the signaler’s legs to the surface on which they are standing), stridulation (rubbing together a specialized file and scraper, which may be found on a variety of body parts), buckling of tymbal organs in animals that have them, vocalizations and perhaps others, such as scraping a surface while signaling, or even scratching against a tree, or rolling on the ground. Some of these signal production mechanisms, such as drumming, stridulation, and vocalization, always produce both a substrate-borne (vibrational) and an airborne (acoustic) component with a single action, even if only one of the potential signals is capable of eliciting a response in a receiver.

Arthropods, and especially insects, show the greatest variety of specialized organs to produce vibrational signals. All mentioned means of vibration production, except for vocalization, are present in several groups of arthropods and may have evolved several times, independently. For a subgroup of the insect order Hemiptera, the Tymbalia or tymbal bugs, comprising tens of thousands of species including plant- and leafhoppers, cicadas, and true bugs (Heteroptera), vibrational communication is known to be evolutionarily old and ubiquitous (Hoch et al. 2006; Wessel et al. 2014).

In mammals, most vibrational signals are produced by drumming or vocalization. Curiously, the vibrational communication of the largest land animal, the African savanna elephant (Loxodonta africana), was discovered by O’Connell-Rodwell in the 1990s, when she noticed peculiar behaviors. A freezing behavior in the elephant and change in orientation, without an apparent cause, nevertheless reminded her of the behaviors of the tiny planthoppers whose vibrational communication she had studied earlier (Fig. 11.7). O’Connell-Rodwell and colleagues demonstrated that the signals the elephants generate with low frequency “rumbles” (about 20 Hz) could be very useful for intraspecific long-distance communication (O’Connell-Rodwell et al. 1997, 2000).

Fig. 11.7
figure 7

Elephant vibration detection posture. (a) To detect a signal, an elephant appears to focus solely on somatosensory detection via receptors in the trunk. Its ears are relaxed suggesting no airborne assessment for signals. (b) Elephant vibration detection posture, where it appears to be using its toenails and trunk to assess a ground-borne signal. Again, its ears are not fully extended. This suggests it uses both bone conduction through the toenails and a somatosensory pathway through Pacinian corpuscles in the trunk for signal detection. Elephants may also lean forward on their front legs with ears flat, sometimes lifting one of the front feet off the ground (possibly for triangulation or better coupling). If focused on an acoustic signal, an elephant will hold its ears out and scan its head back and forth in the general direction of the sound. Reprinted by permission from Springer Nature. Biotremology: Studying vibrational behavior, edited by P. S. M Hill, R. Lakes-Harlan, V. Mazzoni. P. M. Narins, M. Virant-Doberlet and A. Wessel, pp. 259–276, Vibrational communication in Elephants: A case for bone conduction, C. O’Connell-Rodwell, X. Guan and S. Puria; https://link.springer.com/chapter/10.1007/978-3-030-22293-2_13. © Springer Nature, 2019. All rights reserved

Also, drumming is a type of long-range vibrational signal production. For instance, drumming by prairie chickens (Tympanuchus cupido) can be detected up to 5 km away from the source (Jackson and DeArment 1963). Kangaroo rats (Dipodomys deserti, D. ingens, and D. spectabilis) drum the soil surface (seismic communication) with their feet to communicate such things as territorial ownership, their competitiveness, and their presence and location to other kangaroo rats (Fig. 11.8, Randall 1984; Randall and Lewis 1997; Cooper and Randall 2007).

Fig. 11.8
figure 8

Kangaroo rats (genus Dipodomys) produce seismic signals by drumming the soil surface with their large hind feet. (left) Photo of “Kangaroo Rat by Stuart Wilson” by cameraclub231 is licensed under CC BY 2.0 (https://www.flickr.com/photos/135081788@N03/49936422922). (right) Ord’s Kangaroo rat (Dipodomys ordii). Photo of “Two Ord’s Kangaroo rats, Alberta” by Andy Teucher licensed under CC BY-NC 2.0; https://www.flickr.com/photos/63265212@N03/8736679123

Many species of marsupial kangaroos (Macropodidae) are known to produce a foot thump when confronted by predators. The intended recipient of the vibration is not known and could be either a predator or other kangaroos (Narins et al. 2009). Sheep and many other ungulates stamp their feet when frightened or aroused in other ways.

As every movement of an animal cause particles in the surrounding media to oscillate and evokes all possible sorts of mechanical waves, it is the mechanism of reception of mechanical signals or cues that defines acoustic vs vibrational communication. It also follows that every act of communication establishes—at least potentially—a complex communicational network in the realm of the “acousto-vibro-active-space,” whereby the active space for vibrational signals can be surprisingly wide, even bridging air gaps (Fig. 11.9; Virant-Doberlet et al. 2014; Mazzoni et al. 2014; Gordon et al. 2019). On an ecosystems level, we have begun to think of, and to study, a whole complex multilevel vibroscape (Šturm et al. 2021).

Fig. 11.9
figure 9

Types of communication acts by a vibrational signaler. The signaling lycosid wolf spider establishes vibrational communication with a conspecific receiver, even one that is not on the same substrate as the sender. Likewise, a vibrational communicating prey (e.g., a planthopper) and an acoustically orienting parasite (e.g., a braconid wasp) are eavesdropping on the spider whereby establishing a complex communication network. Reprinted by permission from Elsevier. Hill P SM, Wessel A (2016). Biotremology. Current Biology 26, R181–R191; https://doi.org/10.1016/j.cub.2016.01.054. © Elsevier, 2016. All rights reserved

Despite the importance of reception mechanisms for the study of vibrational communication, they are, for now, the least understood aspect in biotremology. Arthropods have in their bauplan—in every body segment and at every joint of their legs—mechanosensitive stretch organs (chordotonal organs) that are responsible for body and movement control, but could also pick up environmental vibrations. In some groups, such as grasshoppers, crickets, and cicadas, chordotonal organs have evolved into ears with a tympanum attached to one end of the stretch organ. It is hypothesized that in every such case these hearing organs transformed through an evolutionary intermediate stage of vibration receptors, i.e., vibrational reception is evolutionarily older than hearing.

A recent breakthrough was the demonstration of the complete pathway, from signaling through reception, to perception, and response behavior, of the vibrational component of the courtship of the fruit fly Drosophila melanogaster. It is the vibrational signaling of the male that triggers the female to freeze at the end of the courtship, facilitating copulation (McKelvey et al. 2021). The male’s vibrational signals are transmitted through the common courtship floor—overripe fruits—and were picked up by a subset of neurons of the female’s femoral chordotonal organ. By genetic knockout experiments of several mechanotransducer ion channels, McKelvey (et al.) also identified a protein involved known to be responsible for gentle touch sensitivity in vertebrates—suggesting a deep evolutionary origin of vibrational communication.

In several cases, we need to consider a bimodal acousto-vibrational communication on the signal production as well as on the reception side that results in a complex perception of the environment outside of the experience of human beings. Elephants, for example, produce low-frequency signals by vocal “rumbles” and “foot stomps” that produce airborne vibrations (sound) as well as seismic waves (O’Connell-Rodwell et al. 2000). New findings point to a simultaneous monitoring of the signaling by three reception pathways: sound hearing by the ear’s tympanum, bone conduction hearing, and somatosensory detection via receptors in the trunk (Fig. 11.7; O’Connell-Rodwell et al. 2019). In this way, the overall chance of detecting a signal at all in a heterogeneous environment is improved, and the animals could also make use of the different propagation velocities for assessing the distance to the source of the signal.

11.3.3 Diversity in Communication

Recent evidence indicates that many messages may be conveyed auditorily in nonhuman primates when the larynx is not used. These commonly take the form of rumbling of the stomach, farting, breaking sticks, swishing of grass, sounds during digging or flying, and others. In fact, many sounds made by an individual can carry information to those who hear, but the question is whether they are used for communication. These sounds could just be the result of physiological or environmental adjustments that the sender may or may not be able to control, or that are not recognized as significant in communication. One example is surface behavior in humpback whales. Humpback whales can launch their body out of the water, turn, and splash down on their side or back (breach), slap the water with their pectoral fins, tail flukes, and even their head. These produce loud “bang” sounds, thought to be used as communication signals during periods of high underwater noise when vocal signals are not as effective (Dunlop et al. 2010).

In general, the use of these sounds for communication has not been given much research time to date, except for cases where they have been ritualized to carry information to others. For example, we do know, from centuries of hunter’s anecdotal evidence, that a hunted antelope, elephant, or even a rhino, will move much more carefully to not make a sound when it is being hunted, compared to when traveling/grazing in a group (e.g., Baze 1950). If this is the case, the individual must recognize that the sound will carry a message (Heyes and Dickinson 1990).

In invertebrates and non-primate vertebrate animals, ascertaining whether or not these signals are being used for communication is more of a challenge. Each movement of an animal’s body creates vibrations that propagate through the environment, and production of these vibrations cannot be eliminated by the individual, even if walking more softly does lower the amplitude. Therefore, we can be certain that in both vertebrate and invertebrate predators, a substrate-borne vibration or sound that alerts potential prey of the presence and direction of movement of the predator is not communication. In animal communication, we refer to this class of unintended information as a cue. On the other hand, we may also be familiar with a hunting dog moving through a meadow and flushing birds on the ground into flight with the result that the hunter can shoot them. We simply do not know if this sort of behavior exists in a more natural less domesticated setting.

11.4 The Advantages and Disadvantages of Vibrational and Acoustic Communication

Substrate-borne vibrational and acoustic signals are used in communication by almost all invertebrates and vertebrates. Sometimes each type of signal is used by a single species but in different contexts. There are many examples of the two being used across animal taxa in the same basic context. Some major groups of animals have evolved a heavier dependence on one than the other. For example, only as recently as 2015 did we observe the first described substrate-borne signaling in mating birds (Ota and Soma 2022) and in the very well-studied fruit fly Drosophila melanogaster (McKelvey et al. 2021), both of which were well-known for acoustic and visual signaling. These signals are essential for many species to find a mate, keep in contact (such as between mother and young), maintain territory, warn conspecifics of predators, link food location, reinforce social living, communicate emotional state, and many other types of information (Bradbury and Vehrencamp 1998). For any animal, being out in the world advertising your presence has many advantages, but it also has its disadvantages. The advantages of using vibrational and acoustic communication signals are essentially the same. There is no need for light—so signals can be detected at night. Sound can flow around obstacles, so acoustic signals can be heard anywhere and anytime, and even though the substrate filters vibrational signals and cues in ways that are difficult to predict, they still can be detected without respect to time. Compared with other signals, most vibrational and acoustic signals do not need a great deal of energy to produce. Because of the physics of signal propagation, vibrational and acoustic signals can travel over long distances. For instance, in primates, the roaring of howler monkeys (genus Alouatta) can travel up to 1 km.

However, there are disadvantages to vibrational and acoustic communication. These include energetic and developmental costs, such as requiring special structures for signal production and reception. Being able to produce a loud signal often requires new, and possibly elaborate structures, such as the larynx of vertebrates and the melon of sperm whales, Physeter macrocephalus). Invertebrates have also evolved specialized structures, such as the stridulatory apparatus in insects, which requires a receptor such as the subgenual organ (for substrate-borne vibrations) and the ear (for sound) to pick up the messages. Many animals have evolved specialized receptors to detect substrate-borne vibration signals (Pacinian corpuscles, Meissner’s corpuscles, Eimer’s organ; Narins and Lewis 1984; Narins et al. 2009).

The disadvantages of signaling can, however, be subtle—such as a wasted broadcast when there is no one to receive it or alerting others and then being overcome by a predator. “Blurting out” who and where one is means others can find you. By listening in, these others, or unintended receivers, which could be predators, prey, or even eavesdropping conspecifics, can obtain valuable information about the signaler. This may come at a cost to the signaler. If the unintended receiver is a predator, the cost is obvious: by listening in on the sound signals, the predator can recognize the signaler as prey and locate it. Conversely, prey can be alerted to, and identify, a signaling predator and its location, thus making it easier for prey to avoid predation. A conspecific eavesdropper can gain important information about the signaler/receiver relationship without having to directly take part in the interaction. Siamese fighting fish (Betta splendens), for example, eavesdrop on fighting males to gain information about their strength, which they then use in future interactions (Oliveira et al. 1998; Peake and McGregor 2004). To add further complexity, the presence of an eavesdropper audience can affect communicative interactions and force signalers to change their signaling behavior according to who else may be listening in. This is known as the audience effect and was first documented in a study of domestic chickens (Gallus gallus; Evans and Marler 1991, 1994).

Despite these and other disadvantages, it is obvious that substrate-borne vibrational and acoustic communication and all that they entail have provided extraordinary benefits in competing, surviving, and propagating the next generation. The stories of the development of vibrational and acoustic communication are ongoing and much knowledge about the mechanisms, meanings, and extent of these systems is yet to be discovered.

11.5 The Influence of the Environment on Acoustic and Vibrational Communication

For the most part, animals do not sit in a studio, acoustic lab, or anechoic chamber when signaling acoustically or with substrate-borne vibrations. They are usually in a natural environment subject to atmospheric and other conditions. Signals may be affected by spatial separation, movement of the caller, and they may even vary spatially or geographically. Environmental noise is a significant factor influencing animal signaling behavior. While few studies to date have addressed vibrational environmental noise, this topic is the focus of a recent review of both terrestrial and marine anthropogenic noise topics and literature, including previously unpublished case studies that can be used as guides for future work (Roberts and Howard 2022).

11.5.1 Atmospheric Conditions

Atmospheric conditions, which include changes in temperature and wind, exert powerful and predictable influences on animal sounds. These influences can cause the ability to detect a signal to change rapidly. The transmitting of a signal may be prolonged or modulated by topography, regional weather, seasonality, and climate. Mammalian carnivores, such as coyotes (Canis latrans) and wolves (Canis lupus), live in areas with nocturnal lower temperatures (David Mech and Boitani 2003). These animals show crepuscular calling to maximize their chances of being heard over the longest possible distances. Vibrations in the soil or other substrates due to wind or rain can also interfere with normal signal production and reception to the extent that individuals will stop courtship displays under windy or rainy conditions.

11.5.2 Masking Sounds

Masking sounds are environmental sounds, such as a stream, wind moving through the trees, and sounds from other animals, which cover, or dilute, the signal. In birds and other animals, spatially separating a signal from a masking sound is one way to improve signal detectability. If the signal and masking sound are separated spatially, the receiver can focus efforts to hear the signal. This “spatial release from masking” has been demonstrated in the behavior and physiology of the northern leopard frog (Lithobates pipiens) (Ratnam and Feng 1998). Bee (2007) showed that female Cope’s gray treefrogs (Dryophytes chrysoscelis) approached a target signal more readily when they were spatially separated by 90° from a masking sound, implying this spatial separation aided with signal reception. Spatial release from masking has also been shown to occur in budgerigars (Melopsittacus undulatus; Dent et al. 1997) and killer whales (Orcinus orca; Bain and Dahlheimm 1994).

A similar mechanism to spatial release from masking is known as the cocktail party effect. Here, the receiver focuses its attention on the signaler, while selectively filtering out other stimuli such as other sounds. At a party, humans can “tune in” to one conversation when many are taking place. Many frogs and songbirds have also been shown to successfully communicate in noisy party-like situations. Frogs can recognize, localize, and respond to signals within a cacophony of chorusing (Gerhardt and Bee 2006; Wells and Schwartz 2006). Songbirds are able to recognize conspecific song and songs from other species within a dawn chorus (Benney and Braaten 2000; Hulse et al. 1997). Reunited offspring and parents within a noisy colony clearly occur successfully in penguin colonies (Aubin and Jouventin 1998).

The above mechanisms demonstrate how the receiver overcomes masking sounds to improve signal detectability. Another way to improve signal detectability is for a signaler to change the way it calls. For example, a signaler could increase its call amplitude, call duration, and/or call at a different frequency. These changes are collectively known as the “Lombard Effect.” The Lombard effect has been demonstrated in species such as the Japanese quail (Coturnix japonica; Potash 1972), budgerigars (Manabe et al. 1998), chickens (Gallus gallus domesticus; Brumm et al. 2009), nightingales (Luscinia megarhynchos; Brumm and Todt 2002), white-rumped munia (Lonchura striata; Brumm and Zollinger 2011), and zebra finches (Taeniopygia guttata; Cynx et al. 1998) and even in large whales such as the humpback whale (Dunlop et al. 2014).

11.5.3 Geographic Variation and Dialects

Changes in the environment may lead to geographic variation, and this variation can eventually separate animals within a species into different populations. It should be noted that geographic variation is not necessarily due to changes in the environment. While this is occurring, geographic separation can lead to the formation of dialects. A dialect can evolve where species dispersal is occurring and their acoustic contact with each other becomes limited (Slater 1986, 1989). As a result, individuals within a species population may exhibit similar sounds to each other, but these sounds may be quite different in structure to other separated and more distant populations (Catchpole and Slater 2008; Gannon and Lawlor 1989). This results in within-species vocal variation.

Dialects are also known from biotremology studies. For example, the well-known southern green stink bug (Nezara viridula) has spread throughout the world (except for the Arctic and Antarctic) from its native Ethiopia in the past 100 years. Geographically isolated populations (e.g., California and Florida in the United States, the French Antilles, Australia, Japan, Slovenia, and France) have distinct differences in duration and repetition time of male and female signals. Individuals appear to be able to recognize adults from other populations but prefer to mate with those of their own dialect/population (Virant-Doberlet and Čokl 2004).

The study of population dialects offers a means to explore the causes and the functions of signal variation and change (Henry et al. 2015). Geographic variation in acoustic signals can reflect historical evolutionary changes within species. Not only can these signals be used to assess links between geographic variations and population connectivity, but they can be used to provide important information for the conservation of a species. For example, geographic variation in calls could indicate how birds disperse through a fragmented habitat, meaning the study of dialects can be used as a noninvasive tool to assess population connectivity (Kroodsma and Miller 1982; Amos et al. 2014).

The formation of dialects can occur through several mechanisms; as a result of a side-effect or “epiphenomenon” of learning via incorporating copying errors (such as adding or omitting parts of the call), due to structural changes to call elements through drift, or as a possible indicator of the level of behavioral or genetic variation in a population (Baptista and Gaunt 1997; Catchpole and Slater 2008; Podos and Warren 2007; Keighley et al. 2017). Another mechanism that helps maintain variable acoustic dialects is social adaptation. Social adaptation refers to the ability to adjust behavior to a prevailing pattern in a population. Migrating birds, for example, learn calls quickly (Salinas-Melgoza and Wright 2012), which provides reproductive benefits due to acoustic familiarity by potential mates (Catchpole and Slater 2008; Farabaugh and Dooling 1996). In this way, newly arriving immigrants fit in quickly and do not insert changes to bird songs of the residents, thereby maintaining the local dialect.

Vocal dialects can act as precursors to genetic isolation (e.g., in coastal US chipmunks, genus Neotamias). Dialects can also be maintained over time if the populations are separated and have little acoustic contact. This separation can be reinforced by geographic boundaries, or other isolation mechanisms, that reduce breeding chances (Gannon and Lawlor 1989). Examples include the pika (Ochotona), grasshopper mice (Onychomys), white-crowned sparrows (Zonotrichia), prairie dogs (Cynomys), and bats (Myotis evotis), which have all been shown to exhibit dialects due to geographic variation. Several species of birds, such as the chaffinch (Fringilla coelebs), have been identified as having song dialects and therefore are described as having distinct “cultures” (Slater 1981). One of the most striking examples of cultural influences is the rapid spread of new humpback whale songs across the South Pacific basin. All male humpback whales within a population generally conform to the same song pattern, making it a cultural trait. These song types move eastward across the South Pacific basin in a series of cultural waves at a geographic scale unparalleled in the animal kingdom (Garland et al. 2011).

Behavioral repertoires are malleable—that is, they are affected by the environment, learning, and interactions within a population. Variants in signal characteristics are no exception (Brumm et al. 2009). Thus, signal characteristics can act as precursors to variants in other genetic characteristics, and eventually, speciation. Notably, O’Farrell et al. (2000) examined nearly 2500 calls from 43 sites in Hawaii and mainland United States for the Hoary bat (Lasiurus cinereus; Fig. 11.10). They found some geographic variation within the calls, but the variation could not be explained by isolation (mainland distance of about 2300 miles (3800 km) from the proximity of San Francisco, CA, USA and Honolulu, Oahu, Hawaii, USA). They were unable to exclude the effects of context, behavior, or in some cases low sample size. Bats of this species, regardless of where they were recorded, could be identified as L. cinereus. In other words, these bats were showing variations in call structure and behavior but had not yet evolved into different species.

Fig. 11.10
figure 10

Hoary bat (Lasiurus cinereus). “Hoary bat” (https://www.flickr.com/photos/33247428@N08/48546621027) by Oregon State University is licensed under CC BY-SA 2.0; https://creativecommons.org/licenses/by-sa/2.0/

There are instances in which different species have evolved. Several studies in mammals have found that research into the geographic variation of acoustic signals is important taxonomically by discovering cryptic species. Chipmunks (Neotamias) occurring mostly along the US coasts of California, Oregon, and Washington were thought to be one species (Eutamias townsendii) with several subspecies. The species was characterized mostly by cranial and pelage features. It was not until localities throughout the range of the four subspecies within E. townsendii were sampled acoustically, and examined statistically, that variation of the calls was shown to be dramatic enough to warrant elevation to four distinct species. Originally based on acoustic data, this was confirmed by genitalia and genetic information (Gannon and Lawlor 1989; Sutton and Nadler 1974; Sullivan et al. 2014).

11.6 Information Content or the Meaning of Signals

Vocal signals can be used to provide (a) static information about the species, including the size and shape of the vocal apparatus, or (b) dynamic information, that is, the motivational state of the sender. Vocal signals can be context-dependent, where the same call can mean different things in different situations, or context-independent, where the call has a specific meaning whatever the context. Species recognize one other from their vocalizations, and produce signals related to various situations such as alarm calls in the presence of a predator, distress calls when separated from a parent, singing and chorusing to attract or deter conspecifics, or reflect behavioral changes. The question then arises; how does the recipient know what the caller means in that situation? The answer is, at least in birds and mammals, the receiver assesses call meaning by observing the sender and the context in which the signal is sent.

11.6.1 Static Information

In addition, the anatomy of the vocal apparatus in mammals determines features of its sounds, and these features correlate with the animal’s body size (Fitch 1997 in rhesus macaques, Macaca mulatta). Larger lungs can produce longer vocalizations. Vocal folds that are longer and thicker produce sounds at lower fundamental frequencies (for example, pika, Ochotona alpina; Volodin et al. 2018). The longer vocal tract concentrates the energy in the lower frequencies (Ey et al. 2007). Thus, correlations have been found between an animal’s vocal tract length, body mass, and formant dispersion (e.g., domestic dog, Canis lupus familiaris, Riede and Fitch 1999; southern elephant seals, Mirounga leonina, Sanvito et al. 2007).

As a result, information about the sender’s body size, sex, age, and sometimes rank can be acquired from their vocalizations. Sounds from small or young animals are typically higher in frequency than those of larger or older animals (see Riondato et al. 2021 for an exception). Sometimes rank information is used by females selecting males. For example, the “roar” of the male Red deer (Cervus elaphus) contains information on its sex and size. The larger the animal, the lower the frequency of the roar. Females chose mates based on their roar and have been found to prefer the roars of larger males (Charlton et al. 2007). The signaler’s dominance rank can also be signaled using size-related formants (e.g., male fallow deer, Dama dama, Vannoni and McElligott 2008; and baboons, Papio ursinus, Fischer et al. 2004). As the sender’s features do not change (e.g., their sex), or change slowly over time (e.g., their size or age), it is known as static information.

11.6.2 Dynamic Information

A second type of information is known as dynamic. This information relates to the sender’s motivation or arousal. Dynamic, or context-dependent calls, follow a motivational code (Morton 1977). A loud or long sound, for example, is associated with the signaler experiencing high arousal that may be due to aggression, fear, frustration, distress, or pain. Signalers in hostile contexts tend to emit longer, lower-frequency “harsh” (broadband) sounds which can signify signaler size. These sounds function to mediate aggressive interactions between it and the receiver. High tonal sounds, that mimic infant sounds, are more likely to be emitted in appeasing (fearful) contexts given they potentially have an “appeasing” effect on the receiver. Distress calls (often “scream” or “whistle-like” vocalizations) are used when “fear” and “aggression” are conflicting motivations. A short quiet signal is often associated with pleasure, close contact between animals that like each other (such as mother to young), or between social partners when close (Morton 1977).

Affiliative calls can indicate a welcoming, or “I am fond of you” context. For example, familiar elephants meeting each other after a long separation may trumpet for pleasure/joy (a high state of arousal). They also murmur to a friend, infant, or person they like who has been close, indicating a low level of arousal but a similar emotion (Kiley-Worthington 2017).

Aggressive calls include territorial calls and calls used as threats, and like affiliative calls, the agnostic call structure can change because of arousal. A highly aroused bull (Bos taurus), for example, will give visual signals: pawing, lowering his head withdrawing his chin and rubbing his horns in the earth, at the same time as roaring. At the highest level of threat, the roar has a vocalized inspiration as well as a vocal expiration known as a “see saw” call (Kiley 1972).

11.6.3 Context-Dependent Meanings

Context-dependent communication is where the same signal may be used in different contexts but has different meanings. For example, a male eastern kingbird (Tyrannus tyrannus) emits a “kitter” call-in three different contexts: (1) when the bird is indecisive or concerned about attempting to approach some object (to perch, mate, or toward another bird), (2) when lone males fly from perch to perch in a new delimited territory, or (3) as an appeasement signal by the male when approaching his mate. Another example is the familiar roar of a lion (Panthera leo) that—from the viewpoint of a human—is a spectacular vocal display during aggressive interactions. However, the call also helps individuals belonging to the same pride find, and identify, each other and can serve as a bonding signal for members of a pride to gather. It can also separate neighboring pride.

Affiliative calls can also be food calls (Kondo and Watanabe 2009). Food calls can be context-dependent given these signals are directed at other conspecifics and can indicate the presence of food. The variation in these food calls can indicate food a quality and quantity. For example, spider monkeys (genus Ateles) are known to produce a higher call rate in response to greater quantities and quality of food. Acoustic signals can attract group members to food locations and these calls can also be used to protect the food resource from others (Clay et al. 2012). These authors examined food-associated calls made by some birds and mammals (see page 326, Table 11.1 in Clay et al. 2012) and found that most species did not produce unique calls for different foods. More commonly, signalers varied their calling rate to advertise food quality or abundance.

Table 11.1 The variety of situations that give rise to the major call types of Bos taurus (reproduced from Kiley 1969)

Therefore, context-dependent vocalizations may not necessarily convey information about the type of situation but can act as an analogue system to inform the recipient about the general level of arousal of the sender, and consequently, how (or if) to respond. In some species, calls are graded, meaning that there are intermediates between one call and another. Humpback whales, for example, use a repertoire of graded signals and the use of these signals is likely related to the motivation and arousal of the signaler (Dunlop 2017). “Grumbles” and “snorts” are used by females and their calf while migrating by themselves and presumably in a low-arousal context. Female–calf pairs can be joined by male escorts and form a competitive group, where males are fighting for access to a breeding female. In these groups, where arousal level is much higher, “grumbles” turn into harsh sounding “roars” and “purrs,” and become more modulated to sound more like “groans” and “moans.”

Different levels of graded calls can be given in one situation. For example, cattle may give a low “mmmmm” call when in close contact with other cattle. On opening its mouth, the sound has an added syllable: “en” to “mmen.” When it is sufficiently aroused, a “hh” syllable is added, which is the result of letting the remaining air out of her respiratory track. This can change even further with higher excitement or arousal by being repeated. Finally, at the highest level of arousal, the inspiratory phase of the call is also vocalized (Table 11.1). This is a very different type of auditory communication from context-independent calls such as human language where auditory communication can reflect either or both and environmental contexts or come from some thought or idea generated by cognition.

11.6.4 Species Recognition

To be sure that the call maintains the same structure (and can therefore be recognized as having the same message), there are a number of measures including call interval, maximum frequency, minimum frequency, fundamental or predominate frequency, call length, duration, amplitude or loudness, and the repetition rate found in both acoustic and vibrational signals. These characteristics, combined with the presence of harmonics, form patterns that are often characteristic of a species or individual. As a result, other animals are likely to be able to identify individuals from their calls, as we can with human voices. For example, many species of vespertilionid bat can be identified by time and frequency characters measured from their echolocation calls (Gannon et al. 2003). Individual recognition is also evident in bats. Playback responses in common vampire bats (Desmodus rotundus) suggested they vocally recognized individual bats, given they were biased toward callers that had fed them more (food sharing), but not biased toward kin (Carter and Wilkinson 2016). Crickets (Teleogryllus spp.) can be differentiated based on the amplitude and repetition of their call, not just their call “note” (that is, the fundamental). The mean frequency of this signal is approximately 4 kHz, but the pattern and call rate increase as the cricket’s motivation changes from “calling” to “encountering” to “fighting” to “courtship” and finally “copulating.”

11.6.5 Context-Independent Meanings

Some calls in animals, like human language, have a specific meaning, whatever the context. These calls often include alarm calls used to alert a group to danger of an approaching predator, territorial invader, or other “alarm” in the caller’s environment. The alarm call may elicit a response by recipients to retreat, freeze in place, or conduct defensive behavior. Slobodchikoff et al. (2009) discussed the complexity of alarm calls in prairie dogs (Cynomys gunnisoni) in the southwestern United States. He and his students have found that prairie dogs are precise in their signaling and can communicate a description of the predator, its size, its speed, and even its color. Wild boars (Sus scrofa) use context-dependent calls, such as “grunts” and “screams,” whose meanings relate to the context, also emit a specific “warning bark”—a context-independent short sharp call that is difficult to locate as an alarm call (Kiley 1972). This alarm call works to conceal the position of the signaler but conveys that a disturbing object has been sighted.

The importance of altruism (or lack of it) when vocalizing has been investigated within the context of emitting alarm calls and food calls. For example, studies have shown that, even those calls that are difficult to locate (ventriloquial calls), will increase the chances of being detected by a predator (Fig. 11.11). However, studies on kinship and altruism have yet to relate the ease of locating an alarm call by a predator to the rate of vocalizations and to actual predation (Reznikova 2019). Still, it seems that coterie members of prairie dogs (Cynomys ludovicianus) alert others to the presence of potential predators using alarm calls, and that these alarms significantly reduce predation (Wilson-Henjum et al. 2019).

Fig. 11.11
figure 11

Young prairie dogs (Cynomys ludovicianus) at Rocky Mountain Arsenal National Wildlife Refuge, Commerce, CO, USA. One pup giving a yipping call. US Fish and Wildlife Service Photo Credit: Rich Keen at RMA; https://commons.wikimedia.org/wiki/File:Yipping_Prairie_Dog_Pups.jpg. Licensed under CC BY 2.0; https://creativecommons.org/licenses/by/2.0/

Functionally referential signals are those that provide very specific information. They are structurally distinct and reflect a stimulus-specific meaning used only in a very specific set of circumstances. Most alarm calls are nonspecific, but the vervet monkey (Chlorocebus pygerythrus), uses a lexicon of four or five sounds to identify the type of intruder. When a major bird or mammal predator is nearby, the vervet produces a “chirp” and “bark” (Strusaker 1966). When a snake is nearby it evokes a special “chutter” call, a minor bird or mammalian predator is indicated by an abrupt “uh” or “nyow” sounding signal, and a major bird predator elicits a “rraup.”

Distress calls can be context independent, such as the calls used by young to attract adults to their location. African wild dog (Lycaon pictus) pups, for example, emit a “lamenting call” when they are deserted by their parents. Precocial birds, such as domestic fowl, ducks, or geese, “pipe” in the same way as when they are cold or hungry. Young, collared lemmings (Dicrostonyx groenlandicus) emit ultrasonic chirps when they are abandoned, cold, or feel as if they are in danger (Sales and Pye 1974). Young primates, including humans, shriek or scream when threatened or abandoned.

11.6.6 Songs

Songs are composed of call notes that have been elaborated in structure and length. The main function of song is to identify the singer as a member of a species, sexually mature, on a territory, prone to territorial defense, and ready for courtship. Song refers to the melodic quality (with harmonics) of songs, as opposed to broadband “noise,” and bird song is often analyzed into themes and phrases, where researchers try to interpret the meaning or function of the different phrases. Marler and Tamura (1964) and Marler and Doupe (2000) believe that certain parts of the song contain certain types of information and that birds decode the songs. Emlen (1972) experimentally modified the songs of male indigo buntings (Passerina cyanea), and based on responses to playbacks, could identify the meaning of certain elements in the song (Fig. 11.12).

Fig. 11.12
figure 12

Male indigo bunting (Passerina cyanea) produces a song where certain elements of the song provide meaning to the listener. Photo “IndigoBuntingonPlant.jpg” by Kevin Bolton; https://wordpress.org/openverse/image/15bcd71f-0728-4bda-8122-38fcf4a82ce6/. Licensed under CC BY 2.0; https://creativecommons.org/licenses/by/2.0/

The male humpback whale is a well known marine singer. Males within each population of whales sing the same song, but each population of whales has its own unique song (rather like a dialect), which can sound different from the song in other populations. Within each population, the song structure changes gradually over the mating season and between years. A call unit can drop out of the repertoire, be replaced with another unit, or units can be added. These changes are known as song evolutions, as the song structure evolves gradually within a population over time. Songs can also completely change between 1 year and the next, known as a song revolution. This is thought to be due to the influx of males from a different population, carrying with them their own song. Males from the original population then pick up and learn this new song causing the song within that population to completely change (Noad et al. 2000).

A duet is an exchange of sounds or substrate-borne vibrations between a pair of animals often produced in rapid succession (Fig. 11.13). The duet may be so rapid, that it is difficult to distinguish which animal is producing the various parts. It functions as a contact-maintaining signal and individual mated pairs within a species can develop their unique duet helping them to maintain contact with their partner. Duets are especially common in frogs, birds (cranes, sea eagles, geese, quail, grebes, woodpeckers, barbets, megapode scrub hens, kingfishers, ravens, cuckoo-shrikes, and honey-eaters), tree shrews (mammalian order Scandentia), and siamang (Symphalangus syndactylus), as well as being common in major groups of insects that communicate via substrate-borne vibrations. Species that perform duets often are monogamous (such as siamangs) and the two sexes resemble each other in appearance (that is, they are not dimorphic).

Fig. 11.13
figure 13

A duet of ravens (Corvus corax). Photo “Ravens’ Duet” by Ron Mead; https://www.flickr.com/photos/14093853@N04/2678807340 . Licensed under CC BY 2.0; https://creativecommons.org/licenses/by/2.0/

Duets are used when mated pairs are required to remain in touch over long periods of time. Duetting can be especially important within environments, such as in dense vegetation, where birds cannot see each other. By duetting, pairs keep close to each other, and in synchrony, so when conditions in a variable environment become right, mating can be achieved quickly and efficiently. In most gibbon species (family Hylobatidae), males, and some females, sing solos that function to attract mates and advertise their territory. If a male and female like one another’s song, they will find each other and conduct a short mating dance followed by a long vigorous mating ritual. The song dialect is used to identify the singing gibbon’s species and the area it is from. Therefore, duetting also reduces hybridization with closely related species (Mitani and Marler 1989).

11.6.7 From Chorusing to Copulation

Males that chorus (e.g., frogs, toads, and insects such as locusts (order Orthoptera) and cicadas (order Hemiptera)), attract females to a localized area. A classic example of this are the periodical cicadas (Magicicada sp.). Millions of 17-year cycle cicada gather to mate in forests in the eastern United States. Males aggregate into chorus centers and attract mates by producing high-intensity sounds (Fig. 11.13). The desert locust (Schistocerca gregaria) forms one of the most intense swarms (Fig. 11.14), and can be found in countries such as Kenya, Somalia, India, and Saudi Arabia. Their loud chorusing is a means of sexual advertisement. BBC News reported on the “biblical locust plagues of 2020”, when these insects swarmed in large numbers in East Africa (BBC News 2020).

Fig. 11.14
figure 14

Desert locusts (Acrididae) emerge and go into flight en masse. Photo “Locust” by [nivs]; https://www.flickr.com/photos/42805979@N00/34263361. Licensed with CC BY-SA 2.0; https://creativecommons.org/licenses/by-sa/2.0/

The gecko Ptenopus garrulus produces loud continuous chirruping during a dusk chorus (Walker, 1998). These calls strengthen social bonding during sexual and courtship activities and are often produced together with visual and tactile behaviors.

An example of a more spatially contained event used by male sage grouse (Centrocercus urophasianus) to attract mates acoustically and visually is leks. Male sage grouse form large courtship leks in a social arena to produce elaborate visual displays with their gular pouches and the accompanying sounds of “swish-swish-coo-oo-poink” (Fig. 11.15; Bush et al. 2010). This study (p. 343) found that despite lekking behavior, male–male competition was spread out spatially and females often covered the entire social arena before copulating. Leks also are increasingly being recognized in invertebrates that communicate through substrate-borne vibrations, such as the prairie mole cricket (Gryllotalpa major). In this species, a male stridulates from inside a burrow he constructs in the soil, producing an airborne (sound) component that signals to fly females as a sexual advertisement. The same stridulation event has a substrate-borne component (vibration) that is used by nearby males to aid in spacing their burrows (Hill 1999).

Fig. 11.15
figure 15

Male Greater Sage-Grouse (Centrocercus urophasianus) by USFWS Pacific Southwest Region; https://www.flickr.com/photos/54430347@N04/6928668188. Licensed under CC 2.0; https://creativecommons.org/licenses/by/2.0/

After mate attraction, comes copulation. Ovulation in female alpacas (Vicugna pacos) is thought to be simulated during copulation, where the male produces a loud “orrgle” for 30 to 45 minutes while mounting the female (Abba et al. 2013). Even after copulation, calling may continue, where the tree frog Phyllomedusa (Hylidae) gives a separate call after oviposition.

11.7 Comparing Human Language to Nonhuman Auditory Communication

Despite the phenomenal array of different types of auditory communication in the different species, what are the defining characteristics of human language? Human language involves the use of vocal sounds that are symbolic of meanings, and therefore context independent. Thus, human language can be understood in the total absence of the communicator, such as when written, or when heard on the telephone.

There is a vast literature on human language, and a whole field of study: linguistics. Many scientists believe that the development of human language was the most important evolutionary step in distinguishing humans biologically. It is also widely maintained that development of human language was responsible for the further cognitive development of humans. Interestingly, nonhumans respond to general sounds and emotions in human language. More recent work has shown that some primates, dogs, marine mammals, horses, and elephants comprehend individual words and phrases. In fact, with experience, they understand a great deal more human language than we previously assumed (e.g., de Waal 2016; Kiley-Worthington 2017). Young human or nonhuman mammals do not only learn the meaning of words by conditioning as the behaviorists believed (Skinner 1957), but they also learn by observing others, imitation, and learning about cause and effect.

One of the first experiments to test if nonhumans could learn to speak a human language was the Kelloggs’ studies (Kellogg and Kellogg 1933). This family raised a young chimp Pan troglodytes with their son and treated her similarly. At the end of several years, although their son was talking, the chimp found great difficulty making human sounds, and managed only “mama.” The conclusion was that the chimp’s inability to learn language implied that chimps have lower intelligence than humans. However, later it was discovered that the reason for her difficulty in making speech sounds was not a mental/cognitive lapse, it was physiological. She did not have the necessary muscles to control the sophisticated movements of the tongue, larynx, buccal and nasal cavities in order to make the different sounds (Lyn 2012). More recently, Fitch (2011) has argued that humans have what he called a “language ready brain.” However, Savage-Rumbaugh et al. (2009) argue strongly that human language may not be any more sophisticated than ape languages. This is supported by the recognition of the many mental homologies between humans and other mammals (e.g., Kiley-Worthington 2017).

Since the middle of the twentieth century, the distinguishing features found in human language have been widely discussed, and the synopsis developed by Hockett (1960) is still widely adhered to. The first question is to what degree these defining features are found in other species (Table 11.2).

Table 11.2 Design features of human language and whether they have been recorded in other species. The species listed here are only examples, since there are others for which better evidence exists

This list has been elaborated, extended, and modified, to include tactile, visual, taste, and olfactory communication (e.g., Christin 1999). The vocal repertoire of many species has been shown to fulfill most of these characteristics, and a list of some of the most pertinent studies is given here (e.g., Fitch 2011; Herman et al. 1984; Schusterman and Kastak 1998; Nehaniv and Dautenhahn 2002; Rendell and Whitehead 2001; Christiansen and Kirby 2003).

To simplify the differences between human spoken language, and communication attributes of other species, there are two human specializations. The first is that the human spoken language, unlike auditory communication of many other species (although not all), is mainly (but not exclusively) context independent. That is, the same word means the same thing in any context. Humans have developed this characteristic much further than other species, and as a result, the meaning of what they are saying can be assessed whatever the situation, whether it be on the telephone, read, or written. However, it is true that many words can have multiple meanings or are used in specific contexts. Furthermore, using the same word in different communication contexts can change its meaning. Meanwhile, primate alarm calls seem to share a lot of features of words. The other important characteristic is that human language is highly symbolic. Again, this is not a unique characteristic of human language. For example, movements such as a horse swishing his tail, which may mean he will kick you, and ritualized displays, such as the courtship preening of Mandarin ducks (Aix galericulata; Fig. 11.16) are also highly symbolic. However, humans have taken symbolism further so that symbols can be built on top of each other. For example, one dog can be seen to be a dog and only one, but it can also be represented by a 1. Another 1 can be added, which is represented as 2. This led to the emergence of mathematics, and to further symbolic links in formulae culminating in our explanations of gravity or electricity and other phenomena in the world.

Fig. 11.16
figure 16

Mandarin ducks (Aix galericulata) perform a specialized courtship routine. The males shake and bob their heads, as well as mocking drinking and preening, while raising their crest and orange sail feathers to “show off.” They also incorporate sound into their courtship in the form of a whistling call. “Mandarin duck” by Tambako the Jaguar; https://www.flickr.com/photos/8070463@N03/853400195. Licensed under CC BY-ND 2.0; https://creativecommons.org/licenses/by-nd/2.0/

Some research has concentrated on teaching apes and marine mammals to develop and use a language that has features characteristic of human language. This includes teaching chimpanzees sign languages, and more recently, to use computer symbols. Interestingly Washoe, one of the first chimps, was taught American Sign Language. This chimp eventually managed to combine symbols to produce new meanings. For example, when asked what a duck was when swimming in the water, she signed it was a “water bird” (Gardner and Gardner 1984). Gluck (2016), in his account of grappling with central philosophical problems in animal ethics, recollects one of his weekly lab meetings (he was part of a research lab known for numerous breakthroughs in psychology and animal behavior) where the graduate students would discuss their research and topics of the day; signing chimps was a hot topic at the time. He noted that one of the students, a bit of a maverick, inquired whether the chimp ever asked “Can I go home now?” or “Can I leave?” Gluck and the other students dismissed this as foolhardy and would spend the next two decades exploring how primate models could inform human biomedical and behavioral science. But that is still the question of our time. If a captive animal could, would they ask to be released? Would they ask “Why are you doing this to me?” These animal-intensive tests came under extreme criticism from other scientists (Terrace 1985). Since then, a gorilla, bonobos (Pan paniscus), and other chimps, have learned to use computer symbols as a human-type language (Hopkins and Savage-Rumbaugh 1991). Kenneally explored the origin of the first word, and speculated on which great apes might have been capable of speaking the first word. Among other things, she said that such a speaker would have to have the anatomical and physiological capacity for speech, but they would also have to have something to say. In her view, this probably eliminated chimps, which she thought were immature and lacking in focus, rather than cognitively limited (Kenneally 2007).

Thomas Nagel’s (1974) thought-provoking question “What is it like to be a bat?” argues that humans might imagine what it is like to be another being but can never know the conscious mental state to be that species, or even another human. We can look at systems, patterns, and responses, but each species and every human retain their own secrets and have their own experiences. That does not mean we should not try to understand nonhuman auditory and vibrational communication signals. These different world views, or knowledge of the world, lead us to a study of the epistemology of different species. Let us hope that we begin seriously to investigate this before it is too late and many species have become extinct due to our actions, most of which are the consequences of human language.

11.8 Summary

With modern technological aids and further studies, the study of acoustic and substrate-borne vibrational communication has advanced considerably since Busnel’s (1963) seminal work. The origins of acoustic communication are likely to be from sounds associated with moving about in the environment and breathing in and out through respiratory passages. These sounds have become specialized for communication. Likewise, as animals move, regardless of how quietly, the motions lead to vibrations through the substrate that can be detected by others of the same or different species. Responses to these vibrations by others are reinforced or are lethal to the receiver, but likely also inform the sender. The first step is for the sounds or vibrations to become ritualized, leading to displays. The development of the necessary sending and receiving structures, such as the larynx or the insect tymbal, and a sensory apparatus such as the ear or subgenual organ, facilitated the evolution of an extremely diverse range of auditory and vibratory signals and cues, of which only some are described here.

Auditory and vibratory communication each has advantages and disadvantages. Though a signal can travel through substrates, meaning the signaler does not have to be in visual range, it can be overheard by others. Atmospheric conditions can influence the signal and other sounds/vibrations can mask it. Geographic separation of animals within a population can cause auditory and vibrational signals to evolve over time into different dialects and cultural waves. This variation can eventually separate animals within a species into different populations. One thing that is becoming increasingly clear is that there is not much time to uncover more about the complexities of auditory and substrate-borne vibrational communication in nonhumans before the behavior of our species, as human language users, has led to the extinction of many species.