10.1 Introduction

Audiometric studies, using behavioral or physiological methods, describe and quantify the hearing capabilities of animals. Audiometric studies using behavioral methods test hearing directly, by requiring an animal to make an observable response when it hears a target sound. The required response can be a natural, untrained response to sound, or the response can be one the animal is trained to make using classical or operant conditioning procedures. Physiological audiometric data, which do not require training, are more easily obtained than are behavioral data based on conditioning procedures. However, physiological methods can assess the perceptual process of hearing only indirectly. If it is shown that an animal’s auditory system is capable of responding to sounds, the ability to hear may be inferred but is not guaranteed. For this reason, behavioral methods are considered the “gold standard” for audiometric assessment.

Animals hear sounds across a range of frequencies, and their sensitivity to audible sounds varies with frequency. By employing behavioral or physiological methods, researchers can determine the range of sound frequencies that animals hear, the amount of energy needed for the detection of sounds at each frequency, and the particular sound frequencies to which animals are most sensitive. Determining what sounds animals hear provides information about their acoustic environment and insight into the evolution of hearing among taxa. For example, toothed whales, microchiropteran bats, some shrews, and oil birds have evolved hearing abilities adapted for echolocation (see Chap. 12 on echolocation and the taxon-specific chapters in upcoming Volume 2), and some insect and fish prey have evolved keen hearing to detect their echolocating predators. Sounds to which animals are most sensitive are the ones most relevant to intraspecies communication and survival (because they provide information about mating partners or about predators and other sources of danger) and therefore are of particular interest.

In addition to providing information about normal hearing capabilities of animals, audiometric studies can show how hearing changes as a function of aging, environmental challenges, and experimental manipulations. Like humans, animals can experience presbycusis (i.e., loss of hearing with age; Willott 1991; McFadden et al. 1997) and they can develop hearing loss if exposed to ototoxic drugs, such as aminoglycoside antibiotics or platinum-based anti-cancer medications (Henderson et al. 1999). Hearing loss in wildlife due to noise exposure is of increasing concern because of widespread noise sources associated with anthropogenic activities in the ocean and on land (see Chap. 13 on the effects of noise). Audiometric studies of animals can also contribute to the understanding and treatment of human hearing and hearing disorders. For example, the study of the genetic and biological bases of hearing disorders often involves audiometric testing of animals with induced genetic conditions (e.g., knockin and knockout mice in which an existing gene is replaced or disrupted with an artificial piece of DNA, thereby altering or eliminating its function) and the investigation of pharmacological influences on human hearing is studied in laboratory animals.

Audiometric studies have been conducted on many aquatic and terrestrial species, with the choice of species guided by availability and the particular questions (biological, medical, or evolutionary) that the experimenter poses. Hearing abilities have been studied extensively in traditional laboratory mammals (Fig. 10.1) including the house mouse (Mus musculus), chinchilla (Chinchilla lanigera), Mongolian gerbil (Meriones unguiculatus), guinea pig (Cavia porcellus), and laboratory rat (Rattus norvegicus). These species are easy to obtain, easily bred in the laboratory, and readily trained in conditioning procedures, and so have long served as models for both normal and impaired human hearing. Audiometric studies have been conducted with many non-mammal species, including insects, amphibians, reptiles, fishes, and birds (see Volume 2). Many species are challenging to obtain, to house, and to train in a laboratory environment. For these reasons, behavioral audiograms are sometimes based on data from only one or very few animals, which limits the generalizability of the results. Further, hearing in some species is estimated by phonotaxis and evoked calling methods, which do not require training but which likely underestimate the animals’ true hearing sensitivity. Understanding the auditory capabilities of non-traditional species provides insight into how hearing has become adapted to the challenges that animals face in a variety of natural environments. Unfortunately, for the vast majority of species, and even major taxa, there are no audiometric data available.

Fig. 10.1
figure 1

Left: Behavioral audiograms of rodents commonly used as laboratory animal models for hearing. Tones were presented through loudspeakers, and the animals’ conditioned responses measured. All of the audiograms are U-shaped, with frequencies of best sensitivity (tip of the audiogram, at the lowest sound pressure level) within the range of 4–16 kHz. These species differ considerably in the low-frequency limit of hearing, with the chinchilla being more sensitive to a broader range of low frequencies than the domestic mouse. Plots are averaged thresholds based on 50% correct detection. Data were collected by Heffner and Heffner (1991, from three chinchillas); Koay et al. (2002, from two domestic mice); Heffner et al. (1994, from four Norway rats); and Heffner et al. (1971, from four Mongolian gerbils). Right: The photo of a mouse participating in a behavioral hearing test is courtesy of Micheal Dent, University at Buffalo, The State University of New York (Screven and Dent 2019)

10.2 What Is an Audiogram?

An audiogram is a graph of hearing threshold as a function of frequency (ANSI/ASA S3.20-2015; ISO 18405: 2017).Footnote 1 Frequency refers to the sinusoidal vibration in cycles/s of a pure tone (sine wave). The hearing threshold of a listener is defined as the minimum stimulus level that evokes an auditory sensation in a specified fraction of trials at a given frequency. On an audiogram (Fig. 10.1), low threshold values correspond to high sensitivity to sound at that frequency and vice versa. The stimulus level is often a root-mean-square sound pressure level (SPL) expressed in dB with a reference of 20 μPa when testing in air or 1 μPa when testing under water; see Chap. 4, Introduction to Acoustics. The stimulus level may also be a root-mean-square sound particle velocity level (e.g., in the case of some fish audiograms) specified in dB re 1 nm/s. Because audiograms may be measured with signals other than pure tones (e.g., tone pips or clicks), signal type, threshold level, and reference value should be reported, along with the measured ambient noise levels. If the ambient noise is negligible, the hearing threshold is referred to as an unmasked threshold. If the ambient noise is high enough to raise the hearing threshold above its unmasked level, the hearing threshold is called a masked threshold (ISO 18405: 2017).

There are two general approaches to assessing the auditory thresholds of live animals: behavioral and physiological. The behavioral hearing threshold is the lowest level that evokes a behaviorally measurable auditory sensation in a specified fraction of trials (ISO 18405: 2017). The pure-tone behavioral hearing threshold measurement procedure (prescribed in ANSI/ASA S3.21-2004) recommends that the behavioral hearing threshold be defined as the lowest input level at which responses occur in at least 50% of a series of ascending trials (i.e., trials in which signal level is systematically increased). The behavioral hearing threshold provides an integrated, whole-organism response to signal detection.

An electrophysiological hearing threshold is the lowest level that evokes a detectable and reproducible electrophysiological response (ISO 18405:2017). Both the ambient noise and the background electrophysiological noise levels should be reported. Electrophysiological noise is the non-acoustic self-noise arising from myogenic and neurogenic sources plus any artifact due to non-biological electrical interference. Electrophysiological hearing threshold estimates can be determined from different physiological processes (e.g., microphonic potentials, auditory brainstem response, cortical evoked responses), which characterize auditory processing at different levels of the auditory system. Various threshold estimation procedures also exist; each carries with it associated errors and assumptions, so the method for threshold estimation should be specified.

Electrophysiological methods are not equivalent to behavioral procedures, and electrophysiological hearing thresholds can differ from behavioral hearing thresholds (even for the same test animal). Within each of these two approaches, several methods can be employed, depending on the species being tested and the goals of the researcher. Behavioral techniques can be based on either unconditioned responses that the animal makes spontaneously and as part of its natural repertoire, or conditioned responses that the animal is trained to make. Common physiological techniques measure otoacoustic emissions (OAEs; i.e., sounds generated by outer hair cells in the inner ear and measured using a very sensitive microphone) and auditory evoked potentials (AEPs; i.e., summed electrical responses of hair cells and auditory neurons recorded from electrodes). Results from behavioral and AEP experiments in the same species or even in the same animal can produce audiograms that are similar in shape and frequency range but may differ in absolute thresholds (see Sect. 10.4.3).

Audiograms in most species are typically U-shaped, but not symmetrical (Fig. 10.1). The frequency region of best sensitivity encompasses those sound frequencies at the trough of the U-shaped curve, where thresholds are lowest. The animal’s best hearing sensitivity (or lowest threshold) corresponds to the threshold range at the frequency region of best sensitivity. The range of hearing specifies the sound frequencies that are audible to an animal at some specified level (e.g., 60 dB) above the lowest threshold. The range of hearing for sounds at high sound levels is wider than the range of hearing for sounds at low sound levels because the audiogram is broad and U-shaped. The range of hearing should be expressed as between X Hz and Y Hz at Z dB above the best hearing sensitivity. Unfortunately, many publications do not include the number of decibels above the best hearing sensitivity when reporting the range of hearing for an animal or species, and they may not indicate whether the highest and lowest frequencies shown in an audiogram reflect the limits of testing or the limits of the animal’s hearing capabilities.

In terrestrial mammals, the main contributors to the U-shape of the audiogram and the location of the frequency of best sensitivity are the acoustic properties of the auditory periphery: the pinnae, external auditory meatus, and middle ear (Tonndorf 1976; Hellström 1995). The pinna serves to funnel sounds into the external auditory meatus (i.e., the ear canal), with sounds from some directions being amplified and those from other directions being attenuated. The external auditory meatus is an acoustic resonator that boosts the amplitude of received frequencies at and near its resonant frequency. The resonant frequency of the ear canal is inversely proportional to its length, so animals with short ear canals, such as mice, have their best hearing sensitivity at high frequencies, whereas animals with long ear canals, such as elephants, have their best hearing sensitivity at low frequencies. The resonant characteristics of the external auditory meatus, coupled with the sound transfer properties of the middle ear, help determine the acoustic energy levels reaching the inner ear.

Often, audiograms are incorrectly interpreted as illustrating hard thresholds to sounds, assuming that sounds at amplitudes just below the published audiogram are inaudible and sounds just above the audiogram are always audible. That is not the case. The faintest sound that an animal can hear depends on many factors, including stimulus characteristics (e.g., duration, repetition rate), environmental factors (e.g., ambient noise level, testing context such as anechoic chamber versus natural environment), and individual factors (e.g., health, response bias, attention, age). A given animal may show a loss of sensitivity due to aging, noise exposure, or exposure to ototoxic drugs, and even due to repeated or prolonged exposure to the stimulus during testing that leads to sensory adaptation and/or cognitive habituation. At high ambient noise levels or when additional sounds are present, an animal might lose the ability to hear a sound it previously heard in a quiet environment. This is because of masking, in which the presence of non-target sounds or noise decreases the detectability of the sound of interest.

Within a species, there can be significant individual differences in hearing sensitivity, which can reflect differences in attention to the task, age, health, and history of exposure to sounds, among other factors. Because there can be considerable variability among animals of a given species, it is important to test many animals when possible. Also, it is important to know when examining an audiogram whether the curve is based on a single animal or a group of animals.

Audiograms from three beluga whales (Delphinapterus leucas) are shown in Fig. 10.2. From this graph, it can be seen that testing was conducted in water because the dB reference is 1 μPa, rather than 20 μPa for sounds presented in air (as in Fig. 10.1). In belugas, hearing sensitivity increased from low frequencies around 250 Hz to the best frequency range around 30 kHz (threshold around 37 dB re 1 μPa), and then decreased toward higher frequencies up to 120 kHz; this results in a U-shaped hearing curve. The range of hearing at 60 dB above lowest threshold extends from about 1–110 kHz.

Fig. 10.2
figure 2

Left: Underwater behavioral audiograms of three beluga whales obtained at two different times 10 years apart. Data were obtained using an ascending Method of Limits (described in Sect. 10.3.3). The whales were trained to leave a station when they heard a tone and swim to the trainer for a food reward. Thresholds were defined as the tone level at which the whales detected the signal 50% of the time. The red triangles show the mean audiogram from one male and one female beluga whale reported by White et al. (1978). The arrow shows the most sensitive frequency at 30 kHz. The blue circles show averaged data from the same male and female and an additional juvenile male, obtained by Awbrey et al. (1988). The gray squares show the ambient noise level in the test pool, which was close to the measured thresholds at 4 and 8 kHz, indicating that the whales’ actual thresholds at these frequencies were likely lower than indicated on this graph. The gray dashed line is 60 dB above the lowest threshold at 30 kHz, where the range of hearing was measured. Right: Photo of two beluga whales at Vancouver Aquarium

10.3 Behavioral Methods for Audiometric Studies on Live Animals

Behavioral approaches can be divided into two general types, unconditioned response techniques and conditioned response techniques. Unconditioned response techniques are based on behaviors that the animal naturally makes to sound and are readily employed in the animal’s natural habitat. Animals must be trained to make conditioned responses, and this training should be based on the species’ typical behavioral repertoire. Klump et al. (1995) provide a full discussion of different methods used to study hearing sensitivity in animals.

For both techniques, establishing stimulus control over an animal’s behavior is crucial. A pure tone is typically the test signal, although broadband clicks, and noises of varying bandwidths can be used, depending on the research question. How signals are generated and presented is extremely important to control and monitor. The sound may be delivered via a loudspeaker to animals ranging freely, being confined to the experimental chamber, or trained to hold station (e.g., at a bite plate or in a hoop), or delivered via tubes, insert earphones, or headphones (Fig. 10.3). Stimuli can be presented using several different protocols, each of which has its own assumptions and limitations. Ambient noise can influence thresholds and so must also be controlled. Ambient noise can be minimized if the animal is tested in an anechoic chamber or a sound-attenuating chamber (Fig. 10.4). If animals are tested in their natural environments where ambient noise levels cannot be controlled, researchers must take periodic measurements of the amount of ambient noise present during hearing tests.

Fig. 10.3
figure 3

Photos of a budgerigar (Melopsittacus undulatus) wearing headphones during a sound localization experiment (left; Welch and Dent 2011) and receiving a reward during a frequency discrimination experiment (right; Dent et al. 2000). Courtesy of Micheal Dent, University at Buffalo, The State University of New York

Fig. 10.4
figure 4

A sound attenuating chamber set up for acoustic startle reflex (ASR) testing in small animals such as mice and rats. The animal is placed in a plastic tube or a wire restraining device on an accelerometer platform. Voltages produced by the movement of the animal on the platform are recorded and quantified. Typical ASR measures are peak amplitude and response latency

10.3.1 Behavioral Methods Using Unconditioned Behaviors Preyer Reflex and Acoustic Startle Response

The Preyer reflex and the acoustic startle response (ASR) are behaviors triggered automatically by unexpected, high-amplitude sounds. These are reflexive responses to sound that require no training of the animal and thus are relatively easy to implement. On the other hand, animals can habituate to repeated presentations of high-amplitude sounds that best evoke these reflexes. Thus, sound-evoked reflexes can be useful as fast and easy screening tests for bracketing an animal’s hearing abilities but are not good measures for determining absolute thresholds of hearing.

The Preyer reflex has been described as an orientation or attentional reflex (Jero et al. 2001). In mammalian species that are able to move their pinnae, it involves a quick retraction of the ears, a rapid twitch of the ears, or a change in orientation of the pinnae toward the source of the sound. In species with immobile pinnae, turning of the head toward the sound source (which brings the source of the sound into the animal’s line of vision) is the measure of orientation. In some studies, a trained observer simply rates the Preyer reflex as present or absent. The reflex also can be monitored using a motion-tracking camera system and reflective markers attached to each of the animal’s pinnae, as described in a study using the guinea pig (Berger et al. 2013). The magnitude and latency of the Preyer reflex can then be determined by measuring pinnae displacement during sound presentation.

The ASR is a whole-body response to unexpected sounds presented at very high amplitudes (typically above 90 dB re 20 μPa) and has been interpreted as a protective or alarm reflex. It can be elicited in a wide range of adults and developing vertebrates, including fishes and most mammals, and typically is quantified in terms of response amplitude and response latency. In teleost fish, the ASR is called the tail-flip reflex or C-start response, and it involves an initial full flexion of the body followed by a weaker flexion in the opposite direction, so that the animal bends and swims away from the source of the stimulus. The response is mediated by the Mauthner cells, a pair of giant neurons located at the level of the auditory-vestibular nerve in the hindbrain. The Mauthner cells receive input from the auditory nerve and then send signals to motor neurons on the opposite side of the body, which then produce the behavioral response. The ASR in fishes can be measured by placing the animals in small acrylic plates filled with water and mounted on top of a vibration device that produces particle motion stimulation. A high-speed video camera is needed to visualize the C-start response (Bhandiwad and Sisneros 2016).

In small mammals such as rodents, the ASR consists of hunching of the shoulders, dorsiflexion of the neck, and rapid extension then flexion of the limbs. ASR in rodents is typically measured by placing the animal on a platform that measures displacement and force or acceleration caused by limb extension (Fig. 10.4). In primates, the ASR involves the reflex contraction of striate skeletal muscles, primarily muscles of the face, neck, shoulders, and arms (Braff et al. 2001).

An animal that twitches its ears or startles repeatedly (e.g., in at least two out of three presentations) in response to finger snaps, hand claps or pure tones at different frequencies has demonstrated an ability to hear. At the same time, however, the presence of a startle response does not mean the animal has normal hearing. This was demonstrated clearly in a study of the sensitivity and specificity of the Preyer reflex by Jero et al. (2001). The researchers used hand claps or the metallic sound of two hammers hitting together to elicit startle responses from young adult albino laboratory mice of the FVB strain. They found that the reflex test was effective for identifying profound hearing loss, but was insensitive for identifying less severe hearing losses.

Reflex responses to sound can be used to show differences between groups of animals as a function of age or experimental treatment. Bhandiwad and Sisneros (2016) examined the development of hearing in two species of larval fishes, the three-spined stickleback (Gasterosteus aculeatus) and the zebrafish (Danio rerio), by quantifying the probability of a startle reflex in response to sounds of different frequencies at different ages post-fertilization. McFadden et al. (2010) showed declines in the amplitude and increases in the latency of the ASR with age in laboratory rats. Age-related changes in one or more of the components of the ASR circuit or to brain regions providing inhibitory input to this circuit can account for ASR changes observed in older animals and humans.

Startle responses also can be useful for determining the range of frequencies that an animal can hear. Bowles and Francine (1993) determined that kit foxes (Vulpes macrotis) have a functional hearing range from 1 to 20 kHz by observing startle responses of four wild-caught kit foxes to playbacks of tones of different frequencies. An additional advantage of startle reflex testing is that a group of animals can be tested simultaneously. Kastelein et al. (2008) determined the frequency range of hearing for eight species of marine fish by noting the frequencies at which 50% or more of the fish in a school reacted to the sound stimulus by increasing swimming speed and making tight turns. Disadvantages of using startle responses are that they require presentation of high amplitude stimuli and they habituate quickly. Prepulse Inhibition (PPI) and Reflex Modification

Although the ASR is a reflex that is not typically under voluntary control, it is sensitive to and can be modified by ongoing behaviors and attentional status of an animal. The ASR can be potentiated under some circumstances and attenuated or inhibited under others. Animals typically show larger ASRs when they are afraid or anxious than when they are not, so fear-potentiated startle paradigms commonly are used to study fear and anxiety states in animals. When an animal is processing another stimulus, such as a brief low-level sound or a puff of air or a flash of light, it will startle less to a sudden, loud sound than when it is not otherwise engaged. The ability of an auditory, tactile, or visual prepulse stimulus to reduce the amplitude of the ASR is termed prepulse inhibition (PPI).

Even an auditory prepulse stimulus near the hearing threshold of an animal can attenuate the ASR, and this makes the PPI paradigm suitable for testing threshold levels of sound and determining subtle effects of treatments on auditory function. PPI has been used to study the auditory sensitivity of fishes, frogs, and mammals (Fig. 10.5). In larval zebrafish, the probability of an ASR to a high-amplitude tone was reduced when the tone was preceded by other tones at sub-startle levels (Bhandiwad and Sisneros 2016). Thresholds obtained by PPI in this species were lower than thresholds obtained by using the ASR alone.

Fig. 10.5
figure 5

Schematic drawing of a setup used to study prepulse inhibition of the ASR in Mongolian gerbils. The top drawing shows a gerbil placed into an acrylic tube 10 cm in front of a loudspeaker. The force sensor under the acrylic tube monitors the gerbil’s movements. The C label shows the position of the stimulation/recording computer. Center drawing shows the timing of acoustic stimulation (dB) with the pre-stimulus (lower amplitude trace) preceding the startle-producing stimulus (higher amplitude trace). Bottom drawing shows the response measured by the force sensor. Here, the response occurs only to the stimulus and not to the pre-stimulus. After repeated pairings of the pre-stimulus and stimulus, the response to the stimulus declines (Walter et al. 2012). © Walter et al. 2012; https://www.scirp.org/journal/paperinformation.aspx?paperid=17796. Licensed under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Reflexes other than acoustic startle responses can be modified by the prior presentation of a sound; these paradigms are termed reflex modifications (Hoffman and Ison 1980). Simmons and Moss (1995) adapted this paradigm to obtain audiograms for two species of frogs, the American bullfrog (Lithobates catesbeianus) and the green treefrog (Dryophytes cinereus). Frogs were constrained inside a small dish (1–2 cm in diameter larger than the animal), which was then placed on top of a stabilimeter that picked up the frog’s movements within the dish. Two copper strips cemented to the side of the dish produced a mild electric shock that evoked small reflex contractions of the frog’s hind limbs. The reflex evoked by the electric shock was modified in strength by prepulses of pure tones, with the extent of modification varying with prepulse amplitude. At any given tone frequency, the amplitude of the prepulse producing 10% inhibition of the reflex response was defined as the threshold to that frequency. The magnitude of the reflex modification effect varied with the amplitude of the prepulse, but only when stimulation was spaced at intervals wide enough to avoid habituation. Phonotaxis

Some animals have a natural tendency to approach sound (positive phonotaxis) or make evasive movements away from sound (negative phonotaxis). Sounds that elicit positive phonotaxis include species advertisement calls (i.e., mating calls), while sounds that elicit negative phonotaxis include sounds made by predators. These natural behavioral responses to sound can be exploited to estimate hearing sensitivity in those species for which training procedures based on conditioned responses are extremely difficult to implement. Phonotaxis experiments are readily conducted in the animal’s habitat and so can provide crucial information on the acoustic features animals use to recognize conspecific (own species) vocal signals such as advertisement and aggressive calls. These kinds of field studies are particularly important for identifying the impact of the entire soundscape on sound detection and discrimination, and for assessing the effects of environmental variables, such as air temperature and humidity, on acoustic communication.

Phonotaxis has been especially useful for studying auditory capabilities of female orthopteran insects, frogs, and songbirds, because these animals naturally approach stationary calling males in order to mate with them. For example, gravid female frogs readily approach loudspeakers broadcasting sounds (tone bursts, amplitude-modulated tones, or frequency-modulated tones) which they recognize as components of the advertisement calls of males of their own species, or even a synthetic version of these conspecific calls (Gerhardt 1995). The sensitivity of females to these sounds is measured in experiments in which sounds of different levels, frequencies, or temporal patterning are broadcast from a loudspeaker, and the female’s approach to the loudspeaker is quantified. Sounds can be broadcast from one source (one-speaker design) to estimate sound detection or from two sources (choice or two-speaker design) to estimate sound discrimination. The researcher can obtain an estimate of the female’s relative sensitivity to sounds (if sound frequency is varied) or her ability to distinguish sounds of two intensities (if sound level is varied). Responses are quantified in terms of the nearness and the path of the phonotactic approach, the latency of the response, and the presence of orientation movements, such as head-turning toward the sound source. Data are typically presented as the proportion of females responding to a particular stimulus as a function of whatever parameter is being varied, with the 50% correct point on the resulting function defined as the threshold in a one-choice experiment and the 75% correct point (midway between chance and perfect performance) defined as the threshold in a two-choice experiment (see Volume 2, Chap. 3 on amphibians).

Because most species of insects and frogs call at night, visualizing their movements in a phonotaxis experiment can be challenging. Figure 10.6 shows a new technique designed to monitor phonotactic movements of frogs in both the laboratory and the natural environment (Aihara et al. 2017). In this technique, a female Australian orange-eyed treefrog (Ranoidea chloris) wears a miniature LED backpack. A video camera records the energy emitted from the LEDs, thus allowing researchers to track the frog’s movements. Sounds are broadcast through multiple loudspeakers, and monitored by separate LED sound indication devices, each of which has a different pattern of illumination. In this way, researchers can not only track the female’s movements but also which of several loudspeakers is playing the preferred sound.

Fig. 10.6
figure 6

(a) An image of a sound indication device that consists of a miniature microphone and a light-emitting diode (LED). The LED is illuminated when detecting sounds. (b) Photo of an orange-eyed female treefrog wearing a LED backpack. (c) Arena playback experiment. Two loudspeakers at each end of the arena present sounds. A sound indication device is placed in front of each loudspeaker. The female wearing the backpack is released from the middle of the arena. The lights emitted by the sound indication device and the LED backpack are recorded by a video camera. (d) Natural habitat of the orange-eyed treefrog. The position of the sound-indication device is shown (Aihara et al. 2017). © Aihara et al. 2017; https://www.nature.com/articles/s41598-017-11150-y. Licensed under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

There are limitations to the use and interpretation of phonotaxis data. Although phonotaxis experiments can tell us which sounds animals prefer and how sensitive they are to these sounds, they are not suitable for the compilation of entire audiograms or estimates of an animal’s entire range of hearing. When a female fails to approach a sound source, it may be because she does not hear it or because she does not recognize it as an advertisement call. Moreover, females of many species will show phonotaxis only when they are gravid. This limits the timespan during which experiments can be conducted, although phonotaxis can be induced by hormone injections (Gerhardt 1995). Male insects and frogs typically exhibit phonotaxis only in response to a high amplitude sound resembling an advertisement call or an aggressive call from a rival male. Males treat aggressive calls from rivals as threats and respond aggressively, by approaching the source and attempting to engage it physically. Because males are less likely than females to approach sound sources, descriptions of their hearing sensitivity based on phonotaxis are not reliable. Evoked Calling

Evoked calling is another method based on unconditioned responses that can be used to estimate hearing sensitivity and acoustic preferences. Males of some species (orthopteran insects, frogs, songbirds) vocalize in response to playbacks of signals resembling conspecific advertisement or aggressive calls. The male’s sensitivity to these playbacks can be estimated by lowering the amplitude of the signal until the male no longer vocalizes back. Varying the acoustic features (frequency, temporal patterning) of the signal can provide estimates of sensitivity to these particular features (Fay and Simmons 1999). Evoked calling experiments, like phonotaxis experiments, can be implemented either in the laboratory or in the field. As with the phonotaxis technique, the evoked calling technique does not measure audibility per se but can be useful for determining what acoustic features of communication signals are most important for mediating behavioral responses. Despite their limitations, phonotaxis and evoked calling techniques are useful because they provide insight into what sounds animals pay attention to in their natural environment and thus into perceptual decision-making in a biologically relevant context.

10.3.2 Behavioral Methods Using Conditioned Behaviors Classical Conditioning

Classical conditioning techniques have been used to train several species of animals for audiometric studies. In classical conditioning, an unconditioned stimulus that naturally elicits an unconditioned response is paired with a conditioned stimulus. After a number of pairings of the conditioned stimulus with the unconditioned stimulus, presentation of the conditioned stimulus alone elicits a conditioned response that is the same as or similar to the unconditioned response.

Fay (1995) described the use of classical respiratory conditioning to estimate auditory thresholds in the goldfish (Carassius auratus). The goldfish was restrained in a cloth bag and submerged in a small tank. An underwater loudspeaker was placed on the bottom of the tank. A tone of a particular frequency was presented shortly before a brief electric shock (unconditioned stimulus) that produced an unconditioned suppression of the fish’s respiration. Changes in the amplitude and rate of fish’s respiration were measured by a thermister placed in front of the fish’s mouth. After multiple pairings of the tone and shock, presentation of the tone alone produced a conditioned suppression of respiration. By determining the amplitude level of the tone that no longer produced a conditioned response, the fish’s sensitivity to that tone frequency could be determined.

Ehret and Romand (1981) used both unconditioned and classically conditioned pinnae movements and eye-blink responses to track the postnatal development of auditory thresholds in domestic kittens (Felis catus). Unconditioned movements of the pinnae and/or facial muscles in response to high-intensity tone bursts were observed in one group of kittens up to 12 days of age. A second group of kittens (aged 10 days to 1 month) was trained with tone-shock pairs to make conditioned movements of their eyelids and pinnae when they heard a sound. Ehret and Romand’s results showed that some kittens as young as 1–2 days of age were able to respond to some frequencies, and that sensitivity to low, mid, and high frequencies developed at different ages. Operant Conditioning

There are many responses animals can make to indicate when sounds are heard (or not heard), such as touching a response paddle, pressing a lever with a nose or paw, lifting a paw, licking a tube from a water bottle, swimming across a barrier, or vocalizing. It is important to choose a response that is based on an animal’s natural behaviors and thus is easy to learn. Once the response is chosen, there are several behavioral methods that can be used to train animals to make the response when a sound is detected or refrain from the response when no stimulus is presented. These different paradigms have been implemented successfully with a large number of species, with modifications that take into account species-typical behaviors and habitats.

Operant conditioning techniques can use positive or negative reinforcement procedures for training or “shaping” a conditioned response. Positive reinforcement methods establish the behavior by providing a reward, such as food, water, or even verbal praise or tactile stimulation whenever the animal makes the appropriate response. Negative reinforcement methods remove an unpleasant or aversive stimulus (usually mild electric shock) whenever the animal makes the appropriate response. Methods can also be used to decrease unwanted or incorrect responses; these are termed punishment procedures. For example, a time-out period might be imposed (positive punishment) when an animal makes an incorrect response. After the desired behavior has been established through an appropriate schedule of reinforcement during a training phase, the animal is then tested using various frequencies and amplitudes of sound to determine the audiogram. Sometimes animals mistakenly respond when there is no signal present; this is a false alarm. Some animals are more inclined to make false alarms than others. To assess this bias, “catch trials” (i.e., control trials in which no signal is presented) are interspersed at random in the stimulus series. Some researchers desire to assess the animal’s attentiveness to a hearing task before collecting data, such as by conducting a set of easily heard “warm-up trials” at the beginning of a session, and a set of easily heard “cool-down trials” at the end of a session. Criteria can be set such that if the animal’s performance does not reach a certain percent of correct responses during either the warm-up or the cool-down trials (e.g., 80%), testing is discontinued for that session or data from that session are eliminated.

In conditioned suppression/avoidance paradigms, an animal learns to suppress an ongoing behavior when it detects a sound that signals shock (Heffner and Heffner 2001). The shock levels used in these studies are kept low so that the animals do not become agitated or develop a fear of the test apparatus that would impair their performance. Heffner et al. (2014) used the conditioned suppression procedure to determine behavioral audiograms and sound localization abilities of three young male alpacas (Vicugna pacos). Thirsty alpacas were trained to break contact with a water spout when they heard a tone or noise signal (a conditioned stimulus) that warned of impending shock (unconditioned stimulus) and to resume drinking water following a safety signal. The safety signal for tone threshold testing was a shock indicator light that turned off when shock was terminated. Hit rates (measuring the percentage of correct detections of sound, indicated by breaking contact with the water bowl when the tone signal was present) and false alarm rates (measuring the percentage of false alarms, indicated by breaking contact with the water bowl when no tone was present) were determined for each stimulus intensity. The pure-tone thresholds of the three alpacas showed little variability among individuals. Indeed, Heffner and Heffner (2001) argued that individual variation among animals is less when using conditioned suppression compared to methods based on positive reinforcement.

Another common technique based on positive reinforcement, used in many species of aquatic (Fig. 10.7) and terrestrial species, is a go/no-go response paradigm. Thomas et al. (1990) used this technique to measure the audiogram of a subadult male Hawaiian monk seal (Neomonachus schauinslandi). At the start of each trial, a trainer sent the seal, using a hand cue, to station under water with its chin resting on a headstand. If a tone was heard, the seal was expected to leave the station, touch a response paddle, and swim to the trainer for a fish reward (go response). If no tone was heard (either a control trial or an inaudible signal), the seal was supposed to stay at the station, wait for the trainer to give a release whistle, and then swim back to the trainer for a reward (no-go response). Half the trials were signal-present and half were signal-absent controls; the order of presentation of the trial types was pseudorandomized throughout a session so that the animal would adopt a neutral response bias. The trainer then called the seal back to the initial station with a whistle and the next trial commenced.

Fig. 10.7
figure 7

Photo of a beluga whale holding station in front of an underwater loudspeaker during behavioral training for later audiogram measurements at Vancouver Aquarium. During the actual experiment, the computer operator moved behind the rock wall, out of sight of trainers and whale

There are several drawbacks of behavioral audiometric studies based on conditioning procedures. Most notably, weeks or months may be required to train the animal to respond reliably. It is important to maintain the animal’s motivation to respond and attention to the task, both of which can wane if there are changes in the social environment, routine, or the animal’s health.

Because behavioral audiograms require a long period to train and test the animal, and since the number of individuals in captivity is limited for many species, in some marine mammals, hearing data are available for only a single animal. Hall and Johnson (1972) conducted a behavioral audiogram on a captive killer whale (Orcinus orca) and reported that this species had much worse high-frequency hearing than other toothed whales tested to that date. Later, Bain et al. (1993) conducted behavioral audiograms on five killer whales and found their hearing was very typical of other toothed whales. Upon investigation, the researchers found that the original test subject had been given high dosages of an ototoxic antibiotic. So, the first killer whale tested was likely hearing impaired as a result of antibiotic-induced death of hair cells in the high-frequency region of the cochlea. By now, another eight individuals have been tested confirming more typical delphinid audiograms in killer whales (Branstetter et al. 2017).

10.3.3 Signal Presentation Paradigms for Behavioral Audiograms

There are three classic paradigms commonly used for signal presentation in behavioral audiogram tests with animals (Levitt 1970; Klump et al. 1995): the Method of Constant Stimuli, the Method of Limits, and the Up/Down Staircase method (also called “adaptive tracking method”). One important factor to keep in mind when choosing a signal presentation paradigm is the time available for measuring thresholds, as there is a trade-off between the number of trials and the accuracy and reliability of hearing-threshold measurements. Method of Constant Stimuli

The Method of Constant Stimuli provides the greatest accuracy and reliability for threshold measurements. In this paradigm, the animal is tested at one frequency in a session with blocks of trials having an equal number of different signal levels ranging from very low to very high amplitude (i.e., no silent controls), presented in random order. The animal makes a response when a signal is heard, and the results for each signal presentation (“Yes” the tone was heard or “No” the tone was not heard) are tallied by amplitude levels (Fig. 10.8 left panel). After all responses are tallied, a psychometric function (i.e., a plot of the animal’s responses, typically the percentage of “Yes” responses) versus amplitude level (Fig. 10.8 right panel) is made. The threshold level is determined (often by interpolation) as the level at which the animal indicated it heard the signal on 50% of the trials.

Fig. 10.8
figure 8

Illustration of the Method of Constant Stimuli. Left panel: Fifty stimuli were presented at each of nine stimulus levels (450 trials total). The number of times the subject indicated that the stimulus was heard at each level was tallied in the Number column and converted to a percentage in the Percent column. At stimulus levels below threshold, the subject rarely responded, whereas at the highest stimulus levels, the subject reported detection on all 50 trials (100%). Right panel: Data from the tallies chart were used to plot a psychometric function, showing performance as a function of stimulus level. Threshold, defined as the stimulus level at which the subject made a detection response on 50% of the trials, was interpolated to be 5.2 in this example

The stimulus presentation levels cover a wide range that bracket the animal’s threshold, so additional points on the psychometric function can be estimated. Randomized presentation of stimuli prevents the animal from anticipating the stimulus level on the next trial. Many of the stimulus levels are well above threshold, so the animal is not required to make difficult detections on every trial. On the other hand, the method is time-consuming, and the choice of stimulus levels to present requires some prior knowledge of likely thresholds at a specific frequency. Method of Limits

The Method of Limits involves the presentation of stimuli in small steps (typically 2 to 5 dB) over a fixed range of stimulus levels. At each level, the experimenter records whether the animal responded to the test tone or not (Fig. 10.9). Stimuli may be presented in an ascending series, from the lowest amplitude to the highest, or in a descending series, from the highest amplitude to the lowest. Multiple runs are conducted, and for each run, the crossover level (i.e., the level halfway between the stimulus level not heard and the next level heard, e.g., 22.5 dB for run 1 and 27.5 dB for run 2 in Fig. 10.9) is determined. The mean threshold is estimated by averaging all of the crossover levels for that frequency.

Fig. 10.9
figure 9

Illustration of the Method of Limits. Five series of trials (runs) were used, with test tones at six stimulus levels (15–45 dB re 20 μPa) presented in each run. Stimuli were presented from the highest level to the lowest (i.e., in descending order) on the first, third, and fifth runs, and from the lowest level to the highest (i.e., in ascending order) on the second and fourth runs. The crossover level was recorded for each run, then crossover levels were averaged to estimate threshold. In this example, a total of 30 trials were conducted across five runs, and the threshold was estimated to be 24.5 dB re 20 μPa

Presenting all runs in either descending order or solely in ascending order may produce a strong response bias that influences threshold estimates. When trials are presented using the descending Method of Limits, the animal can become accustomed to reporting that it perceives a stimulus and can continue reporting hearing the signal below the threshold; this is known as the error of habituation. Alternatively, in the ascending Method of Limits, the animal can anticipate that the stimulus is about to become detectable and make an error in responding in the absence of the signal; this is known as the error of anticipation. The bias introduced by signal predictability is a drawback of using the Method of Limits. The influence of habituation and anticipation errors can be partly overcome by using an equal number of ascending and descending runs alternately on the same subject.

The Method of Limits is often preferred over the Method of Constant Stimuli because of its greater efficiency in bracketing thresholds; i.e., fewer trials are needed for a reliable estimate of threshold. In the example shown in Fig. 10.9, responses to test tones at six stimulus levels were recorded across five runs; this required 30 trials total. If the Method of Constant Stimuli had been used, with 50 signals presented at each of the six stimulus levels, a total of 300 trials would have been presented. Up/Down Staircase Method

The Up/Down Staircase method, or adaptive tracking signal presentation paradigm, is a variation of the Method of Limits that was developed by von Békésy (1960) as a way of efficiently determining thresholds (Fig. 10.10). This method is also referred to as a Modified Method of Limits. The test begins with the presentation of a high-amplitude signal that is likely to be easily heard. Then, the amplitude is reduced in 2- to 10-dB steps until the animal does not respond to the signal. When the animal signifies it can no longer hear the signal, the dB level is immediately increased (in 1- to 5-dB steps) until the animal reports it again hears the sound. At that level, the direction is reversed and the procedure is repeated. Thus, this method includes both descending and ascending staircases, with reversals triggered by a change in the animal’s response. The hearing threshold can be estimated by taking the average of the signal levels at a designated number of reversals or by noting the lowest level with a criterion number of “Yes” responses on ascending trials. Catch trials or silent control trials controls in which all electronics are switched on, but no test signal is projected may be used to control for response bias (see example audiometric study of a Hawaiian monk seal, Sect. In addition, the time interval between signal presentations can be varied, so that the subject does not develop a pattern of responding based on predictable timing.

Fig. 10.10
figure 10

Example of “bracketing” a hearing threshold using the Up/Down Staircase method (Modified Method of Limits). The first signal was presented at a level that the subject easily heard (“Yes” at 40 dB re 20 μPa). Signal level was then decreased in 5-dB steps until the subject no longer signaled detection (“No” at 25 dB re 20 μPa). The change of response from “Yes” to “No” triggered the first reversal, from a descending series to an ascending one. Thereafter, each change of response triggered an immediate reversal. Signals were presented at random intervals to prevent the subject from developing a response bias based on timing. In this example, the predetermined criterion for threshold was the lowest signal level with three “Yes” responses on ascending trials (circled responses), so 30 dB re 20 μPa was the threshold for this frequency. Testing at this frequency terminated when the criterion for threshold was met

The Up/Down Staircase procedure can be difficult for an animal, because many trials are presented at near-threshold levels. This could affect an animal’s motivation to respond. However, receiving a reward for both correct responses to signal and silent control trials helps reduce negative effects. The major advantage of the adaptive tracking method over the Method of Constant Stimuli and the Method of Limits is that fewer trials need to be conducted, resulting in a shorter test session for both the researcher and the animal subject.

10.3.4 Receiver Operating Characteristic (ROC) Curves

Animals, like humans, can have a bias toward a more conservative or liberal response during a hearing test (Klump et al. 1995), which could lead to underestimating or overestimating the hearing threshold, respectively. Procedures have been developed to separate response bias from actual behavioral sensitivity in psychophysical experiments. In a yes/no (audible/inaudible signal) detection task, there are four possible outcomes of each trial: (1) correct detection or hit (i.e., responding that a signal is present when it is broadcast), (2) correct rejection (i.e., responding that a signal is absent when it is not broadcast), (3) false alarm (i.e., responding that a signal is present when it is not, or indicating “yes” before the signal is broadcast), and (4) missed detection or miss (i.e., responding that a signal is absent when a signal is broadcast or failing to respond). The four response choices of an animal in a behavioral hearing test are illustrated in Fig. 10.11.

Fig. 10.11
figure 11

A two-by-two decision matrix relating the signal condition (signal presence versus signal absence) to the animal’s possible responses (indicating signal presence versus signal absence) during audiometric tests

Response bias can be disentangled from sensory capabilities by constructing a Receiver Operating Characteristic (ROC) curve (Green and Swets 1966). Upon signal presentation, the animal can respond either “yes” or “no” and so the probability of correct detection, P(CD), and the probability of missed detection, P(MD) add to 1: P(CD) + P(MD) = 1. Similarly, in the case of no signal presented, the probabilities of false alarm, P(FA), and correct rejection, P(CR), add to 1: P(FA) + P(CR) = 1. In other words, the probabilities computed from the animal responses in Fig. 10.11 are not all independent. In the ROC plot, therefore, two independent probabilities are plotted against each other: P(CD) versus P(FA). As illustrated in Fig. 10.12a, the major diagonal line marks all the points at which P(CD) = P(FA), which would be expected if the subject were making random choices or simply guessing. Below this line, the animal would perform worse than by chance; i.e., the animal would be making deliberate mistakes. The minor diagonal corresponds to P(CD) + P(FA) = 1 and so represents neutral response bias, with responses falling to the left of the line indicating a conservative response bias (i.e., low false alarm probability) and to the right a liberal response bias (i.e., high false alarm probability). The best possible performance is at the point (0|1), where the animal detects all signals and does not report any false alarms. Actual results from a beluga whale (Fig. 10.12b) detecting played-back beluga calls in icebreaker noise are shown in Fig. 10.12c. At decreasing signal-to-noise ratio (from 0 to −30 dB), the animal’s hit rate decreased (i.e., decreasing P(CD)). False alarms were only made at low signal-to-noise ratio (−24 dB) indicating an overall conservative response bias. Data are based on the study by Erbe and Farmer (1998); see Fig. 10.7 for a photo of the training setup.

Fig. 10.12
figure 12

(a) Receiver Operating Characteristic (ROC) plot showing the lines and areas relating the probability of correct detection, P(CD), and the probability of false alarm, P(FA). (b) Photo of a beluga whale at Vancouver Aquarium. (c) ROC plot of this animal’s performance when presented with a beluga call mixed into icebreaker noise at signal-to-noise ratios of 0, −6, −12, −18, −24, and −30 dB. The animal was trained to indicate whenever it heard the call in the noise. The animal’s performance decreased with decreasing signal-to-noise ratio. The animal adopted a very conservative response bias (Erbe and Farmer 1998)

The bias of the animal in these hearing tests can be manipulated by changing the reinforcement regimen. If the possible responses from Fig. 10.11 are differently rewarded (e.g., positive reinforcement for the two correct responses and negative reinforcement for the two false responses), then the animal will aim to maximize the percentage of correct responses. If the four responses are all differently rewarded, then the perceived values and risks will influence the animal’s response. For example, in a study with an Arctic fox (Vulpes lagopus; Stansbury et al. 2014), correct detections and correct rejections were rewarded with 3–4 pieces of kibble. When the animal missed a signal, it was rewarded with 1 piece of kibble. False alarms resulted in a 2–3 s time-out, after which the animal was restationed for the next trial. By rewarding misses (i.e., one of the two false responses) and with only false alarms receiving no food but instead a time-out, the animal was conditioned to avoid false alarms but accept misses. The reinforcement regimen directly influenced the animal’s conservative bias. Similar conditioning likely happened with the beluga whale (Erbe and Farmer 1998). After the animal stationed, a sound was played randomly within a 30-s period. The animal indicated a detection (of the beluga call mixed into icebreaker noise) by breaking from the station. If the animal did not detect a call, it held station for the full 30 s. Correct detections were rewarded with fish within 2 s. False alarms received a time-out. A “no” response received a delayed (by up to 30 s) fish reward; these would have correct rejections (i.e., signal absent trials) and missed detections (i.e., signal present trials, but under the assumption that the signal was too quiet to be detected). Effectively, the animal thus also received a reward (albeit delayed) for missed detections, even if the signal was above threshold on some of the trials. Not knowing in advance what the animal’s hearing threshold is, it is impossible to tell whether the animal truly did not hear the signal when it indicated “no” to a low-level signal-present trial.

An even greater benefit of ROC analysis is realized by measuring actual ROC curves (rather than settling for scatter plots of data as in Fig. 10.12c). To do that, the animal’s bias needs to be actively manipulated using reinforcement. For example, the beluga experiment could be redone with the same animal, but instead of rewarding both correct responses with one fish, the animal might be given 3 fishes for a correct detection and only 1 fish for a correct rejection. The animal might begin to favor the “yes” response, exhibiting a more liberal response bias. So, rather than having just one data point at say −12 dB signal-to-noise ratio, we would get a curve for −12 dB, with the points along the curve corresponding to the same sensitivity (hence also called isosensitivity curve) but to different biases, which were driven by the different reinforcement regimen. This is exactly what was done by Schusterman et al. (1975) with a California sea lion (Zalophus californianus) and a bottlenose dolphin (Tursiops truncatus), yielding actual ROC curves. Other ways of actively changing the bias include changing the percentage of catch trials (whereby fewer catch trials render the animal more liberal; Schusterman and Johnson 1975) or even changing the probability of handing out a reward (i.e., not all correct trials are rewarded all the time; Schusterman 1976). The resulting ROC curves then allow the separation of the animal’s actual sensitivity from its bias (Green and Swets 1966; Au 1993), but much more experimental time is needed to collect all these data.

10.4 Physiological Methods for Audiometric Studies on Live Animals

Behavioral tests of hearing can be too time-consuming to conduct, too difficult to employ because of animals’ limitations in learning or performing a behavioral task, or impractical for some other reason such as animal health, disposition, or developmental status. Physiological methods offer a practical, complementary approach because they do not require training the animal and they can be completed in a relatively shorter period of time. However, because physiological methods do not require a behavioral response from the animal that indicates the sound was perceived, they are considered to be tests of “auditory function” rather than “hearing” per se. The relationship between behavioral and physiological measures of hearing is discussed later in this chapter.

As in behavioral studies, physiological studies test responses to different kinds of acoustic stimulation and must take into account ambient noise that can affect thresholds. Other factors to consider in physiological studies are body temperature and whether or not the animal is anesthetized, because these factors can affect neural thresholds, amplitudes, and latencies. Anesthesia is commonly used in physiological studies because it is difficult to keep an unanesthetized animal in a fixed position in a sound field during testing and physical restraint can be stressful. However, anesthesia can affect brain activity and severely diminish or abolish neural responses to sound (Cui et al. 2017; Kiebel et al. 2012; McFadden and Kiebel 2013; Fig. 10.13). Anesthesia can also impair thermoregulation, resulting in changes in body temperature that can be countered by placing the animal on a heating pad during testing. When brain responses must be obtained from awake animals (see Fig. 10.13), electrical artifacts created by movements during exploration or grooming can be problematic, and many trials may be required to achieve acceptable signal-to-noise ratios.

Fig. 10.13
figure 13

Top: Testing apparatus devised by Kiebel et al. (2012) for recording auditory evoked potentials from awake mice. The mice were placed on a platform (i.e., an inverted jar about 3″ in diameter) in a plastic tub containing warm water in a recording chamber. Mice were acclimated to the apparatus in daily 10-min sessions for 1–2 days prior to the first recording session. Typically, a mouse placed on the platform for the first time would enter the water and after a brief period of swimming, would climb back on the platform and remain there until removed by the researcher. In subsequent sessions, the mouse typically remained on the platform for the entire testing session (30–45 min). Stimuli were delivered from a headphone speaker placed 7″ above the animal’s head. A computer-controlled camera was used to monitor the mouse, and recording was manually paused when the animal groomed or became active. Bottom: Auditory evoked responses recorded from a mouse while it was awake and then again after it had been anesthetized. The waveforms are responses to 12 kHz tones at 90 dB re 20 μPa, averaged across 100 artifact-free trials in each condition

10.4.1 Otoacoustic Emission Methods

Otoacoustic emissions (OAEs) are sounds generated by hair cells in the inner ear, either in the absence of acoustic stimulation (spontaneous otoacoustic emissions) or in response to acoustic stimulation (transient otoacoustic emissions, TOAEs, elicited by a single tone or click; and distortion product otoacoustic emissions, DPOAEs, elicited by two primary tones, f1 and f2). OAEs reflect nonlinear processing in the inner ear and occur due to the action of a “cochlear amplifier,” which functions to increase sensitivity to low-level sounds. Moreover, they are frequency-specific and so will emerge at those frequencies where hearing is near normal (Kemp 2002). DPOAE testing has become popular as a rapid, non-invasive way to assess the functional integrity of hair cells in a wide variety of species, including frogs, lizards, birds, and mammals (Manley 2001). DPOAEs are abolished by loss or dysfunction of outer hair cells, and also by middle ear dysfunction that prevents retrograde transmission of acoustic energy from the cochlea to the ear canal. It is important to recognize, however, that the absence of OAEs is not necessarily evidence of outer hair cell dysfunction, because OAEs are not recordable from all normal ears. The technique is not very useful for pinnipeds because their stapedial reflex shuts down the auditory meatus as an adaptation for diving.

DPOAE tests in mammals typically use a probe assembly that is inserted into the external auditory meatus to form a closed acoustic system. For animals lacking ear canals (e.g., fishes, frogs, reptiles, and birds), the probe tip is placed inside a plastic tube that is then coupled to the animal’s ear using silicone grease or Vaseline to seal any gaps (Bergevin et al. 2008). The probe tip contains a very sensitive external microphone and tubes from two external sound sources (Fig. 10.14). Two primary test tones, f1 and a higher frequency tone f2, are generated by separate channels of a sound-generating system and presented through the sound tubes, and the sound in the ear canal is sampled by the microphone for a fixed period of time. The output of the microphone is filtered, digitized, averaged over a number of trials, and then analyzed using a computerized signal-analysis system. A normal inner ear will generate several nonlinear distortion products that will be propagated in a reverse direction back through the middle ear and into the ear canal (when present). When this occurs, spectrum analysis of the sound recorded by the microphone will show not only the original f1 and f2 tones that were delivered to the ear, but also several new tones that were generated as nonlinear distortion products. The largest distortion product is the cubic DPOAE, with a frequency equal to 2f1 − f2. For example, if f1 = 1000 Hz and f2 = 1200 Hz, then the cochlea will generate a cubic DPOAE at 800 Hz. Because 2f1 − f2 is the largest DPOAE produced (typically 30–40 dB re 20 μPa below the level of the primary tones) and is less variable than other distortion products, it is typically the only one reported in animal studies.

Fig. 10.14
figure 14

A commercially available low-noise microphone with two external sound sources. The probe tip containing the microphone and sound tubes is covered with a foam or plastic ear tip and inserted into the ear canal to form a closed acoustic system. For animals without ear canals, the probe can be inserted into a plastic tube that is then sealed in place against the ear of the animal

The frequency ratio f2:f1 of the primary tones, the level of the higher-frequency primary tone L2, and the difference between the levels of the two primary tones L1 − L2 are selected to maximize the amplitude of the cubic DPOAE in the ear canal. These parameters are species-specific and must be determined empirically. For all combinations of stimulus parameters (f2:f1, L2 and L1 − L2), the amplitude of the cubic DPOAE increases as the level of the primary tones increases until it saturates. DPOAEs can be difficult to measure at low frequencies due to masking by low-frequency ambient sounds in the ear canal (i.e., high noise-floor levels occur at low frequencies). But it is possible to measure low-frequency DPOAEs if great care is taken to ensure deep insertion and a good seal of the probe assembly in the ear canal.

Shaffer and Long (2004) measured low-frequency DPOAEs in two species of kangaroo rats to test the hypothesis that a large foot-drumming species (Dipodomys spectabilis) has better low-frequency sensitivity than a small foot-drumming species (D. merriami). In both species, DPOAEs were generated rated at low frequencies between 225 and 900 Hz. DPOAE amplitudes were greater in the larger kangaroo rat species compared to the smaller species. Additionally, the authors found good correspondence between DPOAE amplitudes, behavioral hearing thresholds, and electrophysiological hearing thresholds in D. merriami. This suggests that DPOAE amplitudes are good estimates of hearing sensitivity.

10.4.2 Auditory Evoked-Potential and Auditory Brainstem Response Methods

Auditory evoked-potential (AEP) methods record stimulus-evoked electrical activity at various levels of the auditory nervous system. Hair cells and neurons in the auditory system function by generating electrical potentials in response to sounds, and measurements of these stimulus-evoked potentials can provide information about the functional state of the inner ear, auditory nerve, central auditory nuclei, and their fiber pathways (Salvi et al. 2000; McFadden 2007).

There are many ways of classifying AEPs. Common classifications are based on: (1) the region involved in the generation of the response (e.g., cochlea, brainstem, thalamus, or cortex), (2) the latency of the response (i.e., short-, middle-, and long-latency potentials reflecting generation by neural elements at progressively higher regions of the auditory system), (3) electrode placement (invasive near-field recordings made with an electrode inserted into an auditory nucleus versus noninvasive far-field recordings made from electrodes placed on the scalp), (4) the type of electrode used (high-impedance microelectrodes for recording potentials from individual cells versus low-impedance surface or needle electrodes for recording activity from large groups of neurons from the scalp), and (5) the size of the cellular population contributing to the response (e.g., local field potentials reflecting the extracellular electrical activity of a discrete group of neurons versus gross potentials generated by large populations of cells such as those recorded from scalp electrodes).

Electrical potentials generated by the cochlea and auditory nerve include the cochlear microphonic potential (CM potential) generated by outer hair cells, the summating potential (SP) generated primarily by inner hair cells, and the compound action potential (CAP) generated by the synchronous depolarization of auditory nerve fibers. AEPs generated by the auditory nerve and neurons in the auditory brainstem (i.e., cochlear nucleus, superior olive, lateral lemniscus, and inferior colliculus) contribute to the short-latency scalp-recorded auditory brainstem response (ABR). AEPs recorded from electrodes implanted into the auditory midbrain of mammals are referred to as inferior colliculus evoked potentials (IC-EVPs). AEPs generated by forebrain regions (thalamus and cortex) include long-latency potentials recorded from electrodes implanted into the brain or from surface electrodes.

AEP methods share a number of common procedures. Stimuli can be presented using the same paradigms discussed in Sect. 10.3.3 (Method of Constant Stimuli, Method of Limits, Up/Down Staircase method) with the criterion for threshold being an electrophysiological, rather than a behavioral, response. Responses are recorded and averaged over a number of trials (e.g., 50–2000 trials); the number of trials depends on the size of the response relative to background electrical noise (i.e., the signal-to-noise ratio). They are typically quantified in terms of response amplitude (e.g., peak-to-peak voltage or peak voltage relative to a baseline voltage level) and latency (i.e., the lag-time between the onset of the stimulus and a defined portion of the response). Threshold is variously defined as the lowest stimulus level that elicits a detectable physiological response, the lowest level at which a peak replicates, the midpoint between the level at which a response replicates and the next lower level at which it does not, or the sound pressure level at which the amplitude of a particular peak reaches a criterion voltage level. Other parameters that are commonly measured from AEP waveforms include peak amplitudes, peak latencies, and in the case of the ABR, inter-peak intervals (i.e., time between different peaks, reflecting neural conduction time). Results are summarized as input-output functions that show response magnitude or latency as a function of stimulus level, or as an audiogram, showing threshold as a function of stimulus frequency.

Because the ABR is an onset response that requires synchronous activity of an ensemble of neural elements, stimuli with very short rise/fall times are most effective. Clicks, which are brief (e.g., 5–100 μs) and therefore spectrally broad, often are used as stimuli, particularly for screening of auditory function. Pure tones with a rapid onset are preferred when more frequency-specific information is required, as for testing the frequency range of hearing. Sinusoidal amplitude modulated tones provide even greater frequency specificity.

At high stimulus levels that are clearly audible to an animal, several characteristic peaks are typically present in the response waveform, with latencies that correspond to their progressively higher anatomical sites of generation. ABRs from mammals typically have five prominent peaks (Fig. 10.15). The first peak of the waveform has a cochlear origin, reflecting the summed synchronous neural activity from the peripheral portion of the auditory nerve, and the second peak most likely reflects neural activity from the central portion of the auditory nerve at the level of the cochlear nucleus. Subsequent peaks are generated by brainstem regions between the cochlear nucleus and the lateral lemniscus or inferior colliculus. In all species studied, peak amplitudes of the ABR increase and latencies decrease as the stimulus level increases (Fig. 10.15). The rate of stimulus presentation can influence response amplitudes and thresholds. Data acquisition time is shortened by using a rapid signal presentation rate, but there is a cost in terms of response size, with high signal rates resulting in decreased peak amplitudes in the response waveform and increased response latencies.

Fig. 10.15
figure 15

Left: Photo of a squirrelfish (Sargocentron sp.) with subcutaneous electrodes about to undergo ABR testing. Photo courtesy of Rob McCauley, Centre for Marine Science and Technology, Curtin University. Right: ABR waveforms obtained from an anesthetized C57BL/6J mouse. Needle electrodes (pictured at top left) were inserted under the skin at the top of the head (active), behind the right ear (reference), and at the base of the tail (ground). Two waveforms were collected at each stimulus level, in 5-dB steps from 90 to 55 dB re 20 μPa. Threshold, defined as the lowest level with a repeatable response, was 65 dB re 20 μPa for this frequency. The first two peaks of the ABR (short bracket) show activity from the auditory nerve, whereas the subsequent peaks (long bracket) arise from successively more rostral regions of the central auditory nervous system. Note the decrease in peak amplitude and increase in peak latency with decreasing stimulus level, typical of ABR waveforms

Preparation of animals for ABR testing is minimal. Typically, the animal is restrained or sedated or anesthetized to keep it still during the recording session. Aquatic animals under human care can be trained to remain still at a station (e.g., in a hoop) and are maintained at a good ambient water temperature in a pool. Terrestrial animals are placed on a heating pad to maintain normal body temperature. Electrodes for recording electrical activity are then applied. For most animals, the electrodes are low-impedance needle electrodes that are inserted under the skin; however, other types of electrodes, such as surface electrodes and suction-cup electrodes that attach to the surface of the head (Fig. 10.16) are suitable as well. One electrode, termed the active, non-inverting, or positive electrode, is placed at the vertex (upper surface of the head, along the midline, and between the ears) and another, termed the reference, inverting, or negative electrode, is placed behind the pinna or in another relatively neutral region of the head. A third electrode, which serves as a ground, is placed in the pool water or in a non-neural site on the animal (e.g., beneath the skin of the neck, back, or leg).

Fig. 10.16
figure 16

Photo of a harbor porpoise (Phocoena phocoena) stationing during an ABR test of its hearing at Fjord & Bælt Denmark. The recording electrodes, attached to the animal’s head and back using suction cups, measure small electrical voltages produced by the brain in response to acoustic stimulation. Photo courtesy of Solvin Zankl, Fjord & Bælt and the Marine Biological Research Center, University of Southern Denmark, Kerteminde, Denmark

One advantage of ABRs is that it requires less time to collect a complete set of data (often 1 h or less to obtain a complete audiogram from an anesthetized animal), as compared to the weeks or months needed to train an animal for compiling behavioral audiograms. In addition, ABR testing is practical to use in studies requiring many animals and multiple measurements (e.g., before and after a treatment is applied), and for testing young animals in developmental studies. For example, McFadden et al. (1996) used ABR methods to study the ontogeny of auditory function in the Mongolian gerbil and identified three phases of development based on frequency-threshold curves. ABRs were elicited by intense stimuli in the low- and mid-frequency range as early as 10 post-natal days (pnd) in a small proportion of animals. By 16 pnd, all gerbils were responding reliably to tones between 125 Hz and 32 kHz, similar to adult animals.

ABR testing has become the AEP method of choice for audiometric testing in a wide range of species. In particular, ABRs are useful for estimating hearing capabilities of animals that are difficult to test using other methods. For example, Hu et al. (2009) used ABR recordings to determine hearing of cephalopods: the oval squid (Sepiotheuthis lessoniana) and the common octopus (Octopus vulgaris). Each cephalopod was anesthetized and then transferred to a holder inside a plastic tub filled with seawater. Teflon-coated silver needle electrodes were inserted on the head between the eyes (non-inverting) and on the mantle (inverting) and a wire was placed in the tub to serve as the ground. In both cephalopods, the ABR had only one prominent peak. The resulting ABR audiogram showed that the squid responded to a wider frequency range (400–1500 Hz vs. 400–1000 Hz) and had significantly lower thresholds at 600 Hz (its frequency of best sensitivity) compared to the octopus.

Comparisons of ABR audiograms can show the effects of factors such as age, noise exposure, drug treatment, and genetic mutations. The ABR audiograms shown in Fig. 10.17, for example, show the effects of an induced genetic mutation of the gene that codes for the copper-zinc form of superoxide dismutase (SOD1) on auditory sensitivity in mice. SOD1, an enzyme found in the cytosol of all cells, serves as a first line of defense against oxidative damage and has been implicated in numerous degenerative disorders and age-related hearing loss (McFadden et al. 2001a, b). For example, hearing thresholds of aged (13-month-old), wild type (WT) mice with normal levels of SOD1 are lower at all four tested frequencies than those of SOD1-deficient littermates. SOD1 deficiency had a greater effect on thresholds at 16 and 32 kHz than at lower frequencies (8 and 4 kHz).

Fig. 10.17
figure 17

Average ABR thresholds (dB re 20 μPa) from aged mice with normal levels of SOD1 enzyme (WT) compared to thresholds from littermates missing 50% (HET) or 100% (KO) of SOD1 due to genetic manipulation of the copper-zinc superoxide dismutase gene. WT = wildtype mice (with two normal gene alleles and normal levels of SOD1); HET = heterozygous knockout mice (with one abnormal allele, resulting in 50% reduction of SOD1); KO = homozygous knockout mice (with two abnormal alleles, resulting in complete elimination of SOD1)

10.4.3 Comparison of Behavioral and Physiological Audiograms

It is important to compare data obtained from physiological and behavioral methods to determine their reliability and validity. Even in the same species, experiments might use different stimulus presentation paradigms and different threshold criteria, making direct comparisons of results difficult. Although ABR and behavioral audiograms in the same species can have the same overall shape and similar frequencies of best hearing sensitivity, actual thresholds may differ considerably (Fig. 10.18). Some authors argue that these audiograms should not be considered equivalent (Sisneros et al. 2016). Ladich and Fay (2013) compiled AEP and behavioral audiograms of goldfish collected in different studies in different laboratories. They found that, at frequencies below 1000 Hz, median ABR thresholds were about 10 dB higher than behavioral thresholds, while at higher frequencies, ABR thresholds were lower than behavioral thresholds.

Fig. 10.18
figure 18

Comparison of underwater hearing thresholds of individual bottlenose dolphins collected by behavioral (black) versus ABR (red) methods. Data from Johnson (1966), Popov and Supin (1990), Brill et al. (2001), Houser and Finneran (2006), Finneran et al. (2008), Finneran et al. (2011)

Schlundt et al. (2007) quantified differences in audiograms recorded from bottlenose dolphins in a variety of underwater test conditions (in a quiet pool and in a noisy bay). AEPs were recorded using a transducer embedded in a suction cup on the jawbone. In behavioral tests, the dolphins were conditioned by the trainer’s whistle to respond when the same tone was heard. Thresholds measured using the two techniques were very similar, although there was less variability in behavioral data.

10.5 Other Audiometric Measurements

Other crucial aspects of hearing can be examined using variations on the basic audiometric methods outlined above. These include frequency discrimination, intensity discrimination, equal-loudness functions, frequency selectivity (e.g., critical ratios, critical bandwidths, and psychophysical tuning curves), masking (i.e., forward, backward, and simultaneous), duration discrimination, stimulus generalization, and directional hearing (i.e., sound localization). All of these aspects of hearing have been studied in a wide range of vertebrate species. Fay (1988) compiled results of behavioral experiments from a large number of different species. Klump et al. (1995) provided complete descriptions of behavioral methods that have been developed for these kinds of experiments. Selected examples of these experiments are discussed briefly below. It is important to note that physiological techniques can also be used to obtain information on these other aspects of hearing, but that again, estimates of sensitivity may differ.

10.5.1 Frequency and Intensity Discrimination

Frequency and intensity discrimination experiments measure the smallest difference in frequency or intensity that an animal can detect—called the just noticeable difference (jnd) or the difference limen (DL). To measure a frequency DL using behavioral methods, the animal is trained to detect a frequency difference (ΔF) between two test tones. In a typical paradigm, the animal is presented with a constant stimulus (i.e., a tone burst of one frequency) that sometimes changes in frequency, and the animal is trained to respond when it perceives a frequency change. The smallest frequency difference that the animal can perceive reliably, according to some set criterion, is the jnd or DL. Because the animal is discriminating between two frequencies, a common criterion for threshold is 75% correct, which is midway between chance and perfect performance.

Heffner and Heffner (1982) measured frequency DLs in an Indian elephant (Elephas maximus indicus) housed in a zoo. The elephant was trained to press one of two response buttons on a panel with its trunk upon hearing a sound. When she heard a train of tone pulses with all the same frequencies, then the correct response was to press the left button. When she heard a train of tone pulses that alternated between two different frequencies, then the correct response was to press the right button. Correct responses were rewarded with a fruit-flavored sugar solution. The DL was determined by reducing the frequency difference between the tones in the two types of pulse trains, until the animal no longer detected the difference reliably. A psychometric function for a tone frequency of 1000 Hz, a frequency of best sensitivity for the elephant, is plotted in Fig. 10.19. The 75% correct discrimination threshold is at 1030 Hz, giving a DL or 30 Hz. The DLs calculated from psychometric functions at different tone frequencies are plotted in Fig. 10.19 as the Weber fraction (ΔF/F) the ratio of the DL to the test frequency. The Weber fraction increases with frequency, showing that the ability to discriminate differences in tone frequency becomes absolutely worse with increases in frequency. Changes in the Weber fraction with tone frequency have implications for understanding how frequency is coded in the nervous system across different species.

Fig. 10.19
figure 19

Psychometric function at a tone frequency of 1000 Hz (left) and a graph of the Weber fraction across frequency (middle) collected from an Indian elephant (right). Left: A psychometric function showing percent correct detection of a frequency difference between two tones. The base frequency is 1000 Hz, and frequency differences range from 20 to 100 Hz. The solid gray line shows the elephant’s performance and the dashed gray line shows the 75% correct criterion for the frequency DL. At 1000 Hz, the frequency difference limen is 30 Hz. Middle: The Weber fraction (ΔF/F) increases with frequency. The Weber fraction is low at frequencies of 250 and 500 Hz, indicating good ability to discriminate frequency differences, and increases at higher frequencies, indicating poorer acuity. Data collected by Heffner and Heffner (1982). Image of the elephant from Evelyn Fuchs, University of Vienna

The psychometric function illustrated in Fig. 10.19 is based on actual data points. Some investigators use a statistical procedure called Probit Analysis to find the best-fitting regression line through the data points, and then base the estimate of the DL from that regression (Levitt 1970). The center of the best-fitting regression line can then be taken as the most probable threshold value. Probit analysis is useful because it provides a standard error for the hearing threshold values.

Intensity DLs are estimated using similar procedures as used for estimating frequency DLs, except that tone frequency is kept constant while tone intensity is varied. Difference limens are also commonly measured for noise. These measurements are useful for estimating a species’ dynamic range of hearing, the intensity range over which changes in sound levels can be perceived. Determining an animal’s sensitivity to the depth of amplitude modulation in a sound and the ability to detect a short, silent gap between two sounds is also a problem of intensity discrimination.

10.5.2 Frequency Selectivity

Frequency selectivity refers to the perceptual ability to discriminate two simultaneous signals of different frequency (e.g., a signal against noise). Behavioral measures of frequency selectivity are used to estimate the width of internal auditory filters (i.e., the physical space including number of hair cells and portion of the sensory epithelia) devoted to a particular frequency or frequency range along the basilar membrane or sensory surface in the inner ear. Thus, behavioral measures of frequency selectivity provide an estimate of the resolving power of the ear. Physiological techniques are used to provide a more direct measurement. Auditory filters are often thought of as a series of contiguous bands of frequency in which the auditory system analyses incoming sound, and sounds of different frequencies are processed in different filters (i.e., independently of one another) without mutual interference. For ease of modeling, auditory filters often are assumed to be rectangular in shape. For very sharp frequency selectivity, hence good ability to separate signals from noise, auditory filters should be narrow. Wide auditory filters are susceptible to greater masking. Different measures of frequency selectivity exist (e.g., Fletcher critical bands, critical bandwidths, equivalent rectangular bandwidths, etc.; Fig. 10.20).

Fig. 10.20
figure 20

Graph of frequency selectivity in marine mammals. *: Critical bandwidths. ★: Equivalent rectangular bandwidths. +: 3-dB bandwidths. O: 10-dB bandwidths. Some of these data were collected behaviorally, others electrophysiologically. For pinnipeds, both in-air and underwater measurements are shown (Erbe et al. 2016). © Erbe et al. 2016; https://www.sciencedirect.com/science/article/pii/S0025326X15302125. Licensed under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/ Critical Ratio

The critical ratio (CR) can be thought of as the minimum signal-to-noise ratio for detecting a tone against a background of broadband masking noise. It is defined as the mean-square sound pressure of a narrowband signal (e.g., a tone) divided by the mean-square sound pressure spectral density of the masking noise at a level, where the signal is just detectable (ISO 18405:2017). ‘Just detectable’ again refers to a specified fraction of trials in behavioral experiments. The CR is typically expressed as a level-quantity in dB with a reference value of 1 Hz. Therefore, the CR can also be computed as the difference between the sound pressure level of the signal and the power spectral density level of the noise—at detection threshold. To measure the CR, the levels of signal (or noise) are changed. As with measuring audiograms, the CR can be measured behaviorally using the Method of Constant Stimuli, the Method of Limits, or the Up/Down Staircase method. The CR can also be measured electrophysiologically.

CR measurements are relatively easy to obtain and are thus available for a number of species. In the horseshoe bat (Rhinolophus ferrumequinum) and in the green treefrog, for example, CRs are lowest, implying sharper filters, at the spectral peaks within this species’ echolocation and advertisement calls, respectively (Long 1977; Moss and Simmons 1986). In many other species, CRs gradually increase with tone frequency (e.g., Fay 1988; Erbe et al. 2016). In the absence of CR data, 1/3 octave bands are often used (in particular in the noise impact assessment literature). While this is a good approximation in birds (e.g., Dooling and Blumenrath 2013), in several species, 1/3 octave bands overestimate CRs at some frequencies (Fig. 10.21).

Fig. 10.21
figure 21

Graphs of critical ratios in dB re 1 Hz of marine mammals under water (Erbe et al. 2016). Fractional octave lines are shown for comparison. © Erbe et al. 2016; https://www.sciencedirect.com/science/article/pii/S0025326X15302125. Licensed under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

The CR is often taken as an estimate of the width of the auditory filters. In this case, it should be referred to as the Fletcher critical band (ANSI/ASA S3.20-2015).Footnote 2 If CR is in dB re 1 Hz, then the Fletcher critical band is computed as 10CR/10. The Fletcher critical band is an indirect estimate of the size of the auditory filter. It is a good approximation in some bird species (Langemann et al. 1995) but in many other species differs from a more direct measure, the critical bandwidth. Critical Bandwidth

The critical bandwidth (CB) refers to a band of frequencies within which sound at any frequency can interfere with sound at the center frequency (ANSI/ASA S3.20-2015; ISO 18405: 2017). The critical bandwidth is typically measured in noise-widening experiments. The listener tries to detect a tone at the center of a band of masking noise. As the noise band is widened, the level of the tone has to increase for it to remain audible. There comes a bandwidth, at which the width of the masking noise band no longer affects the level of the tone at detection threshold. This is the critical bandwidth. The difference between a CR and a CB experiment thus is that the listener has to detect a tone in broadband masking noise in the former and in noise of variable (increasing) bandwidth in the latter. CBs are time-consuming to collect, because they require determining masked thresholds at each tone frequency at many different noise bandwidths. For this reason, measurements of CB are available for fewer species than are measurements of CR. Psychophysical Tuning Curves

Psychophysical tuning curves are another measure of behavioral frequency selectivity. In these experiments, a tone is fixed in frequency and amplitude just above (typically, 10 dB) its absolute threshold. The animal is trained to detect the tone in the presence of a masker (either other tones or narrowband noise). The masker can be presented simultaneously with the tone (simultaneous masking), or prior to the tone (forward masking). Psychophysical tuning curves are typically V-shaped, so that as the frequency separation between the tone and the masker increases, the level of the masker required to mask the tone increases (Fig. 10.22). They are similar in shape to tuning curves of auditory nerve fibers, and so can provide non-invasive estimates of neural frequency selectivity (Serafin et al. 1982). The drawback of this technique is that it is time-consuming to conduct, so that data are available for only a few animal species.

Fig. 10.22
figure 22

Psychophysical tuning curves (left) for the Pig-tailed macaque monkey (Macaca nemestrina; right), measured in a forward masking paradigm. Animals were trained to detect tones using positive reinforcement. Tones were presented via earphones, and the animals were seated inside a sound-attenuating chamber. Masked thresholds to probe tones (0.5, 2, and 8 kHz; blue, dark red, dark gray, respectively; x-axis) were determined using an adaptive tracking procedure and defined as the mean of eight reversal points at each frequency. Probe tones (25-ms duration) were presented at a level of 10 dB above absolute threshold. Masker tones (130-ms duration, with frequencies varying around that of the probe tone) were presented 2 ms before the onset of the probe tone. The blue, dark red, and dark gray curves show the psychophysical tuning curves plotting the level of the masker (y-axis) needed to just mask the probe tone at each masker frequency. The black dashed line shows the animals’ absolute thresholds (audiogram). Data collected by Serafin et al. (1982). © Stauss, 2006; https://commons.wikimedia.org/w/index.php?curid=1733069. Licensed under CC BY-SA 3.0; https://creativecommons.org/licenses/by-sa/3.0/

10.6 Summary

Describing and quantifying the hearing capabilities of different animals is essential in bioacoustical studies. Basic features of hearing, such as the range of audibility, thresholds of hearing as a function of frequency, and the frequency range of best hearing, are easily shown on an audiogram. Hearing sensitivities are best in young, healthy animals and may decline in some animals as they age or if they are exposed to ototoxic antibiotics. Acute exposure to high-amplitude noise or long-term exposure to lower levels of noise also can temporarily or permanently reduce hearing sensitivity.

A variety of behavioral and physiological methods can be used to test hearing in live animals. The aims of a study and the characteristics of the animals should be considered carefully when selecting the appropriate audiometric methods to use. This chapter described common behavioral and physiological methods, along with some of their strengths and weaknesses. Testing hearing abilities in animals is not as easy as in humans because animal subjects cannot verbally report to the researcher when a test signal is heard. Instead, animals indicate that they heard a sound by making unlearned or learned responses in behavioral studies. Thresholds based on conditioned responses are the most accurate and reliable, but conditioning procedures are not suitable for all animals or research questions. Some animals are not trainable or are unable to participate in a behavioral study due to age, health, or some other factor. Physiological methods, especially auditory brainstem response testing, can be particularly helpful in these situations. While ABR and other physiological methods provide useful information about auditory function, it is important to recognize that the results they provide are not equivalent to those from behavioral studies that assess hearing directly; thresholds obtained using physiological methods may under- or over-estimate behavioral thresholds in an unpredictable manner.

Research on hearing abilities in animals has advanced beyond documenting the basic audiogram of a species. Data on frequency and intensity discrimination, sound localization, and the effects of noise on hearing in animals are current topics of study for many animal species. Information on hearing and an animal’s abilities to adapt to noise can have important applications for the conservation of species in areas of high anthropogenic noise.