7.1 Introduction

Whether listening in a forest or on an open plain, by the side of a river or in the ocean, at the outskirts of suburbia or right downtown, the Earth abounds with sounds. The use of the term “soundscape” in the literature has increased rapidly since 2000 (Fig. 7.1) and can be traced back to Southworth’s (1969) article on the sonic environment of Boston, MA, USA. The Canadian music composer and researcher Schafer later defined soundscapes as “the auditory properties of landscapes” (Schafer 1977). Schafer was a pioneer in highlighting the need for soundscape research and management. In his book, The New Soundscape, Schafer and his students documented rapid changes in soundscapes over the course of human civilization (Schafer 1969). Common settings of primitive cultures surrounded by an abundance of natural sounds (i.e., wind, water, animals, etc.) rapidly changed after the Industrial Revolution to cities dominated by sounds from machinery. Schafer further noticed that most people had ceased to listen to the sounds of the environment and actively tried to ignore unpleasant sound (i.e., noise). With the goals of studying and archiving soundscapes, creating public awareness of noise pollution, and creating healthy soundscapes through acoustic design, Schafer founded the World Soundscape Project (WSP 1972–1979; Torigoe 1982). Soundscape studies by the WSP were human-centered, focusing on the acoustic composition of cities and villages, studying only humans as receivers of acoustic information, and emphasizing the negative effects of noise on humans (Truax 1984, 1996). Krause (1987, 1993) adopted an animal-centered approach to the study of soundscapes. He recorded and archived sounds of different animal species as well as of entire ecosystems. According to Krause, acoustic sampling of an area over a period of time and under different conditions allows us to study, and ultimately predict, how human-induced changes might affect ecosystems (Krause 1987).

Fig. 7.1
figure 1

Number of articles with “soundscape” in the abstract, listed by Scopus, versus publication year; retrieved 10 June 2022

While the term “soundscape” has different uses in the literature, the International Organization for Standardization officially defined “soundscape” as “an acoustic environment as perceived or experienced and/or understood by a person or people, in context” and “acoustic environment” as the “sound at the receiver from all sound sources as modified by the environment” (International Organization for Standardization [ISO] 2014). A soundscape is thus a perceptual construct that requires a human listener, while the acoustic environment is a physical phenomenon, extending in frequency beyond the human hearing limits, including infrasounds and ultrasounds. In the field of underwater acoustics, however, a soundscape is the “characterization of the ambient sound in terms of its spatial, temporal and frequency attributes, and the types of sources contributing to the sound field” (International Organization for Standardization [ISO] 2017). “Soundscape” in underwater acoustics thus does not require a listener. In essence, the usage of the term “soundscape” in the literature is variable and perhaps related to specific research objectives (Scarpelli et al. 2020).

The components of a soundscape may be grouped by their origin. Sounds produced by animals are grouped as biophony, sounds produced by atmospheric or geophysical events make up the geophony, and sounds produced by human activities or machinery are referred to as anthropophony (Fig. 7.2; Krause 2008). Sounds created by machinery (including power generators, motors, etc.) are sometimes grouped as technophony (Mullet et al. 2016), which is the component of anthropophony typically associated with noise pollution. The identification of soundscape components is a key element in the research field of ecoacoustics, which investigates the relationship of natural and anthropogenic sounds with the environment on a range of scales in space and time (Farina and Gage 2017). The research field of soundscape ecology investigates the interaction of organisms with their environment, mediated through sound (Pijanowski et al. 2011a, b). For example, sound sources distributed within an environment provide acoustic cues (i.e., soundmarks), by which animals can orientate, navigate, and make habitat choices (Slabbekoorn and Bouton 2008). Under the Acoustic Habitat Hypothesis, the habitats that sound-dependent species select and occupy exhibit acoustic characteristics that suit a species’ functional needs and match its sound production and reception capabilities (Mullet et al. 2017a). Acoustic habitat specialists are species whose acoustic habitat is unique and vital to its functional needs, while acoustic habitat generalists occupy acoustic habitats that are less than unique but still important to the species’ functional needs (Mullet et al. 2017a). Under the Acoustic Adaptation Hypothesis, the sounds of soniferous animals evolved to optimize propagation within the animals’ habitat (Morton 1975), characterized by its soundscape and sound propagation conditions. Under the Acoustic Niche Hypothesis, animals evolved species-specific sounds in certain frequency bands and temporal patterns to minimize competition (i.e., masking) with sounds from other animals and the environment (Krause 1993). An interesting and related question is how animal (and human) listeners make sense of the myriad of sounds received from all directions, overlapping in frequency and time, and thus masking each other. A listener must separate the parts belonging to different sources and merge the parts belonging to the same source to make sense of the acoustic scene. This is called auditory scene analysis (Bregman 1990; Lewicki et al. 2014).

Fig. 7.2
figure 2

Sketch of the sound sources within soundscapes ranging from wilderness to countryside, to city. Biophony decreases and anthropophony increases while the geophony might vary comparatively little. Example species are sketched along the way with decreasing density and biodiversity. Acoustic habitat generalists occur in multiple, different soundscapes, while acoustic habitat specialists only occur in quite specific soundscapes (Mullet et al. 2017a)

Natural soundscapes are appreciated for their esthetic and recreational value (e.g., Davies et al. 2013; Francis et al. 2017; Franco et al. 2017) and also have a significant ecological and scientific value. Soundscapes should, therefore, be considered a natural resource, worthy of study, management, and conservation (National Park Service [NPS] 2000; Farina and Gage 2017; Pavan 2017). How many undisturbed soundscapes remain in this world of decreasing biodiversity, changes in land-use, and rising anthropogenic noise? Can the soundscape of a pristine habitat function as a model to restore a degraded habitat (Pavan 2017; Gordon et al. 2019; Righini and Pavan 2020)? This chapter gives an overview of terrestrial and aquatic soundscapes, outlines how soundscapes may change or have changed over time, provides tools for analyzing and quantifying soundscapes, and discusses how passive acoustic monitoring applies to soundscape ecology research, management, and conservation.

7.2 Terrestrial Soundscapes

Terrestrial soundscapes may vary widely within as well as between ecosystems (e.g., Krause 2012; Yip et al. 2017; Priyadarshani et al. 2018). While some soundscapes might have been studied more than others (Scarpelli et al. 2020), there often are key sounds (i.e., sounds characteristic for an ecosystem) by which an ecosystem may be identified. For example, a listener may identify the terrestrial soundscape of a nearshore ecosystem off central California, USA, by the barks of California sea lions (Zalophus californianus), the squawks of sea gulls (Larus californicus), and the tapping sounds made by sea otters (Enhydra lutris) that use a rock to crack-open shellfish.

7.2.1 Biophony

The terrestrial biophony includes sounds produced by insects (e.g., Brady 1974; Römer and Lewald 1992; Polidori et al. 2013), anurans (e.g., Cunnington and Fahrig 2010; Zhang et al. 2017), reptiles (e.g., Crowley and Pietruszka 1983; Galeotti et al. 2005), birds (e.g., Lengagne et al. 1999; Charrier et al. 2001; Catchpole and Slater 2008), bats (e.g., Gadziola et al. 2012; Prat et al. 2016), and other mammals (such as dogs and seals; e.g., van Opzeeland et al. 2010; Mumm and Knörnschild 2014; Bowling et al. 2017). Typically, multiple (vocal) taxa occur in the same environment and so, evidence for the Acoustic Niche Hypothesis has been demonstrated in various ecosystems among insects (Sueur 2002), anurans (Villanueva-Rivera 2014), birds (Azar and Bell 2016), and a combination of species (Hart et al. 2015).

Terrestrial soundscape ecology studies have been dominated by research on birds (Ferreira et al. 2018). Most bird species are diurnal vocalizers, with peak activity at dawn and dusk. Birds may emit single calls as well as sounds arranged into long and complex songs (Fig. 7.3). Calls have a variety of functions and are, for example, produced to raise alarm (Gill and Bierema 2013), contact conspecifics (Bond and Diamond 2005), or beg for food (Klenova 2015). While bird song was long thought to be an exclusive male trait used for territorial defense and female attraction, there is mounting evidence that female bird song is globally widespread and used for territorial and reproductive purposes (Odom et al. 2014). Terrestrial birds primarily communicate within the frequency range of human hearing, with recorded fundamental frequencies (see Chap. 4) as low as 23 Hz for southern cassowary (Casuarius casuarius; Mack and Jones 2003) and as high as 13 kHz for the Ecuadorian hillstar hummingbird (Oreotrochilus chimborazo; Duque et al. 2018). Marine birds that are heard within terrestrial soundscapes produce calls with fundamental frequencies <2 kHz (e.g., Charrier et al. 2001; Bourgeois et al. 2007; Cure et al. 2009; Mulard et al. 2009; Dentressangle et al. 2012). Lesser-known sounds of birds are those produced by wings while in flight and while perched (Clark 2021). Because these sounds may be audible to the animal itself, conspecifics, and other species (e.g., predators and prey), Clark (2021) suggested that these sounds may be selected to evolve from by-product to communication signal.

Fig. 7.3
figure 3

Soundscape of a temperate forest at dusk showing song of the chiffchaff (Phylloscopus collybita), squawks of a mallard duck (Anas platyrhynchos), and calls from a marsh frog (Pelophylax ridibundus)

Insects are another common source of biophony, with seasonal and diurnal choruses produced by cicadas and crickets at dominant frequencies between 2 and 50 kHz (Bennet-Clark 1970; Robillard et al. 2013; Hart et al. 2015; Buzzetti et al. 2020). These typically male insect choruses, produced to attract females, can be intense and potentially affect the timing and frequency of other species’ vocalizations. Hart et al. (2015), for example, found that birds in a Costa Rican tropical rainforest either ceased vocalizing or changed their call frequency to avoid acoustic overlap with cicada choruses (Fig. 7.4). As do birds, insects produce sounds in flight, with dominant frequencies between 140 and 250 Hz (Fig. 7.5; Kawakita and Ichikawa 2019).

Fig. 7.4
figure 4

A comparison of the soundscapes at two different moments of the morning in a secondary wet forest at Las Cruces Biological Station, Costa Rica. Top spectrogram recorded minutes prior to the onset of Zammara smaragdina cicada morning choruses, displaying vocalizations from seven bird species (Arremon aurantiirostris, Picumnus olivaceus, Arremon torquatus, Catharus aurantiirostris, Arremon aurantiirostris, Phaeothlypis fulvicauda, and Formicarius analis). Bottom spectrogram recorded at the same location just after the onset of cicada morning choruses. © Hart et al. (2015); https://academic.oup.com/view-large/figure/79529274/beheco_arv018_f0001.jpeg. Published under CC BY 3.0; https://creativecommons.org/licenses/by/3.0/

Fig. 7.5
figure 5

Spectrograms of the flight sound produced by the European honeybee (Apis mellifera; a) and the Japanese yellow hornet (Vespa simillima xanthoptera; b). Sound files from Kawakita and Ichikawa (2019). Spectrogram of chorusing frogs in a pond in Colli Euganei, Italy. Yellow-bellied toad (Bombina variegata) with 500-Hz tonals and overtones and the European tree frog (Hyla arborea) with higher-pitched, broadband sounds starting at around 5 s and increasing in intensity and bandwidth from 13 s onwards (c). Recording courtesy of Marco Pesente

Social wasps, honeybees, bumble bees, and some hoverflies produce sounds with dominant frequencies between 152 and 317 Hz when attacked by predators, potentially as a warning signal (Rashed et al. 2009). Smaller velvet ants (family of wasps) also produce distress calls but at higher frequencies between 4 and 17 kHz (Polidori et al. 2013). Ants produce distress calls extending in frequency above 70 kHz (Pavan et al. 1997).

In many anuran species, males aggregate and produce evening choruses of varying complexity to advertise for females (i.e., courtship vocalizations; Grafe 2005). Most male anuran species cycle air through a vocal sac to produce calls with main energy between 400 Hz and 10 kHz (Fig. 7.5c; Cunnington and Fahrig 2010; Narins and Meenderink 2014; Villanueva-Rivera 2014), although some species produce sounds that extend into the ultrasonic range (i.e., >20 kHz; Feng et al. 2006; Arch et al. 2008). White-lipped frogs (Leptodactylus albilabris) also thump their vocal sac on the underlying substrate while vocalizing, thereby creating a seismic signal, which potentially plays a role in seismic communication with conspecifics (Narins 1990).

Courtship vocalizations have also been recorded for at least 35 species of tortoises. Call characteristics of 11 tortoise species were studied in detail by Galeotti et al. (2005), revealing dominant frequencies between 110 and 600 Hz and energy between 100 Hz and 3 kHz. Snakes may produce a broadband hiss (3–13 kHz; Young 1991), rattle (2–23 kHz; Young and Brown 1993), or rasping sound (200 Hz–11 kHz; Young 2003) when threatened. Crocodiles produce sounds with main energy <2 kHz (e.g., Vergne et al. 2009, 2011; Reber et al. 2017). Crocodile hatchlings emit calls before, during, and after hatching, which function to synchronize hatching, alert the mother to their due arrival, and stay in contact (Vergne et al. 2011; Chabert et al. 2015). Adult crocodiles produce calls during courtship, during territorial defense, and to maintain group cohesion with offspring (Fig. 7.6; Vergne et al. 2009; Reber et al. 2017).

Fig. 7.6
figure 6

Male (a) and female (b) American alligator (Alligator mississippiensis) bellows that may be produced during courtship and territorial defense (Vergne et al. 2009). Modified from Reber et al. (2017). © Reber et al. (2017); https://www.nature.com/articles/s41598-017-01948-1/figures/2. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Mammalian species vocalize at frequencies that, for some taxa, are inversely related to their body size (Bowling et al. 2017). African elephants (Loxodonta africana) and Asian elephants (Elephas maximus), for example, vocalize within the infrasonic range (i.e., <20 Hz; fundamental frequency as low as 14 Hz). These low-frequency calls function to coordinate movement and to advertise an individual’s reproductive status over distances as far as 2.5 km (Soltis 2010). Elephants also produce vibrations that propagate through the substrate and so provide additional cues to listening conspecifics (Payne et al. 1986; O’Connell-Rodwell et al. 2000). The majority of aerial feeding bats, at the opposite end of the body-size scale, produce short echolocation calls (biosonar) in the ultrasonic range (15–110 kHz), for navigation and hunting (Fenton et al. 1998). Bat social calls, potentially related to agonistic encounters and courtship, are also characterized by harmonics that extend well into the ultrasonic range (Fig. 7.7; Behr and van Helversen 2004; Lattenkamp et al. 2019).

Fig. 7.7
figure 7

Common social calls with ultrasonic components emitted by the pale spear-nosed bat (Phyllostomus discolor). Modified figure. © Lattenkamp et al. (2019); https://www.frontiersin.org/files/Articles/447704/fevo-07-00116-HTML/image_m/fevo-07-00116-g002.jpg. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Primate vocalizations cover a wide frequency range from approximately 100 Hz in western gorillas (Gorilla gorilla; Salmi et al. 2013) to 16 kHz in pygmy marmosets (Cebuella pygmaea; Pola and Snowdon 1975). Primate vocalizations play an important role in intergroup communication, predominantly facilitating social interactions and group movement (Cheney and Seyfarth 1996, 2018). Primates are also known to use various alarm calls, which were previously suggested to be functionally referential signals (e.g., Cheney and Seyfarth 1996). However, recent studies have shown that primates often use general alarm calls and infer meaning from previous experiences or contextual information (Fichtel 2020).

Marine mammals, such as polar bears (Ursus maritimus), pinnipeds (i.e., seals, sea lions, and walruses), and sea otters (Enhydra lutris nereis) also produce in-air sounds. Nursing female polar bears frequently emit a low-intensity, repetitive, pulsed sound when initiating or continuing body contact with their cub (20 Hz–2 kHz; Wemmer et al. 1976). Pinnipeds produce in-air sounds with main energy <9 kHz (Fig. 7.8). Mother and pup recognize each other by individually unique calls that help them to reunite amidst all other individuals of the colony (Insley et al. 2010), while males produce individually unique calls during agonistic behavior (e.g., Fernández-Juricic et al. 1999; Van Parijs and Kovacs 2002). Female and pup sea otters produce individually distinct calls with main energy <5 kHz, which also seem to function as contact calls between separated individuals (McShane et al. 1995).

Fig. 7.8
figure 8

In-air vocalizations produced by (a) a New Zealand fur seal (Arctocephalus forsteri) and (b) an Australian sea lion (Neophoca cinerea). © Erbe et al. (2017); https://doi.org/10.1007/s40857-017-0101-z. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Urbanized areas may be characterized by the sounds of domesticated animals (i.e., pets and livestock). Dogs bark to greet conspecifics and humans, during play (i.e., excitement), when raising alarm, or when seeking attention (Yin and McCowan 2004), sometimes to the nuisance of the neighborhood (Flint et al. 2014). Barks are short acoustic signals with main energy between 300 Hz and 2.5 kHz (Fig. 7.9), often repeated in bouts (Yin and McCowan 2004). Ewes and their lamb recognize each other by unique calls with main energy <5 kHz (Sèbe et al. 2008), resulting in a cacophony of bleats in lambing season.

Fig. 7.9
figure 9

Example spectrograms of dog barks (a) and bleating sheep (b). Sheep bleats were produced by an ewe (solid box), her lamb (dashed box), and a distant lamb (dotted box)

7.2.2 Geophony

The prevailing geophonic source of sound is wind. Wind acts on vegetation, thereby contributing to sound levels <1 kHz in leafless trees, <4 kHz in leafed trees, and <10 kHz in open grasslands, with a positive correlation between wind speed and sound intensity (Boersma 1997; Bolin 2009). Wind noise may affect the audible range of biological sounds. The detection of bird song in open grasslands in New Zealand significantly decreased with increasing wind speeds from calm (<4 km/h) to windy (>15 km/h) conditions (Priyadarshani et al. 2018). Precipitation also creates sound (Fig. 7.10). Rain increased sound levels within a deciduous forest (Ardennes, France) within the frequency band of 100 Hz to 10 kHz (Lengagne and Slater 2002). The increase in sound levels resulted in a reduction of acoustic communication space (i.e., area over which an individual can communicate with conspecifics) for tawny owls (Strix aluco) to 1/69th of the space without rain, with a simultaneous marked decrease in vocal activity. Thunder is the most common loud natural sound with a peak frequency near 100 Hz, although sounds extend into the infrasonic and mid-frequency range (250 Hz–4 kHz; Fig. 7.10). Other sources of terrestrial geophony are rivers, waterfalls, earthquakes, and volcanic eruptions. Infrasonic monitoring of soundscapes can identify the location of continuous geophonic sound sources, such as waterfalls and seismic activity, as well as transient (i.e., short-duration) sound sources, such as thunder, up to distances of 10 km (Johnson et al. 2006).

Fig. 7.10
figure 10

Spectrogram of a thunderstorm recorded in the Netherlands, depicting high-frequency (i.e., >8 kHz) sound from raindrops falling nearby, constant high-frequency (i.e., 9–12 kHz) rain in the background, and low-frequency (i.e., <1 kHz) sound from thunder

7.2.3 Anthropophony

Anthropophony identifies the presence and activities of human beings. Some of these sounds give cues about local culture, tradition, language, working habits, and religion (e.g., voices, music, cow and sheep bells, church bells, etc.) and can enrich a soundscape (Stack et al. 2011, Pavan 2017). However, with the industrial revolution, new sound sources have emerged at an unprecedented level and spatial extension, with consequent impacts on natural soundscapes and human health.

Terrestrial anthropophony includes sounds from transportation (e.g., road vehicles, trains, snowmobiles, ships, and airplanes; Ernstes and Quinn 2016; Mullet et al. 2017b; White et al. 2017; Duarte et al. 2019), recreational boats (Kariel 1990; Bernardini et al. 2019), machinery (e.g., excavation devices, drilling devices, generators, and chain saws; Potočnik and Poje 2010; Deichmann et al. 2017), gunshots (Wrege et al. 2017), fireworks (Kukulski et al. 2018), and outdoor events (Greta et al. 2019; Kaiser and Rohde 2013). The intensity of anthropophony correlates with the degree of urbanization (Joo et al. 2011; Kuehne et al. 2013) and is considered noise pollution with an impact on both human (European Environment Agency [EEA] 2014) and animal health (Barber et al. 2010; Shannon et al. 2016), potentially affecting entire ecosystems (Pavan 2017).

Low-frequency sound, mostly generated by engines, propagates over large distances and appears to be the most invasive and pervasive sound related to transportation infrastructures. Sound from cars and heavy trucks caused by tire-pavement interaction, aerodynamic sources, and engines peaks around 100 Hz (Rochat and Reiter 2016), but may reach as high as 10 kHz when measured close to the source (Fig. 7.11a). Both birds (e.g., Halfwerk and Slabbekoorn 2009) and anurans (e.g., Cunnington and Fahrig 2010; Caorsi et al. 2017) have been found to change vocal behavior in response to traffic noise (see Chap. 13). Conventional railway sound (i.e., electrified railway with a service speed <200 km/h) has a broad peak between 10 Hz and 2 kHz, whereas high-speed railway sound (i.e., electrified railway with a service speed >200 km/h) peaks <100 Hz (Di et al. 2014).

Fig. 7.11
figure 11

(a) Spectrogram of a passing car at 2-m and a truck at 5-m distance. (b) Spectrogram of a commercial passenger airplane flying overhead at an altitude of ~300 m after take-off. Note the Doppler shift from high to low frequency (from 2.8 to 2 kHz) around the time of closest approach (at ~12 s) and the bird vocalizations between 7 and 9 kHz. (c) Spectrogram of a 3-m recreational power boat with a 3-hp 2-stroke engine, passing at 5-m distance; bird vocalizations within the gray dashed boxes. (d) Spectrogram of a jackhammer breaking tar

Sound from aircrafts, especially near airports, is perceived by humans as a source of disturbance and may have negative effects on children’s learning, human sleep, and human health (Basner et al. 2017). In addition, sound during take-off and landing overlaps with biophony resulting in acoustic and behavioral responses (Fig. 7.11b; Sáncez-Pérez et al. 2013; Vidović et al. 2017). Birds near international airports in Spain, for example, were found to advance their dawn chorus to reduce overlap with aircraft sound (Gil et al. 2015), which is a common response to noise for urban species (Bermúdez-Cuamatzin et al. 2020). However, common chiffchaffs (Phylloscopus collybita) near airports in the UK and the Netherlands were found to sing songs with a lower maximum and peak frequency than conspecifics in nearby control areas, thus resulting in an increased overlap with aircraft sound (Wolfenden et al. 2019). In addition, airport populations sang at a slower rate and responded more aggressively to song playbacks. In South Africa, the critically endangered Pickersgill’s reed frog (Hyperolius pickersgilli) called more frequently and at higher frequencies during and after aircraft overflights than before (Kruger and Du Preez 2016). Even in wild remote areas, aircrafts flying at ~8000 m altitude may produce noise below 500 Hz at 60 dB re 20 μPa (unweighted) at ground level (Pavan 2017; Farina et al. 2021). It is also essential to consider that take-off and landing corridors, where the noise levels are much higher, may cross more rural lands where airplane sound creates a stark contrast with ambient sound levels.

Smaller transport vehicles, such as powered two wheelers and snowmobiles, also contribute to the soundscape (Paviotti and Vogiatzis 2012; Mullet et al. 2017b). Mullet et al. (2017b) found that snowmobile noise, with main energy <2 kHz, affected 39% of the Alaskan wilderness open to snowmobiles and may mask vocalizations from common winter bird species. In-air ship noise from machinery and ventilation systems may propagate to areas near channels, ports, and coasts (Badino et al. 2012; Borelli et al. 2016). Small recreational power boats on lakes, on rivers, and near shore also increase in-air sound levels, predominantly below 1 kHz (Fig. 7.11c), with potential negative effects on bird species and hauled-out sea lions (York 1994; Tripovich et al. 2012).

Construction equipment may generate strong sounds that are audible over long ranges. Pneumatic tools, for example, generate repetitive, broadband sound (Fig. 7.11d). Heavy and stationary equipment, such as earth-moving machinery and air-compressors, generate sounds at frequencies <2 kHz (e.g., Berglund et al. 1996; Roberts 2009). Although one may associate construction sounds with urban areas, there are many examples in rural and remote areas, too. In the western Amazon (Peru), sounds from the construction and operation of a natural gas-well and pipeline (i.e., generators, helicopters, and pneumatic tools) were audible up to 250 m from the source (Deichmann et al. 2017). Anthropogenic sources in rural areas include farming machinery dominating <500 Hz (Gulyas et al. 2002), chainsaws recorded in forests with main energy between 100 Hz and 9 kHz (Potočnik and Poje 2010), and transient, broadband gunshots (Prince et al. 2019), which can provide valuable information on illegal hunting, in particular in remote areas that are difficult to patrol. In urban settings, additional sources of anthropophony originate from outdoor events, such as (music) festivals (Greta et al. 2019), fun parks (Kaiser and Rohde 2013), and Formula 1 races (Payne et al. 2012).

7.2.4 Sound Propagation in Terrestrial Environments

The propagation of sound, from its source through an environment, affects the local soundscape. In environments with good sound propagation conditions, sources from far away contribute to the local soundscape; whereas in environments with poor sound propagation conditions, only nearby sources contribute. Sound propagation is affected by air temperature, humidity, ground cover (bare rock versus grasslands or bush), wind, turbulence, and the presence of sound absorbers (e.g., snow), scatterers (e.g., trees), and reflectors (e.g., cliffs or buildings; see Chap. 5).

As sound spreads, it is transmitted into and through different media, absorbed, reflected, scattered, and diffracted. Many of these effects depend on frequency; meaning that sound propagates differently at different frequencies and that the environment changes the spectral characteristics of the sound. If the wavelength of sound is smaller in size than features of the environment (e.g., rocks), then sound will reflect. The wavelength can be computed as the ratio of sound speed (about 330 m/s in air) and frequency (e.g., a 100-Hz tone has a wavelength of 3 m in air; see Chap. 4). At wavelengths much greater than features in the environment, sound will travel unhindered.

The air may be layered, with layers at different altitudes having different acoustic properties. Higher temperature and higher humidity increase the speed of sound. By Snell’s law of refraction, sound bends toward the horizontal when the speed of sound increases and away from the horizontal when the speed of sound decreases. During the day, temperature typically decreases with increasing altitude, leading to an upward refracting environment that exhibits so-called shadow zones that have reduced sound levels. In the morning or in winter, the air near the ground is often relatively cold, while there might be a warmer layer of air at higher altitude; this situation is called a temperature inversion. Sound is downward refracted and channeled close to the ground. Hence, in winter, sound might travel very far at low altitude (see Chap. 5).

Vegetation attenuates sound, so in temperate areas with high vegetation, the same sound during summer propagates over shorter distances than during winter (Aylor 1972). Areas or seasons of full vegetative cover have soundscapes different from those bare in vegetation (Attenborough et al. 2012). Both temperature and humidity near the ground may change quickly; therefore, sound propagation conditions, soundscapes, and the communication space of terrestrial animals can vary within a few hours.

7.3 Aquatic Soundscapes

The vast majority of aquatic soundscape studies have focused on marine and estuarine environments, where soundscapes vary among geographic regions from the northern marginal ice-zone via equatorial regions to Antarctic waters (Haver et al. 2017), from the deep ocean (e.g., Dziak et al. 2017) to shallow coastal waters (e.g., McWilliam and Hawkins 2013), and from urban rivers (e.g., Marley et al. 2016) to estuarine reserves (e.g., Ricci et al. 2016). Soundscape studies in freshwater are less common but have covered a variety of settings from frozen lakes in Canada (Martin and Cott 2016) to urbanized lakes in the UK (Bolgan et al. 2016, 2018b), from pristine swamps in Costa Rica (Gottesman et al. 2020) to urbanized lowlands in the Netherlands (van der Lee et al. 2020), and from litttle streams in the USA (Holt and Johnston 2015) to the busy Ganges river in India (Dey et al. 2019). As in the terrestrial environment, each soundscape is characterized by a unique composition of biophony, geophony, and anthropophony.

Ambient sound encompasses all of the sounds at a given location and time, except for any specific signal of interest (International Organization for Standardization [ISO] 2017). Fig. 7.12 gives the spectra of characteristic ambient sounds in the ocean, as originally compiled by Wenz (1962), with updates from Cato (2008). Below 100 Hz, ambient sound is dominated by distant shipping, and, in shallow water, wind. Above 100 Hz, ambient sound is mostly wind driven. The prevailing limits of ambient sound decrease with increasing frequency from a maximum of 140 dB re 1 μPa2/Hz at 1 Hz to a minimum of 15 dB re 1 μPa2/Hz at 30 kHz. Above 30 kHz, molecular agitation limits the spectra of recorded ambient sound.

Fig. 7.12
figure 12

Spectra of prevailing and local underwater sound sources between 1 Hz and 100 kHz (after Wenz 1962; Cato 2008)

7.3.1 Biophony

Aquatic species are well adapted to produce, sense, and use sounds in water (e.g., Schmitz 2002; Ladich and Winkler 2017). The aquatic biophony includes sounds produced by invertebrates (e.g., Iversen et al. 1963; Coquereau et al. 2016; Gottesman et al. 2020), frogs (Brunetti et al. 2017), turtles (e.g., Giles et al. 2009), fish (e.g., Kasumyan 2008; Bolgan et al. 2018b), birds (Thiebault et al. 2019), and mammals (e.g., Klinck et al. 2012; Erbe et al. 2017; Dey et al. 2019). The freshwater biophony is not well described and so, sounds frequently cannot be linked to specific species (Rountree et al. 2019; Gottesman et al. 2020; Putland and Mensinger 2020). This lack of knowledge currently impedes the full utilization of freshwater soundscape studies as an ecological tool (Linke et al. 2020).

With regards to marine biophony, snapping shrimps are well-known contributors, producing broadband sounds from a few hundreds of hertz up to 200 kHz (Fig. 7.13a; Knowlton and Moulton 1963; Au and Banks 1998). This short, intense, repetitive sound is a byproduct of many shrimps rapidly closing their snapper claw, which creates a jet stream used in agonistic encounters and to stun prey (Herberholz and Schmitz 1999). As snapping shrimps predominantly live in large aggregations (Duffy 1996; Duffy and Macdonald 1999), their sounds can be heard as a constant ‘crackling’ chorus with temporal and spatial variations in intensity (e.g., Bohnenstiehl et al. 2016; Lillis et al. 2017). Other well-known sound-producing invertebrates are lobsters and sea urchins. Lobsters produce broadband pulse trains when facing predators or competing conspecifics (Staaterman et al. 2010; Jézéquel et al. 2019). Jézéquel et al. (2019) characterized pulse trains of the European spiny lobster (Palinurus elephas) as signals with a mean bandwidth of 5–23 kHz. Sea urchins scrape algae from rocks. This foraging strategy causes the fluid inside the sea urchin to resonate, producing sound at frequencies between 700 Hz and 2 kHz (Radford et al. 2008). In New Zealand, groups of foraging endemic Kina sea urchins (Evechinus chloroticus) increase sound levels between 18:00 and 20:00 compared to mid-day levels (Radford et al. 2008). Further examples of sounds from invertebrate movement and foraging activities are displayed in Fig. 7.13b, c (Coquereau et al. 2016).

Fig. 7.13
figure 13

Spectrograms of (a) snapping shrimp, (b) a swimming great scallop (Pecten maximus), and (c) a feeding spider crab (Maja brachydactyla). Spectrograms b and c were created from supplementary material in Coquereau et al. (2016). Reprinted by permission from Springer Nature. Coquereau L, Grall J, Chauvaud L, et al. Sound production and associated behaviours of benthic invertebrates from a coastal habitat in the north-east Atlantic. Mar Biol 163: 127; https://doi.org/10.1007/200227-016-2902-2. © Springer Nature, 2020. All rights reserved

Over 1200 fish species were estimated to produce sounds by Kaatz (2011), of which 800 were confirmed soniferous species (Kaatz 2002; Rountree et al. 2006). Fish produce sounds in a variety of behavioral contexts, such as courtship (Amorim et al. 2015), agonistic interactions (Ladich 1997), and when in distress (Knight and Ladich 2014). It is therefore not surprising that fish are common contributors to aquatic soundscapes, most noticeably when large numbers vocalize in chorus (e.g., Rice et al. 2017; Pagniello et al. 2019). Parsons et al. (2016) summarized fish chorus patterns over a 2-year period in Darwin Harbour, Australia. Nine different chorus types were detected (Fig. 7.14), dominating the frequency band from 50 Hz to 3 kHz and displaying cycles on several temporal scales (i.e., diurnal, lunar, seasonal, and annual). Fish chorusing was also associated with environmental parameters, including water temperature, depth, salinity, and tidal cycle.

Fig. 7.14
figure 14

Spectrograms of the fish calls making up nine fish choruses (50 Hz–3 kHz) in Darwin Harbour, Australia. The middle panel shows the chorus levels over time, in hours relative to sunrise and sunset. There is a peak in chorusing activity shortly after sunset. Figure created from material in Parsons et al. (2016), by permission from Oxford University Press. Parsons MJG, Salgado-Kent CP, Marley SA, et al., Characterizing diversity and variation in fish choruses in Darwin Harbour. ICES J Mar Sci 73:2058–2074; https://doi.org/10.1093/icesjms/fsw037. © International Council for the Exploration of the Sea, 2016; https://global.oup.com/academic/rights/permissions/. All rights reserved. Reuse requires permission from OUP

Marine mammal sounds range from infrasounds of mysticetes (baleen whales; e.g., Mellinger and Clark 2003) to ultrasounds of odontocetes (toothed whales; e.g., Hiley et al. 2017). Calls may function as contact or warning signals. For example, northern right (Eubalaena glacialis) and southern right (E. australis) whale upsweeps (i.e., upcalls; 50–235 Hz) seem to be used as a contact call (Fig. 7.15a; Clark 1982; Parks et al. 2007). Another characteristic call of this species is a strong, brief, broadband pulse with energy up to 16 kHz (called gunshot), which may serve as an advertisement call and/or agonistic call produced by male individuals (Parks et al. 2006). However, female right whales sometimes also produce this sound (Gerstein et al. 2014). Foraging humpback whales (Megaptera novaeangliae) produce a characteristic tonal call with a fundamental frequency between 400 Hz and 1 kHz (Cerchio and Dahlheim 2001), which may function to herd prey, coordinate group movement, or recruit individuals into a feeding group (Cerchio and Dahlheim 2001; Fournet et al. 2018).

Fig. 7.15
figure 15

Spectrograms of marine mammal sounds. (a) Southern right whale upcall. (b) Humpback whale song. (c) Common dolphin (Delphinus delphis) whistles and (d) clicks and burst-pulse sounds. (e) Leopard seal (Hydrurga leptonyx) and (f) Ross seal (Ommatophoca rossii), both under water. © Erbe et al. (2017); https://doi.org/10.1007/s40857-017-0101-z. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Blue whales (Balaenoptera musculus), bowhead whales (Balaena mysticetus), fin whales (Balaenoptera physalus), and others arrange calls into patterned song, which may last from hours to days. Humpback whale song is particularly complex in structure, consisting of a variety of units that have peak frequencies between 20 Hz and 6 kHz (Fig. 7.15b; Payne and McVay 1971). The functions of whale song may include female attraction, male-male interactions, and long-range sonar (Herman 2017; Mercado 2018). Odontocete echolocation clicks with peak energy between ~10 and ~150 kHz are used for navigation and prey capture (Au 1993). Odontocete tonal calls (i.e., whistles) with fundamental frequencies between ~1 and ~50 kHz and broadband burst-pulse sounds are used for communication (Fig. 7.15c, d; Herzing 1996). Some odontocete species also communicate with clicks (e.g., sperm whales, Physeter macrocephalus, and porpoises, Phocoenidae; Weilgart and Whitehead 1993; Clausen et al. 2010). Delphinids may arrange their whistles and burst-pulse sounds into patterned sequences (e.g., killer whales, Orcinus orca, Wellard et al. 2020; and pilot whales, Globicephala melas, Courts et al. 2020). Seals, sea lions, and walruses use underwater vocalizations particularly during the breeding season and in social interactions (Schusterman et al. 1966; Stirling et al. 1987; Van Parijs and Kovacs 2002). The majority of pinniped underwater vocalizations fall within the frequency range between 10 Hz and 6 kHz (Fig. 7.15e, f), although Weddell seals (Leptonychotes weddellii) were found to produce calls containing energy up to 13 kHz (Thomas and Kuechle 1982). Mysticetes, odontocetes, and pinnipeds also produce non-vocal surface-generated sounds through breaching, pectoral fin slapping, and tail slapping (e.g., Dunlop et al. 2007).

7.3.2 Geophony

The aquatic geophony comprises sounds from wind acting on the water surface (e.g., Knudsen et al. 1948); precipitation (e.g., Nystuen 1986); ice movement, pressure cracking, and melting (e.g., Mikhalevsky 2001; Martin and Cott 2016); subsea volcanoes and earthquakes (e.g., Fox et al. 2001; Dziak and Fox 2002); and sediment displacement (e.g., Lorang and Tonolla 2014). Geophony can be nearly continuous and dominate the soundscape in certain regions at certain times (e.g., wind noise in southern Australia; Erbe et al. 2021). Wind-driven sound lies between 100 Hz and 20 kHz (typical peak at 500 Hz; Wenz 1962). Rainfall can contribute to the underwater soundscape over frequencies between 500 Hz and 50 kHz depending on drop size, rainfall rate, and impact angle related to wind speed (Ma et al. 2005). In the Perth Canyon, Australia, rainfall is often accompanied by strong wind. Consequently, the weather-related sound spectrum shows two peaks: one dominated by wind at 300–600 Hz and another dominated by rain at about 3 kHz (Fig. 7.16a; Erbe et al. 2015). In polar regions and underneath frozen lakes, sounds of colliding, oscillating, breaking, and melting ice range from <10 Hz to 8 kHz (Talandier et al. 2006; Martin and Cott 2016). Sound from polar ice can be detected thousands of kilometers away at tropical latitudes (Matsumoto et al. 2014). Underwater volcanic eruptions generate impulsive sounds as well as harmonic tremors <100 Hz, which can travel over distances greater than 12,000 km through the Sound Fixing And Ranging (SOFAR) channel (Tepp et al. 2019). Similarly, earthquakes can be detected at thousands of kilometers in distance as low-frequency (<100 Hz) rumbles, lasting several minutes (Fig. 7.16b; Erbe et al. 2015). Sediment flow may generate sound in rivers and streams, creating acoustic cues for freshwater species (Tonolla et al. 2010, 2011). Depending on grain size and flow velocity, the spectrum may range from tens of hertz to kilohertz.

Fig. 7.16
figure 16

Sources of aquatic geophony. (a) Underwater power spectral density (PSD) levels illustrating an increase in levels under increased wind speeds (m/s) and rain fall rates (mm/h). (b) Spectrogram of an earthquake recorded in the Perth Canyon, Australia. Colors indicate PSD level (dB re 1 μPa2/Hz). Note the logarithmic frequency axes. Both figures were modified; © Erbe et al. (2015); https://doi.org/10.1016/j.pocean.2015.05.015. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

7.3.3 Anthropophony

In the last century, human activities began to contribute significantly to underwater sound levels. The anthropophony has grown ambient sound levels rapidly compared to evolutionary time scales, making it hard for animals to adapt (see Chap. 13). Anthropogenic sound may be present in aquatic soundscapes far away from human activities, owing to the long-range propagation of low-frequency sound in water (see Chap. 6). The aquatic anthropophony includes personal watercrafts (e.g., jetskis; Erbe 2013), small boats (e.g., Erbe et al. 2016a; Dey et al. 2019), electric ferries (Parsons et al. 2020), merchant ships (e.g., Ross 1976; Hatch et al. 2008; McKenna et al. 2012), offshore hydrocarbon exploration and production (e.g., marine seismic surveys and drilling; Wyatt 2008; Erbe and King 2009; Erbe et al. 2013), near-shore construction including geotechnical work and pile-driving (e.g., Erbe 2009; Dahl et al. 2015; Erbe and McPherson 2017), windfarms (e.g., Koschinski et al. 2003; Tougaard et al. 2009), dredging (e.g., Reine et al. 2014), explosions (e.g., Soloway and Dahl 2014), military sonars (e.g., Ainslie 2010), acoustic alarms on fishing gear or shark nets (e.g., Erbe and McPherson 2012), snowmobiles and vehicles on ice-covered lakes (Martin and Cott 2016), bridge traffic (Holt and Johnston 2015; Martin and Popper 2016), augers (i.e., ice drills; Putland and Mensinger 2020), airplanes (e.g., Martin and Cott 2016; Erbe et al. 2018), and activities alongside, rather than on, the water (Kuehne et al. 2013). Lesser-known anthropophony originates from unpowered recreational activities (e.g., scuba diving and swimming; Erbe et al. 2016c).

Sound from ship traffic is the most pervasive anthropogenic sound in the ocean (e.g., Sertlek et al. 2019). The level of sound emitted depends on ship type, size, speed, and operational mode (e.g., reversing, idling, carrying, or towing load; MacGillivray and de Jong 2021). In water <300 m deep, large ships (>300 t) can temporarily increase sound levels up to 125 kHz within 500 m from shipping routes (Hermannsen et al. 2014; Veirs et al. 2016). In deep water, low-frequency sound from ships can travel farther, especially when entering the SOFAR channel (Fig. 7.17; Erbe et al. 2019). The number of small, recreational boats that occupy coastal waters is on the rise in many places and these vessels may raise sound levels between 100 Hz and 20 kHz in coastal and estuarine habitats, depending on boat type, hull type, length, propulsion system, operational mode, and speed (Parsons et al. 2021).

Fig. 7.17
figure 17

Sketch of the propagation of sound from a 156-m ship (at 0 km range) sailing at a speed of 15 knots above the continental slope in the absence of ambient sound. Propagation modeled with RAMGeo in AcTUP V2.8 (https://cmst.curtin.edu.au/products/underwater/) with an equatorial sound speed profile as indicated in the left panel and a hard, dense, limestone seafloor. Colors represent received level (RL). © Erbe et al. 2019; https://www.frontiersin.org/files/Articles/476898/fmars-06-00606-HTML/image_m/fmars-06-00606-g001.jpg. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Another common anthropogenic sound that has received much concern over its potential impacts on marine life (see Chap. 13) is produced by seismic surveys, used for seabed profiling and hydrocarbon exploration. Surveys are done with a vessel towing an array of airguns. Airguns are metal chambers storing compressed air, which is rapidly released, producing an acoustic pulse with energy up to at least 10 kHz (Dragoset 2000; Hermannsen et al. 2015). Airguns exist with different operating volumes and firing pressures, affecting the spectrum and level of the acoustic pulses (Fig. 7.18a; Erbe and King 2009; Hermannsen et al. 2015). Airgun arrays can be tuned to focus acoustic emission down into the seabed, yet some sound ends up traveling horizontally through the water. Hence, sounds from seismic surveys may affect marine life at both short and long ranges (Gordon et al. 2003; Slabbekoorn et al. 2019). A typical seismic survey may last several weeks, during which the airgun array is discharged every few seconds.

Fig. 7.18
figure 18

Spectrograms of impulsive sound sources. (a) Seismic airgun pulses recorded off Western Australia (Erbe et al. 2021). (b) Pile driving recorded in Moreton Bay, Queensland, Australia (Erbe 2009)

Other common sounds of concern are emitted by pile driving, explosions, and acoustic alarms. Pile driving for windfarm construction and detonations of World War II ammunition are regular sources of sound within European waters (Bailey et al. 2010; von Benda-Beckmann et al. 2015). Impact pile driving generates high-intensity pulses with energy exceeding 40 kHz at close range (Fig. 7.18b). Acoustic alarms are devices that purposefully emit sound between a few hundred hertz and tens of kilohertz to deter marine animals from potential hazards, such as pile driving sites, aquaculture farms, or bather protection nets (e.g., Jacobs and Terhune 2002; Erbe and McPherson 2012), yet their efficacy remains controversial (e.g., see Erbe et al. 2016d). Acoustic alarms differ widely in their signal type, frequency, and source level (Findlay et al. 2018).

7.3.4 Sound Propagation in Aquatic Environments

Underwater, the propagation of sound is affected by water temperature, salinity, hydrostatic pressure (i.e., depth below the sea surface), sea surface roughness, potential ice cover, bathymetry, seafloor roughness, upper seafloor geology (i.e., sediment type and thickness), depth and type of the underlying bedrock, and the presence of sound absorbers, scatterers, and reflectors (e.g., aquatic fauna, bubble clouds, or suspended sediment; see Chap. 6).

The speed of sound in water changes gradually with depth. As a result, sound does not travel in straight lines. Instead, sound paths are bent by refraction. By Snell’s law, paths bend toward local minima in sound speed. The most pronounced local minimum occurs in all non-polar oceans at a depth of about 1000 m below the sea surface. Sound reaching this depth at not too steep angles can get trapped in the so-called SOFAR channel by being repeatedly refracted toward the channel axis. This is how sound can traverse entire oceans, with sound sources contributing to soundscapes thousands of kilometers away (e.g., Gavrilov 2018). The SOFAR channel does not only trap sounds from deep-water sound sources (e.g., submarines or diving megafauna) located within the channel, but also from sources near the sea surface (e.g., ships or whales) because sound can radiate into the SOFAR channel with just one reflection off a downward sloping seafloor (Fig. 7.17). The minimum in sound speed (and so the axis of the SOFAR channel) rises to shallower depths in polar waters. In fact, in the polar oceans, the speed of sound is the smallest at the surface. This leads to a surface duct, in which sound travels by repeated reflection off the sea surface and refraction at depth.

Snell’s law creates additional interesting phenomena such as shadow zones and convergence zones. Sound does not distribute evenly throughout the oceans. There are patterns of shadow zones (into which sound cannot travel by direct paths, and which receive little to no sound) and convergence zones (where received levels are enhanced; Fig. 7.17). These zones will be in different places for different source locations. In addition, sound at low frequencies does not travel far in shallow water. The waveguide concept and normal modes nicely explain this (see Chap. 6). The water depth can be too small to “fit” sound of large wavelength. As a result, ship noise may be attenuated quickly in coastal water and the spectral hump of distant shipping is characteristic only in offshore water (see Sect. 7.5.3.2). Ergo, soundscapes may differ with location and depth, merely because of sound propagation.

7.4 Soundscape Changes Over Space and Time

Soundscapes may vary on a range of spatial scales, exhibit temporal cycles (e.g., because of diurnal animal behaviors, periodic animal presence, or seasonal weather events; Erbe et al. 2015; Caruso et al. 2017; McWilliam et al. 2017), or gradually change over longer periods of time. Such changes may be natural or, directly or indirectly, related to human activity. Understanding natural variability is important for using soundscapes (1) as an ecological tool to study animal behavior and (2) as a management tool of the potential effects of human activity. Our understanding of the function of animal calls and natural or anthropogenic interferences is based on limited observational data (Slabbekoorn et al. 2018) and so interpreting changes in sounds is even more difficult. Gavrilov et al. (2012), for example, recorded the underwater soundscape between 21 and 27 May in 2002, 2006, and 2010 off Cape Leeuwin, Australia. Between years, an increase in sound levels at the frequencies characteristic of fin whales and Antarctic blue whales (Balaenoptera musculus intermedia) was seen (Fig. 7.19). This could be due to an increase in whale population sizes or changes in migration routes (i.e., closer to the recorder). The authors further noted that the frequency of Antarctic blue whale calls decreased for unknown reasons.

Fig. 7.19
figure 19

Power spectral density (PSD) of the soundscape off Cape Leeuwin, Australia, showing increases in level and decreases in frequency of the fin and Antarctic blue whale characteristic sounds over eight years. Figure courtesy of Sasha Gavrilov, Curtin University, Perth, Australia

7.4.1 Spatial Patterns

Soundscapes vary naturally over large and small spatial scales, abruptly or gradually, resulting in different soundscapes between and within habitats. Slabbekoorn (2004) sampled multiple sites within a contiguous rainforest and an adjacent ecotone forest in Cameroon. He found spatial differences in ambient noise, which were due to differences in wind and species vocalizations (insects, frogs, and birds). Over time, ambient noise can affect the vocal characteristics of individuals, populations, and species (see Chap. 13). Consistent ambient noise may drive the features of a species’ vocalizations, so that call transmission is optimized within the acoustic environment (Acoustic Adaptation Hypothesis). Just as temporal changes in ambient noise may result in vocalization changes, spatial changes in ambient noise may result in spatial differences in vocalizations (Slabbekoorn and Smith 2002). If ambient noise differs consistently across a species’ habitat, acoustic adaptation might result in acoustic divergence between populations of the same species (Dingle et al. 2008). If the calls of these populations diverge so much that they are no longer recognized by all populations, sexual selection may lead to the segregation into distinct (sub)species (Dingle et al. 2010; Burbidge et al. 2015). For research on soundscapes and acoustic ecology, spatial replication in sampling is paramount.

7.4.2 Natural Cycles

Soundscapes vary naturally with diurnal, lunar, seasonal, or annual cycles because of temporal patterns in animal presence and behavior (e.g., night-time foraging, lunar spawning, seasonal hibernation, and annual migration) as well as weather (e.g., annual monsoon). In Alaska, ambient sound increased rapidly in early spring due to an influx of migratory bird species and the awakening of species from dormancy and hibernation (Mullet et al. 2016). Gage and Axel (2014) studied the diurnal and seasonal patterns in ambient sound within 1-kHz frequency bands at Michigan Lake, USA, from 2009 to 2012. At 2–3 kHz, power levels were highest in early spring with the presence of spring peepers (Pseudacris crucifer, Hylidae). Levels dropped progressively toward early fall when spring peepers disappeared and increased again in late fall because of chorusing insects. In contrast, at 4–5 kHz, levels were low in early spring but increased in late spring with the presence of breeding birds. Levels subsequently dropped yet increased again in late summer and early fall because of insects. Diurnal changes in ambient sound were related to ecological activity. Within the 2–4 kHz frequency band, for example, spring peepers dominated the soundscape at night until singing birds took over at dawn. Under water, in the Ionian Sea, echolocation activity of dolphins occurred at nighttime and crepuscular hours (Caruso et al. 2017). In contrast, communication signals (i.e., whistles) were mostly produced during the day. Seasonal variation, with a peak number of clicks in August, was also evident, but no effect of lunar cycle was observed. Off Western Australia, pygmy blue whales (Balaenoptera musculus brevicauda) are a seasonally dominant contributor to the marine soundscape and simply by listening, their seasonal migration can be traced along the coast (Fig. 7.20; Erbe et al. 2016b).

Fig. 7.20
figure 20

Seasonal timing of pygmy blue whale migration along the west and south coasts of Australia based on passive acoustic monitoring. The chart shows the locations of sound recordings (red dots). The diagram shows counts of pygmy blue whale singers as 24-h means. The red horizontal lines indicate when the recorders were operating (Erbe et al. 2016b)

7.4.3 Human Activities

In many habitats, soundscapes have changed significantly over the last century, with habitat degradation by humans as a root cause. Humans add sound to soundscapes, change biodiversity through land-use, and directly remove animals from habitats (e.g., by hunting). Humans also contribute to climate change, with greenhouse gas emissions resulting in environmental changes, which can have direct and indirect effects on ecosystems and related soundscapes. The conservation of soundscapes is important not only for scientific and ecological reasons but also for touristic interests and human welfare (Pavan 2017).

7.4.3.1 Anthropophony

Humans alter soundscapes by growing anthropophony through an increase in transportation, construction, mineral and hydrocarbon exploration and production, military exercises, recreational activities, etc. These activities produce sounds over a wide range of frequencies and at a variety of intensities (see Sects. 7.2.3 and 7.3.3). While some activities are temporary, others result in sustained increases in ambient sound levels over time. For example, underwater sound from shipping has increased ambient sound levels between 10 and 100 Hz in large parts of the world’s oceans by up to 3 dB per decade (e.g., Andrew et al. 2011; Chapman and Price 2011; Miksis-Olds et al. 2013).

Seismic surveys produce intense sound over a few weeks at a time to explore a specified area; yet, Nieukirk et al. (2004, 2012) detected airgun pulses along the Mid-Atlantic ridge from seismic survey vessels located 3000–4000 km away. In 1999, airgun signals were routinely detected for more than 80% of the days in a month, which increased to 95% in 2005. Finally, anthropogenic sounds may affect animal behavior (i.e., physical or acoustic, Slabbekoorn et al. 2018; see Chap. 13), which can further alter soundscapes.

7.4.3.2 Land Use

Humans transform natural landscapes to increase agricultural land coverage, to build infrastructure (e.g., roads, buildings, and power supply systems), or to extract resources (e.g., tree logging and mining). These activities generate sound and affect animal density and biodiversity, ultimately changing soundscapes (Phillips et al. 2017). In 1962, ecologist Rachel Carson expressed her concern about the use of chemicals and pesticides in agriculture, killing not only soil micro-fauna but also macro-fauna (Carson 1962). She foresaw a silent natural world without the songs of insects, frogs, and birds, if they were lost due to urbanization or chemical pollution. She was one of the first to consider animal sounds as an expression of ecosystem integrity and quality. Kerr and Cihlar (2004) found a correlation between high-intensity, high-biomass agriculture and high numbers of endangered species on both national and regional levels in Canada.

Danielsen and Heegaard (1995) compared the species richness and abundance of birds, primates, squirrels, tree-shrews, and bats between undisturbed, logged, and transformed patches of forest (i.e., to rubber and oil palm plantations) in eastern Sumatra, Indonesia. Logging changed the composition of bird species, revealing a decrease in the number of specialized insectivorous species and an increase in insectivore-frugivore generalist species. The species richness of bats also decreased with a concomitant increase in abundance of the most dominant bat species. However, logging impacts differed between geographical regions and management strategies (e.g., conventional selective, salvage, or reduced-impact logging; Chaudhary et al. 2016; LaManna and Martin 2017). Land transformation to plantations resulted in a dramatic decrease in biodiversity with the disappearance of primates, squirrels, and tree-shrews as well as a reduction in bird and bat species richness by 90–95% and 75–87%, respectively.

7.4.3.3 Direct Takes

Accidental, illegal, or over-harvesting of animal species occurs in both terrestrial and aquatic habitats (e.g., Challender and MacMillan 2014; Anderson et al. 2020), resulting in population declines and species extinctions (Hoffmann et al. 2011; Dulvy et al. 2014). Perhaps one of the greatest examples is the removal of millions of whales during the nineteenth and twentieth centuries (Rocha Jr. et al. 2014), which unequivocally changed marine soundscapes world-wide. A modern example is the threat of dissapearing Gulf corvina (Cynoscion othonopterus) choruses in the Colorado River delta because of overfishing (Erisman and Rowell 2017). Overfishing can also result in excessive growth of algae, ultimately changing soundscapes. Freeman et al. (2018), for example, found a positive correlation between sound levels and macroalgae coverage on Hawaiian coral reefs, attributable to ringing bubbles emitted during photosynthesis.

7.4.3.4 Climate Change

The Earth is experiencing rapid climate change, affecting soundscapes in a variety of ways. The geophony is affected by changing weather patterns (i.e., wind, precipitation, and storms; Sueur et al. 2019). Rising temperatures reduce sea- and land ice, which is changing polar soundscapes (Intergovernmental Panel on Climate Change [IPCC] 2014). Climate change further modifies the acoustic properties of the environment with direct effects on sound propagation and thus the audible distances of sounds. Larom et al. (1997) calculated that the effective communication range for African elephant calls varied between 2 and 10 km with temperature and windspeed. Ocean acidification, as a result of climate change, results in less absorption of low-frequency sounds (Gazioğlu et al. 2015). Thus, low-frequency sound sources, such as ships and whales, may become more prominent in future marine soundscapes.

Climate change may also directly affect a species’ vocal behavior, distribution pattern, or timing of behavioral events, such as migration and mating (Krause and Farina 2016; Sueur et al. 2019). Narins and Meenderink (2014) found that Puorto Rican coqui frogs (Eleutherodactylus coqui), over a period of 23 years, moved to higher altitudes, while their calls increased in pitch and decreased in duration. These changes in distribution and call characteristics corresponded to an overall increase in temperature of 0.37 °C, with a concomitant decrease in body size. A different response was seen by four frog species near Ithaca, NY, USA, who advanced the start of their breeding season by 14 days between 1900–1912 and 1990–1999, as evident from recordings of mating calls (Gibbs and Breisch 2001). During this time, temperatures increased on average 0.7–1.7 °C. Insects also depend on air temperature for the expression of their behavior, including sound emission (Ciceran et al. 1994). Rossi et al. (2016a, b) found that snapping shrimp (family Alpheidae) reduced their snap rate (i.e., snaps per minute) and intensity under increased levels of CO2. This might affect the behavior of species that rely on acoustic cues from snapping shrimp for navigation (Rossi et al. 2016b). The eastern Chukchi Sea beluga whale (Delphinapterus leucas) population delayed timing of migration from foraging habitats by 2–4 weeks, corresponding to a delay in regional sea-ice freeze-up (Hauser et al. 2016). These examples stress the importance of collecting environmental data together with acoustic data, to correlate changes in animal distribution patterns and behavior with environmental change (Kloepper and Simmons 2014).

7.5 How to Analyze Soundscapes

Soundscape analysis may involve various, sometimes sequential, methods ranging from listening to recordings, via visual inspection of spectrograms, to automated detection of target signals, and computation of several acoustic metrics. Often, the larger the acoustic monitoring project, the more automated the tools, as long-term projects, which might compare multiple recording sites, might gather terabytes of data, which are virtually impossible to analyze by hand.

7.5.1 Standard Soundscape Measurements

Initial assessments of soundscapes typically involve the computation of spectrograms and some general statistics, such as the broadband root-mean-square (rms) Sound Pressure Level (SPLrms) in either dB re 20 μPa or dB re 1 μPa in air and water, respectively (see Chap. 4). This allows an initial quality-check of the recordings and the identification of potential spatial or temporal patterns in overall sound levels, highlighting areas or temporal events of interest for further investigation (e.g., very quiet or very noisy areas or times of day, Fig. 7.21). However, broadband SPLrms levels are strongly influenced by the noisiest events and cannot identify the myriad of soundscape components and contributors to spatial and temporal differences.

Fig. 7.21
figure 21

Spectrograms (top) and time series (bottom) of broadband (20 Hz–22 kHz) sound pressure levels of a 24-h recording period at three sites around Bora Bora Island, French Polynesia. Recording schedule was set at 60 s every 10 min. Note the increase in sound levels at night (shaded areas) as well as the strong fluctuation in sound levels between 60-s segments (Bertucci et al. 2020). Reprinted by permission from Springer Nature. Bertucci F, Guerra AS, Sturny V, et al., A preliminary acoustic evaluation of three sites in the lagoon of Bora Bora, French Polynesia. Environ Biol Fishes 103:891–902; https://doi.org/10.1007/s10641-020-01000-8. © Springer Nature, 2020. All rights reserved

As sound sources are often known to cover certain frequency bands, it is beneficial to compute SPLs within purposefully chosen frequency bands or standard octave or 1/3 octave bands. Buscaino et al. (2016) used Octave Band Levels (OBLs) at center frequencies from 62.5 Hz to 64 kHz to study temporal patterns in the soundscape of a shallow-water Marine Protected Area in the Mediterranean Sea. Seasonal patterns were seen within the lower (63 Hz–1 kHz) and higher (4–64 kHz) OBLs due to increases in wind in winter and snapping shrimp activity in summer, respectively. In contrast, sound levels within the 2-kHz octave band remained stable as sound from both wind and snapping shrimp entered this frequency band, thus attenuating seasonal fluctuations. Sound levels in the 1/3 octave bands centered at 63 and 125 Hz were set as indicators of ship noise by the European Commission Joint Research Centre (Tasker et al. 2010). Ship noise studies in shallow water, however, highlight that natural sound sources (i.e., wind) and propagation characteristics may render these indicators less useful in coastal areas and that bandlevels at 200 and 315 Hz should be included, particularly in areas frequented by smaller recreational vessels (Garrett et al. 2016; Picciulin et al. 2016).

7.5.2 Identification of Sound Sources

Soundscape ecology involves the identification of sound sources and whether they are part of the biophony, geophony, or anthropophony. Most sources have a unique sound signature (see examples earlier in this chapter), which can be identified from power spectra. Knowing to which soundscape component a sound belongs helps to evaluate how pristine an environment is and pinpoint possible impacts from human activities. Choruses by insects (Brown et al. 2019), anurans (Nityananda and Bee 2011), birds (Baker 2009), marine invertebrates (Radford et al. 2008), and fish (Parsons et al. 2016) are so distinct that they are easily identified as biophony. Knowledge on species-specific vocalizations helps to monitor species behavior and species-specific responses to environmental stressors (such as noise) as demonstrated with insects (e.g., Walker and Cade 2003), amphibians (e.g., Gibbs and Breisch 2001), birds (Fig. 7.22; e.g., Jahn et al. 2017), and mammals (e.g., Nijman 2001; Parks et al. 2007). Similarly, the sounds of the geophony and anthropophony have characteristic spectral features by which they can be identified.

Fig. 7.22
figure 22

Spectrograms highlighting the difference in vocalizations between 14 different tanager species, which can be used to monitor behavior and response to environmental change (Mason and Burns 2015). Reprinted by permission from Oxford University Press. Mason NA, Burns KJ, The effect of habitat and body size on the evolution of vocal displays in Traupidae (tanagers), the largest family of songbirds. Biol J Linn Soc 114:538–551; https://doi.org/10.1111/bij.12455. © The Linnean Society of London, 2015; https://global.oup.com/academic/rights/permissions/. All rights reserved. Reuse requires permission from OUP

Studies differ, however, in their methodology to identify sound sources. By listening to sounds while observing their spectrograms in real-time (see Sect. 7.5.3.1), experts can employ their personal experience to separate biotic and abiotic sounds and to identify species. Alternatively, sounds can be compared to labeled recordings in sound libraries (see URLs at the end of this chapter) and spectrograms can be compared to those found in the literature. However, manual inspection of sound files is labor intensive; and so, some studies make use of automatic detection and classification software (see Chap. 8).

7.5.3 Visual Displays of Soundscapes

7.5.3.1 Spectrograms

A spectrogram displays acoustic power density as a function of time and frequency. Each column in the spectrogram is a result of Fourier-transforming a section of the recorded time series of sound pressure. The frequency and time resolutions of the spectrogram are affected by the window length and type of window function used (see Chap. 4). Techniques such as zero-padding (i.e., expanding a time window with zeros) and overlapping time windows may enhance the apparent resolution in frequency and time. Each pixel (or cell) of the spectrogram eventually represents an average sound power, averaged into time and frequency bins. Spectrograms are a useful tool to examine the time, frequency, and amplitude details of a sound at different time scales, potentially identifying the sound source. Spectrograms that contain the vocalizations of multiple sound sources can provide information on species vocal dynamics, acoustic niches, and how animals may be affected by acoustic changes in their surroundings. For example, mixed anuran species’ breeding choruses in Minnesota, USA, revealed acoustic niche partitioning within the frequency domain (Fig. 7.23), while fin whale vocalizations were masked by ship noise in Italy (Fig. 7.24).

Fig. 7.23
figure 23

Anuran choruses recorded in Minnesota comprising calls of four species. Note the occupation of different frequency bands by these species, suggesting acoustic niche partitioning within the frequency domain. Modified image; © Nityananda and Bee (2011); https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0021191. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Fig. 7.24
figure 24

Spectrograms of (a) 20-Hz fin whale vocalizations off Sicily, Italy, and (b) a passing ship, which masked the fin whale sounds

Long-term monitoring programs typically make use of long-term spectral averages (LTSAs), which are spectrograms that were averaged into observation windows much longer than the underlying FFT windows. Observation windows may range from tens of seconds, to one minute, to several hours, to the length of one recording within a duty cycle (e.g., Gavrilov and Parsons 2014). LTSAs highlight persistent soundscape contributors (e.g., shipping or storms), repetitive soundscape contributors (e.g., night-time choruses), and dominant events (e.g., an earthquake). They can be used to identify specific days or hours rich in sounds, quiet versus noisy periods, or correlations between acoustic patterns and environmental factors. Fig. 7.25 shows a 3-week LTSA, in which dominant events were marked (e.g., nightly fish chorus, whale choruses, stormy days, and passing ships). Break-out spectrograms show specific signals on a finer temporal scale (Erbe et al. 2016b). Alternatively, long-term spectrograms may display minimum (LTSmin), maximum (LTSmax), median (LTSmed), or other percentile levels (e.g., LTS75), computed within each frequency bin over some time window (Righini and Pavan 2020). The minima will track the quietest baseline and the maxima can highlight strong but brief events, which would otherwise be averaged and potentially missed in LTSAs. Fig. 7.26 shows three 24-h LTSmax of an Italian soundscape on different dates and under different weather conditions (Righini and Pavan 2020). The images show sound sources present from midnight to midnight: (top) one day in June 2015 with some bursts of rain, (middle) one day with good weather and a clear image of the biophony concentrated between dawn and dusk in the frequency range 1.5–9 kHz, and (bottom) one day recorded in August, with a less dense biophony during daylight hours but Orthopteran choruses in the night. In August, a short period of light rain is also shown on the left side. In addition, the stream noise below 1 kHz in August was lower than in June. The faint band between 12 and 18 kHz present in all 3 panels was due to the intrinsic noise of the recorder.

Fig. 7.25
figure 25

Spectrograms of the marine soundscape in the Perth Canyon, Australia. Middle panel shows a 3-week LTSA, computed with a 10-min observation window. The surrounding panels display short-term spectrograms of example sounds (Erbe et al. 2016b)

Fig. 7.26
figure 26

LTSmax spectrograms from the same location (Sasso Fratino Integral Nature Reserve, Italy) on three different dates and under different weather conditions. Biophony is concentrated between 1.5 and 9 kHz and decreased in August. LTSmax produced with SeaPro software by combining 48 frames of 10 min each, recorded every 30 min (Righini and Pavan 2020)

7.5.3.2 Power Spectral Density Percentile Plots

While spectrograms (including LTSAs) show how the sound spectrum changes over time (from one FFT window to the next or from one LTSA observation window to the next), there might be a need to quantify this variability. Power spectral density (PSD) percentile plots quantify the spectrum variability over the duration of a temporal analysis window. PSD is plotted against frequency. At each frequency, several percentile levels are shown, commonly the median (50th percentile) and the quartiles (25th and 75th percentiles), but perhaps also additional percentiles (e.g., 1st, 5th, 95th, and 99th). The nth percentile gives the levels that were exceeded n% of the time. There is no standard for the length of the temporal analysis window, and selection depends on the specific study questions. Temporal analysis windows of 24 h, one season, or one full year are common. Dominant contributors to the soundscape can then be identified by the shape and levels of the curves. Additional information is provided by plotting the Spectral Probability Density (SPD) as background colors that represent the probability of levels being reached based on normalized histograms of sound levels within each frequency bin (Fig. 7.27; Merchant et al. 2013). Merchant et al. (2015) gave detailed information on how to compute PSDs and SPDs with their publicly available software PAMGuide. Also see Chap. 4.

Fig. 7.27
figure 27

Plot of power spectral density percentiles and probability density for the annual soundscape of the Perth Canyon, Australia. The strongest sound sources were pygmy blue whales and nearby ships at 10–200 Hz, humpback whales at 300 Hz, and fishes at 2 kHz, whereas the most common sound sources were distant shipping at 10–100 Hz and wind at 300 Hz–3 kHz (Erbe et al. 2016b)

7.5.3.3 Soundscape Maps

Soundscape maps literally show sound levels on a map. Such maps are mostly produced by modeling sound propagation from multiple sources, distributed over the area. Model results may be validated by point measurements (i.e., recordings at selected places; Erbe et al. 2014, 2021; Schoeman et al. 2022). Sound maps may be produced for specific frequencies of interest (e.g., relevant to human audiology; Bozkurt and Demirkale 2017) or for a specified receiver height or depth (e.g., migrating whales below the sea surface; Tennessen and Parks 2016; Bagočius and Narščius 2018). Sound propagation maps typically focus on specific sound sources (e.g., highways or railways; Fig. 7.28; Aletta and Kang 2015; Drozdova et al. 2019).

Fig. 7.28
figure 28

Noise-map of a roadway in an urban area. Red indicates highest noise levels and green represents the quietest areas. © Cai et al. 2018; https://www.hindawi.com/journals/jat/2018/7031418/fig4/. Published under CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

Maglio et al. (2015) developed a near real-time model that shows the propagation of sound from individual ships in the Ligurian Sea. However, focus can also be placed on cumulative or average sound levels over a specified time frame to identify areas of long-term risk to humans or animals from noise exposure. Erbe et al. (2012) computed a map of average sound levels from annual ship tracks to highlight areas along the Canadian coast where ship noise exceeded the European criterion of 100 dB re 1 μPa rms (Fig. 7.29). The same concept was later used to identify areas where (a) strong sound levels overlapped with high animal density (identifying areas of risk; Fig. 7.30; Erbe et al. 2014), and (b) low sound levels overlapped with high animal density (identifying areas of opportunity for conservation management; Fig. 7.30; Williams et al. 2015).

Fig. 7.29
figure 29

Illustration of the conversion of cumulative hours of ship traffic along the Canadian coast to cumulative noise levels (a) to identify areas where annual average received levels exceeded the European criterion for low-frequency ambient noise of 100 dB re 1 μPa rms (b; Erbe et al. 2012). © Acoustical Society of America 2012. All rights reserved

Fig. 7.30
figure 30

Maps of (a) harbor porpoise (Phocoena phocoena) density, (b) audiogram-weighted ship noise, (c) areas of risk (i.e., high animal density and high noise), and (d) areas of opportunity (i.e., high animal density and low noise) in British Columbia, Canada. © Williams et al. 2015; https://doi.org/10.1016/j.marpolbul.2015.09.012. Licensed under CC BY-NC-ND 4.0; https://creativecommons.org/licenses/by-nc-nd/4.0/

7.5.4 Acoustic Indices

Apart from sound level statistics (such as SPL measures, PSD percentiles, and SPD), additional metrics, such as acoustic indices, exist, which may quantify soundscapes as a whole or quantify the biophony, geophony, and anthropophony separately or in comparison. Acoustic indices can be used as a tool to assess the quality of soundscapes and the underlying ecosystem. Historically, researchers assessed the number of species (i.e., species richness) and number of individuals belonging to each species (i.e., species evenness) by counting the number of acoustic identifications while walking along survey transects or listening to recordings (Obrist et al. 2010). However, this approach is inefficient, subjective, and limited to brief observation times. In contrast, a transect or grid of automated recording systems allows acoustic surveys in remote areas, over extended periods, and in most field conditions (Acevedo and Villanueva-Rivera 2006).

To support the analyses and interpretation of consequent large datasets, researchers have been developing acoustic indices that summarize and score the structure and distribution of acoustic power over frequency and/or time, reflecting a correlation with species presence and distribution (e.g., Towsey et al. 2014). While traditionally developed for terrestrial communities, acoustic indices are now also increasingly applied to the aquatic environment (e.g., Parks et al. 2014; Harris et al. 2016; Bolgan et al. 2018a). In particular when the same instruments and protocols are used, acoustic indices allow for comparisons of soundscapes between multiple sites recorded over the same period or an evaluation of the changes of a soundscape over time (Righini and Pavan 2020; Farina et al. 2021).

Examples of acoustic indices include:

  1. 1.

    Bioacoustic Index (BI): Aims to quantify biophonic activity by thresholding spectral power in biophony-specific frequency bands (Fig. 7.31; Boelman et al. 2007),

  2. 2.

    Entropy Index (H): Equals the product of two sub-indices, spectral (Hf) and temporal entropy (Ht), computed on the average frequency spectrum and on the Hilbert amplitude envelope of the raw bioacoustic signal, respectively (Sueur et al. 2008b),

  3. 3.

    Acoustic Diversity Index (ADI): Divides the spectrum into specific frequency bins, selects the bins surpassing a preset power threshold, and applies the Shannon entropy to these bins (Villanueva-Rivera et al. 2011),

  4. 4.

    Acoustic Evenness Index (AEI): Divides the spectrum into specific frequency bins, selects the bins surpassing a preset power threshold, and considers the distribution of strong frequency bins by computing the Gini coefficient (Villanueva-Rivera et al. 2011),

  5. 5.

    Acoustic Complexity Index (ACI): Measures the temporal variation in acoustic power by calculating sequential power differences (from one FFT window to the next), in all frequency bands separately, then sums over frequency (Fig. 7.31; Pieretti et al. 2011), and

  6. 6.

    Normalized Difference Soundscape Index (NDSI): Equals the ratio of low-frequency (indicative of anthropophony) to high-frequency power (indicative of biophony) to capture the level of anthropogenic disturbance (Kasten et al. 2012).

Fig. 7.31
figure 31

Bioacoustic Index (BI) and Acoustic Complexity Index (ACI) for three Italian locations in the Integral Nature Reserve of Sasso Fratino, Italy, showing a strong peak at sunrise, followed by a gradual decline with a second peak at sunset

These and other indices are coded in shareware R packages, such as seewave (Sueur et al. 2008a; Sueur 2018), soundecology (Villanueva-Rivera and Pijanowski 2018), and bioacoustics (Marchal et al. 2020). However, the analysis of long-term recordings can also aim at recognizing individual species’ signatures by listening, by observing spectrograms, and by using sound recognition tools to identify the presence and recurrence of defined sound models. The R package monitoR (Katz et al. 2016) can be used to identify user-defined sound models.

It should be noted that acoustic indices applied in two different environments can produce confounding results and so the robustness of these indices to environmental change and to different soundscape compositions has been questioned (Harris et al. 2016; Bolgan et al. 2018a).

Parks et al. (2014) found that seismic airgun pulses interfered with the Entropy Index and therefore did not accurately reflect species richness within the Atlantic Ocean where seismic surveys were commonly detected. Bolgan et al. (2018a) assessed the robustness of the Acoustic Complexity Index to fine variations in fish sound abundance (i.e., number of sounds) and diversity (i.e., number of different calls); both changed index values. Hence, it would be difficult to infer whether a change in this index resulted from a change in fish abundance or fish species diversity. Biophony and anthropophony can overlap in frequency and time as well as vary with frequency and time. Acoustic index performance depends greatly on the frequency and time resolutions used in the computation of the various quantities and is affected by temporal (and spatial) patterns as well as local (and temporally variable) sound propagation conditions (Mooney et al. 2020). As a result, acoustic indices are sometimes tuned for specific environments, limiting comparability across environments and time.

7.6 Applications of Soundscape Studies

Soundscape studies can reveal information on animal distribution, abundance, and behavior; species diversity; and changes of all of these over time under environmental and human influences. Hence, soundscape analyses can be used as ecological tools to understand, conserve, and restore soundscapes as part of conservation management plans (Pavan 2017).

7.6.1 Conservation of Natural Soundscapes

7.6.1.1 Management

Documenting, analyzing, and understanding a soundscape can provide important information for wildlife and habitat managers on species richness, animal behavior patterns, effects of anthropogenic sounds, land-use, and climate change. Documenting relatively pristine soundscapes before they disappear (Righini and Pavan 2020; Farina et al. 2021) can aid re-establishment of degraded acoustic habitats through habitat restoration, animal relocation, elimination of invasive species, or restrictions of activities that generate anthropogenic sound and affect animal behavior. The success of soundscape restoration can then be demonstrated through acoustic monitoring and analysis (Pavan 2017).

Development and implementation of a comprehensive acoustic monitoring program can aid management of a protected area in several ways. Firstly, storage of quantitative data about the acoustic environment can be used to create pivotal repositories for immediate or future analyses of spatial and temporal patterns and differences at large scales. LTSA spectrograms, for example, provide a summary of day-by-day acoustic settings and the possibility to display information, not only on the diversity of acoustic species (as in a census) but also on the density and richness of the biophonic components. The study of an Integral Nature Reserve (Sasso Fratino, Casentinesi Forests National Park, Italy) demonstrated that the biophony dominated both geophony and anthropophony, with undisturbed daily cycles (Righini and Pavan 2020; Farina et al. 2021). Secondly, monitoring soundscapes can help managers detect unwanted and unlawful activities in protected areas. Human voices can be used to identify trespassers, gunshots to locate hunters and poachers, humming chainsaws to find illegal logging, vehicle sounds to document unauthorized vehicle use, and sounds from livestock to pinpoint unlawful grazing. Wrege et al. (2017) found that gunshot sounds within a closed-canopy forest of the Congo could be detected over a 7–10 km2 area, depending on the gun used and orientation to the acoustic receiver. Eight years of acoustic monitoring did not reveal a correlation between illegal hunting of forest elephants (Loxodonta cyclotis) and time of day or season. However, hunting intensity seemingly decreased after initiating patrols in 2009, highlighting the potential use of soundscape studies to monitor for illegal human activities and to assess the effectiveness of conservation efforts.

Investigation of underwater soundscapes can also aid in the detection of foreign vessels by the military, unauthorized commercial fishing vessels, unlawful vessels in restricted areas (i.e., no-go zones or marine protected areas; Kline et al. 2020), and illegal fishing activities with explosives (Xu et al. 2020).

7.6.1.2 Education

The rates of biodiversity loss, habitat loss, invasion of alien species, and species extinctions are high (Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services [IPBES] 2019). Helping citizens and stakeholders appreciate biodiversity is a necessity to establish a general willingness to address anthropogenic causes of ecosystem demise. In this context, animal sound and soundscape recordings not only serve science but have the potential to trigger people’s curiosity to learn more about the importance of ecosystems and their preservation, which will lead to conservation efforts. Such transfer of science, via education, to conservation has been demonstrated in several case studies (e.g., Padua 1994; Macharia et al. 2010; Pavan 2017; Barthel et al. 2018). Exhibits and educational programs on the sounds from nature in museums, zoos, park visitor centers, and websites can stimulate interest in and care about the acoustic environment. An example is Bernie Krause’s Great Animal Orchestra exhibitionFootnote 1. Alternatively, listening to animal sounds during a guided nature walk can generate an appreciation for soniferous animals, which can result in long-term public engagement and commitment to conservation by citizen scientists. Soundscape studies can help to create publicly available sound libraries and help to identify areas within a park for visitors to experience songbirds, calling frogs, chorusing insects, waterfalls, rushing streams, etc. One example of integrating soundscape monitoring and education is the Natural Sound Program, established in 2000 by the U.S. National Park Service (National Park Service [NPS] 2000). This program aims to manage the acoustic environment while providing for educational and inspirational visitor experiences.

7.6.2 Monitoring the Health of Agroecosystems

High productivity from agricultural fields can be maintained through insecticides, pesticides, and fertilizers, but the use of these products may result in chemical pollution with consequent loss of plant and animal biodiversity (e.g., Carson 1962; Boatman et al. 2004; Kerr and Cihlar 2004; Kleijn et al. 2009). Hence, habitats connected to agricultural lands might exhibit poorer soundscapes. In contrast, organic farmers strive to maintain productivity through natural agroecosystems, ensuring environment quality and ecological balances. Bird, insect, amphibian, and bat communities serve as indicators of ecosystem health, and an agroecosystem should have a balance of mixed species that provide natural pest control. The ecological quality of an agroecosystem can therefore be evaluated by the species-richness of its soundscape (e.g., Hole et al. 2005; Kleijn et al. 2011; Pavan 2017). Doohan et al. (2019) identified bird and bat species-specific or guild-specific bioindicators as successful biomonitoring tools for agricultural industries. Systematic monitoring of biological sounds can provide an accurate and practical assessment tool for farmers, policymakers, researchers, and others interested in maintaining or restoring farmland ecosystems, and ultimately encourage the adoption of beneficial and sustainable farming practices.

7.6.3 Improving Captive Animal Welfare

Noise may be omnipresent for captive animals in livestock-operations, zoos, aquaculture, and aquaria. While wind and rain contribute naturally to ambient sound in outdoor animal enclosures (Wiseman et al. 2014), anthropogenic sound from mechanical devices (e.g., Wysocki et al. 2007; Scheifele et al. 2012b), background music (Scheifele et al. 2012a), and visitors (e.g., Quadros et al. 2014; Sherwen and Hemsworth 2019) is characteristic of many indoor, outdoor, and underwater animal holding facilities. O’Neal (1998), for example, found that underwater sound pressure levels were 25 dB (20–6400 Hz) louder in exhibits inside the Monterey Bay Aquarium than in a nearby natural offshore environment, predominantly due to sound from machinery. Similarly, Scheifele et al. (2012b) detected an increase in sound pressure levels by 10–20 dB (20 Hz–1 kHz) when air pumps were switched on within the Georgia Aquarium. These increases in sound levels can have adverse effects on animal welfare because of physiological and behavioral changes (e.g., Owen et al. 2004).

Sound sources that may impact animals might not be audible to humans, and so animal keepers might not be aware of acoustic disturbance to kept animals. For example, laboratory mice are sensitive to ultrasound, above the human hearing range. Laboratory equipment (e.g., air conditioners and lighting) may emit ultrasound and, unknown to humans, stress animals within these facilities (Sales et al. 1988). Identifying such sources is necessary for the improvement of acoustic conditions to increase captive animal welfare (De Queiroz 2018). Sound can further be exacerbated by hard reflective surfaces and the geometry of an exhibit; hence, some noise problems can be solved by improving exhibit design (Wark 2015; De Queiroz 2018). Restricting visitor group sizes, reducing operation hours, limiting the number of shows, and reducing the level of background music can also mitigate negative impacts of noise on captive animals.

7.7 Conclusion

Soundscapes are composed of a myriad of sounds that can be grouped into biophony, geophony, and anthropophony based on their origin. Natural soundscapes have ecological value and modifying these natural assets could lead to changes in ecosystem functioning and biodiversity. At present, natural soundscapes are disappearing at an unprecedented rate because of human interference. Human activities create sound, change land-use patterns, directly remove animals from their habitat through overharvesting and illegal hunting, and lead to climate change, thereby directly and indirectly affecting both geophony and biophony. Soundscape studies can be used as an ecological tool to study animal distribution, behavior, biodiversity, and the effects of environmental stressors (such as anthropogenic noise or climate change). Soundscape studies can subsequently inform conservation management and assess the effectiveness of management and conservation efforts.

7.8 Additional Resources

Below is a selection of free, online resources; last accessed 20 June 2022.

7.8.1 Sound Libraries

Sound libraries can serve as reference during the identification of sound sources. They are also an educational tool to create awareness of the myriad of sounds that may contribute to a soundscape.

7.8.2 Ocean Acoustic Observatories

Ocean acoustic observatories provide a continuous stream of acoustic data either in real-time or archived:

7.8.3 Software for Soundscape Analysis

7.8.4 Software for Sound Propagation Modeling

7.8.5 Software for Automatic Signal Detection

Some of the software packages for soundscape analysis include signal detectors:

  • CHORUS includes detectors for pygmy blue whale song, fin whale 20-Hz downsweeps, and an unidentified spot-call.

  • PAMGuard includes detectors for odontocete and mysticete vocalizations.

Other automatic signal detection resources: