1 Introduction

The paper presents an exploratory study primarily aimed at assessing the presence and amount of ultrasound at five urban locations (and one rural location) in and around the city center of a relatively small-sized city, Aalborg, located in northern Denmark. As a necessary part of the study, we employ methods for measuring and comparing the presence of ultrasound to human-audible sound and the total sound power in our recordings as well as for comparing how different the spectral content of this ultrasound is between these locations. What is presented here is an extended and redrafted version of the paper presented at the Audio Mostly 2022 conference [1].

This initial section opens with a short exegesis on the urban greening movement and definitions of sound and ultrasound before briefly describing non-human species’ relationships to ultrasound, the anthropogenic production of ultrasound, effects on human health, and the regulatory framework. In Section 2, we briefly describe some related work and follow this with a short section on the context of the study. Next, in Section 4, we detail our procedure and the equipment used to carry out the recordings as well as the methods of analysis used to quantify the amount of ultrasound present and how different the spectral content of this ultrasound is across recordings. In Section 5, we present our results and discuss our findings. Finally, in Section 6, we conclude our findings and reflect on possible future work.

1.1 Urban greening

With the growing awareness of human effects on climate change, one of the responses to this crisis has been a move in recent years to more sustainable and green urban centers which, through their plant life, promote biodiversity in insect, bird, and mammal species.Footnote 1 Biodiversity is necessary to self-sustain these ecosystems once established and is a goal in itself, but, in addition, the greening of urban centers also has benefits in terms of human health. For example, a 2022 report in The Netherlands suggests that increasing the green footprint in urban areas reduces the need for (and thus the associated public health costs of) mental and physical care [2], and the effect increases for an area of greenery as the population with access to that area increases, while Pedersen Zari provides a brief review of benefits to humans that, in addition to physiological and psychological benefits, also lists cultural and economic benefits [3].

1.2 Anthropocentric sound definitions

The problem with sound is that its definitions, its measurements, its uses, and awareness and discussion of adverse effects are all too human centered. Consequently, this affects our relationship to the technologies that produce it—possibly biasing how accepting we are of such technologies as well as our understanding of their effects on the non-human world. In the case of ultrasound, this problem is even more apparent. We can take as an example the American National Standards Institute (ANSI) Acoustics documentation which defines infrasound as “sound at frequencies less that 20 Hz” and ultrasound as “sound at frequencies greater than 20 kHz,” with a further note that 20 Hz and 20 kHz mark the approximate lower and upper bounds of normal human hearing, respectively [4]. In the document’s definitions of sound, one definition states that sound is an “oscillation in pressure, stress, particle displacement, particle velocity etc., propagated in a medium with internal forces (e.g., elastic or viscous) or the superposition of such propagated oscillation,” while another definition states that it is an “[a]uditory sensation” caused by the oscillation in a medium described by the former definition. Here, too, there is the suggestion that both ultrasound and infrasound are phenomena that do not “evoke an auditory sensation” [4]: that is, they do not produce sound. With such definitions, quite interesting interpretations of what sound is are possible, such as the pataphysical statement that not all sounds evoke a sound [5]. Nonetheless, these observations illustrate that (1) our definitions of sound are (unsurprisingly) human centered and (2) this human centeredness in defining sound has possibly led to a significant gap in our understanding of the effects that sonic by-products of our technologies—particularly those outside the range of normal human hearing—might have on the environment and its non-human inhabitants.

Notwithstanding the ANSI definition and its threshold of 20 kHz for ultrasound, Leighton [6] has noted that various bodies have selected different frequencies for the lower threshold of ultrasound (10, 15, 16, 18, and 20 kHz), but, in modern times, authors have adopted his suggestion [7] of 17.8 kHz [8,9,10,11,12] because guidelines on maximum permissible levels (MPLs) apply for third octave bands (TOBs), so that those set for 20 kHz have always applied down to the lowest frequency in the TOB centered on 20 kHz, which is 17.8 kHz. The oft-used assumption that humans cannot hear above 20 kHz is unreliable [6, 7] with the measurable threshold recorded in some people up to 28 kHz [13].

1.3 Airborne ultrasound and non-human species

It has been known for some time that some species of plants emit ultrasound and appear to respond to such emissions, but the research on such plant-to-plant communication is still in its infancy [see, e.g., 14, 15]. The evidence so far, collected from tomato and tobacco plants, indicates that such plants emit ultrasound when stressed and that this can be in the range of approximately 50–58 kHz and 61–66 dBSPL (mean peak frequency and SPL, respectively) at 10 cm distance (the variance depends on the type of plant and whether the stress stimulus was drought or cutting) [14]. Other than tomato plants, which might be found in urban allotments, gardens, and window boxes, it is not clear what other potentially urban species of plants (which are numerous and diverse across the globe) might have ultrasound capabilities.

A little more is known about insects and their capabilities regarding ultrasound. A number of insect orders are sensitive to ultrasound, and this is thought to have evolved as a counter strategy to predators such as bats or as male-to-female sound signaling. The order Lepidoptera is one of these [15]—a variety of butterflies and, particularly, moths can exist in urban areas. The Texas field cricket (Gryllus texensis of the order Orthoptera, suborder Ensifera [katydids or crickets]) is a denizen of urban areas in the warmer parts of the USA (being attracted to lights) and has an ultrasound threshold of c. 70–75 dBSPL at 30 kHz [16]. Indeed, the Ensifera generally have quite phenomenal abilities regarding ultrasound production and reception: “The sound frequencies exploited by different species of katydids vary across a huge range from 2 to 150 kHz” [17].

While birds typically do not hear ultrasound (as an urban example, a pigeon’s hearing peaks at about 10 kHz [18]), there are many mammals other than humans to be found in urban areas: pets, such as dogs and cats, for example, have acute ultrasound hearing (up to, respectively, c. 45 kHz [19] and c. 85 kHz [20]), and some species of rats not only can hear up to about 50 kHz [21] but can also vocalize up to that same frequency [22]. But, for the purposes of this paper, bats are common in many cities and are of particular relevance in the context of ultrasound. As noted above, bats predating on insects use ultrasound echolocation to home in on their prey, just as the prey depends on ultrasound detection to evade the bats. Some species of bats produce and are sensitive to frequencies of more than 200 kHz [23], but bat echolocation typically ranges across 20–60 kHz with often high intensities (e.g., Cormura brevirostris emits echolocation pulses in the range 25–31 kHz, and this has been measured at 121 dBSPL at 5–10 m distance [24]).

1.4 Human production of airborne ultrasound

Although the presence of airborne ultrasound in the workplace has been recorded for decades, and the use of ultrasonic pest deterrents has undoubtedly accidentally exposed the public in the past, it was not until 2016 that the widespread emission of airborne ultrasound from fixed devices in public places from many mundane and pervasive forms of urban technologies was recorded [6]. Since then, instances have been found in the UK [6, 7, 9, 25], Switzerland [11], Germany and France [26], and Japan [27]. This is of concern for the adverse effect on humans, flora, and fauna. Leighton [6] notes particular concern for children who are exposed to airborne ultrasound in an environmental setting [28], because data for MPLs is based on small sample sizes of adults [6], and indeed he recorded a case study of airborne ultrasound causing a proportion of students in a school class to fail to concentrate on their work [26, 29].

Take, for example, Fig. 1a and b which show the prominent ultrasound signatures of two workplace-based devices widespread in the building we ourselves work in (Rendsburggade 14, Aalborg, Denmark). Such devices are increasingly common in various public urban spaces such as libraries, cafes, universities, and airports, and we are often in close proximity to them—both devices shown here are in every office of our building while the loudspeaker is also placed approximately every 5 m in corridors. For the device in Fig. 1a, measured at 1 m distance, the mean sound pressure level for the 1/3-octave band centered on 32 kHz is 81.3 dBSPL, while, for the device in Fig. 1b, we observe similarly high sound pressure levels at approximately 21 kHz. Regarding the motion sensor which controls office lighting, while we can praise the efforts to conform to a sustainability agenda through the use of such technology, perhaps we should also question the advisability of increasing our ultrasound footprint in an effort to lower our carbon footprint. The emission of ultrasound from a loudspeaker at rest might seem curious, but it stems from the fact that this is a Public Address Voice Alarm (PAVA) system, first recorded as emitting ultrasound by Leighton [6] and since by a range of authors [9, 30, 30]. Leighton explains that, “Many PAVA systems in shopping centres, airports, public buildings, department stores and most UK stadia are monitored using 20 kHz tones. These tones report to an ‘end of line’ device on each loudspeaker circuit which ‘acknowledges’ receipt of the signal, which is then returned to the amplifier and repeated continuously across the hundreds of circuits throughout the stadium/building/space. Manufacturers require some method of monitoring because EU legislation [31] makes supervision of evacuation systems mandatory.”

Fig. 1
figure 1

a Spectrogram of an excerpt from the continuous ultrasound recording a motion detector at 1 m distance where color indicates the magnitude of the power spectral density. b Spectrogram of an excerpt from the continuous ultrasound recording of a fire alarm speaker at 1 m distance during rest (i.e., not activated) where color indicates the magnitude of the power spectral density

In common with all other complex waveforms, the human production of sound waves will almost always have an ultrasound component, whether the source is vocal, or otherwise corporeal, or mechanical (acoustically produced, such as many musical instruments, or electrically produced, as from loudspeakers or machinery). This is a matter of acoustics, where there is a proliferation of overtones above a fundamental in a complex waveform, many of them being in the ultrasound range. With increasing frequency, most overtones decrease significantly in amplitude. It is both the human production of ultrasound as a by-product and the deliberate production of ultrasound, at relatively high intensity as in the two examples above, that are potential causes of concern.

There exists a substantial body of knowledge regarding the ultrasound produced by specific human technologies in a number of particular areas (e.g., medicine, communication, and industry), and, given the industries and fields involved, one can safely assume that the human production of ultrasound has inexorably increased hand in hand with technological developments. As far back as 1982, for example, the World Health Organization [WHO] identified a number of such (then current) consumer devices and industrial applications that produce airborne ultrasound, including cleansing, emulsifying, welding, flaw detection, dog whistles, pest controllers, alarms, and camera rangefinders, among others [32]. In 2007 (and later in 2016 [6]) Leighton [33] argued that there had been no census regarding human “ultrasound noise exposure,” and, to the best of our knowledge, this is still the case. Since 1982, the list of ultrasound-producing devices has only grown, not least with the addition of mobile- and smartphones (which can both emit [6, 34] ultrasound and be used to detect it [15]). It should also be noted that, since the early 2000s, ultrasonic motion sensors (such as the one producing the ultrasound of Fig. 1a) have proliferated in commercial/work environments and domestic environments, both indoor and outdoor.

Leighton goes on to list three “categories of exposure of humans to ultrasound in air” [33]:

  • exposure to airborne ultrasound generated as a by-product of the operation of machinery (e.g., ultrasonic cleaning baths);

  • exposure to airborne ultrasound as a result of some machinery requiring the generation of ultrasound for its operation (e.g., ultrasonic range finder);

  • deliberate exposure to airborne ultrasound as a means of eliciting some response (e.g., pest control devices).

1.5 Health effects of ultrasound on humans

Leighton’s executive summary from a 2016 paper is worthwhile providing in full:

For over 40 years, there have been reports of hearing threshold shift . . . and a range of subjective effects (nausea, dizziness, migraine, fatigue, tinnitus and ‘pressure in the ears’ . . .) from ultrasound in air to which workers have been routinely exposed (plus other symptoms that have not occurred in more than one study . . .). The degree of response, from significant to none, varied between workers. The evidence base has not studied sufficient numbers of subjects, and has not been sufficiently sensitive to the presence of sensitive individuals, or sensitive subgroups, within the population, to support the guidelines required today. Studies focused on occupational exposure to ultrasound, which tends also to occur alongside high levels of audible sound, and the effect of this audio frequency noise on the observed effects must be isolated because there is increasing public exposure to ultrasound without such audible cues. Measurement methods and audiological procedures in the past have tended to follow extrapolations of methods used in the audio frequency range which, alongside the calibrations and allowed tolerances of the equipment used, must be critically examined. . . . It is therefore no simple matter to measure the very high frequency/ultrasonic (VHF/US) fields to which people are exposed, either in situ or during audiological testing, and relate those to the levels quoted in past studies. Consequently, the evidence to date has been wholly inadequate to inform the development of guidelines for the increasing exposure of the public to ultrasound in air, and is suspect for occupational exposureFootnote 2 [6].

It would seem then that, while there is an increasing amount of human-generated ultrasound in the environment, there is a lack of consensus concerning the effects this ultrasound might have (and how to measure it) on the health and well-being of humans and non-human creatures alike. To this end, the 1982 publication by the WHO lists a number of potential adverse health effects on humans exposed to airborne ultrasoundFootnote 3 including “temporary threshold shifts, altered blood sugar levels, electrolyte imbalance, fatigue, headaches, nausea, tinnitus, and irritability” [32].

Leighton’s third category encompasses technologies deliberately producing ultrasound in order to provoke a response (in humans or otherwise). Examples include pest control devices as well as devices for warding off defensive dogs [10] and loitering teenagers [9]. A less overtly aggressive example is the increasing use of parametric arrays [6, 35] (acoustic spotlights) in spaces such as museums and even the home. In all cases, however, humans are inadvertently (or not) often exposed to ultrasound. One increasing problem is the use of airborne ultrasound for entertainment systems, such as acoustic spotlights of haptic feedback devices, where use may entail the immersion of the head of a child in an intense beam of airborne ultrasound for hours at a time [36, 37].

Despite the proliferation of such technologies, Leighton notes that due to the potentially high intensity of ultrasound energy required for their operation along with “the paucity of information on the safe levels for human exposure to ultrasound in air... and the lack of traceability for the measurement of such fields... this could be a safety issue” [33]. Unfortunately, as Leighton’s executive summary above notes, the situation appears not to have changed much in the decade following this observation [6].

1.6 Regulation of ultrasound

The anthropocentric definitions of sound (see Section 1.2) are reflected in society at large with numerous regulations across the world governing noise pollution and acceptable sound levels within urban environments only within the approximate human-audible range of 20 Hz to 20 kHz. Guidelines (the first dating from 1984 [38]) and regulations are not only human centered in the frequencies and intensities that they permit but are also human centered in other ways too: for example, regulations tend to be less strict during daytime when humans are expected to be active (but when bats, for example, are typically inactive).

Where there are regulations governing maximum permissible levels of ultrasound (typically in the workplace), these tend to be set higher than for human-audible sound. For example, permissible levels (usually for 8-h exposure periods) up to 22.4 kHz range between 75 and 85 dBSPL while maximum permissible levels above 20 kHz can be from 105 to 115 dBSPL depending on the country [26, 39, 40] and whether the guidelines are for occupational exposure or public exposure (increased threshold)—it might be that the higher level for ultrasound reflects the lack of audible annoyance (viz. awareness) for humans. In Denmark, where this study takes place, there are only exposure recommendations (“Vejledende ultralydgrænser”): 100 dBSPL at 20 kHz (1/3-octave band, center frequency) and 110 dBSPL for bands at 25 kHz, 31 kHz, and 40 kHz and higher [41]. A study of the ultrasound emission of a parabolic array concluded that public exposure guidelines were significantly exceeded [35], while a study of some pest controllers and hairdryers [25] concluded that the guidelines were exceeded in several cases.

Leighton’s first category above encompasses, for example, the ultrasonic noise pollution produced by wind turbines. Unfortunately, while there are current regulations governing the infrasound produced by these machines—regulations designed, in part, with the aim to manage the negative effects of infrasound on humans—there are few, if any, regulations on wind turbines regarding ultrasound [42]. There are, however, guidelines on curtailing the effects of wind turbine-generated ultrasound on non-human species, such as bats. Bats often die in collisions with wind-turbine blades, and there is some evidence to suggest that they are attracted to the ultrasonic signatures of the blades moving through the air [43,44,45]. Interestingly, one solution suggested to deter bats from the vicinity of wind farms is to emit ultrasound designed to “jam” the bats’ echo-location abilities [44]; a curious case of fighting fire with fire.

2 Related work

Despite the importance of investigating ultrasound, most research with respect to sound in urban spaces has focused on human-audible sound, such as with the study of urban soundscapes [46] and the field of acoustic ecology [47]. More recent work includes citizen-science sonic surveys of large citiesFootnote 4 and the development of various methods within the domain of machine listening that have been applied to the tasks of detecting and classifying various sound events and human-audible pollution in such cities [48].

It is perhaps quite understandable that our initial focus when it comes to sound and noise pollution is one that is human centered both in the source of the sound and the object of its effects. One of the first to study the sonic environments of cities was Michael Southworth in a 1969 study of the soundscape of Boston [46].Footnote 5 His concern was two-fold: a study of noise pollution and annoyance to humans created by certain sounds; and to create awareness of the impact of sound on the visual perception of the city. In this, the study was entirely human centered and took a descriptive, qualitative approach to human-audible sound. Schafer, and colleagues such as Truax, developed the field of acoustic ecology which concerns itself with the role of sound in typically natural ecologies [47, 49]. To some extent, this relates to the effects of human-produced and human-audible sound on other species, but it encompasses the studies of natural ecologies as revealed by the sounds they produce. As an example, migration patterns can be tracked over time against changing patterns of noise pollution by monitoring the intensity, timing, and variety of sounds produced by different species as they travel into and out of an area.

As a citizen-science project, SONYC, provides an interesting model for studying the intensities of human-produced sound in a city (New York) and for using machine-learning models to analyze the results. A number of datasets of sounds have been developedFootnote 6 along with software for various tasks including the synthesis of soundscapes made with the collected audio and, of more interest to the present study, the training of audio classification of models (machine learning—see also Bello and colleagues [48] who also describe a number of techniques used in the SONYC project). While it provides useful pointers for expanding our work (especially for datasets of ultrasonic sources, machine learning for automatic classification, and the citizen-science approach), SONYC concerns itself with human-audible sound only and the effects of such sounds on the human population of New York.

Citizen science studies of airborne ultrasound in the urban and natural environment began with uploads to Instagram with the hashtag #UltrasoundInAir Appendix A following the instructions in Leighton’s 2016 paper [6] explaining how the citizen scientist could equip a smartphone to detect airborne ultrasound. Spectra and spectrograms from Europe, USA, Australia, and Japan were uploaded. This led to citizen reports included in references [26, 27]. A notable study was by Fletcher and colleagues which used citizen-science techniques to collect ultrasound measurements mainly across the UK (other measurements were taken in some European countries) with more than 50% of the recordings in the London urban area [36]. The intention was to identify sources of very high-frequency sound (11.2–17.8 kHz) and ultrasound (described as > 17.8 kHz, the lower limit of the 1/3-octave band [TOB] centered on 20 kHz), as opposed to our intention of a general survey of ultrasound. Initial recordings were made with smartphones (capable of sensing and recording up to c. 22 kHz) and then checked and confirmed with ultrasound recorders and microphones with a low pass cutoff at 100 kHz. Significant ultrasound sources were found to include public-address voice-alarm systems, pest control devices, and CRT TV (used for surveillance) with the conclusion that “some individuals may be regularly exposed to [ultrasound] at levels of around 85–100 dB SPL (20 kHz TOB) from relatively commonplace devices.”Footnote 7

3 The study and its context

The motivation for this study arose from a desire to investigate human ultrasound production in urban environments, placing the study at the nexus of six contexts: the growth of urban greening, the human centeredness of sound definitions, non-human species sensitivity to and communication with ultrasound, ultrasound emissions from human activity, the health effects of ultrasound on humans, and the regulatory framework for ultrasound against the background of increasing human production of ultrasound.

In line with an urban greening agenda, there is another way to conceptualize human production of ultrasound. Resourcification is the social process of “transforming things into resources [a process] that rejects sufficiency and finitude” [50]. While the originators of the concept do not appear to view space itself as a resource (preferring instead to state that resourcification is situated in space), it is, in fact, useful to view space as an immaterial resource that is increasingly extensively mined by humans. For the purposes of this paper, this is the space of the medium through which soundwaves travel (for our purposes, air, but it could just as well be fresh water or seawater) and the frequency range utilized. Pre-industrialization and electrification, this sound space, for humans, was restricted, in the main, to the human-audible range. To use von Uexküll’s term, this would be the sonic horizons of the umweltFootnote 8 [51], where species have evolved to exploit a sensory niche. But, from the start of the twentieth century in particular, humans have gradually resourcified more and more of this sonic umwelt, expanding their communication bandwidth, not only in terms of intensity and frequency in the human-audible range (see, for example, the effects of traffic noise on birdsong [52, 53]) but also in terms of intensity and frequency in the ultrasonic spectrum. Thus, (ultra)sound space is a resource to be taken from other species.

In summary of the points made in Section 1, we can state the following:

  • there is strong evidence to support the assertion that the presence and intensity of airborne ultrasound has increased in urban centers with increasing population density and technology use;

  • this increase appears to militate against the desire for urban greening where many of the species to be introduced and encouraged have their sonic umwelts in the ultrasonic range, and where, for example, ultrasound is intimately involved in the survival of certain species (e.g., bats and their prey);

  • we tend to be unaware of this increase not least because we, as humans, cannot hear ultrasound but our very definitions of sound, human centered as they are, direct attention away from other species;

  • following from this, our regulatory framework, likewise, is too anthropocentric;

  • we assume that ultrasound production is ultrasound pollution with a negative effect on ecologies, urban, or otherwise.

If high levels of human-produced ultrasound, along with human-audible sound of course, are indeed present in urban environments—by definition, areas heavily populated by humans—then it might be the case that our interactions within such spaces will become negatively affected. The presence of such urban ultrasound must therefore be taken into account in current movements to “green” our cities and to make them more sustainable through the introduction of additional plant life and by encouraging insect and animal species in these spaces. Given that non-human species, plants, insects, and animals, often have sensory thresholds above 20 kHz, and given that human-audible noise pollution (and, presumably, ultrasound pollution) increases as human presence increases, if we wish to green our urban areas, it is clearly important to gather data and knowledge about the presence and effects on urban life of urban ultrasoundscapes. First, though, the presence and intensity of ultrasound in the urban environment must be assessed. Thus, we view the study presented here as necessary to begin providing the groundwork for more extensive studies of the ecological impact of human production of ultrasound in the urban environment.

4 Method and materials

In this section, we present our exploratory study with the primary aim of answering the following basic research question: Is there a significant enough presence of ultrasound in the Aalborg city center to warrant further investigation? Additionally, we have a secondary aim to investigate how different the sonic characteristics of any ultrasound that we do discover may be from one location to another within this urban space. We first explain the procedure and equipment used in making our ultrasound recordings and identify the recording locations in and around the Aalborg city center. We conclude with the basic audio-processing methods used in our primary and secondary analyses to measure the amount of ultrasound present in the recordings made at each of these locations and the levels of dissimilarity in their overall spectral content, respectively. Should our initial findings prove promising, the long-term goals would be to expand the urban ultrasound survey across Aalborg and other locations in Denmark as well as make specific recordings of the possible sources of the ultrasound—both plant and animal species as well as technologies—that contribute to urban ultrasoundscapes.

4.1 Procedure

Six volunteer research assistants each made recordings at their personal residences over the course of approximately a week during the period between September 22 and November 3, 2021.Footnote 9 Of these six recordings, five were made at locations in central Aalborg, Denmark—a relatively small-sized, former industrial city with an urban population of roughly 143,000 inhabitants and urban density of 2400/km2 (https://en.wikipedia.org/wiki/Aalborg). The sixth recording was made at two separate locations around a farm in the countryside outside of the city center. In all, 3.6 TB of data were recorded at 500 kHz sampling rates continuously through the day and night at each of the six locations. Figure 2 shows the five urban locations (A–E) in and around the Aalborg city center where the continuous ultrasound recordings were made over the course of about a week. Note that the rural location (latitude: 57.248886; longitude: 10.198458) is not shown but can be found approximately 35 km northeast of Aalborg.

Fig. 2
figure 2

Five ultrasound recording locations (AE) in and around the Aalborg, Denmark city center (Google Maps ©2022)

As shown in Fig. 2, locations A, B, D, and E all reside rather close to what might be considered a major road while location C resides in the heart of the city center along walking streets furthest from any major roadways. For these locations, research assistants were instructed to position the microphone outside their apartments. either on a window or on a balcony, pointing down toward the street as this was assumed to be the primary source of any ultrasound in urban environments. The height above street level for each microphone varied from approx. 4 m on the 1st floor to approx. 10 m on the 3rd floor. Upon returning the recording equipment, the research assistants provided the investigators with photographs of the area of focus for the microphones, GPS coordinates, and a description of the outside area including any notable potential sources of noise. As an example of the recordings made, Fig. 3a shows a spectrogram of a short excerpt from an ultrasound recording made at location C (shown in Fig. 3b) where the placement of the microphone has been circled.

Fig. 3
figure 3

a Spectrogram of an excerpt from the continuous ultrasound recording made at location C where color indicates the magnitude of the power spectral density. b Location C

One will note in Fig. 3a the presence of sound occurring both within normal range of human hearing (i.e., 20 Hz to 20 kHz) as well as seeming correlated ultrasound (i.e., above 20 kHz) in the signal. As the power spectral density indicates, the majority of the ultrasound power resides within the frequency band of 20 kHz to 75 kHz with little if any occurring above this threshold. However, the majority of the total sound power can be found in the audible range.

4.2 Recording equipment

The Wildlife Acoustics, Inc. SM4BAT-FS bioacoustics recorder and SMM-U2 ultrasonic microphone were used to make the six recordings. The SM4BAT-FS recorder is monophonic and capable of recording 16-bit PCM WAV files at sample rates of up to 500 kHz. Recordings can be triggered at certain frequency thresholds or can be continuous up to a 2 GB WAV limit, at which point a new recording is automatically started. Two SD card slots support a total of 1 TB memory meaning, theoretically, a total of 250 h or so can be recorded at maximum settings with each 2 GB file representing just over 30 min of recording time. In practice, however, the rechargeable batteries (4 × NiMH D size) hold enough charge to record roughly 6–7 days of material with reliability tailing off toward the end of this period. The SMM-U2 microphone has a cardioid pattern and can register frequencies up to approximately 240 kHz although its sensitivity falls 50–60 dBV/PA starting from about 100 kHz and above. Both the recorder and microphone are enclosed in rugged, waterproof polycarbonate suitable for outside use. Prior to recording, all equipment was checked and calibrated with the Wildlife Acoustics Ultrasonic Calibrator in both CAL and CHIRP modes and the correct date and time were set on each recorder. For each recording, the SM4BAT-FS recorder’s 16 kHz high-pass filter was disabled, the option for continuous recording at a 500 kHz sample rate was set, and the compression-less WAV recording format was chosen.

4.3 Analysis

In our primary analysis, we elected to use average band power (W) in the frequency range 20 kHz to 200 kHz as a measure of the amount of ultrasound in each of the six locations (five urban and one rural) where recordings were made. Both the absolute average band power between 20 and 200 kHz and the percentage of the average band power in this frequency range comprising the total average band power from 0 to 200 kHz are reported. Because the absolute average band power for ultrasound is rather low in most natural environments (i.e., not technologically mediated ones such as when near an ultrasonic scanner, cf., Leighton’s category 2 [6]), the percentage indicates how significant this amount is with respect to the total average band power recorded—this is particularly important at night, for example, where the absolute average band power in the ultrasonic range may remain low but nonetheless may make up a considerable amount of the overall sound present. For example, the absolute average power found in the evening recording of location C (shown in Fig. 3a) is extremely small at 1.04 × 10−5 W and the percentage of power in the ultrasonic range that makes up the total power of the signal is 1.48 × 10−4. For comparison, the average band power of the motion detector (shown in Fig. 1a) in the range of 20 kHz to 50 kHz is considerably higher at 0.0259 W and the percentage of power in this range that makes up the total power of the signal is 99.955.Footnote 10

In our secondary analysis, we have computed the spectral dissimilarity of the ultrasound recordings using spectral centroid, spectral roll-off point, and spectral flux as features extracted from the frequency spectrum above 20 kHz and the pairwise Euclidean distance between these features for the recordings from each location. It is important to note that when analyzing the recordings, we were unable to load each 2 GB (approx. 32 min) recording in full due to insufficient computer memory, so we loaded 5 × 106 samples randomly selected from each recording and computed our measurements from these subsets. Specifically, for each location recording we first randomly sampled 5 × 106 samples and used a high-pass filter to attenuate frequencies below the 20 kHz cut-off point in each subset of samples. Next, the newly created set of sampled subsets from all location recordings was made mono and normalized such that the maximum absolute value was 1 (e.g., an example range of values for one recording is [− 1, 0.9248]). Then, the spectral centroid, spectral roll-off point, and spectral flux were extracted from the audio (median value, 30 ms windows, 20 ms overlap) of each location and averaged across the given recording—resulting in a single value for each feature of each recording (e.g., example values for one recording are 102197.661 Hz, 224005.327 Hz, and 0.001 (dimensionless) for spectral centroid, spectral roll-off point, and spectral flux, respectively). Finally, the set of averaged extracted features from each recording was scaled using standardization across the entire set (e.g., the same values just provided became 2.032, 1.972, and − 0.087) before computing the piecewise Euclidean distances between the standardized features from each respective recording. The reported Euclidean distances are those averaged over 10 passes made using the procedure described above. All analysis calculations were made either in Matlab (v. 9.11.0.1769968 R2021b) using functions provided in the Audio and Signal Processing Toolboxes (e.g., bandpower(), spectrogram(), audioFeatureExtractor(), and movmedian()), or Kaleidoscope Pro (v. 5.4.2) by Wildlife Acoustics, Inc. All plots were made using either the same versions of Matlab and Kaleidoscope Pro or R (v. 4.1.2).

5 Results and discussion

As a first step in our analysis, we looked at the amount of ultrasound present at each hour of the day over the course of a week at each of our five urban locations. Figure 4a shows the hourly average power of ultrasound averaged over the course of a week at five urban locations in Aalborg, Denmark, while Fig. 4b shows this same average power as a percentage of the total average power. Note, that while the average powers of locations are aligned by hour, the corresponding days over the week-long period might not be the same because the recordings were made during different weeks.

Fig. 4
figure 4

a Hourly average power (W) of ultrasound over the course of a week at five urban locations in Aalborg, Denmark. b Percentage of hourly average power of ultrasound (compared to total average power) over the course of a week at five urban locations in Aalborg, Denmark

It is important to note that we should not expect to find high levels of power in the ultrasonic range under ordinary environmental conditions as natural airborne sources of ultrasound are generally low in intensity and such sound attenuates quickly in air, so the low power observed in Fig. 4a is not surprising. What is interesting to note, however, are the relative contours of average power observed at each location. We can see, for example, that the contours for all locations remain relatively flat with each having only a few “spikes” in power. Interestingly, these spikes are found at different times for each location with location B having a notable increase at 01:00 followed by 14 h of relatively flat activity, while location C has a notable increase at 06:00 and 08:00 followed by 12 h of relatively constant activity. Locations A and D are notable for having comparatively fewer periods of constant activity and location E is notable for having the highest increase in activity at 10:00. Moreover, if we look at Fig. 4b, we can see that as a percentage of the total average power in their recordings, locations A, B, and D have the highest overall contributions from ultrasound. Despite the sharp increase in absolute power observed in location E, as a percentage of the overall power, this location has one of the two lowest contributions of ultrasound along with location C—both of which remain rather steady in their activity. Under the assumption that traffic and other human activity are the primary sources of urban ultrasound, this finding could be explained by the fact that both locations C and E reside furthest away from any major roadways while locations A, B, and D all reside closest to such streets. Interestingly, we can see for all locations (but most notably for locations A, B, and D) that the greatest percentage of ultrasound was found in the middle of the night approximately between the hours of 20:00 and 06:00 with the most noteworthy drop in percentage occurring at 13:00. This trend was consistent for all locations except C (and perhaps also E) which is in the heart of the pedestrian walking area of the city center and the furthest away from busy main streets.

For comparison with our findings in urban spaces, we can look at the amount of ultrasound in our rural area outside of the Aalborg city center (latitude: 57.248886; longitude: 10.198458). Figure 5a shows the hourly average power of ultrasound averaged over the course of a week in a rural location outside of Aalborg, Denmark, while (b) shows this same average power as a percentage of the total average power. We can see that the absolute average power in Fig. 5a is on an order of magnitude lower than the average power found in the urban locations (shown in Fig. 4a) which suggests that the greatest sources of ultrasound in northern Denmark, as measured by power, may be those predominantly found in urban environments. However, the absolute average power is considerably more varied than in many of the urban locations, with more frequent and sporadic increases in the detected ultrasound, suggesting a less constant level of activity in this frequency range. If we look at Fig. 5b, we can see, moreover, that the percentage of average power comprising the total power present in the rural location is approximately equal to the average of percentages observed in the urban locations, relative to their different absolute average powers. Nonetheless, the total range in the percentage of average power observed across urban locations is notably greater (approx. 5 to 65% as shown in Fig. 4b) when compared to the rural location (approx. 24 to 34 as shown in Fig. 5b). We might expect to see these findings, however, given the differences in levels of human presence and activity (and thus human-audible and produced sound that may dominate the level of ultrasound) between urban and rural locations.

Fig. 5
figure 5

a Hourly average power (W) of ultrasound over the course of a week at a rural location outside of Aalborg, Denmark. b Percentage of hourly average power of ultrasound (compared to total average power) over the course of a week at a rural location outside of Aalborg, Denmark

Turning now to our secondary analysis, Fig. 6 shows the (dis)similarity in spectral content of the ultrasound between recordings made at each of our six locations (A–F). Similar to the varying levels of reported ultrasound between locations discussed above, the observed (dis)similarity in the characteristics of this ultrasound is interesting to note. The most similar locations, in terms of spectral content in their respective ultrasonic ranges, are F and D (dissimilarity of 0.9293), which reside in a rural location and city park, respectively, and both far from heavy traffic. Moreover, in terms of the percentage of average band power in this range when compared to the total, both locations had approximately the same contributions of ultrasound (Fig. 4b and 5b). Interestingly, even though locations may physically reside close to one another, such as with locations A and E, the characteristics of the sound observed here can in fact be quite different (dissimilarity of 3.445). One obvious outlier in Fig. 6, however, is the much higher dissimilarity location A has with all other locations. In looking at its location on the map (Fig. 2), location A resides next to a hospital, which presumably has considerable activity that, according to its spectral content, has rather different ultrasonic characteristics from those observed at the other locations. Moreover, the level of this different ultrasonic activity, as shown in Fig. 4b, is the highest out of all locations. Together, these two observations may suggest that regular and recurring events, such as ambulance sirens at hospitals and early morning trash collection vehicles at other locations, may have an impact on the overall levels and characteristics of ultrasound in long, continuous recordings when analyzed using the methodologies we have employed here.

Fig. 6
figure 6

Dissimilarity in spectral content (spectral centroid, spectral roll-off point, spectral flux) of ultrasound recordings above 20 kHz made at five urban locations (AE) and one rural location (F) in and around Aalborg, Denmark. Note that dissimilarity is measured by the pairwise Euclidean distance of standardized spectral content averaged over 10, randomly sampled selections (5 × 106 samples) from each of the continuous recordings made at the six locations

5.1 Limitations

As this was an exploratory study, we will conclude with a reflection on several aspects of our employed methodology, the nature of ultrasound, and the technical issues that may have arisen when recording such sound in urban spaces. It proved challenging to extract information contained within long, continuous recordings that would prove sufficiently meaningful in drawing any definitive conclusions. For example, the few and short random selections we elected to use as part of our methodology most certainly failed to capture all ultrasound activity and averaging the observed band power across all recordings likely further suppressed the importance of specific instances of ultrasound activity that were found. Aggravating such problems further, human activity (and thus production of sound and ultrasound) is variable not only over a 24-h period but also over longer timespans. That is, activity does not only follow a rough daily or weekly pattern, it can also show an annual cycle. Figure 7 shows an instance of one such yearly activity.

Fig. 7
figure 7

Spectrogram of an excerpt from the continuous ultrasound recording of fireworks in Copenhagen, Denmark, made on New Year’s Eve 2022/2023 where color indicates the magnitude of the power spectral density

While our secondary analysis served as a starting point for exploring how (dis)similar the characteristics of ultrasound were given what we observed with respect to their respective levels at the locations in our primary analysis, similar issues concerning our methodology limit the full extent to which we can comment on the nature of sound in these recordings. In order to effectively do so, it would be necessary to pursue a more fine-grained approach that investigates isolated instances of ultrasound and tries to identify their sources and correlate these with possibly concurrent, human-audible sound activity.

Regarding our chosen technical equipment, the original intention for disabling the 16 kHz high-pass filter on the SM4BAT-FS recorder and electing to use continuous recordings rather than triggered recordings (that begin when sounds above a certain threshold are detected) was to also capture human-audible sound so that we could get a general idea of how much of the ultrasonic sound is a by-product of sources that also generate sound within the range of normal human hearing (e.g., cars) and how much of this sound exists entirely within the ultrasonic frequency range. Without a more thorough analysis, it would not be possible to draw any definitive conclusions, but in looking at Fig. 3a, for example, we can see that the ultrasound does indeed appear to be highly correlated with sources in the audible range. A possibly more serious issue concerns the 50–60 dBV/PA sensitivity drop of the SMM-U2 microphone above approximately 100 kHz. Due to the relatively quick attenuation in air of ultrasonic frequencies and the recording height above street level at our locations, it may be that there is a considerably greater amount of ultrasound above (or indeed below) 100 kHz than was recorded with the SMM-U2. It is quite possible that more precise (i.e., flatter frequency response) results might have been obtained had we elected to use other available ultrasound recording technology. For example, Brüel & Kjær market systems capable of recording up to 140 kHz,Footnote 11 but these are unfortunately prohibitively expensive for an exploratory study and are neither suitable for outdoor locations nor robust enough for extended use.

6 Conclusions and future work

In this paper, we presented an exploratory study aimed at measuring the presence of often overlooked ultrasound in urban spaces. We collected continuous ultrasound recordings over the course of approximately a week at five urban locations in Aalborg and one rural location outside the city. Our results indicate that there is indeed ultrasound present in both urban and rural spaces that warrants further investigation. In most urban locations, there were notable increases in ultrasound, as measured by absolute average power, at certain times of the day with more or less a lower constant presence at locations that were furthest from major streets—suggesting traffic as a possible source. As a percentage of the overall average power found in both the urban and rural recordings, however, the highest percentages of ultrasound were found during the night and the lowest percentages were found during midday. Our secondary analysis revealed that the spectral contents of the ultrasound present at locations either in or closest to green spaces were most similar and that these characteristics could differ significantly between locations that were quite close in proximity to one another. Finally, our location close by a hospital produced the highest levels and most dissimilar ultrasound when compared to all other locations. While limitations concerning our methodology prevent definitive statements regarding the nature of the recorded sound sources, our analyses nonetheless highlight a very likely relationship between a location’s proximity to green spaces and various events, such as traffic, and the levels and spectral characteristics of the ultrasound present.

In future work, it would be interesting to investigate the actual sources of ultrasound—both plant and animal and technological—found in urban ultrasoundscapes. This would serve to lay the groundwork for the possible development of new datasets of urban ultrasonic sources with sustainable biodiversity in mind that could be used, for example, to train machine-listening algorithms to detect and classify these sources, as is already being done extensively with human-audible sound in large cities [49], or to correlate the human-audible sources of noise pollution in cities already well studied with the presence and intensity of ultrasonic sources that we have begun to explore with this study. Toward this first aim, we have already compiled a dataset containing thousands of short-duration field recordings of ultrasonic sources from five different types of electronic devices with initial results from a deep learning neural network showing promising classification accuracy in being able to distinguish them across the entire frequency spectrum.