Introduction

The effective management and conservation of wildlife are increasingly important in a world of limited resources and intensifying anthropogenic pressure on natural ecosystems (Marques et al., 2013; Gibb et al., 2018). Basic estimates of animal populations, such as presence/absence and density, and whether such estimates vary over time due to environmental changes, anthropogenic pressure, or conservation measures, are essential to inform wildlife management and international conservation policy. Estimates of the population size or area of occupancy of a species, for example, are essential criteria for defining the International Union for Conservation of Nature (IUCN) conservation status of a species (Thomas & Marques, 2012). Although such estimates of a species play a substantial role in ecology, they often are difficult and costly to obtain (Aide et al., 2013; Davis et al., 2020; Spear et al., 2021). Therefore, it is essential to conservation biology to test and develop new, effective, and reliable methods to estimate data on presence/absence and the abundance of species (Marques et al., 2013; Jones et al., 2013; Jetz et al., 2019).

Traditionally, wildlife abundance is predominantly monitored with two methods: visual observations and trapping (Marques et al., 2013). Distance sampling based on observations (Buckland et al., 2010) is probably the most common method for estimating wildlife abundance in terrestrial mammals (Thomas & Marques, 2012), including arboreal primates. Where animals can be distinguished individually based on natural or artificial markings, mark-capture-recapture studies also can be used to monitor wildlife abundance. Mark-capture-recapture studies do not necessarily involve physical capture of the animals and may use camera traps (Gilbert et al., 2021). Both approaches often are difficult when sampling rare, cryptic, or elusive species (Thompson, 2013; Deichmann et al., 2018; MacKenzie et al., 2005).

Passive acoustic monitoring (PAM) is a promising tool to support monitoring and inventories of biodiversity. Several authors even argue that PAM has a variety of advantages over traditional monitoring methods (Mennill et al., 2012; Sugai & Llusia, 2019; Darras et al., 2018a), such as lower survey costs, easier sampling of rare and elusive species, reduced avoidance effects, and reduced potential interobserver errors between surveys. PAM uses autonomous recording units to record soundscapes or target sounds that can later be analyzed for the presence of particular species using computer software. Most studies using PAM have been conducted on marine mammals (Mellinger et al., 2007), especially cetaceans (Marques et al., 2013). However, several recent studies have demonstrated the applicability of PAM for biodiversity monitoring in terrestrial taxa, including bats (Froidevaux et al., 2014), birds (Darras et al., 2018a), frogs (MacLaren et al., 2018), insects (Aide et al., 2013), elephants (Thompson et al., 2009, 2010), and primates (Heinicke et al., 2015; Kalan et al., 2015, 2016; Clink et al., 2018; Vu & Tran, 2019). Furthermore, PAM can assess biodiversity at the soundscape level and provide bioacoustic time capsules for the future (Sugai & Llusia, 2019).

PAM for many terrestrial taxa, including primates, is still under development, and various considerations for data collection and analysis are essential to develop PAM as an informative and effective method for monitoring wild primate populations in the future. General considerations for PAM survey design were recently reviewed (Sugai et al., 2019). First, the radius within which an acoustic signal is detected by an acoustic sensor (detection space) is critical to standardize sampling effort and relate vocal activity to a particular spatial scale (Darras et al., 2016; Sugai et al., 2019). Second, the temporal scale monitored has to be adapted to the target species or taxa as most taxa show daily or seasonal variation in acoustic activity (e.g., in Indri: Torti et al., 2013, in Phaner: Hending, Sgarlata, et al., 2020a), which will influence detectability (Hagens et al., 2018). Third, from a practical point of view, PAM equipment should be affordable for low-income countries, although this problem has already been solved with the introduction of AudioMoth (Hill et al., 2018). Furthermore, acoustic sensors must work under various environmental conditions and record high-quality audio data. Finally, the key to standardized and regular use of PAM is the development and reliable performance of automatic detection algorithms capable of processing large amounts of collected data.

PAM is particularly suited for primates as many species are very vocal and often produce loud calls that evolved for transmission over long distances (Mitani & Stuht, 1998). Many of these species have individual (Lisette et al., 2011; Kulahci et al., 2014), sex (Giacoma et al., 2010), population (Hending et al., 2020a), and/or species-specific vocal signatures (Meyer et al., 2012; Thinh et al., 2011), which are relevant for distinguishing among species and potentially also among sexes or even individuals in soundscape data. Vocalizations have long been used to assist assessments of presence/absence or abundance of some primate species, such as gibbons or titi monkeys (Brockelman & Srikosamatara, 1993; Kidney et al., 2016; Gestich et al., 2017). Therefore, using PAM to assist conservation practice in Madagascar's lemurs is especially interesting as the lemurs are one of the most endangered groups of mammals (Schwitzer et al., 2014). New methods for cost-effective monitoring of lemur populations are essential to assess ongoing conservation efforts and for their future survival (Schwitzer et al., 2013).

The pale fork-marked lemur, Phaner pallescens, is one of the most vocal primate species. It is patchily distributed in dry deciduous forests along the west coast of Madagascar from the Fiherenana River in south up to Soalala in the north and considered Endangered by the IUCN (Borgerson, 2020) (Supplemental Fig. 6). This species is pair-living (Schülke, 2005). Home ranges of pair partners overlap extensively, but pair partners rarely spend time together during their activity period (Schülke & Kappeler, 2003). Average home range size was 4.65 ha for females and 5.01 for males (Schülke & Kappeler, 2003). The species feeds almost exclusively on the sap and gum of trees and climbers (Charles-Dominique and Petter, 1980). With the onset of the night, Phaner individuals exit their shelter and usually start calling with a series of “kiu” vocalizations (Petter et al., 1971). Kiu calls are short, impulsive calls with a fundamental frequency around 2.3 kHz shaped as a parabola (Fig. 1). During the animals’ activity phase, which usually ends shortly before sunrise, individuals roam fast and extensively from one gum tree to the next in their territory to explore active gum resources and reopen closing ones for the next visit. This behavior is regularly interrupted by single or series of “kiu” vocalizations. A second, short, impulsive, and loud vocalization called “kea” is more irregularly emitted, only by males (Charles-Dominique & Petter, 1980) (Fig. 1) with a fundamental frequency of around 2.6 kHz.

Fig. 1
figure 1

Examples of the most abundant call types in Phaner pallescens recorded during the night of September 13-14, 2018 in Kirindy Forest, Madagascar. For the typical series of several kiu calls, one or two other animals are emitting kiu calls from further away. We created spectrograms using spectro function of seewave package with a hanning window, a window length of 256 and an overlap of 90.

Vocal activity in Phaner has been described as more or less continuous, with a clear peak at the beginning of the night and another peak between 21:00 and 23:00 (Petter et al., 1971). However, it is not clear how many individuals were counted in this study, if they were single calls or series of calls, and which type of calls were counted. When several individuals meet at the border of their territory, they often have “acoustic fights,” which can last for several minutes (Charles-Dominique & Petter, 1980; Petter et al., 1971). P. pallescens seems to be able to discriminate conspecific “kiu” and “kea” calls from allopatric heterospecific calls (Forbanka, 2019). Although it has been suggested that vocalizations might function in pair-partner coordination and to communicate position between pair-partners, the latter do not necessarily respond to calls of their partners (Schülke, 2005). Males have limited knowledge about a female’s position (Schülke & Kappeler, 2003).

We tested the suitability of PAM in estimating the density of Phaner pallescens. We aimed to answer the following questions:

  • What is the area in which vocalizations of Phaner pallescens can be detected with certainty (probability of detection ≈ 1) by a human listener?

  • Can we use the estimated area and call rates to calculate Phaner pallescens density?

Methods

Study Area

We conducted our study at the Kirindy Forest/CNFEREF research station of the German Primate Center. The site includes three main research grids (CS5, CS7, and N5) with intersecting trails every 25 m, facilitating systematic experiments on sound attenuation. The trail system creates 25-m x 25-m plots, which have been part of the research area of the German Primate Centre since 1995. Kirindy Forest is a dry deciduous forest. Most trees drop their leaves during the dry season lasting from April/May to September/October. Canopy height is on average around 12 m and only exceeded by some tree species, such as baobabs of the genus Adansonia. Kirindy Forest and the adjacent forest of the Aire Protegée Menabe Antimeana (APMA) have been subject to drastic deforestation during the past 20 years (Zinner et al., 2014; Markolf et al., 2020; Kappeler et al. (in press)). A detailed map of the study area is provided in supplemental material (Supplemental Fig. 6).

Recording Equipment

We assembled four solo audio recording units (ARU) following the instructions on GitHub (https://solo-system.github.io/home.html) (Whytock & Christie, 2017). Each unit consists of an omni-directional stereo microphone (Primo EM172Z1, Signal-to-Noise Ratio: 80 dB at 1 kHz) connected to a USB soundcard (Soundblaster Play 3), which itself is connected to a Raspberry Pi A+. The Raspberry Pi is powered by a USB power bank (16500 Ah). The ARU splits recordings into 10-min .wav files. An individual time stamp of the starting time of each .wav file is provided by a clock-module (different models) and written in the filename. To record, we placed the system in a customized IP67 enclosure. We used a rugged nylon panel mount to attach the stereo microphones to both sides of the enclosure box. A picture of the full recorder, with a complete list of the equipment, can be found in the supplemental material (Supplemental Table 3; Supplemental Pictures 1-2).

Recording Protocol

We attached the ARUs to a tree of 10- to 20-cm diameter at a height of approximately 150 cm using a tension belt. We started the recorders on site by connecting the recorder to the power bank, usually in the morning before conducting playback experiments. In addition to Phaner, our recording protocol aimed to record soundscapes for other diurnal and nocturnal bird and mammal species present in Kirindy. The equipment recorded until it ran out of battery power unless stated otherwise. We set ARUs in September/October 2018 and in September 2019.

Annotation of Recordings

We annotated acoustic data in Raven Pro (Version 1.6). Annotation consists of drawing a rectangle around the signal of interest (SOI). The SOI is then defined in two dimensions by the start and end time and the lowest and highest frequency and saved in an annotation table. We recorded and saved additional measurements or information, such as species, the root-mean-square amplitude, distance to playback, call type, in this table. We exported annotation tables to R using the package Rraven (Araya-Salas, 2020) or customized R scripts. We annotated Phaner calls for one single, complete night (18:00:00- 05:40:00) at location CS7 (grid) E5 (plot). As little is known about Phaner call types, we distinguished kius and keas and double, triple, and multi kius or keas, representing very fast sequences of individual kius or keas (Fig. 1). A second person verified all annotations. We did not try to conduct a qualitative analysis of different call-types but make a quantitative description of vocal activity based on hearing and the visual spectrogram.

Calibration of Recordings and Sound Pressure Level Measurements

We calibrated our recordings using a sine wave of 1 kHz played toward the ARUs at 1-m distance before conducting playback experiments and recording of environmental sounds. We measured the peak sound pressure level of the sine tone 4 times using a PCE-322A sound level meter, measured at 1-m distance. We used the mean of the four measurements (87.5 dB, A-weighted) to calibrate the recordings in RavenPro following the “End-to-end” calibration using a recorded calibration signal of known sound pressure as described in the Raven Pro FAQs (https://ravensoundsoftware.com/knowledge-base/calibrating-recordings-in-raven-pro/).

Using the calibrated recordings, we calculated sound pressure level (SPL) for all annotations based on Raven’s root-mean-square amplitude measurement, which is in linear Pascal scale and which we converted to decibels referenced to atmospheric pressure in air.

Inferring a Detection Space for a Human Listener Detecting Phaner Vocalizations with Certainty

We used two different experimental approaches to estimate the detection space for Phaner vocalizations.

Playback Experiments

We placed one ARU inside one of the 25-m x 25-m plots in the grid systems of the Kirindy forest research area to determine the detection space for Phaner pallescens. Subsequently, we took the ARU's GPS-Point and used it to measure the distance between the speaker (JBL Flip 4) and the ARU. Before playing the sounds back, we calibrated the speaker to have a SPL by which playback calls average around 95 dB (using a PCE-322A sound level meter, measured at 1 m, dB(A)-weighted). We mounted the speaker onto an adjustable bar to reach different heights. The experimenter moved away from the ARU using a handheld GPS and stopped at 10 m, 20 m, 40 m, 80 m, 160 m, and 320 m. At each distance, we played a total of 8 different calls of Phaner 2 times, resulting in 16 Phaner calls for each distance and each height. We mixed Phaner playback calls with calls of other birds and mammals from Madagascar, including several lemurs and pure sine tones at 1 kHz, 2 kHz, 4 kHz, 8 kHz, and 16 kHz. We incorporated pure sine tones to get a better understanding of frequency dependent attenuation of sounds with distance. We used the 1-kHz sine tone to calibrate the sound pressure level of our recordings. We played back calls in a random order at each distance so that the listener did not know the order of animal vocalizations.

We repeated this procedure 12 times in randomly chosen plots. We conducted experiments between 08:00 and 11:00 h to reduce the natural noise of other calling animals, especially Phaner pallescens.

After retrieving the recordings, we annotated the data in Raven Pro as described above. For the playback experiments, the annotation tables included information about the species, the call type, the distance of the playback to the ARU, the SPL, the plot, and whether the call was detected by a human listener (1) or not (0). We used a generalized linear mixed model (glmm) with a binomial error structure to model the probability of detection along the distance gradient using the r-package lme4 (Bates et al., 2015, b). We run separate models for the sine waves and Phaner vocalizations. We used plot as random factor (plotsample) and z-transformed distance and height (categorical predictor) as fixed factors in the Phaner model. For the sine wave model, we additionally entered frequency (z-transformed) as fixed factor in the model. We tested for significance by comparing the full models, including the fixed effects, to a null model, including only the random effect of the plot using a likelihood ratio test (Dobson and Barnett, 2018). For the remaining lemur species, we only show probability curves in supplemental material.

To investigate call attenuation further, we plotted the decline in SPL against distance using a logarithmic smoothing function in ggplot2(y ~ log(x)). In addition, we calculated a theoretical sound attenuation curve and plotted it using the function “attenuation” of the seewave package. This function calculates theoretical attenuation values based on a SPL and a reference distance value. We set the reference SPL to 84.013 dB and 83.365 dB, which were the mean SPL measured at 10 m for the height categories of 1 m and 3 m, respectively, and the reference distance to 10 m.

Stationary Recorders Along a Transect

Because the natural SPL of Phaner calls is not known, and playbacks did not account for differences in height above 3-m during calling, we developed a second approach that would yield comparable results for sound attenuation but controls for amplitude variation or natural loudness of Phaner kiu calls and the height of the animal during calling. We attached the first of four ARUs 1.5 m off the ground on a known feeding tree; this served as the ARU with a horizontal distance equal to 0 m. We observed several individuals (at least three) feeding and vocalizing on the tree in the nights prior to recording. We installed three other ARUs 50 m, 100 m, and 150 m away from the first at the same height. All four recorders had clock modules that we synchronized just before data collection. The recorders stayed in the forest for three nights.

After retrieving the recordings, we searched manually for the calls with the highest amplitude in the recordings of the closest ARU. We assumed calls measured with an SPL of more than 70 (dB re 20 μPa) in Raven Pro were potentially emitted from the feeding tree or at least very close to it. We chose this value based on observation, SPL measurements, and recordings conducted at the feeding tree two nights before the experiment. We annotated those calls with the distance of 0 m to represent calls to search for in the recordings of the more distant ARUs using the time synchronization of the recordings. Finding the corresponding call is usually not too difficult as animals emit vocalizations in repeated sequences of calls. When we identified the same call, we annotated it with the distance to the first recorder. We plotted the SPL against the distance to visualize sound attenuation as described for playbacks above. We added a constant of 10 to the distance variable assuming that the calling animal was at least ±10 m away from the recorder at the feeding tree. We measured mean ambient sound pressure level of 16 different areas across different recordings and different ARUs. For the duration of these measurements we copied the annotation rectangle of annotated calls in the same file and replaced those in an area without calling activity of Phaner pallescens, but other background noise, such as noise of insects.

Choosing a Sound Level for Distance Truncation

Before calculating density based on bout rates per hour, we filtered out all calls (and bouts) below a SPL of 50 (dB re 20μPa) corresponding to our estimated detection space based on the playbacks and stationary transect experiment results. Distance truncation aimed to keep detection probability within our space at p = 1. We based our decision to truncate at 150 m and a SPL of 50 dB on the decline of SPL with distance in the fixed transect experiment and the ambient SPL.

Additionally, we constructed a linear model of the logarithm of distance predicted by SPL of the playback data to calculate a distance for all 10380 annotated calls based on the measured SPL and the above model. We plotted resulting distances in a frequency histogram. Furthermore, we used the R-package Distance (Miller et al., 2017) to calculate and plot a detection function (key = hazard rate) for the distance data averaged for bouts to further support our choice of truncation at 150 m and illustrate the application of distance sampling to data from ARUs. We used the dht2 function of the Distance package to estimate density based on bouts for each hour separately (see supplemental material for a detailed description of distance sampling).

Plotting of Call Activity, Vocal Bout Rate, and Density Calculation

We calculated descriptive statistics for different call types using custom R-scripts and Rraven (Araya-Salas, 2020). We used the R-package lubridate (Grolemund & Wickham, 2011) to import and group by time stamps. We generated plots using ggplot2 (Wickham, 2016).

We calculated a vocal bout rate per hour based on focal animal behavioral observations of radio-collared individuals in the same study area (CS7) at Kirindy Forest in September and December 1999 and April 2001. A bout consists of a series of vocalizations by the focal animal. We defined vocalizations separated less than 7 s as belonging to the same bout. Although these data represent and are treated as cue rate we prefer to use the term bout rate instead of cue rate to highlight that these cues consist almost always of series of individual vocalizations. The bout rate data were collected as part of a study on the Phaner social system (Schülke, 2002; Schülke, 2003b; Schülke and Kappeler, 2003; Schülke, 2005). The observer recorded the number of calls or series of kiu and kea calls emitted by the focal animal per 2.5-min interval during 2-h focal samples between 18:00 and 02:00. They observed 15 individuals (7 females & 8 males) for a total of 174.8 h, resulting in 4,196 data points (1,031 in September 1999, 786 in December 1999, and 2,379 in April 2000). We calculated an overall mean bout rate and 95% upper and lower confidence interval using custom R-scripts. We also calculated mean bout rates per hour separately for each month. We used a nonparametric, Kruskal-Wallis test with pairwise post hoc comparisons to test for significant differences in mean bout rate per hour between months.

Within the automated recordings, Phaner calls (kius and kea calls and series), we followed the protocol of the focal sample in defining a bout. We treated vocalizations that were separated by more than 7 s of silence as a new bout. However, we did not distinguish or count separately if more than one individual was vocalizing as it is impossible to separate individuals without information on their location. A bout within the automated recordings can therefore contain more than one individual. We grouped data by month, date, focal animal, and hour to calculate a mean bout rate per hour. We divided this mean bout rate per hour, derived from the ARU, by the bout rate derived from behavioral observations to calculate the number of individuals in the ARU detection space of 150 m (0.07 km2) and extrapolated to 1 km2 . We calculated density using the following formula:

$$\boldsymbol{D}=\frac{\boldsymbol{n}}{\boldsymbol{a}\ast \boldsymbol{b}},$$

where D is density, n is the number of bouts counted per hour, a is the area of sampling, and b the mean bout rate emitted by individuals per hour. We also calculated density using the lower and upper border of the 95% confidence interval of the bout rate of individuals per hour as measure of uncertainty. To calculate a mean density over the night, we excluded the first and last hours because data were not based on a full hour. Additionally, we run a correlation of bouts per hour of the full dataset and the reduced dataset to test whether the filtered dataset reflected the pattern observed in the full dataset.

Ethical Note

All experiments were noninvasive. This research was conducted under the research permit of the Malagasy Ministere de l’Environnement de l’Ecologie et des Forets (N°215/18/MEEF/SG/DGF/DSAP/SCB.Re). The authors declare no conflicts of interest.

Results

Playback Experiments

The playback experiment revealed a significant influence of distance on the probability of call detection by a human listener. Overall, the model was highly significant (likelihood ratio test comparing full and null model: χ2 = 1753.6, df = 2, P < 0.001). Height had no significant effect on call detection, but the effect of distance was highly significant (Table I; Supplemental Fig. 7). Within a detection radius of 100 m, the probability of detecting the species call is still almost 90% (Fig. 2). At 150 m, the probability of detecting the call is slightly above 50% and drops down to less than 25% after 200 m. At 320 m, no Phaner calls could be detected by human listeners. Up to 160 m most undetected calls were masked by environmental noise during the experiments (see also Supplemental Figs. 8a-d for other species).

Table I Model estimates of playback experiments for Phaner calls and sine waves conducted in Kirindy Forest in 2019
Fig. 2
figure 2

Probability of detection of Phaner playback calls over distance and two height categories in Kirindy Forest in 2019. A human listener manually annotated calls. We plotted the curve using the "geom_smooth” function with binomial family in ggplot2. The dotted line indicates a probability threshold of 90%.

The full model analyzing detections of sine waves was significant (likelihood ratio test comparing full and null model: χ2 = 394.4, df = 3, P < 0.001). Distance and frequency were both significant influences on detection, whereas height was not (Table I; Supplemental Fig. 9). Sine waves of 1 and 2 kHz can be detected as far as 320 m. Higher frequencies (4, 8, and 16 kHz) get lost much earlier as indicated by the earlier flattening of the probability curves (Supplemental Fig. 10).

Stationary Recorders Along a Transect

The four ARUs recorded a total of 22 calls that were measured at all four recorders. We found that calls emitted directly in the tree where the first ARU was located ranged between 73.3 dB and 91.5 dB in SPL (Fig. 3). The same calls at a distance of 50 m to the first ARU ranged between 53 dB and 71.9 dB. At 100 m, SPL ranged between 46 dB and 60.4 dB, and at 150 m, the SPL ranged from 47 dB to 62.3 dB. Our results indicate that calls measured with SPL of more than 50 dB are very likely emitted inside a detection space of 150 m. This value is only slightly above the mean ambient SPL of 49.9 dB (SD = 3.16, n = 16) during the transect experiment.

Fig. 3
figure 3

Fall off of sound pressure level (SPL) of real Phaner calls against distance along a fixed transect of acoustic recording units in Kirindy Forest in September 2019. A filtering threshold at 50 dB has been identified as SPL above which detections are likely to occur within a detection space of 150 m. The black dashed line indicates ambient sound level (mean = 49.9 dB SPL). The black dotted line represents theoretical spherical sound pressure level attenuation.

Call and Bout Rates and Density

Between 18:00:00 and 05:40:00, we counted 10,380 Phaner pallescens calls. Of those, 8,641 were kiu calls, which are sometimes emitted as single vocalizations but mostly part of very long kiu series (Fig. 4a). We only considered kiu and kea calls, and variations of these two call types in form of fast repeating kius or keas (e.g., multi kea, multi kiu). The highest vocal activity occurred between 21:00-22:00 and 22:00-23:00, with up to 1,950 and 1,597 calls, respectively (Supplemental Fig. 11). We counted 402 bouts consisting of series of calls separated by less than 7 s. Calls per bout ranged from one single call to 679 calls. We counted a mean of 33.5 (SD = 9.94) bouts per hour. The distribution of bouts is similar to the one of single calls with highest activity between 21:00-22:00 (Fig. 4b).

Fig. 4
figure 4

Number of a call types and b bouts per hour of Phaner pallescens over one night (Sep 13-14, 2018) in Kirindy Forest.

Only one data point was below SPL of 50 dB at 150 m in the transect experiment (Fig. 3), and only two data points were below this level in the playback at a height of 3 m (Fig. 5). The frequency plot of distances estimated based on our model of the SPL fall-off indicates that a cutoff at 150 m would truncate most detections but ensure a probability of detection close to 1 (Supplemental Fig. 12). The detection function fitted to the estimated bout distances confirms this as well (Supplemental Figs. 13a-b).

Fig. 5
figure 5

Fall off of sound pressure level (SPL) of the emitted signal against distance for two height categories during playback experiments conducted in Kirindy Forest in September 2019. The black dashed line represents theoretical spherical sound pressure level attenuation.

Applying the SPL filter of equal or larger than 50 dB reduced the number of bouts to a mean of 9.33 bouts per hour considering the full dataset. Considering full hours only, we estimated a total of 105 bouts with a mean of 10.5 bouts per hour. The number of bouts per hour in the full and SPL filtered dataset were significantly, and fairly strongly positively correlated (Pearson’s R = 0.64, t = 2.61, df = 10, P = 0.026 (Supplemental Fig. 14).

The focal animal observations (data from Schülke, 2002) revealed a mean of 3.90 bouts per hour. We found relatively small but significant difference in bout rate between September and December (Kruskal-Wallis chi-squared = 7.19, df = 2, P = 0.02, Bonferroni post hoc test: P = 0.049; Supplemental Fig. 15). Based on the mean number of bouts per hour of the ARU, and the mean bout per hour of individuals of 3.9, we estimated that we detected around 2.69 individuals per hour within our detection space. Considering a detection space with a radius of 150 m (0.07 km2), we estimated a density of 38.5 individuals/km2 based on a single night at a single location (Table II).

Table II Number of calls and vocal bouts before and after applying a filter of ≥50 dB, and density estimates calculated based on recordings of a single autonomous acoustic recording unit in Kirindy Forest in September 2018

Density estimates calculated via distance sampling with truncation at 150 m were only slightly lower with 34.8 individuals/km2 but almost doubled (72.8 individuals/km2) with truncation at 200 m (Supplemental Table 4).

Discussion

Using applied open-source, low-cost autonomous sound recorders in the dry season of Kirindy Forest, we recorded 402 vocal bouts of Phaner pallescens at Kirindy Forest during a single night with a single ARU. We further estimated that Phaner individuals produce on average 3.9 vocal bouts per hour based on focal animal observations. After filtering the number of vocal bouts to those that likely occur in our experimentally derived detection space of 150 m, we estimated a density of Phaner of 38.5 individuals/km2.

We estimated a very high number of calls across a single night (>10,000 detected calls). With more than 800 calls per hour, the most abundant calls were “kiu” calls, mainly emitted as a long series. Whether the fast sequences of “kiu” and “kea” in our study classified as double, triple, and multi kius or keas represent calls with different meanings is currently unclear. Calls might be more intense due to arousal during vocal fights that frequently happen at territory boundaries (Charles-Dominique and Petter, 1980; Petter et al., 1979). Detailed playback experiments would be necessary to investigate this aspect. Although we did not aim to describe and classify the different call types of fork-marked lemurs, we hope our study will stimulate future research on the function of different call types in this species.

The distribution of call activity over the night is partly consistent with patterns described by Petter et al., 1979, where call activity was described to peak at the beginning of the night. Our data showed a clear peak between 21:00 and 23:00, but more data from different locations and seasons are needed to describe the variation in call activity. Preliminary analyses indicate that activity might be influenced by lunar light intensity (unpublished data).

Experiments to Estimate Detection Space

The results of our three experimental approaches to infer detection space were mostly consistent, indicating a high probability of detecting vocalizations of Phaner pallescens beyond 150 m by human listener. The frequency range of kiu and kea calls lies between 0.5 kHz and 15 kHz (Forbanka 2019), with fundamental frequencies around 2.2 and 2.6 kHz. Consistent with the results of the playbacks of sine waves, the fundamental and dominant frequency are transmitted over longer distances than frequencies above 3 kHz, which are attenuated more strongly. This results in many faint calls likely emitted from far away, which can still be detected by a human listener familiar with the sound but might be challenging to classify and detect by automatic algorithms with high certainty. Applying a filtering threshold might be a practical solution to improve future automatic detection algorithms if the study is designed to calculate densities based on a given spatial area. Our SPL level for truncation (50 dB) was just above ambient noise level. Although our species call can still easily be detected by a human listener and visual analysis of the spectrogram, calls below ambient noise level are probably more difficult to detect by automatic algorithms. However, whether studies can intentionally dismiss data points beyond a certain distance threshold depends on the study species, because not all species are as vocally active as Phaner pallescens.

We did not conduct playback experiments during the animals’ activity period as the animals react strongly to playbacks (Forbanka, 2019), and the distinction between playbacks and natural vocalizations would have been not feasible. A drawback of this approach, however, was that playbacks were often masked by environmental sounds, such as parrots and other birds, which usually do not sing at the time of Phaner’s vocal activity. Most calls missed during the playback by human listener up to 160 m can be attributed to masking by other environmental sounds. Except for calls of the crested coua (Coua cristata) at sunset, masking by other sounds and confusing sounds of Phaner pallescens with other species vocalizations seemed not to be a big issue at Kirindy during our recording period.

Environmental variables in Kirindy are strongly influenced by seasonality and the pronounced dry season from April/May to October/November. Future experiments should consider controlling for seasonality, ambient temperature, and humidity, which might affect sound transmission (Ellinger and Hödl, 2003). However, the results of the playback experiments were overall confirmed by our experiment using a fixed transect during the animal's activity period, indicating that temperature and time of day are unlikely to have a significant influence on our results. Results from previous test experiments conducted in 2018 also included ecological measurements, such as DBH and canopy cover measured among different plots across the study area. However, as these ecological parameters had no significant effect on sound transmission in different plots, they were excluded from subsequent experiments (Supplemental Fig. 16).

Playback experiments almost never reflect the full reality, especially if the vocalizations’ natural sound pressure level is unknown. Maximum measured SPL above 90 dB at a distance zero during the transect experiment indicates that the natural SPL of Phaner calls can be higher than broadcast during our playbacks. In contrast, the transect approach relied on the assumption that calls are emitted very close to our ARU of distance “zero,” which might have not always been the case. We consider both approaches valuable, and they have advantages and disadvantages for estimating detection space and sound transmission. The applicability of either method will depend mostly on the scope of the study, especially the taxa under study. Our transect experiment might not be feasible for many primates, and other approaches need to be developed. For example, one approach is to use time-synced recorders to estimate detection space using distance truncation in birds (Hedley et al., 2020). Other authors used caged birds to estimate detection distances from environmental sound recordings in the rainforest of Indonesia (Darras et al., 2018b).

Calculation of Density

Our calculation of density assumes that bout rate or call activity is positively correlated with the number of individuals found in a specific area. Future studies need to test whether this assumption is correct and stable throughout the year by comparing abundance estimates collected using other methods (e.g., capture-recapture, visual distance sampling) with abundance estimates from PAM, ideally at the same time. Additionally, more data over a longer study period and different locations need to be analyzed to estimate how stable call activity is across different nights as call rate might also be influenced by environmental factors such as seasonality, time, temperature, rainfall, or social activities (Bearder et al., 2002; Starr et al., 2012; Clink et al., 2020). Future analyses will need to test whether abundance can be reliably estimated using call activity in Phaner pallescens and other primate species. Our estimate of 38.5 individuals/km2 is slightly lower than estimates of 60-70 individuals/km2 from the same area based on capture-recapture studies in previous years (Schülke, 2003a). However, assuming a detection space with a radius of 200 m would result in a minimum of 21.6 ind/km2 and reducing the detection space to a radius of 100 m would result in a maximum of 87.1 individuals/km2. Our estimate is in a plausible range, although based on a single night at a single location. More data from different plots will make the estimate more confident. As such, our study provides a proof of principle for estimating lemur abundance using ARUs.

Our approach of using 150 m as radius of the detection space follows Thomas and Marques (2012) for point sampling assuming that all calls are detected with certainty (probability of detection = 1). We have good reasons to assume that detection probability is close to one. First, the duration of our count periods is quite long (1 h) for point counts and should ensure sufficient availability of animal cues within the counting period. The minimum bout rate calculated for the 15 individuals was one bout (range 1-16, mean 3.9) per hour. Emitting no cue for 1 h should be very rare in Phaner pallescens. Second, we showed that the detectability of cues by a human listener is easily possible way beyond 150 m (Supplemental Fig. 12). Third, most (>90%) cues occur in bouts, sometimes of several minutes length. Although single calls show considerable attenuation of higher frequencies, a human listener can hardly miss these long series. However, they seem to challenge reliable automatic acoustic detection algorithms (unpublished data).

As described by Thomas and Marques (2012), additional multipliers could be included in the above formula to account for factors, such as detectability, group size, and false-positive or false-negative rates when using automatic detections. In the appendix, we show that detectability can be modelled using distance sampling approaches as several authors have shown that sound level measurements from acoustic recordings can provide objective distance estimates (Darras et al., 2018b; Hedley et al., 2020; Yip et al., 2019). We do not think that our dataset is currently suitable for estimating density using distance sampling for several reasons. First, we only have data from a single location. Second, Phaner pallescens moves very fast through the forest, contradicting the distance sampling assumption of detecting the object at its initial location. Third, a bout can contain vocalizations of more than one individual, which needs to be accounted for in future models.

Future of Passive Acoustic to Monitor Lemurs and Other Primates

PAM has high potential to improve the future conservation of lemur populations, because species of all genera show regular vocal behavior during their activity period. Indri, Varecia, Phaner, and Lepilemur produce regular territorial defense, intergroup, or pair spacing calls with the onset of their activity period (Petter and Charles-Dominique, 1979; Fichtel and Hilgartner, 2012; Geissmann and Mutschler, 2006; Seiler et al., 2015). Mirza in the Sambirano region and in Beanka vocalize more frequently at the beginning of the night than Mirza in Kirindy (Markolf personal observation), and female Mirza zaza produce loud mating advertisement calls during estrus (Hending et al., 2020b; Seiler et al., 2019). Eulemur reacts and responds to playback alarm calls more often close to sunset, which suggests that this call (croak) serves as group advertisement or territorial defense call (Markolf et al., 2013). Eulemurs elicit several other distinct calls during foraging and traveling (Pflüger and Fichtel, 2012). Besides the "Tchi-fak” call that gives them their name, sifakas (genus Propithecus) also produce several types of calls during group encounters, for group coordination, and in response to terrestrial predators or aerial predators (Fichtel, 2020). Avahi occidentalis elicits at least three distinct vocalizations for spacing and group coordination calls (Ramanankirahina et al., 2016). A preliminary vocal repertoire for Prolemur simus suggests that vocalizations also play an important role in bamboo lemurs (Bergey and Patel, 2008). Almost nothing is known about the use and function of vocalizations in Cheirogaleus, but ultrasonic vocalizations were recorded between mother and infants (Cherry et al., 1987). Microcebus also produces ultrasonic vocalizations and have recently been subject of a study using automated acoustic recording equipment in the Sambirano region (Hending et al., 2017). Finally, Daubentonia madagascariensis produces several types of vocalizations (Stanger and Macedonia, 1994), including a long-distance call (Ramsier and Dominy, 2014). Acoustic divergence between primate species often correlates with genetic distance (Rakotonirina et al., 2016; Meyer et al., 2012; Thinh et al., 2011) between species. Moreover, sympatric species are likely to be acoustically more different, which is the case for sympatric species of Eulemur (Rakotonirina et al., 2016; Markolf et al., 2013). When closely related taxa are sympatric, PAM can only be effective if they show species-specific call structures, which allows technical discrimination of vocalizations among species. Species-specific vocalizations are known in many non-human primates (Mitani and Stuht, 1998), including several lemur genera, such as Eulemur (Gamba et al., 2012; Markolf et al., 2013), Propithecus (Fichtel, 2014), and Microcebus (Braune et al., 2008; Hending et al., 2017). Thus, our knowledge of acoustic communication in lemurs suggests that PAM could help to monitor lemur populations.

Recommendations for Future PAM Studies

Automated acoustic recording is likely to be more cost-effective for long-term monitoring, where applicable, and once appropriate protocols are established. PAM needs fewer human resources and can be deployed for extended periods (Darras et al., 2018a; Vu and Tran, 2019). The method also facilitates regular data collection in areas difficult to access or inaccessible during certain times. For example, many National Parks in Madagascar are difficult to access at night to monitor nocturnal lemurs or due to seasonal flooding of roads. Furthermore, PAM reduces interobserver problems caused by different researchers collecting the data (Darras et al., 2018a). For the future, regular environmental recordings will represent bioacoustic time capsules, potentially providing ecological data of past ecosystems (Sugai and Llusia, 2019) and primate populations. However, analyzing PAM data is still more time-consuming than traditional surveys, and further research is needed to improve automatic detection methods (Gibb et al., 2018; Sugai et al., 2020). Numerous studies have been published, and many groups are working to develop more efficient and widely available algorithms (Clink and Klinck, 2019; Corrada Bravo et al., 2017; Katz et al., 2016).

Today, 33% of lemurs are considered critically endangered by the IUCN Red List (IUCN Red List, 2020). Developing new cost-effective methods for monitoring populations is essential, and lemur conservation could benefit greatly from the development and use of PAM. We recommend several important steps necessary for improving PAM and adopting the methods into lemur and other primate conservation programs. First, we need more detailed species-specific studies on vocal behavior and sound transmission to obtain information on the influence of seasonal and habitat-related differences on the variability of call rates and sound attenuation. For example, preliminary data indicate that vocal activity in Phaner pallescens could be influenced by the phases of the moon (Bader, 2019, unpublished BSc thesis). Acoustic activity also might vary between different seasons (e.g., mating vs. non-mating season). Such influences could result in enormous inaccuracies in abundance and occupancy estimates if not considered. Estimates of species-specific rates of vocalizations could benefit from collars with microphones, which have been successfully applied in several taxa (Buil et al., 2019; Couchoux et al., 2015; Yan et al., 2019)

Second, for genetic studies and many other biology fields, it is obligatory to publish data associated with publications in public databases. The same should be standard for acoustic data as acoustic reference data will be critical to train efficient algorithms for fast analyses of the data. In most acoustic studies on primates, however, the information is limited to a printed spectrogram of example calls, limiting future research in primate acoustic communication. Several audio databases exist but are mostly focused on other taxa and groups (e.g., Macaulay Library, Xeno-canto). Research on primate acoustic communication would benefit greatly from mandatory sharing of audio files in a public database. Such databases include functionalities to facilitate PAM, such as Arbimon (https://arbimon.rfcx.org), ecoSound-web (https://ecosound-web.uni-goettingen.de/ecosound_web/), or Xeno-Canto (https://www.xeno-canto.org). A reference call database for primates for use in PAM studies has been initiated on the platform ecoSound-web (formerly called BioSounds)  (Darras et al., 2020). The platform also allows shared annotation of soundscape recordings via a browser window facilitating species identification and collaboration among countries and research groups. Finally, it will be key for the use of PAM in primate conservation to develop standardized protocols and more importantly to provide teaching and training to local students, scientists, and conservation staff from primate range countries.

Conclusions

PAM will improve the estimation of abundance and occupancy of a wide range of primates in the future. We hope that our work contributes to and facilitates the development of PAM for the conservation of lemurs and other primates.