Accommodating false positives within acoustic spatial capture-recapture, with variable source levels, noisy bearings and an inhomogeneous spatial density

Passive acoustic monitoring is a promising method for surveying wildlife populations that are easier to detect acoustically than visually. When animal vocalisations can be uniquely identified on an array of sensors, the potential exists to estimate population density through acoustic spatial capture-recapture (ASCR). However, sound classification is imperfect, and in some situations a high proportion of sounds detected on just a single sensor ('singletons') are not from the target species. We present a case study of bowhead whale calls (Baleana mysticetus) collected in the Beaufort Sea in 2010 containing such false positives. We propose a novel extension of ASCR that is robust to false positives by truncating singletons and conditioning on calls being detected by at least two sensors. We allow for individual-level detection heterogeneity through modelling a variable sound source level, model inhomogeneous call spatial density, and include bearings with varying measurement error. We show via simulation that the method produces near-unbiased estimates when correctly specified. Ignoring source level variation resulted in a strong negative bias, while ignoring inhomogeneous density resulted in severe positive bias. The case study analysis indicated a band of higher call density approximately 30km from shore; 59.8% of singletons were estimated to have been false positives.


Introduction
In recent decades, passive acoustic monitoring (PAM) has increasingly been used to study both terrestrial (Sugai et al. 2019) and marine animals, particularly cetaceans (Zimmer 2011).Compared with more traditional visual survey methods, acoustic monitoring works day and night, is robust to variation in environmental conditions such as weather, and in some habitats has the potential to detect animals at greater distances, hence increasing the area surveyed (Marques et al. 2011).It has enabled studies of rare and elusive species such as the vaquita (Thomas et al. 2017) and several species of beaked whale (Hildebrand et al. 2015;Yack et al. 2013) which, despite being visually cryptic, produce frequent sounds that can be detected by PAM systems.
One important application of PAM is to estimate population density or abundance (Marques et al. 2013).In the situation where multiple acoustic sensors are deployed simultaneously with a spatial separation such that some vocalisations can be detected on multiple sensors then an appropriate statistical framework for estimating density is spatial capture-recapture (SCR; also known as spatially explicit capture-recapture) (e.g., Borchers et al. 2015;Stevenson et al. 2015).SCR is an extension of long-established capture-recapture (otherwise known as mark-recapture or mark-resight) methods where data on the detection ('capture') of individually-identified animals is supplemented by data on the spatial location of the survey effort and the detections (Borchers and Efford 2008;Royle et al. 2009;Kidney et al. 2016).Recording not just whether but also where each animal was detected increases the accuracy of abundance and density estimates and potentially allows estimation of a spatially inhomogeneous animal density surface.
Acoustic spatial capture-recapture (ASCR) is a special case of SCR where the detections are of individual animal vocalisations or 'cues'.Standard SCR relies on animals moving between detection locations and hence typically requires multiple capture occasions, while in ASCR the sound travels almost instantaneously from its source and hence can be detected on multiple sensors in a single occasion.Estimation is of cue spatial density; to convert to animal density an estimate of average cue production rate is required (Marques et al. 2013;Stevenson et al. 2021).As well as the location of detection, additional information is often available about the location of the vocalisation, e.g., the bearing, received sound level or time of arrival on multiple sensors.Borchers et al. (2015) showed that using this additional information further improved estimation accuracy.
PAM data can also present challenges that require accounting for to avoid bias in ASCR analysis.First, sound classification is imperfect, leading to false positive detections of sounds not originating from the target species.Second, vocalisation source level can vary considerably, causing heterogeneity in detectability.Third, there can be considerable measurement error in the additional information, particularly the bearings.Forth (in common with other SCR studies), spatial density of source locations can vary substantially.
The methods presented here are motivated by a case study that demonstrates all four of the above issues: estimation of call density of bowhead whales (Baleana mysticetus) migrating through the Beaufort Sea.Multiple arrays of acoustic sensors were deployed in the Beaufort Sea during the migration season and recorded millions of bowhead whale calls.Automated detection and classification methods were therefore used to process the data, yielding call detections, received sound levels and bearings, and linking calls across detectors.However, a high proportion of the detections made on only one sensor ('singletons') were thought to be false positives.Naively including these singletons in the analysis would lead to a positive bias in the estimated abundance of unknown but likely substantial magnitude.To avoid this, we excluded all singletons from the analysis and conditioned the ASCR likelihood to only include calls detected on multiple sensors ('multiply-detected calls').Truncating the data in this manner, rather than attempting to model the proportion of false positives, is a good strategy when data are plentiful (see Discussion).Conn et al. (2011) proposed a similar procedure in a mark-resight study as a way to differentiate between resident and transient bottlenose dolphin populations, assuming that transients were never detected more than once.To our knowledge, this is the first time this approach has been used in SCR.
Our case study features some additional complications that may commonly appear in real-world data but are not typically all dealt with.Call source level was thought to vary substantially and so we include source level as a random effect in our model.Exploratory analysis showed that, while most estimated bearings were accurate, some appeared to be very inaccurate.We therefore developed a two-part discrete mixture model for bearing error, extending the bearing error methods of Borchers et al. (2015).Finally, we allowed for an inhomogeneous density model to accommodate the spatial preference of the migrating whales (and thus their calls).
In Section 2 we describe the case study in more detail.We then present the extended ASCR model in Section 3. In Section 4 we evaluate the model performance via simulation, and show how ignoring some of the real-world issues can result in substantial bias.Section 5 gives results from a proof-of-concept application of the model to a single day of data.Finally, in Section 6, we discuss results, limitations and alternatives.

Case Study
Every year from August to October, the Bering-Chukchi-Beaufort (BCB) population of bowhead whales (Baleana mysticetus) migrates westwards through the Beaufort Sea to their wintering areas in the Bering Sea (Blackwell et al. 2007).They travel mainly over the continental shelf in waters less than 25 meters deep, approximately 10-75 km offshore (Greene et al. 2004).During this migration, bowhead whales are known to produce a wide variety of calls (Ljungblad et al. 1982).The purpose of these calls remains largely unknown, although they may be used for long-range communication (Thode et al. 2020) or to navigate through ice (George et al. 1989).
Between 2007 and 2014, up to 35 Directional Autonomous Seafloor Acoustic Recorders (DASARs; Greene et al. 2004) were deployed at several sites off the north coast of Alaska to monitor the calling behaviours of the migrating whales during seismic surveys.An automated detection and classification procedure was developed to handle the more than one million detected calls over the period 2007-2014 (Thode et al. 2012(Thode et al. , 2020)).This procedure could identify discrete sounds as bowhead whale calls, and subsequently link them with detections from other DASARs within the array if they were the same call (Thode et al. 2012).
Even though it was not the original purpose of the monitoring, the detection and linking of calls created detection histories for every detected call, making these data suitable for ASCR.ASCR theory assumes capture histories without errors, i.e., calls can be missed but not incorrectly positively identified or matched.Several data cleaning procedures were required to meet this assumption as far as possible.Sometimes, calls would be wrongly identified as other discrete sounds.Distant airgun signals were occasionally  570) that were estimated to be valid using the methods developed in this paper.misidentified as bowhead whale calls; bearded seals (Erignathus barbatus) and walruses (Odobensu rosmarus divergens) could appear similar as well, but these were generally rare and much quieter (Thode et al. 2012).Moreover, if bowhead whale calls overlapped in time, they could be incorrectly matched as being the same call.Cheoo (2019) showed that the number of detection histories that involved just one sensor ('singletons') in the automated data was not in line with expectations based on sound propagation theory (see Figure 1).It was therefore hypothesised that these singletons contain a high-degree of incorrectly classified calls ('false positives'), mostly consisting of random fluctuations in the noise field or reverberations of whale calls, which were unlikely to be associated among multiple detectors.The solution we present here is to exclude singleton detection histories all together and modify the ASCR likelihood to be conditional on the capture histories involving at least two sensors.To ensure that the multiply-detected calls would contain no false positives, we cleaned the remaining data using a procedure described in Appendix A in Supplementary Materials.
In this study, we focus on the data from one specific day, 31 August 2010, from site 5, the most eastward site, as this site was one of the more intensely studied subsets of the data.Moreover, calls accumulated somewhat evenly across the day, resulting in a low expected rate of overlapping calls.For every call the following data were recorded: a detection history as well as bearings, received sound levels, and noise levels for every involved DASAR.The source level and origin of the call are unobserved, and hence treated as latent variables.Throughout this manuscript, all sound measurements are denoted on the decibel scale (RMS; dB re 1 µPa).More details on the background, availability, and pre-processing of the data are presented in Appendix A in Supplementary Materials.

Model
In this section, we first introduce the full likelihood.Following that, we derive each element of the likelihood individually.Consider an acoustic survey of an array of K sensors operating within a survey region A ⊂ R 2 over a period of time T .A total of n unique animal calls are detected by at least two sensors.For any multiply-detected call i ∈ 1, ..., n, let ω ij be 1 if the call was detected at sensor j ∈ 1, ..., K, and 0 otherwise.We define the matrix containing all detection histories as Ω = (ω 1 , ..., ω n ) , where the detection history for call i is denoted ω i = (ω i1 , ..., ω iK ) and denotes the transpose.For every call i that was detected at sensor j, we also observe the bearing y ij , measured in degrees clockwise relative to true north, and the received sound level r ij .These data are contained in Y = (y 1 , ..., y n ) and R = (r 1 , ..., r n ) , respectively.Latent variables are the spatial origins of calls, denoted by X = (x 1 , ..., x n ) where x i is a location in the Cartesian plane, and the source levels, denoted by s = (s 1 , ..., s n ) .The support for s is denoted S. Throughout this manuscript, we do not distinguish explicitly in notation between random variables and specific observations/realisations -this should be clear from the context.
The parameter vectors used in the model are (following previous literature as closely as possible): φ for parameters associated with the spatial density of emitted calls, ν for those associated with the source levels of calls, η for those associated with source sound propagation and received levels, θ for those associated with detectability of calls, and γ for those associated with measurement error of the bearings.For notational convenience we define the joint parameter vector ξ = (φ, ν, η, θ, γ).A list of model assumptions is presented in Section 3.10.

Likelihood Specification
The likelihood is formed from the joint distribution of all observed and latent variables introduced above.We denote this likelihood L f (ξ) where the subscript f stands for 'full' to distinguish it from the conditional likelihood L c (ξ) which is conditional on the observed number of detections.Denoting both probability mass and density functions as f (.), we define the full likelihood as (1) As we do not observe the call locations and source levels, we marginalise over the unobserved X and s to obtain The double integral in Equation ( 2) is of dimension 3n, making this likelihood intractable.We follow Stevenson et al. (2015) in assuming calls to be independent of each other in all respects, allowing us to specify Equation (2) as the product of n 3-dimensional integrals: where we assume independence between y i and r i given x i , and ω * i := j∈1:k w ij denotes the total number of sensors involved in the detection of call i.The separation of the joint distribution inside the integrals in (3) follows from repeatedly applying Bayes' formula.Note that conditioning the joint distribution on n in (2) is equivalent to conditioning every marginal distribution on involving at least two sensors in (3); for the second and third element this conditioning is implicit in conditioning on detection history ω i .If we were to assume a constant spatial density of calls, it would be sufficient to simply maximise L c , as the parameter estimates from the conditional MLE are identical to those obtained by maximising the full likelihood -a Horvitz Thompson-like estimator could then be used to derive the mean density (Borchers and Efford 2008).In our case study, however, the spatial density of calls is known to be inhomogeneous, so the full likelihood is required.
In the following sections, we specify in further detail f (n; φ, θ, ν) and the five components in Equation (3).For readability we will hereforth omit the indexing of parameters in the probability functions.

Detection Probability, p(x i , s i )
A fundamental part of ASCR is the concept of a detection probability, which is the probability that an emitted call is detected by a sensor -this is the way ASCR accommodates for missed calls, i.e., false negatives.The function that relates this probability to covariates is called the detection function, denoted p(.).For ASCR it was proposed to be most appropriate to use a detection function based on the received (sound) level, also known as signal strength, which is primarily a function of source level and range (Efford et al. 2009;Stevenson et al. 2015).We propose a detection function for sensor j where g 0 ∈ (0, 1) denotes the detection probability when the true received level of a call surpasses threshold t r , and zero probability else, as follows where Φ denotes the standard normal cumulative density function (cdf) and σ r denotes the measurement and propagation error on received level (see Section 3.4).The threshold t r should be set by the analyst, and in this study is roughly equal to the maximum level of ocean background noise.As there is error on the received levels, calls with an expected received level close to t r can have a detection probability between 0 and g 0 .

Detected Calls, f (n)
To construct a probability function for the number of detected calls, we start with the distribution of emitted calls, which are assumed to occur independently of one another in space and time.Thus, let the number of emitted calls at point x in period T be a realisation of a spatially-inhomogeneous Poisson point process with intensity D(x).As we only observe multiply-detected calls, we take the product of this intensity and the probability of multiply detection, denoted by where ω * := j∈1:k ω j .Note that Equation ( 5) is the part of the method that deviates from conventional ASCR and allows us to exclude all singletons.We rewrite Equation ( 5) and marginalise over source level to get the filtered Poisson point process with spatially varying rate parameter S D(x)p.(x,s)f (s)ds.Lastly, we marginalise over x to get the distribution of the total number of multiply-detected calls over time T : The expected total number of emitted calls is then derived by integrating the density over the study area, such that E[N ] = A D(x)dx.
3.4 Received Level, f (r i |ω i , x i , s i ; η) Sound waves propagating through water lose strength through various mechanisms (Jensen et al. 2011), a process known as 'transmission loss'.We approximate this process by allowing a single parameter β r to determine the acoustic transmission loss, such that where d j (x i ) returns the distance from sensor j to location x i .Here, β r = 20 would reflect purely spherical spreading loss while β r = 10 would reflect cylindrical spreading loss; in reality the dominant process will be range-and depth-dependent, with other factors also playing a role (Jensen et al. 2011).To capture potential error in the propagation model and the measurement of received level, we follow Stevenson et al. (2015) and assume Gaussian error on the received levels, giving where N (µ, σ 2 ) denotes a normal distribution with mean µ and variance σ 2 .Unlike Stevenson et al. (2015), we do not assume all calls above threshold t r to be detected with certainty.Instead, we allow for a single detection probability for calls with received levels above the threshold, since the signal processing used in our case study meant that other factors also determined detectability (see Section 3.2).As r ij is only recorded if sensor j detected the call, we condition the indexing on the j th DASAR detecting the call.Assuming independence between the sensors, the third component of Equation where φ denotes the standard normal probability density function (pdf).This is in effect a normal distribution truncated at t r .
3.5 Bearings, f (y i |ω i , x i ; γ) DASARs were designed to record the direction to discrete sounds; these recorded bearings contain measurement errors.Following Stevenson et al. (2015) and Borchers et al. (2015), we capture this error by assuming a Von Mises distribution with concentration parameter κ on the bearings.Analogous to the received levels, we only record bearings at DASARs that detected a call.Assuming that the sensors are independent of each other, the second component of Equation ( 3) would therefore be with I 0 denoting the modified Bessel function of degree 0. Exploratory research found that, while most bearings appeared relatively accurate, a small proportion seemed highly inaccurate, and we hence apply a 2-part discrete mixture model on the bearings.In effect, we fit a Von Mises distribution with a lower accuracy (dispersion is κ) for some proportion of the bearings, ψ κ , and a Von Mises distribution with higher accuracy (dispersion is κ+δ κ , where increment δ κ is non-negative) to the remaining proportion of the bearings, 1 − ψ κ .
The second component of Equation ( 3) thus becomes 3.6 Call location given s and at least two detections, f (x i |s i , ω * i ≥ 2; φ, θ) To evaluate the pdf of call locations, we assume independence between call locations and source levels, and use Bayes' formula to obtain Although f (x i ) is unknown, we do know that it is proportional to the emitted call density, such that f Thus we can simplify Equation (12) as 3.7 Source level given at least two detections, f (s i |ω * i ≥ 2; φ, θ, ν) Analogous to above, we assume independence between and among call locations and source levels, to derive Note that a part of the numerator in Equation ( 14) cancels out against the denominator in Equation ( 13), and that the denominator of Equation ( 14) denotes the effective sampled area.Thode et al. (2020) estimated source levels using the estimated origin of the call, the received level on the sensor nearest the origin, and a transmission loss parameter β r of 15.Based on the observed distribution of these estimated source levels, we assume a normal distribution on s i truncated at 0, such that 3.8 Detection History, f If we assume sensors to be independent, we can view the detection history of a call as a realisation of a binomial process with size K and non-constant probability p j with j = 1, .., K, where the order is relevant and hence the binomial coefficient is absent.This gives where p.(x i , s i ) appears in the denominator to account for the conditioning on at least two sensors in every call detection history (see Equation ( 5)).

Variance Estimation
We do not use the Hessian matrix to extract standard errors, as these are only asymptotically normal.Instead, we estimate uncertainty using a non-parametric bootstrap, where we re-sample the calls with replacement (Borchers et al. 2002) and fit the model every time.Following that, we estimate the standard error by taking the standard deviation of all bootstrapped parameter estimates, and we take the 2.5% and 97.5% percentiles to estimate their 95% confidence intervals.

Assumptions
The method presented in this manuscript relies on several assumptions, as follows.1) Call origins are a realisation of a Poisson point process, thus calls are spatially and temporally independent given this process.2) Calls are omnidirectional and equally detectable given only the received level (i.e., no unmodelled heterogeneity).3) Sensors are identical in performance and operate independently; 4) Calls are matched without error and identified correctly, but can be missed (i.e., no false positive, but false negatives are allowed).5) The transmission loss model is correctly specified.6) Uncertainty on bearings and received levels are independent.7) Source level of a call is independent of space and time.These assumptions are discussed in more detail in Appendix B in Supplementary Materials.

Practical Implementation
We fitted the model using maximum likelihood estimation (MLE) in R 4.1.0(R Core Team 2021) with some components written in C++ and linked to R through the Rcpp library (Eddelbuettel 2013).We standardised the covariates in the density model to improve the convergence.The estimates were found through optimisation with the function nlminb() (R Core Team 2021).We used a spatial mesh with non-uniform grid spacing to reduce run-time, with increased grid spacing farther from the sensor array (see Appendix C in Supplementary Materials for details).

Simulation Study
We used simulations to evaluate the performance of the model under variable source levels (scenario 1) and fixed source levels (scenario 2).Both scenarios featured measurement error on bearings simulated from a two-part mixture model, and inhomogeneous call density specified as follows: where d denotes the (scaled) shortest distance to the coast.The parameter values used were chosen to match those from the case study data analysis and are given in Table 1.
For each scenario, we simulated 100 data sets and analysed each data set with 5 models: (a) the true model, i.e., that corresponding to the simulated scenario; (b) a model with incorrect assumption about source level, i.e., for scenario 1 the model assumed fixed source level and for scenario 2 the model assumed variable source level; (c) a simpler bearing model assuming a von Mises distribution on bearing error but no two-part mixture; (d) a model that omitted the bearing information altogether; and (e) a model assuming a -53.0 -68.5 Table 1: The parameter values used for the simulation study.These were based on initial fits to the real data, in order to keep the simulations as realistic as possible.
homogeneous spatial density.We did not include a simulation scenario where we naively fit model to false positives, as the effect on the estimates is already known: it will induce a positive bias that will increase with increasing false positive rate.
For each scenario and model combination (1a-e and 2a-e) we evaluated performance by calculating the coefficient of variation (CV), relative error and relative bias in estimated total abundance.Further details are given in Appendix D in Supplementary Materials.

Results
Both correctly specified models (1a and 2a) gave near-unbiased results (Figure 2).Fitting a single source level model in the variable source level scenario (1b) introduced a strong negative bias with low variance; fitting a variable source level in the fixed source level scenario (2b) introduced a strong negative bias, although this bias could be explained by a flat likelihood surface and resulting sensitivity to starting values.Using an incorrect, simple bearing model (1c and 2c) did not induce bias but did increase variance.Ignoring the bearing information induced a small positive bias and greater variance in the variable source level scenario (1d) but had little effect on the fixed source level scenario (2d).Lastly, using an incorrectly specified constant spatial density model (1e and 2e) resulted in severe overestimation of abundance (> 300% and > 200% bias for variable and fixed source level scenarios, respectively).

Bowhead Whale Analysis
We used the proposed method to estimate bowhead whale call density and abundance at site 5 on 31 August 2010.We used data from a single day, and as we present our estimate as call density per day, we did not need to adjust for our study time (T = 1).We set t r at 96 dB, as the observed background noise never surpassed this level.The true call density surface is unknown and therefore we fitted several candidate models, which we compared using Akaike's information criterion (AIC; Akaike (1998)).As the bowhead whales are thought to exhibit a spatial preference based on their distance from the coast and ocean depth during fall migration (Greene et al. 2004), we fitted a canditate set of 35 models containing combinations of distance and depth as linear, quadratic or smooth covariates -see Appendix E in Supplementary Materials for full details.Measures of uncertainty in abundance (standard error, CV, and quantile coefficient of dispersion (QCD)) were derived by bootstrapping the calls 999 times, refitting the AIC-best model each time.The QCD is a relative measure and insensitive to outliers, and is defined as , where Q 1 and Q 3 are the first and third quartile, respectively.To further illustrate the potential benefits of modelling source level heterogeneity, we included point estimates of the best model equivalent with a fixed source level.

Results
The lowest AIC model included density modelled as a smooth function distance to coast and a quadratic relation with depth.We observe an increase in density, followed by a decrease, as we increase the distance from the coast (Figure 3, left panel).Moreover, density is higher just east of the sensor array, potentially due to some effect of ocean depth.Even though we observe increases in uncertainty in the southern regions and slightly in the north, overall QCD estimates are low (Figure 3, right panel).Total abundance of bowhead whale calls was estimated at 5741 in the study area, and the CVs of the parameter estimates varied considerably, ranging from 0.95% for μs to 58.22% for β s(depth).3(Table 2).

Discussion
We have developed and tested a novel extension of acoustic spatial capture-recapture by conditioning on a minimum of two detections per individual call.Removing single detections may be necessary when the validity of these calls is challenged, and our simulation study shows that the method gives unbiased results in both variable and single source level scenarios.Even though this model applies to (marine) acoustic data, the extension proposed can be applied to all forms of SCR.Fitting a single source level model to variable source level data, thus ignoring heterogeneity in the detectability of the calls, resulted in severe underestimation of abundance.This represents a cautionary tale for other applications of SCR, both terrestrial and aquatic, when it is hypothesised that there is some random variable affecting individual detectability -it does not have to be something as tangible as a source level.Moreover, we confirmed results from Stevenson et al. (2015) on adding bearing information to a SCR model and found a further increase in precision when allowing for a mixture of 'good' and 'bad' bearings.Finally, we also confirm that incorrectly assuming a constant call density surface can lead to severe overestimates of the abundance.
The notion that density of bowhead whale(s) (calls) is highest 10-75 km from the coast (Greene et al. 2004) seems to be confirmed by the best model, which finds a density that is highest a bit away from the coast, but not too far.The high call spatial density in the northeastern part of the study area (Figure 3) was unexpected, but could have been a consequence of using only data from one day.This likely strongly violated the Poisson assumption about call spatial location, since whale location is spatially autocorrelated, especially on small time scales (Wursig et al. 1985).Alternatively, it has been hypothesised that the higher estimated density of calls in the east could be due to the directionality of the calls or the effect of water depth on sound propagation (Blackwell   et al. 2012).In this study, the authors observed 2.1 times as many calls east of the array as west.Future research could explore whether a more gradual increasing/decreasing density slope is found if more days were included in the analysis, resulting in reduced autocorrelation among the call origins.
The model-based expected number of singletons was 570 whereas there were 1417 observed singletons, which is 149% more than expected and confirmed our suspicions regarding false positives (see Figure 1).
An alternative to truncating singletons would be to retain all data and try to model the occurrence of false positive detections.For example, one could assume that an unknown proportion of detections are false positives and include a parameter for this proportion in an analogous way to the M t,α model presented by Yoshizaki (2007).Detection at any detector j could have three outcomes: no detection with probability 1 − p j , false detection with probability αp j , and correct detection with probability p j .Moreover, we would have to implement some modification of the M b model (Otis et al. 2019) to account for a change in α for the multiply-detected calls, which would likely involve an additional parameter.A benefit of this M b,α model would be that it could allow for false positives in multiply-detected calls, however, it increases complexity and introduces hardto-test assumptions on the false positive rate.Given that a large amount of acoustic data are potentially available (we used data from just one day), including poor-quality data seemed undesirable given the additional complexity and run time.Hence we consider the truncation approach developed here to be best for our application of ASCR.
The main model in our study conditions on at least two positive detections for every detection history, but can readily be extended to condition on a minimum of three or more involved sensors.This requires generalising Equation 5 where ω * min denotes the minimum number of sensors involved.Preliminary simulations showed no apparent bias, although depending on the situation, conditioning this way can rapidly reduce the amount of data available and hence decrease estimator precision.This is illustrated by the relative frequencies in Figure 1.
We estimated variance empirically by bootstrapping the calls.This assumes independence between the calls, which we know is violated as the calls are produced by whales and whales are spatially autocorrelated.Moreover, calls can potentially trigger responses from nearby whales, leading to temporal correlation between them (Thode et al. 2020).Some of the spatial and temporal dependence was eliminated by removing all detections with a received level below 96 dB, as this thinned the data, assuming independence between call characteristics and ocean noise.To create more accurate variance estimates, one could stratify the data by time and bootstrap these time chunks, thereby removing some of the autocorrelation.Even better, one could sample short time frames from several days and combine those to obtain the desired amount of data to analyse, in which case the analysed observations themselves could be assumed independent and likelihoodbased variance estimates are acceptable (given a large enough data set).The length and frequency of these time chunks will be case specific, as this will depend on the calling behaviour of the studied species.
We estimated model parameters using MLE as opposed to in a Bayesian inferential framework, with MLE having coding simplicity and reduced run time as the main bene-fits.We did find that convergence was sometimes slowed due to flat likelihood surfaces as a consequence of the many sources of variation in our model.A benefit of using a Bayesian framework would be the possibility to include informative prior distributions based on previous research, especially on the latent variable source level, which would likely improve convergence and may improve precision of density and abundance estimates.
Our method derives a call abundance, which can be converted into an animal abundance by correcting this estimate for the call rate, similarly to Marques et al. (2013).Ideally, this call rate is estimated for the same population and alongside the collection of the data, but this is not always possible.Blackwell et al. (2021) estimated a call rate for the BCB population over a longer period of time.Naively dividing our estimate or total calls by their median call rate estimate of 31.2 calls/whale/day (interquartile range of 12-129.6 calls/whale/day) gives us N whales = 5741.39/31.2≈ 184 individuals.We present this number for illustration alone; for correct inference, one should properly account for the variance around call rate and call abundance estimates.Alternatively, there are methods that directly estimate animal density based on cues (for more information, see e.g., Fewster et al. (2016) and Stevenson et al. (2019)).
When background noise is highly variable, selecting a truncation level that ensures that noise (almost) never surpasses the received level might result in most data being discarded.When data are scarce, it may then be beneficial to use a detection function based on the signal-to-noise ratio (SNR).Here, detection probability is assumed to depend on the ratio of the received level to the background noise, and could increase either stepwise or gradually as a function of SNR.This way, no data will be lost due to truncation.However, using an SNR detection function requires a random sample of accurate noise measurements for an additional Monte Carlo integration in the likelihood, and noise measurements at all sensors for every cue with a detection history to be able to estimate the detection probabilities.A detailed description of the SNR likelihood is presented in Appendix F in Supplementary Materials.
A potential weakness of the model is the fact that we assume the same propagation loss model for the entire study area.For most of the study site this assumption is reasonable, as it concerns a shallow plateau with little variation in ocean depth.However, the ocean floor drops in the northern part of the study site, resulting in altered propagation conditions.It is assumed that the bowhead whales migrate mainly over the shallow plateau and hence depth-induced inhomogeneous propagation is not a practical issue in our case study.In general, however, it is something that should be considered when modelling sound propagation in variable (ocean) landscapes.Phillips (2016) and Royle (2018) presented ways in which it is possible explicitly to model variable sound attenuation.Such a model requires accurate and sometimes high resolution information on environmental variables that affect sound attenuation, such as ocean depth.
The model presented here extends the scope of SCR to provide reliable inference on spatial density and abundance from passive acoustic data.Removing single detections will also be of use in other SCR applications where single detections are unreliable -for example when association between detections is not perfect such that repeat detections of the same individual are sometimes incorrectly classified as single detections of a new individual.This can happen in situations where natural animal characteristics are used for identification, such as photo-and genetic-ID.

Appendix A: Data Background, Availability & Cleaning Data availability
To handle the vast number of calls detected by the DASARs, Thode et al. (2012) developed a method to detect and match calls using a neural network trained using manually processed data from 2008 and 2009.From these data a collection of detection histories could be derived, where for every call and every DASAR it contains a 1 if detected, and a 0 otherwise.Sometimes, calls would be wrongly identified and/or matched, as other discrete sounds produced by, among others, airguns, bearded seals (Erignathus barbatus) and walruses (Odobensu rosmarus divergens) can appear similar to bowhead whale calls (Thode et al. 2012).It was thought that single detection calls ('singletons') contain a high proportion of falsely identified calls (false positives), thus we removed these detections altogether.
In this study, we focus on the data from one specific day, 31 August 2010, from site 5, the most eastwards site, as this was of one the more intensely studied subsets of the data.Moreover, calls accumulated somewhat evenly across the day, resulting in a low expected rate of overlapping calls.For this day, the following observations were extracted: • A matrix containing detection histories for all calls and all sensors, with 1 indicating detections and 0 indicating no detection.
• A matrix containing received levels for the positive detections, and NA values otherwise.
• A matrix containing noise measurements for all sensors and all calls.Noise at DASARs that detected a call was derived using a 6 second window, consisting of 3 seconds before and 3 seconds after the call, using the same frequency band as the call.Noise recordings for DASARs that did not detect a call were derived using the widest observed frequency window for the DASARs that were involved in the detection of the call, using a similar time window.
• A matrix containing noise measurements derived over 6-second windows, taken at a random time points and frequency bands within the study period.These were subsequently checked to not contain any calls, as that would not qualify as 'noise' -other discrete sounds, such as walrus calls or airguns, do qualify as noise, as they potentially interfere with the identification of bowhead whale calls.
The last two matrices containing noise information are only used for the detection function based on SNR (see Appendix F).
Two variables in our likelihood are latent: the origin of the call and the source level of the call.To limit the run time, we only integrate over the smallest subset of the domain that includes at least 99.9% of the associated probability density/surface, at the lowest resolution that ensures good estimates.Thus, the remainder of our data includes: • A matrix of at least 300 evenly spaced spatial points covering an area such that calls that originated at the bounds have a probability of being detected on at least two sensors of < 0.1.For every grid point, the ocean depth and distance to coast was also known; these were used as potential covariates in the density model.
• A vector of source levels, ranging from 100 to 220 dB, with increments of 3 dB, which were the largest increments that still resulted in good parameter estimates.

Data cleaning
ASCR theory requires perfect identification, and to meet this assumption as closely as possible, several cleaning procedures were required.The false positives among calls that involved two or more detections did not pose a problem for the original studies (Thode et al. 2012).Here, the main goal was to triangulate the bearings of detected calls to estimate the spatial origin of the call.Every involved DASAR would record a bearing, and a localisation procedure would add varying weights to these bearings until it resulted in a successful localisation, or not (Thode et al. 2012).For example, if the bearings of a call with two detections would never meet, e.g., because one of them belonged to a wrongly identified sound, a localisation would fail.If, however, a call that involved three detections had two bearings that matched and one that did not (the 'outlier'), it would give the latter a weight close to 0, and the others a weight closer to 1, thus resulting in a successful localisation.As we only wanted the 'valid' detections in our detection histories, we used these weights to remove wrongly identified and matched calls.This was done using the following procedure: 1. We start by only including call events that led to a successful localisation.This first step got rid of misidentified airgun signals, as those originated too far from to array to be localisable.
2. If the call event contained at least one bearing with a weight < 0.02, we removed the detection with the lowest weight.
3. The call was localised again (now excluding the previously removed bearing), and the newly assigned weights were evaluated.
4. If the call was successfully localised and still contained at least one weight < 0.02, we removed that detection and went back to step 2. Otherwise, the call was now considered valid.
The weight of 0.02 was found to most accurately remove wrongly identified and matched calls; through visual checks we found that lower values still allowed for some incorrect bearings to be included, and that increase the weight did not make a difference in how many incorrect bearings were removed.Finally, one call had a failed noise recording and was thus excluded from the analysis.
This left us with a total of 5793 'correct' calls detected on at least two sensors at site 5 on 31 August 2010.The next stage was to truncate the data to include only detections that exceed the highest sampled background noise level of 96dB.This left 443 calls after truncation that were used in the analysis reported in the main paper.

(In)accurate bearings
When visually exploring our data, we noticed that not all bearings were equally precise.Plotting a subset of the bearings suggested a split in bearings of mostly accurate and some inaccurate ones.It was there hypothesised to best allow for a two-part mixture model on bearing accuracy, as presented in the main paper.An example of inaccurate versus accurate bearings is presented in Figure 1 below.calls could be louder when ocean background noise is louder, an effect that is also known as the 'Lombard effect' (Thode et al. 2020).When ocean noise is then spatio-temporally independent, so is source level.We believe that a violation of this assumption would mainly affect uncertainty estimates and not point estimates.

Appendix C: Spatial Mesh
In spatial capture-recapture, and therefore also in acoustic spatial capture-recapture, we estimate a density surface.Total abundance is derived by integrating over space, giving a total abundance for the area.In practice, we approximate the integral by summing the function over a discrete mesh.We cannot compute the approximation over infinite 2-dimensional space, so we only approximate the integral over our study area A ⊂ R 2 The condition on this study area is that the probability of a call originating at the bounds or farther out has a probability of being detected on at least two sensors is at most 1%.This makes the integration computationally feasible whilst inducing negligible bias.
In our case study, loud calls could potentially be audible hundreds of kilometres from the source.To keep computer run-time low, we had to limit the number of grid cells in the order of 500, which limits the resolution of our mesh.As our study site did not contain many steep gradients in the covariates, we did not need a high resolution to capture this.However, we did need a finer resolution closer to the sensor array to accommodate the bearings.Recall that we included bearing data with errors in our model.If we used a low resolution grid close to our array, we would have greatly reduced the probability that one of the midpoints of our grid cells was close to what the recorded bearings point towards, and thus it would have been hard for the model to estimate the bearing accuracy.Our solution was to use a mesh with increasing grid spacing the farther the cells are from the sensor array.
This mesh had a two-stage grid spacing: the 'high resolution' cells within a 10,000 meter radius from the outer sensors of the array had an area of 6.25 km 2 each; the 'low resolution' cells between 10,000 and 50,000 meters from the outer sensors had an area of 25 km 2 .This way we created higher resolution near the sensors whilst keeping the number of grid cells in the mesh to 438.Every grid cell was rectangular on the Albers projection, and the associated covariate information (i.e., ocean depth and distance to the coast) was derived for the midpoint of the cell.Plots of the mesh with detection probabilities associated with the simulation parameters (which were based on parameter estimates from the real data) are presented in Appendix D. These plots that the detection probabilities are well below 1% at the bounds.
6. Using the distances and the transmission loss parameter β r , derive the received levels for all calls at all sensors, and add measurement error from N (0, σ 2 r ).
7. Assign detection probability g 0 to every call at a sensor when r ij is at least truncation value t r , and 0 otherwise.
8. Simulate from U (0, 1) and accept as a positive detection if the value simulated is less than or equal to the detection probability.Then, remove all calls and associated data if the total number detections for that call is less than two.9.Return the detection histories, received levels, and bearings for the remaining calls.
Alternatively, if a mixture on the bearing precision is desired, step (4) is replaced by: 4a.For all calls, derive the bearings from the sensors to the call.4b.Simulate from U (0, 1) and label the detection of a call on a sensor as 'bad/low-precision' if the value simulated is less than or equal to (ψ κ ).To the bearings for these sensor-call interactions, add error from VonMises(0, κ).4c.For the remaining bearings (the 'good/high-precision' bearings) add an error from VonMises(0, κ + δ κ ).We link the two distributions through their dispersion parameters to ensure identifiability by making it explicit that the second distribution always has a larger dispersion parameter, and thus lower variance.

Simulation studies for functionality
We first performed several simulation studies to evaluate the bias and variance of the estimates.The code used to simulate data can be found in simulate data Rcpp.R.The parameters values used for the simulations were based on exploratory fits on the real data.Using a truncation value of 96 dB for the detection function, we fit a homogeneous and non-homogeneous density model.Both models included a mixture on bearing precision and source level as a latent variable.
We did not centre the covariates because this makes the spatial density covariates harder to interpret, and an exploratory analysis with centred covariates yielded almost identical results.

Checking the buffer width
Figures 1-4 are associated with the parameters from Table 1, corresponding to the model with a variable source level.Figures 5-8 are associated with the parameters from Table 2, corresponding to the model with a fixed source level.They show the midpoints of the grid cells, and whether a call produced at that point is expected to surpass the multiplydetection threshold (blue) or not (pink), which is either 1 or 0.1.We used a threshold of 0.1 for our study, but these figures show how we could have reduced the size or our mesh by increasing this threshold to 1.The plots below are associated with the parameters from Table 2, corresponding to models with a fixed source level.They show the midpoints of the grid cells, and whether a call produced at that point is expected to surpass the multiply-detection threshold (blue) or not (red), which is either 1 or 0.1.The red points represent the sensor array.D ~distance to coast + depth + depth:distance to coast, D ~distance to coast + distance to coast2 + depth + depth2 + depth:distance to coast)) An overview of the model fits is presented below.We derive the AIC, AICc and BIC for all models, and order based on AIC.AIC assumes independence between the observation (as it is a likelihood based statistic) and penalises for amount of parameters in the model.However, if the log likelihood is overestimated (and thus the information) due to unaccounted for dependence between the (some of) the variables, AIC could incorrectly prefer more complex models.We acknowledge this, however, we still use it as the case study was mainly functions to show the functionality of the method -if one would be explicitly interested in the actual call abundance, finding a more effective model selection criteria would be more important.In the following table, 'ID' is a unique identifier, 'Density Model' is the R specification of the density model, 'N hat' is the estimated abundance, '#par' is the number of model parameters, and 'AIC' is the AIC score (Akaike 1998)

Results from the 999 bootstraps
To find the uncertainty around our point estimates from Model 31, we bootstrapped (i.e., re-sampled with replacement) the original calls 999 times.This creates 999 'equally likely' data sets, assuming that calls are independent.The results from these bootstraps are presented below.We present the parameter estimate, the lower and upper limits of the absolute (LCL and UCL) and relative (LRCL and URCL) 95% confidence intervals, the standard deviation (SD) and the variance, and the coefficient of variation (CV).Estimates are rounded to two significant figures.
Par A surprising result is the high CV of log(ψ κ ) (1423%).This is caused by a negative outlier (-234.24)and a mean close to 0. As these estimates are on the log scale and the outlier is << 0, these effects are dampened when converting to the real scale, which is why CV( κ) = 22 (see Table 2 in the main paper).subtraction as r and c are denoted on the decibel scale.Setting the lower limit θ L to 0 this function simplifies to (2) It is fundamental to understand that we often cannot use the recordings to measure SNR or received level directly, as we only have those for sensors that actually detect the call.There are K sensors and generally only some detected the call.To evaluate the likelihood of detection and of non-detection, we need a unique detection probability for every sensor, which means we need an explanatory variable that is different for every sensor.Thus, we use the expectation of the SNR to determine the probability of detection at a sensor, taking the other information as given (see Equation 24).
As was already clear, we have two latent variables: the source level and spatial origin of the call.We deal with this by marginalising over these variables, as described below.
Joint distribution of the detection histories, bearings and received levels, given two or more detections and noise, f (ω, y, r|ω * ≥ 2, c) We start by specifying the joint distribution of our observed stochastic variables.We know this distribution does not only depend on detection, but also on spatial origin and the source level of the detected call.Since these variables are latent, we integrate them out, which gives the following formula for the joint distribution where S is the support for s, which in theory is (0, ∞) but in practice it suffices to use a smaller support such as long as it sufficiently covers the probability density of s.Using standard conditional probability theory we can rewrite this distribution as (4) We will now focus on each component in Equation 6separately, starting with f Distribution of the source level of the call, given at least two detections and noise, f (s|ω * ≥ 2, c) Using Bayes' formula and the addition of latent variables similar to before, we can rewrite the distribution of s|ω * ≥ 2, c the following way Note that some probability distributions, such as the one for source level, are not denoted as conditional on noise.This is because the variables s and x are assumed independent of noise.(Actually, a recent paper by Thode et al. (2020) showed that the source level of bowhead whale vocalisations in related to other discrete sounds and noise, but here we assume them to be independent.) Distribution of the spatial origin and source level of the call, given at least two detections, noise, and source level, f (x|ω * ≥ 2, c, s) The probability density of the spatial origin of an emitted call x is proportional to the density at x, such that f (x) ∝ D(x).The normalising constant is therefore the integral over space, R 2 D(x)dx and this gives We have not defined a spatial density model D(x) for the calls yet.It is common practice to try several density models and then choose to best one using some criterion such as AIC.The density of x|s, In both definitions we assume the distribution of emitted calls to be unrelated to s and c, thus f (x|s, c) ≡ f (x) (note that this is not true for the distribution of detected calls, which changes when s and c changes).As noted before, in this study, we assume s ∼ N (µ s , σ 2 s ).Combining Equations 8 and 9 gives where in the last step the constants R 2 D(x)dx cancel out and a : Finally, the distribution of f (ω * ≥ 2|x, s, c) is a probability mass function, where ω * ≥ 2 can either attain the value 0 when detected at most once or 1 when detected twice or more.This is represented by the following: where g j (x, s, c j ) denotes the detection function at detector j, which is specified in Equation 24.Equation 9 can readily be extended to set the minimum number of detections per call at 3, 4, etc. (although the formula and computational complexity significantly increases; see the Discussion in the main paper).
Joint distribution of the capture history, bearings and received level for a call, given its spatial origin, source level, at least two detections and noise, f (ω, y, r|x, s, ω * ≥ 2, c) First, we observe that Since y and s are unrelated, and so are r and y, we can rewrite the above formula as We now evaluate this joint density from left to right, starting with the density of ω|x, s, ω * ≥ 2, c.First, we rewrite f (ω|x, s, ω * ≥ 2, c) as Conditioning on the detection history gives a probability of two or more detections, since the probability of detection at least twice is implied by the existence of the detection history.
We can view the detection history of a call as a set of Bernoulli trials with size K and non-constant probability p j (x, s, c) with j = 1, .., K.This gives The second term in Equation 14is the density of y|ω, x, ω * ≥ 2, c, which we can rewrite as where we assume independence between bearings and noise.
To keep this section as simple as possible, we assume that all bearings follow a Von Mises distribution with a single concentration parameter κ (unlike the main paper, where we assume a 2-part mixture), which gives f (y|ω, x) = j∈{1:K|ω j =1} exp{κ cos(y j − E [y j |x])} 2πI 0 (κ) , where I 0 is the modified Bessel function of order 0 and E [y j |x] is the expected bearing at sensor j for origin x.
Finally, we rewrite the density of r|ω, x, s, ω * ≥ 2, c as f (r|ω, x, s, ω * ≥ 2, c) = f (ω * ≥ 2|r, ω, x, s, c)f (r|ω, x, s, c) f (ω * ≥ 2|ω, x, s, c) Following the model described in the main paper, we assume a Gaussian error on received levels to capture measurement error and error in the propagation model, such that Received level itself is a function of: source level, denoted by s; the Euclidean distance in meters, denoted by d j (x), between the spatial origin of the call x and sensor j; and the transmission loss per distance travelled by the sound, β r .Standard sound propagation theory (Jensen et al. 2011) gives the following formula for the expectation for r j : E [r j |s, x] = s − β r log 10 (d j (x)) , where β r is a parameter indexing the rate at which received level decreases with distance.More complex propagation models can be used, as was shown by Phillips (2016) and Royle (2018).As r j and c j are both denoted on the dB scale, the expected signal-to-noise ratio at sensor j is simply given by Signal strength is only recorded if the call is detected, and thus the density is only specified when ω j = 1.Assuming independence between the detectors, we can write f (r|ω, x, s, c) = j∈{1:K|ω j =1} f (r j |ω j = 1, x, s, c j ), (20) where Bayes' formula gives f (r j |ω j = 1, x, s, c j ) = f (ω j = 1|r j , x, s, c j ) × f (r j |x, s, c j ) f (ω j = 1|x, s, c j ) (21) The first element of the numerator in Equation 21is the probability of detection given the received level and the noise, which is given by Equation 4 but now with r j known.This means that x and s are redundant, thus giving f (ω j = 1|r j , x, s, c j ) = f (ω j = 1|r j , c j ) = p(r j − c j ) The second element of the numerator in Equation 21 is the probability density function of received level given the spatial origin and the source level, which is specified in Equation 19.As true received level and noise are assumed independent, we get f (r j |x, s, c j ) = f (r j |x, s) = 1 σ r ϕ r j − E (r j |x, s) σ r , where ϕ denotes the standard normal probability density function.The denominator in Equation 21 is the probability of detection given the spatial origin, source level, and noise at sensor j.As we also need the received level to derive this probability, we condition on and marginalise over r j , which gives f (ω j = 1|x, s, c j ) = where R ∈ (−∞, ∞) denotes the support for the received level.Equation 24 is called the detection function and, in line with other literature, is also denoted by g j (x, s, c j ).We combine the three elements to rewrite Equation 20 (26)

Generalising to n calls
The above gave the density of the detection history, bearing and received level for a single detection.Here we generalise ton calls that were detected at least twice.We assume that all calls are independent of one another (this assumption is certainly violated due to spatial correlation in the individuals creating the calls).We define the (n × 2)-matrix of all spatial origins X = (x 1 , ..., x n ), the (n × 1)-vector of all source levels s = (s 1 , ..., s n ), the (n × K)-matrix of all detection histories Ω = (ω 1 , ..., ω n ), the (n × K)-matrix of all bearings Y = (y 1 , ..., y n ), the (n × K)-matrix of all received levels as R = (r 1 , ..., r n ) and the (n × K)-matrix of all the noise as C = (c 1 , ..., c n ).Using this notation, we can create the joint distribution of n, Ω, Y , R|C as Borchers and Efford (2008)).This gives: f (Ω, Y , R,X, s|n, C) = f (Ω|n, C)f (Y , R, X, s|Ω, n, C) f (ω i , y i , r i , x i , s i |ω * ≥ 2 i , c i ).
(29) with n n 1 ,...,n U being the multinomial coefficient where n 1 , ..., n U are the frequencies for U unique detection histories.This coefficient appears as the order of the calls does not matter.Adding the integrals again and using the independence assumption between calls, we get the following n n 1 , ..., n U n i=1 S R 2 f (ω i , y i , r i , x i , s i |ω * ≥ 2 i , c i )dx i ds i , (30) where

Figure 1 :
Figure 1: Counts of detected calls by number of DASARs (sensors) involved per call at site 5 on 31 August 2010.Detections were included if the received level was at least 96 dB.The high proportion of singletons (detections on a single DASAR; 1417) raised concerns about the validity of those calls.The dashed line shows the number of singletons (570) that were estimated to be valid using the methods developed in this paper.

Figure 2 :
Figure 2: Relative error, relative bias and coefficient of variation for N from 100 simulations.The black dots correspond to the relative bias (RB; the mean relative error) and the grey shading shows the spread of relative error per scenario.The RB and coefficient of variation (CV) for every scenario and model combination is shown just above the xaxis.Simulation scenario 1 is variable sound source level and 2 is fixed sound source level.Analysis model (a) is the true model, (b) the incorrect source level model, (c) an incorrect simple bearing model, (d) a model with bearing data, and (e) an incorrect homogeneous spatial density model.

Figure 3 :
Figure 3: A map of the study site 5 from 31 August 2010 with the estimated density surface (left) and the associated empirical quartile coefficient of dispersion (QCD; right).Locations of the 6 DASARs are indicated with the triangles.The square grid cells within 10 km of the array each have an area of 6.25 km 2 , and and area of 25 km 2 elsewhere.The density map corresponds to the model with the lowest AIC, which is model 33 in Appendix D in Supplementary Materials.The QCD is derived from 999 Monte Carlo simulations based on the same model, where the calls are re-sampled.

Figure 1 :
Figure 1: Plots of five calls from the data from Site 5 on 31 August 2020.Most bearings are fairly accurate, but these five plots show that can occasionally be way off.

Figure 2 :
Figure 2: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the homogeneous density parameters for the variable source level model.The sensor array is indicated in red.

Figure 3 :
Figure 3: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the homogeneous density parameters for the variable source level model.The sensor array is indicated in red.

Figure 4 :
Figure 4: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the inhomogeneous density parameters for the variable source level model.The sensor array is indicated in red.

Figure 5 :
Figure 5: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the inhomogeneous density parameters for the variable source level model.The sensor array is indicated in red.

Figure 6 :
Figure 6: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the homogeneous density parameters for the fixed source level model.The sensor array is indicated in red.

Figure 7 :
Figure 7: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the homogeneous density parameters for the fixed source level model.The sensor array is indicated in red.

Figure 8 :
Figure 8: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the inhomogeneous density parameters for the fixed source level model.The sensor array is indicated in red.

Figure 9 :
Figure 9: Plot of midpoints of the grid cells.Blue indicates that the expected probability of detecting a call produced at that point at least twice is at least the threshold, and pink indicates otherwise.Detection probabilities were derived from the inhomogeneous density parameters for the fixed source level model.The sensor array is indicated in red.

Table 2 :
The point estimates and associated uncertainties.SE is standard error, CV stands for coefficient of variation, and Single SL are point estimates from the equivalent model without source level heterogeneity.N is a derived quantity; g 0 through ψ κ are the observation parameters presented on the real scale; and the remainder are the density parameters presented on their log link scale.

Table 1 :
Parameter estimates for the variable source level model with a homogeneous or inhomogeneous density model, and the simulation parameters.Our second simulation scenario involved simulating from a model with a fixed source level.Fitting a fixed source level model to the data, we get the following parameter estimates.

Table 2 :
Parameter estimates for the fixed source level model with a homogeneous or inhomogeneous density model, and the simulation parameters. ).