Background

Whenever false positives occur in ecological data, they have the potential to cause significant effects on the interpretation of results [1]. In some areas of ecology, such as automated acoustic monitoring of amphibian, bird, cetacean, and bat sounds, issues related to false-positive data have been assessed in detail [24] and their effects on results of analyses, such as species distribution models, have been accounted for [1]. In other ecological fields, false-positive data have been poorly examined. One such field is that of aquatic acoustic telemetry monitoring of tagged animals, herein referred to as acoustic monitoring.

Acoustic monitoring has become a popular tool in the study of aquatic animals as it enables a large number of subjects to be tracked simultaneously [57]. While there are several equipment manufacturers or standards, the most commonly used is the VEMCO VR2 system which uses a proprietary pulse position modulation coding scheme. This VR2 system is a two-part approach: acoustic receivers are deployed in a study area to listen for coded signals from transmitters deployed on released animals. Each transmitter produces a signal with a unique tag ID code. The number of tag ID codes available to users depends on the coding scheme used, with multiple code spaces (i.e. coding schemes that define the number of available tag ID codes than can be encoded) available to increase the number of unique tag ID codes and allow the provision of different forms of telemetry data, e.g. temperature, depth, and acceleration. Animal detection data from receivers are used to generate information on movements, residency, behaviour, and many other ecological aspects [5].

Acoustic monitoring researchers have long known about the occurrence of false-positive data (known more commonly as ‘false detections’) in their datasets. For example, Heupel et al. [8] excluded detections of known tags using a speed criteria in data from grey reef sharks (Carcharhinus amblyrhynchos) that had potentially moved long distances (>100 km) between two acoustic arrays. The importance of false detections has been commonly overlooked since research primarily focuses on known tag ID codes from animals deployed in relatively small areas. As acoustic monitoring increases in popularity and data sharing grows amongst research groups through the implementation of regional, national, and international databases of detections (e.g. the Atlantic Cooperative Telemetry Network—http://www.theactnetwork.com/home, the Australian Animal Tracking and Monitoring System—https://aatams.emii.org.au/aatams, and the Ocean Tracking Network—http://oceantrackingnetwork.org), the potential for false detections to arise in datasets drastically increases as does the potential for erroneous interpretation. It therefore is becoming increasingly important to understand the nature of false detections and the processes that create them and therefore identify which tag ID codes are relevant to analyses.

False detections occur when transmissions from two or more transmitters collide, which results in a detection of a different tag ID code by the receiver [9]. Within a dataset two situations can occur. First, a false detection can produce an unknown tag ID code, i.e. a code from a transmitter that has not been released in a specific study location (but may have been released elsewhere), herein referred to as a Type A false detection. For researchers using data from their own study area, these Type A false detections are relatively easy to identify and discard from analyses because the unique tag ID codes erroneously created differ from those released. However, in shared datasets these are much more difficult to identify. Alternatively, a false detection may result in a tag ID code that is the same as the one from a transmitter that has been released; herein referred to as a Type B false detection. Differentiating Type B false positives from true-positive data is inherently more difficult. Importantly, shared data may include Type A false detections that are difficult to identify without the broader information and context derived from the local study and experimental design. Thus, without adequate quality control algorithms false detections may be used in data analyses and lead to biased or erroneous outcomes.

To better understand the occurrence of false detections in acoustic monitoring data, we used detection data from a range test experiment carried out over a 9-month period. This experiment allowed for the examination of false detections in a controlled situation. The specific aims were to (1) determine the occurrence of Type A and Type B false detections, (2) examine the consistency in the occurrence of Type A false detections between receivers, (3) determine if there were patterns in the tag ID codes of Type A false detections, (4) examine the pattern of Type A false detections over time, and (5) examine if false detections occurred as a result of random processes or if they were correlated with the performance of receivers.

Results

A total of 151 unique tag ID codes were recorded by the eight receivers during 235 days of monitoring. These tag codes resulted in 1 975 165 detections. Twelve of the tag ID codes were from tags deployed for the range test experiment and two were from tags registered in the AATAMS database: one white shark (Carcharodon carcharias) and one wobbegong shark (Orectolobus maculatus). The remaining 137 tag ID codes were from no identifiable source and were potentially the result of Type A false detections. All of the known tag ID codes were identified by the VUE False Detection Analyser as likely to be from valid tags (experiment tags: 96 tag ID—receiver combinations; white shark: eight tag ID—receiver combinations; and wobbegong shark: one tag ID—receiver combination), while all of the unknown codes were identified as likely to come from Type A false detections (384 tag ID—receiver combinations). The tag ID codes that were identified as likely from false detections accounted for 918 detections (<0.05 % of detections) throughout the study period. The majority of the 137 tag ID codes (98.5 %) identified as Type A false detections were from the A69-1303 code space, i.e. the same as all of the experiment tags and the two detected species. Single tag ID codes from the A69-1105 and A69-1008 code spaces were also detected. Examination of time differences between detections of the experimental tags did not identify any Type B false detections, with all time differences greater than or equal to the specified repeat rate.

The distribution of the Type A false detection tag ID codes was not randomly located throughout the available code space (Wald-Wolfowitz runs test, µ = −11.53, p < 0.0001). The tag ID codes identified as Type A false detections were more likely to occur within 5000 codes of the experimental tags (Fig. 1) showing that false detection tag ID codes were likely to be close to codes from real tags. The stepped nature of the tag ID distribution plot also suggested that Type A false detection tag ID codes occurred in groups. The ratio between the variance and the mean was very large (VMR = 4367) supporting the hypothesis that the false detection tag ID codes had a clumped distribution.

Fig. 1
figure 1

Distribution of tag ID codes within the available code space relative to deployed experimental tags

The Type A false detection tag ID code discovery curves for the near-surface, mid-water, and one of the near-bottom/upward facing receivers showed a decreasing rate of new codes being added the longer the study ran (Fig. 2a–c). The remaining near-bottom receivers, and all receivers combined, had relatively straight discovery curves indicating that new tag ID codes were consistently detected throughout the study period (Fig. 2c, d).

Fig. 2
figure 2

Discovery curves for tag ID codes from Type A false detections on a near-surface, b mid-water, c near-bottom, and d all receivers combined. Black lines in a–c indicate downward facing receivers, grey lines indicate upward facing receivers. Note that y-axis scale of d differs from a–c

Forty-eight % of tag ID codes identified as Type A false detections occurred on only one receiver, while only 2.2 % occurred on all eight receivers (Fig. 3). Individual receivers detected between 29 and 67 tag ID codes from Type A false detections (Table 1). Seven of the eight receivers recorded ten or less unique tag ID codes from Type A false detections, i.e. detected on only that specific receiver, but one (near-bottom/downward facing) recorded 24 unique tag ID codes (Table 1). The downward facing/near-bottom receivers had the largest numbers of tag ID codes from Type A false detections, largest number of unique tag ID codes from Type A false detections, and largest number of total detections from Type A false detection tag ID codes (Table 1).

Fig. 3
figure 3

Number of receivers on which individual tag ID codes identified as Type A false detections were detected

Table 1 Distribution of false detection tag ID codes by receiver

The number of Type A false detections recorded across all eight receivers each day varied considerably (Fig. 4), with no false detections recorded on any receiver on 18 days. The maximum number of Type A false detections across the eight receivers in a single 24-h period was 13. The numbers of Type A false detections on individual receivers also varied from day to day; from a maximum of six (downward facing/near-bottom receiver) to zero (all receivers). The distribution of the number of Type A false detections per day was significantly different to that expected from random events (comparison to a Poisson distribution using chi-squared test; all p < 0.0001) (Table 1). Comparison of the mean number of Type A false detections per day showed that receivers with the same placement on the deployment, i.e. depth and orientation, did not detect different numbers of Type A false detections (Table 2). There were also some similarities in the occurrence of false detections between most near-bottom receivers irrespective of their orientation; and the mid-water receivers had some similarities to both the near-bottom and near-surface receivers (Table 2). The mean daily numbers of Type A false detections for both surface receivers was significantly less than all of the near-bottom receivers.

Fig. 4
figure 4

Number of Type A false detections per day for all eight receivers combined

Table 2 Results of pairwise Wilcoxon rank-sum tests comparing mean numbers of daily Type A false detections between receivers

The distribution of daily numbers of Type A false detections by each receiver did not conform to a Poisson distribution suggesting that the numbers of Type A false detections were not the result of a random process (Table 1). There were no strong correlations between the daily numbers of Type A false detections and any of the five receiver performance metrics (Table 3). The two near-bottom/downward facing receivers consistently had the highest correlation values for all metrics, with all values greater than 0.324 or less than −0.327. There was also no trend in numbers of Type A false detections by the time of day (Fig. 5).

Table 3 Correlation coefficient values between daily numbers of Type A false detections and receiver performance metrics
Fig. 5
figure 5

Number of Type A false detections by all receivers combined in 1-h periods throughout the day

Discussion

The results of this study demonstrate that false detections can occur in acoustic monitoring studies using the VEMCO VR2 system, but that their occurrence is rare relative to the number of detections of valid tags (<0.05 % of detections). Only Type A false detections were identified, suggesting that Type B false detections are much less common in acoustic monitoring datasets. It is possible that Type B false detections did occur but could not be identified by looking for sequential detections at shorter intervals than the repeat rate of the transmitters. Other studies have identified Type B false detections (e.g. 8) using limits on animal movement ability. While such an approach could not be used for this study because the transmitters were stationary, it may be particularly useful in studies that combine data from receivers placed large distances apart. The ability of the VEMCO False Detection Analyser to correctly identify known transmitters as likely valid and unknown tags as likely invalid indicates that this approach is suitable for identifying Type A false detections in datasets.

The placement of receivers in the water column appears to have an effect on the number of false detections, the number of false detection tag ID codes, and the consistency through time with which new tag ID codes were discovered. The near-bottom receivers, and especially the downward facing ones, performed worst in all of the false detection metrics examined. This may be the result of increased likelihood of reflection of sound waves from the bottom reaching these receivers and so increasing the likelihood of two signal transmissions colliding. It may also be possible that the reflection of a single transmission may interfere with itself resulting in a false detection. Environmental noise can also be a factor effecting acoustic detections [5]. Receivers at different levels in the water column detected similar numbers of false detections, although they did not necessarily detect the same tag ID codes. Differences in tag ID codes between receivers with similar placement suggest that tag collision events that lead to false detections are interpreted differently even between closely located units. The greater differences in the number of false detections between receivers with different placement may suggest that relatively small changes in distance from the transmitters of the colliding signals affect the tag ID code recorded. The data also suggested that receivers further from the bottom produced fewer Type A false detections. This may be the result of increased reflection of sound and biological noise closer to the bottom. These results provide clear guidance for receiver placement to acoustic monitoring practitioners wishing to minimise the occurrence of false detections in their data. Receivers further from the bottom are likely to produce less false detections, and if they are placed close to the bottom, then they should face upwards. However, researchers must weigh this advantage against the ability of the receivers to detect the tagged animals.

The repeat rate of the experimental transmitters (the period between signal transmissions by the tag) used in this study was substantially longer than that used by most researchers in the field. The long repeat rate was used deliberately in this experiment to reduce the likelihood of tag collisions affecting range testing. This suggests that for many animal-based studies that routinely use transmitters with faster repeat rates, there may be more false detections than observed in this study. Whether this would result in more tag ID codes from Type A false detections, greater numbers of a similar set of tag ID codes, or both, remains to be investigated. The lack of asymptotes in the tag ID code discovery curves suggest that more codes would have been detected in this study if a shorter repeat rate had been used. Faster repeat rates are also likely to increase the probability of Type B false detections and further work to identify this type of false detection is needed.

The processes that lead to false detections suggest that they should occur randomly through time. However, in this study the number of Type A false detections recorded each day was not random, suggesting that other processes may affect their occurrence. We were able to exclude changes in the performance of receivers [1] as factors that affected the numbers of Type A false detections. Further investigation will be required to identify the processes that affect the occurrence of false detections, including changes in environmental conditions and biofouling. [10, 11].

The distribution of Type A false detection tag ID codes within the code space used in this study was also not randomly distributed, with codes more likely to occur close to real tag ID codes in the system and also occurring in groups. This suggests that the collisions between pairs (or more) of real tag ID codes can only create a subset of false detection codes because of the way that codes are transmitted. This hypothesis was supported by the curvature in the tag ID discovery curves from near-surface and mid-water receivers that suggested that few new codes were being recorded, while only a small proportion of possible tag ID codes were being detected. This situation may also have been exacerbated by the stationary nature of this study, and as such the distribution of false detection codes may be different in studies where animals are moving relative to receivers.

Studies examining false detections in acoustic monitoring data are limited, and there is a need to further understand their occurrence and the factors that affect their prevalence. In an age of increasing data sharing and public storage of scientific data, these issues are of significant concern. Ensuring that the data are quality controlled and used within the context of the experimental design parameters is crucial to avoiding erroneous conclusions. This study demonstrated that there are patterns in the Type A false detection tag ID codes and their occurrence was not always random. It will be important for researchers to understand the occurrence of false detections within datasets, especially where data are obtained from sources other than their own receivers. In these situations, it will be critical to employ checks for false detections to ensure that only valid data are used.

Methods

Experiment design

The range test experiment from which the data to investigate false detections were derived commenced on December 15, 2008 and ended on July 7, 2009. During this period, eight VEMCO VR2W acoustic receivers were suspended in the water column offshore from Bondi Beach (33.9259°S, 151.3546°E) on a mooring with a subsurface float approximately 15 m below the surface in approximately 85 m of water. Pairs of receivers were attached to the mooring line at three depths (near-surface ~21 m, mid-water ~54 m, near-bottom ~78 m; see Fig. 6). Two pairs of receivers were used in the near-bottom area, one pair oriented upwards, and one pair oriented downwards. Pairs of receivers were placed about one metre apart on the mooring line to reduce the chances of interference among units. Twelve VEMCO V16-4L acoustic transmitters (147 dB, tested for consistency by the manufacturer) were then anchored at varying distances away from the receiver mooring. Transmitters were attached to mooring lines with cable ties and the battery end taped to the line to avoid any movement that would cause noise. Mooring lines were fitted with subsurface floats positioning the tags at depths of ~60 m. The closest transmitter was at 200-m, the next at 300-m, and the remainder placed at 50-m increments from 350–800 m (Fig. 6). All transmitters emitted a signal at 69 kHz at pseudo-random repeat rates between 550 and 650 s in the A69-1303 code space. This code space encompasses 65,536 possible tag ID codes numbered sequentially from one. An acoustic release was used to retrieve the acoustic receivers at the end of the experiment. All data used in this analysis are publicly available via the Integrated Marine Observing System’s Australian Animal Tracking and Monitoring System (AATAMS) database (https://aatams.emii.org.au/aatams/).

Fig. 6
figure 6

Experimental design showing the location and depth of receivers and transmitters. Receivers labelled with asterisks’ indicate upward facing, and all other were downward facing

Data analysis

Data from each of the receivers were downloaded into a VEMCO VUE database from which information on the number of detections of individual tag ID codes by individual receiver and receiver performance was obtained. In addition, the False Detection Analyser function in VUE version 2.1.3 was used to evaluate the likelihood that individual tag ID codes detected on individual receivers (i.e. false detections were analysed for each tag ID at each receiver, referred to as a tag ID–receiver combination) were the result of false detections. The False Detection Analyser used an algorithm that determines the time between detections of a tag ID code on each receiver and computes the ratio of short periods between detections (default value <30 min) to long periods between detections (default value >12 h) [9]. Only the default values were used in the False Detection Analyser. If there were more long periods between detections than short periods, then a tag ID code was considered to have a high likelihood of being the result of false detections; otherwise detections from tag ID codes were considered to be valid. The False Detection Analyser could not be used for tag ID codes that had a single detection and these were assumed to have a high likelihood of being the result of a false detection.

We distinguished between two types of false detections: false detections that generated tag ID codes that are unknown (Type A) and those that resulted in tag ID codes from known tags (Type B). To determine if false detections produced tag ID codes that were the same as those from the experimental tags (Type B), we examined the data from each receiver for detections of known tags at less than the specified repeat rate (550–650 s) by calculating the difference in time between detections for each receiver and experimental tag. The values of time differences less than 520 s (a 30-s buffer period was added to the shortest specified repeat rate to ensure that there were clear differences from the repeat rate) indicated likely false detections of known tags. This approach could not identify Type B false detections that occurred at times longer than the repeat rate and thus were considered a conservative estimate.

A data set that contained the number of detections of each tag ID code, receiver, and day was generated for analysis. Tag ID codes that were known from the experiment and any others that were included in the AATAMS database (https://aatams.emii.org.au/aatams) or were identified by the False Detection Analyser as “likely valid” tags were removed from this data set. This filtering process left a data set that contained only those tag ID codes that were likely the result of Type A false detections. To determine if the tag ID codes in this data set occurred randomly throughout the A69-1303 code space, the tag ID code numbers were used in a Wald–Wolfowitz runs test. To test if tag ID codes from false detections occurred closer to the numeric tag ID codes used in the experiment, i.e. close in number sequence, the absolute value of the difference between the false detection code numbers and the lowest experimental code number was calculated (each tag ID code includes a unique code number; in this case between 1 and 65,536). To determine if the rate of adding new Type A false detection tag ID codes decreased over time, discovery curves were constructed by plotting the number of unique tag codes recorded from the start of the study to each of the 232 days following deployment. Discovery curves that were straight indicated a consistent rate of adding new tag ID codes, while those that reach an asymptote indicate that no new tag IDs were detected toward the end of the experiment.

To determine if the distributions of the daily number of receivers on which false detections occurred and if the daily number of detections on a single receiver were the result of random processes, they were compared to a Poisson distribution using a chi-squared goodness-of-fit test. The mean numbers of daily false detections were compared between pairs of receivers using Wilcoxon rank-sum tests since the data were not normally distributed. A Bonferroni correction was applied to the critical probability value for this test because 28 comparisons were made using the same dataset.

Daily performance metrics for each acoustic receiver were calculated following Simpfendorfer et al. [12]. Five metrics were used: the number of pings (the number of acoustic pings recorded at the receiver’s working frequency; an A69-1303 tag ID code is made up of eight pings), number of detections, code detection efficiency (the ratio of detections to synchronisation codes), rejection coefficient (number of codes rejected due to an invalid checksum divided by the number of synchronisation codes), and noise quotient (the number of pings received over and above those that resulted from tag detections). The correlation coefficient between daily values of the number of false detections and values of performance metrics was calculated to determine if numbers of false detections were related to receiver performance or factors that also affected receiver performance.