Findings

Background and aims

Accelerometer data loggers are being used increasingly to study animal behaviour and derive indirect proxies of energy expenditure in the wild [16]. A key challenge is deciding how to assign specific behaviours to large sets of acceleration data recorded at high frequencies (typically 10 to 50 Hz). To date, no common solution to this problem has been agreed thus multiple processing approaches have been adopted, ranging from black box machine learning techniques [7, 8], to more visual, and often manually applied, hierarchical key-based methods [9]. Classification usually requires a ground-truthing stage during which the animal is monitored (although unsupervised attempts have also been made [10]).

Here we combined a number of previously tested methods with novel analytical techniques to develop an extensible classifier with high potential utility across species. We used: (a) the k-nearest neighbour (k-NN) algorithm and (b) a sliding window combined with an intuitively constructed decision tree to facilitate the automated classification of behaviours. The sliding window and decision tree approach also allowed for the simultaneous visualisation of behaviours from continuous streams of accelerometer data. The Eurasian badger (Meles meles) was selected as a study species as it is widely dispersed across much of Europe and Asia [11], attracts much attention as a reservoir of disease (particularly tuberculosis (TB) [12]) and serves as a model quadruped for mammalian studies in general [13].

Materials and methods

Accelerometer and collar

A tri-axial (orthogonal) accelerometer (X8M-3; Gulf Coast Data Concepts, LLC, Waveland, MS, USA; recording range: ±8 g; resolution: 0.001 g) was encased in epoxy resin (Evo-Stik, Bostik Limited, Stafford, UK) and configured to record at a sample rate of 25 Hz. The device was attached to an adjustable nylon clip-on dog collar (Ancol Pet Products Limited, Walsall, UK; circumference: 20 to 30 cm) with a layer of waterproof self-amalgamating tape (ultratape™; Bruce Douglas Marketing, Dundee, UK), which was additionally fastened with three cable ties and then covered with tesa®; tape (No. 4651; tesa AG, Hamburg, Germany). To ensure that the collar remained in position on the ventral side of the neck, it was fitted with an additional weight consisting of four metal nuts. The total mass in air of the fitted collar was 146 g. The y-axis of the accelerometer was positioned to correspond with ‘surge’ motion (front-back acceleration), the x-axis with ‘sway’ (left-right acceleration) and the z-axis with ‘heave’ (up-down acceleration) (Figure 1).

Figure 1
figure 1

Tri-axial accelerometer axes in relation to the study animal. The three orthogonal accelerometer axes in relation to the body of the animal. The badger is shown wearing a collar with accelerometer attached and positioned ventrally on the neck.

Subject and enclosure

The study took place in January 2013 at a wildlife rehabilitation centre (Secret World Wildlife Rescue) in Somerset, UK (latitude: 51.205803, longitude: −2.963007), under licence from Natural England. The subject was an adult male badger (body mass: 12 kg). As the animal was tame, the collar was attached without the need for anaesthesia and it could be observed from close range (<2 m). The badger was then allowed to roam freely within a fenced 20×20 m2 outdoor enclosure consisting of flat grassland. Observations were also made within an indoor artificial sett through an observation window.

Behaviours

In total, four discrete behaviours were studied, three of which were recorded in the outdoor enclosure: ‘walking’ (head up and travelling forwards), ‘trotting’ (fast leg movements with rump swaying from side to side) and ‘snuffling’, the primary mode of foraging (i.e. standing, head lowered with nose to the ground). A third locomotory gait that has been documented in badgers, ‘galloping’ [14], was not observed. A continuous sequence of movements was recorded on a video camera (Samsung Galaxy Tab 10.1; Samsung Electronics Co, Ltd, Suwon, South Korea) for a total of 35 min. At no point did the collar rotate. Therefore, specific behaviours were recorded consistently in the three axes. The final behaviour, ‘resting’ (lying down), was recorded whilst the animal was in the artificial sett over a two-day period, during which a total of 10 min were filmed. Video footage was time-stamped to ensure synchronisation with the accelerometer.

Analysis

The collar was removed from the animal and the accelerometer data downloaded. All analyses were performed using R version 3.0.2 [15]. To ground truth data, results were annotated manually with behavioural labels by examining video footage frame-by-frame (30 FPS). In total, 25 2-s samples (which was sufficient time to capture repetitive cycles of movement) of each of the four behaviours were selected at random using a custom R function and the corresponding accelerometer data were isolated (Additional file 1a and b). From each segment, we extracted features that have proved useful in the determination of behaviours in other species [16, 17], as described below.

Static and dynamic acceleration

Acceleration can be described using a number of components [18]. ’Static’ acceleration is defined as the acceleration due to the Earth’s gravitational field and provides information on posture. ’Dynamic’ acceleration results from movement of the accelerometer and is superimposed on the static acceleration to give the raw data. Static acceleration was approximated by taking a 2-s running mean of the raw acceleration data (Additional file 2a). The dynamic acceleration was derived by subtracting the static acceleration from the raw acceleration (Additional file 2b). As an indicator of posture, the mean static acceleration was calculated for each axis. The mean absolute dynamic acceleration was calculated to gauge the magnitude of acceleration resulting from movement in each axis.

Spectral analysis

Spectral analysis was used to determine the periodicity of the various behaviours. The dynamic acceleration data for each of the three axes were transformed from the time to the frequency domain by computing the discrete Fourier transform. This enabled the acceleration data to be categorised in terms of the frequency (Hz) and amplitude of the signal components. Raw acceleration data were first centred around zero to remove the constant component (i.e. static acceleration). A Hanning window (50 points long) was applied to reduce spectral leakage. The fast Fourier transform algorithm was then used to compute the frequency spectra from which the maximum amplitude (normalised by a factor of 1/N where N is the number of points), standard deviation of the amplitude and absolute period of movement (peak frequency, precision: 0.5 Hz) were calculated (Additional file 1c).

Behaviour classification

To establish whether the four target behaviours could be distinguished we used the supervised non-parametric k-NN machine learning algorithm, in which an object is classified based on a majority vote of its k-nearest (in Euclidean distance) neighbours. Principal component analysis was used to reduce the dimensionality of the 100 isolated behavioural sequences. Principal components with eigenvalues greater than 1.0 were retained adhering to the Kaiser criterion [19]. The retained principal component scores were used in lieu of the 15 original variables as the training data set of the k-NN classifier (k = 3) with leave-one-out cross-validation. Subsequently, a decision tree (Figure 2) was constructed (e.g. [9, 17]) based on the descriptive statistics derived for each of the behaviours (Additional file 3). These rules were implemented in custom-written software designed to scan and identify animal behaviours from streams of accelerometer data. Features were extracted within a 2-s overlapping sliding window (e.g. [20]) and threshold values were refined through iterative testing on the randomly selected sequences (Figure 3). This software was used to process the 45-min continuous sequence of data from both the outdoor enclosure and within the artificial sett (see Additional file 4 for video output).

Figure 2
figure 2

Binary decision tree used to classify behaviour of the Eurasian badger. The logic of this decision tree was implemented in custom software designed to automate the classification of animal behaviours from accelerometer data within a sliding window. g, gravitational acceleration; max., maximum.

Figure 3
figure 3

Eurasian badger ( Meles meles ) accelerometer traces with sliding-window feature extraction and behavioural classification. For each axis, the blue section represents a 2-s (50 points long) sliding windown (in steps of 0.04 s, 98% overlap), from which features were extracted. The simultaneous decision tree classification for a given set of features is displayed below (see Additional file 4 for video output). The 100 randomly selected behavioural sequences (Additional file 3) were flagged in the continuous stream of accelerometer data as a training data set. Threshold values were then refined through iterative testing on these sequences to obtain the best overall fit. Finally, the selected tree was used to classify the remainder of the data set. g, gravitational acceleration; max., maximum.

Additional file 4: Eurasian badger ( Meles meles ) accelerometer traces with real-time feature extraction and behavioural classification. A 25-s video clip showing real-time feature extraction within a 2-s overlapping sliding window. The video frame rate (FPS) and accelerometer sample rate (Hz) both equal 25. The simultaneous decision tree classification is also displayed. (To guarantee playback, download and view with the latest version of VLC media player [30].) A streaming version can be viewed at: http://youtu.be/WGYWoID1lUk. (MP4 15 MB)

Results

Descriptive statistics and decision tree

Low absolute dynamic acceleration was indicative of resting behaviour (surge: x ̄ =0.011 g, s=0.002; sway: x ̄ =0.011 g, s=0.001; heave: x ̄ =0.011 g, s=0.001). Trotting had a distinctively high degree of periodicity (maximum amplitude: x ̄ =0.966 g, s=0.235) and high absolute dynamic acceleration in the heave axis ( x ̄ = 0.792 g, s=0.153). Relatively high static surge acceleration ( x ̄ =0.539 g, s=0.141) was the most distinguishing feature of snuffling behaviour. Walking exhibited low to intermediate periodicity (maximum amplitude: x ̄ =0.378 g, s=0.186) and intermediate levels of absolute dynamic acceleration in the heave axis ( x ̄ =0.336 g, s=0.112). See Additional files 1, 2 and 3 for more details. The root node of the decision tree (Figure 2) first evaluated the magnitude of absolute dynamic acceleration in the three axes. If these values were less than or equal to the threshold (0.03 g), the behaviour was classified as resting. The next internal node examined the static surge acceleration to consider the position of the badger’s head in relation to the ground. If the static surge threshold (0.345 g) was met or exceeded, the behaviour was classified as snuffling. If the value was less than the snuffling threshold, a subsequent internal node examined the degree of periodicity in the heave axis, terminating as either trotting (if the maximum amplitude was greater than or equal to 0.664 g) or walking (less than 0.664 g).

Classification accuracy

The first three components from the principal component analysis rendered eigenvalues that were greater than 1.0 and these were retained. These components accounted for 80.2% of the total variance. Overall the accuracy of the k-NN algorithm was 92% (Cohen’s kappa = 0.89, average proportion of votes = 0.93). Resting behaviour was the most readily identifiable with an accuracy of 100% (average proportion of votes = 1.0), followed by trotting (96%, average proportion of votes = 0.91), snuffling (88%, average proportion of votes = 0.95), and lastly, walking (84%, average proportion of votes = 0.87) (Table 1). A total of 21 min from the continuous sequence were manually annotated as belonging to one of the four behaviours of interest (excluding training data). For this time, the overall decision tree percentage match was 87.1% and 83.2% when weighted equally between the behaviours. As with the k-NN results, resting had the highest classification accuracy (99.4%), followed by trotting (78.7%), snuffling (77.5%) and walking (77.4%) (Table 2).

Table 1 Confusion matrix for k -nearest neighbour algorithm with leave-one-out cross-validation output
Table 2 Confusion matrix of decision tree output with time engaged in each behaviour

Discussion

The ability to identify behaviours retrospectively from animal-borne data loggers is an important goal in field ecology [21, 22]. Our use of the k-NN algorithm and an automated decision tree applied to accelerometer data show how this can be achieved with a model mammalian quadruped, the Eurasian badger. Using features that relate clearly to animal movement and posture, both methods demonstrated variable (depending on behaviour), but high, classification accuracy. The k-NN results indicate that the behaviours could be distinguished from isolated sequences of accelerometer data and, as such, lend themselves to the creation of a database of behavioural information to help classify future traces. Using a sliding window and decision tree classification approach, we showed that behaviours could also be accurately identified under more natural conditions from continuous streams of accelerometer data containing multiple transitions between behaviours.

In terms of the misclassification rate, variable walking speed and hence periodicity may explain some of the observed confusion between walking and trotting. While moving forwards, the animal occasionally lowered its head making the distinction between walking and snuffling less clear. Optimisation of window length and the extraction of additional features could further improve classification accuracy [23]. It is also worth noting the extensibility of the feature extraction methods detailed here as they may be used in conjunction with a variety of other classification algorithms (such as those explored by Nathan et al. [8]).

In future studies, these methods will enable the automated generation of detailed behaviour-time budgets from large quantities of accelerometer data collected on wild animals. Whilst the figures presented here may serve as an initial benchmark for badgers, further work should focus on the characterisation of a wider repertoire of behaviours occurring over a range of terrain, as both substrate and gradient can affect the mechanical work requirements of terrestrial locomotion, which may be reflected in body acceleration [24, 25]. Cross-validation between animals of the same species could also be performed to confirm the implication that accelerometer data from one individual can accurately predict the behaviour of another. Promisingly, other recent validation studies suggest this may be the case both at intra-specific [17] and, to varying degrees of success depending on morphology, inter-specific levels [26]. Specifically for badgers, accelerometers could be used to examine the differential occurrence of behaviours between TB infected and uninfected individuals, with the potential to expand on recent studies [2729]. Such a high resolution of behavioural information could prove invaluable in future badger biosecurity work.

Availability of supporting data

The analysis software is under continued development and will be made available at: https://github.com/dmcclune/accelerometeR.