Background

The development of animal-attached devices that provide data on animal movement, behaviour or physiology (without the need to directly observe the animal) has proved a powerful way to quantify animal behaviour [1, 2]. In particular, three-dimensional accelerometers have been used to reconstruct animal behaviour [1, 3]. The use of accelerometers has been used most widely in studies of marine mammals and birds [3, 4], but recent advances in bio-logging technologies have made devices smaller, cheaper and longer-lasting, drawing interest from researchers working with a wider diversity of species [5]. Behaviours identified from acceleration data can range from simple active–inactive behaviour [6, 7] to the dynamics of prey capture [8] or even the classification of ‘internal state’ [9].

Bio-logging techniques are seldom used on non-human primates, probably due to the long tradition of direct observation by researchers in the field [10, 11, 12]. However, bio-logging can provide information on elusive or out of sight behaviours that are difficult to record [13, 14] and reduce potentially negative outcomes associated with observer presence [15], or habituation to observation [16, 17]. To date, only a handful of studies have used accelerometer data to infer behaviours in primates, but these analyses have been limited to broad levels of activity, rather than specific behavioural states (rhesus monkeys, Macaca mulatta [18], vervet monkeys, Chlorocebus pygerythrus [19] and owl monkeys, Aotus azarai [20]). To our knowledge, the only study that has used accelerometers to identify specific behaviour was undertaken by Sellers and Crompton [21] where they successfully identified locomotion events in captive red-ruffed lemur (Varecia variegata rubra). Therefore, unlike other terrestrial species [2225] no acceleration ethogram (a catalogue of different acceleration footprints produced by different behaviour of an animal) exists for any non-human primate.

An acceleration ethogram would be particularly useful to collect fine-scale behavioural data with high temporal resolution. Among many applications, such methods could be used to document situations where primates are in conflict with people in species ranging from chimpanzees (Pan troglodytes) [26] to macaques (Macaca mulatta) [27] allowing us to quantify the occurrence of such events and their spatial–temporal dynamics. One of the most high-profile non-human primate–human conflict occurs with people and baboons in the Cape Peninsula, South Africa, with baboons raiding bins, properties or taking food directly from people themselves daily [28]. We are particularly interested in using acceleration data to document baboons’ behaviour in this environment to understand baboons’ behavioural responses to anthropogenic change, but in order to complete this later goal (not developed in the present work), we first need to define a reliable method to assess behaviour through acceleration.

A key challenge that is common to all studies involving the use of accelerometer data lies in the analysis of the data. This is particularly pertinent for datasets that extend over weeks or even months, which are typically extremely large. To infer behaviour from acceleration data, researchers manually annotate the signal by expert interpretation [3], or ‘label’ behaviours in the acceleration signals, ideally using time-matched behavioural observation, to teach machine learning algorithms [2931]. The broad approaches can be applied across taxa, although the specific selection of variables is likely to vary with and reflect characteristic movement modes, behavioural categories and habitat types of a particular species or population.

Here, we aim to describe a reliable ‘end to end’ process to quantify major behavioural states from tri-axial acceleration, applied here to baboons, but potentially transferable to other primates or contexts. We equipped n = 10 adult male baboons in the Cape Peninsula, South Africa, with three axial accelerometers and used video footage of the collared baboons ranging in their natural environment to generate a labelled dataset. We then used random forest models [32] to match behaviour and acceleration in the dataset with a focus on the identification of locomotion gaits and foraging behaviour [12]. Finally we compared the model predictions to our observations to test its accuracy and validate our procedure.

Methods

The general workflow of the methods is summarised in Fig. 1. All data processing and analyses were conducted in R (R version 3.2.2, R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.), and all codes used are provided in Additional file 1 (Computation of acceleration variables for behavioural identification) [22, 30].

Fig. 1
figure 1

General workflow. Process for identification of behaviours from accelerometer data in a wild social primate

Study site and subjects

We studied the ‘Constantia’ baboon troop that ranges in a varied landscape at the edge of the City of Cape Town (S −34.0349, E 18.4156) (for more details see [33]) for 30 days from mid-May to mid-June 2015. The troop comprised 13 adult males, 25 adult females, 4 subadult males and approximately 30 juveniles of both sexes.

Acceleration data (Fig. 1a)

Ten adult males were fitted with SHOAL group in-house constructed collars (F2HKv2 collars, see Additional file 2, Baboon collar development). Each collar contained a tri-axial accelerometer (‘Daily Diary’ sensor [31]) recording acceleration in each axis at 40 Hz which allows for the study of behaviours of most terrestrial animals whose fastest movements range between 0.5 s to 1 s. Baboons were cage-trapped in accordance with the local ‘baboon management team’-approved protocol before being sedated by a certified veterinary surgeon and fitted with the collar. Collars weighed less than 3% of the body mass of the baboons and were approved for use by Swansea University Ethics Committee (Swansea University IP-1314-5). Of the 10 collars fitted to the baboons, one baboon dispersed before we were able to collect video data (see below) and so our sample is based on n = 9 individuals.

Video data (Fig. 1b)

Baboons were habituated to close (≤10 m) human observation and could thus be followed on foot by one or two observers without affecting their behaviour. Collar equipped individuals were video recorded using an AEE SD100 camera (PNJ SARL, Paris, France) for 15.3 h in total with a mean ± standard deviation of 1.7 ± 0.96 h per individual.

Video processing (Fig. 1c)

Footage was time-stamped to allow synchronisation with the accelerometer, and the signal was annotated using Framework4 [31]. We labelled behaviours at time steps of one second, relevant for most behaviours (mean duration of one behavioural bout (±SD) = 33 s ± 62 s, median = 12 s) [22, 34], leading to a sample size of 33,619 s. This created a dataset with n = 18 labelled behaviours (Tables 1, 2) for a total of 9.3 h. All rare behaviours with less than 100 s of observations (representing in total 7.3% of their time budget) were discarded from further analysis, bringing the labelled sample down to 33,387 s, i.e. 9.2 h (on average 1.2 ± 1.3 h (SD) per behaviour and on average 1.0 ± 0.6 h (SD) per individual, Table 2).

Table 1 Ethogram of baboon behaviours
Table 2 Observed baboon behaviours and sample sizes

Analyses of acceleration data (Fig. 1d)

There are essentially two main types of variables that can be derived from tri-axial acceleration data that are relevant to the identification of behaviour. These are static acceleration, which is dependent on gravity and describes the posture of the animal, and dynamic body acceleration, which reflects the body movement of the animal. These variables can be measured in each of the three-dimensional axes (with X for ‘surge’, Y for ‘sway’ and Z for ‘heave’; Fig. 2). Data from the three axes can also be combined to give a general index of body motion.

Fig. 2
figure 2

Schematic of baboon with a collar and acceleration data example. a Schematic of a male baboon wearing the GPS/acceleration collar. The three axes measured by the accelerometers are indicated by the arrows. b Example of labelled acceleration signal from three axes. Sections are coloured (and labelled) according to the observed behaviours (upper section) and predicted behaviours (lower row)

In order to match our behavioural sampling frequency (1 Hz) and identify behaviours at this frequency, we computed mean values over one second for a total of 25 variables that describe both static (Fig. 1b) and dynamic (Fig. 1c) acceleration data across our individuals. The list that follows summarises each of these variables, which are numbered 1–25 in round parentheses: (1–3) tri-axial static acceleration [1]; (4–5) pitch and roll [1]; (6) vectorial dynamic body acceleration (VeDBA); (7) smoothed vectorial dynamic body acceleration (VeDBAs) [35, 36]; (8–10) tri-axial partial dynamic body acceleration (PDBA) [1]; (11–13) the tri-axial PDBA-to-VeDBA ratio. In addition to these descriptive statistics, we processed the dynamic part of the acceleration further by computing its (14–16) tri-axial power spectrum density (PSD); (17–19) maximum frequencies associated with the tri-axial PSDs; (20–22) the second maximum frequencies associated with the tri-axial PSDs; (23–25) the associated frequency for each axis. We provide a full description for each of these variables (1–25), in turn, below.

(1–3) The static (st) component of acceleration for each axes stX, stY and stZ is directly influenced by the orientation of the logger with respect to gravity and therefore indicative of the posture of the animal [1, 37]. The tri-axial static acceleration was calculated from the raw acceleration with a running mean of 3 s [38]. From the resulting 3D-static acceleration, the angles (4) pitch and (5) roll were calculated, converting the 3D orientation towards gravity (measured in g), to two angles (in degrees) using the plane and upright position as Ref. [1]. Pitch was calculated as the arcsine of stX and roll as the arcsine of stY [1].

Tri-axial dynamic body acceleration (DBA), which represents overall body movement [1, 35], was calculated as the difference between raw and static acceleration from each axis. We note that centripetal acceleration can also affect the acceleration signal (e.g. when an animal ‘pulls g’ by cornering); however, this is unlikely to be a main factor affecting the acceleration signal in baboons. (6) The vectorial dynamic body acceleration (VeDBA) was computed using the dynamic components of the signal to assess the ‘activity level’ of the individual, bringing the three axes (x, y, z) together as given by Eq. 1.

$$VeDBA = \sqrt {X^{2} + Y^{2} + Z^{2} }$$
(1)

To allow for a general estimation of activity, reducing the impact of short high-amplitude burst of activity, we (7) smoothed the VeDBA using a running mean of 3 s, that is, ‘smoothed VeDBA’. (8–10) Partial dynamic body acceleration [1] was also calculated in each different axis in order to describe the amplitude of the movement, calculated as the absolute dynamic acceleration values for each axis (11–13). The PDBA-to-VeDBA ratio provided an estimation of contribution of each axis to the VeDBA, calculated by the ratio of PDBA to the VeDBA in each of the three axes.

To characterise the oscillations in the dynamic body acceleration for each axis, we computed (14–16) power spectrum densities (PSDs) and their associated frequencies using Fourier analysis [21, 39]. Fast Fourier analysis decomposes the signal into frequencies and amplitude. It can therefore be used to indicate at which frequency the signal varies the most, providing an overview of large body movements and ignoring signal associated with small body movements. For each second, we defined (17–19) maximum PSD and (20–22) second maximum PSD together with their (23–25) associated frequencies, at an interval of 3 s (1 s after and 1 s before, this in order to sample enough oscillations to define a frequency even for slow cyclic behaviour such as walking).

Time matching (Fig. 1e) and building datasets (Fig. 1f)

Acceleration variables were time-matched with our video-based behavioural data to obtain a labelled dataset. Of this dataset, 60% was used as a training dataset (20,111 s, 5.6 h) and 40% as a validation dataset which we later used to test the success of our model predictions (13,276 s, 3.7 h, Fig. 1e).

Model fitting via random forest models (Fig. 1h; Fig. S1)

To be able to assign behaviours according to the 25 descriptive variables (see above), we used random forests. Random forests are based on classification trees and, in summary, build many trees using a random subset of the data each time, and a random subspace of variables for each classification step. Thanks to the great number of iterations (here 500) and two ‘layers of randomness’ [40], this model has the advantage of being more powerful than the classical classification trees, limiting the risk of overfitting and of being more able to cope with unbalanced dataset [40].

The first step of the random forest is to sample the initial training set randomly with replacement (such that one observation can be drawn many times), resulting in several bootstrapping sets with the same number of observations as the initial training set (Additional file 3: Fig. S1). Due to replacement, not all observations are represented in every bootstrapping set.

From one of these artificial sets of data, the model builds one classification tree which aims to classify the full set of observations into different classes (here, behaviours) by building a set of hierarchical decision rules based on the given variables (Additional file 3: Fig. S1 [32]). At each node (a set of observations, represented by a circular graph when two branches split in Additional file 3: Fig. S1), the model will aim to split this set of observations into two smaller and ‘purer’ subsets, i.e. each subset contains a fewer number of different classes (here, behaviours). A subset is considered as pure when it only contains one kind of behavioural classes. This purity, or its absence, is quantified with the Gini impurity index (Eq. 2) which will tend to zero when only one class is represented in a subset.

$$G = \mathop \sum \limits_{i = 1}^{n} p_{i} \left( {1 - p_{i} } \right)$$
(2)

where n is the number of behavioural classes and p i is the proportion of each class in the set of observations. At each step of the classification, the model uses a random selection of the total variables available and tests each of them with different thresholds to define a rule that will minimise the Gini index in the two descendent subsets (Additional file 3: Fig. S1). This process continues until no more rules can be found to split the dataset into purer subsets. The local importance of a variable is calculated by the index of the parent set of observations minus the Gini indexes of the two descendent subsets. We classified the variables according their overall importance and performed Kruskal–Wallis tests to compare the mean of the most discriminating variables according to the different behaviours.

When using random forests, the model will simply grow many trees, each built from a different random portion of the training set (60% of our initial dataset). Because the advantages of random forests come from the high number of iterations, we built 500 trees. We tested post hoc the minimum number of trees required to obtain the best classification and found that the best results were reached above 300 trees (Additional file 4: Fig. S2).

Model validation (Fig. 1g, i)

Once we built our model, we used it to predict the behaviour of our validation dataset. All analyses were conducted in R environment (version 3.2.2) with the package random forest [41]. Each time unit is therefore classified according the 500 trees, each assigning one behavioural class to the time unit, ending in 500 predictions (Additional file 3: Fig. S1). Then, the most frequent predictions across all trees were selected as the final prediction (Additional file 3: Fig. S1). We then compared the predicted behaviour with the observed behaviour and built a confusion matrix to assess the recall and the precision of the model [30] (Fig. 1h) as described in Eqs. 3 and 4:

$${\text{Precision}} = TP/\left( {TP + FP} \right)$$
(3)
$${\text{Recall}} = TP/\left( {TP + FN} \right)$$
(4)

where TP is true positive, TN true negative, FP false positive and FN false negative for each behaviour.

Results

Acceleration ethogram (model fitting)

Of the variables calculated from our acceleration data, static acceleration on all axes, X, Y and Z (which provides information on the ‘posture’ of the baboon) were the most important for distinguishing behaviours, stX, stY, stZ being ranked 1st, 6th and 13th and pitch and roll being ranked 2nd and 12th in our model (Fig. 3a). The static acceleration for the X axis (stX) were different between resting (sitting, median [1st and 3rd quartile]: 0.62 g [0.50 g–0.73 g], Additional file 5: Table S1) and behaviours in standing postures such as locomotion and foraging (median [1st and 3rd quartile]: 0.01 g [−0.29 g–0.23 g], Additional file 5: Table S1, Kruskal–Wallis Chi-squared = 20,264.87, df = 7, p value <0.001, Fig. 3b).

Fig. 3
figure 3

Random forest model results. a Variable importance for the identification of baboon behaviour. Variables are ordered according to the mean decrease in Gini index (see “Methods” for more details). b Density histogram plots for major behaviours as a function of mean static acceleration, stX, which scored the highest mean decrease in the Gini index (i.e. was most important to classification of behaviours). c Precision and Recall for all identified behaviours

Power spectrum densities (PSDs) were the next most important class of variables, with four out of six of these measures ranked in the top ten. The PSD2 on the X axis and PSD1 on the Z axis was, as expected, the highest for running behaviour (PSD1Z median [1st and 3rd quartile]: 0.5870 [0.2614–0.6792]; PSD2X median [1st and 3rd quartile]: 0.0405 [0.0220–0.0586], Additional file 5: Table S1), indicating high-amplitude movements happening on a regular frequency. Walking behaviour was represented by intermediate values for these variables (PSD1Z median [1st and 3rd quartile]: 0.0157 [0.0092–0.0268]; PSD2X median [1st and 3rd quartile]: 0.0025 [0.0043–0.0075], Additional file 5: Table S1) while foraging behaviour, with low-amplitude movements was represented with lower values (PSD1Z median [1st and 3rd quartile]: 0.0008 [0.0004–0.0016]; PSD2X median [1st and 3rd quartile]: 0.0007 [0.0004–0.0016], Additional file 5: Table S1). Overall we found significant differences between the three behaviours on these variables (Kruskal–Wallis Chi-squared 22,773.87, df = 7, p value <0.001, Fig. 3). In contrast, the ratio and frequency measures did not contribute much to the model’s ability to classify behaviours (Fig. 3).

Model accuracy (validation procedure)

The random forest model reached an average precision of 88.3% (±8.5%) and a mean recall of 70.7% (±29.3%) across all behaviours (Fig. 3c; Table 3). The recognition (or extraction) of foraging, resting, running and walking shows a high precision and recall (>85%), while lying and grooming (both when focal is actor and receiver) have high precision (>85%) but lower recall (>60% for lying and grooming [receiver] and >20% for grooming [actor]); the missing instances being principally classified in other low-amplitude behaviours (Fig. 3c; Table 3). The extraction of standing behaviours was poor (recall: 38.9%, precision: 67.9%), and instances of standing that were misclassified tended to be labelled as foraging and resting.

Table 3 Confusion matrix

Discussion

We have successfully used acceleration data to identify six behaviours performed by adult male chacma baboons. These behaviours represented 93.3% of the time budget recorded from video observations and the first ethogram from acceleration data for a wild non-human primate. Behaviours relevant to raiding behaviours (foraging, running, walking and resting), which are important with respect to the study population [33], were successfully identified with good precision and recall (>85%). We discuss the variables calculated from our accelerometer that contributed to identification of these behaviours in turn.

Static acceleration and the smoothed vectorial dynamic body acceleration (VeDBA) were among the most important variables and were found to be different between active and inactive behaviours. Differentiating between active and inactive behaviours is commonly done with variables such as VeDBA or VeDBAs [1, 7, 18, 24]. Interestingly, static acceleration metrics are not always included as discriminators in machine learning algorithms [30] and so our findings suggest that this could be an important factor for other primate researchers to include in their models.

The performance of our model varied when it came to the different inactive behaviours, while resting was extracted with a high recall and precision, standing behaviour was less accurately described. A standing posture is adopted within a range of other behaviours (during locomotion, or foraging, for example), and this may explain the difficulty in identifying resting or vigilance when standing, particularly when other activities are executed slowly. Differentiating non-active behaviours has proved difficult in other species too; for example, in vultures static acceleration was not useful for differentiating between different types of passive flights behaviour, such as gliding, thermal soaring or slope soaring [25]. This problem is therefore not unique to the baboons and suggests there is an inherent problem in using acceleration data to classify behaviours which involve subtle movements and especially when these movements are adopted with similar postures [25]. Nevertheless, the identification of a broad inactivity category is likely sufficient for most researchers and questions. For instance, the identification of inactivity could enable us to identify habitats that can be used as refuges. More generally, in the case of our baboon research in the Cape Peninsula, identifying inactivity before and after a raiding event, or short inactive pauses while travelling, could indicate periods of vigilance and can overall be used to explore how and when baboons adopt ‘sit and wait’ raiding (foraging) strategies [12]. Similarly, identification of inactivity can provide interesting insights into energy expenditure and recovery time in other systems [24, 19, 42].

Locomotion (walking and running) were the best identified behaviours by our model (92% of the misclassifications occurring in running were composed of walking—either running segments classified as walking or walking segments classified as running). This efficiency in the recognition of locomotion has been observed in other species and reflects a regular signal in the heave channel [1, 3, 23] and/or frequency of the general acceleration [1, 21, 43]. This result is also consistent with the importance of the 1st maximum power spectrum density peak in the Z channel (PSD1Z) describing the amplitude of the sinusoidal pattern during locomotion on the Z axis (Fig. 2). Because locomotion is generating the patterns, we frequently describe by the use of GPS data and mathematical models of animal movement [4446], accurately describing locomotion phases versus sedentary phases via acceleration can allow the user to correct GPS errors and investigate movements which are happening between fixes. Combining GPS and acceleration data will therefore increase the reliability of both data streams that are at the basis of burgeoning field of movement ecology [44]. In the context of our larger programme of research on baboons, locomotion’s precision and recall will enable us to explore the dynamics of forays into urban areas by raiding baboons [12], such as the speed of the approach or the sinuosity of trajectories [46].

Our model was extremely successful at extracting foraging, despite the fact that a wide diversity of ‘types’ of foraging behaviours are exhibited by baboons [47]. Indeed, our model successfully extracted most foraging events (recall = 93.5%) which is particularly important for the study of a short lived behaviour such as raiding behaviour [12]. Foraging in baboons is almost never performed in isolation of other behaviours, as it can take place while being stationary (sitting, or standing, i.e. stationary foraging [48]) or while walking (i.e. travel foraging [48]). Interestingly, the 2nd maximum power spectrum density peak in the X axis (PSD2X), the fourth most important variable, was important for quantifying a sinusoidal pattern in the secondary amplitude. As such, PSD2X was important for identifying behaviours of smaller amplitude that co-occur during other activities of higher amplitude such as chewing (foraging); which can occur while walking, for example (Fig. 2). We therefore suggest that this variable can be of interest for accelerometer users looking at the foraging ecology of primates or species sharing a similar foraging behaviour. Indeed, researchers have had success identifying foraging in other terrestrial species [23, 43, 49, 50], in birds using location clues, for example, from GPS or pressure sensors [8, 39, 51] and in sea mammals using, for example, mandible accelerometers [52, 53]. This suggests that ‘complex’ foraging behaviours in fact lend themselves to identification from acceleration (and other bio-logger) signals, and offers a useful avenue for further research.

While the main focus of our study was locomotion and foraging behaviours, we also identified grooming from our collar data. The maintenance of social affiliation by baboons is mostly mediated through grooming, especially for females [12, 11, 54]. As such, grooming has been studied in various contexts across primates [54], and it constitutes one of the most used metrics to build social networks [54, 55]. Grooming behaviour is traditionally identified by direct observation only and is therefore limited by the number of observers available to witness it and their ability to recognise individuals’ identity, and thus recording only one or a few interactions at a time. We were able to identify grooming with >60% of recall and precision when the focal individual was receiver and >20% when the focal individual was the actor. Further work would be needed to confirm this extraction since our model included grooming events from two baboons with few independent events, which could have led to overfitting in the model. Because adult males rarely if ever groom one another, by collaring females that spend a high proportion of their time grooming each other [56], it is likely that grooming behaviour (active and passive) could be resolved with greater confidence (since the dyad would be stationary and grooming one another). To identify even a fraction of grooming events remotely using tracking collars could transform our ability to explore the spatial–temporal dynamics of social relationships in baboons (and other grooming species) [57]. In the future, grooming data identified through acceleration would afford researchers opportunity to comprehensively investigate questions relating to ‘biological market theory’ [56, 58], or enhance our understanding of decision processes in movements and leadership [59, 60], for example.

Conclusion

Overall our study shows that the use of accelerometers can document foraging strategies and social behaviour in wild primates. Such methodology provides advantages in gathering data with limited direct observation [15, 16] and offers an alternative to habituation of wild primates [17]. We hope that researchers interested in primate behavioural ecology will be inspired by the ‘end to end’ process that we have described here. This paper offers a full protocol—from collar design and construction to the identification of behaviours from accelerometers—for any researcher working with a medium- to large-sized primate. We hope that researchers in the fields of both primatology and biotelemetry make the most of these exciting new opportunities.