In cooperation with our collaborators from biology, we used our concept implementation to analyse changes in animal behaviour, thereby targeting two of their main research questions. First, we wanted to know whether we can reliably predict pregnancy dates from the movement data of female cheetah, including an automated analysis based on machine learning methods, followed by interactive visual inspection. Secondly, we wanted to analyse the dynamics in the spatial tactics behaviour of male cheetah. The idea is to integrate visualisation of machine learning results in our visual analytics environment to help our collaborators analysing the results together with trajectory data and environment.
Pregnancy prediction
As a threatened species with a small and sparsely distributed population, cheetah face extinction if mortality rate exceeds recruitment rate. An important tool in monitoring both birth events and cub survival, is to predict a pregnancy of female cheetahs as soon as possible, and to use the opportunity to document the birth and monitor cub survival. This is currently difficult to achieve in the field, because the GPS data from the collars are only downloaded every two to three weeks by flying with an airplane. It is possible to identify from the GPS data that a female has given birth by visually detecting a cluster representing the lair, but planning a field trip for verification of the born cubs is time consuming and can be tight. For this reason, the visualisation of an early pregnancy indication or prediction in our visual analytics environment would help the field biologists to identify when to monitor an animal more closely and already start preparing the trip to visit the lair.
During pregnancy, a change in movement dynamics and animal behaviour is expected, which might not be very obvious in the GPS data or distances travelled, and e. g. not easily conceived by simple visual inspection or rule-based analysis. After conception, the female cheetah is moving approximately three months on her own until she gives birth. Before she gives birth, it is likely that she is looking for a safe hiding site for her cubs, and thus changes her movements. Once the cubs are born, her movements change drastically. Because the cubs are not very mobile in the first two months and stay in the lair, the mother returns regularly to this place after hunting or resting further away, see Fig. 9. After that period, the cubs start to join the mother on her trips, but slow her moving speed down.
To aid biologists in the pregnancy prediction task, a machine learning on GPS data was set up and a model pre-trained for further prediction and visualisation. As birth events are relatively rare, and GPS data are not available in a similar manner for all animals, data of 10 animals were selected. In total, 806,593 data points were collected, out of which 51,805 were labeled as pregnant based on pre-knowledge from observation. Before machine learning, the data sets were pre-processed in the scripting language Python, as described in the following. In general, the tag is obtaining a GPS position every 15 minutes. To save energy, the tag was scheduled such that if the tag does not record any movement during 30 minutes, for example, because the animal sleeps, the scheduled GPS positions are skipped until the animal becomes active again (Brown et al. 2012). This results in a discontinuity in time points of the raw data during times of no movement. The resulting gaps within the GPS data are filled up with the last GPS position available, which was recorded after arrival at the resting place during the first 30 minutes before the tag went inactive. This algorithm also fills missing locations due to bad satellite coverage which in our study amounts to less than one percent of the scheduled positions.
To make the prediction independent from GPS locations, the distance value between two GPS locations was considered. Since a change in dynamics during the course of pregnancy is expected, not only the current distance value at a time point was considered, but the following 672 distance values were added to capture movement dynamics over 7 days. To capture similar movement patterns among the animals, data sets were cut off to begin and end at 12pm. To classify cheetahs as non-pregnant and pregnant, data points were labelled as pregnant six weeks before the birth event, since the change in behaviour of pregnancy is not expected from the conception date on, but with progressive duration. Also, to capture the last days of pregnancy and changed behaviour shortly after giving birth, labeling of data points as pregnant was extended until the cubs were approximately one week old.
Due to the limited amount of documented pregnancies and information in our data sets, several machine learning models were pre-trained by using all data sets (except for one data set (here: animal)) for training. On the one data set left out, performance was evaluated and visualised. For all different models the XGBoost Classifier implemented in Python was used with standard settings, except for the number of estimators and maximum depth, which were changed to 100 and 5, respectively, after applying random under-sampling of the majority class (non-pregnant). In total, testing was performed on data of five different animals.
The classification results were visualised on the timeline of Cesium, to help analyse performance of machine learning and to discover new pregnancies in the future at an early state. Predicted values were mapped to a colour scheme of dark brown (non-pregnant) to dark green (pregnant). In Fig. 9 the results for pregnancy prediction on one animal are shown, which delivered twice in the time course of the GPS data. The first birth took place on 10 December 2012, and is marked in dark green on the timeline, meaning a high probability of the female being pregnant. In addition, it is also easily possible to identify postpartum periods and the lair by visual inspection, as the movement pattern of the female changes significantly with relatively short trips around the lair and the frequent returns, which are marked as yellow high density points, see Fig. 9. For monitoring cub survival, it is crucial for our experts to identify the pregnancy in an early state. When looking at the time before the first delivery, the female is first marked as pregnant on 10 October 2012, which is approximately two months before giving birth. Nevertheless, there is a two week window in between with higher uncertainty of pregnancy, which is indicated by the white colour. The second birth is temporally very close to the first birth. Normally, the cubs move with their mother for approximately 17 to 19 months, before the mother is giving birth again. Our experts provided the information that in this case the cubs did not survive. By visual inspection of the movement data in our visual analytics environment, it was spotted that the cubs died approximately on 21 December 2012, because the period of frequent returns last usually eight weeks, but ends here suddenly in an early state. The second birth took place on 03 April 2013. It is in a window where pregnancy is marked as less likely (white), after a period of high probability of being pregnant, which starts on the 01 March 2013, approximately one month before giving birth. In this case, the prediction of birth events for this female worked well in both cases and a high probability was found one month before giving birth. Nevertheless, for other females the prediction was more difficult, as the period of frequent returns was sometimes misclassified as pregnant or the probability only increased close to the actual birth event. These problems might also occur due to the fact that several factors influence pregnancy for the individual female, for example, the number of previous birth events or the age and experience of the female. Thus, while our results are promising, there is still the potential for improvement and further markers that might indicate pregnancy and improve the predictive power.
Territorial-floater prediction
One important and difficult task is to classify territorial behaviour of male cheetah. In particular, there is the need to detect changes in the behaviour from one spatial tactic to the other, for example, when a floater starts to claim a territory (switcher). To this end, a machine learning model was pre-trained and predictions assessed and visualised for test cases in our visual analytics environment. It is known that the home range is one crucial, but insufficient, factor influencing the predictive power. Since the dynamics and movement behaviour change when a floater starts claiming a territory, we used the same approach for feature generation as for pregnancy prediction here, except that 288 distance values (3 days) were used to capture more detailed behavioural changes. Pre-processing of the data sets was also identical to female data sets described in Sect. 6.1 and target values were determined by a list of territorials, floaters and switchers provided by our domain experts. In total, 13 animals (1,270,838 data points) showed territorial, 19 animals (917,484 data points) floating, and one animal (28,417 data points) switching behaviour. As much more data are available for this classification task than for the females, all animals except for two territorial males, one floater and one switcher, were used for training of one machine learning model. The Random Forests Classifier implemented in Python was used with standard settings, except for the number of trees and maximum depth, which were changed to 200 and 7, respectively. In addition, random under-sampling of the majority class (territorial) was performed before applying machine learning. The classification results were visualised on the timeline of Cesium to help analysing the performance of the machine learning and to identify animal behaviour in the future. Predicted values for time segments were mapped to a colour scheme of dark brown (strong territorial behaviour) to light brown to white (unspecific behaviour) to light green to dark green (strong floating behaviour).
Overall, the inspection of the machine learning classification showed good results on the test cases. One of the test cases is a bipolar territorial male and our classification marked it mostly as territorial. Our visual aggregation only shows one time period marked as floater, where the animal switches between its two activity centers, consistent with the bipolar behaviour. In total, 70% of data points are classified correctly. Also, one of the two floater test cases (animal A) had overall 68% of data points correctly classified and was overall mostly marked as floater over large periods of time, see Fig. 10. Still, there are sometimes uncertain predictions (white ranges) and one time period where it is classified as territorial. Probably this happened because floaters have periods of rather restricted local movements, and the one period with highest density where it resides for some time might indicate territorial behaviour.
The most interesting case for our domain experts is the behaviour when males are switching between the two spatial tactics. Here, animal B changes its behaviour from floating to territorial over the course of time, which is also indicated by an interchanging pattern of the classification. With additional visual inspection, a small territory is already clearly visible, and also spending time there is mostly identified as territorial, see Fig. 11. Periods during which the animal shows floating behaviour are also mostly identified correctly, see Fig. 12. Nevertheless, switching behaviour poses a challenge to machine learning, as we have continuous and frequent changes between behaviour over long periods of time.
Another good indicator of territorial behaviour is marking. Territorials mark several sites in their territories to claim them and exchange information with other cheetahs. Since marking is a continuous process where the territorial cheetah returns frequently for a short time, it can be analysed with a clustering approach in our visual analytics environment. To evaluate usability of our visual analytics framework in marking spot detection, a typical territorial and floater movement trajectories of approximately one year were analysed and compared.
For the territorial cheetah the density map shown in Fig. 2 shows a major region of high revisitation (dark yellow) in the movement trajectory. The yellowish area indicates the major outline of the territory of the respective cheetah, as this is the area where it mainly spends time. In comparison, the density map of a typical floater looks different, see Fig. 3. There are three major regions of high revisitation (dark yellow). In addition, the floater moves regularly between them, as indicated by the yellowish connections. Overall, a lot more space is covered by the floater, and more area is revisited more frequently in comparison to the territorial. Nevertheless, density maps are not conclusive to distinguish territorial and floater behaviour completely, since also bipolar territorials exist, which own two territories which are frequently revisited and marked.
By filtering for clusters and taking a closer look at the dense (dark yellow) areas, a better distinction between territorials and floaters is possible. When filtering the revisitation cluster of the territorial cheetah, we can see a number of clusters which are closely together, see Fig. 8. All of those clusters have a revisitation rate of 39 to 103 with an average of 54.2 revisitations and on average 4 to 11 days until revisitation. This kind of regular revisitation over a long time span is a clear sign for a marking spot. When checking the individual marking spots further, it was clearly visible that they are located next to or at a tree, which is a typical marking spot for cheetahs, see Fig. 13. This will aid in the identification of new marking spots in the future, since the visualisation clearly shows which clusters are close to potential marking spots, e. g. a tree. Our collaborators evaluated the marking site prediction which is based on our clustering approach, and confirmed that the automatically selected clusters were exactly the marking sites of the cheetah, which were determined through observation and photo camera traps (DC2).
The floater data set also has some cluster with frequent returns, see Fig. 14. In contrast to the territorial cheetah, they are more distributed and far away from each other. The revisitation rate is between 19 and 57 and average revisitation is much lower with 32.7 compared to the territorial revisitation of 54.2. On average, the clusters are revisited every 4 to 17 days, which is longer compared to the territorial.
It is known from literature that a floater walks between cheetah territories and does not spend much time in them. To confirm this, the sliding time window feature was used to compare movement of the animals two times for one week each time. For the territorial we can observe that in each week the clusters are located closely together and that the territorial mostly moves close to the frequently visited marking spots, see Fig. 15. In contrast, the floater moves much more, covers a huge area between these clusters (as highlighted in Fig. 14) and does not spend much time in one area, see Figure 16. This feature enables a good visual distinction between the two different types of animal behaviour evaluated here.