1 Introduction

Technological developments from the last decades offer unprecedented opportunities to monitor the Earth system. In particular, the derived downstream data products are very valuable to understand processes at the land surface. International research projects like ESDLFootnote 1 and BACIFootnote 2 are joint efforts to provide free-of-charge, unified, and high quality Earth Observations (EOs) from satellite-based remote sensing measurements. Within this framework, the concept of the ‘Earth System Data Cube’ arose as a practical and intuitive way of storing and representing multivariate spatiotemporal databases.

The ability to detect and monitor anomalous behavior in multivariate environmental time series is crucial. These events are signals of changes in the underlying dynamical system and their detection can be used as an early warning system for land ecosystems. Classical extreme value theory (Coles 2001; Dey and Yan 2016) cannot be an option since the length of existent EOs data so far is relatively short (up to maximal three decades). Recently, Zscheischler et al. (2014) and Zscheischler et al. (2014) proposed an univariate approach based on threshold exceedances to analyze the global interannual variability of gross primary production. The presented methodology in contrast aims to tackle the problem from a multivariate point of view. Then, the definition of an extreme event should also include those constellations where not a single variable is an extreme but its combination is an extreme (multivariate extreme or compound event) (Reichstein et al. 2013; Flach et al. 2017; Zscheischler et al. 2018). Therefore, the extrapolation from the univariate to the multivariate case is not trivial.

A common approach for multivariate analysis in geoscience is to look for those events where multiple variables present abnormal behavior simultaneously, often called co-exceedances (Donges et al. 2011; Donges et al. 2016). This approach is based on fixing a threshold at each variable and analyzing the probability of occurrence of events above those thresholds either simultaneously or with a certain lag between variables. However, this might be a very conservative approach. A good alternative which has become very popular lately is the application of copulas. Copula models are based on Sklar’s theorem which states that any multivariate distribution can be written in terms of univariate marginal distribution functions of the variables involved and a copula function that describes the dependence between these variables (Sklar 1959). Whereas copulas are well studied in the bivariate case, higher-dimensional cases still present some limitations. Elliptical (i.e., multivariate Gaussian and Student’s t distributions) and Archimedean (i.e., Clayton, Frank and Gumbel) families are the most suitable ones for practical multivariate applications (Ma et al. 2013; Corbella and Stretch 2013). Nonetheless, there are authors arguing against the use of copulas and that they do not present any particular advantage when dealing with multivariate distributions (Mikosch 2005).

The main objective of this study is to propose a methodology to detect abnormal events in multivariate environmental time series. By combining different statistical methods, we are able to tackle the spatiotemporal dependencies. The methodology we propose can be divided into two main steps: feature extraction and event detection. The first step is based on the assumption that the time series of each variable can be represented by an autoregressive moving average (ARMA) process, and anomalies are those time instances that are not well represented by the estimated ARMA model (Chandola et al. 2009). We use the Mahalanobis distance as a measure of the deviation of the multivariate residuals (difference between the observations and ARMA model output) at certain time step from their joint distribution.

For the second step, two event detection methods are presented in this paper. The first is to use a fixed threshold at a certain percentile of the Mahalanobis distance distribution applied on each time step independently. However, adjacent points in time and space are most likely to belong to the same event, whether it is normal or anomalous. The exploitation of the spatiotemporal regularity of the obtained novelty score can, on one hand, help to reduce the uncertainty in the estimation of the Mahalanobis distance from the noisy observations of a single point in time and space, and on the other hand, can help to directly define the spatial and temporal extent of the detected events. Hence, as an alternative solution, we propose to approach the problem of detecting abnormal events as detection of spatiotemporal clusters of high novelty score (Mahalanobis distance). Based on the proposed approach, the statistics (mean and variance) of the detected clusters, rather than a fixed percentile threshold, can be used to define the intensity of the anomalies. This is advantageous since the optimal selection of a fixed percentile threshold might vary according to the season as well as the climate area.

Markov random fields (MRFs) (Geman and Geman 1984) provide an effective and theoretically well-established mathematical tool for integrating spatiotemporal dependency into the classification of image time series (Melgani and Serpico 2003; Benedek et al. 2015). To this end, the obtained Mahalanobis distance over the entire study area is treated as a time series of images. We use an adaptation of the multi-layer fusion MRF classification model presented in Sziranyi and Shadaydeh (2014) and Shadaydeh et al. (2017) for the classification of this Mahalanobis distance images into three classes, intense anomaly, possible anomaly and normal.

The remainder of this article proceeds as follows. Section 2 gives a short description of the used data and study area. In Sect. 3, the steps of the methodology are explained in detail. Experimental results and validation based on known historic events are presented in Sect. 4. Finally, a conclusion is drawn in Sect. 5.

2 Data and study area

Data from the Earth System Data Cube (ESDC) developed within the ESDL project have been used as the primary source of biosphere data for this study. The ESDC comprises spatiotemporal data consisting of: time, latitude, longitude and multivariate Earth Observations. The version used in this study covers the period from January 2001 to December 2012 with 8 daily observations and a spatial grid with a resolution of 0.25\(^{\circ }\). More than 30 biosphere and atmosphere parameters are included in this database. Out of these variables, we have used those 5 that mainly measure the terrestrial biosphere activities: Gross Primary Productivity (GPP), Latent Energy (LE), Net Ecosystem Exchange (NEE), Sensible Heat (SH) and Terrestrial Ecosystem Respiration (TER), which were kindly provided by the FLUXCOM initiative (Tramontana et al. 2016).

The study area comprises Africa and Europe (see Fig. 1). This area was defined as the main study area within the EU project BACI. The BACI project aims to develop a ‘Biosphere Atmosphere Change Index’ to detect climate-induced ecosystem changes and to asses their impacts in socioeconomical and ecological processes.

3 Methodology

The methodology we propose can be divided into the following three steps: preprocessing, feature extraction and event detection. Methodology workflow is illustrated in Fig. 2. Each step is described in more detail in the following subsections.

3.1 Preprocessing

Deseasonalization and normalization To avoid inconsistencies later, data needs to be pre-processed. We have applied techniques commonly used in environmental sciences; initially, the seasonal pattern usually present in environmental variables has been removed. In order to do so, we have subtracted the mean seasonal cycle. Then the remaining variables were normalized by subtracting its mean, \(\mu \), and dividing by its variance, \(\sigma \). This is done for all the 5 variables locally at each pixel of the grid.

Regionalization Once the seasonality has been removed and the variables have been normalized, the grid was clustered into regions of similar climate conditions. This regionalization was done according to the climate types defined by the Köppen Climate Classification (Chen and Chen 2013). The Köppen Climate Classification is a widely used vegetation-based empirical clustering that divides the world in up to 31 climate regions. From these 31 climate regions, 23 are present in our study area. Figure 1 shows the climate regions with the legend explaining the codes that define them.

Fig. 1
figure 1

Area of study clustered according to the Köppen climate classification. Gaps represent areas where there is no data available

Fig. 2
figure 2

Flowchart of the proposed methodology

3.2 Feature extraction

3.2.1 ARMA models

An abnormal event can be defined as those points within the time series that are not well represented by a previously fitted statistical model (Chandola et al. 2009). Following this intuitive concept, we have applied an autoregressive moving average (ARMA) model and afterward computed the residuals between the model and the data. Those points where the differences (residuals) between model and data are significantly high can be considered as abnormal events that the model is not able to represent correctly. An ARMA (p, q) model consists of two parts, an autoregressive part (AR) and a moving average part (MA). The coefficients p and q refer to the order of each part:

$$\begin{aligned} X_t =\varepsilon _t + \displaystyle \sum _{i=1}^{p} \varphi _i X_{t-i} + \displaystyle \sum _{i=1}^{q} \theta _i \varepsilon _{t-i} \end{aligned}$$
(1)

where \(\varphi _1,...,\varphi _p\) and \(\theta _1,...,\theta _q\) are parameters of the model and \(\varepsilon _t\) is an error term assumed to be i.i.d. Gaussian noise.

For each climate region, a representative point that is geographically centered in the region and hence reflects its average behavior has been selected. A univariate ARMA model for each of the 5 variables has been fitted, for every representative point. In order to select the best model order (p, q), a Bayesian Criterion (Schwarz 1978) was applied to all the possible combinations between (0,0) and (5,5). Table 1 shows the selected ARMA order for each variable and climate region.

Although there are multivariate approaches available (e.g., Cai 2011; Soares and Cunha 2000), we have decided to work with univariate ARMA models independently fitted to each variable at each point due to higher flexibility and easier interpretation. In Preez and Witt (2003), the authors compared the performance of univariate and multivariate models and as a result of their work they recommended the use of univariate models, specially in those cases where cross-correlations between variables are not particularly strong. Multivariate models involve a greater number of parameters, which becomes a disadvantage for rather short time series, while their performance is comparable to univariate approaches.

Accordingly for each climate region and variable, we have estimated the parameters (p, q) of the ARMA models to be fitted. Then we proceed with the entire grid, fitting for each point an ARMA\((p_{ij},q_{ij})\), where i refers to the climate region and j stands for the variable (see Table 1). Note that there are some variables where the selected ARMA model is of order (0, 0), in those cases, the Bayesian Criterion indicates that is better to work directly with the variables themselves instead of working with ARMA model residuals.

Table 1 (p, q) ARMA parameters selected for each climate region and variable

We have additionally tested the use of ARIMA models (autoregressive integrated moving average models) that are worthy to be used when the variables present non-stationarity. Comparing the use of ARMA and ARIMA models by means of the Bayesian Criterion, we found that including the extra parameter of the ARIMA models does not lead to better results for the relatively short-term (12 years) ESDL data. Table 2 shows the comparison between the ARMA and ARIMA models for each climate region and variable. The ARIMA(p, D, q) models introduce a non-seasonal integration term defined by the parameter D. As it can be seen, D is equal to 0 in the majority of the climate regions and variables. Therefore, following the principle of parsimony, we have decided for the ARMA models.

Table 2 Comparison between (p, q) ARMA parameters and (p, D, q) ARIMA parameters for each climate region and variable

3.2.2 Residuals

Next we proceed with the model fittings for all the pixels of the grid; by comparing the predictions of the ARMA models with the variables themselves, we obtain the time series of residuals at each point. These residuals’ time series will be used to detect abnormal events. To ensure the fitness of the estimated ARMA models, we have checked the autocorrelation pattern of the residuals to ensure the absence of seasonal pattern and low correlation.

3.2.3 Mahalanobis distance

At each pixel of the grid, we have combined the residuals of the 5 variables in a vector \(\mathbf x \) and estimated its Mahalanobis distance (Mahalanobis 1936; Hotelling 1947). This distance measure compared to other metrics has the advantage of taking into account the shape of the joint distribution. Compared to the Euclidean distance, it does not only take the mean but also take the covariance matrix into account. The Mahalanobis distance (in squared units) is defined as:

$$\begin{aligned} d_m(\mathbf x ) =(\mathbf x -\bar{\mathbf{x }})^T {\varvec{\Sigma }} ^{-1}(\mathbf x -\bar{\mathbf{x }}) \end{aligned}$$
(2)

where \(\bar{\mathbf{x }}\) and \({\varvec{\Sigma }}\) are the mean and covariance matrix of the multivariate residuals vector X respectively. The mean and the covariance were estimated considering the entire time series. This was the best way to do so in our case due to the short length of the time series used together with its coarse temporal resolution.

Figure 3 shows the scatter-plot matrix of the residuals at a certain location (50,875\(^\circ \)N, 11,625\(^\circ \)E). The diagonal of the matrix shows the autocorrelation plots of the 5 variables, while the rest of the subplots represent all the pair-wise scatter plots. The colors assigned to the dots in the scatter plots are associated to the Mahalanobis distance estimated with the 5 variables. Although the residuals’ joint distribution does not follow a multivariate Gaussian distribution, it does not present a clear multimodality. Therefore, Mahalanobis distance is still a robust approach as argued by Warren et al. (2011). The negligible autocorrelation values for lags greater than zero indicate that the models are well fitted. This corroborates the use of univariate ARMA models and their correct fit for this study case considering the low temporal resolution of the data. However, multivariate autoregressive models could be implemented at the feature extraction step for higher temporal resolution data without changing the next event detection step.

Fig. 3
figure 3

Scatter-plot matrix and autocorrelation plots of the ARMA residuals at the location 50,875\(^\circ \)N, 11,625\(^\circ \)E. The color of the dots represents the Mahalanobis distance associated to the residuals

3.3 Event detection

Once the Mahalanobis distance for all the points of the grid has been estimated, the following question arises: how could we discern between normal and abnormal values of this metric? At this point, we have considered two options. As a first approach, we have used a fixed threshold at a certain percentile of the Mahalanobis distance distribution. And as a second and more complex approach, we define abnormal events as those spatiotemporal clusters of high novelty score (Mahalanobis distance). To this end, we use unsupervised K-means clustering followed by MRF spatiotemporal smoothing. Details of these two approaches are described in the following two sections.

3.3.1 Event detection using fixed threshold

The easiest way to distinguish between normal and abnormal events is to set a threshold and look for the events surpassing this threshold. We are interested only in the largest values of the Mahalanobis distance; therefore, we have set the threshold at the 97.5th percentile of its distribution (all the Mahalanobis distance values along the entire region). We have then looked for the events above this threshold, which are the 2.5\(\%\) of observations with the largest Mahalanobis distance.

3.3.2 Event detection using spatiotemporal MRF model

Markov Random Field models are widely adopted to quantify the spatial/temporal dependency among adjacent pixels in time series of images. It represents an undirected graph where graph nodes denote image pixels, and graph edges denote conditional dependencies. The dependency between adjacent pixels can be modeled by conditional probabilities within a neighborhood system.

Sziranyi and Shadaydeh (2014) proposed a multi-layer fusion MRF model for change detection in multi-temporal optical images. This method has been further improved in Shadaydeh et al. (2017) to deal recursively with time series of images.

In this study, the obtained Mahalanobis distance over the entire study area is treated as time series of images. We use an adaptation of the model proposed in Shadaydeh et al. (2017) for the classification of time series of the Mahalanobis distance images into K classes using unsupervised K-means clustering followed by spatiotemporal MRF-based segmentation applied recursively on each \(L \ge \) 1 consecutive images. The K classes represent K different intensities of the novelty score clusters.

An image \(\mathcal {S}=\{s_1, s_2, ..., s_H\}\) is considered as two-dimensional grid of pixels. Let \(\mathcal {G}_s\) be the set of pixels which are neighbors of a pixel \(s \in \mathcal {S}\); \(\mathcal {G}= \{ \mathcal {G}_s \mid s \in \mathcal {S} \}\) is a neighborhood system (Kato and Zerubia 2012) if: \(\forall s,r \in \mathcal{S}: s \notin \mathcal {G}_s\); and \(s \in \mathcal {G}_r \Longleftrightarrow r \in \mathcal {G}_s\).

Each of the image pixels may take a label \(\lambda \) from a finite set of labels \(\varLambda \). Let \(\varOmega = \{\omega =(\omega _{s_1},\ldots , \omega _{s_H}): \omega _{i}\in \varLambda , 1\le i \le H\}\) be the set of all possible labels assigned to the image pixels. The image segmentation is equivalent to a global labeling \(\varOmega =\{ \omega _s\ |\ s\in \mathcal {S}\}\). The label field \(\varOmega \) is modeled as a Gaussian Markov Random Field. There are several methods to estimate the global optimum labeling through iteration process or graph-cutting (Kato and Zerubia 2012).

Let \(d_s(t)\) denote the novelty score (Mahalanobis distance) at location s and time point t. The proposed method consists of the following four steps applied recursively for each time step t (cf. Fig. 4):

  1. 1.

    The novelty scores \(d_s(t)\) for \(L=2N+1\) successive time instants are presented by the feature vector:

    $$\begin{aligned} \mathbf d _s = [ d_s(t-N), \cdots , d_s(t), \cdots , d_s(t+N) ]^{T}. \end{aligned}$$
    (3)
  2. 2.

    Finding K clusters in the multi-layer image \(\mathbf X =\{\mathbf{d}_s, s \in \mathcal {S}\}\) using the unsupervised K-means clustering algorithm.

  3. 3.

    Running MRF segmentation on the multi-layer image \(\mathbf X \) using the K-means clustering parameters to obtain the multi-layer labeling \(\varOmega _L\).

  4. 4.

    The multi-layer labeling \(\varOmega _L\) is used as training map for another MRF segmentation applied on the novelty score image \(\mathbf d (t)= \{d_s(t), s \in \mathcal {S}\} \) resulting in a labeling \(\varOmega _t\).

We use a maximum a posteriori (MAP) estimator for the label field of the MRF. The MAP estimator combines the conditional random field of the observed data \(P({{\varvec{f}}}_s | \omega _s )\) (\({{\varvec{f}}}_s=\mathbf d _s\) in Step 1, and \({{\varvec{f}}}_s=d_s(t)\) in Step 4) and the unconditional Potts model (Potts 1952). The global labeling \(\widehat{\varOmega }\) is defined by the energy minimum:

$$\begin{aligned} \widehat{\varOmega }=\text {argmin}_\varOmega \Bigg [{\sum _{s \in S} {-\log {P({{\varvec{f}}}_s | \omega _s )}} + \sum _{r,s \in S}\varTheta (\omega _r,\omega _s)}\bigg ], \end{aligned}$$
(4)

where \(\varTheta (\omega _r,\omega _s)\) is the neighborhood-energy term. It is set to zero if s and r are not neighboring pixels, otherwise \(\varTheta \) can be modified by applying the \(\beta \) homogeneity weight:

$$\begin{aligned} \varTheta (\omega _r,\omega _s)=\left\{ \begin{array}{ll} 0 &{}\quad \mathrm {if}\ \omega _r=\omega _s \\ \beta &{}\quad \mathrm {if}\ \omega _r\ne \omega _s \end{array} \right. . \end{aligned}$$
(5)
Fig. 4
figure 4

The workflow of the proposed spatiotemporal MRF classification model applied on each \(L=3\) consecutive novelty score images

4 Experimental results and discussion

4.1 Validation process

Validating models that try to reproduce environmental processes is not trivial task. There are no well-defined ground-truth events which can be used to compare the models’ performance and level of accuracy. Here, with help of experts on the topic, we have selected a list of well-known historical events that caused perturbations in the biosphere within the time span of our data.

This validation is not meant to be as detailed and precise as the topic would require. For this purpose, within the BACI project, there are specific work packages with experts in the field currently working and analyzing the models from a geophysical point of view and studying the socioeconomic and biodiversity impacts of these anomalies as well as the causes behind them. With this validation, we only want to check quantitatively the performance of the proposed methodology without detailed knowledge on the physical drivers behind the events.

Table 3 encompasses the list of historical events selected for this validation. We have selected three main events that happened at different locations within the area of study and with different causes and impacts. These three events are well documented in the literature: 1. the drought-floods at the Horn of Africa along 2006; 2. the Russian Heatwave from 2010 and 3. the severe floods in South Africa between the end of 2010 and the beginning of 2011. In 2006, the Horn of Africa, specially Somalia, confronted a series of disasters both natural and human-made. From the beginning of that year, a severe drought has caused a dramatic decrease in crops threatening almost 2 million people with starvation. This was aggravated by the ongoing political conflict as well as the worst flooding in a decade by the end of the year (Isar 2010). In summer 2010, western Russia, Ukraine, Belarus, Georgia and Kazakhstan experienced a heat wave that lead to historical warm records in several cities and a considerable high number of casualties related to the extremely hot temperatures (Barriopedro et al. 2011; Trenberth and Fasullo 2012; Coumou and Rahmstorf 2012). By the end of 2010 and the beginning of 2011, a series of loods took place in the southern part of Africa, mainly in South Africa, Mozambique, Zimbabwe and southern Botswana. The heavy rainfalls caused river-floods on the main rivers in that area. Severe damages on property and loss of human lives were reported in the countries affected. This event has been linked to the La Niña event that occurred that winter and led to similar events in several other countries around the world (Nilsson 2012).

Table 3 Known historical extreme events used in the validation

In the following two sections, the results given by the two event detection methods for those 3 events are presented.

4.2 Results for fixed threshold event detection

For each of the events presented before, the areas with higher values of Mahalanobis distance are shown in Figs. 5, 6 and 7 with the largest contour line at the 97.5th percentile threshold marked in red.

Figure 5 shows the months of October and November 2006. At that time, an important food crisis caused by large floods devastated the Horn of Africa. Figure 6 clearly depicts the Russian Heatwave of 2010. It can be seen how the anomaly moves from East to West and finally hits the Nordic countries. The time steps between December 2010 and January 2011 are plotted in Fig. 7. During those months, serious floods took place in southern Africa.

Fig. 5
figure 5

Floods at the Horn of Africa from October to November 2006 (Mahalanobis distance images). The largest contour line at the 97.5th percentile of the Mahalanobis distance distribution is marked in red

Fig. 6
figure 6

Russian Heatwave from May to October 2010 (Mahalanobis distance images). The largest contour line at the 97.5th percentile of the Mahalanobis distance distribution is marked in red

Fig. 7
figure 7

Floods in southern Africa from December 2010 to January 2011 (Mahalanobis distance images). The largest contour line at the 97.5th percentile of the Mahalanobis distance distribution is marked in red

4.3 Results for spatiotemporal MRF model event detection

Considering the same events, we applied the four steps of the spatiotemporal MRF classification model presented in Sect. 3.3.2 to each time step. The classification maps of the Mahalanobis distance into the classes intense anomaly, possible anomaly and normal are shown in Figs. 8, 9 and 10. The intensity of the possible and intense anomaly classes for each time step are annotated in the subplots’ legends, respectively, as PA and IA.

In all the experiments, we set the homogeneity weight \(\beta =1\). We used a graph cut-based \(\alpha \)-expansion algorithm for the energy minimization of the MRF with the implementation accompanying Szeliski et al. (2006). The selection of the value of L is based on data’s temporal resolution. In our experiments, we set \(L=3\) ensuring that the data belong to the same season. However, it is possible to use higher values of \(L=3\) when using data with a higher temporal resolution. We initially assume that we have \(K=9\) clusters. We use MRF segmentation with 9 classes representing 9 different intensities of the Mahalanobis distance. For anomaly detection, we then merge all classes with mean values exceeding the value of the Mahalanobis distance distribution (chi-square distribution) at the 97.5th percentile as the intense anomaly class; classes with mean values between the 97.5th and the 95th percentiles are classified as possible anomaly, and classes with mean value below the 95th percentile are considered normal. The mean and variance of the new three clusters are calculated again from the points of the merged sub-clusters.

Fig. 8
figure 8

Floods at the Horn of Africa from October to November 2006 (spatiotemporal MRF classification maps). The intensity of the possible and intense anomaly classes is annotated in the subplots’ legends as PA and IA, respectively

Fig. 9
figure 9

Russian Heatwave from May to October 2010 (spatiotemporal MRF classification maps). The intensity of the possible and intense anomaly classes are annotated in the subplots’ legends as PA and IA, respectively

Fig. 10
figure 10

Floods in southern Africa from December 2010 to January 2011 (spatiotemporal MRF classification maps). The intensity of the possible and intense anomaly classes are annotated in the subplots’ legends as PA and IA, respectively

4.4 Discussion

The plots shown in the previous sections demonstrate the methodology’s capability of effectively detecting abnormal events in the biosphere. Both of the proposed event detection methods have advantages and disadvantages: fixing a threshold is a simple approach that allows for a quick analysis of the results. This approach ensures that the detected events at least have the same intensity (threshold) but the selection of the threshold requires a meditated decision. The second proposed method, on the contrary, automatically detects the spatial extents as well as the thresholds between classes at each step. This allows for a more flexible definition of the detected anomalies but makes the comparison of events more complicated since they might have different intensity levels.

Taking a closer look on the plots of the three selected events, we can observe the following: Larger events like the Russian Heatwave are well represented by both methods (Figs. 6 and 9). The sequence of time steps clearly shows how the anomaly moved from East to West with a later displacement toward northern latitudes. While initially the anomalies seem to be sparse across entire Europe, they become more concentrated around western Russia, Ukraine and Belarus. In terms of intensity from the MRF models, the event increases in intensity starting in May until it reaches a peak in the last 2 weeks of July before decreasing again.

The event at the Horn of Africa in 2006 (Figs. 5 and 8) is detected more clearly using the MRF-based method than with the fixed threshold. This can be related to the fact that the intensities for this event are a bit lower than the ones obtained for the Russian Heatwave.

Finally, the event in southern Africa between the end of 2010 and the beginning of 2011 is shown in Figs. 7 and 10. Although being shorter in time (only 6 time steps), this anomaly is clearly detected by both methods. Here, another difference between both methods is present: the definition of three classes established from the MRF model-based method allows for a better spatial definition of the event compared to the fixed threshold method.

It should be noted that the low temporal and spatial resolution of the data used in this study represents a limitation for obtention of anomalies of short duration or small spatial extension.

5 Conclusions

A new methodology to detect anomalies in biosphere time series has been described. The procedure is computationally efficient and hence practical to apply. Our approach comprises two main steps after preprocessing the data: feature extraction and event detection. The feature extraction is achieved by means of autoregressive models followed by the estimation of Mahalanobis distance of the multivariate residuals. Event detection is based on the concept of detecting clusters of high novelty score using a spatiotemporal MRF classification model.

The proposed methodology has been applied to a large area that covers Europe and Africa. Results show that the method is able to detect the spatial extent of three known historic events and also provides a useful tool to define sensitive regions.

The modular nature of the methodology allows for further improvements in different sub-steps. It is also envisaged that the methodology can be applied to other kind of data (e.g., atmospheric parameters). This however could require different preprocessing steps.

The proposed methodology could be enhanced by the use of other models than the univariate ARMA models suggested here; for example, with longer time series, multivariate regressive models could be implemented at the feature extraction step without changing significantly the event detection step. Further research is also needed for a better suited climate classification. Although the use of the Köppen Climate classification provides a useful approach, a better definition of climate regions with transitional borders between regions, instead of hard ones could avoid seasonal effects observed in the models’ behavior. Finally, regarding the spatiotemporal MRF classification model, one potential improvement would be the use of a hyper-parameter-free clustering method instead of the K-means. In case of availability of training data for normal and/or anomalous classes, it is possible to use supervised training for the spatiotemporal MRF model instead of the unsupervised K-means clustering.