1 Introduction

The credibility of climate scenarios is based largely on the ability of climate models to reproduce the observed climate. Evaluation of the quality of the reconstruction of the climate models is to compare the fields of meteorological parameters. Fields of continuous parameters, such as temperature, can be evaluated by comparing the values at the grid points. Statistical evaluation of these point differences gives a realistic view of the quality of the reconstruction of the climate. The situation is different in the case of phenomena such as precipitation. This phenomenon occurs in a limited area and is usually short-lived, so even in high-resolution models, it is difficult to precisely reconstruct. However, minor deviations in the duration and location of the phenomenon should not be crucial in assessing the adequacy of fields. This problem also occurs in the verification of numerical weather forecasts and has been studied in the project “Spatial Forecast Verification Methods Inter-Comparison Project” (Ahijevych et al. 2009a, b). In the view of many publications related to this project, there are more appropriate methods to analyze the compatibility of precipitation fields (Gilleland et al. 2010; Ahijevych et al. 2009a, b; Gilleland et al. 2009). The methods and ideas adopted from this project can be applied to the evaluation of climate models. An example of the application of techniques from this project to climate data can be found in the work of Moise and Delange (2011). In this work, an object-oriented technique for contiguous rain areas (CRAs) was developed.

The aim of our research is to present methods for comparing very general properties of precipitation fields, which does not take into account the details of location and intensity of precipitation. This assumption allows the use of climate indices to evaluate climate scenarios. The analysis using the climate indices is carried out in two ways. First, absence or occurrence of precipitation is analyzed by using the principle of the nearest neighborhood. The properties of precipitation at the grid point are replaced by properties of nearby fields at this point. Then, the percentage of days with precipitation exceeding the threshold of 1 mm from EOBS data—ENSEMBLES Observations gridded dataset (Haylock et al. 2008)—is compared with that for ERA40—reanalysis of meteorological observations produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) ( Uppala et al. 2005) and model scenarios. Evaluation is done comparing the field of percentage of wet days. The maximum daily total precipitation is also analyzed.

The second method is based on a cluster analysis where the value of precipitation is replaced by a class of precipitation defined by the thresholds. This method identifies certain objects consisting of points of precipitation fields with a similar precipitation class that are located close to each other. Each such object is represented by the exemplar, its location, and its class of precipitation. Then, we compare the characteristics of the occurrence of these objects in the subregions of the domain.

2 Methods

The method based on the principle of the nearest neighborhood analyses precipitation at the grid point, considering values in its immediate vicinity, consisting of nine points. Due to the resolution (0.25°) of the data, this neighborhood is a 50-km × 50-km rectangle and is the smallest symmetric neighborhood of the point in the net. This method compares the precipitation fields using the climate indices eca_rr1—wet day index per time period (the number of days where RR is at least 1 mm)—and eca_rx1day—highest 1-day precipitation amount per time period (the maximum of one day sum of precipitation in the period)—both calculated by the program Climate Data Operators (CDO 2015). The wet day is defined as RR—daily sum of precipitation exceeding the threshold of 1 mm (RR ≥ 1). The index eca_rr1 is used to evaluate the percentage of wet days. Wet days are defined on the basis of the average values in the neighborhood. Further analysis concerns the comparison of the possibility of occurrence of areas in which the number of wet days N wet exceeds the threshold of 50 %, or 33 % of the total number of days N. In the following step, the maps of probabilities of exceeding p 50 or p 33 are visually evaluated. The probabilities at each grid point are calculated on the basis of the value N wet at K points in the neighborhood. It is assumed that p 50 = k 50/K where k 50 is the number of grid in the neighborhood for which N wet > 50 % N (p 33 and k 33 are defined analogously). The next step of analysis is to compare maps of probabilities p 33 and p 50. The index eca_rx1day is used to assess the probability of maximum precipitation greater than 50 mm. In this case, the analysis refers to the value of exca_r1xday at K points in the neighborhood. The probability of exceeding by RRmax the threshold 50 mm 50 p max = 50 k max/K, where 50 k max is the number of points for which ecxa_rx1d > 50. The maps of spatial distribution of p 33, p 50, and 50 p max are evaluated visually and by using a fractional skill score FSS (Ebert 2008) (Fig. 1). Comparison of precipitation fields using these tools are presented in Section 4.

Fig. 1
figure 1

Spatial distribution of the percentage of wet days. N wet: RR ≥ 1 mm

Motivation for the use of cluster analysis to verify the precipitation fields, reconstructed by RCM models, is to find a way to compare the very general properties of these fields. First, generalization refers to the daily total precipitation. Studies are subject to the data categorized by the use of threshold values. The values of daily total precipitation RR are divided into five categories:

  • No rain: RR < 0.1 mm (for this case variable cl = 0)

  • Very light rain: 0.1 mm ≤ RR < 1 mm (cl = 10)

  • Light rain: 1 mm ≤ RR < 5 mm (cl = 20)

  • Moderate rain: 5 mm ≤ RR < 10 m (cl = 30)

  • Strong rain: RR ≥ 10 mm (cl = 40)

In this way, the field (lon, lat, RR) is transformed into (lon, lat, cl). Then, the cluster analysis method applied to these data allows us to create clusters—groups of points lying close to each other with the same precipitation class. It is difficult to identify in advance the appropriate number of clusters for every precipitation field. Marzban and Sandgathe (2008) used a hierarchical cluster analysis method for forecast verification. Application of hierarchical cluster analysis resulted in the number of clusters varied from the total number of grid points in the data down to 1. These numbers were interpreted as a resolution of analysis. In Konca-Kedzierska (2012) were some comments on the selection of the clustering procedure, which showed the need for individual choice of a number of clusters for each field. The method chosen for calculation is a method called by Frey and Dueck (2007) as “affinity propagation,” which takes the measures of similarity between pairs of data points as input and simultaneously considers all data points as potential exemplars. The affinity propagation method was implemented in System R as a package APCluster (Bodenhofer et al. 2011). This method provides individual selection of the number of clusters, suitable for the particular field of precipitation classes.

Based on the results of clustering, we have assessed a very overall view of the spatial distribution of precipitation classes. The result of clustering procedure is a set of clusters represented by their exemplar (lonexp, latexp, clexp), i.e., their localisation and precipitation class. As shown in Fig. 2, the domain has been split into four areas: northeast (NE), northwest (NW), southeast (SE), and southwest (SW). Of course, there are other possible divisions one could take into account other than the geographical aspects of the domain, for example, an administrative division. It is assumed that precipitation RR > 10 mm occurs in subregions of the division, if there is an exemplar with cl = 40. Considering the position of exemplar, each day of the period 1971–1990 can be classified into one of 16 situations:

  • 00—exemplars with cl = 40 does not appear in any of the regions

  • NE—exemplars with cl = 40 are found only in the area NE

  • NW—exemplars with cl = 40 found only in the area NW

  • NESE—exemplars with cl = 40 found in the areas NE and SE

  • NENWSESW—exemplars with cl = 40 are found in all areas

Fig. 2
figure 2

The domain division into four regions NE, NW, SE, and SW which correspond to the geographical parts of it

The next step is to compare the frequency of these situations in the observational and model data. This enables the possibility to use a methodology for examining the compatibility of contingency tables. Furthermore, as a measure of the differences between histograms, we calculate the sum of the percentage differences in the frequencies in the histogram classes TSDm defined by the following formula:

$$ {\mathrm{TSD}}^m=\frac{1}{N}{\displaystyle {\sum}_{i=1}^{16}{n}_i^m\times 100} $$

where

i = 1,…16 possible locations for cl = 40 (“00,” “NE,” “NW,”…, “NENWSESW”)

m = ERA40, REF, DMI—indicator of the analyzed model

n m i  = abs(N EOBS i  − N m i )—absolute value of the difference in the number of cases

N EOBS i —the number of cases for the ith location in the EOBS data

N m i —the number of cases for the ith location in the model m data

This method does not take into account co-occurrence of events; it only compares the fit of distributions sufficient to evaluate the climate model. Detailed analysis of the occurrence of clusters with precipitation class cl = 40 is discussed in Section 5.

3 Data

The study concerns precipitation fields that were reconstructed by regional climate models RCM for the period of 1971–1990. The research refers to the domain 13 E–24 E and 48 N–55 N, which surrounds Poland. The resolution of the data is 0.25°, meaning that the distance between the grid points is about 25 km. Climate during the reference period is described by the EOBS (Haylock et al. 2008) data downloaded from the European Climate Assessment & Dataset project (http://eca.knmi.nl/dailydata/index.php). Also used as reference data is the reanalysis from project ERA-40 (http://data-portal.ecmwf.int/). The climate scenarios are taken from the Polish project “Climatic service” (http://klimat.icm.edu.pl/serv_climate.php) (based on data published in project ENSEMBLES http://www.ensembles-eu.org/) and the project KLIMAT carried out in IMWM. Table 1 contains a list of the selected RCM scenarios, together with information on references and global circulation models, GCM, from which the boundary conditions were taken.

Table 1 The list of models being compared in the method of the nearest neighborhood

4 The principle of the nearest neighborhood

Percentage of wet days is calculated on the basis of climate indices eca_rr1 from the program Climate Data Operators. The results of calculation presented as maps of  pecentage of wet days, are shown in Fig. 1. Most similar to the spatial distribution of the observational EOBS data is the distribution of the model DMI-HIRHAM5_ARPEGE. From Fig. 1, it is easy to see that the models overestimate the percentage of wet days. For observational EOBS data, it is verified that the number of wet days does not exceed 50 % at any point in the domain. A similar situation is in the case of ERA40 data, while the models allow for the occurrence of areas where the number of wet days exceeds 50 %. In the cases of models DMI-HIRHAM5_ARPEGE and MPI-M-REMO, these unreal areas are very small and occur in mountainous terrain in the southern Polish border. The largest areas with the number of wet days exceeding 50 % are found for models REF and RM5.1. The fractional skill score FSS is used to objectively compare the maps of p 50. The values of FSS are shown in Table 2. The highest value of FSS, which means the best compatibility with EOBS data, is achieved by model DMI-HIRHAM5_ARPEGE.

Table 2 The values of fractional skill score FSS for maps of probability of exceeding the threshold by climate indices for analyzed models compared to the EOBS data

The probability that the number of wet days exceeds the threshold of 33 % p 33 is also examined. Nearly all used models and ERA40 data show p 33 = 1 throughout the territory of Poland. The exception is the DMI-HIRHAM5_ARPEGE model, which represents a large area in the western part of Poland with p 33 = 0 which is also found in the EOBS data.

Spatial distributions of RRmax are visually compared on maps of eca_rx1day. On this basis, we conclude that the maps for the models DMI_HIRHAM5_ARPEGE, DMI_HIRHAM5_BCM, and KNMI-RACMO2 are most similar to that for observational EOBS data. However, for both models DMI_HIRHAM5, there is no area with RRmax > 100 mm in the western Polish border which occurs in the observational EOBS data. This area is slightly reconstructed by a KNMI-RACMO2 model. The next step is to determine 50 p max the probability of exceeding the threshold 50 mm by maximum daily precipitation RRmax. It is done on the basis of eca_rx1day using the principle of the nearest neighborhood. Then, the maps of the probability of occurrence of maximum precipitation over the threshold of 50 mm 50 p max are evaluated. The size of the area in which 50 p max is close to 1 is too high for all models. From the visual assessment, it follows that the spatial distribution of the DMI-HIRHAM5_BCM model is most similar to the distribution of the observational EOBS data. For the ERA40 data almost in the entire domain, the likelihood 50 p max is equal to zero, while for EOBS data in the north, west, and south parts of the domain, there are significant areas with 50 p max = 1. The objective evaluation of the field 50 p max is set using a fractional skill score FSS, and the result are shown in Table 2. FSS highest values (FSS > 0.6) are achieved for the aforementioned three models, while for the data ERA40, it is the lowest and FSS = 0.24. Application of climate indices, a principle of the nearest neighborhood, and a fractional skill score FSS allows you to specify the method of comparing climate scenarios. Results obtained using a fractional skill score FSS are consistent with the subjective assessment of climate indices maps.

4.1 The cluster analysis method

ERA40 reanalysis and REF data, obtained in IMWM from RegCM model, are selected for comparative analysis. DMI_HIRHAM5_ARPEGE model is also selected from the other models because for him the value of FSS obtained in Section 4 is the highest. In this section, the model DMI_HIRHAM5_ARPEGE is denoted by the acronym DMI.

The cluster analysis method is used to analyze the spatial occurrence of the significant precipitation. It is defined that in the subregion of the domain, the selected class of precipitation occurs if the exemplar with this class is located in this region. It is analyzed that the occurrence of class cl = 40 which means the precipitation is above 10 mm. Such situations, where the whole area has no significant precipitation (no exemplar with class cl = 40) are numerous, and there are significant differences in the number of such cases for each data set. As shown in Fig. 3, the highest number of such cases is for the ERA40 data, nearly two times more than for EOBS. The number of such situations for the DMI model (3965) is almost equal to that for the EOBS data (3972); the difference is only 7 days.

Fig. 3
figure 3

The number of cases for the absence of significant precipitation in the domain

Analysis of the occurrence of class cl = 40 is carried out using the method of contingency table. Pearson’s chi-squared χ 2 test is performed, and indicators such as contingency coefficient, Phi-coefficient, and Cramér’s V are calculated (Meyer et al. 2012).

Three selected situations of coexistence of class cl = 40 in observed EOBS data and the model data are explored. The first concerns the absence of class cl = 40 in the whole domain (situation denoted by “00”), and the opposite event is the appearance of this class anywhere in the domain (situation denoted by “∼00”). Figure 4 shows Bangdiwala’s agreement plots, which provide a simple graphic representation of the strength of agreement in a contingency table. It can be noted that the biggest adjustment is for the ERA40 data because, in this case, the black areas in the figure, which are the measure of co-occurrence, are the largest. Numbers describing Bangdiwala’s agreement plots in Fig. 4 are the rounded marginal percentage of each of the situations “00” and “∼00.” Table 3 contains the exact percentage distribution for contingency tables that correspond to agreement plots in Fig. 4. The marginal homogeneity is shown directly by the relation of the dark squares to the diagonal line. In Fig. 4, we see that the ERA40 data overestimate situation “00” (85.4 %), whereas the REF data underestimate it (34.1 %). The DMI model data shows the marginal homogeneity, because in this case all squares are located on the diagonal line. This means that the percentage of data EOBS and DMI are identical. The highest values of several measures of strength of agreement are obtained for ERA40 data, because in this case the co-occurrence of ERA40 and EOBS reaches 67.8 %. The lowest values were obtained for DMI data because, in this case, the co-occurrence of classes “00” and “∼00” is very rare (49.4 %). The χ 2 test (at the level of significance p = 0.01) only for DMI does not give reason to reject the hypothesis of independence with EOBS data. Analogous analysis is performed for the division of the data into three categories: “00” for absence of cl = 40 in the whole domain, “S” for cl = 40 located in south part of the domain (this part includes mountains), and “∼S” for cl = 40 located in the rest of the domain. Bangdiwala’s agreement plots for this case are shown in Fig. 5. The number describing Bangdiwala’s agreement plots in Fig. 5 are the numbers of cases. On the chart for the ERA40, one can see the strong underestimation of the number of occurrences of cl = 40 in the southern part of the domain. This is correct, but only 24 % of the occurrence of cl = 40 in this part of the domain in EOBS data is reconstructed. For EOBS data, the largest group is the case of “00” (55 %), while for REF data, the largest group is the case of “S” (48 %). Overestimation of the number of cases of significant precipitation in the southern part of the domain by the REF model is about 33 %. The number of situations “S” for the DMI model is underestimated by 6 %. Measures of association of these contingency tables have values similar to the values for the two-class tables “00” and “∼00.” Similar calculations are also performed for all 16 classes of possible locations of significant precipitation cl = 40. The contingency coefficient in this case is 0.7 for ERA40, 0.23 for REF, and 0.18 for the DMI model. ERA40 appears to be the best model, but in this case there is a very big overestimation of the absence of significant precipitation “00” by approximately 57 %. Taking into account the co-occurrence of events, the percentage of conformity with EOBS data (in Bangdiwala’s agreement plots, this corresponds to a total of the surface of the black rectangles) are as follows: ERA40 59 %, model DMI 32 %, and model REF 25 %. On the other hand, from the analysis, it follows that the highest marginal homogeneity occurs for model DMI, and such evaluation is more important from the point of view of climatology.

Fig. 4
figure 4

Bangdiwala’s agreement plots for occurrence (∼00) and absence (00) of class cl = 40. The numbers correspond to the percentage of cases

Table 3 Contingency tables with the percentage for analysis of the occurrence or absence of class cl = 40
Fig. 5
figure 5

Bangdiwala’s agreement plots for three locations for cl = 40: “00”—absence, “S”—in the south part, and “∼S”—in the rest part of domain. The numbers in the graph correspond to the number of cases

In Fig. 6, histograms of location of precipitation events for the class cl = 40 in all 16 locations are shown. The histograms for both models REF and DMI have a better fit than histogram for ERA40 reanalysis. The visual assessment of the similarity of these histograms shows that the histogram most similar to the EOBS histogram is that for DMI model. In cases of histograms for EOBS and DMI data, there are the two clearly predominant locations (except in the case of “00”) that means the southern parts of the domain SE and SW. Values of TSD index for the analyzed data are as follows: TSDERA40 = 64 %, TSDREF = 42 %, and TSDDMI = 14 % and confirm that the best fit is for the model DMI.

Fig. 6
figure 6

Histograms for the distribution of frequency of precipitation class cl = 40 in the regions to which the domain was divided. 00, NE, NW, …, NENWSESW describe the possible coexistence of class cl = 40 in the subregions

5 Conclusion

This paper considers two ways to assess whether the precipitation fields are compatible without comparing their values at the grid points. The first method (Section 4) compares the properties of the fields in the neighborhood of the grid point instead of its value at that point. The probability of exceeding the threshold of number of wet days (daily total RR > 1 mm) and maximum precipitation threshold are specified based on climate indices in the neighborhood. The correctness of the reconstruction of past climate is assessed by comparing the maps for EOBS data and regional models. The results of visual assessment are confirmed by the values of fractional skill score FSS. The best rating in this analysis is obtained by DMI_HIRHAM5_ARPEGE model. Furthermore, in the analysis of the number of wet days, it is found that models overestimate this number, especially in mountainous areas in the southern Polish border. In contrast to the observed EOBS data, models allow for regions in which the number of wet days exceeds 50 %. The analysis with the climate index eca_ex1day shows that regional models RCM better estimate the RRmax than ERA40 reanalysis.

The second method (Section 5) examines the location of the cluster representatives in the assumed division of the domain. Analysis of the occurrence of significant precipitation (daily sum of precipitation RR > 10 mm) shows that for ERA40 data, the absence of such precipitation in the entire domain is overstated and is 86 %. For EOBS data, cases where there is no significant precipitation are 54 %. All these cases are accurately reconstructed in the ERA40 data. For this reason, measurement of compliance in the contingency tables and Bangdiwala’s agreement plots, which are sensitive to the simultaneity of events, are best for the ERA40 data. Co-occurrence of events is not relevant for climate assessment, so it is sufficient to evaluate the histograms of the exemplar location. In this evaluation, closest to the histogram for EOBS is the histogram for model DMI_HIRHAM5_ARPEGE. The difference between these histograms can be estimated at 14 %. Categorized data allowed for comparing the scenarios and the reference fields for different levels of precipitation. Further research is planned on the use of cluster analysis for the other classes of precipitation that will allow for a full analysis of the model fit.

Presented examples allow us to obtain a very general description of the time series of precipitation fields reconstructed by RCM model. However, it is possible on this basis to assess which models better reconstruct precipitation fields for the reference period.