Spatial gridding of daily maximum and minimum temperatures in Europe
- First Online:
- Received:
- Accepted:
DOI: 10.1007/s00703-011-0160-x
- Cite this article as:
- Krähenmann, S., Bissolli, P., Rapp, J. et al. Meteorol Atmos Phys (2011) 114: 151. doi:10.1007/s00703-011-0160-x
Abstract
We developed an operationally applicable land-only daily high-resolution (5 km × 5 km) gridding method for station observations of minimum and maximum 2 m temperature (T_{min}/T_{max}) for Europe (WMO region VI). The method involves two major steps: (1) the generation of climatological T_{min}/T_{max} maps for each month of the year using block regression kriging, which considers the spatial variation explained by applied predictors; and (2) interpolation of transformed daily anomalies using block kriging, and combination of the resulting anomaly maps with climatological maps. To account for heterogeneous climatic conditions in the estimation of the statistical parameters, these steps were applied independently in overlapping climatic subregions, followed by an additional spatial merging step. Uncertainties in the gridded maps and the derived error maps were quantified: (a) by cross-validation; and (b) comparison with the T_{min}/T_{max} maps estimated in two regions having very dense temperature observation networks. The main advantages of the method are the high quality of the daily maps of T_{min}/T_{max}, the calculation of daily error maps and computational efficiency.
1 Introduction
Monitoring temperature extremes is important because they have major impacts on the economy, ecology and human life. Estimating the 2 m minimum and maximum temperature (T_{min}/T_{max}) at high spatial resolution is particularly important. The daily T_{min}/T_{max} is often of more interest than the daily mean temperature (e.g., as an indicator of frost or heat stress in agriculture). However, T_{min}/T_{max} data are usually available from fewer monitoring stations, and the values are less normally distributed than mean temperature values. Therefore, the gridding of T_{min}/T_{max} is more challenging than gridding mean temperature, and so is the focus of this study.
For high-resolution gridding (25 km²) in Europe, at least 40,000 stations would be needed if a density of one station per grid square was required. However, as only a few thousand stations are available, it is essential to generate an appropriate interpolation algorithm that most efficiently uses the available station information and additional predictor information (such as orography or continentality).
A number of studies have investigated the interpolation of daily climate variables to create regional or global data sets. Regression is a well-established interpolation technique that can be used in a variety of ways. Stahl et al. (2006) compared 12 regression variations and kriging methods for interpolating daily maximum and minimum temperatures in British Columbia (Canada). Gaussian filter inverse distance weighting methods (GIDS), which are based on multiple linear regression against predictors, yielded the best results if there was a high-density network of stations available. However, it was noted that methods that determine regression functions locally (such as GIDS) should not be applied in situations where the observed predictor range is too restrictive (e.g., no observations at high elevations). In such cases, the use of ordinary kriging (OK) was recommended.
Numerous studies have shown that a number of very similar techniques are suitable for interpolating temperature and other climate variables (Jarvis and Stewart 2001; Stahl et al. 2006). In universal kriging (UK), the trend of a variable is modeled as a function of the spatial coordinates. If the trend is defined as a linear function of the predictors, it is referred to as kriging with external drift (KeD). Alternatively, the estimation of trends and residuals can be undertaken separately and combined later in a process termed regression kriging (RK), which was proposed by Ahmed and de Marsily (1987) and Odeh et al. (1995). Regression kriging enables the application of nonlinear dependence to predictors. In cases of linear dependence, Hengl et al. (2007) demonstrated the equivalence of KeD and RK. In the present study, RK was preferred because of its greater computational efficiency and robustness.
The quality of the interpolation method is influenced by the explanatory predictors considered, such as elevation (Goovaerts 2000), aspect ratio and distance to the coast (Daly et al. 2002; Hiebl et al. 2009), seasonality and weather conditions (Hewitson and Crane 2005). The unexplained spatial variability of a variable can be considered in an additional step by interpolation of anomalies using other methods (New et al. 2001).
Haylock et al. (2008) described a high-resolution gridded data set of daily precipitation and daily 2 m T_{min}/T_{max} for Europe. They used a three-step interpolation method involving: (a) interpolation of climatological monthly values with three-dimensional thin-plate smoothing splines; (b) kriging interpolation of the daily anomalies with respect to the monthly climatologies; and (c) adding the interpolated anomalies to the climatologies to produce the final result. Splitting of the interpolation process facilitates selection of the most appropriate and efficient method with regard to time and space scales.
Section 2 briefly introduces the station data used in the study, Section 3 overviews the predictor data used, and Sect. 4 describes the new algorithm and explains the method for calculating the uncertainty. The final section presents the results and an evaluation of the new method.
2 Station data
This section briefly describes the station data sets used in the interpolation exercise, and the data sets used for evaluation of the new algorithm. To estimate the monthly climatologies, we used data from approximately 3,000 CLIMAT stations (Hoefrichter 2009) providing long-term averages (1961–1990) of monthly mean daily T_{min} and T_{max} observations. The CLIMAT data set has been thoroughly quality controlled and inhomogeneities have been corrected, making this data set a reliable basis for our analysis.
The daily anomalies were based on data from approximately 3,000 synoptic stations (SYNOP). For the period 2005–2008, only about 33% of SYNOP and CLIMAT stations were co-located, as a consequence of the addition of new stations and the removal of other stations since 1990. Data from SYNOP stations are available in near real time, but are not subject to a high level of quality control. Therefore, we introduced a degree of quality control by excluding daily observations that deviated by more than ±5 standard deviations from the averaged regional anomalies.
The density of the CLIMAT and SYNOP stations is variable, with the highest density occurring in Central Europe and lowest on the Greenland ice shield (Fig. 1). Approximately, 20 CLIMAT and 20 SYNOP stations are located along the southern coastline of Greenland. No CLIMAT data are available for the entire period 1961–1990 for high-elevation areas including the Greenland ice shield, and currently there is only one SYNOP station on the ice shield (at the Summit, 3,208 m). Despite the sparse station network in Greenland, we included it in the study area because this provided the opportunity to conduct a feasibility study on the use of our method in remote areas.
To evaluate the new algorithm for daily T_{min} and T_{max} we used data from two independent regional observation networks having very high spatial resolution. One, situated in southern Austria, has been in operation since January 2007 and comprises 200 stations covering an area of 29 km × 14 km (Kabas et al. 2008). The other network was situated in the Black Forest, Germany, and was operated in summer 2007. It comprises 96 stations covering an area of 8 km × 12 km (Schneider et al. 2008).
3 Predictors and predictor data sets
In the gridding method, the target variable (mean monthly T_{min} or T_{max}) was linearly regressed against multiple predictors. Useful predictors have to be physically related to temperature, readily available and applicable to the entire interpolation domain of interest, and statistically robust in the regression process. These criteria exclude some apparently good predictors. For example, objective weather classifications (e.g., Bissolli and Dittmann 2001) are only regionally defined, so are not applicable to the entire domain and are therefore not appropriate predictors. Satellite data are potentially useful for detection of such things as cloud and fog, but are potentially subject to the screening of low fog layers (with potential temperature inversion) by thin high clouds. Thus, the use of satellite products is not robust; and these are also affected by a limited period of availability. Hiebl et al. (2009) accounted for urban effects in the generation of a high-resolution climatology in an alpine region, but concluded that there was no general, operationally usable relationship between urban effects and temperature at the daily time scale.
Inversion is the reversal of the temperature lapse rate and leads to strongly modified regional temperature patterns. It predominantly occurs in winter and is often accompanied by fog. Fog prevents marked cooling during the night by limiting the emission of heat, and reduces warming during the day by reflecting solar radiation. Therefore, fog weakens the daily temperature cycle. We tested the inversion index proposed by Daly et al. (2002), but it was not robust and consequently was not used. The selected predictors are described below.
Elevation (Fig. 1a) correlates well with surface temperature, is globally available and is thus a useful predictor. We used high-resolution elevation data obtained from the shuttle radar topography mission (SRTM; see http://dds.cr.usgs.gov/srtm/version2_1/SRTM3/), which provides data within 60° north and south, and represents an improvement on previous high quality and resolution digital elevation model (DEM) products (Jarvis et al. 2004). Pole-wards the data set is complemented by data provided by the United States Geological Survey (USGS; see http://eros.usgs.gov/#/Find_Data/Products_and_Data_Available/gtopo30_info). The original grid spacing of the SRTM data is about 90 m (the gtopo30 grid spacing is approximately 1,000 m) at the equator and increases toward the poles. The two data sets were aggregated to the target resolution (5 km × 5 km) by calculating the average of all grid values within each target grid box. The two DEMs were then merged by linearly weighting across an overlapping area extending 100 km southwards from 60°N.
Climate is strongly affected by the land–sea distribution. We represented this using the continentality index \( K = 1.7\frac{A}{\sin \varphi } - 20.4 \) (Gorczynski 1920) as a predictor. This index was based on the geographical latitude (φ) and the mean annual range of monthly temperatures over the period 1961–1990 (A in °C). The index is defined to usually take values from 0 to 100, with lower values indicating maritime climate and higher values indicating continental climate. We derived this predictor (Fig. 1b) from the Climatic Research Unit (CRU) data (CRU TS 3.00; Mitchell and Jones 2005) for the period 1961–1990.
Geographical latitude is a potential predictor of the pole-ward decrease of mean temperature. However, we used the zonally averaged monthly CRU climate, which typically yields better cross-validation results as a predictor than geographical latitude (not shown).
RMSE (in °C) for temperature predictions using the predictor sets elevation (z), continentality index (K) and zonal mean temperature (b)
Predictor relation | January | July | Average | ||
---|---|---|---|---|---|
T_{min} | T_{max} | T_{min} | T_{max} | ||
t ~ z | 6.1 | 5.8 | 3.3 | 4.4 | 4.9 |
t ~ K | 5.0 | 5.1 | 3.3 | 3.9 | 4.3 |
t ~ b | 5.6 | 4.8 | 2.5 | 3.2 | 4.0 |
t ~ K + b + z | 3.1 | 3.0 | 2.3 | 2.7 | 2.8 |
The addition of the inversion index as a predictor resulted in a small reduction of the RMSE. However, inversion is a local phenomenon that cannot be adequately captured when determining the regression coefficients for the climate of an entire subregion (overfitting). Furthermore, artifacts can occur at the inversion height. Thus, only elevation, zonal mean temperature and the continentality index were ultimately selected for inclusion in the study.
4 Gridding method
The gridding method involved a two-step process. Firstly, the climatological monthly T_{min}/T_{max} data were interpolated using block RK, which applied the selected predictors in a multi-linear regression step (Wackernagel 2003). The resulting product is termed a monthly prediction map. Secondly, block simple kriging (SK) was applied to interpolate the normal score transformed daily anomalies. The back-transformed daily anomaly map was added to the monthly prediction map, yielding the final result. These steps are detailed below.
The gridding was performed in a rotated geographical coordinate system with pole at 180°W and 38°N. This placed the center of the study area at the equator of the rotated coordinate system and yielded a quasi-equal area grid across the target region. This enabled application of an isotropic variogram in kriging, minimized the number of grid nodes necessary and maximized the numerical efficiency.
4.1 Block regression kriging of monthly observations
The regression coefficients necessary for block RK can be estimated from point data (observations and predictors at station locations) if the applied regression function is linear (Heuvelink and Pebesma 1999; Leopold et al. 2006). To account for different climates, we determined the regression coefficients separately for each climate region. The regression coefficients were then applied to block-averaged (25 km^{2}) predictor maps. The predictors, the coverage of the feature space and the covariance of the regression residuals (determined from the variogram, see below) have to be taken into account in determining the regression error (Hengl et al. 2003).
To insure the data were normally distributed, a normal score transformation (Deutsch and Journel 1998) was applied to the regression residuals prior to interpolation. For the kriging process, we used a spherical variogram model, which has a linear behavior near the origin and reaches the sill at the range beyond which autocorrelation becomes zero. For the necessary variogram estimate, we adopted a suboptimal but robust approach (Ahrens and Beck 2008). We estimated a climatological variogram range from normal score transformed monthly regression residuals (separately for T_{min}/T_{max} and each subregion). The range of the variogram of residuals was shorter (approximately 180 and 1,200 km for Central Europe and Greenland, respectively) than the monthly temperature range, which suggests that the trend had been removed. We also estimated a climatological nugget variance, which was 5–20% of the sill variance (dependent on the subregion, and the network density). However, as the nugget variation in the residual variogram reflects the trend estimation error within the regression step (Ali et al. 2005), it was generally not smaller than the nugget variance in the variogram of monthly temperatures.
Following interpolation of the normal score transformed regression residuals using SK, a back-transformation was applied, which yielded a residual map that was added to the regression map. These subregional maps were merged using linear weighting in the 250 km-wide overlap, to yield the monthly prediction maps showing the mean over the period 1961–1990 for T_{min} and T_{max} for the WMO region VI .
Solving the kriging system provided the block SK error variance (Chilès and Delfiner 1999). This was used to produce the quartile map of the interpolated residuals, which were back-transformed. We used half of the inter-quartile range (IQR/2) as an error measure. The quartile map of the interpolated residuals and the regression error map were combined to yield the monthly RK quartile map, following the additive relation described by Hengl et al. (2003).
4.2 Block simple kriging of the daily anomaly
The regionally interpolated daily anomaly maps were back-transformed, merged, and added to the monthly prediction map. This yielded the daily prediction maps for T_{min} and T_{max}.
The daily error variance estimated by block SK of the normal score transformed anomalies was used to produce the quartile map of the daily anomalies, which were back-transformed. For interpolation of the monthly residuals, we used IQR/2 as the error measure. The error maps of the monthly predictions and daily anomalies were then combined (Hengl et al. 2003).
Thus, the proposed method applies regression kriging for monthly data and simple kriging for daily anomalies, and is henceforth referred to as regression kriging kriging (RKK).
5 Results and discussion
This section illustrates the application of the block RKK gridding method and provides an evaluation of the temperature and error maps generated. An important part of the evaluation was a comparison with the E-OBS product (Haylock et al. 2008).
5.1 Temperature maps
The final steps in generating the daily products are illustrated in Fig. 5. The daily anomalies were interpolated using block SK, as shown in panel (a). Most of the large-scale variability was already adequately explained by the monthly product, but interpolation of the daily anomalies provided important additional information at the regional scale. The cold anomaly over Central Europe and the warm anomaly over Eastern Europe are particularly noteworthy. The daily anomaly map highlighted regions with marked daily temperature anomalies, which were then superimposed on the monthly mean product in the final daily product, shown in panel (b). Figure 5 also shows the uncertainty estimates for the gridded anomalies and the final product. The uncertainty of the final product was dominated by uncertainty in the daily anomaly interpolation (the daily anomalies were greater by one order of magnitude than the monthly regression residuals).
5.2 Cross-validation and comparison with simpler algorithms
We evaluated the performance of the RKK algorithm in three climate regions (Greenland, Central Europe and the Mediterranean) using cross-validation (Wackernagel 2003) and comparison with simpler but commonly used interpolation methods including ordinary kriging (OK) of daily observations and inverse distance weighting (IDW) (Ahrens 2006) of daily observations.
Cross-validation involves sequential removal of each observation from the observational data set, and re-estimation of the observed temperature from the observations in the amended data set using an interpolation method. Thus, the goal was to re-estimate point values at particular station locations. Therefore, these results were based on point interpolation, and not block interpolation. The evaluation criteria were RMSE (perfect score 0°C) and the ratio of the variance of interpolated temperatures and the variance of observed values (VARI, perfect score 1).
Evaluation of RKK, OK and IDW for Central Europe in January and July 2007, using cross-validation for daily T_{min}/T_{max}
January | July | |||||||
---|---|---|---|---|---|---|---|---|
T_{min} | T_{max} | T_{min} | T_{max} | |||||
RMSE | VARI | RMSE | VARI | RMSE | VARI | RMSE | VARI | |
Point RKK | 1.8 | 1.1 | 1.8 | 1.1 | 1.5 | 1.2 | 1.7 | 1.0 |
OK | 2.1 | 0.7 | 2.1 | 0.7 | 2.0 | 0.6 | 2.6 | 0.7 |
IDW | 2.1 | 0.8 | 2.2 | 0.7 | 2.0 | 0.7 | 2.7 | 0.7 |
Similarly, Table 2 demonstrates the superiority of RKK over OK and IDW (using Central Europe as an example) in terms of both RMSE and VARI. However, RKK overestimated the spatial variability, especially for T_{min} in July. This was a result of the spatial variability imposed by the chosen predictors in the regression step during interpolation.
5.3 Validation of the uncertainty measure
A key issue was to provide a robust uncertainty measure with the RKK, and for this purpose we proposed the IQR. Correct modeling of the local uncertainty implies that 50% of the true values are within the local IQR.
To assess the uncertainty measure, the RKK was tested in two regions that are independent and have dense networks of stations, including a network in southern Austria operated by WegenerNet, and a network in the Black Forrest, Germany. We used block RKK to estimate the daily T_{min} and T_{max} for each test region at a spatial resolution of 5 km × 5 km (observations aggregated to the grid of the block RKK).
Average percentage of observations falling into the IQR for daily T_{min} and T_{max} in January and July 2007 for Central Europe and the Mediterranean Sea
Central Europe | Med Sea | |||
---|---|---|---|---|
T_{min} | T_{max} | T_{min} | T_{max} | |
January | 54 | 57 | 46 | 52 |
July | 47 | 56 | 44 | 50 |
Total | 51 | 56 | 45 | 51 |
However, the widths of the box plots indicate that there were large daily differences. The quality of the predicted temperature and its related uncertainty measure were associated with particular weather conditions, topographical complexity of the region and coverage by network observations. The RKK provided an uncertainty measure that on average (across the climate subregion, and over long time periods) yielded the target precision (50% of the observed values within the IQR), but for single day events at a local scale the uncertainty could be grossly overestimated or underestimated. This is also evidenced in Table 3, which shows that if using non-transformed data the method overestimated the uncertainty (about 80% of the observations fell within the IQR) in the regional observation network areas. Thus, error estimation remains uncertain.
5.4 Comparison with the E-OBS product
We compared maps estimated using the block RKK method with maps based on estimates obtained using a similar product (E-OBS; Haylock et al. 2008). As with the RKK, the E-OBS algorithm involves splitting of the interpolation process into two parts: (1) generation of a monthly map and (2) interpolation and addition of daily anomalies. E-OBS uses elevation and geographical coordinates as predictors by applying three-dimensional smoothing thin-plate splines for interpolation of monthly data, and uses KeD for interpolation of daily anomalies.
Average ratio of the estimated field variances of the daily products of RKK and E-OBS (T_{min} and T_{max} for the period 2005–2008 for Central Europe and Mediterranean Sea)
Central Europe | Med Sea | |||
---|---|---|---|---|
T_{min} | T_{max} | T_{min} | T_{max} | |
January | 1.1 | 1.2 | 1.1 | 1.0 |
July | 1.0 | 1.0 | 1.0 | 1.0 |
Total | 1.1 | 1.1 | 1.0 | 1.0 |
Figure 8a shows the average daily estimation error for each observation station using RKK. The largest estimation errors occurred in areas where there was most difference between the daily maps produced by E-OBS and RKK, including the alpine area, Scandinavia and the Mediterranean Sea. This highlights that large uncertainties mostly arise in areas with low network density and complex terrain.
6 Conclusion
We developed an operationally applicable algorithm (RKK) for generating a high-resolution gridded map of daily minimum and maximum temperatures for the WMO region VI. These maps are unique in their spatial extent, resolution and the number of observation stations used.
The RKK method separates the interpolation process into the estimation of monthly climatologies and the interpolation of daily anomalies. This enables selection of the most appropriate interpolation method for the climatological and daily timescales, respectively. For operational use the target region was separated into seven subregions. This decreased the calculation time by reducing the size of the variance–covariance matrix, which had to be inverted within the kriging process. In addition, the use of climatic subregions enabled local climatic characteristics to be accounted for, which resulted in smaller regression residuals.
An evaluation process indicated the usefulness of the method. For example, the RKK method showed a similar performance to the E-OBS data set (Haylock et al. 2008), and was similarly limited in regions with low station density. Kyselý and Plavcová (2010) recently demonstrated this for the E-OBS data set by comparison with an alternative data set that was generated from a high-density network in the Czech Republic.
A major outcome of the study was in establishing an accurate estimate of the daily interpolation uncertainty (calculated as half the inter-quartile range). The application of a normal score transformation of the monthly regression residuals and daily anomalies prior to interpolation reduced the estimation error and improved the quality of the RKK uncertainty measure (observations within the IQR in approximately 50% of cases with normal score transformation and 80% without transformation).
The use of other predictors should be investigated, such as the incorporation of land cover data, which reflects local energy fluxes and thus has predictive potential for local temperature patterns. The main challenge for the future is that simple static indexes, either topographical or weather based, can capture only a small part of the complex processes involved.
Acknowledgments
Monthly climate temperatures, based on various station networks, and daily observations were obtained from the Deutscher Wetterdienst (DWD; German Meteorological Service, Offenbach). The authors SK and BA also acknowledge the funding from the DWD and the Hessian initiative for development of scientific and economic excellence (LOEWE) at the Biodiversity and Climate Research Centre (BiK-F), Frankfurt/Main.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.