1 Introduction

When researchers study climate changes on a global or regional scale, they often use a regional average to analyze the characteristics of variations (Jones and Briffa 1992). The global temperature trend presented in all previous reports of the Intergovernmental Panel on Climate Change (IPCC) is also based on a gridded dataset of global surface temperature. For the regions where the horizontal distribution of meteorological stations is very non-uniform, a simple regional arithmetic mean of observation data often assigns an excessive weight to one area with denser stations, which may reduce the representativeness of the regional mean. Therefore, performing spatial interpolation from irregularly meteorological station data to regularly gridded data can effectively reduce the influence of the unevenly distributed stations on the regional average (Cao et al. 2013). In addition, using the regularly gridded data is more convenient for the evaluation of numerical models compared with the irregularly station data. Therefore, it is essential to interpolate station data onto regular grid points.

Temperature is one of the most important meteorological elements for studying regional and global climate changes, and numerous studies on the gridding of temperature have been performed. The US National Climate Data Center (NCDC) developed a global gridded monthly historical temperature dataset (GHCN) with a 5° × 5° spatial resolution (Vose et al. 1992; Peterson and Vose 1997). The UK Climate Research Unit (CRU) of the University of East Anglia, Norwich, published multiple global historical climatic gridded datasets with various resolutions (Harris et al. 2014; Brohan et al. 2006; Jones et al. 2012). In addition, the US National Aeronautics and Space Administration (Goddard Institute for Space Studies, GISS) developed a global gridded temperature dataset with a spatial resolution of 2° × 2° (Hansen et al. 2010). Robert et al. (2005) constructed a gridded dataset of global land surface air temperature (SAT) with a high horizontal resolution using an interpolation method of the thin plate spline (TPS). Nynke et al. (2009) also constructed a gridded dataset of SAT with a high resolution across Europe using the TPS interpolation method. However, these gridded datasets contain limited observational station data in China. For example, the CRU only used data at approximately 400 meteorological stations. Therefore, the resolution of the previous gridded datasets is relatively low, and it is difficult to objectively characterize temperature variations at a high horizontal resolution in China.

In recent years, some scholars have used observation data at dense meteorological stations to establish gridded surface temperature datasets for China. For example, Li and Li (2007) and Zhang et al. (2009) used observation data at approximately 700-1000 stations and the Kriging interpolation method to develop a gridded temperature dataset with a resolution of 2.5° × 2.5° during 1951–2004 and with a 1° × 1° gridded temperature dataset during 1951–2007, respectively. Yan et al. (2005) also used observation data at 752 stations to develop a monthly average gridded temperature dataset with a high resolution of 0.01° × 0.01° during 1971–2000, but they did not utilize more observation stations. Xu et al. (2009) further constructed a gridded dataset of SAT with a horizontal resolution of 0.5° using approximately 700 station data and the TPS method. Recently, Wu and Gao (2013) used temperature data at about 2400 stations in China to develop a 0.25° × 0.25° gridded temperature dataset.

Observation data in China are affected by the relocation of meteorological stations, instrument changes, differences in observation times and calculation methods, and urbanization, which will probably cause inhomogeneities in long-term observation data series, particularly in temperature. This heterogeneity can cause deviations in the estimated long-term trends in China, resulting in an uncertainty in the trends (Li 2011). The temperature dataset developed by Wu and Gao (2013) is based on the inhomogeneous dense station data in China. Therefore, we constructed a gridded temperature dataset using a homogenized SAT dataset at more than 2400 stations in China that was recently developed by the National Meteorological Information Center (NMIC) of China Meteorological Administration. This homogenized temperature dataset is currently the best available dataset in China, and it remarkably improves the integrity, quality, and temporal heterogeneity of observation datasets (Ren et al. 2012; Cao et al. 2016). As temperature is considerably affected by latitude and elevation above sea level, we utilized the three-dimensional (longitude, latitude, and elevation) TPS interpolation method to develop a gridded monthly temperature dataset with a horizontal spatial resolution of 0.5° in China during 1961–2015 (hereafter China Homogenized Gridded Temperature (CHGT) Dataset). In addition, we performed cross-validation and error statistics to diagnose and analyze interpolation errors and evaluated the uncertainty of the gridded data. Finally, using the CHGT gridded temperature dataset, we compared differences in climatic trends between the CHGT and previous gridded datasets.

2 Data and methods

2.1 Data

We adopted the NMIC homogenized SAT dataset at more than 2400 surface meteorological stations in China (Cao et al. 2016). This dataset, including maximum (Tmax), mean (Tm), and minimum (Tmin) temperatures, makes great improvement in the integrity, heterogeneity, and quality of temperatures through the following three steps. (1) By comparing the digitized temperature data with the original paper-based records, missing data at some stations due to the digitization of the paper reports were supplemented, which increased the integrity of the data. (2) By comparing the different versions of the archived digitized data, the erroneous data in the original dataset (including the metadata information) were corrected, which improved the quality of the data (Ren et al. 2012). (3) By the RHtest homogenization method (Wang and Feng 2010), the heterogeneity in SAT was assessed based on detailed historical information at stations, and the heterogeneity caused by non-natural factors was corrected at 55% stations, in which the station relocation remains the primary cause of the inhomogeneity. The varying trend in the homogenized temperature is more reliable. For example, the original temperature dataset shows an exceptionally large decreasing trend in northeastern China that is incoherently nested in a large-scale warming pattern. The homogenized dataset remarkably improves the inconsistency (Cao et al. 2016).

The number of surface meteorological stations in China rapidly increased from fewer than 200 stations in the early 1950s to approximately 2000 stations in the 1960s. After the 1960s, the number of stations increased, and the integrity of the data is significantly improved (Fig. 1). Our statistical analysis indicates that from 1961 to 2015, there were 2484 national stations in China, with mean data missing rate of 5.5%. In addition, there were no missing records at approximately 70% stations, which implies the high integrity of the observation data. Furthermore, the terrain varies considerably in China, most of which is characterized by mountains and hills. The observation stations are relatively dense in the middle and lower reaches of the Yangtze River and northern China where the terrain is relatively flat. There are fewer stations in mountainous northeast China and in hilly southern China. There are scarce stations in western China where there exists a large topographical undulation and there are almost no observation stations in certain areas (Fig. 2). For example, in 2015, there were 2207 surface meteorological stations in eastern China (east of 100° E), while there were only 217 stations in western China (west of 100° E). It is evident that the number of stations in the eastern area is ten times that in the western area. Sinha (2006) found that the selection of a horizontal resolution of a gridded product should fully consider the spatial density of the observation station. Because this great difference in the density of observation stations between the eastern and western areas may affect the representativeness of a regional average value, we took into account the spatial density of stations and the coverage of grid points. Our analysis shows the distance between stations ranges from 5 to 70 km, with a mean distance of 33.4 km. Therefore, we select a horizontal resolution of 0.5° for the gridded product. Based on the monthly temperature data at surface meteorological stations during 1961–2015 and the TPS interpolation method with the terrain features, we produced a gridded temperature data at a spatial resolution of 0.5°. Specifically, using a global digital elevation dataset with a resolution of 30 s (Global 30 Arc Second Elevation Dataset, GTOPO30) and the bilinear method, we generated a digital elevation model (DEM) with a resolution of 0.5° over the mainland of China (Fig. 3), in which the DEM was inputted into the interpolation model.

Fig. 1
figure 1

Temporal curve of the monthly number of surface meteorological stations in China from 1951 to 2015

Fig. 2
figure 2

Spatial distribution of the missing rate of observation data at surface meteorological stations in China from 1961 to 2015

Fig. 3
figure 3

Digital elevation model (DEM; unit: m) with a resolution of 0.5° in China

2.2 Interpolation method

2.2.1 Principles of the interpolation model

We used the ANUSPLIN software package version 4.4 (Hutchinson and Xu 2013) which implements the thin plate spline (TPS) procedure described by Hutchinson (1995). The TPS interpolation was developed primarily by Wahba and Wendelberger (1980) and was applied to a climate analysis by Hutchinson (1991). This method can provide better climatic estimates by allowing for spatially varying dependence on topography, directly estimating interpolation errors, and efficiently diagnosing data errors (Hutchinson and Gessler 1994). An underlying statistical model for a partial TPS with three independent position variables (latitude, longitude, elevation) is given as follows.

$$ {Z}_i=f\left({x}_i\right)+{b}^T{y}_i+{e}_i\left(i=1,\dots, N\right), $$
(1)

where each xi is a three-dimensional vector of spline independent variables, f is an unknown smooth function of the xi, each yi is a three-dimensional vector of independent covariates, b is an unknown three-dimensional vector of coefficients of the yi, and each ei is an independent, zero mean error term with variance wiσ2, where wi is termed the relative error variance (known) and σ2 is the error variance which is constant across all data points, but normally unknown (Hutchinson 1991). The model reduces to an ordinary TPS model when there are no covariates (p = 0).

The function f and the coefficient vector b are determined by minimizing

$$ \sum \limits_{i=1}^N{\left(\frac{z_i-f\left({x}_i\right)-{b}^T{y}_i}{w_i}\right)}^2+\rho {J}_m(f), $$
(2)

where Jm(f) is a measure of the complexity of f, the “roughness penalty” defined in terms of an integral of mth-order partial derivatives of f, and ρ is a positive number called the smoothing parameter. As ρ approaches zero, the fitted function approaches an exact interpolant. As ρ approaches infinity, the function f approaches a least squares polynomial, with order depending on the order m of the roughness penalty. The value of the smoothing parameter is normally determined by minimizing a measure of predictive error of the fitted surface given by the generalized cross-validation (GCV). The GCV also provides a predictive error estimate of interpolation directly by removing each data point in turn and summing the square of deviation between each omitted value and the corresponding interpolation value.

The ANUSPLIN interpolation model has been extentively applied in global studies (Hutchinson 1991; New et al. 1999; Price et al. 2004; Hijmans et al. 2005; Yan et al. 2005; Xu et al. 2009), performed well compared with multiple interpolation techniques (Hartkamp et al. 1999; Jarvis and Stuart 2001), and is computationally efficient and easy to run.

2.2.2 Setting of model parameters

The mandatory input parameters in the ANUSPLIN interpolation model mainly include independent variables and the order of spline, and the main optional parameter is the covariate. These parameters are used to improve the weighting of variables, various variable transformations, or normalization (such as multiple, square root, and natural logarithm transformations) (Hutchinson and Xu 2013). Because temperature is considerably affected by the elevation above sea level, following Willmott and Matsuura (1995), the elevation along with longitude and latitude is treated as an independent variable or covariate in the model. Meanwhile, in the ANUSPLIN interpolation model, we performed a transformation of the elevation variable (that is, divided by 1000), which thereby increased the role of the elevation relative to the longitude and latitude independent variables (Hutchinson 1995; Hutchinson 1998). Using the observed monthly mean temperature data in 1 year, we tested effects of different parameter combinations on interpolated results by treating the elevation as an independent variable or a covariate and by selecting the order of the spline (from two to four) and a transformation of the elevation variable. By minimizing the GCV value or the root of GCV (RGCV) and the signal-to-noise ratio (SNR) in the model, the optimum parameters were determined (Liu 2008). Table 1 lists the 12 parameter combinations of the interpolation model and the interpolation error. A comparison indicates that the RGCVs and SNRs of parameter combinations 11 and 12 are the smallest. Considering the influence of the spline order on the computational efficiency, we selected three independent variables (longitude, latitude, and elevation) and the cubic spline interpolation as the optimum choice in the TPS model.

Table 1 Combinations of different parameters in the interpolation model and the statistic errors

2.3 Data quality and evaluation criterion

The ANUSPLIN software package provides a set of statistical parameters for measuring the interpolation error and accuracy. In this study, RGCV, mean bias error (MBE), mean absolute error (MAE), and root-mean-square error (RMSE) were used to evaluate the errors of the gridded product. In particular, cross-validation was conducted via a three-step process, that is, (1) removing the observational temperature value at an observation station, (2) interpolating the missing temperature at this station from the remaining surrounding data based on the TPS method, and (3) assessing the error between the observed and the interpolated values. The formulas of three error indices are as follows.

$$ \mathrm{MBE}=\frac{1}{N}\sum \limits_{i=1}^N\left({P}_i-{O}_i\right), $$
(3)
$$ \mathrm{MAE}=\frac{1}{N}\sum \limits_{i=1}^N\left|\left({P}_i-{O}_i\right)\right|, $$
(4)
$$ \mathrm{RMSE}=\sqrt{\frac{1}{N}\sum \limits_{i=1}^N{\left({P}_i-{O}_i\right)}^2,} $$
(5)

where Pi and Oi are the interpolated and original values, respectively, and N is the number of samples in the statistical analysis.

3 Evaluation of gridding performance

Figure 4 a shows the changes in the annual mean RGCVs of Tm, Tmax, and Tmin from 1961 to 2015. These RGCVs generally decreased in the 1960s and remained stable since the 1970s. The RGCVs of Tm mainly vary in the range of 0.5–0.6 °C, the RGCVs of Tmax are approximately 0.5 °C, and the range of RGCVs of Tm is 0.8–0.9 °C. The annual mean RGCVs of Tm, Tmax, and Tmin are 0.57 °C, 0.51 °C, and 0.83 °C, respectively. Tmin shows the largest variance in RGCVs, followed by Tm and Tmax, which is associated with their mean square deviations (SD) (figures not shown). We also calculated the annual variation in the RGCV relative to the observed annual mean temperature (hereafter MEAN). The ratios of the Tm, Tmax, and Tmin RGCVs to the MEAN are generally low, with the range of 4–5%, below 3%, and in the range of 9–12%, respectively (Fig. 4b). These results suggest that the RGCVs of Tm and Tmax are very small compared with their temperatures, whereas Tmin has a relatively large RGCV. Moreover, the Tm and Tmin RGCVs show remarkable seasonal variations (Fig. 5), with the largest value in winter, the moderate value in spring and fall, and the smallest value in summer. The seasonal variation in the Tmax RGCV is not distinct.

Fig. 4
figure 4

Interannual variations of RGCV (unit: °C) for Tm, Tmax, and Tmin in China (a) and the ratio of their RGCV to the MEAN (b) from 1961 to 2015

Fig. 5
figure 5

Seasonal variations of RGCV (unit: °C) for Tm, Tmax, and Tmin in China from 1961 to 2015

Figure 6 a, c, and e show the MBEs of Tm, Tmax, and Tmin at more than 2400 stations from 1961 to 2015. It is seen that the MBE is relatively low in general. On the average, the MBE is ± 0.4 °C, ± 0.2 °C, and ± 0.6 °C for Tm, Tmax, and Tmin, respectively, and their frequencies are 87.5%, 81.7%, and 82.8%, respectively. Moreover, the MBE is not significant at most stations, with significant MBEs at the 95% level only 1.8%, 2.0%, and 7.3% of the total number of stations for Tm, Tmax, and Tmin, respectively. For Tm and Tmax, the significant MBEs are mainly located in transition regions with complex terrain, such as in the eastern and southern parts of the Tibetan Plateau and the Yunnan-Guizhou Plateau, and large MBEs are beyond 1 °C. For Tmin, large MBEs are located in the Tianshan and Taihang mountainous areas. Although the MBEs in these mountainous regions are large, the differences between the fitted and observed values are generally small, which implies a good performance of the interpolation model. Figure 6 b, d, and f show the spatial distributions of Tm, Tmax, and Tmin RMSEs, respectively. In general, the spatial distributions of the Tm and Tmax RMSEs are similar and their RMSEs are generally below 0.6 °C, with the frequencies of 88.2% and 94.0%, respectively. The RMSE of Tmin is slightly higher than those of Tm and Tmax, mainly ranging from 0.2 to 0.8 °C, with a frequency of 79%. Similar to the MBE, relatively large RMSE between the fitted and observed values is also mainly located in the mountainous regions. This pattern is likely related to the relatively poor spatial representativeness of the observation stations in these regions. Thus, the interpolation model should be further improved in these regions.

Fig. 6
figure 6

Spatial distribution of MBE (a, c, e; unit: °C) and RMSE (b, d, f; unit: °C) for Tm, Tmax, and Tmin from 1961 to 2015 and the frequency distribution (shown in the inset figure at the lower left), in which “●” is significant at the 95% level and “+” is not significant at the 95% level

We compared the interpolation error of the CHGT data with the errors of the previous temperature gridded products. For example, Yan et al. (2005) applied the TPS method to interpolate temperature data at 752 stations in China onto regular grid points with a resolution 0.01° × 0.01° from 1971 to 2000 (hereinafter referred to as the Yan data). Our statistical analysis shows that the SD difference in Tmax and Tmin between the CHGT (from 1961 to 2015) and Yan (from 1971 to 2000) datasets is between 0.8 and 1.5 °C (Fig. 7). The ratios of the CHGT Tmax and Tmin RGCVs and RMSEs to the SD values are clearly less than those of the Yan data. These results indicate that the interpolation errors of the CHGT Tmax and Tmin data are lower compared with the Yan results, suggesting a higher quality of our gridded product. The higher quality is attributed to the following reasons. (1) In the TPS interpolation model of Yan et al. (2005), longitude and latitude are treated as independent variables and the elevation is treated as a covariate. In our interpolation model, the elevation is treated as the independent variable rather than the covariate, which can remarkably improve the quality of the interpolated data. (2) Yan et al. (2005) used the observation data at fewer stations to interpolate a high-resolution gridded data, while we considered the station density in selecting the gridded data resolution. Denser station data are used to interpolate the gridded data at an appropriate resolution, which could also have caused a better interpolation result.

Fig. 7
figure 7

Statistic errors for January, April, July, and October. (a)–(e) For Tmax and (f)–(j) same as (a)–(e) but for Tmin

4 Climatic warming in China according to CHGT data

Based on the CHGT grid data, we calculated the linear trend of the annual mean temperature series in China and also made a comparison with the linear trend derived from the CRU TS4.02 data during the low-temperature period (1961–1984), the rapidly warming period (1985–1997), and the slowly warming period (1998–2015) (Fig. 8). In general, the regional mean CHGT temperature over China shows a mean warming rate of 0.22 °C/10 years for Tmax, 0.27 °C/10 years for Tm, and 0.38 °C/10 years for Tmin during 1961–2015, while the CRU TS4.02 temperature shows the warming trends as 0.17 °C/10 years for Tmax, 0.24 °C/10 years for Tm, and 0.33 °C/10 years for Tmin. It is evident that the CHGT data shows larger warming trends from 1961 to 2015. Specifically, the CRU TS4.02 data overestimates the trend of temperature in China during the low-temperature period and underestimates the trend during the rapidly and slowly warming periods. Moreover, during the slowly warming period, the ratio of the difference (between the CRU and CHGT trends) to the CHGT temperature was the greatest, especially for Tmax, with a ratio of 333%. During the rapidly warming period, the ratio was the smallest. During the slowly warming period, the CHGT warming trend is 0.04 °C/10 years in China. After 1998, the CHGT Tmax is showing a trend of − 0.03 °C/10 years while the CHGT Tmin continued to increase (but with a small rate of 0.15 °C/10 years).

Fig. 8
figure 8

Trends of the CHGT and CRU TS4.02 annual mean Tm (a), Tmax (b), and Tmin (c) (unit: °C/10 years)

Figure 9 shows the distribution of the annual mean tendency for Tm, Tmax, and Tmin. A warming trend generally occurred across the most parts of China from 1985 to 1998, with a larger warming trend in eastern China than in western China. Moreover, temperature exhibited downward trends in northern China, the middle and lower reaches of the Yangtze River, and some of southern China from 1961 to 1984. The warming trend is clearly lower in eastern China than in western China from 1999 to 2015 (Li et al. 2009; Zhao et al. 2014; Sun et al. 2018). Thus, owing to the unevenly distributed stations in China, the simple regional arithmetic average of the station data is biased to the feature in eastern China, and the warming trend in China based on the gridded data appears to be more reasonable.

Fig. 9
figure 9

Trends (unit: °C/10 years) of annual mean Tm (a, d, g), Tmax (b, e, h), and Tmin (c, f, i) in China, in which the left, middle, and right panels are during 1961–1984 (a–c), 1985–1998 (d–f), and 1999–2015 (g–i)

Our results are further compared with those of the gridded data based on a small number of observation stations. Figure 10 shows the trends in seasonal and annual mean temperatures obtained by Li et al. (2015). They used 545 stations in China (referred to as the Li data) and the CRU TS4.02 data. Except for Tm and Tmax in spring and Tmin in winter, the CHGT warming trends in seasonal and annual mean temperatures in China generally exceed those from the Li and CRU datasets. This difference is particularly high in summer and fall, in which Tmax and Tmin in summer are higher by more than 20% for Li data. In fall, the difference is even higher by more than 40% for the CRU data, which may be due to more stations used in our product, especially in the northern and western parts of China. Additionally, to reduce the influence of urbanization on observation environments, many stations were moved from urban to rural areas. Therefore, the relocation of observation sites usually causes a non-climatic, decreasing trend bias in the unadjusted temperature series (Trewin 2013; Cao et al. 2016).

Fig. 10
figure 10

Trends (unit: °C/10 years) of the CHGT, Li, and CRU TS4.02 Tm (a), Tmax (b), and Tmin (c) in China from 1961 to 2015

Finally, we compared the warming trend in China with that of the globe estimated by the CRUTEM4 dataset (Morice et al. 2012). The changes in Chinese and global mean temperatures are generally consistent, but the warming trend is larger in China than in the globe (Fig. 11). In particular, from 1985 to 1997, the global warming rate was 0.24 °C/10 years, while the warming rate was 0.4 °C/10 years in China. However, after 1998, the global warming rate (0.16 °C/10 years) exceeded that in China (0.04 °C/10 years).

Fig. 11
figure 11

Temporal curves of annual mean temperature anomalies from the climatology (1961–1990) (unit: °C) in China (blue solid line) and the globe (red solid line) from 1961 to 2015. The dashed lines are their linear trends

5 Conclusions

In this study, using the homogenized temperature data at more than 2400 surface meteorological stations in China, we constructed a gridded temperature dataset with a horizontal resolution of 0.5° from 1961 to 2015. By testing different combinations of parameters in the interpolation model, we selected the optimum three-dimensional parameters (longitude, latitude, and elevation), developing a fitting curve surface for temperature in China, which considerably reduced the uncertainty and error caused by the interpolation. Our evaluation shows that the annual mean GCV values of Tm, Tmax, and Tmin are 0.57 °C, 0.51 °C, and 0.83 °C, respectively, and are small relative to the observations. Comparing our results with those of Yan et al. (2005), we found that our error is smaller, with a higher reliability. This improvement could be attributed to the optimum setting of our interpolation parameters, the increase of the station data, and the selection of an appropriate spatial resolution.

We used the CHGT data to further analyze the warming trend in China from 1961 to 2015. Since 1961, temperature has experienced a three-stage variation, that is, a low-temperature period (1961–1984), a rapidly warming period (1985–1997), and a slowly warming period (1998–2015). The CHGT temperature in China shows a mean warming rate of 0.22 °C/10 years for Tmax, 0.27 °C/10 years for Tm, and 0.38 °C/10 years for Tmin from 1961 to 2015. Generally, the CHGT temperature shows larger warming trends compared with the CRU TS4.02 data. Specifically, the CRU TS4.02 data overestimated the temperature trend in China during the low-temperature period and underestimated the temperature trend during the rapidly and slowly warming periods. These differences are due to the usage of both the homogenized high-density station data and the TPS interpolation method in this study.

Compared with previous temperature trends based on few observation stations in China, our trends are generally higher, particularly in fall. Compared with the global warming trend from 1961 to 2015, the overall warming rate in China is greater. After 1998, the warming trend suspended in China (0.04 °C/10 years). Meanwhile, Tmax in China also suspended (− 0.03 °C/10 years), while Tmin continued to increase (but with a slower rate of 0.15 °C/10 years).