1 Introduction

Change in the frequency and intensity of climate extremes, especially at regional or local scales, has great impact on human mortality, regional economics, and natural ecosystems. Therefore, the detection and attribution of change in climate extremes at regional scales is important for effective risk management and adaptation, and worldwide efforts have thus been devoted to this area of study (Bindoff et al. 2013; Stott et al. 2010).

As a typical type of human-induced land use change forcing, urbanization may have contributed a substantial influence on extreme temperature trends in rapidly developing regions, although it is unlikely to have caused more than 10 % of the measured global land averaged centennial trend in surface temperature (Bindoff et al. 2013; Hartmann et al. 2013b). Much attention has been paid to China (Hartmann et al. 2013b), which has undergone rapid economic development and urbanization over the last few decades, and this urbanization trend is set to continue at least in the near future. Ren and Zhou (2014) estimated that urbanization has led to a 37.8 % tropical nights increasing trend, a 10.1 % frost days decreasing trend, a 12.8 % summer days increasing trend, a 17.6 % cold nights decreasing trend, and a 26.4 % warm nights increasing trend, as observed based on China-averaged annual mean temperature. Sun et al. (2014) reported about 24 % of the trend in summer temperature, which was highly correlated with heatwave days, could be attributed to urbanization in eastern China. At a smaller scale in eastern China, such as in the North China Plain, the contribution of urbanization to trends in temperature extremes can be even greater (Zhou and Ren 2011); for example, around 48 % of the annual warm nights increasing trend was estimated to derive from urbanization. Quantifying the urbanization effect on trends in climate extremes and adjusting this urbanization bias in China are therefore important for the detection and attribution of other human influences, such as greenhouse gas forcing, on trends in climate extremes or on the risks of an extreme event.

A fundamental problem in this area is how to accurately estimate a trend and its statistical significance. First, climate extremes are sometimes, if not often, non-Gaussian distributed (e.g., Klein Tank et al. 2009; Qian et al. 2015). At smaller scales, the central limit theorem may not work because the sample size is small. The non-Gaussian characteristics will make the frequently used ordinary least squares (OLS) fit biased in estimating a linear trend (e.g., Qian et al. 2015). Second, even if a time series has a Gaussian distribution, its trend may not be linear (Wu et al. 2007), as usually assumed. “There is no a priori physical reason why the long-term trend in climate variables should be linear in time” (IPCC 2013, p. 179). The rate of real urbanization may not be constant either. For example, Ren et al. (2007) reported an enhanced urban warming at the megacity stations of Beijing and Wuhan for the latter 20 years of the period 1961–2000. Third, a trend can always be identified provided it is non-zero; the key question is whether the trend is statistically significant. Different from the independent assumption usually made, a time series may have serial dependence, either short-range dependence or long-range dependence, which affects the statistical significance of the corresponding trend (Franzke 2010, 2012).

In the present paper, the impact of the method for the estimation of the trend and corresponding statistical significance on the quantification of the urbanization effect on trends in climate extremes is illustrated by using hot extremes in the megacity of Shanghai as an example. Located in the Yangtze River Delta region in East China, Shanghai is the largest city by population in China, with more than 20 million people. Its level of urbanization is the highest in China. It is also one of the country’s largest economic centers. Hot extremes and urbanization in such a city therefore have substantial impacts on both human health and the regional economy. Both linear and nonlinear trend estimation methods were used to quantify how much of the observed changes in the hot extremes in Shanghai could be attributed to urbanization forcing. The trend and the corresponding statistical significances were estimated by taking into account potential non-Gaussian and serial dependence in the extreme indices.

The remainder of the paper is organized as follows: In Sect. 2, the homogenized data and the methods for the trend estimation and significance testing, and for the estimation of the urbanization effect and its contribution, are described. The results in terms of the effect and contribution of urbanization, as estimated from linear and nonlinear trends, are presented in Sect. 3. The selection of the reference series and the implications of this study for regional climate change analysis in China are discussed in Sect. 4. A summary of the key findings and concluding remarks regarding the implications of the results for the international community are given in the final section.

2 Data and methods

2.1 Data

The temperature data used included daily mean temperature (T mean), daily maximum temperature (T max) and daily minimum temperature (T min), all of which were from the recently homogenized daily surface air temperature datasets of 2419 meteorological stations in China, produced by China’s National Meteorological Information Center (NIMC). The data have been quality controlled and homogeneity adjusted at the NMIC using the RHtest of Wang et al. (2007). Furthermore, they have been recently used by Sun et al. (2014) to study the rapid increase in the risk of extreme summer heat in Eastern China. Homogenized data are important for climate change analysis, especially for assessing the urbanization effect in China where station relocations are frequent (Ren and Zhou 2014; Yan et al. 2014). Most of the station observations in China started after the mid-to-late 1950s, and thus many previous studies focused their analysis starting from 1961 (e.g., Ren and Zhou 2014). To be consistent with previous studies, the period of 1961–2013 was analyzed in the present study. Xujiahui station (station ID: 58367, elevation: 46 m) was used to represent downtown Shanghai. It is fully surrounded by buildings (Fig. 1). Pinghu station (station ID: 58464, elevation: 54 m) was used as a rural station. Its location, which remained the same throughout the analysis period, is outside a small town named Zhapu (Fig. 1). The distance from Pinghu station to Xujiahui station is approximately 73 km. These two stations were representative of a distinct contrast in land surface characteristics, as can clearly be seen from the satellite map (Fig. 1). To support the usage of Pinghu station as a rural reference, five suburban stations of Shanghai municipality with temperature records starting before 1961 were also used. These stations were Minhang (station ID: 58361), Baoshan (station ID: 58362), Jiading (station ID: 58365), Chongming (station ID: 58366) and Nanhui (station ID: 58369).

Fig. 1
figure 1

Map showing Xujiahui station (58367, upper yellow marker) and Pinghu station (58464, lower yellow marker)

The missing value of T min at Pinghu station on 30 July 1968 was linearly interpolated using a 5-day window centered on the day of missing data. The same method was used to deal with the missing values for T mean on 12 September and 18 November 2013 at Pinghu station, on 10 September 1962 at Minhang station, on several individual days in 1961 and 1962 at Chongming station, and on 23 June 2006 at Nanhui station. Since the entire month of October in 2013 at Baoshan station had no T mean in the datasets used, only four suburban stations (all except Baoshan) in Shanghai municipality were averaged to calculate the suburban-average annual mean T mean, whereas all five stations were used to calculate that of the June–July–August (JJA) mean.

2.2 Definition and calculation of the extreme temperature indices

A set of seven hot extreme temperate indices (Table 1) were used. These seven indices were adopted or revised from the expert team on climate change detection and indices (ETCCDI) (see http://etccdi.pacificclimate.org/list_27_indices.shtml). Among them, the hottest day index (TXx), warmest night index (TNx), warm days index (TX90p), and warm nights index (TN90p) were directly adopted; while the hot days index (TX35), hot nights index (TN28) and heat wave index (HW) were revised from the index of number of summer days, the index of number of tropical nights, and the warm spell duration index, respectively, according to local climate conditions. The TX35 index has been adopted in previous studies in China (Zhai and Pan 2003; Sun et al. 2014). The seven indices can be classified into three categories: extreme value (TXx and TNx), absolute threshold (TX35 and TN28), and relative threshold (TX90p, TN90p, and HW) indices. The former four indices often reflect extremes that happen during the warm season. We counted the latter three indices during JJA only. Therefore, all seven of these indices reflect hot extremes during the warm season that could have significant impact on human mortality. To ensure consistency in the calculation of the indices with other regions, the RClimDex, version 1.1, software packages (Zhang and Yang 2004) were used. Percentiles required for some of the temperature indices were calculated from the base period of 1961–1990 using a bootstrapping method to avoid possible inhomogeneities (Zhang et al. 2005). The same base period of 1961–1990 was used, as recommended in ETCCDI, since using different base periods would result in different anomalies and make the results difficult to compare with others (Qian et al. 2011).

Table 1 Details of the seven hot extremes indices used

2.3 Methods for normality testing

The exact distribution for a given time series is not the aim of this study. Rather, since the OLS regression method used in many previous studies on the quantification of the urbanization effect assumes the regression residual follows a Gaussian distribution, the focus here is on whether or not the distribution is Gaussian/normal. To determine whether or not a time series follows a Gaussian distribution, we adopted the three methods used in Qian and Zhou (2014): histograms, quantile–quantile (Q–Q) plots, and the Jarque–Bera test. A minor difference in the present study was that the theoretical quantile was estimated by the inverse of the normal cumulative distribution function, and the 95 % confidence interval was added in the Q–Q plots. When the distribution of a time series did not follow a straight line exactly, but was still within the 95 % confidence interval range, it was still considered as a Gaussian distribution at the 5 % level.

2.4 Linear and nonlinear trend estimation methods

2.4.1 Linear trend

The linear trend for a time series with sample size N was estimated using both OLS regression and the nonparametric Kendall’s tau-based Sen–Theil estimator (Sen 1968). The former assumes that the regression residuals are independent, identically distributed random variables with Gaussian distribution, whereas the latter is free of distribution and is much less sensitive to outliers in the time series. However, the Sen–Theil estimator requires the sample data to be serially independent. Thus, an iterative method, proposed by Zhang et al. (2000) and later refined by Wang and Swail (2001, Appendix A), was adopted to pre-whiten the sample data before using the Sen–Theil estimator, and the resultant trend is referred to as WS2001. This method accounts for the first-order autocorrelation [hereafter AR(1)].

The statistical significance of a standard OLS trend was estimated using the Student’s t test with N − 2 degrees of freedom under the assumption of an independent regression residual. However, this assumption is sometimes, if not often, unrealistic. Therefore, a modified OLS method (hereafter OLS-M) was adopted to allow AR(1) in the regression residual (Santer et al. 2008; Hartmann et al. 2013a), achieved by reducing the number of degrees of freedom (Santer et al. 2008) when the estimated AR(1) of the OLS regression residual (hereafter r 1) is positive (Hartmann et al. 2013a). The details can be found in Hartmann et al. (2013a), but a brief introduction is provided as follows:

For a given confidence interval for probability level p, say 95 %, the OLS regression residuals for target data y with sample size N are \(\hat{e}(t) = y(t) - (\hat{a} + \hat{b}t),t = 1, \ldots N.\)

The effective sample size for data with a serially correlated residual is reduced to \(N_{e} = \left\{ {\begin{aligned} {N\frac{{1 - r_{1} }}{{1 + r_{1} }}, \quad r_{1} > 0} \\ {N, \quad r_{1} \le 0} \\ \end{aligned} } \right.\)

The variance of the trend slope estimator is obtained: \(\hat{\sigma }_{b} = \left[ {\frac{{\sum\limits_{t = 1}^{N} {\hat{e}(t)^{2} } }}{{N_{e} - 2}}/\sum\limits_{t = 1}^{N} {(t - \overline{t} } )^{2} } \right]^{{\frac{1}{2}}}.\) Then, the regression coefficient with a 95 % confidence interval is \(b = \hat{b} \pm q\hat{\sigma }_{b},\) where q is the \(\frac{1 + p}{2}\) quantile of the Student’s t distribution with N e  − 2 degrees of freedom. If b > 0, then the OLS trend is considered as statistically significant at the (1 − p) level.

The statistical significance of a WS2001 trend was estimated by a revised nonparametric Mann–Kendall test. The revised Mann–Kendall test also used the iterative method (Zhang et al. 2000; Wang and Swail 2001) to pre-whiten the sample data before using the original Mann–Kendall test, since it requires the sample data to be serially independent. For comparison, the OLS trend and its Student’s t test results were still estimated when the data were non-Gaussian, even though they were not appropriate.

2.4.2 Nonlinear trend

The ensemble empirical mode decomposition (EEMD) method (Wu and Huang 2009; Huang and Wu 2008), which is an adaptive and temporally local (e.g., Wu and Huang 2009; Qian et al. 2011) data analysis tool, was used to extract the nonlinear trend (Wu et al. 2007). As an improved version of the empirical mode decomposition (EMD) method (Huang et al. 1998; Huang and Wu 2008), the EEMD method has been applied in many geophysical studies to deal with nonlinear and nonstationary data and to extract their nonlinear trends (e.g., Wu and Huang 2009; Qian et al. 2009, 2011; Franzke 2010, 2012; Wu et al. 2011; Ji et al. 2014). The MATLAB code of EEMD was used in this study. In the EEMD calculation, the noise added had a standard deviation of 0.3 times that of the target data and the ensemble size was 10,000. The last component of the EEMD calculation [hereafter R n(t), which is a function of time t] for a time series was treated as the nonlinear trend. According to Wu and Huang (2009), the final standard deviation of error decreases as the ensemble size increases. To save computational time and constrain the final error in a limited range, 100 such EEMD calculations were conducted in the present study to estimate the uncertainties due to the ensemble size. The median of these 100 EEMD nonlinear trends was considered as the final trend and the 5th and 95th percentiles of these 100 EEMD nonlinear trends were considered as the uncertainty range.

The significance test for the EEMD nonlinear trend was estimated previously by surrogate data generated from three methods: (1) surrogate data with Gaussian distribution and short range dependence, namely, AR(1) (Ji et al. 2014) or long range dependence (Franzke 2010); (2) surrogate data with an a priori determined well-known distribution and short range or long range dependence (Franzke 2012); (3) surrogate data with exactly the same autocorrelation structure as the target data (Franzke 2012), achieved by a phase randomization approach (Theiler et al. 1992), but the distribution may be different from that of the observation. However, a significance test is still lacking for sample data with both non-Gaussian distribution and serial dependence, especially long range dependence characteristics, which require surrogate data with both the same non-Gaussian distribution and the same autocorrelation structure. Therefore, the following method for use with adaptive surrogate data is proposed to test the statistical significance of an EEMD nonlinear trend.

2.5 Significance testing for an EEMD trend using adaptive surrogate data

To generate surrogate data having both the same non-Gaussian distributions and the same autocorrelation functions, an iterative approach, which was proposed by Schreiber and Schmitz (1996) and reviewed in Schreiber and Schmitz (2000), was adopted. The basic idea of this approach is that surrogate data have the same autocorrelations as the observation if they have the same power spectra according to the Wiener–Khinchin theorem, and that they have the same distribution as the observation if the surrogate data are rescaled to the empirical distribution of the observation. Both the spectrum and distribution of the surrogate data can be corrected toward the same as those in the observation through an iterative process. The steps are as follows:

  1. 1.

    Calculate the squared amplitudes of the discrete Fourier transform of the original data and obtain a time series {|S k |2}. Copy the original data and sort them to obtain a time series {c n }.

  2. 2.

    Randomly shuffle (permutation) the original data and obtain {r (0) n }.

  3. 3.

    Take the discrete Fourier transform of the shuffled time series {r (i) n }. Replace its squared amplitudes by those of the original time series (i.e., {|S k |2}) but keep the phases of the shuffled time series (θ (i) k ), and then transform back into physical space:

    $$s_{n}^{(i)} = \frac{1}{\sqrt N }\sum\limits_{k = 0}^{N - 1} {\exp (i\theta_{k}^{(i)} )} \left| {S_{k} } \right|\exp \left( { - \frac{i2\pi kn}{N}} \right)$$
    (1)

    Thus, this step enforces the correct spectrum but usually the distribution will be modified.

  4. 4.

    Rank-order the resultant series so as to assume exactly the values taken from {c n }:

    $$r_{n}^{(i + 1)} = c_{{rank(s_{n}^{(i)} )}}$$

Thus, this step corrects the distribution; however, the spectrum will be modified again. Therefore, an iteration of steps 3 and 4 is required to correct deviations in both the spectrum and distribution until the discrepancies are sufficiently small.

For a one-sided test at the 99 % level of significance (α = 0.01), a collection of 1/α − 1 = 99 surrogate time series were generated following Schreiber and Schmitz (2000). An example of 99 surrogates for the TXx index at Pinghu station is shown in Fig. 2, which confirms that each surrogate has exactly the same distribution and almost the same autocorrelation structure as the observation. If a positive (negative) EEMD nonlinear trend from the observation was larger (smaller) than any of those obtained from the 99 surrogates, then the observed EEMD nonlinear trend was considered as statistically significant at the 1 % level. If a positive (negative) EEMD nonlinear trend from the observation was larger (smaller) than any of those obtained from the first 19 [1/0.05 − 1 = 19, according to Schreiber and Schmitz (2000)] surrogates, then the observed EEMD nonlinear trend was considered as statistically significant at the 5 % level.

Fig. 2
figure 2

Probability density function a and autocorrelation function b of the TXx index at Pinghu station from observation (red line) and 99 surrogates (blue lines)

2.6 Estimation of the urbanization effect and urbanization contribution

The urbanization effect (E u ) on the temperature trends for downtown Shanghai during the period 1961–2013 was assessed using the urban minus rural (UMR) method, which is a widely used method for assessing the urbanization effect (Hartmann et al. 2013b; Ren et al. 2008; Yang et al. 2011; Ren and Zhou 2014). The E u here was estimated by the trends in the UMR difference series between the urban and rural station (T u  r ). This calculation was also used by Ren et al. (2008) and Ren and Zhou (2014). Note that in the first formula of Ren and Zhou (2014), they also used another expression of E u as the difference in the trends between the urban and rural station (T u  − T r ). For the OLS trend, these two calculations are the same; but for the WS2001 and EEMD trend, they are different, as will be shown later in this study. The urbanization contribution (C u ) is defined as the proportion that the urbanization effect accounts for the trends in the temperature series at Xujiahui station (T u ), which is the same as in Ren et al. (2008), Zhou and Ren (2011), and Ren and Zhou (2014). The detailed formulas are as follows:

$$E_{u} = T_{u - r,}$$
(2)
$$C_{u} = \, \left| {E_{u} /T_{u} } \right| \times 100\% .$$
(3)

C u was only calculated when the urbanization effect was statistically significant, following Ren and Zhou (2014). Thus, the significance test for a trend is also important for the quantification of urbanization impact.

3 Results

3.1 Urbanization effect on annual- and JJA-mean temperature

The urbanization effect on annual- and JJA-mean T mean at Xujiahui station is first estimated by the OLS trend, as commonly used in the previous studies, to give background information. The correlation coefficient between Xujiahui and Pinghu station is 0.92 for the annual-mean T mean and 0.93 for the JJA-mean T mean during 1961–2013. In addition, their variability is almost the same (Fig. 3a, b), indicating that Pinghu station is a good reference station and the trend in the UMR difference is primarily related to the urbanization effect. According to Table 2, the annual- and JJA-mean T mean at Xujiahui and Pinghu station all follow a Gaussian distribution at the 5 % level, according to the three normality test approaches (figures not shown, but similar to Fig. 4b, c). The OLS trend in the annual-mean (JJA-mean) T mean is 0.37 (0.27) °C decade−1 for Xujiahui station and 0.23 (0.13) °C decade−1 for Pinghu station. The OLS regression residual is not an independent random sample, but with an r 1 of 0.18 (0.27) for the annual residual series at Xujiahui (Pinghu) station. Therefore, the 95 % confidence interval of the OLS-M trend is wider than that of the original OLS trend and is less able to give a statistically significant result. Nevertheless, the annual trends for Xujiahui and Pinghu stations are both large enough to suggest that the trends are both statistically significant at the 1 % level. For the JJA T mean, only the linear trend at Xujiahui station is statistically significant. It is noted that for both Xujiahui and Pinghu station, 2013 was not the warmest year for annual T mean, but only for JJA T mean (Fig. 3a, b).

Fig. 3
figure 3

Temperature anomalies (relative to 1961–1990) for Xujiahui station (red solid line) and for Pinghu station (blue solid line) along with their OLS trends (dashed lines): a for annual-mean T mean; b for JJA-mean T mean. The blue lines in (a) and (b) are the results of subtracting the 1961–1970 average temperature difference between Pinghu station and Xujiahui station to make the two time series possess the same 10-year average during the 1960s. Panels (c) and (d) are the urban minus rural differences (black lines) for annual-mean T mean and JJA-mean T mean, respectively, along with the corresponding OLS trends (red lines)

Table 2 Distribution characteristics and OLS trends with a 95 % confidence interval in annual- and JJA-mean T mean for Xujiahui station and Pinghu station, along with the urbanization effect and urbanization contribution for Xujiahui station
Fig. 4
figure 4

Time series (black line) and the corresponding trends (colored lines) of four T max -related hot-extreme indices for Xujiahui station (a, d, g, and j), along with the corresponding normality test using histograms and Jarque–Bera tests (b, e, h, and k) or Q–Q plots (c, f, i, and l). In (a, d, g, and j), the EEMD trend (green line) is the median of 100 EEMD calculations, along with the 5th–95th percentile range (gray area) of these 100 EEMD calculations. Note that the differences in these 100 EEMD results are rather minor; therefore, the range (gray area) is barely visible. In the histograms (b, e, h, and k), the blue bars are the results of each hot extreme index and the red line is the fitted Gaussian distribution. The p > 0.05 in the JarqueBera tests indicates that the target data are normally distributed. In the Q–Q plots (c, f, i, and l), red circles indicate the distribution of each hot extreme index and the black solid lines represent the Gaussian distribution, with the 95 % confidence interval shown as black dashed lines. The approximate linearity of the circles suggests that the target data are normally distributed

The UMR difference series also suggest a Gaussian distribution for both annual and JJA T mean (Table 2). The urbanization effect for Xujiahui station is 0.14 °C decade−1 for both annual T mean (Fig. 3c) and JJA T mean (Fig. 3d), as estimated from formula (2). These urbanization effects are both statistically significant at the 1 % level according to the OLS-M results, although the r 1 is larger than 0.4 for either annual or JJA T mean. Therefore, the urbanization contribution for Xujiahui station is 38.3 % (50.4 %) for annual (JJA) T mean, as estimated from formula (3). It is noted that the urbanization contribution in annual T mean and that in JJA T mean are different; therefore, it is inappropriate to remove the urbanization bias calculated from annual-mean T mean if one is to investigate the summer temperature. It is also noted that, although the best estimates of the linear trend and the corresponding urbanization contributions estimated from the original OLS and from the OLS-M method are the same for the case reported here, there will be cases of urbanization contribution that cannot be estimated based on OLS-M but can be estimated based on the original OLS. These cases would occur when the UMR difference used to estimate the urbanization effect possesses serial correlation in the regression residual, and thus the corresponding linear trend would not be statistically significant based on OLS-M but would be statistically significant based on OLS. Therefore, the OLS-M method is used in the following analysis.

3.2 Urbanization effect on warm season hot extremes

3.2.1 Trends at Xujiahui station

For Xujiahui station, most of the seven hot-extreme indices do not follow a Gaussian distribution, except for the TXx index at the 5 % level (Figs. 4, 5; Table 3). The distributions of the TX35 (Fig. 4e, f), TN28 (Fig. 5e, f) and HW indices (Fig. 4k, l) deviate from a Gaussian distribution substantially. Three out of fifty-three points in the JJA TN90p time series lie outside the 95 % confidence interval range (Fig. 5i), and the Jarque–Bera test value barely exceeds 0.05 (Fig. 5h); therefore, this index is also considered as non-Gaussian distributed. In terms of autocorrelation, all of the seven indices have large AR(1) values, ranging from 0.39 to 0.54 (Table 3). Therefore, many of these extreme indices are non-Gaussian and serially dependent. Under such a circumstance, the proposed significance test for the EEMD nonlinear trend seems necessary. Moreover, the WS2001 method is preferred to the original Sen–Theil estimator, which assumes that the sample data are serially independent but the seven indices here are serially correlated.

Fig. 5
figure 5

As in Fig. 4, but for three T min-related hot-extreme indices for Xujiahui station

Table 3 Distribution characteristics, AR(1) values, and various trends in the seven hot extremes indices for Xujiahui station (58367) and Pinghu station (58464) (units: decade−1)

The OLS-M trends for the TXx, TNx, TX35, TN28, JJA TX90p, JJA TN90p, and HW indices at Xujiahui station for the period 1961–2013 are 0.68 °C decade−1, 0.55 °C decade−1, 4.54 days decade−1, 3.40 days decade−1, 5.69 % decade−1, 6.76 % decade−1, and 4.50 days decade−1, respectively (Table 3). All of these increasing trends are statistically significant at the 1 % level, although the regression residual for some of the indices is not serially independent and has an r 1 value of up to 0.28 for the HW index. It is noted that the estimated OLS-M trend and its statistical significance are not appropriate for non-Gaussian distributed indices, such as TN28 (Fig. 5d). The non-parametric WS2001 trend is only 2.73 days decade−1. That is to say, the OLS-M trend overestimates about a quarter of the trend estimated using the WS2001 method. It is noted that the value for the TN28 index in 2013 is also an outlier (Fig. 5e, f), to which the OLS regression is also sensitive. Therefore, for non-Gaussian data or data with outliers, the WS2001 method is more reliable than the OLS-M method. Table 3 suggests that all the WS2001 trends for the seven extreme indices at Xuijiahui station are statistically significant at the 1 % level.

The estimated EEMD trends for Xuijiahui station are also shown in Figs. 4 and 5. All the EEMD trends for the seven indices are apparently nonlinear, but fit the sample data better than the corresponding linear trends. The standard deviation of the residual after removing the EEMD trend from any of the seven indices is always smaller than that after removing the corresponding linear trend (figures not shown). Moreover, the EEMD nonlinear trend can provide useful information on the structure of change; namely, from slightly decreasing to rapidly increasing in most cases (Figs. 4, 5). To facilitate comparison of the EEMD trend rate with the corresponding linear trend rate, the averaged increment value of the EEMD trend during 1961–2013 is calculated as TrendEEMD = [R n (2013) − R n (1961)]/(2013 − 1961), similar to in Ji et al. (2014). Table 3 gives these results for the seven extreme indices, which are: 0.64 °C decade−1, 0.49 °C decade−1, 5.12 days decade−1, 3.66 days decade−1, 6.32 % decade−1, 5.95 % decade−1, and 5.03 days decade−1 for the TXx, TNx, TX35, TN28, JJA TX90p, JJA TN90p, and HW indices, respectively. Some of these values are close to those estimated from the linear trend, but some are not. Note that for non-Gaussian indices, the WS2001 results should be adopted when comparing the rates of the linear trends with those of the EEMD trends. The statistical significances of these TrendEEMD results are estimated using the method proposed in Sect. 2.5. Take the TNx index for example (Fig. 6): the maximum value of the 99 corresponding surrogates (the upper 99 % confidence interval) is 0.50 °C decade−1, and the maximum value of the first 19 corresponding surrogates (the upper 95 % confidence interval) is 0.36 °C decade−1. The TrendEEMD in the observation lies between these two values and thus is statistically significant at the 5 % level. Table 3 shows that the TrendEEMD in the other six extreme indices are statistically significant at the 1 % level. It should be noted that the uncertainty range for the EEMD trend is very narrow after 10,000 ensembles in the EEMD calculation and is hardly noticeable in Figs. 4 and 5. The estimated magnitude of the uncertainty range is no more than 1 % of the TrendEEMD for any of the seven extreme indices at Xuijiahui station, with that for TN28 as small as about 0.5 % (Fig. 7). If the ensemble size in the EEMD calculation is chosen to be larger, the uncertainty range for the EEMD trend will be even smaller. Nevertheless, to save computational time, this uncertainty range is already acceptable.

Fig. 6
figure 6

Significance test for the EEMD trend in the TNx index at Xujiahui station. The red cross indicates the TrendEEMD in the observation. The blue and black dashed lines indicate the 99 and 95 % confidence interval of the significance test, respectively. Units:  °C decade−1

Fig. 7
figure 7

Uncertainties due to ensemble size for the EEMD trend rates from 100 calculations for seven hot-extreme indices at Xujiahui station (red bars) and Pinghu station (blue bars), along with those for the corresponding urban minus rural (UMR) difference (black bars). The error bars indicate the 5th and the 95th percentile, both of which are divided by the 50th percentile and are all shown as anomalies relative to the 50th percentile. Units:  %

3.2.2 Trends at Pinghu station

For Pinghu station, five out of seven extreme indices are non-Gaussian distributed (Table 3; figures not shown, but similar to Figs. 4 and 5). Only the TXx and TNx indices follow a Gaussian distribution at the 5 % level. Compared with Xujiahui station, the AR(1) values of the seven indices at Pinghu station are much smaller and no more than 0.3; however, only three indices have AR(1) values smaller than 0.1. This means that even for a rural station, many of the hot-extreme indices are not serially independent.

As for the trend, the OLS-M trend slopes of the seven indices at Pinghu station are much smaller than those at Xujiahui station, with some slopes for the former even less than half of those for the latter (Table 3). Although six out of the seven OLS-M trends are statistically significant at the 5 % level, only the trends in the TXx, TX35, JJA TX90p, and HW indices are statistically significant at the 1 % level. The trends in the other indices related to the minimum temperature change are either barely significant or not significant at all, e.g., the TN28 index. However, the results from the WS2001 method are quite different from those from the OLS-M method, in terms of either the trend slope or the corresponding statistical significance. For example, the WS2001 trend for TX35 is only two-thirds of the corresponding OLS-M trend due to the non-Gaussian characteristics and an outlier in the year 2013 (figure not shown, but similar to Fig. 5d, e, f), suggesting about a 40 % overestimation if using the OLS-M trend. However, there are also cases in which the WS2001 trends are substantially larger than the OLS-M trend, such as for the TN28 index. The WS2001 trends in the seven indices are all statistically significant at the 1 % level, except for that in the TNx index, which is not statistically significant. In terms of the EEMD trends, only those in the TXx, TX35, JJA TX90p, and HW indices are statistically significant at the 5 % level, whereas those in the other three extreme indices related to the minimum temperature change are not statistically significant. Differences between the rates estimated from the EEMD nonlinear trend and those from the WS2001 linear trend are quite large for the TX35 and HW index, both of which have an EEMD trend rate approximately 80 % larger than the WS2001 trend rate. The uncertainty ranges of the EEMD trends in the seven indices at Pinghu station are no more than 2.5 %, although they are larger than those at Xuijiahui station (Fig. 7). The uncertainty range for the HW index at Pinghu station is as small as 0.8 % (Fig. 7).

3.2.3 Urbanization effect and contribution for Xujiahui station

Table 4 shows that five out of seven UMR difference time series are non-Gaussian distributed and only TXx and TNx follow a Gaussian distribution at the 5 % level (only the TN28 case is shown in Fig. 8). The AR(1) values of the seven UMR difference time series range from 0.33 to 0.77, suggesting that these seven time series are all serially dependent. Thus, five out of the seven UMR difference time series are both non-Gaussian and serially dependent.

Table 4 As in Table 3, but for the UMR difference between Xujiahui station and Pinghu station, whose trend is considered as the urbanization effect for Xujiahui station
Fig. 8
figure 8

As in Fig. 4, but for UMR difference time series of TN28. The time series is shown as the anomaly relative to the first 10 years (1961–1970)

The OLS-M trends in these seven UMR time series are 0.35 °C decade−1, 0.39 °C decade−1, 2.90 days decade−1, 1.99 days decade−1, 3.04 % decade−1, 4.68 % decade−1, and 2.59 days decade−1, respectively, all of which are statistically significant at the 1 % level, although the regression residual can have an r 1 value of up to 0.57 for the JJA TX90p index. However, the WS2001 trends in the five non-Gaussian time series are much smaller than the corresponding OLS-M trends, suggesting substantial overestimations of the urbanization effects by the OLS-M method. This overestimation can be as large as approximately 100 % for the TN28 index. For the Gaussian distributed indices, TXx and TNx, the estimated trends from the WS2001 and OLS-M methods are close. Except for the WS2001 trend in the TN28 UMR difference time series, which is statistically significant at the 5 % level, the other six WS2001 trends are all statistically significant at the 1 % level.

The EEMD trends suggest the urbanization effects on the seven extreme indices are 0.33 °C decade−1, 0.31 °C decade−1, 3.08 days decade−1, 2.89 days decade−1, 2.75 % decade−1, 4.51 % decade−1, and 2.18 days decade−1, respectively, all of which are statistically significant at the 5 % level, except for those on TX35, TN28 and JJA TN90p at the 1 % level. Some of these nonlinear estimations are substantially different from those linear estimations by the WS2001 method. For example, the EEMD trend in the TN28 UMR difference time series is more than double the WS2001 trend. It is noted that the uncertainty ranges for the seven EEMD trends are only approximately 1 % (Fig. 7).

According to formula (3), the urbanization contributions for the seven extreme indices at Xujiahui station are estimated using the three different trend estimators (Table 5). Based on the distribution characteristics (Tables 3, 4) reported previously, the OLS-M method is only appropriate for the estimation of the urbanization contribution of TXx in Table 5; and for the other six indices, the results from the WS2001 method should be adopted if we assess the urbanization contribution from a linear perspective. The estimated urbanization contributions for the TXx index is 51.5 % based on the OLS-M method; and those for the TNx, TX35, TN28, JJA TX90p, JJA TN90p and HW indices are 76.0, 61.1, 37.0, 40.0, 56.9 and 34.4 %, respectively, based on the WS2001 method (Table 5). The urbanization contributions for the latter five non-Gaussian indices are all overestimated based on the OLS-M method, especially for the HW index with an overestimation of more than 20 %. For the quasi-Gaussian distributed TXx and TNx, the estimations from the WS2001 method and those from the OLS-M method are comparable. From a nonlinear perspective, the estimated urbanization contributions for the seven indices at Xujiahui station based on the EEMD method are 51.6, 63.3, 60.2, 79.0, 43.5, 75.8 and 43.3 %, respectively. The differences between these results and those from the WS2001 method are noticeable (Table 5). Since the EEMD method is adaptive, which does not pre-determine the shape of a trend (the WS2001 method has pre-determined a trend is linear), the resultant trend rate therefore could be either larger or smaller than that of the linear trend. It depends on the data themselves. Note that the same is true for Gaussian distributed index, although the EEMD results of the TXx index shown in this study happened to be close to those estimated from the OLS-M method. The largest difference in the estimation between the WS2001 linear trend and the EEMD nonlinear trend is for the TN28 index, with the latter almost doubling the former. From a fitting perspective, the EEMD nonlinear trend provides a better fitting than the WS2001 linear trend for both the TN28 index (Fig. 5d) and the corresponding UMR difference time series (Fig. 8a). The physical interpretation of the difference may be from a different perspective in terms of the evolution of the urbanization effect. The linear trend suggests that the urbanization effect is stable over time, whereas the EEMD nonlinear trend suggests a rapid urbanization effect at the beginning and then a slowed down effect in recent years (Fig. 8a).

Table 5 Urbanization contributions (units: %) of the seven temperature extreme indices for Xujiahui station using Pinghu station as a reference site

4 Discussion

It should be acknowledged that only Pinghu station was used as a rural reference series, which is different from previous studies in which several suburban stations in the territory of the megacity municipality were averaged as a rural reference (e.g., Ren et al. 2007; Gaffin et al. 2008). This is because the urbanization process in Shanghai was so rapid that many former suburbs are now urban areas, following subways having been built to connect these suburbs and facilitate the ever growing population living there. When using these stations’ average to quantify the urbanization effect in downtown Shanghai, the contributions are only 11.0 % for annual-mean temperature and 8.8 % for JJA-mean temperature by the OLS-M method, which reveals the urbanization effects for both are not significant. Figure 9 shows that after the early 1990s, the UMR differences between the Xujiahui station and Shanghai suburban-average tended to decrease, implying that suburban areas had been developing even faster than the downtown area since that time. In addition, since there is an urban heat dome, the bigger a city is the wider the size of the urban heat dome will be and the larger the area affected by the urban heat island effect (Ren et al. 2015). Therefore, Pinghu station was chosen as a rural site. This station is not too close to downtown Shanghai (Xujiahui station) but is still near (approximately 73 km away from Xuijiahui station). Besides, it does not belong to Shanghai municipality and therefore the urbanization process is much slower than in Shanghai’s suburbs. The selection of this station was based as much as possible on existing observational networks, and it also meets most of the five criteria listed in Ren et al. (2015) for selecting a reference station: (1) The record of Pinghu station started before 1961 and was continuous, except for individual missing values on 12 September and 18 November 2013 for T mean, 24 October 1974 for T max and 30 July 1968 for T min; (2) The overall population of Zhapu town, where Pinghu station is located, was 57,159 in 2000, within the 70,000 limit for eastern plain regions where economies were relatively more developed; (3) This station experienced no relocations during the analysis period. Nevertheless, the urbanization contributions estimated using Pinghu station were similar to those using a relatively less urbanized suburban station (Chongming station) of Shanghai municipality: 38.6 % for annual-mean temperature and 53.1 % for JJA-mean temperature by the OLS-M method. Admittedly, Zhapu town was also developing during several of the more recent years. As a real “rural” site adjacent to Shanghai that is completely free from the effects of urbanization is difficult to find, the urbanization contributions of downtown Shanghai reported in this study, based on the compromise of selecting a reference station, should be regarded as conservative estimations.

Fig. 9
figure 9

As in Fig. 3c and d, but for the differences (black) of Xuijiahui station minus the Shanghai suburban-average for a annual-mean T mean and b JJA-mean T mean, along with the corresponding OLS trend (red) and 11-year running mean (blue)

Nevertheless, the trend estimation and significance testing methods discussed in this paper using these two stations have broad implications for regional climate change research in China, not limited only to detection and attribution of the urbanization effect in rapidly developing locations, since trend analysis is a common practice in these kinds of studies.

Firstly, many previous studies on the estimation of the linear trend in temperature extremes in China used OLS regression to estimate the spatial pattern of linear trends at individual stations (e.g., Qian and Lin 2004; Ding et al. 2010; Huang et al. 2010; Jiang et al. 2012; Ye et al. 2014, among many others). Caution should be applied in that, for a single station in China, there are quite a few stations other than Shanghai whose temperature extreme indices are non-Gaussian, which will be reported in detail in another paper. Compared to individual stations, the data in a latitude/longitude grid box or an area-average are more likely to be Gaussian, especially for a large area average, according to the central limit theorem. Therefore, previous studies using OLS regression to quantify the urbanization effect and contribution to temperature extremes for a large region average with quite a few stations in China (Zhou and Ren 2011; Ren and Zhou 2014; Sun et al. 2014) are appropriate. However, caution should also be noted that even for a latitude/longitude grid box or an area average, Gaussian characteristics cannot necessarily be guaranteed for climate extremes. This is especially true for data-sparse regions such as western China, where a grid box or a small area may contain only one or several stations. Such a grid box may not be Gaussian. For other regions, a grid box may also be non-Gaussian. For example, many of the 3.75° × 2.5° grid boxes for JJA TX90p indices in the China domain are non-Gaussian when analyzing HadEX2 gridded data (Donat et al. 2013) for the period 1961–2010, details of which will be reported elsewhere. For some climate extremes, even an area average cannot guarantee Gaussian characteristics; therefore, new methods have been adopted, such as to fit extreme temperatures with a generalized extreme value distribution in individual grid boxes (Zwiers et al. 2011). Since the number of stations required for an average to meet the central limit theorem is not yet foreseen, it is therefore useful to carry out the normality test before using OLS regression for climate extremes. For non-Gaussian cases, the WS2001 method is valuable both for linear trend analysis and for quantifying the urbanization effect, as discussed in the present paper. Even for Gaussian cases, the serial correlation should also be taken into account, which suggests that the effective degrees of freedom in the significance testing of the Student’s t-test are not necessarily N − 2. For these cases, the OLS-M method is preferred.

Secondly, besides the implications for linear trend analysis, the nonlinear urban warming discussed in the present paper also deserves attention. Even if the index under investigation for a single station or an area average is Gaussian, or its linear and nonlinear trend rates may not be significantly different from each other when sampling uncertainties are considered, the shape of the trend (the structure of change) may not necessarily be linear (e.g., Fig. 4a). The same is true for the urbanization effect. The fact that urban warming may not be stable was also noted in Ren et al. (2007), who reported an enhanced urban warming at the megacity stations of Beijing and Wuhan for the latter 20 years of the period 1961–2000. Although linear trend analysis is common practice in the literature, use of the EEMD trend to quantify the nonlinear urban warming rate and contribution, introduced here, and the results from the nonlinear trend perspective, can serve as a supplement to help understand the different trends or urban warming rates and urbanization contributions during different analysis periods and among different studies.

5 Summary and concluding remarks

To illustrate the importance of trend estimation and significance testing methods on the estimation of trends and urbanization effects (contributions) in climate extremes, the megacity of Shanghai, whose level of urbanization is the highest in China, was used as an example. The urbanization effects on the trends in seven hot-extreme indices for downtown Shanghai were quantified based on the UMR approach and homogenized daily temperature datasets during 1961–2013. Both linear and nonlinear trend estimation methods were used and the statistical significances for these trends were assessed by taking into account potential non-Gaussian and serial dependence in the extreme indices. The main conclusions can be summarized as follows:

  1. 1.

    The urbanization effect (contribution) for downtown Shanghai was 0.14 °C decade−1 (38.3 %) for the linear trend in annual mean temperature, and 0.14 °C decade−1 (50.4 %) for the linear trend in JJA mean temperature.

  2. 2.

    Among the seven hot-extreme indices for downtown Shanghai, the TX35, TN28, JJA TX90p, JJA TN90p and HW indices were found to be non-Gaussian and serially dependent in terms of both the time series and the UMR difference. Only TXx and TNx were found to be quasi-Gaussian but serially dependent in terms of both the time series and the UMR difference.

  3. 3.

    The urbanization effect on the trends in the TXx, TNx, TX35, TN28, JJA TX90p, JJA TN90p and HW indices for downtown Shanghai based on non-parametric WS2001 linear trend (EEMD nonlinear trend) estimation was 0.35 (0.33) °C decade−1, 0.38 (0.31) °C decade−1, 2.73 (3.08) days decade−1, 1.01 (2.89) days decade−1, 2.51 (2.75)  % decade−1, 3.80 (4.51)  % decade−1 and 1.62 (2.18) days decade−1, respectively; the corresponding urbanization contribution was 48.6 % (51.6 %), 76.0 % (63.3 %), 61.1 % (60.2 %), 37.0 % (79.0 %), 40.0 % (43.5 %), 56.9 % (75.8 %) and 34.4 % (43.3 %), respectively. For the linear trend estimation itself, the OLS trend substantially overestimated the urbanization effect and contribution for some of the non-Gaussian extreme indices.

  4. 4.

    The urbanization effects (contributions) of downtown Shanghai reported in this study, based on the compromise of selecting a reference station, should be regarded as conservative estimations, as a real “rural” site adjacent to Shanghai that is completely free from the effects of urbanization is difficult to find.

The above results suggest that the trend estimation and significance test method is important for the detection and attribution of the influence of urbanization on changes in temperature extremes, especially when they are non-Gaussian and/or serially dependent. For analyzing the urbanization effect on changes in other climate extremes or even precipitation, this point needs to be considered. It is also noted that the substantial urbanization forcing in the mean temperature and extreme temperature indices for cities in eastern China like Shanghai needs to be taken into account when performing detection and attribution analyses of other human influences on climate change, or event attribution, especially at local scales.

The proposed method for the significance test for the EEMD nonlinear trend can also be applied in hydrology and many other geophysical or non-geophysical research settings when the sample data are non-Gaussian and serially dependent. Even if the sample data are Gaussian and/or independent, this method can still be applied since the method is adaptive, which generates surrogate data with the same distribution and the same autocorrelation structure as the sample data.