1 Introduction

Climate change mitigation seeks the progressive substitution of fossil energy sources by cleaner renewable ones (e.g. Edenhofer et al. 2014). In this framework wind-power production is one of the most rapidly evolving fields (e.g. Chen 2011; Higgins and Foley 2013; Firestone et al. 2015). Nevertheless, wind is a variable subject to a strong variability at multiple time scales (Pryor et al. 2006). Thus, the critical effects that the irregularity of calm and gusty periods have on both the wind farms functionality and electricity distribution makes the possibility to foretell wind speed anomalies and engaging research for climate services (Hewitt et al. 2012; Torralba et al. 2017). Nevertheless, while short-term wind speed forecasting is already consolidated (Costa et al. 2008; Zhu and Genton 2012), the use of seasonal forecasts in the operational long range planning is rather limited (Jung and Broadwater 2014).

Moreover, in the framework of seasonal forecast verification it is important to know whether the characteristics of the simulated distribution of a variable are similar to its reference counterpart. This is essential to guide post-processing techniques such as bias correction or statistical downscaling (e.g. Benestad et al. 2008; Ruffault et al. 2013). In this regard the choice of ERA-Interim (ERA-Int; Dee et al. 2011) is a great opportunity to analyse the distribution’s characteristics of a dataset that has been extensively used in a wide range of research fields such as climatology (e.g. Škerlak et al. 2014), climate change (e.g. Andres et al. 2014), characterization of extremes (e.g. Cornes and Jones 2013), etc. More specifically, concerning wind-power forecasting, production and verification, it has been thoroughly applied also with positive results (e.g. Kiss et al. 2009; Rose and Apt 2015; Lorenz and Barstad 2016).

Concerning the seasonal forecast system, we have selected the European Centre for Medium-range Weather Forecasting System 4 (S4; Molteni et al. 2011) which is the evolution of the well-considered ECMWF System 3 (Stockdale et al. 2011) and its full potential is still being unfolded (e.g. Tompkins and di Giuseppe 2015; Marcos et al. 2015; Ogutu et al. 2016; Marcos et al. 2017; Torralba et al. 2017). Nevertheless, there are still great uncertainties on the results obtained by seasonal forecast systems (Parker 2016), due to atmospheric and oceanic uncertainty along with the need for several parametrization and computational approximations during calculations (Palmer et al. 2005; Delsole and Shukla 2010). This makes post-processing methods such as bias correction and downscaling necessary for end-user applications. But these methodologies need a thorough knowledge of the inner properties of the variable distributions depicted both by the seasonal forecast systems and the reference datasets (e.g. Amengual et al. 2012).

Consequently, this paper tries to fill the existing gap in the characterization of the probability distribution of 10 m wind speed at a global scale by discussing and comparing the most relevant features of this variable in DJF (December–January–February) and JJA (June–July–August) for both S4 and ERA-Int datasets. Those seasons are the most important for stakeholders because they hold the greatest potential regarding seasonal predictability and wind production in both hemispheres (e.g. Trenberth and Olson 1988; Lu et al. 2009). Our evaluation provides valuable information of the different statistical parameters that should be considered in seasonal post-processing methods as well as when comparing predictions with reanalysis. This is relevant for the development of wind speed applications and services.

The manuscript is organised as follows: the second section, Sect. 2, is devoted to the description of the ERA-Int and S4 datasets together with the definition of the statistical parameters used to characterize the probability distribution of 10 m wind speed. Afterwards, the Sect. 3 is focused on the characterization and comparison of the ERA-Int and S4 10 m wind speed distributions in different seasons and time-frames. In the Sect. 4 the main outcomes of the study are summarised.

2 Material and methods

2.1 Datasets

We have used 10 m wind speed monthly means from the ERA-Int reanalysis and S4 seasonal prediction system. Although higher altitude wind speed would be desirable for wind industry applications (e.g 100 m; Drechsel et al. 2012), unfortunately the state-of-the art seasonal prediction systems only provide the wind speed values at 10 m height. For that reason, and taking into account that 10 m level wind speed is a common proxy for higher altitude winds (e.g. Gryning et al. 2007) we focus our analysis on the near-surface (10 m) wind speed.

The ERA-Int reanalysis is a global atmospheric reanalysis issued by the European Centre for Medium-Range Weather Forecasts (ECMWF). It spans 1979 to nowadays at a \(0.7^{\circ }\) resolution, and it is updated on a real-time basis (Dee et al. 2011). In comparison to the previous system, ERA-40 (Uppala et al. 2005), it shows multiple improvements such as the incorporation of the four-dimension variational data assimilation approach, 4D-Var, the increase of system resolution (\(\sim\)80 km) or the enhancement of the forecast system physics.

The S4 seasonal prediction system (Molteni et al. 2011) is a fully coupled general circulation system that provides operational multi-variable seasonal forecasts in a real-time basis at \(0.7^{\circ }\) resolution. In this study we focus on a 35-years period, 1981–2015, coming from the combination of the 30-years hindcast with the 5-years contemporary pool of forecasts (from 2011 onwards). The forecasts are initialised on the first day of every month and span 7 months into the future. Although there are 51 members for the start dates of February, May, August and November, and for every month since May 2011, the forecasts used in this study retained only 15 members to stay coherent with the remaining months. Predictions starting in August, September, October, November and December have been selected for DJF season; whereas for JJA, have been selected February, March, April, May and June.

2.2 Methods

In this work we have characterized the main properties of the wind speed probability density function of both ERA-Int and S4 based on the computation of five statistical parameters: mean, standard deviation, coefficient of variation, skewness and excess of kurtosis. We have also assessed the goodness of fit to a Gaussian distribution through the Shapiro–Wilks test (Wilks 2006). The analysis of the statistical parameters of the distribution has been done at interannual and intraseasonal basis to distinguish the contribution of each variability source (e.g. Achuthavarier and Krishnamurthy 2010; Luo et al. 2011). Interannual statistics reflect the properties of the seasonal mean distribution, i.e., how values vary from year to year for a particular season, whereas intraseasonal statistics reflect the average properties of the distribution inside a season, i.e., how values vary among the months of the season. The difference between each time frame is that at the intraseasonal scale we do not average the wind speed of the 3 months of the season, instead, we concatenate each group of three values in the time series. Finally, it is worth noting that dealing with the S4 implies each year having 15 times more elements than ERA-Int per grid-point, due to the inclusion of the 15-members forecast in the computation. In order to assess the statistical significance of the model-reanalysis differences we have used a bootstrapping method. This has also been applied to assess the significance of the excess kurtosis and skewness statistics in ERA-Int. Our approach consists in the following steps.

  1. 1.

    Compute the statistic for both ERA-Int and S4.

  2. 2.

    Repeat 1000 times the first point resampling the data with replacement.

  3. 3.

    Compute the difference of each pair of 1000 values (it only applies when working with differences).

  4. 4.

    Obtain the percentile 97.5th and 2.5th of the distribution of each statistic.

  5. 5.

    If the zero value falls above the 97.5th or below the 2.5th percentile, then we can regard the statistic (or the difference) significantly different from 0 at 95% confidence level.

Subsequently we will introduce each of the statistical metrics used as well as the reasons for its choice. All these statistics are obtained for each grid-point and lead time separately. The mean, \({{\bar{x}}}\) , is the first moment of the distribution and a useful measure in order to characterize which are the usual conditions in one particular location/grid point.

$$\begin{aligned} {\bar{x}}=\frac{1}{N}\sum \limits _{i=1}^{N}{{{x}_{i}}} \end{aligned}$$
(1)

From now on \(x_{i}\) refers to each value of the distribution; and, N, to the total number of elements in the distribution. The evaluation of the differences in the mean value (climatology) between the seasonal predictions and the reference dataset is a measure of the systematic error. Having computed the distribution’s central value we have characterized its variability around the mean based on the standard deviation (\(\sigma\)), which is the second central moment.

$$\begin{aligned} \sigma = \sqrt{\frac{1}{N-1} \sum _{i=1}^N (x_i - {\overline{x}})^2} \end{aligned}$$
(2)

This is helpful to understand whether wind speed values spread far from the mean or, conversely, if they are clustered around it. From an end-user perspective this information is important because it is directly related to the capacity factor curves (e.g. Sinden 2007).

The coefficient of variation (cv) is a standardized measure of dispersion. It is often expressed as a percentage, and is defined as the ratio of the standard deviation to the mean:

$$\begin{aligned} cv=\frac{\sigma }{\bar{x}} \end{aligned}$$
(3)

The smaller coefficient of variation the less dispersion is in the distribution. This is an interesting parameter because it relates the magnitude of the variability to the mean wind value. Thus, not only it allows to compare distinct wind regime regions (e.g. Bett et al. 2017), but it provides information on the areas with lower variability compared to the mean, which is important when studying appropriate spots for wind-farm installation (the production is easier to plan).

Skewness is a measure of the degree of symmetry related to the normal distribution, which indicates whether a sample has more values to the left of its mean (left-skewed), to the right (right-skewed) or it is symmetrically distributed around the mean (zero skewness) and it is defined:

$$\begin{aligned} skw=\frac{1}{n}\sum \limits _{i=1}^{n}{{{\left[ \frac{{{x}_{i}}-\bar{x}}{\sigma } \right] }^{3}}} \end{aligned}$$
(4)

Kurtosis is the fourth moment of the distribution and it measures the weight of the tails of a sample distribution. It can be defined as:

$$\begin{aligned} kurt=\left\{ \frac{1}{n}\sum \limits _{i=1}^{n}{{{\left[ \frac{{{x}_{i}}-\bar{x}}{\sigma } \right] }^{4}}} \right\} -3 \end{aligned}$$
(5)

Kurtosis is often compared to that of a Standard Normal distribution, which equals 3, by subtracting 3 from the kurtosis value, yielding what is known as ‘excess kurtosis’. In a positive (negative) excess kurtosis the weight of the tails is lower (higher) than for the Gaussian distribution. That is, the probability of an extreme located in the tails of the distribution is also lower (higher) than for the Gaussian distribution.

The goodness of fit assesses how closely a sample resembles a specific distribution. In our case we have assessed how close is our sample data to a normal distribution. Although instantaneous values or daily accumulations of many climatic variables do not have Gaussian distributions, at interannual or intraseasonal time scales the distributions tend to be ‘near-normal’ in accordance with the central limit theorem (Wilks 2006).

The evaluation of the goodness of fit is based on the Shapiro–Wilks test (Shapiro and Wilk 1965), which allows to quantify how similar is the distribution to a normal one. This test is a statistical hypothesis testing process whose null hypothesis is whether the distribution of a z variable for N values (\(z_{1}\), ...\(z_{N}\)) comes from a normally distributed population. The statistic that is employed in the hypothesis contrast is calculated as follows:

$$\begin{aligned} W=\frac{{{\left( \sum \nolimits _{i=1}^{n}{{{a}_{i}}{{z}_{i}}} \right) }^{2}}}{\sum \nolimits _{i=1}^{n}{{{\left( {{z}_{i}}-\bar{z} \right) }^{2}}}} \end{aligned}$$
(6)

Where \(z_{i}\) are the ordered sample values (\(z_{1}\) is the smallest) and the \(a_{i}\) are constants that depend on the mean, variance and covariance matrix. The null hypothesis is that the population is normally distributed. Therefore, if the p-value computed with the statistic (6) is lower than a pre-set significance level (normally 95%), then the null hypothesis is rejected and the distribution is considered not normal. Since the skewness and kurtosis are another way for assessing the normality of the distribution, the outcomes and discussion linked to the goodness of fit are provided as Supplementary Material. All these statistical parameters are important to characterize the statistical distributions from both the model and the reanalysis. Besides, their difference is useful to identify which specific features of the statistical distributions are distinct between the prediction and reference datasets.

3 Results

In this section we present the results for DJF and JJA seasons at inter-annual and intraseasonal time scales. We start by analysing the results from ERA-Int and, afterwards, we assess the differences between S4 and the aforementioned reanalysis. To avoid cluttering we only provide images for DJF (results for JJA season are in supplementary material).

3.1 ERA-Int

The spatial patterns of the mean wind speed (climatology) for interannual and intraseasonal time scales are equal because the mean computation is independent of the number of steps (or the order) in which the mean is applied (e.g. Fig. 1a, b). For DJF, the climatology shows a widespread area of high wind speeds in the northern and southern extra-tropical oceans, with secondary maxima around 20\(^\circ\) north and south (Fig. 1a, b). The maximum wind speeds are over the oceans while the minimum values lie inland, with generally stronger winds in the Northern Hemisphere. In fact, the wind speed over land is generally less than half the observed over the oceans. This behaviour is a consequence of the increased roughness over the continents due to relief. For JJA, the maximum wind speeds are restricted to the Southern Hemisphere, around the extra-tropics and in the Indian Ocean, where the Monsoon structure is clear (Fig.  6a, b in the Supplementary Material). It is worth noting that the maximum wind speed difference between both hemispheres is more noticeable in JJA than in DJF, probably as a result of the stability of the southern circumpolar storm-track structure (Chang et al. 2002). Nevertheless, in both seasons the maximum wind speeds are observed over the oceans, with the minimum values settled over land.

The spatial patterns of the standard deviation inform us about the variability of the wind speed. Both DJF (Fig. 1c, d) and JJA (Fig.  6c, d in the Supplementary Material) show interannual values much lower than the intraseasonal. These differences are expected since month to month variations are normally bigger than interannual changes. For DJF the strongest variability is over the oceans and, more specifically, in the Northern Hemisphere where there is a region of high variability in the north-eastern Atlantic (Fig. 1c, d). This particular area contains the tail of the North-Atlantic stormtrack, where the storms enter the continent (Chang et al. 2002). Moving to the southern oceans, there is a strong variability region in the maritime continent (Indonesia) that propagates northwards. Conversely to DJF, JJA shows less variability (Fig. 6c, d in the Supplementary Material). Its maximum values are centred in the northern tropical regions and the southern Antarctic Ocean. Its worth noting that the differences between DJF and JJA are more remarkable in the Northern Hemisphere than in the Southern counterpart, probably due to the larger oceanic area in the latter.

Fig. 1
figure 1

Global 10 m wind speed probability distribution parameters from ERA-Int. They are computed for DJF at interannual and intraseasonal time-scales spanning the period 1981–2015. In skewness and excess kurtosis the hatching denotes regions where the values are different from 0 at 95% confidence level computed with a bootstrapping method. a Interannual climatology, b intraseasonal climatology, c interannual standard deviation, d intraseasonal standard deviation, e interannual coefficient of variation, f intraseasonal coefficient of variation, g interannual skewness, h intraseasonal skewness, i interannual excess kurtosis, j intraseasonal excess kurtosis

The coefficient of variation is the parameter displaying the greatest difference between the spatial patterns at interannual and intraseasonal time-scales (DJF, Fig. 1e, f; and JJA, Fig. 6e, f in the Supplementary Material). This is logical since it includes the second moment (standard deviation) and this tends to be higher at intraseasonal level than interannual. In DJF the greatest values are seen in the tropical Pacific and Indian Oceans and, also, the Greenland coast and western European regions (Fig. 1e, f). In JJA, conversely to DJF, the largest values are found only in the inter-tropical oceans, generally to the north of the Equator (Fig. 6e, f in the Supplementary Material).

The spatial patterns for the third moment (skewness) are noisy at interannual and intraseasonal level and for both seasons DJF (Fig. 1g, h) and JJA (Fig. 6g, h in the Supplementary Material). However, there are still some structures that can be outlined. Intraseasonal DJF and JJA show some prevalence of positive skewness over the continents (Fig. 1h and Supplementary Material Fig. 6h) and positive values over the equatorial line that are surrounded by negative ones. Additionally, around 30\(^\circ\) north and south to the Equator there is a global strip of positive values which is more evident in DJF (Fig. 1h) than JJA (Fig. 6h in the Supplementary Material). Finally, in DJF, we find positive values in the Arctic Ocean and negative skewness in the Antarctic Ocean. These structures are interesting because they highlight the influence of intraseasonal circulation processes on 10 m wind speed distribution. At interannual time scales the only remarkable structure is the dipole of the Eastern Pacific that can be found both in DJF and JJA and that could be linked to the ENSO (El Niño Southern Oscillation; Fig. 1g and Supplementary Material Fig. 6g).

The excess kurtosis, derived from the fourth moment compared to the normal, is interesting because it depicts some predominance of slightly negative values worldwide for both seasons, specially at interannual time scales (Fig. 1i and Supplementary Material Fig. 6i). This indicates that the wind speed distribution has heavier tails than a normal distribution, and consequently a higher than normal frequency of ocurrence of wind speed extreme events in that particular region. Besides, the most interesting structure lies in the Eastern Pacific, where a patch of positive values can be identified at both time frames and seasons (DJF, Fig. 1i, j; and JJA, Supplementary Material Fig. 6i, j), therefore the outliers are more frequent in that area than a gaussian distributed variable (Figs. 4a, c, 5a, c).

3.2 Differences between S4 and ERA-Int

In this section we are evaluating the S4 predictions, with November (DJF) and May (JJA) start dates, against the reference data from ERA-Int using the parameters of the wind speed distribution (see supplementary material for wind speed distribution obtained with S4, Supplementary Material Figs. 7, 8). We limit the study to the first lead time because it retains some degree of predictability at extra-tropical latitudes (e.g. Doblas-Reyes et al. 2013) and, because the differences between start dates are rather small as it is illustrated in Fig. 2. This figure shows the S4 biases for DJF and JJA computed for three different start dates.

Fig. 2
figure 2

Global 10 m wind speed S4 biases relative to ERA-Int. They are computed for DJF and JJA considering the period 1981–2015 for 0, 2 and 4 months lead time. a DJF Lead 0, December initialization. b JJA Lead 0, June initialization. c DJF Lead 2, October initialization. d JJA Lead 2, April initialization. e DJF Lead 4, August initialization. f JJA Lead 4, February initialization

These results show that the seasonal bias is not dependent on the start date of the prediction. This might be due to the influence of mixing different lead times when building the season (e.g. DJF is built with the start date of November, which combines 1 month lead time for December, 2 months lead time for January and 3 months lead time for February). However, this is an hypothesis that has to be further studied but is out of the scope of this research.

Differences in climatology for DJF (Fig. 3a, b and JJA Fig. 9a, b from the Supplementary Material) seasons show that S4 has a systematic positive bias worldwide. More specifically, in DJF (Fig. 3a, b) the positive discrepancies lie in the inter-tropical oceans, eastern Siberia and western Canada. Regarding the areas where the S4 underestimates wind speeds, they are notorious in the Himalayan region, the Atlantic ocean close to the western coast of Equatorial Africa and the central and eastern Pacific ocean. S4 overestimates the values over the oceans in JJA (Fig. 9a, b from the Supplementary Material), especially in the inter-tropics and the Australia southern oceanic region and underestimates wind speeds in the eastern Pacific (near the Colombian coast), in the northern tropical Atlantic, in a strip of land between 20\(^\circ\) and 40\(^\circ\) north covering African and Asian territories and, also, in the Arctic ocean.

The differences in the standard deviation patterns at interannual and intraseasonal have similar magnitude at global scale (Fig. 3c, d and Supplementary Material Fig. 9c, d), although the latter displays slightly higher absolute values. There is also no dominant overall sign, either in DJF and JJA. In DJF, at intraseasonal timescales (Fig. 3d), there is higher variability over Indonesia and Indian ocean along a broader region than at interannual timescales (Fig. 3c). At both time scales, there are positive values over Indonesia, central equatorial Pacific and, also, northwestern Pacific. Negative values, on the other hand, are present in northern Europe, Siberia and eastern equatorial Pacific. Positive (negatives) values of the differences indicate that the S4 overestimates (understimates) the variability displayed at ERA-Int. Considering JJA, the differences at interannual level are less pronounced than at intraseasonal, although they hold a similar spatial structure (Fig. 9c, d from the Supplementary Material). In this case the interesting areas lie in the inter-tropics, especially in the maritime continent (where the S4 underestimates ERA-Int), the Bengal Bay (where S4 overestimates) and the equatorial Atlantic and Pacific Oceans (S4 underestimates variability in the Pacific; and overestimates in the Atlantic).

Fig. 3
figure 3

Differences in wind speed probability distribution parameters computed for DJF at interannual and intraseasonal time-scales (S4 minus ERA-Int) for the period 1981–2015. The seasonal predictions for DJF have been initialized the 1st of November. The hatching means the values are different from 0 at 95% confidence level computed with a bootstrapping method. a Interannual climatology, b intraseasonal climatology, c interannual standard deviation, d intraseasonal standard deviation, e interannual coefficient of variation, f intraseasonal coefficient of variation, g interannual skewness, h intraseasonal skewness, i interannual excess kurtosis, j intraseasonal excess kurtosis

The coefficient of variation differences show localised larger values at intraseasonal than interannual time frames, both for DJF and JJA (e.g. in the Indian Ocean, North America or Syberia; Fig. 3e, f and Supplementary Material Fig. 9e, f). An overestimation of the coefficient means that either the variability is too big or the mean, too small, always regarding ERA-Int reference. The increased value of the differences in several areas at the intraseasonal scale is due to variability between months within the same season tends to be higher than the interannual one. More specifically, in DJF the regions with more intense values can be found at the intertropical areas and a strip of land around 60\(^\circ\) north, from Scandinavia to Siberia (Fig. 3e, f). Intertropical regions present discrepancies at both time scales, being an underestimation of the cv by S4 for the most part of the Pacific Ocean, whereas an underestimation of the S4 is found mainly over the Atlantic and Indian Oceans. In the strip of land from Scandinavia to Siberia the S4 underestimates the cv due to the intense underestimation of the S4 standard deviation over this area (Fig. 3c, d). There, it is also possible to find negative difference values of cv over Northern America, which are more intense at intraseasonal than at interannual time-scales. In JJA (Fig. 9e, f from the Supplementary Material), the inter-tropics is again the region where the magnitude differences is greater, particularly at intraseasonal time-scales, where negative (positive) values of the differences are found over central and western Pacific ocean, South America, Africa and Indian ocean (central Atlantic and Indian region) at most. Outside tropical zones, over land, there is an intensification of the negatives values over southern part of North America, northern Europe and Asia and South America. Similarly to what’s happening for standard deviation coefficient, the spatial amplitude of the cv values is higher in winter season than in summer and for the case of JJA higher at intraseasonal time-scales.

When studying skewness and excess kurtosis it is important to note that the S4 (Figs. 7g–j and 8g–j both from the Supplementary Material) is not capable of reproducing the structure of ERA-Int (Fig.  1g–j, Supplementary Material Fig. 6g–j), especially in the kurtosis case, where it only has non-zero values around the inter-tropics. Besides, these differences are mostly non significant at 95% confidence level, especially at interannual scales (except in winter intraseasonal excess kurtosis, where the differences are significant almost everywhere, Fig. 3h). This behaviour is also observed when assessing the statistical significance at a 95% confidence level in the ERA-Int (Fig. 1g–j). However, if we focus on the differences, in skewness they are more intense at interannual than intraseasonal time-scales (Fig. 3g, h and from the Supplementary Material Fig. 9g, h). This means the third moment’s magnitude is better depicted by the S4 when considering months within a season than when gathering seasons in a year to year basis. It is worth remembering that regions with S4 underestimation (blue colors in Fig. 3g, h and from the Supplementary Material Fig. 9g, h) imply that the statistical distribution is more skewed to the right than the ERA-Int reference. This indicates that high values of 10 m wind speed are more probable for the S4 than for ERA-Int. In case of overestimation, the behaviour is the opposite (red colours in Fig. 3g, h and from the Supplementary Material Fig. 9g, h). Besides, the spatial difference patterns are very noisy, and they only barely resemble when assessing the same season. More specifically, the areas holding greater differences are located in the inter-tropics, specially in the central and eastern Pacific at inter-annual scales (Fig. 3g and from the Supplementary Material Fig. 9g). Moreover, in DJF interannual scale, the Arctic shows some consistent S4 overestimation (Fig. 3g). Finally, regarding JJA, the S4 underestimates skewness at interannual scales in the Antarctic Ocean.

Finally, when the differences in excess kurtosis are analyzed, we can see they are also more intense at interannual than intraseasonal level (Fig. 3i, j and from the Supplementary Material Fig. 9i, j). That said, we have to consider that S4 is not able to produce excess-kurtosis values large enough outside the inter-tropical region. Areas with S4 overestimation signal greater values of excess kurtosis than the ERA-Int and, thus, lighter tails than the reference. Conversely, regions with S4 underestimation signal smaller values of excess kurtosis than the ERA-Int and, hence, heavier tails. Having lighter (heavier) tails means that the S4 shows less (more) 10 m wind speed extreme values than ERA-Int. In DJF, interannually (Fig. 3i), the most interesting regions lie in the central inter-tropical Pacific where there is a region of S4 underestimation surrounded by overestimated areas; the northern extra-tropical Atlantic, with patches of S4 underestimation; and northern Europe, which shows an area of S4 overestimation. Intraseasonally (Fig. 3j), we can highlight the Southern Hemisphere, with patchy areas of more intense S4 overestimation and underestimation and the Arctic Ocean (with areas of intense S4 underestimation). Regarding JJA, it seems there is a predominance of positive values, mainly at interannual level (Fig 9i from the Supplementary Material). In this case, although the behaviour is even noisier, we can identify some patterns in the inter-tropics (mainly S4 overestimation) and in the southern Pacific Ocean (S4 underestimation).

3.3 The Shapiro–Wilks test

In this section we summarise the application of the Shapiro–Wilks test to evaluate whether the distribution of the 10 m wind speed for both of the ERA-Int and S4 fits to a Gaussian distribution. The evaluation is performed for DJF and JJA at interannual and intraseasonal time scales (Figs. 4, 5) as in the rest of the parameters of this study. In this test the null hypothesis is that the underlying distribution is of Gaussian type. We have established the significance confidence level at 95%, which means a rejection of the null hypothesis when the p-value is equal or below 0.05.

At intraseasonal time scales the regions that we cannot regard Gaussian for ERA-Int are basically in the inter-tropical region for JJA (Fig.  5c), and extended to some other areas in extra-tropical regions for DJF (Fig. 5a). Interannually these areas are more scarce (Figs. 4a, c), although it is worth noting that the tropical Pacific still shows some deviations from a gaussian distribution. In S4 the number of territories with p value under 0.05 (i.e. where we can reject the null hypothesis) is superior to those found for ERA-Int (Figs. 4b, d, 5b, d). At intraseasonal time scale we find some areas in the extra-tropics where the null hypothesis is not rejected (Fig. 5b, d), although these areas are narrower than those at interannual level (Fig.  4b, d).

Fig. 4
figure 4

Shapiro–Wilks goodness of fit test for ERA-Int and S4 interannual 10 m wind speed, considering the period 1981–2015. The seasonal predictions for DJF have been initialized the 1st of November; for JJA, they have been initilized the 1st of May. a DJF, ERA-Int, sample size 35. b DJF, S4, sample size 525. c JJA, ERA-Int, sample size 35. d JJA, S4, sample size 525

The number of values entering the Shapiro–Wilks test can explain the observed differences between ERA-Int/S4 and interannual/intraseasonal time-scales. In fact, looking at Figs. 4, 5 we see that in Figs. 4a, c there are 35 values of sample size for each grid-point; in Fig. 5a, c 105 (\(35\times 3\)); in Fig. 4b, d there are 525 (\(35\times 15\)) values; and in Fig. 5b, d 1575 (\(35\times 3\times 15\)). For instance, in the intraseasonal configuration there is threefold more data entering the test than for the interannual case. Furthermore, when considering ERA-Int and S4 there are 15 times more values in the latter (one for each ensemble member).

Therefore, the rate to which the goodness of fit Shapiro–Wilks test rejects the hypothesis of normality is notably dependent on the sample size. This is not a unique feature of the Shapiro–Wilks test, but a general characteristic of every hypothesis testing relying on the p-value approach, because the power of a test depends, both, on the effect size and the sample size (e.g. Marsh and McDonald 1988; Sullivan and Feinn 2012).

Fig. 5
figure 5

Shapiro–Wilks goodness of fit test for ERA-Int and S4 intraseasonal 10 m wind speed, considering the period 1981–2015. The seasonal predictions for DJF have been initialized the 1st of November; for JJA, they have been initilized the 1st of May. a DJF, ERA-Int, sample size 105. b DJF, S4, sample size 1575. c JJA, ERA-Int, sample size 105. d JJA, S4, sample size 1575

4 Discussion and conclusions

In this work we have outlined the main characteristics of the global 10 m wind speed distribution of ERA-Int reanalysis and used this information to validate the corresponding S4 distribution. We have applied a set of six statistical diagnostics: mean value, standard deviation, coefficient of variation, kurtosis, skewness and goodness of fit Shapiro–Wilks test and studied them for DJF and JJA seasons at interannual and intraseasonal time scales and for all possible start dates. This diagnostic work helps to locate the hot-spot regions for the study of wind speed from a bias adjustment standpoint, which differ depending on the parameter considered.

In ERA-Int climatology (mean) DJF shows a widespread area of higher wind speeds in the northern and southern extra-tropical oceans, with secondary maxima around 20\(^\circ\) north and south. In JJA, on the other hand, the maximum wind speeds are restricted to the Southern Hemisphere, around the extra-tropics and in the Indian Ocean. In both seasons the maximum wind speeds are observed over the oceans, with the minimum values over land. The spatial standard deviation patterns for DJF are similar at interannual and intraseasonal time scales. In DJF the greatest standard deviation values are over the oceans, and in JJA, in the northern tropical regions and the southern Antarctic Ocean. The coefficient of variation, on the other hand, is the parameter displaying the greatest difference between interannual and intraseasonal frameworks. In DJF the most intersting areas are the southern tropical Pacific and Indian Oceans and, also, in Europe, the Greenland coast and western European regions. In JJA, contrary to DJF, the largest values are found only in the inter-tropical oceans, generally to the north of the Equator. In skewness and excess kurtosis the noisiness is the prevalent characteristic for both DJF and JJA and, also, at interannual and intraseasonal time scales. In skewness, around 30\(^\circ\) north and south to the Equator there is a global strip of positive values which is more evident in DJF than JJA. Additionally, in DJF, we find positive values in the Arctic Ocean and negative skewness in the Antarctic. For the excess kurtosis, the most interesting structure lies in the Eastern Pacific, where a patch of positive values can be identified in boh seasons and time scales. Regarding the seasonal S4 bias, it is almost independent of the lead time.

The differences in the first statistical parameter (mean) reveal that S4 systematically overestimates wind speed at global scales, except for some specific regions in which S4 understimates it. For standard deviation, the higher differences are found in the inter-tropical areas and the intraseasonal frame for JJA season, whereas for DJF the highest differences are found in the tropical areas but in Norhern Europe also. The disparity of values between ERA-Int and S4 is larger in DJF than in JJA, and over the oceans than over the continents. The coefficient of variation differences show larger values at intraseasonal than interannual time frames, both for DJF and JJA. More specifically, in DJF the inter-tropics is the area where the greatest discrepancies can be observed. The other significant area is a strip of land around 60\(^\circ\) north, from Scandinavia to Siberia, where the S4 underestimates the cv due to the intense uderstimation of the S4 standard deviation over this area. Regarding JJA, the inter-tropics is again the region where the magnitude differences is greatest. When studying skewness and excess kurtosis the S4 is not capable of reproducing the finer structure of ERA-Int, specially in the excess kurtosis case outside the inter-tropics (near zero values). Differences in skewness are more intense at interannual than intraseasonal time-scales. More specifically, the areas holding greater differences are located in the inter-tropics, specially in the central and eastern Pacific at inter-annual scales. Moreover, in DJF interannual scale, the Arctic shows some consistent S4 overestimation. Finally, regarding JJA, the S4 uderstimates skewness at interannual scales in the Antarctic Ocean. Regarding the excess kurtosis, the difficulty of the S4 to produce values substantially different from zero outside the inter-tropics might be due to the effect of the sample size on the fourth moment. However, since the effect of the S4 ensemble dimension on the kurtosis and the other moments seems different, a more thorough analysis on the reasons behind this behaviour might be needed to confirm or refute this hypothesis.

One conclusion we can draw from the results obtained is that while the S4 is able to approximately reproduce the structure of the first two moments of the distribution, it has much more difficulties when dealing with the third and fourth moment patterns or the combination of the first two (e.g. coefficient of variation). This is an outcome that should be considered when assessing the performance and suitability of any post-processing method, pushing towards the use of bias adjustment methods that take into consideration not only the first and second moments but the complete distribution, as for example the calibration method used in Torralba et al. (2017).

Eventually, regarding the goodness of fit to a normal distribution, the intertropical areas and the intraseasonal time-scales cannot be regarded normally fitted (for the extratropical, this is only true at intraseasonal scales). However, the way in which this normality is violated is different depending on the season, the time scale and the dataset. It is possible that some of these differences are consequence of the characteristics of the S4 or the number of values entering the Shapiro–Wilks goodness of fit test. In fact, in the intraseasonal configuration there is threefold more data entering the Shapiro–Wilks test than for the interannual case (when considering ERA-Int and S4 there are 15 times more values in the latter due to the ensemble size). Therefore, we have found that the rate to which the goodness of fit Shapiro–Wilks test rejects the hypothesis of normality is notably dependent on the sample size and, thus, the results obtained should be regarded with caution.

These results encourage us to proceed further in our research by validating the S4 with other reanalyses and comparing it with other seasonal prediction systems (e.g. ERA5 reanalysis or ECMWF System 5 seasonal forecast system). Furthermore, we will also seek the collaboration of wind industry end-users to tailor future experimental suites to test the usability of commonly used post-processing techniques (either downscaling or bias correction).