Using the Twentieth Century Reanalysis to assess climate variability for the European wind industry

We characterise the long-term variability of European near-surface wind speeds using 142 years of data from the Twentieth Century Reanalysis (20CR), and consider the potential of such long-baseline climate data sets for wind energy applications. The low resolution of the 20CR would severely restrict its use on its own for wind farm site-screening. We therefore perform a simple statistical calibration to link it to the higher-resolution ERA-Interim data set (ERAI), such that the adjusted 20CR data has the same wind speed distribution at each location as ERAI during their common period. Using this corrected 20CR data set, wind speeds and variability are characterised in terms of the long-term mean, standard deviation, and corresponding trends. Many regions of interest show extremely weak trends on century timescales, but contain large multidecadal variability. Since reanalyses such as ERAI are often used to provide the background climatology for wind farm site assessments, but contain only a few decades of data, our results can be used as a way of incorporating decadal-scale wind climate variability into such studies, allowing investment risks for wind farms to be reduced.


Introduction
Wind is a highly variable phenomenon over all time scales, from gusts lasting seconds, to long-period variations spanning decades (e.g. Watson 2013). Harnessing the wind resource for electricity production is a rapidly-developing field, with many challenges for engineering, energy systems design, national-scale energy policy, and meteorological forecast systems (e.g. Wiser et al 2011). Short term wind variability is critically important to the day-to-day management of a wind farm, and efficient running depends on having high quality wind speed forecasts (e.g. Foley et al 2012;Jung and Broadwater 2014). However, the impact of long term, decadal-scale variations in the wind climate is less well understood. This is partly due to a historical lack of data. Typically, when a site is considered for wind farm development, developers are often restricted to using statistical techniques to relate observational records from nearby stations to the site in question. Homogeneous data from any single station will usually only span a few years to a decade, but can be supplemented by data from a dedicated meteorological mast positioned on-site for a limited period of time such as 1-3 years (Petersen and Troen 2012;Liléo et al 2013;Carta et al 2013). In the absence of long term data sets of wind speed itself, studies of long term wind variability typically use pressure-based metrics as proxies for the wind (e.g. Palutikof et al 1992), often combined with complex statistical procedures to relate to the wind speed at a site (e.g. Kirchner-Bossi et al 2013). Around Europe, indices based on the North Atlantic Oscillation (NAO) have often been used (e.g. Boccard 2009). Standard NAO indices correlate well with winter wind speeds in northern/western parts of Europe. However, this is not true more generally, such as at other times of the year or in other locations (Hurrell et al 2003), and alternative indices must be used in these cases (e.g. Folland et al 2009). Regardless of definition, the NAO does not capture the full variability seen in wind speeds. Thus, there is scope for improvement over all these techniques.
Within the past decade, reanalysis data products have been able to extend such site assessment studies, allowing a description of a reasonable climatological period of around 30 years. The two main global reanalysis data sets used for this are the ECMWF 1 Re-Analysis Interim product (ERA-Interim, hereafter ERAI; Dee et al 2011), and NASA's Modern Era Retrospective-analysis for Research and Applications (MERRA, Rienecker et al 2011), which both cover the 'satellite era ' (1979 onwards). Such data sets are necessarily produced at relatively low spatial resolution (e.g. grid sizes ∼ 0.7 • ), and cannot, on their own, be used to determine the likely wind speeds at a site. In combination with other techniques however, from simple rescaling, detailed statistical modelling or even full dynamical downscaling, reanalysis data can be a key source for obtaining a representative wind climatology for a specific site (Kiss et al 2009;Petersen and Troen 2012;Kubik et al 2013;Badger et al 2014).
Most recently, attempts at producing century-scale reanalyses have yielded results: the NOAA Twentieth Century Reanalysis (hereafter 20CR, Compo et al 2011) and ECMWF's ERA-20C (Poli et al 2013; see also Dee et al 2013) data sets provide ensemble realisations of the atmospheric state spanning over 100 years. However, as they are at even lower resolution (e.g. 1-2 • ), and their early data is subject to substantial uncertainty, care must be taken when considering how to interpret their results in the context of wind farms.
Concerns within the wind industry about the possible impacts of future climate change, along with greater availability of larger data sets, have motivated various studies resulting in a greater awareness of the risks of climate variability (whether anthropogenic or natural). In fact, unlike the situation for temperature, there is little evidence of any long-term trend in globally-averaged wind speeds -see e.g. the Fourth and Fifth Assessment Reports (AR4/AR5 respectively) of the IPCC's 2 Working Group I, Trenberth et al (2007) and Hartmann et al (2013). The low confidence in such assessments is due in part to difficulties with the historical observational record, coupled with the highly-variable nature of winds in both space and time. For example, various data sets have suggested a positive trend in wind speeds over the oceans, with significant regional variability (Tokinaga and Xie 2011; Young et al 2011a,b;Wentz and Ricciardulli 2011;Young et al 2012). Over land however, the situation is different: an apparent reduction in surface wind speeds (nicknamed "global stilling") has been seen in recent decades in some data sets (McVicar et al 2012(McVicar et al , 2013, with studies suggesting that it could be due in part to anthropogenic factors, such as changes in land-use increasing the surface roughness Wever 2012), or aerosol emissions locally changing the thermal structure of 1 European Centre for Medium-range Weather Forecasting 2 Intergovernmental Panel on Climate Change the atmosphere (Bichet et al 2012). It is important to note that stilling is not seen in reanalysis data, which use climatological aerosol levels and do not include land-use change. Over both the land and oceans, opposing trends in different regions and times of year will act to reduce any globallyaveraged trend signal. While further and better data is required to settle questions on the true scale, causes and interrelationships of changes in wind speeds over oceans and land, it is important to note that these observed trends are always much smaller than interannual variability.
Given the uncertainties in trends in the historical wind climate, it is not surprising that projections of future wind climates should also be treated with caution. The review of Pryor and Barthelmie (2010) concluded that wind speeds over Europe would continue to be dominated by natural variability, although by the end of the century some differences could have emerged -although even the sign of the change was uncertain. The IPCC's Special Report on Renewable Energy Sources and Climate Change Mitigation (SRREN) came to a similar conclusion (Wiser et al 2011), and the IPCC's AR4 (Meehl et al 2007;Christensen et al 2007) and AR5 (Collins et al 2013;Christensen et al 2013) noted that there is low confidence in any projected changes. Consequently, Pryor and Schoof (2010) and Dobrynin et al (2012) found that the choice of emission scenario or concentration pathway has relatively little impact overall on the resulting wind climate. It is important to note that simulations of the historical climate over the 20th Century do not reproduce the observed variability in atmospheric circulation (Scaife et al 2005(Scaife et al , 2009, so the uncertainties in these climate projections do not preclude large multi-decadal variations in the future. Overall, the effect of climate change on the annuallyaveraged wind resource is thought to be small, although the increased seasonality seen in some studies by 2100 could have a challenging impact on wind-dominated electricity networks (Hueging et al 2012;Cradden et al 2012).
Thus, when seeking to improve assessments of future wind speeds over the lifetime of a turbine, there is more to be gained from an increased understanding of historical long-term wind variability than through climate change model runs. Given this context, we show in this paper how the new class of century-scale reanalyses can be linked to the more widely-used satellite-era reanalyses, thus allowing for information on the long-term decadal-scale variability in wind speeds to be propagated through the model chain when performing a wind site assessment. In section 2 we describe the two main data sets we use, including their limitations. We compare them in detail in section 3, and describe our procedure for relating the two. Section 4 shows results for the wind speed distribution over Europe, including longterm averages, variabilities, and changes in the shape of the distribution over time for selected regions. We discuss our conclusions in section 5.

Data sources
Reanalyses represent the most convenient data sets for assessing the long-term historical wind climate, in the sense that they aim to provide an optimal combination of observations and numerical model: the data provided in a reanalysis aims to give the best estimate of the "true" situation at any given point, as well as being homogeneous in time (e.g. free of systematic shifts), and complete in both space and time. However, in reality, biases and uncertainties inherent in both raw observations (due to location, frequency, instrumentation, etc.) and models (due to resolution, parametrisation schemes, etc.) mean that such data sets must be used with caution.
This study primarily uses data from the Twentieth Century Reanalysis project (20CR), in conjunction with wind speeds from the ERA-Interim data set (ERAI) for validation and calibration of the 20CR data. We describe key aspects of these data sets in the following sections.

Data from the 20CR Ensemble System
A full description of the ensemble reanalysis system used in the 20CR project is given in Compo et al (2011). Here, we describe some key features that have important impacts on our analysis methods and results. The 20CR assimilates sea-level pressure and surface pressure observations alone (from the International Surface Pressure Databank, incorporating the ACRE 3 project, Allan et al 2011), using observational fields of sea-surface temperature and sea-ice concentration (HadISST1.1, Rayner et al 2003) as boundary conditions. It uses the April 2008 experimental version of the NCEP 4 Global Forecast System (GFS), a coupled atmosphere-land model produced by the NOAA 5 NCEP Environmental Modelling Centre.
The 20CR data assimilation system is based on an Ensemble Kalman Filter. The data are produced in a series of 5-year 6 'streams' (independent runs, to simplify parallelisation), with 56 members in each stream. A consequence of this system is that ensemble members only remain temporally continuous for the 5-year duration of each stream.
As highlighted in Compo et al (2011), when considering variability it is important to use the ensemble members directly, rather than using the daily ensemble-mean time series alone. The increased uncertainty in the early period of 3 Atmospheric Circulation Reconstructions over the Earth, http: //www.met-acre.org/ 4 National Centres for Environmental Prediction 5 National Oceanic and Atmosphere Administration 6 Streams 16 & 17 actually last 6 and 4 years respectively (see Table  III in Compo et al 2011). For simplicity, we assume 5-year streams throughout.
the data leads to greater disagreement between the ensemble members, such that a time series of their mean will have much less variability than the members individually. This would lead to a spurious strong reduction in variability appearing at earlier times in the ensemble mean.
The 20CR data spans 140 years from 1st Jan 1871 to 31st Dec 2010, on a T62 spectral grid with 28 vertical levels. We use the output data provided on a regular latitude-longitude grid with cell size 2 • , at the the near-surface pressure level at σ := P/P surface = 0.995 (around 40 m height). The σ = 0.995 level is a reasonable choice for turbines whose rotor hubs are expected to be some tens of metres above the surface; typical hub heights are between 40 & 100 m, but vary greatly (Wiser et al 2011); we do not expect our conclusions to be qualitatively affected by the precise height above ground. More details on our choice of levels can be found in Appendix A. We use daily-mean data, and calculate the wind speed U = √ u 2 + v 2 for each ensemble member, from the orthogonal components u (zonal, i.e. westerly) and v (meridional, i.e. southerly).
Some recent studies have highlighted potential problems with the 20CR data set. Villarini (2012, 2014) have performed a detailed analysis of change points in the 20CR data, finding that, while these are in fact common in the data set overall, there are many areas, especially in the northern hemisphere, where the 20CR remains largely homogeneous for many decades. Their results emphasize that users of the 20CR data must be aware of possible -indeed, probable -inhomogeneities in the data, and the potential impact this could have on their analyses. Stickler and Brönnimann (2011) found very significant differences between 20CR winds and pilot balloon measurements in the West African Monsoon region over 1940-1957, and Liléo et al (2013, using the 20CR to study interannual wind variability over Scandinavia, had to discard 20CR data prior to 1920 due to suspicious behaviour in some grid cells. Finally, there has been some debate on the consistency of longterm trends in storminess and extreme winds found in 20CR compared to observations (Donat et al 2011, Brönnimann et al 2012,b, and Krueger et al 2013a. These studies serve to emphasize the importance of being extremely careful with methodology when comparing reanalysis data with observations, and when identifying trends.

Data from ERA-Interim
The second source of data we use is the 60 m wind speed fields from the ERAI data set . This uses the ECMWF Integrated Forecasting System model (IFS), and assimilates observational data of many types, mostly coming from satellites. The atmospheric fields of ERA-Interim were calculated on a T255 spectral grid, with surface fields calculated on a reduced Gaussian grid. We use the daily-mean wind speed data available on the regular latitude-longitude grid of cell-size 0.75 • . A comparison of ERAI data at 60 m and 10 m with the 20CR levels can be found in Appendix A. The reanalysis starts in 1979 and continues to the present; we use the data up to the end of 2013. Further details are available in Dee et al (2011) and references therein, and the ERA-Interim Archive report . Stopa and Cheung (2014) compared ERAI wind speeds with those measured from buoys and satellite data, finding that the reanalysis performs very well in terms of homogeneity, but with a small negative bias and reduced variability compared to the observations. Szczypta et al (2011) found that ERA-Interim tended to overestimate wind speeds over most of France, but underestimated it in mountainous areas, compared to the SAFRAN high resolution (8 km) reanalysis data set -although the authors note that the SAFRAN wind speed data is known to be biased low.
As already discussed, it is known that reanalysis data sets including ERA-Interim do not exhibit the observed largescale trends in wind speeds (see e.g. McVicar et al 2013; Mears 2013 and references therein), and the relatively low resolution of ERAI (and similar data sets) prevents it from being used directly as a proxy for observations at the scale of a wind farm (Kiss et al 2009;Kubik et al 2013). We will instead be using ERAI as an example of the kind of data currently used for providing a climatological basis for wind farm site assessments, the first link in the 'model chain' of dynamical and statistical downscaling for such studies: reanalyses are connected to mesoscale dynamical models, then in turn to microscale models and computational fluid dynamics (CFD) at the scale of a wind farm itself (Petersen and Troen 2012).

Linking the Reanalyses
While the strength of the 20CR is its characterisation of realworld variability on long time scales, the ERA-Interim data set provides wind speeds that are at much higher spatial resolution, and are more tightly-constrained by observations. ERA-Interim is therefore much better suited for developing a climatology of wind speeds over small (sub-national) regions, or, in conjunction with additional dynamical or statistical downscaling techniques, at a point location. However, as it only spans ∼ 30 years it cannot give a good indication of climate variability on multi-decadal timescales. In this section we describe how we calibrate the 20CR wind speed data to produce a data set that has the same distribution of wind speeds in time as ERA-Interim (over their overlapping period), but with the long-term variability of 20CR.

Comparison of the Reanalyses
We focus our study on Europe, and consider several small sub-regions for more detailed examination. To aid comparison, we regrid the ERAI data by area-averaging onto the 20CR's native 2 • grid.
The 20CR and ERAI data do not exhibit the same climatology in wind speeds over their period of intersection (1979-2010, 32 years). This is due to a number of factors. These include the structural differences (NWP model, data assimilation and reanalysis procedure); spatial resolution and the amount of orographic complexity resolved; the amount and type of observational data assimilated; and the mismatch between vertical levels available for comparison.
In this section we denote ensemble-mean daily-mean wind speeds from 20CR (at its σ = 0.995 vertical level) and from ERAI (at its 60 m model level on the 20CR grid), by U 20CR and U ERAI respectively. As we are focusing on the later period of the 20CR data set for our calibration procedure, the ensemble spread is small, so it is acceptable to use the ensemble mean series in this case. We consider the 'bias' between the 20CR and ERAI data in terms of the simple difference in wind speeds, and the day-to-day relative difference compared to ERAI, In Figure 1, we show 7 the 32-year mean bias β and the mean of the day-to-day relative bias β rel . The bias maps are all rather noisy, but over most of the domain the bias is negative (i.e. U 20CR < U ERAI ), with differences of up to ∼ 30% of the ERAI wind speeds over much of Europe. There are some notable exceptions to this however, with small positive biases (i.e. U 20CR > U ERAI ): for example over Britain, wind speeds are up to 10% higher in the 20CR data. Some areas have particularly strong negative bias, such as around the Czech Republic. The Strait of Gibraltar is particularly affected by the low spatial resolution, resulting in the lowest 20CR wind speeds compared to ERAI. We have used a t-test to assess whether the data is consistent with β = 0 (i.e. no bias) at the 1% level. When it is not consistent with zero, we say there is a significant bias; this is the case for most areas according to this test.
In addition to the spatial variability, it is important to bear in mind that the difference between 20CR and ERAI does not have to be constant in time. Figure 2 shows the dayto day variability of β rel in terms of its standard deviation σ . There is a suggestion in the data of increased β rel around coastal regions, such as in large parts of the Mediterranean, Fig. 1 Maps of the difference between wind speeds from 20CR and ERA-Interim; details as given in the panels. Crosshatched areas in the top panel are not significantly different from zero at the 1% level, according to a t-test.
as well as Norway and Britain. The relative-bias variability is generally around 15-30%, which is a similar magnitude to the mean relative bias β rel shown in Figure 1.
Finally, we show the correlation between the daily wind speeds of the 20CR and ERAI in Figure 3. The data are wellcorrelated in most places, but the correlation is particularly strong (between 0.9 and 0.95) in the Atlantic, North Sea, and associated coastal regions including the British Isles.

Procedure for calibration
The goal of our calibration procedure is to generate a wind speed data set that retains the fluctuation patterns of the 20CR data over time, and between ensemble members, but whose probability density functions (PDFs) of the ensemble-mean wind speed in each grid cell match those of the ERA-Interim data during their overlapping time period. In particular, the PDFs do not have to match over other periods (e.g. if comparing the distribution over 140 years from 20CR to the 35 years from ERAI), the time series do not have to match in detail (although we have shown that they do tend to be wellcorrelated), and individual ensemble members do not need to match ERAI data -thus retaining the 20CR's important measure of uncertainty. We illustrate our procedure for the case of a particular grid cell in Figure 4.
Our method proceeds in two stages, and is performed on each grid cell independently. Firstly a transfer matrix is obtained as the conditional probability density of ERAI wind speeds given bins of 20CR wind speeds for the overlapping period: where i and j index bins in wind speed for the data sets indicated. We use bins of 0.5 m s −1 covering the range 0-40 m s −1 . This transfer matrix is applied to the full 140-year 20CR PDF, to obtain a calibrated PDF of 20CR wind speeds spanning 1871-2010: Secondly, the calibrated daily time series of wind speeds, U 20CRc (t), is obtained by quantile matching (e.g. Panofsky and Brier 1968): we interpolate the cumulative distribution function (CDF) of the calibrated 20CR wind speeds at the quantile of the original 20CR wind speed. This step can be performed using both the daily ensemble mean wind speeds and the individual ensemble members, allowing the ensemble spread to be transferred to the calibrated climatology.
In some cases, there can be wind speeds present in the 140-year 20CR data that were greater than any in the 32year period common with ERAI. This means that such wind speeds have no corresponding frequency in ERAI that we can calibrate to: the CDF corresponding to P(U 20CRc i ) reaches its maximum 8 below that wind speed, so quantile matching by interpolating the CDF will fail. In this case, we instead perform a linear regression on the relationship between original and corrected winds up to this point (i.e. U 20CRc (U 20CR ) = aU 20CR + b). We then extrapolate this model to obtain corrected wind speeds for the final few high wind days.
We demonstrate our procedure for the case of a grid cell in north-western Germany in Figure 4. This shows the different PDFs in question, the transfer matrix, and the quantile matching. It is clear that in this case the ERAI wind climate largely represents a shift to higher wind speeds compared to 20CR (i.e. the 20CR winds are low compared to ERAI), and the calibrated 20CR reproduces this well. In both the PDFs and the transfer matrix, a truncation of ERAI winds is apparent at lower wind speeds. We have tested that this , such that each row j is a PDF of ERAI wind speeds given a 20CR wind speed U 20CR j (i.e. the values along each row sum to unity). Darker colours indicate higher frequencies in each ERAI PDF. Bottom-left: Wind speed distributions over the full 140 years. Bottom-right: The cumulative distribution function derived from the PDF histograms shown to the left. The dotted lines and arrows illustrate the interpolation in the quantile-matching procedure used to convert wind speeds from the original 20CR data to their calibrated counterparts. Here, a high wind speed for this cell from the original 20CR is transformed to its higher counterpart at the same level in the calibrated distribution. truncation is also present in the original ERAI data on its native grid, i.e. that it is not solely due to the area-averaging of the ERAI data onto the 20CR grid. This has implications when attempting to fit Weibull functions to the wind speed distribution (section 4.3); we discuss this issue in detail in the Supplementary Information.
It is important to note that the method we describe here is not unique. Many other techniques for calibrating one data set with another have been developed and used in climatological studies. These are usually designed to compare reanalysis or model data with observations, or climate model data at different spatial scales, such as a global run with regional model output; see Teutschbein and Seibert (2012), Watanabe et al (2012), Lafon et al (2013) and references therein for recent reviews of methods. Compromises are reached between statistical complexity, data volumes, direct numerical simulation, and time available. In our case, we have chosen a relatively simple statistical procedure.

Results of calibration procedure
Time series of annual mean wind speeds from both the original and calibrated 20CR data, and from ERA-Interim, are shown in Figure 5 for a region covering Denmark and Northern Germany (using area-weighted averaging over the region). The calibrated data retains the interannual variability of the original 20CR wind speeds, but with a climatology matching that of ERA-Interim over 1979-2010.
We map the bias remaining after our procedure in Figure 6. This can be compared to the original bias maps in Figure 1 -note that here the values are much smaller. The mean bias β = U 20CRc −U ERAI is consistent with zero almost everywhere (using a t-test at a 1% significance level, as before). An exception is a residual positive bias east of Gibraltar: we expect this area to be heavily affected by differences in how well the complex orography here is resolved between ERAI and 20CR. Two further exceptions occur in the central and eastern Mediterranean, which correspond to anomalies seen in other aspects of the 20CR data (see later sections), and which we discuss in more detail in Appendix C.
The mean of the relative bias β rel (not shown) is ≤ 5% almost everywhere. Finally, we note that the correlations between 20CR and ERAI after calibration (not shown) remain almost identical to those shown previously in Figure 3.

Analysis and results
In this section we use the 20CRc data to analyse the distribution of wind speeds over Europe in various complementary ways.

The European context: Maps of the long-term average, variability and trends
The map of the 140-year mean wind speed in Figure 7 gives an overview of the geographic distribution of wind speeds over Europe. There is a noticeable land-sea contrast, although it is the mountainous regions that have the lowest mean wind speed, just as is seen in the uncorrected 20CR data (Bett et al 2013), and is inconsistent with observations. This erroneous behaviour is a known consequence of the orographic drag schemes in atmospheric models (Howard and Clark 2007), and is particularly apparent when (as here) the orographic variability is on a much smaller horizontal scale than the model grid cells. The spatial pattern in fact agrees very well with that derived by Kiss and Jánosi (2008) from the 10 m wind speeds covering 1958-2002 in the ERA-40 reanalysis (Uppala et al 2005), although since they used winds at a lower level their mean values are correspondingly smaller.
It is important to note that the wind speeds shown here apply to the particular spatial scale of this data set, which implies a certain amount of smoothing compared to values measured at a specific site. For example, Kirchner-Bossi et al (2013) use a complex statistical procedure to relate sealevel pressure from 20CR to wind speed observations at a range of meteorological stations in Spain. Because they are statistically downscaling to this local scale, the mean wind speed they find is 2-3 m s −1 higher than we show in Figure 7.
We map the wind variability in terms of its standard deviation. The structure of the data set makes the calculation of the long-term standard deviation non-trivial: simply considering the ensemble-mean daily time series would result in a standard deviation that was negatively biased. Furthermore, the ensemble members' time series are only continuous in 5-year chunks, and using them as if they were continuous throughout could potentially inflate the apparent variability at the discontinuities (although in practice the impact of this is likely to be very small). To avoid such spurious signals and trends, we calculate the mean and standard deviation of daily wind speeds in each 5-year stream for each ensemble member, then take ensemble means for each period. We then combine these 5-yearly ensemble-mean standard deviations into single aggregate values for the full 140-year period, for each grid cell; see Appendix B for details.
Since the standard deviation of wind speeds tends to correlate with the mean, we show in Figure 8 the wind variability in terms of the coefficient of variation, the ratio of the standard deviation to the mean. This shows that, in most areas, the wind speed standard deviation is ∼ 40% of the mean. The central Mediterranean has proportionally higher variability, with Greece, Turkey and the Alps (whose orography will be extremely poorly represented) showing lower variability.
The presence of any long-term trends in the mean or variability of wind speeds could have important consequences for wind farms, in terms of their future deployment, energy yield, and maintenance requirements. Figure 9 maps the trends in both the ensemble-mean annual mean wind speed and the ensemble-mean annual standard deviation of daily wind speeds. The trends are found from the ensemblemean annual time series using the Theil-Sen estimator (Theil 1950;Sen 1968). This is the median of the slopes between all pairs of points in the data set, and is more robust against outliers than simple linear regression, making it more suited to skew-distributed data such as wind speed.
We test the significance of these trends at the 0.1% level, using a Mann-Kendall test (Mann 1945;Kendall 1975) modified using the method of Hamed and Rao (1998) to account for autocorrelation in the data (following Sousa et al 2011); as is the case with much meteorological data, we expect adjacent timesteps to be correlated. As with all significance tests, the result says whether the measured trend was un-   Fig. 1; note the colour scale covers much smaller values here. Crosshatched areas are not significantly different from zero at the 1% level, using a t-test.
likely, given the assumption of there being no true underlying physical trend. If the probability of measuring the trend we did was below 0.1%, then we describe the trend as 'significant', otherwise we regard it as consistent with zero. We chose the particularly stringent threshold of 0.1% to guard against detection of spurious trends; we only want to highlight trends we are very sure are present in the data.
Some key points about long-term trends in European winds are immediately apparent from Figure 9. Firstly, they are only on the order of a few centimetres per second per decade; and secondly, that in most areas of the continent, the trend is not significantly different from zero. The trends in standard deviation show a similar spatial pattern, although at an even lower magnitude. There are three areas of apparently significant trend in annual wind speed that merit looking at in more detail: the areas of positive trend in the Atlantic Ocean to the north and west of the British Isles, and the eastern Mediterranean around Crete; and the negative trend in an area of the central Mediterranean around the Italian peninsula and Sicily. The Mediterranean regions were also anomalous in terms of their bias with respect to ERA-Interim (see previous section). We look at the behaviour of wind speeds in these regions in more detail in Appendix C. Bett et al (2013) used the same significance threshold for analysing trends in the uncorrected 20CR data, but measured trends using simple linear regression and t-tests to establish significance. While we consider the present technique to be more robust, the magnitude and spatial patterns of the trends are similar, and similar regions are highlighted as signifi- cant, pointing to genuine features in the underlying 20CR data.
As already discussed in the context of the mean wind speed, it is important to realise that these trends are those seen at the large scales of the 20CR data, and detailed physical or statistical modelling is required to downscale to a specific location. Considering again the example of Kirchner-Bossi et al (2013), they find that the site in Spain they describe has a statistically significant negative trend in wind speed of around −0.01 m s −1 decade −1 . In our results, the corresponding grid cell has a trend of around +0.01 m s −1 decade −1 , and is consistent with zero according to our test.

Wind distribution time series
We use a region covering England & Wales to to give an example of how wind speed distributions can vary with time. The time series of the area-averaged data from this region are shown in Figure 10. The annual mean wind speed (panel a) shows both large interannual variability, and (when smoothed with a 5-year boxcar window) strong decadal-scale variation. For example, the smoothed series shows a clear increasing trend from around 1970 to a peak in the mid-1990s, followed by a return to near-average values after 2000. When seen in the 140-year context however, these recent variations are not exceptional, and the year-to-year variability is always much greater. Note that, for this region, the year 2010 is the extreme low-wind year. This is linked to exceptionally cold months at the start and end of that calendar year, and a strongly negative NAO index in the 2009-2010 win- The peak in wind speeds that occurs in the 1990s is another important feature in this region, and is also seen clearly in the observational record of wind speeds (Earl et al 2013), in studies using geostrophic winds derived from pressure observations (Palutikof et al 1992;Alexandersson et al 2000;Wang et al 2009), and is consistent with the large positive NAO in these years (e.g. Scaife et al 2005, and references therein). Indeed, much of the variability of wind speeds in this region is likely to be related to modes of climate variability such as the NAO and Atlantic Multidecadal Oscillation (AMO, e.g. Knight et al 2006); further consideration of this requires careful seasonal breakdowns of both wind speed and these climate indices however, and is beyond the scope of this paper.
Our results bear a remarkable qualitative resemblance to those produced over 20 years ago by Palutikof et al (1992) using geostrophic wind speeds (1881-1989) adjusted to match wind speed observations from a station in England over 1975-1984. A key purpose of that study was to illustrate the longterm variability present in wind speeds, as it could have important implications for wind power production. With the advent of larger datasets and greater computational capacity, we are able to re-emphasize their conclusions and consider the decadal-scale behaviour of the wind more robustly and in greater detail.
Considering the time series of the distribution as a whole (panel b), we can see that it follows the same decadal trends as the mean (panel a). The distribution width (panel c) highlights that while the outer reaches of the distribution are subject to much variability, with the distribution width growing and shrinking over decades, the inner parts of the distribution are much more constant. The standard deviation shown in that panel has a small but statistically significant positive trend, of 0.014 m s −1 decade −1 .
Finally, the bottom panel shows the relative uncertainty in the data, in terms of the annual mean of the day-to-day ensemble spread. As one looks further back, fewer observations are assimilated, and the ensemble members have more freedom to disagree with each other, resulting in increases in this measure of uncertainty. Two peaks are present that are related to the reduction in data from Atlantic shipping during the World Wars; these spikes in uncertainty are ubiquitous for near-Atlantic regions.
In Appendix C, we show similar plots for other regions that show particular features of interest, as already discussed.

The wind speed distribution and Weibull parameters
The Weibull (1951) distribution is often used in wind analyses as a concise way of describing the wind speed distribution. Extending the previous work in Bett et al (2013), we have considered the Weibull scale and shape parameters in our calibrated 20CR data, looking at long-term mean values, trends, and behaviour in select regions. However, while the original 20CR data does tend to be well-described by Weibull distribution, the wind speeds from ERAI often are not; this results in the Weibull distribution often being a poor description of the calibrated 20CR wind speeds.
However, the Weibull scale parameter, which is proportional to the mean of the distribution, does tend to behave in the same way as the mean wind speeds in terms of variability and trends, as described in the previous subsections. In particular, trends are of a similar magnitude and spatial pattern, and 'anomalous' regions in the central and eastern Mediterranean, noted in previous sections, are also present.
Due to the inability of the Weibull distribution to provide a useful description of the calibrated 20CR data, we do not consider it further. Additional details and discussion are presented in the Supplementary Information.

Discussion and summary
In this paper, we have demonstrated how century-scale reanalyses -in particular, the Twentieth Century Reanalysis, 20CR -can be used for assessing the long-term trends and variability of near-surface wind speeds over Europe, through a calibration procedure to relate it to a higher-resolution satelliteera climatology (such as ERA-Interim), and subsequent careful analysis.
The long baseline of the 20CR means that it has great potential to inform wind speed assessments for the wind energy industry. In general, reanalysis data is used in conjunction with dynamical and/or statistical downscaling techniques in order to reach the spatial scale of wind farms, as part of the 'model chain' in such assessments. Often, it is the observation-rich and relatively high-resolution data sets of ERA-Interim and MERRA that provide that first reanalysis step. This limits any assessment of long-term variability, since they only cover ∼ 3 decades. By calibrating the 20CR data to match the climatology of ERA-Interim over their period of overlap , this 140-year data set can be used in their place, providing a much more robust assessment of historic interannual and decadal variability in regions of Europe, and allowing the "short-term" trends of the past 10-30 years to be put into the longer-term context.
To emphasise this point, we show in Figure 11 the distribution of the 109 32-consecutive-year trends 9 in annual mean wind speed for the England & Wales region described in the previous section. The full 1871-2010 trend is indicated and, as already shown, is near zero. The trend from ERA-Interim for the 32 years of overlap is also marked, with a negative trend driven by the general reduction in wind speeds since the early-1990s. It is clear that the strong multidecadal variability in wind speeds means that attempting to estimate the long-term trend from a ∼ 30-year sample can lead to misleading results.
The 20CR data is a rich source of information on the large decadal-scale variability of wind speeds. However, it is not without limitations, and hence it does need to be analysed with care. For example, in areas of complex orography, 9 i.e. the Thiel-Sen trend for 1871-1902 inclusive, and 1872-1903, 1873-1904, . . . , and 1979-2010. near-surface wind speeds are strongly reduced at the spatial scale of the 20CR, making their variability more difficult to interpret. We have also found that calibration to ERA-Interim severely reduces the goodness-of-fit to the commonlyused Weibull distribution, making it a poor description of the wind speed data itself.
As has been noted in other studies Brönnimann et al 2012), the ensemble nature of the 20CR needs to be taken into account when assessing long-term variability. Disagreement between ensemble members can be large, especially in the early period of the data. This leads to the daily ensemble-mean time series having less variability than the individual members, and can cause apparent trends in variability over time. Therefore the daily ensemblemean time series has little use in determining wind variability on long timescales, and the ensemble members should be used.
Assessment of trends over the full 140 years of the 20CR is complicated by the fact that the mid-point of the time series, and hence of a simple linear trend, is the 1940s. The reduction in ocean-based measurements during both the First and Second World Wars causes spikes in uncertainty, and in some cases systematic spikes in the wind speeds themselves (see Appendix C). Furthermore, the period after the Second World War corresponds to a large increase in national and international programmes collecting greater amounts of weather data. Taken together, the pre-1950s period is much more susceptible to greater random and systematic uncer-tainties. Measured trends in the 20CR data should therefore be treated with caution.
We have shown in fact that all trends in 20CR surface wind speeds over Europe are either consistent with zero (in most locations), so small to be of little practical relevance (e.g. possibly in the North Atlantic), or due to systematic problems with the data (e.g. in the central and eastern Mediterranean and possibly the North Atlantic; see Appendix C).
It is clear that, for most wind energy applications, interannual variability and the large decadal-scale variability are more important than the very small long-term trends in historical European wind speeds. Using century-scale reanalyses such as the 20CR allows wind resource assessment studies to incorporate more information on the historical decadal-scale variability at a site, which can reduce the uncertainties in the financial planning central to wind energy development.

A Choice of vertical level
In Figure 12 we compare the wind speeds from 20CR at the σ := P/P surface = 0.995 level with those at the other available near-surface levels of P = 1000 hPa and 10 m, over an arbitrary period. They have very similar variability behaviour, with the 1000 hPa winds tending to be slightly higher, and the 10 m winds around 10% lower. Figure 12 also includes 10 m and 60 m winds from ERA-Interim. The 60 m vertical level was chosen as it is roughly similar to the height expected at P = 0.995P surface . A similar alternative would have been the 30 m model level, but we chose the higher level as it would be (marginally) less impacted by surface roughness; 60 m is also closer to wind turbine hub heights and thus more likely to be used for siteselection studies for the wind power industry. Figure 12 also suggests that the 60 m winds provide a fairly good match to the 20CR σ = 0.995 winds by eye.

B Procedure for combining variances
To avoid bias, we calculate variances of daily-mean wind speeds for each ensemble member separately, in consecutive 5-year periods (see section 4.1). These are combined into an aggregate population 10 variance for the whole 140-year period over all ensemble members, using the following procedure. If we consider a single time series of daily-mean wind speeds U(t j ), at discrete timesteps labelled j, then we can divide it into a series of discrete 5-year chunks labelled i, each containing N i days (leap years mean that not all N i are equal).
For each 5-year period i, we can calculate the mean U i = N −1 i ∑ j U(t j ), the mean of squares U 2 i = N −1 i ∑ j U 2 j , and the variance σ 2 i = U 2 i −U 2 i . We store the mean and variance for each 5-year period, for each ensemble member.
The aggregate means over all 5-year periods (i.e. the 140-year means in our case) are simply We can use these to write the aggregate population variance in terms of the mean and variance in each period: In practice, since we have stored the 5-year means and variances for each ensemble member m, U i,m and σ 2 i,m , we take ensemble means to obtain U i and σ 2 i for each period. These are then used to calculate U and σ 2 using equation 9.

C Additional regional time series
In this section we demonstrate the wind speed time series for some additional regions of interest, in the same manner as for the England & Wales results discussed in section 4.2 ( Figure 10). The regions are defined in Table 1 and shown in Figure 13, and were selected as areas of apparently 'significant' trends in wind speed (see Figure 9). As elsewhere in this paper, trends are calculated using the Theil-Sen estimator, and their significance is tested using the modified Mann-Kendall test (see section 4.1). Figure 14 shows the results for the North Atlantic region. As well as having much stronger and more variable wind speeds overall compared to England & Wales, there are also significant positive trends in the annual mean wind speed and annual standard deviation of daily winds. The increase in the uncertainty prior to the 1940s is much more striking than for the England & Wales region, and casts a degree of suspicion on the trend in the annual mean wind speed. It is plausible that the apparent trend is simply due the winds prior to the 1940s in this location being systematically slightly lower than in the subsequent period, rather than being due to any true underlying physical mechanism.   Table 1 (on a regular lat-lon grid) are marked with boxes.
A possible cause -at least in part -could be a difference between the variance in the observations ingested by the reanalysis, and the preferred variance of the underlying NWP model. For example, if the observations are more variable than the model (e.g. if left running without assimilating data), then we might imagine that the 20CR data would have less variance at early times when there are much fewer observations. The skewed nature of wind speed distributions means that a trend in variance could lead to a trend in mean wind speeds too. However, the 20CR employs a covariance inflation process (see Compo et al 2011 and references therein for details), which will act in the opposite direction. Without further detailed study of the model behaviour, these ideas remain at the level of speculation.
The WASWind data set produced by Tokinaga and Xie (2011), based on ship-based measurements of wind and wave heights, has a negative trend in winds for the North Atlantic over 1950-2008. In our data, the trend over the 1950-2010 period is positive, but not significantly different from zero. The weakness of both trends, and difficulties with the observations in both cases, means that it is hard to be conclusive about the 'true' situation.
However, the negative trend we see between around 1990 to around 2005 is seen in the WASwind data, and Vautard et al (2010) have shown that it is also present in the ERA-Interim data. Finally, Vautard et al (2010) found a negligible trend in the North Atlantic in the NCEP/NCAR Reanalysis (Kalnay et al 1996(Kalnay et al ) over 1979(Kalnay et al -2008 is also be consistent with our results. Long-term trends in extreme wind speeds and storminess in the North Atlantic have been discussed in Wang et al (2009Wang et al ( , 2011Wang et al ( , 2013a, Krueger et al (2013b,a), and Wang et al (2013b). These studies relate extreme winds derived from long-term pressure records with those derived from the 20CR data set, and demonstrate both the decadal-scale variability that we see here, and the difficulty of drawing definitive conclusions from trend analysis with this data: different analysis methods can produce very different results, and the 20CR data prior to the 1950s should be treated both carefully and sceptically.
The Sicily & Central Mediterranean region appeared to have a significant negative trend in wind speeds in Figure 9; the time series for the annual mean wind speeds in that region is shown in Figure 15. We can see again the high levels of uncertainty prior to the 1950s, and a particularly anomalous spike in wind speeds around 1940-1942. If we take that spike to be indicative of the kind of systematic errors that might be present in the early half of the data, but not captured by the ensemble spread, then it is not unreasonable to suppose that the entire period prior to the 1940s could be showing higher wind speeds than it should, and thus accentuating a negative trend.
However, there does appear to be a more genuine negative trend in the data from the 1950s onwards, where the uncertainties are much more reasonable. We find that the Theil-Sen slope for the 1950-2010 period is very similar to that of the full 140 years, although in this case it is not significantly different from zero at the 0.1% level. However, as there are so few decades available from the 1950s, it is difficult to know how such an apparent trend relates to the decadal-scale oscillations that we see here, and in other regions.
Overall, the uncertainties in the data make it extremely difficult to separate decadal climate variability, systematic errors, and genuine long-term trend. Pirazzoli and Tomasin (2003) looked at trends in the observed wind speeds over a similar region using station data mostly covering 1951-2000. They found a mixture of trend behaviours: most stations showed a negative trend prior to the 1970s that then became positive; some stations showed no trend, or trends which became negative from the 1970s onwards. In our data, which will not be able to resolve the complex coastal and orographic features of the region, we can see that the 5-year running mean appears to be increasing from the 1950s, changing to a negative trend after the 1970s. While this clearly disagrees with the Pirazzoli and Tomasin (2003) results from some stations, it is unclear how the variety of different observed behaviours in this complex terrain should combine to produce an aggregate trend on the large scales of the 20CR. In any case, the trends in the 20CR data are extremely slight; the main conclusion from our data should be that interannual variability is vastly more important than any trend for this region over a period as short as 50 years. Finally, we show the time series for the Crete & Eastern Mediterranean region in Figure 16. In this case, the apparent overall trend is positive. There is again a spike in wind speeds in the early 1940s, and a suggestion that the data prior to the 1950s could be systematically shifted relative to the latter period. Another interesting feature is that the early period until around the 1920s shows a slight decrease over time; if we exclude the 1940s spike, this then appears to be followed by a long generally-increasing period until the 1980s, after which the wind speeds have been relatively constant.
As before, the uncertainties in the data, both systematic and as seen in the ensemble spread, coupled with the expectation of decadal-scale variability and a time series that is "only" 14 decades long, mean that it is impossible to know from this data alone how "real" such a very long oscillation might be. If we allow for systematic shifts in the 1940s and before, the data is consistent with there being no long-term trend, but with decadal-scale variations underlying large interannual variability, as in other regions. What we can say with some certainty however is that the wind speeds in this region have been higher since the 1970s than they were in the 1950s and 1960s -with the caveat of there being strong interannual variability.

D Supplementary Information: Weibull distribution fits
Concise measures of the wind speed distribution are convenient when comparing different locations and periods. Wind speed distributions are frequently analysed in terms of the Weibull (1951) distribution. The probability density function (PDF) for a Weibull-distributed random variable U is given by which has two free parameters: the scale λ , proportional to the mean of the distribution, and the dimensionless shape (k), determining the skewness. The distribution width and peak location are related to both parameters. The choice of the Weibull distribution for wind speed analysis has a partly theoretical and partly pragmatic basis (Hennessey 1977). From a theoretical standpoint, the simplest case is the (one-parameter) Rayleigh distribution, which describes the magnitudes of two-component vectors when the components are uncorrelated and Gaussian-distributed, with equal variance and zero mean. Describing real wind speeds requires the more general case of correlated components with unequal variances. While this does not itself have a closed-form mathematical expression, the Weibull distribution -of which the Rayleigh distribution is the k = 2 special case -in fact provides a very good approximation to this more general distribution (Harris and Cook 2014). Empirically, the Weibull distribution has been found to fit reasonably well with observed distributions of wind speeds for many decades, and it aids comparison with previous work to continue using it (although not uncritically). 11 Using the calibrated 20CR data, we calculate maximum-likelihood estimates of the Weibull parameters (Carta et al 2009) 12 for each ensemble member in each of the 28 consecutive 5-year periods in each grid cell, using the distribution of daily-mean wind speeds. We do the same for the area-averaged wind speed ensemble member time series in the standard analyses regions used in the main paper.
The goodness of fit of each Weibull distribution is assessed using a chi-square test: in each case we compare the histogram of the dailymean wind speed data with that derived from the best-fitting Weibull PDF, using wind speed bins 13 of 1 m s −1 and a significance level of 1%. Passing the goodness-of-fit test implies that the data are consistent with being drawn from the given Weibull distribution, at the 1% significance level; it is a viable Weibull. Figure 17 shows a map of the mean, over the 28 5-year periods, of the percentage of ensemble members whose Weibull fits pass the χ 2 goodness-of-fit test at the 1% significance level. This shows a clear distinction between land regions (and the Mediterranean), over which very few of the fits are 'good', and the Atlantic ocean regions, where most of the fits are 'good'. This contrasts strongly with the results obtained from the uncorrected 20CR data, i.e. without applying the calibration procedure described in the main paper. The goodness of fit map in this case is shown in Figure 18: Most regions exhibit very good fits, with the higher proportions of poor fits tending to be reserved for areas of more complex orography. Note that an analysis of the means, trends Fig. 17 Goodness of fit map. Each point shows the percentage of ensemble members whose Weibull fits pass the χ 2 test (see text), averaged over the 28 5-year time periods. Fig. 18 Goodness of fit map (as Figure 17), but produced using the uncorrected 20CR data analysed in Bett et al (2013). and time series of the Weibull parameters in the uncorrected 20CR data was performed in Bett et al (2013).
To consider the quality of fits further, we show in Figure 19 the PDFs of daily wind speeds at a particular grid cell location (western England) for a single 5-year period, comparing the different data sets we use. It is important to see that the original, uncorrected 20CR data is reasonably well-described by a Weibull distribution, whereas the original ERA-Interim data is not. As seen in the main paper, the ERA-Interim wind speed distribution is truncated at the low-wind end, with a correspondingly broader peak. Area-averaging the ERAI data onto the 20CR grid exacerbates this problem, moving data in the tails towards the mean, and this is then carried forward to the calibrated 20CR results. When considering data from our standard analysis regions, the PDFs of wind speeds are somewhat 'worse' again (not shown), as they are averages over many grid cells.
The impact of the distribution taking on this shape is that, while we expect our Weibull parameters to still respond systematically to variations in the wind speed distribution, the distributions that one would derive from the parameters are not accurate representations of the wind speeds. For example, we expect that quantiles of the wind speed distribution estimated using the Weibull fits would be highly inaccurate; however, the relative movements over time of the scale parameter at least will broadly describe shifts in the peak location of the wind speed distribution. As an example, the time series of Weibull parameters for the England & Wales region are shown in Figure 20. The scale parameter λ behaves very similarly to the time series of the mean and standard deviation of wind speeds shown earlier (although note that we are looking at discrete 5-year periods here, rather than the 5-year rolling means shown in the main paper). The shape parameter k does not vary a great deal, staying between 2.6 and 3, although there is noticeable spread at early decades between the values estimated for different ensemble members. For this region, neither the scale nor shape parameters exhibit trends in their 5-yearly ensemble-mean time series that are significantly different from zero. Note also that the estimate of the shape parameter from the fits to the daily ensemble mean time series is biased, as expected, compared to the estimates from the fits to the ensemble members or the ensemble means of those fits.
We show maps of the long-term mean values and trends of the Weibull scale and shape parameters in Figures 21 and 22. The longterm means are calculated as the averages over the 28 5-year periods of the ensemble mean parameter values based on the fits to each member. The trends are calculated using the Theil-Sen trend estimator on the 5-yearly ensemble mean values, and tested for significance using modified Mann-Kendall test at the 0.1% level, as for the mean wind speed time series. Figure 21 shows the mean and trend for the Weibull scale parameter. As expected, the mean scale map shows a very similar spatial pattern to the long-term mean wind speed map shown in the main paper. The areas with statistically significant trends are smaller in this case than for the mean wind speed, although the trends themselves have a similar magnitude and spatial pattern.
The results for the shape parameter are shown in Figure 22. The long-term value of the shape parameter is much less spatially variable than the scale, but in most areas is above the value of k = 2, i.e. the distribution is less skewed than a Rayleigh distribution. The only areas with statistically significant trends are in the central Mediterraneanwe have already flagged this region as being subject to data quality issues, and it is discussed further in the Appendix of the main paper. Zhou and Smith (2013) studied the Weibull parameters derived from the 10 m winds over land only, using the NCEP Climate Forecast System Reanalysis (CFSR, Saha et al 2010Saha et al ) covering 1980Saha et al -2009. The values they derive for λ and k are noticeably smaller than those we have shown here. This is to be expected, for the scale parameter at least, because of the difference in height compared to our winds at 60 m. However, the CFSR is also at much higher resolution than the 20CR (around 0.3 • compared to 2 • ), which makes direct comparison difficult. However, their results are similar to those of Kiss and Jánosi (2008), who used the 10 m winds from the ERA-40 reanalysis (Uppala et al 2005), covering 1958-2002 at a resolution of 1 • . Kiss and Jánosi (2008), like us, also noted that Weibull fits tend to be much better over the ocean than the land; the data used by Zhou and Smith (2013) only covers land area. Bett et al (2013) considered the uncorrected 20CR data, obtaining slightly lower values for Weibull scale than the present study (as expected from our calibration usually acting to increase wind speeds), and slightly smaller Weibull shapes (i.e. less symmetric distri-butions before calibration, as expected from Figure 19). In that paper, the trend analysis was performed using simple linear regression and a t-test at 0.1% significance; the magnitudes and spatial patterns of the trends are very similar to our present results, although we find smaller areas of significance.
Finally, it is important to note that being well-described by a Weibull distribution should not be taken as a necessary indicator of physical realism. The work of Harris and Cook (2014) suggests that in regions with strongly varying wind speeds (such as in different seasons), or between regions with different behaviour (such as coastal vs. inland), then a combination of Weibull distributions for the different climatological scenarios (regions/seasons) would be a better fit -although one then starts to lose the benefit of simplicity and conciseness that led to the Weibull distribution in the first place. Considering the 20CR and ERAI data, the difference in frequencies of low wind speeds might suggest that the models underlying those reanalyses are treating the drag from unresolved orography differently. It is not clear which might be more 'correct' however: it is not implausible that on these large scales, very low wind speeds should indeed be rare (and un-Weibull-like), and it would be the 20CR data in that case that has an unrealistically high numbers of low-wind days. While there is significant disagreement between 20CR and ERAI on the shape of the wind speed distribution in the domain we have studied, resolving it would involve detailed comparison work between the two different modelling centres in question, as well as further validation against observations.