1 Introduction

The climatology of marine surface wind field is of considerable commercial interest, e.g., for harnessing wind energy at favorable offshore sites, for the shipping, and for the offshore oil and gas industry. In view of a strong increase in the industrial exploitation of the North Sea, this study addresses some questions related to the climatology of the daily mean wind speed at 10-m height (WS10), namely (1) whether there are important differences in the mean and extreme WS10 in different low-resolution reanalysis datasets, especially related to their long-term trends; (2) how well is WS10 represented in the low-resolution global reanalysis dataset (horizontal resolution of roughly 100 × 100 km) compared to the high-resolution simulations (horizontal resolution of 3 × 3 km) or alternatively are there important processes that remain unresolved at low resolution and are resolved in higher-resolution simulations and whether there is a subsequent long-term impact; (3) can the relationship between the monthly mean WS10 and the large-scale circulation patterns over the European–Atlantic region and their interannual variability in the last few decades be quantified; and (4) can the extreme WS10 over the German Bight be related to extreme circulation patterns and do the long-term trends of extreme WS10 and circulation patterns agree?

The WS10 from different reanalysis datasets can be used to study the near-surface wind climatology and its relation to the circulation patterns. The reanalysis data have previously been widely used for climate studies since the data quality is mostly consistent over a number of decades, which is achieved by using the same model throughout the reanalysis period (Kalnay et al. 1996). However, with the development of the observation system, the quality of reanalysis data has improved over the last two decades. The most pronounced change affecting WS10 is probably the assimilation of the satellite-measured near-surface wind speeds which started in the late 1970s (Kistler et al. 2001). This can lead to discontinuities in the WS10 time series and artificial trends. It was confirmed that some basic parameters, such as the sea level pressure (SLP) from different reanalysis generally agree well over Europe (e.g., Greatbatch and Rong 2006), which increases confidence in using the reanalysis data. The WS10 from reanalysis may be sensitive to the model details, such as the selected surface and boundary layer parameterizations (Kalnay et al. 1996). As far as we are aware, there has been no systematic comparison of the WS10 from different reanalysis over northern Europe.

Most studies concerned with the wind climatology over northern Europe use low-resolution wind speed or a proxy dataset (e.g., derived from the horizontal pressure gradients). For example, Pryor and Barthelmie (2003), Pryor et al. (2006), and Yan et al. (2006) use reanalysis data to infer the climatology of near-surface mean and extreme winds over northern Europe, their interdecadal variability, and linear trends. Weisse et al. (2005) investigated the changes of cyclone intensity and frequency over the North Atlantic and the North Sea from the downscaled National Centers for Environmental Prediction (NCEP) reanalysis data of around 50-km horizontal grid size. Studying extreme winds, some authors prefer to use proxy data, such as the mean SLP or geostrophic wind speed inferred from the SLP gradients, arguing that the SLP data are less prone to systematic errors which may lead to artificial trends (e.g., Schmidt and von Storch 1993; Alexandersson et al. 2000; Matulla et al. 2007). The proxy data derived from SLP are usually able to capture the large-scale forcing on the wind field but not the local effects. The ability of the low-resolution wind speed datasets or the proxy data to represent realistic surface wind conditions over the North Sea remains unclear. The largest differences between low- and high-resolution wind speed are expected close to the coastline or orographic features, where the wind speed may deviate due to strong gradients in forcing and formation of local circulation patterns or in case of strong horizontal gradients (e.g., fronts).

The influence of the North Atlantic oscillation (NAO), which is described by the anomaly of pressure gradient between northwestern (usually Iceland) and southwestern North Atlantic (usually Azores), on the wind speed over northern Europe is well documented (e.g., Marshall et al. 2001; Hurrell and van Loon 1997; Trigo et al. 2002). The NAO index (NAOI) is well correlated with the wind speed over northern Europe, especially in the winter season where the large-scale circulation has a more dominant influence on the weather. In the summer season, the surface heating produces disturbances weakening the large-scale circulation. An increase of the NAOI from the 1960s to the mid-1990s was observed. It is known that the increase in NAOI leads to higher wind speeds over northern Europe (e.g., Hurrell and van Loon 1997). Here, we investigate whether a regional spatial pattern of SLP exists, which has a more dominant influence on the WS10 variability over northern Europe than the NAO. It is also not clear if there are other patterns which are able to describe an additional fraction of WS10 variability. For example, Barnston and Livezey (1987) have observed that the second dominant mode of the North Atlantic pressure variability is the East Atlantic (EA) pattern, which can explain a considerable part of the wind speed variability in northern Europe. Kaas et al. (1996) found that a pressure structure similar to the NAO is related to the wind speed in northern Europe, although the northern center of their pattern is more elongated and shifted eastwards. Rogers (1997) identified a pattern similar to the one from Kaas et al. (1996) and observed that it was related to the position of the storm tracks. Another alternative is to relate the surface wind speeds to other large-scale fields such as the surface temperature. The study by Deser and Blackmon (1993) suggests that the Sea Surface Temperature (SST) is well correlated with the wind speed. However, they found that the relationship is local and suggest that the surface wind influences the SST by mixing the upper ocean waters.

High-resolution simulations can resolve a number of physical processes such as the land–sea breeze in the coastal areas which is additional information not obtained by statistical downscaling of low-resolution reanalysis data. The climate signature of these processes may be vital for some applications such as for wind energy resource studies. The long-term fluctuations and trends extracted from the reanalysis can be combined with the detailed regional information from high-resolution simulations. This information is vital for risk management, resource assessment, and long-term planning to compensate for the inherent fluctuations of the climate system in the development of renewable energy resources and by other stakeholders. Here, we use 3-year data from a mesoscale model (Weather Research and Forecasting, WRF) simulation with 3-km resolution to determine the significance of key regional physical processes.

As opposed to the well-studied variability of the mean wind speed over northern Europe, the variability of extreme winds over the past few decades is less known. On one hand, with an increase in the NAOI and the mean wind speed from the 1960s to the mid-1990s, the extreme wind speeds are expected to rise. On the other hand, there are studies that show a trend in shifting of the North Atlantic storm track northwards in the past decades, which causes a trend of decreasing storms in the southern part of northern Europe (Schneidereit et al. 2005; Wang et al. 2006). The extreme winds can be characterized by different criteria, such as the occurrence of wind speeds exceeding a threshold (e.g., Weisse et al. 2005) or wind speeds at certain high percentiles, such as for example the 90 percentile (Pryor and Barthelmie 2003; Matulla et al. 2007). With the first approach, the number of storm events exceeding a predefined threshold level is investigated, while, with the second approach, the intensity of storms is studied. Different and sometimes contradictory results regarding the trends of the extreme winds or storms have been presented. Based on the SLP measurements, Schmidt and von Storch (1993) found no increase and very little interannual variability of the high wind speed percentiles over the last century in the German Bight. From the SLP analysis, Matulla et al. (2007) concluded that the extreme wind speeds in the northeastern Atlantic region were high in the late nineteenth and the beginning of the twentieth century, lower in the middle of the twentieth century, and then at the end of the twentieth century reach a similar level as at the beginning of the twentieth century. Their analysis is focused on the wind speed north and east of the North Sea. Pryor and Barthelmie (2003) observed an increase in the high wind speed in the last few decades over the Baltic Sea and parts of the North Sea derived from the NCEP/National Center for Atmospheric Research (NCR) reanalysis. An increase in the extreme wind speeds over the Baltic is related to an increase in the frequency of the western types of “Grosswetterlagen” (GWL) in the last quarter compared to the middle of the twentieth century. The extreme weather conditions are usually not well linked to the large-scale circulation patterns such as the NAO since the NAO explains well the variability on monthly to interdecadal timescales, while the extreme values by definition occur only for short periods. Therefore, some authors link the extreme values to different circulation patterns (Yiou and Nogaj 2004; Panja and Selten 2007).

The paper is structured as follows. The data are described in “Section 2.” In “Section 3,” the WS10s over the North Sea from different reanalysis datasets are compared to each other and to the high-resolution WS10. The relation of the mean WS10 to the circulation patterns and their time evolution and the relation of the extreme WS10 to the extreme circulation patterns are then derived. Finally, the discussions and conclusions are presented in “Section 4.”

2 Data description

The present study uses the daily SLP, 2-m temperature (T2M), WS10, and wind speed at 700-hPa level (WS700hPa) from three reanalysis datasets—NCR (Kalnay et al. 1996; Kistler et al. 2001), NCEP-Department of Energy (NCR2; Kanamitsu et al. 2002) which is an updated version of the NCR reanalysis and ECMWF ERA40 (Uppala et al. 2005). The NCR and NCR2 reanalyses are based on the same model with a global T62 horizontal grid and 28 vertical sigma levels. In NCR2, errors found in NCR were corrected; additional data were assimilated as well as some of the physical parameterizations were changed. For the present analysis, the most relevant modification is the change of a simple boundary layer parameterization based on the local Richardson number in NCR to a nonlocal diffusion scheme in NCR2 (Kanamitsu et al. 2002). The ERA40 model is defined on the global T159 grid with 60 hybrid vertical levels.

In addition, the higher-resolution (3-km grid size) but shorter (3 years) WS10 data were obtained by integrating the WRF (Skamarock et al. 2005) model (Sood et al. 2007). The duration of the short time series of high-resolution data is not adequate to derive climate variability characteristics of WS10. Rather, it was used to investigate how well daily WS10 over the North Sea is represented in the low-resolution reanalysis dataset and to detect additional high-resolution information. WRF was initialized at 18 UTC every second day with a spin up time of 6 h and a forecast horizon of 53 h. A continuous four-time daily time series is constructed by combining the 6-hourly data from the 6th to 53rd forecast hours. The initial and boundary conditions are prescribed from the NCEP Final Analysis. Three (two-way) nested domains are defined in WRF, where the third domain with a horizontal resolution of 3 km spans the innermost region (lat. 52.5–55.25 N, lon. 4.75–9.5 E). The WRF model includes parameterizations of all essential physical processes. The surface and the boundary layer parameterizations were based on the Mellor–Yamada–Janjić model (Janjić 2002). Here, only WS10 from the third WRF domain (WRF3) was analyzed.

To relate the dominant WS10 patterns to the main modes of atmospheric variability, the NAOI and the East Atlantic Index (EAI) from the Climate Prediction Center (CPC) were used. These indices were constructed based on the Rotated Principal Components analysis of the northern hemispheric 500-hPa geopotential height. In addition, the corrected and preprocessed measurements of SLP from the stations at Göteborg (Sweden) and Oksoy (Norway), which were a part of the European project EMULATE (European and North Atlantic daily to multidecadal climate variability; Ansell et al. 2006), were used to relate the extreme WS10 to the circulation indices.

All datasets used in this study are shown in Table 1, where the time period, the horizontal resolution, and the parameters used (where applicable) are listed.

Table 1 Dataset used in this study (for abbreviation look in text), the horizontal resolution, time period, and parameters

The focus of the WS10 analysis is mainly over the North Sea region (NS; lat. 50–60 N, lon. 3 W–20 E), though, for relating the WS10 patterns to the large-scale patterns, a larger North Sea region (lNS) is defined (lat. 45–65 N, lon. 15 W–30 E). For the case where only WS10 patterns were investigated, the North European (NE) region is additionally used (lat. 40–70 N, lon. 20 W–30 E). The SLP and T2M data, used as predictors of WS10, are defined on the Atlantic–European domain (lat. 30–90 N, lon. 80 W–40 E). The extreme WS10 was studied over the German Bight (lat. 52.5–55.5 N, lon. 4.7–9.5 E).

The analysis is based on the daily mean values defined as an average of the four-time daily data (00 UTC, 06 UTC, 12 UTC, and 18 UTC). In case of WS10, the magnitude of WS10 was calculated from the wind speed components (U10 and V10) from 6-hourly data and averaged over the day.

A part of the analysis is performed on the anomalies from the annual cycle. The anomalies were computed for each grid point of the data field separately. First, the monthly average value is computed as an average value at each grid point within a calendar month. The mean annual cycle was constructed by assuming that the monthly average corresponds to the 15th day of the calendar month, while the daily values of mean annual cycle are calculated as a linear interpolation of the values at the 15th day. The anomalies, i.e., deviations from the mean annual cycle, are obtained as a difference between the original field and the mean annual cycle. Part of the analysis is performed for the winter (December–February) and the summer (June–August) seasons separately. Two periods of 21 years were defined, the past presatellite period (1960–1980) and the recent period (1980–2000) where the satellite data (in particular, the marine surface wind) are assimilated in the reanalysis.

3 Analysis methods and results

The surface wind conditions over North Sea derived from the datasets described in “Section 2” are investigated for a range of temporal and spatial scales to detect resolution-dependent characteristics and to test the robustness of consequent assertions. The modes of wind speed variability over the North Sea are examined and related to the large-scale circulation patterns to identify and analyze relevant indicators of regional climate trend. The time evolution of the extreme wind speed and its relation to the circulation patterns are studied.

3.1 Description of surface wind speed in different datasets

To estimate the reliability of reanalyzed daily wind speed in terms of mean values and trends, the surface wind speed (WS10) and the WS700hPa from different reanalyses are compared. Since all reanalyses are based on virtually the same measurement data, the wind speed especially close to the surface (WS10) may have similar problems such as the artificial trends and discontinuities due to advances in the observation system (such as the assimilation of near-surface satellite measurements). In order to assess the representation of WS10 by the low-resolution data, it was interpolated and compared to the high-resolution data.

In the recent period, the pattern of the mean WS10 across the NS domain is mainly influenced by the land–sea interface, with the highest WS10 values over seas reaching a maximum between around 8 and 10 ms−1 (depending on a reanalysis model) and the lowest values over land between around 3 and 6 ms−1 in the southeastern part of the NS domain (Fig. 1, upper panel). The mean WS10 from NCR and ERA40 agree well over the studied area, while the mean WS10 from NCR2 is around 2 ms−1 higher. The difference between the mean WS10 from NCR and NCR2 is strikingly high even though both the reanalysis are based on the same model core and assimilation setup. The possible reasons for the differences are different atmospheric boundary and surface layer parameterization or the description of the land surface used in NCR and NCR2 models. Comparing the mean WS700hPa which is used as a measure of the wind speed at the top of the boundary layer or in the free atmosphere and represents the upper kinematic forcing on the boundary layer of all reanalysis models (lower panel of Fig. 1) reveals small difference (less than 0.5 ms−1). This indicates that the differences in mean WS10 between the models are primarily due to different representations of the boundary or land surface processes.

Fig. 1
figure 1

Mean WS10 (ms−1; upper panel) and WS700hPa (ms−1; lower panel) in the recent period (1980–2000) over NS domain from NCR, NCR2, and ERA40 reanalysis

The variability of WS10, characterized by the standard deviation of daily mean WS10 time series of roughly half its mean value, varies very little across the domain (not derived). Significant fraction of the WS10 variability is due to the mean annual cycle, which is considered by examining the probability density distribution of the wind speed P(WS10) and the wind direction P(θ) averaged over the NS domain for the recent period (Fig. 2). The data plotted in Fig. 2 are binned on monthly resolution to present the intra-annual variability. The magnitude of spatially averaged WS10 is sensitive to the domain definition and the reanalysis data used, while the time evolution of the mean annual cycle of WS10 and the wind direction are relatively insensitive. The strongest winds are observed in the winter months (November–February), when the storm activity over the northern Europe is peaking. During this period, the wind direction is mostly from west to southwest (200–270°). In the spring season (March–June), the mean wind speeds decrease with the weakening of the storm activity over northern Europe. The wind directions remain predominantly from west to southwest, but other directions are also observed. The weakening of the storm activity allows for more frequent outbreaks in this region especially from the north and northeast. Approaching summer, the wind speeds decrease further, reaching the minimum in late summer, and then increasing again from August onwards.

Fig. 2
figure 2

Probability density distributions of WS10 ((ms−1)−1; left) and wind direction (deg−1; right) for the annual cycle from the NCR averaged over NS domain in the recent period (1980–2000)

The temporal variability of WS10 between the datasets agrees well. The linear correlation coefficient between the time series of daily WS10 from NCR and ERA40 (bilinearly interpolated on the NCR grid points) is high, reaching the values between 0.9 and 0.95 everywhere within the NS domain, except in the southern part of Scandinavia where the values are as low as 0.8 (not shown). The reason for the lower correlation coefficient above Scandinavia may be the different model resolutions which also lead to different representations of terrain. Since the NCR and NCR2 models are based on the same underlying orography, the correlation coefficient between WS10 from those two reanalysis is above 0.95 over the NS domain.

Even though WS10s from different reanalyses are well correlated, the linear trends differ substantially. In Fig. 3, the linear trends in the annual averaged WS10 data calculated for each grid point separately and significant at the 5% level are plotted. To estimate the statistical significance of the WS10 trends, Student t test was used with an adjusted number of degrees of freedom to account for serial autocorrelation in the data. The effective sample size is calculated as:

$$ n_e = n_t \frac{{1 - r_1 }}{{1 + r_1 }} $$
(1)

where n t is the number of sample members, n e the effective sample size, and r 1 the lag one correlation of the regression residuals. The effective data size was used to correct the estimation of variance of residuals above the regression line and the number of degrees of freedom in the Student t test. Further details can be found in Santer et al. (2000). In the past period, ERA40 data show a high positive trend of WS10 over the North Sea, while the trend in NCR is lower and seems to be centered in the northwestern part of the domain and over the coastal areas (Fig. 3, upper panel). In the recent period, the trend from the three reanalysis models is positive everywhere but quite different (Fig. 3, lower panel). The WS10 trend from NCR data is the lowest and concentrated over the southwestern part of the domain. In the ERA40 data, the trend is mainly over northern Germany whereas in the NCR2 data it is over the southwestern part of the domain including the North Sea. The linear trends in the WS700hPa well agree among all reanalysis models and are positive and significant over the northern part of the domain. As with the mean daily WS10, the different trends in WS10 reanalysis data appear mainly as a result of different representations of surface and boundary layer processes in the reanalysis models. The WS10 trends in the different datasets differ substantially while the correlation coefficients of WS10 are high. This seems contradictory but understood since the linear trends explain only a small proportion of the WS10 variance.

Fig. 3
figure 3

Linear trends in the annual averaged WS10 (ms−1 year−1) for the past (upper panel) and recent (lower panel) period. From left to right: NCR, NCR2, and ERA40. Only the trends significant on the 5% level (estimated by Student t test) are plotted

The question addressed next is whether it is possible to represent the daily mean WS10 over the North Sea by the low-resolution datasets, such as the reanalysis data. This depends on the spatial scales of the governing physical processes responsible for the daily regional WS10 variability compared to the resolution of the reanalysis data. It is well known that WS10 over the North Sea is partly influenced by the North Atlantic storm activity with typical horizontal scales of a few hundred kilometers. It is not clear if there are other local circulation patterns that distinctly influence the daily WS10. To estimate the typical length scale of WS10, the isotropic first-order autoregressive AR(1) model is fitted to the WS10 data. In the AR(1) model, the correlation coefficient between time series at locations r i and r j , separated by the distance d(r i ,r j ), exponentially decreases (e.g., von Storch and Zwiers 1999) and is not directionally dependent:

$$ {\text{corr}}\left( {r_i, r_j } \right) = e^{{{{ - d\left( {r_i, r_j } \right)} \mathord{\left/ {\vphantom {{ - d\left( {r_i, r_j } \right)} {d_0 }}} \right. } {d_0 }}}} $$
(2)

where the d 0 is the decorrelation length and is interpreted as the characteristic length at which points in space are no longer correlated. In the WRF3 data, the decorrelation length of the daily WS10 anomaly from the annual cycle is around 800 km over most of the domain and the North Sea, with smaller values over the southern and northern part of the domain as well as along the coastal regions (Fig. 4). The absolute value of the decorrelation length is not too reliable since the horizontal domain of WRF3 is smaller than the decorrelation length. However, the result indicates that the daily WS10 variability over the North Sea can be well represented by the reanalysis models since the decorrelation length of the daily WS10 is larger than the resolution of the reanalysis models. Exceptions are probably the regions along the North Sea coast especially in the Weser-Ems area, where the decorrelation length is lower probably due to the local circulation patterns. The spatial pattern may result from the local land–sea orientation with respect to the prevailing southwest to west winds. A smaller decorrelation length at the northern and southern parts of the domain may be related to a real signal and not an artifact of the boundary conditions while calculating the decorrelation length. The second Empirical Orthogonal Function (EOF) of WS10 describes that the WS10 over these areas is negatively correlated (“Section 3.2”).

Fig. 4
figure 4

Decorrelation length (d 0; km) for the WS10 anomaly from the annual cycle estimated from the WRF3 data

To quantify the relationship between WS10 from the low-resolution reanalysis data (NCR) and the high-resolution WRF3, WS10 from NCR was interpolated on the WRF3 grid and compared on the 3-year common time frame. To interpolate the WS10 data, two different techniques were used: (1) bilinear interpolation of WS10 from NCR ignoring any information of the land–sea interface (we refer to this interpolation as BILIN) and (2) separate bilinear interpolation for the WRF3 land and sea areas. At the interface, where the bilinear interpolation is not possible, the WS10 from the nearest point was taken (we refer to interpolation as BILIN-COAST). These simple interpolations do not take any specific knowledge of the atmospheric dynamics into account. As expected, the 3-year mean WS10 interpolated with BILIN-COAST show excessively high gradient of WS10 along the coasts, while the WS10 interpolated with BILIN method shows a too weak gradient at the coast compared to WS10 from WRF3 (Fig. 5). Therefore, it appears that, at the coastal regions, high-resolution WS10 adds additional information to the low-resolution data related to the local circulation patterns. In the upper panel of Fig. 6, it is seen more clearly that, in case of the BILIN interpolation, due to the too weak gradient of WS10 at the coast, WS10 is overestimated at the land points close to the coast and underestimated over the sea points at the coastal region, while with WS10 interpolated with the BILIN-COAST method with a sharp land–sea discontinuity, just the opposite is true. The correlation coefficient of WS10 from WRF3 and interpolated from NCR is high over the whole domain, with the lowest values at the coast of the Weser-Ems area and Holland. The lower correlation is probably due to the prevailing southwestern wind which mixes the continental and oceanic air masses at the land–sea interface, weakening the local circulation patterns at the windward side but retaining some part towards the leeward side, which may leave an imprint on the long-term daily WS10 variability.

Fig. 5
figure 5

The 3-year averaged WS10 (ms−1) from WRF (left), NCEP interpolated with BILIN (middle), and with BILIN-COAST (right) interpolation on the WRF3 grid

Fig. 6
figure 6

Mean difference (upper panel) and correlation coefficient (lower panel) between 3 years of WS10 (ms−1) from WRF3 and NCEP interpolated on WRF3 grid. Left panel is for BILIN-COAST interpolation and right for BILIN interpolation (look in text for details)

3.2 Modes of wind speed variability and their relation to the large-scale patterns

Since WS10 over the North Sea is spatially highly correlated, it is sufficient to characterize it by a few well-chosen modes as for example with the EOFs. EOF analysis (e.g., Preisendorfer 1988; von Storch and Zwiers 1999) finds a linear combinations of modes comprising of the spatial patterns (EOFs) and their time evolutions (principal components, PCs), so that the modes explain the highest part of the variance of data under the orthogonality constrain of the EOFs. The resultant EOFs represent the spatial distribution of the magnitude of the field together with its relative phase, while the time series (PCs) are uncorrelated and explain the time evolution of the corresponding EOF. In this analysis, each pair of PC and EOF is normalized so that the variance of PC equals unity and the EOF represents the magnitude of typical variation of the field and thus the physical units of the field are attributed to the EOF. The EOF analysis is performed on the monthly mean WS10 anomalies from the mean annual cycle to account for the seasonality. Since only the monthly WS10 anomalies from the mean annual cycle are considered in this section, we simply refer to them as WS10. EOF modes of WS10, related SLP, and T2M patterns from NCR dataset for the complete NCR period (1948–2007) are shown in first part of this section, while, in the second part, the differences between NCR and other two reanalysis datasets in the recent (1980–2000) and the past (1960–1980) period are discussed.

The EOF modes of WS10 from NCR reanalysis are calculated on three domains (NS, lNS, and NE, as defined in the previous section) and compared to investigate the sensitivity of the modes on the domain size, where the measure of similarity is the correlation coefficients between the corresponding PCs (Table 2). While the first EOF mode is almost identical on all three domains, the second mode is very similar on only the two largest domains (lNS and NE), while on the smallest domain it deviates considerably. The third (and also higher) EOF modes are sensitive to domain selection. In addition, the rotation of the EOFs has been preformed with the VARIMAX method (e.g., Richman 1986) using the first five or ten EOF patterns. In both cases, the first rotated pattern was almost identical to the first unrotated, while the second and third rotated patterns were mainly the linear combination of the second and the third nonrotated patterns (not shown). Since the rotated EOF modes do not seem to give additional information or better capture the relevant WS10 variability, only the nonrotated EOF modes are used in further analysis.

Table 2 Correlation coefficients for the first four PCs of monthly WS10 anomaly from NCR data for the time period 1948–2007, calculated over the NS and lNS domain (corr(NS, lNS)) and over the lNS and NE domain (corr(NS, lNS))

To investigate the relevance of the EOF modes of WS10, the explained variance of each mode is compared to the explained variance of the simple isotropic diffusive model modes projected on the EOF modes of WS10. The diffusive model is defined in Dommenget (2007) as AR(1) model in space and time:

$$ \frac{d}{dt}\Phi = c_{\text{damp}} \Phi + c_{\text{diff}} \nabla^2 \Phi + f $$
(3)

where the Φ is the synthetic field forced by the spatial and temporal white noise (f); the damping (c damp) and diffusive (c diff) coefficients are calculated from the WS10 data. As shown by Dommenget (2007), the comparison of the explained variance of the data modes and the ones from an AR(1) process projected on the data modes helps to separate the data modes which represent the spatial covariance of the data and modes which represent teleconnection patterns.

The first three EOFs of the monthly WS10 for the lNS domain are plotted and the EOFs on the smaller NS domain are superimposed (Fig. 7). The explained variance of the first five EOF modes of WS10 from the lNS domain is shown in Fig. 8, where the confidence limits are obtained from North’s rule of thumb (North et al. 1982). The explained variance of the AR(1) process modes projected on the EOF modes is also shown in this plot.

Fig. 7
figure 7

First three EOF patterns of monthly WS10 anomaly (ms−1) for lNS domain. EOF patterns for NS domain are also shown with the black contours

Fig. 8
figure 8

Explained variance (%) of the first five EOF modes of the WS10 anomaly with confidence interval (solid lines) and explained variance of the AR(1) modes projected on the data modes (dashed lines)

The first EOF mode of WS10, which explains around 38% of its variance, is hardly separated from the stochastic AR(1) model, suggesting that this mode reflects that WS10 above the area studied is highly correlated in space (consistent with Fig. 4) and does not represent any specific teleconnection pattern. The higher magnitude of the WS10 in the first EOF above the sea is due to the higher variability of the WS10 over sea compared to land.

The second mode, which shows the negative correlation of WS10 above the North Atlantic north and south of Great Britain, is statistically different from the AR(1) model. This mode explains around 18% of the WS10 variability, while the same mode will explain only around 10% of WSA10 variability under the assumption of an isotropic AR(1) process. The large separation of the explained variance between the WS10 mode and the same mode from the AR(1) process suggests that this mode represents a teleconnection pattern of the WS10 anomaly rather than just being a higher-order mode of an AR(1) process. The EOF pattern from the smaller (NS) domain is very similar, although the corresponding PCs are not highly correlated. The negative correlation between the northern and southern part of the domain is consistent with the lower decorrelation lengths at those two areas (Fig. 4).

The third and higher modes cannot be distinguished from the EOF modes of the AR(1) process and seem to be partly determined by the domain geometry and the requirement of the orthogonality of the EOF modes. The third EOF spatial pattern explains the dipole in the WS10 anomaly between the western (west of England) and eastern (above the North Sea and the Baltic Sea) parts of the domain.

For further analysis, the EOF modes from the lNS domain are used to ensure an adequate regional representation of WS10 over the North Sea. The first three modes of WS10 anomaly over the lNS domain are related to the monthly SLP and T2M anomalies from the annual cycle. The relation between the WS10 and the SLP patterns describes the dynamical forcing of WS10, while relating the WS10 modes to T2M is an attempt to describe the baroclinic contribution to the WS10 anomalies. Even though the third EOF mode seems to only represent the orthogonality requirement of the first EOF mode of an AR(1) process, it was included in the following analysis. The justification to neglect the third mode in the following analysis was established with the cross-validation procedure.

For each of the first three PCs of the WS10, two maps of temporally averaged SLP and T2M anomalies are plotted after filtering days for which the PCs of WS10 are higher or lower than the standard deviation separately (Fig. 9). The filtered fields plotted in Fig. 9 are in the units of their local standard deviation, i.e., each grid point is divided by its standard deviation. For the first three PC modes of the WS10 anomalies, the SLP and T2M anomaly patterns show a bimodal structure. The sign of the extrema changes with a change in the sign of the PC, which suggests an approximately linear relationship between the PCs of WS10 and the large-scale circulation. Therefore, a large-scale index is defined as a time series of the difference between spatially averaged predictors within a defined center of activity at extrema (shown in Fig. 9) in SLP and T2M separately and is used to predict the PC of WS10 anomalies.

Fig. 9
figure 9

First and second row: the mean SLP anomaly for the days of high (first row) and low (second row) first three PCs of wind speed (left to right). Third and fourth row: same as first and second row, only for T2M instead of SLP. The fields are in units of their local standard deviation

As expected, the first SLP anomaly pattern has a similar spatial structure as the NAOI, although the northern center is moved slightly eastwards. The correlation between the first SLP index and NAOI from CPC is fairly high only in the winter season, while it is low in the summer season (Table 3). The first T2M anomaly pattern appears to be almost identical to the surface temperature related to the NAOI (e.g., Trigo et al. 2002), which suggests that the T2M anomaly pattern describes a relationship between SLP and T2M anomalies but adds no additional information for predicting the PC of WS10 anomaly. The second SLP anomaly pattern is a dipole between the SLP over Great Britain/North Sea and Greenland and is not much related to the EA pattern, which is also indicated by a low correlation coefficient between EAI and the second SLP index (Table 3). The second T2M anomaly pattern is the dipole with minimum above Great Britain and maximum above the southeastern part of Europe and may be explained by the advection related to the SLP pattern. In the positive phase of the anomaly pattern, the SLP is lower over Great Britain and the cyclonic circulation advects warmer air to Eastern Europe and the colder air west of Great Britain. Therefore, the T2M anomaly maxima are at these locations. In the negative phase, the situation is approximately inverse. The third SLP anomaly pattern is a dipole between Scandinavia and the North Atlantic west of Great Britain. It is seen that the third pattern has a much lower amplitude compared to the first two patterns, implying a lower association between the third SLP pattern and the third EOF pattern of WS10. The related T2M pattern is a dipole between west of Scandinavia and central North Atlantic. The relationship between the SLP and the T2M anomaly pattern may be explained as with the second pattern.

Table 3 Correlation coefficient between NAOI (EAI) from CPC and the first (second) circulation index derived from SLP/T2M

To test the relationship between the EOF modes of WS10 and the corresponding large-scale patterns, the cross-validation is preformed as follows. The large-scale indices are linearly fitted to the corresponding PCs of WS10 anomaly on four fifths of the time series length and the fitted relationship was used to predict the remaining one fifth time series of the PCs of WS10 anomaly. This procedure was repeated five times each time one fifth of the consecutive data was omitted from the fitting procedure and predicted. From the predicted PCs of WS10 anomalies, the whole WS10 anomaly field was reconstructed, assuming that the EOF spatial patterns are constant in time. The ability of the indices to predict WS10 anomaly was estimated by computing the correlation coefficient between the original and predicted WS10 anomaly field. In all cases, the SLP indices were better predictors than T2M indices (not shown). Also using both the SLP and T2M indices at the same time by multiple regressions did not improve the correlation coefficient between the original and the predicted WS10 time series. The highest correlation coefficients were obtained when the first two PCs of WS10 were predicted and used for reconstructing the WS10 anomaly. Including the third EOF mode of WS10 decreases the correlation coefficient. This is since the third EOF mode does not represent significant variability of WS10 and adds mostly noise. Therefore, in the further analysis, only the first two EOF modes of WS10 and the related circulation patterns from the SLP are considered.

The PCs of WS10 and the large-scale circulation indices of SLP and T2M as well as NAOI are shown in Fig. 10. The SLP and T2M indices and NAOI are linearly fitted to the PCs of WS10. In the winter, the PCs of WS10 are more variable than in the summer. The first PC of WS10 over the NCR period shows a strong interannual variability with an increasing trend. The increasing trend is most pronounced in winter, while only slight increase is observed in the summer. The positive trend is consistent with the NAOI trend (Marshall et al. 2001). In the last decade, a relatively strong decrease of the first PC of WS10 anomaly is observed, reaching almost its mean value over the NCR period. It is not clear if this decrease is due to a change in trend or a manifestation of the interannual variability. The first SLP and T2M indices as well as NAOI follow closely the first winter PC of the WS10 on an interannual timescales. There is some disagreement between the time series at the beginning of the period (from 1950 to 1960). The reason for this may be the lower quality of WS10 from NCR before 1960 (Kistler et al. 2001). In the summer, the first SLP index captures interannual variability of the first PC of WS10 anomaly relatively well, while the NAOI and the T2M index are not able to describe it.

Fig. 10
figure 10

Time series of the 5-year running mean PCs of WS10 anomalies (solid black), linearly fitted corresponding SLP index (dashed black), T2M index (solid gray), and NAOI (dashed gray). Upper panel first PC, lower panel second PC. Left panels for winter, right for summer. All data except NAOI are from NCR

The intra-annual variability of the second PC of WS10 is poorly described by either the SLP or the T2M indices. Especially poor agreement is seen in the winter where even the trends do not match. The trends of the second SLP and T2M index show a decrease not observed in the PC of WS10 anomaly. During the summer period, interannual variability of the second SLP index follows the second PC of WS10 anomaly well. However, since only a small improvement in the correlation coefficient of WS10 is achieved when the second PC of WS10 anomaly is related to the indices, it may be inferred that predicting only the first PC of WS10 anomaly is sufficient.

The EOF patterns of WS10 are not very sensitive to the reanalysis dataset used. The correlation coefficients between corresponding PCs of WS10 from NCR and other two reanalysis datasets are very high (up to 0.99) but decrease with the higher order of PC (Table 4). The correlation coefficient between PCs of WS10 anomalies is slightly higher in the recent period compared to the past period (in brackets).

Table 4 Correlation coefficient between PCs of WS10 anomalies from NCR and other datasets

The correlation coefficients between PCs of WS10 anomaly and the corresponding derived SLP circulation indices are shown in Table 5. In the winter, the surface winds are much better described by the large-scale circulation patterns compared to the summer, which is indicated by high correlation coefficients between the PCs of WS10 and the circulation indices. In this season, the correlation coefficients between PCs of WS10 and circulation indices are not significantly dependent on the reanalysis dataset used. The correlation coefficients in the recent period are considerably higher than in the past period for both NCR and ERA40 data. The reason for the lower correlation coefficients in the past period may be due to the better quality of WS10 data in the recent period with the assimilation of the satellite-derived near-surface wind speeds or the change in the climate dynamics within the last decades. Since the PCs of WS10 anomalies from different reanalysis in the past period differ more than in the recent period, we argue that at least part of the lower correlation between PCs of WS10 anomalies and circulation patterns in the past period may be related to the lower data quality of the reanalysis data of the past period.

Table 5 Correlation coefficients between PCs of WS10 anomaly and the circulation patterns for the recent period (1980–2000, without brackets) and for the past period (1960–1980, in brackets)

3.3 Extreme wind conditions

In this paper, the extreme wind speeds are defined as for example in Pryor and Barthelmie (2003) with predefined high percentiles of the daily WS10 distribution. The WS10 percentiles are calculated for each winter season. The winter season was selected where the highest wind speeds during the storms are predominant. For example, the 90 percentile of WS10 is defined when WS10 is exceeded in 10% of the days during the winter season in a given year. The extreme WS10 and its relation to the circulation patterns can be assessed from the high PCs of WS10 or the high intensity of the circulation indices as shown in the previous chapter. However, EOF functions are not optimally designed to capture extreme values since the extreme values occur rarely and thus do not necessarily have an important contribution to the total variance of data which is the criteria for extracting EOF patterns. Moreover, there is also a scale mismatch between the circulation patterns that describe the variability of WS10 and the typical duration of extreme WS10. Typical duration of the extreme wind speeds by definition cannot exceed a few days, while the first EOF pattern of WS10 is related to the NAO-like patterns which is known to describe well the monthly to interdecadal variability of the SLP (e.g., Hurrell and van Loon 1997). The more scale-appropriate circulation patterns describing the extreme WS10 may be obtained for example from the circulation types such as from the GWL catalog, which usually persists for few days (e.g., James 2007).

Therefore, to find a relationship between the extreme WS10 over the North Sea and the corresponding circulation patterns, a slightly different approach is considered as described below. For each day, a maximum WS10 (MWS10) over the German Bight region is first calculated. The circulation pattern related to the extreme WS10 is found in the following manner. The map of time-averaged SLP anomalies for the days when the MWS10 exceeds twice its standard deviation over the mean value is plotted on Fig. 11. It is observed that the high WS10 is related to an anomalously low SLP over Scandinavia. This SLP pattern is related to the Cyclonic-Westerly and Cyclonic-Northwesterly GWL, which have already been identified and related to the wind storms over the North Sea. The resultant SLP pattern is very robust regarding the choice of the mean or the maximum WS10 over German Bight or the reanalysis dataset used. Also, almost the same pattern is obtained by using the more complicated method of extreme associated functions described in Panja and Selten (2007), which finds the combination of EOF patterns that are optimally related to the extreme WS10. The SLP index related to the extreme WS10 can be derived as the negative of the spatial averaged SLP anomaly over the area of the lowest SLP anomaly (lat. 55–70°N, lon. 10–25° E) as in Fig. 11. The index is also constructed from the SLP measurements at stations located in the southern part of the Scandinavian peninsula. The advantage of deriving the index from measurements is that the pressure-measuring techniques did not significantly change in the last century and thus the measurement causes less problems regarding temporal inconsistencies compared to the SLP from the reanalysis data. Here, the measured index is constructed as the negative of the mean daily SLP measurement at the synoptic stations in Göteborg (Sweden) and Oksoy (Norway). The results do not change if SLP from only one of the stations is taken instead of the average of both stations.

Fig. 11
figure 11

The SLP anomaly [Pa] from NCR related to high WS over the NS domain

The relationship between the SLP index and the extreme WS10 is almost linear (not shown); the high percentiles of the MWS10 are fitted to the high percentiles of the SLP index. For example, the annual time series of the 90 percentile of MWS10 and the 90th percentile of the indices (from both reanalysis and measurements) linearly fitted to the 90 percentile of MWS10 are shown in Fig. 12. The correlation coefficient between the time series of the 90th percentiles of MWS10 and 90 percentiles of the related indices is relatively high (exceeding 0.7 for all datasets). However, the time series of the 90th percentiles of MWS10 and the related indices show consistently different interannual variability. The time series of the 90th percentiles of MWS10s are also sensitive to the reanalysis dataset. The 90th percentile of the MWS10 from the NCR shows almost no interannual variability until the middle of 1970s and subsequently a strong increase which reaches a maximum in the mid-1990s and a decrease thereafter with a slight increase towards the end. The results from NCR2 are similar to NCR for the common data period. The 90 percentile of MWS10 from ERA40 is slightly decreasing until beginning of the 1970s and then shows a sudden jump over a few years followed by a constant period and an increase until mid-1990s with decrease thereafter. The 90 percentiles of the SLP index from the reanalysis agree with respect to the MWS10 being relatively constant until the mid-1980s and then show a slight increase until the mid-1990s and a decrease thereafter. The indices show significantly smaller trends compared to the WS10. The MWS10 index derived from the measurement is amazingly constant in time and does not show an increase from late 1970s until mid-1990s.

Fig. 12
figure 12

The 90th percentile of winter WS10 (ms−1; black lines) and related circulation indices from reanalysis models (red lines) and from measured SLP (green lines). Thin lines represent yearly data, the thick lines 9-year moving average. Left for NCR, middle for NCR2, and right for ERA40 data

The disagreement of the interannual variability between the high percentiles MWS10 and the related indices fitted to the MWS10 is observed in other high percentiles. In Fig. 13, the linear trends are shown during the recent and the past period. The high percentile indices derived from the measurements show no significant trend. The index derived from the reanalysis SLP shows a slight trend in the recent period, which in most cases is not significant, while the high percentiles of WS10 show significant trends towards higher high percentiles of WS10. We investigated if the trend in MWS10 may have resulted by the change in thermal forcing. The T2M and thickness between 700 and 1,000 hPa were used to calculate the extreme indices, same as with the SLP, and were related to the MWS10. None of these two indices was able to explain the interannual variability of MWS10.

Fig. 13
figure 13

Linear trends of the high percentiles (from 70 to 90) of winter MWS (solid lines) and high percentiles of related circulation patterns (dashed lines) for past (left) and recent (right) period. Green line is for measured SLP, black for NCR, blue for NCR2, and red for ERA40

4 Discussions and conclusions

The representation of daily mean WS10 over the North Sea between the NCR, NCR2, and ERA40 reanalysis data is compared with the main goal to assess the reliability and deficiencies of the datasets for applications such as the long-term wind resource assessment. The WS10 over the North Sea from different reanalysis sources is highly correlated, as shown by the high linear correlation coefficient of the data as well as the correlation coefficient between the first three PCs from different reanalysis datasets (Table 4). The agreement between PCs of WS10 is slightly better in the recent period (1980–2000) compared to the past period (1960–1980) which can partially be attributed to the improved WS10 dataset in the recent period probably due to assimilation of satellite-measured wind data.

Some important differences in the WS10, such as its mean magnitude and linear trends of time-averaged values, are model dependent which can be at least partially attributed to the choice of parameterizations of boundary layer, surface layer and land surface processes, and the model resolutions. The WS10 from reanalysis is interpolated from the first model level to 10-m height using surface layer parameterizations and is also sensitive to the vertical model resolution. As a result, in the recent period, the mean magnitude of NCR WS10 is around 2 ms−1 lower than in NCR2 WS10. Also, the linear trends of annual mean WS10 are significantly different among the three reanalysis datasets. Since the trends and the mean values of WS700hPa from all models are nearly the same again increases the confidence in this reasoning.

Next, the imprint of local circulation patterns on the WS10 climatology over the NS domain resolved in the high-resolution model (WRF) simulations is investigated. Some of the processes such as the land–sea breezes are to some degree smoothened by averaging the WS10 on the daily time step. It is expected that the daily WS10 in the low-resolution dataset is well represented since the decorrelation length of daily WS10 is around 800 km which is considerably larger than the horizontal resolution of the reanalysis models. Exceptions are only at the coast of Weser-Ems area and Holland which probably deviate due to the influence of the local circulation patterns. The comparison of WS10 reanalysis data with a high-resolution dataset (WRF3) for the 3-year (2003–2006) period is sufficiently long to warrant a closer examination of the influence of local processes on the WS10. Comparing the NCR reanalysis data to the high-resolution WRF3, we again confirmed that the variability of daily WS10 over the North Sea is mostly well described by the low-resolution reanalysis dataset. The highest deviations between NCR and WRF3 data were found mostly along the coast which indicates the influence of the local circulation patterns that cannot be resolved in the low-resolution dataset. As with the different reanalysis datasets, the mean magnitude of the WS10 from the NCR and high-resolution datasets differs. The exact location of the coastline is not well resolved in the NCR, resulting in larger deviations between the NCR and high-resolution WS10 along the coast.

The dominant modes of the mean monthly WS10 described by the EOF modes are related to the SLP and T2M patterns. The SLP and T2M patterns were obtained empirically to represent the forcing on WS10 such that the circulation indices (corresponding time series) are well correlated to the PCs of WS10. The circulation indices were fitted to the corresponding PCs of WS10 in order to hindcast the WS10. There are alternative approaches for searching the dominant modes of WS10 variability and its relationship to the circulation patterns. For example, Fil and Dubus (2005) preferred cluster analysis over EOFs of the northern European SLP in order to isolate the climate regimes since the cluster analysis does not require an assumption of the linearity. However, for the analysis presented here, the cluster analysis is not well suited since it would be harder to obtain the continuous relationship between the large-scale indices and the clusters of WS10 anomaly. The other alternative would be to use one of the multivariate methods for finding the relationship, such as canonical correlation analysis (CCA; von Storch and Zwiers 1999) of the EOF prefiltered fields. However, the disadvantage of the CCA is that its patterns do not necessarily represent a large proportion of the data variability. The T2M pattern does not bring a significant improvement in explaining the WS10 variability. Most of the interannual variability of the WS10 explained by the first EOF pattern describes a coherent increase of WS over the North Sea region and is related to a SLP pattern similar to the NAO. However, the northern center of the pattern found in this study is shifted eastwards and is very similar to the pattern found by (Kaas et al. 1996). In the winter, the correlation coefficient between first SLP index and NAOI is high. The first SLP index is highly correlated to the first PC of WS10 only in the winter. The higher correlation between the first PC of WS10 and the SLP index in the winter reflects that the empirically found pattern is better correlated to the WS10 variability compared to the NAOI. The second SLP pattern is the SLP anomaly between Scandinavia and Greenland and is related to the WS10 dipole over south and north of Great Britain. The index from the second circulation pattern is well related to the WS10 in the recent period, whereas it was rather poor in the past period. Again, it is not self-evident if this is an indication of the better quality of the WS10 data in the recent period compared to the past period or if this reflects the changes between the recent and past period. The second SLP and WS10 patterns seem to reflect the monthly mean position of the storm tracks and increase toward the end of the reanalysis period which is consistent with the observed northern shift of storms tracks in the recent period (Schneidereit et al. 2005; Wang et al. 2006).

The extreme wind speeds are described as the WS10 exceeding a predefined high percentile (e.g., 90th) of daily WS10 for each year in winter season separately. The extreme circulation patterns related to the high daily WS10 were found by filtering the SLP anomalies for the days when WS10 crosses a predefined limit. The search for the different SLP patterns related to the extreme wind speeds may not be suitable since the patterns based on the EOF functions are not designed to find rare extreme values. The extreme wind speeds in the German Bight are related to the low SLP over Scandinavia. The time series of the extreme WS10s are insensitive to the reanalysis datasets used in the analysis. In the past period, there are an important differences between the extreme WS10s from different reanalysis, while, in the recent period, the correspondence is better. The circulation indices defined as the same percentiles of the negative SLP over the Scandinavia and regressed on the extreme percentiles of the WS10 show considerably less interannual variability; especially, the increase from the 1960s to the mid-1990s is underestimated. Similar SLP index was constructed from the daily SLP measurements at two stations in the south of the Scandinavian peninsula (Göteborg and Oksoy). The circulation index from these two stations shows incredibly little interannual variability. These results show high sensitivity to the methodology of the estimation of extreme wind speeds. It has been noted in the literature that the time series of extreme near-surface wind speeds are very different when estimating them from the wind speeds and from the proxy data like the SLP. We showed here that, even if the data from the same reanalysis are used, which is expected to be highly consistent, the deviations in the interannual variability of extreme wind speeds exist. The extreme wind speeds are probably only partially captured in the SLP patterns.

The analysis of different resolution datasets was undertaken in this study to comprehensively describe the current state of surface wind climate (WS10) over the North Sea. In particular, a methodology to detect and examine a regional indicator of climate change influencing primarily the surface marine wind field is developed and applied. This methodology is applied in a subsequent study to compare and investigate the surface wind climate states in the regional climate hindcasts and future climate scenarios.