The Role of Atmospheric Stability and Turbulence in Offshore Wind-Farm Wakes in the German Bight

Airborne meteorological in situ measurements as well as stationary measurements at the offshore masts FINO1 and FINO3 in the German Bight are evaluated in order to examine the hypothesis that the wake dissipation downstream of large offshore wind farms depends on atmospheric stability. A long-term study of the mast data for the years 2016 and 2017 demonstrates a clear dependence of stability on the wind direction. Stable conditions are predominantly expected during southerly winds coming from the land. The analysis of various stability and turbulence criteria shows that the lapse rate is the most robust parameter for stability classification in the German Bight, but further implies that stability depends on the measurement height. A near-surface (0 to 30 m), predominantly convective, layer is present and more stable conditions are found aloft (55 to 95 m). Combing the stability data with the airborne measurements of the offshore wind-farm wakes reveals the trend of a correlation between longer wake lengths and an increase in the initial wind-speed deficit downwind of a wind farm with stronger thermal stability. However, the stability correlation criteria with the wake length downstream of the four investigated wind farms, Godewind, Amrumbank West, Meerwind Süd/Ost, and Nordsee Ost, contain large variance. It is assumed that the observed scattering is due to the influence of the wind-farm architecture and temperature inversions around hub height. These, however, are crucial for the classification of stability and illustrate the complexity of a clear stability metric.

from the FINO1 and FINO3 masts in order to investigate systematically the relationship between stability and wake.
Three major questions are addressed. In Sect. 4.1, a long-term analysis of the stability in the German Bight is done using stationary FINO data. Secondly, in Sect. 4.2, a comparison of stability measures from FINO data and aircraft measurements is conducted, and the strengths and weaknesses of the parameters and measurement techniques are presented. We discuss which parameters are most representative of stability in the German Bight, aiming to identify which measure is most useful for describing stability at an offshore wind farm. Section 4.3 covers the analysis of the stability and its influence on the wake extension. Stability is established in relation to the wake length and initial wind-speed deficit to test the hypothesis of a correlation between stability and wake intensity.

Stability and Turbulence Measures
Atmospheric stability describes the potential of vertical motion of air. A layer is considered as stable when vertical motion is suppressed and as unstable or convective when vertical motion is enhanced (Stull 1988). The two concepts of stability are static stability, which takes solely buoyancy into account and dynamic stability, which also considers shear-generated turbulence.
Static stability is measured by the gradient of virtual potential temperature θ v with the height z, named the lapse rate γ . For the lapse rate, no turbulence measurements are necessary since only mean differences and height mean differences in temperature are required: The lapse rate is negative during convective conditions, zero for neutral, and positive for stable cases. Dynamic stability is commonly described by the bulk Richardson number Ri B and is based on mean measures only. It is calculated according to where g is the acceleration due to gravity, and U is the wind speed. The overbar represents a mean value. Stable conditions are expected when Ri B > 1 and convective or turbulent conditions when Ri B < 0.25 (Stull 1988), see Table 1. However, those thresholds are valid for local gradients only and deviate with larger height intervals 1z (e.g., Stull 1988;Balsley et al. 2008). An alternative method for dynamic stability that does require turbulence measurements is described by the Obukhov length with where κ ≈ 0.4 is the von Kármán constant (Stull 1988), and u is the friction velocity, which can be derived from the turbulent parts of the wind components u, v, and w, marked by a prime (Eq. 4); u is defined northwards, v eastwards, and w is the vertical component, positive upwards. The flux w θ v describes the turbulent kinematic heat flux. Since the airborne measurements used for the derivation of the heat flux and u were performed at hub height, in our dataset, L is considered as a local metric at this height. Often the Monin-Obukhov stability parameter ζ with the measurement height z is evaluated, with negative ζ indicating convective conditions, whereas positive ζ defines stable conditions. However, the calculation requires high-frequency measurements of the turbulent wind components and virtual potential temperature differences. Moreover, there are several parameters that quantify turbulence. Most prominent is the turbulence kinetic energy (e) per unit mass Offshore, Archer et al. (2016) observed that e varies between 0.1 to 10 m 2 s −2 and increases with the wind speed at 20-m height. Lidar observations by Bodini et al. (2019) reveal that the dissipation rate-which can be derived from the value of e-is on average smaller compared with onshore conditions. However, a relationship between increasing atmospheric stability and decreasing e, as determined in Wharton and Lundquist (2012) over land, has not been documented in the literature as far as we know.
In some studies the variance of the vertical velocity component w w alone is also considered as a turbulence criterion. Türk and Emeis (2010) observed that this quantity depends mainly on wind speed and is only influenced to a small extent by atmospheric stability.
The turbulence intensity I [%] is defined as the standard deviation of the wind speed, normalized by the mean wind speed, Offshore, the value of I is observed around 4 to 5 % during stable conditions, and increases to about 7 % as the atmosphere becomes more unstable (Hansen et al. 2012). However, also a significant correlation with wind speed was identified and a dependence on the wind direction indicates that the thresholds are site specific. An overview of the stability classification for the prevalent criteria is provided in Table 1.

Data Acquisition
The study is based on two datasets. First, meteorological long-term data recorded by the permanent measurement masts FINO1 and FINO3 are used (Fig. 1). The masts are equipped with various meteorological sensors for recording the wind speed, temperature, and humidity at the heights of 30, 40, 50, 70, and 90 m for FINO1 (Table 2) and in 29, 55, and 95 m for FINO3 (Table 2). Routine meteorological data are stored as 10-min averages, sea surface temperature every 30 min. Virtual potential temperature θ v is calculated for all the sensor   Heightinm  23  28  29  30  40  50  55  60  70  80  90  94  95  100 106 heights. To ensure an accurate lapse-rate calculation, the temperature sensors at both masts have been undergoing a relative calibration process (Frühmann 2016). The quantities of pressure p, relative humidity R H, wind speed, and direction are linearly interpolated to the height of interest if not available. Moreover, R H is assumed to be 100 % at the water surface (0 m). For θ v at the water surface, the temperature is derived from the water temperature measurement of the buoy similar to the method by Foreman et al. (2017) who used the sea surface as a proxy for the lapse-rate calculation. Data are averaged over a 10-min period, which is identical to the airborne measurements. Sonic anemometers at FINO1 at a height of 40 and 80 m provide 20-Hz frequency data of the wind velocity components as well as the sonic temperature T s . More details can be found in Foreman et al. (2017).
Also analyzed are airborne meteorological measurements conducted in the research project WIPAFF (Platis et al. 2020, WInd PArk Far Field) comprising four field campaigns from September 2016 to October 2017 for a total of 41 measurement flights. The measurements were made from the research aircraft Dornier DO-128 operated by the Technische Universitt Braunschweig. The aircraft is equipped with sensors for temperature, humidity, pressure, and wind components, sampling at a frequency of 100 Hz. Details about the measurement devices can be found in Corsmeier et al. (2001), Platis et al. (2018), andLampert et al. (2020). The airspeed of the research aircraft is 66 m s −1 . The airborne flight data are publicly available .
All flights were conducted in the German Bight, in the vicinity of an operating offshore wind farm either Godewind (GO) or the cluster with the wind farms belonging to the Amrumbank West Cluster: Amrumbank West (AW), Nordsee Ost (NO), and Meerwind Süd-Ost (MSO). A typical flight pattern ( Fig. 1) comprises the following features: upwind of the wind farm, with a distance of approximately 5 km to the turbines, there is one leg perpendicular to the flow (red) to measure the undisturbed flow. Downwind of the wind farm, the legs are arranged in a meander pattern. Whereas the first leg has a distance of about 1 km to the farm, the farther downstream legs follow with a spacing of 10 km. The legs usually have a length of 40 km, which corresponds to a flight time of 10 min.
The approximate height of the pattern is at about 100 m adjusted to the turbine hub height. Furthermore, vertical profiles between 30 to 1000 m are flown in the vicinity of the wind farm (numbered in Fig. 1) and used for the vertical probing of the lower atmosphere.

Stability Characterization
Stability is analyzed according to the parameters presented in Sect. 2. Starting with the gradient based criteria γ and Ri B , those are calculated over a number of different height intervals.
For the long-term analysis of γ at the FINO measurement masts, the gradient is calculated from the measurements at two heights each. For the FINO1 mast, the gradients are 0-95 m, 29-95 m, and 55-95 m and 0-90 m, 33-90 m, and 50-90 m for the FINO3 mast; this is to calculate the stability in relation to the wind-turbine hub height at about 100 m. For the aircraft measurements the gradients are 50-100 m, 30-150 m, and 60-120 m extracted from the vertical profiles. For each case, the gradients of θ v and U are calculated using the vertical flight sections by the aircraft measurements and a linear regression within the height range for the FINO mast data. If the vertical flight section does not cover the entire altitude range, the flight is ignored. The vertical probing by the aircraft took about 10-15 min. The observed stability parameters at the FINO masts are evaluated at the same time as the flight profiles, using the 10-min time-average mast measurement closest to the flight time, thus tower measurements and aircraft measurements always overlap. For one measurement flight, the lapse rate γ is calculated using the arithmetic mean, and Ri B using the median due to present outliers. For the FINO masts, the regression is calculated over all available measurement heights in the height interval.
The parameters ζ , I , e, w θ v , and w w are calculated as described in Sect. 2 using highfrequency data of the first undisturbed flight leg and over a time span of 10 min at FINO1 at a height of 80 m. Due to distortion from the measurement tower as well as the adjacent wind farm Alpha Ventus, FINO1 data are only evaluated for a wind direction φ > 180 • (Foreman

Wake Characterization
For processing the wake measurements, the flight pattern is projected to Cartesian coordinates with the flow coming from negative x (Fig. 2) and its origin at the farthest downstreamlocated wind turbine of the farm. We use the method introduced by Platis et al. (2020). The wake region is determined manually, with the choice of a left and right boundary of the region of significantly reduced wind speed, and is extended linearly along the mean wind direction, as visualized for flight 7 in Fig. 2. Moreover, this example shows that in some cases the background wind speed is inhomogeneous (higher wind speeds left of the wake in the direction of flow). Such inhomogeneities are mainly attributed to mesoscale wind patterns and the effect is aimed to be minimized by averaging over an adequately wide undisturbed area on both sides of the wake, which is why the upstream leg is not used as a reference. Therefore, the undisturbed zone is defined left and right of the wake using a distance of 4 km as buffer zone between wake and undisturbed area and a width of 10 km to both sides. However, these default distances are adapted if required for example due to wind directions not perpendicular to the flight pattern, asymmetric patterns, or strong spatial inhomogeneities. In case distinct wakes build up behind the different wind farms of the cluster, several wake areas are defined.
The extent of the wake is defined by the relative wind-speed deficit D r in the mean wind direction, with u r the averaged wind speed in the wake zone, and u f the averaged wind speed in the undisturbed zone, both as a function of the downwind coordinate x as in some cases significant gradients within the background wind speed are observed for flow from the land. Similar observations are made in previous studies (Christiansen and Hasager 2005;Djath et al. 2018). The end of a wake is defined as the downstream distance at which D r (x) < 5 %. Furthermore, the wake can be characterized in time t and having a wake duration t 95 , with D(t 95 ) <5% when the expression is transferred into the time domain Here, u r (x) is the average wind speed within the wake over all downstream legs. This is assumed to be the best approximation of the speed of a travelling air parcel within the wake and is therefore taken as the characteristic speed. The third wake parameter is the initial velocity deficit D r 1 , which is obtained as D r of the first flight leg downwind of the wind farm at x = 1 km.

Long-Term-Stability Analysis in the German Bight
To assess the stability conditions at offshore wind farms in the German Bight, a long-term study of the lapse rate γ is conducted, comprising the years 2016 and 2017. This choice is based on the high data availability during this period. We analyzed all situations with wind speeds between 5 and 25 m s −1 at 90 m. The lapse rate γ is calculated between different measurement heights with respect to the maximum height at the FINO masts, which refers to the hub height at about 100 m. The lower heights are defined by the altitude of the thermometers at the masts (Table 2). Starting with the FINO3 mast, which is located farther offshore and further away from wind farms, the height intervals are 55-95 m, 29-95 m, and 0-95 m. Data availability decreases from 95-79 % with the height range (Table 3).
The histogram of the probability distribution of γ ( Fig. 3a) shows that the gradient for the 29 and 55-95 m interval ranges from about −0.01 to 0.02 K m −1 . The distribution is narrow and has a distinct peak around zero, which is slightly shifted towards negative (unstable) values for 29-95 m (yellow) and towards positive (stable) for 55-95 m (blue). Moreover, both distributions are skewed to the right. The sea-surface temperature varies slowly compared The higher the height range that the gradient calculation is located, the more the distribution is shifted towards larger γ values, and therefore more stable conditions, similar to the findings by Argyle and Watson (2014). As listed in Table 3, with the use of the sea-surface temperature the minority with only 29 % are classified as stable for at the FINO3 mast. Using the highest altitude interval (55 to 95 m) the majority is stable with 88 %.  Fig. 3a and b the distribution of γ according to the wind direction is represented for the calculation of γ over 0-95 m. However, the key features of the stability rose can be revealed even more clearly if two alternative vertical separation heights (between 30-90/95 and 50-90/95 m) are considered ( Fig. 3c and d). According to the stability rose, there is an asymmetric distribution. For flow from the from north-west to north-east, the percentage of stable cases is very low. During winds with a southern component, stable cases become more relevant. The variation of the vertical separations reveals additionally how sensitive the choice of the height interval is on the stability determination.
The analysis of the FINO1 data largely supports the findings (Fig. 4). In general, the data availability is lower than for the FINO3 data with 75 % on average. As above, three different altitude intervals are evaluated: 0-90 m, 33-90 m, and 50-90 m. Depending on the altitude interval, the same trend is visible as for the FINO3 mast: the higher the lower measurement point, the larger the percentage of stable situations (Table 3). The influence of height intervals on the lapse rate measured by the FINO masts is further discussed in Sect. 4.3 for the time of WIPAFF flight cases. Further, results confirm the observed trend.
Moreover, γ shows dependence on the wind direction ( Fig. 4f-h), resembling the findings at the FINO3 mast. Most of the stable situations are observed during winds with a southern component. Especially for south-west winds, a percentage of stable cases up to 47 % is found.

Comparison and Discussion
According to the analysis in Sect. 4.1, the highest percentage of stable conditions in the German Bight is expected for flow from land from the south. In contrast, for the flow from open sea from the northern direction, the conditions are predominantly convective. The presented results fit well in the framework of existing studies of the dynamic stability parameters in the German Bight (Westerhellweg et al. 2010;Sathe 2010;Emeis et al. 2016;Platis et al. 2021). As outlined in Emeis et al. (2016), this behaviour is typical over the ocean for the northern hemisphere in the temperate westerly wind belts: warm sector winds most frequently come from the south-west and are thus followed by rather stable conditions, whereas cold sector winds come from the north-west and predominantly bring convective conditions. However, compared to the theory proposed by Emeis et al. (2016), the sector of most stable conditions in our dataset is shifted towards the south or even south-east. An explanation is that land mass in the south and south-east has an impact on the stability conditions as warm air masses over the land are advected over the sea during noon and afternoon. Moreover, as mentioned already in Sect. 3.2, also the wind farms adjacent to the measurement masts (Alpha Ventus next to FINO1 and Dan Tysk east of FINO3, see Fig. 1) alter the analysis (Foreman et al. 2017). However, it is difficult to assess whether the adjacent wind farms alter the stability observations on a long-term perspective. According to , temperature changes in the order of 0.5 to 1 K have been observed in the wake during stable conditions at hub height. This warming may lead to an even stronger gradient between sea surface temperature and hub height, thus to a stronger stability during thermal stable cases.
The stability classification according to γ depends crucially on the measurement altitude. The trend is observed that the lower the lowest measurement point for the gradient calculation, the more convective cases are expected. One reason might be frequent cases with unstable stratification close to the water surface. The hypothesis is supported by Argyle and Watson (2014) for example, concluding from an analysis of L that the marine air below 50 m is mostly very unstable compared to the stronger stability above, suggesting the presence of an internal marine boundary layer.
Concerning the measurement location, the FINO1 mast is located closer to the shore than the FINO3 mast and is expected to be more influenced by the land mass. The FINO1 data reveal the percentage of stable cases higher for the intervals 0 to 90 m and 33 to 90 m compared withthe two corresponding intervals at the FINO3 mast. In contrast, the estimated percentage is lower for the interval 50 to 90 m compared to the respective interval at FINO3. The finding indicates that for the different height intervals, the FINO1 data reveal results in more homogeneous values of stability compared with at the FINO3 mast.

Evaluation of Stability and Turbulence Criteria Using Flight and FINO Data
The comparison of the criteria γ , Ri B , ζ , I , e, w w , and w θ v illustrated with the help of the Spearman rank correlation coefficient r (e.g., Sachs 2013) is provided in Table 4 (full list of  Table 4 Spearman rank correlation coefficients in % calculated between the criteria in the column and the row header  Tables 6 and 7 ). With this coefficient, the strength and direction of association between two variables is measured without assuming a linear relationship as with the Pearson correlation coefficient. Gradient-based parameters are evaluated in the interval of 50 to 100 m for aircraft and FINO measurements to assure comparability. A crucial feature that must be taken into consideration is that parameters such as γ and Ri B are obtained within a layer and others, such as ζ and its input variable w θ v or the turbulent quantities, are obtained only for a certain height. We further discuss the consequences of this issue at the end of Sect. 4.2.2. The following explains the Table 4 with visualizations and discussions of the most promising as well as unexpected results.

Comparison of Different Measurement Locations
For the correlation between the same criteria evaluated at different locations, as shown in bold in Table 4, the results of FINO1 and FINO3 masts are condensed using only the measurement mast closest to the flight measurement (FINO1 for flights at Godewind and FINO3 at the Amrumbank West cluster). No comparison is conducted of FINO1 and flights for the turbulence-based measures due to deviating measurement heights. The lapse rate γ shows the highest correlation coefficients between aircraft and FINO data of 78% and is shown as a scatter plot in Fig. 5. The error bar indicates the standard deviation between the airborne-measured profiles if several consecutive profiles were flown. The gradient ranges from −0.004 to about 0.06 K m −1 with an accumulation of cases between 0 and 0.01 K m −1 . There is adequate agreement with the dashed bisecting line. The root-meansquare error (r.m.s.e.) between the two datasets is 0.008 K m −1 , which is due to the spatial separation of the measurement readings by several tens of kilometres. Furthermore, it should be noted that during strong stability especially the uncertainty is larger, as seen by larger error bars, mainly caused by a fluctuating height of the inversion top that crucially influences the slope of the regression. By trend, the error in γ evaluated at FINO masts is smaller than for the flight, indicating that the spatial and temporal variability between the single flight profiles contributes to the uncertainty.
To estimate the effect of the spatial variability of the temperature gradient in the German Bight, the simultaneous results of the two measurement masts FINO1 and FINO3, which are separated by a distance of about 130 km, are compared (Fig. 6). With r = 55 %, the Fig. 6 Scatter plot of the lapse rate measured at the FINO1 and FINO3 masts, with the standard deviation between the profile measurements as the error bars correlation is lower and a significant deviation from the bisecting line is observed, especially at high stability.
The Richardson number Ri B is evaluated over a gradient-based method (not shown) but results in a far smaller correlation coefficient of 30 % between the flight and FINO measurements. In both datasets there is a high variability with several outliers and no functional relationship could be observed.

Comparison of Different Stability Criteria at the Same Location
This section is dedicated to a comparison of the different stability criteria during the same atmospheric conditions at the same location. The expected relationship between the stability criteria and turbulence measurements is a negative correlation: when the stability increases (increasing γ , Ri B , or ζ ), the turbulent quantity decreases (decreasing e, I or w w ).
Different measurement heights are considered by the different stability parameters. Whereas γ and Ri B are obtained within a layer with vertical separation from 50-100 m above sea level (a.s.l), ζ and its input variable w θ v are recorded only at one distinct height at 100 m a.s.l.
One of the highest correlations between a gradient-based stability criterion and a turbulence criterion of the flight measurement with r = −68 % is found for the lapse rate γ with the variance of the vertical velocity component (Fig. 7a). The plot reveals that during thermal instability, indicated by a negative or near-zero temperature gradient, the amplitude of w w is high. With increasing γ , w w decreases to nearly zero, which is expected as vertical mixing is suppressed. For strongly stable conditions with γ > 0.01 K m −1 , w w stays very small, except for the two outliers at the FINO1 mast. A similar relationship is found for the comparison of γ and the turbulence intensity I . For the parameters e and γ , the correlation is significantly weaker (r = −22 %, see Tab 4). Also, the correlation of w w , I , and e with Ri B is lower (Table 4), even though Ri B depends on wind-speed measurement(s) as well.
The correlation within the turbulence criteria e, I and w w , is comparably high, which is not surprising, as the variables partially depend on the same input quantities. Especially high is the correlation between e and w w with r = 93 % for FINO1 (Fig. 7b). The horizontal wind components measured by the aircraft have much larger variances compared with the tower data (Fig. 7b), which is expected during (strong) stable conditions. In contrast, the tower Fig. 7 Scatter plots of different criteria for flight and FINO1 measurements. The turbulence criteria are calculated for the flight leg upstream of the wind farm, which is expected to be in the undisturbed flow and the simultaneously measured 10-min FINO1 data. When the aircraft height deviates from the standard pattern and lies above 100 m, the data point is only indicated as a non-filled marker in the plots that compare aircraft with FINO1 measurements. a Variance of vertical velocity component over lapse rate. b Variance of vertical velocity component over e. c Turbulent kinematic heat flux over lapse rate data indicate almost isotropic turbulence by a linear correlation, which seems unrealistic for (very) stable cases.
Finally, the stability criteria γ , Ri B , and ζ are compared with each other: there is a weak correlation between γ and Ri B . This finding is mainly assigned to outliers in Ri B and because Ri B includes wind shear in addition to the thermal stratification of the lapse rate; ζ shows even contradicting correlations with γ and Ri B in Table 4.
To understand the behaviour, the scatter plot of the turbulent kinematic heat flux w θ v -one of the input variables for calculating ζ -is provided in Fig. 7c: for the FINO1 masts no clear statement about the correlation between w θ v and γ can be made as data are very limited. Stable cases especially, which predominately occur for a south-westerly wind (see Sect. 4.1), have been excluded from our dataset, since turbulence measurements (and therefore the turbulence heat flux measurements) by the tower for wind directions of φ > 180 • are not considered. For neutral conditions a strong variation in the heat flux is observable. For the few stable cases, the heat flux approaches zero, which is expected, apart from the one outlier at γ = 0.01 K m −1 . For the aircraft data the expected trend is visible: a coincidence of the highest w θ v value with the lowest γ value, and by the trend of a decrease of w θ v with more positive γ (r = 69 %). However, whereas the vast majority of the samples are classified as stable according to γ , the outcome is reverse for w θ v . It is assumed that this result is a consequence of the following: -As in the case of Ri B , ζ includes wind shear in addition to the thermal stratification solely in γ , which may cause a major deviation. -Different measurement heights are considered by the different stability parameters.
Whereas γ and Ri B are obtained within a layer with a vertical separation from 50 to 100 m a.s.l., ζ and one of its input variables w θ v are recorded only at one distinct height at 100 m. In case there are temperature inversions present over the relevant height interval, γ averages over several layers and does not represent the actual temperature gradient, whereas the measure ζ represents the stability of the measurement height (probably) better, but ignores layers below an inversion. An example for that is observed for flight 31 and is discussed in Platis et al. (2020). For some of the presented case studies, this implies that stable inversions closer to the surface are present, leading to the observed majority of positive γ but negative ζ detected at a height aloft. -When γ is positive, only a negative sensible heat flux is possible, unless, directly above the water a convection layer with an unstable stratification is located, and the heat is transported upwards against the stratification (counter-gradient). This effect could also partially explain the findings in Sect. 4.1.1.

Evaluation of Criteria and Implications
In the present dataset, the lapse rate is considered to be the most robust parameter, as there is a good agreement between the airborne data and the FINO tower data especially for nearneutral and slightly stable conditions. The conclusion can be drawn that this study shows good reasons why FINO measurements can be used as a reliable data source for the temperature gradient in the German Bight at the wind farm of interest. However, the disadvantages of the criteria are a high uncertainty when it comes to very stable conditions and when inversions or stable surface layers are present. Moreover, a dry adiabatic lapse rate does not automatically mean neutral conditions, as excess buoyancy and free convection can still be present (Stull 1988). Also in the dataset, an accumulation of the results for near-neutral conditions (γ ≈ 0) is observed, which makes it difficult to distinguish convective cases here. There are two stability criteria taking the effect of shear-generated turbulence into account: first, Ri B , whose agreement with the other criteria is weak as outliers lead to an unreliable result. Second, ζ deviates from the static stability, which results in a significantly higher portion of convective cases probably originating in the presence of stable layers below the measurement height of ζ and the additional shear generated turbulence term. The most promising turbulence parameter is w w , showing a functional relationship with the temperature gradient γ . Used as an additional parameter, this variable could support a further classification during situations with a lapse rate close to zero (neutral stratification).

Airborne Wake Observations
The determination of wakes behind wind farms was possible for 28 out of 41 measurement flights. Those have a meandering pattern of flight legs downstream of the wind farm at hub Fig. 8 Comparison of the lapse rate derived from the aircraft measurements across the total rotor area (30 to 150 m) with the calculation over alternative height intervals: 50 to 100 m is the interval considered in Sect. 4.2. The interval 60 to 120 m is also centred within the rotor area, but is smaller to improve data availability height ( Figs. 1 and 2 ). The remaining 13 flights were excluded due to a different flight strategy. For 12 out of the analyzed flights the wake length, defined as the distance to 95 % recovery, exceeds the length of the meander pattern and for two flights, the wake is observed to be shorter than the first measured leg at hub height. An overview of wake lengths, wake durations, and the initial wind-speed deficit is provided in Table 5.
The observed wakes within the campaign range from nearly zero to over 65 km or exceed 160 min in terms of wake duration. Initial wind-speed deficits D r 1 approximated at the first downstream leg are observed up to 43 % of the undisturbed flow.
To examine the relationship between wake and stability, the three wake parameters (length x 95 , wake duration t 95 , and initial wind-speed deficit D r 1 ) are compared to stability indicators. Based on Sect. 4.2 those are the lapse rate γ and the variance of the vertical velocity component w w .

Height Interval of the Lapse Rate
Since γ depends on the applied height interval (see also Sect. 4.1), the question arises, which is the most representative interval for the wake evolution? For an offshore wind farm, it is likely to be across the total rotor area which is 30-150 m for the Amrumbank West wind farm. A comparison of the flight lapse rate evaluated over different intervals shows good agreement between γ calculated across the rotor area and an alternative smaller interval but also centred within the rotor area of 60-120 m (Fig. 8). As the outcome agrees well for the given case studies, the smaller interval of 60-120 m is used in the following analysis, as it has the advantage of higher data availability and is centred within the rotor area.
For the FINO1 and FINO3 masts, no calculation of an interval centred within the rotor area is possible due to the limited height of the towers. However, the towers have the advantage of measuring the sea-surface temperature. The comparison of γ evaluated over different height intervals (Fig. 9) shows a good linear correlation, especially between the sea surface (0 m) to 30 m and the 0-100 m interval. It also shows that γ measured between 0-100 m has the tendency to be more neutral and this tendency is even enhanced for the 50-100-m interval compared with the 0-30-m interval. The lapse-rate interval close to the ground can be interpreted as an indication of the surface forcing. Hence, with the use of the lowest interval, Table 5 Overview of wake parameters  Comparison of the lapse rate γ measured at the FINO masts during the flight times evaluated over different height intervals the best representation of convective cases is achieved. This is important to account for any neutrally stratified layer located at higher altitudes, but may be still convective due to a strong surface forcing by a large surface sensible heat flux (Stull 1988). Therefore, the lapse rate is calculated from the sea surface to the lowest temperature reading on the mast to explicitly include the surface forcing in contrast to the aircraft measurements (with an interval between 60 and 120 m) in the further analysis of the correlation between the wake length and stability.

Comparison of Wake and Stability
The comparison of the wake parameters (wake parameters length x 95 , wake duration t 95 , and initial wind-speed deficit D r 1 ) with the lapse rate γ evaluated from the aircraft measurement is provided in Fig. 10. A common trend is not very clear regarding all wind-farm clusters at once. When looking at a distinct wind-farm clusters, the trend towards longer wakes with stronger stability for the Godewind and Amrumbank West wind-farms is observed. This trend is revealed by a function of the form In terms of wake duration (Fig. 10b), a trend of longer wake duration is only visible for the Godewind observations (blue filled dots). For the other wind farms there is no clear correlation in our data. Comparing the initial wind-speed deficit with γ (Fig. 10c) indicates that small deficits occur during more convective conditions, assuming the Godewind measurement point of flight 9 with a gradient of 0.03 K m −1 to be an outlier. In contrast, with higher initial deficits, the range of stability widens. During strong stability, large D r 1 values are observed, supporting the hypothesis that wake effects within the wind farm are more pronounced during stable conditions. Excluding the outlier of flight 9, a linear regression with a slope of 5 × 10 −4 K m −1 approximates the behaviour.
By comparing the wake length with the values of γ evaluated at the FINO1 and FINO3 masts (from 0 to 30 m), a similar correlation to the aircraft observations is possible (Fig. 11a).
The overall trend is that the wake length increases with stability and especially no short wakes are present during very stable conditions. For visualization, a regression with a r.m.s.e. of 0.05 K m −1 and the form of γ = 3.2 × 10 −5 × (x 95 ) 2.8 − 0.036 for γ in K m −1 and x 95 in metres is fitted to all wake measurements that captured the whole wake dissipation (filled markers).

Fig. 10
Scatter plots of wake length x 95 , duration t 95 , initial wind-speed deficit D r 1 and static stability obtained from the flight measurements. The respective wind farm is colour coded. Observations of one common wake downwind of Amrumbank West cluster or Nordsee Ost and Meerwind Süd-Ost are assigned as "Cluster". An empty marker indicates the minimum wake extension if the wake exceeds the meander pattern. a Lapse rate versus x 95 and regression for Godewind (blue line) and AW (green line) data. Figure adapted from Platis et al. (2021). The regressions have the form for Godewind: γ = 1.6 × 10 −5 × (x 95 ) 2.0 − 4.6 × 10 −4 and AW: γ = 7.4 × 10 −5 × (x 95 ) 3.6 − 1.9 × 10 −4 with γ in K m −1 and x 95 in metres. b Same as in Fig. 10a but for the wake duration t 95 . c D r 1 and linear regression with γ = 5 × 10 −4 K m −1 y 0 − 0.003 K m −1 A comparison of wake length and vertical velocity variance w w shows no clear relationship (Fig. 11b) for all wind farms. However, considering wake measurements only downwind of Godewind, there is an indication that the value of w w reduces with the increase in wake length. The same trend is observed for the wake duration (not shown here). We tested the correlation with the other parameter such as e, I , or ζ , too. For these stability and turbulence criteria no clear correlation for the entire dataset as well as for individual wind farms is found (not shown).
In summary, a distinct correlation between wakes and stability is complex to define: -A correlation is seen for the initial wind-speed deficit and the thermal stability γ . Higher wind-speed deficits and longer wakes are observed during more stable conditions. -Strong stability coincides with the absence of short wakes. We conclude that the relationship cannot be expressed simply in a clear correlation of stability and wake. Further, superimposed effects, which are not assessed here, make the picture indistinct. Besides uncertainties in the method for a representative stability and wake determination, the farm architecture could be an additional factor which impacts the result. Platis et al. (2020) showed that wake lengths downstream of the wind farm Amrumbank West-with a very dense turbine spacing-are much longer than wakes from the neighbouring wind farm Nordsee Ost or Meerwind Süd-Ost with much sparser turbine spacing, despite identical atmospheric conditions.

Conclusion
The lapse rate γ calculated using the FINO towers and airborne measurements are in good agreement for simultaneous measurements in the German Bight. A statistical error of 0.008 K m −1 is found between the FINO data compared to the flight measurements close to the wind farm of interest, although there is a spatial separation of several of 10 km between the measurements. Therefore, γ is considered as a robust parameter describing stability for a region such as the German Bight.
On the contrary, a comparison of turbulence measurements at FINO1 and aboard the aircraft is evaluated as less consistent, probably due to the influence of adjacent wind farms on FINO1, as noted by Foreman et al. (2017), and the different altitudes where turbulence measurements were available. Especially, comparing gradient-based approaches over a layer of several of dozens of metres and turbulence measurements at one certain height leads to differences. A main reason is temperature inversions inside these layers or below the observed measurement heights, as discussed in more detail in Platis et al. (2020).
The study supports the hypothesis that static stability in the German Bight depends on the wind direction, and stable conditions are predominantly expected for flow from the land from the south to south-west and south-east. In addition, the evaluation of the stability depends on the measurement height. In the German Bight, by trend, a near-surface (0 to 30 m), predominantly convective layer is present and more stable conditions are found aloft. Using the highest altitude interval (55 to 95 m), stable conditions are found for 88 % of the investigated cases, whereas extending the interval down to the sea surface results in only 29% of cases being stable for the FINO3 mast. Therefore, it is crucial to consider the height of the boundary layer and possible inversions in the temperature profile when evaluating stability.
The comprehensive analysis of 28 airborne in situ measurements downwind of the wind farms Godewind and the Amrumbank West cluster mainly under thermally stable conditions, reveals wake lengths of more than 65 km. The length at which the wind speed inside the wake has recovered to 95 % of the free stream characterizes the maximum length of the wake. The initial wind-speed deficit D r 1 is found to range up to 43 % compared with the undisturbed flow, which supports the finding of the previous analyses by Platis et al. (2018), Cañadillas et al. (2019), Platis et al. (2020), and Platis et al. (2021). However, in this study all flights of the WIPAFF campaign have been considered now, leading to more representative results.
Our data analysis reveals that there is a trend of longer wake lengths with stronger atmospheric stability, which is expected from the results of analytical models such as that proposed by Emeis (2010). However, the correlation is not clear if wakes are observed not only during strongly stable cases, but also during near-neutral conditions. Apart from the impact of possible temperature inversions, this variance in the data indicates that under the investigated thermally stable conditions, the wake intensity is influenced by further factors. Those could be for example, the operating state of the wind farms, wind-turbine dimensions, or the farm layout (its width, the angle of incident flow or turbine spacing). The latter is supported by the findings from recent studies such as Kühn and Schneemann (2017) and Platis et al. (2020). Hypothesizing that the wind-farm layout is one of the main drivers for controlling the wake length under statically stable conditions, this would attribute farm architecture a high relevance in planning new farms, since, as is the case in the German Bight, stability varies with wind direction. The wind farm is either classified as Godewind meaning the flight is conducted at Godewind or Amrumbank West when it is conducted at the Amrumbank West cluster, AC stands for aircraft measurement, and F1 and F3 abbreviate FINO1 and FINO3. The height intervals for the lapse rate calculation for AC and FINO is between 50 and 100 m. The parameters U and φ are evaluated over the vertical flight profiles as median of all values with a height 99 m < z < 101 m Table 7 Overview about flight and stability criteria part 2; AC abbreviates the aircraft measurements and F1 and F3 abbreviate FINO1 and FINO3 .5 Table 7 continued AC The height intervals for the Ri B calculation for the AC and FINO data is between 50 and 100 m. For the parameter ζ , the measurement height z in m during the flight measurement is added for completion