A ﬁrst order geometric auto regressive process for boundary layer wind speed simulation

. Under certain conditions the ﬁrst order geometric auto regressive (AR) process has statistical properties similar to atmospheric boundary layer wind speed. In this contribution, we investigate this similarity and analyse the extent to which this stochastic process is a suitable model for wind speed simulation. In particular, we focus on the ﬂuctuation of the process around its moving average over a given number of time steps. We show that the ﬂuctuation conditioned on the value V of the moving average are of a symmetric normal distribution with a proportionality between its standard deviation and V . This proportionality is empirically observed in wind speed data. Furthermore, we show that the increment distribution of the geometric AR(1) process is in good agreement with the symmetric Castaing distribution which is empirically found in wind speed data.


Introduction
In several applications, in particular in view of the cost efficient use of wind power, more realistic input wind fields than just laminar flows are desired for numerical simulations of the flow around an obstacle. Solving the Navier-Stokes equations, see e.g. [1,2], by direct numerical simulation gives a realistic wind field. But for three-dimensional volumes of a realistic size the computational costs are immense. Large eddy simulation is an approximated solution to the Navier-Stokes equations by not directly solving the motions of the small scales. Rather, they are modelled. Still, it requires enormous computational power.
Another approach are stochastic models simulating turbulent wind fields based on empirical findings. Notable are spectral models [3][4][5] aiming to describe the process by its spectral density and models based on continuous time random walks [6,7]. These models put their focus on the cross-correlation or the intermittency, respectively. It is very difficult (if at all possible) to consider all kinds of statistical properties and generate a series which perfectly reflects the atmospheric wind velocity.
In this contribution, we put the focus on another statistical property: the proportionality between the strength of wind speed variation and the mean wind speed. We analyse the extent to which a stochastic process, which received yet little consideration in this context, is a suitable model for simulating the modulus of the horizontal a e-mail: laubrich@pks.mpg.de components of atmospheric boundary layer (ABL) wind speed: the first order geometric auto regressive (AR) process. It is defined by the recursion formula for n ≥ 1 where ξ n stands for an independent normally distributed random variable with zero mean, unit variance, and an auto correlation function given by a Kronecker delta: E [ξ n ξ m ] = δ n,m . As {u n } N n=1 shall reflect a wind speed series, it must have a dimension. The parameter U acts as a scale for the process and has the dimension of velocity. The logarithm of the dimensionless speed u n /U is an AR(1) process which is stationary if a is chosen to be less than one in magnitude. From this equation we can read off the meaning of the two parameters a and b. The first one correlates ln(u n /U ) and ln(u n+1 /U ) and the positive constant b reflects the noise strength.
The purpose of this contribution is to show that the process (1) with suitably chosen a and b has similar fluctuation statistics to ABL wind speed and therefore is a suitable candidate for the numerical generation of artificial wind speed sequences. This is done in Section 4.
On the other hand, in order for (1) to reflect wind speed, we compare its increment statistics with those found in wind speed data. This is subject of Section 5.
But first, we want to state some basic statistical properties of the geometric AR(1) process and discuss the properties of ABL wind speed data, which serve as reference and which shall be simulated.

The process
This section is intended to state basic statistical properties of u n which we will need later for deriving the fluctuation and increment statistics in Sections 4 and 5, respectively.
We only consider |a| < 1 so that the process (1) is stationary. It is obvious from equation (2) that the AR(1) process ln(u n /U ) is of a Gaussian distribution. Its expectation value E [ln u n /U ] equals zero and its variance is equal to Therefore, u n is of a log-normal distribution with the density The reader is referred to reference [9] for a detailed description of the log-normal distribution.
Here, we summarize shortly some points which are important for the scope of this paper. The position parameter U of the log-normal distribution (4) matches the median of u n and has the dimension of velocity. In contrast, the shape parameter λ 2 is dimensionless and reflects the ratio between the expectation value μ u and the median of u n : As a consequence of the expectation value being larger than the median, the majority of points generated by (1) are smaller than the expectation value. In fact, the probability of u n to be smaller than μ n is 1/2 + erf(λ √ 2/4)/2. The coefficient of variation, i.e. the standard deviation of u n in units of the expectation μ u , is given by The distribution is positively skewed with Skew [u n ] = (e λ 2 + 2) √ e λ 2 − 1 > 0 and leptokurtic. That is, the kurtosis is larger than three. Note that E u 4 n /σ 4 u where σ 2 u = Var [u n ] is used for the definition for the kurtosis. If u n were of a normal distribution, its kurtosis would be equal to three. However, the kurtosis of the log-normal distribution (4) is only three if the shape parameter λ 2 equals to zero. At last, we give the expression for the higher moments of u n reading for q = 0, 1, 2, . . .
Another quantity of the process (1), which is of importance for the subject of this contribution, is the auto correlation function. Reference [9] states that the auto correlation function is given by For small lag |τ | this function decays faster than exponentially whereas for sufficient large |τ | it decays exponentially because the auto correlation function approaches for λ 2 a |τ | 1. In any case, it can be noted that if the exponent a tends to one, the degree of correlation becomes larger.

Wind data
Recordings, which were made at the Lammefjord site [10] (55 • 47 41 N, 11 • 26 52 E), of the modulus of the hor- , of ABL wind velocity serve as reference. The data was acquired using cup-anemometers with a measurement frequency ν = 8 Hz.
The fluctuation analysis is introduced in reference [11] and applied to ABL wind velocity v n . The fluctuation is defined by wherev (T ) (t) denotes the moving average over the time interval [t−T /2, t+T /2]. We investigate the statistics of the fluctuation conditioned on the mean wind speed V . That is, we estimate the distribution of f (T ) (t) conditioned on |v (T ) (t) − V | < ΔV/2. ΔV should be small. But due to the time discretisation of the data (ν = 8 Hz) it must be large enough to get reasonable statistics. Figure 1 depicts some of the estimated conditioned distributions for a variety of V with binning ΔV = 0.25 m s −1 . The time scale T is chosen to be 5 s. The histograms in Figure 1 suggest that the fluctuation is roughly of a symmetric normal distribution, see also reference [12]. The estimated standard deviation σ ing a proportionality between σ (5 s) f (V ) and V , which is emphasized in Figure 2. The proportionality factor is estimated to be 0.0694 by estimating the standard deviation of f (5 s) (t)/v (5 s) (t). The proportionality factor can be interpreted as the turbulence intensity [1,12] at a 5 s scale. The latter states the percentage of the mean flow which is represented by the velocity fluctuation. Its value depends on the time scale T over which the mean wind velocity is defined.
Regarding the ABL wind speed increments with fixed increment length τ , several studies, such as [13][14][15][16], give empirical evidence that it is in good approximation of the symmetric Castaing distribution. The latter is developed by Castaing et al. [17] under the assumption of the Taylor hypothesis [18] stating that spatial correlations can be translated into temporal correlations. The distribution is given by  and described by two parameters: the position and shape parameter β τ and λ 2 τ , respectively. The position parameter β τ has the dimension of velocity −2 and is in terms of variance and kurtosis 2 given by The dimensionless shape parameter λ 2 τ is is directly related to the kurtosis of the increments by Additionally, for very large τ the increment distribution approaches a Gaussian, which is shown in Figure 3 and in agreement with references [13][14][15][16]. The figure depicts the estimated increment distribution in a semi-logarithmic plot. The increment length τ is varied between 1/8 s and 4:16 min. That is, for large τ , the shape parameter becomes zero because the increment kurtosis becomes three.

578
The European Physical Journal B

Analysis of fluctuations
We want to show that the geometric AR(1) process (1) In other words, the expected value of u n conditioned on u (2m+1) n = V must be equal to V . We express the moving average in terms of u n and write it as Iterating equation (1), the values u n±s are given by We can take advantage that the parameter a is a little less than one, i.e. a = 1 − ε ∼ 1. Equation (18) contains a in terms of a k where the integer k according to equation (17) In order to get an expression for the expectation of u n conditioned onū (2m+1) n = V , we apply the conditioned expectation operator E · |ū (2m+1) n = V to this equation.
Using that ± s is independent from u n with E [ ± s ] ∼ e b 2 s/2 , expanding the resulting expression up to the second order in b 1 leads to Consequently, the expectation value of the fluctuation reads That is, the deviation from zero of the conditioned expectation value of the fluctuation is merely second order in noise strength b so that it can be neglected.
We can now calculate the distribution of the fluctuation up to the first order in b. Inverting equation (21) leads to u n ∼ū (2m+1) Expanding the right hand side up to first order in b and inserting u n into the definition of the fluctuation yields Consequently, it has zero expectation and its variance is given by so that the symmetric Gaussian is the distribution of f (2m+1) n conditioned onū (2m+1) n = V . The proportionality factor between the standard deviation and V is given by Note that α(m) does not depend on the parameters U or a. Hence, the parameter b can be chosen such that the generated series has α(m) being equal to the ABL turbulence intensity obtained at scale T with T ν = 2m+1. Regarding the simulation of the Lammefjord data with ν = 8 Hz, we fix the scale T to be 5 s. As mentioned before the turbulence intensity is estimated to be 6.94%. The corresponding geometric AR(1) process has therefore b = 0.0376. We generate a series consisting of 5 × 10 6 data points with a = 0.993 and U = 8.00 m s −1 . The reason why these values are chosen is explained in the next section. The conditioned fluctuation distributions are estimated using bin-counting and displayed in Figure 4 for a variety of V . The figure demonstrates good agreement with the day 191 data (plotted with black dots) and symmetric Gaussian (plotted with dashed lines).

Increment statistics
We check whether the increments of the geometric AR(1) process are of a symmetric Castaing distribution (at least approximately) and whether the distribution possesses a cross-over from being intermittent to being Gaussian.
The increment u s;n = u n+s − u n of the process (1) corresponds to the difference between two (coupled) lognormally distributed random variables u n+s and u n . The increment distribution is therefore given by an integral of the form p s (u s;n ) = du n du n+s ρ s (u n , u n+s ) δ(u s;n + u n − u n+s ) (29) where δ(·) stands for the Dirac δ-function and ρ s (u n , u n+s ) denotes the joint distribution of u n and u n+s . The latter can be written as ρ s (u n , u n+s ) = ρ s (u n+s |u n )p(u n ) with ρ s (u n+s |u n ) denoting the distribution of u n+s conditioned on u n . According to equations (18) and (19) u n+s can be written as so that its logarithm for fixed u n reads It is of a Gaussian distribution with mean a s ln(u n /U ) and variance λ 2 (1 − a 2s ) with λ 2 = b 2 / (1 − a 2 ). Therefore, u n+s conditioned on u n is of a log-normal distribution with density ρ s (u n+s |u n ) = 1 Knowing that p(u n ) corresponds to a log-normal distribution as given in equation (4), we have an expression for the joint distribution of u n+s and u n which can be inserted in equation (29). Renaming u s;n to Δu and integrating u n+s over the δ-function as well as substituting y = (2u n + Δu)/U yields with y ± = (y ± Δu/U )/2. Its variance is obtained by expressing the variance in terms of the process variance σ with z(s) := 1 − a s . The kurtosis of the increment is given by Inserting equation (30) and using the binomial theorem results in which is obtained after applying a polynomial division. Consequently, we can set the parameters of the Castaing distribution to such that the symmetric Castaing distribution has same variance and kurtosis. Unfortunately, as s becomes very large, i.e., z(s) tends to one, the shape parameter approaches lim s→∞ λ 2 s = 4λ 2 + ln For small λ 2 this limit is in good approximation 8λ 2 /3, whereas for large λ 2 it is in good approximation 4λ 2 −ln 6. In any case, lim s→∞ λ 2 s ≥ 0 so that it can be concluded that the increments of a geometric AR(1) process do not approach a Gaussian distribution for large s.
Still, we can show that the increment distribution p s (Δu) approaches a symmetric Castaing distribution equation 13 for small increment length s when assuming that a = 1− ε with ε 1, b 1, and s 1/ε. The increment is approximated by u s;n ∼ u n s−1 l=0 exp[bξ n+l ] − u n due to a s ≈ 1 − εs ∼ 1 and equation (30). Expanding the exponential up to first order in b yields an expression for the increment u s;n ∼ u n b √ sη s;n . The random variable η s;n = s−1 l=0 ξ n+l / √ s is of a normal distribution with zero mean and unit variance. Its auto correlation function reads γ η (t) = 1 − |t|/s for |τ | < s and γ η (t) = 0 for |τ | ≥ s. Therefore, the increment u s;n with small s corresponds to a product of a standard Gaussian random variable η s;n with an independent log-normally distributed random variable u n b √ s. Identifying β by 1/(u 2 n b 2 s) implies that u s;n is of a Castaing distribution with position parameter and shape parameter As a cross check, equations (37) and (38) approach β −1 s = U 2 b 2 s and λ 2 s = 4λ 2 , respectively, for small s, and thus z(s) ≈ εs 1. As a consequence, equations (28), (40), and (41) can be used to compute the parameters a, b, and U of the geometric AR(1) process (1) such that the generated series has similar turbulence intensity and similar increment distribution (increment length τ such that s = τν ∈ N) to ABL wind velocity.
Regarding the simulation of the Lammefjord data with ν = 8 Hz, b is set as 0.0376 in order to get the turbulence intensity at scale T = 5 s correct. Fixing the increment length τ to be 1 s, the remaining parameters U and a can be computed: a = 0.993 and U = 8.00 m s −1 . The corresponding increment distribution is plotted in Figure 5. It can be inferred that the increment distribution of the geometric AR(1) is of a symmetric Castaing distribution. The corresponding ABL wind velocity increment distribution is plotted as well with black dots. As it also approximately follows the shape of the symmetric Castaing distribution, it can be concluded that the generated series has similar increment statistics to ABL wind speed.

Summary
This contribution analysed the extent to which a geometric AR(1) process is a suitable discrete stochastic process for simulating ABL wind speed with frequency ν. Due to its iterative definition, the process is simple to implement. The process is defined by three parameters: a 1, b 1, and U . We showed that they can be chosen such that the process has similar fluctuation and increment statistics to wind speed data by applying (28), (40), and (41). That is, three parameter as set to fit the wind speed turbulence intensity as well as position and shape parameter of the wind speed increment distribution. However, in order to estimate the turbulence intensity and the increment distribution, the time scale T and the increment length τ need to be fixed. The process only reproduces the fluctuation and increment statistics for time scale 2m + 1 = T ν ∈ N and increment length s = τν ∈ N, respectively. The model only provides an intermittent increment statistics so that the increment length τ needs to be chosen such that the wind speed increment distribution is intermittent.
We demonstrated the agreement between fluctuation and increment statistics of the process with fluctuation and increment statistics of ABL wind speed data (Lammefjord day 191 data). The frequency of the model was set to measurement frequency of the wind data (8 Hz). The three parameters of the process can be chosen to fit the turbulence intensity at time scale T = 5 s and the increment distribution with increment length τ = 1 s.