Modeling directional distributions of wind data in the United Arab Emirates at different elevations

Modeling wind speed and direction are crucial in several applications such as the estimation of wind energy potential and the study of the long-term effects on engineering structures. While there have been several studies on modeling wind speed, studies on modeling wind direction are limited. In this work, we use a mixture of von Mises distributions to model wind direction. Finite mixtures of von Mises (FMVM) distributions are used to model wind directions at two sites in the United Arab Emirates. The parameters of the FMVM distribution are estimated using the least square method. The results of the research show that the FMVM is the best suited distribution model to fit wind direction at these two sites, compared to other distributions commonly used to model wind direction.


Introduction
Wind energy is an important source of renewable energy, and the global installation capacity has been growing every year. Understanding the wind energy potential is very important for the site selection for installations of wind turbines. The key to analyzing the potential is to estimate the probability density function of wind speed. Several methods can be used to estimate the probability distribution using statistical models such as Weibull, Rayleigh, lognormal, normal, and gamma (Ouarda et al., 2015;Qin et al., 2011;Ouarda et al., n.d.;Shin et al., 2016;Wang et al., 2016;Safari, 2011). Wind energy is not only affected by wind speed but also by many other factors such as terrain, climate, temperature, friction, wind direction, and air density (Zhang et al., 2014;Zhang et al., 2013;Naizghi & Ouarda, 2017). While modeling wind speed is enough for turbines with a rotating nacelle, modeling wind direction becomes important for turbines with fixed nacelle. Moreover, modeling wind direction has several other applications in aerosol transport, power transmission, and engineering structure design (Jammalamadaka & Lund, 2006;Petersen et al., 2005;McLean Sloughter et al., 2013).
Wind direction plays a vital role in transporting pollutants, which can have serious implications on human health (Jammalamadaka & Lund, 2006). The wind parameters have a significant effect on the thermal rating of transmission lines, which influences the transmission capacity of the lines (Davis, 1977). Atmospheric flow in the vicinity of a large elliptical mountain is affected by upstream wind direction and orographic features. In mountain areas, it is important to understand the interaction between Coriolis force and the upstream wind direction, because this will affect atmospheric flow in the vicinity of a large elliptical mountain (Petersen et al., 2005).
Wind direction modeling is also important in applications related to navigation and safety (McLean Sloughter et al., 2013) and civil engineering projects such as towers and bridges (Masseran et al., 2013;Abohela et al., 2013). The analysis of circular data, such as wind direction, is also common in several other fields. For example, in biology, circular data are present in animal navigation and movement directions (Mardia & Jupp, 2009). Spherical data is evident in the earth sciences field in the assessment of the relative rotation of tectonic plates, and directional data exists in studying the orientations of cross-bedding structures (Mardia & Jupp, 2009).
The circular random variable is measured in degrees from 0 to 360°or radians from zero to 2π. These variables cannot be modeled using standard statistical distributions. There are several periodical and directional statistical distribution models that are used to describe circular distributions such as the circular uniform distribution, von Mises distribution, wrapped normal distribution, wrapped Cauchy distribution, and the Cardioid distribution (Jammalamadaka & Sengupta, 2001;Fisher, 1995). A number of earlier research activities have modeled the wind direction using different methods. A model was developed for the prediction of surface wind direction over the Pacific Northwest using the von Mises distribution (Bao et al., 2010). Another research project forecasted wind directions over the North American Pacific Northwest using the applied Bayesian model averaging (BMA) method (McLean Sloughter et al., 2013). In (Masseran et al., 2013), a finite mixture of von Mises distributions is used to fit wind direction in nine stations in Malaysia. (Carta et al., 2008a) used the least squares method to estimate the parameters of a finite mixture of von Mises distributions to model wind direction in the Canary Islands. A comparison of four types of circular distribution, i.e., circular uniform distribution, von Mises distribution, wrapped normal, distribution and wrapped Cauchy distribution, was carried out by (Kamisan et al., 2010). The study used graphical and numerical approaches to find the best fit to model southwesterly monsoon wind direction data recorded in 2005 at four locations in Malaysia. The results were compared graphically using the Q-Q plot. In addition, the mean circular distance and the chord length are also used to find the most adequate distributions. It was concluded that the von Mises distribution is the most appropriate model for the data in all four stations.
In the present work, the focus is on modeling the wind direction over the United Arab Emirates (UAE) using a finite mixture of von Mises distributions. The parameters are estimated at two different stations for different heights to look for the consistency in the distribution models. The paper is organized as follows: Section 2.1 presents the theoretical  background on circular statistics. Section 2.2 details the mathematical formulation of the von Mises distribution. Section 2.3 discusses the modeling of the mixture of two von Mises distributions. Hypothesis testing is discussed in section 3 of the paper. The case study in presented in section 4, and the results are illustrated in section 5. Finally, the discussion and conclusions are given in section 6.

Circular statistics
Circular data is normally presented as points on the circumference of the unit circle which represent the direction's position. Any point in the plane can be denoted using rectangular coordinates (x, y) or polar coordinates (r, α), where r is the distance from the origin to the point and α is the direction (angle). The trigonometric functions such as sine and cosine are used to transfer the data from rectangular coordinates to polar coordinates and vice versa from the following equations (Jammalamadaka & Sengupta, 2001): The features of directional data are identified by measures of central tendency and dispersion such as mean direction, mean resultant length, circular variance, and circular deviation (Fisher, 1995;Trauth et al., 2007;Jones, 2006a). Finding the directional mean is different from finding the arithmetic mean. The proper way to combine vectors is to use vector addition: The mean direction θ of the vector resultant of θ 1 ,…, θ n can be computed from The mean resultant length R is the length of the mean vector and lies in the range (0, 1). It is defined in (Fisher, 1995;Bowers et al., 2000) as: The distribution is more concentrated when the circular variance is smaller. The sample circular variance V lies between 0 and 1and is computed as: The corresponding circular deviation, S, is calculated as: The von Mises distribution was introduced by von Mises in 1918, and it has some similarities with the normal distribution. This was highlighted by (Masseran et al., 2013;Gumbel et al., 1953;Masseran, 2015). The von Mises distribution VM(μ, k) is a symmetric common unimodal model for circular data. The probability density function of the von Mises distribution is given as follows: where μ is a parameter of the mean direction, k ∈ [0, ∞) is the concentration parameter around the center (Fisher, 1995;Costa et al., 2014), and I 0 k ð Þ ¼ 1 2π ∫ 2π 0 e kcos ∅−μ ð Þ d∅ is the modified Bessel function of order zero. A p (k) of order p is a ratio of two modified Bessel functions, where p is the order: The distribution function from (Fisher, 1995) is: The concentration parameter can be estimated by using the maximum likelihood method, and it is a solution of the following equation [18]: where R is the mean resultant length and (Fisher, 1995;Bowers et al., 2000) b k ML can be used to estimate the true value of k when the sample size is small and R < 0.7 (in other words, if the R < 0.7 and specifically R < 0.45, then b k ML is true and there is no need to go to the next step). If the sample size n≤ 15, then the following equation is preferred to estimate k after finding b k ML (Fisher, 1995;Hakim et al., 2013).
Modeling mixture of two von Mises distributions Generally, the mixture of von Mises distributions is derived from the following equation [29]: The probability density function of a mixture of two von Mises distributions is written as (Fisher, 1995;Jones, 2006a;Mooney et al., 2003;Grimshaw et al., 2001): where μ 1 and k 1 represent the vector mean and concentration parameters of the first component; μ 2 and k 2, represent the vector mean and concentration parameters of the second component; and ∑ n j¼1 p j ¼ 1 (Fisher, 1995;Carta et al., 2008a). The five parameters are estimated using the least square method with numerical optimization. Least square method had been used in the literature to derive the equations from (20) up to (25) which had been used to compute the parameters as mentioned in (Carta et al., 2008a).The initial estimation of the parameters plays a significant role in convergence to the correct estimate. In (Fisher, 1995;Carta et al., 2008a), the method of moments was described, and it uses the following equations for the estimation of the above five parameters. The equations are as follows (Jones, 2006a): where C p ¼ 1 n À Á ∑ n i¼1 cos pθ i and S p ¼ 1 n À Á ∑ n i¼1 sin pθ i . If we denote the set of the parameters as e μ 1 , e k 1 , e μ 2 , e k 2 , and e p, the aim is to find the residual difference between the sample moments ΔC 1 , ΔC 2 , ΔC 3 , ΔS 1 , ΔS 2 , and ΔS 3 which will be used in the least square criterion method to find the parameters of the two components of von Mises distributions. ΔC 1 is calculated from the following equation, and all residual differences can be obtained in the same manner (Jones, 2006a).
Then, the sum of the square criterion is: This method can be extended to more than two components of the von Mises distribution.

Hypothesis testing
Three types of distributions, i.e., circular distribution, von Mises distribution, and mixture of von Mises distributions, are tested to find the best fitting model to the data as illustrated in the flow chart in Fig. 1.
Graphical assessment and formal assessment using 95% confidence intervals for the distributions are used to verify the hypothesis and results. First, we check the uniformity of  the data, and then a graphical assessment is carried out as in (Mardia & Jupp, 2009). Various methods of formal testing are used to check the uniformity, such as the test of randomness against any alternative (Kuiper's test for ungrouped data) (Fisher, 1995). Rayleigh test is used to test the hypothesis of uniformity of randomness against a unimodal alternative (Mardia & Jupp, 2009;Jones, 2006a). Watson's U 2 test is also employed. It uses a corrected mean square deviation. U 2 is well defined in this test and is invariant under rotation and reflection. U 2 is based on comparing the theoretical cumulative distribution function with the practical distribution function (Jones, 2006a).
where U i is the sorted cumulative distribution function, U is the average of CDF, and N is the total number of records.
The second step is to test the hypothesis of von Mises distribution. The goodness of fit is measured using the R 2 coefficient as it is mentioned in (Masseran et al., 2013;Carta et al., 2008a;Carta et al., 2008b). Watson's U 2 test is used also to verify whether the distribution is a von Mises distribution or not. In this case, Watson's U 2 test may take two different forms depending on whether the vector mean and concentration parameters are known (predicted) or not (both known, one of them known or both unknown) (Jones, 2006a). The last step is to check the goodness of fit for the finite mixture of two von Mises distributions using the R 2 coefficient as it is mentioned in (Masseran et al., 2013;Carta et al., 2008a;Carta et al., 2008b). The mixture of von Mises is tested using the chi-square to check if the mixture of two von Mises is the best fit to the data (Heckenbergerová et al., 2013).

Case study
In this study, we use two data sets from two different locations in the UAE. The UAE is a located in the southeastern part of   (Ouarda et al., 2014). Both stations studied in this paper are located in the coastal region of the UAE. For the first site, the data is recorded at Abu Dhabi International Airport (ADIA) which is located at 24°26′N and 54°39′E. The data is measured at 27 m height from 1982 to 2010. The second site is located in the Al Badiyah region, which is near the coast and the mountains, at 25°24′N and 56°20′E. The wind direction and speed were simulated 24 h daily from 2003 to 2012 at five different elevations (10 m, 50 m, 80 m, 100 m, and 120 m) (Mooney et al., 2003). The locations of the two selected sites are illustrated in the map of Fig. 2. ADIA is located near the coast bordering the Arabian Gulf, while Al Badiyah is located near the coast bordering the Gulf of Oman and surrounded by a mountainous region. Figure 3 presents the wind rose diagrams at 10 m height for Al Badiyah and ADIA stations. The wind rose diagrams corresponding to other elevations at Al Badiyah are very similar to the one corresponding to the 10 m height. Table 1 provides an overview of the descriptive statistics of the data for the different elevations at Al Badiyah and at ADIA.

Results
The objective of this work is to model wind direction distributions at two different locations in the UAE. For this, we explore three distribution models, i.e., uniform circular, von Mises, and mixture of two von Mises distributions. An appropriate model is selected based on the tests mentioned in section 5. The data proceed using the CircStat toolbox (Jones, 2006a;Jones, 2006b) applied in MATLAB R2014.

Test of uniformity
The uniformity of the Al Badiyah and ADIA data was first verified based on graphical assessment using wind rose diagrams to represent wind direction as shown in Fig. 3. Wind directions are pointing to the east and to the west/northwest for ADIA and Al Badiyah, respectively. It is evident from the plots that the uniform distribution is not a good fit for the data.
The hypothesis is also formally tested using the Rayleigh test with two assumptions, i.e. when the vector mean is known and when it is unknown. The hypothesis is also tested using Watson's U 2 test. The results of formal assessment of the hypothesis under 0.05 confidence level are shown in Table 2, Table 3, and Table 4 for the Rayleigh test when vector means are unknown, Rayleigh test when vector means are known, and Watson's U 2 test, respectively. It can be seen that the uniform circular distribution is not a good fit for the data at both locations. The test results indicate that there is a significant variation in the wind direction.

Test of von Mises distribution
The graphical assessment of Al Badiyah data is illustrated in Fig. 4 which illustrates that there is a shift in the direction to the right (north) with an increase in the height. It can be observed that the von Mises distribution seems to be a good fit at 10 m height but deviates for the other heights. This indicates that as the height increases, the direction changes also and the highest speed is recorded at the altitude of 120 m Fig. 5. The graphical assessment of ADIA is shown in Fig. 6. The hypothesis was tested using Watson's U 2 test (Tables 5 and Table 6). The goodness of fit was measured using the R 2 coefficient as shown in Fig. 6 for Al Badiyah and in Fig. 7 for ADIA ( Table 7). The tests show that the von Mises distribution is not a good fit for both data sets.

Test of mixture of von Mises
The least square estimates of the parameters of the two components von Mises distribution are shown in Table 8. For the Al Badiyah data set, the table shows that there are two concentration parameters and two directional means. The concentration parameter k 2 is higher than k 1 which means that the wind direction below is highly concentrated in the western direction. The concentration parameter around the directional mean is higher at the 10 m elevation compared to the other elevations. It can be observed that as the elevation increases, the concentration parameters decrease. The reason for that might be because of the wind direction overlap in the height. The ADIA data set shows also that the concentration parameter k 2 is higher than k 1 and it concentrates in the northwest direction. The result in Table 8 is confirmed with the separation of two components of von Mises distributions in Fig. 8 (Al Badiyah) and Fig. 9 (ADIA). The plots at four different elevations are approximately similar to Fig. 8. The goodness of fit of the FMVM was then measured using the R 2 coefficient as shown in Fig. 10, Fig. 11, and Table 9. These results illustrate that the FMVM and the empirical cumulative probability are approximately identical, and the R 2 coefficient is equal to one for the heights 10 m, 50 m, 80 m, and at ADIA station. For the directions at the 100 m and 120 m heights, the R 2 coefficient takes the value 0.9999. The result is comparable with (Carta et al., 2008a) as presented in Fig. 12 below.
The chi-square test was carried out to verify that the best fitting model is the FMVM as presented in Table 10. It is clear from the results corresponding to both stations that the finite mixture of two von Mises distributions is the best fitting model for wind direction.

Discussion and conclusions
Since the values of wind speed and direction is changing from one place to another, the data planned to be gathered in various locations. If the circular uniform model is accepted, this means that the wind below in all directions (360 degrees) are the same, so no much effort will be done to align the wind turbine since it will gather the same amount of energy. Whereas if the there is a particular direction for wind to blow, then that should be considered once planned to establish an electricity field of turbine. If the wind blows on two or more directions, this must be noted to place the wind turbine in a way to gather the most energy. This means that the best fitting models and number of Von Mises combination depend on the nature of the data. To conclude, the first case study, which deals with modeling wind directions in Al Badiyah station at five different heights, shows that the mixture of two von Mises models fits the data better than the uniform circular and von Mises distributions. The mixture of von Mises model indicates that there  The second case study models wind direction at Abu Dhabi International Airport during the period 1982 to 2010. Wind direction at ADIA points to two main directions. The dominant wind direction is concentrated in the northwest (324.1), and the second direction points to the southeastern (107.7). This shows that wind direction characteristics at the ADIA station are significantly different from Al Badiyah station. This points to the spatial variability of wind direction in the UAE and the need for a more detailed assessment of wind directional distributions at any specific site of interest. The chi-square value is lower than 3.841 when the significant interval is α = 0.05, and the hypothesis that wind has two main directions cannot be rejected.
Future research efforts may focus on the association between wind direction and speed, e.g., using linear-circular association methods (Fisher, 1995) in order to accurately estimate the wind power and better plan the development of wind farms. Future research may also explore the use of bivariate (wind speed and direction) and multivariate (wind speed and direction and air density) models using multivariate kernel density estimates (Zhang et al., 2013).

Declarations
Conflict of interest The author(s) declare that they have no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.