1 Introduction

The decreasing fossil-fuel resources and climate friendliness, impose the utilization of renewable energy resources (RES) in diverse fields of the daily lifetime [1]. Towards decarbonisation, several energy sectors are led to electrification in a global attempt to increase the renewable electricity integration in existing power networks. However, the intermittency and uncertainty with respect to RES potential cause serious challenges to modern power systems, their planners and operators. By lowering the contribution of controllable generation, the capability of adjusting production to consumption (via frequency regulation) is decreased unevenly. Hence, the addition of renewable portions in total generation calls for enhanced flexibility levels (translated as spinning reserve margins) in order to retain system stability and reliability, which in turn raise the total generation cost [2].

Alongside, the sustainable targets have already been set and, in order to ameliorate their expensive requirements in spinning reserve, some efforts have been made towards the determination of the optimal RES share per case. Up to a certain share, renewable resources provide viable and advantageous solutions, while in considerably higher levels they become infeasible. To this end, various electricity storage (ES) technologies are examined, taking into account the annual income, emission avoidance and RES increase against the cost [3]. Different scientific opinions on the aspect do exist, stating that any attempt to determine the optimal ES system size that does not explore the sustainable view of a probable application and future raw-material reserves cannot be taken into account [4]. The uncertainty worsens in 100% RES paradigms, where hydrogen from renewable production processes has to replace hydrocarbon fuels. Issues like feedstock cost and content variation, security of supply, transportation and domestic storage and utilization need to be study in depth [5].

On a radically different axis, the stochastic and variable impact of RES on their potential availability can be greatly improved via the distinguished concept of prediction. Utilizing artificial intelligence tools in recent years, forecasting has gained a relevant accuracy level in the field of RES as well. The general mechanism finds ready application on the prediction of solar irradiation, wind speed, river stage and flood level, sea-wave height, tides and currents, geothermal wells gushing water and biomass yield. Based on the multiple stages followed to result in electricity and its global potential, wind lies between the most promising and worth investigating renewable resources [6]. According to [7], traditional renewables including hydro, wind, solar, biomass and geothermal represent ~ 2000 GW (or 30% of worldwide installed capacity), supplying near the 25% of total electricity during 2017. Wind power contributed with 90% onshore and 10% offshore energy to the one quarter of RES capacity, following hydro which constitutes ~ 16% of renewable power or 4% of total electricity production. In this context, wind can be regarded as a major resource towards a carbon-free and sustainable economy, with massive expansion expectations as high as 977 GW in 2030 (905 GW onshore and 72 GW offshore wind power).

Numerical assessments on wind potential and installation arrangements in urbanized areas reveal that a considerably high share will take place in distribution networks [8]. As a result, turbulence intensity between existing obstacles, ground geometry and presence of upstream objects, must be taken into account. The intermittency issues are also examined in Cai and Bréon [9]. The great temporal variability of the wind power in higher penetration levels is proposed to be mitigated through spatial aggregation techniques. These techniques must consider multi-input prediction tools with the consolidation of both frequent and infrequent events at regional and national scale. In general, wind power forecasting models can be divided per applied methodology into persistence, physical, statistical and hybrid [10]. Persistence and physical methods do not need to be trained with historical data. Instead, they depend on highly correlated or physical data and, thus, they become favourable for short-term forecasting [11]. On the contrary, statistical approaches are based on developing the non-linear and linear relationships between historical data (such as wind speed, wind direction and temperature, etc.) and the generated power, to train a forecast model. A comprehensive review on the multi-objective optimization technologies in wind prediction is made in Liu et al. [12], with a reference to four critical data pre-processing techniques.

The development of artificial intelligence and machine learning techniques certainly benefits the advancement of energy prediction. Hybridizing the artificial neural networks and regression techniques, the non-linearity in the wind-speed time series can be addressed [13]. Auto-regressive, moving average, least absolute shrinkage selector operator, k nearest neighbour, gradient boosting decision tree, random forest, Gaussian process, quantile and support vector are some of the most relevant regression algorithms available for combination with the family of artificial neural networks (ANNs) (including perceptron, feed-forward, multi-layer perceptron, radial based, convolutional, recurrent, long short-term memory, deep belief and auto-encoder) [13,14,15,16].

A further, very common family that operates satisfactorily with forecasting tasks is well-known as fuzzy systems. Fuzzy logic models are suitable for load prognosis when the historical data are expressed by linguistic terms [17]. They are introduced to accelerate the speed of convergence and lower the computational times usually presented in ANNs. Specifically, rather than making use of individual variants and targets, they can exploit single-valued ensembles and evaluate their membership under the concept of multiple linear regression to construct a final prediction [18]. Otherwise, the non-linear stochastic relationship problem can be handled by utilizing weighted fuzzy time series and combined linear and non-parametric algorithms [19]. Exploiting the merits of ANNs and fuzzy systems, the hybrid approaches offer impressive trade-offs between accuracy and convergence time. However, in multi-predictor prognosis tasks they are comparable with another family named regression trees.

Regression trees constitutes a method based on the use of a decision tree as a predictive model, particularly used in data mining and automatic learning. The authors in [20] presented four regression trees methods, namely normal, pruned, boosted and bagged, for 1–6 h global horizontal irradiation prediction. A 30-min ahead wave-height forecasting for electricity generation was carried out in [21], by making use of hybridized multiple regression algorithms including regression trees. The employed regression trees constructed on a binary decision structure to establish the association among predictors and response variables. An innovative home-energy consumption predictive model is demonstrated in Nie et al. [22]. Based on experimental evaluations, the energy consumption by using the gradient boosting regression tree algorithm is predicted with greater accuracy, compared to alternative approaches in terms of performance indices like Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE).

Table 1 lists some representative models for wind power prediction. Although a huge research work has been made around forecasting, regression trees have been under-explored in the direction of wind power prediction. Some limited studies investigated the performance of regression trees in wind-speed for very short-term problems [23] and probabilistic wind-direction forecasting [24]. Based on the existing literature, most studies are focused on the prognosis of wind based on weather conditions with the aim of providing the wind power output relying only to the manufacturer’s power curve. This may offer accepted results considering only one turbine per time. The spatial correlation is not taken into account, though it belongs within the critical parameters in wind farms and urban generating complexes. For example, the impact of a probable turbulence at specific wind directions must be consolidated if the prognosis target concerns an aggregated wind-power output. A relevant work found in Barber and Nordborg [25], studied these incoherent phenomena and concluded that both wind turbulence intensity and shear, along with site-specific tip deflection are becoming quite important, while the wind-turbine blades are getting longer and more flexible. In spite of the fact that the total wind energy production of peninsular Spain is predicted in Torres-Barrán et al. [26], the forecasts depend on input patterns given for several weather variables, at each one of the points that caver the areas under study. However, identifying the increased occurrence of wind turbines in the forthcoming distribution networks, the RES producers must be facilitated in making day-ahead forecasts using open-access data available online.

Table 1 Representative wind-power prediction models

Towards this direction, in this work a new formulation is developed to extract the most critical parameters affecting the aggregated contribution of wind parks. To associate the actual aggregated wind penetration with the predictors, the curtailment of RES is also considered. The rest of the paper is organized as follows. In Sect. 2, emphasis is given on the formulation of the problem to determine the most critical parameters towards model training. In Sect. 3, the minimum required input variables for forecasting are indicated and explained based on mathematical framework. The extensive experimental evaluations and obtained results are analysed and discussed in Sect. 4, providing the input/output characteristics of different geographical areas. Finally, the conclusions are drawn in Sect. 5.

2 Mathematical framework

The prospects of increasing the contribution of wind in distributed generation are continuously growing in recent years. Moving from the centralized generational schemes to dispersed production patterns, there is a need for developing expert systems able to accurately predict and properly manage the power flows in a bi-directional manner. In this smart-grid procedure, forecasting holds a major role which is often decisive for the maximum penetration of wind in the energy mixture.

It is well known that the power extracted by a specific wind turbine (Pwt) is proportional to the wind speed (v), the spanned area by the rotor blades (A) and the air density (ρair). Since the amount of kinetic energy of the air cannot be entirely acquired, the Betz limit CP ~ 59.26% is applied to form the following equation [27]:

$$P_{wt} = \frac{1}{2}C_{P} \rho_{air} Av^{2}$$
(1)

According to the type and scale of the wind turbine, most manufacturers define the kinetic energy converted into electricity as a function of the rated power (Pr) and wind speed (vr) with respect to the cut-in (vi) and cut-out (vo) speed. In this regard, the following formulation can be used [28]:

$$P_{wt} \left( t \right) = \left\{ {\begin{array}{*{20}c} {P_{r} ,} & { v_{r} \le v\left( t \right) \le v_{o} } \\ {P_{r} \frac{{v\left( t \right) - v_{i} }}{{v_{r} - v_{i} }},} & {v_{i} < v\left( t \right) < v_{r} } \\ {0,} & {v\left( t \right) \le v_{i} \;or\;v\left( t \right) > v_{o} } \\ \end{array} } \right.$$
(2)

This way, the total wind power (Pw) obtained by a wind park or a complex of wind parks consisted of N wind turbines can be calculated by Eq. 3, only if the wind speed v(t) is known (or predicted) for each individual wind turbine.

$$P_{w} \left( t \right) = \mathop \sum \limits_{i = 1}^{N} P_{wt} \left( t \right)$$
(3)

At the desired height, the parameter of air density plays a critical role on the estimation of wind power in aggregated generation plants. To examine how the latter is affected by weather and site-specific conditions, the following analysis takes place [29].

Air density is defined as the mass per unit volume of the Earth's atmosphere. Like pressure (p), it decreases with increasing altitude. It also changes with fluctuating temperature (T) or relative humidity (RH). At sea level and 15 °C, air has a density approximately equal to 1.225 kg/m3. At the point of connection between the wind turbine shaft and blades, its value decreases according to the prevailing weather conditions. The density of dry air can be evaluated considering the ideal gas law, expressed as a function of T and p.

$$\rho = \frac{p}{{R_{s} T}}$$
(4)

Measuring the absolute pressure in Pascal (Pa) and temperature in Kelvin degrees (K), the specific gas constant of dry air (Rs) is 287.052874 J/(kg K) assuming a mean molar mass for dry air of 28.964917 g/mol. Hence, this quantity can slightly vary depending on the molecular composition of air at a particular location.

Adding water vapour to the air (making the air moist), decreases its density, which may seem counterintuitive at first. This is because the molecular mass of water (18 g/mol) is less than the molecular mass of dry air (~ 29 g/mol). For any gas, at a given temperature and pressure, the number of molecules present is constant for a given volume according to Avogadro’s Law. Thus, when water molecules (water vapour) are added to a given volume of air, the dry air molecules must decrease by the same number, in order to prevent the pressure or temperature from rising. Hence, the mass per unit volume of the gas decreases. In other words, the density of moist air can be calculated as a mixture of ideal gases. In this case, the partial pressure of water vapor (pv) is known as vapor pressure and, using this method, the deviation error in density becomes less than 0.2% in the range from − 10 °C to 50 °C. The density of humid air (ρha) is found by Eq. 5.

$$\rho_{ha} = \frac{{\rho_{d} }}{{R_{d} T}} + \frac{{\rho_{v} }}{{R_{v} T}} = \frac{{\rho_{d} M_{d} + \rho_{v} M_{v} }}{RT},$$
(5)

where ρd is the partial pressure of dry air, T is the ambient temperature in K, Rd and Rv are the specific gas constants equal to 287.053 J/(kg K) and 461.495 J/(kg K) for dry and vapour air, respectively, Md and Mv correspond to the molar masses of dry air (28.964 g/mol) and water vapour (18.016 g/mol), and R constitutes the universal gas constant of 8.314 J/(K mol).

The vapor pressure of water can be estimated from the saturation vapor pressure (psat) and relative humidity as \(p_{v} = RH \cdot p_{sat}\). In turns, the saturation vapor pressure at any given temperature is basically the vapor pressure when RH equals to 100% [30]. The formula used to define psat as a function of temperature is:

$$p_{sat} = 6.1078.10^{{\frac{{7.5\left( {T + 273.15} \right)}}{T}}}$$
(6)

It is noted that, the extracted result is given in hPa which is equivalent to 100 Pa or 1 mbar. Then, the partial pressure of dry air can be estimated by making use of the vapour and absolute observed pressures as \(p_{d} = p - p_{v}\). At this point, the altitude (h) can be consolidated, introducing additional parameters and constants with reference to sea level. These accounts for the sea-level atmospheric pressure po (101.325 kPA), the sea-level standard temperature To (288.15 K), earth-surface gravitational acceleration g (9.81 m/s2) and temperature lapse rate L (0.0065 K/m). Consequently, the temperature and pressure can be determined based on the following equations.

$$T = T_{o} - L \cdot h$$
(7)
$$p = p_{o} \left( {1 - \frac{Lh}{{T_{o} }}} \right)^{{\frac{{gM_{d} }}{RL}}}$$
(8)

The air density can be calculated by the molar form according to the ideal gas law as:

$$\rho_{air} = \frac{pM}{{RT}},$$
(9)

where M is the actual molar mass of air (found by Rs/R) and p is measured in Pa. Combining the aforementioned formulation, the final form of the air pressure as a function of ambient temperature in °C (Tc), relative humidity (RH) and altitude (h) is expressed as follows:

$$\rho_{air} = \frac{1}{{287.05\left( {Tc + 273.15} \right)}}\left[ {101325\left( {1 - 0.0226h} \right)^{5.256} - 230.87484RH \cdot 10^{{\frac{7.5Tc}{{237.3 + Tc}}}} } \right]$$
(10)

Referring back to the definition of wind power, one can observe that the wind direction does not affect the power output from a turbine. Nonetheless, it is very important for the spatial association between the allocation of turbines and the aggregated power. In the wake of a wind turbine, each region is independently affected based on the wind inflow. This generates the requirement to study the power generation considering distinct turbulent locations based on the varying wind direction [31]. Finally, the total contribution of wind is limited by another factor, imposed by the distribution system operators (DSOs). This factor stems from the overall system inertia for frequency regulation and depends on the optimal unit commitment schedule. It constitutes a national percentage for RES curtailment, giving a set-point to producers for the maximum accepted power output, regardless of whether a unit (or a whole park) is fully available for electricity provision or not.

3 Data harvesting and processing

Up to here, the most critical parameters that can change the production rate of wind were taken into account, excluding the planned outages and unexpected interruptions of any kind (e.g. for maintenance, upgrade, failure, etc.) at a single or a group of wind turbines. To construct an appropriate list for the least predictors needed to make forecasts, let’s take a look back to the formulation of wind production (Eq. 1). From this definition, the required predictors are all parameters that cannot be controlled by the wind turbine, its manufacturer or the owner.

Although the exploited energy from wind can be regulated via pitch (β) control, the yaw angle (γ) and gear-box ratio (λ), these parameters are controlled by the wind turbine mechanisms. The spanned area from the blades is a constant feature based on the manufacturer’s blade length and the height and exact location of installation is up to the plant owner. To this end, the uncontrollable variables that are needed to be collected are the wind speed, wind direction and air pressure. Wind direction is measured in degrees (°) in the range of 0–360°, defining a wind blowing from the north with 0° (or 360°) and another one blowing from east as 90°. Turning back to Eq. 10, one can be seen is that air density depends on the controllable parameter of altitude as an extension of the turbine’s height, and the uncontrollable variables of temperature and humidity which form the rest two predictors.

Since a reduction in aggregated generation, as a consequence of a partial outage on a separated unit or a group of units, is very difficult to be determined, a time-varying system-capacity factor (csys) can be used as a predictor towards accurate forecasts. Together with the RES curtailment factor (fRES), can define the total available capacity of wind generation under consideration. These factors are defined as:

$$c_{sys} \left( t \right) = \frac{{P_{wind}^{actual} \left( t \right)}}{{P_{wind}^{cap} \left( t \right)}}$$
(11)
$$f_{RES} \left( t \right) = \frac{{P_{RES}^{cut} \left( t \right)}}{{P_{RES}^{cap} \left( t \right)}},$$
(12)

where \(P_{wind}^{actual}\) denotes the current availability of wind systems, \(P_{wind}^{cap}\) the capacity of wind systems in total, \(P_{RES}^{cut}\) the curtailed RES and \(P_{RES}^{cap}\) the total capability of RES.

To gain a broader overview with respect to the magnitude of RES curtailment, the term of spinning reserve (SR) has to be defined. Spinning reserve refers to the available power capacity that can be directly attributed to the system from the committed (online) conventional generating units. Due to some operational constraints, such as minimum elapsed times between consecutive generator start-ups or shut-downs, rate of change (upward and downward) of their power output, maximum achievable number of status change at a time and uninterruptible operation of some must-run units, this amount of power must be guaranteed in order to retain system stability and reliability levels. Thus, the excess power from RES is curtailed in an attempt to sustain the required SR, without violating the imposed constraints [32, 33].

SR consists of a constant, a bi-basic and a variable part. The constant value of SR accounts for an unexpected failure on a remarkable generator being online. The bi-basic quantity, takes a high value during peak hours and a lower during off-peak, whereas a time-dependent portion is used to include the uncertainty due to RES contribution. In any case, this contribution is restricted by the minimum capacity (Pmin) of the must-run units M and hence, the excess RES is curtailed to avoid their de-commitment. This can be formulated so that [34]:

$$P_{RES}^{cut} \left( t \right) + SR\left( t \right) \ge P_{load} \left( t \right) - \left[ {\mathop \sum \limits_{m = 1}^{M} P_{m}^{\min } \left( t \right) + \mathop \sum \limits_{i = 1}^{G} P_{i} \left( t \right) + P_{RES}^{cap} \left( t \right)} \right],\;\forall t \in T,$$
(13)

where Pi expresses the actual power by each generator i and must-run generator m, while Pload represents the actual load demand during the interval t. Identifying that the aggregated wind power is expressed by Eq. 14, the last predictor needed is the actual deviation between real generating power and minimum capacity of the must-run units, when the rest of conventional generators are off-line.

$$P_{w} \left( t \right) = c_{sys} f_{RES} \mathop \sum \limits_{i = 1}^{N} P_{wt} \left( t \right)$$
(14)

4 Experimental evaluation

In order to evaluate the performance of the proposed approach, the implementation of regression trees is carried out considering three case studies for the paradigm of Cyprus. The case studies take place under three geographical paradigms to represent the distributed-generation diversity of the island. The input/output characteristics are explained along with the forecaster development. The obtained models are compared based on the metrics of MAE, RMSE and MARNE.

To justify the selection of the most appropriate input features, Pearson method and mutual information are utilized. To estimate the correlation coefficients and mutual dependency between the input features and wind-power output, the Pearson method takes place. Based on Eq. 15, the correlation coefficient \(\rho \in \left[ { - 1, + 1} \right]\) considering the predictors xi and output Yi, and their mean values \(\overline{x}\) and \(\overline{Y}\), respectively, is defined as follows [35]:

$$\rho = \frac{{\sum \left( {x_{i} - \overline{x}} \right)\left( {Y_{i} - \overline{Y}} \right)}}{{\sqrt {\sum \left( {x_{i} - \overline{x}} \right)^{2} \sum \left( {Y_{i} - \overline{Y}} \right)^{2} } }}.$$
(15)

Treating the predictor/target pairs as random variables (X,Y), their mutual dependencies can be obtained based on Eq. 16, taking into account the number of their respective states Sn and Sm, their joint P(xn,Ym) and marginal probabilities P(xn) and P(Ym). The mutual information \(I \in \left[ {0, + \infty } \right)\) is determined as [36]:

$$I\left( {X;Y} \right) = \mathop \sum \limits_{n = 1}^{{S_{n} }} \mathop \sum \limits_{m = 1}^{{S_{m} }} P\left( {x_{n} ,Y_{m} } \right)\log \frac{{P\left( {x_{n} ,Y_{m} } \right)}}{{P\left( {x_{n} } \right)P\left( {Y_{m} } \right)}}$$
(16)

Between the input features of wind speed, wind direction, generator rotor speed, pitch angle, yaw angle, ambient temperature, relative humidity, air pressure, gear ratio and turbine-hub height, those that exhibit the highest correlation (\(\rho \ge \left| {0.20} \right|\)) and dependency (\({\rm I} \ge 0.25\)) to the wind power are only wind speed and air pressure.

4.1 Input/output variables

Actual data for the wind power generated from three different wind farms were obtained from the Cyprus energy regulatory authority (CERA). The wind farms are located east, west and in the mountains in the centre of the country. Hence, they are connected to independent distribution networks which are under the administrative and operational ownership of different districts. Distinguishing the plants by location to eastern, western and central mountainous, their respective power output during the year of 2020 can be seen in Fig. 1. As can be observed, there does not exist a relationship between the produced power and seasons, months, day-type or time.

Fig. 1
figure 1

Aggregated wind power output pertaining the a western coast, b eastern coast and c central mountainous areas

To depict the inputs, Fig. 2 shows the respective histograms of wind direction to the mentioned sites, while the average monthly wind speeds are included in Fig. 3. It is obvious that they can vary according to the altitude and morphology of the area. In a similar way, the ambient temperature varies with location and season, as illustrated in Fig. 4. The inherent association between temperature, wind speed and therefore the wind power, cannot be graphically represented with ease. The same occurs with humidity which provides an extremely non-linear relationship with air density based on Eq. 10. However, as in the case of temperature, it follows a pretty steady seasonal pattern that can be represented by Fig. 5 for spring, autumn, winter and summer.

Fig. 2
figure 2

Wind direction histograms relating to the assessed coastal and mountainous regions

Fig. 3
figure 3

Hourly averaged wind speed pertaining the a western coast, b eastern coast and c central mountainous areas per season

Fig. 4
figure 4

Seasonal ambient temperature at the respective a western coast, b eastern coast and c central mountainous areas

Fig. 5
figure 5

Seasonal average relative humidity at the areas of a western coast, b eastern coast and c central mountains

While someone can state that wind speed and wind power, in turns, demonstrate a stochastic behaviour difficult to be predicted, the aspect of RES curtailment remains at the same side. To verify this randomness, the definition of Eq. 13 can be used. The clear dependence of curtailment on load demand, is essentially the main reason of added uncertainty. Load demand is strictly correlated with the day-type, day length, season, electricity price and fuel price, daily activities and weather. This way, it deteriorates the satisfaction of conventional generating constraints and consequent RES contribution. To offer a graphical explanation of this phenomenon, Fig. 6 contains the minimum capacity limits of the must-run units, violated by the actual demand during a week in April 2020. As a result, the integrated RES across the curtailed RES is included in Fig. 7, for the same period.

Fig. 6
figure 6

Spinning reserve (SR) graphical violation for the week of 20–26 April 2020

Fig. 7
figure 7

Total renewable contribution and imposed curtailment during the assessed week

4.2 Training approach

In general, analytical models are presented computationally intensive, leading to infeasible exploration in most cases. Therefore, more generic and simple training approaches are required, in order to forecast the key targets, namely outlet wind power, based on non-exhaustive data-driven models. Identifying the trends in data, these techniques are able of capturing the underlying physical behaviour of the system, avoiding the need for pre-defined, detailed information about its specific features. As a modern algorithm of machine learning, regression trees offer excellent solutions in predictive modelling problems.

To generate a trained model able to predict new datasets in a repetitive manner, the regression tree makes use of logical rules to split a complex problem into several simpler sub-problems, easier to be interpreted. Specifically, it makes decisions based on conditions that are hierarchically applied from “root” to “leaf” of a tree [37]. A simple configuration of a decision tree considering two predictors (i.e. wind speed and wind direction) and one target (wind power) is provided in Fig. 8. From the root node, a splitting process is repeated until the maximum depth of the tree is achieved (restricted to 3 in this case). Each leaf induces a simple regression model which can be pruned to reduce complexity and improve the capability of the tree model.

Fig. 8
figure 8

Demonstration of the proposed regression trees paradigm (where WP denotes wind power, WS the wind speed and WD the wind direction)

Denoting the recursively grown tree with \({\mathcal{T}}\), the multivariate \({\varvec{x}} \in {\mathcal{X}}\) can be mapped to the target variable \({\mathcal{Y}}\) based on a tree-node membership m. \({\varvec{x}}\) expresses a d-dimensional vector with real valued variables (predictors) so that \({\varvec{x}} = \left( {x_{1} , \ldots ,x_{v} } \right) \in {\mathbb{R}}^{d}\). Considering the learning data \({\mathcal{L}} = \left\{ {\left( {Y_{i} ,x_{i} } \right):i = 1, \ldots ,n} \right\}\), the regression tree is grown from \({\mathcal{L}}\) according to recursive, conditional splits (in the form of xi ≤ c and xi > c for binary decisions) chosen from the observed values of xi [38]. Hence, for M terminal nodes the appropriate mapping function \({\mathcal{T}}:{\mathcal{X}} \to \left\{ {1, \ldots ,M} \right\}\) can be defined as:

$${\mathcal{T}}\left( x \right) = \mathop \sum \limits_{m = 1}^{M} mB_{m} \left( x \right),$$
(17)

where Bm denotes a product spline as a basis function of the number of splits (length) Lm, the splitting variable xl,m and the splitting value cl,m so that:

$$B_{m} \left( x \right) = \mathop \prod \limits_{l = 1}^{{L_{m} }} \left( {x_{l,m} - c_{l,m} } \right)$$
(18)

In our realization, a more complex and deeper regression tree is actually used, able to interpret up to six predicting features, namely wind speed, wind direction, temperature, humidity, system capacity factor and RES curtailment factor, over one response, the target of wind power production. In the scope of averting overfitting and facilitating in learning dependencies and dataset outliers, the inclusion of some hyper-parameters like turbine-hub height, turbine-blade radius and spatial distribution could take place and optimized (e.g. with gradient boosting optimization method) towards the accuracy improvement and algorithm generalization during the model validation stage.

To ameliorate for the uncertainties in observed values, due to steep fluctuations of the wind speed, the dissimilarities between different time-series datasets are measured by means of clusters with common features. As a result, the final stage of forecasting consists of a post-process correction procedure explained as follows. First, the daily, observed wind power is categorized into clusters so as to minimize the variance, which is defined as the within-cluster sum of squares of Eq. 19.

$$\mathop {{\text{argmin}}}\limits_{S} \mathop \sum \limits_{i = 1}^{k} \mathop \sum \limits_{{x \in S_{i} }} Y - \mu_{i}^{2}$$
(19)

It is noted that Y represents a dataset of n observed d-dimensional (d = 24) vectors (e.g. Y1, Y2,…, Yn), that are to be partitioned into k ≤ n clusters such that S = {S1, S2,…, Sk}, based on their mean value (point in Si). To this end, a linear regression approach is applied to each cluster, in order to determine the fitting (or corrective) coefficients b and a so that [39]:

$$P_{predicted} = bP_{observed} + a$$
(20)

4.3 Model performance

To properly determine the model performance, three scenarios are utilized by increasing the dimensionality in terms of consolidated variables. Specifically, the first scenario makes use of historical data with respect to wind speed and wind direction at the western coast, eastern coast and central mountainous terrains. The data obtained from CERA and regard the years of 2020 and 2021. For training purpose, the data size is 2 years and accounts for 60 min sampling rate. For validating the proposed model, the hourly granularity is considered for 6 months. In the second scenario, the variables of ambient temperature and relative humidity are also consolidated, while the third scenario also includes the predictors of RES capacity and RES curtailment.

To assess the scenarios at each location, the metrics of mean absolute error (MAE), root mean squared error (RMSE) and mean absolute range normalized error (MARNE) are taken into account exploiting Eqs. 2123.

$$MAE = \frac{1}{\tau }\mathop \sum \limits_{t = 1}^{\tau } \left| {P_{a} \left( t \right) - P_{p} \left( t \right)} \right|$$
(21)
$$RMSE = \frac{1}{\tau }\sqrt {\mathop \sum \limits_{t = 1}^{\tau } \left( {P_{a} \left( t \right) - P_{p} \left( t \right)} \right)^{2} }$$
(22)
$$MARNE = \frac{1}{\tau }\mathop \sum \limits_{t = 1}^{\tau } \frac{{\left| {P_{a} \left( t \right) - P_{p} \left( t \right)} \right|}}{{\mathop {\max }\limits_{t} P_{a} \left( t \right)}} \times 100$$
(23)

A paradigm of wind power forecasting at western regions is depicted in Fig. 9. According to the simulations performed in MATAB MathWorks R2020b, the performance of the underlying regression trees is improved by adding predictors in model training process. In scenario 1, the wind power accuracy in western areas is rated at 6.1306, 0.0734 and 7.6357 in terms of MAE, RMSE and MARNE, respectively. These values greatly improve during scenario 2, where they were amounted at the respective 5.1920, 0.0649 and 6.4667. The aggregated 41 wind turbines of total 82 MW capacity, do not add considerable improvements by the insertion of RES availability and RES curtailment.

Fig. 9
figure 9

Obtained results for the forecasting in the western regions

In the eastern coast, a wind farm with 21 wind turbines and 32.5 MW capacity shows improved predictive accuracy. Considering only the speed and direction of wind, the forecasted power was predicted with 2.4175 MAE, 0.0292 RMSE and 7.2084 MARNE. It is worth noting that, even more improvements were observed when the predictors of ambient temperature and relative humidity taken into account. The accuracy improved by 29.95%, 20.66% and 29.94%, respectively. Νeither in this case was an appreciable change in the results during scenario 3. The forecasted performance is included in Fig. 10, where the actual across the predicted wind power is illustrated.

Fig. 10
figure 10

Obtained results for the forecasting in the eastern regions

The central mountainous area composes three wind farms with total installed capacity rated at 40 MW. A daily forecast utilizing the proposed model configuration is shown in Fig. 11. The MAE, RMSE and MARNE metrics account for 4.1147, 0.0466 and 9.8884 and decrease to the respective values of 4.4821, 0.0416 and 8.3681 in scenario 2 and 3. For completeness sake, the experimental evaluation of the proposed model is listed by comparative metric and location in Table 2. Based on the comprehensive analysis, the predictive accuracy greatly improves by adding the predictors of ambient temperature and humidity to the vectorised wind speed and wind direction matrix. In the current form of isolated power network, the available capacity of RES and RES curtailment do not appear to possess a significant impact on predictions. However, in higher levels of wind penetration, their impact can impose extreme deviations from actual profiles, allowing regression trees to achieve increased improvements by their consolidation as predictors.

Fig. 11
figure 11

Obtained results for the forecasting in the central mountainous regions

Table 2 Comparative performance metrics by location

5 Conclusions

In this work, the performance of regression trees has been assessed towards wind power forecasting in distribution networks. The configuration was tested for the case of Cyprus, dividing the system by region into western coast, eastern coast and central mountainous. Five wind farms in total, accounting for around 82 MW, 31.5 MW and 44 MW per region, were taken into consideration.

Following a non-parametric mathematical formulation of air density, the minimum required predictors were identified successfully. The performance of the proposed configuration is investigated in terms of mean absolute error, root mean square error and mean absolute range normalized error. Specifically, experimental evaluations are performed considering three scenarios per location. The first scenario takes into account only the wind speed and wind direction data as predictors for the years of 2020 and 2021. In the second scenario, the variables of ambient temperature and relative humidity are also included. The final scenario (scenario 3) accounts for the availability of renewable resources systems and renewable energy curtailment, in an attempt to increase the object fitting performance.

According to the experimental simulations, improvements can occur by consolidating the predictors of temperature and humidity, independent from the location. The improvements in predictive error are in the order of 29.95%, 20.66% and 29.94%, for the respective performance metrics. Applying wind power forecast with six predictors (wind speed, wind direction, ambient temperature, relative humidity, renewable capacity availability and renewable energy curtailment), the imposed improvements did not show reasonable changes. Especially in the case of eastern coast, no improvement was indicated since the aggregated power regarded only one wind farm with total 31.5 MW installed capacity. Nevertheless, the results relating to the mountainous region, where the wind power from three different wind farms of total 44 MW capability, reveal that the increasing contribution of wind in the forthcoming distribution networks will undeniably require the more predictors. In this case, the renewable capacity availability and renewable curtailment, based on spinning reserve provision, possess critical role in wind power forecast modelling and performance.