Spatio-temporal estimation of wind speed and wind power using extreme learning machines: predictions, uncertainty and technical potential

With wind power providing an increasing amount of electricity worldwide, the quantification of its spatio-temporal variations and the related uncertainty is crucial for energy planners and policy-makers. Here, we propose a methodological framework which (1) uses machine learning to reconstruct a spatio-temporal field of wind speed on a regular grid from spatially irregularly distributed measurements and (2) transforms the wind speed to wind power estimates. Estimates of both model and prediction uncertainties, and of their propagation after transforming wind speed to power, are provided without any assumptions on data distributions. The methodology is applied to study hourly wind power potential on a grid of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$250\times 250$$\end{document}250×250 m\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{2}$$\end{document}2 for turbines of 100 m hub height in Switzerland, generating the first dataset of its type for the country. We show that the average annual power generation per turbine is 4.4 GWh. Results suggest that around 12,000 wind turbines could be installed on all 19,617 km\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{2}$$\end{document}2 of available area in Switzerland resulting in a maximum technical wind potential of 53 TWh. To achieve the Swiss expansion goals of wind power for 2050, around 1000 turbines would be sufficient, corresponding to only 8% of the maximum estimated potential. Supplementary Information The online version contains supplementary material available at 10.1007/s00477-022-02219-w.


Appendix A A.1 Exploratory Data Analysis
This section shows the main outputs of the exploratory data analysis performed on the wind speed data. Spatial plots and time series are used to highlight the presence of spatio-temporal structures and dependencies in the data. Temporal correlation are further explored for a set of stations located at different altitudes via autocorrelation functions (ACF). Distributional properties are explored using a kernel density estimation (KDE).     Table A1.  Table A1.  Table  A1 is estimated with KDE.

A.2 Input Features
This section provides details on the features included in the input space and the corresponding bandwidth parameters used to extract them from the digital elevation model. A correlation matrix showing relationships among pairs of features is also reported.

A.3 Roughness estimation
Wind speed transformation in the Earth boundary layer under neutrally stable conditions and dynamic equilibrium has here been modelled using the generalized log-law presented in equation (23), which is based on the assumption that the mean velocity profile is a function of height, surface roughness, friction velocity, zero-plane displacement and von Karman's constant, usually assumed equal to 0.4 [2]. In this paper, roughness has been estimated based on the CLC by associating a roughness value to each of the land cover classes, as reported in Table A3. Such values have been widely used and validated, see e.g. [3].   As a result, two roughness maps have been uses, one corresponding to the land covers mapped in the 2012 CLC, one for those mapped in the 2018 CLC data. The roughness map corresponding to the 2018 CLC is show in Figure A6.

B.1 ELM Tikhonov factor analysis
Regularised ELM can be intended as a ridge regression [4,5] in a random feature space. This suggests that the Tikhonov factors can be used to increase the explainability of the spatio-temporal models. Tikhonov factors for the first 25 EOF components of each model are presented in Figure B7 as matrices with a linear colour scale depending on the magnitude order of α. In the Figure, a factor in yellow indicates the selection of a large regularisation parameter, which shrinks the output weights of ELM near to zero, hence suggesting the absence of structure in the corresponding modelled spatial coefficient map. It is worth noticing that the corresponding model variance will also tends to zero. Differently, dark tones in the Figure correspond to regularisation parameters closer to zero, hence to a behaviour closer to that of the classical LS procedure. This suggests that the regularised ELM is prone to consider a spatial structure for the corresponding EOF coefficients.
The matrix of the first model in Figure B7 tends to be less sparse with more recent data. This may be related to the increasing number of training stations over the different datasets. Reading vertically the matrices, hence looking at the columns, it is easy to verify how the first components provide the major contribution in terms of data variability and of spatially structured information. Reading now the matrices horizontally, hence looking at the rows, permits the detection of the variability within ensembles for some components of the first component group.
Other patterns can be identified for the single datasets. The MSWind 08-12 data in Figure B7a mainly takes advantage of the first five components, which from the EOF decomposition are known to cumulate 65% of data variability. Excepted for some isolated members in component 7 and 13, no other information is used by the model. Figure B7b shows how for the MSWind 13-16 data the first three components and the sixth one are chosen by the model for all members. Again, they represent about 65% of the variability. For the MSWind 17 data in Figure B7c components 1 to 6, 9, 11 and 17 are fully contributing to the model. They correspond to 73% of the variability. Components 8, 13, 19 20 and 24 are only partly contributing. All other components are automatically not considered due to the regularisation mechanism. For the three datasets, the very first component of the original data model (contains seasonal cycles which are weakly depending on space) is variable, hence denoting some fluctuations of the model. For the second model based on log-squared residuals, the difference between the regularization parameters are more neat among the components.      Table B4 A careful analysis of the residuals is carried out. Figure B8 displays histograms of training and test set of the raw data, modelled data and residuals, for the three periods of study. Note that the plots are zoomed in for visualisation purposes and the actual ranges are reported in Table B4, together with the empirical means. The models predict negative values for the three datasets. Although a negative wind speed has no physical meaning, histograms show that it is happening quite rarely. These predicted values have been set to zero for the purpose of power estimation. The training residual means are all null and test ones are close to zero. However, each residual distribution has its mode below zero and is slightly skewed.

B.2 Residuals Analysis
Spatio-temporal variography analysis is performed on the training data for some chosen months. Figure B9 visualise the semivariograms for raw data, model predictions and their residuals. The model reproduces well the spatiotemporal dependencies detected by variography for the selected months. The sill is lower for the modelled data, hence suggesting a substantial variability loss, possibly due to the noisy nature of the raw data. The semivariograms of the residuals are close to flat with a residual temporal structure. In the spatial axis, almost a pure nugget effect is observed, although a residual structure could subsist for January 2017. Globally, the patterns observed in the raw data semivariograms are reproduced in the corresponding semivariograms of  Table B5. modelled data, modulo the sill shift. This is even more striking by looking at longer temporal lags e.g. for January 2017, where a high similarity is observed between the semivariogram shapes, see Figure B10. Spatial variography [6] is also performed on some spatial coefficients of EOF component before and after ELM modelling. Figure B11 displays such analysis for the MSWind 17 dataset for the first three components, which contain most of the variability. The (spatial) omnidirectional semivariogram is computed on the spatial coefficient obtained directly from the EOF decomposition of the original spatio-temporal data (in solid line). It shows the presence of the spatial structures in the first three components, although it is less pronounced for the very first component which is consistent with what was observed and discussed so far in Figure B7. Figure B11 also shows the semivariograms of the residuals obtained from the spatial modelling with ELM ensembles (in dashed line), which are close to pure nugget effects. This has two consequences. First, the spatial modelling component by component seems to correctly extract the spatial structures from them. Second, this suggests that the heteroskedastic variance estimate of the ELM ensembles satisfies its independence assumption -actually, it is sufficient to assume a vanishing covariance for this estimation, see [7] -and hence is appropriate. Finally, the cross-variograms between the residuals of component pairs fluctuate near zero (in dashed-dotted line). Comparing to the corresponding semivariograms, this indicates that the correlation between component residual pairs is very weak [8] and then satisfy the additional assumption for spatio-temporal model variance estimation which is necessary to ensure that no variability comes from spatial model interactions. The other components, which contain far less variability, can be exempted from such analysis without too many risks.    Table B5: APVs of the spatio-temporal semivariograms of Figure B9. The sample variance (or a priori variance, APV) is computed on the training data.

C.1 Definition of restriction zones
As explained in Section 3.3, the restriction zones shown in Table 3 are based on the framework for wind energy planning in Switzerland developed by the Swiss Federal Office of Spatial Development (ARE) [9]. It divides, on a scale of 500 × 500 m 2 , between buffered building zones, protected areas, areas to be excluded in principle (considered prohibited ), areas with a potential balancing of interests in case of national interest (considered restricted ), areas subject to inter-authority coordination and other areas (no restriction assumed).
Forests are considered here as a separate category, as wind turbine installation may be possible (unless other restrictions apply), but these zones are considered to be more vulnerable than other areas. Furthermore, all areas at altitudes above 3,000 meters are considered prohibited (based on a digital terrain model [10]), as they mark highly alpine terrain that is typically difficult to access. Areas above 2,500 meters of altitude are categorised as restricted, as currently no installations are found at higher altitudes and wind turbines in these regions may be difficult to install and maintain.