Abstract
Renewable energy systems depend on the weather, and weather information, thus, plays a crucial role in forecasting time series within such renewable energy systems. However, while weather data are commonly used to improve forecast accuracy, it still has to be determined in which input shape this weather data benefits the forecasting models the most. In the present paper, we investigate how transformations for weather data inputs, i. e., station-based and grid-based weather data, influence the accuracy of energy time series forecasts. The selected weather data transformations are based on statistical features, dimensionality reduction, clustering, autoencoders, and interpolation. We evaluate the performance of these weather data transformations when forecasting three energy time series: electrical demand, solar power, and wind power. Additionally, we compare the best-performing weather data transformations for station-based and grid-based weather data. We show that transforming station-based or grid-based weather data improves the forecast accuracy compared to using the raw weather data between 3.7 and 5.2%, depending on the target energy time series, where statistical and dimensionality reduction data transformations are among the best.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Renewable energy sources are essential to mitigate climate change (Clarke et al. 2022), and being able to forecast their supply is necessary to balance any energy system that includes them (González Ordiano et al. 2018). However, renewable energy sources and demand are weather dependent; thus, weather information plays a crucial role in renewable energy-related forecasting (Bloomfield et al. 2021; Harish et al. 2020; Vanting et al. 2021). This weather information can come from various sources such as weather stations, e. g. (Vladislavleva et al. 2013; Hong et al. 2014; Gutierrez-Corea et al. 2016), maps from Numerical Weather Prediction (NWP) models, e. g. (Al-Yahyai et al. 2010), or both (e.g. Du 2019).
New advances in machine learning, for example, offer tools to better forecast renewable energy sources and demand given various weather input data. These approaches use weather data in different formats like single time series, e. g. (Dahl et al. 2017; Hu et al. 2021; Ren et al. 2022; Elizabeth Michael et al. 2022; Beichter et al. 2022), to grid-based data, e. g. (Feng et al. 2022; Kong et al. 2020; Si et al. 2021), or graphs, e. g. (Hu et al. 2022; Simeunović et al. 2022).
While many research studies use data transformations on energy time series like wind power forecasting (Liu and Chen 2019) and it is known that using weather data improves the forecast accuracy of short-term renewable energy forecasts (Ludwig et al. 2023), it is yet unclear which weather data input shape is most beneficial. There are many options for how weather data can be used; for example, we can preprocess and transform the data or take it as it is. Additionally, some of the input data is often redundant, and, thus, some forecasting methods may perform better when using a reduced set of input features. In theory, a deep-neural network learns the most important features on its own assuming a sufficient amount of data and an appropriate network architecture. In practice, however, the amount of data is often limited (Do and Cetin 2018) and the best network architecture is hard to find (He et al. 2021; Wang et al. 2019). Hence, transforming the input data into a reduced-size data set can be beneficial. Furthermore, forecasts for energy consumption or production in large areas require weather data that cover the corresponding area which increases the complexity of the problem including data size and network architecture. While previous research already investigates which weather variables enhance the forecast accuracy, e. g. (Cococcioni et al. 2012; Lei et al. 2009), how specific weather transformations improve the forecast accuracy, e. g. (Andrade and Bessa 2017), and how transformations of input data generally impact the accuracy of neural networks (Neumann et al. 2021), there is currently no comprehensive understanding of whether transforming weather data improves forecast accuracy.
In the present paper, we investigate the role of transformations for station-based and grid-based weather data when forecasting renewable energy time series. We evaluate different weather data input transformations on historical energy data, including electrical demand, solar, and wind time series data from one transmission system operator in Germany.
The remainder of the paper is structured as follows. “Weather data transformations” section introduces different weather data transformations. Given these transformations, we use the experimental setup described in “Experimental setup” section to evaluate the forecast performance in “Results” section. We finally discuss the results and our approach in “Discussion” section and conclude the paper in “Conclusion” section.
Weather data transformations
In the following, we introduce different transformations for weather data that we later evaluate concerning their influence on the accuracy of energy time series forecasts. We choose the transformation based on other existing transformations for energy time series (Liu and Chen 2019) to adapt them to weather data and also based on promising transformations from other domains like autoencoders for images. We consider eight transformations for station-based weather data, typically available as a vector, and grid-based weather data, typically available as a matrix. While the vector only consists of temporal information, the matrix also contains explicit spatial information. All transformations reduce the dimensionality to reduce the complexity of the problem. Reducing the complexity, of course, also reduces the information within the data (e.g. spatial resolution). However, the information is often redundant or not needed for accurate forecasts and makes it hard for the model to learn the underlying function. There are numerous possible transformations. So, we only choose a subset of all possible transformations that we consider as common (e. g. statistics), promising (e. g. autoencoder), or to ensure comparability between spatial and non-spatial weather data (selection and spatial statistics). Five of the transformations can be used for both station-based and grid-based data, namely the statistics, dimensionality reduction, clustering, and autoencoder and variational autoencoder transformations. We additionally consider the interpolation transformation for the station-based weather data and the selection and spatial statistics transformations for the grid-based weather data. For each weather data transformation, we describe its underlying concept before introducing the specific method we use for the transformation. We start with the transformations that can be used for both station-based and grid-based weather data before describing the three transformations for only station-based or grid-based weather data. For all weather data transformations \(f_t\) and forecasting time steps i, the weather data transformations are applied on each forecast horizon time step j of the forecast horizon H which leads to the formula \(W_i = \{f_t(w_{ij})\}\), for \(i \in T\) and \(j \in \{1, \dots , H\}\) where \(W_i\) is the weather input for the forecasting time step i.
Statistics
To reduce the size of the input weather data features, we can apply the statistics transformation. It describes the available weather data by a set of statistical measures such as mean or minimum. We evaluate three statistics transformations that differ in the included number of statistical measures. The statistics small transformation comprises just the mean, while the statistics medium transformation additionally consists of the standard deviation. The statistics large transformation also includes the minimum and maximum of the given weather data. Each statistical feature is applied to the whole station-based or grid-based weather data.
Dimensionality reduction
Another way to reduce the size of the input weather data features is the Dimensionality Reduction (DR) transformation. It transforms the weather data feature space into a smaller latent feature space, thereby helping to remove redundant and irrelevant data points. We evaluate three DR transformations using the Principal Component Analysis (PCA) for dimensionality reduction. The PCA transforms the station-based or grid-based weather data by changing the basis to the eigenspace of the covariance matrix and omits eigenvectors with low variance (Pearson 2022). We use all eigenvectors up to a variance of 90% for the DR 90 transformation, 95% for the DR 95 transformation, and 98% for the DR 98 transformation.
Clustering
Since energy forecasts are weather-driven, weather conditions can be complex, and similar weather conditions lead to similar amounts of energy consumed or generated, the clustering transformation can be used to find similar weather data points. Clustering groups data points such that data points in one cluster are more similar to each other than data points in other clusters. As clustering high-dimensional data is challenging due to the curse of dimensionality (Zimek et al. 2012), we reduce the dimensionality of the input weather data using PCA before we cluster the weather data points. We evaluate three clustering transformations that differ in the number of considered clusters. The clustering 5 transformation uses five, the clustering 8 transformation eight, and the clustering 12 transformation twelve clusters. To determine the clusters, we apply the k-Means clustering (Lloyd 1982) which is a centroid-based clustering algorithm that iteratively moves centroids based on their corresponding data points. If the centroids do not change after an iteration, the clusters are found. We apply the clustering approach to the station-based or grid-based weather data that results from the previously introduced PCA using all eigenvectors up to a variance of 98%.
Autoencoder and variational autoencoder
Another way to reduce the input weather data feature space is an Autoencoder transformation. It transforms a given large input feature space into a latent space using two neural networks: an encoder and a decoder. The encoder transforms the input data points into a latent space from which the decoder reconstructs the original data points. The latent space is optimized such that the difference between the original data points and their reconstructions is as small as possible (Rumelhart et al. 1986). If the latent space corresponds to parameters of a variational distribution (e. g. a Gaussian distribution), it is called a Variational Autoencoder (VAE) (Kingma and Welling 2014). We refer to the related transformation as Variational Autoencoder transformation. We evaluate both the Autoencoder (AE) transformation and the VAE transformation with two different architectures for the encoder and decoder and latent space sizes of 16 and 32. The ully Connected Network (FCN) AE 16, FCN AE 32, FCN VAE 16, and FCN VAE 32 transformations use a network with fully connected layers, whose input is a vector of station-based or grid-based weather data. The Convolutional Neural Network (CNN) AE 16, CNN AE 32, CNN VAE 16, and CNN VAE 32 transformations apply a network with convolutional layers consisting of a fully connected layer in the latent space.
While the above-introduced transformations are applicable to both station-based and grid-based weather data, it may even be advantageous to convert station-based weather data into grid-based weather data or vice versa. Introducing transformations between station-based and grid-based weather data also increases comparability.
Interpolation
To convert station-based into grid-based weather data, we can use the interpolation transformation. It derives grid-based data from the locations of the station-based weather data by interpolation between the related weather data points. We evaluate the following three interpolation transformations that differ in the interpolation method applied to the non-spatial weather data: The bicubic interpolation transformation applies a cubic function to determine the interpolated weather data point of the 16 nearest data points (Keys 1981). The bilinear interpolation transformation uses the distance-weighted average of the four nearest data points as interpolated weather data point (Bredies and Lorenz 2018). Finally, the nearest interpolation transformation selects the value of the nearest weather data point (Bredies and Lorenz 2018). While the first two transformations can only interpolate within the convex hull of the locations, we can use the nearest interpolation transformation to extrapolate from the available locations.
Selection
To transform grid-based into station-based weather data, we can apply the selection transformation. It uses the locations of the weather stations in the station-based weather data and selects the corresponding nearest grid cell in the grid-based weather data. So, instead of the point measurements for station-based weather data, we get the corresponding grid cells from the grid-based weather data which refers to the smallest area of the grid-based weather data. We evaluate the selection transformation using the weather stations available for the energy time series to be forecast.
Spatial statistics
Another way to transform grid-based data into station-based weather data is the spatial statistics transformation. It determines statistical features for grid-based weather data over subareas of the input space. The subareas are equally distributed in all spatial dimensions without overlap. This transformation keeps some of the spatial information while lowering the spatial resolution and, thus, the computational complexity of the input weather data. We evaluate three spatial statistics transformations with the same numbers of statistical measures as for the statistics transformation, resulting in the spatial statistics small, the spatial statistics medium, and the spatial statistics large transformations. We calculate the related statistical measures on nine subareas for all three transformations by splitting the grid-based weather data into a \(3\times 3\) grid.
Experimental setup
Before we present the results, we introduce the respectively used energy and weather data sets, the evaluated weather data transformations, the applied forecasting methods, and the metric to assess the forecasting results.
Energy data
The introduced transformations are evaluated on weather data concerning their influence on forecasting renewable energy time series. More specifically, we select electrical demand, solar power, and wind power generation as target forecast time series for our evaluation. As data, we use the corresponding energy time series for the German state of Baden-Württemberg provided by the corresponding transmission system operator TransnetBW and available via the Open Power System Data (OPSD) platformFootnote 1 (Wiese et al. 2019). We choose the hourly resolution of this data from the beginning of 2015 to the end of 2019 for the evaluation (see Fig. 1 for example weeks). In the following, we refer to these energy time series as electrical demand, solar power, and wind power. We shift these time series by 1 h to align them to the selected weather data such that \(t_i\) describes the mean between \(t_{i-1}\) and \(t_{i}\).
In addition to the power measurements, we derive calendar information that the forecasting models can use as features. Using the timestamps of the energy time series, we calculate different sets of calendar features for electrical demand, solar, and wind power generation because useful calendar features vary between the three target energy time series. For example, holidays are helpful for electrical demand forecasting but not for solar and wind power generation forecasting. For electrical demand forecasting, we calculate the summertime, public holidays, and weekends as one-hot encoded vectors as well as the hour of the day, the day of the week, and the month of the year as sine-cosine encoded variables, and the month of the year between 2015 and 2019 as a linear increasing variable, which ranges between 0 and 1, enabling a forecasting method to learn concept drifts. Regarding public holidays and weekends, we use the previous and following two days as well as the current day. Regarding solar and wind power generation forecasting, we calculate sine and cosine encodings for the hour of the day and the day of the year, as well as the same linear feature as for the electrical demand forecasts.
Weather data
Since the presented transformations use station-based and grid-based weather data, we compare them on both data when forecasting the selected energy time series for the German state of Baden-Württemberg. We introduce the historical weather data sets we use in the following. An exemplary station-based and grid-based weather input for electrical demand using temperature weather data can be seen in Fig. 2.
The grid-based weather data we consider comes from NWP models as produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). More specifically, we select ECMWF’s ERA5 Re-analysis data setFootnote 2 from the beginning of 2015 to the end of 2019 over a geographical area covering Baden-Württemberg, i. e. 47\(^{\circ }\) to 50\(^{\circ }\) latitude and 7\(^{\circ }\) to 11\(^{\circ }\) longitude with a 0.25\(^{\circ }\) resolution. This data set offers a broad range of weather variables,Footnote 3 of which we use temperature 2 m above ground, surface net solar radiation (which includes both direct and diffuse solar radiation accumulated hourly), and the u and v components of wind ten meters above the ground. We then calculate the wind speed as the vector of the two wind components using the L2 norm, i. e. \(|(u, v)|_2 = \sqrt{(u^2 + v^2)}\).
The station-based weather data used for the evaluation is the publicly available weather station dataFootnote 4 from the German meteorological service, the Deutscher Wetterdienst (DWD). This weather station data includes the weather variables temperature two meters above ground, the hourly sum of global solar radiation (covering direct and diffuse solar radiation), and mean wind speed. Not all weather stations offer the same weather variables. Therefore, the number of available weather stations varies between the weather variables. From the available weather station data, we select weather stations for our evaluation based on three requirements. First, we only use weather stations with a maximum of 24 consecutive missing values. For these weather stations, we interpolate missing values using linear interpolation. Second, we only consider weather stations within the same area as the grid-based data described above. For comparison, Fig. 3 shows the locations of the overall used station-based weather data and the used area of the selected grid-based weather data. Third, we only consider the weather stations for each target energy time series with relevant weather variables available from the beginning of 2015 to the end of 2019. As a result, the number of considered weather stations varies across the three target energy time series (see the shape-encoded weather stations in Fig. 3a).
Architectures for processing weather data
To evaluate the introduced weather data transformations, we use a neural network as a subnetwork to process the selected weather data (weather subnet). Since we consider the use of the weather data as it is as a baseline, which we call raw data baseline in the following, we examine three different architectures, namely a 2D CNN, a 3D CNN, and a FCN for this raw data baseline (see Fig. 4). The 2D CNN is applied along the spatial dimensions for each time step and has more parameters than the 3D CNN. The 3D CNN is applied along spatial and temporal dimensions and has the least parameters. The FCN is a weighted sum of all weather data points and has the most parameters. All architectures are based on typical network architectures and the hyperparameters (e. g. number of filters, number of neurons, or number of layers) are evaluated in advance to get the best forecasting performance for each architecture. Since all other weather data transformations have a vector as output, we use the FCN for them. We briefly describe all three architectures in the following.
The 2D CNN processes the selected weather data using two 2D convolutional layers with ReLU activation and max pooling for each weather time step with two and four feature maps (see Fig. 4a). Instead of processing each weather time step separately, the 3D CNN uses 3D convolutional layers with ReLU activation and max pooling that process the weather time series for the forecast horizon as a whole using four and eight feature maps (see Fig. 4b). Both CNNs have a fully connected layer with 128 neurons after the convolutional layer to achieve a latent representation size of 128. In addition, the FCN includes 128 neurons, leading to a latent space representation of size 128 (see Fig. 4c).
Forecasting methods
We apply two forecasting methods to the selected energy time series to evaluate the introduced transformations on the chosen station-based and grid-based weather data. For this, we use PyTorch to implement a Deep Neural Network (DNN) as a state-of-the-art method, Scikit-Learn to implement a Linear Regression (LR) as a baseline method, and pyWATTSFootnote 5 (Heidrich et al. 2021) a framework for building graph-based pipelines that we use to automate training and evaluation.Footnote 6 Both forecasting methods get three inputs: electrical demand, solar power generation, or wind power generation time series, station-based or grid-based weather data, and calendar information (see Tables 1 and 2 in “Appendix” for more details on the inputs). An overview of the proposed pipeline is shown in Fig. 5. We use this data from 2015 to 2017 for training, 2018 for validation, and 2019 for testing. We apply both forecasting methods to obtain a multi-step 24 h-ahead forecast with a multiple output strategy. We introduce both forecasting methods concerning their architecture and training in the following.
The Deep Neural Network (DNN) processes the three inputs with three parts (see Fig. 6). The first part processes the energy time series with two 1D convolutional layers with eight and four feature maps. A subsequent fully connected layer results in a latent representation with a size of 64. The second part processes the weather data using a neural network architecture suitable for the respective weather data transformation described above and leads to a latent space representation of size 128. The third part directly passes the calendar information to the latent space representation of the energy time series. All three latent space representations are finally concatenated and processed with two fully connected layers with 128 and 64 neurons and ReLU activation. We train the networks using the Adam optimizer (Kingma and Ba 2015) with a learning rate of \(1\textrm{e}{-4}\) and the mean absolute error as the loss function. We also apply an early stopping, interrupting the training process if the validation loss does not decrease in the last 20 epochs.
In this study, we choose Linear Regression (LR) as the selected baseline method which is a common baseline to validate if more complex models are well parameterized. It gets the same input data as the DNN. For each forecast step, we train a separate LR. All energy, weather, and calendar features are concatenated and passed to the LR method. It is defined as
where \(e \in E\) is the energy time series, \(w \in W\) are the weather features with N weather features for a single time step, \(c \in C\) the calendar features with M features for a single time step, and \(h \in H\) the forecast step.
Evaluation metrics
In the evaluation, we make use of two metrics. First, to compare the forecasts of one transformation between station-based and grid-based weather data, we calculate the Mean Absolute Error (MAE) as the evaluation metric. We use a skill score based on the MAE to compare the forecasts of different introduced transformations.
The hourly MAE is defined as
where \(h \in H\) is the forecast step, \(t \in T\) the time step, \(y_{t+h}\) the true forecast value, and \({\hat{y}}_{t,h}\) the forecast value. For the evaluation, we consider the average of the hourly MAEs.
Given the hourly MAE, the skill score then describes the error of the MAE compared to a baseline forecast in percent. We calculate the hourly skill score with
where \(h \in H\) is the forecast step, \(MAE_{h}\) is the MAE of the forecasting method, and \(MAE_{h}^{baseline}\) is the MAE of the baseline forecast. For the evaluation, we consider the average of the hourly skill scores.
Results
In this section, we present the evaluation of the different introduced weather data transformations for the three target energy time series electrical demand, solar power, and wind power. We start with preliminary analyses to streamline the subsequent comparison of the weather data transformations, which includes three steps. First, we compare the forecasting performance of the weather data transformations on station-based and grid-based weather data. For each target energy time series, we compare all weather data transformations based on the mean skill score of their multi-step 24 h-ahead forecasts with a multiple output strategy. For the sake of clarity, we only show the best-performing version of each weather data transformation. Second, we select the three best-performing weather data transformations for station- and grid-based weather data and compare each on a 24-h forecast horizon. Third, we compare the best-performing weather data transformation for station-based and the best-performing weather data transformation for grid-based weather data to determine which data are most beneficial for a forecast. An illustration of the evaluation process can be seen in Fig. 7.
Preliminary analyses
Before we compare the different weather data transformations, we perform three preliminary analyses that reduce the complexity of the final forecast comparisons. We first determine baseline models by selecting the best-performing weather processing architectures for both station-based and grid-based weather data based on the raw data. Second, we determine which interpolation transformation performs best on the station-based weather data for each target energy time series. Lastly, we compare the DNN, which we use as the forecasting method in the evaluation, to the LR to verify its competitiveness.
Raw data baselines
The first preliminary analysis identifies the best weather data processing architectures for the raw data of the station-based and grid-based weather data, which we want to use as a baseline. To identify these baselines, we compare the forecast accuracy of the 2D CNN, the 3D CNN, and the FCN for each station-based weather data and grid-based weather data for all three target energy time series. The MAE for this analysis can be seen in Fig. 8.
Generally, the accuracy differences between the architectures are minor across the target energy time series. Concerning the station-based weather data, the 3D CNN performs best for electrical demand and the 2D CNN for solar and wind power. For the grid-based weather data, the FCN performs best for electrical demand and wind power, whereas the 3D CNN achieves the lowest MAE for solar power. We use these best-performing architectures as baselines for the respective weather data and target energy time series.
Best-performing interpolation for station-based weather data
The second preliminary analysis determines which interpolation transformation of the station-based weather data performs best for each target energy time series and thus should be used in the subsequent comparisons. For each target energy time series, we compare the forecast accuracy of the bicubic, bilinear, and nearest neighbor interpolation using the raw data of the station-based weather data for the three target energy time series.
As we can see in Fig. 9, no interpolation transformation outperforms all others for the three target energy time series. For electrical demand and wind power, the nearest interpolation has the lowest MAE, while the bilinear interpolation performs best for solar power. We select these best-performing interpolation transformations for the three target energy time series and refer to each of them as interpolation transformation in the following.
Competitiveness of DNN
In the third preliminary analysis, we verify the competitiveness of the DNN. For each target energy time series, we compare the forecast accuracy of the best-performing DNN to that of the LR as the benchmark forecasting method using the raw data of both station-based and grid-based weather data.
Figure 10 shows the MAEs of the DNN and the LR on the station-based and grid-based weather data for the three target energy time series. We can see that the DNN provides a lower MAE than the LR for all three target energy time series, thus outperforming the LR.
Electrical demand forecast
Our preliminary analysis allows comparing the different weather data transformations for each target time series in the following. We start with the electrical demand. For the electrical demand forecast, the skill scores of all weather data transformations using the station-based and grid-based weather data are shown in Fig. 11. The skill is compared to using the raw data baseline, and the three best-performing weather data transformations are highlighted in orange.
For the station-based data, we observe that six of the weather data transformations, namely the statistics small, PCA 90%, PCA clustering 8, interpolation statistics medium, interpolation PCA 90%, and interpolation PCA clustering 8 transformations, provide a considerable improvement in the forecast accuracy. While the interpolation and interpolation CNN VAE transformations show only a slight improvement, and the interpolation spatial statistics large and interpolation FCN AE transformations show no improvement over the baseline. The best-performing transformations are then the interpolation PCA 90%, interpolation PCA clustering 8, and statistics small.
For the grid-based weather data, all weather data transformations except for the statistics medium, FCN VAE, and CNN AE transformations show a clear positive skill score. While the spatial statistics small transformation only achieves a slight improvement and the CNN AE transformations no improvement, the FCN VAE transformation results in a worse accuracy. The three best-performing weather data transformations are the selection, statistics medium, and selection statistics medium transformations.
Given these results using the mean scores, we now also compare the best three performing models for each weather data and target energy time series on the individual time steps from 1 to 24 h. The corresponding skill scores are shown in Fig. 12. The weather data transformations for the station-based weather data show the largest improvements of up to 7% between 1 and 2 h and 19 to 24 h and improvements between 1 and 4% in the remaining hours. The considered weather data transformations for the grid-based weather data provide a more constant improvement over the forecast horizon of about 3 to 7%. Only the improvement provided by the selection transformation decreases below 3% starting at hour 19.
Solar power forecast
Concerning the solar power forecast, Fig. 13 shows the skill scores of all weather data transformations using the station-based and grid-based weather data compared to the raw data baseline. The three best-performing weather data transformations are again highlighted in orange.
For the station-based weather data, only the interpolation spatial statistics small transformation achieves a positive skill score and thus improves the forecast accuracy. The interpolation, interpolation FCN AE 16, and interpolation CNN AE 16 transformations have a skill score close to zero, and all other weather data transformations result in negative skill scores. Overall, the three best-performing weather data transformations are the interpolation spatial statistics small, interpolation, and interpolation FCN AE 16 transformations.
For the grid-based weather data, four weather data transformations have a positive skill score, namely the statistics large, spatial statistics small, PCA 98%, and PCA clustering 8 transformations. The CNN AE 16 and selection transformations achieve skill scores close to zero, whereas the remaining weather data transformations have negative skill scores. For the grid-based weather data, the PCA 98%, statistics large, and spatial statistics small transformations are, therefore, the best weather data transformations.
Figure 14 shows the skill scores of the three best-performing weather data transformations over each step of the 24-h forecast horizon. For the station-based weather data, the performance is very mixed, with all transformations having a negative trend starting from hour 18 and only the spatial statistics small transformation having a mostly positive skill score. For the grid-based weather data, the transformations provide relatively constant improvements up to around 6% over the whole forecast horizon and even show a positive trend at the end of the forecast horizon.
Wind power forecast
The last target energy time series is wind power, where Fig. 15 shows the skill scores of all weather data transformations using the station-based and grid-based weather data. The three best-performing weather data transformations are again highlighted in orange.
For the station-based weather data, we observe that only the interpolation and interpolation CNN VAE 32 transformation have a positive skill score and thus improve the forecast accuracy. All other weather data transformations provide negative skill scores, a few even below − 7.5%.
For the grid-based weather data, all evaluated weather data representations provide negative skill scores, and most are below or well below − 10%. Thus on average, over the forecast horizon, they do not improve the forecast accuracy over the raw data baseline.
These mostly negative results are slightly improved when looking at the errors per forecast step. Focusing on the three best-performing weather data transformations, Fig. 16 shows their skill scores for the station-based and grid-based weather data for the 24-h forecast horizon. Regarding the station-based data, the interpolation and interpolation spatial statistics large transformations start with highly negative skill scores at hour 1 but achieve positive skill scores starting from hour 4 or 8. They continue with positive skill scores until the end of the forecast horizon. In contrast to that, the interpolation CNN VAE 32 transformation shows positive skill scores over the complete forecast horizon. The observation is reversed for the grid-based weather data: The weather data representations have very high positive skill scores at hour 0, but the skill scores constantly decrease until hour 8. Starting from hour 3 or 5, they become negative and stay negative until the end of the forecast horizon.
Station-based vs. grid-based weather data
Finally, we compare the forecast accuracy of the best-performing weather data transformation for station-based weather data with that of the best-performing weather data transformation for grid-based weather data to determine which data are most beneficial for a forecast. We first compare them with their average MAE before we consider the MAE for the 24-h forecast horizon. Figure 17 shows the average MAE, with interpolation DR 90 (electrical demand), interpolation spatial statistics small (solar power), interpolation raw (wind power) as the best-performing station-based weather data transformation for demand, solar and wind respectively, and selection raw (electrical demand and wind power) and DR 98 (solar power) as the best-performing grid-based weather data transformation.
We observe that transforming the weather data for electrical demand yields a lower MAE, with the station-based transformation providing the lowest error. For solar power, transforming the weather data achieves a slightly better accuracy than the raw data. The station-based weather data transformation results in noticeably lower errors than the grid-based weather data transformation. For wind power, the best-performing weather data transformations have a lower MAE than the raw data for the station-based weather data but not for the grid-based weather data. Overall, the raw data for the grid-based weather data shows the best forecast accuracy for wind power.
This analysis for the average errors remains consistent for the individual steps of the 24-h forecast horizon, as can be seen in Fig. 18. For electrical demand, the best-performing station-based weather data transformation achieves a lower MAE than the raw data, especially for hours 1 to 3 and 16 to 24. The best-performing grid-based weather data transformation yields a lower MAE very constantly over the forecast horizon. With respect to solar power, the best-performing weather data transformation generally provides a lower MAE than the raw data over the forecast horizon. However, the difference is rather small for hours 1 to 4, and the raw data has a lower MAE at hour 22 for the station-based weather data. Concerning the wind power the best-performing station-based weather data transformation achieves remarkably lower MAEs than the raw data at hours 5 to 24 but higher MAEs at hours 1 to 4. The best-performing grid-based weather data transformation reversely has a lower MAE than the raw data at hours 1 to 4 but higher MAEs at all remaining hours.
Discussion
Given our results section, we interpret the results, discuss limitations, and point out the benefits of weather data transformations for station-based and grid-based weather data in energy time series forecasting in this section.
Overall, we can see that the introduced weather data transformations can improve the forecast accuracy for all target energy time series compared to their respective raw data baselines. However, no single weather data transformation outperforms all others, and the impact of the transformation differs for the target energy time series. Nonetheless, we show that evaluating data transformations is beneficial, especially for electrical demand and solar power where statistical and data reduction transformations are among the best. While several station-based and grid-based weather data transformations improve the forecasting accuracy of the electrical demand between 3.4 and 4.3%, the forecast accuracy of solar power is more consistently improved by grid-based weather data transformations with accuracy improvements between 1.2 and 3.7%. Lastly, for wind power, only one station-based weather data transformation improves the forecast accuracy by 5.2%, but no grid-based transformation achieves an improvement. The reason why weather data transformations mostly reduce the wind power forecast accuracy could be the characteristic of the data set. While Baden-Württemberg’s population is denseFootnote 7 and many houses have solar power plants on the roof, wind power plants are mainly located in the northeast.Footnote 8 So, reducing the spatial resolution hinders accurate wind power forecasts. In addition, wind speeds vary more than solar radiation or temperature.
While we find that the most beneficial weather data transformation depends on the target energy time series and on the weather data, we also observe that the interpolation transformation or combinations with it represent the top three weather data transformations for all three target energy time series when using station-based weather data. Analogously, the statistics transformation is part of the top three weather data transformations when using grid-based weather data.
This analysis is subject to a few limitations, which we discuss in the following. Besides the difference in the weather variables, we only use one weather variable to forecast each target energy time series. We select the temperature for electrical demand, solar radiation for solar power, and wind speed for wind power. While these are the most influential weather variables for these energy time series, further investigations could evaluate whether using more weather variables in combination with weather data transformations changes the impact of transformations on the forecast accuracy.
Additionally, our evaluation uses actual historical station-based and grid-based weather data, thus providing an upper bound for the forecast accuracy that can be achieved as there is no forecast error noise in the weather variables. From a practical point of view, it could be helpful to extend the evaluation to forecast weather data such as ECMWF’s ERA5 High-Resolution NWP forecast data set.
Also, the presented evaluation covers a broad range of weather data transformations and their parameters. However, other weather data transformations and parameters are possible. To give an example, instead of using k-means for the clustering transformation, hierarchical-based or density-based clustering algorithms are also suitable. Similarly, the presented results are based on the selected neural networks for processing the weather data and forecasting the energy time series. Therefore, it could be of interest to extend the evaluation accordingly.
Lastly, we evaluate three different network architectures for the weather data subnetwork and only use one architecture for the energy time series. Other architectures are possible like Recurrent Neural Networks (RNNs) or Long Short-Term Memories (LSTMs). Nonetheless, more complex architectures are harder to train, need more time for training, and often require more data. We think, that data transformations also have a positive effect on more complex architectures. However, more research is needed.
Overall, the results indicate that smartly selecting weather data transformations for station-based or grid-based weather data improves the accuracy of energy time series forecasts where statistical or data reduction transformations are among the best.
Conclusion
In the present paper, we investigate how weather data transformations for station-based and grid-based weather data affect the accuracy of energy time series forecasts. We first identify the best-performing neural network architectures to process the weather data and determine the baselines for our evaluation. We then evaluate several weather data transformations in forecasting three energy time series: electrical demand, solar power, and wind power. Finally, we use a skill score for each energy time series to examine the effect of transforming weather data on forecast accuracy.
The results show that choosing a proper weather data transformation together with either station-based or grid-based weather data is beneficial for the forecast accuracy. For the electrical demand, using the best-performing weather data transformation improves the forecast accuracy for both station-based and grid-based weather data compared to the baseline transformation. At the same time, neither station-based nor grid-based weather data has a clear advantage over the other. For solar power, the best-performing weather data transformation yields better accuracy for both weather data sets, however, using grid-based weather data results in higher accuracy. For wind power, the best-performing weather data transformation can improve the forecast using station-based weather data. However, while the best-performing weather data transformation has no positive effect when using grid-based weather data, the raw data leads to the comparatively highest accuracy. While the results differ for the considered energy time series, the interpolation transformation and combinations with it are beneficial for station-based weather data and the statistics transformation for grid-based weather data.
In future work, we want to evaluate weather data transformations for energy time series forecasting using multiple weather variables and forecast weather data. Additionally, we want to examine further weather data transformations, for example, combinations of evaluated weather data transformations or more advanced autoencoders.
Availibility of data and materials
The implementations will be made available after the acceptance of the paper. The energy data used for the evaluation is openly available via the Open Power System Data (OPSD) platform at https://open-power-system-data.org/. The grid-based weather data used for the evaluation is the ERA5 Re-analysis data set from the European Centre for Medium-Range Weather Forecasts (ECMWF), which is available at https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5 depending on the user’s status. The station-based weather data used for the evaluation is the weather station data from the Deutscher Wetterdienst (DWD), which is openly available at https://opendata.dwd.de/.
Notes
The implementations are available at https://github.com/KIT-IAI/Weather-Data-Transformations-for-Energy-Time-Series-Forecasting.
https://www.deutschlandatlas.bund.de/DE/Karten/Wo-wir-leben/006-Bevoelkerungsdichte.html.
https://www.lubw.baden-wuerttemberg.de/erneuerbare-energien/karten.
References
Al-Yahyai S, Charabi Y, Gastli A (2010) Review of the use of Numerical Weather Prediction (NWP) Models for wind energy assessment. Renew Sustain Energy Rev 14(9):3192–3198. https://doi.org/10.1016/j.rser.2010.07.001
Andrade JR, Bessa RJ (2017) Improving renewable energy forecasting with a grid of numerical weather predictions. IEEE Trans Sustain Energy 8(4):1571–1580. https://doi.org/10.1109/TSTE.2017.2694340
Beichter M, Phipps K, Frysztacki MM, Mikut R, Hagenmeyer V, Ludwig N (2022) Net load forecasting using different aggregation levels. Energy Inform 5:19. https://doi.org/10.1186/s42162-022-00213-8
Bloomfield HC, Gonzalez PLM, Lundquist JK, Stoop LP, Browell J, Dargaville R, Felice M, Gruber K, Hilbers A, Kies A, Panteli M, Thornton HE, Wohland J, Zeyringer M, Brayshaw DJ (2021) The importance of weather and climate to energy systems: a workshop on next generation challenges in energy-climate modeling. Bull Am Meteorol Soc 102(1):159–167. https://doi.org/10.1175/BAMS-D-20-0256.1
Bredies K, Lorenz D (2018) Mathematical image processing. Birkhäuser, Cham. https://doi.org/10.1007/978-3-030-01458-2
Clarke L, Wei YM, Navarro ADLV, Garg A, Hahmann AN, Khennas S, Azevedo IML, Löschel A, Singh AK, Steg L, Strbac G, Wada K (2022) Energy systems. In: Shukla PR, Skea J, Slade R, Khourdajie AA, Diemen R, McCollum D, Pathak M, Some S, Vyas P, Fradera R, Belkacemi M, Hasija A, Lisboa G, Luz S, Malley J (eds) Climate change 2022: mitigation of climate change. Contribution of working group III to the sixth assessment report of the intergovernmental panel on climate change. Cambridge University Press, Cambridge, UK and New York, NY, USA. Chap. 6. https://doi.org/10.1017/9781009157926.008
Cococcioni M, D’Andrea E, Lazzerini B (2012) One day-ahead forecasting of energy production in solar photovoltaic installations: an empirical study. Intell Dec Technol 6(3):197–210. https://doi.org/10.3233/IDT-2012-0136
Dahl M, Brun A, Andresen GB (2017) Using ensemble weather predictions in district heating operation and load forecasting. Appl Energy 193:455–465. https://doi.org/10.1016/j.apenergy.2017.02.066
Do H, Cetin KS (2018) Residential building energy consumption: a review of energy data availability, characteristics, and energy performance prediction methods. Curr Sustain Renew Energy Rep 5(1):76–85. https://doi.org/10.1007/s40518-018-0099-3
Du P (2019) Ensemble machine learning-based wind forecasting to combine NWP output with data from weather station. IEEE Trans Sustain Energy 10(4):2133–2141. https://doi.org/10.1109/TSTE.2018.2880615
Elizabeth Michael N, Hasan S, Al-Durra A, Mishra M (2022) Short-term solar irradiance forecasting based on a novel Bayesian optimized deep Long Short-Term Memory neural network. Appl Energy 324:119727. https://doi.org/10.1016/j.apenergy.2022.119727
Feng C, Zhang J, Zhang W, Hodge B-M (2022) Convolutional neural networks for intra-hour solar forecasting based on sky image sequences. Appl Energy 310:118438. https://doi.org/10.1016/j.apenergy.2021.118438
González Ordiano JÁ, Waczowicz S, Hagenmeyer V, Mikut R (2018) Energy forecasting tools and services. Wiley Interdiscip Rev Data Min Knowl Discov 8(2):1235. https://doi.org/10.1002/widm.1235
Gutierrez-Corea F-V, Manso-Callejo M-A, Moreno-Regidor M-P, Manrique-Sancho M-T (2016) Forecasting short-term solar irradiance based on artificial neural networks and data from neighboring meteorological stations. Solar Energy 134:119–131. https://doi.org/10.1016/j.solener.2016.04.020
Harish S, Singh N, Tongia R (2020) Impact of temperature on electricity demand: evidence from Delhi and Indian states. Energy Policy 140:111445. https://doi.org/10.1016/j.enpol.2020.111445
He X, Zhao K, Chu X (2021) AutoML: a survey of the state-of-the-art. Knowl Based Syst 212:106622. https://doi.org/10.1016/j.knosys.2020.106622
Heidrich B, Bartschat A, Turowski M, Neumann O, Phipps K, Meisenbacher S, Schmieder K, Ludwig N, Mikut R, Hagenmeyer V (2021) pyWATTS: python workflow automation tool for time series. https://doi.org/10.48550/arXiv.2106.10157
Hong T, Pinson P, Fan S (2014) Global energy forecasting competition 2012. Int J Forecast 30(2):357–363. https://doi.org/10.1016/j.ijforecast.2013.07.001
Hu S, Xiang Y, Zhang H, Xie S, Li J, Gu C, Sun W, Liu J (2021) Hybrid forecasting method for wind power integrating spatial correlation and corrected numerical weather prediction. Appl Energy 293:116951. https://doi.org/10.1016/j.apenergy.2021.116951
Hu Y, Cheng X, Wang S, Chen J, Zhao T, Dai E (2022) Times series forecasting for urban building energy consumption based on graph convolutional network. Appl Energy 307:118231. https://doi.org/10.1016/j.apenergy.2021.118231
Keys R (1981) Cubic convolution interpolation for digital image processing. IEEE Trans Acoust Speech Signal Process 29:1153–1160. https://doi.org/10.1109/TASSP.1981.1163711
Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations (ICLR 2015). https://doi.org/10.48550/arXiv.1412.6980
Kingma DP, Welling M (2014) Auto-encoding variational Bayes https://doi.org/10.48550/arXiv.1312.6114
Kong W, Jia Y, Dong ZY, Meng K, Chai S (2020) Hybrid approaches based on deep whole-sky-image learning to photovoltaic generation forecasting. Appl Energy 280:115875. https://doi.org/10.1016/j.apenergy.2020.115875
Lei M, Shiyan L, Chuanwen J, Hongling L, Yan Z (2009) A review on the forecasting of wind speed and generated power. Renew Sustain Energy Rev 13(4):915–920. https://doi.org/10.1016/j.rser.2008.02.002
Liu H, Chen C (2019) Data processing strategies in wind energy forecasting models and applications: A comprehensive review. Appl Energy 249:392–408. https://doi.org/10.1016/j.apenergy.2019.04.188
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28:129–137. https://doi.org/10.1109/TIT.1982.1056489
Ludwig N, Arora S, Taylor JW (2023) Probabilistic load forecasting using post-processed weather ensemble predictions. J Oper Res Soc 74(3):1008–1020. https://doi.org/10.1080/01605682.2022.2115411
Neumann O, Ludwig N, Turowski M, Heidrich B, Hagenmeyer V, Mikut R (2021) Smart data representations: impact on the accuracy of deep neural networks. In: Schulte H, Hoffmann F, Mikut R (eds) Proceedings 31 workshop computational intelligence, pp 113–130. KIT Scientific Publishing, Karlsruhe. https://doi.org/10.5445/KSP/1000138532
Pearson K (2022) LIII. On lines and planes of closest fit to systems of points in space. Lond Edinburgh Dublin Philos Mag J Sci 2(11):559–572. https://doi.org/10.1080/14786440109462720
Ren X, Zhang F, Zhu H, Liu Y (2022) Quad-kernel deep convolutional neural network for intra-hour photovoltaic power forecasting. Appl Energy 323:119682. https://doi.org/10.1016/j.apenergy.2022.119682
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. https://doi.org/10.1038/323533a0
Si Z, Yang M, Yu Y, Ding T (2021) Photovoltaic power forecast based on satellite images considering effects of solar position. Appl Energy 302:117514. https://doi.org/10.1016/j.apenergy.2021.117514
Simeunović J, Schubnel B, Alet P-J, Carrillo RE, Frossard P (2022) Interpretable temporal-spatial graph attention network for multi-site PV power forecasting. Appl Energy 327:120127. https://doi.org/10.1016/j.apenergy.2022.120127
Vanting NB, Ma Z, Jørgensen BN (2021) A scoping review of deep neural networks for electric load forecasting. Energy Inform 4:49. https://doi.org/10.1186/s42162-021-00148-6
Vladislavleva E, Friedrich T, Neumann F, Wagner M (2013) Predicting the energy output of wind farms based on weather data: Important variables and their correlation. Renew Energy 50:236–243. https://doi.org/10.1016/j.renene.2012.06.036
Wang H, Lei Z, Zhang X, Zhou B, Peng J (2019) A review of deep learning for renewable energy forecasting. Energy Convers Manag 198:111799. https://doi.org/10.1016/j.enconman.2019.111799
Wiese F, Schlecht I, Bunke W-D, Gerbaulet C, Hirth L, Jahn M, Kunz F, Lorenz C, Mühlenpfordt J, Reimann J, Schill W-P (2019) Open power system data—frictionless data for electricity system modelling. Appl Energy 236:401–409. https://doi.org/10.1016/j.apenergy.2018.11.097
Zimek A, Schubert E, Kriegel H-P (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min ASA Data Sci J 5:363–387. https://doi.org/10.1002/sam.11161
Acknowledgements
Not applicable.
Funding
Open Access funding enabled and organized by Projekt DEAL. This project is funded by the Helmholtz Association’s Initiative and Networking Fund through Helmholtz AI, the Helmholtz Association under the Program “Energy System Design”, and the German Research Foundation (DFG) under Germany’s Excellence Strategy - EXC number 2064/1 - Project number 390727645.
Author information
Authors and Affiliations
Contributions
ON: Conceptualization, methodology, software, validation, investigation, writing—original draft, visualization. MT: Conceptualization, methodology, investigation, writing—original draft, writing—review & editing. RM: Resources, writing—review & editing, supervision, funding acquisition. VH: Resources, writing—review & editing, supervision, funding acquisition. NL: Conceptualization, methodology, investigation, writing—original draft, writing—review & editing, supervision. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Neumann, O., Turowski, M., Mikut, R. et al. Using weather data in energy time series forecasting: the benefit of input data transformations. Energy Inform 6, 44 (2023). https://doi.org/10.1186/s42162-023-00299-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s42162-023-00299-8