Abstract
Since energy system models require a large amount of technical and economic data, their quality significantly affects the reliability of the results. However, some publicly available data sets, such as the transmission system operators’ day-ahead load forecasts, are known to be biased and inaccurate, leading to lower energy system model performance. We propose a time series model that enhances the accuracy of transmission system operators’ load forecast data in real-time, using only the load forecast error’s history as input. We further present an energy system model developed specifically for price forecasts of the short-term day-ahead market. We demonstrate the effectiveness of the improved load data as input by applying it to this model, which shows a strong reduction in pricing errors, particularly during periods of high prices and tight markets. Our results highlight the potential of our method the enhance the accuracy of energy system models using improved input data.
Similar content being viewed by others
1 Introduction
Energy markets are complex and exhibit non-trivial interdependencies, so decisions from policy and industry stakeholders rely on theoretical models and other methodological support. Techno-economic energy system models are widely used in academia, policy-making and industry. Typically, they determine market equilibria, minimising production costs or maximising social welfare. A market’s supply and demand sides are equally essential to derive equilibria. Various models have been developed using time series of load data as an essential input on the demand side. On the supply side, models focus on power plants (electricity system models) or gas production (gas systems). Transmission and distribution infrastructure, i.e., connecting supply and demand, can also be included and analysed with energy system models. A strength of these models is that they can provide valuable insights into both causes and effects of current and planned developments, as well as into “what-if” types of analyses. They are capable of reflecting structural breaks better than most other model types. Thus, energy system models are among the most essential methodologies for a successful energy transition.
However, they rely on the quality of input data to provide accurate results. Preparing and collecting data for energy system models is a challenge, and tremendous efforts have been done to generate techno-economic data ([see, e.g., 33, 56]) or forecast data ([see, e.g., 34]), among others. Moreover, literature has shown that widely used input data sets for energy system models, in particular load data and wind or solar forecasts from official sources, often have significant systematic errors [28, 39]. In our paper, we refer to these results and elaborate on how these errors can be reduced by real-time time series filters. Considering the errors as an econometric time series, serial structures in these errors can be used to predict future errors, which in effect, significantly reduces the errors themselves. We then analyse whether using these improved input data in an energy system model will improve model quality.
The contribution of this paper is threefold. First, we develop and provide a simple time-series model reducing forecast errors of hourly day-ahead load predictions of transmission system operators (TSOs) in real-time. We focus on load forecasts because they are the most correlated with the prices of the day-ahead electricity market and have the most potential for improvement compared with wind and PV forecasts [see, e.g., 39]. One advantage of our approach is that we take publicly available TSO-based load forecasts as given and thus, in modelling directly their prediction error as a predictable subject, do not need to develop a complex load forecast model. On country level, load forecasts are often used to represent the demand on the day-ahead market clearing.Footnote 1 Thus, load forecasts are central variables for determining equilibria of demand and supply in energy system models.
Second, we present a fundamental energy system dispatch model called the em.power dispatch model, developed and calibrated precisely for short-term use in the day-ahead market. A primary objective of this model is to predict wholesale electricity prices. Using a rolling window, it consecutively determines the optimal power plant operation for three consecutive days. Moreover, the model considers hourly net transfer capacities to limit electricity transmission across countries and a formulation for medium- and long-term energy storage. We describe these steps in detail in Sect. 4.
Third, we demonstrate the value of sequentially and continuously improving the quality of input variables in fundamental energy system models in the empirical part of the paper. We consider TSO day-ahead load forecasts provided by one of the most used data sources [21] and day-ahead prices forecasted with the energy system model for Germany, one of the largest and most liquid electricity markets in the world. By capturing and reflecting systematic biases and autoregressive structures, we reduce the mean squared error by 26% compared to the TSO-based load forecast. Therefore, market participants’ expectations of the day-ahead market clearing can be better reflected. As a result, the mean squared error of the em.power dispatch model’s price forecast is reduced by nearly 15% in hours with high prices using the improved load forecast compared to using the TSO load forecast. By demonstrating that energy system models with the improved load data perform significantly better compared to the TSO data, we provide valuable insights for many stakeholders in the power sector, particularly energy system model developers seeking to improve the validity of their models. Based on these results, we encourage energy system modelers and all users of fundamental input data to be aware of the predictable structure of their errors. In particular, stochastic modelling of the errors significantly reduces the forecast error of input data. It thus improves the quality of input data as part of sequential data pre-processing in real-time and offers the possibility to enhance the output of fundamental energy system models.
The remainder of the paper is organised as follows. First, we examine the literature on energy system modelling, data quality and time series modelling in Sect. 2. Section 3 presents the data used in this application. In Sect. 4, we provide and explain the methodology for the model improving the load forecasts and the energy system model used to evaluate the impact of the improved load data. The results are presented in Sect. 5. Finally, a conclusion is drawn in Sect. 6.
2 Literature
With our paper, we address energy system modelers who model energy systems with a high degree of detail and therefore require large and as accurate as possible data sets. Out of a wide range of modelling applications, examples include the determination and assessment of long-term investment decisions for generation and storage capacities [e.g., 45, 54] or implications on short-term operational decisions [e.g., 55], transmission expansion planning [e.g., 13, 53], the evaluation of carbon reduction paths [e.g., 61] and support schemes for renewable energy system [e.g., 31] and the evaluation of interdependencies between energy sectors (e.g., [36] for electricity and gas markets, [25] for transport, electricity and district heating, [32] for electricity, hydrogen and methane). Moreover, scholars developed stochastic models to assess the impact of uncertainty on a power system [50], for example, to quantify the expected costs of ignoring uncertainty of critical parameters in the electricity and gas sector. An overview and classification of stochastic models dealing with uncertainty in the power sector is provided by [42]. With regard to uncertainty, scholars analyse the effect of risk preferences as well [e.g., 2, 41].
In particular, our paper analyses the impact of better load forecasts on the day-ahead forecast of wholesale electricity prices using a fundamental energy system model. Estimating wholesale electricity prices is essential for making optimal economic decisions (e.g., investment and dispatch of various technologies) and policy decisions (e.g., calculating the implications of a coal phase-out). Wholesale electricity prices can be forecasted with multiple methodologies, all with their unique advantages and disadvantages. Energy system models have advantages, e.g., they perform exceptionally well at structural breaks, are based on a broad economic theoretical foundation explaining causality, and provide additional information beyond the forecast. Consequently, much attention has been paid in the literature to the simulation or prediction of electricity prices in energy system models. Scholars simulate electricity prices to quantify, for example, the drop in the market value of variable renewables [e.g., 27]. Additionally, [14] quantify market values for renewables generating electricity prices in a future power system with the help of an energy system model. To quantify weather-specific market values for a comprehensive database of onshore wind capacities in Germany, [15] derive market prices assuming different weather years. An agent-based model with rule-based bidding strategies to reproduce spot prices for the German bidding zone is used in [49].
Market power and strategic behaviour are other applications of wholesale price forecasts with energy system models. When modelling competitive market prices and comparing them with actual prices, they were able to point to serious problems (e.g., [44, 63] for Germany, and [5] for the United States).
These and many other model applications have a dedicated empirical focus. Thus, the high quality of input data is vital. For the European electricity sector, data is conveniently gathered and made publicly available by transmission system operators (TSOs) via the transparency platform of the ENTSO-E. The platform is a very ambitious and unique project to provide an extensive data set for electricity markets and is thus both well-known and widely used. Nevertheless, the data presented on the platform is not without its shortcomings regarding completeness and quality [see 28]. Furthermore, [39] analyse the quality of load data for the Germany-Luxembourg bidding zone. They detect a bias in TSO load forecasts and develop an alternative load prediction model that incorporates information from these forecasts to remove the bias and thus achieve an enhanced load prediction. For the Spanish market, [8] analyse the forecast errors of the TSOs day-ahead load forecasts for serial structures and influences of special days such as Christmas holidays or New Year’s Eve. Hence, researchers using such empirical data should raise awareness and aim to improve data quality.
With our paper, we aim to provide energy system modelers with a methodology to improve the quality of their results by improving the input data. Concerning load data, a comprehensive literature review of various methods and models for energy demand forecasting is given by Ref. [57, 58]. Among others, approaches for standalone load forecasting models are presented by [1, 9, 37, 46, 51, 59, 62, 64,65,66] and [67].
Load forecasts are publicly available. However, they can be improved with a simple and straightforward approach. Given a series of load forecasts with forecasting errors that still show a predictable structure, the method proposed in this paper offers a possibility to enhance existing forecasts. We improve the forecasts by modelling and removing predictable parts of the errors needing no other information than the forecast error itself. Implicitly, [67] use a similar step since they remove a structure from their forecasting model (first stage) in a second stage by a time series approach. However, they rely on neural networks, while we propose a simple time series model.
In energy system modelling, activities to improve input data can be described as data pre-processing or, more precisely, continuous data processing and enhancement with subsequent use. Such continuous data processing is typically not performed for energy system models and the value it provides has not yet been researched. We believe this is a methodological gap in the literature and aim to bridge it by providing an approach to sequentially improving input data and sequentially using these continuously improved datasets in an energy system model. We demonstrate the effectiveness in an empirical application, focusing on the effect of better load forecasts for electricity price forecasts derived from energy system models.
3 Data
Energy system models require extensive input data to model market equilibria on both the demand and the supply sides. Since this paper focuses on a day-ahead time horizon, TSO-based load forecasts published by ENTSO-E may be used as predictors for the demand side. However, as was pointed out in the literature section, the quality of these load forecasts is debated and will be improved in this paper. In Sect. 3.1, we first provide a detailed overview of the TSO-based load forecast data and forecast errors. Moving to the supply side of the energy system model, data on techno-economic parameters for conventional generation, renewables, storage and electricity transmission are of the utmost importance and are presented in Sect. 3.2.
3.1 TSO-based load forecast data
The load data set we use for our analysis contains hourly day-ahead load forecast data and hourly actual load data from January 1st, 2016, until December 31st, 2019, for Germany and Luxembourg. It was downloaded from the ENTSO-E transparency platform [21] in MWh. Missing values were replaced by the average of the value of the previous week and the week after.Footnote 2 An illustration of the time series of the actual load, TSO load forecast and the resulting error, computed as the difference between actual load and load forecast, is shown in Fig. 1.
For the considered years, Table 1 contains descriptive statistics of the TSO load forecast errors defined as \(\epsilon _t {:}{=}L_t - {\hat{L}}_t\), meaning actual load minus TSO load forecast. Thus, a positive error states an underprediction of load.
The TSO forecast data is mean-biased, as discussed in [39]. In our analysis, we find systematic underpredictions with a mean error of 881.3 MWh across all years and positive mean errors for every year.
However, the absolute level of the error and whether the TSO under- or over-predicts in its forecasts depend on the day of the week and the hour of the day. Figure 2 states the averaged hourly forecast errors in a week. Broadly, we can observe underprediction during weekdays and overprediction on the weekends, especially on Saturdays. During the day, in the morning and the evening hours, the error of the TSO day-ahead load forecast is generally positive and higher than in the other hours of the day. With an average error of 943.53 MWh at 6 a.m. and 1180.48 MWh at 7 p.m., the prediction error in these hours is higher by 7% (34%) than the mean error of the entire time period considered (compare with Table 1). These are the hours when the workday begins or ends and where production ramps up or down. Although the standard deviation of the forecast in these hours is not significantly larger than in the other hours, it appears that the load in these hours is still more challenging to forecast on average than in the other hours of the day (see weekday-wise descriptive measurements in Table 5 for more details).
Finally, we perform Ljung-Box (LB) tests to verify the auto-correlation of the TSO load prediction errors. The null hypothesis at a 5% significance level is rejected for all years, which indicates a strong auto-correlation of the errors. Comparing the errors with those one hour before (see Fig. 3), we can see a highly linear dependence.
In summary, the load data shows high auto-correlated TSO forecast errors, which average 1.56% of the total load’s mean. The mean absolute error of the TSO load forecast is 1776 MWh (3.14% of the total load’s mean). The TSO forecast errors are biased with some seasonal structures in the bias and are highly auto-correlated. Hence, autoregressive type models could improve the TSO load forecast.
3.2 Input data for an energy system model
Aiming to analyse the impact of improved day-ahead load forecasts on the accuracy of electricity price forecasts, which are derived using an electricity system model, we develop and parameterise a European electricity market model with data from January 1st, 2017, until December 31st, 2019. A meaningful empirical parameterisation of such models requires extensive input data derived from various sources. To model the demand side, the load data presented in the previous Sect. 3.1 is essential. Furthermore, there is typically an option to shed load during supply scarcities. In our application, we assume the costs for load shedding to be 3000 €/MWh.
On the supply side, several technologies are available for electricity generation and storage. Our energy system model distinguishes ten conventional thermal generation technologies, which form 30 capacity clusters according to a power plant’s commissioning year. We provide each of the capacity clusters with different efficiencies, minimum outputs and efficiency losses in part-load operations, which are derived from [47, 56]. The capacity, fuel type, generation technology and commissioning date are derived from [11, 20, 47]. For power plants on the German market, we additionally use data from [4, 60]. Fuel costs, costs for CO\(_2\) emissions and the power plant efficiency determine the variable generation costs of conventional thermal technologies. For fuel costs, we use daily gas prices that are provided by Ref. [12], monthly coal prices are taken from [7], and monthly oil prices from [7]. Fuel costs for nuclear, lignite and waste are derived from [23]. These are assumed to be constant over time. Prices for CO\(_2\) certificates are implemented as weekly values from [52].
The process of starting up power plants requires the use of fuel, emits CO2 and leads to material wear in the plant. Data for start-up times, secondary fuel usage and depreciation are derived from [56].
The ability to generate electricity depends not only on the installed capacity but also on the technical availability of the plants. Therefore, we consider all scheduled and non-scheduled power plant outages known before the day-ahead market’s closure. Hourly outages are derived from [22].
Since combined heat and power (CHP) plants are used in most electricity markets, electricity and heat supplies are linked. To account for this dependency, we provide these units with a must-run condition that ensures their operation at certain minimum output levels. These output levels are derived in two steps. First, we determine an hourly heat-demand factor consisting of a temperature-dependent (spatial heating) and temperature-independent (warm water and process heat) part. The temperature-dependent heat demand is generated with heating degree days using mean temperature data from [48]. We derive the temperature-independent heat demand using the hourly and daily consumption patterns from [26]. Second, we use the heat-demand factor to allocate annual electricity generation volumes by CHP plants to single hours. The annual technology-specific electricity generation by CHP units is taken from [24].
In addition to conventional thermal technologies, we consider renewable energy sources (RES), energy storage, hydro-reservoirs and run-of-river. Intermittent RES such as onshore wind, offshore wind and photovoltaics (PV) are implemented by hourly availability factors that are derived from feed-in forecasts from [19]. We do not also improve these forecasts by sequentially modelling their forecast errors in order to clearly measure the impact on the quality of the price forecast when we improve the forecast of the variable that not only offers the greatest potential for improvement but is also most strongly correlated with day-ahead electricity spot market prices. Biomass is implemented as base-load as the historic operation is at a constant level [compare 16].
We exclusively consider pumped storage plants (PSP) for energy storage that actively charge and discharge. The overall turbine capacity of PSPs is made available by Ref. [20], and the efficiency of a storage cycle is around 75% [56]. For PSPs, the energy storage capacity and the turbine capacity are linked. Assuming an energy-power factor (epf) of nine, the plant can generate electricity at full load for nine hours until the storage is empty.
Long-term PSP, as well as hydro-reservoirs, are assigned a variable generation cost, i.e., the value for water consumption. Using historical electricity prices from [17] and the observed generation and pumping activities in the respective hour from [16], a step-wise merit-order for long-term PSP and hydro-reservoirs is constructed. Run-of-river and mid-term PSPFootnote 3 are subject to seasonal variations, which we acknowledge by a monthly availability factor derived from historical generation data from [16].
The German electricity market is highly integrated into the European system. Total interconnector capacity amounts to 27 GW, which is more than 30% of the German peak load.Footnote 4 Both annual aggregated exports (around 13% of annual German consumption in 2019) and imports (around 7% in 2019) are significant. Hence, we parameterise a Pan-European electricity market model, which includes the bidding zones of most EU-27 member states,Footnote 5 Norway, Switzerland and the United Kingdom.Footnote 6 Within Germany, day-ahead electricity prices are derived following the bid-based economic dispatch principle, neglecting the market zone’s physical transmission constraints. Since the energy system model focuses on analysing day-ahead prices, we follow this approach and treat all of Germany, plus Luxembourg, as one bidding zone.Footnote 7 Thus, we include 23 different markets in the analysis, which will be referred to as ‘nodes’ in the formal model, connected by net transfer capacities (NTCs). We implement hourly day-ahead forecasts for NTCs made available by Refs. [18, 30].
As the data parameterisation may be interesting for numerous stakeholders but is difficult and time-consuming to replicate, we publish our input data in the supplementary material: https://github.com/ProKoMoProject/Enhancing-Energy-System-Models-Using-Better-Load-Forecasts.
4 Methodology
In the following, we present our two components to analyse the value of improved day-ahead load forecasts for electricity price forecasts derived by an electricity system model: a time series model for the sequential load data pre-processing and improvement in Sect. 4.1 and the dispatch market model that is used to generate price estimators in Sect. 4.2.
4.1 Model for load forecast error
To improve load forecasts, we use a well-known time series approach that achieves a trade-off between performance and complexity. The approach is based on the idea of forecasting the TSO load forecast error and using this to enhance the load prediction. Thus, we model the time series of forecast errors. For this reason, and to obtain a low-parameter model, we do not use exogenous variables such as feed-in of renewable energy or weather in our model for forecasting the load forecast error, in contrast to the main load forecasting methods in the literature, which include temperature and weather data in particular, e.g., [1, 3, 8, 35, 66]. We propose a purely endogenous time series approach that can be applied using TSO load forecast error alone as input data. It is detached from the outgoing model, which generally already includes exogenous variables. With forecasting the forecast error, the resulting load prediction \(\hat{L_t}^*\) at time t is then given by
where \(\hat{L_t}\) is the original TSO load prediction and \(\hat{\epsilon _t}\) is our forecasted TSO load prediction error. Thus, \({\hat{L}}^*\) is an improved load forecast in which we adjust the original forecast for predictable structure in its error.
For the overall setup, the subindex t will denote consecutive hours. So, \({\hat{L}}_1\), for instance, is the load forecast for the first hour of the considered time period, and \({\hat{L}}_{123}\) is the forecast for the hour 123. This fits best into the observation process of the actual load data. For example, in contrast to electricity prices, for which we observe a realisation of 24 daily hourly prices simultaneously, load data can theoretically be observed hour by hour. For day-ahead electricity prices, alternative parameterizations, such as modelling every day as a 24-dimensional vector, or using 24 time series each for one hour of the day, would be more appropriate [see, e.g., 69].
Furthermore, we decompose the time series into the sum of a seasonal component and a remaining stochastic component. As we do not observe any trend in the forecast error data in Sect. 3.1, we do not use the usual trend component of such decomposition models (see, e.g., [6, 29, 38] for comprehensive introductions into time series models). Together, the model is
where \(\epsilon _t\) is the TSO load forecast error, \(SC_t\) is a seasonal and \(RC_t\) is the remaining component at time t.
The forecast errors’ average sizes depend on the specific hour of the week (see Sect. 3.1), so the seasonal component \(SC_t\) captures a weekly season, consisting of an average value for each of the 24\(\times\)7 h of a week. This means addressing the hour of the day and the day of the week with a total of 168 dummy variables, as given by
Here \(h = 1,..., 24\) denote the hours of a day and \(d = 1\) (Monday), ..., 7 (Sunday) the weekdays of a week.
The seasonal component \(SC_t\) for time t is now defined by Eq. 3 with 4 being the average of TSO forecast errors from the hours of a week from the time period used to estimate the model (e.g., the last \(l_w\) hours).
The rest of the time series \(RC_t = \epsilon _t - SC_t\) is modelled by the econometric SARMA (1, 1)x\((1,1)_{24}\) model given in Eq. 5, i.e., a (S)easonal (A)uto(R)egressive (M)oving (A)verage model. Here, the value \(RC_t\) at hour t depends on its previous value at \(t-1\) as well as the previous model error \(\psi _{t-1}.\) Additionally, the model contains a 24-h seasonal part which captures stochastic seasonal behaviour in contrast to the more deterministic seasonal structure filtered by \(SC_t.\) Formally, the seasonal part leads to direct effects of all variables lagged by another 24 h on \(RC_t\) as given in detail in Eq. (5).
where the innovations are assumed to be homoscedastic and normally distributed, which means \(\psi _t \sim N(0,\sigma ^2_\epsilon )\). Assuming a normal distribution for the innovations is a simplification and idealisation.
We calibrate and estimate the model on a rolling window. The window length, denoted by \(l_w\), is an integer multiple of 24 and thus contains full days only. The window is also rolled over full days in each step to further reflect the daily availability of load data and, thus, the error of the TSO’s load forecast. In this work, we decide on one window length \(l_w\) to estimate the model. Alternatively, one could average multiple models calibrated on different window lengths, e.g., as proposed in [39, 40, 69]. However, in this paper, where the simplicity and usability of the model are important considerations, we believe such an increase in complexity would not be justified.
The estimated model is used to recursively (i.e., on an hour-by-hour basis) predict the hours of the next day. Since we rely on an autoregressive time series model, we need load data from the last hours for prediction, which enter the model as explanatory variables. Although load generation can theoretically be observed hourly, in practice, the load values of the previous hours are available with a time lag, meaning they may not be available as explanatory variables when forecasting the following hours. A solution is to replace unavailable variables with recursively forecasted variables based on the last available observations.
To ensure data availability in the sense of a day-ahead forecast at all times, we only use load observations up to yesterday’s last hour for TSO data as inputs if we make predictions today for tomorrow. Today’s hours must be replaced by forecasts based on yesterday. More clearly, let t=8785 be the first hour of January 1st, 2017, for simplicity and let x be the hour of January 1st from which we forecast the next day’s hours. In the further course, we assume \(x=12\), so we forecast the next day’s hours between 11:00 and 12:00 a.m. today. Depending on availability, real TSO load forecast errors \(\epsilon _t\) enter our model or forecasted ones. For hour \(t \le x-12\), we use the observed real errors \({\epsilon }_t\) and the forecasted ones \({\hat{\epsilon }}_t\) for \(t > x-12\). We want to predict the load for the next day’s 24 h, thus, \(x+13\) to \(x+37\). Due to the information delay and ensuring data availability, we do not indicate the actual load of hours \(x-11\) to \(x-1\). We also have no information about the hours x to \(x+12\) lying in the future. For this reason, we first estimate the model based on the last available \(l_w\) observations (i.e., of hours \(x-12-(l_w+1)\) to \(x-12\). From that, we predict the errors of the TSO load forecast of the next 48 hours \(x-11\) to \(x+37\), i.e., of the hours of January 1st and 2nd, and use the last 24 predicted values. Thus, at hours \(x+13\) to \(x+37,\) for improving the original load forecasts of the following day. Note that by rolling over the estimation window daily, we ensure that the prediction of TSO forecast errors for all load periods of one day is based on the same estimated model.
The proposed model is implemented in MATLAB®. For this paper, the code is run with MATLAB®Version R2020b. The code, used data and the generated result are provided on GitHub: https://github.com/ProKoMoProject/Enhancing-Energy-System-Models-Using-Better-Load-Forecasts.
4.2 Energy system model
We develop a new energy system model, the em.power dispatch model, to derive wholesale day-ahead price forecasts. The model is formulated as a linear optimisation problem minimising total system costs and includes a detailed representation of central techno-economic aspects of the European electricity sector. In particular, the model dispatches various generation technologies to satisfy electricity demand. In addition to power plant dispatch in Germany, the model considers international trade between the markets described in 3.2, electricity production by combined heat and power plants, energy storage and control power provision. To ensure a linear formulation of such a highly complex system, we form capacity clusters, parameterised as described in 3.2. Within each technology cluster, capacity can be started-up and electricity can be produced in marginal increments [see, e.g., 44]. The advantage of this approach is twofold. First, computational efforts are reduced. Second, the marginal of the demand restriction is differentiable at each point and can thus be interpreted as a wholesale market price estimator. Additionally, the accuracy of modelling large energy system, in particular, remains reasonably high [43, see].
Considering all economic and technical restrictions, the model solves the cost minimisation problem and determines (i) the optimal dispatch decision for all considered infrastructure elements, such as generation technologies, energy storage and cross-border transmission capacities, and (ii) the short-run marginal system cost that determines the price estimator for the day-ahead market in hourly resolution.
Furthermore, as our research analyses the impacts on day-ahead price forecasts, we set up the model to reflect the information available to market participants on the day before delivery. We thus consider that market participants do not have perfect foresight for the upcoming days. We achieve this with a rolling window model that is repeatedly solved and provides information for 24 day-ahead hours of one “target day” in each model run. To reduce the problem of starting and ending values, in particular for power plant start-ups and pump storage plants, each model run includes 3 days, as shown in 4. In this setting, the 24 h of the respective target day are represented by the second day of the horizon (d+1). This is following the EPEX spot market organisation, where 24 hourly day-ahead prices are determined at 12 p.m. on the day before delivery (d). In addition to the target day d+1, we also include the day before (d) and the day after (d+2). Note that we include a water value to increase the accuracy of seasonal hydro-storage modelling.
As with the improvement of the load forecast, this approach is repeated continuously (“rolling window”), once for each day of the observation period. At each iteration, the input data for d+1 and d+2 are limited to the values available on day d (i.e. forecasts), so that the incoming day-ahead load forecast is successively improved and processed in our approach. Correctly parameterised, our model uses the same data as market participants (e.g., energy suppliers, direct marketers, investment banks) when forecasting the day-ahead prices to optimise their portfolio. Given this day-ahead focus of our analysis, installed and available capacities are exogenous. The model endogenously optimises power plant dispatch only.
Our rolling window approach to forecasting hourly prices implies that we forecast three years with 365 daily model runs each year. As each model run comprises 72 hourly dispatch decisions with numerous variables in 23 model regions, the total number of variables is 340 million. In the following, we present the mathematical formulation of our model. The model is coded in GAMS.Footnote 8 The entire code is provided on GitHub: https://github.com/ProKoMoProject/Enhancing-Energy-System-Models-Using-Better-Load-Forecasts. A nomenclature containing all indices, parameters and variables of the energy system model formulation is provided in 1.
The objective function in Eq. 6 minimises total system costs and accounts for all costs that generation units face in the short-term. We include costs at full load operation (\(vc_{i,n,t}^{FL}\)), additional costs for units that operate at partial load (\(vc_{i,n,t}^{ML}-vc_{i,n,t}^{FL}\)), and start-up costs (\(sc_{i,n,t}\)). Note that we apply a linear formulation of the unit commitment, and all units have to produce at least a minimum output level. Additionally, we account for load shedding costs (voll) and penalty payments for curtailing renewables (curtc).
Since we apply our model with a rolling window, we consider three days in each model run. Modelling an additional day before and after the target day seems appropriate for storages with large energy-to-power ratios, which are essentially operated on a daily cycle (e.g., the largest German pump storage facility, Goldisthal, can store enough energy for nine hours of full load operation). However, other storages (both PSP and seasonal storages without pumps) have a storage cycle longer than 3 days. Therefore, we model two types of PSP, first as mid-term storage that operates a storage cycle within a 3-days horizon, and second as long-term storage that operates a storage cycle longer than 3 days. The dispatch of mid-term storage is determined endogenously, with the exogenous restriction that they both start and end the cycle with reservoir levels at 30%. The approach is different for long-term PSPs, which are assigned a water value (\(wv_{stl,n,t}\)) that is implemented as a variable cost factor for electricity generation (\(G_{stl,n,t}\)) and consumption (\(CL_{stl,n,t}\)). We assume that 70% of the pump storage capacity is optimised in the medium-term. The remaining 30% are long-term PSPs.
Compared to pumped storage plants, hydro-reservoirs have a natural water feed-in and do not perform a pumping process. However, the water budget for electricity generation is limited according to seasonal inflow volumes. Therefore, we also apply a water value for electricity generation by hydro-reservoirs.
Market clearing is ensured by Eq. 7: for all T hours of the given rolling window, demand (\(d_{n,t}\)) must equal the sum of generation (\(G_{i,n,t}\)), load shedding (\(SHED_{n,t}\)) and electricity imports (\(FLOW_{nn,n,t}\)), reduced by electricity consumption of mid-term energy storage (\(CM_{stm,n,t}\)) and long-term energy storage (\(CL_{stl,n,t}\)) and electricity exports (\(FLOW_{n,nn,t}\)).
The dual variable of the demand constraint Eq. 7 is used as an hourly day-ahead wholesale electricity price estimator. As we want to analyse how well these price estimators based on different demand forecasts fit real-world day-ahead prices, we compare them and compute error measures.
Electricity generation by capacity cluster is limited by an upper and a lower bound. The upper bound is formalised in Eq. 8 and ensures that electricity generation does not exceed the running capacity (\(P_{i,n,t}^{on}\)) in the cluster. The possible electricity generation by running capacity is further limited by the reserve for positive control power provision (\(PCR_{i,n,bp},SCR_{i,n,bs}^{pos}\)). The lower bound is presented in Eq. 9 and states that running capacities must operate at least at a minimum power level, including the capacity reserved for negative control power provision (\(PCR_{i,n,bp},SCR_{i,n,bs}^{neg}\)). Note that primary control power (\(PCR_{i,n,bp}\)) in Germany is provided synchronously, i.e., a unit has to provide both positive and negative primary control power. Different products for positive and negative control power were introduced for secondary control power. Since fast-reacting units (e.g., hydro- and open-cycle gas turbines) can be started-up to provide a positive-minute reserve, the effect on the running capacities is neglected. In addition, we assume that a negative-minute reserve is provided by multiple market players, not necessarily by power plants. The hours that belong to bidding blocks are mapped for primary control power by bp and secondary control power by bs.
The running capacity of a power system is limited by the installed capacity (\(cap_{i,n,t}\)) in combination with either the availability factor (\(af_{i,n,t}\)) or power plant outages (\(out_{i,n,t}\)), as shown in Eq. 10. For thermal generation capacities, we use hourly power plant outages. Renewables are provided with an hourly availability factor and hydroelectric units with a monthly availability factor.
Equation 11 tracks start-up activities (\(SU_{i,n,t}\)) that increase the running capacity from one hour to another. Due to the non-negativity condition, start-ups are either positive or zero.
The delta between available feed-in from intermittent renewables and their actual generation defines the curtailment of renewables (\(CURT_{res,n,t}\)), as shown in Eq. 12.
Some power plants are active in the heat market in addition to the electricity market. The model thus implements a must-run condition for such units on the electricity market, which varies over time (e.g., higher in the winter season due to space heating). Depending on hourly heat demand, Eq. 13 states that the output of a combined heat and power unit is at least equal to the electricity generation linked to the heat production (\(chp_{i,n,t}\)).
Equation 13 constraints the cross-border electricity transfer (\(FLOW_{n,nn,t}\)) by the net transfer capacity (\(ntc_{n,nn,t}\)).
Equation 15 describes the state of the storage level of a mid-term storage. The storage level is increased by the generation (\(G_{stm,n,t}\)) and decreased by the consumption while charging (\(ST_{stm,n,t}^{in}\)). The efficiency of an entire storage cycle (\(\eta _{stm}\)) is assigned to the charging process.
The maximum energy storage capacity (\(SL_{stm,n,t}\)) of a mid-term storage is defined by the maximum installed turbine capacity times an energy-power factor (epf), as shown in Eq. 16.
Equation 17 restricts the turbine and pumping capacity, where the pumping capacity is assumed to be lower than the turbine capacity.
At the beginning and end of each model run, all mid-term storages must be filled with 30 % of their energy level (Eqs. 18 and 19).
Long-term storage is not subject to a storage mechanism. However, the electricity generation and consumption of long-term storage units are also restricted by the installed capacity of long-term storage by Eq. 20.
Equations 21, 22 and 23 ensure the control power provision for primary, positive secondary and negative secondary control power.
The non-negativity constraint is presented in Eq. 24.
We use both models presented alternately. To predict the next day, we first forecast the load forecast error with the load forecast improvement model and thus enhance the day-ahead load forecast. As one input data, it enters the power system model, which estimates the next day’s prices using the presented approach. This sequence is repeated continuously day by day over the rolling window for all points in time in our observation period.
5 Results
Our paper explores two different methodologies that are combined. It presents a forecast error improvement model for load forecasts based on data from ENTSO-E, and it develops the energy system model em.power dispatch which is built for day-ahead wholesale price forecasts. We present the results accordingly. First, we show the performance of our approach to model the load forecast error. Therefore, we use statistical data and different error measures for various time periods of the enhanced load forecast. With our approach we are able to reduce the RMSE of load forecast error by 22.5%. Second, we analyse the impact of the improved forecast on the resulting price estimates of the em.power dispatch model. Therefore, we compare the resulting price estimators generated with the original TSO load forecast \({\hat{L}}\) and the enhanced load forecast \({\hat{L}}^*\) with the actual price observed at the day-ahead market using several error measures: mean squared error (MSE), root mean squared error (RMSE), and mean average error (MAE). We find that during hours with relatively high prices, the usage of improved load forecasts leads to a reduction of prices’ forecast mean squared error by nearly 15%.
5.1 Improved load data and achieved error reduction
In the following, we quantify the TSO forecast error improvement model described in Sect. 4.1. Therefore, we compare the improved load forecast \({\hat{L}}^*\) and the TSO load forecast \({\hat{L}}\) with actual load data L. For the error improvement model, we use a rolling window width of 1 year (i.e., \(l_w = 8760\)), which yields the lowest (out of sample) error measures compared with a width of three months and six months. For this reason, the prediction of the forecast error, and thus the out-of-sample period, begins on January 1st, 2017. Table 2 shows the mean and standard deviation of the TSO load forecast error and the enhanced load forecast error, the error measures MSE, RMSE and MAE of the TSO load forecast and of the enhanced load forecast, as well as the percentage improvement.
While the load was severely underestimated in the TSO forecast with a mean of 656.0 MWh, it is slightly overestimated in the improved model with \(-\)98.9 MWh. Looking at the individual annual mean values, the high negative value in 2017 is particularly striking. The reason for this is the very strong underestimation of the TSO load forecast in 2016, with an average deviation of 1555.4 MWh (see Sect. 3.1). The influence of errors from the year 2016 has a large impact due to the rolling window period of 365 days, especially on the model estimates of the first days and months of 2017. A shorter window period of three months sinks the annual mean value of 2017 but has a minor improvement in error measures (see 6).
The standard deviation of the improved load forecast is lower than the standard deviation of the TSO load forecast across all years.
The error measures MSE and MAE given in Table 2 show a significant improvement of the load forecast. With an RMSE of 2224.6 MWh, we achieve a 21.48% improvement over the TSO load forecast for the period from January 1st, 2017, to December 31st, 2019. The most considerable improvement can be observed in 2019 with 32.14%. A breakdown of the improvement among the components (seasonal and remaining) of the model shows that both the seasonal and remaining components account for a large share of the improvement, and neither component dominates.
Reference [39] also improve the TSO load forecast. From October 1st, 2016, to September 30th, 2019, they achieve an enhancement in RMSE over 365 days from a minimum of 23.71% to a maximum of 34.38%. Comparing both, achieving a slightly higher improvement also means using a multivariate modelling framework with six different rolling window widths, and consequently six model estimates and six point forecasts for each hour of the forecast period. Our approach is intended to allow a user with less modelling expertise and computational capacity to enhance the commonly used TSO forecast of load. With a less complex, univariate model, we still achieve substantial improvement and thus error measurements that are comparable with error measurements in the literature [e.g., in 10, 68, 35].
To better attribute and understand the effect of load improvement on price, we also determine the percentage improvement in MSE for the hours of a day, and the days of the week, as shown in Fig. 5. The observed daytime and weekday structures in the TSO load forecast error are also evident in the improvement. During the day, hours 2 through 5 and 16 through 20 achieve the most considerable percentage improvement. Weekdays can be improved more than weekends; Tuesdays and Wednesdays show an especially strong improvement. In the TSO load forecast, these are the hours and days that have the largest mean error. Therefore, hours and days that have a sizeable mean error are the ones that have the most potential for improvement. Enhancing the load forecast by reducing this error is the primary goal of modelling and predicting the error of the TSO load forecast.
5.2 Impact of improved load data on an energy system model
In the previous Sect. 5.1, we proved that with a relatively straightforward approach, the ENTSO-E load data can be significantly improved. Thus, this approach is particularly suitable for energy system modelers to enhance critical input data. In the following, we quantify the impact of the improved load forecast on day-ahead wholesale price forecasts based on the em.power dispatch model. To do this, we run the model twice, first using the original TSO-based load forecasts \({\hat{L}}\) and second, using the improved load forecasts \({\hat{L}}^*\) presented in Sect. 5.1. For both cases, we derive estimates of the day-ahead wholesale prices and calculate error measures comparing the results to actual observed day-ahead prices.
Using the improved load data set, we see an overall reduction in the error of the price estimator. For the entire time horizon, Table 3 states a reduction of the MSE by 1.75%, the RMSE by 0.88% and the MAE by 0.42%.
Comparing our results with those of other models in the same modelling class, we find that our model generates very good price estimates. In [49], for example, report an MAE of 9.44 €/MWh for 2017, 8.88 €/MWh for 2018 and 6.69 €/MWh for 2019.
Table 3 further shows disaggregated error measures by year. It can be seen that an improvement in the error measure is achieved in all three years. However, the magnitude of this improvement varies; the relative error reduction is largest in 2019 and smallest in 2018. This observation correlates with the magnitude of the annual improvement in the load forecast, shown in Fig. 5.
Furthermore, we analysed whether the improvement of the load estimator and the price estimator correlate with the hour of the day. Figure 6 shows the average percentage improvement of the MSE of the day-ahead load prediction per hour of the day (left) and of the day-ahead price estimators (right). It can be seen that an hour’s load and hour’s price improvement do not correlate. Depending on the respective hour of the day, improvement of load prediction seems to have a different impact on the resulting price estimator.
The reasoning for this discrepancy is two-fold: i) the model is more sensitive in one hour than in another hour, depending on the respective position in the merit order, and ii) an improvement in the load forecast in one hour may affect another hour due to temporal interdependencies such as storage operation and unit commitment decisions.
Having shown that the impact of better load forecasts on price forecasts derived in an energy system model is positive on average but varies between hours, we now examine the extent of error reduction at different points in time, starting with differentiation between high (Peak) and low demand (Off-Peak) periods. Figure 7 states the error reduction of the price estimator and that of the load forecast for the entire time period and time categories peak, off-peak, weekdays and weekend days. The most considerable error reduction of the price estimator is observed in peak hours and on weekdays in general. In the hours between 8 p.m. and 8 a.m. as well as on weekends, the effect on the price estimator is relatively low. On weekends, this observation correlates with the improvement of the load data, both of which are at their minimum. However, in off-peak hours, the impact on the price estimator is negligible, despite the great improvement in the load forecast.
As such, the model benefits significantly from improved load input data during peak hours and in total on weekdays, where demand and price levels are generally higher than off-peak hours and especially on weekends.
Based on the observation that price forecasts improve more during peak periods than in off-peak periods, we analyse the relation between wholesale price and forecast improvement. Figure 8 shows the improvement of the price estimator for five different price segments where electricity prices are equally separated in 20% quantiles based on their level. The first quantile (q1) represents the lowest 20% quantile and the last quantile (q5) the highest 20% quantile of electricity prices of the respective year between 2017 and 2019.
It can be seen that the error reduction of the price estimator is most relevant in hours with high and medium prices. Overall, the largest improvement can be observed in 2018 and 2019 with an MSE reduction of nearly 15%, here at times with the 60–80% highest prices. In contrast, the improved load forecast data does not lead to a better price estimator in low-price periods. In all years, we even observe an increasing error in these price ranges. In summary, the improved load forecast is most beneficial for the model in the hours when the market equilibrium is found on the right side in the merit order, i.e., where changes or errors in the demand have the highest price impact.
Hence, our analysis shows that the price forecasts are generally better when (a) demand is high and (b) prices are high. As traded volumes (in monetary terms) are the product of prices and volumes, it is interesting to note that price forecast improvement is highest when it matters the most.
6 Conclusion
We confirm the results from previous studies that input data for energy system models, especially day-ahead load forecast data, are biased and inaccurate. Nevertheless, many modellers use them unfiltered. Therefore, we show to what extent load forecasts can be improved. We also show how improved input data affect the quality of energy system models’ results. Our paper is thus aimed at energy system modellers who want to provide empirically meaningful results and therefore need large and accurate data sets.
We present a simple time series model to improve the TSO-based load forecast data provided by ENTSO-E. The model captures and removes systematic biases and autoregressive structures present in the load forecast errors. Answering the research question, we find that it can be straightforward to improve input data. Using the example of German day-ahead load forecasts we were able to reduce the RMSE of the error by 22.5%. Since the model is applied to observed forecast errors rather than to the load data itself and does not include load-specific external variables, it can be easily transferred to the pre-processing of other quantities of interest.
To analyse the effect of enhanced load forecasts on electricity system models, we feed the improved load forecast data into the em.power dispatch model. The model is used to generate price estimates for the German day-ahead electricity market, and we present the structure, assumptions, and optimisation equations of the model in detail. Concerning the effect of sequentially preprocessed inputs, we find that the benefits of sequentially improved load forecasts strongly depend on the respective price level, with more extensive benefits for higher price levels. This is a universal result in line with fundamental theory since in merit order markets, the impact of load changes on price changes increases with the overall level. We find that in phases of relatively high prices, as in 2018 and 2019, the continuous and sequential, i.e., day-by-day, load data pre-processing leads to an average reduction of em.power dispatch’s prices forecast mean squared error by nearly 15%. Hence, as the value of traded energy is the product of prices and volumes, our analysis shows that forecasts are generally better when (a) demand is high and (b) prices are high, i.e., when it matters the most. With this analysis, we proved that the quality of the model results benefits from better data input.
Based on these findings, we recommend energy system modelers to carefully analyse not only the structure and equations of their models but also the quality of input data. This paper demonstrated in the empirical setting of the German wholesale electricity market that input data can be improved significantly and that these improvements can be achieved with straightforward time-series models. Furthermore, we demonstrate that the results of the energy system model benefit from the improved input data.
Although our results are generalised, further research should extend our analysis by evaluating the impact of better load forecasting using different energy system models and models that focus on other markets than Germany. Furthermore, scholars may investigate the quality of other input parameters, such as generation forecasts from wind and photovoltaics. Modelers focusing on, for example, CO2 emissions or the use of power plants and energy storage can also use our approach and analyse how the quality of their results can be improved.
Data availability
Datasets related to this article and a source code for the entire project are available in a public GitHub repository. On you find code and data for the time series model improving the load forecasts as well as code and data for the energy system model. The codes reproduce the benchmarks from the paper.
Notes
In all main European markets, wholesale electricity prices are determined in the day-ahead market clearing one day before the actual delivery.
There are 1105 missing values in the hourly TSO load forecast and 38 missing values in the hourly actual load data.
Note that we call it mid-term because we focus on the day-ahead market with an hourly granularity, as opposed to short-term storage with an intra-hourly resolution closer to time of delivery.
Note that the availability of the interconnectors depends on various factors (e.g., congestion within a market zone).
Bulgaria, Cyprus, Greece, Iceland, Ireland, Malta and Romania are not included.
Note that we aggregate the bidding zones of Spain and Portugal to one market, ‘Iberian peninsula’, and the bidding zones of Lithuania, Estonia and Latvia to one market, ‘Baltic’. Also, note that we consider the different bidding zones within countries. However, we aggregate the following zones: in Norway NO1-NO5, in Sweden SE1-SE3, and in Italy all zones except IT-North.
Note that the market area, Germany-Luxembourg-Austria, was split into two market zones (Germany-Luxembourg and Austria) in 2018. Our model accounts for this fact.
GAMS General Algebraic modelling System Version 41 (https://www.gams.com/). The computer specification the code was running on are as follows: 199 GB RAM, 2.8–3.2 GHz Processor.
References
Al-Hamadi, H., Soliman, S.: Short-term electric load forecasting based on Kalman filtering algorithm with moving window weather and load model. Elect. Power Syst. Res. 68, 47–59 (2004). https://doi.org/10.1016/S0378-7796(03)00150-0
Ambrosius, M., Egerer, J., Grimm, V., van der Weijde, A.H.: Risk aversion in multilevel electricity market models with different congestion pricing regimes. Energy Econ. 105, 105701 (2022). https://doi.org/10.1016/j.eneco.2021.105701
Amjady, N.: Short-term hourly load forecasting using time-series modeling with peak load estimation capability. IEEE Trans. Power Syst. 16, 798–805 (2001). https://doi.org/10.1109/59.962429
BNetzA.: Kraftwerksliste der Bundesnetzagentur (2021). https://www.bundesnetzagentur.de/DE/Sachgebiete/ElektrizitaetundGas/Unternehmen_Institutionen/Versorgungssicherheit/Erzeugungskapazitaeten/Kraftwerksliste/start.html. Accessed on 15-05-2021
Borenstein, S., Bushnell, J.B., Wolak, F.A.: Measuring market inefficiencies in California’s restructured wholesale electricity market. Am. Econ. Rev. 92, 1376–1405 (2002). https://doi.org/10.1257/000282802762024557
Box, G.E.P., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time series analysis: forecasting and control. Wiley series in probability and statistics. fifth edition ed., John Wiley and Sons Inc., Hoboken, New Jersey (2015). https://doi.org/10.1111/jtsa.12194
Bundesamt, Destatis Statistisches: Erzeugerpreise gewerblicher Produkte (Inlandsabsatz). Preise für leichtes Heizöl, Motorenbenzin und Diesel (2021). https://www.destatis.de/DE/Themen/Wirtschaft/Preise/Erzeugerpreisindex-gewerbliche-Produkte/_inhalt.html. Accessed on 25-01-2021
Cancelo, J.R., Espasa, A., Grafe, R.: Forecasting the electricity load from one day to one week ahead for the Spanish system operator. Int. J. Forecast. 24, 588–602 (2008). https://doi.org/10.1016/j.ijforecast.2008.07.005. (energy Forecasting)
Chen, J.F., Wang, W.M., Huang, C.M.: Analysis of an adaptive time-series autoregressive moving-average (arma) model for short-term load forecasting. Elect. Power Syst. Res. 34, 187–196 (1995). https://doi.org/10.1016/0378-7796(95)00977-1
Do, L.P.C., Lin, K.H., Molnár, P.: Electricity consumption modelling: a case of Germany. Econ. Model. 55, 92–101 (2016). https://doi.org/10.1016/j.econmod.2016.02.010
EBC: Europe Beyond Coal: European Coal Plant Database (2021). https://beyond-coal.eu/database/. Accessed on 25-01-2021
EEX: European Energy Exchange: Historic gas price data (2021). Accessed on 15-05-2021
Egerer, J., Grimm, V., Kleinert, T., Schmidt, M., Zöttl, G.: The impact of neighboring markets on renewable locations, transmission expansion, and generation investment. Eur. J. Oper. Res. 292, 696–713 (2021). https://doi.org/10.1016/j.ejor.2020.10.055
Eising, M., Hobbie, H., Möst, D.: Future wind and solar power market values in Germany—evidence of spatial and technological dependencies? Energy Econ. 86, 104638 (2020). https://doi.org/10.1016/j.eneco.2019.104638
Engelhorn, T., Möbius, T.: On the development of wind market values and the influence of technology and weather: a German case study. Zeitschrift für Energiewirtschaft 1–23,(2022). https://doi.org/10.1007/s12398-022-00319-2
ENTSO-E Transparency Platform: Actual Generation per Production Type (2021). https://transparency.entsoe.eu/. Accessed on 15-05-2021
ENTSO-E Transparency Platform: Day-ahead prices (2021). https://transparency.entsoe.eu/. Accessed on 15-05-2021
ENTSO-E Transparency Platform: Forecasted Transfer Capacities - Day Ahead (2021). https://transparency.entsoe.eu/. Accessed on 15-05-2021
ENTSO-E Transparency Platform: Generation Forecast - Day ahead (2021). https://transparency.entsoe.eu/. Accessed on 15-05-2021
ENTSO-E Transparency Platform: Installed Capacities per Production Type (2021). https://transparency.entsoe.eu/. Accessed on 15-05-2021
ENTSO-E Transparency Platform: Total Load - Day Ahead / Actual (2021). https://transparency.entsoe.eu/. Accessed on 15-05-2021
ENTSO-E Transparency Platform: Unavailability of Production and Generation Units (2021). https://transparency.entsoe.eu/. Accessed on 15-05-2021
ENTSO-E: TYNDP 2018 Scenario Report (2018). https://tyndp.entsoe.eu/tyndp2018/scenario-report. Accessed on 23-02-2022
European Commission: Eurostat Statistics Database (2021). https://ec.europa.eu/eurostat/data/database. Accessed on 15-05-2021
Heinisch, V., Göransson, L., Erlandsson, R., Hodel, H., Johnsson, F., Odenberger, M.: Smart electric vehicle charging strategies for sectoral coupling in a city energy system. Appl. Energy 288, 116640 (2021). https://doi.org/10.1016/j.apenergy.2021.116640
Hellwig, M.: Entwicklung und Anwendung parametrisierter Standard-Lastprofile. Dissertation, Technische Universität München. Dissertation, Technische Universität München (2003)
Hirth, L.: The market value of variable renewables: the effect of solar wind power variability on their relative price. Energy Econ. 38, 218–236 (2013). https://doi.org/10.1016/j.eneco.2013.02.004
Hirth, L., Mühlenpfordt, J., Bulkeley, M.: The ENTSO-E Transparency Platform—a review of Europe’s most ambitious electricity data platform. Appl. Energy 225, 1054–1067 (2018). https://doi.org/10.1016/j.apenergy.2018.04.048
Hyndman, R.J., Athanasopoulos, G.: Forecasting : principles and practice. Otexts: Melbourne, Australia, Lexington, Ky (2021). https://otexts.com/fpp3/. Accessed on 04-02-2022
JAO Joint Allocation Office: ATC for Shadow Auction (2021). https://www.jao.eu/implict-allocation. Accessed on 15-05-2021
Kitzing, L., Juul, N., Drud, M., Boomsma, T.K.: A real options approach to analyse wind energy investments under different support schemes. Appl. Energy 188, 83–96 (2017). https://doi.org/10.1016/j.apenergy.2016.11.104
Koirala, B., Hers, S., Morales-España, G., Özdemir, Ö., Sijm, J., Weeda, M.: Integrated electricity, hydrogen and methane system modelling framework: application to the Dutch infrastructure outlook 2050. Appl. Energy 289, 116713 (2021). https://doi.org/10.1016/j.apenergy.2021.116713
Kunz, F., Weibezahn, J., Hauser, P., Heidari, S., Schill, W.P., Felten, B., Kendziorski, M., Zech, M., Zepter, J., von Hirschhausen, C., Möst, D., Weber, C.: Reference Data Set: Electricity, Heat, and Gas Sector Data for Modeling the German. System. (2017). https://doi.org/10.5281/zenodo.1044463
Li, Y., Wang, R., Li, Y., Zhang, M., Long, C.: Wind power forecasting considering data privacy protection: a federated deep reinforcement learning approach. Appl. Energy 329, 120291 (2023). https://www.sciencedirect.com/science/article/pii/S0306261922015483, https://doi.org/10.1016/j.apenergy.2022.120291
Li, Z., Li, Y., Liu, Y., Wang, P., Lu, R., Gooi, H.B.: Deep learning based densely connected network for load forecasting. IEEE Trans. Power Syst. 36, 2829–2840 (2021). https://doi.org/10.1109/TPWRS.2020.3048359
Lienert, M., Lochner, S.: The importance of market interdependencies in modeling energy systems - the case of the European electricity generation market. Int. J. Elect. Power Energy Syst. 34, 99–113 (2012). https://doi.org/10.1016/j.ijepes.2011.09.010
Lin, L., Xue, L., Hu, Z., Huang, N.: Modular predictor for day-ahead load forecasting and feature selection for different hours. Energies (2018). https://doi.org/10.3390/en11071899
Lütkepohl, H.: New introduction to multiple time series analysis. Springer, Berlin (2005)
Maciejowska, K., Nitka, W., Weron, T.: Enhancing load, wind and solar generation for day-ahead forecasting of electricity prices. Energy Econ. 99, 105273 (2021). https://doi.org/10.1016/j.eneco.2021.105273
Marcjasz, G., Serafin, T., Weron, R.: Selection of calibration windows for day-ahead electricity price forecasting. Energies 11, 2364 (2018). https://doi.org/10.3390/EN11092364
Möbius, T., Riepin, I., Müsgens, F., van der Weijde, A.H.: Risk aversion in flexible electricity markets (2021). https://doi.org/10.48550/ARXIV.2110.04088
Möst, D., Keles, D.: A survey of stochastic modelling approaches for liberalised electricity markets. Eur. J. Oper. Res. 207, 543–556 (2010). https://doi.org/10.1016/j.ejor.2009.11.007
Müsgens, F., Neuhoff, K.: Modelling dynamic constraints in electricity markets and the costs of uncertain wind output (2006)
Müsgens, F.: Quantifying market power in the German wholesale electricity market using a dynamic multi-regional dispatch model. J. Ind. Econ. 54, 471–498 (2006). https://doi.org/10.1111/j.1467-6451.2006.00297.x
Nahmmacher, P., Schmid, E., Pahle, M., Knopf, B.: Strategies against shocks in power systems—an analysis for the case of Europe. Energy Econ. 59, 455–465 (2016). https://doi.org/10.1016/j.eneco.2016.09.002
Naz, A., Javed, M.U., Javaid, N., Saba, T., Alhussein, M., Aurangzeb, K.: Short-term electric load and price forecasting using enhanced extreme learning machine optimization in smart grids. Energies (2019). https://doi.org/10.3390/en12050866
Open Power System Data: Data Package National Generation Capacity (2020). Version 2019-12-02. https://doi.org/10.25832/national_generation_capacity/2019-12-02, https://doi.org/10.25832/national_generation_capacity/2019-12-02. Accessed on 20-12-2020
Open Power System Data: Data Package Weather Data (2020). Version 2020-09-16. https://doi.org/10.25832/weather_data/2020-09-16, https://doi.org/10.25832/weather_data/2020-09-16. Accessed on 20-12-2020
Qussous, R., Harder, N., Weidlich, A.: Understanding power market dynamics by reflecting market interrelations and flexibility-oriented bidding strategies. Energies (2022). https://doi.org/10.3390/en15020494
Riepin, I., Möbius, T., Müsgens, F.: Modelling uncertainty in coupled electricity and gas systems-is it worth the effort? Appl. Energy 285, 116363 (2021). https://doi.org/10.1016/j.apenergy.2020.116363
Rodrigues, F., Trindade, A.: Load forecasting through functional clustering and ensemble learning. Knowl. Inf. Syst. 57, 229–244 (2018). https://doi.org/10.1007/S10115-018-1169-Y/FIGURES/5
Sandbag: CO2 emission allowance (2020). https://sandbag.be/index.php/carbon-price-viewer/. Accessed on 20-02-2020
Sauma, E.E., Oren, S.S., Sauma, E.E., Oren, S.S.: Proactive planning and valuation of transmission investments in restructured electricity markets. J. Regul. Econ. 30, 261–290 (2006). https://doi.org/10.1007/S11149-006-9003-Y
Schill, W.P., Zerrahn, A.: Long-run power storage requirements for high shares of renewables: results and sensitivities. Renew. Sustain. Energy Rev. 83, 156–171 (2018). https://doi.org/10.1016/j.rser.2017.05.205
Schill, W.P., Pahle, M., Gambardella, C.: Start-up costs of thermal power plants in markets with increasing shares of variable renewable generation. Nat. Energy 2, 1–6 (2017). https://doi.org/10.1038/nenergy.2017.50
Schröder, A., Kunz, F., Meiss, J., Mendelevitch, R., von Hirschhausen, C.: Current and Prospective Costs of Electricity Generation until 2050. DIW Data Documentation 68 (2013)
Singh, A.K., Ibraheem, Khatoon, S., Muazzam, M., Chaturvedi, D.K.: Load forecasting techniques and methodologies: a review. In: 2012 2nd International Conference on Power, Control and Embedded Systems, pp. 1–10 (2012). https://doi.org/10.1109/ICPCES.2012.6508132
Suganthi, L., Samuel, A.A.: Energy models for demand forecasting a review. Renew. Sustain. Energy Rev. 16, 1223–1240 (2012). https://doi.org/10.1016/j.rser.2011.08.014
Tan, Z., Zhang, J., Wang, J., Xu, J.: Day-ahead electricity price forecasting using wavelet transform combined with Arima and Garch models. Appl. Energy 87, 3606–3610 (2010). https://doi.org/10.1016/j.apenergy.2010.05.012
UBA: Umweltbundesamt: Datenbank “kraftwerke in deutschland” (2020). https://www.umweltbundesamt.de/dokument/datenbank-kraftwerke-in-deutschland. Accessed on 20-02-2020
Vaillancourt, K., Bahn, O., Frenette, E., Sigvaldason, O.: Exploring deep decarbonization pathways to 2050 for Canada using an optimization energy model framework. Appl. Energy 195, 774–785 (2017). https://doi.org/10.1016/j.apenergy.2017.03.104
Wang, D., Gan, J., Mao, J., Chen, F., Yu, L.: Forecasting power demand in china with a cnn-lstm model including multimodal information. Energy 263, 126012 (2023). https://www.sciencedirect.com/science/article/pii/S0360544222028985, https://doi.org/10.1016/j.energy.2022.126012
Weigt, H., von Hirschhausen, C.: Price formation and market power in the German wholesale electricity market in 2006. Energy Policy 36, 4227–4234 (2008). https://doi.org/10.1016/j.enpol.2008.07.020 (transition towards Sustainable Energy Systems)
Weron, R., Misiorek, A.: Modeling and forecasting electricity loads: a comparison. Proceedings of the European Electricity Market EEM-04 (2005)
Weron, R.: Modeling and forecasting electricity loads and prices: a statistical approach. Wiley finance series. Wiley & Sons, Chichester (2006)
Wu, Z., Zhao, X., Ma, Y., Zhao, X.: A hybrid model based on modified multi-objective cuckoo search algorithm for short-term load forecasting. Appl. Energy 237, 896–909 (2019). https://doi.org/10.1016/j.apenergy.2019.01.046
Yang, Y., Wu, J., Chen, Y., Li, C.: A new strategy for short-term load forecasting. Abstract Appl. Anal. (2013). https://doi.org/10.1155/2013/208964
Ziel, F.: Modeling public holidays in load forecasting: a German case study. J. Modern Power Syst. Clean Energy 6, 191–207 (2018). https://doi.org/10.1007/s40565-018-0385-5
Ziel, F., Weron, R.: Day-ahead electricity price forecasting with high-dimensional structures: univariate vs. multivariate modeling frameworks. Energy Econ. 70, 396–420 (2018). https://doi.org/10.1016/j.eneco.2017.12.016
Acknowledgements
The work was supported by the German Federal Ministry of Economic Affairs and Climate Action through the research project “ProKoMo - Better price forecasts in the energy sector by combining fundamental and stochastic models” within the Systems Analysis Research Network of the 6th energy research program.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Descriptive statistics for various time resolutions
Appendix B: Nomenclature
Nomenclature
Sets and indices
- \(BP\) :
-
Time blocks for primary control power
- \(BS\) :
-
Time blocks for secondary control power
- \(RES (I)\) :
-
Intermittent renewables [Subset of I]
- \(STL (I)\) :
-
Long-term storage [Subset of I]
- \(STM (I)\) :
-
Mid-term storage [Subset of I]
- \(tfirst (T)\) :
-
First hour of a rolling window
- \(tlast (T)\) :
-
Last hour of a rolling window
- I :
-
Electricity generation capacity cluster
- \(N, NN\) :
-
Node
- T :
-
Time steps
Parameters
- \(\eta _i\) :
-
Efficiency rate of a generation technology
- \(af _{i,n,t}\) :
-
Availability factor for generation capacities
- \(cap _{i,n,t}\) :
-
Installed generation capacity [\(MW_{el}\)]
- \(chp _{i,n,t}\) :
-
Minimum electricity output of combined heat power units [\(MWh_{el}/h\)]
- \(curtc\) :
-
Costs for RES curtailment [€\(/MWh_{el}\)]
- \(epf\) :
-
Energy-power factor for mid-term storage plants [\(MWh_{el}/MW_{el}\)]
- \(g _{i,n,t}^{min}\) :
-
Minimum generation of a running unit
- \(ntc _{n,nn,t}\) :
-
Net transfer capacities [\(MWh_{el}/h\)]
- \(out _{i,n,t}\) :
-
Power plant outages [\(MW_{el}\)]
- \(sc _{i,n,t}\) :
-
start-up costs [€\(/MW_{el}\)]
- \(vc _{i,n,t}^{FL}\) :
-
Variable generation costs at full load [€\(/MWh_{el}\)]
- \(vc _{i,n,t}^{ML}\) :
-
Variable generation costs at minimum load [€\(/MWh_{el}\)]
- \(voll\) :
-
Value of lost load [€\(/MWh_{el}\)]
- \(wv _{i,n,t}\) :
-
Water value for hydro reservoirs and long-term storage [€\(/MWh_{el}\)]
- \(d_{n,t}\) :
-
Electricity demand [\(MWh_{el}/h\)]
Variables
- \(CL _{i,n,t}\) :
-
Charging activity for long-term storage [\(MWh_{el}/h\)]
- \(CM _{i,n,t}\) :
-
Charging activity for mid-term storage [\(MWh_{el}/h\)]
- \(CURT _{i,n,t}\) :
-
RES curtailment [\(MWh_{el}/h\)]
- \(FLOW _{n,nn,t}\) :
-
Electricity flow from node n to node nn [\(MWh_{el}/h\)]
- \(PCR _{i,n,t}\) :
-
Primary control reserve [\(MW_{el}\)]
- \(SCR _{i,n,t}^{neg}\) :
-
Negative secondary control reserve [\(MW_{el}\)]
- \(SCR _{i,n,t}^{pos}\) :
-
Positive secondary control reserve [\(MW_{el}\)]
- \(SHED _{n,t}\) :
-
Load shedding [\(MWh_{el}/h\)]
- \(SL _{i,n,t}\) :
-
Storage level of PSP [\(MWh_{el}\)]
- \(SU _{i,n,t}\) :
-
Start-up activity of a generation unit [\(MW_{el}\)]
- \(TC\) :
-
Total system costs [€]
- \(G_{i,n,t}\) :
-
Electricity generation [\(MWh_{el}/h\)]
- \(P_{i,n,t}^{on}\) :
-
Running capacity [\(MW_{el}\)]
Appendix C: Error measures for different rolling window lengths
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Möbius, T., Watermeyer, M., Grothe, O. et al. Enhancing energy system models using better load forecasts. Energy Syst (2023). https://doi.org/10.1007/s12667-023-00590-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12667-023-00590-3