1 Introduction

For the last three decades electricity markets are undergoing significant structural changes. At the same time the price risk for the wholesale electricity market participants increased significantly. Limited storage possibilities, technical constraints of transmission grid and the importance of electricity supply lead to much higher price variability than in other commodity markets. In the recent years we observe a rapid transformation of the overall electricity production profile in the European electricity markets with a growing share of renewable energy sources (RES). This makes not only electricity demand but also supply highly weather dependent and, as a consequence, electricity prices can be even more volatile. The distributed generation caused by the growth of RES induced the change not only in the energy production, but also in the profiles of the market participants. A number of small producers and traders joined the market. They are facing a significant risk associated with electricity price volatility, but at the same time electricity markets give them a range of trading opportunities. Since electricity prices are not known in advance, any trade planning needs to be based on price forecasts. In such a context some trading strategies have been proposed recently in the literature (Maciejowska et al. 2019; Serafin et al. 2022; Agakishiev et al. 2023). They utilize point as well as probabilistic forecasts of electricity prices. We believe that improving the accuracy of the forecasts would also improve the efficiency of trading strategies, see e.g. Nitka and Weron (2023) for an empirical analysis of this issue.

There are many articles dedicated to point forecasting of the day-ahead electricity prices, see Weron 2014 or Lago et al. (2021) for extensive reviews. Also an extension from the point to probabilistic forecasting methods has gained much attention in recent years, see (Nowotarski and Weron 2018) for a review. The latter takes into account not only the best estimate of a future value but also uncertainty of the prediction. As a consequence, it brings much more information to a decision maker and allows e.g. for a direct risk management. One of the methods that were successfully applied in the probabilistic electricity price forecasting is the Quantile Regression Averaging (QRA) proposed by Nowotarski and Weron (2015). It is built on the quantile regression (Koenker and Basset 1978) method combined with different point forecasts of electricity prices. In this paper we follow this direction and introduce the Expectile Regression Averaging (ERA) method. It uses a notion of expectiles introduced originally by Newey and Powell (1987). Expectiles can be viewed as an analogous description of the distribution to quantiles (Gneiting 2009) and there exists a unique mapping between expectiles and quantiles (Yao and Tong 1996). However, the calculation of expectiles yields some important differences from the calculation of quantiles. First, the estimation of the expectile regression is based on the asymmetric least squares method, in contrast to the quantile regression which is based on the asymmetric least absolute deviations. Second, a \(\tau\)-expectile accounts for the shape of the whole distribution, while an \(\alpha\)-quantile only for the value of the cumulative distribution function (CDF) at the \(\alpha\) level. Expectile function can be used as an alternative description of a distribution, but as shown in an empirical study by Waltrup et al. (2015) also deriving quantiles from estimated expectiles can be more efficient than the direct empirical calculation of the former.

Due to their numerical and statistical properties expectiles has seen increasing interest in the recent years. They were used, among others, in regression analysis (Waltrup et al. 2015), functional factor modelling (Burdejová and Härdle 2019), estimation of extremes (Girard and Stupfler 2022) or multivariate data analysis (Cascos and Ochoa 2021). Expectiles have also gained at lot of attention in finance, since Kuan et al. (2008) adopted them as a risk measure, called the expectile Value at Risk (EVaR). Although the interpretation of EVaR is less straightforward than for the classical risk measures, using expectiles allows to overcome the known drawbacks of the latter. The commonly used Value at Risk measure is not coherent, while the Expected Shortfall is non-elicitable. In contrast, EVaR is coherent and at the same time elicitable (Ziegel 2016; Bellini and Di Bernardino 2015). Expectiles were also recently used as a risk measure for electricity market by Syuhada et al. (2021) or Janczura and Wójcik (2022). For other applications concerning electricity markets see also the work of Taylor (2021) or Melzer et al. (2019). However, to our best knowledge, expectiles were not used in the context of forecast averaging for electricity prices, yet.

The aim of the probabilistic forecast is to estimate the future distribution of a considered variable. It yields possible scenarios of the future values and assign them probabilities. In practice it can be used for different purposes, like controlling uncertainty of point prediction, risk assessment or decision making. Depending on the context, predicting the whole distribution or only some of its values can be desired. The ERA method proposed in this paper is suitable for both approaches. On one hand it yields an estimate of an expectile on a given level, which can be used directly e.g. for risk assessment with the EVaR measure (see Janczura and Wójcik (2022) for an expample of using EVaR in diversification strategies for electricity markets). On the other hand, calculating a grid of expectile predictions yields approximation of the whole forecasted distribution. It can be used directly for sampling scenarios (like e.g. in Agakishiev et al. (2023) or Janczura and Puć (2023)), or it can be transferred into the qauntile predictions. For electricity, the latter were used e.g. by Nitka and Weron (2023) for controlling risk appetite in different bidding strategies. In this paper, we treat the QRA method as a benchmark for probabilistic electricity price forecasting. Hence, in the empirical study we transform expectiles into quantiles to allow for reliable comparison. However, the ERA method can be also used directly and in that case the transformation is not needed.

The rest of the paper is structured as follows. In Sect. 2 we briefly describe the notion of expectiles and show their analogies as well as differences from quantiles. Section 3 is devoted to the construction of probabilistic forecasts of electricity prices. In particular, in Sect. 3.2 we introduce the Expectile Regression Averaging method. Next, in Sect. 4 we apply the proposed technique to the day-ahead electricity prices from the German market and compare its performance with some benchmark probabilistic forecasts. Finally, in Sect. 5 we conclude.

2 Expectiles and quantiles

A standard way of describing the probability distribution of a random variable is in terms of the CDF and its inverse, i.e. quantiles. Another notion that can be used in such a context is the expectile. An expectile at level \(\tau\), \(e_{\tau }\) (\(0<\tau <1\)), is defined as a unique solution of Newey and Powell (1987)

$$\begin{aligned} \tau \mathbb {E}[(Y-e_{\tau }(Y))_{+}] = (1-\tau )\mathbb {E}[(Y-e_{\tau }(Y))_{-}], \end{aligned}$$
(1)

where \((x)_+=\max (x,0)\) and \((x)_-=\min (x,0)\) denote the positive and negative part of a variable x. For \(\tau =\frac{1}{2}\) expectile is equal to the mean of the distribution, so expectiles are often seen as asymmetric generalization of the mean (Gneiting 2009). On the other hand, if the expected value in (1) is replaced with a probability of non zero values, then the formula defines the quantile, yielding median for \(\tau =\frac{1}{2}\). Hence, expectiles generalize the mean in a similar way as quantiles generalize the median (Nolde and Ziegel 2017), but are based on the mean distance instead of the probability. As a consequence, they include information on the size of exceedances, in contrast to quantiles, which are based only on their frequency.

Expectiles can be also defined as the minimizers of the quadratic loss function (Bellini and Di Bernardino 2015)

$$\begin{aligned} e_{\tau }(Y)= \arg \min _{x\in \mathbb {R}} \tau \mathbb {E}[(Y -x)^2_{+}] + (1-\tau )\mathbb {E}[(Y -x)^2_{-}]. \end{aligned}$$
(2)

Note that for \(\tau =\frac{1}{2}\) this loss function is just the standard mean square error (MSE). For quantiles we have an analogous absolute loss function

$$\begin{aligned} q_{\alpha }(Y)= \arg \min _{x\in \mathbb {R}} \alpha \mathbb {E}[|Y -x|_{+}] + (1-\alpha )\mathbb {E}[|Y -x|_{-}], \end{aligned}$$
(3)

which for \(\alpha =\frac{1}{2}\) is the mean absolute error (MAE). The loss functions (2) and (3) are also a basis for the expectile and quantile regression, generalizing the classical linear regression model in terms of the predicted variable distribution. These methods will be further used in the paper for forecast construction.

Both, quantiles and expectiles, describe a distribution of a random variable. Naturally, they are related with each other. As shown by Yao and Tong (1996) there exists a unique function h such that

$$\begin{aligned} q_{\alpha }(Y)=e_{h(\alpha )}(Y). \end{aligned}$$

It is given by

$$\begin{aligned} h(\alpha )=\frac{-\alpha q_{\alpha }(Y)+G(q_{\alpha }(Y))}{-e_{0.5}(Y)+2G(q_{\alpha }(Y))+(1-2\alpha )q_{\alpha }(Y)}, \end{aligned}$$

were \(G(x)=\int _{-\infty }^x ydF(y)\) is the partial moment function and F is the CDF of Y. There exists also the inverse relation. Expectiles are linked with the CDF F by (Waltrup et al. (2015))

$$\begin{aligned} e_{\tau }(Y)=\frac{(1-\tau )G(e_{\tau }(Y))+\tau (e_{0.5}(Y)-G(e_{\tau }(Y)))}{(1-\tau )F(e_{\tau }(Y))+\tau (1-F(e_{\tau }(Y)))}. \end{aligned}$$
(4)

Hence, quantiles can be calculated from expectiles, and expectiles can be calculated from quantiles, but it usually requires some numerical approximations.

3 Probabilistic forecasts of electricity prices

3.1 Variance stabilizing transformation

Since electricity prices are known to be highly volatile, we apply the variance stabilizing transformation prior to calibration of the models (see Uniejewski et al. 2018 for a discussion on a usage of different transformations in this context). Here, we apply the inverse hyperbolic sine (asinh) function, which can be viewed as a generalization of the logarithmic transformation, being suitable also for negative prices. Denote electricity price for a delivery during hour h on day t by \(p_{h,t}\). The prices are transformed in the following way:

$$\begin{aligned} P_{h,t} = \text {asinh}(y_{h,t}) = \log \left( y_{h,t} + \sqrt{(y_{h,t})^2 + 1}\right) , \end{aligned}$$

where \(y_{h,t}\) is the normalized price, \(y_{h,t} = {\left(p_{h,t} - \mu _h \right) /\sigma _{h}}\), with \(\sigma _h\) being here the standard deviation of prices \(p_{h,t}\) in the calibration window and \(\mu _h\) the corresponding mean.

For practical applications one is usually interested in predictions of the original prices. Hence, in this paper, predictions calculated for transformed prices are in the end inverted back. Since inverting the asinh transformation of random variables is not straightforward (see Narajewski and Ziel 2020 for a discussion on this issue), we use the Monte Carlo approach. Namely, first we simulate n day-ahead price scenarios, \(\hat{P}^j_{h,t}, j=1,2, \ldots ,n\), using the predicted distribution. Next, we invert each of them using the hyperbolic sine function

$$\begin{aligned} {\hat{p}}^j_{h,t}= \sigma _h \cdot \text {sinh}\left( \hat{P}^j_{h,t} \right) + \mu _h , \quad j=1,2,\ldots ,n. \end{aligned}$$
(5)

Finally, the empirical distribution of the inverted day-ahead price scenarios \(\hat{p}^1_{h,t},\hat{p}^2_{h,t},\ldots ,\hat{p}^n_{h,t}\) yields the probabilistic forecast of the price for day t and hour h.

3.2 Quantile and expectile regression averaging

One of the commonly used methods for probabilistic forecasting of electricity prices is the Quantile Regression Averaging (QRA) proposed by Nowotarski and Weron (2015). It is based on applying the quantile regression (Koenker and Basset 1978) to a pool of point forecasts of different individual models. Namely, probabilistic forecasts of \(P_{h,t}\) are determined as the following linear combination (Nowotarski and Weron 2015)

$$\begin{aligned} \hat{q}_{P_{h,t}}\left( \alpha \right) =\hat{\textbf{P}}_{h,t}\textbf{w}_{\alpha }, \end{aligned}$$
(6)

where \(\hat{q}_{P_{h,t}}(\alpha )\) is an \(\alpha\)-quantile of the forecasted distribution, \(\hat{\textbf{P}}_{h,t}\) is a vector of K corresponding individual point forecasts, while \(\textbf{w}_{\alpha }\) is a column of weights for the \(\alpha\)-quantile. Weights \(\textbf{w}_{\alpha }\) are estimated, by minimizing the quantile loss function

$$\begin{aligned} \min _{\textbf{w}_{\alpha }} \left[ \sum _{t=1}^T\left( \alpha \left| P_{h,t} - \hat{\textbf{P}}_{h,t}\textbf{w}_{\alpha }\right| \mathbbm {1}_{\{P_{h,t} \ge \hat{\textbf{P}}_{h,t}\textbf{w}_{\alpha }\}}+ (1-\alpha ) \left| P_{h,t} -\hat{\textbf{P}}_{h,t}\textbf{w}_{\alpha }\right| \mathbbm {1}_{\{P_{h,t} < \hat{\textbf{P}}_{h,t}\textbf{w}_{\alpha }\}}\right) \right] . \end{aligned}$$
(7)

Note that the weights are calculated independently for each of the considered quantiles. It accounts for the fact that different point forecasts might carry distinct information on different parts of the predicted distribution. Nowotarski and Weron (2015) showed that such averaging scheme outperforms the empirical distribution of the errors of combined point forecasts in terms of the forecast accuracy.

In this paper we follow the forecast averaging approach, but we propose to combine it with the expectile regression. It is similar to the quantile regression, but the absolute loss function (7) is replaced with the quadratic one (2). Hence, in the Expectile Regression Averaging (ERA) method the \(\tau\)-expectile of the predicted distribution, \(\hat{e}_{P_{h,t}}\left( \tau \right)\), is calculated as

$$\begin{aligned} \hat{e}_{P_{h,t}}\left( \tau \right) =\hat{\textbf{P}}_{h,t}\textbf{w}_{\tau }, \end{aligned}$$
(8)

where \(\hat{\textbf{P}}_{h,t}\) is the vector of point forecasts from the individual models and \(\textbf{w}_{\tau }\) are the weights estimated from

$$\begin{aligned} \min _{\textbf{w}_{\tau }} \left\{ \sum _{t=1}^T\left[ \tau \left( P_{h,t} - \hat{\textbf{P}}_{h,t}\textbf{w}_{\tau }\right) ^2 \mathbbm {1}_{\{P_{h,t} \ge \hat{\textbf{P}}_{h,t}\textbf{w}_{\tau }\}}+ (1-\tau ) \left( P_{h,t} - \hat{\textbf{P}}_{h,t}\right) ^2\mathbbm {1}_{\{P_{h,t} < \hat{\textbf{P}}_{h,t}\textbf{w}_{\tau }\}}\right] \right\} . \end{aligned}$$

Note that the expectile regression is based on the \(L_2\) norm, yielding here an asymmetric least squares (ALS) method, while the quantile regression is based on \(L_1\) norm. The latter is more robust to outliers, but on the other hand least squares method posses better numerical properties.

3.3 Individual models

The ERA and QRA methods use a linear combination of individual forecasts, so they require deriving a set of point forecasts, first. To this end, we consider five expert models, being standard, frequently used approaches in electricity price modelling (see e.g. Misiorek et al. 2006; Kristiansen 2012; Maciejowska 2020). All are build on autoregressive models with exogenous variables (ARX), in which one assumes that electricity prices can be explained by the market fundamentals of technical or economical nature, like e.g. load, generation or weather conditions. Since, the forecasts of physical system variables are often publicly available, the construction of price predictions with the ARX models is straightforward.

In the first considered in this paper model we assume that the transformed electricity price for a delivery during hour h of day t, \(P_{h,t}\), is given by

$$\begin{aligned} \text {Model 1:}\quad \quad P_{h,t} = \theta _{1} P_{h,t-1}+\theta _{2} P_{h,t-2}+\theta _{7} P_{h,t-7} + \sum _{i=1}^k \psi _i Z_{h,t}^i + \sum _{i=1}^4\alpha _i D^i_t + \epsilon _{h,t}, \end{aligned}$$
(9)

where \(P_{h,t-i}\) are the autoregressive terms, \(Z_{h,t}^i, i=1,2,..,k\) are the exogenous variables and \(\epsilon _{h,t}\) is the noise term. In order to account for a weekly seasonality of electricity prices, we use also four dummy variables \(D^i_t=\mathbbm {1}_{\{t\in S_i\}}\) related to different days of the week, here Mondays (\(S_1\) set of time indices), Saturdays (\(S_2\)), Sundays/Holidays (\(S_3\)), and the other days of the week (\(S_4\)).

The second model differs from the first one by the number of regressors, for which we consider all transformed prices from a given hour during the past week, i.e.

$$\begin{aligned} \text {Model 2:}\quad \quad P_{h,t} = \sum _{i=1}^7\theta _{i} P_{h,t-i} + \sum _{i=1}^k \psi _i Z_{h,t}^i + \sum _{i=1}^4\alpha _i D^i_t + \epsilon _{h,t}. \end{aligned}$$
(10)

The third model uses also the minimum and maximum of the previous days’ transformed prices, so it allows for taking into account nonlinear intraday effects

Model 3:

$$\begin{aligned} P_{h,t} = \sum _{i=1}^7\theta _{i} P_{h,t-i} + \sum _{i=1}^k \psi _i Z_{h,t}^i + \sum _{i=1}^4\alpha _i D^i_t + \delta \min _h(P_{h,t-1}) + \eta \max _h(P_{h,t-1}) + \epsilon _{h,t}. \end{aligned}$$
(11)

The structure of the fourth model, called the p-ARX (Misiorek et al. 2006), is similar to Model 1, but applied to transformed prices with pre-processed spikes. Precisely, the transformed prices that exceed the mean level from the calibration window by more than its three standard deviations are substituted with

$$\begin{aligned} P^p_{h,t} = {\left\{ \begin{array}{ll} L_U+L_U\log _{10}\left( \left| \frac{P_{h,t}}{L_U}\right| \right) &{} \text {if} \quad P_{h,t}>L_U,\\ L_L-|L_L|\log _{10}\left( \left| \frac{P_{h,t}}{L_L}\right| \right) &{} \text {if} \quad P_{h,t}<L_D, \end{array}\right. } \end{aligned}$$
(12)

where the upper level is set to \(L_U= \mu _{P_{h,t}} + 3 \sigma _{P_{h,t}}\), while the lower level to \(L_L= \mu _{P_{h,t}} - 3 \sigma _{P_{h,t}}\).

The fifth model specification, m-ARX proposed by Ziel and Weron (2018), is a modification of Model 2, including the weekly mean of the transformed prices \(\overline{P}^W_{h,t}=\frac{1}{7}\sum _{i=1}^7 P_{h,t-i}\) in the following way

$$\begin{aligned} \text {Model 5:}\quad \quad P_{h,t} = \overline{P}^W_{h,t} + \sum _{i=1}^7\theta _{i}( P_{h,t-i} - \overline{P}^W_{h,t}) + \sum _{i=1}^k \psi _i Z_{h,t}^i + \sum _{i=1}^4\alpha _i D^i_t + \epsilon _{h,t}. \end{aligned}$$
(13)

The parameters of the ARX models, \(\theta _i, \alpha _i, \psi _i,\delta ,\eta\), can be estimated using the least squares method. Then, the day-ahead point forecasts for each hour are given by the corresponding linear combination of explanatory variables. The set of these forecasts, \({\hat{\textbf{P}}}_{h,t}\), is then used in the ERA (8) and QRA (6) methods.

As a benchmark we also calculate the probabilistic forecasts using the standard historical simulation method. For each of the individual models we derive the out-of-sample point prediction errors

$$\begin{aligned} \epsilon _{h,t}= \hat{P}_{h,t} - P_{h,t}, \end{aligned}$$
(14)

where \(\hat{P}_{h,t}\) are the point forecasts, while \(P_{h,t}\) are the actual transformed prices. In order to derive the sample of \(\epsilon _{h,t}\)’s, we divide the training data set into two separate periods. Next, each of the individual models is calibrated in the first window and the forecasts for the first day of the second window are calculated. Next, we apply the moving window scheme, i.e. the first window is shifted by one day, the models are again calibrated and the forecasts for the next day are calculated. The procedure is repeated until forecasts for the last day of the training data set are derived. Then, the out-of-sample errors (14) are calculated for each day in the second part of the training window, see also Sect. 4.2 for details on the datasets used for the forecast construction. Finally, the probabilistic forecast is calculated as the sum of the point forecast and the empirical distribution of the errors. Here, this forecast is considered in terms of the quantiles as well as expectiles. Precisely, the empirical \(\alpha\)-quantile is calculated as a point, which divide the sample into two subsets with \(\alpha\) and \(1-\alpha\) frequencies, while the empirical \(\tau\)-expectile is derived from the sample by minimizing the empirical quadratic loss function (2).

Overall, we consider 12 methods for deriving probabilistic forecasts: QRA, ERA as well as historical simulation of expectiles and quantiles from the five individual models. The models are fitted for each hour separately, so in total we consider 24 one-dimensional time series. This is a common approach in electricity price modelling since electricity delivered during different hours is in fact traded as separate products.

3.4 Quantile predictions from expectiles

The considered probabilistic forecasts are given either in terms of quantiles or of expectiles. Both are a proper description of the predicted distribution, but their accuracy should be evaluated using different scoring functions. Hence, in order to compare the quantile- and expectile-based methods, we transform expectiles into the corresponding quantiles. To this end, we use a procedure proposed by Waltrup et al. (2015). It is based on finding a CDF that minimizes the distance between the derived expectiles and their theoretical values resulting from that CDF (4). Let \(\{\hat{e}({\tau _1}),\hat{e}({\tau _1}),\ldots ,\hat{e({\tau _L})}\}\), \(0<\tau _1<\tau _2<\ldots <\tau _{L}<1\), be a grid of estimated expectiles. Further in the paper we use \(\tau _i \in \{0.001,0.0025,0.005,0.0075,0.01,0.02,0.04,\ldots ,0.98,0.99,0.9925,0.995,0.9975,0.999\}\). Let also \(\hat{e}({\tau _0})=\hat{e}({\tau _1}) + [\hat{e}({\tau _1}) - \hat{e}({\tau _2})]\) and \(\hat{e}({\tau _{L+1}}) = \hat{e}({\tau _{L}})+ [\hat{e}({\tau _{L}}) - \hat{e}({\tau _{L-1}})]\) be the tuning bounds in the tails of the distribution. Assume that the CDF at the estimated expectile \(\hat{e}({\tau _i})\) is given by

$$\begin{aligned} {F}(\hat{e}({\tau _i}))=\sum _{j=1}^i {\gamma }_j \end{aligned}$$
(15)

with non-negative steps \({\gamma }_j>0\), for \(j=1,2,\ldots ,L\) and set \({\gamma }_{L+1}=1-\sum _{j=1}^L {\gamma }_j\ge 0\) to assure that the CDF is properly defined. Now, observe that the partial moment function can be approximated by \(\hat{G}(\hat{e}({\tau _i}))=\sum _{j=1}^i \hat{c}_j {\gamma }_j\), where \(\hat{c}_j=\frac{\hat{e}({\tau _j})-\hat{e}({\tau _{j-1}})}{2}\). Finally, putting expression for the CDF, (15), and \(\hat{G}\) into formula (4) and minimizing the distance from the estimated expectiles \(\hat{e}(\tau _i)\), yields estimators for \( {\hat{{\gamma}}}=(\hat{\gamma _1},\hat{\gamma _2},\ldots , \hat{\gamma _L})\):

$$\begin{aligned} {\hat{{\gamma }}}=\arg \min _{{\gamma _1},{\gamma _2},\ldots , {\gamma _L}} \sum _{i=1}^L \left[ \hat{e}({\tau _i})-\frac{(1-\tau _i)\sum _{j=1}^i \hat{c}_j {\gamma }_j+\tau _i (\hat{e}({0.5})-\sum _{j=1}^i \hat{c}_j {\gamma }_j)}{(1-\tau _i)\sum _{j=1}^i {\gamma }_j+\tau _i (1-\sum _{j=1}^i {\gamma }_j)}\right] ^2. \end{aligned}$$
(16)

The estimator of the CDF is then simply \(\hat{F}(\hat{e}({\tau _i}))=\sum _{j=1}^i \hat{\gamma }_j\). In the numerical minimization of (16), following (Waltrup et al. 2015), to ensure stability and smoothness of the CDF, we add an additional penalty \(\lambda \sum _{i=1}^{L-1} (\frac{\hat{\gamma }_l}{\hat{e}(\tau _l)-\hat{e}(\tau _{l-1})} - \frac{\hat{\gamma }_{l+1}}{\hat{e}(\tau _{l+1})-\hat{e}(\tau _{l})} )^2\), where \(\lambda\) is the standard deviation of transformed prices in the calibration window.

Next, the values of the CDF at desired quantiles are approximated using linear interpolation \(\hat{F}(y)=\sum _{j=1}^i \hat{\gamma }_j + \hat{\gamma }_{j+1}\frac{y - \hat{e}(\tau _i)}{ \hat{e}(\tau _{i+1})- \hat{e}(\tau _i)}\), \(y\in [\hat{e}(\tau _i),\hat{e}(\tau _{i+1}))\), and finally inverted yielding quantile forecasts. For a more detailed description of this procedure see Waltrup et al. (2015).

4 German day-ahead electricity market case study

4.1 Datasets

We apply the ERA, QRA as well as the historical simulation methods from individual models (9)–(13) to the hourly electricity prices from the German EPEX day-ahead market spanning the period of 1.01.2017\(-\)31.12.2020. The considered prices are plotted in Fig. 1. For the calculation of the point forecasts we use the set of exogenous variables \(Z^i_{h,t}\) consisting of: i) the day-ahead forecasts of generation; ii) day-ahead forecasts of wind generation; iii) day-ahead forecasts of solar generation and iv) day-ahead forecasts of load. All these values are published by the Transmission System Operator (TSO) and are freely available from ENTSO-E platform (https://transparency.entsoe.eu/). The values of the considered variables are plotted in Fig. 2.

Fig. 1
figure 1

Hourly electricity prices from the German EPEX day-ahead market from the period 1.01.2017\(-\)31.12.2020

Fig. 2
figure 2

Values of the exogenous variables: forecasted generation, forecasted wind generation, forecasted load and forecasted solar generation for the German market from the period 1.01.2017\(-\)31.12.2020 (source: ENTSO-E)

4.2 Forecasts construction

Electricity price predictions are calculated in a moving window scheme. For each day of the validation window we calculate the day-ahead probabilistic forecast based on the parameters estimated from the preceding calibration window. The derivation of the probabilistic forecasts for all considered methods requires calculating the point forecasts, first. Hence, we divide the calibration window into two yearly parts. The first one is used for the estimation of the parameters of the individual models (9)—(13). Next, the resulting point forecasts are derived for the second part of the calibration window. Finally, these point forecasts are used to calculate probabilistic forecasts for the validation window. Here, the forecasts are evaluated in a two-yearly window spanning over the years 2019-2020. Note that all predictions are derived for the asinh transformed prices (see Sect. 3.1 for details). The asinh transformation is also performed in a moving window scheme with parameters \(\mu _h\) and \(\sigma _h\) calculated based on the corresponding calibration window. Finally, all predictions are transformed back into electricity prices using the inverse asinh transformation (5) and the Monte Carlo scheme.

The comparison of forecasts is done in terms of quantiles. In order to transform the expectile-based predictions into the corresponding quantiles (see Sect. 3.4 for details) we use a grid of 59 expectiles calculated at the following levels \(\tau = 0.001,0.0025,0.005,0.0075,0.01,0.02,0.04,\ldots ,0.98,0.99,\) 0.9925, 0.995, 0.9975, 0.999. It yields an approximation of the forecasted CDF, which in the end is inverted for each considered quantile, yielding the quantile forecasts. We further evaluate the forecast accuracy for a grid of all percentiles, \(\alpha \in \{0.01,0.02,\ldots ,0.99\}\), yielding an approximation of the CDF.

4.3 Forecasts evaluation

The accuracy of quantile forecasts is compared using the pinball loss (PL), being a consistent scoring function for quantiles, Gneiting and Katzfuss (2014)

$$\begin{aligned} PL\left( \hat{q}_{p_{t,h}}(\alpha ),p_{t,h},\alpha \right) =\left\{ \begin{array}{lcr} (1-\alpha )\left( \hat{q}_{p_{t,h}}(\alpha ) - p_{t,h}\right) &{} \text{ if } &{} p_{t,h}<\hat{q}_{p_{t,h}}(\alpha ),\\ \alpha \left( p_{t,h} - \hat{q}_{p_{t,h}}(\alpha )\right) &{} \text{ if } &{} p_{t,h}\ge \hat{q}_{p_{t,h}}(\alpha ),\\ \end{array}\right. \end{aligned}$$

where \(\hat{q}_{p_{t,h}}(\alpha )\) is the \(\alpha\)-quantile of the forecasted price distribution and \(p_{t,h}\) is the actually observed value. We calculate the averaged pinball score for each hour and percentile in the validation window. The values are then averaged over all percentiles, yielding mean pinball score for each hour, or over all hours, yielding mean pinball score for each percentile. A standard measure for evaluating probabilistic forecasts is the continuous ranked probability score (CRPS), being a consistent scoring function for CDF. It can be written as an integral on the pinball score over all quantiles, \(\alpha \in (0, 1)\) (Gneiting and Ranjan 2011). Hence, the mean pinball score for each hour can be viewed a discrete approximation of the CRPS. On the other hand, mean pinball scores for each percentile reflect differences in the forecast accuracy for different parts of the predicted distribution. The obtained values of the mean pinball scores are plotted in Fig. 3, while in Fig. 4 we also plot the corresponding skill scores. For the calculation of the skill scores, the QRA method is treated as a benchmark, i.e. \(\text {skill score(forecast)}=1 - \frac{{\text {mean pinball score}(\text {forecast}})}{\text {mean pinball score}(\text {QRA})}\). As can be observed, the ERA ad QRA averaging schemes yield lower pinball scores than the historical simulation method for most percentiles, especially in the middle of the distribution. The differences in the tails of the distribution are much lower, but looking at the corresponding skill scores, one can see that the historical simulation outperforms the QRA method and in the lower tail also the ERA approach. Comparing the averaging schemes, the QRA method yields higher forecast accuracy only in the first quartile of the distribution, which for the analyzed prices is characterized by more extreme observations than the rest of the distribution. Looking at the mean pinball scores for each hour, in most cases the best results were obtained with the ERA method. The best performing historical simulation approaches with Models 2 and 5 yield similar skill scores to the ERA method in the middle of the day, outperforming at the same time the QRA scheme. For the other hours the averaging schemes lead to more accurate forecasts.

Fig. 3
figure 3

Mean pinball scores for each hour (top panel) or percentile (bottom panel) obtained for each of the considered models applied to the asinh transformed prices. ‘Q-hist’ denotes the historical simulation in terms of quantiles, while ‘EX-hist’ the historical simulation in terms of expectiles. Numbers are related to the individual Models

Fig. 4
figure 4

Mean skill scores of the pinball loss for each hour (top panel) or percentile (bottom panel) obtained for each of the considered models applied to the asinh transformed prices. Here, the results of the QRA method are treated as a benchmark. ‘Q-hist’ denotes the historical simulation in terms of quantiles, while ‘EX-hist’ the historical simulation in terms of expectiles. Numbers are related to the individual Models

The significance of the pinball score differences is further verified using the one-sided (Diebold and Mariano 1995) test. The test is applied for pairwise comparisons of the forecasts. In Fig. 5 we show the number of hours as well as percentiles for which each of the considered models was significantly outperformed by each of its competitors. The obtained results confirm conclusions drawn from Fig. 3. The forecasts from the ERA and QRA averaging schemes outperform significantly each of the historical simulation methods. There are no hours for which the accuracy of ERA or QRA were significantly lower, while they outperformed the historical simulation for 4 up to 23 hours depending on the model specification. Similarly for percentiles, we can see a significant improvement in the forecast accuracy, if averaging methods were used. This is especially apparent for the ERA method, which outperforms the other approaches for 33 up to 86 percentiles. It yields significantly better results also in comparison with the QRA method, outperforming the latter for 8 h and 67 percentiles. Among the historical simulation approaches the best performing is Model 5. However, still for 4-7 hours it yields significantly worse predictions than averaging schemes and in 33-55 percentiles worse than the ERA method, which is significantly outperformed by Model 5 only for 4-5 percentiles. Looking at the differences between the quantile- and expectile-based historical simulation within a given model specification we do not observe a clear pattern and overall accuracy is at similar level.

Fig. 5
figure 5

Number of hours (left panel) or percentiles (right panel) for which prediction from the model in row is significantly worse than prediction from the model in column according to the Diebold and Mariano (1995) test. ‘Q-hist’ denote the historical simulation in terms of quantiles, while ‘EX-hist’ the historical simulation in terms of expectiles. Numbers are related to the individual Models. The test was performed at the \(5\%\) significance level

In order to further evaluate the predictions, we calculate the coverage probability \(\text {P}(p_{t,h}< \hat{q}_{p_{t,h}}(\alpha ))\) at the \(5\%\) and \(95\%\) \(\alpha\)-levels. Note that these are in fact accuracies of the Value at Risk forecasts at \(95\%\) level, i.e. VaR\(_{95\%}\), for a seller and buyer, respectively. Results obtained for each of the hours are plotted in Fig. 6. Here, we can see a clear difference between the results obtained with the quantile- and expectile-based methods. The coverage probabilities obtained for the latter are closer to the expected \(5\%\) and \(95\%\) levels. This is visible for both approaches - the ERA method and the expectile-based historical simulation. The coverage probabilities obtained with the QRA as well the quantile-based historical simulation methods are too high for the \(5\%\) quantile and at the same time too low for the \(95\%\) one, yielding too narrow prediction intervals. The coverage probabilities obtained with the expectile based methods are close to the expected \(5\%\) and \(95\%\) levels with the only exception for higher quantiles in the night hours which are lower by approximately \(1\%\). The significance of the differences from the expected \(5\%\) and \(95\%\) levels is verified using the Kupiec (1995) test. The number of hours for which the obtained coverage probabilities were significantly different from \(5\%\) and \(95\%\) is given in Table 1. In the case of the expectile-based methods the obtained values are significantly different than the expected ones only for few hours, mainly during night, and for \(95\%\) level. For the quantile-based methods the accuracy is much worse as the number of hours with significant differences is from 10 up to even 22. In Table 1 we also give the aggregated results for the Conditional Coverage test, which combines the unconditional coverage of the Kupiec (1995) test with the independence test of Christoffersen (1998). The independence of exceedances is for most cases violated. However, here also better results are obtained with the expectile-based methods.

Fig. 6
figure 6

Coverage probabilities for each hour at the \(5\%\) (top panel) and \(95\%\) (bottom panel) level obtained for each of the considered models applied to asinh transformed prices. ‘Q-hist’ denote the historical simulation in terms of quantiles, while ‘EX-hist’ the historical simulation in terms of expectiles. Numbers are related to the individual Models. The \(5\%\) and \(95\%\) levels are marked with horizontal blue lines

Table 1 Number of hours for which the coverage probability obtained for the considered methods applied to asinh transformed prices was significantly different than the expected \(5\%\) or \(95\%\) value according to the Kupiec (1995) as well as the Conditional Coverage test performed at the \(5\%\) significance level.

The obtained results are summarized in Table 2. As a reference we provide also the results obtained in the case, if there was no transformation applied to electricity prices prior to modelling. The coverage probabilities and the pinball scores are averaged over all hours and in the latter case also over all percentiles. The coverage probabilities are additionally evaluated with the Kupiec (1995) test at the \(5\%\) significance level.

Table 2 The values of the average pinball score (PS) as well as the coverage probabilities P\(_{5\%}\) and P\(_{95\%}\) at the \(5\%\) and \(95\%\) levels obtained for the considered methods

The best averaged pinball score was obtained for the ERA method applied to asinh transformed prices. Also the averaged coverage probabilities are in this case closer to \(5\%\) or \(95\%\) for the expectile-based methods. We observe much improvement of the forecast accuracy if the asinh transformation is applied to electricity prices, especially in the lower tails of the predicted distribution and in the overall pinball score. Interestingly, if no transformation is applied, then the quantile-based methods yield higher accuracy than their expectile analogues. Those methods rely on the absolute deviation instead of the least squares, so are more robust to outliers. Nevertheless, accuracy of the forecasts obtained without transformation is lower then their transformed versions for all of the considered methods.

5 Conclusions

In this paper we proposed a new method for probabilistic forecasting of electricity prices. It is based on combining forecast averaging with the expectile regression. Precisely, it yields forecasts of expectiles, given as linear combinations of a pool of point forecasts. Predicted distribution is then given in terms of expectiles, so it can be directly used e.g. for risk management purposes and EVaR calculation. On the other hand, from a grid of expectiles one can also calculate quantiles of the same distribution. Since the QRA method can be treated as a benchmark in probabilistic forecasting of electricity prices, in this paper we followed the second approach and transformed the calculated expectiles into quantiles for comparability.

The proposed ERA approach was applied to the German electricity market data. Its accuracy for hourly, day-ahead electricity prices was compared with the QRA as well as the historical simulation methods. In terms of the pinball score both considered forecast averaging methods, ERA and QRA, significantly outperformed historical simulation. Results of the expectile- as well as quantile-based historical simulation methods were in this case similar. We also calculated coverage probabilities at the \(5\%\) and \(95\%\) levels. For this accuracy measure all expectile-based approaches outperformed significantly the quantile-based ones, especially the QRA method. A possible explanation of these results might lie in the shape of the loss function used in the estimation. Both, the asymmetric least squares and asymmetric least absolute deviations take into account sharpness of the predictions, i.e. put higher penalty for observations below \(\alpha\)-quantile or \(\tau\)-expectile if \(\alpha ,\tau <0.5\) and analogously for observations above \(\alpha\)-quantile or \(\tau\)-expectile if \(\alpha ,\tau >0.5\). However, due to the convexity of quadratic function, this penalty for asymmetric least squares is relatively lower provided that the distance from the expectile is smaller than 1. Note that, here, this effect is additionally strengthened by the variance stabilizing transformation, which makes the scale of deviations in general smaller.

Overall, the best results were obtained for the ERA method applied to prices after variance stabilizing transformation. Such transformation improved forecast accuracy for all considered methods. The ERA method led to higher forecast accuracy than QRA one in terms of both considered measures, the pinball score as well as the coverage probability. It yielded significantly lower pinball score for 8 hours and 67 percentiles and also for all of the hours more accurate coverage probabilities.

We believe that utilizing the notion of expectiles in probabilistic forecasting of electricity prices might improve the forecasts accuracy. Since using expectile regression leads to least squares optimization, it naturally inherits its good numerical properties. However, for such volatile data as electricity prices it should be applied with consciousness, as the asymmetric least squares method is not robust to outliers. Hence, a variance stabilizing transformation or outlier treatment methods might be necessary to apply it efficiently.