Forecasts are very rarely produced for their own sake, and are vital for the optimal performance of a number of applications. This chapter summarises a few such applications as well as some adjacent areas which will use some of the similar techniques and methods presented in this book. The main focus will be for battery storage control which is presented in detail in the following section.

15.1 Battery Storage Control

In a low carbon economy, storage devices are going to be essential for maximum utilisation of renewable generation. Most renewable energy, such as solar and wind, are dependent on weather conditions and may not generate at times when the energy is most needed. Storage devices can be used to ‘shift’ renewable energy by charging when generation is high and then discharging when demand is high but renewable generation is low.

Fig. 15.1
figure 1

Illustration of how forecasts (black line) are used in storage control. The final demand on the network (shaded) is created by charging the battery when the forecast estimates low demand and discharging when the forecast estimates high demand

One popular form of storage are battery energy storage systems (BESS). Although BESS have traditionally been an expensive solution, the rapid reduction in cost in recent years is beginning to make them competitive with traditional reinforcement such as replacing existing assets or demand side response (Sect. 15.2). BESS can be deployed for a wide variety of network solutions including, demand smoothing, voltage control, and phase balancing. In this chapter, the primary focus will be on the application of peak demand reduction for the purposes of increasing the network headroom and maintaining the thermal capacity of the network (Sect. 2.2).

Forecasts are used to estimate the distribution of demand throughout the day, in particular the magnitude and timing of the peaks and troughs (the maximums and minimums). This can be utilised into a control algorithm so that the storage device knows the most appropriate times to discharge (at the peaks) and charge (during the troughs). An illustration of the charging and discharging of a storage device is shown in Fig. 15.1. The ‘Final demand’ in the plot is the resulting demand on the grid and is a supposition of the original demand and the charging/discharing of the device at periods of relatively low/high demand.

This process is not as easy as it appears. The demand being considered here is at the low voltage which is much more volatile than higher voltage (HV) and system level (e.g. country wide) demand. Low voltage feeders typically consist of around 40 households and hence have higher degrees of uncertainty than HV systems (see the Case Study in Chap. 14 for a detailed analysis of 100 LV feeders). In particular, an inaccurate forecast may cause a storage device to charge during a high demand period causing an increasing in the peak.

In this section, the peak demand application is based on the results from a previous publication by one of the authors and can be found in [1]. Note that some elements have been simplified to maintain the focus on the impact of forecasts rather than diving into the details of control theory! In the following sections several forecasting methods are developed and then incorporated into a control algorithm which simulates a BESS.

Although a basic control method is considered in this book, more advanced control methods such as Optimal Control, Model Predictive Control (MPC), and Stochastic Model Predictive Control (SMPC) could also be considered which can more optimally solve the peak reduction problem.

15.1.1 Data

This section will be focusing on a few of the same low voltage feeders as used in Chap. 14 and will use a selection of nine feeders to demonstrate how they can be used to support the storage control of a BESS. The feeders have been selected essentially randomly but to include a range of different average demands and volatilities (defined by standard deviation). A few of the attributes of the chosen feeders are shown in Table 15.1. The chosen feeders are labelled according to their approximate size with smaller feeders labelled S1, S2, S3, the medium sized feeders, M1, M2, M3, and L1, L2, L3 the larger feeders.

Table 15.1 Summary features for the feeders considered for the analysis in this chapter. This includes average half hourly demand, standard deviation of demand, and the maximum recorded half-hourly demand (measured over a year). All values are in kWh. Reprinted from [1] with permission from Springer

The monitored data for the selected feeders consists of half hourly energy demand (kWh) for the period from 10th March 2014 to 15th November 2015. The two-week period, from 1st to 14th of November 2015, is used as a test period for assessing the storage control algorithms with the remaining data used for parameter selection (via cross-validation) and training of the forecast models.

15.1.2 Forecast Methods

This section describes several methods used for the load forecasts made up of a mix of machine learning (Chap. 10) and statistical methods (Chap. 9). Throughout this application \(L_1, L_2, \ldots , \) will denote the monitored demand time series with \(L_t\) the demand at time step t, with the predicted value at the same time step denoted \(\hat{L}_t\). Let \(t=N\) be the last observed time step, and hence the forecast value at the forecast horizon h will be given by \(\hat{L}_{N+h}\).

The forecasts will generate half hourly forecasts for the next two days starting at midnight prior to the first day. Since the data is the same as in the Case Study in Chap. 14 the same data analysis is relevant here and some of the same models are included. This also helps identify some of the inputs to include in the models such as daily and weekly seasonalities, and autoregressive features.

Method 1: Linear Seasonal Trend Model (ST) The first forecast method is based on the simple seasonal model (ST) presented in Sect. 14.2.3 consisting of annual seasonal terms, and linear trend for different half hours of the day. This generates a mean model forecast, \(\mu _t\). This method is labelled ST.

Method 2: ST with Autoregressive Correction (STAR) The ST method does not utilise the most recent available information. As mentioned in Sect. 7.5, a standard method for improving these forecasts is to include autoregressive terms. Once the mean equations are found for the ST model a residual time series can be created defined by

$$\begin{aligned} r_t=\sum _{m=1}^{M_{\max }}\phi _{m}r_{t-k}+\epsilon _t \end{aligned}$$
(15.1)

where \(r_t=L_t- \mu _t\) and \(\epsilon _t\) is the error term, and \(\mu _t\) is the estimate from the ST method. The optimal order \(M_{\max }\) is found by minimising the Akaike information criterion (AIC) searched over a maximum order of \(m=15H\) (i.e. an optimal order of up to 15 days is used). The autoregressive terms defines an additional model labelled as STAR.

Method 3: Random Forest Regression As presented in Sect. 10.3.2, Random Forest regression RFR, is a popular machine learning method, based on combining an ensemble of decision trees. To forecast the load at time \(N+h\) the following features are used as input:

  • Load from the past day. The past \(H=48\) available demand values (a day) \(L_N,L_{N-1}, \ldots , L_{N-47}\). This means that for further ahead forecasts the historical input data is less recent than for those for shorter horizons.

  • Load from past weeks. The load at the same time of the week for the past four weeks, i.e. the inputs \(L_{N+h-n_w}, L_{N+h-2n_w}, L_{N+h-3n_w}, L_{N+h-4n_w}\) where \(n_w =336\), the number of timesteps in a week for half-hourly resolution data.

  • Time of the day. This is defined as an integer between 1 and H. I.e. the half hour period of the day.

In total there are therefore \(48+4+1 = 53\) input features per half hour in the horizon. Note that this means there is a different forecast model for each period in the horizon and a different model for each feeder to be trained. That equates to \(96*9=864\) models to train for a two day ahead forecast for nine feeders. Thus it would be desirable to keep the computational cost low if possible.

The number of trees in the ensemble is an important parameter and must be tuned to its optimal value via cross-validation. To select the optimal number of trees in the random forest, a validation period of one week prior to the test period is used. Ensembles with varying number of trees from 5 to 100 in increments of 5 are considered. A value of 30 trees in the ensemble of a Random Forest was found to be sufficient trade-off between forecasting accuracy and performance, since smaller trees are less computationally expensive.

To train the final forecasts, the Random Forest is trained using the one year prior to the test period, 1st November 2014 to 31st October 2015.

Method 4: Support Vector Regression Recall from Sect. 10.2 that a Support Vector Regression (SVR) can solve a nonlinear regression of the form

$$\begin{aligned} \hat{L}_{N+h} = \boldsymbol{\beta }^T \Phi (\textbf{X}_{N+1}) + b. \end{aligned}$$
(15.2)

for inputs e.g. \(\textbf{X}_{t} = (X_{1,t}, X_{2, t}, ...., X_{n, t})^T\). This is solved by using kernel functions \(K(\textbf{X}_i, \textbf{X}_j) = <\Phi (\textbf{X}_{i}), \Phi (\textbf{X}_{j})>\) where \(<,>\) represents an inner product and can be written in the dual form

$$\begin{aligned} \hat{L}_{N+h}=f(\textbf{X}) = \sum _{t=1}^N \alpha _t K(\textbf{X}_{t}, \textbf{X}) + b. \end{aligned}$$
(15.3)

The optimal fit for the weights \(\alpha \) and the intercept b can be found by minimising a regularised cost function.

As shown in Sect. 10.2, SVR has two hyperparameters, C and \(\epsilon \), that require tuning. The regularisation constant C controls the flatness (or complexity) of the model, a trade-off between empirical error and model flatness, and \(\epsilon \) determines the amount of error allowed by the model (values far from the model expectation line). In addition, the choice of kernel \(\phi ()\) is also important for the final model. The hyper parameters can be found via many different methods but in this example, grid search is considered (see Sect. 8.2.3). To simplify the task the error allowance term is set to \(\epsilon =0.1\), and three kernels are considered for the regression: a linear, a radial basis function (RBF) and a polynomial (Sect. 10.2). As an example, the Gaussian Radial Basis Function (RBF) given by

$$\begin{aligned} K(\textbf{X}_i, \textbf{X}_j) = \exp \left( -\gamma ||\textbf{X}_i -\textbf{X}_j||^2 \right) , \end{aligned}$$
(15.4)

The regularisation constant C is restricted to vary from 0.1 to 100. The RBF kernel has an extra free parameter, \(\gamma \), which controls the width of the kernel, and varies from 0.01 to 100. Finally, the degree of the polynomial kernel requires tuning too, changing from 2 to 5.

As with the RFR model, hyperparameter selection is performed using the week prior to the test-period as a validation period. The linear kernel is chosen since it outperforms both the RBF and the polynomial kernels for all values of the C parameters. With the linear kernel, large values of \(C > 20\) seem to reduce the model accuracy (in terms of MAPE) and so the regularisation constant is fixed at \(C = 1\) for all feeders.

Since the Support Vector Regression forecast is more computationally intensive than the Random Forest Regression, a shorter training period of eight weeks prior to the test period, i.e. 5th September 2015 to 31st October 2015, is used.

Benchmark Methods Informative benchmarks are also included to compare with the more sophisticated models (Sect. 9.1). They can also help to understand the factors which best drive the forecast accuracy and the storage control performance. Further, because they are computationally inexpensive, if they perform well, they will scale up well and there is no need to implement more intensive methods.

The first simple model is the simple seasonal average model given as

$$\begin{aligned} \hat{L}_{N+h}= \frac{1}{p} \sum _{k=1}^{p} L_{N+h -k n_w} \end{aligned}$$
(15.5)

where \(n_w=336\) is the number of time steps in a weekly period. Testing shows that using \(p = 5\) weeks of data is the optimal hyperparameter. The model is denoted 7SAV5. This model is mainly motivated by the fact that the recent past is usually important for the actual behaviour.

The other benchmark considered is the seasonal persistence model, which is technically a special case of the average model but simply uses the last week as the current week estimate

$$\begin{aligned} \hat{L}_{N+h}= L_{N+h - n_w} \end{aligned}$$
(15.6)

This special case is denoted SALW.

15.1.3 Analysis of Forecasts

In this section the accuracy of the forecasts is analysed. A standard error measure, the mean absolute percentage error (MAPE), is used given by

$$\begin{aligned} MAPE\left( a,f \right) =\frac{1}{n}\sum _{k=1}^{n}\frac{\left| a_{k}-f_{k} \right| }{a_{k}} \end{aligned}$$
(15.7)

where \(a=(a_1,\ldots ,a_n )^T \in \mathbb {R}^n\) is the actual/observation and \(f=(f_1,\ldots ,f_n )^T\in \mathbb {R}^n\) is the estimate/forecast. MAPE isn’t ideal or advisable for low voltage feeders, however this work is replicating a previous piece of work and in fact the MAPE is strongly correlated with other relative errors such as rMAE (see Sect. 14.2). However, the MAPE does gives a simple way of comparing and combining the errors of feeders of different sizes, since the differences are normalised by the size of the demand. The results presented here are for the entire two-week test period.

Table 15.2 MAPE for day ahead forecasts for each of the methods described in the main text. The best score for each feeder is highlighted in bold. Also shown is the average score for each method. Reprinted [1] with permission from Springer

The Table 15.2 shows the MAPE scores for day ahead forecasts for each method and each feeder considered in this trial. The best methods tend to be the STAR methods with a MAPE of \(16.40\%\), although the ST, SVR and 7SAV all produce the best forecast for a feeder each. This is consistent with the results in Sect. 14.2 where the best methods utilised autoregressive components.

Fig. 15.2
figure 2

Plot of Forecast errors for each feeder against a Average daily demand on for STAR and ST methods, and b scaled standard deviation for the STAR method. Also highlighted is feeder M2 (red square) and the S2 feeder (blue square)

The inclusion of the autoregressive component in STAR produces, on average, a \(10\%\) improvement over the seasonal trend model, ST. Although this improvement varies depending on the feeder. This is highlighted in Fig. 15.2a which shows the average error vs the average daily demand of each feeder, for both ST and STAR methods. This plot shows that there is a negative correlation between size of feeder and the accuracy of the forecast. This make sense since larger feeders are generally smoother and more regular and can therefore be more accurately estimated. However, there are two feeders S2, and M2, which are estimated quite inaccurately compared to the others. One explanation for this is suggested in Fig. 15.2b which shows the errors for the STAR method against the standard deviation (STD) scaled by the average daily demand. This gives a measure of the relative volatility of a feeder and shows that these feeders seem to have also high variability, especially M2.

Fig. 15.3
figure 3

Day Ahead forecasts (orange) using STAR method for the Small feeders for first four days of test data. Also shown are the actuals (black)

To better understand the errors (and how easy it will be for a storage device to reduce the peak) a few plots for each feeder are shown for the STAR day ahead forecasts for the first four days of the test set. Figure 15.3 shows the forecasts and the actuals for the small feeders. These feeders seem to have been forecast accurately in all cases although S2 is has much more volatility around the forecast value as already suggested by the high scaled STD value. S1 and S2 seems to have larger demands in the morning and evening periods compared to S3 which generally only has a peak demand in the evening. This means that to reduce the daily peak may be more difficult for S1 and S2 as the battery will have to charge and discharge appropriately in both periods to reduce the peak (and also avoiding overcharging too early prior to the main peak and risk creating a larger morning peak). Hence it is expected the greatest peak reduction for the smaller feeders will be for S3.

Fig. 15.4
figure 4

Day Ahead forecasts (orange) using STAR method for the Medium feeders for first four days of test data. Also shown are the actuals (black)

Figure 15.4 shows the same data but for the medium sized feeders. The forecasts again seem to do a good job of estimating the general shape of the daily profiles. As with S2, the high standard deviation in M2 is clear. This suggests that it may be difficult to maximally reduce the peaks for this feeder. It also appears that it will be difficult to reduce the peaks on M1 by a large percentage. The demand is relatively large throughout the day suggesting that the feeder is connected to businesses which operate during this period. Since the storage device will have to reduce demand over several time steps there will not be enough capacity in the battery to produce large reductions. Storage applied to M3 will probably produce the largest peak reduction because the main peak is in the evening and it appears to be accurately forecast.

Fig. 15.5
figure 5

Day Ahead forecasts (orange) using STAR method for the Large feeders for first four days of test data. Also shown are the actuals (black)

Finally, Fig. 15.5 shows the day ahead STAR forecasts and actual demand for the largest feeders. The forecasts once again seem to produce good estimates of the daily demand of the feeders. Feeder L2 looks like it is a feeder connected to many or one large business. In addition to the large daily demand (and no demand during morning or evenings) the first day (a Sunday) has zero demand suggesting this is a single commercial consumer with no operation on the weekend. This large demand will mean that the storage device will likely not be able to significantly reduce the peak. L1 looks very accurately estimated and in fact has the smallest MAPE on average (Table 15.2). This feeder has a major singular peak which seems to regularly occur in the evening, and therefore there could be significant peak reduction on this feeder. Finally, L3 has not got a prominent peak on most days and but seems to have two peaks on most days, one in the morning and one in the evening. A relatively large peak reduction may not be possible for this feeder if the battery cannot recharge quick enough to reduce the second peak after reducing the first peak.

15.1.4 Application of Forecasts in Energy Storage Control

The effectiveness of a storage device at reducing daily peak demands depends on the specifications of the battery, namely it’s capacity (how much energy it can store), its ratings (how fast it can charge and discharge), and its age (batteries become less effective the more they are used, but also how they are used, e.g. performing many cycles of charging to full and emptying completely). For other applications other criteria such as location on a feeder, whether there is real-time control etc. can also be important. The BESS will be sized so that theoretically a peak demand reduction of \(20\%\) can be achieved. The aim is to see what the effect of future demand uncertainty has on the performance, and what part forecast accuracy plays. The ratings and capacity ranges used in this experiment will change for each day in the test set, but obviously in a real world example it would be fixed and could be sized by analysing historical behaviour. The required rating will depend on how high the peak is over any half hour, the higher it is the faster a battery would need to discharge to be able to reduce it. The relative capacity is also related to the size of the peak, but if there are large demands during periods adjacent to the peak then these will also have to be reduced in order to decrease the overall daily peak (This is the case with feeders M1 and L2 as seen in Figs. 15.5 and 15.4 respectively). This will require a larger capacity to reach a particular percentage peak reduction. In general the bigger the demand on a feeder the bigger the capacity required.

Forecasts are used as estimates of the future demand, and the battery control algorithm will assume these are true estimates in order to develop a schedule which maximises the peak reduction. The schedule will be developed at midnight prior to the day of interest. Further another aim is to have the BESS with \(50\%\) state-of-charge at the midnight so it is prepared for any early peaks the following day.

Assume that \(p_t\) is the power outputs (in kW) from the battery then the following constraints can be defined. First, the ratings are bounded by a maximum charging rate, \(P_{\max }>0\) and a minimum \(P_{\min } <0\), i.e.

$$\begin{aligned} P_{min} \le p\left( t \right) \le P_{max}. \end{aligned}$$
(15.8)

Next it is assumed that the capacity \(c_t\) of the battery (in kWh) is also constrained between a maximum \(C_{\max } >0\) and a minimum \(C_{\min }>0\), i.e.

$$\begin{aligned} C_{min} \le c_t \le C_{max}, \end{aligned}$$
(15.9)

The absolute bound on the capacity is obviously zero (empty battery) but this can cause deterioration of the battery and so it may be desirable to set the minimum to be some strictly positive value. Obviously there is also a constraint in terms of how the capacity and the charging/discharging relate from one time step to the next, with the capacity changing depending on how much energy was added/removed from the battery, i.e.

$$\begin{aligned} c_{t+1}=c_t + 0.5(p_t\mu -\lambda ) \end{aligned}$$
(15.10)

with

$$\begin{aligned} \mu = \left\{ \begin{array}{ c l } \mu &{} \quad \text {if } p\left( t \right) \ge 0 \\ \frac{1}{\mu }, &{} \quad \text {if } p\left( t \right) < 0 \end{array} \right\} , \end{aligned}$$
(15.11)

where, \(\mu \) is the efficiency (96% in each direction), \(\lambda \) is the continuous losses within the BESS (assumed to be 100 W). I.e. not all of the energy will be transferred due to natural battery limitations. Note that the 0.5 in Eq. (15.10) converts the average power (in kW) into average energy (in kWh) because the data being considered is half hourly.

The control method presented here uses fixed day-ahead scheduling. As input it uses the forecasts to decide on the periods of charging and discharging subject to the constraints above. Let \(\textbf{P} = (P_1, P_2, \ldots , P_{48})^T\) be the charging schedule and let \(\hat{\textbf{L}} =(\hat{L}_{N+1}, \hat{L}_{N+2}, \ldots , \hat{L}_{N+48})^T\) be the predicted demand (both in kW) for the day ahead.

To create the schedule the following cost functionFootnote 1 is minimised with respect to \(\textbf{P}\) and subject to the forecast, \(\hat{\textbf{L}}\) and the battery constraints (15.8)–(15.11)

$$\begin{aligned} F\left( \textbf{P}, \hat{\textbf{L}} \right) = \xi _p\left( \textbf{P}, \hat{\textbf{L}}\right) + \xi _{cd}\left( \textbf{P} \right) \end{aligned}$$
(15.12)

The first component represent the new peak size and is the Peak-to-Average cost component for peak reduction, self-normalised to the initial conditions, defined

$$\begin{aligned} \xi _p(\textbf{P}, \hat{\textbf{L}})= \frac{1}{R}\left( \frac{max_{t=1, \ldots ,48} (P_t+L_k)}{\frac{1}{48}\sum _{k=1}^{48}(P_t+L_k) }, \right) ^{2} \end{aligned}$$
(15.13)

where R is the final peak size given an initial schedule \(\textbf{P}^{initial}\)

$$\begin{aligned} R = \left( \frac{max_{t=1, \ldots ,48} (P^{initial}_t+L_k)}{\frac{1}{48}\sum _{k=1}^{48}(P^{initial}_t+L_k) }. \right) ^{2} \end{aligned}$$
(15.14)

The second component aims to achieve a 50% State-of-Charge (SoC) at the end of the day and is defined as:

$$\begin{aligned} \xi _{cd} \left( \textbf{P} \right) = \frac{( c_{48} - 0.5 C_{max} )^2}{( c^{initial}_{48} - 0.5 C_{max})^2}, \end{aligned}$$
(15.15)

where \(c_t\) is the charge in the battery at time t. The initial end charge based on the initial schedule is given by \(c^{initial}_{48}\).

15.1.5 Results

Average peak reduction performance for each feeder for each day-ahead forecasts is given in Table 15.3. The best possible values assuming perfect foresight are also shown. This almost gets the maximum possible peak reduction of \(20\%\) for all feeders except L2. The question that could be asked is, why isn’t it exactly \(20\%\) since the future is exactly known? To answer this, recall that the cost function Eq. (15.12) isn’t just focused on peak reduction, it also has to ensure the battery is charged to about \(50\%\) at the end of the day. The closer the peak is to the end of the day the more this criteria will effect the final peak reduction. Further, if there are multiple peaks in close proximity to each other then the battery may not have sufficient time to recharge between peaks and reduce the subsequent peaks.

Table 15.3 The overall peak reduction by applying a storage control to each feeder for each day-ahead forecast. The best results for each feeder are highlighted in bold. Reprinted from [1] with permission from Springer

The table shows that the STAR method produces the largest percentage peak reduction on average, 4.5% larger than the next biggest (ST). However, it only produces the biggest peak reduction for three feeders, the simpler model ST actually has the biggest peak reduction for five feeders (tying STAR for feeder S1). Recall that STAR was the best forecasting method for most feeders and this highlights an important point: the accuracy of the forecast doesn’t necessarily mean it will produce the greatest performance in the application. This can be an easy point to miss. Forecast error measures are usually simple and easy to calculate, in contrast to training directly via the cost function. However, it would be impractical to assess the forecasts using this cost functions due to computational costs and so a compromise is to utilise a related more but simple measure which hopefully still indicates the application performance.

Fig. 15.6
figure 6

Plot of percentage peak reduction for each feeder against a Average daily demand on for STAR and ST methods, and b MAPE for STAR method. Also highlighted is feeder S2 (blue square) and M2 (red square)

The last column of the table shows the average peak reduction across all forecasts (and doesn’t include the “Best”). It shows that although there is a trend of better peak reduction for larger feeders it isn’t straight forward. This is despite the correlation between accuracy and feeder size (see Fig. 15.2). Figure 15.6a shows the percentage peak reduction against feeder size for the ST and STAR methods. In this case there is a trend, with lower peak reduction with larger feeder size, but there is at least one outlier with large demand but small peak reduction. This is feeder L2, which as shown in Fig. 15.5, appears to be a single commercial load with no operation on Saturday and Sunday. Not only does this mean peak reduction is not possible on the two weekend days each week, but the other five days have large continuous daytime demands which mean peak reductions are difficult. To reduce the daily peak demand on this network requires a much larger storage device which can discharge a lot of demand over a larger portion of the day.

Figure 15.6b shows the peak reduction for each feeder against the MAPE for the STAR method. In general the more accurate the forecast (the smaller the MAPE) the bigger the percentage peak reduction. However there are three feeders which do not fit the trend. One of these is L2 which has already been discussed. The others are S1 and M1. In fact M1 (Fig. 15.4) has lower peak reduction due to some of the similar demand features as L2. Again the demand is relatively large throughout the day, possibly due to several commercial consumers connected to this feeder. The low peak reduction for S1 is more difficult to explain but there is large demands in the morning on some days (Fig. 15.3) which may reduce the energy available in the battery for reducing the evening peak.

Highlighted in Fig. 15.6b is the feeders S2 and M2 which you may recall have relatively large standard deviation (Fig. 15.2). These have relatively large MAPE and also have small peak reduction. The volatility of these forecasts mean that the data is relatively spikey and thus makes it difficult to provide an accurate forecast. A storage control schedule based on the forecast may inadvertently charge during higher charge periods or discharge in relatively lower periods. Therefore these feeders only have small peak reductions.

It should be noted that there are only nine points in these plots. Thus there should be some caution with being over interpreting the results and they may not generalise more widely.

A take home message from these results is that there is no one-size-fits all method. A next step may be to consider taking simple averages of the forecasts to see if this improves things (Sect. 13.1). In addition, since the data is quite volatile, probabilistic forecasts may also be a good option (Chap. 11). They may help to improve the results for the more volatile feeders. In addition, there are more advanced control techniques out there such as model predictive control which could also be considered.

15.2 Estimating Effects of Interventions and Demand Side Response

Demand side response (DSR) is deployed by turning on or off demands to react to possible strains on the network or to ensure energy supply matches energy demand. DSR could be as simple as turning on a load to increase the demand or, more commonly, turning off devices to reduce the demand on the network. For example, heat pumps could be turned off to reduce the demand during peak hours. Over a short period of time such interventions may not have a significant impact on the heating comfort within a home since, unless the home is not well-insulated, the temperature should not drop too rapidly.

An important question for these applications is how much energy was saved by the demand side response implementation? This is also known as “turn-down”. Forecasts can help answer this question.

Fig. 15.7
figure 7

Illustration of DSR turn down. The measured demand (shaded) is compared to an estimate without intervention (bold line). The comparison can be used to estimate the energy saved by turning off devices

Figure 15.7 shows both the actual demand after demand side response (shaded), and what the demand would have been had no intervention been applied (bold line). The shaded area is the adjusted demand created by the ‘turn-down’ event, for example by turning off the controllable appliance. The energy saved at 6PM is the difference in the area between the shaded part and the line. Of course, there is no way to know what the demand would have been had there been no DSR which means the consumer has no way of knowing how much energy they saved. In particular, if they are participating in any energy market schemes, they will not know how much payback they may have received.

Forecasting is an effective way to estimate the amount of turn-down since a model trained on “typical” demand can estimate what the demand would have been in the absense of an intervention (as long as the historical data used for training does not include interventions either). Thus the turn-down is simply the difference between the recorded demand (where the intervention has been applied) and the estimated demand from the forecast model.

The estimate in Fig. 15.7 could be estimated using the time series methods introduced in this book. Of course there will be natural variation in the demand but if the forecast is reasonably accurate (and this should be tested of course) then accurate estimates of average turn-down can be produced. Notice that the example in the figure has a much larger demand than expected after the turn down period. This can occur in some applications, and is known as a ‘rebound’ effect caused by adjusted behaviour, or extra demand in order to recover from the turn down. For example, this could be extra heating applied by the occupants to recover the comfort levels in the home after the DSR event.

Notice that the model can be trained using demand data from after the DSR event since the application is in fact a backcast rather than a forecast and the aim is to estimate, not predict, the demand. This means the estimates may in fact be more accurate than a normal forecast since more information is available.

15.3 Anomaly Detection

Chapter 6 already discussed ways to identify anomalous data. However, similar techniques can be used to identify anomalous behaviour rather than errors in the recorded data. This is important to identify things like energy theft, or whether the security of supply to vulnerable customers, with for example medical needs, are at risk (although privacy concerns would have to be considered for such applications).

Such anomalies can be detected if they deviate from the expected demand, and of course models used to create forecasts can be used to estimate typical demand, or model the uncertainty. One example would be to develop point forecast models to estimate the daily demand which can be used to identify unusually large demands or appliances. Load forecasts can also be used to identify unusually small demand. This could indicate the monitoring is broken, or that someone is rerouting their usage to artificially lower their bills! Another example would be to use probabilistic forecasts (Chap. 11), e.g. Quantile forecasts, to identify observations which lie in the extreme outliers. Large numbers of these variables can suggest something unusual is occurring.

Sudden large increases in the forecast errors also may suggest sudden changes in behaviour. This could suggest new occupants, new technologies, or simply large behavioural changes (for example the covid pandemic has led to many individuals working from home). This information could, in turn, lead to new solutions to support the network or help network operators plan their infrastructure upgrades.

15.4 Other Applications

There are many other topics which haven’t been explored in the use cases above, but forecasting has many other applications it can support which are briefly outlined below

  • Further Battery Applications: Forecasts can also be used to optimise solar PV connected batteries, minimise curtailment loss, control multiple batteries in electric vehicles, and regulate voltage.

  • Network Design and Planning: Forecasts can be used to size assets on the network (capacitors, substations etc.), and also plan the networks themselves (topology, location of batteries, sectionalising switches, etc.).

  • Electricity Price Forecasts: Energy markets rely on the estimated future demand, and therefore can be valuable inputs to price forecasting algorithms.

  • Simulating Inputs and Missing Data: Instead of the simple imputation models given in Sect. 6.1.2, more sophisticated load forecast models could be used. Forecasts can also be used to simulate inputs for other applications, for example power-flow analysis.

There are many other low voltage applications for load forecasts and these can be found in the review paper [2].

15.5 How to Use Forecasts in Applications

Below are a few guidelines for experimenting with utilising forecasts in real world applications:

  1. 1.

    Try to understand what features may be most important for the performance of the application and try to design the error measure so it represents or aligns with this.

  2. 2.

    Remember: whatever error measure is used it will not exactly correlate with the the performance of the application (unless you use the associated application cost function for the assessment—which is not often practical).

  3. 3.

    Design the forecasts with the application and practicality in mind. If there is high levels of volatility then perhaps probabilistic forecasts are more appropriate. However, if there is limited data, or limited computational resources, this may not be possible and point forecasts may be more appropriate.

  4. 4.

    In the case were probabilistic forecasts seem appropriate, it may be worth considering point forecasts anyway since the performance difference may be minimal and the savings in resources may be worth the drop in optimality.

  5. 5.

    Use at least one benchmark forecast but preferably several to help investigate the performance of the main models.

  6. 6.

    Try to understand how forecast accuracy relates to performance within the application. If there is a trend, is it dramatic or small? Possibly drops in accuracy do not correspond to a large drop in performance. In which case simpler methods may be appropriate, and there is not much point in spending time perfecting the forecast models. In contrast, if small improvements in forecast accuracy create large performance changes (or large monetary savings) then perhaps a focus on small improvements to the forecast is worth the effort (at least until there is diminishing returns to this effort).

  7. 7.

    It is worth remembering that in-silico tests are limited as there will often be a whole host of other complications and challenges when applying the methods in practice. For example, to control a storage device will require reliable communications equipment, properly functioning power-electronics, and may involve lags and delays in processing etc. Ideally many of these considerations should be included in the design of the algorithms but there will always be some simplifications.