A Dynamic Game Approach for Demand-Side Management: Scheduling Energy Storage with Forecasting Errors

Smart metering infrastructure allows for two-way communication and power transfer. Based on this promising technology, we propose a demand-side management (DSM) scheme for a residential neighbourhood of prosumers. Its core is a discrete time dynamic game to schedule individually owned home energy storage. The system model includes an advanced battery model, local generation of renewable energy, and forecasting errors for demand and generation. We derive a closed-form solution for the best-response problem of a player and construct an iterative algorithm to solve the game. Empirical analysis shows exponential convergence towards the Nash equilibrium. A comparison to a DSM scheme with a static game, reveals the advantages of the dynamic game approach. We provide an extensive analysis on the influence of the forecasting error on the outcome of the game. A key result demonstrates that our approach is robust even in the worst-case scenario. This grants considerable gains for the utility company organising the DSM scheme and its participants.

and environment friendly access to electricity by means of energy storage and renewable energy generation.
The concept rests upon the implementation of a technologically advanced power grid. In contrast to the current power grid, this smart grid features two-way communication and power transfer between the utility company (UC) and individual households [4]. Its decentralised nature is expressed through distributed generation and storage of energy, with individual households capable of doing both. These households are called prosumers (combination of producer and consumer ). Moreover, the deployment of smart meters allows households to accurately measure electricity demands in real-time. This permits the implementation of demand-side management (DSM) schemes. Within such schemes, the UC incentivises users to avoid consumption during peak hours by means of dynamic pricing tariffs. These tariffs determine the price per energy unit based on the aggregated load of all users (cf. [18,6,2]). This will eventually allow them to reduce investments into fast ramping technologies, needed otherwise.
In [18,2,7,22,5] consumers react to these price incentives by rescheduling their appliances. Among them, [18,2,5] additionally model the usage of energy storage systems. All of these users are aiming at a reduction of the peak-to-average ratio (PAR) of the aggregated load, since achieving this eventually translates into financial benefits for the participants. The methods of choice to obtain the desired schedules are almost always based on game-theoretic concepts. Only [5] deviates by using convex optimisation. Since the DSM scheme directly influences the routines of the users, their comfort levels play an important role. For instance, Yaagoubi et al. [22] found that when acceptable comfort levels are preserved, the amount of savings from the energy bill reduces by more than half of the optimum. Note that all these studies have the common idea of scheduling the usage of appliances and batteries in a day-ahead manner.
Day-ahead scheduling that does not interfere with the users can solely be realised through energy storage systems. [8,13] followed this approach and showed that considerable gains are achievable without interrupting the habits of the consumers. Nguyen et al. [8] put their focus on developing a distributed algorithm, while Pilz et al. [13] implemented an advanced battery model, providing insight into how specific battery characteristics influence the participation behaviour and thus the outcome of the game.
This work builds on these previous results and extends the approach of [13] in two directions. Firstly, we introduce a more sophisticated underlying game structure for the DSM scheme, namely a discrete time dynamic game. Secondly, we analyse the influence of the forecasting error for demand and energy generation on the scheduling outcome. A real-world application requires the mechanism to be resilient against eventual errors in the predictions, as they will undoubtedly occur.
Our contributions are as follows: (1) We introduce a novel discrete time dynamic game for energy storage scheduling among prosumers in the smart grid. The closed form solution to the best-response problem is derived by means of a dynamic-programming approach. The ensuing iterative algorithm converges quickly towards the Nash-equilibrium. A direct comparison to a static game for DSM reveals the superiority of this approach both in terms of computational costs and achieved PAR reduction.
(2) A complete day-ahead DSM scheme, consisting of prosumers with realistically modelled batteries, local renewable energy sources, and forecasting errors for demand and generation is simulated. In contrast to previous works which merely simulate individual days, our scheduling period covers a full year. The length of the simulation allows for an in-depth analysis of the influence of the forecasting errors as well as the impact of the number of participants in the DSM scheme.
(3) We show that the proposed dynamic game approach is robust with respect to the forecasting errors, even in the worst-case scenario. The respective results exhibit only small deviations in the PAR reduction outcomes compared to runs with accurate predictions, and hardly any influence on the financial benefits for the DSM participants.
(4) For the first time, a comparison of how different compositions of neighbourhoods perform in the DSM scheme is presented. We find that a community consisting of a mix of consumer types can achieve best results.
This paper is organised as follows. In Section 2, we give an overview of the system, provide details of the DSM protocol, introduce the battery and the renewable energy model, and explain the pricing tariff. Section 3 contains detailed information about the dynamic game. Furthermore, it includes the derivation of the best-response solution and the description of the iterative algorithm. The simulation parameters and the data sets for demand and generation data are presented in the beginning of Section 4. Then, we compare our approach to the static game approach of [13], and show the influence of the forecasting errors. This section ends with detailed discussions of all the presented results. Section 5 concludes the paper and points out future research aspects.

System Model -A Smart Grid Neighbourhood
In this section, we build the basis to the formulation of the battery scheduling game presented in Section 3. We introduce the concept of a smart grid neighbourhood that participates in a demand-side management (DSM) program to reduce their electricity bills. Each of the participants is equipped with an individually owned lithium-ion battery in addition to a photovoltaic (PV) cell which generates electricity. Models for both the battery and the PV cell are stated in detail. Moreover, we clarify the specific smart meter infrastructure that is necessary to implement the DSM program, as well as the role of the single utility company (UC) running this program.

Neighbourhood and Demand-Side Management Program
Consider a residential neighbourhood comprised of M houses. Each of these is equipped with a smart meter. Smart meters are capable of measuring electricity consumption accurately and at a higher frequency than the usual monthly or quarterly readings. Furthermore, these devices can communicate directly with the  It is equipped with a lithium-ion battery and a solar panel. The solar panel can directly charge the battery, but to run any appliance its direct current needs to be converted to alternating current by the inverter. The smart meter collects data and executes the schedule obtained from the scheduling software selma. A non-participating household is depicted in the bottom left corner. It is also equipped with a smart meter, collecting data and communicating with the utility company. The complete neighbourhood consists of a number of households (cf. bottom right) that belong to either one of the shown categories. All of them are served by the same utility company.
utility company. This eventually allows for the implementation of the DSM program, and also eliminates the need for on-site readings. For our proposed model, we assume that we are able to obtain readings in regular intervals. Based on the reading-frequency, we split each day into T discrete intervals and denote the set of all intervals by T . We assume that the M houses are served by the same UC. In order to incentivise consumers to participate in the DSM scheme, the UC offers them a specific pricing scheme, which eventually reduces their electricity bills. Details can be found in Section 2.3. Let us denote the set of households who participate in the DSM program by N ⊂ M, where M is the set of all households in the neighbourhood. The total number of participants is N = |N |. Besides the different pricing scheme, the participants of the DSM possess their own battery storage system and have solar panels installed. An overview of the neighbourhood is given in Figure 1.
The DSM scheme can be seen as a protocol, which is gone through repeatedly. In our study the protocol is run once per day. Note that this is a completely automated process run by our scheduling software selma (short for: Scheduler for Electricity in Local MArkets), which needs to be installed on a consumer access device given to each participant of the scheme.
Before the start of each scheduling period, selma forecasts the demand 1 of the respective household for each interval t ∈ T of the upcoming day. This information is sent to the UC. The smart meters of non-participants are not able to forecast their own demand. Thus, the UC performs the forecasting step for these households, based on historically collected data. Eventually, forecasted demand curves are aggregated and the information is sent to each DSM participant. Note that no information about individual neighbours is shared, but only aggregated information. This provides anonymity to all consumers.
Based on this input, the households play a dynamic non-cooperative game (cf. Section 3). The outcome of the game is a set of schedules, one for each household, which specify how they can make best use of their battery system. The households will follow these schedules throughout the day, even if their actual demand differs from the forecasted one. In Section 4, we investigate the influence of the forecasting error and show the robustness of the approach even in the worstcase scenario. At the end of the scheduling period, the electricity costs for each consumer is calculated based on the agreed pricing terms and the protocol starts over again.

Individual Households
Households that participate in the DSM scheme are equipped with a lithium-ion battery and PV cells. In this subsection, we introduce the battery model and clarify how the battery can be used. Moreover, details on the PV system are provided. Finally, we clarify the terminology of demand, net-demand and load of a household based on the usage of their battery and PV cells.

Battery Model and Decision Variables
In this paper, we employ the same battery model as used in [13]. This includes charging, discharging, and self-discharging characteristics of a lithium-ion battery. In fact, the same model may also be applied for lead-acid battery systems (but not nickel-based batteries due to their different charging behaviour). As all our simulations are based on a real-world lithium-ion battery system, in the following we will only refer to them as such.
Charging: Lithium-ion batteries are charged in a two-stage process [16]. In the first stage, the state-of-charge (SOC) increases linearly. This stage is called the 'constant current' (CC) stage, with a charging rate limited by ρ + > 0. In the second stage, i.e. the 'constant voltage' (CV) stage, the effective charging rate levels off exponentially towards the point where the SOC reaches the nominal maximum capacity s max of the battery. The point of transition from the first stage to the second is indicated by a SOC s * and an associated time t * , which needs to be specified for the respective battery. During both stages, we additionally consider losses due to the specific charging efficiency η + with 0 ≤ η + ≤ 1. Additionally, certain losses occur from the hybrid inverter (cf. Figure 1), modelled by η inv with 0 ≤ η inv ≤ 1. The hybrid inverter transforms the direct current from either the battery or PV into alternating current at usable voltage and frequency for the household appliances. It also works in the reverse direction to charge the battery.
To obtain an insight into how the households can make use of their battery system, let us look at a specific example (cf. Figure 2(a)). Given a certain value for the SOC, e.g. s , we can associate a time t , and thus specify a point on the charging curve.
Within the next interval of length ∆t, the decision variable a + of how much to charge the battery will lie in H + s = a + |h + s , a + ≤ 0 , with In other words, a + is limited by 0 < a + ≤ φ + s < s max −s . We use the notation above to comply with the one shown in [9]. The upper limit φ + s is described by the charging curve, as described above, where γ 1 , γ 2 are defined such that the charging curve is smooth at the transition point (t * , s * ). The discrepancy between the grey-shaded area and the charging curve in Figure 2(a) results from an imperfect charging efficiency. In fact, based on the decision variable a + the SOC of the battery changes according to the charging transition equation Discharging and Self-Discharging: We model the discharging behaviour of lithiumion batteries by a linear decrease in the SOC. Here, the slope is given by the discharging rate ρ − < 0. In order to account for the usual sharp drop off of the discharging rate at low capacities, discharging is prohibited below a minimum SOC s min . Again, we also consider losses due to the specific discharging efficiency η − with 0 ≤ η − ≤ 1 and the hybrid inverter.
In Figure 2(b) a specific example is given, to clarify how the user can discharge its battery. Within the respective interval, the decision variable a − of how much to discharge the battery will lie in In other words, a − is limited by s − s min < φ − s ≤ a − < 0 and The dependency on s in (5) is implicitly given by the fact that we cannot go lower than s min . Note that φ − also depends on the efficiency parameter, such that the actual amount taken from the battery in correspondence with the decision variable a − (grey-shaded area in Figure 2(b)) is given by the discharging transition equation In the following subsection, we will see that φ − is additionally limited by the demand of the specific household, i.e. one can only discharge as much as is needed to run all appliances. Whenever the battery is neither charging nor discharging, it will be subject to self-discharging. We model this type of behaviour with an exponential decline. This case corresponds to the decision variable a = 0. The respective self-discharging transition equation is given by whereρ < 0 is the self-discharging rate.
For later usage (cf. Section 3.1), we summarise the transition equations for charging, discharging and self-discharging into a single transition equation f , i.e.
Furthermore, we combine the restrictions of the decision variable due to the battery restrictions for charging and discharging, i.e.

PV Model
We model the solar panel as an additional source of electricity besides the grid connection. The output of the nth household's PV system during interval t is denoted by w t n . It can serve two purposes: (i) direct usage by household appliances, and (ii) charging the battery. Whereas direct usage is influenced by the efficiency of the hybrid inverter, charging the battery does not require any inversion and thus only depends on the charging efficiency of the battery.
An important parameter of the PV installation is the nominal kilowatt peak kWp of the system. It is a measure of the size of the system and denotes the maximum output that can be expected under standardised conditions. A PV system which operates at its maximum capacity, e.g. kWp = 3 kW, for one hour will produce 3 kWh. Note that identifying the optimal size of the PV installation does not fall within the scope of this article. An approximated scale is obtained from [23,11] (cf. Section 4.2).

Demand, Net-Demand and Load
We define the demandd t m ≥ 0 of a household m ∈ M as the amount of electricity that is needed to run all its appliances during the time interval t ∈ T . Thus, the total daily demand-schedule can be written asd . Throughout the paper, we assume that the demand cannot be shifted. Thus our approach is fully non-intrusive and does not influence the behaviour of the user.
Combining the demandd t n of a household n ∈ N with the generated electricity w t n from the solar panel, gives the net-demand where η inv is the efficiency of the inverter (cf. Figure 1). Theoretically, this value can be smaller than zero, i.e. when the effective generation is larger than the demand in the specific interval. Practically, we ensure d t n ≥ 0 by storing all excess energy directly in the battery. For households m ∈ N , that do not participate in the DSM scheme, the net-demand is identical to the demand.
Let l t m denote the load, i.e. the amount of energy drawn from the grid by household m ∈ M during interval t ∈ T . For households which do not participate in the DSM scheme, the load equals their demand. For the others, the load depends on the decision a t n taken at the specific interval. In other words, it combines the net-energy demand with the amount of energy that is charged or discharged by the battery where max −d t n , φ − ≤ a t n ≤ φ + . The lower boundary expresses the fact that one cannot discharge more than is actually needed to fulfil the net-demand, while at the same time all battery restrictions remain valid. Due to this condition and (10), we ensure that l t m ≥ 0 for all m ∈ M and all intervals t ∈ T . We write l m = l 0 m , . . . , l T −1 m for the schedule of loads of a specific household. Furthermore, we can calculate the total load on the grid for interval t by Similarly, we define the average aggregated load of all households other than n during time interval t by L t −n = 1 /(M−1) m∈M\n l t m .

Forecasting Errors
The DSM protocol states, that households send a forecast of their net-demand to the UC. This depends on the demand as well as the electricity generated by the solar panel. Both variables will introduce errors, that need to be accounted for. In this paper, we consider the worst-case scenario. [1] gives a comprehensive overview of current techniques for short term demand forecasting. From [1], we obtain an upper limit for the forecasting error d , expressed as a percentage of the actual demand. Similarly, [3] gives an insight into 24 hour PV power output prediction. The forecasting error w is also given as a percentage of the actual generation. The worst-case scenario is constituted when these two errors carry opposing signs. This becomes clear from (10), since both contributions for the net-demand enter with different signs. Intuitively, it makes sense that in the worst-case the forecasted net-demand is smaller than the actual demand. This is because a too small forecasted net-demand does disguise the incentive to make use of the battery system. With the same argument, the worst-case solar forecast is higher than the actual one. It might imply a sufficient SOC of the battery, when in reality more charging would have been necessary.

The Utility Company
Throughout the paper, we assume a single utility company (UC) serves all the consumers in the neighbourhood. The UC runs a DSM scheme in order to reshape the load profile. To be more precise, they want to achieve a flatter profile such that investments into fast ramping technology, which is needed to deliver peak demand, can be reduced. The incentive for the users to limit consumption during peak hours is given by a dynamic pricing tariff: The cost per energy unit is calculated separately for each interval and depends on the aggregated load of all users in the neighbourhood. Following [13,7,22,8], we employ a quadratic cost function g t : where y is the aggregated load at time t given by L t and the coefficients c 2 > 0, c 1 ≥ 0 and c 0 ≥ 0. Similar to [13,18,7], we employ a proportional billing scheme, where each participant of the DMS scheme pays for their share of the consumption, i.e. the electricity bill B n yields with For households that do not participate in the DSM scheme, a standard fixedprice tariff is employed, i.e.

Dynamic Battery Scheduling Game
In this section, we formulate the non-cooperative dynamic game between the households that possess individual energy storage and photovoltaic (PV) installations. To do so, we introduce the relevant notation and relate it to their respective 'real-world' meaning according to our system (cf. Section 2). Furthermore, the notion of a Nash equilibrium (NE) is defined and an important result concerning the link between the NE for the whole game and the NE for a subgame is provided. Subsequently a dynamic programming algorithm is presented from which we derive a closed form expression of the best response, i.e. the best decision a player can make in response to fixed decisions of other players. Eventually we use this result to construct an iterative algorithm that computes the NE of the game.

Definitions and Game Formulation
Formally, the game belongs to the category of discrete time dynamic games (cf. [9]), where players make their decisions sequentially in stages. These stages directly correspond to the daily intervals introduced in Section 2.1. For each stage we define a state of the game, i.e. the current state-of-charge (SOC) of all batteries, representing the configuration of the overall system. Furthermore, we define a transition equation that models the evolution of this state based on the decisions of the players. In other words, the players will choose actions that are directly related to their battery usage, which in turn depends on the state of the game. We consider a game with open-loop information structure, which means that the initial state of the game is known by all players. In this game, players want to minimise their energy bill, i.e. their utility function, which depends not only on their own but also on the decisions of all other players. In a nutshell, we have: (3) Scalar state variables s t n ∈ S n ⊂ IR denoting the SOC of the nth player's battery at stage t ∈ T ∪ {T }. Collectively, we denote the state variables of all players at stage t by s t : In the open-loop information structure it is assumed that the initial state s 0 is known to all players n ∈ N .
(4) Scalar decision variables a t n ∈ H t n s t n ⊂ A n ⊂ IR (for definition of H t n see item (5)) denoting the usage of the battery of the nth player at time t ∈ T . Collectively, we denote the decision variables of all players at stage t by a t := a t 1 , a t 2 , . . . , a t N ∈ A := A 1 × A 2 × · · · × A N ⊂ IR N . Furthermore we define the schedule of battery usage of an individual player n ∈ N as a collection of all its decisions in the stages of the game by a n := a 0 n , a 1 n , . . . , a T −1 n . A strategy profile is denoted by a := (a 1 , a 2 , . . . , a N ).
(5) A set of admissible decisions H n s 0 n := a n | h t n s t n , a t n ≤ 0, t ∈ T ⊂ IR T for the nth player. The function h t n s t n , a t n has been defined in (9) Section 2.2.1, capturing the restrictions posed on the battery. We denote H t n s t n := a t n | h t n s t n , a t n ≤ 0 ⊂ IR (6) A state transition equation governing the state variables s t T t=0 . The function f t n s t n , a t n is the discretised version of the the transition equation (8) for the nth player, where a −n := (a 1 , a 2 , . . . , a n−1 , a n+1 , . . . , a N ) denotes the decisions of all other players. The function g t n s t n , a t n , a t −n has been defined in (13) Section 2.3 capturing the costs to the nth player at the tth stage. Note that the utility function depends only on the initial state variable s 0 n , since the subsequent states s t n are determined by (17). The function g T n s T n = s T n (19) expresses a penalty for the nth player that is incurred by ending up in state s T n , i.e. its SOC, at the end of the scheduling period.
We represent the decision problem of the nth player as the following optimisation problem: G n (a −n ) given s 0 ∈ S minimise a n U n s 0 n , (a n , a −n ) subject to a t n ∈ H t n s t n s t+1 Moreover, the game is referred to as {G 1 , G 2 , . . . , G N } Definition 2 A strategy profileâ = (â 1 , . . . ,â N ) is a Nash equilibrium for the game {G 1 , . . . , G N } if and only if for all players n ∈ N we have

Analysis of the Game
In order to analyse the game {G 1 , . . . , G N }, we follow the dynamic programming (DP) idea by Nie et al. [9]. To do so, we introduce notation for subproblems of (20). Furthermore, we show an important result about Nash equilibria for these subproblems, which constitutes the basis for the DP-algorithm. Applying the general algorithm eventually leads us to an analytic formulation of the nth player's best responseâ n , given the strategies a −n of other players at stage t of a T -stage game.

Subgame Formulation
For subproblems that are only interested in decisions taken from stage t onwards, we write: For t ∈ T we define a subproblem of the nth player as the following optimisation problem: Therefore, the subgame is referred to as G T −t such that That is in contradiction to our assumption thatâ is a Nash equilibrium for the game {G 1 , . . . , G N }. Consequently, our assumption aboutâ t,T −1 is proved to be false. Thusâ t,T −1 indeed comprises a Nash equilibrium of the subgame

The DP-Algorithm and Derivation of the Best Response Solution
Based on the results of the previous subsection, we can formulate the following DP-algorithm to find the solution to the decision problem G n (a −n ) (20), i.e. the optimal decision for the nth player given the decisions a −n of the other players. Let us apply Algorithm 1 to obtain the result to the decision problem G n (a −n ) (20) in closed form. Note that both for-loops (line 1 and line 3) are treated implicitly by keeping s T and s t unspecified throughout the computations.
Given the total scheduling length T , the aggregated decisions a −n of all other players, and the initial SOC s 0 of the batteries, at the first step (t = T ) we set V 0 n (s T n ) = s T n according to (19). With this we enter the while-loop (line 2) which overwrites t to now represent t = T − 1. We solve for the best decisionâ   With this, the first step is done and we again overwrite t to now represent t = T −2.
In this stage we solve the following problem The solution is computed aŝ from which we obtain finalising the second step. This procedure can be done for all subsequent steps. As the equations increase quickly in size, they become infeasible to quote here. Fortunately though, our calculations provided insight into recurring patterns, which all the solutions seem to follow. Eventually, the solution for an arbitrary stage t of the T -stage dynamic game can be written aŝ

The Algorithm
Similar to [13], we make use of a best-response algorithm (cf. Algorithm 2) to find the solution to the game. Whereas in [13] an extensive search for optimal Algorithm 2: Best-response algorithm for finding a pure NE based on [17] Input: T , s 0 initialise random strategy profile a = (a n , a −n ) 1 while there exists a player n for whom a n is not a best response to a −n do 2 for each n ∈ N do 3 for each t ∈ T dô a t n ← best response to a −n based on (23) end a n ← â 0 n , . . . ,â T −1 n end end Output:â schedulesâ n was performed, here we can compute the best response for each stage (line 3) analytically by means of (23) and concatenate the results to obtain the optimal scheduleâ n in response to a −n . Performing this computation for each player n ∈ N (line 2) results in a new strategy profile a. We iterate this (line 1) as long as "there exists a player n for whom a n is not a best response to a −n ". In the actual implementation, this check is done by comparing the current strategy profile with the one obtained from the previous iteration. If it did not change, up to machine precision, an equilibrium is reached andâ = (â n ,â −n ) constitutes the Nash equilibrium.
Based on the definition of the NE, no household can benefit from unilaterally deviating from its respective schedule. Nonetheless, we have to keep in mind that it is based on forecasted demand and renewable generation. Whenever either the demand or the generation does not match the forecasted value, it might not be possible anymore to strictly follow the NE schedule. In the analysis in the subsequent sections, we assume that it always seeks to be as close as possible to the forecasted schedule. To illustrate the idea: Imagine the NE schedule of household n scheduled to discharge an amount x in a certain interval. Due to a forecasting error for the renewable generation, this has not been charged fully. In this case, the schedule will discharge as much as possible during this interval. The deviation from the NE will decrease the benefit in terms of PAR reductions and achieved savings for the consumer. Nevertheless, in the following section we show that the solution is robust with respect to these deviations.

Results and Discussion
In this section, we firstly summarise important simulation parameters and introduce the specific data sets for electricity demand and generation from the photovoltaic (PV) installation. Secondly, all results are shown with detailed explanations of the individual parameters under investigation. Thirdly, the results are discussed and compared to the literature. The correctness of the implementation of Algorithm 2 is provided in the Appendix A.2.

The Simulation Setup
In the real-world application, the smart meter of individual households collects data about electricity demand and generation from the available PV installation. As specified in Section 2.1, the demand-side management (DSM) protocol requires participants to send forecasts of the demand and generation to the utility company. These forecasts are based on historically collected data. In order to run our simulations, we omit this forecasting step and rather make use of two publicly available data sets.
Demand data: The demand data stem from the openei dataset [21]. It contains 365 days of simulated hourly data 2 for households in TMY3-locations in the USA [10]. The building models used for this simulation can be found in [20]. Based on an additional survey, all buildings are put into one of three different category. They differ with respect to their overall consumption. Following [21], we refer to them as LOW, BASE and HIGH consumers. For all simulation runs, we picked the same M = 25 households, in close vicinity to each other, to represent our neighbourhood. With respect to their consumption categories, we have seven LOW, nine BASE and nine HIGH users.
PV data: Data for the PV generation are based on real-world measurements [14] in the UK. They contain hourly values for days between September 2013 and October 2014. Note that latitude and climate zone of the measurement location are similar to the ones of the demand data. In assumption that the weather for all households in the neighbourhood is the same, we use data from the same site for each of them. An estimate for the kWp value is obtained from looking at the highest hourly output in the course of a whole year. Its value is w max = 3.7 kWh, which is why we assume kWp ≈ 4 kW. We account for different sizes of PV installations by scaling the data set with a household specific factor p n . About 6% of the collected data was corrupted. We set all these values to w = 0.0 kWh. This does not pose any problem for our simulation results, but can be seen as realistic failures of the installation.

Variable
Value 13.5 kWh s min 0.0 kWh s * 9.46 kWh Battery and pricing parameters: The parameters of the battery are based on the Tesla Powerwall 2 [19] data sheet. The choice to employ this battery system is motivated by two reasons: (i) The same battery was used in [13], allowing for a direct comparison of the results. (ii) A non-extensive analysis of different battery systems showed that the Tesla Powerwall 2 qualifies as a representative of stateof-the-art technology. Please see the Appendix A.1 for more details. A summary of the battery parameters can be found in Table 1. The data sheet only specifies the round-trip efficiency η = η + · η − of the battery. Without loss of generality, we assume that charging and discharging contribute equally, yielding η + = η − = √ 0.918. For the parameters in the cost function (13) we use c 2 = 0.03125 $/MW 2 , c 1 = 1.0 $/MW, and c 0 = 0, following other studies [13,15]. This allows to compare our results.

Results
We compare the game-theoretic approach introduced in this manuscript (cf. Section 3.1) with a simpler non-cooperative static game, revealing the advantages of the dynamic treatment. Subsequently, the analysis of how the participation rate of the DSM scheme and the forecasting errors influence the scheduling outcome is shown. Finally, we consider the influence of the composition of the neighbourhood on the peak-to-average ratio (PAR) reduction. This is an important measurement of the effectiveness of the DSM scheme. We consider the PAR of the aggregated electricity load (12) over the respective scheduling period. It is defined by (24)

Comparison Between a Static and a Dynamic DSM scheme
In [13], a similar DSM scheme to the one described in Section 2.1 was examined.
Both are based on a battery scheduling game for households of a neighbourhood The aggregated demand is given as a reference. The orange curve results from a DSM scheme employing a static scheduling game [13], while the green one stems from a DSM scheme employing a dynamic game. Other than the underlying game structure, all parameters are identical.
served by the same utility company (UC). Their main difference is the underlying game that determines the schedules for the upcoming day. Whereas in this paper we employ a discrete time dynamic game, [13] made use of a simpler noncooperative static game in which players were only able to choose between four discrete options for each interval. For a more thorough description please see [13]. For the sake of comparison, none of the households is equipped with PV cells.
In this subsection, we compare the two approaches with respect to their success in reducing the PAR of the aggregated load. To this end, the same parameters for each household and also the same demand data are used. Households do not have the capability of on-site generation, but are equipped with the same batteries (cf. Table 1). The upcoming day is divided into T = 12 intervals and we assume N = M = 25, i.e. every household takes part in the DSM scheme. As in [13], we simulate full weeks by using the state-of-charge (SOC) values of the batteries at the end of the scheduling period as the initial configuration for the following one. Figure 3(a) and Figure 3(b) show the aggregated load curves achieved by the DSM schemes for forecasts given by week 12 and week 38 of the demand data set [21], respectively. For completion, we also simulated week 25 and week 51 as done in [13]. A summary of the achieved results can be seen in Table 2. On average, a 14% and a 32% decrease of the PAR value was achieved by the static and the dynamic games, respectively. To understand the differences of the outcomes, we explicitly look at the schedules that are obtained in the NE of the respective games. Figure 4 shows these schedules exemplarily for day 5 of week 38 (Figure 3(b), cf. Figure 3 in [13]) together with the aggregated load and aggregated SOC above it. Each row illustrates the equilibrium schedule of one household.

Influence of Participation Rate and Forecasting Errors
The question of how many participants are needed to obtain considerable gains in terms of PAR reduction and savings is important. Moreover, within this subsection the robustness with respect to the forecasting errors (cf. Section 2.2.4) is shown. To do so, we assume the forecasting error for the demand to be d = 8% for every household [1]. This is independent of whether the household participates in the DSM scheme or not. The forecasting error for the solar generation is set to w = 10% in accordance with [3]. Note that only participants of the DSM scheme are equipped with PV cells and thus subject to the forecasting error. The values are taken to represent a worst-case scenario. Subsequently, any real-world scheduling result should fall in the interval between the worst-case outcome and the respective outcome without any forecasting error.
We simulate a full year and average over the obtained PAR values for the individual days. All participants are equipped with a lithium-ion battery (cf. Table 1) and a solar cell. The size of the PV installation depends on the user's category. For LOW, BASE, and HIGH consumers, we use p n = 0.3, p n = 0.5, and p n = 0.7, respectively. Starting with all 25 households taking part in the DSM scheme, we eliminated three users, i.e. one randomly selected from each consumer category, in each subsequent run. Non-participant still exhibit the specified forecasting error for their demand. Figure 5 shows the reduction of the PAR value over the rate of participating consumers for the scenarios with and without forecasting errors. It includes not only the mean values, but also the standard deviation. Note that we slightly shifted the results for both runs along the abscissa to increase readability. An additional axis on the left indicates the absolute PAR values. Whereas the PAR reduction is the interest of the UC, the financial rewards, i.e. savings off the energy bill, are the interests of the participants of the DSM scheme. Figure 6 shows the average  (a) The underlying game structure is a static non-cooperative game from [13]. Within each interval, players can choose between four discrete decisions. (b) Here, the game structure is the dynamic game introduced in Section 3.1. Note that the schedules employ the same scaling.
saving per day for all participants both with and without forecasting error. For further insight, it also illustrates the difference between the two curves.

Consumer Type Dependency
The results in Section 4.2.1 and Section 4.2.2 are all based on a neighbourhood consisting of a mix of the three different consumer types (LOW, BASE, HIGH). Figure 7 shows the possible PAR reductions for mono-type neighbourhoods. To allow for comparison M = 25 is kept constant. Furthermore, we use the same forecasting errors of d = 8% and w = 10% for the demand and renewable energy generation, respectively (cf. Section 4.2.2). All the simulations consider a scheduling period of a full year. We also calculated the average savings that are achieved by the participants of the DSM scheme. These results are presented in Figure 8 together with the reference of a mixed neighbourhood (cf. Figure 6) with forecasting errors. with error without error difference Fig. 6 Savings dependency on the participation rate. The mean bill reduction in per cent for participants of the demand-side management scheme are plotted over the participation rate in per cent. In addition to the average over 365 days and participants, the standard deviation between different participants is shown for each data point. The simulations were run for a scenario with forecasting errors and one without forecasting errors. The difference between the two curves is plotted against the right-hand axis.

Discussion
For each subsection of the results (cf. Section 4.2), there is a corresponding subsection discussing these outcomes.  Fig. 7 Peak-to-average ratio (PAR) reduction for different neighbourhoods. The mean PAR reduction in per cent is plotted over the participation rate in per cent for different mono-type consumer neighbourhoods. In addition to the average over 365 days, the standard deviation is shown for each data point. For comparison, the results of a mixed neighbourhood (cf. Figure 5) are also presented. Note that the data points are slightly shifted along the abscissa to increase readability.  Fig. 8 Savings for different neighbourhoods. The mean bill reduction in per cent for participants of the demand-side management scheme are plotted over the participation rate in per cent for different mono-type consumer neighbourhoods. In addition to the average over 365 days and participants, the standard deviation between different participants is shown for each data point. For comparison, the results of a mixed neighbourhood (cf. Figure 6) are also presented.

Comparison Between a Static and a Dynamic DSM scheme
Comparing the aggregated load curves (cf. Figure 3) shows that a DSM scheme based on a dynamic game can achieve an almost flat profile. Nevertheless, depending on the given data, the outcome of the scheduling is subject to a finite-horizon effect. Empirically, we observe peaks and troughs at the end of the scheduling period if the demand for the final interval is lower than the average demand of the whole day. This indicates that the starting time of the DSM scheme has an influence on the achievable outcome. Nonetheless, this parameter is fixed through the DSM scheme protocol, thus asking for alternative solutions to the finite-horizon effect. Future work will aim to eliminate the influence of the starting time altogether.
In Table 2, we observe that on average the dynamic game reduces the PAR value more than twice as much as the static game. However, with respect to the individual weeks the static game shows a smaller standard deviation of 0.04 and thus seems to be more consistent. Its achieved reductions are all between 10.4% -15.3%, while the range of reductions by the DMS scheme with the dynamic game is 23.9% -40.9%. The differences with respect to the standard deviations is again owed to the finite-horizon effect. It is also present in the case with the static game, but due to generally worse outcome, does not alter it as much as the results of the dynamic scheduling game.
We can further understand the differences between the static and dynamic game from Figure 4. The restriction to four discrete options for each interval in the static case, i.e. (i) remain idle, (ii) charge half interval, (iii) charge full interval, and (iv) use battery, results in a majority of intervals where the battery remains idle. This is because of a lack of incentive to charge the battery by the two given amounts. In the dynamic game, players can choose to charge their battery from a continuous spectrum of decisions in a given interval. This difference becomes most apparent when looking at the aggregated SOC of all participants. Whereas the maximal SOC in the static case is approximately 64 kWh, almost twice as much (120 kWh) is charged in the dynamic case. In summary, it shows that the increased flexibility of the dynamic game is better suited to minimise the PAR of the aggregated load.

Influence of Participation Rate and Forecasting Errors
Although a worst-case scenario is simulated, the outcome with respect to PAR reduction (cf. Figure 5) and electricity bill (cf. Figure 6) reduction show considerable gains for the UC and the participants of the DSM scheme.
Without forecasting error the PAR reduction monotonically improves with the proportion of the participants. This stands in contrast to the results shown in [18], where a minimum is reached at medium range participation rate. In comparison to other studies, such as [8,5], we conclude that our dynamic game performs as good as their respective scheduling approach. At 100% participation rate, a reduction of −33.3% (5.8%) is achieved, in agreement with the results shown in Section 4.2.1. It should be noted that a perfectly flat load profile corresponds to an approximately −40% reduction of the PAR. Thus the outcome is close to the theoretical optimum. When looking at the standard deviation, we observe that it is lowest for the simulation run with 52% participation rate and increases towards both ends of the spectrum. On the lower end of participation rate the fluctuations of the PAR value for different days is just an artefact of the data set in use. Small numbers of participants have not enough influence on the overall neighbourhood to change this. When regarding large participation rates, the PAR value is considerably reduced. The increase of the standard variation for these runs stem directly from the finite-horizon effect already discussed in Section 4.3.1.
The results for runs with forecasting errors follow the results without errors closely. For low participation rates the difference is negligible but starts to increase when more households participate in the DSM scheme. Nevertheless, even in the worst-case scenario, a reduction of −27.8% (8.9%) is achieved at 100% participation rate. With respect to the standard deviation, we again recognise similarities to the runs without forecasting errors. Smallest variations in the PAR reduction are obtained for participation rates around 50%, while we again see increasing variations at high participation rates. Here, the increase is distinctly larger than in the other runs. The reason behind this difference is directly explained by the forecasting error. As more participants join the DSM scheme, the absolute amount of deviation from the actual demand and production is increasing.
It is worth noting that the result for a participation rate of 76%, i.e. a reduction of −27.7% (4.4%), are very promising from a practical point of view. The UC might not be able to convince everybody to participate in the DSM scheme, but can still gain reductions of the PAR value close to what is achievable at maximum participation.
The savings that participants of the DSM scheme can gain increase monotonically with the share of participants. Furthermore, we observe that the variations between different participants is negligible. This is due to the particular proportional billing scheme employed in the scheme (cf. Section 2.3). It ensures fairness in the sense that LOW and HIGH consumers can gain equally by signing up for the DSM scheme. The difference between runs with and without forecasting errors reveals that the forecasting error does not influence the bill reduction to a great extent. Since the two curves are almost non-separable to the unaided eye, the difference is shown in the same plot (cf. Figure 6). It becomes clear that the difference is actually decreasing for larger numbers of participants.
This highlights that the dynamic scheduling game ensures robust and beneficial results for the participants of the DSM scheme, even in the worst-case scenario.

Consumer Type Dependency
When comparing different compositions of neighbourhoods, we can gain further insight into the conditions for which the DSM scheme works most efficiently. At first glance, Figure 7 reveals that given a low rate of participants in the scheme, the actual type of consumer is not crucial. Figure 9 shows the difference between the respective results for mono-type neighbourhoods and a mixed neighbourhood. A closer look shows that mono-LOW communities are always worse in reducing the PAR value of the aggregated load than any of the other ones. The results in terms of both the mean PAR reduction and the standard deviation get even worse with more than two thirds of households participating in the scheme. Similar observations for mono-type neighbourhoods are found in [18].
For both mono-BASE and mono-HIGH neighbourhoods it can be observed that they perform better (< 1%) in an interval of medium participation rate than the mixed neighbourhood. Nevertheless, at N = M the obtained PAR reduction is smaller by 1.8% and 4.5%, respectively. Considering the variation of these mean PAR reduction values, it becomes clear that it is most beneficial to have a mixedconsumer neighbourhood. Figure 8 shows the average bill reduction for the participants of the DSM scheme for different participation rates. Generally, they show the same behaviour already observed in Figure 6. The influence of the proportionality factor in the billing scheme (14) is clearly visible. Although the mixed-consumer neighbourhood achieves better PAR reduction, the average savings are almost identical to a mono-BASE neighbourhood. A neighbourhood that purely consists of HIGH consumers can save about 11% off the energy bill and is consistently most rewarding for the participants independent of the participation rate.

Conclusion
In this paper, we proposed a demand-side management (DSM) scheme based on a discrete time dynamic game. Its purpose is to reduce the peak-to-average ratio (PAR) of the aggregated electricity load by scheduling the usage of individually owned (lithium-ion) energy storage systems. The utility company running the scheme, incentivises users to take part by offering fair financial benefits. To ensure realistic outcomes, an advanced battery model is employed. Furthermore, the integration of local energy generation in form of photovoltaic cells is taken into account.
The DSM scheme is suitable for real-world implementation for four reasons: Firstly, it is based on a complete model of the neighbourhood including storage systems, local energy generation, and crucially forecasting errors of both demand and generation. Secondly, computational costs to obtain schedules for the upcoming period are small and require only little amounts of memory. This was achieved by deriving a closed form solution for the best-response problem of an individual player. The ensuing iterative algorithm seems to converge exponentially towards the Nash-equilibrium and thus obtains the strategy profiles for one scheduling period in a fraction of a second. Thirdly, the resulting schedules are robust with respect to the worst-case forecasting errors. Whereas the error weakens the effect of the PAR reduction by ≤ 5.5%, the corresponding savings off the energy bill for the participants of the scheme are hardly changed. Fourthly, we provide evidence that a neighbourhood that consists of various types of consumers performs best in such a DSM scheme. Since a mixed community is more probable than a mono-type community, this is a promising result.
A direct comparison to a DSM scheme with an underlying static game, revealed the advantages of the dynamic game approach. Players are overall more active and thus able to achieve distinctly better results.
In future work, we plan to corroborate our results with a full probabilistic analysis of the forecasting errors. Moreover, we will extend the game-theoretic model to be able to deal with the finite-horizon effect. This will eventually lead to a scheduling mechanism which is insensitive to the starting time of the protocol. state-of-the-art technology. In this particular simulation run, it achieves a peakto-average ratio reduction similar to the best in the field. Also the savings off the energy bill are close to the best competitors.

A.2 Convergence Behaviour of the Algorithm
Let us provide an insight into the convergence behaviour of Algorithm 2.
Results: The condition that needs to be fulfilled to declare equilibrium is stated as 'there exists no player n for whom his current action a n is not a best response to the actions a −n of the other players' (cf. Algorithm 2, line 1). Associated with the current action profile during each iteration are also the energy bills for each participant. In Figure 11, the absolute change of the average bill B = 1 /N n B n (cf. (14)) is shown for a randomly selected day of the simulation shown in Section 4.2.2. To cover the large scale of different changes, a logarithmic representation is chosen. The respective sign of the change is then expressed in the colour of the bar. Figure 12 shows how the number of average iterations per day depends on the number of participants in the DSM scheme. Furthermore, it reveals the influence of the error on the iteration statistics. The values are again taken from the simulations in Section 4.2.2.

Discussion:
The results give evidence of a correctly working iteration algorithm (cf. Algorithm 2). From Figure 11 we see that between any two consecutive iterations, the absolute change of the average electricity bill is monotonically decreas- ing. Furthermore, we observe that the rate of this decrease is almost linear in the semi-logarithmic plot, hinting towards an exponential relationship. Due to the exponential convergence towards the Nash equilibrium, only few iterations are needed to obtain the equilibrium schedules. The specific number of iterations depends on the number of participants taking part in the DSM scheme. This is comparable to the ones shown in [8]. Figure 12 shows that the average number of iterations increases monotonically with the number of participants. Moreover, the variation across the number of iterations for individual scheduling periods is small, as shown by the standard deviation. This is a strong result, as it shows that the convergence properties are insensitive to different demand data of the individual participants. Additionally, when comparing the number of iterations for the two scenarios: (i) with forecasting error for demand and generation and (ii) without any forecasting error, it becomes clear that the convergence behaviour is not influenced. The biggest difference is observed for 100% participation rate, where the difference amounts to approximately 0.5%.
The small number of iterations directly translates to small computational times and thus does not hinder a real-world application. Typical 365-day simulation runs take about 30 s on a single core of an i7-3770S CPU and require less than 1 GB of memory. Note that in the real-life scenario, the scheduling process is initiated once before the scheduling period and only needs to calculate the equilibrium schedules for the upcoming day. In summary, we expect no difficulties in implementing a DSM scheme based on our scheduling software selma.