1 Introduction

Temporal and spatial aggregation are both very important tools for modeling large scale energy systems [1,2,3,4]. This is also the case for hydropower [3, 5, 6]. The aggregation of hydropower can pose some additional challenges as the river systems naturally create linked temporal and spatial connections between the hydropower stations [7]. Hydropower also plays a very important role in many energy systems worldwide [8]. Thanks to its flexibility across several time scales and thus power balancing capabilities, hydropower can help substantially with the integration of more variable renewable energy, VRE, such as solar and wind [8, 9]. This means that some extra care is warranted when aggregating the hydropower within a geographical area. Figure 1 illustrates how different areas can be defined within the context of an energy system model. This is, e.g., the set-up in Europe where the trading and bidding system is divided into several geographical areas.

Fig. 1
figure 1

Illustration of model context with multiple areas connected by transmission lines and connections to e.g. the heat sector

Various simplification and aggregation methods of the power production within the different areas are employed and used today both in industry [10,11,12] and academia [13,14,15,16,17]. Examples of aggregated hydropower specifically is found in [3, 6, 18,19,20]. Most common is that the aggregations in energy system models are rather straightforward and based on historical data and directly measurable characteristics of the hydropower system in question. These can work well for the current system in the climate of today, but care must be taken to ensure that the flexibility is not overestimated [3]. However, it is reasonable to question whether these Direct aggregations can be used to reflect a future system with a different energy mix and more VREs, or if it can capture how the hydrosystem would operate in a changing climate. Recent research such as [21] concerning the EMPS model, has included modifications to the feasibility set of the aggregated model based on a more detailed description of the real hydropower system. Others, as [20], utilize a genetic algorithm to aggregate the stations within an area in a smarter way based on some selected properties. Moreover, in e.g. [22, 23] operating rule curves for aggregated versions of cascading hydropower systems are used.

In order to consider that the simplified hydropower model should mimic the real hydropower within that area also in a changing climate or in a changed system with more VREs one can use so called hydropower Equivalent models. In this approach, instead of aggregating the existing hydropower stations within the area, one compute new Equivalent stations with the aim to match the power production of a more detailed model of the Original system [5, 24,25,26]. In contrast to the Direct aggregation, this means that not all directly measurable characteristics of the system might be exactly the same in the Equivalent station(s) as in the real world Original system. The main focus lies instead on the power production simulation results. The Equivalent models can be computed via a bilevel optimization problem with the aim to minimize the difference in power production between the Equivalent and Original models [5].

The definition of Equivalent model considered in this paper is twofold. One part is the mathematical optimization model and one part are the parameters that make up the characteristics of the Equivalent hydrosystem. These parameters include limits on maximum and minimum reservoir content, discharge etc. For a more detailed description of the Equivalent model, see Sect. 2.2.

1.1 Peak hour matching challenges

Previous research has shown that Equivalent models with good accuracy in average power production are possible to compute, for example see [26, 27]. However, hours with peak power production can be particularly challenging to accurately capture. Direct aggregations can at times overestimate the production peaks, as their maximum peaks are based on historical data and not average behaviors, see for example the baseline aggregation from [27]. In contrast, previous Equivalent models have underestimated the highest production peaks, as this seems to increase the average power production performance [24]. In Fig. 2, this problem is illustrated using example power curves.

Fig. 2
figure 2

Illustration of the challenge with accuracy during hours of peak power production

Accurately capturing the power production during peak hours is important for many reasons. For example, in investment studies, if the peak production in the hydropower systems is underestimated sub-optimal investment decisions might be taken or might require an over-investment to cover the highest peaks. Similarly, the balancing capabilities of the hydrosystem might be underestimated, resulting in lower investments than actually possible in VREs such as wind and solar. Some additional problems that can arise from inaccurate aggregations and/or misrepresentations of the peak capacity within an area are discussed in [9, 28,29,30]. In other words, it is vital for the Equivalent models to better capture these peaks.

1.2 Contributions

In this paper, the focus lies on performance of the system area Equivalents during peak hours. To increase this performance and raise the accuracy in power production simulation during these hours, a new method is developed. In order to evaluate this new method it is also compared with two older methods to compute the segmentation and two alternative new methods also designed to increase accuracy during peak hours. Thus, the main contributions can be summarized as:

  • Novel method to increase accuracy during peak hours

  • Comparison with two older methods and two alternative methods

  • Analysis of the trade-off between peak hour performance and average hourly performance

Here it is shown that the novel proposed method significantly increases the accuracy of the Equivalent model during peak hours. And that the new method clearly outperforms both the old and alternative methods in this aspect.

1.3 Organization

Next, the Equivalent hydropower optimization model used in this paper is presented together with a list of the mathematical symbols used. In Sect. 3, the proposed new method along with the two older versions and the new alternative methods are described. Section 4 includes a description of the case-study used to evaluate the methods from Sect. 3. The results and discussion are presented in Sect. 5. Finally, the conclusions are included in Sect. 6.

2 Hydropower models

Both the Original detailed model of the real hydropower system and the simpler Equivalent model are formulated as optimization problems. The aim of the here presented modeling is that the hydropower Equivalent should, as much as possible, mimic the operation of the real Original system. This means that for the same price profile (obtained from the rest of the simulated whole system), the hydropower Equivalent and the real Original hydropower system should have, as close as possible, the same total energy production per hour.

This then implies optimization formulations where the goal is income maximization for the hydrosystem where prices are inputs. The optimization formulation is either for the Equivalent, or for the real Original hydropower system, and the aim is that the resulting production curves should be as close as possible. Therefore, both models are mathematically similar, with only a few constraints difference. The main differences is found in the complexity of the models. First, the Original model considers water flow times between stations and water court decisions regulating the operation of individual hydropower stations and reservoirs. The Equivalent model instead include two specific power ramping constraints. Further, the topology of the Equivalent model is simpler and considers only one or a few stations. As the models are similar and the main focus of this paper is on the Equivalent model computation, only the full mathematical problem formulation of the Equivalent is included here, see model (1) in Sect. 2.2, while the Original model is included in Appendix 1. Moreover, in Sect. 2.1 the nomenclature used is shown.

2.1 List of symbols

Here, the nomenclature used in this paper is described. The superscript O is used to denote the detailed Original hydromodel and the superscript E is used for the Equivalent. An underline symbolize a minimum value, while an overhead bar symbolizes a maximum value. The capital letters corresponding to the variable of that same lower case letter represent a parametric maximum or minimum value related to that variable. Moreover, note that parameters in bold are considered to be variables in the upper-level of the bilevel problem presented in Sect. 2.3.

2.1.1 Indices and sets

  • \(\mathbb {I^{O/E}}\): Set of stations from 1 to \(I^{O/E}\), index i and j

  • \(\mathbb {K^{O/E}}\): Set of discharge segments, 1 to \(K^{O/E}\), index k

  • \({\mathbb {W}}\): Set of scenarios from 1 to W, index w

  • \({\mathbb {T}}\): Set of time periods from 1 to T, index t

  • \({\mathbb {X}}^E_P\): Set of unknown parameters in the Equivalent problem formulation

  • \({\mathbb {X}}^E_v\): Set of variables in the Equivalent problem formulation

  • \(A_{i}^{d,E}\): Set of all downstream stations for station i including itself

  • \(A_{i}^{u,E}\): Set of directly upstream stations for station i

2.1.2 Parameters

  • \(\pi _w\): Probability of scenario w

  • \(\lambda _{w,t}\): Expected price in scenario w and time t

  • \(E_{0,w}^O\): Total initial energy of the system in scenario w

  • \(E_{T,w}\): Minimum total energy of the system in scenario w

  • \(\varvec{\alpha _i^E}\): Share of total initial energy in reservoir i

  • \(\varvec{\mu _{i,k}^{O/E}}\): Marginal production function for station i, segment k

  • \(\varvec{V_{i,w,t}^{O/E}}\): Inflow to reservoir i, time t and scenario w

  • \(\varvec{\overline{P}_{\tau }}\): Maximum ramping between \(\tau = \{1,4\}\) hours

  • \(P^O_{w,t}\): Simulated power production of the detailed Original model in scenario w and time t

  • \(\varvec{\overline{M}_{i}^E}\): Maximum content in reservoir i

  • \(\varvec{\underline{M}_{i}^E}\): Minimum content in reservoir i

  • \(\varvec{\overline{Q}_{i,k}^E}\): Maximum discharge in station i, segment k

  • \(\varvec{\underline{Q}_{i}^E}\): Minimum discharge in station i

  • \(\varvec{\overline{S}_{i}^E}\): Maximum spill in station i

  • \(\varvec{\underline{S}_{i}^E}\): Minimum spill in station i

  • \(\underline{U}/\underline{D}\): Minimum up- and down-regulating capacity required in the system

2.1.3 Variables

  • \(p_{i,w,t}^{E}\): Production in station i, scenario w and time t

  • \(m_{i,w,t}^{E}\): Content in reservoir i, scenario w and time t

  • \(q_{i,k,w,t}^{E}\): Discharge in station i, segment k, scenario w, time t

  • \(s_{i,w,t}^{E}\): Spill in station i, scenario w and time t

  • \(d_{i,w,t}^{E}\): Down-regulating capacity provided in station i, time t and scenario w

  • \(u_{i,w,t}^{E}\): Up-regulating capacity provided in station i at time t and scenario w

  • \(f^{low}/f^{up}\): Objective function of lower-/upper-level problem

  • \(\varvec{\gamma _i}\): Inflow multiplier, to modify Equivalent local inflow

2.2 Equivalent optimization model

The objective function in (1a) of the Equivalent optimization model is to maximize income from sold electricity across all scenarios. The power production for each station, hour and scenario is calculated in constraint 1b. Note that \(\mu ^E_{i,k}\) decreases with each segment. This is to ensure that each discharge segment is fully utilized before using the next one. Next, the hydrological balance is described in (1c). The constraint (1d) limits the allocated FCR up-regulating capacity by the installed capacity of the station. Similarly, in constraint (1e), the FCR down-regulating capacity is limited by the minimum discharge of the station. Constraint (1f) then sets the minimum requirement on the FCR up- and down-regulating capacities.

$$\begin{aligned} \max _{{\mathbb {X}}^E_v} &f^{low} = \sum _{w=1}^{W} \pi _w \sum _{t=1}^{T} \lambda _{w,t} \sum _{i=1}^{I^E} p_{i,w,t}^E \end{aligned}$$
(1a)
$$\begin{aligned} \text {s.t. }&p_{i,w,t}^E = \sum _{k=1}^{K^E}\varvec{\mu _{i,k}^E} q_{i,k,w,t}^E, \; \forall i, w, t \in {\mathbb {I}}^E, {\mathbb {W}}, {\mathbb {T}}, \end{aligned}$$
(1b)
$$\begin{aligned}&m_{i,w,t}^E = m_{i,w,t-1}^E + \varvec{V_{i,w,t}^E} - \sum _{k=1}^{K^E} q_{i,k,w,t}^E - s_{i,w,t}^E + \nonumber \\&\quad + \sum _{j \in A_i^{u,E}} (\sum _{k=1}^{K^E} q_{j,k,w,t}^E + s_{j,w,t}^E), \; \forall i,w,t \in {\mathbb {I}}^E,{\mathbb {W}},{\mathbb {T}}, \end{aligned}$$
(1c)
$$\begin{aligned}&u_{i,w,t}^E + p_{i,w,t}^E \le \sum _{k=1}^{K^E} \varvec{\mu _{i,k}^E} \varvec{\overline{Q}_{i,k}^E} \; \forall i,w,t \in {\mathbb {I}}^E,{\mathbb {W}},{\mathbb {T}}, \end{aligned}$$
(1d)
$$\begin{aligned}&p_{i,w,t}^E - d_{i,w,t}^E \ge \varvec{\mu _{i,1}^E\underline{Q}^E_{i}} \; \forall i,w,t \in {\mathbb {I}}^E,{\mathbb {W}},{\mathbb {T}}, \end{aligned}$$
(1e)
$$\begin{aligned}&\sum _{i=1}^{I^E} u_{i,w,t}^E \ge \underline{U}, \; \sum _{i=1}^{I^E} d_{i,w,t}^E \ge \underline{D}, \; \forall t \in {\mathbb {T}}, \ \end{aligned}$$
(1f)
$$\begin{aligned}&\sum _{j\in A_i^{d,E}} m_{i,w,0}^E \varvec{\mu _{j,1}^E} = \varvec{\alpha _i^E}E_{0,w}^O, \; \forall i \in {\mathbb {I}}^E, w \in {\mathbb {W}}, \end{aligned}$$
(1g)
$$\begin{aligned}&\sum _{i=1}^{I^E}\sum _{j\in A_i^{d,E}} m_{i,w,T}^E\mu _{j,1}^E \ge E_{T,w}, \; \forall w \in {\mathbb {W}}, \end{aligned}$$
(1h)
$$\begin{aligned}&-\varvec{\overline{P}_\tau } \le \sum _{i=1}^{I^E} p_{i,w,t}^E - \sum _{i=1}^{I^E} p_{i,w,t-\tau }^E \le \varvec{\overline{P}_\tau }, \; w, t \in {\mathbb {W}}, {\mathbb {T}}, \end{aligned}$$
(1i)
$$\begin{aligned}&0 \le q_{i,k,w,t}^E \le \varvec{\overline{Q}_{i,k}^E}, \; \forall i,k,w,t \in {\mathbb {I}}^E, {\mathbb {K}}^E, {\mathbb {W}}, {\mathbb {T}}, \end{aligned}$$
(1j)
$$\begin{aligned}&\varvec{\underline{Q}_{i}^E} \le \sum _{k=1}^{K^E} q_{i,k,w,t}^E \; \forall i,w,t \in {\mathbb {I}}^E, {\mathbb {W}}, {\mathbb {T}}, \end{aligned}$$
(1k)
$$\begin{aligned}&\varvec{\underline{M}_{i}^E} \le m_{i,w,t}^E \le \varvec{\overline{M}_{i}^E}, \; \forall i, w, t \in {\mathbb {I}}^E, {\mathbb {W}}, {\mathbb {T}}, \end{aligned}$$
(1l)
$$\begin{aligned}&\varvec{\underline{S}_{i}^E} \le s_{i,w,t}^E \le \varvec{\overline{S}_{i}^E}, \; \forall i, w, t \in {\mathbb {I}}^E, {\mathbb {W}}, {\mathbb {T}}. \end{aligned}$$
(1m)

In constraint (1g) the start content of each Equivalent reservoir is calculated as a share \(\varvec{\alpha }_i^E\) of the total initial energy content in the original system. The end reservoir content is limited by constraint (1h) to be larger than a specified value. As these constraints limit the total energy content in the reservoirs, the water content \(m_{i,w,T}^E\) needs to be multiplied with \(\varvec{\mu _{j,1}^E}\). The ramping limits for the Equivalent hydrosystem is shown in constraint (1i), and covers both an increase and decrease in production between \(\tau = \{1,4\}\) hours. This constraint is included to account for real limitations in the Original system due to its more complex topology. Finally, the minimum and maximum limits for the variables are included in constraints (1j)–(1m).

The variables of the Equivalent optimization problem are included in Eq. (2).

$$\begin{aligned} {\mathbb {X}}^{E}_v = \{ p_{i,w,t}^E, q_{i,k,w,t}^E, m_{i,w,t}^E, s_{i,w,t}^E, u_{i,w,t}^E, d_{i,w,t}^E \} \end{aligned}$$
(2)

For comparison, the detailed Original model that the area Equivalent aims to mimic is included in Appendix 1. The main differences between the detailed Original model used here and the Equivalent model is the complexity of the topology. The Original model has a more complex topology with more stations and more considerations such as water flow-times and water court decisions.

2.3 Bilevel optimization problem

Recall that the parameter values included in the Equivalent optimization model are unknown and must be computed. The aim is to estimate these parameters so the resulting production per hour of the Equivalent optimization problem becomes as close as possible to the result of the Original optimization problem. These Equivalent parameters that need to be computed are shown in Eqs. (3a, 3b).

$$\begin{aligned}&{\mathbb {X}}^{E}_P = \{ V_{i,w,t}^{E,\text {init}}, \varvec{\mu _{i,k}^E}, \varvec{\underline{M}_i^E, \overline{M}_i^E, \underline{Q}_{i}^E, \overline{Q}_{i,k}^E}, \varvec{\underline{S}_i^E, \overline{P}_1, \overline{P}_4, \gamma _{i}^E} \}, \end{aligned}$$
(3a)
$$\begin{aligned}&\text {with } \varvec{V_{i,w,t}^E} = V_{i,w,t}^{E,\text {init}} \cdot \varvec{\gamma _i^E}. \end{aligned}$$
(3b)

Note, the parameter \(V_{i,w,t}^{E,\text {init}}\) (often together with \(\varvec{\mu _{i,k}^E}\)) is computed separately from the others. The initial guess for the Equivalent inflow is calculated based on the assumption that the energy in the total inflows to the Equivalent model and to the Original model should be the same. For more details, see [25]. The rest of the the parameters in \({\mathbb {X}}^{E}\) are computed via a bilevel problem formulation with the aim to minimize power production differences between the Original and Equivalent models. This means that these Equivalent parameters are determined with one main goal in mind-accuracy w.r.t. simulated power. The bilevel problem formulation used is shown in Eqs. (4a, 4b, 4c, 4d, 4e, 4f).

$$\begin{aligned} \min _{{\mathbb {X}}^E_P \cup {\mathbb {X}}^E_v}&f^{up} = \sum _{w,t} \left( P^O_{w,t} - \sum _{i=1}^{I^E}p_{i,w,t}^E \right) ^2 \end{aligned}$$
(4a)
$$\begin{aligned} \text {s.t. }&\sum _k \varvec{\overline{Q}_{i,k}^E} \ge \varvec{\underline{Q}_{i}^E} \end{aligned}$$
(4b)
$$\begin{aligned}&\varvec{\overline{M}_{i}^E} \ge \varvec{\underline{M}_{i}^E} \end{aligned}$$
(4c)
$$\begin{aligned}&\varvec{\overline{S}_{i}^E} \ge \varvec{\underline{S}_{i}^E} \end{aligned}$$
(4d)
$$\begin{aligned}&\max _{{\mathbb {X}}^E_v} f^{low} = \sum _{w=1}^{W} \pi _w \sum _{t=1}^{T} \lambda _{w,t} \sum _{i=1}^{I^E} p_{i,w,t}^E \end{aligned}$$
(4e)
$$\begin{aligned}&\text {s.t. } \text {constraints}\, (1\text {b})-(1\text {m}). \end{aligned}$$
(4f)

Here, the upper-level objective function in (4a) is to minimize the squared difference in hourly simulated power production results from the Original detailed optimization model and from the Equivalent optimization model. The upper-level constraints (4b)–(4d) include requirements on the Equivalent parameter values to ensure that the maximum limits are greater than (or equal to) the minimum values. Finally, the lower-level problem (4e)–(4f) is the optimization problem (1) of the Equivalent.

By solving this bilevel problem (4a, 4b, 4c, 4d, 4e, 4f), the optimal parameter values in \({\mathbb {X}}^E_P\) are computed. In this paper, the bilevel problem is mainly solved using a Particle Swarm Optimization (PSO) algorithm [31], based on modifications by [32] and adapted to the hydropower area Equivalent bilevel problem in [24]. Some further modifications have been made also for this paper to adapt the algorithm to the methods described in Sect. 3. This PSO algorithm has shown to be successful at solving this kind of bilevel problem before and can be used directly on the problem without any additional considerations or reformulations to ensure convexity [24].

3 Proposed method

The main idea behind the proposed solution to the problem during peak hours described in 1.1 is to add one (or more) additional segment to the marginal production function and maximum discharge parameters to an already computed Equivalent. By adding this additional segment(s) to an existing Equivalent the peak hours can be better simulated even if the average accuracy decreases. The main new method (N1) is described in Sect. 3.1. In order to evaluate the proposed method, it is also compared to two older methods based on [25] and [5]. These older methods are described in Sects. 3.33.2. Finally, two additional methods are evaluated which can be seen as a mix of the new method (N1) and each of the two older methods, see Sects. 3.43.5 for more details on these.

3.1 New method (N1)

In short, the conceptual idea of the new method is described in Fig. 3. The method can be divided into two main steps. First, all Equivalent parameters in \({\mathbb {X}}^E_P\) are computed, based on the assumption that there is only one segment in \(\varvec{\mu _{i,1}^E}\) and \(\overline{Q}_{i,1}^E\). This means that the bilevel problem (4) is solved with \(k=1\), using a PSO algorithm. From this first step a complete Equivalent model is found. Then in the second step this existing Equivalent model is modified with the addition of a second segment. Moreover, in the second step, a new bilevel problem is formulated. There are three main differences between this new bilevel problem and the bilevel problem (4a, 4b, 4c, 4d, 4e, 4f) from the first step:

  1. a.

    The upper-level variables (Equivalent parameters) in the new bilevel problem are \({\mathbb {X}}^{E,N1}_P = \{\varvec{\mu _{i,2}^E}, \varvec{\overline{Q}_{i,2}^E}, \varvec{\gamma _{i}^E}\}\), the rest of \({\mathbb {X}}^{E}_P\) are assumed to be parameters from step one.

  2. b.

    The upper-level objective function is modified according to (5).

  3. c.

    An additional upper-level constraint is added in the form of (6) to ensure that the efficiency of the second segment is lower than that of the first.

$$\begin{aligned} f^{up,N1} =&f^{up} +\sum _{w,t \in \mathbb {T^*}} \Big (\sum _{i=1}^{I^O}P_{i,w,t}^O - \sum _{i=1}^{I^E} p_{i,w,t}^E\Big )^2, \nonumber \\&\text {with } \mathbb {T^*} = \{t \in {\mathbb {T}} : P_{w,t}^O \ge x_k \cdot \overline{P}_{w,t}^O\}. \end{aligned}$$
(5)
$$\begin{aligned} \mu _{i,k}^E&\le \mu _{i,k-1}^E, \; \forall k=2,\ldots ,K^E, \forall i\in {\mathbb {I}}^E \end{aligned}$$
(6)

The new method (N1) will allow for optimal partition and efficiency of the k segments in the marginal production function \(\mu\) considering peak performance. The hypothesis is that the second segment will have a significantly decreased efficiency, so that this segment is not used too liberally, thereby avoiding overestimations. The second step of (N1) can be repeated also for a higher number of segments, adding one new segment at the time. The same basic idea is used where a new bilevel problem with \(f^{up, N1}\) is solved for each additional segment k. In this paper, \(k=1,2,3\) are investigated. For the second segment \(x_2=\{0.8, 0.85, 0.9\}\) are used and for the third segment \(x_3=\{0.95, 0.98\}\).

Fig. 3
figure 3

Conceptual idea of new method (N1) with multiple segments in the marginal production function \(\mu\)

3.2 Old method (O1)

In this old method (O1), also investigated in [5], both segments of the Equivalent marginal production function are computed at the same time within the bilevel problem formulation in (4a, 4b, 4c, 4d, 4e, 4f), with the only requirement that the second segment has a lower or equal efficiency compared to the first segment. The additional requirement is included as an upper-level constraint, the same as in Eq. (6).

If an Equivalent with only one segment would be computed using this method it would be the same as the Equivalent from step 1 of (N1).

3.3 Old method (O2)

This method is also illustrated in Fig. 4.

Fig. 4
figure 4

Idea behind old method (O2) to compute the marginal production function \(\mu\)

The second old method (O2) is based on the method used in [25]. In [25], the marginal production function \(\varvec{\mu _{i,k}^E}\) was computed outside of the bilevel problem formulation. This si also the case here, however here an addition assumption is included. This assumption is that \(\mu ^E_{1}\) is equal to the average of \(\mu ^O_{i,1}\). The following segments of the marginal production function are then computed using a fixed method. For \(k=2\) segments in total, the second segment is calculated by following these steps:

  1. 1.

    Calculate \(\mu _{i,1}^E\) and \(\overline{Q}_{i}\),

  2. 2.

    Set \(\mu _{i,2}^E\) = \(0.99 \cdot \mu _{i,1}^E\) or \(\mu _{i,2}^E\) = \(0.95 \cdot \mu _{i,1}^E\),

  3. 3.

    Adjust \(\overline{Q}_{i}\) so that \(\overline{Q}_{i,1}\) = \(0.75\cdot \overline{Q}_{i}\) and \(\overline{Q}_{i,2}\) = \(0.25\cdot \overline{Q}_{i}\). Thus \(\overline{Q}_{i,1} + \overline{Q}_{i,2} = \overline{Q}_{i}\).

In [25], the second segment of the marginal production function was assumed to be 99% of the first segment. In this paper, also the assumption that the second segment of \(\varvec{\mu _{i,k}^E}\) is 95% of the first segmented is investigated. These two versions are hereafter called ‘(O2) 95%’ and ‘(O2) 99%’.

3.4 Alternative method (N2)

In this method the first segment \(\mu ^E_{1}\) is determined using the old method (O2). In other words, it is computed as the average of \(\mu ^O_{i,1}\) for the Original detailed model. Then, the second segment is computed following the new method (N1). Meaning that the second segment is optimized, again considering the new upper-level objective \(f^{up, N1}\) from (5).

3.5 Alternative method (N3)

Similarly to method (O2), in (N3) both segments of the Equivalent marginal production function are computed at the same time. The main difference is that the upper-level objective used in the bilevel problem formulation is the same as in method (N1), \(f^{up,N1}\) from (5). Again, the only requirements on \(\mu ^E_{k}\) is that the efficiency in the second segment is lower than in the first, as shown in constraint (6). Note, in contrast to methods (N1) and (N2), here the upper-level objective \(f^{up,N1}\) is considered when computing both segments.

3.6 Main differences

The main difference between (N1) and (N2) compared to the other three methods is that in (N1) and (N2) the second (and following) segment is added using a new bilevel problem to an already computed Equivalent model with only one segment. And thus forming a new Equivalent with two segments. The other three methods compute all segments at the same time and thus also only compute one Equivalent, which then already has the full number of segments \(K^E\).

In (O1) and (N3) the segmentation of \(\overline{Q}^E_{i,k}\) and \(\mu _{i,k}^E\) is solved for in the bilevel problem, while in (O2) the segmentation is fixed and assumed while only solving for \(\sum _k\overline{Q}^E_i\) in the bilevel problem. For (N3) a different upper-level objective function is used, i.e. (5), where (O1) and (O2) use (4a).

4 Case-study

To evaluate the proposed solution and methods from Sect. 3, a case-study over the northernmost electricity trading area in Sweden, SE1, is carried out. SE1 consists of two large rivers - Luleälven and Skellefteälven. These two rivers have a combined total of 30 hydropower stations and an installed capacity of almost 5.5 GW. A topology of these two rivers in SE1 is shown in Fig. 5, where also the topology of the system area Equivalent investigated in this case study is included. For this paper, the method is evaluated on a system area Equivalent with one station and corresponding reservoir (and several discharge segments).

Fig. 5
figure 5

The original rivers Luleälven and Skellefteälven located in the electricity trading area SE1 in northern Sweden compared to the system area equivalent

In this case study, the temporal resolution in the optimization models for the Original and Equivalent systems are 1 h. The input data is divided into a training data set (TR) and a data set used for out-of-sample testing (TS). The training is done using a total simulation horizon of 720 h in 12 different scenarios, while the testing is done using 6 out-of-sample scenarios of the same length. The input data consists of hourly electricity prices from Nordpool [33], daily inflow values to each station in the original system based on calculations from [34], and historical start and end contents in the reservoirs of the original system. Here the input data are from the months Aug-Oct for the years 2013–2018. The training data (TR) are from the years 2013–2016, and the test data (TS) from the years 2017–2018. This means that this particular model is expected to perform well for these months of the year. However, extending the training data would also extend the period in which the Equivalent model is expected to perform well. For more information about the Equivalent model performance depending on the input training periods (e.g. seasons, quarters of the year, dependent on local inflow) see [35, 36].

For the one station area Equivalent with three discharge segments, 13 model parameters \({\mathbb {X}}_P^E\) are computed. As always, when calibrating this model there is a risk of overfitting the parameter values. To evaluate if the model has been overfitted the Equivalent is also evaluated on the out-of-sample test set (TS). However, for this Equivalent model the maximum number of parameters is 13 and the number of data points in the training data (TR) is \(720\times 12=8640\), the risk of overfitting the parameters is relatively low.

To evaluate the performance of the area Equivalent models, three main different performance metrics are used:

  1. 1.

    Relative Mean Average Error (MAE) in hourly power production:

    $$\begin{aligned} \Delta p_{t} = \frac{1}{W\cdot T\cdot \overline{P}^O} \sum _{w,t} \bigg |P^O_{w,t} - \sum _i p^E_{i,w,t} \bigg |. \end{aligned}$$
  2. 2.

    Relative Mean Average Error in total power production per scenario:

    $$\begin{aligned} \Delta p_{\text {tot}} = \frac{1}{W} \sum _w \frac{1}{\sum _t P^O_{w,t}}\bigg |\sum _t P^O_{w,t} - \sum _{i,t} p^E_{i,w,t} \bigg |. \end{aligned}$$
  3. 3.

    Accuracy during peak hours: \(\Delta ^{\text {peak}}_{80/90/95\%}\), measured as share of Original peak hours \({\mathbb {T}}^*\) that the simulated power production of the Equivalent model also fulfill the Original peak hour requirements \({\mathbb {T}}^{E*}\):

    $$\begin{aligned}&{\mathbb {T}}^* = \{t\in {\mathbb {T}}: P_{w,t}^O \ge x_k \cdot \overline{P}_{w,t}^O\}, {\mathbb {T}}^{E*} = \{t\in {\mathbb {T}}^*: \sum _i p_{i,w,t}^E \ge x_k \cdot \overline{P}_{w,t}^O\}, \\&\rightarrow \Delta ^\text {peak}_{80/90/95\%} = {\mathbb {T}}^{E*}\big /{\mathbb {T}}^*. \end{aligned}$$

    For this evaluation peak hours are considered in three steps: \(x_k={0.80, 0.90, 0.95}\). The level 80% of max production is included since some methods have \(\Delta ^{\text {peak}}_{90\%} = 0\), so for a more helpful comparison also a lower peak hour definition is included.

Besides these numerical performance metrics, it is also important to consider the visual similarity in the power simulation curves of the Original and Equivalent models. Besides the comparison of the methods presented in this paper, the area Equivalents are also compared to a Direct aggregation from [27].

The models are implemented using the programming language Julia [37], the optimization package [38] and solved with Gurobi [39]. All computations have been done using a 2.90 GHz Intel Core i7-7600U CPU with 16 GB RAM.

5 Results and discussion

The results from the case study along with a discussion of their implications are included below.

First, in Table 1 some relevant computational times for the area Equivalent model and PSO are included. The computational time of the PSO algorithm is shown as \(C^{PSO}\) while the solution time of the area Equivalent model is \(C^E\). These range from 1.8 to 3.9 s depending on the number of discharge segments. Similarly, the solution time of the detailed Original model is \(C^O = 7.45\) min. In Table 1, the relative solution time of each area Equivalent is reported as a share of \(C^O\).

Table 1 Relevant computational times for the area Equivalent models on the training set (TR), including the number of iterations used in the PSO

5.1 Results on training data

A summary of the performance metrics on the training data (TR) for all methods from Sect. 3 is included in Table 2. Recall that for one segment, or \(k=1\), (N1) is the same as (O1) and (N2) is the same as (O2). Moreover, note that the new proposed method (N1) with 2 segments is investigated considering \(x_2=\{0.80, 0.85, 0.90\}\). Then for (N1) with 3 segments considers the pairs (\(x_2, x_3\)) = \(\{(0.85, 0.95), (0.80, 0.98)\}\) as these were the Equivalents with the best performance considering all possible combinations of \(x_2\) and \(x_3\) from Sect. 3.1.

Table 2 Performance metrics on training data (TR)

In Table 2 the best results for each metric is highlighted using a bold font. Note that for the first two metrics a lower value means better, but for the last three a higher value is better. All the best values are found for the new proposed methods (N1)–(N3), clearly showing the improvement to gain not only in peak accuracy but also in average hourly performance. However, also here the trade-off between good accuracy during peak hours and good average hourly accuracy is visible. The models from (N1) and (N2) have a slightly higher difference in hourly power production \(\Delta p_{wt}\) compared to (O1), (O2) and especially (N3) which has the best average hourly performance. However, the difference in \(\Delta p_{\text {tot}}\) is somewhat lower and the accuracy during peak hours for \(\Delta ^{\text {peak}}_{90\%}\) and \(\Delta ^{\text {peak}}_{95\%}\) is significantly improved. The accuracy for \(\Delta ^{\text {peak}}_{80\%}\) is comparable between all models, except for ‘(N2)/(O2) \(k=1\)’.

Even though (N1) for 2 segments with \(x_2=0.90\) showed the best performance, (N1) for 3 segments and \(x_3=0.95\) had better performance overall basing its second segment on \(x_2=0.85\) instead. This was also the model that showed the best peak accuracy of them all on the training data (TR).

5.2 Results on test data

The results of testing if the behavior of the Equivalent models and the benefit of models (N1)–(N3) are applicable to new data are shown in Table 3. The same patterns as in Table 2 based on the training data (TR) are also present using the testing data (TS). The main difference is that the overall performance is marginally lower on the test set (TS) compared to the training set (TR). However, for method (N1) the metric \(\Delta p_{\text {tot}}\) has a better value on (TS) than on (TR) in this case-study. Since the performance is still good on this out-of-sample test set (TS), this indicates that the area Equivalent model was not overfitted during the training phase.

Table 3 Performance metrics on test data (TS)

An example of the simulated power curves is shown in Fig. 6. There, it is clearly shown that the methods (N1) and (N2) are the only ones who manage to follow the higher peaks of the Original model. However, also for these methods, not all peaks are covered and the absolute highest peaks are not matched perfectly. This is further illustrated in the duration curve in Fig. 7.

Fig. 6
figure 6

Simulated power curve for one scenario in the test set (TS). Area equivalents with \(k=2\) based on the proposed methods compared to the original

Fig. 7
figure 7

Simulated duration curve for one scenario in the test set (TS). Area equivalentsw with \(k=2\) based on the proposed methods compared to the original

The method (N1) gives the Equivalent model which is closest to match the peak hours of the Original model, and (N2) is second best. In Fig. 7 the number of hours that the second segment is fully utilized can be discerned. For (N1) and (N2) the second segment is used to maximum capacity about 7–12.5% of the simulation period. The other methods all run on maximum capacity for over 35%. Thus it is clear that the second segments in (N1) and (N2) fulfill their intended purpose to a higher degree. What these two methods have in common and what separates them from the others is that the Equivalent is computed in multiple steps. It appears that this division into different steps with different bilevel problems is critical for peak performance. The addition to the objective function of the second bilevel problem (5) is also required but is not enough to help capture the peaks by its own. This is illustrated by the results from method (N3) which utilizes \(f^{up,N1}\) from (5) but does not employ the different steps for calculating the Equivalent.

5.3 Impact of the additional segments

In this study, only the method (N1) considers \(k=3\) segments in the discharge curve and marginal production function. From the Tables 2 and 3 a small improvement in accuracy during peak hours is gained by adding a third segment compared to a second segment. This benefit is largest for \(\Delta ^\text {peak}_{95\%}\). However, studying Fig. 8, there is only a very small difference in the duration curve between (N1) with \(k=2\) or \(k=3\). Looking closely, one can see that (N1) with \(k=3\) utilizes the third segment about 75 h (in this scenario), and the increase in maximum capacity by adding a third segment is less than 180 MW. Compare that to a total installed capacity of about 5000 MW, and it is only an increase of 3.6%. In contrast, the second segment increase the installed capacity with about 735 MW, or 17.4%, as seen in Fig. 8. The method allows for more segments to be added, however the benefit of adding even more segments is not clear. It is also important to consider that more segments increases the simulation time of the model.

Fig. 8
figure 8

Simulated duration curve for one scenario in the test set (TS). Area equivalents based on (N1) with k = 1, 2, 3, compared to the original

Fig. 9
figure 9

Simulated duration curve for one scenario in the test set (TS). Area equivalents based on (O1) and (O2) with k = 1 and k = 2, compared to the original

For the rest of the models only \(k \le 2\) is considered. For the older methods (O1) and (O2) this can be motivated by studying Fig. 9. There the effect of adding a second segment compared to only one segment is shown for these methods. Here, there is no clear increase of installed capacity by adding a second segment to the model (compare this to Fig. 8). Instead, the main change between one and two segments is that the models with \(k=2\) utilize their maximum installed capacity for fewer hours. In Fig. 9 a slightly lower level of installed capacity is seen for ‘(O1) \(k=1\)’ compared to ‘(O1) \(k=2\)’. However, it is telling that the maximum levels are very similar regardless of \(k=1\) or \(k=2\) using these methods. Also note that for the models with \(k=2\), the first segment has a significantly lower maximum level compared to the first (and only) segment in the model with \(k=1\). From Fig. 9, the maximum levels are shown as:

  • ‘(O2) 95% \(k=2\)’: 3 233 MW + 1 024 MW = 4 257 MW,

  • ‘(O1) \(k=2\)’: 3 413 MW + 991 MW = 4 404 MW,

  • ‘(O1) \(k=1\)’: 4 217 MW (same as ‘(N1) \(k=1\)’).

When considering that (O1) for \(k=1\) and (N1) for \(k=1\) are the same, the results from method (N1) makes it more clear that the benefit of the second segment for methods (O1) and (O2) is insignificant in comparison.

Another motivation as for why \(k=3\) was not investigated for (O1) and (O2) is shown in the Tables 2 and 3. Here we see that already at \(\Delta ^\text {peak}_{90\%}\) the share of peak hours covered are 0%. Combining this with the effect of the second segment in installed capacity it is reasonable to assume that the installed capacity would not increase significantly by adding a third segment either. If the capacity is not increased, a higher level of peak hours (e.g. 90% instead of 80% of max) would not be possible to cover.

For the alternative methods (N2) and (N3) only \(k\le 2\) is investigated. For \(k=1\), (N2) have the same model as (O2) and (N3) have the same model as (O1). Based on the results w.r.t. \(\Delta ^\text {peak}_{90\%}\), \(\Delta ^\text {peak}_{95\%}\) and a visual examination of the power and duration curves (see Fig. 6 and 7), it was concluded that there would be little benefit to investigate these methods for \(k=3\). The motivation is essentially the same for (N3) as for the older methods (O1) and (O2). However, for (N2) the motivation is different. This difference is mainly due to the noticeable benefit of the second segment in (N2), as seen in Fig. 7 and Tables 2 and 3 for \(\Delta ^\text {peak}_{90\%}\). Similar to method (N1) the second segment clearly increase the installed capacity above the versions with only one segment. Moreover, \(\Delta ^\text {peak}_{90\%}\) is non-zero only for (N1) and (N2). However, also in (N2) \(\Delta ^\text {peak}_{95\%} = 0\%\) and the increase in installed capacity is not as pronounced as in (N1).

All in all, no clear benefit w.r.t peak accuracy is gained by including the second segment for the methods (O1), (O2) and (N3). For the method (N2) there is a noticeable benefit, however this benefit is significantly less substantial than the benefit in method (N1). The major benefit of the new method (N1), and to a lesser extent (N2), is that the division of the computation of the full Equivalent into several bilevel problems allow for an increased installed maximum capacity in the additional segments, even if this slightly reduces the average hourly performance \(\Delta p_{w,t}\).

5.4 Trade-off

As shown in Tables 2 and 3, adding a second segment after the first improves both peak and average performance. However, depending on how this second segment is added either the average performance or the peak performance is improved more. Recall that ‘(N1) \(k=1\)’ is the same as ‘(O2) \(k=1\)’. By comparing the performances of ‘(N1) \(k=2\)’ and ‘(O1) \(k=2\)’, it is shown that adding the second segment directly in the first bilevel problem (4a, 4b, 4c, 4d, 4e, 4f) increases the average hourly performance \(\Delta p_{w,t}\) while adding the second segment in a separate bilevel problem using (5) increases the performance in both total power production and during peak hours.

Moreover, when considering the different values of \(x_k\) for (N1) a clear trend is shown in which better peak performance results in a slightly lower average hourly performance. The same pattern is visible for the methods (N2) and (N3) where (N3) has a better value for \(\Delta p_{w,t}\) but lower peak performance. For future studies, a potential different weighting between the two terms, representing the difference during all hours and the difference during peak hours, in the upper-level objective function (5) should be explored to see if this can help increase the performance of (N3).

When selecting a method for computing the area Equivalents it is important to consider this trade-off and be aware of their advantages and limitations.

5.5 Comparison with direct aggregation

As mentioned in Sect. 1.1, Direct aggregations of the hydropower within a geographical area can overestimate the simulated peak power production of the Original system. This is clearly shown for this case study in Fig. 10, displaying the simulated power and duration curves of the area Equivalent model calculated with (N1), Direct aggregation and the Original model. The Direct aggregation overestimate every peak shown in the figure and seem to have a higher maximum power production over the whole period. This is because the Direct aggregation is based on historical data. This means that one occurrence of a higher peak power production for a single hour in the Original system also results in a higher installed capacity of the Direct aggregation, even if the Original system typically have a slightly lower peak production.

Fig. 10
figure 10

Simulated power and duration curves for one scenario in the test set (TS). Area equivalents based on (N1), compared to direct aggregation and the original

The \(\Delta p_{wt}\) of the Direct aggregation is 15.6% on (TR) and 15.1% on (TS). This is about twice as high difference in average hourly power production compared to the area Equivalents. \(\Delta p_{\text {tot}}\) is equal to 2.70% on (TR) and 0.50% on (TS). These numerical values seem very good. However, they are a bit misleading as large differences are shown in the duration curve of Fig. 10. The values are only low because the overestimations of the power production are compensated over the period by the underestimations. Since all peaks are overestimated by the Direct aggregation \(\Delta _{80/90/95\%}^\text {peak}\) is not a good metric to show the peak accuracy in this case.

6 Conclusions

Modeling large energy systems requires different forms of simplifications and aggregations. This is especially true for large hydropower systems. One way to simplify the modeling of hydropower systems is to utilize so-called Equivalent models. Hours with peak power production can be particularly challenging to accurately capture for the area Equivalent models. Direct aggregations can easily overestimate the production peaks, as their maximum peaks are based on historical data and not average behaviors. In contrast, the bilevel computed area Equivalent models have often underestimated the highest production peaks. In this paper, a new method to increase the accuracy during peak hours is presented. The new method is also compared to two older methods and two alternative new methods. The results show a clear improvement in accuracy during peak hours using the novel method presented in this paper. Here, the Equivalents computed with the new method cover 17–31% of the highest peaks, compared to 0% using the other methods. Thus, it becomes apparent that the idea to divide the computation of additional segments in the Equivalent is critical for peak accuracy. Still, a trade-off between best average performance and best peak performance is discernible. However, with the proposed methods the average performance is very close to the best average performance shown in this paper with a maximum of 0.9%-points difference.