1 Introduction

Optimization under uncertainty is a very active and growing subfield in mathematical optimization as in many real world applications, the input data is either not known in advance or may be subject to perturbations. In the current literature two major approaches have been developed in order to address this challenge, stochastic optimization (SO), see e.g. Birge and Louveaux (2011) for further details, and robust optimization (RO), see e.g. Bertsimas et al. (2011). In stochastic optimization it is generally assumed that the underlying probability distribution of the uncertain parameters is known, whereas robust optimization methods do not require any knowledge about the distribution but assume that the uncertain data is contained in a predefined set of scenarios. In addition, robustly optimal solutions tend to be rather conservative as one considers the worst-case scenario within the predefined uncertainty set.

In order to address the conservativeness inherent in robust optimization, adjustable robust optimization (ARO) was introduced in Ben-Tal et al. (2004) and grew to a very active subarea in robust optimization. On the one hand, it serves as a natural extension of the concept of RO, where uncertainties are not only addressed a priori by making first stage or here-and-now decisions, but also allowing to react to the realization of the uncertainty by second stage or wait-and-see decisions. Hence, ARO is often used to provide more competitive solutions without sacrificing the uncertainty protection that is provided by RO. This is particularly useful in applications such as gas networks, see e.g. Aßmann et al. (2018) and the seminal unit commitment problem in power systems, see e.g. Lorca and Sun (2017).

Methodically, AROs are approached either by approximation schemes or by exact reformulations. In Ben-Tal et al. (2004) an approximation scheme was presented, where the second stage decisions are restricted to affinely depend on the uncertain parameters. This so-called affine decision rule significantly simplifies the underlying trilevel problem and often leads to tractable optimization problems such as linear or semidefinite programs. Additionally, more elaborate decision rules such as piecewise-linear decision rules, see Bertsimas and Georghiou (2015), can also address mixed-integer AROs. For further details, we refer to the excellent survey (Yanıkoğlu et al. 2019).

Exact reformulations of AROs are usually computationally intractable, particularly if non-convexities such as binary variables are involved. Consequently the literature pivots to decomposition approaches, see Bienstock and Özbay (2008), Zeng and Zhao (2013) and Bertsimas et al. (2013). Furthermore, in Avraamidou and Pistikopoulos (2019) a parametric programming approach is given, that solves AROs with a small number of binary variables to global optimality. For larger instances however, parametric programming as of now seems to fall behind decomposition approaches in terms of computational runtime. Hence, the present work aims to contribute to close this computational gap. To this end, we introduce a single-level MIP, that approximates a significant class of AROs and can be proven to be exact under some further assumptions.

As mentioned above, adjustable robust optimization has been active research topic in recent years and it has been found useful in a large variety of areas. Applications dealing with adjustable robustness include, model predictive control (Tejeda-Iglesias et al. 2019) for dealing with process-model mismatch, process scheduling under uncertainty (Lappas and Gounaris 2016; Li and Ierapetritou 2008; Lin et al. 2004; Grossmann et al. 2016), multi-task scheduling with imperfect tasks (Lappas et al. 2019), for analysing resiliency and flexibility of chemical process (Grossmann et al. 2014; Zhang et al. 2016), to handle uncertainty in supply chains (Ben-Tal et al. 2005; Buhayenko and den 2017), and portfolio optimization (Takeda et al. 2008). Here, we will not go into more details on these applications, but will elaborate on the application of AROs to find the optimal operating points of smart converters in power systems.

Renewable energy, particularly wind and photovoltaics, has seen rapid growth in recent years (Bhandari et al. 2015). These energy sources are widely connected to the power grid, which can greatly reduce the use of fossil fuels and electricity costs. However, this integration also introduces power uncertainty that can threaten the stability and safety of the grid (Wei et al. 2014).

Smart inverters (Zhao et al. 2018) and energy storage technology (Weitemeyer et al. 2015) provide a solution to this problem. The former controls the power output of renewables, while the latter could smooth power fluctuations through charging and discharging. One of the goals of power system optimization is to balance the use of renewable energy and system safety by setting the operating state and set points of inverters and storages.

Historically, power system operation-related optimization involves deterministic optimization, see e.g. Anjos and Conejo (2017) for a broader overview on the seminal unit commitment problem and Wang et al. (1995) for an application to security-constrained unit commitment or Sanders and Monroe (1987) for security-constrained economic dispatch, assuming accurate renewable energy forecasting. However, this stream of research does not account for potential uncertainties in the system, which may lead to infeasible operating states. Addressing these challenges has gained increased attention due to the incorporation of renewables in modern power systems.

Robust optimization has gained attention for its potential to address this issue and has been extensively studied in power system operation. We refer to Conejo and Wu (2022) for a brief survey. Moreover, multi-level robust optimization has also be applied to a various challenges that arise in power system operation, such as robust unit commitment (Zhao and Guan 2013; Lorca and Sun 2016) or robust optimal power flow (Zhang and Giannakis 2013) to name a few. It is also applied to the energy management of power systems with smart inverters and energy storage (Yu et al. 2020; Lekvan et al. 2021; Yang and Su 2021). While these studies provide excellent use cases of multi-level optimization models in power systems, they tend to oversimplify the discrete variables in inverter and storage control models by converting them into continuous variables for increased tractability. However, this may result in infeasible solutions in practical applications since we overestimate the possibilities, the grid operator has at its disposal to readjust the grid. Conversely, the proposed method in this work preserves discrete variables and thus guarantees the feasibility of the computed operating points. Moreover, as our only approximation comes from relaxing the adversarial problem posed by the uncertainties in the system, we provide (potentially overly conservative) operating points, whose feasibility can be guaranteed.

This work is structured as follows. In Sect. 2, the adjustable robust problem formulation is introduced. Moreover, we discuss the key assumption, that these problems are weakly connected and present our main results. Section 3 introduces the application to smart converters in networks and demonstrates the consequences of the approximation results from Sect. 2 in this setting. Subsequently, Sect. 4 illustrates the results and presents numerical evidence, that the presented method outperforms existing algorithms on weakly-connected trilevel problems.

2 A MIP approach to robust optimization with MIP adjustments

Adjustable robust programming consist of three separate types or levels of variables. The first-level variables, denoted in the present paper by , describe an initial planning approach, which has to be decided first. Then, an uncertainty affects the outcome of this initial planning. Here, we model this uncertainty by a random vector \(h\in \Omega \subseteq \mathbb {R}^I\), which is distributed by a probability distribution \(\mathbb {P}\in \mathcal {P}(\Omega )\) on a compact domain \(\Omega \subseteq \mathbb {R}^I\). The set \(\mathcal {P}(\Omega )\) is called the ambiguity set of probability measures and the vector h is referred to as the second-level variable. Lastly, the initial planning x can be adjusted to the uncertainty h by choosing the third-level variables \(y\in \mathcal {Y}(x,h)\), where in the present article, we suppose that \(\mathcal {Y}(x,h)\subseteq \mathbb {R}^m\times \{0,1\}^l\) is assumed to be defined through linear constraints. Thus, in total the basic DRO setting can be defined as follows:

$$\begin{aligned} \min _{x\in \mathcal {X}} G(x) + \max _{\mathbb {P}\in \mathcal {P}(\Omega )}\left( \mathbb {E}_\mathbb {P}\left( \min _{y\in \mathcal {Y}(x,h)} c^\top y\right) \right) , \end{aligned}$$
(1)

where \(\mathcal {X}\) and \(\Omega \) are assumed to be a polytopes and \(\mathcal {Y}(x,h)\) a polytope intersected with an integer lattice. However, in the present article, we restrict ourselves to a standard robust ambiguity set, i.e., for the polytope \(\Omega \) the ambiguity set is defined as a set of Dirac measures \(\mathcal {P}(\Omega ){:}{=}\{\delta _{\{h\}}: h\in \Omega \}\) and aim to solve instances of

$$\begin{aligned} \min _{x\in \mathcal {X}} G(x) + \max _{h \in \Omega }\min _{y\in \mathcal {Y}(x,h)} c^\top y. \end{aligned}$$
(2)

Note, that (2) is still considered to be a very challenging problem as it contains the \(\textrm{NP}\)-complete MIP \(\min _{y\in \mathcal {Y}(x,h)} c^\top y\) as a subproblem, see Section 6 in Yanıkoğlu et al. (2019) for further details. In the present article, we focus on turning the above trilevel-problem into a single-level MIP. To this end, we fix the integral part of y and dualize the resulting LP in order to achieve a bilevel problem with a bilinear objective. This objective is then relaxed with McCormick envelopes and dualized again in order to achieve a single-level LP for every fixed integral assignment of y. However, instead of evaluating the resulting LP relaxations for every integer combination, we can simply reincorporate the integrality of y as an additional constraint and, thus, obtain a single-level MIP. This approach may lead to overly pessimistic outcomes as using the McCormick relaxations strengthens the adversarial in the multilevel problem, but can be shown to be exact under some assumptions on h and the dual variables of \(\min _{y\in \mathcal {Y}(x,h)} c^\top y\).

However, this method may not work on strongly connected multilevel problems since broadly speaking the hardness in multilevel optimization problems comes from strong connections between the levels, we consider the following subclass of adjustable robust problems:

We define a problem of type (2) weakly connected, if the only relation between the first-,second- and third level variables is a linear relation in \(\mathcal {Y}(x,h)\):

$$\begin{aligned} \mathcal {Y}(x,h) =\{y\in \mathbb {R}^m\times \{0,1\}^l: A'y\ge b', B y \ge B_x x + B_h h + b_0\}, \end{aligned}$$

i.e., the first-level variables x and the second-level random vector only affect the right-hand side of the constraints defining \(\mathcal {Y}(x,h)\). Weakly connected adjustable robust problems include among others, problems with affine decision rules (\(B^\top = (I_n,-I_n), B_x=0, B_h^\top = (Q^\top ,-Q^\top ), b_0 = (q^\top ,-q^\top )\), where Qq can be chosen arbitrarily.

Moreover, in order to work with the levels separately, we denoted by \(A'y\ge b'\) the constraints, that solely deal with third-level variables, i.e., that are neither affected by the first-level variables x nor by the second-level variables h and by \(By \ge B_x x + B_h h + b_0\) the constraints, that are affected by the upper levels. Furthermore, instances that are particularly well suited for our approach should have relatively few non-zero entries in the matrix \(B_h\). Few non-zero entries in \(B_h\) is not a strict requirement, but both the approximation quality and computational complexity can scale with the number of non-zero elements.

In addition, let us consider an LP inner-approximation of the third-level program \(\min _{y\in \mathcal {Y}(x,h)} c^\top y\), i.e. we fix the integer variables \(y_{m+1},\ldots , y_{m+l}\in \{0,1\}\) by a set of linear constraints denoted by \(A_f y \ge b_f\). Consequently, we consider the following LP:

$$\begin{aligned} \min \ {}&c^\top y \end{aligned}$$
(3a)
$$\begin{aligned} \text {s.t.}\ {}&A'y \ge b', \end{aligned}$$
(3b)
$$\begin{aligned}&A_f y \ge b_f \end{aligned}$$
(3c)
$$\begin{aligned}&By\ge B_x x + B_h h + b_0 \end{aligned}$$
(3d)

Let \(A^\top = ((A')^\top ,A_f^\top ), b^\top = ((b')^\top ,b_f^\top )\) and \(\alpha \) denote the dual variables that correspond to Constraints (3b) and (3c), i.e. to \(Ay\ge b\). Additionally, we denote by \(\beta \) the dual variables corresponding to (3d) and obtain as the dual program of (3):

$$\begin{aligned} \max \ {}&b^\top \alpha + (B_x x + B_h h + b_0)^\top \beta \end{aligned}$$
(4a)
$$\begin{aligned} \text {s.t.}\ {}&\begin{pmatrix}A^\top&B^\top \end{pmatrix} \begin{pmatrix}\alpha \\ \beta \ \end{pmatrix} = c, \end{aligned}$$
(4b)
$$\begin{aligned}&\alpha ,\beta \ge 0, \end{aligned}$$
(4c)

We observe, that on the second level the objective decomposes into a bilinear part \((B_h h)^\top \beta = h^\top B_h \beta \) and a linear one \(b^\top \alpha + (B_x x + b_0)^\top \beta \). Moreover, the amount of nonzero entries of \(B_h\) crucially determines the number of bilinear terms and thereby the difficulty of computing (4).

Let I and J denote the index sets of h and \(\beta \) respectively. Then, in order to relax the bilinear term \(h^\top B_h\beta \), we make the following key assumption:

Assumption 1

Both, the dual variables \(\beta \in \mathbb {R}^J\) as well as the second-level variables \(h\in \mathbb {R}^I\) are bounded, i.e., there are \(\beta ^-,\beta ^+\) and \(h^-,h^+\) such that adding

$$\begin{aligned} h^- \le h\le h^+ \text { and } \beta ^-\le \beta \le \beta ^+ \end{aligned}$$

to \(\Omega \) or (4) does not affect the outcome of (1).

Note that a compact \(\Omega \) already implies \(h^-\le h \le h^+\). Hence, in those cases it suffices to check Assumption 1 for \(\beta \). Later, we show that finite, and meaningful, bounds \(\beta ^-\) and \(\beta ^+\) can be easily determined for the considered application of finding optimal operating points in power systems. In addition, we mention, that every sharpening of the bounds in Assumption 1 improves the quality of the upcoming results. This is due to the fact, that the boundedness of both, \(\beta \) and h enables us to relax the second-level problem \(\max _{h\in \Omega } \min _{y\in \mathcal {Y}(x,h)} c^\top y\) further using McCormick envelopes for the bilinear terms. In particular, this idea gives rise to the following theorem:

Theorem 1

Let the ambiguity set \(\Omega \) be a polytope defined by \(\Omega =\{h\in \mathbb {R}^I: A_\Omega ^\top h +B_\Omega ^\top \eta =b_\Omega ,\ \eta \ge 0\}\) with \(b_\Omega \in \mathbb {R}^k\), i.e. \(\Omega \) is compact and \(\eta \ge 0\) denote potential nonnegative slack variables in the rows. Let further \(\beta ^-,\beta ^+\) be a lower/upper bound for \(\beta \). Then, the following linear program provides an upper bound to (2):

$$\begin{aligned} \min ~&G(x) + (\beta ^+)^\top u_\beta ^+ + (\beta ^-)^\top u_\beta ^- + b_\Omega ^\top u_\Omega + c^\top y \end{aligned}$$
(5a)
$$\begin{aligned}&- \sum _{i\in I, j\in J} \left( h_i^- \beta _j^- (u_\text {env})_{ij}^1 + h_i^+\beta _j^+ (u_\text {env})_{ij}^2 + h_i^+\beta _j^- (u_\text {env})_{ij}^3 + h_i^-\beta _j^+ (u_\text {env})_{ij}^4\right) \nonumber \\ \text {s.t.}&Ay \ge b, \end{aligned}$$
(5b)
$$\begin{aligned}&By + u_\beta ^+ + u_\beta ^- -h_i^-(u_\text {env})_{i}^1 - h_i^+(u_\text {env})_{i}^2 - h_i^+ (u_\text {env})_{i}^3 -h_i^-(u_\text {env})_{i}^4 \ge B_x x + b_0 \quad \text {for every } i\in I \end{aligned}$$
(5c)
$$\begin{aligned}&u_\beta ^+ \ge 0,\ -u_\beta ^- \ge 0, \end{aligned}$$
(5d)
$$\begin{aligned}&(A_\Omega u_\Omega )_i - \sum _{j\in J} \left( \beta _j^-(u_\text {env})_{ij}^1 + \beta _j^+(u_\text {env})_{ij}^2 + \beta _j^-(u_\text {env})_{ij}^3+\beta _j^+(u_\text {env})_{ij}^4\right) \ge 0 \qquad \quad \;\text {for every } i\in I, \end{aligned}$$
(5e)
$$\begin{aligned}&B_\Omega u_\Omega \ge 0, \end{aligned}$$
(5f)
$$\begin{aligned}&(u_\text {env})_{ij}^1 + (u_\text {env})_{ij}^2 + (u_\text {env})_{ij}^3 + (u_\text {env})_{ij}^4 \ge (B_h)_{ij} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text {for every } i\in I,\nonumber \\ {}&j\in J, \end{aligned}$$
(5g)
$$\begin{aligned}&-(u_\text {env})_{ij}^1, -(u_\text {env})_{ij}^2 \ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \qquad \qquad \text {for every } i\in I,\nonumber \\ {}&j\in J, \end{aligned}$$
(5h)
$$\begin{aligned}&(u_\text {env})_{ij}^3, (u_\text {env})_{ij}^4 \ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text {for every } i\in I,\nonumber \\ {}&j\in J, \end{aligned}$$
(5i)
$$\begin{aligned}&x\in \mathcal {X}. \end{aligned}$$
(5j)

Proof

We observe that with Assumption 1 the second-level \(\max _{h\in \Omega } \min \{c^\top y: Ay\ge b, By \ge B_x x + B_h h + b_0\}\) can be written as

$$\begin{aligned} \max \ {}&b^\top \alpha + (B_x x + b_0)^\top \beta + \sum _{i\in I, j\in J} (B_h)_{ij} h_i\beta _j \end{aligned}$$
(6a)
$$\begin{aligned} \text {s.t.}\ {}&A^\top \alpha + B^\top \beta = c, \end{aligned}$$
(6b)
$$\begin{aligned}&\beta + \gamma = \beta ^+, \end{aligned}$$
(6c)
$$\begin{aligned}&\beta - \delta = \beta ^-, \end{aligned}$$
(6d)
$$\begin{aligned}&A_\Omega ^\top h + B_\Omega ^\top \eta = b_\Omega , \end{aligned}$$
(6e)
$$\begin{aligned}&\alpha ,\beta ,\gamma ,\delta ,h,\eta \ge 0, \end{aligned}$$
(6f)

where we assumed w.l.o.g. that \(h^-= 0\) since otherwise, we could substitute h by \(h-h^-\) and adjust \(b_0\) and \(b_\Omega \) accordingly. Next, we substitute \(\kappa _{ij}{:}{=}h_i\beta _j\) in the objective and relax the resulting constraint \(\kappa _{ij}= h_i\beta _j\) by a McCormick envelope. Note, that w.l.o.g. \(\kappa \ge 0\). If we further introduce suitable nonnegative slack variables \(\rho \ge 0\), we obtain the following LP:

$$\begin{aligned} \max \ {}&b^\top \alpha + (B_x x + b_0)^\top \beta + \sum _{i\in I, j\in J} (B_h)_{ij} \kappa _{ij} \end{aligned}$$
(7a)
$$\begin{aligned} \text {s.t.}\ {}&A^\top \alpha + B^\top \beta = c, \end{aligned}$$
(7b)
$$\begin{aligned}&\beta + \gamma = \beta ^+, \end{aligned}$$
(7c)
$$\begin{aligned}&\beta - \delta = \beta ^-, \end{aligned}$$
(7d)
$$\begin{aligned}&A_\Omega ^\top h + B_\Omega ^\top \eta = b_\Omega , \end{aligned}$$
(7e)
$$\begin{aligned}&\kappa _{ij} = h_i^-\beta _j + h_i\beta _j^- - h_i^-\beta _j^- +\rho _{ij}^1{} & {} \text {for every } i\in I, j\in J \end{aligned}$$
(7f)
$$\begin{aligned}&\kappa _{ij} = h_i^+\beta _j + h_i\beta _j^+ - h_i^+\beta _j^+ +\rho _{ij}^2{} & {} \text {for every } i\in I, j\in J, \end{aligned}$$
(7g)
$$\begin{aligned}&\kappa _{ij} = h_i^+\beta _j + h_i\beta _j^- - h_i^+\beta _j^- -\rho _{ij}^3{} & {} \text {for every } i\in I, j\in J, \end{aligned}$$
(7h)
$$\begin{aligned}&\kappa _{ij} = h_i\beta _j^+ + h_i^-\beta _j - h_i^-\beta _j^+ -\rho _{ij}^4{} & {} \text {for every } i\in I, j\in J, \end{aligned}$$
(7i)
$$\begin{aligned}&\alpha ,\beta ,\gamma , \delta ,h,\eta ,\kappa ,\rho \ge 0. \end{aligned}$$
(7j)

If we denote the dual variables of(7) by \(y, u_\beta ^+, u_\beta ^-, u_\Omega , (u_\text {env})_{ij}^1,(u_\text {env})_{ij}^2,(u_\text {env})_{ij}^3,(u_\text {env})_{ij}^4\) respectively, then the result follows by strong duality and including the first-level variables and objectives. \(\square \)

We observe, that Theorem 1 provides an LP inner approximation of (2) in the sense that the adversarial is overestimated, whereas the space of decision variables is underestimated, potentially leading to a more conservative solution. In particular, we simply fixed the discrete or w.l.o.g. binary decisions in \(\mathcal {Y}(x,h)\) denoted by \(y_{m+1},\ldots y_{m+l}\in \{0,1\}\). To this end, we briefly illustrate the key idea behind the following Theorem 2, that addresses integralities:

Observe, that solving (2) is equivalent to solving exponentially many trilevel LPs – one LP for every fixed choice of \(y_{m+1},\ldots y_{m+l}\). This is due to the fact that the proof of Theorem 1 does not depend on the fixing of \((y_{m+1},\ldots , y_{m+l})^\top \) to a vector in \(y'\in \{0,1\}^l\). Consequently, we obtain \(2^l\) single-level LPs, where each of these LPs yields an inner-approximation of the trilevel LPs given by the fixing of \(y_{m+1},\ldots , y_{m+l}\). Finally, instead of solving each of those exponentially many LPs separately, the following MIP gives the same result.

Theorem 2

Let the ambiguity set \(\Omega \) be a polytope defined by \(\Omega =\{h\in \mathbb {R}^I: A_\Omega ^\top h +B_\Omega ^\top \eta =b_\Omega ,\ \eta \ge 0\}\) with \(b_\Omega \in \mathbb {R}^k\), i.e. \(\Omega \subseteq \mathbb {R}^I\) is compact and \(\eta \ge 0\) denote potential nonnegative slack variables in the rows. Let further \(\beta ^-,\beta ^+\) be a lower/upper bound for \(\beta \). Then, the following MIP provides an upper bound to the tri-level MIP (2):

$$\begin{aligned} \min \ {}&G(x) + (\beta ^+)^\top u_\beta ^+ + (\beta ^-)^\top u_\beta ^- + b_\Omega ^\top u_\Omega + c^\top y \end{aligned}$$
(8a)
$$\begin{aligned}&- \sum _{i,j\in [m]} \left( h_i^- \beta _j^- (u_\text {env})_{ij}^1 + h_i^+\beta _j^+ (u_\text {env})_{ij}^2 + h_i^+\beta _j^- (u_\text {env})_{ij}^3 + h_i^-\beta _j^+ (u_\text {env})_{ij}^4\right) \nonumber \\ \text {s.t.}\ {}&Ay \ge b, \end{aligned}$$
(8b)
$$\begin{aligned}&By + u_\beta ^+ + u_\beta ^- -h_i^-(u_\text {env})_{i}^1 - h_i^+(u_\text {env})_{i}^2 - h_i^+ (u_\text {env})_{i}^3 -h_i^-(u_\text {env})_{i}^4 \ge B_x x + b_0 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \; \text {for every } i\in [m] \end{aligned}$$
(8c)
$$\begin{aligned}&u_\beta ^+ \ge 0,\ -u_\beta ^- \ge 0, \end{aligned}$$
(8d)
$$\begin{aligned}&(A_\Omega u_\Omega )_i - \sum _{j\in [m]} \left( \beta _j^-(u_\text {env})_{ij}^1 + \beta _j^+(u_\text {env})_{ij}^2 + \beta _j^-(u_\text {env})_{ij}^3 + \beta _j^+(u_\text {env})_{ij}^4\right) \ge 0 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \; \text {for every } i\in [m], \end{aligned}$$
(8e)
$$\begin{aligned}&B_\Omega u_\Omega \ge 0, \end{aligned}$$
(8f)
$$\begin{aligned}&(u_\text {env})_{ij}^1 + (u_\text {env})_{ij}^2 + (u_\text {env})_{ij}^3 + (u_\text {env})_{ij}^4 \ge (B_h)_{ij} \qquad \qquad \qquad \qquad \text {for every } i,j\in [m], \end{aligned}$$
(8g)
$$\begin{aligned}&-(u_\text {env})_{ij}^1, -(u_\text {env})_{ij}^2 \ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \!\!\!\! \text {for every } i,j\in [m], \end{aligned}$$
(8h)
$$\begin{aligned}&(u_\text {env})_{ij}^3, (u_\text {env})_{ij}^4 \ge 0 \quad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \! \text {for every } i,j\in [m], \end{aligned}$$
(8i)
$$\begin{aligned}&x\in \mathcal {X}, u_\beta ^+, u_\beta ^- \in \mathbb {R}^J, u_\Omega \in \mathbb {R}^k, (u_\text {env})^1, (u_\text {env})^2, (u_\text {env})^3, (u_\text {env})^4 \in \mathbb {R}^{I\times J}, y\in \mathbb {R}^{n+l}, \end{aligned}$$
(8j)
$$\begin{aligned}&y_{m+1},\ldots , y_{m+l}\in \{0,1\}. \end{aligned}$$
(8k)

Proof

First, we fix \((y_{m+1},\ldots , y_{m+l})^\top =y'\) with a set of linear (in-)equalities \(A_fy \ge b_f\). Here, \(y'\in \{0,1\}^l\) denotes an arbitrary integer assignment. Suppose we replace (8k) by this system of inequalities and denote the resulting feasible set by \(\mathcal {X}(y')\), i.e.

$$\begin{aligned} \mathcal {X}(y')= & {} \left\{ x, u_\beta ^+, u_\beta ^-, u_\Omega , (u_\text {env})^1, (u_\text {env})^2, (u_\text {env})^3, (u_\text {env})^4, y: (8a) - (8b)\right. ,\\{} & {} \quad \left. (y_{m+1},\ldots , y_{m+l})^\top = y'\right\} . \end{aligned}$$

We further denote the corresponding objective formed by (8a) by c(x), i.e. the LP \(\min _{x\in \mathcal {X}(y')} c(x)\) describes (8) with fixed values of \((y_{m+1},\ldots , y_{m+l})^\top \). Hence, Theorem 1 implies, that \(\min _{x\in \mathcal {X}(y')} c(x)\) inner-approximates the following trilevel LP

$$\begin{aligned} \min _{x\in \mathcal {X}} G(x) + \max _{h\in \Omega } \min _{y\in \mathcal {Y}(x,h,y')} c^\top y, \end{aligned}$$

where

$$\begin{aligned} \mathcal {Y}(x,h,y')=\left\{ y\in \mathcal {Y}(x,h): (y_{m+1},\ldots , y_{m+l})^\top =y'\right\} . \end{aligned}$$

In particular, we obtain by Theorem 1 that \(\min _{x\in \mathcal {X}(y')} c(x) \ge \min _{x\in \mathcal {X}} G(x) + \max _{h\in \Omega } \min _{y\in \mathcal {Y}(x,h,y')} c^\top y\) for every \(y'\in \{0,1\}^l\). Subsequently, we conclude

$$\begin{aligned} (8)&= \min _{y'\in \{0,1\}^l} \min _{x\in \mathcal {X}(y')} c(x) \\&\ge \min _{y'\in \{0,1\}^l} \min _{x\in \mathcal {X}} G(x) + \max _{h\in \Omega } \min _{y\in \mathcal {Y}(x,h,y')} c^\top y\\&= \min _{x\in \mathcal {X}} G(x) + \min _{y'\in \{0,1\}^l}\max _{h\in \Omega } \min _{y\in \mathcal {Y}(x,h,y')} c^\top y \\&\overset{(*)}{\ge }\ \min _{x\in \mathcal {X}} G(x) + \max _{h\in \Omega } \min _{y' \in \{0,1\}^l,y\in \mathcal {Y}(x,h,y')} c^\top y = (1). \end{aligned}$$

As stated above, the first inequality is based on Theorem 1 whereas the second one is a consequence of the max-min-inequality. In the remainder of this proof, we will argue, that \((*)\) is even sharp and thus, any potential differences between the relaxation (8) and the original problem (1) solely originate from the McCormick envelopes. To this end,

let us consider the Lagrangian relaxation with penalty terms instead of hard constraints:

$$\begin{aligned} f_x(h,y'){:}{=}\min _{y\in \mathcal {Y}(x)} c^\top y + u_0^\top (A_\Omega ^\top h +B_\Omega ^\top \eta -b_\Omega ) + u_1^\top ((y_{m+1},\ldots , y_{m+l})^\top -y'), \end{aligned}$$

where \(\mathcal {Y}(x)\supseteq \mathcal {Y}(x,h)\) is defined as the relaxation of \(\mathcal {Y}(x,h)\) occuring if Constraint \(A_\Omega ^\top h +B_\Omega ^\top \eta = b_\Omega \) is dropped. Observe that since \(\mathcal {Y}(x)\) is bounded, for a given \(h\in \Omega \) and \(y'\in \{0,1\}^l\), according to Theorem 21 in Lemaréchal (2001), there exist \(u_0\in \mathbb {R}^k,u_1\in \mathbb {R}^l\) that satisfy

$$\begin{aligned} \min _{y\in \mathcal {Y}(x,h,y')} c^\top y = f_x(h,y'). \end{aligned}$$

Then, since \(f_x(h,y')\) is concave in h, i.e. on the nonempty set \(\Omega \) and further is convex in \(y'\) on the compact nonempty set \(\{0,1\}^l\), we apply the Ky-Fan theorem (Fan 1953) and conclude equality in \((*)\). \(\square \)

We note, that the approximation quality given by Theorem 2 relies solely on the strenght of the relaxation due to the McCormick envelopes. Moreover, as the number of nonzero entries in \(B_h\) determines the number of bilinear terms in the second-level, it crucially affects the approximation quality given by the McCormick relaxation. In particular, a large number of bilinear terms may lead to an overly conservative solution. Thus, exploiting problem specific information that improves the bounds \(\beta ^-,\beta ^+,h^-,h^+\) directly increases the solution quality. Moreover, more elaborate approximations of \(h^\top B_h \beta \) may lead to a direct improvement of Theorem 2 and are subject to future research.

3 Application to smart converters in power systems networks

In the present section, we apply our results to the questions of how to operate a power grid integrated with smart inverters and storage.

Usually, the optimal operation problem of the power system can be divided into two phases.

  • The first phase is about day-ahead scheduling. In this phase, the power demand of the power grid needs to be determined and reported to the day-ahead electricity market. In addition, internal conventional generators need to make a power generation plan for the next day to reasonably allocate diesel consumption.

  • The second phase is about intra-day system operation. Given the day-ahead decisions, operators need to further set operating points and working status of smart inverters and energy storage systems to realize a real-time power balance and reduce the waste of renewables.

As can be seen from our analysis, the day-ahead decisions influence the intra-day decisions. The renewable-led uncertainties will further affect the quality of decision-making. In order to realize the safe and economical operation of the power grid, we need to overcome the following three difficulties:

  1. 1.

    Sequential The formulation of the day-ahead strategy is subject to uncertainty realizations and intra-day operations. A multi-layer model is needed to describe the decision sequence as well as uncertainty realizations in the real world.

  2. 2.

    Uncertain The uncertainties threaten the safe operation of power systems. Power fluctuations induced by renewables may deteriorate power quality and increase electricity costs. Hence, the operation strategy should be robust enough to handle different situations.

  3. 3.

    Discrete The intra-day operation in the second phase often involves many state-switching operations. The states can be modeled as integer variables in the optimization problem. However, these variables destroy the convexity of the model. A new solution method should be developed in such a case.

The power system operation problem can be solved using the robust optimization approach with MIP adjustments proposed in Sect. 2. A multi-layer model is built to describe the sequential decision-making process; the adaptive robust programming is employed to account for worst-case scenarios; the MIP adjustment is then applied to consider the state switching in the system.

At first, we formulate the mathematical model of the power system operation. We mainly follow the notation by Yang and Wu (2019) but use the simpler DC approximation of Kirchhoff laws in order to model the power flow when operating the grid.

Let \(\mathcal {B}\) denote a set of buses and \(\mathcal {L}\) denote a set of lines/branches in a power grid. Additionally, power is generated within the grid either by a set of conventional (fossil fuel) generators denoted by \(\mathcal {N}_G\) or by a set of distributed (renewable) generators denoted by \(\mathcal {N}_{DG}\). Moreover, the transmission system operator (TSO) may decide to store or release power via a set \(\mathcal {N}_S\) of storages, e.g. batteries. The last potential sources and sinks of power is a trading node with other connected regional transmission grids. Here, the TSO may purchase power on the day-ahead market or intra-day. As the day-ahead market is often called first-level market, we denote the amount of power traded day-ahead by \(P_{fl}\) and the corresponding market price by \(p_{fl}\). Similarly, the amount of power traded intra-day is denoted by \(P_{sl}\) and its market price by \(p_{sl}\). Note, that a positive value for \(P_{fl},P_{sl}\) is interpreted as a purchase and negative values for \(P_{fl},P_{sl}\) correspond to a sell of energy.

For the SO’s initial planning, one considers the first-level variables \(x=(P_G^\top ,P_{fl})^\top \in \mathbb {R}^{\mathcal {N}_G}\times \mathbb {R}\) combined with the estimated renewable energy production \(P_{DG, forecast}\in \mathbb {R}^{\mathcal {N}_{DG}}\) and ensures that a given total demand \(\sum _{i\in \mathcal {B}} P_{d_i}\) is met. Hence,

$$ \begin{aligned} \mathcal {X}=\{x\in \mathbb {R}^n:\ (9a) \& \ (9b) \}, \end{aligned}$$

where

$$\begin{aligned}&P_{fl}^t = \sum _{i\in \mathcal {B}} P_{d_i}^t - \sum _{i\in \mathcal {N}_G}P_{G_i}^t - \sum _{i\in \mathcal {N}_{DG}} P_{DG_i, forecast}^t \qquad \,\, \text {for every } t\in T, \end{aligned}$$
(9a)
$$\begin{aligned}&P_{G_i,\text {min}}^t \le P_{G_i}^t \le P_{G_i,\text {max}}^t \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \text {for every } i\in \mathcal {N}_G, t\in T. \end{aligned}$$
(9b)

with given parameters \(P_{G,\min }, P_{G,\max }\in \mathbb {R}^{\mathcal {N}_G \times T}\) and the subsequent constraints (9a) and (9b). We like emphasize that the market-clearing condition (9a) ensures the active power balance in the whole system. Moreover, the objective of the first-level is given by

$$\begin{aligned} G(x)=\sum _{i\in \mathcal {N}_G,t\in T} c_{G_i^t,2} (P_{G_i}^t)^2 + c_{G_i^t,1} P_{G_i}^t + c_{G_i^t,0} + \sum _{t\in T} p_{fl}^t P_{fl}^t, \end{aligned}$$

where \(c_{G_i^t,2},c_{G_i^t,1},c_{G_i^t,0}\in \mathbb {R}\) are given generator cost parameters.

Since the uncertainties will impact the initial planning, we will consider the uncertain capacity of the renewable generators, i.e. we denote the second-level variable by \(h=P_{DG, \text {max}}\). As these generators are dependent on weather conditions, which are highly uncertain, this is one of the most common uncertainties faced by modern power grids with high proportion of renewable energies (Pfenninger et al. 2014; Impram et al. 2020). As the grid stability is crucial, we will address this uncertainty in a robust manner and set the domain of the second-level variables \(P_{DG,\max }\) to

$$\begin{aligned} \Omega&=\left\{ P_{DG,\text {max}}\in \mathbb {R}^{\mathcal {N}_{DG}\times T}:\ 0\le P_{DG_i, \text {max}}^t \le P_i^+ \forall \ i\in \mathcal {N}_{DG}, t\in T\right. ,\nonumber \\&\qquad \left. \sum _{i\in \mathcal {N}_{DG}} P_{DG_i, \text {max}}^t \ge R\sum _{i\in \mathcal {N}_{DG}} P_{DG_i, forecast}^t \right\} , \end{aligned}$$
(10)

where \(P_i^+\) denotes the technical limit of the renewable generator, i.e. its capacity under optimal conditions and \(R\in [0,1]\) denotes a maximal forecast error.

On the third level, the TSO is able to react to this uncertainty and adjust the initial planning accordingly. In particular, instead of producing \(P_G\), the TSO might regulate the energy output to \(P_{G,\text {reg}}\) by either increasing the production by adding \(P_G^+\ge 0\) or decreasing the production by adding \(P_G^-\le 0\). However, this can only be done at a cost \(r^+\) or \(r^-\) respectively. Similarly, \(P_{DG}\) is the adjusted energy production that deviates from its forecast by \(P_{DG}^+\) or \(P_{DG}^-\), where deviations are penalized by \(f^+,f^-\) respectively. Despite of these regulations, the TSO might trade power intra-day (\(P_{sl}\)) or decide whether (\(\mu _{ch},\mu _{dch}\in \{0,1\}^{\mathcal {N}_S}\)) and by how much \(P_{ch},P_{dch}\) to charge or discharge the storages for balancing potential power deficiency and surplus. The state of charge of a storage is denoted by \(\text {soc}\). Lastly, the power on a line \((k,l)\in \mathcal {L}\) is denoted by \(p_{kl}\) and the phase angles of the system by \(\vartheta \). Hence, the TSO can adjust the vector \(y=(P_{G_i,\text {reg}}, P_{G_i}^+, P_{G_i}^-, P_{DG_i}, P_{DG_i}^+, P_{DG_i}^-, P_{sl}, P_{ch_i}, P_{dch_i},p_{kl}, \theta , \text {soc}, \mu _{ch}, \mu _{dch} )^\top \) in order to satisfy

$$\begin{aligned} y\in \mathcal {Y}(x,h){:}{=}\left\{ y\in \mathbb {R}^m:\ (11a)-(15g)\right\} , \end{aligned}$$

where the constraints (11a)–(15g) are given below:

  1. (a)

    First, we consider the following Generator and DG output constraints:

    $$\begin{aligned}&P_{G_i,\text {reg}}^t = P_{G_i}^t + P_{G_i}^{t,+}+ P_{G_i}^{t,-} \qquad \qquad \qquad \text {for every } i\in \mathcal {N}_G, t\in T, \end{aligned}$$
    (11a)
    $$\begin{aligned}&P_{G_i,\text {min}}^t \le P_{G_i,\text {reg}}^t \le P_{G_i,\text {max}}^t \!\!\qquad \qquad \qquad \quad \text {for every } i\in \mathcal {N}_G, t\in T, \end{aligned}$$
    (11b)
    $$\begin{aligned}&P_{DG_i}^t = P_{DG_i,\text {forecast}}^t + P_{DG_i}^{t,+}+P_{DG_i}^{t,-} \!\qquad \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
    (11c)
    $$\begin{aligned}&P_{DG_i,\text {min}}^t \le P_{DG_i}^t \le P_{DG_i, \text {max}}^t \!\qquad \qquad \qquad \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
    (11d)
    $$\begin{aligned}&P_{sl}^t-P_{fl}^t=-\mathbbm {1}^\top (P_{G}^{t,+} +P_G^{t,-}) -\mathbbm {1}^\top (P_{DG}^{t,+}+P_{DG}^{t,-}) -\mathbbm {1}^\top (P_{dch}^t-P_{ch}^t) \nonumber \\&\!\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text {for every } t\in T. \end{aligned}$$
    (11e)

    Constraints (11a) -(11d) describe the output range of the conventional and renewable generators and the potential impact of uncertainties. In particular, (11d) shows that the third-level variables \(P_{DG_i}^t\) are restricted by the uncertainties realized in the second level (\(P_{DG_i, \text {max}}^{t}\)). Constraint (11e) illustrates the market clearing on the intra-day market, which reflect the actual power demand-supply relations.

  2. (b)

    Second, we consider the operation constraints:

    $$\begin{aligned}&0 \le P_{G_i}^{t,+} \le P_{G_i, \text {max}}^{t,+} \qquad \qquad \qquad \,\,\,\ \text {for every } i\in \mathcal {N}_G, t\in T, \end{aligned}$$
    (12a)
    $$\begin{aligned}&P_{G_i, \text {min}}^{t,-} \le P_{G_i}^{t,-} \le 0 \qquad \qquad \qquad \,\,\,\ \text {for every } i\in \mathcal {N}_G, t\in T, \end{aligned}$$
    (12b)
    $$\begin{aligned}&P_{DG_i}^{t,+}\ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \ \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
    (12c)
    $$\begin{aligned}&P_{DG_i}^{t,-} \le 0 \qquad \qquad \qquad \qquad \qquad \qquad \ \text {for every } i\in \mathcal {N}_{DG}, t\in T. \end{aligned}$$
    (12d)

    Constraints (12a)–(12d) limit the real output derivations of conventional and renewable generators.

  3. (c)

    Third, we consider the power flow constraints with a DC approximation for the given, constant line reactance (\(x_{ij}>0\)) and demand in active power (\(P_k^{d,t}\)) at every time step t and bus k. The nodal power flow balance is established in (13a) and separately in (13b) for the root node. The branch power flow is established in (13c). The constraints are listed as follows:

    $$\begin{aligned}&\sum _{i\in \mathcal {N}_G: i\sim k} P_{G_i,\text {reg}}^t +\sum _{i\in \mathcal {N}_{DG}: i\sim k} P_{DG_i}^t + \sum _{i\in \mathcal {N}_S: i\sim k} (P_{dch_i}^t-P_{ch_i}^t) - P_k^{d,t} \nonumber \\&\quad = \sum _{l \in \delta (k)} p_{kl}^t \qquad \qquad \qquad \forall k\in \mathcal {B}\setminus \{0\}, t\in T, \end{aligned}$$
    (13a)
    $$\begin{aligned}&\sum _{i\in \mathcal {N}_G: i\sim 0} P_{G_i,\text {reg}}^t +\sum _{i\in \mathcal {N}_{DG}: i\sim 0} P_{DG_i}^t + \sum _{i\in \mathcal {N}_S: i\sim 0} (P_{dch_i}^t-P_{ch_i}^t) +P_{sl}^t - P_0^{d,t} \nonumber \\&\quad = \sum _{l \in \delta (k)} p_{0l}^t \qquad \qquad \qquad \qquad \qquad \qquad \ \forall t\in T, \end{aligned}$$
    (13b)
    $$\begin{aligned}&p_{ij}^t = \frac{1}{x_{ij}} (\theta _i^t-\theta _j^t) \qquad \qquad \forall \{i,j\}\in \mathcal {L}, t\in T \end{aligned}$$
    (13c)
  4. (d)

    Fourth, to guarantee the safe operation of the branch, the power flow should not exceed the branch’s capacities. Hence, we consider the branch thermal constraints:

    $$\begin{aligned}&-s_{ij, \max } \le p_{ij}^t \le s_{ij, \max }&\text {for every } \{i,j\}\in \mathcal {L}, t\in T. \end{aligned}$$
    (14a)
  5. (e)

    Fifth, we consider the storage constraints. Noticing that the storage operation involves two actions, the action shifts need to be considered. Two binary variables \(\mu _{ch}, \mu _{dch}\) are defined to represent the storage action, where \(\mu _{ch}, \mu _{dch}\in \{0,1\}^{\mathcal {N}_S\times T}\). Moreover, only one of those variables can be 1 at time t. Thus, we have the following constraints:

    $$\begin{aligned}&\text {soc}_{i,\text {min}}^t \le \text {soc}_i^t \le \text {soc}_{i,\text {max}}^t&\text {for every } i\in \mathcal {N}_S, t\in T, \end{aligned}$$
    (15a)
    $$\begin{aligned}&\text {soc}_i^t = \text {soc}_i^{t-1} + \frac{(P_{ch_i}^t-P_{dch_i}^t)}{E_i} \Delta T&\text {for every } i\in \mathcal {N}_S, t\in T, \end{aligned}$$
    (15b)
    $$\begin{aligned}&P_{ch_i}^t, P_{dch_i}^t \ge 0&\text {for every } i\in \mathcal {N}_S, t\in T, \end{aligned}$$
    (15c)
    $$\begin{aligned}&\mu _{ch_i}^t, \mu _{dch_i}^t \in \{0,1\}&\text {for every } i\in \mathcal {N}_S, t\in T, \end{aligned}$$
    (15d)
    $$\begin{aligned}&\mu _{ch_i}^t P_{ch_i,\text {min}}^t \le P_{ch_i}^t \le \mu _{ch_i}^t P_{ch_i,\text {max}}^t&\text {for every } i\in \mathcal {N}_S, t\in T, \end{aligned}$$
    (15e)
    $$\begin{aligned}&\mu _{dch_i}^t P_{dch_i,\text {min}}^t \le P_{dch_i}^t \le \mu _{dch_i}^t P_{dch_i,\text {max}}^t&\text {for every } i\in \mathcal {N}_S, t\in T, \end{aligned}$$
    (15f)
    $$\begin{aligned}&\mu _{ch_i}^t+\mu _{dch_i}^t \le 1&\text {for every } i\in \mathcal {N}_S, t\in T. \end{aligned}$$
    (15g)

Constraints (15a) and (15b) set the upper/lower bounds of soc and give the relationships between soc and charging/discharging actions. Constraints (15c)–(15g) depict the connection between storage actions \(\mu _{ch}\) and \(\mu _{dch}\) and their corresponding real power output \(P_{ch_i}^t,P_{dch_i}^t\).

Lastly, the SO’s cost function is given by

$$\begin{aligned} c^\top y{} & {} {:}{=}\sum _{t\in T} \sum _{i\in \mathcal {N}_G}(r_i^{+,t}P_{G_i}^{+,t} +r_i^{-,t}P_{G_i}^{-,t}) + p_{sl}^t(P_{sl}^t-P_{fl}^t) \\{} & {} \qquad + \sum _{i\in \mathcal {N}_{DG}} (f_i^+ P_{DG_i}^{t,+}+f_i^- P_{DG_i}^{t,-}). \end{aligned}$$

The cost function aims to minimize the electricity cost and reduce the deviation between the intra-day system operation strategy and the day-ahead planning. Thus, the whole adjustment can be summarized as solving

$$\begin{aligned} \min _{y\in \mathcal {Y}(x,h)} c^\top y. \end{aligned}$$

Following the structure from Sect. 2, we denote by \(Ay\ge b\) the constraints, that solely deal with third-level variables, i.e., every constraint of (11a)–(15g) despite of (11d). We observe, that the only remaining constraint is the upper bound on the second-level (adversarial) variables \(h=P_{DG,\max }\) given by (11d) implying \(B = \begin{pmatrix} 0&-I_{DG} \end{pmatrix}, B_x=0, B_h = -I_{DG,\max }, b_0=0\). Consequently, the third-level program in this particular case reads

$$\begin{aligned} \min \ {}&c^\top y \end{aligned}$$
(16a)
$$\begin{aligned} \text {s.t.}\ {}&Ay \ge b, \end{aligned}$$
(16b)
$$\begin{aligned}&-P_{DG_i}^t \ge - P_{DG_i,\max }^t{} & {} \text { for every } i \in \mathcal {N}_{DG}, t\in T. \end{aligned}$$
(16c)

In addition, we denote by \(a_{DG}\) the columns of A corresponding to \(P_{DG}\) and the remaining columns by \(A'\), i.e., \(A=\begin{bmatrix}A'&a_{DG}\end{bmatrix}\). Let further \(\alpha \) denote the dual variables that correspond to Constraints (11a) – (15c), i.e. to \(Ay\ge b\) and \(\beta \in \mathbb {R}^{\mathcal {N}_{DG}\times T}\) denote the dual variables corresponding to (16c). Then the dual program of (16) is

$$\begin{aligned} \max \ {}&b^\top \alpha - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{DG_i,\max }^t \beta _{DG_i,t} \end{aligned}$$
(17a)
$$\begin{aligned} \text {s.t.}\ {}&(A')^\top \alpha = c, \end{aligned}$$
(17b)
$$\begin{aligned}&a_{DG_i}^\top \alpha - \beta _{DG_i,t} = 0 \quad \quad \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(17c)
$$\begin{aligned}&\alpha ,\beta \ge 0, \end{aligned}$$
(17d)

We observe, that, as in Sect. 2, on the second level the objective decomposes into a bilinear part and a linear one. However, one can argue that the dual variables \(\beta _{DG}\) as well as the maximal capacity \(P_{DG,\max }\) of the distributed generators are bounded, i.e., Assumption 1 holds. In particular, we will show that there are \(P_{i,t}^+,\beta _{i,t}^+\) such that

$$\begin{aligned} P_{DG_i,\min }^t \le P_{DG_i,\max }^t\le P_{i,t}^+ \text { and } \beta _{i,t}^-\le \beta _{DG_i,t} \le \beta _{i,t}^+. \end{aligned}$$

On the one hand, this is because distributed generators have technical limits. For instance, the power outputs of wind turbines are restricted by the cut-out wind speed. The outputs will not exceed the power corresponding to this wind speed. As for the solar panels, their outputs are also restricted by the rated power of the devices themselves. Hence, \(P_{DG_i,\max }^t\) is always bounded.

On the other hand, we may prove the existence of an upper bound for \(\beta _{DG_i,t}\) and thereby verify Assumption 1 in our application as follows:

Lemma 1

For every optimal solution \((\alpha ^*,\beta ^*)\) to (17), we have that

$$\begin{aligned} p_{sl}^t-f_i^+\le \beta _{DG_i,t}^*\le \max \{r_i^{+,t},r_i^{-,t}, p_{sl}^t\}-\min \{f_i^+,f_i^-\} \text { for every }i\in \mathcal {N}_{DG}, t\in T. \end{aligned}$$

Proof

Consider the following relaxed version of (16), where we penalized violations in (16c) instead of incorporating (16c) as a hard constraint:

$$\begin{aligned} \min \ {}&c^\top y + \sum _{i\in \mathcal {N}_{DG},t\in T} \beta _{i,t}^+ \lambda _{i,t} \end{aligned}$$
(18a)
$$\begin{aligned} \text {s.t.}\ {}&Ay\ge b, \end{aligned}$$
(18b)
$$\begin{aligned}&-P_{DG_i}^t + \lambda _{i,t} \ge -P_{DG_i,\max }^t{} & {} \text {for every } i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(18c)
$$\begin{aligned}&\lambda _{i,t} \ge 0{} & {} \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(18d)

where \(\beta _{i,t}^+{:}{=}\max \{r_i^{+,t},r_i^{-,t}, p_{sl}^t\}-\min \{f_i^+,f_i^-\}\). Suppose \(\lambda _{i,t}>0\), then we can increase the value of \(P_{DG_i}^t\) by at most \(\lambda _{i,t}\), thereby at least decreasing the objective value by \(\min \{f_i^+,f_i^-\}\lambda _{i,t}\). Due to the power balance equations (13a) and (13b), we have to either sell the energy on the (second-level) market, i.e., decrease \(P_{sl}^t\), which results in a benefit of \(p_{sl}^t\lambda _{i,t}\), (in-)decrease \(P_{ch_i},P_{dch_i}^t,P_{G_i,\text {reg}}\) resulting either in a benefit of 0 or at most \(\max \{r_i^{+,t},r_i^{-,t}\}\lambda _{i,t}\) respectively. Given this \(\beta _{i,t}^+\), we obtain by strong duality:

$$\begin{aligned} c^\top y^*=\max \ {}&b^\top \alpha - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{DG_i,\max }^t \beta _{DG_i,t} \end{aligned}$$
(19a)
$$\begin{aligned} \text {s.t.}\ {}&(A')^\top \alpha = c, \end{aligned}$$
(19b)
$$\begin{aligned}&a_{DG_i}^\top \alpha - \beta _{DG_i,t} = 0 \quad \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(19c)
$$\begin{aligned}&\beta _{DG_i,t} + \gamma _{i,t} = \beta _{i,t}^+ \quad \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(19d)
$$\begin{aligned}&\alpha ,\beta ,\gamma \ge 0. \end{aligned}$$
(19e)

Here, the last two constraints imply

$$\begin{aligned} \beta _{DG_i,t}\le \beta _{i,t}^+ \text{ for } \text{ every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$

i.e. adding a sufficiently large upper bound on \(\beta _{DG}\) does not change the outcome of the dual program and hence, we can safely assume \(0 \le \beta _{DG} \le \beta ^+\).

For the inequality \(p_{sl}^t-f_i^+\le \beta _{DG_i,t}^*\), we observe that (18) is unbounded whenever \(\beta _{i,t}^+<p_{sl}^t-f_i^+\): Consider an optimal solution \((y^*,P_{DG}^*)\) of (16), which is feasible for (18) with \(\lambda _{i,t}^*=0\) for every \(i\in \mathcal {N}_{DG}, t\in T\). If we fix \(k\in \mathcal {N}_{DG}, t'\in T\), then we observe that \(P_{DG_k}^{t'} (\mu ){:}{=}(P_{DG_k}^{t'})^* + \mu \), \(P_{DG_k}^{+,t'} (\mu ){:}{=}(P_{DG_k}^{+,t'})^* + \mu \), \(\lambda _{k,t'}(\mu ) {:}{=}\lambda _{k,t'}^* +\mu \), \(P_{sl}^{t'}(\mu ){:}{=}(P_{sl}^{t'})^* -\mu \) is also feasible for (18) for every \(\mu >0\). Its objective value is

$$\begin{aligned} c^\top y^* - p_{sl}^{t'} \mu + f_k^+ \mu + \sum _{i\in \mathcal {N}_{DG}, t\in T} \beta _{i,t}^+ \lambda _{i,t}^* + \beta _{k,t'}^+ \mu , \end{aligned}$$

which tends to \(-\infty \) for \(\mu \rightarrow \infty \), if \(\beta _{k,t'}^+<p_{sl}^{t'}-f_k^+\). By strong duality, we obtain, that for \(\beta _{k,t'}^+<p_{sl}^{t'}-f_k^+\) (19) does not have a feasible solution. Since \(k,t'\) have been chosen arbitrarily, the following problem is equivalent to (19):

$$\begin{aligned} c^\top y^*=\max \ {}&b^\top \alpha - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{DG_i,\max }^t \beta _{DG_i,t} \end{aligned}$$
(20a)
$$\begin{aligned} \text {s.t.}\ {}&(A')^\top \alpha = c, \end{aligned}$$
(20b)
$$\begin{aligned}&a_{DG_i}^\top \alpha - \beta _{DG_i,t} = 0 \qquad \qquad \qquad \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(20c)
$$\begin{aligned}&\beta _{DG_i,t} + \gamma _{i,t} = \max \{r_i^{+,t},r_i^{-,t}, p_{sl}^t\}-\min \{f_i^+,f_i^-\} \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \! \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(20d)
$$\begin{aligned}&\beta _{DG_i,t} - \delta _{i,t} = p_{sl}^t-f_i^+ \qquad \qquad \!\! \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(20e)
$$\begin{aligned}&\alpha ,\beta ,\gamma , \delta \ge 0. \end{aligned}$$
(20f)

\(\square \)

Moreover, this bound might even be sharp due to the following observation:

Remark 1

Depending on the market operations, that is considered, we may have that both, \(r_i^{+,t},r_i^{-,t} \le p_{sl}^t\) and \(f_i^+\le f_i^-\) holds. In this case the two bounds in Lemma 1 coincide and

$$\begin{aligned} \beta _{DG_i,t}^*=p_{sl}^t-f_i^+. \end{aligned}$$

Thus, in those cases the McCormick envelopes are exact and the upcoming corollaries provide an exact reformulation of (2). However, even if the McCormick envelope is not exact, the boundedness of both, \(\beta _{DG}\) and \(P_{DG,\max }\) enables us to relax the second level problem \(\max _{h\in \Omega } D(x,h)\) with the McCormick envelope. This gives rise to the following corollary of Theorem 1:

Corollary 1

Let \(\Omega \) be a robust ambiguity set as in (10), where \(\Omega \) denotes a polytope with potential nonnegative slack variables \(\eta \) in the rows \(i\in I\), i.e. \(\Omega =\left\{ P_{DG,\max }\in \mathbb {R}^{\mathcal {N}_{DG}\times T}, \eta \in \mathbb {R}^I_{\ge 0}: A_\Omega ^\top P_{DG,\max } + \eta =b_\Omega \right\} \). Let further \(\beta _{i,t}^-,\beta _{i,t}^+\) be a lower/upper bound for \(\beta _{DG_i,t}\). Then, the following linear program provides an upper bound to (2):

$$\begin{aligned} \min \ {}&\sum _{i\in \mathcal {N}_G,t\in T} c_{G_i^t,2} (P_{G_i}^t)^2 + c_{G_i^t,1} P_{G_i}^t + c_{G_i^t,0} + \sum _{t\in T} p_{fl}^t P_{fl}^t + c^\top y\nonumber \\&+ \sum _{i\in \mathcal {N}_{DG}, t\in T}\beta _{i,t}^+u_{i,t}^{\beta ,+} + \sum _{i\in \mathcal {N}_{DG}, t\in T}\beta _{i,t}^-u_{i,t}^{\beta ,-} + b_\Omega ^\top u_\Omega \nonumber \\&- \sum _{i\in \mathcal {N}_{DG}, t\in T}P_{DG_i,\min }^t \beta _{i,t}^- (u_\text {env})_{i,t}^1 - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{i,t}^+\beta _{i,t}^+ (u_\text {env})_{i,t}^2 \nonumber \\&- \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{i,t}^+\beta _{i,t}^- (u_\text {env})_{i,t}^3 - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{DG_i,\min }^t\beta _{i,t}^+ (u_\text {env})_{i,t}^4 \end{aligned}$$
(21a)
$$\begin{aligned} \text {s.t.}\ {}&Ay \ge b, \end{aligned}$$
(21b)
$$\begin{aligned}&-P_{DG_i}^t + u_{i,t}^{\beta ,+} + u_{i,t}^{\beta ,-} -P_{DG_i,\min }^t(u_\text {env})_{i,t}^1 - P_{i,t}^+(u_\text {env})_{i,t}^2\nonumber \\&\qquad - P_{i,t}^+ (u_\text {env})_{i,t}^3 -P_{DG_i,\min }^t(u_\text {env})_{i,t}^4 \ge 0 \qquad \qquad \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(21c)
$$\begin{aligned}&u_{i,t}^{\beta ,+} \ge 0,\ -u_{i,t}^{\beta ,-} \ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \,\, \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(21d)
$$\begin{aligned}&(A_\Omega u_\Omega )_i -\beta _{i,t}^- ((u_\text {env})_{i,t}^1 + (u_\text {env})_{i,t}^3) -\beta _{i,t}^+ ((u_\text {env})_{i,t}^2 + (u_\text {env})_{i,t}^4) \ge 0 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \quad \qquad \qquad \qquad \qquad \qquad \qquad \quad \,\,\,\,~~~ \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(21e)
$$\begin{aligned}&u_\Omega \ge 0 \end{aligned}$$
(21f)
$$\begin{aligned}&(u_\text {env})_{i,t}^1 + (u_\text {env})_{i,t}^2 + (u_\text {env})_{i,t}^3 + (u_\text {env})_{i,t}^4 \ge -1\nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \ \qquad \qquad \quad \qquad ~~ \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(21g)
$$\begin{aligned}&-(u_\text {env})_{i,t}^1, -(u_\text {env})_{i,t}^2 \ge 0 \qquad \qquad \qquad \qquad \qquad \quad \qquad \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(21h)
$$\begin{aligned}&(u_\text {env})_{i,t}^3, (u_\text {env})_{i,t}^4 \ge 0 \qquad \qquad \qquad \qquad \ \qquad \qquad \quad \qquad ~~ \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(21i)
$$ \begin{aligned}&(9a), \ \& \ (9b) \end{aligned}$$
(21j)

We note, that Corollary 1 is a direct consequence of Theorem 1 and thus its proof follows the same lines. However, we include the full proof here, as the notation varies a bit and the proof illustrates the impact of Lemma 1.

Proof

We observe that with (17) and Lemma 1 the second-level \(\max _{P_{DG,\max }\in \Omega } \min _{y\in \mathcal {Y}(x,h)} c^\top y\) can be written as

$$\begin{aligned} \max \ {}&b^\top \alpha - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{DG_i,\max }^t \beta _{DG_i,t} \end{aligned}$$
(22a)
$$\begin{aligned} \text {s.t.}\ {}&(A')^\top \alpha = c, \end{aligned}$$
(22b)
$$\begin{aligned}&a_{DG_i}^\top \alpha - \beta _{DG_i,t} = 0&\text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(22c)
$$\begin{aligned}&\beta _{DG_i,t} + \gamma _{i,t} = \beta _{i,t}^+&\text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(22d)
$$\begin{aligned}&\beta _{DG_i,t} - \delta _{i,t} = \beta _{i,t}^-&\text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(22e)
$$\begin{aligned}&A_\Omega ^\top P_{DG,\max } + \eta = b_\Omega , \end{aligned}$$
(22f)
$$\begin{aligned}&\alpha ,\beta ,\gamma ,P_{DG,\max },\delta , \eta \ge 0, \end{aligned}$$
(22g)

where \(\beta ^-,\beta ^+\) are chosen as in Lemma 1. Next, we substitute \(\kappa _{i,t}{:}{=}P_{DG_i,\max }^t \beta _{DG_i,t}\) in the objective and relax the resulting constraint \(\kappa _{i,t}{:}{=}P_{DG_i,\max }^t \beta _{DG_i,t}\) by a McCormick envelope. Note, that since \(P_{DG_i,\max }^t, \beta _{DG_i,t}\ge 0\), we can immediately conclude \(\kappa \ge 0\), which simplifies our notation a bit. If we further introduce suitable nonnegative slack variables, we obtain the following dual LP:

$$\begin{aligned} \max \ {}&b^\top \alpha - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{DG_i,\max }^t \beta _{DG_i,t} \end{aligned}$$
(23a)
$$\begin{aligned} \text {s.t.}\ {}&(A')^\top \alpha = c, \end{aligned}$$
(23b)
$$\begin{aligned}&a_{DG_i}^\top \alpha - \beta _{DG_i,t} = 0 \qquad \qquad \qquad \qquad \qquad \!\!\!\! \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(23c)
$$\begin{aligned}&\beta _{DG_i,t} + \gamma _{i,t} = \beta _{i,t}^+ \qquad \qquad \qquad \qquad \qquad \! \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(23d)
$$\begin{aligned}&\beta _{DG_i,t} - \delta _{i,t} = \beta _{i,t}^- \qquad \qquad \qquad \qquad \qquad \! \text { for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(23e)
$$\begin{aligned}&A_\Omega ^\top P_{DG,\max } + \eta = b_\Omega , \end{aligned}$$
(23f)
$$\begin{aligned}&\kappa _{i,t} = P_{DG_i,\min }^t \beta _{DG_i,t} + P_{DG_i,\max }^t\beta _{i,t}^- - P_{DG_i,\min }^t \beta _{i,t}^- + \eta _{i,t}^1 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(23g)
$$\begin{aligned}&\kappa _{i,t} = P_{i,t}^+\beta _{DG_i,t} + P_{DG_i,\max }^t \beta _{i,t}^+ - P_{i,t}^+\beta _{i,t}^+ + \eta _{i,t}^2 = 0 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(23h)
$$\begin{aligned}&\kappa _{i,t} = P_{i,t}^+ \beta _{DG_i,t} + P_{DG_i,\max }^t \beta _{i,t}^- -P_{i,t}^+\beta _{i,t}^- -\eta _{i,t}^3 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(23i)
$$\begin{aligned}&\kappa _{i,t} = P_{DG_i,\max }^t \beta _{i,t}^+ + P_{DG_i,\min } \beta _{DG_i,t} - P_{DG_i,\min }^t\beta _{i,t}^+ -\eta _{i,t}^4 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text {for every } i\in \mathcal {N}_{DG}, t\in T, \end{aligned}$$
(23j)
$$\begin{aligned}&\alpha ,\beta ,\gamma , \delta ,P_{DG,\max },\rho ,\eta ,\kappa \ge 0. \end{aligned}$$
(23k)

If we denote the dual variables of (23a)–(23h) by \(y, u_{i,t}^{\beta ,+}, u_{i,t}^{\beta ,-}, u_\Omega , (u_\text {env})_{i,t}^1,(u_\text {env})_{i,t}^2,(u_\text {env})_{i,t}^3,(u_\text {env})_{i,t}^4\) respectively, then the result follows by strong duality and including the first-level variables and objectives. \(\square \)

Again, we observe that Corollary 1 provides an LP inner approximation of the linear relaxation of (1). Now, the following direct corollary of Theorem 2 incorporates the discrete (binary) decisions \(\mu _{ch}\) and \(\mu _{dch}\).

Corollary 2

Let \(\Omega \) be a robust ambiguity set as in (10), where \(\Omega \) denotes a polytope with potential nonnegative slack variables \(\eta \) in the rows \(i\in I\), i.e. \(\Omega =\left\{ P_{DG,\max }\in \mathbb {R}^{\mathcal {N}_{DG}\times T}, \eta \in \mathbb {R}^I_{\ge 0}: A_\Omega ^\top P_{DG,\max } + \eta =b_\Omega \right\} \). Let further \(\beta _{i,t}^-,\beta _{i,t}^+\) be a lower/upper bound for \(\beta _{DG_i,t}\). Then, the following MIP provides an upper bound to the tri-level MIP (1) with the given parameters from Sect. 3:

$$\begin{aligned} \min \ {}&\sum _{i\in \mathcal {N}_G,t\in T} c_{G_i^t,2} (P_{G_i}^t)^2 + c_{G_i^t,1} P_{G_i}^t + c_{G_i^t,0} + \sum _{t\in T} p_{fl}^t P_{fl}^t + c^\top y\nonumber \\&+ \sum _{i\in \mathcal {N}_{DG}, t\in T}\beta _{i,t}^+u_{i,t}^{\beta ,+} + \sum _{i\in \mathcal {N}_{DG}, t\in T}\beta _{i,t}^-u_{i,t}^{\beta ,-} + b_\Omega ^\top u_\Omega \nonumber \\&- \sum _{i\in \mathcal {N}_{DG}, t\in T}P_{DG_i,\min }^t \beta _{i,t}^- (u_\text {env})_{i,t}^1 - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{i,t}^+\beta _{i,t}^+ (u_\text {env})_{i,t}^2 \nonumber \\&- \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{i,t}^+\beta _{i,t}^- (u_\text {env})_{i,t}^3 - \sum _{i\in \mathcal {N}_{DG}, t\in T} P_{DG_i,\min }^t\beta _{i,t}^+ (u_\text {env})_{i,t}^4 \end{aligned}$$
(24a)
$$\begin{aligned} \text {s.t.}\ {}&Ay \ge b, \end{aligned}$$
(24b)
$$\begin{aligned}&-P_{DG_i}^t + u_{i,t}^{\beta ,+} + u_{i,t}^{\beta ,-} -P_{DG_i,\min }^t(u_\text {env})_{i,t}^1 - P_{i,t}^+(u_\text {env})_{i,t}^2 \nonumber \\&\qquad - P_{i,t}^+ (u_\text {env})_{i,t}^3 -P_{DG_i,\min }^t(u_\text {env})_{i,t}^4 \ge 0 \qquad \qquad \quad \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(24c)
$$\begin{aligned}&u_{i,t}^{\beta ,+} \ge 0,\ -u_{i,t}^{\beta ,-} \ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \,\, \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(24d)
$$\begin{aligned}&(A_\Omega u_\Omega )_i -\beta _{i,t}^- ((u_\text {env})_{i,t}^1 + (u_\text {env})_{i,t}^3) -\beta _{i,t}^+ ((u_\text {env})_{i,t}^2 + (u_\text {env})_{i,t}^4) \ge 0 \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \!\forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(24e)
$$\begin{aligned}&u_\Omega \ge 0 \end{aligned}$$
(24f)
$$\begin{aligned}&(u_\text {env})_{i,t}^1 + (u_\text {env})_{i,t}^2 + (u_\text {env})_{i,t}^3 + (u_\text {env})_{i,t}^4 \ge -1 \qquad \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(24g)
$$\begin{aligned}&-(u_\text {env})_{i,t}^1, -(u_\text {env})_{i,t}^2 \ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \qquad \!\forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(24h)
$$\begin{aligned}&(u_\text {env})_{i,t}^3, (u_\text {env})_{i,t}^4 \ge 0 \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \,\, \forall i\in \mathcal {N}_{DG}, t\in T \end{aligned}$$
(24i)
$$ \begin{aligned}&(9a), \ \& \ (9b) \end{aligned}$$
(24j)
$$\begin{aligned}&\mu _{ch}, \mu _{dch} \in \{0,1\}^{\mathcal {N}_S\times T} \end{aligned}$$
(24k)
$$\begin{aligned}&P_{G_i}^t, P_{fl}^t, y', P_{DG_i}^t, u^{\beta ,+}_{i,t}, u^{\beta ,-}_{i,t}, u^P_{i,t}, (u_\text {env})^1_{i,t}, (u_\text {env})^2_{i,t}, (u_\text {env})^3_{i,t}, (u_\text {env})^4_{i,t} \in \mathbb {R}. \end{aligned}$$
(24l)

Thus, our approximation technique is applicable to the optimization of smart converters in power grids. Moreover, the only strict relaxation comes from approximating the uncertainty set \(\Omega \) by McCormick envelopes of the bilinear terms and may, depending on the second-level (intra-day) market price \(p_{sl}\) and the penalizations for adjustments \(f^+,f^-,r^+,r^-\), even be sharp. Hence, it seems natural to test our approximations numerically.

4 Computational results

We present the results of three case studies in this section. All of these instances are considered on a daily basis divided into hourly (24 period) or 15min (96 period) time intervals. The first benchmark is a 5-bus instance, based on the “case5.m” instance from the matpower library (Zimmerman et al. 2011). Second, we consider a 30-bus instance, based on the “case_ieee30.m” instance of the matpower library and finally a modified version of the IEEE 118-bus system similar to the one in Cobos et al. (2018). The computations were executed via Gurobi 10.0.0 under Python 3.7 on a Macbook Pro (2019) notebook with an Intel Core i7 2,8 GHz Quad-core and 16 GB of RAM.

4.1 5-bus example

he topology of the test system is shown in Fig. 1, where we consider bus 1 to be the root node. We observe, that the conventional generators are connected to the buses 1 and 4, i.e. \(\mathcal {N}_G=\{1,4\}\), two distributed generators are connected to buses 1 and 5, i.e. \(\mathcal {N}_{DG}=\{1,5\}\) and an energy storage unit is connected to bus 3, i.e. \(\mathcal {N}_S=\{3\}\).

Fig. 1
figure 1

The case5.m network with its corresponding generators and flows at 3am. \(P^d=0\) at buses 1 and 5

Whether a generator in “case5.m” is a conventional/renewable one or a storage was decided by the authors. Both, the day-ahead and intra-day market prices \(p_{fl}, p_{sl}\) were taken as averages from the Pecan street data base’s pecanstreet (2022) “miso” data set for October 3rd, 2022. The daily deviations in \(P^d\), denoted by \(\Delta _d^t\), or daily deviations in \(P_{DG,\max }^t\), denoted by \(\Delta _{DG}^t\) were similarly taken from the Pecan street data base’s “california_iso” dataset for October 3rd, 2022.

Then, the demand varying over the day was modeled as \(P^d=P^d\cdot \Delta _d^t\) and the varying potential renewable energy production was modeled by \(P_{DG,\text {forecast}}^t{:}{=} \min \{\frac{P_{DG,\min }+P_{DG,\max } }{2}\cdot \Delta _{DG}^t, P_{i,t}^+\}\).

Solving Problem (24) takes less than 1 s. However, its objective value highly depends on the uncertainty imposed on the system. We illustrate in Fig. 2 how sensitive the optimal solution reacts to changes in the maximal forecast error R, that crucially determines \(\Omega \) through (10). In particular, since in “case5.m”we have neglectable upward and downward regulation costs and \(f^-\ge f^+\), Lemma 1 implies that the McCormick relaxation is sharp if and only if \(r^-,r^+ \le p_{sl}\). As realistic penalties \(r^+ = r^-\), we assume \(r_1^+=r_1^-= 14\)$ p.u., which are the costs of operating the first generator at bus 1. For the sake of a better analysis, we replace the natural choice \(r_4^+=r_4^-= 40\)$ p.u., which are the costs of operating the generator at bus 4 by \(r_4^+=r_4^-= 20\)$ p.u. as then \(r^-,r^+ \le p_{sl}\) and we can compare variations in the penalties to an optimal robust solution.

In particular, for this instance, \(R=1\) yields the nominal optimal solution with an objective value of 727082

Fig. 2
figure 2

Objective value of (24) on “case5.m” under varying R

Moreover, the blue line in Fig. 2 illustrates the perfect linear relation between the lower bound of the uncertainty set \(\Omega \) and the objective value of (24). We conclude, that the DSO may cut its worst-case costs by almost \(50\%\) with a perfectly accurate weather forecast, i.e. \(\Omega =\{P_{DG,forecast}\}\). As this is unrealistic with present forecasting methods, we would like to highlight that one may gain already significant cost reductions in the worst-case by incorporating more information on \(\Omega \).

Moreover, the red line in Fig. 2 shows an upper bound to (2) given by Corollary 2 in case \(r^{+,t}_4 > p_{sl}^t\) for some t, i.e. in case Lemma 1 is violated. Since the objective value with respect to this penalty is contained between the red and the blue line, Fig. 2 thereby shows a rather strong approximation quality for this particular instance. However, we want to stress, that this only holds for this particular example and may not be a general pattern.

4.2 30-bus example

Similarly, as in the 5-bus example, the topology of the 30-bus test system is taken from “case_ieee30.m”, a system with 41 transmission lines and after modification 4 dispatchable generators as well as 4 energy storages. The only renewable generator is placed at bus 2, i.e. \(\mathcal {N}_{DG}=\{2\}\). In addition, we choose \(\mathcal {N}_{G}=\{5,8,11,13\}\) and the four energy storage units to be connected to buses 1, 2, 8, 13, i.e. \(\mathcal {N}_S=\{1,2,8,13\}\). The estimation procedure of market prices and demands are kept from the “case5.m” example. Since also “case_ieee30.m” does not include upward or downward regulation costs for the generators, the McCormick envelope is sharp and for \(R=1\), i.e. \(\Omega =\{P_{DG,forecast}\}\), the nominal value of 104, 088 is attained.

Moreover, we would like to illustrate the dependency of the worst-case revenue with respect to choices of R in Fig. 3. We want to highlight, that also in this more elaborate example, the runtime was \(<1\)s.

Fig. 3
figure 3

Objective values of (24) with \(|\mathcal {N}_S|\in \{0,2,4\}\) on “case_ieee30.m”

Note, that different slopes may occur due to the different capabilities of the storages. In particular, the improved performance from \(\mathcal {N}_S=\emptyset \) to \(\mathcal {N}_S\ne \emptyset \) indicates that the capability of storing all renewable energy produced within the transmission system is more valuable than storing energy from prior purchases, i.e. externally produced energy.

4.3 Analyzing the runtime on larger instances

After illustrating the behavior of the objective value under different uncertainties and storages, we focus on the main advantage of the proposed MIP approach, namely its speed. As the scaling of the runtime is crucial in industrial applications, we demonstrate the applicability of our algorithm to larger power systems, particularly a 118-bus (“case118.m”), a 200-bus (“case200.m”) and 300-bus (“case300.m”) test system.

To this end, we aim to keep the considered instances as comparable as we can. In particular, we again keep the estimation procedure of market prices and demands from the “case5.m” example. Additionally, none of the considered instances contains upward or downward regulation costs for the generators and thereby due to Lemma 1 we always compute the exact solutions to (2). Lastly, we note that the number of storages \(|\mathcal {N}_S|\) determines the amount of binary variables in (24) and consequently should crucially impacts the runtime. Thus, we equipped our test systems with \(|\mathcal {N}_S|=6\), \(|\mathcal {N}_S|=10\) and \(|\mathcal {N}_S|=15\) storages respectively in order to achieve an approximately constant ratio of buses to storages \(|V|/|\mathcal {N}_S|\), i.e. \(|V|/|\mathcal {N}_S|\approx 20\). To further improve comparability, we also recomputed the 30-bus system with 2 storages instead of 4. The following figure illustrates the achieved runtime:

Fig. 4
figure 4

runtime comparison “case5.m” with \(|\mathcal {N}_S|=1\), “case_ieee30.m” with \(|\mathcal {N}_S|=2\), “case118.m” with \(|\mathcal {N}_S|=6\), “case200.m” with \(|\mathcal {N}_S|=10\), case300.m” with \(|\mathcal {N}_S|=15\)

As (2) is a notoriously challenging problem, see Question in Conejo and Wu (2022) and Section 6 in Yanıkoğlu et al. (2019) benchmarks on the exact problem setting are, to the best of our knowledge, rare. However, Cobos et al. (2018) applied a nested column generation approach in order to solve a strongly related problem on “case118.m” and achieved runtimes between 200s and 800s. The considered instances in Cobos et al. (2018) contain \(|\mathcal {N}_G||T| + 3|\mathcal {N}_S| |T|\) first-level, \(2|\mathcal {N}_{DG}| |T|\) second-level and \(|\mathcal {N}_S| |T|\) third-level binary variables – a significantly larger amount of binary variables compared to our instances since in (Cobos et al. 2018) we have \(|\mathcal {N}_G|= 54, |\mathcal {N}_{DG}|=10, |\mathcal {N}_S| = 6\) and \(|T|=24\). In summary, the nested column generation in Cobos et al. (2018) addresses \(54\cdot 24 + 3\cdot 6 \cdot 24 + 2 \cdot 10 \cdot 24 + 6 \cdot 24 = 98 \cdot 24 = 2352\) binary decisions on a comparable instance. As Fig. 4 illustrates, the proposed algorithm solves the instance in \(\approx 3\)s, but with significantly fewer, namely \(2\cdot 6\cdot 24 = 12 \cdot 24 = 288\) binary decisions. Thus, a direct comparison with Cobos et al. (2018) seems rather inappropriate.

Furthermore, the parametric programming approach presented in Avraamidou and Pistikopoulos (2019) can be used to solve the ARO (2), even if (2) is not weakly-connected. The same authors demonstrate, that an instance with 60 binary variables on various levels can be solved within 15 s, see Table 7, Problem P5 in Avraamidou and Pistikopoulos (2020). As the number of binary variables of this instance is still significantly smaller than “case_ieee30.m”, which the presented MIP framework can solve within \(<1\)s, it seems natural to conjecture, that the MIP approach outperforms the parametric programming approach on weakly-connected AROs in terms of runtime. However, we would like to highlight, that the work in Avraamidou and Pistikopoulos (2019) rather aims at wide applicability as the authors present a significantly more general approach.

5 Conclusion

This article presents a new MIP framework to approximate adjustable robust programs with integer variables in the innermost (adjustment) stage. It is based on a weak connection between the separate stages and uses a McCormick envelope to strengthen the adversarial, thereby relaxing the ARO. We have proven that the resulting MIP provides feasible solutions for the first stage, that can be adjusted to a solution satisfying an objective at least as good as the ARO objective regardless of the realization of the uncertainty. In addition, we have provided a sufficient criterion, for the exactness of our approximation.

Moreover, we applied our results to model discrete adjustments of smart converters in a power system and provided numerical evidence, that our approach is competitive to previous methods such as the nested column generation or parametric programming in terms of runtime.