1 Introduction

The proportion of renewable energy, such as solar and wind energy, in electrical distribution networks is constantly increasing. Due to these difficult to predict and highly fluctuating energy sources, the operational management of electrical networks becomes very challenging. Transmission system operators (TSO) have to control the feed-in and the power distribution in the network and have to meet safety requirements at the same time. If the network risks a system overload, the feed-in from renewables must be curtailed. However, the curtailed energy has to be minimized for financial and ecological reasons. Therefore, there is a high demand for the combination of advanced forecasting and optimization models. In this work, we show how these models can be applied and combined for the optimal curtailment of solar feed-in in an electrical distribution network.

The predominantly used model and optimizing the production and distribution of power in an electrical network is the Optimal Power Flow (OPF) model. In its classic version this is a non-linear non-convex optimization problem which is hard to solve and was originally introduced in Carpentier (1962). For a broad overview of the literature on OPF, we refer to Frank et al. (2012a) and Frank et al. (2012b). Due to the computational difficulty of the OPF problem, there are some approximation approaches in the literature. One of the most frequently used approximations is the DC Optimal Power Flow (DC OPF), see Christie et al. (2000). It results in a power flow model including only linear constraints and can be solved efficiently with standard software. For the optimization of power grids under uncertainty the DC OPF model is also used in this work.

In applications to power grids, it is important to ensure that there is a sufficiently high probability (chosen beforehand) that all safety constraints like transmission limits are satisfied. This can be modeled with a two-stage stochastic optimization model incorporating joint chance constraints that enforce the simultaneous satisfaction of several constraints with a predefined probability. In the first stage, the nominal network operating solution, including generator output, (discrete) curtailment, power flows and voltage angles, is decided before the realization of uncertainty is revealed (here-and-now). After the uncertain parameters manifest themselves, the two-stage variables react to them. In the second stage, the network response to fluctuation ensures that there is a high probability of transmission limits being maintained. From a practical perspective, protection through probabilistic constraints is suitable because short-term overloads in the electrical network are acceptable. In the event of larger or longer lasting overloads, countermeasures will need to be taken, where a TSO will need to (re-)optimize interventions in order to stabilize the network. In our model, curtailment limits the output of renewable power production to a specific percentage proportion of the installed power.

We approximate the probabilistic constraints in the optimization problem using robust constraints within a robust safe approximation, see Nemirovski (2012). By a suitable choice of the uncertainty set we can ensure that all robust feasible solutions are also feasible for the stochastic optimization problem. The constraints of the robust approximation thus lead to sufficient conditions for the chance constraints being satisfied. In particular, we use a mixed-integer linear reformulation for the approximation introduced in Aigner et al. (2021). Hence, by solving only one mixed integer optimization model to global optimality, a robust solution is computed that is feasible for the chance constrained problem. The respective uncertainty sets are computed with the procedure proposed in Margellos et al. (2014) based on the scenario approach (see Calafiore and Campi (2005)) of stochastic optimization, which uses samples from a suitably chosen probability distribution. The present paper proposes several enhancements of our previous work, which consist in the utilization of R-vine copulas (see e.g. Joe (2015)), a flexible parametric model to construct multivariate probability densities by decomposing them into several bivariate conditional (and univariate) densities to fit distributions to available data. Note that R-vine copulas contain the family of D-vine copulas as special case, which we used so far to model data from meteorology and solar power supply, see Schinke-Nendza et al. (2021); von Loeper et al. (2021). From the fitted R-vine copula model we draw samples in order to obtain the uncertainty sets with the help of the scenario approach mentioned above. Then, in a second step, we modify the R-vine copula model such that we can draw samples from conditional distributions. This allows us to determine uncertainty sets depending on weather forecasts provided by DWD (German Meteorological Service) which are significantly smaller and lead to a drastic reduction of conservatism and less costly curtailment with same probabilistic guarantees.

There are many research activities regarding OPF under uncertainty. The goal is to determine an optimal network configuration that remains feasible under uncertainty where the approach considered in this paper uses methods and models from stochastic and robust optimization. We refer to Ben-Tal et al. (2009) and Prékopa (1995) for a broad overview of these two paradigms regarding optimization under uncertainty. Note that due to the non-convexity of the nominal AC OPF, only solutions that are approximately protected against uncertainty can be computed as in (Dall’Anese et al. 2017; Roald and Andersson 2018; Zhang and Li 2011) with robust or probabilistic constraints.

Essential for an algorithmically tractable treatment of uncertainty in optimization problems is the possibility to solve the underlying deterministic problems (without uncertainty) efficiently. This is why the linear DC OPF model is suitable and of great interest. Such uncertain optimization problems are usually solved by reformulating them under specific assumptions on the underlying probability distribution or by using approximation techniques from stochastic programming. Most chance constrained OPF problems considered in the literature have separate chance constraints for each engineering limit, including both generation and transmission limits. For example, the authors of Bienstock et al. (2014); Lubin et al. (2016) focus on OPF with individual probabilistic constraints under Gaussian distributions. Uncertainty probabilities for specific classes of probability distributions are considered robustly in Roald et al. (2015); Xie and Ahmed (2018). Furthermore, there is a limited number of papers that deal with joint chance constraints OPF models. They allow much stronger system security guarantees, but are much harder to solve, see Geng and Xie (2019). Most common solution methods are based on the Boolean or Bonferroni approximation (see e.g. Jia et al. (2021)) and on scenario approximations (see e.g. Peña-Ordieres et al. (2021)).

In addition, the curtailment of renewable power is used in practice to reduce the feed-in of renewable energy sources, maintaining network stability and avoiding overloads of transmission lines. The curtailment of uncertain feed-in from renewables has also been considered in several OPF models. Examples can be found in (Aigner et al. 2021; Roald et al. 2016; Qiu and Wang 2014; Wang et al. 2011; Dall’Anese et al. 2017). Note that there are two principal types of curtailment strategies, which are usually modeled by additional discrete or continuous decision variables or fixed parameters. The first and more common type of curtailment uses output capacities, which restrict the maximum possible power input. This limit cannot be exceeded and any potential power production above the limit is cut off. The second type of curtailment reduces the produced energy by a fixed value regardless of how high the feed-in amount is. Chance constraints in combination with curtailment are usually tackled by sampling techniques from stochastic optimization already mentioned above. In the present paper we use discrete curtailment levels as it is common practice in many industrial applications and set by law in Germany.

To construct parametric models for multivariate distributions, vine copulas are a versatile tool which has been used in the literature for similar problems. For example, in Guo et al. (2021); Khuntia et al. (2019); Xiao et al. (2020), copulas are applied for dependency modeling of wind power in conjunction with OPF. Furthermore, in Xu et al. (2021), Gaussian copulas are used to determine uncertainty sets for an OPF problem with chance constraints.

The main contribution of the present paper is an extension of the safe approximation of the joint chance constrained DC OPF model introduced in Aigner et al. (2021), by combining it with a model-based prediction of solar power supply via copulas. Furthermore, additional information gained from weather data can be integrated into the copula approach and thus conditional distributions of solar power supply can be modeled. However, with regard to conditional sampling, vine copulas have some restrictions as described in Cooke et al. (2015), i.e., when drawing conditional samples from a given vine copula model, only some components of the underlying vector data can be taken into account in the conditioning set. To resolve this issue, various algorithms for conditional sampling from D- and C-vine copulas have been considered in the literature, see Bevacqua et al. (2017). In the present paper we propose a modification of the fitting procedure for the more general class of R-vine copulas. This modification allows us to obtain a suitable R-vine copula for any set of components on which we want to condition. To the best of our knowledge, this modification has not yet been considered before.

This rest of this paper is structured as follows. Section 2 recalls the joint chance constrained DC OPF model considered in Aigner et al. (2021), together with its robust approximation using box uncertainty sets. Then, in Sect. 3, the modeling of the underlying multivariate probability distribution with the help of R-vine copulas is introduced, where suitable uncertainty sets are constructed via the novel combination of the scenario approach and the fitted R-vine copulas. The numerical results of case studies based on real-world data for the distribution network of N-ENERGIE GmbH are presented in Sect. 4. They demonstrate the benefit of combining stochastic programming with a model-based prediction of uncertainty via copulas. The computed solutions are robust and lead to relatively small cost increase compared to the nominal optimization model that ignores uncertainty. The consideration of conditional probability distributions further improves the solution quality. Finally, Sect. 5 concludes.

2 Chance constrained DC optimal power flow model

In this section, we recall the chance constrained DC optimal power flow model with the possibility to curtail feed-in proposed in Aigner et al. (2021), which is based on Bienstock et al. (2014).

2.1 Nominal DC optimal power flow with curtailment

We model the electrical distribution network as an undirected graph \(\mathcal {G}=(\mathcal {N},\mathcal {L})\) where \(\mathcal {N}=\{1,\ldots ,n\}\) for some integer \(n>1\) represents the set of vertices and \(\mathcal {L}\subseteq \mathcal {N}\times \mathcal {N}\) denotes the set of edges. In the context of power system optimization, vertices are also called nodes or buses, and edges are called (transmission) lines. The set of those nodes that are connected with (continuously controllable) slack generators of higher network hierarchies is denoted by \(\mathcal {N}_\text {G}\subseteq \mathcal {N}\). Furthermore, for each \(k\in \mathcal {N}\) we denote the set of adjacent nodes with \(\mathcal {N}(k)\subseteq \mathcal {N}\). For notational ease, we assume that every node is connected to (discretely) controllable solar power generation units. The energy production on a bus without solar feed-in is set equal to zero.

In order to control the solar feed-in, discrete regulation decisions can be made at each node. Curtailment is realized by restricting the maximum feed-in to a certain fraction vector \(\beta =(\beta _1,\ldots ,\beta _{n})\in \mathcal {S}= \mathcal {S}_1\times \ldots \times \mathcal {S}_{n}\subset [0,1]^{n}\) of the installed capacity vector

$$\begin{aligned} P^{\text {I}}=(P^{\text {I}}_1,\ldots ,P^{\text {I}}_{n}) \in [0,\infty )^{n}. \end{aligned}$$

Note that the installed capacity is the intended full-load sustained solar energy production at each node. In practice, sets of curtailment factors with a small number of levels are common. Typical sets of curtailment factors for single nodes are \(\{0,\,0.3,\,0.6,\,1.0\}\) or \(\{0,\,0.1,\,0.2,\,\ldots ,\,1.0\}\).

Thus, at a node \(k \in \mathcal {N}\), the power fed into the network cannot exceed \(\beta _k P^{\text {I}}_k\). Any potential feed-in above this value is cut off. We model the curtailed uncertain solar feed-in \(P^{\text {in}}_k\) based on a given solar power production \(P^{\text {PV}}_k\ge 0\) via

$$\begin{aligned} P^{\text {in}}_k&= {\left\{ \begin{array}{ll} P^{\text {PV}}_k, &{} \hbox {if} \quad P^{\text {PV}}_k \le \beta _k P^{\text {I}}_k, \\ \beta _k P^{\text {I}}_k, &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

i.e., \(P^{\text {in}}_k =\min \{P^{\text {PV}}_k, \beta _k P^{\text {I}}_k\}\).

In the following, we briefly recall the DC optimal power flow model with discrete curtailment of solar feed-in proposed in Aigner et al. (2021), where Table 1 summarizes the notation used for decision variables and input parameters.

Table 1 Notation for decision variables and input parameters

Decision variables are the vectors of generator outputs \(P^{\text {G}}= (P^{\text {G}}_k)_{k\in \mathcal {N}_\text {G}}\in [0,\infty )^{|\mathcal {N}_\text {G}|}\), voltage angles \(\theta = (\theta _1,\ldots ,\theta _n) \in [-\pi ,\pi ]^{n}\), power flows \(p = (p_{kl})_{(k,l)\in \mathcal {L}} \in \mathbbm {R}^{|\mathcal {L}|}\) and curtailment factors \(\beta \in \mathcal {S}\), where \(|\mathcal {N}_\text {G}|\), \(|\mathcal {L}|\) denote the cardinalities of the sets \(\mathcal {N}_\text {G}\) and \(\mathcal {L}\), respectively. The model reads as follows:

$$\begin{aligned} {} \underset{P^{\text {G}}, \theta , p, \beta }{\text {min}}&\quad \sum \nolimits _{k \in \mathcal {N}_\text {G}} f_k(P^{\text {G}}_k) + \sum \nolimits _{k \in \mathcal {N}} c_k(\beta _k) \end{aligned}$$
$$\begin{aligned} \text {such that}&\quad P^{\text {G}}_k + \min \{P^{\text {PV}}_k, \beta _k P^{\text {I}}_k \} - P^{\text {D}}_k= \sum \nolimits _{l \in \mathcal {N}(k)} p_{kl}&\quad \text {for all } k \in \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad \min \{P^{\text {PV}}_k, \beta _k P^{\text {I}}_k \} - P^{\text {D}}_k = \sum \nolimits _{l \in \mathcal {N}(k)} p_{kl}&\quad \text {for all } k \in \mathcal {N}\setminus \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad p_{kl}=b_{kl}(\theta _k-\theta _l)&\quad \text {for all } (k,l) \in \mathcal {L}, \end{aligned}$$
$$\begin{aligned}&\quad -d^{+}_{kl} \le p_{kl}\le d^{+}_{kl}&\quad \text {for all } (k,l) \in \mathcal {L}, \end{aligned}$$
$$\begin{aligned}&\quad P^{\text {G},-}_k \le P_k^G \le P^{\text {G},+}_k&\quad \text {for all } k \in \mathcal {N}_\text {G}, \end{aligned}$$

where the functions \(f_k:[0,\infty )\rightarrow [0,\infty )\) and \(c_k:[0,1]\rightarrow [0,\infty )\) model generator and curtailment costs, respectively.

The equality constraints (1b)–(1d) model the active power flow, which is determined by the power flow equations (1d) and Kirchhoff’s first law where we distinguish the two cases with and without generators, see (1b) and (1c) respectively. Note that the power at each node has to be balanced. This means that at each node \(k \in \mathcal {N}\) the active power production \(P^{\text {G}}_k + P^{\text {in}}_k \in [0,\infty )\) from generators and renewables equals the demand \(P^{\text {D}}_k \ge 0\) plus the active power sent to adjacent nodes \(\sum \nolimits _{l \in \mathcal {N}(k)} p_{kl} \in \mathbbm {R}\). The active power flow on transmission line \((k,l)\in \mathcal {L}\) is the product of voltage angle difference \(\theta _k-\theta _l\in [-2\pi ,2\pi ]\) and susceptance \(b_{kl}>0\). At the same time, the transmission limits considered in (1e) must not be exceeded. The vector of generator outputs \(P^{\text {G}}\) can be continuously controlled within the generator bounds considered in (1f). Furthermore, we assume that there is a bus \(k_0\in \mathcal {N}\) with a reference angle \(\theta _{k_0}=0\).

The optimization task consists in minimizing the objective function given in (1a) which is the sum of power generation costs (\(f_k\)) and curtailment costs (\(c_k\)) subject to the constraints mentioned above. Note that the functions \(f_k\) for all \(k\in \mathcal {N}_\text {G}\) and \(c_k\) for all \(k\in \mathcal {N}\) can be assumed to be linear or convex quadratic in the generator output. Since the minimum expressions in (1b) and (1c) can be linearized by introducing auxiliary variables and additional linear constraints (see e.g. Sherali and Adams (2013)), the optimization problem considered in (1) is a mixed-integer linear or convex quadratic program and can be solved efficiently to global optimality with standard techniques and software using, e.g., the Gurobi optimizer [23].

2.2 Uncertainty modeling

In practice, the vector of solar power production \(P^{\text {PV}}=(P^{\text {PV}}_1,\ldots ,P^{\text {PV}}_{n}) \in [0,\infty )^{n}\) is not known in advance. In addition, the production of renewable power can be subject to high fluctuations and is therefore an uncertain quantity. Using a network operating strategy that is computed by ignoring such uncertainties, a sudden fluctuation of renewable energy can lead to overloads in the electrical network. In the worst case, this can lead to failure of network elements owing to cascade effects. To prevent this, the optimization model explained in Sect. 2.1 has to be extended in order to take such fluctuations into account, and individual feed-in units may have to be regulated. In particular, we model the vector of produced solar power \(P^{\text {PV}}\) as the sum of a vector \(P^\mathrm{F}=(P^\mathrm{F}_1,\ldots ,P^\mathrm{F}_{n}) \in [0,\infty )^{n}\) of forecasted solar power and a random fluctuation vector \(X=(X_1,\ldots ,X_{n}):\Omega \rightarrow \mathbbm {R}^{n}\) defined on some probability space \((\Omega , \mathcal{F}, \mathbb {P})\), i.e.,

$$\begin{aligned} P^{\text {PV}}_k=P^\mathrm{F}_k + X_k \qquad \hbox { for all }\;k\in \mathcal {N}. \end{aligned}$$

However, in a first step, we need to determine a nominal operating solution \((P^{\text {G}}, \theta , p)\) together with a curtailment decision \(\beta\) that is feasible for the nominal feed-in vector \(P^\mathrm{F}\) (corresponding to \(X=0\)), i.e., the decision variables \(P^{\text {G}}, \theta , p,\beta\) have to fulfill the constraints (1b)–(1e), where \(P^{\text {PV}}\) is given in (2) with \(X=0\). In addition, we require that, with high probability, the network reaction to fluctuating feed-in remains feasible, see the chance constraint given in (6g) below. To model this kind of network reaction, we consider randomized duplicates \(P^{\mathrm{G}, X}:\Omega \rightarrow [0,\infty )^{|\mathcal {N}_\text {G}|},\theta ^X:\Omega \rightarrow [-\pi ,\pi ]^{n}\) and \(p^X:\Omega \rightarrow \mathbbm {R}^{|\mathcal {L}|}\) of the decision variables \(P^{\text {G}}, \theta , p\) introduced in Sect. 2.1, which depend on the realizations \(X(\omega )\) for \(\omega \in \Omega\) of the random fluctuation vector X. Note that realizations \(X(\omega )\not =0\) of X may lead to a changed distribution of power in the network and, therefore, to an imbalanced network. The generators then change their output to \(P^{\mathrm{G},X(\omega )}\) in order to balance the total active network power. Furthermore, the decision variables \(\theta ^X\) and \(p^X\) are adjusted correspondingly to ensure feasibility.

Thus, in the setting of the two-stage stochastic optimization problem described above (see also Sects. 2.3 and 2.4), the variables \(P^{\text {G}}, \theta , p\) refer to first-stage (or here-and-now) decisions. They must be decided for the nominal feed-in vector \(P^\mathrm{F}\) (corresponding to \(X=0\)), before uncertainty is revealed. Moreover, for fixed first-stage variables \(P^{\text {G}}, \theta , p\), any realization \(X(\omega )\not =0\) of X leads to a reaction of the network by choosing optimal second-stage (or wait-and-see) variables \(P^{\mathrm{G},X(\omega )},\theta ^{X(\omega )},p^{X(\omega )}\), where we assume that the power generation is balanced by the Automatic Generation Control Borkowska (1974). This means that the total power generation mismatch \(\Delta _X=\sum \nolimits _{k \in \mathcal {N}} (\min \{P^\mathrm{F}_k+X_k,\beta _kP^{\text {I}}_k\}-\min \{P^\mathrm{F}_k,\beta _kP^{\text {I}}_k\})\) is shared among all generators according to given participation factors \(\alpha _k \in [0,1]\) for every \(k\in \mathcal {N}_\text {G}\) such that \(\sum \nolimits _{k \in \mathcal {N}_\text {G}} \alpha _k =1\). More precisely, for each \(\omega \in \Omega\) we put

$$\begin{aligned} P^{\mathrm{G},X(\omega )}_k = P^{\text {G}}_k - \alpha _k \Delta _{X(\omega )} \quad \text {for all } k\in \mathcal {N}_\text {G}. \end{aligned}$$

The vector of decision variables \(\theta ^X\) is adjusted in a way that the power balance equations

$$\begin{aligned} {2} P^{\mathrm{G},X(\omega )}_k+ \min \{P^\mathrm{F}_k+X_k(\omega ), \beta _k P^{\text {I}}_k\} - P^{\text {D}}_k= & {} \sum \nolimits _{l \in \mathcal {N}(k)} b_{kl} (\theta ^{X(\omega )}_k - \theta ^{X(\omega )}_l) \text { for all } k\in \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned} \min \{P^\mathrm{F}_k +X_k(\omega ), \beta _k P^{\text {I}}_k\} - P^{\text {D}}= & {} \sum \nolimits _{l \in \mathcal {N}(k)} b_{kl} (\theta ^{X(\omega )}_k - \theta ^{X(\omega )}_l) \text { for all } k\in \mathcal {N}\setminus \mathcal {N}_\text {G} \end{aligned}$$

are fulfilled for each \(\omega \in \Omega\). Furthermore, for each \(\omega \in \Omega\) we put

$$\begin{aligned} p^{X(\omega )}_{kl} = b_{kl} (\theta ^{X(\omega )}_k - \theta ^{X(\omega )}_l) \quad \text {for all } (k,l)\in \mathcal {L}. \end{aligned}$$

It can be shown, see Aigner et al. (2021), that for each realization \(X(\omega )\) of X the equation system given in (4a)–(4b) has a uniquely determined solution \(\theta ^{X(\omega )}\), i.e., the wait-and-see variables \(P^{\mathrm{G},X(\omega )},\theta ^{X(\omega )}\), and \(p^{X(\omega )}\) are uniquely determined by (3), (4a)–(4b), and (5).

2.3 Chance constrained DC optimal power flow

By construction, the vectors \(p^X\) and \(P^{\text {G},X}\) of power flows and generator outputs are random variables that depend on the realization \(X(\omega )\) of the random fluctuation vector X and on the values of first-stage decision variables \(P^{\text {G}}, \theta , p,\beta\). Thus, we are searching for solutions \((P^{\text {G}}, \theta , p,\beta )\) which satisfy the limits of type (1e) and (1f) for power flows and generators outputs, respectively, with a probability of at least \(1-\varepsilon\) for some small number \(\varepsilon \in [0,1]\).

We model this requirement by a joint chance constraint in order to guarantee network stability. This means that the desired compliance probabilities for all power flows and generator outputs are simultaneously met. Thus, combining all modeling elements considered in the previous sections, we formulate the joint chance constrained DC optimal power flow problem with discrete curtailment as follows:

$$\begin{aligned} \underset{P^{\text {G}},\theta ,p,\beta }{\text {min}}&\quad \sum \nolimits _{k \in \mathcal {N}_\text {G}} f_k(P^{\text {G}}_k) + \sum \nolimits _{k \in \mathcal {N}} c_k(\beta _k) \end{aligned}$$
$$\begin{aligned} \text {such that}&\quad P^{\text {G}}_k + \min \{P^\mathrm{F}_k, \beta _k P^{\text {I}}_k\} - P^{\text {D}}_k = \sum \nolimits _{l \in \mathcal {N}(k)} p_{kl} \nonumber \\&\qquad \text {for all } k \in \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad \min \{P^\mathrm{F}_k, \beta _k P^{\text {I}}_k\} - P^{\text {D}}_k = \sum \nolimits _{l \in \mathcal {N}(k)} p_{kl} \nonumber \\&\qquad \text {for all } k \in \mathcal {N}\setminus \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad p_{kl}=b_{kl}(\theta _k-\theta _l) \nonumber \\&\qquad \text {for all } (k,l) \in \mathcal {L}, \end{aligned}$$
$$\begin{aligned}&\quad -d^{+}_{kl} \le p_{kl}\le d^{+}_{kl} \nonumber \\&\qquad \text {for all } (k,l) \in \mathcal {L}, \end{aligned}$$
$$\begin{aligned}&\quad P^{\text {G},-}_k \le P^{\text {G}}_k \le P^{\text {G},+}_k \nonumber \\&\qquad \text {for all } k \in \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad {\mathbb {P} \begin{pmatrix} \bigl \{\omega \in \Omega : -d^{+}_{kl} \le p^{X(\omega )}_{kl} \le d^{+}_{kl} \text { for all } (k,l)\in \mathcal {L}\bigr \}\,\cap \\ \bigl \{\omega \in \Omega :P^{\text {G},-}_k \le P^{\mathrm{G},X(\omega )}_k \le P^{\text {G},+}_k \text { for all } k \in \mathcal {N}_\text {G}\bigr \} \end{pmatrix} \ge 1- \varepsilon ,} \end{aligned}$$

where the wait-and-see variables \(P^{\mathrm{G},X(\omega )}_k, p^{X(\omega )}_{kl}\) are defined in (3) and (5), respectively.

2.4 Safe approximation of the chance constraints

Chance constrained optimization problems like (6) are in general hard to solve and may not be algorithmically tractable. Therefore, a large number of approximation techniques can be found in the literature, see Prékopa (1995) for a broad overview of the paradigm of stochastic optimization.

Thus, following Nemirovski (2012), we will replace the chance constraint considered in (6g) by a strictly robust protection against a suitably chosen uncertainty set \(B\in {\mathcal {B}}(\mathbbm {R}^{n})\) that fulfills

$$\begin{aligned} \mathbb {P}(\{\omega \in \Omega :X(\omega ) \in B\}) \ge 1-\varepsilon , \end{aligned}$$

where \({{\mathcal {B}}}(\mathbbm {R}^{n})\) denotes the \(\sigma\)-algebra of Borel sets in the n-dimensional Euclidean space \(\mathbbm {R}^{n}\).

The robust approximation of (6) is then given by

$$\begin{aligned} {} \underset{P^{\text {G}},\theta ,p,\beta }{\text {min}}&\quad \sum \nolimits _{k \in \mathcal {N}_\text {G}} f_k(P^{\text {G}}_k) + \sum \nolimits _{k \in \mathcal {N}} c_k(\beta _k) \end{aligned}$$
$$\begin{aligned} \text {such that } &\quad P^{\text {G}}_k + \min \{P^\mathrm{F}_k, \beta _k P^{\text {I}}_k\} - P^{\text {D}}_k = \sum \nolimits _{l \in \mathcal {N}(k)} p_{kl}&\quad \text {for all } k \in \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad \min \{P^\mathrm{F}_k, \beta _k P^{\text {I}}_k\} - P^{\text {D}}_k = \sum \nolimits _{l \in \mathcal {N}(k)} p_{kl}&\quad \text {for all } k \in \mathcal {N}\setminus \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad p_{kl}=b_{kl}(\theta _k-\theta _l)&\quad \text {for all } (k,l) \in \mathcal {L}, \end{aligned}$$
$$\begin{aligned}&\quad -d^{+}_{kl} \le p_{kl}\le d^{+}_{kl}&\quad \text {for all } (k,l) \in \mathcal {L}, \end{aligned}$$
$$\begin{aligned}&\quad P^{\text {G},-}_k \le P^{\text {G}}_k \le P^{\text {G},+}_k&\quad \text {for all } k \in \mathcal {N}_\text {G}, \end{aligned}$$
$$\begin{aligned}&\quad \max _{u\in B} p^u_{kl} \le d^{+}_{kl}, \quad \min _{u\in B} p^u_{kl} \ge -d^{+}_{kl}&\quad \text {for all } (k,l)\in \mathcal {L}, \end{aligned}$$
$$\begin{aligned}&\quad \max _{u \in B} P^{\mathrm{G}, u}_k \le P^{\text {G},+}_k, \quad \min _{u \in B} P^{\mathrm{G}, u}_k \ge P^{\text {G},-}_k&\quad \text {for all } k \in \mathcal {N}_\text {G}, \end{aligned}$$

where \(P^{\mathrm{G},u}_k,p^u_{kl}\) are determined as in (3) and (5) replacing \(X(\omega )\) by u.

One can show that every feasible solution of the safe approximation (8) is feasible for (6), see Gorissen et al. (2015). To ensure that the safe approximation generates not overly conservative solutions, the uncertainty set \(B\) should be chosen as small as possible, but as large as necessary.

Assuming that

$$\begin{aligned} B=[\ell _1,u_1]\times \ldots \times [\ell _{n},u_{n}] \subset \mathbbm {R}^{n} \end{aligned}$$

for some \(\ell =(\ell _1,\ldots ,\ell _{{n}}),u=(u_1,\ldots ,u_{n}) \in \mathbbm {R}^{n}\) such that \(\ell _k<u_k\) for all \(k\in \mathcal {N}\), it has been shown in Aigner et al. (2021) that the optimization problem (8) possesses an equivalent mixed-integer linear reformulation which - although being NP-hard in general - can be solved e.g. with the Gurobi optimizer [23] within reasonable time also for huge instances.

3 Modeling the distribution of the random forecasting error

In order to solve the safe approximation (8) of the stochastic optimization problem (6) described in Sect. 2.3, a suitable uncertainty set \(B\subset \mathbbm {R}^n\) has to be determined such that (7) holds. For the novel construction of uncertainty sets with the help of copulas, we propose a method for modeling the multivariate probability distribution of the n-dimensional power forecasting error \(X = P^{\text {PV}}- P^\mathrm{F}\) introduced in (2). The model for the distribution of X is based on R-vine copulas, which are fitted to empirical data.

To make the paper self-contained, we first give a brief overview of some fundamentals of copula theory in Sect. 3.1. In Sects. 3.2 and 3.3 we explain how R-vine copulas are structured and how they can be fitted to empirical data. Once an R-vine copula is fitted for the distribution of the random fluctuation vector X, in Sect. 3.4 we explain how samples can be drawn from it, in order to determine an uncertainty set \(B\subset \mathbbm {R}^{n}\) of the form given in (9) which satisfies a slightly modified version of condition (7), see Sect. 3.6. Furthermore, in Sect. 3.5 we propose a modification of the fitting procedure for R-vine copulas in order to fit the distribution of the (2n)-dimensional random vector (SX) to empirical data, where \(S:\Omega \rightarrow [0,\infty )^n\) models the forecasted solar radiation at the n nodes of the electrical network. This allows for an enhanced modeling of uncertainty sets \(B_s\in {{\mathcal {B}}}(\mathbbm {R}^n)\) conditioned on \(S=s\) for any given radiation forecast \(s\in [0,\infty )^n\).

3.1 Copulas: definition and sklar’s representation formula

A bivariate copula \(C:[0,1]^2\rightarrow [0,1]\) is the cumulative distribution function (CDF) of a two-dimensional random vector \(U=(U_1,U_2):\Omega \rightarrow [0,1]^2\), where both marginal distributions (of \(U_1\) and \(U_2\)) are the standard uniform distribution on the unit interval [0, 1], i.e., it holds that \(C(u_1,u_2)=\mathbb {P}(U_1\le u_1, U_2\le u_2)\) with \(C(u,1)=u_1\) and \(C(1,u_2)=u_2\) for any \(u_1,u_2\in [0,1]\). Moreover, by the choice of the copula \(C:[0,1]^2\rightarrow [0,1]\) the mutual interdependence of the components \(U_1\) and \(U_2\) can be described. For example, the product copula, where

$$\begin{aligned} C(u_1,u_2)=u_1\,u_2 \qquad \hbox { for all}\,u_1,u_2\in [0,1], \end{aligned}$$

models the case that \(U_1\) and \(U_2\) are independent random variables. On the other hand, if \(C(u_1,u_2)=\min \{u_1,u_2\}\) for all \(u_1,u_2\in [0,1]\), then \(\mathbb {P}(U_1=U_2)=1\), i.e., the components \(U_1\) and \(U_2\) are identical almost surely. Besides these two extreme cases, many further (parametric) families of bivariate copulas \(C:[0,1]^2\rightarrow [0,1]\) can be found in the literature, which model the case that \(U_1\) and \(U_2\) are neither independent nor identical. In particular, for the purposes of the present paper, the following bivariate copula families will be considered: Gaussian, Student t, Clayton, Gumbel, Frank, Joe, BB1, BB6, BB7, BB8 and their rotations, see e.g. Joe (2015); Nelsen (2006) for details.

Note that the notion of a copula is not restricted to the bivariate case. For any integer \(m\ge 2\), the function \(C:[0,1]^m\rightarrow [0,1]\) is called a copula if it is the CDF of an m-dimensional random vector \(U=(U_1,\ldots ,U_m):\Omega \rightarrow [0,1]^m\) such that the (marginal) distributions of \(U_1, \ldots ,U_m\) are the standard uniform distribution on the unit interval [0, 1]. The importance of copulas results from Sklar’s representation formula, see Joe (2015); Nelsen (2006), which states that the CDF of any random vector \(Y=(Y_1,\ldots ,Y_m):\Omega \rightarrow \mathbbm {R}^m\) with arbitrary (not necessarily uniform) marginal distributions can be written as the superposition of the univariate CDFs of \(Y_1, \ldots ,Y_m\) and a certain copula \(C:[0,1]^m\rightarrow [0,1]\). More precisely, it holds that

$$\begin{aligned} F_{1,\ldots ,m}(y_1,\ldots ,y_m)=C(F_{1}(y_1),\ldots ,F_{m}(y_m))\qquad \hbox { for all }\;y_1,\ldots ,y_m\in \mathbbm {R}, \end{aligned}$$

where \(F_{1,\ldots ,m}:\mathbbm {R}^m\rightarrow [0,1]\) with \(F_{1,\ldots ,m}(y_1,\ldots ,y_m)=\mathbb {P}(Y_1\le y_1,\ldots ,Y_m\le y_m)\) is the CDF of the m-dimensional random vector Y and \(F_{i}:\mathbbm {R}\rightarrow [0,1]\) with \(F_{i}(y_i)=\mathbb {P}(Y_i\le y_i)\) is the CDF of its ith component \(Y_i\) for each \(i\in \{1,\ldots ,m\}\). Vice versa, for any sequence \(F_{1},\ldots ,F_{m}\) of univariate CDFs and for any copula C, the superposition of \(F_{1},\ldots ,F_{m}\) and C considered on the right-hand side of (11) is the CDF of an m-dimensional random vector.

3.2 R-vine copulas

Note that the representation formula given in (11) can not directly be used in order to fit multivariate probability distributions to data. For this, sufficiently simple and, simultaneously, flexible parametric families of multivariate copulas \(C:[0,1]^m\rightarrow [0,1]\) are needed. One possible way to construct such parametric copula families is given by so-called R-vine copulas (regular vines), which is a generalization of D-vine copulas recently applied, e.g. in Schinke-Nendza et al. (2021); von Loeper et al. (2021), to model data from meteorology and solar power supply.

Fig. 1
figure 1

Example of the structure \(\mathcal {R} = (\mathcal {T}_1, \ldots , \mathcal {T}_4)\) for an R-vine copula consisting of four trees with \(\mathcal {T}_1\) at the bottom and \(\mathcal {T}_4\) at the top

The structure of R-vine copulas offers the advantage that the probability distribution of the m-dimensional random vector \(Y=(Y_1,\ldots ,Y_m)\) to be modelled can be expressed in terms of a number of bivariate copulas. Hereby the structure of an R-vine copula is given by a vector of trees \(\mathcal {R} = (\mathcal {T}_1, \ldots , \mathcal {T}_{m-1})\) with the following properties, see also Fig. 1:

  1. 1.

    \(\mathcal {T}_1=(\mathcal {V}_1,\mathcal {E}_1)\) consists of the set of vertices \(\mathcal {V}_1 = \{1, \ldots , m\}\) and some set of edges \(\mathcal {E}_1\subset \mathcal {V}_1\times \mathcal {V}_1\).

  2. 2.

    For the remaining trees \(\mathcal {T}_2=(\mathcal {V}_2,\mathcal {E}_2), \ldots , \mathcal {T}_{m-1}=(\mathcal {V}_{m-1},\mathcal {E}_{m-1})\), it holds that \(\mathcal {V}_i = \mathcal {E}_{i-1}\) for each \(i=\{2,\ldots ,m-1\}\), i.e., the set of vertices \(\mathcal {V}_i\) of \(\mathcal {T}_i\) consists of the edge set of the previous tree \(\mathcal {T}_{i-1}\).

  3. 3.

    For each \(i \in \{ 1, \ldots , m-2\}\), two edges in tree \(\mathcal {T}_i\) are joined by an edge in tree \(\mathcal {T}_{i+1}\) only if these edges share one common vertex.

Let \(\mathcal {E}(\mathcal {R})\) denote the set of all edges in \(\mathcal {R}\), meaning that \(\mathcal {E}(\mathcal {R})=\mathcal {E}_1\cup \ldots \cup \mathcal {E}_{m-1}\). Furthermore, we need the following notation. First, for each \(e=\{v_1,v_2\}\in \mathcal {E}_1\) we define \(\mathcal {S}(e) = \emptyset\) and \(\mathcal {O}(e) =\{v_1,v_2\}\). Next, we iterate over \(i\in \{2,\ldots ,m-1\}\) and, for each \(e=\{v_1,v_2\}\in \mathcal {E}_i\), we define \(\mathcal {S}(e)=\mathcal {S}(v_1)\cup \mathcal {S}(v_2)\cup (\mathcal {O}(v_1)\cap \mathcal {O}(v_2))\) and \(\mathcal {O}(e)=(\mathcal {O}(v_1)\cup \mathcal {O}(v_2))\setminus \mathcal {S}(e)\). We call \(\mathcal {S}(e)\) the conditioning set and \(\mathcal {O}(e)\) the conditioned set of edge e. According to Kurowicka and Joe (2010), it holds that \(|\mathcal {O}(e)|=2\) for each \(e\in \mathcal {E}(\mathcal {R})\) and, for each pair of indices \(\{i,j\}\in \{1,\ldots ,m\}\times \{1,\ldots ,m\}\) with \(i\not = j\), there is exactly one edge \(e\in \mathcal {E}(\mathcal {R})\) such that \(\mathcal {O}(e)=\{i,j\}\). Thus, for each each \(e\in \mathcal {E}(\mathcal {R})\), there are indices \(o_1,o_2\in \{1,\ldots ,m\}\) such that \(\{o_1,o_2\}=\mathcal {O}(e)\) and \(o_1< o_2\).

Suppose now that \(Y=(Y_1,\ldots ,Y_m)\) is a random vector with continuously differentiable CDF \(F_{1,\ldots ,m}:\mathbbm {R}^m\rightarrow [0,1]\), where the joint probability density of Y is denoted by \(f_{1,\ldots ,m}:\mathbbm {R}^m\rightarrow [0,\infty )\), and \(f_1,\ldots , f_m:\mathbbm {R}\rightarrow [0,\infty )\) are the marginal (univariate) densities of the components \(Y_1,\ldots ,Y_m\). Furthermore, let \(\mathcal {R} = (\mathcal {T}_1, \ldots , \mathcal {T}_{m-1})\) be a vector of trees with the properties mentioned above. Then, the following representation formula is true, see Czado (2019); Bedford and Cooke (2001); Joe (2015): For any \(y=(y_1,\ldots ,y_m)\in \mathbbm {R}^m\) such that \(f_{1,\ldots ,m}(y)>0\) it holds that

$$\begin{aligned} f_{1,\ldots ,m}(y)=\prod _{e=(o_1,o_2)\in \mathcal {E}(\mathcal {R})} c_{o_1,o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}} \left(F_{o_1\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_1}),F_{o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_2}) \right) \prod _{i=1}^m f_i(y_i), \end{aligned}$$

where \(Y_{\mathcal {S}(e)}\) denotes the random vector consisting of those components of \(Y=(Y_1,\ldots ,Y_m)\) the indices of which belong to the set \(\mathcal {S}(e)\subset \{1,\ldots ,m\}\), and, analogously, \(y_{\mathcal {S}(e)}\) is the corresponding subvector of \((y_1,\ldots ,y_m)\). Furthermore, \(c_{o_1,o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}:\mathbbm {R}^2\rightarrow [0,\infty )\) denotes the bivariate copula density of the conditional probability distribution of the two-dimensional random vector \((Y_{o_1},Y_{o_2})\) given that \(Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}\), and \(F_{o_j\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}:\mathbbm {R}\rightarrow [0,1]\) is the conditional CDF of \(Y_{o_j}\) given that \(Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}\), where \(j=1,2\).

Note that the right-hand side of (12) is the product of uni- and bivariate functions. Thus, in order to determine the multivariate probability density \(f_{1,\ldots ,m}\), we just have to determine the univariate (marginal) densities \(f_1,\ldots , f_m\), the (conditional) univariate CDFs \(F_{o_j\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}\), and the (conditional) bivariate copula densities \(c_{o_1,o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}\) for all \(e=(o_1,o_2)\in \mathcal {E}(\mathcal {R})\), where the recursion formulas (see Aas et al. (2009))

$$\begin{aligned} F_{o_1 \mid Y_{\mathcal {S}(e) \cup \{o_2\}} = y_{\mathcal {S}(e) \cup \{o_2\}}}(y_{o_1}) = \frac{\frac{d}{dy_{o_2}} C_{o_1,o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}} \bigl (F_{o_1\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_1}),F_{o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_2}) \bigr ) }{\frac{d}{dy_{o_2}} F_{o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_2}) } \end{aligned}$$


$$\begin{aligned} F_{o_2 \mid Y_{\mathcal {S}(e) \cup \{o_1\}} = y_{\mathcal {S}(e) \cup \{o_1\}}}(y_{o_2}) = \frac{\frac{d}{dy_{o_1}} C_{o_1,o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}} \bigl (F_{o_1\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_1}),F_{o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_2}) \bigr ) }{\frac{d}{dy_{o_1}} F_{o_1\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}(y_{o_1}) } \end{aligned}$$

are used in order to determine the univariate CDFs \(F_{o_j\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}\) for \(j=1,2\).

3.3 Fitting R-vine copulas to empirical data

In this section we outline how the representation formula given in (12) can be utilized in order to fit an m-dimensional probability density \(f_{1,\ldots ,m}\) to empirical data, i.e., for a given sample of k realizations \(y^{(1)}=(y^{(1)}_1,\ldots ,y_m^{(1)}), \ldots , y^{(k)}=(y^{(k)}_1,\ldots ,y_m^{(k)}) \in \mathbbm {R}^m\) of the random vector \(Y=(Y_1,\ldots ,Y_m)\), where we use the sequential algorithm proposed in Dissmann et al. (2013). First, for each \(i\in \{1,\ldots ,m\}\), we use the sample \(y_i=(y^{(1)}_i,\ldots ,y_i^{(k)})\) to determine a kernel density estimator (KDE) \({\widehat{f}}_i:\mathbbm {R}\rightarrow (0,\infty )\), see Silverman (1986), for the marginal density \(f_i\) of the i-th component \(Y_i\) of Y, which is numerically integrated in order to obtain the univariate CDF \({\widehat{F}}_i:\mathbbm {R}\rightarrow [0,1]\). Then, in the next step, a valid tree \(\mathcal {T}_1=(\mathcal {V}_1,\mathcal {E}_1)\) with \(\mathcal {V}_1 = \{1, \ldots , m\}\) is chosen such that the expression

$$\begin{aligned} I(\mathcal {E}_1)= \sum _{e=(o_1,o_2) \in \mathcal {E}_1} \left| {\widehat{\tau }}\left( \left( {\widehat{F}}_{o_1}\big (y^{(1)}_{o_1}\big ),\ldots ,{\widehat{F}}_{o_1}\big (y^{(k)}_{o_1}\big )\right) , \left( {\widehat{F}}_{o_2}\big (y^{(1)}_{o_2}\big ),\ldots ,{\widehat{F}}_{o_2}\big (y^{(k)}_{o_2}\big )\right) \right) \right| \end{aligned}$$

is maximized with respect to \(\mathcal {E}_1\), where \({\widehat{\tau }}\) denotes an empirical version of Kendall’s tau, which is defined for pairs of realizations \(\{(x_1, y_1)\ldots ,(x_n, y_n)\}\) of two random variables X and Y

$$\begin{aligned} \widehat{\tau }(x, y)=\frac{2}{n(n-1)}\sum _{i<j}\text {sgn}(x_i-x_j)\; \text {sgn}(y_i-y_j), \end{aligned}$$

where \(x=(x_1,\ldots ,x_n)\) and \(y=(y_1,\ldots ,y_n)\).

In other words, the edge set \(\mathcal {E}_1\) is chosen such that the sum of pairwise empirical correlations between \(Y_{o_1}\) and \(Y_{o_2}\) is maximized, where the sum extends over all edges \(e=(o_1,o_2) \in \mathcal {E}_1\). Subsequently, for each \(e=(o_1,o_2) \in \mathcal {E}_1\), a bivariate copula \(C_e\) is fitted. For this, the independence of \(Y_{o_1}\) and \(Y_{o_2}\) is checked via a statistical test Dissmann et al. (2013). If the null hypothesis (stating that \(Y_{o_1}\) and \(Y_{o_2}\) are independent) is not rejected, then the product copula given in (10) is chosen for \(C_e\). Otherwise, an (unconditional) bivariate copula \({\widehat{C}}_e\) and its parameters are fitted to the data vectors \(({\widehat{F}}_{o_1}(y^{(1)}_{o_1}),\ldots {\widehat{F}}_{o_1}(y^{(k)}_{o_1}))\) and \(({\widehat{F}}_{o_2}(y^{(1)}_{o_2}),\ldots {\widehat{F}}_{o_2}(y^{(k)}_{o_2}))\) with the help of a maximum likelihood method Joe (2015).

Now, analogously to (15), a valid tree \(\mathcal {T}_2=(\mathcal {V}_2,\mathcal {E}_2)\) with \(\mathcal {V}_2=\mathcal {E}_1\) is selected such that the following expression is maximized:

$$\begin{aligned}&I(\mathcal {E}_2)= \sum _{e \in \mathcal {E}_2} \big \vert {\widehat{\tau }}\big (\big ({\widehat{F}}_{o_1 \mid Y_{\mathcal {S}(e)}=y^{(1)}_{\mathcal {S}(e)}}\big (y^{(1)}_{o_1}\big ),\ldots ,{\widehat{F}}_{Y_{o_1} \mid Y_{\mathcal {S}(e)}=y^{(k)}_{\mathcal {S}(e)}}\big (y^{(k)}_{o_1}\big )\big ),\big ({\widehat{F}}_{o_2 \mid Y_{\mathcal {S}(e)}=y^{(1)}_{\mathcal {S}(e)}}\big (y^{(1)}_{o_2}\big ),\ldots ,\\&\quad {\widehat{F}}_{Y_{o_2} \mid Y_{\mathcal {S}(e)}=y^{(k)}_{\mathcal {S}(e)}}(y^{(k)}_{o_2}\big )\big )\big ) \big \vert . \end{aligned}$$

Note that \(|\mathcal {S}(e)|=1\) for all \(e \in \mathcal {E}_2\). Thus, using (13) and (14), the conditional CDFs \({\widehat{F}}_{o_1 \mid Y_{\mathcal {S}(e)}=y^{(\ell )}_{\mathcal {S}(e)}}\) and \({\widehat{F}}_{o_2 \mid Y_{\mathcal {S}(e)}=y^{(\ell )}_{\mathcal {S}(e)}}\) for \(\ell \in \{1,\ldots ,k\}\), can directly be obtained from the (unconditional) bivariate copula \({\widehat{C}}_{o_1,o_2}\) and the (unconditional) CDFs \({\widehat{F}}_{o_1}\) and \({\widehat{F}}_{o_2}\), which are determined as described above. Then, for each \(e\in \mathcal {E}_2\) and \(o_1,o_2\in \mathcal {O}(e)\), a bivariate copula \({\widehat{C}}_{o_1,o_2\mid \mathcal {S}(e)}\) and its parameters are fitted to the data vectors \(({\widehat{F}}_{o_j \mid Y_{\mathcal {S}(e)}=y^{(1)}_{\mathcal {S}(e)}}(y^{(1)}_{o_j}),\ldots ,{\widehat{F}}_{Y_{o_j} \mid Y_{\mathcal {S}(e)}=y^{(k)}_{\mathcal {S}(e)}}(y^{(k)}_{o_j}))\) for \(j=1,2\), where the simplifying assumption is made that the copula \({\widehat{C}}_{o_1,o_2\mid \mathcal {S}(e)}={\widehat{C}}_{o_1,o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}\) does not depend on the given realization \(y_{\mathcal {S}(e)}\) of \(Y_{\mathcal {S}(e)}\), see e.g. Haff et al. (2010).

Finally, in the same way as described above, the trees \(\mathcal {T}_i=(\mathcal {V}_i,\mathcal {E}_i)\), the conditional CDFs \({\widehat{F}}_{o_j \mid Y_{\mathcal {S}(e)}=y^{(\ell )}_{\mathcal {S}(e)}}\) for \(j=1,2\) and \(\ell =1,\ldots ,k\), and the bivariate copulas \({\widehat{C}}_{o_1,o_2\mid \mathcal {S}(e)}\) are determined for all \(e\in \mathcal {E}_i\) and \(i=3,\ldots ,m-1\).

3.4 Sampling from multivariate probability densities

In Sect. 3.3 we showed how the multivariate probability density \({\widehat{f}}:\mathbbm {R}^m\rightarrow [0,\infty )\) given by the representation formula

$$\begin{aligned} {\widehat{f}}_{1,\ldots ,m}(y_1,\ldots ,y_m)=\prod _{e\in \mathcal {E}(\mathcal {R})} {\widehat{c}}_{o_1,o_2\mid \mathcal {S}(e)} \left( {\widehat{F}}_{o_1\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}\big (y_{o_1}\big ),{\widehat{F}}_{o_2\mid Y_{\mathcal {S}(e)}=y_{\mathcal {S}(e)}}\big (y_{o_2}\big )\right) \prod _{i=1}^m {\widehat{f}}_i(y_i) \end{aligned}$$

for \((y_1,\ldots ,y_m)\in \mathbbm {R}^m\) can be fitted to empirical data. We now explain how samples can be drawn from the probability density given in (17).

Recall that the Rosenblatt transform Joe (2015) maps a sample \(y=(y_1, \ldots , y_m)\) of a random vector \(Y = (Y_1, \ldots , Y_m)\) with joint probability density \(f_{1,\ldots ,m}:\mathbbm {R}^m\rightarrow (0,\infty )\) onto a sample \(u=(u_1, \ldots , u_m)\) of a vector of independent and uniformly distributed random variables \(U = (U_1, \ldots , U_m):\Omega \rightarrow [0,1]^m\) such that

$$\begin{aligned} u_1&= F_{Y_1}(y_1),\\ u_2&= F_{Y_2 \mid Y_1=y_1}(y_2),\\ u_3&= F_{Y_3 \mid Y_1=y_1, Y_2=y_2}(y_3),\\ u_m&{\mathop {=}\limits ^{\vdots }} F_{Y_m \mid Y_1=y_1, \ldots , Y_{m-1}=y_{m-1}}(y_m), \end{aligned}$$

where \(F_{Y_i \mid Y_1=y_1, \ldots , Y_{i-1}=y_{i-1}}:\mathbbm {R}\rightarrow [0,1]\) denotes the (conditional) CDF corresponding to the conditional density \(f_{Y_i \mid Y_1=y_1, \ldots , Y_{i-1}=y_{i-1}}:\mathbbm {R}\rightarrow (0,\infty )\) for \(i=1,\ldots ,m-1\). Assuming that the densities \(f_{Y_i \mid Y_1=y_1, \ldots , Y_{i-1}=y_{i-1}}\) for \(i=1,\ldots ,m-1\) are positive, the CDFs \(F_{Y_i \mid Y_1=y_1, \ldots , Y_{i-1}=y_{i-1}}\) are bijective for \(i=1,\ldots ,m-1\) and thus, by applying the inverse CDFs to both sides of the above equations, we obtain the inverse Rosenblatt transform:

$$\begin{aligned} F^{-1}_{Y_1}(u_1)&= y_1,\\ F^{-1}_{Y_2 \mid Y_1=y_1}(u_2)&= y_2,\\ F^{-1}_{Y_3 \mid Y_1=y_1, Y_2=y_2}(u_3)&= y_3,\\ F^{-1}_{Y_m \mid Y_1=y_1, \ldots , Y_{m-1}=y_{m-1}}(u_m)&{\mathop {=}\limits ^{\vdots }} y_m, \end{aligned}$$

which maps a sample \(u=(u_1, \ldots , u_m)\) of U onto a sample \(y=(y_1, \ldots , y_m)\) of Y. Note that the (inverse) Rosenblatt transform works for any permutation of the indices \(1, \ldots , m\).

Now, consider some sequence of edges \(e^{(1)}, \ldots , e^{(m-1)}\) with \(e^{(i)} \in \mathcal {E}_i\) for \(i=1,\ldots ,m-1\) such that \(e^{(i)} \in e^{(i+1)}\) for \(i=1,\ldots ,m-2\). For the given edges, it follows from the third property of the trees \(\mathcal {T}_1,\ldots ,\mathcal {T}_{m-1}\) introduced in Sect. 3.2 that there is a permutation \((o_1, \ldots , o_m)\) of \((1,\ldots ,m)\) such that \(o_1 \in \mathcal {O}(e^{(1)})\) and \(o_{i+1} \in \mathcal {O}(e^{(i)})\) for \(i=1,\ldots ,m-1\). Thus, the inverse Rosenblatt transform can be used as follows, in order to draw a sample \((y_1,\ldots ,y_m)\) from the probability density \({\widehat{f}}_{1,\ldots ,m}\) given in (17):

$$\begin{aligned} {\widehat{F}}^{-1}_{o_1}(u_{o_1})&= y_{o_1},\\ {\widehat{F}}^{-1}_{o_i \mid Y_{\mathcal {S}(e^{(i-1)}) \cup \{o_{i-1}\}} = y_{\mathcal {S}(e^{(i-1)}) \cup \{o_{i-1}\}}}(u_{o_i}) = {\widehat{F}}^{-1}_{o_i \mid Y_{\{o_1, \ldots , o_{i-1}\}} = y_{\{o_1, \ldots , o_{i-1}\}}}(u_{o_i})&{\mathop {=}\limits ^{\vdots }} y_{o_i},\\ {\widehat{F}}^{-1}_{o_m \mid Y_{\mathcal {S}(e^{(m-1)}) \cup \{o_{m-1}\}} = y_{\mathcal {S}(e^{(m-1)}) \cup \{o_{m-1}\}}}(u_{o_m}) = {\widehat{F}}^{-1}_{o_m \mid Y_{\{o_1, \ldots , o_{m-1}\}} = y_{\{o_1, \ldots , o_{m-1}\}}}(u_{o_m})&{\mathop {=}\limits ^{\vdots }}y_{o_m}, \end{aligned}$$

where \(u=(u_1, \ldots , u_m)\) is a sample of a vector of independent and uniformly distributed random variables \(U = (U_1, \ldots , U_m):\Omega \rightarrow [0,1]^m\), the (unconditional) CDF \({\widehat{F}}_{o_1}\) is given by an integrated kernel density estimator (KDE), and the (conditional) CDFs \({\widehat{F}}_{o_i \mid Y_{\mathcal {S}(e^{(i-1)}) \cup \{o_{i-1}\}}}\) for \(i=2,\ldots ,m\) are determined as described in Sect. 3.3.

Later on, in Sect. 4, the algorithms stated in Sects. 3.3 and 3.4 are applied to derive the numerical results presented in this paper, where the implementation provided by the python library pyvinecopulibNagler and Vatter (2021) is used.

3.5 Conditional sampling

In the previous section we described a method how to sample from a multivariate distribution with the help of the Rosenblatt transform. This method is used in Sect. 4 below in order to draw samples from the (unconditional) distribution of the forecasting error \(X = P^{\text {PV}}- P^\mathrm{F}\). Furthermore, to model the distribution of the random fluctuation vector X more accurately, we modify the approach considered in Sects. 3.3 and 3.4 such that we can draw samples from the conditional distribution of X for any given radiation forecast \(S=s\). For D-vine copulas, a similar conditional sampling algorithm can be found in Aas et al. (2021) and Bevacqua et al. (2017).

Let \(m,m'\ge 1\) be some integers with \(m'<m\). We first explain the reasons why the fitting and (unconditional) sampling approach considered in Sects. 3.3 and 3.4 has to be modified such that we can draw samples from arbitrary conditional distributions of a random vector \(Y = (Y_1, \ldots , Y_m)\), i.e., to draw samples \(y=(y_1, \ldots , y_m)\) from the conditional distribution of \(Y = (Y_1, \ldots , Y_m)\), given that \(Y_{i_1}=y_{i_1},\ldots ,Y_{i_{m'}}=y_{i_{m'}}\) for some subset of indices \(D=\{i_1,\ldots ,i_{m'}\} \subset \{1, \ldots , m \}\) and some vector \((y_{i_1},\ldots ,y_{i_{m'}})\in \mathbbm {R}^{m'}\),

Recall that the (direct and inverse) Rosenblatt transform considered in Sect. 3.4 works for arbitrary permutations of the sampling order provided that all conditional CDFs required for this transformation are known. Here, the sampling order refers to the order of the marginal dimensions from which samples are drawn. However, if we want to obtain these CDFs with the help of (13) and (14), the structure of the underlying R-vine copula restricts the choice of possible sampling orders. To understand why this is the case, note that in order to sample in any given order would require the construction of arbitrary (conditional) CDFs, the total number of which is equal to \(m 2^{m-1}\). However, an R-vine copula of dimension m consists of \(\frac{m (m-1)}{2}\) bivariate copulas. With the help of (13) and (14) two (conditional) CDFs can be obtained from each bivariate copula, i.e., we can obtain \(m (m-1)\) (conditional) CDFs in total from a given R-vine copula, which limits the number of possible sampling orders.

Consider the R-vine copula in Fig. 1 which has (1, 2, 3, 5, 4) as a possible sampling order. To sample in this order with the inverse Rosenblatt transform, we obtain the required inverse CDFs \(F^{-1}_{Y_1}\), \(F^{-1}_{Y_2 \mid Y_1=y_1}\), \(F^{-1}_{Y_3 \mid Y_1=y_1, Y_2=y_2}\), \(F^{-1}_{Y_5 \mid Y_1=y_1, Y_2=y_2, Y_3=y_3}\) and \(F^{-1}_{Y_4 \mid Y_1=y_1, Y_2=y_2, Y_3=y_3, Y_5=y_5}\) from the marginal distribution \(\boxed {1}\) and the copulas \(\boxed {1,2}\), \(\boxed {1,3 \mid 2}\), \(\boxed {1,5 \mid 2,3}\) and \(\boxed {1,4 \mid 2,3,5}\) respectively. Note that this sampling order is possible because each copula corresponds to an edge connected to the previous copula or marginal distribution, e.g., \(\boxed {1,3 \mid 2}\) corresponds to an edge connected to \(\boxed {1,2}\) while \(\boxed {1,2}\) corresponds to the edge connected to \(\boxed {1}\). This ensures that a suitable copula for the next dimension in the sampling order exists.

Now consider the sampling order (1, 2, 3, 4, 5), which is impossible. Analogously to the previous sampling order the inverse CDFs \(F^{-1}_{Y_1}\), \(F^{-1}_{Y_2 \mid Y_1=y_1}\) and \(F^{-1}_{Y_3 \mid Y_1=y_1, Y_2=y_2}\) can be obtained. However, to obtain the \(4^{\hbox {th}}\) necessary inverse CDF \(F^{-1}_{Y_4 \mid Y_1=y_1, Y_2=y_2, Y_3=y_3}\) for the inverse Rosenblatt transform, the copulas \(\boxed {1,4 \mid 2, 3}\) or \(\boxed {3,4 \mid 1, 2}\) are required which do not exist within the considered R-vine copula.

As shown in Theorem 5.1 in Cooke et al. (2015), an R-vine copula of dimension m has only \(2^{m-1}\) possible sampling orders. This is due to the fact that every possible sampling order corresponds to a vector \(\lambda = (\lambda _1,\ldots ,\lambda _m) = (v_1, \ldots , v_{m-1}, e)\) with \(v_i \in \mathcal {T}_i\) and \(e \in \mathcal {E}_{m-1}\), i.e., the CDF used in the first equation of the Rosenblatt transform is the marginal CDF \(F_{v_1}\) whereas the conditional CDFs of the equations thereafter are given by the copulas corresponding to \(\lambda _2,\ldots ,\lambda _m\). Recall that for each equation of the Rosenblatt transformation the dimension of the condition of the corresponding conditional CDF grows by one. This restricts the choice of \(\lambda _{i+1}\) to copulas for which it holds that \(\lambda _i \in \lambda _{i+1}\) for all \(i \in \{1, \ldots , m-2\}\), i.e., \(\lambda _i \in \mathcal {T}_i\) must be a vertex of the edge \(\lambda _{i+1} \in \mathcal {T}_{i+1}\), because only then the copula corresponding to \(\lambda _{i+1}\) can be used to construct a conditional CDF with a valid condition for the \((i+1)\)-th equation of the Rosenblatt transform.

We thus modify the fitting process for vine copulas presented in Sect. 3.3 such that a vector \(\lambda =(\lambda _1,\ldots ,\lambda _m)\) as described above exists for a given set of indices \(D=\{i_1,\ldots ,i_{m'}\} \subset \{1, \ldots , m \}\). For this, we consider \(\mathcal {T}_1^D = (\mathcal {V}_1^D, \mathcal {E}_1^D) = (D, \{e \in \mathcal {E}_1: e \subseteq D\})\), i.e., \(\mathcal {T}_1^D\) is a graph with vertex set D and edges \(e \in \mathcal {E}_1\) which connect two vertices in D. For \(i > 1\), we recursively define \(\mathcal {T}_i^D = (\mathcal {V}_i^D, \mathcal {E}_i^D) = (\mathcal {E}_{i-1}^D, \{e \in \mathcal {E}_i: e \subseteq \mathcal {E}_{i-1}^D\})\). Note that in general \(\mathcal {T}_i^D\) is not a tree but a forest, however, only if all \(\mathcal {T}_i^D\) are trees the vector \((\mathcal {T}_1^D, \ldots , \mathcal {T}_{m'}^D)\) is a valid R-vine copula. This is necessary to construct an inverse Rosenblatt transform for the dimensions in D, or more generally speaking, it is necessary for the construction of an inverse Rosenblatt transform for all dimensions \(\{1, \ldots , m \}\) where the dimensions in D occur at the beginning.

To ensure that there is a sampling order in which all indices in D are in successive order, we choose the graphs \(\mathcal {T}_i^D\) in the fitting process of the R-vine copula such that \(I(\mathcal {E}_i^D)\) in (15) is maximized (as in the unmodified fitting process considered in Sect. 3.3), where additionally it must hold that \(\mathcal {T}_i^D\) is a tree for all \(i \in \{1, \ldots , m'\}\) because only then we can chose a sampling order where \(\lambda _i \in \lambda _{i+1}\) holds for all \(i \in \{1, \ldots , m-1\}\).

Without loss of generality, we now assume that \(D=\{1,\ldots ,m'\}\). Thus, we omit the first \(m'\) equations of the inverse Rosenblatt transform and sample values \(y_{m'+1},\ldots ,y_m\) for the remaining \(m-m'\) components via

$$\begin{aligned} F^{-1}_{Y_{n+1} \mid Y_1=y_1, \ldots , Y_{n}=y_{m'}}(u_{m'+1})&= y_{m'+1}, \\ F^{-1}_{Y_m \mid Y_1=y_1, \ldots , Y_{m-1}=y_{m-1}}(u_m)&{\mathop {=}\limits ^{\vdots }} y_m. \end{aligned}$$

As an example, consider again the R-vine copula in Fig. 1 and the set \(D = \{1, 2, 3\}\) to sample from the conditional distribution of \((Y_4, Y_5)\mid _{Y_1=y_1, Y_2=y_2, Y_3=y_3}\). The graphs \(\mathcal {T}_1^D\), \(\mathcal {T}_2^D\) and \(\mathcal {T}_3^D\) with the sets of vertices \(\{\boxed {1}, \boxed {2}, \boxed {3}\}\), \(\{\boxed {1,2}, \boxed {2,3}\}\), {\(\boxed {1,3 \mid 2}\)} and the corresponding edges correspond to the lower left part of the diagram. Since the graphs \(\mathcal {T}_1^D\), \(\mathcal {T}_2^D\) and \(\mathcal {T}_3^D\) are trees and \((\mathcal {T}_1^D, \mathcal {T}_2^D, \mathcal {T}_3^D)\) is a valid R-vine copula, sampling orders with 1, 2 and 3 at the beginning are possible.

Now consider \(D=(1,2,4)\) for which \(\mathcal {T}_1^D = (\{\boxed {1}, \boxed {2}, \boxed {4}\}, \{\{\boxed {1}, \boxed {2}\}\})\) is not a tree and the vector \((\mathcal {T}_1^D, \mathcal {T}_2^D, \mathcal {T}_3^D)\) is not an R-vine copula. Since \(\boxed {4}\) is not connected to \(\boxed {1}\) or \(\boxed {2}\) in \(\mathcal {T}_1^D\) there can be neither \(\boxed {1,4}\) nor \(\boxed {2,4}\) in \(\mathcal {T}_2^D\) and in turn there can be neither \(\boxed {2,4 \mid 1}\) nor \(\boxed {1,4 \mid 2}\) in \(\mathcal {T}_3^D\). Therefore it is not possible to obtain the required inverse CDFs for an inverse Rosenblatt transform for which the sampling order begins with the elements of D.

In the following section, we explain how the construction of uncertainty sets is performed with the scenario approach from stochastic optimization. We then use the copula-based modeling from this section in order to construct high-quality uncertainty sets for given weather situations.

3.6 Scenario approach to determine a suitable uncertainty set

In order to determine a suitable uncertainty set of the form given in (9) which satisfies (a slightly modified version of) condition (7), we apply, as in Aigner et al. (2021), an idea described in Margellos et al. (2014) and formulate the estimation of the uncertainty set \(B=[\ell _1,u_1]\times \ldots \times [\ell _{n},u_{n}] \subset \mathbbm {R}^{n}\) as an auxiliary probabilistic optimization problem. Then, for this problem with chance constraints, we apply the scenario approach proposed in Campi and Garatti (2008), i.e., the chance constraints considered in (7) are replaced by constraints based on a sufficiently large number of samples drawn from the probability distribution of the random forecasting error \(X = P^{\text {PV}}- P^\mathrm{F}\). In this work this distribution is fitted to empirical data, using the algorithm described in Sect. 3.3, and simulation is performed with the technique described in Sect. 3.4.

The auxiliary optimization problem in its general form consists of a chance constraint model for the enclosure \(B\in {\mathcal B}(\mathbbm {R}^{n})\) of the probability mass of \(X=(X_1,\ldots ,X_n)\) satisfying the condition \(\mathbb {P} (\{\omega : \ X(\omega ) \in B\}) \ge 1-\varepsilon\) for some \(\varepsilon \in (0,1)\), see (7). At the same time, this problem aims for an uncertainty set B such that its size is as small as possible. Thus, in order to apply the scenario approach proposed in Campi and Garatti (2008) to determine an uncertainty box \(B=[\ell _1,u_1]\times \ldots \times [\ell _{n},u_{n}] \subset \mathbbm {R}^{n}\), we consider the probabilistic optimization problem

$$\begin{aligned}&\min _{\ell ,u \in \mathbbm {R}^{n}} \nonumber \\&\quad \sum \nolimits _{k \in \mathcal {N}} (u_k-\ell _k) \end{aligned}$$
$$\begin{aligned}&\text {such that} \nonumber \\&\quad \mathbb {P} (\{\omega : \ell _k \le X_k(\omega ) \le u_k \hbox { for all} k=1,\ldots ,n\}) \ge 1-\varepsilon , \end{aligned}$$

where the minimum in (18a) extends over all \(\ell =(\ell _1,\ldots ,\ell _n),u=(u_1,\ldots ,u_n)\in \mathbbm {R}^n\) with \(\ell _k<u_k\) for all \(k=1,\ldots ,n\).

Thus, to control the size of the set B, we minimize the sum of interval lengths \(u_k-\ell _k\). In contrast, if minimization of the box volume were used instead, this would lead to a non-convex objective. In this case, the scenario approach proposed in Campi and Garatti (2008) is no longer applicable. Although the solution of (18) does not necessarily minimize the box volume, the solution of the following scenario program does. This is why this choice of objective is suitable. We further explain this after introducing our scenario program.

Suppose that \(N>0\) samples \(x^1,\ldots ,x^N\) are independently drawn from the probability distribution of X. Instead of (18b), in our scenario approach we want to ensure that the samples \(x^1,\ldots ,x^N\) are included in the uncertainty set B. The resulting scenario program for computing \(B=[\ell ,u]\) is thus given by

$$\begin{aligned} {2}&\min _{\ell ,u\in \mathbbm {R}^{n}}\nonumber \\&\quad \textstyle \sum _{k \in \mathcal {N}} (u_k-\ell _k) \end{aligned}$$
$$\begin{aligned}&\text {such that} \nonumber \\&\quad \ell \le x^i \le u\quad \text {for all } i=1,\dots ,N. \end{aligned}$$

The solution of this optimization problem can be written explicitly as \([\ell ^*,u^*]\), where \(\ell ^*_k=\min _{i=1,...,N} \{x_k^i\}\) and \(u_k^*=\max _{i=1,...,N}\{x_k^i\}\) for every vector component k. It is true that set \(B^*=[\ell ^*,u^*]\) also minimizes the volume over all sets [lu] containing the samples \(x^1,\ldots ,x^N\). Although, in general, the solution of problem (18) does not calculate boxes with minimal volume, this is the case for the optimization problem given in (19).

From the results presented in Campi and Garatti (2008), we know that the optimal solution \(B^*=[\ell ^*,u^*]\) of (19) fulfills condition (18b) with a confidence probability of at least \(1-\delta\) for some small \(\delta \in (0,1)\) if \(N>0\) is chosen such that

$$\begin{aligned} \sum _{j=0}^{2n-1} \left( {\begin{array}{c}N\\ j\end{array}}\right) \varepsilon ^j (1-\varepsilon )^{N-j} \le \delta . \end{aligned}$$

Note that in the latter inequality, the necessary number of samples \(N>0\) for a predefined confidence level \(1-\delta \in (0,1)\) is given implicitly. However, an explicit sufficient condition has been derived in Alamo et al. (2010), which reads as

$$\begin{aligned} N \ge \left\lceil \frac{1}{\varepsilon }\frac{e}{e-1}\left( 2n - 1 + \ln \frac{1}{\delta } \right) \right\rceil . \end{aligned}$$

Furthermore, we determine the optimal solution \(B^*_s=[\ell ^*_s,u^*_s]\) of (19) based on samples drawn, as described in Sect. 3.5, from the conditional distribution of X for given radiation forecasts \(S=s\).

4 Numerical results

In order to derive the results presented in this section we used the library pyvinecopulib Nagler and Vatter (2021). Furthermore, we utilized Gurobi 9.1.2 [23] as solver for mixed-integer linear programs. The computations were carried out by means of a python implementation on a cluster using 4 cores of a machine with two Xeon E3-1240 v6 “Kaby Lake” chips (4 cores, HT disabled) running at 3.7 GHz with 32 GB of RAM.

4.1 Data description

Data regarding power measurements as well as weather forecasts were provided by the distribution network operator N-ERGIE Netz GmbH (NNG) and the German weather service Deutscher Wetterdienst (DWD). In particular, NNG provided data of solar power supply at more than 150 feed-in points and corresponding active power measurements at 13 network nodes (buses) measured in 15 min intervals. Moreover, NNG provided data regarding the positions of network nodes (buses) and their connections through lines (branches) which include resistance values and transmission limits of each line in the distribution network. A fragment of the NNG distribution network with 34 nodes and 37 lines is visualized in Fig. 2. The solar power forecast \(P^\mathrm{F}\) is provided by a model proposed in Schinke-Nendza et al. (2021).

Fig. 2
figure 2

Sketch of NNG subnetwork, where (slack-)generator nodes are denoted by \({{\textbf {+}}}\), solar feed-in points by \(\star\) and load buses by •

DWD provided hourly forecasts of global horizontal irradiation, which were generated by the ensemble system of the numerical weather prediction model COSMO-DE, called COSMO-DE-EPS, and statistically interpreted based on synoptic observations at weather stations by Ensemble-MOS of DWD, see Hess (2020). The weather forecasts are issued on a 20 km \(\times\) 20 km grid covering Germany and parts of the neighboring countries at every third hour. The forecasts of global horizontal irradiation were provided with forecast lead times up to 19 h, where the measurements and forecasts range over the months May, June and July of the years 2015–2017.

We split the data into a training set and a validation set. The training set is used to fit model parameters and consists of data from the years 2015 and 2016. Based on the validation set from 2017 the accuracy of the predictions generated by the fitted model is evaluated.

4.2 Fitting unconditional and conditional distributions of forecasting errors

In this section we discuss the fitting of R-vine copulas, as outlined in Sects. 3.3 and 3.5, in order to determine uncertainty sets \(B^*\) of the form introduced in Sect. 3.6. First we explain how to model the (unconditional) distribution of the n-dimensional random vector \(X = P^{\text {PV}}-P^\mathrm{F}\) of power forecasting errors at the n nodes of the electricity network considered in the present paper, where \(n=13\). Besides this, we additionally consider the random vector \(S=(S_1,\ldots ,S_n):\Omega \rightarrow [0,\infty )^n\), which describes the forecasted solar radiation at the n nodes of the electricity network, and we model the conditional distribution of X given that \(S=s\) for some \(s\in [0,\infty )^n\). Moreover, we consider two further types of conditional distributions of X under the condition that \({\overline{S}}={\overline{s}}\) and \(S_k=s_k\), respectively, for some \({\overline{s}}\ge 0\), \(s_k\ge 0\) and \(k\in \{1,\ldots ,n\}\), where

$$\begin{aligned} {\overline{S}}=\frac{1}{n}\sum _{k=1}^n S_k. \end{aligned}$$
Fig. 3
figure 3

Histograms and fitted KDEs of forecasting errors \(X_k,X_{k'}\) (left, in MW) and forecasted radiations \(S_k,S_{k'}\) (right, in \(\frac{kWh}{m^2}\)) for two examples of solar feed-in points \(k,k'\in \{1,\ldots ,n\}\)

As outlined in Sect. 3, copula theory allows for the modeling of the multivariate distribution of random vectors like the random power forecasting error \(X:\Omega \rightarrow \mathbbm {R}^{n}\). In order to estimate the univariate (marginal) CDFs \(F_{X_1},\ldots ,F_{X_n}\) we use numerically integrated KDEs, with a Gaussian kernel and a bandwidth being equal to the estimated standard deviations \(\sigma _k\) of \(X_k\) for \(k=1,\ldots ,n\), see the left column of Fig. 3. Once an R-vine copula is fitted to the distribution of X, as descibed in Sect. 3.3, we are able to draw realizations from the fitted distribution of X, with which the uncertainty set \(B^*\) can be determined as described in Sect. 3.6. This method results in one single uncertainty set \(B^*\) for all considered hours, since the fitted R-vine copula models the (unconditional) distribution of X, irrespective of other variables, which are possibly correlated with X. Thus, it is sensible to investigate if and to which extent the random vector X of power forecasting errors depends on various other variables, like the random vector S of forecasted solar radiations at the n nodes. For this reason, we also model various conditional distributions of X.

To condition on the forecasted solar radiation vector S, we consider the three cases mentioned above, i.e., \(S=s\), \({\overline{S}} = {\overline{s}}\), and \(S_k=s_k\) for some \(k\in \{1,\ldots ,n\}\). From a meteorological perspective, the network nodes in \(\mathcal {N}\) are in close geographical proximity and, therefore, the forecasted solar radiations \(S_{1}, \ldots , S_n\) at the n network nodes are highly correlated. Thus, it might be sufficient to consider either the average solar radiation \({\overline{S}}\) or the solar radiation \(S_k\) for one single node, instead of the random vector S, which reduces the complexity of the copula model without much loss of information.

As can be seen in Fig. 3, the power forecasting errors \(X_k,X_{k'}\) have unimodal distributions which are well approximated by KDEs. For the forecasted solar radiations, \(S_k,S_{k'}\), however, the values of the densities are significantly larger than zero at the distribution limits. Since the kernel of the KDE would cross the bounds of the distribution for data points close to those bounds, we first transform the components of S, as well as \({\overline{S}}\) and \(S_k\), using the mapping \(T:[a,b]\rightarrow [-\infty ,\infty ]\) with \(T(x) = F_{N(0,1)}^{-1}(F_{U(a,b)}(x))\) for each \(x\in [a,b]\), where \(F_{N(0,1)}\) is the CDF of the standard normal distribution and \(F_{U(a,b)}\) is the CDF of U(ab), the uniform distribution for the interval [ab] for some \(a,b\in \mathbbm {R}\) with \(a<b\). Thus, T maps the bounded interval [ab] onto \(\mathbbm {R}\). Since the endpoints a and b are mapped to \(-\infty\) and \(\infty\), respectively, we choose them to be slightly outside the bounds of the solar radiation distribution such that T does not map any data point to \(\pm \infty\). The ranges of values of the transformed random variables T(S), \(T({\overline{S}})\) and \(T(S_k)\) are unbounded and we can apply kernel density estimators to their transformed data points T(s), \(T({\overline{s}})\) and \(T(s_k)\), where \(T(s)=(T(s_1), \ldots , T(s_n))\). Finally, we transform the density functions \({\hat{f}}_{T(S_i)}\) back to the interval [ab] with \({\hat{f}}_{S}(x) = \frac{1}{c} {\hat{f}}_{T(S)}(T(x))\) for each \(x\in [a,b]\), where \(c>0\) is a normalizing constant.

Fig. 4
figure 4

Histograms of 10,000 simulated conditional forecasting errors \(X_k,X_{k'}\) (in MW), given that \(S=s\) (left), \({\overline{S}}={{\overline{s}}}\) (middle), and \(S_k=s_k, S_{k'}=s_{k'}\) (right), for two examples of solar feed-in points \(k,k'\in \{1,\ldots ,n\}\) and for three different quantile values of \({\overline{s}}, s_k,s_{k'}\) or a vector of quantile values \(s \in [0,1]^{13}\). The colors indicate the quantiles of forecasted solar radiation on which the samples are conditioned, i.e., blue, green and red corresponds to low, medium and high solar radiation, respectively

Once the densities of the marginal distributions of X and S, as well as the densities of \({\overline{S}}\) and \(S_k\) are determined, they are numerically integrated to obtain the corresponding CDFs with which an R-vine copula is fitted, as described in Sect. 3.3. Now we can draw samples from the (unconditional and conditional) R-vine copula model with which we construct uncertainty sets \(B^*\), as described in Sect. 3.6. Figure 4 shows the histograms of samples drawn from conditional R-vine copula models for different solar radiation forecasts and, in particular, how the conditional error distribution changes for different forecasted solar radiations.

To check how well the R-vine copula model captures the correlations of the dataset of forecasted radiations and power forecasting errors, we compare the values of empirical Kendall’s tau (see (16)) for all pairs of components of the vector \((S_1,\ldots ,S_{n},X_1,\ldots ,X_{n})\). It can be seen in Fig. 5 that the R-vine copula model manages to capture the correlation within the underlying dataset quite well, since the values of empirical Kendall’s tau computed from the dataset of forecasted radiations and power forecasting errors (left) and from simulated realizations of the R-vine copula model (right), respectively, show very similar correlation structures.

Note that we consider copulas with up to 26 dimensions while the available dataset contains only 180 data points. This makes it difficult to reliably assess the goodness of fit of the copula model. However, in the following we evaluate the entire model chain with various validation scores in order to assess the additional benefit of the copula model.

Fig. 5
figure 5

Empirical Kendall’s tau computed from the dataset of forecasted radiations and power forecasting errors (left) and from simulated realizations of the R-vine copula model (right) for all pairs of components of the vector \((S_1,\ldots ,S_{n},X_1,\ldots ,X_{n})\). Top left: \(\tau (X_i,X_j)\). Bottom right: \(\tau (S_i,S_j)\). Top right and bottom left: \(\tau (X_i,S_j)\)

4.3 Analyzing the size of uncertainty sets

We now analyze the size of uncertainty sets for the robust approximation of chance constraints using the scenario approach described in Sect. 3.6. The resulting sets depend on the samples drawn from the unconditional probability distribution and the three conditional distributions of power forecasting errors, respectively, considered in Sect. 4.2. Note that the minimum number N of samples required for the scenario approach, determined by means of (20), goes from \(N=48\) (for \(1-\varepsilon =0.01\)) over \(N=469\) (\(1-\varepsilon =0.9\)) to \(N=4684\) samples (for \(1-\varepsilon =0.99\)). In practice, a coverage probability \(1-\varepsilon\) of about 0.9 is often practically relevant and therefore \(N=469\) samples are sufficient for the scenario approach with a confidence of \(1-\delta =0.99\).

For the numerical results discussed in the present section, we use an average uncertainty set which is obtained from applying the scenario approach 500 times. In this way, our numerical results become reproducible because the average uncertainty set does not change significantly, when the procedure described above is repeated.

Fig. 6
figure 6

Average size of uncertainty sets for varying coverage probabilities \(1-\varepsilon\) and with a confidence of \(1-\delta =0.99\), using the unconditional distribution of X (no), and the conditional distribution given that \(S=s\) (all), \({\overline{S}}={\overline{s}}\) (avg) and \(S_{19}=s_{19}\) (one), respectively

Figure 6 shows values of the size measure given in (18a), i.e. for the sum of interval lengths, of uncertainty sets computed exemplarily for a usual summer day at noon with an average hourly global horizontal irradiation of 0.63\(\frac{kWh}{m^2}\), in dependence of different values of the coverage probability \(1-\varepsilon\) with a confidence of \(1-\delta =0.99\). Note that smaller confidence levels would lead to smaller uncertainty sets, but the quality of these sets also decreases. In particular, there would no longer be a confidence probability of 0.99 that the computed uncertainty set covers the chosen probability mass of \(1-\varepsilon\).

The values displayed in Fig. 6 are normalized by the size of the largest uncertainty set, namely the unconditional uncertainty set for a coverage probability of 0.99. It can be seen that the sizes of the uncertainty sets increase with increasing probabilities \(1-\varepsilon\) as the confidence regions cover a larger set of realizations of the random vector X of power forecasting errors. In comparison to the uncertainty sets constructed with conditional probability distributions of X, the unconditional distribution of X leads for all coverage probabilities \(1-\varepsilon\) to larger uncertainty sets. Thus, with knowledge on the forecasted solar radiation, it is possible to adapt the uncertainty sets to the current weather situation, which leads to small sizes. Not surprisingly, the conditional distribution of X with given solar radiation at all n solar feed-in nodes yields the smallest uncertainty sets for all coverage probabilities \(1-\varepsilon\). However, the differences between these sizes and those obtained for the other two conditional settings with less complete information on the forecasted solar radiation, i.e. knowledge of average solar radiation (avg), and at one single node (one), are not too large. Furthermore, the size differences between the conditional settings ’avg’ and ’one’ are negligible.

Table 2 Average empirical coverage probability and average reduction of size/volume of uncertainty sets for the four (unconditional/conditional) settings ’no’, ’avg’, ’one’ and ’all’ of the distribution of the power forecasting error X, baesd on data for each day in the validation datset

The numerical results presented in the remaining part of this section concern the case \(1-\varepsilon = 0.9\), i.e. the practically most relevant value of the coverage probability \(1-\varepsilon\). For this safety margin, we analyze the uncertainty sets obtained for the four (unconditional and conditional) distributions of X described above and for each day in the validation dataset. In particular, we determine the empirical coverage probability by counting how often the realizations drawn from the respective distribution of the random vector X belong to the corresponding uncertainty set. Furthermore, we compute and compare the average size of the uncertainty sets, i.e. the sum of interval lengths, and their average volume, i.e. the product of interval lengths. The results are displayed in Table 2, where it can be seen that the four different settings lead to similar empirical coverage probabilities around the given level of 0.9. On the other hand, the reductions of size and volume of uncertainty sets implied by considering conditional distributions of the power forecasting error X are clearly visible. Again, the case with given solar radiation at all n solar feed-in nodes yields the smallest uncertainty sets, whereas the size differences between the conditional settings ’avg’ and ’one’ are negligible.

Fig. 7
figure 7

Uncertainty sets in MW (\(\varepsilon =0.1, \delta =0.01\)) for average solar radiation forecast of 0.76 \(\frac{kWh}{m^2}\) (left) and 0.18 \(\frac{kWh}{m^2}\) (right), using the unconditional distribution of X (no), and the conditional distribution given that \(S=s\) (all), \({\overline{S}}={\overline{s}}\) (avg) and \(S_{19}=s_{19}\) (one), respectively

To further analyze the impact of additional knowledge regarding solar radiation forecast on size and location of uncertainty sets, we determined uncertainty sets for a rather sunny day at noon with a high average solar radiation forecast of 0.76 \(\frac{kWh}{m^2}\) and a less sunny day at noon with a low average solar radiation forecast of \(0.18~\frac{kWh}{m^2}\). The results are shown in Fig. 7, where the uncertainty sets are plotted via their confidence intervals (in MW) for each solar feed-in point.

It turned out that the lengths of the confidence intervals significantly shrink by considering conditional distributions of the power forecasting error X, given a high average solar radiation forecast. More precisely, the lower endpoints of the confidence intervals are shifted upwards, i.e., negative power forecasting errors are less likely, whereas the upper endpoints remain almost unchanged, see Fig. 7 (left). On the other hand, for low average radiation forecast, the confidence intervals are shifted downwards by considering conditional distributions of the power forecasting error, but their lengths remain almost unchanged, see Fig. 7 (right).

Finally, we note that also the results of the numerical experiments presented in Aigner et al. (2021) are based on (measured) power feed-in data from NNG and forecasted radiation data from DWD. However, the used database differs from that of the present paper, where, in addition, solar power forecast data are exploited provided by the forecasting model of Schinke-Nendza et al. (2021). In this way, by modeling the multivariate probability distribution of solar power forecast data via R-vine copulas, it is possible to determine conditional uncertainty sets, which meet the desired coverage probability of 0.9. They have significantly smaller sizes than the corresponding unconditional uncertainty sets from Aigner et al. (2021) which led to an larger empirical coverage of 0.98 although \(1-\varepsilon =0.9\) was required.

4.4 Robust curtailment

As important as the size of the computed uncertainty sets is the quality of solutions obtained by solving the robust approximation (8) of the chance constrained optimization problem described in (6). In order to solve (8), we use the network parameters given by the power network operator NNG. The curtailment options for the feed-in nodes in the electrical power network of NNG are \(\beta _k \in \{0,\,0.1,\,0.2,\,\ldots ,\,1.0\}\). Moreover, the participation factors of the generators are fixed values given by NNG (\(\alpha _{31} = \alpha _{34} = 0.05\), \(\alpha _{32} = \alpha _{33} = 0.45\)). There are no costs affiliated with the power transfer at the (slack-) generators on the boundary nodes. Hence, there are no generator production costs and the corresponding term in the objective function is given as \(\sum \nolimits _{k \in \mathcal {N}_\text {G}} f_k(P^{\text {G}}_k)\) with \(f_k(P^{\text {G}}_k)=0\) for each \(k \in \mathcal {N}_\text {G}\). The curtailment costs are modeled as \(\sum \nolimits _{k \in \mathcal {N}} c_k(\beta _k)\) with \(c_k(\beta _k)=P^{\text {I}}_k(1-\beta _k)\) for each \(k\in \mathcal {N}\). The minimization of this objective function leads to a minimum curtailment of solar feed-in.

Due to the balanced network situations in the historical data, there is no need to curtail the solar feed-in in the instances from the validation set. There is also no danger of overload and the optimization leads to trivial solutions with a curtailed solar power equal to 0. Thus, in order to generate test cases with critical network situations (and non-trivial solutions), we artificially increased the solar power feed-in, whereas the network topology, transmission line parameters and the power demand remained unchanged. More precisely, based on the data of the validation set, we increased the installed solar power and the feed-in up to the by NNG planned total solar power capacities of the year 2022 and the planned total solar power increase of year 2025. The corresponding scaling of power generation forecast and uncertainty sets creates an oversupply of renewable energy, and therefore it is more likely in these instances that a curtailment will be required. Furthermore, in addition to the up-scaled solar power, we simulated the impact of transmission line failure on the solution of our optimization problem.

Thus, we now discuss further details for the following experimental setups:

  1. A:

    Installed solar power as planned in 2025,

  2. B:

    Installed solar power as planned in 2022 with a failure of lines (6, 19) and (9, 30).

To obtain the results, a mixed-integer optimization problem was solved for each instance and each (unconditional and conditional) uncertainty set. The computing times are very low and, thus, solutions can be generated efficiently. Indeed, the average computing times for the two settings are 2.8s (setting A) and 1.1s (setting B), with a maximal run time of 8.2s (setting A) and 4.0s (setting B).

Table 3 Number of instances where a nominal solution of (1) without probabilistic constraints leads to overload in the network compared to robust solutions (with security of \(1-\varepsilon =0.9\)) using the four (unconditional/conditional) types ’no’, ’avg’, ’one’ and ’all’ of uncertainty sets

The robustness of a solution of (8) can be validated by checking if the computed network configuration leads to an overload after the realization of uncertainty. The corresponding entries in Table 3 show that nominal solutions generated without probabilistic constraints (or, in other words, for \(1-\varepsilon =0\)) lead to overload in a large amount of test instances. In contrast, only up to three robust solutions lead to constraint violation in each setting for the different probabilistic models. The relative frequencies for this is therefore below the given threshold of \(\varepsilon =0.1\). This indicates the feasibility of the robust solutions for the chance constraints. This shows that the robust protection against uncertainties is necessary and reasonable, since the number of technical constraint violation could be strongly reduced in the numerical experiments.

Fig. 8
figure 8

Box plots of relative cost increase by the robust protection using the four (unconditional/conditional) types ’no’, ’avg’, ’one’ and ’all’ of uncertainty sets. The box extends from the lower to upper quartile values with a line at the median and a marker at the arithmetic mean. The whiskers extending from the boxes show the maximum ranges of relative cost increase

To further investigate the quality of solutions of (8), we computed the amount of curtailed solar power of the robust solution in comparison to the solution of the nominal problem (1) without a protection against uncertainty. The increase in curtailed energy of the robust solutions in comparison to the nominal ones can be interpreted as the cost of robust protection. That means how much the curtailment costs increase due to the protection against uncertainties. Figure 8 shows box plots for the increase of relative curtailment costs using the four (unconditional/conditional) types ’no’, ’avg’, ’one’ and ’all’ of uncertainty sets. One can see that, again, the addition of further knowledge about the solar radiation improves the performance in both settings. This corresponds to the size reduction of the uncertainty sets recognized in Sect. 4.3. Overall, the relative cost increase in all experiments is relatively small. However, using the samples drawn from the three conditional distributions of power forecasting errors enable us to further reduce the amount of wasted energy under the same solution guarantees, where, again, the conditional settings ’avg’ and ’one’ have a similar impact. In comparison with the preliminary results obtained in Aigner et al. (2021), the amount of curtailed energy could drastically reduced on average from about 13 to \(5\%\) under the same solution quality guarantees. This coincides with the reduction of uncertainty set size discussed at the end of Sect. 3.6.

In summary, the obtained results show that the scenario approach for the considered instances in combination with the copula-based stochastic modeling of power forecasting errors leads to high-quality solutions. The addition of further knowledge about the current weather situation allows us to construct more precise uncertainty sets. We are able to produce robust solutions with a relative small increase of curtailment costs, while maintaining the same level of protection.

5 Conclusion

In this paper, we combine the robust approximation of chance constrained DC Optimal Power Flow with a probabilistic uncertainty model based on R-vine copulas to reduce the curtailment of solar power while keeping the power grid stable. The chance constrained DC Optimal Power Flow determines appropriate levels of curtailment based on a deterministic forecast for the expected solar power feed-in and uncertainty sets, i.e., multidimensional cuboids which contain the forecasting error with a given probability. These uncertainty sets are approximated with the help of the multivariate probability distribution of the forecasting error at all considered power grid nodes. This results in less curtailments and a more stable power grid compared to the results of a model without uncertainty sets.

To further improve upon these results, we incorporate knowledge about solar radiation in the solution process by considering the conditional forecasting error distribution for a given solar radiation forecast. This leads to sharper distributions, i.e., the forecasting error can be predicted with higher accuracy, which results in smaller uncertainty sets. Compared to the unconditional case, this leads to even less curtailments and improved stability of the power grid.

Our numerical results demonstrate the applicability of our procedure and the positive effects of incorporating a probabilistic model for the distribution of random solar radiation vectors. Future research can transfer our solution framework to different applications under uncertainty like in energy network optimization.

Future research could add further features and investigate questions arising from the application, for example adding optimal transmission switching under uncertainty or including storage elements and unit commitment constraints over time. From a mathematical point of view, it would be interesting to study different geometries for uncertainty sets to further reduce the conservatism of the robust approximation. The major challenge is to find assumptions where an equivalent reformulation for the resulting problems is possible. In order to improve the copula-based sampling from conditional probability distributions, it might be promising to add more information (e.g. temperature, solar altitude, time) to the model.