1 Introduction

Since the deregulation of the energy market, the question of how to determine the value of a power plant can be asked. The traditional approach of valuing it within a given portfolio of other assets in a coordinated way against one’s customer load is one possibility. A second approach is to adopt the ideas of real option pricing in finance. In the first case one ends up with models resembling unit commitment (e.g., van Ackooij et al. 2018) but at a long time scale. Although the actual operation of the power plant can be presented in great detail, it will be harder to incorporate other features in the model. This will typically be the case for uncertainty, where one ends up with multi-stage mixed-integer programs which are not easily solved. One can also argue that it is unreasonable to model the system as fully coordinated. In contrast, when modelling the power plant as a real option, thus operating it in the face of a set of market signals, the setting becomes that of perfect competition. Uncertainty is also naturally modelled, but it comes at the expense of modelling the plant as an independent production unit and thus with less realism in that sense.

However, the price of the real option may well serve as a financial reference base between two parties. For example between the power plant owner and a trading entity actually operating on the market. Taking the option pricing perspective, it must be emphasized that energy markets are by far not as “granular” as the equity markets. For instance, on the electricity market one cannot buy a contract of delivery for a given hour 6 months from now. The classical pricing-hedging duality argument is thus not feasible. Moreover, when operating a power plant, generation will be bound locally by a given power output level. This can be either the result of ramping conditions or minimum up/down times. It is therefore reasonable to try to model the power plant with sufficient realism for the above discrepancies to be minimal. This is the stance that we have taken in the current work. It will lead us to consider a multiscale stochastic program in the sense of Glanzer and Pflug (2019), i.e., a multistage stochastic optimization problem where each stage itself is subdivided into a given set of time instants.

To account for uncertainty, we start out with a set of typical stochastic models for underlying prices, which are based on multi-factor models (e.g., Clelow and Strickland 2000) driven by Brownian motions. Clearly, such (commonly used) idealized modelling assumptions are rather unrealistic. It is thus the aim and the core part of the present paper to relax such strong assumptions by computing distributionally robust solutions to the studied operational problem and to investigate how the resulting valuation deviates when considering model ambiguity. Distributionally robust optimization is a field which has recently gained a lot of popularity in the literature (see (Pflug and Pichler 2014, pp. 232–233) for a review of different approaches). In particular, ambiguity sets based on distance concepts between probability measures (such as the Wasserstein distance) are well-supported by theory and frequently applied (e.g., Pflug and Wozabal 2007; Esfahani and Kuhn 2018; Glanzer et al. 2019; Duan et al. 2018; Gao and Kleywegt 2016). However, to the best of our knowledge, the effects of distributional robustness in (especially multistage) real-world applications, have not been investigated yet.

In order to solve the formulated problem numerically, the given uncertainty model will be discretized on a scenario lattice. The multiscale structure could then simply mean that uncertainty is lost within a given stage (cf., e.g., Moriggia et al. 2018). More advanced approaches do consider some uncertainty [e.g., the so-called multi-horizon approach originally suggested in Kaut et al. (2014) and subsequently studied and applied in Seljom and Tomasgard (2017), Skar et al. (2016), Werner et al. (2013), Zhonghua et al. (2015), Maggioni et al. (2019)], but the resulting paths do not necessarily connect with subsequent elements in the scenario tree/lattice. Hence, the multi-horizon approach is not appropriate for the present problem, as the key requirement of two time scales which may be assumed to run completely independent from each other, is not given. Indeed, we deal with two different granularities associated with one and the same stochastic process reflecting the evolution of the underlying market prices. A framework for such situations, where sub-stage paths in the lattice are carefully connected, has recently been proposed in Glanzer and Pflug (2019). We test the multiscale stochastic programming approach suggested in Glanzer and Pflug (2019) in the context of the present real-world application.

Although the resulting ideas will be illustrated through the power plant real option framework, their potential usage is readily seen to be beyond this specific application. In terms of contributions we can can state:

  • For the application of real option pricing, we investigate more reasonable exercise patterns. In order to keep computational burden low, this naturally leads to multiscale stochastic programs. We also consider model ambiguity to mitigate the fairly ideal models for market prices. From a high-level perspective, we thus extend the literature on real-world applications of dealing with two fundamental problems in stochastic programming, namely the problem of time scales with multiple granularities as well as the problem of model ambiguity.

  • With respect to multistage model ambiguity, we propose a new concept based on the Wasserstein distance. It is tailored with a computational intention, namely in such a way that (on a discrete scenario tree/lattice) the applicability of a classical backward dynamic programming recursion can be maintained. In particular, the suggested framework leads to solutions that are robust w.r.t. model misspecification in a ball around each conditional transition probability distribution. The size of these balls may be controlled uniformly by a single input parameter. We also link the concept to the nested distance in such a way that it inherits a favourable stability property of the latter.

  • In the context of Wasserstein ambiguity sets, we propose a state-dependent metric as a basis for the Wasserstein distance. Thereby we account for more realistic worst-case scenarios. We discuss that the well-appreciated statistical motivation for using Wasserstein balls is not invalidated by doing so.

The paper is organized as follows. Section 2 describes the valuation model and the uncertainty model. As typical for real-world energy applications, a sound mathematical framework reflecting all peculiarities of the problem requires carefulness in all details. The underlying uncertainty factors are modelled by a continuous time stochastic process. However, in the light of the nature of the decision problem, we will eventually apply a stochastic dynamic programming algorithm which operates backwards in (discrete) time. To prepare for the computational solution, we therefore discuss all discretization steps required by the multiscale stochastic programming framework that we adopt. Section 3 is dedicated to model ambiguity. We introduce and discuss a new concept which is tailor-made for incorporating model ambiguity into dynamic stochastic optimization models on discrete structures. All numerical experiments and aspects of the computational solution algorithm are given in Sect. 4. Section 5 concludes. Some technical details and examples are deferred to the “Appendix”.

2 The model

Our valuation problem belongs to the class of discrete time sequential decision problems with finite horizon T, decisions \(u_t\), state variables \(z_t\), and a Markovian driving process \(\xi _t\):

$$\begin{aligned} \begin{aligned} \underset{\{u_t\}_{t=0}^{T-1}}{\text {max}}&{\mathbb {E}}\left[ \displaystyle \sum _{t=0}^{T-1}\beta _t h_t(z_t, u_t, \xi _t) \right] \\ \xi _{t+1}&\quad \forall t = 0,\ldots ,T-1\\&u_t \in {\mathcal {U}}_t(z_t) \;\text{ a.s. } \forall t=0,\ldots ,T-1,\; \, u_t = u_t(z_t, \xi _{t}), \\&z_t \in {\mathcal {Z}}_t \;\text{ a.s. } \forall t=0,\ldots ,T-1\; \end{aligned} \end{aligned}$$
(1)

Here T is the number of decision stages and \(g_t(z_t,u_t,\xi _t)\) is the state transition function. The driving stochastic process \(\xi _t\) is assumed to belong to \(L_1(\Omega _t,\mathcal {F}_t ; \mathbb {R}^m)\,\) and the feasible decision variables at stage t are defined by the set \({\mathcal {U}}_t(z_t) \subseteq {\mathbb {R}}^m\). The set of all reachable state variables is denoted by \(\mathcal {Z}_t \subseteq \mathbb {R}^{d_1}\). The stage-wise profit function \(h_t : \mathbb {R}^{d_1} \times \mathbb {R}^m \times \mathbb {R}^{d_2} \rightarrow \mathbb {R}\) is continuous and satisfies the following growth condition:

$$\begin{aligned} \left| h_t( z, u, x ) \right| \le K \cdot (1 + \left\| z\right\| + \left\| u\right\| + \left\| x\right\| ), \end{aligned}$$

for all \((z,u,x) \in \mathbb {R}^{d_1 + m + d_2}\) and some constant K. We choose the discount factor \(\beta _t = \beta ^t\) for some constant \(\beta \in (0,1]\) throughout the paper. Any decision \(u_t\) to be made at time t may only depend on the current state \(z_t\) and the most recent observation of exogenous information \(\xi _{t-1}\). This is the non-anticipativity condition. The initial conditions for the random process \(\xi \) and the state vector z are that \(\xi _0\) and \(z_0\) are assumed to be constant.

In our application, the decisions \(u_t\) represent the weekly electricity production plan for a thermal power plant. The latter is characterized by many technical constraints, such as minimum up/down times or ramping constraints. Fine grain constraints can be incorporated into the model by increasing the dimension of the state vector and accounting for the number of hours the plant has been offline/online. Such state-representations of constraints on generation assets have received attention in the literature (see, e.g., Martinéz et al. 2008; Frangioni and Gentile 2006; Frangioni et al. 2008 and the references therein). Finer granularity of the time dimension and/or the state variable would result in a significant increase of time steps T (and reduction of the time step size \(\Delta t\)) as well as an increase in the dimension of \(z_t\). For this reason, we introduce here the idea of a multiscale model: While the production decisions \(u_t\) are made on a weekly scale, the production costs and revenues are calculated on a finer time scale. To make the dynamic optimization algorithm tractable, we make the assumption that the decisions, i.e. the production profiles, must be chosen from a pre-specified set with finite cardinality. The profiles are set up such that they reflect realistic operating conditions and key choices, such as generating at minimal stable generation (MSG) at off peak hours.

Just prior to presenting the specific instantiation of (1), let us emphasize once more that the idea of subdividing a “stage” to mitigate (the curse of) dimensionality goes largely beyond the presented application. Typical other energy problems with similar mechanisms are cascaded reservoir management problems [e.g., see the extensive discussion in van Ackooij et al. (2014) as well as Escudero et al. (1996, 1999), Zéphyr et al. (2015), Cervellera et al. (2006), Aasgård et al. (2014), Séguin et al. (2017), Fleten et al. (2011)].

2.1 Instantiation of the problem: the valuation model

In our instantiation of problem (1) the time horizon is spanned by T weeks. Each week \(t=1,...,T\) is subdivided into S equally sized blocks of hours. With respect to our earlier introduced notation, we now present the following specific versions:

  • the price process \(\xi _{t,s}=(\xi ^e_{t,s}, \xi ^f_{t,s} , \xi ^c_{t,s}) \in {\mathbb {R}}^3 \) represents the electricity price in GBP per megawatt hour (£/MWh), the fuel price in USD per tonne ($/tonne(fuel)) and the \(\text{ CO }_2\) allowances price in EUR per tonne (€/tonne(carbon)), for each block s within week t, for \(s=0, \dots , S\). The information up to stage t is the information up to \(\xi _{t,0}\). The values within the weeks are \(\xi _{t,s}\) for \(s=1,\dots , S\) with the convention that \(\xi _{t,S} =\xi _{t+1,0} \) coincides with the initial prices of the next stage. With this convention we ensure continuity of prices in between weeks. In this way, the information up to stage t is to be understood as the information up to the value \(\xi _{t,0}\).

  • the control \(u_t = \{u_{t,s}\}_{s=0}^{S-1} \in {\mathcal {U}} \subseteq {\mathbb {R}}_+^S\) represents the production profile vector for week t, where \(u_{t,s} \) is given in megawatt (MW) and denotes the production at block s. Before the beginning of intermediate values of week t, we determine \(u_t\). Then, \(u_t\) is \(\xi _{t,0}-\) measurable.

  • the state vector \(z_t\) is two-dimensional, i.e., \(z_t = (x_t,y_t)\) with

    • \(x_t\in {\mathbb {R}}_+\) representing the amount of \(\text{ CO }_2\) allowances (measured in tonnes of carbon), that are left for week t.

    • \(y_t \in {\mathbb {Z}}_+\) representing the number of hours the power plant was offline before the beginning of week t.

The objective function \(h_t : \mathbb {R}\times {\mathbb {Z}}_+ \times \mathbb {R}_+^S \times \mathbb {R}_+ \rightarrow \mathbb {R}\) is given by

$$\begin{aligned} h_t\left( [x_t,y_t],u_t,\xi _{t,0}\right) = {\mathbb {E}}\left[ \displaystyle \sum _{s=0}^{S-1} f_s(x_t,y_t,u_t,\xi _{t,s} ) \biggr | \xi _{t,0} \right] . \end{aligned}$$
(2)

The profit at each block s within week t is defined as follows:

$$\begin{aligned} f_s(x_t,y_t,u_t,\xi _{t,s}) = \left\{ \begin{array}{ll} \big (\xi ^e_{t,0} - H_1\,\xi ^f_{t,0}\big ) \,u_{t,0} \, \Delta s- f^{\text{ CO }_2}\big (x_t,{\bar{u}}_t,\xi ^c_{t,0}\big ) &{} \\ \quad - f^{\text {start}}\big (0,y_t,u_t,\xi ^e_{t,0}, \xi _{t,0}^f\big ) - f^{\text {tr}}(u_t) &{} \;\text{ if }\; s = 0 \\ \big (\xi ^e_{t,s} - H_1\,\xi ^f_{t,s}\big ) \,u_{t,s} \, \Delta s - f^{\text {start}}\big (s,y_t,u_t,\xi ^e_{t,s}, \xi _{t,s}^f\big ) &{} \;\text{ if }\; s > 0, \end{array} \right. \end{aligned}$$

where \({\bar{u}}_t := \sum _{s=0}^{S-1} u_{t,s}\). Costs incurred are based on the following component functions:

  • \(f^{\text{ CO }_2} : \mathbb {R}\times \mathbb {R}_+\times \mathbb {R}_+ \rightarrow \mathbb {R}_+\) gives the cost of buying more \(\text{ CO }_2\) allowances at the beginning of week \(t\,\) (before the values within week t are known);

  • \(f^{\text{ start }} : {\mathbb {Z}}_+^2 \times \mathbb {R}_+^S \times \mathbb {R}_+^2 \rightarrow \mathbb {R}_+\) gives the start-up cost if the power plant has been offline prior to (one of its arguments) block s;

  • \(f^{\text{ tr }} : \mathbb {R}^S_+ \rightarrow \mathbb {R}_+\) represents fuel transportation costs linked to a selected production profile at the beginning of week t.

Table 1 summarizes constants used above or in the sequel.

Table 1 Description of the constants

The way in which each state variable is updated will be described now. First we will focus on the variables regarding the \(\text{ CO }_2\) allowances. Although in the past, a given set of allowances was allocated for free, in principle, they are now obtained from a non modelled auction process. Within our model, the variable \(I_t\) will represent the number of additional \(\text{ CO }_2\) allowances received from the regulator at the beginning of week t (measured in tonnes of carbon). Note that this variable will typically be equal to zero but sometimes it will take a relatively high value. The latter happens exactly at the rare events when new allowances are obtained.

Now, \(H_4\,{\bar{u}}_t\,\Delta s\) is the amount of generated \(\text{ CO }_2\) during stage t. Hence, together with \(x_t\) being the remaining stock level and \(I_t\) the “inflows”, the amount of allowances one needs to buy at stage t is \(\alpha _t = [ x_t +I_{t} - H_4 \,{\bar{u}}_t\,\Delta s ]_{-}\).Footnote 1 In the case where \(\alpha _t\) is positive, we follow a procurement strategy based on a low/middle/high price range partition resulting from some pre-market analysis. Prices in the interval \([{\underline{b}}, {\overline{b}}]\), with \(0 \le {\underline{b}} < {\overline{b}}\), are considered middle range. This is formalized as follows:

$$\begin{aligned} A(x_t,{\bar{u}}_t,\xi _{t,0}^c) = \alpha _t\cdot ( 1+ C_\alpha \cdot \min \left\{ \max \{ {\overline{b}} - \xi _{t,0}^c,0 \}/( {\overline{b}} - {\underline{b}}) , 1 \right\} ) , \end{aligned}$$
(3)

where \(C_\alpha \) is a constant that determines the size of the extra amount to be bought.

Recall, our implicit assumption is that new allowances are always bought before the prices within week t are known. The cost of buying more certificates for week t is then given by

$$\begin{aligned} f^{\text{ CO }_2} (x_t, {\bar{u}}_t, \xi _{t,0}^c) = A(x_t,{\bar{u}}_t,\xi _{t,0}^c) \cdot H_3\, \xi _{t,0}^c\, , \end{aligned}$$

The amount \(x_{t+1}\) of remaining allowances after the previous purchase, is updated as follows:

$$\begin{aligned} g^{(1)}(x_t, u_t, \xi _{t,0}^c)= A(x_t,{\bar{u}}_t,\xi _{t,0}^c) + x_t + I_t- H_4\,{\bar{u}}_t\,\Delta s . \end{aligned}$$

The second state variable accounts for start-ups and related costs. The latter depend on the amount of time the power plant was offline. In our model this time frame will be partitioned into C different intervals of hours denoted by \((c_j, c_{j+1}] ,\, (c_{C}, \infty )\), for \(j=1, \dots , C-1\), \(c_1=0\), over which the start-up costs are assumed to be constant. The associated costs are in terms of power, fuel burnt and extra costs. Depending on \(y_t\) and the chosen profile \(u_t\), one can readily figure out in which interval each start-up of \(u_t\) falls.

The induced start-up costs at block s within week t are given by:

$$\begin{aligned} f^{\text {start}}(s,y_t,u_{t},\xi ^e_{t,s}, \xi _{t,s}^f) = {\overline{W}}_s(y_t,u_t) \,\xi ^e_{t,s} + {\overline{B}}_s(y_t,u_t)\, H_2\,H_6\,\xi ^f_{t,s} + {\overline{E}}_s(y_t,u_t),\nonumber \\ \end{aligned}$$
(4)

where

  • \({\overline{W}}_s(y_t,u_t)\) is the amount of works power (MWh) for a start-up at s;

  • \({\overline{B}}_s(y_t,u_t)\) is the amount of solid fuel burnt (GJ) during a start-up at s;

  • \({\overline{E}}_s(y_t,u_t)\) denotes engineering and imbalance costs (£) during a start-up at s.

The updated state \(y_{t+1}\) is given by:

$$\begin{aligned} g^{(2)}(u_t) = \left( S - \max \{s:\, u_{t,s} \ne 0\}\right) \cdot \Delta s, \end{aligned}$$

where \(\max \{ \emptyset \}=0\).

As a further cost factor, we account for the fuel transportation costs associated to each profile:

$$\begin{aligned} f^{\text{ tr }}(u_t) = C^{tr}\,H_7\, {\bar{u}}_t\, \Delta s, \end{aligned}$$
(5)

where the unit transportation cost (in £  per tonne of fuel) is given by the constant factor \(C^{tr}\).

Fig. 1
figure 1

Typical structure of electricity forward prices. Each curve represents one week of EFA block power prices (in £/MWh, observed every 4 h), from Saturday, 3 am, until Friday, 11 pm. Five weeks of forward price data from the beginning of February to the first week of March, 2017

2.2 Underlying price processes

To model the underlying uncertainties, i.e., the stochastic price-evolution of electricity, fuel and \(\text{ CO }_2\) allowances, we postulate a version of a classical two-factor model. The latter are commonly used for the modeling of commodity markets (cf. Clelow and Strickland 2000; Ewald et al. 2018; Ribeiro and Hodges 2004; Farkas et al. 2017). More specifically, in our model the electricity price behaviour is governed by a long term and a short term factor, whereas fuel and \(\text{ CO }_2\) allowances prices evolve according to a one factor model. In summary, we get a three-dimensional geometric Brownian motion model driven by four correlated one-dimensional Brownian components \(B^{e,\text{ sh }},B^{e,\text{ lo }},B^{f},B^{c}\). In particular, the dynamics of the underlying stochastic process F are described by the SDE

$$\begin{aligned} \left( \begin{array}{l} dF_{t,t}^{e} / F_{t,t}^{e} \\ dF_{t,t}^{f} / F_{t,t}^{f} \\ dF_{t,t}^{c} / F_{t,t}^{c} \end{array}\right) = \left( \begin{array}{llll} \sigma ^{\scriptscriptstyle {\mathrm{e,sh}}}_{\scriptscriptstyle {\mathrm{t}}}&{}\quad 0 &{}\quad 0 &{}\quad \sigma ^{\scriptscriptstyle {\mathrm{e,lo}}}_{\scriptscriptstyle {\mathrm{t}}}\\ 0 &{}\quad \sigma ^{\scriptscriptstyle {\mathrm{f}}}_{\scriptscriptstyle {\mathrm{t}}}&{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad \sigma ^{\scriptscriptstyle {\mathrm{c}}}_{\scriptscriptstyle {\mathrm{t}}}&{}\quad 0 \end{array}\right) \left( \begin{array}{l} dB_t^{e,\text {sh}} \\ dB_t^{f} \\ dB_t^{c} \\ dB_t^{e,\text {lo}} \end{array}\right) , \end{aligned}$$
(6)

where the superscripts sh and lo refer to short-term and long-term, respectively. Volatility is allowed to be time-dependent but deterministic. The double-index notation \(F_{t,t}\) expresses the fact that we model the spot price as a special case of the forward price. In particular, the forward price \(F_{0,t}\) (as observed in the market at time 0) will enter the solution of (6) at time t. In this way, we account for the well-known seasonality (peak-hours and off-peak-hours) inherent in electricity prices. Figure 1 visualizes this typical effect. To avoid notational clutter, we henceforth write \(\xi _t\) with one index for the spot price as a short hand for \(F_{t,t}\). Note that we are dealing with a continuous time stochastic model here. The index notation should not be confused with the discrete-time multiscale indexes used in the valuation model; the context will always make clear what is meant.

Regarding the dependence structure between the underlying assets, we allow for a time dependent correlation matrix

$$\begin{aligned} \rho _t= \left( \begin{array}{llll} 1 &{}\quad \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{sh}},\mathrm{f}}}_\mathrm{t}&{}\quad \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{sh}},\mathrm{c}}}_\mathrm{t}&{}\quad \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{sh}},\mathrm{e}^{\mathrm{lo}}}}_\mathrm{t}\\ \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{sh}},\mathrm{f}}}_\mathrm{t}&{}\quad 1 &{}\quad \varrho ^{\scriptscriptstyle {\mathrm{f,c}}}_\mathrm{t}&{}\quad \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{lo}},\mathrm{f}}}_\mathrm{t}\\ \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{sh}},\mathrm{c}}}_\mathrm{t}&{}\quad \varrho ^{\scriptscriptstyle {\mathrm{f,c}}}_\mathrm{t}&{}\quad 1 &{}\quad \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{lo}},\mathrm{c}}}_\mathrm{t}\\ \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{sh}},\mathrm{e}^{\mathrm{lo}}}}_\mathrm{t}&{}\quad \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{lo}},\mathrm{f}}}_\mathrm{t}&{}\quad \varrho ^{\scriptscriptstyle {\mathrm{e}^{\mathrm{lo}},\mathrm{c}}}_\mathrm{t}&{}\quad 1 \end{array}\right) . \end{aligned}$$

Using the (lower triangular) matrix \(L_t\) resulting from a Cholesky decomposition of \(\rho _t\), we may replace the Brownian factors \([dB_t^{e,\text{ sh }}, dB_t^{f}, dB_t^{c}, dB_t^{e,\text{ lo }}]^\top \) by the matrix-vector product \(L_t \times [dW_t^{(1)}, dW_t^{(2)}, dW_t^{(3)}, dW_t^{(4)}]^\top \), such that the underlying prices are driven by independent Wiener processes \(W_1^{\text{ s }}, W_2, W_3, W_1^{\text{ l }}\). Multiplying the volatility matrix in (6) with \(L_t\), we can write the model in the form

$$\begin{aligned} \left( \begin{array}{l} d\xi _t^{e} / \xi _t^{e} \\ d\xi _t^{f} / \xi _t^{f} \\ d\xi _t^{c} / \xi _t^{c} \end{array}\right) = \left( \begin{array}{llll} a_{11}(t) &{}\quad a_{12}(t) &{}\quad a_{13}(t) &{}\quad a_{14}(t) \\ a_{21}(t) &{}\quad a_{22}(t) &{}\quad 0 &{}\quad 0 \\ a_{31}(t) &{}\quad a_{32}(t) &{}\quad a_{33}(t) &{}\quad 0 \end{array}\right) \left( \begin{array}{l} dW_t^{(1)} \\ dW_t^{(2)} \\ dW_t^{(3)} \\ dW_t^{(4)} \end{array}\right) . \end{aligned}$$
(7)

The non-zero components of the above coefficient matrix involve nasty terms with combinations of the various correlations. The precise parameters can be found in the “Appendix”.

The solution of SDEs of such a form as in (7) is well known to be of the geometric Brownian motion type (e.g., see (Oksendal 2000, p. 62)). In particular, the random vector \(\xi _t = [\xi _t^{e},\xi _t^{f},\xi _t^{c}]\) follows a three-dimensional log-normal distribution. The corresponding parameters can again be found in the “Appendix”.

2.2.1 Discretization and the associated bridge process

For our numerical solution framework, which is discussed in detail in Sect. 4, the process \(\xi \) will first be discretized in all decision stages. Then, an approximate solution of the problem will be computed by stochastic dynamic programming with a backward recursion. In each decision stage, the algorithm relies on the expected profit/loss associated with any decision to be made for the upcoming observation blocks of the following week. To compute such values, we will exploit the structure of the valuation model, the uncertainty model and the backwards recursion. In particular, we are able to compute the expected profits by an analytical formula.

Let us start with the discretization step. To account for the two different time scales explained in Sect. 2.1 above, namely the weekly decision scale and the much finer intra-week observation scale, we use the notation \(\xi _{t,s}\), where \(t=0,\ldots ,T\,\) runs in weeks and \(s=0,\ldots ,S\) in hour-blocks of equal size \(\Delta s\) (such that \(t+S\cdot \Delta s = t+1)\). Considering the fact that intra-week data of fuel and \(\text{ CO }_2\) allowances prices typically show – if even available – a rather stable evolution with low fluctuations, we assume those prices to be constant from Monday to Sunday. On the contrary, the electricity price dimension is truly stochastic even on a fine time-scale. Looking at the expected profit function (2) at some block s during a week t, it turns out that (on the basis of our assumptions) the problem boils down to the expected value of the electricity price \(\xi _{t,s}^e\,\) given both the initial value \(\xi _{t,0}^e\) as well as the final value \(\xi _{t,S}^e\) of week t. This is due to the fact that the function \(h_t(\cdot )\) is linear in \(\xi _{t,s}^e\). Mathematically speaking, we are left with the computation of the conditional expected value at time \(t+s\cdot \Delta s\) of the stochastic bridge process linking the values \(\xi _{t,0}^e\) and \(\xi _{t,S}^e\), for all \(s=1,\ldots ,S\). All other parts can be computed in a straightforward way.

The one-dimensional process \(\xi _{t,s}^e\) follows a univariate lognormal distribution. Thus, its transition density \(\delta \) is available in analytical terms and the transition density of the associated bridge process can be computed explicitly. Let an initial value \(\eta _1\) of the process at the beginning of some week t and a final value \(\eta _2\) at the end of that week be given (i.e., \(\xi _{t,0}^{e}=\eta _1\) and \(\xi _{t,S}^{e}=\eta _2\)). Then, the bridge process transition density, at time \(s \in [0,S]\), is given by

$$\begin{aligned} \delta \left( x,t+s\cdot \Delta s \vert \eta _1,t,\eta _2,t+1\right)&= \frac{\delta \left( \eta _2, t+1 \vert x,t+s\cdot \Delta s\right) \cdot \delta (x,t+s\cdot \Delta s \vert \eta _1,t)}{\delta (\eta _2,t+1\vert \eta _1,t)} \\&= \frac{1}{\sqrt{2\pi {\hat{\sigma }}_{s\vert t}^2}x}\exp \left( -\frac{\left( \log (x)-\log (\eta _1)-{\hat{\mu }}_{s\vert t} \right) ^2}{2{\hat{\sigma }}_{s\vert t}^2} \right) , \end{aligned}$$

where

$$\begin{aligned} {\hat{\mu }}_{s\vert t}&= \frac{\int _{t}^{t+s\cdot \Delta s}\sigma ^2(u) \;du}{\int _{t}^{t+1}\sigma ^2(u) \;du} \log \left( \left( \frac{F_{0,t}^{e}}{F_{0,t+1}^{e}}\right) \cdot \frac{\eta _2}{\eta _1} \right) ,\\ {\hat{\sigma }}_{s\vert t}&= \frac{\left( \int _{t}^{t+s\cdot \Delta s}\sigma ^2(u) \;du \right) \left( \int _{t+s\cdot \Delta s}^{t+1}\sigma ^2(u) \;du \right) }{\int _{t}^{t+1}\sigma ^2(u) \;du}. \end{aligned}$$

In particular, we get for the conditional expectation

$$\begin{aligned} {\mathbb {E}}\left[ \xi _{t,s}^{e} \big \vert \xi _{t,0}^{e}=\eta _1, \xi _{t,S}^{e}=\eta _2\right] = \eta _1 \cdot \left( \frac{F_{0,s\cdot \Delta s}^{e}}{F_{0,t}^{e}}\right) \cdot \exp \left( {{\hat{\mu }}_{s\vert t}}\right) . \end{aligned}$$
(8)

Let us emphasize that the above analytical tractability is not due to our restriction of the intra-week stochasticity to one dimension (see Glanzer and Pflug 2019 for a treatment of the more general multi-dimensional case). This restriction is purely motivated by data.

Figure 2 illustrates a set of sample paths from the bridge process, which starts and ends in the forward prices corresponding to two consecutive weeks. The intermediate forward prices are shown for comparison of the seasonal behaviour.

Fig. 2
figure 2

Electricity forward price data (solid line) versus 5 simulated trajectories (dashed lines). Simulation based on the bridge process dynamics

3 Ambiguity for dynamic stochastic optimization models

It is an application of classical stochastic dynamic programming theory to solve (1) backwards in time on the basis of the following recursion scheme:

$$\begin{aligned} \begin{aligned} V_t(z_t, \xi _t)&= \underset{ u_t \in {\mathcal {U}}_t(z_t)}{\text {max}} h_t(z_t, u_t, \xi _t) + \beta \, {\mathbb {E}}[ V_{t+1}(z_{t+1},\xi _{t+1}) \vert \xi _t ] \\&\text {s.t. } z_{t+1}=g_t(z_t, u_t,\xi _{t+1}), \end{aligned} \end{aligned}$$
(9)

where \(V_T (z_{T}, \xi _{T}) \equiv 0\), \(z_0\) and \(\xi _0\) are given.

Let \(\xi \) be a Markovian process defined on a finite state space \(\Xi _0 \times \dots \times \Xi _{T}\), where on each \(\Xi _t\) there is a distance \(d_t\). Let the cardinality of \(\Xi _t\) be \(N_t\) with \(N_0=1\) (typically nondecreasing in t). Then the transition matrices \(P_t\), \(t=0, \dots , T-1\) are of the form \(N_t \times N_{t+1}\), where the \(i-\)th row of the matrix \(P_t\) is denoted by \(p_{t}(i)\, \), for all \( i=1, \dots , N_t\). Notice that each row \(p_{t}(i)\) describes a probability measure on the metric space \((\Xi _{t+1},d_{t+1})\).

Let \(\xi _{t}^i \in \Xi _t\,\) be given. Then, the conditional probability to transition to \(\xi _{t+1}^j \in \Xi _{t+1}\) is given by the jth element of the row vector \(p_{t}(i)\), denoted as \(p_{t}(i,j)\) for \(j=1,\dots , N_{t+1}\) and \(i=1, \dots , N_t\). In this discrete case, the objective of the recursion in (9) can be written as

$$\begin{aligned} \begin{aligned} V_t(z_t, \xi _t^i ) = \underset{ u_t \in \mathcal {U}_t(z_t)}{\max }&h_t(z_t, u_t, \xi ^i_t) + \beta \sum _{j=1}^{N_{t+1}} p_t(i,j)\cdot V_{t+1}\left( g_t(z_t, u_t,\xi _{t+1}^j), \xi _t^j\right) . \end{aligned} \end{aligned}$$
(10)

3.1 A new concept: uniform Wasserstein distances

In order to consider model ambiguity, we look for alternative transition matrices \(Q_t\), which are close to a given matrix \(P_t\). Let us first recall the general definition of the Wasserstein distance for discrete models.

Definition 3.1

Let \(P = \sum _{i=1}^n P_i\,\delta _{\xi ^i}\) and \(Q = \sum _{j=1}^{{\tilde{n}}} Q_j\,\delta _{{\tilde{\xi }}^j}\) be two discrete measures sitting on the points \(\{\xi ^1,\dots , \xi ^n \}\subset \Xi \) and \(\{{\tilde{\xi }}^1,\dots , {\tilde{\xi }}^{{\tilde{n}}} \}\subset {\tilde{\Xi }} \), respectively. Then, the Wasserstein distance between P and Q is defined as

$$\begin{aligned} {{\,\mathrm{{\mathfrak {W}}}\,}}(P,Q) := \underset{\pi _{ij}}{\min }&\sum _{i,\, j} \pi _{ij}\cdot D_{ij}, \end{aligned}$$

where \(\pi = \sum _{i,j} \pi _{i,j} \cdot \delta _{\xi ^i, {\tilde{\xi }}^j}\) is a probability measure on \( \Xi \times {\tilde{\Xi }}\) with marginals P and Q, and where \(D_{ij}\) is a distance between the resp. atoms.

We will now measure the closeness between (discrete) multistage models \({\mathbb {P}}\) and \({\mathbb {Q}}\) by a uniform Wasserstein distance concept. The rows of alternative matrices \(Q_t\) are denoted by \(q_{t}(i), \, i=1,\dots , N_t\). The measure \(q_t(i)\) is sitting on at most \(N_{t+1}\) points in \(\Xi _{t+1}\) and is such that \(\mathrm{supp}(q_t(i)) \subseteq \mathrm{supp}(p_t(i))\). Then, we define the distance

$$\begin{aligned} {{\,\mathrm{{\mathfrak {W}}^{\infty }}\,}}({\mathbb {P}}, {\mathbb {Q}}) = \max _{0\le t \le T-1} \max _{1\le i \le N_t} {{\,\mathrm{{\mathfrak {W}}}\,}}(p_{t}(i),q_{t}(i)) , \end{aligned}$$
(11)

which can be interpreted as a uniform version of scenariowise Wasserstein distances. An \(\varepsilon \) ball around \({\mathbb {P}}\) is characterized by the fact that all members \({\mathbb {Q}}\) satisfy

$$\begin{aligned} {{\,\mathrm{{\mathfrak {W}}}\,}}(p_{t}(i),q_{t}(i))\le \varepsilon , \end{aligned}$$

for all \(t=0,\ldots ,T-1\,\) and all \(i=1,\ldots , N_t\).

When introducing ambiguity into the model, we would like to solve problem (1) wherein the objective function is replaced with:

$$\begin{aligned} \underset{\{u_t\}_{t=0}^{T-1}}{\text {max}} \min _{\mathbb {Q}\; : {{\,\mathrm{{\mathfrak {W}}^{\infty }}\,}}(\mathbb {P},\mathbb {Q})\le \, \varepsilon } {\mathbb {E}}_{\mathbb {Q}}\left[ \displaystyle \sum _{t=0}^{T-1}\beta _t h_t(z_t, u_t, \xi _t) \right] , \end{aligned}$$

where \({\mathbb {E}}_{\mathbb {Q}}\) denotes the expectation with respect to the measure \(\mathbb {Q}\).

Our choice of the multistage distance makes it possible to keep the decomposed structure of the backward recursion, which reads:

$$\begin{aligned} \begin{aligned} V_t(z_t, \xi _t^i )&= \underset{ u_t }{\max }\underset{{{\,\mathrm{{\mathfrak {W}}}\,}}(p_t(i), q_t(i)) \le \,\varepsilon }{\min } h_t(z_t, u_t, \xi ^i_t) + \beta \, \sum _{j=1}^{N_{t+1}} q_t(i,j)\;V_{t+1}(z_{t+1},\xi ^j_{t+1} ) \\ \text {s.t. } z_{t+1}&=g_t(z_t, u_t,\xi _{t+1}) . \end{aligned} \end{aligned}$$
(12)

Hence, the ambiguity approach just extends the max model to a maximin model. The ambiguous model can also be seen as a risk adverse model in contrast to the basic risk neutral model. If the distance is not of the decomposable form, then the backward recursion does not decompose scenariowise and one has to find all optimal decisions in one very big stagewise but not scenariowise decomposed algorithm. However, decomposability is the key feature of successful methods for dynamic decision problems. Hence, our concept is strongly motivated by its favourable computational properties. However, as we will discuss now, under a mild regularity condition (in the sense that it will always hold for discrete models, which are the basis of the whole computational framework) it can still be shown that optimal solutions are close if the underlying models are close w.r.t. the uniform Wasserstein distance.

The general distance concept for stochastic processes (including their discrete representation in the form of scenario trees) is the nested distance introduced in Pflug (2010), Pflug and Pichler (2012) as a multistage generalization of the classical Wasserstein distance.Footnote 2 In our case we have a Markov process which can be seen as a lattice process. Notice that a lattice can be interpreted as a compressed form of a tree. It can always be “unfolded” to a tree representing the same filtration structure, by splitting each node according to the number of incoming arcs. Thus, all results applying for trees do hold for lattices as well. The uniform Wasserstein distance introduced above is given by the maximum Wasserstein distance over all conditional transitions. The subsequent stability result holds.

Proposition 3.1

Let \({\mathbb {P}}\) and \(\tilde{{\mathbb {P}}}\) be two discrete Markovian probability models defined on the filtered space \((\Omega , \sigma (\xi ))\). Assume the following Lipschitz condition regarding \({\mathbb {P}}\,\) to hold for all \(t=0, \dots , T-1\) and all values \(\xi _t^{i},\, \xi _t^{j}\), where \(i,j = 1,\dots , N_t\):

$$\begin{aligned} {{\,\mathrm{{\mathfrak {W}}}\,}}\left( P_t(\cdot \vert \xi _{t}^{i}), P_t(\cdot \vert \xi _{t}^{j})\right) \le \; K_t \cdot \Vert \xi _{t}^{i} - \xi _{t}^{j} \Vert , \end{aligned}$$

for \(K_t\in {\mathbb {R}}\). Consider the generic multistage stochastic optimization problem

$$\begin{aligned} v({\mathbb {Q}}) := \inf {\mathbb {E}}^{{\mathbb {Q}}}[c(x, \xi )], \end{aligned}$$

where the (nonanticipative) decisions x lie in some convex set and where the function \(c(\cdot ,\cdot )\) is convex in x and 1-Lipschitz w.r.t. \(\xi \). Then the relation

$$\begin{aligned} \left| v({\mathbb {P}})-v(\tilde{{\mathbb {P}}})\right| \le \; {{\,\mathrm{\mathsf{{d}\mathsf{{I}}}}\,}}({\mathbb {P}}, \tilde{{\mathbb {P}}}) \le \; K\cdot {{\,\mathrm{{\mathfrak {W}}^{\infty }}\,}}({\mathbb {P}}, \tilde{{\mathbb {P}}})\, \end{aligned}$$

holds, where \({{\,\mathrm{\mathsf{{d}\mathsf{{I}}}}\,}}(\cdot ,\cdot )\) denotes the nested distance, and where the constant K is given by

$$\begin{aligned} K:= \sum _{t=0}^{T-1} \prod _{j=t+1}^{T} (1+K_j). \end{aligned}$$

Proof

The first inequality is a well-known result from (Pflug and Pichler 2012, Th. 11). The statement then follows readily from (Pflug and Pichler 2014, Lem. 4.27), by using \({{\,\mathrm{{\mathfrak {W}}^{\infty }}\,}}({\mathbb {P}}, \tilde{{\mathbb {P}}})\) as a uniform bound for \({{\,\mathrm{{\mathfrak {W}}}\,}}(P_t(\cdot \vert \xi _t), {\tilde{P}}_t(\cdot \vert \xi _t))\), over all t. \(\square \)

Remark

Notice that for discrete Markov chain models the assumption in Proposition 3.1 always holds, as one can simply choose the ergodic coefficient

$$\begin{aligned} K_t = \max _{\xi _t^{i}\ne \xi _t^{j}} \frac{{{\,\mathrm{{\mathfrak {W}}}\,}}\left( P_t(\cdot \vert \xi _{t}^{i}), P_t(\cdot \vert \xi _{t}^{j})\right) }{\Vert \xi _{t}^{i} - \xi _{t}^{j} \Vert }. \end{aligned}$$

Remark

In the above construction, all models contained in the ambiguity set share exactly the same tree structure and node values. Thus, one might conjecture at a first glance that it would be possible to bound the nested distance by a simple sum of the stagewise maximum of conditional Wasserstein distances, weighted by the number of subtrees at the respective stage. A simple example in the “Appendix” shows that such a construction does not work in general.

3.2 State-dependent distances

In practice, the worst-case model for an upcoming period may often depend on the current state. In the model considered in the present paper (cf. Sect. 2.1), we decide only at the beginning of each stage about the procurement of additional \(\text{ CO }_2\) allowances. In particular, we restrict ourselves not to buy any if the current stock is sufficient for whatever we may do during the subsequent week; regardless of their market price. If we neglect this consideration when searching for the optimal distributionally robust production profile, the worst case may reflect a variation in the \(\text{ CO }_2\) allowances price dimension which in fact will not have an impact on our optimal decision. Thus, we modify the distance on the underlying three-dimensional space by projecting to the electricity price and the fuel price dimension only, given that our stock of \(\text{ CO }_2\) allowances is sufficient. Otherwise, we keep the usual \(L_1\) norm. More formally, we define

$$\begin{aligned}&D\left( [\xi _t^{e}, \xi _t^{f}, \xi _t^{c}], [{\tilde{\xi }}_t^{e}, {\tilde{\xi }}_t^{f}, {\tilde{\xi }}_t^{c}] \right) \nonumber \\&\quad := {\left\{ \begin{array}{ll} w^e\vert \xi _t^{e} - {\tilde{\xi }}_t^{e}\vert + w^f\vert \xi _t^{f} - {\tilde{\xi }}_t^{f}\vert ~ &{}\text{ if } \alpha _t=0\\ w^e\vert \xi _t^{e} - {\tilde{\xi }}_t^{e}\vert + w^f\vert \xi _t^{f} - {\tilde{\xi }}_t^{f}\vert + w^c\vert \xi _t^{f} - {\tilde{\xi }}_t^{f}\vert ~ &{}\text{ if } \alpha _t>0 , \end{array}\right. } \end{aligned}$$
(13)

with \(\alpha _t\) defined in Sect. 2.1 and positive weights \(w^e, w^f, w^c\). Notice that D is not a distance, as it does not separate points. However, this fact does not entail any restrictions for our considerations.

When basing the uncertainty model on historical observations, there is a strong statistical argument for using balls w.r.t. the Wasserstein distance as ambiguity sets (cf. Esfahani and Kuhn 2018). In particular, large deviations results are available (see Bolley et al. 2007; Fournier and Guillin 2015 for the case of the Wasserstein distance and Glanzer et al. (2019) for the case of the nested distance) which provide probabilistic confidence bounds for the true model being contained in the ambiguity set around the (smoothed) empirical measure. Observe that such results are not invalidated by the state-dependency that we introduce: it is evident that a given confidence bound is directly inherited if one neglects some dimension. Notice however that a general state-dependent weighting of the dimensions would require a more careful treatment.

4 A case study

In the following, we will test the framework elaborated in Sects. 2.1 and 3 for a specific power plant. For the present application, each week t is subdivided in \(S = 42\) blocks, where each block has 4  h. We solve the problem for a quarter ahead, thus we take the horizon to be \(T=13\) weeks.

In this case, the control variables are given vectors of dimension \(S=42\). The set of production profiles we use for this case study consists of 10 different production schedules. This set is denoted by \({\mathcal {U}} = \{u^{(i)}\}_{i=1}^{10} \) (see Fig. 3) and it will remain constant for every stage.

Fig. 3
figure 3

All different profiles in \({\mathcal {U}}\). Production is given in MW

Fig. 4
figure 4

If we choose profile \(u^{(10)}\), the strategy A to follow is illustrated for different prices \(\xi _{t,0}^{c}\) and all possible left allowances \(x_t\) in the partition. The horizontal lines indicate the lower and upper bounds for the prices \({\underline{b}}\) and \({\overline{b}}\), respectively

The discrete evolution of prices is given as a lattice process \((\xi )\) defined on \(\Xi _0 \times \dots \times \Xi _T\), where each space \(\Xi _t\) has \(N_t\) elements correspondent to the number of nodes, i.e., \(\Xi _t = \{\xi _t^1, \dots , \xi _t^{N_t}\}\) and each node \(\xi _t^i = (\xi _t ^{e,i}, \xi _t ^{f,i}, \xi _t ^{c,i})\in {\mathbb {R}}^3 \) for all \(i=1, \dots , N_t\) and all \(t = 0, \dots , T\). As explained in Sect. 3, \(P_t\) will denote the probability transition matrices from stage t to stage \(t+1\) of dimensions \(N_t \times N_{t+1}\). The description of the lattice construction will be explained in Sect. 4.1.1.

The profit function \(h_t\) at stage t is defined by the expected profit during the upcoming week (see (2)). Profits at every block s in the week are quantified by the functions \(f_s\), for \(s=0,\dots , S-1\). Costs of buying additional allowances and transportation costs are quantified only at the beginning of each week, while start-up costs need to be assigned at each block s. We proceed to the description of the state variables. The costs of buying new allowances depend on the strategy A defined in (3). For its computation we consider \([{\underline{b}}, {\overline{b}}] = [4.4, \, 9.6]\), where the latter values were obtained by applying a simple quantile rule to the available data set. Moreover, we set \(C_\alpha = 2\).

For the partition of the amount of available \(\text{ CO }_2\) allowances \(x_t\), we consider different possible values from 0 tonne(carbon) to \(10^5\) tonne(carbon), and we also take into account the allowances needed for each profile. All in all, the partition of state \(x_t\) has 16 different elements.

For every state in the partition, an example of the procurement strategy is shown in Fig. 4 for different prices. Note that we illustrate this example when we choose full production, i.e., \(u^{(10)}\). The strategy when choosing a different profile is similar, the only change is that we do not need to buy as many allowances as with \(u^{(10)}\).

The second state variable \(y_t\) describes the hours the power plant was off since the last time it was on. The costs associated with restarting the production depend on \(y_t\) and the chosen profile \(u_t\). The cost function is a step function given in Table 2.

Table 2 Classification of offline hours and associated start-upcosts

Once a profile is chosen for any week, the first time the profile is different than zeros is where we consider initial start up costs. Then, for the initial start up costs we consider the hours the power plant was offline before week t starts (i.e., \(y_t\)) in addition to the hours the chosen profile is off before it starts producing. For the rest of the blocks the costs will only depend on the profile. As for the notation of the elements in \(f^{\text{ start }}\), see (4), \({\overline{E}}_s\) is the sum of the last two columns of Table 2. We illustrate the initial start up costs for a profile that is on in the beginning of the week for all possible values of \(y_t\) (see Fig. 5) .

Given that we have a finite set of profiles, we can calculate the value of \(y_t\) for each profile. The partition of \(y_t\) will have these values and the limit numbers of the classes in Table 2.

Fig. 5
figure 5

Initial start-up costs (\(\pounds \)) for all possible initial states \(y_t\in \{0,1, \dots 168\}\), when the profile is on in the beginning of the week. The costs are calculated for a specific node with electricity price 40 €/MWh and fuel price 90 $/tonne(fuel)

Regarding the transportation cost function in (5), we assume \(C^{tr} = 40\) (£/tonne(fuel)).

Finally, the values of the constants in Table 1 are \(H_2 = 0.78\) (£/$), \(H_3 = 0.9\) (£/€), \(H_5 = 0.0975\) (MWh/GJ), \(H_7= 0.45\) (tonne(fuel)/MWh) and \(J = 2.31\) (tonne(carbon)/tonne(fuel)).

4.1 The solution algorithm

We numerically solve the power plant valuation problem by a stochastic dynamic programming algorithm. A lattice structure is used as a discrete representation of the uncertainty model in all decision stages.

4.1.1 On the lattice construction

The state-of-the art approach for the construction of scenario lattices is based on optimal quantization techniques (cf. Bally and Pagès 2003; Löhndorf and Wozabal 2018). For a given number of discretization points, such methods select the optimal locations as well as the associated probabilities in such a way that the Wasserstein distance (or some other distance concept for probability measures) with respect to a (continuous) target distribution is minimized. For the present study, we have implemented a stochastic approximation algorithm for the quantization task, following (Pflug and Pichler 2014, Algorithm 4.5). Referring to the latter algorithm, we first applied the iteration step (ii) in order to find the atoms of all marginal distributions, separately for each stage. We then formed a lattice out of these sets of points by fixing the structure of allowed transitions and then applied step (iv) to determine all conditional transition probabilities. Eventually, this also determines the absolute probabilities of each node in the lattice. The Wasserstein distance of order two has been used as a target measure for the minimization. We use a ternary lattice, i.e., each node has (at most) three successors with a positive transition probability.

As for the notation, recall that we distinguish between stages (weeks) and intra-week blocks. Decisions are taken only at each stage but the profit will be calculated taking into account the random evolution of the prices during the entire week. The discretized process in each stage t and node i is denoted as \(\xi _t^i = (\xi _t^{e,i} ,\xi _t^{f,i}, \xi _t^{c,i})\), and the values of the process within week t starting in node i, are denoted by \(\xi _{t,s}^i = (\xi _{t,s}^{e,i} ,\xi _{t,s}^{f,i}, \xi _{t,s}^{c,i})\), for \(s=0,\dots S\). At \(s=0\), \(\xi _{t,0}^i = \xi _t ^i\) takes the value of node i; and at \(s=S\), \(\xi _{t,S}^i = \xi _{t+1}^j\) takes the value of node j in the next stage with probability \(p_t(i,j)\), for \(j=1, \dots , N_{t+1}\). Note that if the lattice structure does not make a link between two nodes in consecutive stages, then the probability of such a transition will be zero.

4.2 Computing the expected profit between two decisions

The valuation of the power plant is obtained by solving (10) backwards in time from \(t=T\) to \(t=0\). In this section, we specify how to solve (10) and its robust version (12) at any stage t. We specifically concentrate on the calculation of the expected profit within each week given the current node and a successor node.

As discussed in Sect. 2.2.1, electricity prices within weeks will be modeled by the bridge process that we described. Fuel and \(\text{ CO }_2\) allowances prices are assumed to remain constant between stages.

We start now with the computation of the expected profit, as defined in (2). In classical stochastic dynamic programming problems, \(h_t\) exclusively depends on the values observed at time t. In contrast, in Sect. 2.1 we instantiated (1) in such a way that the function \(h_t\) is defined as an expected value of the the random profits within week t. Hence, given the values of node i at stage t (at block \(s=0\)), as well as initial states \((x_t, y_t)\); the weekly profit \(h_t\) will be calculated as follows

$$\begin{aligned}&h_t\big ([x_t, y_t], u_t, \xi _{t,0}^i\big )\\ {}&\quad = {\mathbb {E}}\left[ \sum _{s=0}^{S-1} f_s(x_t, y_t, u_t,\xi _{t,s}^i) \biggr | \xi _t^i = \xi _{t,0}^i \right] \\&\quad = \sum _{j=1}^{N_{t+1}} \left( \sum _{s=0}^{S-1} {\mathbb {E}}\left[ f_s(x_t, y_t, u_t,\xi _{t,s}^i) \biggr | \xi _{t}^i = \xi _{t,0}^i,\, \xi _{t+1}^j = \xi _{t+1,0}^j \right] \right) \cdot p_t(i,j). \end{aligned}$$

We define

$$\begin{aligned} h_t^j(x_t, y_t, u_t, \xi _{t,0}^i) =\sum _{s=0}^{S-1}{\mathbb {E}} \left[ f_s(x_t, y_t, u_t,\xi _{t,s}^i) \biggr | \xi _{t}^i = \xi _{t,0}^i,\, \xi _{t+1}^j = \xi _{t+1,0}^j \right] , \end{aligned}$$

for all \(j=1, \dots , N_{t+1}\). Then, \(h_t([x_t, y_t], u_t, \xi _{t,0}^i) = \sum _{j=1}^{N_{t+1}} h_t^j(x_t, y_t, u_t, \xi _{t,0}^i) \cdot p_t(i,j) \). We compute now \(h_t^j\) as follows

$$\begin{aligned} h_t^j (x_t, y_t, u_t, \xi _{t,0}^i)&=\sum _{s=0}^{S-1} (u_{t,s}\, \Delta s - {\overline{W}}_s(y_t, u_t))\cdot {\mathbb {E}} [\xi _{t,s}^{e,i} | \xi _{t}^e= \xi _{t,0}^{e,i},\, \xi _{t+1}^{e} = \xi _{t+1,0}^{e,j} ] \\&\quad - \xi _{t,0}^{f,i}\sum _{s=0}^{S-1} (H_1 u_{t,s} \Delta s +H_2\, H_6\, {\overline{B}}_s(y_t, u_t)) \\&\quad - A(x_t, {\bar{u}}_t , \xi _{t,0}^{c,i})\, H_3 \, \xi _{t,0}^{c,i} - \sum _{s=0}^{S-1} {\overline{E}}_s(y_t, u_t) - H_7 \, C^{tr}\, {\bar{u}}_t\, \Delta s . \end{aligned}$$

At stage T we set the terminal condition \(V_T=0\). Given an initial state \((x_0, y_0)\), going backwards in time from \(t=T-1\) to \(t=0\), we obtain the power plant value at \(t=0\). The latter is calculated with respect to the baseline multistage model \({\mathbb {P}}\) and will be denoted as \(\nu _0({\mathbb {P}})\). The policy associated with \(\nu _0({\mathbb {P}})\) is denoted with \(u_{{\mathbb {P}}}^*\). It can be represented as a probabilistic tree of profiles.

If we incorporate ambiguity in the lattice process using the uniform Wasserstein distance, for all \(0\le t\le T-1\) we solve:

$$\begin{aligned} V_t([x_t, y_t], \xi _t^i )&= \max _{u_t} \min _{{\mathfrak {W}}(p_t(i), q_t(i))\le \varepsilon } \left\{ \sum _{j=1}^{N_{t+1}} \left[ h_t^j(x_t, y_t, u_t, \xi _{t,0}^i)\right. \right. \\&\quad \left. \left. + \beta V_{t+1} ([g^{(1)} (x_t, u_t, \xi _{t,1}^c), g^{(2)}(u_t)], \xi _{t+1}^j)\right] q_t(i,j) \right\} . \end{aligned}$$

The optimal value is reached at \(t=0\) and it will be denoted as \(\nu _0({\mathcal {Q}}^\varepsilon )\), where \({\mathcal {Q}}^\varepsilon \) denotes the ambiguity set defined as

$$\begin{aligned} {\mathcal {Q}}^\varepsilon = \{ {\mathbb {Q}}\, : \, {{\,\mathrm{{\mathfrak {W}}^{\infty }}\,}}({\mathbb {P}}, {\mathbb {Q}}) \le \varepsilon \} . \end{aligned}$$

A worst-case model \({{\mathbb {Q}}^\varepsilon }^*\) is any multistage probability model contained in \({\mathcal {Q}}^\varepsilon \) such that \(\nu _0({\mathcal {Q}}^\varepsilon ) = \nu _0({{\mathbb {Q}}^\varepsilon }^*)\). More concrete, the optimal value is reached at a saddle point \((u_{{\mathbb {Q}}^{\varepsilon *}} ^*, {{\mathbb {Q}}^\varepsilon }^*)\) where \(u_{{\mathbb {Q}}^{\varepsilon *}} ^*\) is the policy associated with the worst-case model.

At each node i, the objective function of the minimization problem is linear in \(q_t(i)\) under linear constraints. Define

$$\begin{aligned} c_j =h_t^j(x_t, y_t, u_t, \xi _{t,0}^i) + \beta V_{t+1} ([g^{(1)} (x_t, u_t, \xi _{t,1}^c), g^{(2)}(u_t)], \xi _{t+1}^j)),\end{aligned}$$

for \(j=1, \dots , N_{t+1}\). Then, the minimization problem can be written as

$$\begin{aligned} \begin{aligned} \underset{q_t(i,j),\, \pi _{k,\,l}}{\text {min}}&\sum _{j=1}^{N_{t+1}} c_j \cdot q_t(i,j)\\ \text {s.t. }&\sum _{k = 1}^{N_{t+1}} \pi _{k,\, l} = q_t(i, l) \quad \forall l = 1, \dots , N_{t+1}\\&\sum _{l = 1}^{N_{t+1}} \pi _{k,\, l} = p_t(i, k) \quad \forall k = 1, \dots , N_{t+1}\\&\sum _{k, \, l} D_{k\, l}\, \pi _{k,\, l}\le \varepsilon \\&\sum _{k, \, l} \pi _{k,\, l} = 1 \\&\pi _{k, \, l} \ge 0,\, \forall k,\, l = 1, \dots , N_{t+1} , \end{aligned} \end{aligned}$$

where \(D_{k\, l} = D\left( [\xi _{t+1}^{e,k}, \, \xi _{t+1}^{f,k},\, \xi _{t+1}^{c,k}], [ \xi ^{e,l}_{t+1}, \, \xi _{t+1}^{f,l},\, \xi _{t+1}^{c,l}]\right) \) is the distance between nodes k and l at stage \(t+1\), as defined in (13) with specific weights \(w^e = 1\), \(w^f = H_1\) and \(w^c = H_3\cdot H_4\).

Fig. 6
figure 6

The impact of the ambiguity radius \(\varepsilon \) on the optimal profit over 13 weeks

Table 3 Percentage of change in electricity prices for every stage and different ambiguity radii

4.3 Impact of model ambiguity

4.3.1 The value of the power plant

We describe the optimal valuation of the power plant when we compute the iterative system of backward equations in (10) and (12). We assume that the initial state \(x_0\) provides enough allowances to execute any of the profiles and the power plant was not offline before we start, i.e., \(y_0 =0\). Moreover, the terminal condition for both problems is set to be \(V_T = 0\). With the baseline model we obtain an expected profit of approximately \(\nu _0 ({\mathbb {P}})=2.3\cdot 10^6\) (\(\pounds \)). The optimal decision at \(t = 0\) is to turn off the power plant by choosing \(u^{(1)}\). The valuation of the power plant including ambiguity is obtained for different radii \(\varepsilon \in [0,2]\). The different optimal values \(\nu _0({{\mathcal {Q}}^\varepsilon })\) are illustrated in Fig. 6 with respect to the ambiguity level. We observe the valuation of the power plant decreases when the ambiguity radius is higher. For \(\varepsilon = 2\), the valuation of the power plant decreases to \(\nu _0({{\mathcal {Q}}^\varepsilon }) = 6.8\cdot 10^5\).

In order to get an insight into the change of prices in the ambiguity model, we report the changes of electricity prices for the worst case models \({{\mathbb {Q}}^\varepsilon }^*\) in Table 3. Let \(B_0\) be a vector containing the stagewise expectations of electricity prices with respect to the baseline model and let \(B_\varepsilon \) be the vector containing the expectations with respect to each worst-case model \({{\mathbb {Q}}^\varepsilon }^*\). The percentage of change at each stage t in the prices are denoted with the parameter \(\theta _t\), such that \(B_\varepsilon (t) = (1-\theta _t) B_0(t)\). We observe that the largest change in prices is given in stage 12, where electricity prices decrease up to 30% for \(\varepsilon = 2\).

Fig. 7
figure 7

Stagewise distribution of the optimal profiles \(u_{\mathbb {P}}^*\)

Fig. 8
figure 8

Stage distribution of the optimal profile with respect the ambiguity radius. For each \(\varepsilon \) we plot the distribution of \(u_{{{\mathbb {Q}}^\varepsilon }^*}^*\)

4.3.2 Forward in time

With the iterative solution of the backwards equations we eventually obtain an initial optimal profile at \(t=0\), namely \(u_0^* = u^{(1)}\). With this initial decision we go forward in time and create a probabilistic tree \(u^*_{\mathbb {P}}\) of the optimal decisions together with their profits. Starting with the given states and the optimal profile at \(t=0\), the updated states at stage \(t=1\) are completely determined by the knowledge of \(x_0,\, y_0\) and \(u_0^*\). The choice of the optimal profile in \( t= 1\), for each node \(i = 1, \dots , N_1\), will be made by looking at the nearest location of the updated states in the grid and taking the correspondent profile chosen in the backwards algorithm. We proceed in this way until we obtain all the optimal profiles at stage \(T-1\). Eventually, we obtain a probabilistic tree with \(3^{T}\) possible paths. Following the same procedure, we calculate the probabilistic tree of optimal profiles \(u_{{\mathbb {Q}}^{\varepsilon *}} ^*\) for each worst-case model \({{\mathbb {Q}}^\varepsilon }^*\), and the corresponding profits under the worst-case models for different radii \(\varepsilon \).

Starting with \(u^{(1)}\) is optimal for all models at \(t=0\). For the subsequent stages the choices of optimal profiles change. Figure 7 shows the stagewise distribution of the optimal profiles chosen with the baseline model \({\mathbb {P}}\). Figure 8 shows the changes of profile choices when we incorporate ambiguity in the model.

Fig. 9
figure 9

On the left the profit tree by following the optimal profiles and on the right the distribution of the final profits obtained following every path

Fig. 10
figure 10

Stagewise accumulated profit distribution for each solution \(u_{{{\mathbb {Q}}^\varepsilon }^*}^*\)

With no ambiguity there is a probability greater than 0 to choose full production in stages 6, 7, 8, 10 and 12. When we start increasing the radius of ambiguity these chances drop to 0. The larger the ambiguity radius is the more we choose to be offline or not to produce in the weekends choosing profiles like \(u^{(2)},\, u^{(4)}, \, u^{(6)}\). A different option, but with less probability is not to produce in peak hours, by choosing \(u^{(3)}\) or \(u^{(5)}\).

Given the optimal profiles for the baseline model \({\mathbb {P}}\) and the alternative models \({{\mathbb {Q}}^\varepsilon }^*\) we can calculate the profits we make along the decision tree. To be precise, we denote by \(i_\tau \in N_\tau \) any node index at stage \(\tau = 0, \dots , T\). Since \(N_0 = 1 \), a possible path to follow forward in time up to stage t, will go through any sequence of nodes \((1, i_1, \dots , i_{t-1}, i_t)\). The profit from \(i_\tau \) to \(i_{\tau + 1}\) is the profit made in stage \(\tau \) and is written as \(h_\tau ^ {i_\tau , i_{\tau + 1} }\). This profit is obtained with probability \(p_\tau (i_\tau , i_{\tau + 1 })\). Therefore, the accumulated profit until stage \(t-1\) is \( (h_0^{0, i_1} + \cdots + h_{t-1}^{i_{t-1}, i_t}) \) with probability \(p_0(1,i_1) \cdots p_{t-1}(i_{t-1}, i_t)\), when we end in node \(i_t\in N_t\). Figure 9 shows the accumulated profits following the tree of optimal profiles \(u_{\mathbb {P}}^*\) as well as the distribution of the final profits.

If we include ambiguity, then the optimal profits change as well as their probabilities. Figure 10 shows the profit trees together with the final distribution with respect to the correspondent alternative model. We observe that for larger \(\varepsilon \) the alternative models put more weight at lower profits.

5 Conclusion

In this paper, we have shown how a realistic valuation of a power plant can be done by solving a multistage Markovian decision problem. The value is defined as the (discounted) expected net profit, that one can get from the operation of the plant, if an optimal production plan is implemented. In this valuation process, all relevant purchasing costs and selling prices are included in the model. The number of feasible production plans is finite and thus a discrete multistage optimization problem has to be solved. We use the classical backward algorithm for the Markovian control problem and a forward algorithm for determining an estimate of the achievable profit and its distribution. The novelty of the paper is twofold. First, we adopt a multiscale approach, where decisions are made on a coarser scale than costs are calculated. This allows us to keep the computational effort tractable. Second, we do not only consider the baseline model for the random factors, but rather a set of models (the ambiguity set) which are close to the baseline model. This allows to incorporate the fact that probability distributions for future costs and revenues are not known precisely. The more models, and especially the more unfavourable models are included in the ambiguity set, the smaller is the robust value of the plant. We demonstrate how the final value under model ambiguity depends on the degree of uncertainty about the correct price and cost model. Our distance model for the ambiguity set depends on the state of the system, taking into account that what is close for two price vectors depends also on the fact whether these prices are relevant for the state at hand. We also noticed that the optimal production strategy not only depends on the degree of ambiguity, but also gets more diversified for larger ambiguity, in contrast to some bang-bang solutions in unambiguous models.