Abstract
We consider an optimal switching problem with random lag and possibility of component failure. The random lag is modeled by letting the operation mode follow a regime switching Markov-model with transition intensities that depend on the switching mode. The possibility of failures is modeled by having absorbing components. We show existence of an optimal control for the problem by applying a probabilistic technique based on the concept of Snell envelopes.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The standard optimal switching problem (sometimes referred to as starting and stopping problem) is a stochastic optimal control problem of impulse type that arises when an operator controls a dynamical system by switching between the different members in a set of switching modes \(\mathcal {I}=\{\mathbf{b}_1,\ldots ,\mathbf{b}_m\}\). In the two-modes case (\(m=2\)) the modes may represent, for example, “operating” and “closed” when maximizing the revenue from mineral extraction in a mine as in [8]. In the multi-modes case the operating modes may represent different levels of power production in a power plant when the owner seeks to maximize her total revenue from producing electricity [10] or the states “operating” and “closed” of single units in a multi-unit production facility as in [7].
In optimal switching the control takes the form \(u=(\tau _1,\ldots ,\tau _N;\beta _1,\ldots ,\beta _N)\), where \(\tau _1\le \tau _2\le \cdots \le \tau _N\) is a sequence of (random) times when the operator intervenes on the system and \(\beta _j\in \mathcal {I}\) is the switching mode that the operator switches to at time \(\tau _j\). The standard multi-modes optimal switching problem in finite horizon (\(T<\infty \)) can then be formulated as finding the control that maximizes
where \(\alpha _t=\beta _0\mathbb {1}_{[0,\tau _{1})}(t)+\sum _{j=1}^N \beta _j\mathbb {1}_{[\tau _{j},\tau _{j+1})}(t)\) is the operation mode (when starting in a predefined mode \(\beta _0\in \mathcal {I}\)), \(\psi _\mathbf{b}\) and \(\Upsilon _\mathbf{b}\) are the running and terminal revenue in mode \(\mathbf{b}\in \mathcal {I}\), respectively and \(c_{\mathbf{b},\mathbf{b}'}(t)\) is the cost of switching from mode \(\mathbf{b}\) to mode \(\mathbf{b}'\) at time \(t\in [0,T]\).
The standard optimal switching problem has been thoroughly investigated in the last decades after being popularised in [8]. In [19] a solution to the two-modes problem was found by rewriting the problem as an existence and uniqueness problem for a doubly reflected backward stochastic differential equation. In [13] existence of an optimal control for the multi-modes optimal switching problem was shown by a probabilistic method based on the concept of Snell envelopes. Furthermore, existence and uniqueness of viscosity solutions to the related Bellman equation was shown for the case when the switching costs are constant and the underlying uncertainty is modeled by a stochastic differential equation (SDE) driven by a Brownian motion. In [14] the existence and uniqueness results of viscosity solutions was extended to the case when the switching costs depend on the state variable. Since then, results have been extended to Knightian uncertainty [11, 20, 21] and non-Brownian filtration and signed switching costs [32]. For the situation when the underlying uncertainty can be modeled by a diffusion process, generalization to the case when the control enters the drift and volatility term was treated in [17]. This was further developed to include state constraints in [26]. Another important generalization is to the case when the operator only has partial information about the present state of the diffusion process as treated in [30].
As many physical systems do not immediately respond to changes in the control variables, including delays is an important aspect when seeking to derive applicable results in optimal control. General impulse control problems with deterministic lag have been considered in a variety of different settings including the novel paper [3], where an explicit solution to an inventory problem with uniform delivery lag is found by taking the current stock plus pending orders as one of the states. Similar approaches are taken in [2] where explicit optimal solutions of impulse control problems with uniform delivery lags are derived for a large set of different problems and in [9] where an iterative algorithm is proposed. In [33] the authors propose a solution to general impulse control problems with lag, by defining an operator that circumvents the delay period. The optimal switching problem with non-uniform (but deterministic) lag and ramping was solved in [34] by state space augmentation in combination with the probabilistic approach initially developed in [13].
The aim of the present article is to extend the applicability of optimal switching further by considering the case of random lag and component failure during startup. As in [34] we consider the problem of operating \(n>0\) different production units, that can be either in operation or turned off, and thus let the switching modes be the set of all n-dimensional vectors of zeroes and ones, i.e. \(\mathcal {I}:=\{0,1\}^n\). To model the random lags and failures we let the operation mode, \(\alpha ^u_t\), be a continuous-time, finite-state, observable Markov-process taking values in \(\mathcal {A}:=\{-1,0,1\}^n\), where \(-1\) represents “malfunction”, 0 represents “off” and 1 represents “operating”. We assume that the transition intensities of \(\alpha ^u_t\) depend on the control both through the present switching mode, \(\xi _t:=\sum _{j=1}^N\beta _j\mathbb {1}_{[\tau _j,\tau _{j+1})}(t)\), but also through the time of the last switch from off to operating in each of the different production units. As opposed to the situation in the standard optimal switching problem, the switching mode and the operation mode may thus differ due to the lag.
We will consider the problem of finding a strategy u that maximizes
where the process \(\theta ^u\) is such that the ith component gives the elapsed time in the present “on”-cycle for Plant i. The process \(\theta ^u\) will allow us to model increased production costs during startup or lower production during ramp-up periods (see e.g. [35] for a situation where ramping is important). The results presented will be derived under the assumption that the \(\psi _\mathbf{a}\), \(\Upsilon _\mathbf{a}\) and \(c_{\mathbf{b},\mathbf{b}'}\) are adapted w.r.t. a filtration generated by a Brownian motion. However, these results readily extend to more general (quasi-left continuous) filtrations, e.g. a filtration generated by a Brownian motion and an independent Poisson random measure.
The remainder of the article is organized as follows. In the next section we state the problem, set the notation used throughout the article and detail the set of assumptions that are made. Then, in Sect. 3 a verification theorem is derived. This verification theorem is an extension of the original verification theorem for the multi-modes optimal switching problem developed in [13]. In Sect. 4 we show that there exists a family of processes that satisfies the requirements of the verification theorem, thus proving existence of an optimal control for the optimal switching problem with random lag. Then, in Sect. 5 we focus on the case when the underlying uncertainty in the processes \(\psi _\mathbf{a}\) and \(\Upsilon _\mathbf{a}\) can be modeled by an SDE and derive a dynamic programming relation for the corresponding value functions.
2 Preliminaries
We consider the finite horizon problem and thus assume that the terminal time T is fixed with \(T<\infty \). We will assume that turning off a unit gives immediate results on the operation mode and we have \(\alpha _t\le \xi _t\) for all \(t\in [0,T]\). The state space for \((\alpha ,\xi )\) is then \(\mathcal {J}:=\{(\mathbf{a},\mathbf{b})\in \mathcal {A}\times \mathcal {I}:\mathbf{a}\le \mathbf{b}\}\). Furthermore, we define the following setsFootnote 1:
-
For each \(\mathbf{b}\in \mathcal {I}\), we let \(\mathcal {A}_{\mathbf{b}}:=\{\mathbf{a}\in \mathcal {A}:\mathbf{a}\le \mathbf{b}\}\) and for each \(\mathbf{a}\in \mathcal {A}\) we let \(\mathcal {I}_{\mathbf{a}}:=\{\mathbf{b}\in \mathcal {I}:\mathbf{b}\ge \mathbf{a}\}\).
-
For each \((\mathbf{a},\mathbf{b})\in \mathcal {J}\) we let \(\mathcal {A}_{\mathbf{a},\mathbf{b}}:=\{\mathbf{a}'\in \mathcal {A}_\mathbf{b}: |a_i'|\ge |a_i|\,\,\mathrm{and}\,\,a'_i = a_i\,\,\mathrm{when}\,\, a_i\in \{-1,-b_i\}\}\).
-
For each \(\mathbf{b}\in \mathcal {I}\) we let \(\mathcal {I}^{-\mathbf{b}}:=\mathcal {I}\setminus \{\mathbf{b}\}\) and for each \(\mathbf{a}'\in \mathcal {A}_{\mathbf{a},\mathbf{b}}\) we let \(\mathcal {A}^{-\mathbf{a}'}_{\mathbf{a},\mathbf{b}}:=\mathcal {A}_{\mathbf{a},\mathbf{b}}\setminus \{\mathbf{a}'\}\)
-
For each \(\mathbf{b}\in \mathcal {I}\) we let \(\mathcal {A}_{\mathbf{b}}^{\mathrm{abs}}:=\Pi _{i=1}^{n}\{\{-b_i\}\cup \{-1\}\}\).
-
We introduce the setFootnote 2\(\mathcal D:=[0,T] \times \cup _{(\mathbf{a},\mathbf{b})\in \mathcal {J}}[0,T]^{\mathbf{b}} \times [0,T]^{\mathbf{a}^+}\times (\mathbf{a},\mathbf{b})\) and let \(\mathcal D_A:=[0,T] \times \cup _{(\mathbf{a},\mathbf{b})\in \mathcal {J}}[0,T]^{\mathbf{b}}\times (\mathbf{a},\mathbf{b})\) and \(\mathcal D_\lambda := \cup _{\mathbf{b}\in \mathcal {I}}[0,T]^{\mathbf{b}}\times \{\mathbf{b}\}\). Furthermore, for each \((\mathbf{a},\mathbf{b})\in \mathcal {J}\) we let \(\mathcal D_{(\mathbf{a},\mathbf{b})}:=[0,T]\times [0,T]^\mathbf{b}\times [0,T]^{\mathbf{a}^+}\).
Note here that \(\mathcal {A}_{\mathbf{a},\mathbf{b}}\) is the set of all \(\mathbf{a}'\in \mathcal {A}_\mathbf{b}\) that the operation mode may transition to from \(\mathbf{a}\) when \(\xi =\mathbf{b}\) and \(\mathcal {A}_{\mathbf{b}}^\mathrm{abs}\) is the set of all states in \(\mathcal {A}\) that are absorbing for \(\alpha \) when \(\xi =\mathbf{b}\).
We let \((\Omega ,\mathcal {G},\mathbb {G},\mathbb {P})\) be a probability space endowed with a d-dimensional Brownian motion \((B_t:0\le t\le T)\) whose augmented natural filtration is \(\mathbb {F}:=(\mathcal {F}_t)_{0\le t\le T}\). For all \((t,\nu ,\mathbf{a},\mathbf{b})\in \mathcal D_A\) let \((A^{t,\nu ,\mathbf{a},\mathbf{b}}_s:0\le s\le T)\) be a mixed Markov chain (sometimes also referred to as a stochastic hybrid system [31], see [5, 6, 23] for applications in credit-risk models and [4, 22] for continuous-time conditionally-Markov chains) with càdlàg sample paths and state-space \(\mathcal {A}_{\mathbf{a},\mathbf{b}}\). We assume that \(A^{t,\nu ,\mathbf{a},\mathbf{b}}_s=\mathbf{a}\) for \(s\in [0,t\vee \max _i \nu _i]\) and that on \((t\vee \max _i \nu _i,T]\), the transition rate from \(\mathbf{a}\) to \(\mathbf{a}'\), \(\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}(s)\), is \(\mathbb {F}\)-progressively measurable for all \(\mathbf{a}'\in \mathcal {A}_{\mathbf{a},\mathbf{b}}\). For \(\mathbf{a}'\notin \mathcal {A}_{\mathbf{a},\mathbf{b}}\) we let \(\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}(s)\equiv 0\). We assume that \(\mathbb {G}:=(\mathcal {G}_t)_{0\le t\le T}\) is the augmented natural filtration generated by B and the family \(((A^{t,\nu ,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}:(t,\nu ,\mathbf{a},\mathbf{b})\in \mathcal D_A)\), satisfying the usual conditions in addition to being quasi-left continuous (more information about enlargement of filtrations can be found in e.g. Chapter 6 of [36]).
Recall here the concept of left continuity in expectation: A process \((X_t:0\le t\le T)\) is strongly left continuous in expectation (SLCE) if for each stopping time \(\gamma \) and each sequence of stopping times \(\gamma _k\nearrow \gamma \) we have \(\lim \limits _{k\rightarrow \infty }\mathbb {E}\left[ X_{\gamma _k}\right] = \mathbb {E}\left[ X_\gamma \right] \).
Throughout we will use the following notation:
-
\(\mathcal {P}_{\mathbb {F}}\) (resp. \(\mathcal {P}_{\mathbb {G}}\)) is the \(\sigma \)-algebra of \(\mathbb {F}\)-progressively measurable (\(\mathbb {G}\)-progressively measurable) subsets of \([0,T]\times \Omega \).
-
We let \(\mathcal {S}^{2}\) be the set of all \(\mathbb {R}\)-valued, \(\mathcal {P}_{\mathbb {G}}\)-measurable, càdlàg processes \((Z_t: 0\le t\le T)\) such that \(\mathbb {E}\left[ \sup _{t\in [0,T]} |Z_t|^2\right] <\infty \). We let \(\mathcal {S}_{e}^{2}\) (resp. \(\mathcal {S}_{c}^{2}\)) be the subset of processes that are non-negative and SLCE (resp. continuous).
-
We let \(\mathcal {S}_\mathbb {F}^{2}\), \(\mathcal {S}_{\mathbb {F},e}^{2}\) and \(\mathcal {S}_{\mathbb {F},c}^{2}\) be the subset of \(\mathcal {S}^{2}\), \(\mathcal {S}_{e}^{2}\) and \(\mathcal {S}_{c}^{2}\), respectively, of processes that are \(\mathcal {P}_{\mathbb {F}}\)-measurable.
-
We let \(\mathcal {H}_\mathbb {F}^{2}\) denote the set of all \(\mathbb {R}\)-valued \(\mathcal {P}_{\mathbb {F}}\)-measurable processes \((Z_t: 0\le t\le T)\) such that \(\mathbb {E}\left[ \int _0^T |Z_t|^2 dt\right] <\infty \).
-
We let \(\mathcal {H}^{\infty }_\mathbb {F}\) denote the set of all \(\mathbb {R}\)-valued \(\mathcal {P}_{\mathbb {F}}\)-measurable processes \((Z_t: 0\le t\le T)\) such that \(|Z_t|<\infty \), \(d\mathbb {P}\otimes dt\)-a.e.
-
We let \(\mathcal {T}\) (\(\mathcal {T}^{\mathbb {F}}\)) be the set of all \(\mathbb {G}\)-(\(\mathbb {F}\)-)stopping times and for each \(\gamma \in \mathcal {T}\) (\(\mathcal {T}^{\mathbb {F}}\)) we let \(\mathcal {T}_\gamma \) (\(\mathcal {T}^{\mathbb {F}}_\gamma \)) be the subset of stopping times \(\tau \) such that \(\tau \ge \gamma \), \(\mathbb {P}\)-a.s.
-
We let \(\mathcal {U}\) be the set of all \(u=(\tau _1,\ldots ,\tau _N;\beta _1,\ldots ,\beta _N)\), where \((\tau _j)_{j=1}^N\) is an increasing sequence of \(\mathbb {G}\)-stopping times and \(\beta _j\in \mathcal {I}^{-\beta _{j-1}}\) is \(\mathcal {G}_{\tau _j}\)-measurable.
-
We let \(\mathcal {U}^f\) be the subset of controls \(u\in \mathcal {U}\) for which N is finite \(\mathbb {P}\)-a.s. (i.e. \(\mathcal {U}^f:=\{u\in \mathcal {U}:\, \mathbb {P}\left[ \{\omega \in \Omega : N(\omega )>k, \,\forall k>0\}\right] =0\}\)) and \(\mathcal {U}^k\) the subset of controls for which \(N\le k\), \(\mathbb {P}\)-a.s. For \(\gamma \in \mathcal {T}\) let \(\mathcal {U}_\gamma \) (resp. \(\mathcal {U}_\gamma ^f\) and \(\mathcal {U}_\gamma ^k\)) be the subset of \(\mathcal {U}\) (resp. \(\mathcal {U}^f\) and \(\mathcal {U}^k\)) with \(\tau _1\in \mathcal {T}_\gamma \).
Our problem will be characterized by four objects:
-
A collection \((\Upsilon _{\mathbf{a}}:\Omega \times [0,T]^{\mathbf{a}^+}\rightarrow \mathbb {R})_{\mathbf{a}\in \mathcal {A}}\) of \(\mathcal {F}_T\otimes \mathcal {B}([0,T]^{\mathbf{a}^+})\)-measurable maps.
-
A collection \((\psi _{\mathbf{a}}:\Omega \times [0,T]\times [0,T]^{\mathbf{a}^+}\rightarrow \mathbb {R})_{\mathbf{a}\in \mathcal {A}}\) where \(\psi _{\mathbf{a}}\) is a \(\mathcal {P}_{\mathbb {F}}\otimes \mathcal {B}([0,T]^{\mathbf{a}^+})\)-measurable map.
-
A cost process \(C^u_t:=\sum _{\tau _j\le t}c_{\beta _{j-1},\beta _j}(\tau _j)\), where \((c_{\mathbf{b},\mathbf{b}'}:\Omega \times [0,T]\rightarrow \mathbb {R})_{(\mathbf{b},\mathbf{b}')\in \mathcal {I}^2}\) is a collection of \(\mathcal {P}_{\mathbb {F}}\)-measurable processes.
-
A family \((((\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}(s):0\le s\le T)_{(\mathbf{a},\mathbf{a}')\in \mathcal {A}_{\mathbf{b}}\times \mathcal {A}_{\mathbf{b}}})_{\nu \in [0,T]^\mathbf{b}})_{\mathbf{b}\in \mathcal {I}}\) of \(\mathbb {R}\)-valued, \(\mathcal {P}_{\mathbb {F}}\)-measurable transition intensities (sometimes referred to as \(\mathbb {F}\)-conditional intensities), i.e. \(\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}\ge 0\) for \(\mathbf{a}'\ne \mathbf{a}\) and \(\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}}=-\sum _{\mathbf{a}'\in \mathcal {A}_{\mathbf{a},\mathbf{b}}^{-\mathbf{a}}} \lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}\).
We make the following assumptions:
Assumption 2.1
-
(i)
For each \(\mathbf{a}\in \mathcal {A}\) and \(z\in [0,T]^{\mathbf{a}^+}\), \(\psi _\mathbf{a}(\cdot ,z)\in \mathcal {S}_{\mathbb {F}}^{2}\) and \(\mathbb {E}[\Upsilon _\mathbf{a}^2]<\infty \). Furthermore, we assume that there are constants \(k_\psi >0\) and \(k_\Upsilon >0\) such that, for each \((z,z')\in [0,T]^{\mathbf{a}}\times [0,T]^{\mathbf{a}}\),
$$\begin{aligned} |\psi _\mathbf{a}(t,z)-\psi _\mathbf{a}(t,z')|\le k_\psi |z-z'|, \end{aligned}$$for all \(t\in [0,T]\) and
$$\begin{aligned} |\Upsilon _{\mathbf{a}}(z)-\Upsilon _{\mathbf{a}}(z')|\le k_\Upsilon |z-z'|, \end{aligned}$$\(\mathbb {P}\)-a.s. (where the exception set does not depend on the tuple \((t,z,z')\)).
-
(ii)
The switching costs \((c_{\mathbf{b},\mathbf{b}'})_{\mathbf{b},\mathbf{b}'\in \mathcal {I}}\in (\mathcal {S}_{\mathbb {F},c}^2)^{m\times m}\) are such that, \(\mathbb {P}\)-a.s.,
-
(a)
\(\inf _{t\in [0,T]} c_{\mathbf{b},\mathbf{b}'}(t)\ge 0\), for all \((\mathbf{b},\mathbf{b}')\in \mathcal {I}\times \mathcal {I}\)
-
(b)
\(c_{\mathbf{b}_{1},\mathbf{b}_2}(t_1)+c_{\mathbf{b}_2,\mathbf{b}_3}(t_2)+\cdots +c_{\mathbf{b}_{k-1}, \mathbf{b}_k}(t_{k-1})+c_{\mathbf{b}_{k},\mathbf{b}_1}(t_k)\ge \epsilon >0\), for all \(0\le t_1\le t_2\le \cdots \le t_k\le T\) and \((\mathbf{b}_1,\ldots ,\mathbf{b}_k)\in \mathcal {I}^k\).
-
(a)
-
(iii)
For all \(\mathbf{a}\in \mathcal {A}\) and \(z\in [0,T]^{\mathbf{a}^+}\), \(\Upsilon _{\mathbf{a}}(z)>\max _{(\mathbf{b},\mathbf{b}')\in \mathcal {I}_{\mathbf{a}}\times \mathcal {I}} \{\Upsilon _{\mathbf{a}\wedge \mathbf{b}'}(z\wedge T\mathbf{b}')-c_{\mathbf{b},\mathbf{b}'}(T)\}\), \(\mathbb {P}\)-a.s.
-
(iv)
For each \((\nu ,\mathbf{b})\in \mathcal D_\lambda \) and all \(\mathbf{a},\mathbf{a}'\in \mathcal {A}_{\mathbf{b}}\) the process \(\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}(\cdot )\in \mathcal {H}_\mathbb {F}^{\infty }\) and we assume that there is a constant \(K_\lambda >0\), such that
$$\begin{aligned} |\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}(s)|\le K_\lambda , \end{aligned}$$\(d\mathbb {P}\otimes ds\)-a.e. Furthermore, we assume that each element of \(\lambda ^{\nu ,\mathbf{b}}\) is Lipschitz continuous in \(\nu \):
$$\begin{aligned} |\lambda ^{\nu ,\mathbf{b}}(s)-\lambda ^{\nu ',\mathbf{b}}(s)|\le k_\lambda |\nu -\nu '|, \end{aligned}$$for all \(s\in [0,T]\), \(\mathbb {P}\)-a.s. (where the exception set does not depend on the tuple \((s,\nu ,\nu ')\)).
The above assumptions are mainly standard assumptions for optimal switching problems. Assumptions i and iia together imply that the expected maximal reward is finite. Assumption iib implies that there is always a positive switching cost associated to making a loop of switches and iii implies that it is never optimal to switch at time T.
Each control \(u=(\tau _1,\ldots ,\tau _N;\beta _1,\ldots ,\beta _N)\) defines the switching mode starting in \(\mathbf{b}\in \mathcal {I}\) which is a process \((\xi ^{\mathbf{b}}_t: 0\le t\le T)\) given by
with \(\tau _{N+1}=\infty \) (for notational simplicity we will write \(\xi \) for \(\xi ^{0}\)). The switching mode thus, in some sense, tells us the preferred operation state. For each initial mode \(\mathbf{b}\) and each vector \(\nu \in [0,T]^{\mathbf{b}}\), we let the control u define the sequenceFootnote 3\((\vartheta ^{\nu ,\mathbf{b}}_0,\ldots ,\vartheta ^{\nu ,\mathbf{b}}_N)\) with \(\vartheta ^{\nu ,\mathbf{b}}_0:=\nu \), \(\vartheta ^{\nu ,\mathbf{b}}_1:=\nu \beta _1+\tau _1(\beta _1-\mathbf{b})^+\) and then recursively \(\vartheta ^{\nu ,\mathbf{b}}_{j}:=\vartheta ^{\nu ,\mathbf{b}}_{j-1}\beta _{j}+\tau _{j}(\beta _{j}-\beta _{j-1})^+\) for \(j=2,\ldots ,N\).
Given that the operating mode at time t is \(\mathbf{a}\), the switching mode is \(\mathbf{b}\in \mathcal {I}\) and given the vector of activation times \(\nu \), such that \((t,\nu ,\mathbf{a},\mathbf{b})\in \mathcal D_A\), the family of mixed Markov-chains \((A^{\cdot ,\cdot ,\cdot ,\cdot }_s:0\le s\le T)\) defines the sequence of operating modes, \(\bar{\alpha }_0,\ldots ,\bar{\alpha }_N\), at the intervention times as \(\bar{\alpha }_0:=\mathbf{a}\) and then recursively
for \(j=1,\ldots ,N\), with \(\tau _0=t\) and \(\beta _0=\mathbf{b}\). This leads us to define the operating mode \((\alpha ^{t,\nu ,\mathbf{a},\mathbf{b},u}(s):0\le s\le T)\) as
For notational simplicity we use the same shorthand as above and write \(\alpha ^u\) for \(\alpha ^{0,0,0,0,u}\).
Example 2.2
Consider the case when transitions for different plants are independent and Plant i has a failure rate \(r^\mathrm{fail}_i\ge 0\) if operating and 0 otherwise and a startup rate \(r^\mathrm{start}_i:[0,T]\rightarrow \mathbb {R}_ +\), where the input is the time that has elapsed since the unit was turned on. Then,Footnote 4
when \(\mathbf{a}'\in \mathcal {A}_{\mathbf{a},\mathbf{b}}^{-\mathbf{a}}\) and \(\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}}(s)=-\sum _{\mathbf{a}'\in \mathcal {A}_{\mathbf{a},\mathbf{b}}^{-\mathbf{a}}}\lambda ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}(s)\). \(\square \)
We let \(((\theta ^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}:(t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\) be given by
To define the time in present on-mode for a control \(u=(\tau _1,\ldots ,\tau _N;\beta _1,\ldots ,\beta _N)\in \mathcal {U}_t\) starting in z at time t we let \(\theta ^0_s:=\theta ^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s\) and then recursively define
This allows us to define
and again we let \(\theta ^u:=\theta ^{0,0,0,0,0,u}\). In addition, to simplify notation in some of the proofs, we let
Remark 2.3
Note that, with \(\vartheta _t:=\sum _{j=1}^N \vartheta _j\mathbb {1}_{[\tau _{j},\tau _{j+1})}\) the set \(\mathcal D\) is the state-space for \((s,\vartheta _s,\theta _s,\alpha _s,\xi _s)_{0\le s\le T}\), \(\mathcal D_A\) is the state-space for \((s,\vartheta _s,\alpha _s,\xi _s)_{0\le s\le T}\) and \(\mathcal D_\lambda \) is the state-space for \((\vartheta _s,\xi _s)_{0\le s\le T}\).
We are now ready to state the optimal switching problem with random lag:
Problem 1
Find \(u^*\in \mathcal {U}\), such that
\(\square \)
Remark 2.4
Note that we have
Hence, we can without loss of generality assume that for each \(\mathbf{a}\in \mathcal {A}\), \(\Upsilon _{\mathbf{a}}\) and \(\psi _{\mathbf{a}}\) are both non-negative.
The following proposition is a standard result for optimal switching problems and is due to the “no-free-loop” condition.
Proposition 2.5
Suppose that there is a \(u^*\in \mathcal {U}\) such that \(J(u^*)\ge J(u)\) for all \(u\in \mathcal {U}\). Then \(u^*\in \mathcal {U}^f\).
Proof
Assume that \(u\in \mathcal {U}\setminus \mathcal {U}^f\) and let \(B:=\{\omega \in \Omega : N(\omega )>k, \,\forall k>0\}\), then \(\mathbb {P}[B]>0\). Furthermore, if B holds then the switching mode must make an infinite number of loops and we have
by Assumption 2.1 (i) and (ii). Now, by the above non-negativity assumption on \(\Upsilon \) and \(\psi \) we have \(J(u)\ge 0\) for \(u=\emptyset \) and the assertion follows. \(\square \)
We end this section with two useful lemmas:
Lemma 2.6
Let \((\gamma _m)_{m\ge 1}\) be a sequence of \(\mathbb {G}\)-stopping times such that \(\gamma _m\nearrow \gamma \) \(\mathbb {P}\)-a.s. for some \(\gamma \in \mathcal {T}\), with \(\gamma \le T\), \(\mathbb {P}\)-a.s. Then, for any \((t,\nu ,\mathbf{a},\mathbf{b})\in \mathcal D_A\)
for all \(\mathcal {G}_\gamma \)-measurable functions g such that \(\sum _{\mathbf{a}\in \mathcal {A}}\mathbb {E}[|g(\mathbf{a})|^2]<\infty \).
Proof
We haveFootnote 5
and the last part goes to zero as \(m\rightarrow \infty \). \(\square \)
Lemma 2.7
For any \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\) and any \(s\in [t,T]\) we have
\(\mathbb {P}\)-a.s.
Proof
First note that by definition we have \(\theta ^{t,\nu ,z',\mathbf{a},\mathbf{b}}_r-\theta ^{t,\nu ,z,\mathbf{a},\mathbf{b}}_r= z'-z\). Assumption 2.1.i now gives
\(\mathbb {P}\)-a.s. for all \(r\in [0,T]\). If \(\mathbf{a}\in \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\) then \(A^{t,\nu ,\mathbf{a},\mathbf{b}}_r=A^{t,\nu ',\mathbf{a},\mathbf{b}}_r =\mathbf{a}\), \(\mathbb {P}\)-a.s. for all \(r\in [0,T]\) and the result follows. Assume instead that \(\mathbf{a}\notin \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\) and let \(\eta \) and \(\eta '\) be the first transition times of \(A^{t,\nu ,\mathbf{a},\mathbf{b}}\) and \(A^{t,\nu ',\mathbf{a},\mathbf{b}}\), respectively. For all \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\) and all \(r\in [t,s]\) we let
We note that \((\Gamma ^{t,\nu ,z,\mathbf{a},\mathbf{b}}_t:0\le t\le T)\in \mathcal {S}^2_{\mathbb {F}}\) (in particular it is \(\mathcal {F}_t\)-adapted, since \(\eta >t\), \(\mathbb {P}\)-a.s.) and get that (recall that \(\theta _r^{t,z,\mathbf{a}}=z+(r-t)^+\mathbf{a}^+\))
where we have used that the transition-rate of \(A^{t,\nu ,\mathbf{a},\mathbf{b}}_{\cdot }\) from \(\mathbf{a}\) to \(\mathbf{a}'\) is \(\lambda _{\mathbf{a},\mathbf{a}'}^{\nu ,\mathbf{b}}(\cdot )\), that \(\lambda _{\mathbf{a},\mathbf{a}}^{\nu ,\mathbf{b}}=-\sum _{\mathbf{a}'\in \mathcal {A}^{-\mathbf{a}}_{\mathbf{a},\mathbf{b}}} \lambda _{\mathbf{a},\mathbf{a}'}^{\nu ,\mathbf{b}}\) and the fact that \(\mathbb {E}[Z_{\eta }\mathbb {1}_{[\eta \le s]}|\mathcal {G}_t]=-\mathbb {E}[\int _{t}^sZ_re^{\int _t^r\lambda _{\mathbf{a},\mathbf{a}}^{\nu ,\mathbf{b}}(v)dv}\lambda _{\mathbf{a},\mathbf{a}}^{\nu ,\mathbf{b}}(r)dr|\mathcal {F}_t]\), \(\mathbb {P}\)-a.s. for any \(Z\in \mathcal {S}_{\mathbb {F}}^2\) (see e.g. Corollary 5.1.3 and the preceding comment on p. 148 in [5]). This implies that
Using the identity \(ab-a'b'=1/2((a-a')(b+b')+(a+a')(b-b'))\) we find that
Now, as \(\Gamma ^{r,\nu ',z,\mathbf{a},\mathbf{b}}_r-\Gamma ^{r,\nu ,z,\mathbf{a},\mathbf{b}}_r=0\), \(\mathbb {P}\)-a.s., whenever \(\mathbf{a}\in \mathcal {A}^{\mathrm{abs}}_{\mathbf{b}}\) we can use an induction argument to deduce that
and the assertion follows as the last term is \(\mathbb {P}\)-a.s. bounded by Assumption 2.1.i and Doob’s maximal inequality. \(\square \)
2.1 The Snell Envelope
In this section we gather the main results concerning the Snell envelope that will be useful later on. When presenting the theory we introduce an auxiliary probability space \((\mathbb {P},\Omega ,\tilde{\mathcal {F}},(\tilde{\mathcal {F}}_t)_{0\le t\le T})\) that we assume satisfies the usual conditions in addition to the filtration, \(\tilde{\mathbb {F}}:=(\tilde{\mathcal {F}}_t)_{0\le t\le T}\), being quasi-left continuous. For any \(\tilde{\mathbb {F}}\)-stopping time \(\eta \), we let \(\tilde{\mathcal {T}}_\eta \) be the set of \(\tilde{\mathbb {F}}\)-stopping times \(\tau \) such that \(\eta \le \tau \le T\), \(\mathbb {P}\)-a.s. and recall that a progressively measurable process \(U_t\) is of class [D] if the set of random variables \(\{X_\tau :\tau \in \tilde{\mathcal {T}}_0\}\) is uniformly integrable.
Theorem 2.8
(The Snell envelope) Let \(U=(U_t)_{0\le t\le T}\) be an \(\tilde{\mathbb {F}}\)-adapted, \(\mathbb {R}\)-valued, càdlàg process of class [D]. Then there exists a unique (up to indistinguishability), \(\mathbb {R}\)-valued càdlàg process \(Z=(Z_t)_{0\le t\le T}\) called the Snell envelope, such that Z is the smallest supermartingale that dominates U. Furthermore, the following holds:
-
(i)
For any stopping time \(\gamma \),
$$\begin{aligned} Z_{\gamma }=\mathop {\mathrm{ess}\,\sup }_{\tau \in \tilde{\mathcal {T}}_{\gamma }}\mathbb {E}\left[ U_\tau \big |\tilde{\mathcal {F}}_\gamma \right] . \end{aligned}$$(2.2) -
(ii)
The Doob-Meyer decomposition of the supermartingale Z implies the existence of a triple \((M,K^c,K^d)\) where \((M_t:0\le t\le T)\) is a uniformly integrable right-continuous martingale, \((K^c_t:0\le t\le T)\) is a non-decreasing, predictable, continuous process with \(K^c_0=0\) and \(K^d_t\) is non-decreasing purely discontinuous predictable with \(K^d_0=0\), such that
$$\begin{aligned} Z_t=M_t-K^c_t-K^d_t. \end{aligned}$$(2.3)Furthermore, \(\{\Delta _t K^d>0\}\subset \{\Delta _t U<0\}\cap \{Z_{t^-}=U_{t^-}\}\) for all \(t\in [0,T]\).
-
(iii)
Let \(\eta \in \tilde{\mathcal {T}}_0\) and assume that U is non-negative, \(L^2\)-bounded and that \(\lim \sup _{m\rightarrow \infty }\mathbb {E}[U_{\gamma _m}]\le \mathbb {E}[U_{\gamma }]\) whenever \(\gamma _m\nearrow \gamma \in \tilde{\mathcal {T}}_\eta \). Then, the stopping time \(\tau ^*_{\eta }\) defined by \(\tau ^*_{\eta }:=\inf \{s\ge \eta :Z_s=U_s\}\wedge T\) is optimal after \(\eta \), i.e.
$$\begin{aligned} Z_{\eta }=\mathbb {E}\left[ U_{\tau ^*_\eta }\big |\tilde{\mathcal {F}}_\eta \right] . \end{aligned}$$Furthermore, in this setting the Snell envelope, Z, is left continuous in expectation, i.e. \(K_d\equiv 0\), and Z is a martingale on \([\eta ,\tau ^*_\eta ]\).
-
(iv)
Let \(U^k\) be a sequence of càdlàg processes converging increasingly and pointwisely to the càdlàg process U and let \(Z^k\) be the Snell envelope of \(U^k\). Then the sequence \(Z^k\) converges increasingly and pointwisely to a process Z and Z is the Snell envelope of U.
-
(v)
We have the following dynamic programming relation: For any \(\gamma \in \tilde{\mathcal {T}}_0\) and any \(\tilde{\mathbb {F}}\)-stopping time \(\eta \) with \(\eta \ge \gamma \), \(\mathbb {P}\)-a.s., we have
$$\begin{aligned} Z_\gamma =\mathop {\mathrm{ess}\,\sup }_{\tau \in \tilde{\mathcal {T}}_{\gamma }}\mathbb {E}\left[ \mathbb {1}_{[\eta \le \tau ]}Z_\eta +\mathbb {1}_{[\tau <\eta ]}U_\tau \big |\tilde{\mathcal {F}}_\gamma \right] . \end{aligned}$$
In the above theorem (i)–(iii) are standard. Proofs can be found in [15] (see [29] for an English version), Appendix D in [18, 25, 27] and in the appendix of [12]. Statement (iv) was proved in [13]. The last statement follows by noting that
To get the second equality above we note that if \((\tau _j)_{j\ge 0}\) is an increasing maximizing sequence for the outer supremum and \((\tau '_j)_{j\ge 0}\) an increasing maximizing sequence for the inner supremum in the second expression on the first row, then \(\tilde{\tau }_j=\mathbb {1}_{[\tau _j<\eta ]}\tau _j+\mathbb {1}_{[\tau _j\ge \eta ]}\tau '_j\) is a maximizing sequence for the expression on the second row and the two values must equal.
The Snell envelope will be the main tool in showing that Problem 1 has a unique solution.
3 A Verification Theorem
The method for solving Problem 1 will be based on deriving an optimal control under the assumption that a specific family of processes exists, and then (in the next section) showing that the family indeed does exist. We will refer to any such family of processes as a verification family.
Definition 3.1
We define a verification family to be a family of càdlàg supermartingales \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\) such that:
-
(a)
For every \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\) we have \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}\in \mathcal {S}_e^2\) and \((Y^{s,\nu ,z,\mathbf{a},\mathbf{b}}_s:0\le s\le T)\in \mathcal {S}_{\mathbb {F},e}^2\).
-
(b)
The family is bounded in the sense that \(\mathbb {E}[\sup \limits _{(t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D}\sup \limits _{s\in [0,T]} |Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s|^2]<\infty \).
-
(c)
The family is continuous in \((\nu ,z)\) in the sense that
$$\begin{aligned} \lim _{(p,q)\rightarrow (0,0)}\mathbb {E}\Big [\sup _{(t,\nu ,z)\in \mathcal D_{(\mathbf{a},\mathbf{b})}}|Y^{t, (\nu +p)^+\wedge \mathbf{b}T,(z+q)^+\wedge T\mathbf{a}^+,\mathbf{a},\mathbf{b}}_{t}-Y^{t,\nu ,z, \mathbf{a},\mathbf{b}}_t|^2\Big ]= 0, \end{aligned}$$for every \((\mathbf{a},\mathbf{b})\in \mathcal {J}\).
-
(d)
The family satisfies the recursion
$$\begin{aligned} Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s&=\mathop {\mathrm{ess}\,\sup }_{\tau \in \mathcal {T}_{s}} \mathbb {E}\bigg [ \int _s^{\tau \wedge T}\psi _{A^{t,\nu ,\mathbf{a},\mathbf{b}}_r}\left( r,\theta _r^{t, \nu ,z,\mathbf{a},\mathbf{b}}\right) dr+\mathbb {1}_{[\tau \ge T]}\Upsilon _{A^{t,\nu ,\mathbf{a},\mathbf{b}}_T}\left( \theta _T^{t,\nu ,z,\mathbf{a},\mathbf{b}}\right) \nonumber \\&\quad +\mathbb {1}_{[\tau < T]}\max _{\beta \in \mathcal {I}}\left\{ -c_{\mathbf{b},\beta } (\tau )+Y^{\tau ,\beta \nu + \tau (\beta -\mathbf{b})^+,\theta _\tau ^{t,\nu ,z,\mathbf{a},\mathbf{b}}\wedge T\beta ,A^{t,\nu ,\mathbf{a},\mathbf{b}}_\tau \wedge \beta ,\beta }_\tau \right\} \Big | \mathcal {G}_s\bigg ]. \end{aligned}$$(3.1)
The purpose of the present section is to reduce the solution of Problem 1 to showing existence of a verification family. This is done in the verification theorem below. First we give a lemma that will be used in the proof of the verification theorem:
Lemma 3.2
Let \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\) be a verification family, then
Proof
For \(s\in [0,t]\) the result follows immediately from property c) and Doob’s maximal inequality. We thus let \(s\in (t,T]\) and note that if \(\mathbf{a}\in \mathcal {A}^{\mathrm{abs}}\) then
Now, since \(\theta _s^{t,(z+q)^+\wedge T\mathbf{a}^+,\mathbf{a}}-\theta _s^{t, z,\mathbf{a}}=(z+q)^+\wedge T\mathbf{a}^+-z\) we get that
\(\mathbb {P}\)-a.s. Put together, this implies that
which tends to 0 as \((p,q)\rightarrow (0,0)\) by property c), where to get the last inequality we have applied Doob’s maximal inequality. For \(\mathbf{a}\in \mathcal {A}_{\mathbf{b}}\setminus \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\) we find, by arguing as in the proof of Lemma 2.7, that for \((t,\nu ,z)\) and \((t,\nu ',z')\) in \(\mathcal D_{(\mathbf{a},\mathbf{b})}\),
where \(\eta \) and \(\eta '\) are the first transition times of \(A^{t,\nu ,\mathbf{a},\mathbf{b}}\) and \(A^{t,\nu ',\mathbf{a},\mathbf{b}}\), respectively. Using the relation \(ab-a'b'=1/2((a-a')(b+b')+(a+a')(b-b'))\) we get that
By again applying Doob’s maximal inequality and an induction argument in the \(\mathbf{a}\)-component, combined with the square integrability property in (b), the assertion follows. \(\square \)
We have the following verification theorem:
Theorem 3.3
Assume that there exists a verification family \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\). Then \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\) is unique (i.e. there is at most one verification family, up to indistinguishability) and:
-
(i)
Satisfies \(Y_0^{0,0,0,0,0}=\sup _{u\in \mathcal {U}} J(u)\).
-
(ii)
Defines the optimal strategy, \(u^*=(\tau _1^*,\ldots ,\tau _{N^*}^*;\beta _1^*,\ldots ,\beta _{N^*}^*)\), for Problem 1, where \((\tau _j^*)_{1\le j\le {N^*}}\) is a sequence of \(\mathbb {G}\)-stopping times given by
$$\begin{aligned} \tau ^*_j&:=\inf \Big \{s \ge \tau ^*_{j-1}:\,Y_s^{\tau ^*_{j-1},\vartheta ^*_{j-1},z^*_{j-1},\mathbf{a}^*_{j-1}, \beta ^*_{j-1}}=\max _{\beta \in \mathcal {I}}\Big \{-c_{\beta ^*_{j-1},\beta }(s)\nonumber \\&\qquad +Y^{s,\beta \vartheta ^*_{j-1}+s(\beta -\beta ^*_{j-1})^+, \theta _s^{\tau _{j-1}^*,\vartheta ^*_{j-1},z_{j-1}^*,\mathbf{a}^*_{j-1},\beta _{j-1}^*}\wedge T\beta ,A^{\tau ^*_{j-1},\vartheta ^*_{j-1},\mathbf{a}^*_{j-1},\beta ^*_{j-1}}_{s} \wedge \beta ,\beta }_s\Big \}\Big \}, \end{aligned}$$(3.2)\((\beta _j^*)_{1\le j\le {N^*}}\) is defined as a measurable selection of
$$\begin{aligned}&\beta ^*_j\in \mathop {\arg \max }_{\beta \in \mathcal {I}}\Big \{-c_{\beta ^*_{j-1},\beta }(\tau ^*_j) \\&\qquad + Y^{\tau ^*_j,\beta \vartheta ^*_{j-1}+\tau ^*_j(\beta -\beta ^*_{j-1})^+, \theta _{\tau ^*_j}^{\tau _{j-1}^*,\vartheta ^*_{j-1},z_{j-1}^*,\mathbf{a}^*_{j-1}, \beta _{j-1}^*}\wedge T\beta ,A^{\tau ^*_{j-1},\vartheta ^*_{j-1},\mathbf{a}^*_{j-1},\beta ^*_{j-1}}_{\tau ^*_j} \wedge \beta ,\beta }_{\tau ^*_j}\Big \}, \end{aligned}$$where \(\vartheta ^*_{j}=\beta ^*_j\vartheta ^*_{j-1}+\tau ^*_j(\beta ^*_j-\beta ^*_{j-1})^+\), \(z^*_{j}:=\theta _{\tau _{j}^*}^{\tau ^*_{j-1},\vartheta ^*_{j-1},z_{j-1}^*, \mathbf{a}^*_{j-1},\beta ^*_{j-1}}\wedge T\beta ^*_{j}\) and \(\mathbf{a}^*_{j}:=A^{\tau ^*_{j-1},\vartheta ^*_{j-1},\mathbf{a}^*_{j-1}, \beta _{j-1}^*}_{\tau ^*_{j}}\wedge \beta _j^*\), with \((\tau ^*_0,\vartheta ^*_0,z^*_0,\mathbf{a}^*_0,\beta ^*_{0}):=(0,0,0,0,0)\) and \(N^*:=\max \{j:\tau _j^*<T\}\).
Proof
Note that the proof amounts to showing that for all \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\), we have
for all \(s\in [t,T]\), where \(\beta _0=\mathbf{b}\). Then uniqueness is immediate, (i) follows from Proposition 2.5 and (ii) follows from repeated use of Theorem 2.8.iii.
Step 1 We start by showing that for all \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\) the recursion (3.1) can be written in terms of stopping times. From (3.1) we have that, for each \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\),
is the smallest supermartingale that dominates the process
We will show that under assumptions (a)–(c) the dominated process satisfies the assumptions of Theorem 2.8.iii after which the assertion follows. Fix \(\mathbf{b}'\in \mathcal {I}^{-\mathbf{b}}\) and note that for all \(s\le s'\le T\) we have the following trivial relation
If \(\gamma _m\) is a sequence of stopping times such that \(\gamma _m\nearrow \gamma \in \mathcal {T}\), \(\mathbb {P}\)-a.s., we thus have
where the first equality follows by (a) and (c) and the second equality follows from the \(L^2\)-boundedness assumed in (b) in combination with Lemma 2.6.
The dominated process is thus \(L^2\)-bounded by Assumption 2.1 and (b), positive by Remark 2.4 and SLCE on [0, T). At time T it may have a jump but the jump has to be positive by Assumption 2.1.iii. Theorem 2.8.iii now implies that, for each \(\gamma \in \mathcal {T}\), there is a stopping time, \(\tau _{\gamma }\in \mathcal {T}_\gamma \), such that:
Step 2 We now show that \(Y^{0,0,0,0,0}_0=J(u^*)\). First, define
Then by Theorem 2.8, \(Z_s\) is the smallest supermartingale that dominates
and by step 1
Now suppose that, for some \(j'>0\) we have, for all \(j\le j'\),
\(\mathbb {P}\)-a.s., for each \(\tau ^*_{j-1}\le s\le T\). We now show that the same equality holds for \(j'+1\). This will be done over two sub-steps:
Sub-step a) \(Y_\cdot ^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}} + \int _{\tau ^*_{j'}}^\cdot \psi _{A^{\tau ^*_{j'},\vartheta ^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}}_r} (r,\theta _r^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}})dr\) is a càdlàg supermartingale. For all \(M\ge 1\), let \((G_k^M)_{1\le k \le M}\) be an partition of \([0,T]^n\) into sets of the type \(G_k^M=[z_{M,k,1},z_{M,k,1}')\times \cdots \times [z_{M,k,n},z_{M,k,n}')\) where \(z_{M,k,i}< z_{M,k,i}'\) and \(\max _{k,i}|z_{M,k,i}-z_{M,k,i}'|\rightarrow 0\) as \(M\rightarrow \infty \) and let \((\kappa _{k}^M)_{1\le k\le M}\) be the sequence of points in \(\mathbb {R}^n\) given by \([\kappa _k^M]_i:=z_{M,k,i}\). For \(M\ge 1\) and \(t\ge \tau ^*_{j'}\), we define the process
for all \(s\in [t,T]\). Now, \(\mathbb {1}_{[A_{t}^{\tau ^*_{j'},\vartheta ^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}}=\mathbf{a}]}\mathbb {1}_{[\beta ^*_{j'}=\mathbf{b}]}\mathbb {1}_{[\vartheta ^*_{j'}\in G_k^M]}\mathbb {1}_{[z^*_{j'}\in G_l^M]}\mathbb {1}_{[\theta _t^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}}\in G_l^M]}\) \(\cdot \Big (Y_s^{t,\kappa _{k}^M,\kappa _{l}^M,\mathbf{a},\mathbf{b}} + \int _{t}^s\psi _{A^{t,\kappa _{k}^M,\mathbf{a},\mathbf{b}}_r}(r,\theta _r^{t,\kappa _{k}^M,\kappa _{l}^M,\mathbf{a},\mathbf{b}})dr\Big )\) is the product of a \(\mathcal {G}_{t}\)–measurable positive r.v. and a supermartingale, thus, it is a supermartingale for \(s\ge t\). Hence, as
is the sum of a finite number of supermartingales it is also a supermartingale. By Lemma 3.2 we have that
as \(M\rightarrow \infty \). This implies that there is a subsequence \((M_\iota )_{\iota \ge 1}\) such that
\(\mathbb {P}\)-a.s. as \(\iota \rightarrow \infty \). In particular, we note that \((Y^{t,M_\iota }_t:\tau ^*_{j'}\le t\le T)\) is a sequence of càdlàg processes that converges \(\mathbb {P}\)-a.s. to \((Y_t^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}}:\tau ^*_{j'}\le t\le T)\) uniformly in t and we conclude that \((Y_t^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}}:\tau ^*_{j'}\le t\le T)\) is a càdlàg process.
Furthermore, by Lemma 2.7 and dominated convergence we get
\(\mathbb {P}\)-a.s., for all \(s\in [t,T]\). This implies that for all \(\tau ^*_{j'}\le t\le s \) we have
\(\mathbb {P}\)-a.s. where we have used the supermartingale property to reach the inequality. Hence, \(\Big (Y_s^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}} + \int _{\tau ^*_{j'}}^s\psi _{A^{\tau ^*_{j'},\mathbf{a}^*_{j'},\mathbf{a}^*_{j'}, \beta ^*_{j'}}_r}(r,\theta _r^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'}, \mathbf{a}^*_{j'}, \beta ^*_{j'}})dr:\, \tau ^*_{j'}\le s\le T\Big )\) is a càdlàg supermartingale.
Sub-step (b) \(Y_\cdot ^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}} + \int _{\tau ^*_{j'}}^\cdot \psi _{A^{\tau ^*_{j'},\vartheta ^*_{j'},\mathbf{a}^*_{j'}, \beta ^*_{j'}}_r} (r,\theta _r^{\tau ^*_{j'},\vartheta ^*_{j'},z^*_{j'},\mathbf{a}^*_{j'},\beta ^*_{j'}})dr\) is the Snell envelope of
We note that \(Y^{t,M}_{s}\) dominates the càdlàg process
Now, if Z is another càdlàg supermartingale that dominates \(U^{t,M}_s\), then by (3.1) we have
and summing over all \((\mathbf{a},\mathbf{b},k,l)\in \mathcal {J}\times \{1,\ldots ,M\}^2\) we conclude that \(Y^{t,M}\) is the Snell envelope of \(U^{t,M}\).
Using Lemma 2.7 and property (c) we find that
as \(\iota \rightarrow \infty \). Hence, there is a subsequence \((\tilde{M}_\iota )_{\iota \ge 0}\subset (M_\iota )_{\iota \ge 0}\) such that
uniformly in t as \(\iota \rightarrow \infty \) and we conclude that U is a càdlàg process. Appealing once again to Lemma 2.7 and property (c) the statement follows.
Sub-step c) \(U\in \mathcal {S}^2_e\). We note that the results we obtained in Step 1 implies that for any sequence \((\gamma _l)_{l\ge 0}\subset \mathcal {T}_{\tau ^*_{j'}}\) with \(\gamma _l\nearrow \gamma \in \mathcal {T}\), where \(\gamma \le T\), \(\mathbb {P}\)-a.s, we have \(\lim _{l\rightarrow \infty }\mathbb {E}[U^{\tau ^*_{j'},M}_{\gamma _l}]\le \mathbb {E}[U^{\tau ^*_{j'},M}_{\gamma }]\) for all \(M\ge 1\). Now, for all \(\iota \ge 0\) this gives
where the last term can be made arbitrarily small and we, thus, have that \(\lim _{l\rightarrow \infty }\mathbb {E}[U_{\gamma _l}]\le \mathbb {E}[U_{\gamma }]\).
By Theorem 2.8.(iii) we get
\(\mathbb {P}\)-a.s. By induction we get that for each \(K\ge 0\)
where \(\tau ^*_{N^*+1}=\tau ^*_{N^*+2}=\cdots =\infty \). Now, arguing as in the proof of Proposition 2.5 we find by property b) that \(u^*\in \mathcal {U}^f\). Letting \(K\rightarrow \infty \) we conclude that \(Y^{0,0,0,0,0}_0=J(u^*)\).
Step 3 It remains to show that the strategy \(u^*\) is optimal. To do this we pick any other strategy \(\hat{u}:=(\hat{\tau }_1,\ldots ,\hat{\tau }_{\hat{N}};\hat{\beta }_1,\ldots ,\hat{\beta }_{\hat{N}})\in \mathcal {U}^f\) and let the triple \((\hat{\vartheta }_j,\hat{z}_j,\hat{\mathbf{a}}_j)_{1\le j\le \hat{N}}\) be defined by the recursions \(\hat{\vartheta }_j:=\hat{\vartheta }_{j-1}\hat{\beta }_j+(\hat{\beta }_j-\hat{\beta }_{j-1})^+\hat{\tau }_j\), \(\hat{z}_{j}:=\theta _{\hat{\tau }_{j}}^{\hat{\tau }_{j-1},\hat{\vartheta }_{j-1},\hat{z}_{j-1},\hat{\mathbf{a}}_{j-1},\hat{\beta }_{j-1}}\wedge T\hat{\beta }_{j}\) and \(\hat{\mathbf{a}}_{j}:=A^{\hat{\tau }_{j-1},\hat{\vartheta }_{j-1},\hat{\mathbf{a}}_{j-1},\hat{\beta }_{j-1}}_{\hat{\tau }_{j}}\wedge \hat{\beta }_j\), with \((\hat{\tau }_0,\hat{\vartheta }_0,\hat{z}_0,\hat{\mathbf{a}}_0,\hat{\beta }_0):=(0,0,0,0,0)\). By the definition of \(Y^{0,0,0,0,0}_0\) in (3.1) we have
but in the same way
\(\mathbb {P}\)–a.s. By repeating this argument and using the dominated convergence theorem we find that \(J(u^*)\ge J(\hat{u})\) which proves that \(u^*\) is in fact optimal. \(\square \)
Remark 3.4
Note that the above proof can be trivially extended to arbitrary initial conditions \((z^*_0,\mathbf{a}^*_0,\beta ^*_0)\in \cup _{(\mathbf{a},\mathbf{b})\in \mathcal {J}}[0,T]^{\mathbf{a}^+}\times (\mathbf{a},\mathbf{b})\).
4 Existence
Theorem 3.3 presumes existence of the family \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\). To obtain a satisfactory solution to Problem 1, we thus need to establish that there exists a family of processes satisfying properties (a)–(d) in the definition of a verification family. We will follow the standard existence proof which goes by applying a Picard iteration (see [10, 13, 20]). We thus define a sequence \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)_{k\ge 0}\) of families of processes as
and
for \(k\ge 1\).
In this section we will show that the limiting family, \(((\tilde{Y}^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\), obtained when letting \(k\rightarrow \infty \) is a verification family, thus proving existence of an optimal control for Problem 1. This will be done over a number of steps where we start by showing that for each k the family defined by the above recursions satisfy properties (a)–(c). We then show that property (d) follows from Theorem 2.8.iv. However, we start by showing that the above defined family is uniformly \(L^2\)-bounded. We let \(\bar{\psi }:=\max _{\mathbf{a}'\in \mathcal {A}}\max _{z\in [0,T]^{\mathbf{a}'}}|\psi _{\mathbf{a}'}(\cdot ,z)|\), \(\bar{\Upsilon }:=\max _{\mathbf{a}'\in \mathcal {A}}\max _{z\in [0,T]^{\mathbf{a}'}}|\Upsilon _{\mathbf{a}'}(z)|\) and define
We have the following:
Proposition 4.1
For each \(k\ge 0\), the family of processes \(\big (Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\big )\) is \(L^2\)-bounded in the sense that there is a constant \(K_Y>0\) such that
for all \(k\ge 0\).
Proof
Let \(\bar{Y}^k=\sup _{(t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D}|Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}|\). Since \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}\ge 0\), applying an induction argument gives
Hence, by Doob’s maximal inequality we get
where the right hand side is bounded by Assumption 2.1. \(\square \)
It should be noted that the above bound is uniform in k which implies that the limit family (if it exists) satisfies the same inequality. In particular, we conclude that property (b) holds for all k. Properties (a) and (c) will be shown by induction and we make the following induction hypothesis:
H.k
The family \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\) is such that:
-
(i)
For every \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\) we have \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}\in \mathcal {S}_e^2\) and \((Y^{s,\nu ,z,\mathbf{a},\mathbf{b},k}_s:0\le s\le T)\in \mathcal {S}_{\mathbb {F},c}^2\).
-
(ii)
For every \((\mathbf{a},\mathbf{b})\in \mathcal {J}\), we have the following continuity property
$$\begin{aligned} \lim _{(p,q)\rightarrow (0,0)}\mathbb {E}\Big [\sup _{(t,\nu ,z)\in \mathcal D_{(\mathbf{a},\mathbf{b})}}|Y^{t,(\nu +p)^+\wedge \mathbf{b}T,(z+q)^+\wedge T\mathbf{a}^+,\mathbf{a},\mathbf{b},k}_{t}-Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_t|^2\Big ]= 0. \end{aligned}$$
We note that, under the induction hypotheses H.0–H.k, arguing as in the proof of Theorem 3.3 we have
where the supremum is attained by a control \(u^{*,k+1}\in \mathcal {U}_s^{k+1}\). Furthermore, the characterisation of \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_t\) in terms of the recursion (4.1) and (4.2) can be further simplified by noting that the possible transitions of \(A^{t,\nu ,\mathbf{a},\mathbf{b}}_\cdot \) forms paths in a directed acyclic graph where the leafs are the members of the set \(\mathcal {A}^\mathrm{abs}_{\mathbf{b}}\). Letting \(\eta \in \mathcal {T}_t\) denote the first transition time of \(A^{t,\nu ,\mathbf{a},\mathbf{b}}_\cdot \), Theorem 2.8.v allows us to write (recall that \(\theta _s^{t,z,\mathbf{a}}=z+\mathbf{a}^+(s-t)^+\))
and
where in both equations the fact that \(\eta > t\), \(\mathbb {P}\)-a.s., allows us to take conditional expectation with respect to \(\mathcal {F}_t\) instead of \(\mathcal {G}_t\). Actually, since \(A^{t,\nu ,\mathbf{a},\mathbf{b}}\) is a pure jump Markov process, we have that \(\sigma (A^{t,\nu ,\mathbf{a},\mathbf{b}}_s:0\le s\le \cdot )\) is generated by sets of the type \(\{s<\eta \}\) on \([0,\eta )\). Hence, for each \(\tau \in \mathcal {T}_t\) there is a \(\tilde{\tau }\in \mathcal {T}^{\mathbb {F}}_{t}\) such that \(\tilde{\tau }\wedge \eta =\tau \wedge \eta \), \(\mathbb {P}\)-a.s. and we only need to take the essential supremum over \(\mathcal {T}^{\mathbb {F}}_t\) (the set of \(\mathbb {F}\)-stopping times \(\tau \ge t\)) in (4.5).
Furthermore, we note that when \(\mathbf{a}\in \mathcal {A}^\mathrm{abs}_\mathbf{b}\), then by the definition of \(A^{t,\nu ,\mathbf{a},\mathbf{b}}\) we have \(\eta =\infty \) and thus
and
We start by showing that the induction hypothesis holds for \(k=0\):
Proposition 4.2
The induction hypothesis H.0 holds.
Proof
We first show continuity in z. This follows by the Lipschitz assumptions on \(\psi \) and \(\Upsilon \) together with the definition of \(\theta ^{t,\nu ,z,\mathbf{a},\mathbf{b}}\). Indeed we have
for all \(s\in [0,T]\). It thus follows immediately from (4.1) that
\(\mathbb {P}\)-a.s. and in particular we have
From this it is immediate that H.0.ii holds whenever \(\mathbf{a}\in \mathcal {A}^\mathrm{abs}_\mathbf{b}\). Concerning H.0.i, we note that for the general case
is the sum of a continuous process and a martingale in a quasi-left continuous filtration, hence, it has a version which is càdlàg and SLCE and, thus, belongs to \(\mathcal {S}^2_e\) by Proposition 4.1.
Let \((t,\nu ,z)\in \mathcal D_{(\mathbf{a},\mathbf{b})}\) and assume that \(t' \in [t,T]\). To evaluate \(|Y^{t,\nu ,z,\mathbf{a},\mathbf{b},0}_{t}-Y^{t',\nu ,z,\mathbf{a},\mathbf{b},0}_{t'}|\) we note that during \([t,t']\) we may have a transition in \(A^{t,\nu ,\mathbf{a},\mathbf{b}}\) (the probability of which is bounded by \(1-e^{-K_\lambda (t'-t)}\)) and thus have
Note that the first two terms on the right hand side in the above equation satisfies:
\(\mathbb {P}\)-a.s. as \(|t'-t|\rightarrow 0\) by \(\mathbb {P}\)-a.s. boundedness of \(\bar{\psi }\) and \(\bar{Y}\).
Concerning the third term let us again consider the case when \(\mathbf{a}\in \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\). Define \(\mathcal {K}_{t,t'}(X):=|\mathbb {E}\Big [X\big |\mathcal {F}_t\Big ]-\mathbb {E}\Big [X\big |\mathcal {F}_{t'}\Big ]|\), then \(\mathcal {K}_{t,t'}\) is a subadditive operator and
where \(\delta (M)\) is the diameter of a partition \((G^M_l)_{l=1}^M\) of \([0,T]^{\mathbf{a}^+}\) and \(z^M_l\in G^M_l\) and we note that the integrals are well defined by Assumption 2.1.i. The last of the above inequalities follows by noting that for all \(z,z'\in [0,T]^{\mathbf{a}^+}\)
Now, by the martingale representation theorem there is, for each \((s,l)\in [0,T]\times \{1,\ldots ,M\}\), a process \(Z^{\psi ,\mathbf{a},s,l,M}\in \mathcal {H}^2_{\mathbb {F}}\) such that \(\mathbb {E}\Big [\psi _{\mathbf{a}}(s,z^M_l)\big |\mathcal {F}_t\Big ]=\mathbb {E}\Big [\psi _{\mathbf{a}}(s,z^M_l)\Big ] +\int _0^t Z^{\psi ,\mathbf{a},s,l,M}_rdW_r\) for all \(t\in [0,T]\), \(\mathbb {P}\)-a.s. (where the exception set can be chosen independent of s since \((\psi _\mathbf{a}(s,z_l^M):0\le s\le T)\) is càdlàg) and a process \(Z^{\Upsilon ,\mathbf{a},l,M}\in \mathcal {H}^2_{\mathbb {F}}\) such that \(\mathbb {E}\Big [\Upsilon _{\mathbf{a}}(z^M_l)\big |\mathcal {F}_t\Big ]=\mathbb {E}\Big [\Upsilon _{\mathbf{a}}(z^M_l) \Big ]+\int _0^t Z^{\Upsilon ,\mathbf{a},l,M}_rdW_r\) for all \(t\in [0,T]\), \(\mathbb {P}\)-a.s. We thus have
where the integral w.r.t. ds is well defined since the integrand is \(\mathbb {P}\)-a.s. equal to that on the last row of (4.9). Noting that \(\int _{t}^{t'}Z^{\psi ,\mathbf{a},s,l,M}_rdW_r\rightarrow 0\) and \(\int _{t}^{t'}Z^{\Upsilon ,\mathbf{a},l,M}_rdW_r\rightarrow 0\) as \(|t'-t|\rightarrow 0\), \(\mathbb {P}\)-a.s. (exception set independent of s, t and \(t'\)) and using dominated convergence on the first term we thus get that
\(\mathbb {P}\)-a.s., where \(\delta (M)\) can be made arbitrarily small. We conclude that H.0 holds for all \(\mathbf{a}\in \mathcal {A}^\mathrm{abs}_\mathbf{b}\).
We now apply an induction scheme and assume that for some \(\mathbf{a}\in \mathcal {A}_\mathbf{b}\) hypothesis H.0 holds and
for all \(\mathbf{a}'\in \mathcal {A}^{-\mathbf{a}}_{\mathbf{a},\mathbf{b}}\).
We consider continuity in \(\nu \). We have, with \(\eta \) the first transition time of \(A^{t,\nu ,\mathbf{a},\mathbf{b}}\) and \(\eta '\) the first transition time of \(A^{t,\nu ',\mathbf{a},\mathbf{b}}\),
Using the identity \(ab-a'b'=1/2((a-a')(b+b')+(a+a')(b-b'))\) gives
Noting that the same equality holds for \(Y^{t,\nu ',z,\mathbf{a},\mathbf{b},0}_{t}-Y^{t,\nu ,z,\mathbf{a},\mathbf{b},0}_{t}\), the induction argument combined with the fact that
and \(\mathbb {E}[|Y^{s,\nu ,\theta _{s}^{t,z,\mathbf{a}}\wedge T\mathbf{a}',\mathbf{a}',\mathbf{b},0}_s+Y^{s,\nu ',\theta _{s}^{t,z,\mathbf{a}}\wedge T\mathbf{a}',\mathbf{a}',\mathbf{b},0}_s||\mathcal {F}_t]\le 2\bar{Y}_t\) (see the proof of Proposition 4.1) gives that
\(\mathbb {P}\)-a.s. Applying Doob’s maximal inequality, we find that, for \(p\in \mathbb {R}^{n}\)
Put together, this implies that H.0.ii holds for \(\mathbf{a}\).
Now when \(\mathbf{a}\notin \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\) we let \((G_l^M)_{l=1}^M\) be a partition of \(\mathcal D_{(\mathbf{a},\mathbf{b})}\) with \(\max _l \mathrm{diam}(G_{l}^M)=\delta (M)\rightarrow 0\) as \(M\rightarrow \infty \) and let \((t_l^M,\nu ^M_l,z^M_l)\in G_l^M\) then
where, to arrive at the last inequality, we have used that for \((r,\nu ,z),(r',\nu ',z')\in \mathcal D_{(\mathbf{a},\mathbf{b})}\), with \(r\le r'\),
and
while noting that, by the above results, \(|Y^{s,\nu ,z,\mathbf{a}',\mathbf{b},0}_{s} -Y^{s,\nu ',z',\mathbf{a}',\mathbf{b},0}_{s}|\le C(|z'-z|+|\nu '-\nu |\bar{Y}_s)\). Arguing as in the case when \(\mathbf{a}\in \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\) we conclude that
\(\mathbb {P}\)-a.s. for all \((\mathbf{a},\mathbf{b})\in \mathcal {J}\). This implies that \((Y^{t,\nu ,z,\mathbf{a},\mathbf{b},0}_t:0\le t\le T)\) is in fact continuous and Proposition 4.1 guarantees that it belongs to \(\mathcal {S}^2_c\). By the induction argument we conclude that hypothesis H.0 holds. \(\square \)
Proposition 4.3
The induction hypothesis H.k holds for all \(k\ge 0\).
Proof
Since we have already shown that the induction hypothesis holds for \(k=0\) we will assume that H.0–H.k hold for some \(k\ge 0\). Now, as noted above, this implies the existence of a control \(u^\diamond :=(\tau ^\diamond _1,\ldots ,\tau ^\diamond _{N^\diamond };\beta _1^\diamond , \ldots ,\beta _{N^\diamond }^\diamond )\), with \({N^\diamond }\le k+1\), \(\mathbb {P}\)-a.s. such that
Fix \((\mathbf{a},\mathbf{b})\in \mathcal {J}\). We will prove the Proposition in 3 different steps:
Step 1 We first show continuity in z. Whenever \((t,\nu ',z')\in \mathcal D_{\mathbf{a},\mathbf{b}}\) we have
Now, by definition we have \(|\theta ^{t,\nu ,z',\mathbf{a},\mathbf{b}}_s-\theta ^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s|\le |z'-z|\), for all \(s\in [0,T]\) which gives that
and in particular \(|\theta ^{t,\nu ,z',\mathbf{a},\mathbf{b},(\tau _1^\diamond ,\beta _1^\diamond )}_{s} -\theta ^{t,\nu ,z,\mathbf{a},\mathbf{b},(\tau _1^\diamond ,\beta _1^\diamond )}_{s}|\le |z'-z|\), for all \(s\in [0,T]\). Repeating this argument and noting that k is finite we conclude that
for all \(s\in [0,T]\). By the above we find that, \(\mathbb {P}\)-a.s. for all \(t\in [0,T]\),
Furthermore, since the reversed argument applies to \(Y^{t,\nu ,(z+q)^+\wedge T,\mathbf{a},\mathbf{b},k+1}_{t}-Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k+1}_t\) we find that
\(\mathbb {P}\)-a.s. and
where we note that the constant C does not depend on k.
Step 2 Next we show continuity in \(\nu \). Again we apply an induction argument and assume that for some \(\mathbf{a}\in \mathcal {A}_\mathbf{b}\) hypothesis H.k+1 holds for all \(\mathbf{a}'\in \mathcal {A}^{-\mathbf{a}}_{\mathbf{a},\mathbf{b}}\). We have
Repeating the argument in the proof of Proposition 4.2 yields that \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t}-Y^{t,\nu ',z,\mathbf{a},\mathbf{b},k+1}_{t}\le \Xi _t^{\nu ,\nu ',z,\mathbf{a},\mathbf{b},k+1}\), where
for \(\mathbf{a}\in \mathcal {A}_\mathbf{b}\setminus \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\) and
when \(\mathbf{a}\in \mathcal {A}^{\mathrm{abs}}_\mathbf{b}\). Relying on the induction argument and the fact that the control \(u^\diamond \in \mathcal {U}^f\) for all k, we conclude by symmetry that
for all \(t\in [0,T]\), \(\mathbb {P}\)-a.s. Applying Doob’s maximal inequality, we find that, for \(p\in \mathbb {R}^{n}\)
where again the constant C does not depend on k. Put together, this implies that H.k+1.ii holds for \(\mathbf{a}\).
Step 3 To show that H.k+1.i holds we note that
is the Snell envelope of a process in \(\mathcal {S}^2_e\) and thus itself belongs to \(\mathcal {S}^2_e\). Subtracting the continuous process \(\int _0^\cdot \psi _{A_r^{t,\nu \mathbf{a},\mathbf{b}}}(r,\theta _s^{t,\nu ,z,\mathbf{a},\mathbf{b}})dr\) we conclude that \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k+1}\in \mathcal {S}^2_e\).
Let \((t,\nu ,z)\in \mathcal D_{(\mathbf{a},\mathbf{b})}\) and assume that \(t' \in [t,T]\). To evaluate \(|Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t}-Y^{t',\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t'}|\) we note that during \([t,t']\) we may either have a transition in \(A^{t,\nu ,\mathbf{a},\mathbf{b}}\) (the probability of which is bounded by \(1-e^{-K_\lambda (t'-t)}\)), it may be optimal to switch to another mode or neither of the above. We conclude that
where the first three terms tend to 0 for all \(|t'- t|\rightarrow 0\), \(\mathbb {P}\)-a.s. (the first term by Assumption 2.1.i and the third term by the induction hypothesis and continuity of the switching costs). Moving to the fourth term, we let \(\tau ^{r,\nu ,z,\mathbf{a},\mathbf{b},k+1}_s\in \mathcal {T}^{\mathbb {F}}_s\) be the first intervention time in an optimal control for \(\breve{Y}^{r,\nu ,z,\mathbf{a},\mathbf{b},k+1}_s\), with
By sublinearity of the operator \(\mathcal {K}_{t,t'}\) introduced in the proof of Proposition 4.2 we have, for \((\zeta ,\nu ,z)\in \mathcal D_{(\mathbf{a},\mathbf{b})}\) with \(\zeta \le t'\),
\(\mathbb {P}\)-a.s., where the last term represents the possibility of \(A^{\zeta ,\nu ,z,\mathbf{a},\mathbf{b}}\) having a transition during \([\zeta ,t')\). Furthermore, arguing as in Step 1 and Step 2 we find that
\(\mathbb {P}\)-a.s. which implies that
\(\mathbb {P}\)-a.s. (where the exception set does not depend on \((t,t',\zeta ,\nu ,\nu ',z,z')\)). For each \(M\ge 1\) we again partition the set \(\mathcal D_{(\mathbf{a},\mathbf{b})}\) into a partition with diameter \(\delta (M)\) (now into a rectangular partition with a constant step-size \(\Delta t\) in the time variable t) and note that when \(t'\in [t_l^M,t_{l}^M+\Delta t]\) then \(|\tau ^{t_l^M,\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t'}-\tau ^{t_l^M,\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t_l^M}|\wedge |\tau ^{t_l^M,\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t'}-\tau ^{t_l^M,\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t_{l}^M+\Delta t}|\le \Delta t\), \(\mathbb {P}\)-a.s. With \(\tau ^{M,k+1}_{l,1}:=\tau ^{t_l^M,\nu _l^M,z_l^M,\mathbf{a},\mathbf{b},k+1}_{t_l^M}\) and \(\tau ^{M,k+1}_{l,2}:=\tau ^{t_l^M,\nu _l^M,z_l^M,\mathbf{a},\mathbf{b},k+1}_{t_{l}^M+\Delta t}\) we have
where we have used that
In the above inequality for \(|\mathbb {E}\Big [Y^{t',\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t'}\big | \mathcal {F}_t\Big ] -Y^{t',\nu ,z,\mathbf{a},\mathbf{b},k+1}_{t'}|\) each of the terms of type \(K_{t,t'}\) can be represented by a integral of type \(\int _{t}^{t'}Z_rdW_r\). Now since M was arbitrary and \(\delta (M)\rightarrow 0\) as \(M\rightarrow \infty \) we conclude from (4.10) that \((Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_t:0\le t\le T)\) has a version such that
\(\mathbb {P}\)-a.s. as \(h\rightarrow 0\). By the induction argument in \(\mathbf{a}\) this extends to all \((\mathbf{a},\mathbf{b})\in \mathcal {J}\).
Finally, by the induction argument in k and using Proposition 4.2 we conclude that H.k holds for all \(k\ge 0\). \(\square \)
Remark 4.4
For the more general case when \(\mathbb {F}\) is generated by a Brownian motion and an independent Poisson random measure the process \(\big (Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_t: 0\le t\le T\big )\) is not necessarily continuous anymore. However, continuity in \((\nu ,z)\) follows immediately from the above proof. The main difference is that the martingale representation theorem cannot be applied to show continuity in t. However, the terms \(\mathcal {K}_{t,t+h}(\cdots )\) goes to zero \(\mathbb {P}\)-a.s. as \(h\searrow 0\) by right continuity of the filtration and right continuity of \(\big (Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_t: 0\le t\le T\big )\) follows. Furthermore, strong left continuity in expectation is immediate from (4.10) and quasi-left continuity of the filtration and we conclude that H.k holds if we replace the condition that \((Y^{s,\nu ,z,\mathbf{a},\mathbf{b},k}_s:0\le s\le T)\in \mathcal {S}_{\mathbb {F},c}^2\) with \((Y^{s,\nu ,z,\mathbf{a},\mathbf{b},k}_s:0\le s\le T)\in \mathcal {S}_{\mathbb {F},e}^2\).
We are now ready to show that the limit family, \(\lim _{k\rightarrow \infty }((Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_s)_{0\le s\le T}: ({t,\nu ,z,\mathbf{a},\mathbf{b},k})\in \mathcal D)\), exists and satisfies the properties of a verification family. We start with existence:
Proposition 4.5
For each \((t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D\), the limit \(\tilde{Y}^{t,\nu ,z,\mathbf{a},\mathbf{b}}:=\lim _{k\rightarrow \infty }Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}\), exists as an increasing pointwise limit, \(\mathbb {P}\)-a.s. Furthermore, the process \((\tilde{Y}^{t,\nu ,z,\mathbf{a},\mathbf{b}}_t:0\le t\le T)\) is continuous.
Proof
Since \(\mathcal {U}^k_t\subset \mathcal {U}^{k+1}_t\) we have by (4.4) that, \(\mathbb {P}\)-a.s.,
where the right hand side is bounded in \(L^2\). Hence, the sequence \(((Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\) converges \(\mathbb {P}\)–a.s. for all \(s\in [0,T]\).
Concerning the second claim, note that by Proposition 4.1 there is for each \(\delta >0\) and \(p\in (1,2)\) a constant \(K>0\), such that the set
has probability \(\mathbb {P}(B)\ge 1-\delta \). By the “no-free-loop” condition (Assumption 2.1.(ii)) and the finiteness of \(\mathcal {I}\) we get that for any control \((\tau _1,\ldots ,\tau _N;\beta _1,\ldots ,\beta _N)\),
\(\mathbb {P}\)-a.s. Hence, there is a \(\mathbb {P}\)-null set \(\mathcal {N}\subset \Omega \) such that for all \(\omega \in B\setminus \mathcal {N}\),
where \((\tau ^k_1,\ldots ,\tau ^k_{N^k};\beta ^k_1,\ldots ,\beta ^k_{N^k})\) is a control corresponding to \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_t\). This implies that for \(k'>0\) we have,
Now, for \(0\le k'\le k\) we let \(u^k_{k'}\) be the truncated control \(u^k_{k'}:=(\tau ^k_1,\ldots ,\tau ^k_{N^k\wedge k'};\beta ^k_1,\ldots ,\beta ^k_{N^k\wedge k'})\). Then, clearly \(u^k_{k'}\in \mathcal {U}^{k'}\) and we have
Furthermore,
where \(\frac{1}{p}+\frac{1}{q}=1\). There is thus a constant \(C>0\) such that
for all \(t\in [0,T]\) and all \(0\le k'\le k\). We conclude that for all \(\omega \in B\setminus \mathcal {N}\), the sequence \((Y^{t,\nu ,z,\mathbf{a},\mathbf{b},k}_t(\omega ):0\le t\le T)_{k\ge 0}\) is a sequence of continuous functions that converges uniformly in t which implies that the limit is continuous. Since \(\delta >0\) was arbitrary we conclude that \(\mathbb {P}\)-almost all trajectories of \((\tilde{Y}^{t,\nu ,z,\mathbf{a},\mathbf{b}}_t:0\le t\le T)\) are continuous. \(\square \)
Theorem 4.6
The limit family \(((\tilde{Y}^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}: (t,\nu ,z,\mathbf{a},\mathbf{b})\in \mathcal D)\) is a verification family.
Proof
As noted above, property b) of a verification family follows immediately by Proposition 4.1. We now show that the limit satisfies the additional properties of the verification theorem as well, starting with the recursion.
d) Limit satisfies (3.1). From Proposition 4.5 and the proof of Proposition 4.3 it follows that \(\tilde{Y}^{t,\nu ,z,\mathbf{a},\mathbf{b}}_t\) is \(\mathbb {P}\)-a.s. jointly continuous in \((t,\nu ,z)\) and in particular we note that \((Y^{s,\beta \nu +s(\beta -\mathbf{b})^ + ,\theta _s^{t,z,\mathbf{a},\mathbf{b}}\wedge T\beta ,A^{t,\nu ,\mathbf{a},\mathbf{b}}_s\wedge \beta ,\beta ,k}_s:0\le s\le T)\) is an increasing sequence of càdlàg processes that converges \(\mathbb {P}\)-a.s. pointwisely to the càdlàg process \((\tilde{Y}^{s,\beta \nu +s(\beta -\mathbf{b})^ + ,\theta _s^{t,z,\mathbf{a},\mathbf{b}}\wedge T\beta ,A^{t,\nu ,\mathbf{a},\mathbf{b}}_s\wedge \beta ,\beta }_s:0\le s\le T)\). We can thus use (iv) of Theorem 2.8 and find that
a) Limit in \(\mathcal {S}^2_e\). As \(\tilde{Y}^{t,\nu ,z,\mathbf{a},\mathbf{b}}_s+\int _0^s\psi _{A^{t,\nu ,\mathbf{a},\mathbf{b}}_r}(r,\theta ^{t,\nu ,z,\mathbf{a},\mathbf{b}}_r)dr\) is the limit of an increasing sequence of càdlàg supermartingales it is also a càdlàg supermartingale (see e.g. [24]). It remains to show that the limit is SLCE. Rather than appealing to a uniform convergence argument as in the proof of Proposition 4.5 we give a direct, more intuitive proof, along the lines of [13] and [32]. We will look for a contradiction and let \((\gamma _j)_{j\ge 1}\) be a sequence of \(\mathbb {G}\)-stopping times such that \(\gamma _j\nearrow \gamma \in \mathcal {T}\) and assume that
Then, the previous step and the Doob-Meyer decomposition of the Snell envelope implies that
on some measurable set \(M_1\subset \Omega \) with \(\mathbb {P}(M_1)>0\) and
Since c and \(\tilde{Y}\) are both left limited there is a \(\mathcal {F}_{\gamma }\)-measurable random variable \(\hat{\beta }_1\in \mathcal {I}^{-b_n}\) such that
Now, by (4.12) and continuity of the switching costs we must have
Repeating this argument l times we find that there is a sequence of measurable sets \(M_1\supset M_2 \supset \cdots \supset M_l\) with \(\mathbb {P}(M_l)>0\) and a sequence \(\hat{\beta }_2,\hat{\beta }_3,\ldots ,\hat{\beta }_l\) of \(\mathcal {F}_{\gamma }\)-measurable random variables such that \(\hat{\beta }_{i+1}\in \mathcal {I}^{\hat{\beta }_i}\) for \(i=1,\ldots ,l-1\) and
on \(M_{i+1}\), with \(\underline{\hat{\beta }}_{i}:=\hat{\beta }_1\wedge \hat{\beta }_2\wedge \cdots \wedge \hat{\beta }_i\). Now, since \(\mathcal {I}\) is finite we can always find (possibly random) \(l'\) and l such that \(l<l'\), \(\beta _{l}=\beta _{l'}\) and \(\underline{\hat{\beta }}_{l}=\underline{\hat{\beta }}_{l'}\) implying that on \(M_{l'}\) we have
contradicting the “no free loop”-condition of Assumption 2.1.ii.
c) Limit family continuous in \((\nu ,z)\). In the proof of Proposition 4.3 we showed that there is a \(C>0\) such that for all \((p,q)\in \mathbb {R}^{2n}\)
for all k. Letting \(k\rightarrow \infty \) it follows that the limit family satisfies the same inequality.
This finishes the proof. \(\square \)
Remark 4.7
Note that the results of this section naturally generalize to the case when \(\mathbb {F}\) is a more general filtration, e.g. generated by a Brownian motion and an independent Poisson random measure.
We can now apply Theorem 2.8.v to the recursion (3.1) and get
5 Value Function Representation
We now turn to the case when uncertainties in \(\psi \) and \(\Upsilon \) are modeled by a stochastic differential equation (SDE) as follows
where \(a:[0,T]\times \mathbb {R}^m \rightarrow \mathbb {R}^m\) and \(\sigma :[0,T]\times \mathbb {R}^m \rightarrow \mathbb {R}^{m\times m}\) are two deterministic, continuous functions that satisfy
and
To be able to consider feedback-control formulations we will, for all \(t\in [0,T]\) and \(x\in \mathbb {R}^m\), define the process \((X_s^{t,x};0\le s\le T)\) as the strong solution to
A standard result (see e.g. [38, Theorem 6.16 in Ch. 1]) is that, for any \(p\ge 1\), there exists a constant \(C>0\) such that
and for all \(t'\in [0,T]\) and all \(x'\in \mathbb {R}\)
We will consider the problem of finding a feedback strategy \(u\in \mathcal {U}_t\) that, for each \((t,x,\nu ,z,\mathbf{a},\mathbf{b})\in \bar{\mathcal D}:=[0,T]\times \mathbb {R}^m\times \cup _{(\mathbf{a},\mathbf{b})\in \mathcal {J}}([0,T]^{\mathbf{b}}\times [0,T]^{\mathbf{a}^+}\times (\mathbf{a},\mathbf{b}))\), maximizes
where for each \(\mathbf{a}\in \mathcal {A}\) the deterministic function \(\varphi _\mathbf{a}:[0,T] \times \mathbb {R}^m\times [0,T]^{\mathbf{a}^+}\rightarrow \mathbb {R}_+\) is locally Lipschitz and of polynomial growth in x, Lipschitz in z and such that \(\varphi _\mathbf{a}(\cdot ,x,z)\) is a càdlàg function and the deterministic function \(h_{\mathbf{a}}:\mathbb {R}^m \times [0,T]^{\mathbf{a}^+}\rightarrow \mathbb {R}_+\) is locally Lipschitz and of polynomial growth in x and Lipschitz in z.
Furthermore, we assume that the switching costs are deterministic and that, for each \((\mathbf{a},\mathbf{b})\in \mathcal {J}\) and \(\nu \in [0,T]^{\mathbf{b}}\) and each \(\mathbf{a}'\in \mathcal {A}_{\mathbf{a},\mathbf{b}}\), the transition rates
where \(\rho ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}:[0,T]\times \mathbb {R}^m\rightarrow \mathbb {R}\) is Lipschitz in \(\nu \), bounded and locally Lipschitz in x and \(\rho ^{\nu ,\mathbf{b}}_{\mathbf{a},\mathbf{a}'}(\cdot ,x)\) is càdlàg.
We note that using the results in Sects. 3 and 4 there exists a control \(u^*\in \mathcal {U}_t\) such that
for all \(u\in \mathcal {U}_t\) and a unique family of processes \(((Y^{t,x,\nu ,z,\mathbf{a},\mathbf{b}}_s)_{0\le s\le T}:(t,x,\nu ,z,\mathbf{a},\mathbf{b})\in \bar{\mathcal D})\) such that
where \((\alpha _s^{t,x,\nu ,\mathbf{a},\mathbf{b},u}:0\le s\le T)\) is the process \((\alpha _s^{t,\nu ,\mathbf{a},\mathbf{b},u}:0\le s\le T)\) with transition intensities given by (5.4). Furthermore, the following estimates hold:
Proposition 5.1
There exist \(p\ge 1/2\) and \(K_Y>0\) such that, for each \((\mathbf{a},\mathbf{b}) \in \mathcal {J}\), we have
Furthermore, \(Y^{t,x,\nu ,z,\mathbf{a},\mathbf{b}}_t\) is (indistinguishable from) a deterministic process and
Proof
For the first part we note that, using Doob’s maximal inequality and polynomial growth, there is a \(p\ge 1/2\) such that
For the second part we argue as in the proof of Proposition 4.3 and let \(\hat{u}=(\hat{\tau }_1,\ldots ,\hat{\tau }_{N};\hat{\beta }_1,\ldots ,\hat{\beta }_N)\in \mathcal {U}_t^f\) be an optimal control corresponding to the reward \(Y^{t,x,\nu ,z,\mathbf{a},\mathbf{b}}_t\). Then,
\(\mathbb {P}\)–a.s. Now, letting \(\Psi _t^{x,x',\nu ,z,\mathbf{a},\mathbf{b},\hat{u}}\) denote the right hand side of the inequality we have
For \(K>0\) we let \(B:=\{\omega \in \Omega :\sup _{s\in [0,T]}(|X^{t,x}_s|+|X^{t,x'}_s|)\le K\}\). Using the identity \(ab-a'b'=1/2((a-a')(b+b')+(a+a')(b-b'))\) in combination with local Lipschitz continuity and polynomial growth gives the recursion
where \(\hat{u}^{-1}:=(\hat{\tau }_2,\ldots ,\hat{\tau }_{N};\hat{\beta }_2,\ldots , \hat{\beta }_N)\) and the constant C(K) does not depend on the parameters \((t,x,x',z)\). Using an induction argument and finiteness of the strategy \(\hat{u}\) together with Hölder’s inequality we conclude that
We note that, for all \(K>0\) the first term approaches zero as \(x'\rightarrow x\), by (5.2). Furthermore, by (5.1), \(\mathbb {P}(B^c)\le C\frac{1+|x|+|x'|}{K}\) and by symmetry we conclude that \(\lim _{x'\rightarrow x}|Y^{t,x,\nu ,z,\mathbf{a},\mathbf{b}}_t-Y^{t,x',\nu ,z,\mathbf{a},\mathbf{b}}_{t}|\le C\frac{1+|x|^{2p}+|x'|^{2p}}{K^{1/2}}\). Since \(K>0\) was arbitrary, we conclude that \(|Y^{t,x,\nu ,z,\mathbf{a},\mathbf{b}}_t-Y^{t,x',\nu ,z,\mathbf{a},\mathbf{b}}_{t}|\rightarrow 0\) as \(x'\rightarrow x\). The second statement now follows by the \(\mathbb {P}\)-a.s. continuity of \(Y^{t,\nu ,z,\mathbf{a},\mathbf{b}}_t\) in \((t,\nu ,z)\) shown in Sect. 4. \(\square \)
Now, (4.13) and repeated use of Theorem 8.5 in [16] shows that for \(k\ge 0\), there exist functions \((v_{(\mathbf{a},\mathbf{b})}^{k})_{(\mathbf{a},\mathbf{b})\in \mathcal {J}}\) of polynomial growth, with \(v_{(\mathbf{a},\mathbf{b})}^{k}:[0,T]\times \mathbb {R}^m \times [0,T]^{\mathbf{b}}\times [0,T]^{\mathbf{a}^+}\rightarrow \mathbb {R}\) such that
Furthermore, by Proposition 5.1 the functions \(v_{(\mathbf{a},\mathbf{b})}^{k}\) are continuous. Repeating the steps in the proof of Theorem 4.6 we find that the sequence \((v_{(\mathbf{a},\mathbf{b})}^{k})^{k\ge 0}_{(\mathbf{a},\mathbf{b})\in \mathcal {J}}\) converges pointwise to functions \(v_{(\mathbf{a},\mathbf{b})}:[0,T]\times \mathbb {R}^m \times [0,T]^{\mathbf{b}}\times [0,T]^{\mathbf{a}^+}\rightarrow \mathbb {R}\) and that
By Proposition 5.1 we find that the functions \(v_{\mathbf{b}}\) are continuous and of polynomial growth. Finally, the verification theorem implies that the functions \(v_{\mathbf{b}}\) are value functions for the stochastic control problem posed above and satisfy the following dynamic programming relation:
Using this formulation we can approximate the value function using either a Markov-chain approximation [28] or by reinforcement learning techniques [37] (see e.g. [1, 10, 30, 35] for previous applications in optimal switching).
Notes
Throughout, we assume that \(\le \), \(\wedge \) and \(^+\) are defined componentwise so that, for any two vectors \(x,y\in \mathbb {R}^n\), \(x\le y\) implies that \(x_i\le y_i\), \([x\wedge y]_i=\min (x_i,y_i)\) and \([x^+]_i=\max (0,x_i)\), for \(i=1,\ldots ,n\).
For \(\mathbf{b}=(b_1,\ldots ,b_n)\in \mathcal {I}\) we let \([0,T]^{\mathbf{b}}:=[0,b_1T]\times \cdots \times [0,b_nT]\).
For two vectors \(x,y\in \mathbb {R}^n\) we define the product xy as \([xy]_i=x_iy_i\), where \([x]_i\) denotes the ith component of the vector x.
Where \(\mathbf{e}_i:=(0,\ldots ,0,1,0,\ldots ,0)\) with a 1 in the ith position.
Throughout, \(C\in (0,\infty )\) is a constant that may change value from line to line.
References
Aïd, R., Campi, L., Langrené, N., Pham, H.: A probabilistic numerical method for optimal multiple switching problems in high dimension. SIAM J. Financ. Math. 5(1), 191–231 (2014)
Aïd, R., Federico, S., Pham, H., Villeneuve, B.: Explicit investment rules with time-to-build and uncertainty. J. Econ. Dyn. Control 51, 240–256 (2015)
Bar-Ilan, A., Sulem, A.: Explicit solution of inventory problems with delivery lags. Math. Oper. Res. 20(3), 709–720 (1995)
Bielecki, T.R., Jakubowski, J., Nieweglowski, M.: Conditional markov chains: properties, construction and structured dependence. Stoch. Process. Appl. 127(4), 1125–1170 (2017)
Bielecki, T.R., Rutkowski, M.: Credit Risk: Modelling, Valuation and Hedging. Springer Finance. SpringerVerlag, Berlin (2002)
Bielecki, T.R., Crépey, S., Jeanblanc, M., Rutkowski, M.: Defaultable game options in a hazard process model. J. Appl. Math. Stoch. Anal. https://doi.org/10.1155/2009/695798 (2009)
Brekke, K.A., Øksendal, B.: Optimal switching in an economic activity under uncertainty. SIAM J. Control Optim. 32(4), 1021–1036 (1994)
Brennan, M.J., Schwartz, E.S.: Evaluating natural resource investments. J. Bus. 58, 135–157 (1985)
Bruder, B., Pham, H.: Impulse control problem on finite horizon with execution delay. Stoch. Process. Appl. 119, 1436–1469 (2009)
Carmona, R., Ludkovski, M.: Pricing asset scheduling flexibility using optimal switching. Appl. Math. Financ. 15, 405–447 (2008)
Chassagneux, J.F., Elie, R., Kharroubi, I.: A note on existence and uniqueness for solutions of multidimensional reflected bsdes. Electron. Commun. Probab. 16, 120–128 (2011)
Cvitanic, J., Karatzas, I.: Backwards stochastic differential equations and Dynkin games. Ann. Probab. 24(4), 2024–2056 (1996)
Djehiche, B., Hamadéne, S., Popier, A.: A finite horizon optimal multiple switching problem. SIAM J. Control Optim. 47(4), 2751–2770 (2009)
El Asri, B., Hamadéne, S.: The finite horizon optimal multi-modes switching problem: the viscosity solution approach. Appl. Math. Optim. 60, 213–235 (2009)
El Karoui, N.: Les aspects probabilistes du contrôle stochastique. Ecole d’Eté de SaintFlour IX. Lecture Notes in Math. Springer, Berlin (1979)
El-Karoui, N., Kapoudjian, C., Pardoux, E., Peng, S., Quenez, M.C.: Reflected solutions of backward SDEs and related obstacle problems for PDEs. Ann. Probab. 25(2), 702–737 (1997)
Elie, R., Kharroubi, I.: Bsde representations for optimal switching problems with controlled volatility. Stoch. Dyn. 14(03), 1450003 (2014)
Hamadène, S.: Reflected BSDE’s with discontinuous barrier and application. Stoch. Int. J. Probab. Stoch. Process. 74(3–4), 571–596 (2002)
Hamadène, S., Jeanblanc, M.: On the starting and stopping problem: application in reversible investments. Math. Oper. Res. 32(1), 182–192 (2007)
Hamadène, S., Zhang, J.: Switching problem and related system of reflected backward SDEs. Stoch. Process. Their Appl. 120(4), 403–426 (2010)
Hu, Y., Tang, S.: Multi-dimensional BSDE with oblique reflection and optimal switching. Probab. Theory Relat. Fields 147(1–2), 89–121 (2008)
Jakubowski, J., Nieweglowski, M.: A class of f-doubly stochastic markov chains. Electron. J. Probab. 15(56), 1743–1771 (2010)
Jeanblanc, M., Yor, M., Chesney, M.: Mathematical Methods for Financial Markets. Springer Finance. Springer-Verlag, London Ltd, London (2009)
Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Springer-Verlag, New York (1991)
Karatzas, I., Shreve, S.E.: Methods of Mathematical Finance. Springer-Verlag, New York (1998)
Kharroubi, I.: Optimal switching in finite horizon under state constraints. SIAM J. Control Optim. 54(4), 2202–2233 (2016)
Kobylanski, M., Quenez, M.C.: Optimal stopping time problem in a general framework. Electron. J. Probab. 17(72), 1–28 (2012)
Kushner, H.J., Dupuis, P.: Numerical Methods for Stochastic Control Problems in Continuous Time, 2nd edn. Springer, New York (2001)
Latifa, I.B., Bonnans, J.F., Mnif, M.: A general optimal multiple stopping problem with an application to swing options. Stoch. Anal. Appl. 33(4), 715–739 (2015)
Li, K., Nyström, K., Olofsson, M.: Optimal switching problems under partial information. Monte Carlo Methods Appl. 21(2), 91–120 (2015)
Lygeros, J., Prandini, M.: Stochastic hybrid systems: a powerful framework for complex, large scale applications. Eur. J. Control 16(6), 583–594 (2010)
Martyr, R.: Finite-horizon optimal multiple switching with signed switching costs. Math. Oper. Res. 41(4), 1432–1447 (2016)
Øksendal, B., Sulem, A.: Optimal stochastic impulse control with delayed reaction. Appl. Math. Optim. 58, 243–255 (2008)
Perninge, M.: A limited-feedback approximation scheme for optimal switching problems with execution delays. Math Meth Oper Res. arXiv:1605.00606 (2017)
Perninge, M., Söder, L.: Irreversible investments with delayed reaction: an application to generation re-dispatch in power system operation. Math. Meth. Oper. Res. 79, 195–224 (2014)
Protter, P.: Stochastic Integration and Differential Equations, 2nd edn. Springer, Berlin (2004)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge, MA (2017)
Yong, J., Zhou, X.Y.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer, New York (1999)
Acknowledgements
Open access funding provided by Linnaeus University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by the Swedish Energy Agency through Grants Numbers 42982-1 and 48405-1.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Perninge, M. On the Finite Horizon Optimal Switching Problem with Random Lag. Appl Math Optim 84, 355–397 (2021). https://doi.org/10.1007/s00245-019-09648-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00245-019-09648-0