1 Introduction

The economic literature has shown that the behaviour of economic quantities varies with the state of the world; see Hamilton [27]. Such a state of the world is usually referred to as a regime. The most common example is the distinction between a bull and a bear market, in which stock prices are believed to follow different dynamics. Models incorporating such states are called regime switching models.

In a few cases, there are analytically tractable solutions to selected derivatives with regime switching underlying processes. These include the zero-coupon bond price in a Vasiček short rate model with a regime switching mean reversion level and volatility, see Landén [37], or a Cox–Ingersoll–Ross short rate model with regime switching mean reversion level as in Elliott and Siu [19], the price of European options where the underlying stock price process is the exponential of a specific regime switching pure jump process, see Elliott and Osakwe [18], and derivatives pricing with regime switching in a Heston-type stochastic volatility setting; see Elliott et al. [20], Elliott and Lian [16] and Elliott et al. [17].

Apart from these specific cases, there is no overarching theory that allows us to introduce regime switching to more general processes and derivatives, including multivariate extensions. There is also no explanation as to why analytically tractable solutions exist in these and not in other cases. The scope of this paper is to tackle these two points by introducing the class of multivariate regime switching affine processes.

This paper is related to some earlier ideas discussed briefly in van Beek et al. [43] on introducing regime switching to affine processes. However, we take a far more general approach here in assuming a generator-based perspective to define the class of regime switching processes, we do not limit ourselves to pure diffusions on the canonical state space, and we extend the results beyond regimes driven by the classical Markov chains.

Regime switching affine processes encapsulate all special cases from the literature mentioned above, but include many more possible extensions. We give a non-exhaustive list of the classes that can now be extended to include regime switching. A set of applications that already includes regime switching follows later.

(1) Multivariate affine term structure models are used throughout finance as the gold standard for yield curve analysis and the valuation of interest rate derivatives such as bills, notes, bonds, strips, caps, floors, swaps and swaptions. See Dai and Singleton [9] and Christensen et al. [4] for examples.

(2) The Heston model is a bivariate affine process that is used ubiquitously in finance for the pricing of European options on equity under the assumption that volatility is stochastic; see Heston [30], Van der Stoep et al. [44] and Fouque and Saporito [23]. The theory of regime switching affine processes also applies to Heston extensions that include an affine model for rates, such as described in Grzelak and Oosterlee [26].

(3) Next, credit risky securities including corporate bonds are often modelled using an affine model for rates and a CIR-type process for the instantaneous hazard rate of default; see Duffie and Singleton [15], Maboulou and Mashele [39] and White [46]. Together these processes form a multivariate affine process.

(4) Finally, affine processes on generalised state spaces such as the space of positive semidefinite matrices can include stochastic covariances between variables. The processes can be useful in the valuation of options on multiple underlying assets and fixed income products with stochastic correlation between factors and default intensities; see Cuchiero et al. [5]. The Wishart process is an example of such a process that lives on the cone of symmetric positive semidefinite (covariance) matrices. It is used for pricing options with multiple underlying assets in Muhle-Karbe et al. [40], range notes in Chiarella et al. [3] and volatility-equity options in Da Fonseca et al. [8].

There are a myriad of practical applications thinkable with our generalisation of the existing literature. Affine processes are generally multivariate so that we can immediately generalise the literature to multivariate processes with regime switching parameters, as suggested above. One example is in pricing a credit default swap (CDS) with a credit valuation adjustment (CVA). This case has multiple hazard rates for the different involved parties that need to be modelled jointly, possibly together with a short rate. Regime switching affine processes can capture this multivariate aspect, and at the same time introduce regime switching. Giesecke et al. [25] and Frey and Backhaus [24] show that default events exhibit clustering in regimes, supporting this feature. Second, regime switching affine processes enable the pricing of derivatives where the payoff function is regime dependent. This happens when the regimes are the credit rating of some entity, as proposed by Jarrow et al. [31], and when the payoff deviates in the case of a default rating. Third, in the portfolio optimisation space, the mean returns on assets can be modelled as an unobservable Markov chain. Rieder and Bäuerle [41] and Bäuerle and Rieder [2] derive optimal investment strategies in such a setup. Also, Vostrikova and Dong [45] analyse utility maximisation on regime switching Lévy processes. In Sect. 9, we derive the analytical formula for the characteristic function that they consider. Fourth, we can think of specific calibrations of yield curve models as regimes, as in Harms et al. [29]. This allows pricing with future recalibration embedded. A perspective on regime switching different from ours (and the above cited literature) is the Bayesian one in Cuchiero et al. [6], Duembgen and Rogers [12]. These regimes have the interpretation of multiple competing (nested) models, and the probability estimate of each regime represents a Bayesian model averaging weight.

Regime switching affine processes are defined in terms of a continuous-time Markov chain on a specific state space that switches between states, the regimes, and a part with different affine dynamics conditionally on each regime. For example, let the Markov chain switch between the bull and bear market states. Let the short rate follow a Vasiček model reverting to one mean in the bull market, and to another mean in the bear market. Then the joint process of the Markov chain and the short rate is a regime switching affine process.

The literature allows those parameters to vary across regimes that do not multiply with the process. For example, in the drift \(\theta ( \mu -r_{t})\) of the Vasiček model of the short rate \(r_{t}\), \(\mu \) may vary across regimes whereas \(\theta \) may not, because it multiplies with the process \((r_{t})\) in the sense that in the expanded affine form \(\theta \mu -\theta r_{t}\), \(\theta \) and \(r_{t}\) are multiplied. Our definition of regime switching affine processes also allows exactly those parameters to vary across regimes that do not multiply with the process, but in a more general and multivariate setup.

The primary result of this paper is that the characteristic function of our regime switching affine process can be expressed as the solution to ordinary differential equations (ODEs), just like in the purely affine case. Because of the similar (but regime-dependent) exponential affine form of the characteristic function, they are conditionally affine. This result extends the techniques available for the class of affine processes to regime switching affine processes. Hence pricing equations based on affine processes can now be extended to include regime switching on the previously mentioned subset of parameters. This also explains the tractability of the special cases dealt with in the literature, because these can now be seen as specific regime switching affine processes.

The power of this result lies in part in the diversity of affine processes, as explained above. In general, affine processes are use to (jointly) model log-stock prices (with jumps), stochastic volatility, short rates and short rate factors, default intensities, see Duffie et al. [13], and stochastic covariance matrices. Affine processes also include all Lévy processes. Informally, a process \(X\) on a state space \(D\) is affine if its characteristic function can be written as

for some complex-valued functions \(\phi \) and \(\psi \). The functions \(\phi \) and \(\psi \) solve a system of ODEs. Typical examples of the quantities that are (jointly) enumerated in \(X\) are short rates, hazard rates of default, log-stock prices and stochastic volatility. In our regime switching extension, \(\phi \) may depend on the current regime, whereas \(\psi \) may not. This restriction directly dictates that the process parameters that determine \(\psi \) cannot be different across regimes. Because \(\phi \) and \(\psi \) depend on mutually exclusive parameters, all parameters in \(\phi \) may vary across regimes. Naturally, the state space \(D\) is not allowed to depend on the regime, to prevent jumping from a regime with a larger state space to a regime with a smaller state space that does not include the current value of the process. For example, if we jump from a negative short rate in a Vasiček model regime with state space ℝ to a Cox–Ingersoll–Ross short rate model regime with a nonnegative state space R+, we no longer have a well-defined process.

The result that regime switching affine processes are conditionally affine makes the powerful pricing techniques for affine processes available to regime switching affine processes. The most common technique is transform pricing. Using this technique, we can solve complex expectations in terms of \(\phi \) and \(\psi \), such as

with \(\ell \) scalar and \(L\) a vector representing discounting, and \(f\) a payoff function on \(D\) that satisfies some requirements related to Fourier transformation. For an overview of examples, see Duffie et al. [14]. In a handful of cases, \(\phi \) and \(\psi \) can be solved from the ODEs analytically. If not, then they result from numerically solving the ODEs. Solving such equations is often preferred over numerically solving the partial differential equations (PDEs) that more general processes entail. The ODEs in transform pricing are generally faster to solve than PDEs, and suffer less from numerical instability.

This paper is organised as follows. After the notational setup in Sect. 2 and a brief overview of results on affine processes in Sect. 3, Sect. 4 introduces the (general) regime switching process in terms of its infinitesimal generator. Section 5 defines regime switching affine processes and gives the most important result of the paper, namely a system of ODEs for the characteristic function that is an analogue of the Riccati equations for affine processes. Section 6 extends the validity of this result beyond the standard domain of the characteristic function. Section 7 shows how our analysis in terms of semigroups relates to stochastic differential equations (SDEs). Section 8 discusses discounting and how this affects the ODEs in the previous sections. Finally, Sect. 9 links these techniques to the existing literature on regime switching by viewing a number of special instances from the literature as special cases of the theory developed here.

2 General notation

This section introduces notation and the mathematical context. We consider processes on a general state space \(S\). Throughout this paper, we work with various forms of \(S\), either \(S=D\), \(S=E\) or \(S=D\times E\), where \(D\) and \(E\) are specified as follows. We consider processes on \(D\) in the setup of Keller-Ressel et al. [34]. That is, these processes live on the state space DRd such that the affine hull of \(D\) is Rd. We also consider the one-point compactification \(D_{\infty }:=D\cup \{\infty \}\) and extend all functions \(f\) on \(D\) by assuming \(f(\infty )=0\) unless stated otherwise. This compactification is used to describe a point at infinity, i.e., a cemetery where killed or exploded processes go.Footnote 1 The state space \(E\) can be chosen almost arbitrarily as long as it is locally compact. See Remark 5.2, 1) for a practically relevant setting. Write \(\mathcal{C}_{\mathrm{b}}(S)\) for the class of bounded continuous complex-valued functions on \(S\), and \(\mathcal{C}_{0}(S)\) for the subclass of functions that are real-valued and vanishing at infinity; \(\mathcal{C}_{\mathrm{b}}(S)\) and \(\mathcal{C}_{0}(S)\) are Banach spaces equipped with the norm \(\|f\|=\sup _{x\in S}{|f(x)|}\).

Central in this paper are time-homogeneous continuous-time Markov processes, denoted by , where denotes the law of the process starting in \(x\in S\) so that . This context implies the existence of an associated filtered probability space (Ω,F,F), where \(X\) is Markov with respect to F. To further aid notation, let \((P_{t})\) be the associated semigroup, for \(f\in \mathcal{C}_{\mathrm{b}}(S)\), with the expectation under . For general and possibly unbounded functions, we use the same notation, although the expectation need not be defined for all \(t\geq 0\).

When we use the term process in the remainder of this paper, we mean a process with the properties defined in this section unless stated otherwise.

3 Affine processes

To build the definition of regime switching affine processes, some knowledge of affine processes is necessary. This section gives the relevant setup of affine semigroups; it largely follows Keller-Ressel et al. [34].

We begin with a definition of affine processes. Consider for \(u\in \mathcal{U}\) the functions \(f_{u}(x):=\exp {(\langle u,x \rangle )}\) on \(D\), with UCd being the set of complex vectors such that \(f_{u}\) is bounded, i.e., \(f_{u}\in \mathcal{C}_{\mathrm{b}}(D)\). Also introduce Uk:={uCd:supxDu,xk} for kN, so that kNUk=U.

Definition 3.1

A semigroup \((P_{t})\) is an affine semigroup if it is stochastically continuous (in the sense of Duffie et al. [13, Definition 2.4]) and there exist two functions Φ:R+×UC and ψ:R+×UCd such that for all \(t\geq 0\), \(x\in D\) and \(u\in \mathcal{U}\),

$$\begin{aligned} P_{t}f_{u}(x)=\Phi (t,u)\exp {\big(\langle \psi (t,u),x\rangle \big)}. \end{aligned}$$

Remark 3.2

1) Since iRdU, the definition implies a specification of the form of the characteristic function of \(X_{t}\) with respect to the law .

2) Cuchiero and Teichmann [7] show that every affine process has an affine càdlàg version. We assume throughout the rest of this paper that this is the version we are working with.

3) Since there is a one-to-one correspondence between semigroups and processes in the present setup, we call the associated process of affine semigroups affine as well. In the remainder of this paper, we do the same with other properties.

As long as \(\Phi (t,u)\neq 0\), we can write \(\Phi (t,u)=\exp (\phi (t,u))\); so when we write \(P_{t}f_{u}(x)= \exp (\phi (t,u)+\langle x,\psi (t,u)\rangle )\), we implicitly assume that \(\Phi (t,u)\neq 0\). To explain the deeper relation between \(\Phi \) and \(\phi \), introduce for any \(u\in \mathcal{U}\) the quantity \(\sigma (u):=\inf \{t\geq 0:\Phi (t,u)=0\}\) and define Qk:={(t,u)R+×Uk:t<σ(u)}. Then \(\phi \) is a function on Q:=kNQk. In general, there are multiple choices of \(\phi \) and \(\psi \) that describe the same semigroup. Throughout this paper, we assume that \(\phi \) and \(\psi \) are jointly continuous on \(\mathcal{Q}_{k}\) for kN, that \(\phi (0,0)=0\) and that \(\psi (0,0)=0\). Keller-Ressel et al. [34, Proposition 2.4 (ii)] prove that these assumptions make \(\psi \) and \(\phi \) unique on \(\mathcal{Q}\).

An important property of affine semigroups is regularity.

Definition 3.3

An affine semigroup is called regular if the derivatives

$$ F(u):=\left .\partial _{t}\Phi (t,u)\right |_{t=0+}, \qquad R(u):=\left .\partial _{t}\psi (t,u)\right |_{t=0+} $$

exist for all \(u\in \mathcal{U}\) and are continuous on \(\mathcal{U} _{k}\) for each kN.

Note that \(F\) is scalar whereas \(R\), whose elements are denoted \(R^{i}\), is vector-valued. Keller-Ressel et al. [34] prove that every affine semigroup is regular and that these derivatives can be written as

F(u)=12u,au+b,uc+Rd{0}(fu(ξ)1u,h(ξ))m(dξ),Ri(u)=12u,αiu+βi,uγi+Rd{0}(fu(ξ)1h(ξ),u)μi(dξ),

where \(a,\alpha ^{1},\ldots ,\alpha ^{d}\) are Rd×d matrices, \(b,\beta ^{1},\ldots ,\beta ^{d}\) are Rd vectors, \(c,\gamma ^{1},\ldots ,\gamma ^{d}\) are scalars and \(m,\mu ^{1},\ldots , \mu ^{d}\) are signed Borel measures, and \(h(x)=x1_{\{|x|\leq 1\}}\) is a truncation function based on the Euclidean norm.

Fast-forwarding to the application of regime switching, we note that \(F\) may be regime dependent, whereas \(R\) may not. This implies that only the \(F\)-specific parameters \(a\), \(b\), \(c\) and \(m(\mathrm{d}\xi )\) may vary across regimes. Without this restriction, the characteristic function of such a regime switching affine process loses its conditionally affine structure.

Additionally, for any \(u\in \mathcal{U}\), \(t,s\geq 0\) with \((t+s,u) \in \mathcal{Q}\) and \((s,\psi (t,u))\in \mathcal{Q}\), the functions \(\phi \) and \(\psi \) satisfy the flow properties

$$\begin{aligned} \Phi (t+s,u) &=\Phi (t,u)\Phi \big(s,\psi (t,u)\big), \qquad \Phi (0,u)=1, \\ \psi (t+s,u) &=\psi \big(s,\psi (t,u)\big), \qquad \psi (0,u)=u. \end{aligned}$$
(3.1)

These can be differentiated into a system of ordinary differential equations for \(u\in \mathcal{U}\) and \(t\in [0,\sigma (u))\), called the generalised Riccati equations and given by

$$\begin{aligned} \partial _{t}\Phi (t,u) &=\Phi (t,u)F\big(\psi (t,u)\big), \qquad \Phi (0,u)=1, \\ \partial _{t}\psi (t,u) &=R\big(\psi (t,u)\big), \qquad \psi (0,u)=u. \end{aligned}$$
(3.2)

This system is used to find \(P_{t}f_{u}(x)\), although there are boundary cases, when the solutions to this system are not unique, that require extra care, as described by Keller-Ressel and Mayerhofer [32].

Finally, if the affine semigroup is Feller, then its generator \(A\) is given by

$$\begin{aligned} Af(x) & =\frac{1}{2}\operatorname{tr}\big(a(x)\partial _{xx}f(x)\big)+\langle b(x), \partial _{x}f(x)\rangle +c(x)f(x) \\ & \phantom{=:} +\int _{D\backslash \{0\}}{\big(f(x+\xi )-f(x)-\langle h(\xi ),\partial _{x}f(x)\rangle \big)m(x,\mathrm{d}\xi )}, \end{aligned}$$

for all functions \(f\in \mathcal{D}(A)\), the domain of \(A\), and with

$$\begin{aligned} a(x) &=a+\alpha ^{1}x_{1}+\cdots +\alpha ^{d}x_{d}, \\ b(x) &=b+\beta ^{1}x_{1}+\cdots +\beta ^{d}x_{d}, \\ c(x) &=c+\gamma ^{1}x_{1}+\cdots +\gamma ^{d}x_{d}, \\ m(x,\mathrm{d}\xi ) &=m(\mathrm{d}\xi )+\mu ^{1}(\mathrm{d}\xi )x_{1}+ \cdots +\mu ^{d}(\mathrm{d}\xi )x_{d}. \end{aligned}$$

4 Regime switching processes

This section introduces regime switching processes, which are defined in terms of generators of Feller semigroups. Although these regime switching processes are not the focus of the present paper, they are essential to the definition of regime switching affine processes. Informally, regime switching affine processes are regime switching processes defined in terms of generators of affine Feller semigroups. We start with some notation and continue with a proof that regime switching processes also satisfy the Feller property. This is important in later sections.

Consider two state spaces \(D\) and \(E\) as defined in Sect. 2. Consider a function f:D×ER. We define fy:DR by \(f^{y}(x)=f(x,y)\) for all \((x,y)\in D\times E\). We define \(f^{x}\) similarly.

Definition 4.1

For all \(y\in E\), let \(A^{y}\) and \(B\) be generators of Feller semigroups on \(\mathcal{C}_{0}(D)\) and \(\mathcal{C}_{0}(E)\), respectively, and also assume that \(B\) is bounded. Introduce the linear operator \(A^{*}\) on \(\mathcal{C}_{0}(D\times E)\) through

$$ A^{*}f(x,y):=A^{y}f^{y}(x)+Bf^{x}(y). $$

Then \(A^{*}\) is called a regime switching linear operator.

Remark 4.2

1) In this definition, we say that the linear operator \(A^{*}\) is switched by the driving generator \(B\) and that \(E\) contains the regimes. We call the generators \(A^{y}\) the regime specific generators and \(D\) the state space of the switched process. The driving process is denoted by \(Y\), or \((Y_{t})\) if the time index is relevant. The switched process is denoted by \(X\), or \((X_{t})\) with time index.

2) The restriction to Feller semigroups is weak. For many specific state spaces including the canonical state space, it is known that affine semigroups are Feller, and for the general case it is hypothesised; see Cuchiero and Teichmann [7].

3) The requirement that \(B\) is bounded restricts the associated process to pure jumps; see Kolokoltsov [36, Proposition 3.7.1]. This corresponds to the intuition that a process must remain in a state at least for some time to be naturally interpreted as a regime. Examples of pure jump processes are Markov chains and compound Poisson processes. The states that these processes take can be interpreted as regimes because they last for some time.

4) In the above setup, the dynamics of the driving process are independent from the starting point \(x\) of the switched process. Indeed, if \(f(x,y)=g(y)\) for all \(x\in D\), then because \(f^{y}\) is here a constant. Therefore, \(A^{*}\) is bounded on these functions and \(P_{t}^{*}f(x,y)= \exp (tB)g(y)\), which is independent of \(x\).

For the sake of brevity and readability, in the remainder of this paper, we often do not introduce superscript functions \(f^{y}\) and \(f^{x}\) every time we consider an operator on a function with additional variables. So when we write \(Bf(x,y)\), it is implicit that the operator acts only on the variable \(y\), i.e., this is equivalent to \((Bf(x,\cdot ))(y)\).

Although it is not directly implied by the definition, \(A^{*}\) is the generator of a Feller semigroup. The following theorem states this and thereby shows that we can think of \(A^{*}\) as a generator with an associated Feller semigroup and process. The proof relies mostly on results from perturbation theory.

Theorem 4.3

Every regime switching linear operator\(A^{*}\)on\(\mathcal{C}_{0}(D \times E)\)is the generator of a Feller semigroup.

Proof

Construct the operator \(A'\) on \(\mathcal{C}_{0}(D\times E)\) through \(A'f(x,y)=A^{y}f^{y}(x)\) for all \(x\in D\) and \(y\in E\). This is possible since \(f\in \mathcal{C}_{0}(D\times E)\) implies that \(f^{y}\in \mathcal{C}_{0}(D)\) for all \(y\in E\).Footnote 2 Also extend \(B\) in this way to \(B'\) on \(\mathcal{C}_{0}(D\times E)\). Clearly, \(A'\) and \(B'\) are both generators of Feller semigroups and \(A^{*}=A'+B'\). Also, \(B'\) is bounded so that its semigroup can be written as \(\exp (tB')\) and this implies that \(\mathcal{D}(B')=\mathcal{C}_{0}(D\times E)\).

Denote by \(R_{\lambda }\) the resolvent on the space of linear operators and choose \(\lambda >\|B'\|\). Then \(\lambda \) belongs to the resolvent set of \(A'\), called ρ(A)R++ (see Ethier and Kurtz [22, Proposition 1.2.1]), where R++ is the set of strictly positive real numbers. Then \(\|R_{\lambda }(A')\| \leq \lambda ^{-1}\) yields \(\|B'R_{\lambda }(A')\|<1\), and thus \((B'R_{\lambda }(A'))^{n}\rightarrow 0\) as \(n\rightarrow \infty \). This limit implies the same limit in Cesàro means, and so \(B'R_{\lambda }(A')\) is uniformly ergodic. By Tyran-Kaminska [42, Theorem 1.1], \(A^{*}\) is a closed operator that generates a strongly continuous contraction semigroup. By [22, Theorem 1.2.6], \(\mathcal{D}(A^{*})\) is dense in \(\mathcal{C}_{0}(D\times E)\) and the range of \(\lambda -A^{*}\) equals \(\mathcal{C}_{0}(D\times E)\). Also, by [22, Theorem 2.2.2], \(A'\) and \(B'\) satisfy the positive maximum principle. It follows easily that \(A^{*}\) does so, too. Again by [22, Theorem 2.2.2], the closure of \(A^{*}\) is a generator. The result follows from the fact that \(A^{*}\) was already proved to be closed. □

Remark 4.4

The result in Theorem 4.3 may be generalised for definitions that relax the bounded nature of \(B\). Related results exist for strongly continuous semigroups, see e.g. Engel and Nagel [21, Chap. III], but it is unclear whether the same can be done under the more restrictive Feller property. In the present paper, we do not attempt such a generalisation as doing so would move us away from the present, finance-oriented scope (involving regime switching processes) for which \(B\) is naturally bounded.

5 Regime switching affine processes and properties

This section introduces the regime switching affine processes that are central to this paper. These are regime switching processes driven by a bounded operator such as a Markov chain, and with affine regime specific generators subject to specific parameter restrictions. For this process, we derive a system of ODEs analogous to the Riccati equations for the affine process.

Definition 5.1

Consider a regime switching generator \(A^{*}\) with regimes \(E\) and state space \(D\) for the switched process. The generator \(A^{*}\) is regime switching affine if

  1. 1)

    the regime specific generators \(A^{y}\) are generators of affine Feller semigroups, and

  2. 2)

    the regime specific semigroups have \(P_{t}^{y}f_{u}(x)=\Phi ^{y}(y,u)\exp {(\langle \psi (t,u),x\rangle )}\) for some functions \(\Phi ^{y}\) and \(\psi \) for all \(t\geq 0\), \(x\in D\) and \(u\in \mathcal{U}\).

Remark 5.2

1) In the practically important special case when \(B\) is the generator of a Markov chain on a state space of standard basis vectors, i.e., E={e1,,ep}Rp, a \(p\times p\) generator matrix \(Q\) exists. With \(q_{k\ell }\) for \(k,\ell =1,\ldots ,p\) denoting the entries of \(Q\), it holds that

$$ Bf^{x}(e_{k})=\sum _{\ell =1}^{p}{q_{k\ell }f^{x}(e_{\ell })}, \qquad k=1,\ldots ,p. $$

2) Requiring \(A^{y}\) to generate Feller semigroups is a minor restriction, as explained in Remark 4.2, 2).

An important aspect of the definition of regime switching affine processes is that \(\psi \) does not depend on the regime \(y\). We call this independence regime invariance, and we use it throughout this paper to ensure parameters or functions do not vary with the regimes.

The associated derivatives of \(\Phi ^{y}\) and \(\psi \) are \(F^{y}\) and \(R\) (see Definition 3.3). The function \(R\) is regime invariant as a consequence of the regime invariance of \(\psi \). The functional forms of \(F^{y}\) and \(R\) (see directly below Definition 3.3) imply that the parameters \(a^{y}\), \(b^{y}\), \(c^{y}\) and \(m^{y}\) may be different per regime, whereas \(\alpha ^{i}\), \(\beta ^{i}\), \(\gamma ^{i}\) and \(\mu ^{i}\) for \(i=1,\ldots ,d\) must be regime invariant.

The primary result of this paper is the functional form of the characteristic function of regime switching affine processes. We consider dual variables \(u\in \mathcal{U}\) and \(v\in \mathcal{V}\), the set of continuous mappings v:EC such that \(y\mapsto \exp {(\langle v,y\rangle )}\) is bounded, where we write \(\langle v,y\rangle \) for \(v(y)\) for homogeneity of notation. When \(E\) is a general Banach space, we usually take \(v\in E^{*}\), and when \(E\) is finite-dimensional, we take vCp and \(\langle v,y\rangle \) is the usual inner product. In what follows, let \(\tilde{X}:=(X,Y)\), \(\tilde{x}:=(x,y)\), \(\tilde{\xi }:=(\xi ,\zeta )\), \(\tilde{u}:=(u,v)\) and \(F^{*}(u,y):=F^{y}(u)\), and extend the definition of \(f_{u}\) before Definition 3.1 from \(S=D\) to \(S=D\times E\) via \(f_{\tilde{u}}(\tilde{x})=\exp {(\langle u,x\rangle +\langle v,y\rangle )}\).

Theorem 5.3

Consider a regime switching affine semigroup\((P_{t}^{*})\)andU˜=U×Cp. Then there exists a functionΦ:R+×U˜×ECsuch that for\(\tilde{u}\in \tilde{\mathcal{U}}\),

$$\begin{aligned} P_{t}^{*}f_{\tilde{u}}(\tilde{x}) =\Phi ^{*}(t,\tilde{u},y)\exp {\big( \langle \psi (t,u),x\rangle \big)} \end{aligned}$$

for all\(x\in D\), \(y\in E\)and\(t\geq 0\), where\(\Phi ^{*}\)solves the PDE

$$\begin{aligned} \partial _{t}\Phi ^{*}(t,\tilde{u},y) &=F^{*}\big(\psi (t,u),y\big)\Phi ^{*}(t,\tilde{u},y)+B\Phi ^{*}(t,\tilde{u},y), \\ \Phi ^{*}(0,\tilde{u},y) &=\exp {(\langle v,y\rangle )}, \end{aligned}$$

wherevCpand\(\psi \)is as in Definition5.1.

Proof

The proof can be found in Appendix A. □

Remark 5.2, 1) introduced the special case that \(B\) generates a Markov chain. Theorem 5.3 simplifies if this is the case, as the following corollary shows. We adopt the notation that \(\operatorname*{diag}(a)\) or \(\operatorname*{diag}(a_{1},\ldots ,a_{p})\) is the \(p\times p\) diagonal matrix with \(p\)-vector \(a\) on its diagonal. In the remainder of this paper, it often holds that \(y=e_{k}\) for some \(k=1,\ldots ,p\). In such cases, we simply write \(F^{k}\) instead of \(F^{y}\), and similarly for other semigroups and functions with superscript \(y\).

Corollary 5.4

Suppose that\(B\)is the generator of a Markov chain with the state space\(E=\{e_{1},\ldots ,e_{p}\}\)and the generator matrix\(Q\). Then there exists a functionθ:R+×U˜Cpsuch that

$$\begin{aligned} P_{t}^{*}f_{\tilde{u}}(\tilde{x})=\langle \theta (t,\tilde{u}),y \rangle \exp {\big(\langle \psi (t,u),x\rangle \big)} \end{aligned}$$

for all\(x\in D\), \(y=e_{k}\in E\)and\(t\geq 0\), where\(\theta \)is the unique solution to the linear ODE

$$\begin{aligned} \partial _{t}\theta (t,\tilde{u}) =\operatorname*{diag}\Big(F^{1}\big(\psi (t,u) \big),\ldots ,F^{p}\big(\psi (t,u)\big)\Big)\theta (t,\tilde{u})+Q \theta (t,\tilde{u}). \end{aligned}$$

Proof

Due to the nature of the state space, it is always possible to define the vector-valued function \(\theta \) from \(\Phi ^{*}\) such that \(\langle \theta (t,\tilde{u}),y\rangle =\Phi ^{*}(t,y,\tilde{u})\). Similarly, we can write \(F^{*}(t,y,u)=\langle (F^{1}(u),\ldots ,F^{p}(u)),y \rangle \). Then the ODE for \(\langle \theta (t,\tilde{u}),y\rangle \) is known from Theorem 5.3 as

$$\begin{aligned} \langle \partial _{t}\theta (t,\tilde{u}),y\rangle &=\partial _{t} \langle \theta (t,\tilde{u}),y\rangle \\ &=\Big\langle \Big(F^{1}\big(\psi (t,u)\big),\ldots ,F^{p}\big( \psi (t,u)\big)\Big),y\Big\rangle \langle \theta (t,\tilde{u}),y \rangle +B\langle \theta (t,\tilde{u}),y\rangle \\ &=\Big\langle \operatorname*{diag}\Big(F^{1}\big(\psi (t,u)\big),\ldots ,F^{p} \big(\psi (t,u)\big)\Big)\theta (t,\tilde{u}),y\Big\rangle +\langle Q\theta (t,\tilde{u}),y\rangle . \end{aligned}$$

Since this holds for every standard basis vector \(y\), the time derivative of \(\theta \) exists and the ODE follows.

Finally, we need to prove that the ODE has a unique solution. This is true if \(F^{*}(\psi (t,u))\) is continuous in \(t\); see Knobloch and Kappel [35, Satz II.1.1 and Satz II.6.1]. Observe that \(\psi (t,u)\) is continuous in \(t\) by virtue of its generalised Riccati equation, and has domain \(\mathcal{U}\) by the contraction property of Feller semigroups. Moreover, \(F(u)\) is continuous on \(\mathcal{U}_{k}\) for all kN, hence continuous on \(\mathcal{U}\), and so the composition \(F^{*}(\psi (t,u))\) is continuous in \(t\). □

The remainder of this paper focuses on the setup as in Corollary 5.4. We refer to this setup as a regime switching affine process driven by a Markov chain. One might intuitively expect that this case gives analytically tractable results because \(X\) and \(Y\) can be viewed as jointly affine on a state space \(D\times E\). However, it turns out that this is not possible as Markov chains are not affine processes to begin with.

The question arises whether the regime invariance restrictions of \(\psi \) (or, equivalently, \(R\) or \(\alpha ^{i}\), \(\beta ^{i}\), \(\gamma ^{i}\) and \(\mu ^{i}\)) in Theorem 5.3 can be relaxed. It appears that this is not straightforward and perhaps not possible. We conjecture that the restriction here is that \(\partial _{t}P_{t}^{*}f_{\tilde{u}}( \tilde{x})|_{t=0+}=(G(\tilde{u})+\langle H(\tilde{u}),\tilde{x}\rangle )f_{\tilde{u}}(\tilde{x})\) for some functions \(G\) and \(H\), as is the case for affine processes; see Duffie et al. [13, Remark 2.8, where \(G(u)=F(u)\) and \(H(u)=R(u)\)]. It is easy to show that this equation holds for regime switching affine processes driven by a Markov chain by taking derivatives, to get

$$\begin{aligned} \partial _{t}P_{t}^{*}f_{\tilde{u}}(\tilde{x})|_{t=0+} &=\langle \partial _{t}^{+}\theta (t,\tilde{u})|_{t=0},y\rangle \exp {(\langle u,x \rangle )} \\ & \phantom{=:}+\big\langle \big(\exp {(v_{1})},\ldots ,\exp {(v_{p})} \big),y\big\rangle \langle R(u),x\rangle \exp {(\langle u,x\rangle )} \\ &=\big\langle \operatorname*{diag}\big(\exp {(-v_{1})},\ldots ,\exp {(-v_{p})} \big)\partial _{t}^{+}\theta (t,\tilde{u})|_{t=0},y\big\rangle f_{ \tilde{u}}(\tilde{x}) \\ & \phantom{=:}+\langle R(u),x\rangle f_{\tilde{u}}(\tilde{x}), \end{aligned}$$

and by setting \(H(\tilde{u})=(R(u),\operatorname*{diag}(\exp {(-v_{1})},\ldots , \exp {(-v_{p})})\partial _{t}^{+}\theta (t,\tilde{u})|_{t=0})\) and \(G( \tilde{u})=0\). In the second equality, the above derivation depends on the property that \(\exp {(\langle v,y\rangle )}=\langle (\exp {(v_{1})}, \ldots ,\exp {(v_{p})}),y\rangle \), which is only possible because the elements of \(E\) can be written as \(y=e_{k}\) for some \(k\), regardless of the form of the ODE for \(\theta \). This step also clearly breaks down when \(R\) is regime dependent.

6 Enlarging the domain of the characteristic function

So far, we have generally required that \(u\in \mathcal{U}\). In financial applications of affine processes, option prices are usually expressed as functions of \(P_{t}f_{u}\), where \((P_{t})\) is an affine semigroup and uCdU. For most applications, \(\mathcal{U}\) is too small a set to work with, and this carries over to the application of regime switching affine processes. Even for relatively standard univariate diffusion processes, specific parameters lead to explosions in the exponential moments, i.e., \(P_{t}^{y}f_{u}(x)\) does not exist. However, for any uC, \(\Phi ^{y}\) and \(\psi \) exist up to some time \(T\). This time naturally depends on \(u\). This section extends the validity of Theorem 5.3 to \(u\) beyond \(\mathcal{U}\), yet up to a time \(T\leq \infty \).

In the world of affine processes, Duffie et al. [13] provide a toolset to determine if a solution \(\Phi (t,u)\), \(\psi (t,u)\) to the generalised Riccati equations up to some \(T>0\) for uCd satisfies Definition 3.1. The steps usually involve checking that \(F\) and \(R^{i}\), \(i=1,\ldots ,d\), are analytic on some open domain, concluding from this that \(\Phi \) and \(\psi \) must therefore also be analytic, and finally using regularity properties of characteristic functions to expand the domain on which the solutions of the generalised Riccati equations describe the exponential moments of the affine process. Our case is roughly equivalent so that we can extend the most important results for affine processes to regime switching affine processes. We start with the analytic nature of the Riccati solutions. Lemma 6.1 can be viewed as the regime switching affine generalisation of [13, Lemma 6.5 ii] for the setup in Corollary 5.4, i.e., driven by a Markov chain. Its proof closely follows [13].

Lemma 6.1

Consider a regime switching affine semigroup\((P_{t}^{*})\)driven by a Markov chain. Suppose that all\(F^{k}\)and\(R\)are analytic on some open set\(U\)inCd. Let\(T\leq \infty \)be such that a local\(U\)-valued solution\(\psi \)to the generalised Riccati equations exists on\((0,T)\)for all\(u\in U\). Then:

(i) For allvCp, there also exists a solution\(\theta \)to the ODE in Theorem5.3on\((0,T)\)that is unique given\(\psi \).

(ii) \(\psi \)and\(\theta \)have unique analytic extensions on\((0,T)\times U\)and(0,T)×U×Cp, respectively.

Proof

Consider first the existence of \(\theta \). That \(\psi \) has a unique analytic extension on \((0,T)\times U\) follows directly from Dieudonné [11, Theorem 10.8.2]. The function \(\psi \) is \(U\)-valued; so since \(F^{k}\) is analytic on \(U\), the vector-valued composition \((F^{1}(\psi (t,u)),\ldots ,F^{1}(\psi (t,u)))\) is analytic on \((0,T)\) for every \(u\in U\) and therefore continuous. Using the linearity of the ODE in Theorem 5.3, it follows from Knobloch and Kappel [35, Satz II.1.1 and Satz II.6.1] that this ODE has a unique solution \(\chi ^{u}\) on \((0,T)\) for every starting point wCp, not only those that can be written as an exponent of \(v\). Defining \(\theta (t,\tilde{u})=\chi ^{u}(t,w)\) for w(C{0})p, where \(w_{k}=\exp {(v_{k})}\), gives the existence of a unique solution \(\theta \) given \(\psi \).

To prove the final statement of the lemma, consider the composite system

$$\begin{aligned} \partial _{t} \begin{pmatrix} \psi (t,u) \\ \theta (t,u,w) \end{pmatrix} &=G\big(\psi (t,u),\chi (t,u,w)\big), \qquad \psi (0,u)=u, \chi (0,u,w)=w, \\ G(u,w)&= \begin{pmatrix} R(u) \\ (\operatorname*{diag}(F^{*}(u))+Q)w \end{pmatrix} . \end{aligned}$$

Then \(G\) is analytic on the open domain U×Cp, and there exists a U×Cp-valued local solution on \((0,T)\) for every (u,w)U×Cp. Using Dieudonné [11, Theorem 10.8.2] again, \(\psi \) and \(\chi \) have a unique analytic extension on \((0,t)\times U\) and (0,t)×U×Cp, respectively. The result for \(\theta \) follows by changing the domain as above. □

With the analytic nature of \(\psi \) and \(\theta \) established, Theorem 6.2 below provides the necessary conditions to conclude that the ODE solutions describe the exponential moments of regime switching affine processes. This theorem is the regime switching affine generalisation of Duffie et al. [13, Theorem 2.16 ii].

Theorem 6.2

Consider a regime switching affine semigroup\((P_{t}^{*})\)driven by a Markov chain. Let\(t\geq 0\)and let\(U\)be an open convex neighbourhood of 0 inCd. Suppose that\(\psi (t,\cdot )\)and\(\theta (t,\cdot )\)have analytic extensions on\(U\)andU×Cp, respectively. Then:

(i) The exponential moments\(P_{t}^{*}f_{\tilde{u}}( \tilde{x})\)are finite for alluURd, vRp, \(x\in D\)and\(y\in E\).

(ii) \(P_{t}^{*}f_{\tilde{u}}(\tilde{x})=\langle \theta (t, \tilde{u}),y\rangle \exp {(\langle \psi (t,u),x\rangle )}\)holds for all\(u\in U\)such that(u)URd, vCp, \(x\in D\)and\(y\in E\).

Proof

The proof of [13, Theorem 2.16 ii] is sufficient here as it relies in no way on the form of the characteristic function or the state space, only on its analytic nature which is satisfied by assumption. The only difference is that we do not require the process to be conservative. This is allowed because the proof works for any bounded measure (\(\nu \) in [13, Appendix A]), not only probability measures. For a killed process, we have \(f_{u}(\infty )=0\) by assumption; so for the purpose of computing exponential moments, we might as well adjust that the measure value is zero on jumps to this point at infinity. This means that whenever a process includes killing, the exponential moments deflate due to strict subadditivity of the measure rather than the definition \(f_{u}(\infty )=0\), with the exact same effect. □

A deeper analysis of analytical extensions in the context of affine processes can be found in Keller-Ressel and Mayerhofer [32].

7 SDE characterisation

We have given a definition of regime switching affine processes driven by a Markov chain in terms of the generator. For many applications, SDEs are more intuitive. Also, practitioners more often think in terms of SDEs than semigroups. This section gives an SDE formulation using the existing link for affine processes. For completeness, we state the result for affine processes without regime switching first.

For any Markov process, consider the sequence \(\tau _{n}=\inf \{t \geq 0:\|X_{t}-X_{0}\|>n\}\) and define the explosion time \(\tau _{ \infty }:=\lim _{n\rightarrow \infty }\tau _{n}\). Recall the definitions of \(a\), \(b\) and \(m\) from the end of Sect. 3. The following proposition is adapted from Keller-Ressel et al. [33, Corollary 3.11].

Proposition 7.1

Let\(X\)be an affine process and suppose that the killing terms vanish, i.e., \(c=0\)and\(\gamma =0\). Then under every, \(x\in D\), the process\(X\)is a\(D\)-valued semimartingale on0,τwith canonical semimartingale representation

dXt=b(Xt)dt+a(Xt)dWt=:+Rdh(ξ)(νX(ω;dt,dξ)m(Xt,dξ))+Rd(ξh(ξ))νX(ω;dt,dξ),

where\(\sqrt{\cdot }\)indicates the Cholesky decomposition, \((W_{t})\)is a\(d\)-dimensional Brownian motion, \(\nu _{X}\)is the random measure associated with the jumps of\(X\), \(h\)is a truncation function and\(a\), \(b\)and\(m\)are as defined in Sect3.

The following proposition is the regime switching counterpart to Proposition 7.1 for the Markov chain case. Recall that we can write \(Y=e_{K}\) for \(K \in \{0,\ldots ,p\}\) in this particular case.

Proposition 7.2

Let\(\tilde{X}=(X,Y)=(X,e_{K})\)be a regime switching affine process and suppose that the killing terms vanish, i.e., \(c^{k}=0\)for\(k=1, \ldots ,p\)and\(\gamma =0\). Then under every, \(x\in D\), \(y\in E\), the process\((X,Y)\)is a\(D\)-valued semimartingale on0,τwith canonical semimartingale representation

dXt=bKt(Xt)dt+aKt(Xt)dWt=:+Rdh(ξ)(νX(ω;dt,dξ)mKt(Xt,dξ))+Rd(ξh(ξ))νX(ω;dt,dξ),dYt=k=1pYk,tk(eek)dNtqk,

where\(a^{k}\), \(b^{k}\)and\(m^{k}\)are the regime specific parameters corresponding to\(A^{k}\)and\((N_{t}^{q_{k\ell }})\)are independent Poisson processes with intensity\(q_{k\ell }\)that count the transitions from\(e_{k}\)to\(e_{\ell }\)up to time\(t\).

Proof

We prove the statement by letting \((X,Y)\) follow the canonical representation in the statement, and showing that its semigroup has the correct parameters.

Let \(u\in \mathcal{U}\) be such that \(\psi \) and \(\theta \) exist for all \(t\geq 0\) by Theorem 5.3. Based on this, let \(T\geq 0\) and define \(M_{t}^{X}:=\exp {(\langle \psi (T-t,u),X_{t} \rangle )}\) and \(M_{t}^{Y}:=\langle \theta (T-t,\tilde{u}),Y_{t} \rangle \). Then let \(M_{t}:=M_{t}^{X}M_{t}^{Y}\). With the ansatz that \(M\) is a martingale, we find

This means that if \(M\) is a martingale, then the semigroup \((P_{t} ^{*})\) of \((X,Y)\) has the correct form. We now show that \(M\) is indeed a martingale.

By Itô’s lemma and the definitions of \(F^{k}\) and \(R\), we have that

dMtX=MtX(Tψ(Tt,u),Xt+ψ(Tt,u),bKt(Xt)=:MtX(+12aKt(Xt)ψ(Tt,u),ψ(Tt,u)=:MtX(+Rd(exp(ψ(Tt,u),ξ)1ψ(Tt,u),h(ξ))=:MtX(+Rd×mKt(Xt,dξ))dt+dM˜tX=MtXFKt(ψ(Tt,u))dt+dM˜tX,

where \(\tilde{M}^{X}\) stands for the local martingale part of \(M^{X}\). By Elliott and Osakwe [18, (7)], we can write \(\mathrm{d}Y_{t}=Q ^{\top }Y_{t-}\mathrm{d}t+\mathrm{d}\tilde{Y}_{t}\), where \(\tilde{Y}\) is a \(p\)-dimensional martingale. So with the martingale \(\tilde{M} _{t}^{Y}=\int _{0}^{t}\langle \theta (T-s,\tilde{u}),\mathrm{d} \tilde{Y}_{s}\rangle \), we get

$$\begin{aligned} \mathrm{d}M_{t}^{Y} & =\big(-\langle \partial _{T}\theta (T-t, \tilde{u}),Y_{t-}\rangle +\langle \theta (T-t,\tilde{u}),Q^{\top }Y _{t-}\rangle \big)\mathrm{d}t+\mathrm{d}\tilde{M}^{Y}_{t} \\ & =\bigg(-\bigg\langle \bigg(\operatorname*{diag}\Big(F^{*}\big(\psi (T-t,u)\big) \Big)+Q\bigg)\theta (T-t,\tilde{u}),Y_{t-}\bigg\rangle \\ & \phantom{=:\bigg(} +\langle \theta (T-t,\tilde{u}),Q^{\top }Y_{t-}\rangle \bigg) \mathrm{d}t+\mathrm{d}\tilde{M}^{Y}_{t} \\ & =-\Big\langle \operatorname*{diag}\Big(F^{*}\big(\psi (T-t,u)\big)\Big)\theta (T-t, \tilde{u}),Y_{t-}\Big\rangle \mathrm{d}t+\mathrm{d}\tilde{M}^{Y}_{t} \\ & =-M_{t}^{Y}F^{K_{t-}}\big(\psi (T-t,u)\big)\mathrm{d}t+\mathrm{d} \tilde{M}_{t}^{Y}, \end{aligned}$$

and by independence of the jump measures of \(Y\) and \(X\),

$$\begin{aligned} \mathrm{d}M_{t} & =M_{t-}^{X}\mathrm{d}M_{t}^{Y}+M_{t-}^{Y}\mathrm{d}M _{t}^{X} \\ & =-M_{t-}^{X}M_{t}^{Y}F^{K_{t-}}\big(\psi (T-t,u)\big)\mathrm{d}t -M _{t-}^{X}\mathrm{d}\tilde{M}^{Y}_{t} \\ & \phantom{=:} +M_{t-}^{Y}M_{t-}^{X}F^{K_{t-}}\big(\psi (T-t,u)\big)\mathrm{d}t +M _{t-}^{Y}\mathrm{d}\tilde{M}_{t}^{X} \\ & =-M_{t-}^{X}\mathrm{d}\tilde{M}^{Y}_{t} +M_{t-}^{Y}\mathrm{d} \tilde{M}_{t}^{X}. \end{aligned}$$

Hence \(M\) is a local martingale. But \(M^{X}\) and \(Y\) are uniformly bounded, and so \(M\) is a true martingale. □

8 Discounting

Derivatives pricing often requires some form of discounting, either with an interest rate or a hazard rate of default, or both; see Lando [38]. This section explains discounting, and that discounting for regime switching affine processes driven by a Markov chain boils down to the same small modification of the generalised Riccati equations as discounting for affine processes.

Consider an affine process \(X\) with semigroup \((P_{t})\). Discounting means that instead of \(P_{t}f(x)\), we are interested in

where L:DR given by

$$\begin{aligned} L(x)&:=\ell +\lambda ^{1}x_{1}+\cdots +\lambda ^{d}x_{d} \end{aligned}$$

is the discounting rate. This is usually a short rate or a hazard rate, or a combination. For \((Q_{t})\), \(Q_{t}f_{u}(x)\) still has the exponential affine form from Definition 3.1. Naturally, unless \(\ell =\lambda ^{1}=\cdots =\lambda ^{d}=0\), \(\Phi \) and \(\psi \) are not the same as for \((P_{t})\). However, it is well known that a small adaptation of \(F\) and \(R\) solves this problem. Effectively \(F(u)\) becomes \(F(u)+\ell \) and \(R^{i}(u)\) becomes \(R^{i}(u)+\lambda ^{i}\) for all \(i=1,\ldots ,d\).

For a regime switching affine process \((X,Y)\), the natural equivalent is to see it as a conditionally affine process and use the exact same notation to obtain

where L:D×ER is given by

$$\begin{aligned} L(x,y)&:=\ell +\lambda ^{1}x_{1}+\cdots +\lambda ^{d}x_{d}+\lambda ^{d+1}y _{1}+\cdots +\lambda ^{d+p}y_{p}. \end{aligned}$$

The same modifications apply to \(F^{k}\) and \(R\), giving \(F^{k}(u)+ \ell +\lambda ^{d+k}\) and \(R^{i}(u)+\lambda ^{i}\).

Alternatively, we can assume without loss of generality that \(\ell =0\) since the constant \(\ell \) can be added to all the \(\lambda ^{d+1},\ldots ,\lambda ^{d+p}\), keeping \(L\) the same. Then, using a slightly different notation, we can write

(8.1)

where L:D×ERp is now given by

$$\begin{aligned} L^{*}_{k}(x)&:=\ell ^{k}+\lambda ^{1}x_{1}+\cdots +\lambda ^{d}x_{d}. \end{aligned}$$
(8.2)

The modifications to \(F^{k}\) and \(R\) are now \(F^{k}(u)+\ell ^{k}\) for \(k=1,\ldots ,p\) and \(R^{i}(u)+\lambda ^{i}\) for \(i=1,\ldots ,d\).

9 Applications

This section explains why the most important cases of regime switching processes from the literature are in fact regime switching affine processes. First, we consider the regime switching short rate models in Landén [37] and Elliott and Siu [19], and show how the price of a zero-coupon bond is expressed in both models. Second, we derive the characteristic function of the regime switching univariate Lévy processes analysed by Elliott and Osakwe [18]. Third, we derive the characteristic function of a regime switching Heston model of the type in Elliott and Lian [16], Elliott et al. [17,20]. In the final example, we explore a credit derivative with different interest rate and credit regimes. For this application, we do not base the model directly on the literature. The section concludes with a few high-level notes on hedging in the context of regime switching affine processes.

9.1 Riskless bond

Landén [37] and Elliott and Siu [19] consider the Vasiček and Cox–Ingersoll–Ross short rate models (with translated notation such that \(X_{t}\) is the short rate and \(K_{t}\) is the regime index as before)

$$\begin{aligned} \mathrm{d}X_{t} &=(b^{K_{t}}+\beta X_{t})\mathrm{d}t+ \sqrt{a^{K_{t}}}\mathrm{d}W_{t} \end{aligned}$$

and

$$\begin{aligned} \mathrm{d}X_{t}&=(b^{K_{t}}+\beta X_{t})\mathrm{d}t+\sqrt{\alpha X _{t}}\mathrm{d}W_{t}. \end{aligned}$$

Using Proposition 7.2, it is easy to see how these are special cases of regime switching affine processes. The authors find that the price of a zero-coupon bond at \(t\) with maturity \(T>t\) in these short rate models is

In our notation in (8.1), this is

where \(L^{*}_{k}(x)=\ell ^{k}+\lambda ^{1}x=-x\) is the discount rate as in (8.2) and \(f_{0}\) is simply \(f_{\tilde{u}}\) with \(\tilde{u}=0\). Note that due to \(\ell ^{k}=0\), the discounting formula is regime invariant and the regime only influences the price indirectly through the dynamics of the short rate \((r_{t})\). The authors use a system of differential equations that is a specific case of the ODE in Theorem 5.3, and use \(A(t,T,r,k)=\ln {\theta _{k}(T-t,0)}\).

9.2 European option price

In the next example, we look at option pricing in a Lévy setting. Elliott and Osakwe [18] choose a regime switching univariate Lévy process with the Lévy kernel of the variance gamma process. Their arguments for this specific process are empirical, but theoretically, their derivation checks out for any other univariate Lévy kernel. Translated to our setup, they choose \(\alpha ^{i}=0\), \(\beta ^{i}=0\), \(\gamma ^{i}=0\), \(\mu ^{i}=0\) for \(i=1\) since there is only one dimension, and \(a^{k}=0\), \(b^{k}\) free, \(c^{k}=0\) and \(m^{k}(\mathrm{d}\xi ) \mathrm{d}t\) are the compensators of variance gamma processes for \(k=1,\ldots ,p\).

With these parameters, it follows that \(R^{i}(u)=0\) so that \(\psi (t,u)=u\). The function \(F^{k}(u)\) is the regime specific log-characteristic function

Fk(u)=ubk+R{0}(euξ1uh(ξ))mk(dξ).

Using Theorem 5.3, we get the characteristic function

where \(\tilde{u}=(u,0)\), uiR and \(\theta \) is \(p\times 1\) and the solution to

$$ \partial _{t}\theta (t,\tilde{u})=\Big(\operatorname*{diag}\big(F^{*}(u)\big)+Q \Big)\theta (t,\tilde{u}), \qquad \theta (0,\tilde{u})=1, $$

where 1 is a vector of ones, and with straightforward closed-form solution

$$ \theta (t,\tilde{u})=\exp {\bigg(\Big(\operatorname*{diag}\big(F^{*}(u)\big)+Q \Big)t\bigg)}1. $$

This result is obtained in an entirely different fashion by Elliott and Osakwe [18, Proposition 2]. By Lemma 6.1 and Theorem 6.2, we may expand \(u\) beyond iR to a convex open neighbourhood of 0 in ℂ, as long as \(F\) is analytic on this neighbourhood. Note how in our setup, we can trivially generalise to any multivariate process \((X_{t})\), as in the setup of Vostrikova and Dong [45]. Variants of this result have been established in various contexts related to Markov additive processes; see e.g. Asmussen [1, Proposition XI.2.2].

9.3 Stochastic volatility

Next, we cast the regime switching Heston model of Elliott et al. [17] in the regime switching affine process format. The interpretation of volatility in terms of regimes is not new, but has become more important in the low volatility period of 2017; see Hamilton and Susmel [28]. Let \(\ln S_{t}\) be the log-stock price, \(V_{t}\) the (instantaneous) variance and \(K_{t}\) the regime index. The regime switching Heston model has

$$\begin{aligned} \mathrm{d}\ln S_{t} &=\bigg(r-\frac{1}{2}V_{t}\bigg)\mathrm{d}t +\sqrt{V _{t}}\mathrm{d}W_{t}^{1}, \\ \mathrm{d}V_{t} &=\kappa (\theta _{K_{t}}-V_{t})\mathrm{d}t+\nu \sqrt{V _{t}}\big(\rho \mathrm{d}W_{t}^{1}+\sqrt{1-\rho ^{2}}\mathrm{d}W_{t} ^{2}\big). \end{aligned}$$

When \(X_{t}=(\ln S_{t},V_{t})\), it is easy to see this in the light of Proposition 7.2 as

$$\begin{aligned} \begin{pmatrix} \mathrm{d}\ln S_{t} \\ \mathrm{d}V_{t} \end{pmatrix} &= \bigg( \begin{pmatrix} r \\ \kappa \theta _{K_{t}} \end{pmatrix} + \begin{pmatrix} 0&-\frac{1}{2} \\ 0&\kappa \end{pmatrix} \begin{pmatrix} \ln S_{t} \\ V_{t} \end{pmatrix} \bigg)\mathrm{d}t \\ & \phantom{=:}+\sqrt{ \begin{pmatrix} 0&0 \\ 0&0 \end{pmatrix} \ln S_{t}+ \begin{pmatrix} 1&\nu \rho \\ \nu \rho &\nu ^{2} \end{pmatrix} V_{t}} \begin{pmatrix} \mathrm{d}W_{t}^{1} \\ \mathrm{d}W_{t}^{2} \end{pmatrix} . \end{aligned}$$

From this formulation, we can read off all parameters required for \(R^{i}\) and \(F^{k}\) as

$$\begin{aligned} a^{k} &= \begin{pmatrix} 0&0 \\ 0&0 \end{pmatrix} , \qquad \alpha ^{1}= \begin{pmatrix} 0&0 \\ 0&0 \end{pmatrix} , \qquad \alpha ^{2}= \begin{pmatrix} 1&\nu \rho \\ \nu \rho &\nu ^{2} \end{pmatrix} , \\ b^{k}&= \begin{pmatrix} r \\ \kappa \theta _{k} \end{pmatrix} , \qquad \beta ^{1}= \begin{pmatrix} 0 \\ 0 \end{pmatrix} , \qquad \beta ^{2}= \begin{pmatrix} -\frac{1}{2} \\ \kappa \end{pmatrix} . \end{aligned}$$

The killing and jump parameters \(c^{k}\), \(\gamma ^{i}\), \(m^{k}\) and \(\mu ^{i}\) are all zero, and we note that \(a^{k}\) is zero and thus regime invariant. From here on, it is straightforward to numerically solve the Riccati equations for \(\psi (t,u)\) and \(\theta (t,\tilde{u})\) using (3.2) and Theorem 5.3 to get the characteristic function. With this, we can price European options through standard Fourier transformation as in Elliott et al. [17], or we can price the variance swap in Elliott and Lian [16], Elliott et al. [20].

9.4 Defaultable bond

Lastly, we consider an example from credit risk, namely the price of a defaultable zero-coupon bond of a bank, with no recovery value. As far as we know, this example does not correspond to a model found in the literature. This simplified case can be extended to include recovery value and can be used to price a credit default swap, possibly with counterparty credit risk, using the principles in Duffie et al. [14]. We consider a situation in which in search of yield, banks tend to take on more risky loans in a low interest rate environment. We model this by constructing two regimes indexed by a process \((K_{t})\), one stressed regime with a relatively low mean reversion level for the interest rate \((r_{t})\) with a corresponding high mean reversion level for the bank’s default hazard rate \((h_{t})\), and one vice versa. Both processes are correlated CIR processes under the risk neutral measure,

$$\begin{aligned} \mathrm{d}r_{t} &=\theta _{r}(\mu ^{r}_{K_{t}}-r_{t})\mathrm{d}t+\sigma _{r}\sqrt{r_{t}}\mathrm{d}W_{t}^{1}, \\ \mathrm{d}h_{t} &=\theta _{h}(\mu ^{h}_{K_{t}}-h_{t})\mathrm{d}t+\sigma _{h}\sqrt{h_{t}}\big(\rho ^{2}\mathrm{d}W_{t}^{1}+ \sqrt{1-\rho ^{2}}\mathrm{d}W_{t}^{2}\big), \end{aligned}$$

where \(\mu _{1}^{r}<\mu _{2}^{r}\) and \(\mu ^{h}_{1}>\mu ^{h}_{2}\). It is well known from Lando [38, Proposition 3.1] that by the Markov property of the involved processes, the price of such a bond is given by

where \(\tau \) is the default time corresponding to the intensity process \(h\) and \(T\) is the maturity. To translate this to our setup, write \(X=(r,h)\). It is straightforward to derive the parameters of the regime switching process for this case as it closely resembles the Heston model in the previous example. Using (8.1), we get

where \(L^{*}_{k}(x)=\ell ^{k}+\lambda ^{1}x_{1}+\lambda ^{2}x_{2}=-r-h\) is the discount rate as in (8.2). Note that due to \(\ell ^{k}=0\), we have regime invariance in the discounting, meaning that \({\langle L^{*}(X_{t}),Y_{t}\rangle =-r_{t}-h_{t}}\). This pricing problem can be solved numerically using the ODE in Theorem 5.3.

9.5 Notes on hedging

In the Black–Scholes world (and several generalisations to other diffusion processes), hedging a contingent claim is well understood and standard. After switching to an equivalent martingale measure, we can use a unique self-financing strategy to replicate the payoff of the contingent claim almost surely. With jumps involved, market incompleteness generally (but not always) arises, i.e., the contingent claim cannot be exactly replicated. It is beyond the scope of this paper to go into detail, but worthwhile to highlight the implications when using regime switching processes for hedging purposes.

In such applications, the set of generators \(A^{y}\) may cause hedging errors due to incompleteness, for example because these are the generators of processes that include certain jump components. These hedging errors may be regime specific, i.e., in some regimes it may be harder to hedge than in others. But the perturbation by the generator \(B\) may lead to an additional source of hedging error that is caused by non-tradable regime changes. This is clearest in the case when we try to hedge a European option in a regime switching Black–Scholes world where only the volatility is different in each regime. In the trivial case that the regime is observed and constant (i.e., each state is absorbing), the contingent claim can be replicated as in the Black–Scholes case. If the regime is observed and can be traded, then the risk of jumping to another regime is priced, and the claim can be replicated again with appropriate modifications to the absorbing case. However, if the regime cannot be traded, then in general a hedging error emerges. In this specific Black–Scholes case, the hedging error can be fully attributed to the regime switching nature of the process, since the generators \(A^{y}\) correspond to different Black–Scholes diffusions. The hedging error can be minimised using for instance mean–variance hedging and locally risk minimising trading strategies. Di Masi et al. [10] show explicitly how this can be done for the regime switching volatility case described above. It is conceivable that their approach can be mimicked for our switching regime processes, but as said above, this is beyond the scope of the present paper.