1 Introduction

1.1 Motivation

Affine processes are highly valued in the field of mathematical finance thanks to their exceptional analytical tractability, making them a popular choice for stochastic modelling. In the last two decades, numerous authors have studied affine processes and their applications on various state spaces, including the canonical state space \(\mathbb{R}_{+}^{d} \times \mathbb{R}^{n}\) studied in Duffie et al. [25], Keller-Ressel [48, Part I], Dawson and Li [23], Filipović and Mayerhofer [28], Keller-Ressel and Mayerhofer [50], Kallsen and Muhle-Karbe [44], and the cone of symmetric positive semi-definite \((d \times d)\)-ma- trices \(\mathbb{S}_{+}^{d}\) studied in Cuchiero et al. [21], Cuchiero [20, Chap. 5]. Affine diffusions on canonical state spaces in infinite-dimensional Hilbert spaces have been studied in Schmidt et al. [64], Yu [68, Chap. 2] with applications to interest rate modelling. Moreover, infinite-dimensional affine processes also emerge in Cuchiero and Teichmann [22] as Markovian lifts of affine Volterra processes, which are used to model rough volatility. Besides the general framework discussed therein, examples of Markovian lifts are either constructed on a space of signed measures or on the Filipović space which also plays a central role for our applications; see also Sect. 4.

In this work, we focus on infinite-dimensional affine stochastic covariance models that are used to describe the forward curve dynamics in fixed-income or commodity markets in Benth et al. [9], Benth and Simonsen [11], Benth and Sgarra [10], Benth et al. [8], Cox et al. [18], Karbach [46, Part I]. These models are developed within the Heath, Jarrow, Morton and Musiela (HJMM) framework in Carmona and Tehranchi [13] and Benth and Krühner [7], and consist of a hyperbolic stochastic partial differential equation that describes the evolution of the forward curve, an inherently infinite-dimensional object modelled in the Filipović space, and a separate operator-valued instantaneous covariance process that captures the covariance structure of the forward curve dynamics.

The main challenge in these models lies in specifying the instantaneous covariance process in a way that is both accurate and feasible for tasks in statistical inference, simulation or the pricing of financial derivatives. To address the challenges associated with these models, the analytical tractability of affine processes plays a crucial role. In fact, in finite dimension, the instantaneous covariance process has been modelled as an affine process on symmetric positive semi-definite matrices in Cuchiero et al. [21], Leippold and Trojani [54], Barndorff-Nielsen and Stelzer [4], Gouriéroux and Sufana [36]. Motivated by the need to capture high-dimensional noise, for example to model maturity-specific risk (see Carmona and Tehranchi [13, Chap. 2], Benth and Krühner [7] and Koekebakker and Ollmar [52] for the case of electricity markets), it was proposed in Benth et al. [9] and Cox et al. [17,18] to model the instantaneous covariance process of the forward curves by affine processes on positive Hilbert–Schmidt operators. These new affine stochastic covariance models provide great flexibility in modelling high-dimensional noise as they allow infinite-rank operator-valued instantaneous covariance while maintaining sufficient tractability due to their affine structure.

Empirical data suggests that stochastic covariance, similarly to stochastic volatility, enjoys the stylised fact of mean-reversion. The latter allows us to define e.g. long-term averages and the speed of mean-reversion for stochastic covariance; see e.g. Teng et al. [65]. Mathematically, mean-reversion is captured by the long-time behaviour, stationarity and ergodicity of the process. In this article, we study these properties for the class of affine processes on positive Hilbert–Schmidt operators and henceforth provide a mathematical justification of mean-reversion. Additionally to the study of mean-reversion, stationarity also plays an important role in parameter estimation and model calibration of the instantaneous covariance process as noted in Alfonsi et al. [1]. Ergodicity, on the other hand, is critical for the \(L^{1}\)-convergence of the general method of moments in stochastic covariance models, as discussed in Pigorsch and Stelzer [62]. In the broader picture, a better understanding of the long-time behaviour of forward curve models is also critical for managing the risk associated with portfolios of forward contracts and evaluating long-term investment decisions. In the following, we introduce our setting and present our main results more formally.

1.2 Main result and applications

Following the literature, we realise the evolution of the forward curves as a function-valued process with values in some Filipović space; see Sect. 4 below. Thus let us consider a real and separable Hilbert space \((H,(\,\cdot \,,\,\cdot \,))\). Accordingly, we model the instantaneous covariance process of the forward curve dynamics on \(\mathcal {H}_{+}\), the set of all positive self-adjoint Hilbert–Schmidt operators defined on \(H\). Note that \(\mathcal {H}_{+}\) is a convex cone in the Hilbert space of all self-adjoint Hilbert–Schmidt operators equipped with the trace inner product \(\langle x,y\rangle :=\mathrm {Tr}(yx)\) for \(x,y\in \mathcal {H}\). Given a time-homogeneous Markov process \((X_{t})_{t\geq 0}\) with values in \(\mathcal {H}_{+}\), we denote its transition kernels by \((p_{t}(x,\,\cdot \,))_{t\geq 0}\), and following the general terminology of affine processes, we call \((X_{t})_{t\geq 0}\) affine whenever the Laplace transform of \(X_{t}\), for every \(t\geq 0\), is of an exponential-affine form in the initial value \(x\in \mathcal {H}_{+}\), i.e.,

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle \xi ,u\rangle}p_{t}(x,\mathrm {d}\xi )=\mathrm {e}^{- \phi (t,u)-\langle x,\psi (t,u)\rangle}, \qquad t\geq 0, u\in \mathcal {H}_{+}, \end{aligned}$$
(1.1)

for some functions \(\phi \colon \mathbb{R}_{+}\times \mathcal {H}_{+}\to \mathbb{R}_{+}\) and \(\psi \colon \mathbb{R}_{+}\times \mathcal {H}_{+}\to \mathcal {H}_{+}\). Typically, the functions \(\phi \) and \(\psi \) are solutions of an associated pair of generalised Riccati equations that can be solved numerically. Affine processes on the state-space \(\mathcal {H}_{+}\), as considered in this work, have been first introduced and studied in Cox et al. [17]. Their construction is summarised in Theorem 2.2 below; see also [17, Theorem 2.8]. We want to emphasise that such processes can be considered as the natural infinite-dimensional analogues of affine processes on symmetric positive semi-definite \((d\times d)\)-matrices \(\mathbb{S}_{+}^{d}\) studied in Cuchiero et al. [21]. Indeed, for \(H=\mathbb{R}^{d}\), the self-adjoint Hilbert–Schmidt operators on \(H\) are precisely the symmetric matrices equipped with the Frobenius norm, and \(\mathcal {H}_{+}=\mathbb{S}_{+}^{d}\) holds in this case.

To carry out our study of long-time behaviour for affine stochastic covariance models, we first prove that the transition kernels \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) of the affine instantaneous covariance process converge weakly to some probability measure \(\pi \) on \(\mathcal {H}_{+}\) as \(t \to \infty \). This limit distribution is then shown to be the unique invariant measure of the process. Afterwards, we provide an exponential rate of convergence in the Wasserstein distance of order \(p \in [1,2]\) and finally construct the corresponding stationary process that has the invariant measure \(\pi \) as its time-marginals. Below, we provide a short version of our main result. We call an affine process on \(\mathcal{H}_{+}\) subcritical if its state-dependent drift is negative (i.e., all eigenvalues have strictly negative real parts).

Theorem 1.1

Let \((X_{t})_{t\geq 0}\) be a subcritical affine process on the cone \(\mathcal {H}_{+}\) with transition kernels \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) and the existence of which is guaranteed by Theorem 2.2below. Then the following hold true:

i) There exists a unique invariant measure \(\pi \) of \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) for all \(x\in \mathcal {H}_{+}\).

ii) For every \(x\in \mathcal {H}_{+}\) and \(p\in [1,2]\), the sequence \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) converges exponentially fast to \(\pi \) in the Wasserstein distance of order \(p\) as \(t \to \infty \) .

iii) There exists a Markov process \((X_{t}^{\pi})_{t\geq 0}\) with transition kernels \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) such that the distribution of \(X_{t}^{\pi}\) is equal to \(\pi \) for all \(t\geq 0\).

In Sect. 1.3 below, we give an in-depth overview of our methodology and the relevance of our work in the context of the existing literature and methods. In the following, we emphasise the relevance of our results for applications in stochastic covariance modelling of forward curve dynamics.

Indeed, having understood the long-time behaviour of affine instantaneous covariance processes of Hilbert-valued stochastic covariance models, we can introduce and study these models in the stationary covariance regime, where we replace the affine instantaneous covariance process by its corresponding stationary affine process having the same transition probabilities and whose existence is guaranteed by part iii) of the result above (which is Proposition 3.5 below). The notion of the stationary covariance regime is inspired by the univariate case, i.e., the stationary variance regime in Keller-Ressel [49]. In Proposition 4.2 below, we first show that the affine stochastic covariance model in the stationary covariance regime satisfies an affine transform formula, making it a tractable model for e.g. the pricing of options on forwards; see Sect. 4.3 below. Since the operator Barndorff-Nielsen–Shephard (BNS) model of Benth et al. [9] is known to be included in the affine class, we derive the characteristic function of the operator BNS model in the stationary covariance regime. This complements the literature on infinite-dimensional BNS-type models by their long-time behaviour, extending the finite-dimensional matrix-valued case in Barndorff-Nielsen and Stelzer [4], Pigorsch and Stelzer [62].

As an application of such stochastic covariance models in the stationary regime, we study in Sect. 4.3 the implied forward volatility for forward-start options written on commodity forwards. In particular, we examine the implied forward volatility in the geometric affine stochastic covariance model and show in Proposition 4.4 below that it converges to the implied spot volatility of a European call option written on the forward, but this time modelled in the stationary covariance regime. This extends a result in Keller-Ressel [49, Proposition 5.2] for forward-start options in (univariate) affine stochastic volatility models to an infinite-dimensional setting.

1.3 Related literature and methodology

The long-time behaviour of affine processes on the finite-dimensional state spaces \(\mathbb{R}_{+}^{d}\times \mathbb{R}^{n}\) and \(\mathbb{S}_{+}^{d}\) for \(d,n\in \mathbb{N}\) is now mostly well understood. More precisely, based on the representation by strong solutions of stochastic differential equations, ergodicity was studied for different Wasserstein distances in Friesen et al. [34]. By using the regularity of transition densities with respect to Lebesgue measure combined with the Meyn–Tweedie stability theory, the ergodicity in total variation distances has been studied in Barczy et al. [3], Friesen and Jin [31], Jin et al. [41, 42], Mayerhofer et al. [60], Friesen et al. [33]. Finally, coupling techniques for affine processes are studied in Wang [67], Li and Ma [58]. Unfortunately, all these methods implicitly either use the dimension of the state space or are closely related to the particular structure of the state space and hence do not allow an extension to our infinite-dimensional setting. Indeed, for general affine processes on \(\mathcal{H}_{+}\), a pathwise construction is still absent from the literature which rules out the methods developed in Friesen et al. [34]. The absence of an infinite-dimensional Lebesgue measure prevents us from effectively using the Meyn–Tweedie stability theory (in terms of estimates of the density). Finally, although there exist some extensions of the coupling techniques to infinite-dimensional state-spaces (see Li [57] for measure-valued branching processes), these methods seem to be closely related to the measure-valued structure of the process and hence are not suitable for our Hilbert space framework.

The most promising tool for the study of the long-time behaviour for affine processes in infinite-dimensional settings is thus based on the convergence of Fourier–Laplace transforms. This requires, in view of the affine transform formula (1.1), to study stability for the solutions \(\phi \), \(\psi \) of the generalised Riccati equations. For finite-dimensional state spaces, these ideas have been developed in Glasserman and Kim [35], Keller-Ressel [49], Keller-Ressel and Mijatović [51], Jin et al. [43], Pigorsch and Stelzer [62], Friesen et al. [32]. In these works, the existence of an invariant distribution (as well as weak convergence of transition probabilities) is obtained from Lévy’s continuity theorem. More recently, such techniques also have been applied in Friesen [30] to Dawson–Watanabe superprocesses with immigration, which form a specific class of affine processes on the state space of (possibly tempered) measures (so-called measure-valued Markov processes).

Unfortunately, in infinite-dimensional settings, there exists no direct analogue of Lévy’s continuity theorem since additional technical tightness conditions need to be verified. For Ornstein–Uhlenbeck processes on Hilbert spaces, such a problem can be avoided by taking advantage of their infinite divisibility; see Chojnowska-Michalik [15]. Note that such processes form a subclass of affine processes; see also Example 3.7 below. Apart from this, the long-time behaviour of affine processes in infinite dimensions has not been investigated in a systematic way. The present work provides a first general treatment of this topic, while our methodology also allows the treatment of other infinite-dimensional models being so far out of reach.

Namely, as a first step, we construct a candidate for the invariant measure \(\pi \) as a weak limit of the transition probabilities when \(t \to \infty \), and show that this limit is indeed the unique invariant measure of the process. To this end, we follow ideas taken from Friesen et al. [32] where the state-space \(\mathbb{S}_{+}^{d}\) was studied and show that for subcritical affine processes, the limits \(\lim _{t\to \infty}\phi (t,u)\) and \(\lim _{t\to \infty}\psi (t,u)\) exist for every \(u\in \mathcal {H}_{+}\). Consequently, the Fourier–Laplace transform of the process (see (1.1)) converges when \(t \to \infty \). In contrast to the finite-dimensional case, this alone does not suffice to obtain the existence of a limit distribution \(\pi \). As a key tool, we propose to utilise the generalised Feller semigroup approach for the process (see Appendix A for a definition). More precisely, we provide uniform bounds on the operator norm of the transition semigroup \((P_{t})_{t \geq 0}\) which allows us to prove that \(\lim _{t \to \infty}P_{t} f =: \ell (f)\) exists for a sufficiently large class of functions \(f\). By showing that the limit \(\ell \) is a continuous linear functional, we can apply a variant of Riesz’ representation theorem for generalised Feller semigroups to show that \(\ell \) has a representation \(\ell (f) = \int _{\mathcal{H}_{+}}f(y)\pi (\mathrm {d}y)\). The measure \(\pi \) obtained in this way is then shown to be the desired unique invariant probability measure. As a by-product, we also obtain weak convergence of the transition probabilities. Due to the infinite-dimensional nature of the process and the use of generalised Feller semigroups, such a convergence a priori only holds in the weak topology on \(\mathcal{H}_{+}\).

In the second step, we strengthen the above convergence from the weak topology to the norm topology by establishing bounds on the Wasserstein distance of order \(p\in [1,2]\) for the transition probabilities with respect to the invariant measure \(\pi \). In contrast to the finite-dimensional results in Friesen et al. [34,32], our new bounds are dimension-free while the convergence rate is shown to be still exponential. Finally, the corresponding stationary process is obtained via Kolmogorov’s consistency theorem which is applicable since \(\pi \) is shown to be inner regular (a condition that is always satisfied when working in finite-dimensional cases). As a side product of our considerations, we also obtain that the first two moments of the process converge to the first two moments of the invariant measure. Using generalised Feller semigroups, we provide explicit formulas for these.

We should like to mention that our proposed method for the study of the long-time behaviour of stochastic processes on Hilbert spaces via the generalised Feller semigroup can also be applied to other settings. For instance, in the recent work Jacquier et al. [38] that appeared at the same time as this article, the authors established a Krylov–Bogoliubov-type theorem for generalised Feller semigroups and subsequently applied it to multi-dimensional rough Heston models. Their key method, a uniform bound on the operator norm of \(P_{t}\), also appears as an auxiliary step in our result (see Lemma 5.5). However, our approach does not require such a Krylov–Bogoliubov-type result and also allows us to prove the uniqueness of limit distributions. Following Cuchiero and Teichmann [22] and Benth et al. [6], general (affine) Volterra processes can be realised as Markovian lifts on infinite-dimensional spaces including the Filipović space. Therefore, the methods of this work will be also useful for the study of such Markovian lifts associated with affine Volterra processes.

1.4 Layout of the article

In Sect. 2, we introduce the class of affine processes on \(\mathcal {H}_{+}\) established in Cox et al. [17] and recall some preliminary results. Subsequently, in Sect. 3, we present and discuss our main results in full detail. Afterwards, in Sect. 4, we discuss applications of our results in the context of affine stochastic covariance models for forward curves. Finally, the proofs are contained in Sect. 5 which is subdivided into several subsections: We first consider the long-time behaviour of the solutions of the generalised Riccati equations in Sect. 5.1, then prove the existence of a unique invariant measure in Sect. 5.2, derive the convergence rates in Sect. 5.3, show the existence of stationary affine processes in Sect. 5.4 and finally prove the moment formulas of the invariant measure in Sect. 5.5. We end the article with a brief conclusion and outlook in Sect. 6. For the reader’s convenience, we added some background information on generalised Feller semigroups in Appendix A, where we give in particular a version of Kolmogorov’s extension theorem that is tailored to our needs. In Appendix B, we give a convolution property for the Wasserstein distance of order 2.

2 Preliminaries: affine processes on \(\mathcal {H}_{+}\)

We set \(\mathbb{N}=\left \{ 1,2,\ldots\right \}\) and \(\mathbb{N}_{0}=\mathbb{N}\cup \{0\}\). For a complex number \(z\in \mathbb{C}\), we denote its real and imaginary parts by \(\Re (z)\) and \(\Im (z)\). For a vector space \(X\) and a subset \(U\subseteq X\), we denote the linear span of \(U\) in \(X\) by \(\mathrm {lin}(U)\). For a topological vector space \((X,\tau )\) and a subset \(S\subseteq X\), we denote the Borel-\(\sigma \)-algebra generated by the relative topology on \(S\) by \(\mathcal {B}(S)\). We write \(C_{b}(S)\) for the space of real-valued bounded functions on \(S\) that are continuous with respect to the relative topology; then \(C_{b}(S)\) is a Banach space when endowed with the supremum norm \(\| \cdot \|_{C(S)}\). For a Banach space \(X\) with norm \(\|\cdot\|_{X}\), we denote by \(\mathcal {L}(X)\) the space of all bounded linear operators on \(X\), which becomes a Banach space when equipped with the operator norm \(\| \cdot \|_{\mathcal {L}(X)}\).

Throughout this article, we let \((H, \langle \,\cdot \,,\,\cdot \,\rangle _{H})\) be a separable real Hilbert space and denote by ℋ the set of all self-adjoint Hilbert–Schmidt operators from \(H\) to \(H\). This is a Hilbert space when endowed with the trace inner product

$$ \langle A, B \rangle = \sum _{n=1}^{\infty} \langle A f_{n}, B f_{n} \rangle _{H} $$

for \(A,B\in \mathcal {H}\) and where \((f_{n})_{n\in \mathbb{N}}\) is an orthonormal basis for \(H\). Note that \(\langle \,\cdot \,, \,\cdot \, \rangle \) is independent of the choice of the orthonormal basis. We denote by \(\| \cdot \|\) the norm on ℋ induced by \(\langle \,\cdot \,, \,\cdot \, \rangle \). In addition, we denote by \(\mathcal {H}_{+}\) the set of all positive operators in ℋ, i.e., \(\mathcal {H}_{+} :=\{ A \in \mathcal {H}\colon \langle Ah, h\rangle _{H} \geq 0 \text{ for all } h\in H \}\). Note that \(\mathcal {H}_{+}\) is a closed subset of ℋ. Moreover, it is a convex cone in ℋ, i.e., \(\mathcal {H}_{+}+\mathcal {H}_{+}\subseteq \mathcal {H}_{+}\), \(\lambda \mathcal {H}_{+}\subseteq \mathcal {H}_{+}\) for all \(\lambda \geq 0\), and \(\mathcal {H}_{+}\cap (-\mathcal {H}_{+}) = \{ 0\}\). The cone \(\mathcal {H}_{+}\) induces a partial ordering \(\leq _{ \mathcal {H}_{+}}\) on ℋ, which is defined by \(x\leq _{ \mathcal {H}_{+}} y\) whenever \(y-x\in \mathcal {H}_{+}\). The cone \(\mathcal {H}_{+}\) is also generating for ℋ, i.e., \(\mathcal {H}=\mathcal {H}_{+}-\mathcal {H}_{+}\). For a Hilbert space \((V,\langle \,\cdot \,,\,\cdot \,\rangle _{V})\), in this article either \((H, \langle \,\cdot \,,\,\cdot \,\rangle _{H})\) or \((\mathcal {H},\langle \, \cdot \,,\,\cdot \,\rangle )\), we denote the adjoint of \(A\in \mathcal {L}(V)\) by \(A^{*}\). For two elements \(x\) and \(y\) in \(V\), we define the operator \(x\otimes y\in \mathcal {L}(V)\) by \((x\otimes y)(h)=\langle x,h \rangle _{V} \,y\) for every \(h\in V\) and write \(x^{\otimes 2}:=x\otimes x\).

2.1 Admissible parameters and discussion

Let \(\chi \colon \mathcal {H}\to \mathcal {H}\) be given by \(\chi (\xi )=\xi \mathbf {1}_{\|\xi\|\leq 1}(\xi )\). The following definition of an admissible parameter set stems from Cox et al. [17, Definition 2.3] and is closely related to the finite-dimensional version, i.e., matrix-valued admissible parameter sets known from Cuchiero et al. [21, Definition 2.3].

Definition 2.1

An admissible parameter set \((b,B,m,\mu )\) consists of

i) a measure \(m\colon \mathcal {B}(\mathcal {H}_{+}\setminus \{0\})\to [0,\infty ]\) such that

(a) \(\int _{ \mathcal {H}_{+}\setminus \{0\}} \| \xi \|^{2}m(\mathrm {d}\xi )< \infty \),

(b) \(\int _{ \mathcal {H}_{+}\setminus \{0\}}|\langle \chi (\xi ),h\rangle |m(\mathrm {d}\xi )<\infty \) for all \(h\in \mathcal {H}\) and there exists an element \(I_{m}\in \mathcal {H}\) such that \(\langle I_{m},h\rangle =\int _{ \mathcal {H}_{+}\setminus \{0\}}\langle \chi (\xi ),h \rangle m(\mathrm {d}\xi )\) for every \(h\in \mathcal {H}\);

ii) a vector \(b\in \mathcal {H}\) such that

$$\begin{aligned} \langle b, v\rangle - \int _{ \mathcal {H}_{+}\setminus \{0\}} \langle \chi (\xi ), v\rangle m( \mathrm {d}\xi ) \geq 0 \qquad \text{for all }v\in \mathcal {H}_{+}; \end{aligned}$$

iii) an \(\mathcal {H}_{+}\)-valued measure \(\mu \colon \mathcal{B}(\mathcal {H}_{+}\setminus \{0\}) \rightarrow \mathcal {H}_{+}\) such that the kernel \(M(x,\mathrm {d}\xi )\), defined for every \(x\in \mathcal {H}_{+}\) on \(\mathcal{B}(\mathcal {H}_{+}\setminus \{0\})\) by

$$\begin{aligned} M(x,\mathrm {d}\xi ):=\frac{\langle x, \mu (\mathrm {d}\xi )\rangle }{\|\xi\|^{2}}, \end{aligned}$$
(2.1)

satisfies for all \(u,x\in \mathcal {H}_{+}\) with \(\langle u,x \rangle = 0\) that

$$\begin{aligned} \int _{\mathcal {H}_{+}\setminus \{0\}} \langle \chi (\xi ), u\rangle M(x,\mathrm {d}\xi )< \infty ; \end{aligned}$$

iv) an operator \(B\in \mathcal{L}(\mathcal{H})\) with adjoint \(B^{*}\) satisfying for all \(x,u \in \mathcal {H}_{+}\) with \(\langle u,x\rangle =0\) that

$$\begin{aligned} \langle B^{*}(u) , x \rangle - \int _{ \mathcal {H}_{+}\setminus \{0\}} \langle \chi (\xi ),u \rangle \frac{\langle \mu (\mathrm {d}\xi ), x \rangle}{\| \xi \|^{2} } \geq 0. \end{aligned}$$
(2.2)

In comparison to the matrix-valued affine processes in Cuchiero et al. [21], the class of affine processes studied in this article is of pure-jump type and lacks a diffusion coefficient in the admissible parameter set in the sense of Definition 2.1. This is motivated by the fact that general affine diffusions on the cone of positive Hilbert–Schmidt operators are necessarily finite-rank-valued, as can be conjectured from the matrix-valued case. Indeed, for the matrix-valued case, the diffusion coefficient \(\alpha \) must satisfy the inequality \(b\succeq _{\mathbb{S}_{+}^{d}}(d-1)\alpha \) with \(b\) being the drift coefficient and \(d\in \mathbb{N}\) the dimension of the matrix, i.e., its maximal rank. When the rank approaches infinity, the diffusion coefficient must vanish to meet this requirement. In future work, this argument will be made precise and discussed in the context of Wishart processes on positive Hilbert–Schmidt operators. Thus affine processes with nonzero diffusion components on \(\mathcal {H}_{+}\) are finite-rank-valued and on a fixed orthonormal basis behave very similarly to their matrix-valued counterparts. For this reason, we restrict our considerations to the case of vanishing diffusion components with general jumps so that no rank conditions need to be imposed.

Consequently, when assuming that the diffusion component is absent, randomness solely enters through the parameters \(m\) and \(\mu \). Here \(m\) is called constant and \(\mu \) is called linear jump coefficient, respectively. Note that \(m\) is a standard measure defined on the Borel-\(\sigma \)-algebra on \(\mathcal {H}_{+}\setminus \left \{ 0\right \}\), whereas \(\mu \) is a vector-valued measure. For a comprehensive introduction to the integration theory of vector-valued measures, we refer to Bartle et al. [5] and Lewis [56].

Note that by part (b) of Definition 2.1 i), we have

$$ \int _{\mathcal {H}_{+}\cap \{\|\xi\|>1\}}\|\xi\|m(\mathrm {d}\xi )\leq \int _{ \mathcal {H}_{+}\cap \{\|\xi\|>1\}}\|\xi\|^{2}m(\mathrm {d}\xi )< \infty , $$

which means that the integral \(\int _{\mathcal {H}_{+}\cap \{\|\xi\|>1\}}\xi m(\mathrm {d}\xi )\) is defined in the Bochner sense. Similarly, it can be seen that the map \(u\mapsto \int _{\mathcal {H}_{+}\cap \left \{ \|\xi\|>1\right \}}\langle \xi ,u \rangle \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\) is a bounded linear operator on ℋ. Indeed, for \(v\in \mathcal {H}\) such that \(v=v^{+}-v^{-}\) for \(v^{+},v^{-}\in \mathcal {H}_{+}\), we write \(|\langle \mu (\mathrm {d}\xi ),v\rangle |:=\langle \mu (\mathrm {d}\xi ),v^{+} \rangle +\langle \mu (\mathrm {d}\xi ),v^{-}\rangle \) and see that \(|\langle \mu (\mathrm {d}\xi ),v\rangle |\) is a positive measure for all \(v\in \mathcal {H}\). We thus have

$$\begin{aligned} \bigg\langle \int _{\mathcal {H}_{+}\cap \left \{ \|\xi\|>1\right \}}\langle \xi ,u \rangle \frac{ \mu (\mathrm {d}\xi )}{\|\xi\|^{2}},v\bigg\rangle &\leq \|u\|\bigg( \int _{\mathcal {H}_{+}\cap \left \{ \|\xi\|>1\right \}}\|\xi\|^{-1}|\langle{\mu ( \mathrm {d}\xi )},v\rangle |\bigg) \\ &\leq \|u\| | \langle \mu (\mathcal {H}_{+}\cap \left \{ \|\xi\|>1\right \}),v \rangle |, \end{aligned}$$

and taking the supremum over all \(v\in \mathcal {H}\) with \(\|v\|=1\) on both sides proves that the map is bounded. Note that \(\|\mu (\mathcal {H}_{+}\cap \left \{ \|\xi\|>1\right \})\|<\infty \) since by Definition 2.1 iii), we have \(\mu (A)\in \mathcal {H}_{+}\) and hence \(\|\mu (A)\|<\infty \) for every \(A\in \mathcal {B}(\mathcal {H}_{+}\setminus \{0\})\).

One noticeable difference to the matrix-valued case in Cuchiero et al. [21] is that while the diffusion component is absent, the jump measure \(M(x, \mathrm {d}\xi )\) in (2.1) can have infinite variation. The latter is actually impossible in the matrix-valued case as shown in Mayerhofer [59]. Thus the jump components are more general than their finite-dimensional counterparts; see Karbach [47, Sect. 3.4] for a concrete example of an affine process of infinite variation.

2.2 Existence and moment formulas

We recall the main result in Cox et al. [17, Theorem 2.8] which ensures the existence of a broad class of affine processes on \(\mathcal {H}_{+}\) associated with the admissible parameter set \((b,B,m,\mu )\) satisfying the conditions in Definition 2.1.

Theorem 2.2

Let \((b, B, m, \mu )\) be an admissible parameter set according to Definition 2.1. Then there exist a conservative time-homogeneous \(\mathcal{H}_{+}\)-valued Markov process \(X\) with transition kernels \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) and constants \(K,\omega \in [1,\infty )\) such that

$$\begin{aligned} \int _{ \mathcal {H}_{+}} \| \xi \|^{2}p_{t}(x,\mathrm {d}\xi )\leq K e^{\omega t} ( \|x\|^{2}+1) \end{aligned}$$

and for all \(t\geq 0\) and \(u,x\in \mathcal {H}_{+}\), we have

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle \xi , u\rangle}p_{t}(x,\mathrm {d}\xi )=\mathrm {e}^{- \phi (t,u)-\langle x,\psi (t,u)\rangle}, \end{aligned}$$
(2.3)

where \(\phi (\,\cdot \,,u)\), \(\psi (\,\cdot \,,u)\) are the unique solutions to the generalised Riccati equations

$$\begin{aligned} \frac{\partial \phi (t,u)}{\partial t}&=F\big(\psi (t,u)\big),\quad t>0, \qquad \phi (0,u)=0, \end{aligned}$$
(2.4)
$$\begin{aligned} \frac{\partial \psi (t,u)}{\partial t}&=R\big(\psi (t,u)\big),\quad t>0, \qquad \psi (0,u)=u, \end{aligned}$$
(2.5)

where \(F\colon \mathcal {H}_{+}\to \mathbb{R}\) and \(R\colon \mathcal {H}_{+}\to \mathcal {H}\) are given by

$$\begin{aligned} F(u)&= \langle b,u\rangle -\int _{ \mathcal {H}_{+}\setminus \{0\}}\big(\mathrm {e}^{-\langle \xi ,u \rangle}-1+\langle \chi (\xi ) ,u\rangle \big)m(\mathrm {d}\xi ), \end{aligned}$$
(2.6)
$$\begin{aligned} R(u)&= B^{*}(u)-\int _{ \mathcal {H}_{+}\setminus \{0\}}\big(\mathrm {e}^{-\langle \xi ,u\rangle}-1+ \langle \chi (\xi ) , u\rangle \big)\frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}. \end{aligned}$$
(2.7)

Define the transition semigroup \((P_{t})_{t\geq 0}\) by

$$ P_{t} f(x) = \int _{\mathcal{H}_{+}}f(\xi )p_{t}(x,\mathrm{d}\xi ) $$

for bounded measurable functions \(f: \mathcal{H}_{+} \to \mathbb{R}\). Then \((P_{t})_{t \geq 0}\) is a positive semigroup. Let \(\rho (x) = 1 + \|x\|^{2}\) and define

$$ \| f\|_{\rho} = \sup _{x \in \mathcal{H}_{+}} \frac{|f(x)|}{\rho (x)}. $$

Denote by \(B_{\rho}(\mathcal{H}_{+})\) the Banach space of all measurable functions \(f: \mathcal{H}_{+} \to \mathbb{R}\) for which \(\|f \|_{\rho}\) is finite. Clearly, \((P_{t})_{t \geq 0}\) extends onto \(B_{\rho}(\mathcal{H}_{+})\) and satisfies for each \(f \in B_{\rho}(\mathcal{H}_{+})\) that

$$\begin{aligned} |P_{t} f(x)| \leq \| f\|_{\rho}\int _{\mathcal{H}_{+}} \rho (y)p_{t}(x, \mathrm {d}y) \leq \| f\|_{\rho}(1+ K)e^{\omega t}\rho (x), \qquad x \in \mathcal{H}_{+}, \quad \end{aligned}$$
(2.8)

i.e., \((P_{t})_{t \geq 0}\) leaves \(B_{\rho}(\mathcal{H}_{+})\) invariant. Let \(\mathcal{H}_{+,{\mathrm{w}}}\) be the space \(\mathcal{H}_{+}\) equipped with the weak topology and denote by \(C_{b}(\mathcal{H}_{+,{\mathrm{w}}})\) the space of all bounded and weakly continuous functions \(f: \mathcal{H}_{+} \to \mathbb{R}\). Finally, let \(\mathcal{B}_{\rho}(\mathcal{H}_{+,{\mathrm{w}}})\) be the closure of \(C_{b}(\mathcal{H}_{+,{\mathrm{w}}})\) in \(B_{\rho}(\mathcal{H}_{+})\). It follows from Cox et al. [17] that \((P_{t})_{t \geq 0}\) leaves \(\mathcal{B}_{\rho}(\mathcal{H}_{+,{\mathrm{w}}})\) invariant and satisfies \(\lim _{t\to 0+}P_{t}f(x)=f(x)\) for all \(f\in \mathcal {B}_{\rho}(\mathcal{H}_{+,{\mathrm{w}}})\) and \(x\in \mathcal{H}_{+,{\mathrm{w}}}\). From (2.8), we then obtain

$$ \| P_{t} \|_{\mathcal{L}(\mathcal{B}_{\rho}(\mathcal{H}_{+,{\mathrm{w}}}))} \leq (1+K)e^{\omega t}, \qquad t \geq 0. $$

Hence \((P_{t})_{t \geq 0}\) is a generalised Feller semigroup on \(\mathcal{B}_{\rho}(\mathcal{H}_{+,{\mathrm{w}}})\) (see Appendix A for the definition and additional details).

Remark 2.3

Given the transition kernels \((p_{t}(x,\,\cdot \,))_{t\geq 0}\), the process \((X_{t})_{t\geq 0}\) with initial value \(X_{0}=x\in \mathcal {H}_{+}\) can be constructed by a version of Kolmogorov’s extension theorem in Cuchiero and Teichmann [22, Theorem 2.11]. Indeed, for every \(x\in \mathcal {H}_{+}\), one can show the existence of a unique measure \(\mathbb{P}_{x}\) on \(\Omega :=(\mathcal {H}_{+})^{ \mathbb{R}_{+}}\), equipped with the \(\sigma \)-algebra generated by the canonical projections \(X_{t}\colon \Omega \to \mathcal {H}\) given by \(X_{t}(\omega )=\omega (t)\) for \(\omega \in \Omega \). For \(x\in \mathcal {H}_{+}\), the probability measure \(\mathbb{P}_{x}\) is the distribution of \(X\) with \(\mathbb{P}_{x}[X_{0}=x]=1\). We denote the expectation with respect to \(\mathbb{P}_{x}\) by \(\mathbb{E}_{x}\left [\,\cdot \,\right ]\).

Let \((b,B,m,\mu )\) be an admissible parameter set according to Definition 2.1 and denote by \((X_{t})_{t\geq 0}\) the associated affine process on \(\mathcal {H}_{+}\). Let us define, for \(u\in \mathcal {H}\), the constant and linear effective drift terms \(\hat{b}\) and \(\hat{B}\) via

$$\begin{aligned} \hat{b}&:=b+\int _{\mathcal {H}_{+}\cap \{\|\xi\|>1\}}\xi m(\mathrm {d}\xi ), \\ \hat{B}(u)&:=B^{*}(u) + \int _{\mathcal {H}_{+}\cap \left \{ \|\xi\|>1\right \}} \langle \xi ,u\rangle \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}. \end{aligned}$$
(2.9)

Then \(\hat{b}\in \mathcal {H}\) and \(\hat{B}\in \mathcal {L}(\mathcal {H})\) are well defined. Below we state explicit formulas for the first two moments of the process \((X_{t})_{t\geq 0}\), which can be derived from differentiating the Laplace transform of \((X_{t})_{t\geq 0}\) using its affine form; see Cox et al. [17, Proposition 4.7].

Proposition 2.4

Let \((X_{t})_{t\geq 0}\) be the affine process associated with the admissible parameter set \((b,B,m,\mu )\). Then for all \(v,w\in \mathcal {H}_{+}\), we have the formulas

$$\begin{aligned} \mathbb{E}_{x}\left [\langle X_{t},v\rangle\right ]&=\int _{0}^{t}\langle \hat{b}, \mathrm {e}^{s \hat{B}}v\rangle \,\mathrm {d}s+\langle x, \mathrm {e}^{t\hat{B}}v\rangle \end{aligned}$$
(2.10)

and

$$\begin{aligned} & \mathbb{E}_{x}\left [\langle X_{t},v\rangle \langle X_{t},w\rangle\right ] \\ &= \bigg( \int _{0}^{t} \langle \hat{b},\mathrm {e}^{s \hat{B}} v\rangle \,\mathrm {d}s+ \langle x, \mathrm {e}^{t \hat{B}} v \rangle \bigg) \bigg( \int _{0}^{t} \langle \hat{b},\mathrm {e}^{s \hat{B}} w\rangle \,\mathrm {d}s+ \langle x, \mathrm {e}^{t \hat{B}} w \rangle \bigg) \\ & \hphantom{=:} + \int _{0}^{t}\int _{ \mathcal {H}_{+}\setminus \{0\}}\langle \xi , \mathrm {e}^{s \hat{B}} v \rangle \langle \xi , \mathrm {e}^{s \hat{B}} w\rangle m(\mathrm {d}\xi )\,\mathrm {d}s \\ & \hphantom{=:} + \int _{0}^{t} \int _{0}^{s} \bigg\langle \hat{b},\mathrm {e}^{(s-u)\hat{B}} \int _{ \mathcal {H}_{+}\setminus \{0\}} \langle \xi ,\mathrm {e}^{u \hat{B}}v\rangle \langle \xi , \mathrm {e}^{u \hat{B}}w\rangle \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}} \bigg\rangle \,\mathrm {d}u\,\mathrm {d}s \\ & \hphantom{=:} + \int _{0}^{t} \bigg\langle x,\mathrm {e}^{(t-s)\hat{B}}\int _{ \mathcal {H}_{+}\setminus \{0\}} \langle \xi ,\mathrm {e}^{s \hat{B}}w\rangle \langle \xi ,\mathrm {e}^{s \hat{B}}v \rangle \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}} \bigg\rangle \,\mathrm {d}s. \end{aligned}$$
(2.11)

3 Main results

Let \(V_{\tau} :=(V,\tau )\) be a topological vector space and denote by \(\mathcal {M}(V_{\tau})\) the set of all probability measures defined on the Borel-\(\sigma \)-algebra \(\mathcal {B}(V_{\tau})\). For the vector space ℋ equipped with its weak topology \(\tau _{\mathrm{w}}\), we write \(\mathcal {H}_{\mathrm{w}}=(\mathcal {H},\tau _{\mathrm{w}})\). Note that the positive cone \(\mathcal {H}_{+,\mathrm{w}}\) is also closed in the weak topology and, moreover, the Borel-\(\sigma \)-algebras of the strong and weak topology coincide, i.e., \(\mathcal {B}(\mathcal {H}_{+})=\mathcal {B}(\mathcal {H}_{+,\mathrm{w}})\). We say that a measure \(\nu \in \mathcal {M}(\mathcal {H}_{+,\tau})\) is inner regular (with respect to the topology \(\tau \)) whenever

$$\begin{aligned} \nu (A)=\sup \left \{ \nu (K)\colon K\subseteq A, K\text{ is $\tau $-compact}\right \}. \end{aligned}$$

For a sequence \((\nu _{n})_{n\in \mathbb{N}}\subseteq \mathcal {M}(\mathcal {H}_{+})\), we write \(\nu _{n}\Rightarrow \nu \) as \(n\to \infty \) for the weak convergence of \((\nu _{n})_{n\in \mathbb{N}}\) to \(\nu \) in the strong topology, i.e.,

$$\begin{aligned} \lim _{n\to \infty}\int _{ \mathcal {H}_{+}}f(\xi )\nu _{n}(\mathrm {d}\xi )=\int _{ \mathcal {H}_{+}}f(\xi )\nu (\mathrm {d}\xi )\qquad \text{for all }f\in C_{b}(\mathcal {H}). \end{aligned}$$

For a family \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) of transition kernels, we call the probability measure \(\pi \) an invariant measure whenever

$$\begin{aligned} \int _{ \mathcal {H}_{+}}p_{t}(x,\mathrm {d}\xi )\pi (\mathrm {d}x)=\pi (\mathrm {d}\xi ) \qquad \text{for all }t\geq 0 \text{ and } x\in \mathcal {H}_{+}. \end{aligned}$$

For \(\nu _{1}\), \(\nu _{2}\in \mathcal {M}(\mathcal {H}_{+})\), we call a probability measure \(G\) on the product Borel-\(\sigma \)-algebra \(\mathcal {B}(\mathcal {H}_{+})\times \mathcal {B}(\mathcal {H}_{+})\) a coupling of \((\nu _{1},\nu _{2})\) whenever its marginal distributions are given by \(\nu _{1}\) and \(\nu _{2}\), respectively. We denote the set of all possible couplings of \((\nu _{1},\nu _{2})\) by \(\mathcal {C}(\nu _{1},\nu _{2})\). For \(p\in [1,\infty )\), the Wasserstein distance of order \(p\) between \(\nu _{1}\in \mathcal {M}(\mathcal {H}_{+})\) and \(\nu _{2}\in \mathcal {M}(\mathcal {H}_{+})\) is defined as

$$\begin{aligned} W_{p}(\nu _{1},\nu _{2})&=\bigg(\inf \left \{ \int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}\|x-y\|^{p}G(\mathrm {d}x,\mathrm {d}y)\colon G\in \mathcal {C}(\nu _{1},\nu _{2})\right \} \bigg)^{1/p}. \end{aligned}$$

For an introduction to Wasserstein distances, we refer to Villani [66, Sect. 6].

Now let \((b,B,m,\mu )\) be an admissible parameter set and denote the spectrum of \(\hat{B}\) defined in (2.9) by \(\sigma (\hat{B})\). We introduce the following central assumption:

Assumption 3.1

The spectral bound \(s(\hat{B}):=\sup \{ \Re (\lambda )\colon \lambda \in \sigma (\hat{B}) \}\) of \(\hat{B}\) is strictly negative, i.e., \(s(\hat{B})<0\).

Definition 3.2

We call an affine process \((X_{t})_{t\geq 0}\) on \(\mathcal {H}_{+}\) associated with an admissible parameter set \((b,B,m,\mu )\) satisfying Assumption 3.1 a subcritical affine process on \(\mathcal {H}_{+}\).

Recall that \(\hat{B}\) is bounded and generates the operator semigroup \((\mathrm {e}^{t\hat{B}})_{t\geq 0}\) given by \(\mathrm {e}^{t\hat{B}}:=\sum _{n=0}^{\infty}\frac{(t\hat{B})^{n}}{n!}\), where the convergence of the series is understood in the \(\mathcal {L}(\mathcal {H})\)-norm. It is well known that \((\mathrm {e}^{t\hat{B}})_{t\geq 0}\) is a uniformly continuous semigroup, see Engel and Nagel [26, Sect. I.3], which implies that the spectral bound \(s(\hat{B})\) coincides with the growth bound of \((\mathrm {e}^{t\hat{B}})_{t\geq 0}\), see [26, Corollary 4.2.4], i.e.,

$$\begin{aligned} s(\hat{B})=\inf \{ w\in \mathbb{R}\colon \exists M_{w}\geq 1 \text{ such that }\|\mathrm {e}^{t\hat{B}}\|_{\mathcal {L}(\mathcal {H})}\leq M_{w}\mathrm {e}^{w t}\text{ for all } t\geq 0 \}. \end{aligned}$$

Thus whenever Assumption 3.1 is satisfied, there exist \(M\geq 1\) and \(\delta >0\) such that

$$\begin{aligned} \|\mathrm {e}^{t\hat{B}}\|_{\mathcal {L}(\mathcal {H})}\leq M \mathrm {e}^{-\delta t}, \end{aligned}$$
(3.1)

The following theorem is a detailed version of our main result concerning the long-time behaviour of affine processes on the state space \(\mathcal {H}_{+}\).

Theorem 3.3

Let \((b,B,m,\mu )\) be an admissible parameter set satisfying Assumption 3.1. Denote the associated subcritical affine process on \(\mathcal {H}_{+}\) by \((X_{t})_{t\geq 0}\) and its transition kernels by \((p_{t}(x,\,\cdot \,))_{t\geq 0}\). Then the following hold true:

  1. i)

    There exists a unique invariant measure \(\pi \) for \((p_{t}(x,\,\cdot \,))_{t\geq 0}\), and the Laplace transform of \(\pi \) is given by

    $$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle u,x\rangle}\pi (\mathrm {d}x)=\exp \bigg(-\int _{0}^{ \infty}F\big(\psi (s,u)\big)\,\mathrm {d}s\bigg),\qquad u\in \mathcal {H}_{+}, \end{aligned}$$
    (3.2)

    where \(F\) and \(\psi (s,u)\) are as in (2.6) and (2.5). Moreover, \(\pi \) is an inner regular measure on \(\mathcal {B}(\mathcal {H}_{+,\mathrm{w}})\).

  2. ii)

    For \(p\in [1,2]\), \(t \geq 0\) and \(x\in \mathcal {H}_{+}\), we have

    $$\begin{aligned} W_{p}\big(p_{t}(x,\,\cdot \,),\pi \big)&\leq C_{1}\mathrm {e}^{-\delta t} \bigg(\|x\|+\Big(\int _{ \mathcal {H}_{+}}\|y\|^{p}\pi (\mathrm {d}y)\Big)^{1/p} \bigg) \end{aligned}$$
    (3.3)
    $$\begin{aligned} & \hphantom{=:} +C_{2}\mathrm {e}^{-\delta /2 t}\bigg(\|x\|^{1/2}+\Big(\int _{ \mathcal {H}_{+}} \|y\|^{p/2}\pi (\mathrm {d}y)\Big)^{1/p}\bigg), \end{aligned}$$
    (3.4)

    where \(C_{1}=2M\) and \(C_{2}=2^{1/2}M^{3/2}\delta ^{-1/2}\|\mu (\mathcal {H}_{+}\setminus \{0\})\|^{1/2}\) for \(M\geq 1\) and \(\delta >0\) as in (3.1). In particular, we have \(p_{t}(x,\,\cdot \,)\Rightarrow \pi \) as \(t\to \infty \).

Remark 3.4

For locally compact and second countable Hausdorff spaces, in particular for finite-dimensional normed spaces, every probability measure defined on the Borel-\(\sigma \)-algebra is regular. The last assertion of part i) in Theorem 3.3 states that the invariant measure \(\pi \) is an inner regular measure on \(\mathcal {B}(\mathcal {H}_{+,\mathrm{w}})\), i.e., inner regular in the weak topology, although \(\mathcal {H}_{+,\mathrm{w}}\) in the infinite-dimensional case is not locally compact. We see that the inner regularity is a nontrivial property of the invariant measure that is used in the proof of Proposition 3.5 below.

Setting \(p = 1\), it is worthwhile to compare our result with the estimates given in Friesen et al. [32, Theorem 2.9, Equation (2.12)] for the state space \(\mathbb{S}_{+}^{d}\) \((d\in \mathbb{N})\), i.e., \(H=\mathbb{R}^{d}\) in our setting. The convergence rate there is given by

$$\begin{aligned} \sqrt{d}M \mathrm {e}^{-\delta t}\bigg(\|x\|+\int _{\mathbb{S}_{+}^{d}}\|y\| \pi (\mathrm {d}y)\bigg),\qquad t\geq 0, x\in \mathbb{S}_{+}^{d}, \end{aligned}$$

which is dimension-dependent. In contrast, the estimate obtained here in (3.3), (3.4) is independent of the dimension and hence holds for arbitrary \(\mathcal {H}_{+}\) at the expense that an additional term (3.4) appears.

As a consequence of Theorem 3.3 which ensures the existence of an invariant inner regular measure \(\pi \) on \(\mathcal {B}(\mathcal {H}_{+,\mathrm{w}})\), we obtain the existence of a stationary process with invariant measure \(\pi \) as formulated in the next result.

Proposition 3.5

There exists a process \((X_{t}^{\pi})_{t\geq 0}\) on \(\mathcal {H}_{+}\) with the transition kernels \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) such that the distribution of \(X^{\pi}_{t}\) equals \(\pi \) for all \(t\geq 0\).

Coming back to Theorem 3.3, we note that the \(p\)th moment of \(\pi \) shows up in the convergence rate in (3.3), where we implicitly assumed that these terms are finite. That this is indeed the case is part of the next result, where we also obtain explicit formulas for the first two moments of the invariant measure \(\pi \).

Proposition 3.6

Under the same conditions as in Theorem 3.3and denoting the unique invariant measure of \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) by \(\pi \), we have \(\int _{\mathcal{H}_{+}}\|y\|^{2} \pi (\mathrm {d}y) < \infty \),

$$\begin{aligned} \lim _{t\to \infty} \mathbb{E}_{x}\left [X_{t}\right ]=\int _{ \mathcal {H}_{+}}y \pi (\mathrm {d}y)= \int _{0}^{\infty} \mathrm {e}^{s\hat{B}}\bigg(b+\int _{\mathcal {H}_{+}\cap \left \{ \|\xi\|>1\right \}}\xi m(\mathrm {d}\xi )\bigg) \,\mathrm {d}s \end{aligned}$$
(3.5)

and

$$\begin{aligned} \lim _{t\to \infty} \mathbb{E}_{x}\left [X_{t}\otimes X_{t}\right ] =&\int _{ \mathcal {H}_{+}}(y \otimes y) \pi (\mathrm {d}y) \\ =&\int _{0}^{\infty} (\mathrm {e}^{s\hat{B}^{*}}\hat{b} )^{\otimes 2}\,\mathrm {d}s+ \int _{0}^{\infty}\int _{ \mathcal {H}_{+}\setminus \{0\}} (\mathrm {e}^{s\hat{B}^{*}}\xi )^{\otimes 2} m(\mathrm {d}\xi )\,\mathrm {d}s \\ &{} + \int _{0}^{\infty}\int _{0}^{s}\int _{ \mathcal {H}_{+}\setminus \{0\}} (\mathrm {e}^{u\hat{B}^{*}} \xi )^{\otimes 2}\bigg\langle \hat{b},\mathrm {e}^{(s-u)\hat{B}} \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}u\,\mathrm {d}s.\qquad \end{aligned}$$
(3.6)

Example 3.7

As an example, we consider Lévy-driven Ornstein–Uhlenbeck processes on \(\mathcal {H}_{+}\). For this, let \(m\) be a Lévy measure on \(\mathcal {B}(\mathcal {H}_{+}\setminus \{0\})\) with finite second moment and \(b\in \mathcal {H}_{+}\) be such that part ii) of Definition 2.1 is satisfied. Let \(\mu =0\) and \(B\in \mathcal {L}(\mathcal {H})\) be of the form \(B(u)=Gu+uG^{*}\) for some \(G\in \mathcal {L}(H)\); then part iv) of Definition 2.1 is satisfied, which can be seen from the fact that for every \(u\in \mathcal {H}_{+}\), we have \(\mathrm {e}^{tB}u=\mathrm {e}^{tG} u \mathrm {e}^{tG^{*}}\geq _{ \mathcal {H}_{+}} 0\) for all \(t\geq 0\) and hence Lemmert and Volkmann [55, Theorem 1] implies that \(B\) satisfies (2.2). Thus the tuple \((b,m,B,0)\) is an admissible parameter set according to Definition 2.1, and the associated affine process \((X_{t})_{t\geq 0}\) becomes an Ornstein–Uhlenbeck process driven by an \(\mathcal {H}_{+}\)-valued Lévy process \((L_{t})_{t\geq 0}\) with characteristics \((b,0,m)\); see Cox et al. [18, Lemma 2.3], i.e.,

$$\begin{aligned} X_{t}=\mathrm {e}^{t G}x\mathrm {e}^{tG^{*}}+\int _{0}^{t}\mathrm {e}^{(t-s)G}\mathrm {d}L_{s}\mathrm {e}^{(t-s)G^{*}}, \qquad t\geq 0. \end{aligned}$$

Since \(\sigma (B)=\sigma (G)+\sigma (G)\), see Rosenblum [63], and hence \(s(B)\leq s(G)\), we see that whenever the spectral bound \(s(G)\) of the operator \(G\) is negative, the same holds for \(s(B)\) and hence Assumption 3.1 is satisfied. This provides an explicit and simple sufficient criterion for the Ornstein–Uhlenbeck process \((X_{t})_{t\geq 0}\) to be subcritical. By Theorem 3.3, there exists a unique invariant measure \(\pi \) with Laplace transform

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle u,x\rangle}\pi (\mathrm {d}x)=\exp \bigg(-\int _{0}^{ \infty}\varphi _{L} ( \mathrm {e}^{sG}u\mathrm {e}^{sG^{*}} )\,\mathrm {d}s\bigg), \end{aligned}$$

where \(\varphi _{L}\colon \mathcal {H}\to \mathbb{C}\) denotes the Laplace exponent of the Lévy process \(L\), given by

$$ \varphi _{L}(u)= \langle b, u\rangle -\int _{ \mathcal {H}_{+}\setminus \{0\}}\big(\mathrm {e}^{- \langle \xi , u\rangle}-1+\langle \chi (\xi ), u\rangle \big) m(\mathrm {d}\xi ) ,\qquad u\in \mathcal {H}_{+}. \qquad $$
(3.7)

Existence and uniqueness of invariant measures for Ornstein–Uhlenbeck processes were studied in Chojnowska-Michalik [15], where a similar result follows under the assumption of a finite log-moment of the Lévy measure \(m\). In Proposition 3.6, the process \((X_{t})_{t\geq 0}\) is assumed to have a finite second moment, which is a stronger assumption than that of finite log-moments. However, this stronger assumption also allows the derivation of explicit formulas for the first and second moments of \(\pi \). Indeed, setting \(\mu =0\) and \(B(u)=Gu+uG^{*}\) in (3.5) and (3.6) gives

$$\begin{aligned} \lim _{t\to \infty} \mathbb{E}_{x}\left [X_{t}\right ]=\int _{0}^{\infty} \mathrm {e}^{sG}\bigg(b+ \int _{\mathcal {H}_{+}\setminus \{0\}\cap \left \{ \|\xi\|>1\right \}}\xi m(\mathrm {d}\xi )\bigg)\mathrm {e}^{sG^{*}} \,\mathrm {d}s \end{aligned}$$

and

$$\begin{aligned} \lim _{t\to \infty} \mathbb{E}_{x}\left [X_{t}\otimes X_{t}\right ]&=\int _{0}^{\infty} ( \mathrm {e}^{sG}\hat{b}\mathrm {e}^{sG^{*}} )^{\otimes 2}\mathrm {d}s +\int _{0}^{\infty} \int _{ \mathcal {H}_{+}\setminus \{0\}} (\mathrm {e}^{sG}\xi \mathrm {e}^{sG^{*}} )^{\otimes 2}m(\mathrm {d}\xi ) \,\mathrm {d}s. \end{aligned}$$

4 The stationary covariance regime for forward curve dynamics

In this section, we investigate applications of our findings regarding the long-time behaviour of affine processes on positive Hilbert–Schmidt operators within the context of an affine stochastic covariance model for commodity forward curves. This model characterises the evolution of the logarithmic forward curve over time through a stochastic partial differential equation, with the instantaneous volatility determined by a stationary affine process on positive Hilbert–Schmidt operators. Subsequently, we introduce the model, recall some auxiliary results and then study the implied volatility for forward-start options written on forward contracts.

4.1 A geometric affine forward curve model

Let \(0\leq T\leq \hat{T}\) and denote by \(F(T,\hat{T})\) the forward price at time \(T\) with the maturity date \(\hat{T}\), i.e., \(F(T,\hat{T})\) represents the price at time \(T\) for one unit of an underlying spot commodity with delivery date \(\hat{T}\). We follow the HJMM approach and model the dynamics of \((F(T,\hat{T}))_{T\leq \hat{T}}\) directly using a stochastic partial differential equation. To begin, we employ the Musiela parametrisation \(x=T-t\) and denote by \(f_{t}\) the forward curve at time \(t\geq 0\) in terms of time-to-maturity, i.e., \(f_{t}(x) = F(t, t+x)\) for \(x \geq 0\). The logarithmic forward prices \(Y_{t}(x) = \ln f_{t}(x)\) are then described by

$$\begin{aligned} \mathrm {d}Y_{t}(x)= \bigg(\frac{\partial}{\partial x}Y_{t}(x)+G(t,x) \bigg) \mathrm {d}t+\sigma (t,x)\mathrm {d}W_{t}(x),\qquad t\geq 0, \end{aligned}$$
(4.1)

where for all \(x\geq 0\), \((\sigma (t,x))_{t\geq 0}\) represents the instantaneous volatility process of the forward price dynamics, \((W(t,x))_{t\geq 0}\) is a collection of one-dimensional Brownian motions parametrised by \(x\) and \((G(t,x))_{t\geq 0}\) represents the drift component. To guarantee arbitrage-free pricing, this drift needs to have a particular functional form based on \(\sigma \) (the so-called HJM condition); see Benth and Krühner [7].

Mathematically, (4.1) is a stochastic partial differential equation, and hence we regard \((Y_{t})_{t\geq 0}\) as a process taking values in a Hilbert space of functions containing viable forward curves. Below, we introduce a weighted Sobolev space that serves as a model for such forward curves. An economic reasoning for this particular choice was given in Filipović [27, Chap. 5] for forward rate curves in fixed-income markets (thus it is henceforth called Filipović space), and in Clewlow and Strickland [16, Chap. 4] for commodity forward spaces. Namely, for \(\beta >0\), we let \(H_{\beta}\) be the space of all absolutely continuous functions \(f\colon \mathbb{R}_{+}\to \mathbb{R}\) such that

$$ \|f\|_{\beta} :=\bigg(|f(0)|^{2}+\int _{ \mathbb{R}_{+}}\mathrm {e}^{\beta x}|f'(x)|^{2} \mathrm {d}x \bigg)^{1/2}< \infty , $$

where \(f'\) denotes the weak derivative of \(f\). We equip the space \(H_{\beta}\) with the inner product

$$\begin{aligned} \langle f,g \rangle _{\beta} :=f(0)g(0)+\int _{ \mathbb{R}_{+}}\mathrm {e}^{\beta |x|}f'(x)g'(x) \mathrm {d}x, \end{aligned}$$

which makes \((H_{\beta},\langle \, \cdot \,,\,\cdot \, \rangle _{\beta})\) a separable Hilbert space.

It is well known that the left-shift semigroup, denoted by \((S(t))_{t\geq 0}\), is strongly continuous on \(H_{\beta}\) with the infinitesimal generator being the operator of differentiation in time-to-maturity. We denote it by \((\mathcal {A},\mathrm {dom}(\mathcal {A}))\), i.e., \(\mathcal {A}f = f'\), where \(f\) belongs to its maximal domain \(\mathrm {dom}(\mathcal {A}) = \{ f \in H_{\beta} \ : \ f' \in H_{\beta}\}\) (compare with Benth et al. [6, Sect. 4]). We further note that for every \(t\geq 0\), the point evaluation maps \(\delta _{t}\colon H_{\beta}\to \mathbb{R}\) are continuous linear functionals; see Filipović et al. [29, Theorem 2.1]. Throughout this section, we identify the map \(\delta _{t}\) for any \(t\geq 0\) with an element \(u_{t}\in H_{\beta}\) such that \(\langle f,u_{t}\rangle _{\beta}=\delta _{t}(f)\) holds. In this way, we can interpret the process \((Y_{t})_{t\geq 0}\) as the (unique strong) solution to the stochastic differential equation on \(H_{\beta}\) given by

$$\begin{aligned} \mathrm {d}Y_{t} = \big(\mathcal{A} Y_{t} + G(\sigma _{t}) \big) \mathrm {d}t + \sigma _{t} \mathrm {d}W_{t},\qquad Y_{0}=y\in H_{\beta}, \end{aligned}$$
(4.2)

where \(W\) is a cylindrical Brownian motion on \(H_{\beta}\), \(G\colon \mathcal {H}\to H_{\beta}\) a continuous affine function whose dependence on \(\sigma \) stems from the HJM condition, and \((\sigma _{t})_{t\geq 0}\) is a positive trace-class-operator-valued process that determines the stochastic covariance structure of forwards with different times to maturity.

An affine stochastic covariance model, as introduced by Cox et al. [18], is characterised by the instantaneous volatility \(\sigma _{t}=D^{1/2}X^{1/2}_{t}\), where \(D\) is a trace-class operator on \(H_{\beta}\) and \((X_{t})_{t\geq 0}\) is an affine process on the cone of positive Hilbert–Schmidt operators. Here \(D\) corresponds to a constant covariance term used to ensure that the covariance process \(\sigma \) is integrable with respect to \(W\); see Cox et al. [18] for a discussion. Consequently, for such a choice of \(\sigma _{t}\), \(G\) takes the form

$$\begin{aligned} G(X_{t})=-\frac{1}{2}D^{1/2}X_{t}D^{1/2}S^{*}(\,\cdot \,)u_{0} \end{aligned}$$
(4.3)

with \(u_{0}\in H_{\beta}\) such that \(\langle f,u_{0}\rangle _{\beta}=\delta _{0}(f)\), in order to satisfy the HJM condition; see Karbach [46, Lemma 4.6]. The joint process \((Y^{y}_{t},X_{t})_{t\geq 0}\) is then called an affine stochastic covariance model on \(H_{\beta}\). It determines the forward curve dynamics via

$$\begin{aligned} F(T,\hat{T}):=\delta _{\hat{T}-T}\big(\exp (Y^{y}_{T})\big)=\exp ( \langle Y^{y}_{T},u_{\hat{T}-T}\rangle _{\beta}). \end{aligned}$$
(4.4)

This justifies the name geometric affine forward curve model for \((F(T,\hat{T}))_{T\leq \hat{T}}\) on \(H_{\beta}\). Note that geometric models of this form ensure the positivity of forward prices which is crucial for many forward markets. A similar geometric model was proposed in Benth and Sgarra [10]. In their case, the underlying stochastic covariance model \((Y^{y}_{t},X_{t})_{t\geq 0}\) is the Hilbert-valued Barndorff-Nielsen–Shephard (BNS) model (see (4.6) below) with additional leverage term in the dynamics (4.2). However, in our model (4.2), we omit the leverage term (compared to [10]) for brevity, although leverage is a stylised fact in many commodity forward markets. The advantage of the more general affine geometric model class \((Y^{y}_{t},X_{t})_{t\geq 0}\) is its flexible jump structure; in particular, it allows self-exciting jumps, which has the potential to model volatility clustering, a key feature in many commodity markets.

Example 4.1

We give an example of an admissible parameter set \((b,B,m,\mu )\). For brevity, let the constant parameters \(b\) and \(m\) be zero. Let \(g,z\in \mathcal {H}_{+}\) be fixed and let \(\eta \colon \mathcal {B}((0,\infty ))\to [0,\infty )\) be such that \(\int _{0}^{\infty}\lambda ^{-2}\eta (\mathrm {d}\lambda )<\infty \). Then define the vector-valued measure \(\mu \colon \mathcal {B}(\mathcal {H}_{+}\setminus \left \{ 0\right \})\to \mathcal {H}_{+}\) by

$$\begin{aligned} \mu (A):=g\eta (\left \{ \lambda \in \mathbb{R}_{+}\colon \lambda z\in A\right \}), \end{aligned}$$

which implies that jumps only occur in the direction of \(z\in \mathcal {H}_{+}\), with the intensity determined by the measure \(\eta \) on \(\mathbb{R}_{+}\), and the structure of the spillover effects between the volatility of forwards with different times to maturity determined by the operator \(g\). If we define for \(\gamma \in \mathbb{R}\),

$$\begin{aligned} B(u) &:=\gamma u+\int _{\mathcal {H}_{+}\setminus \left \{ 0\right \}}\chi (\xi ) \frac{\langle u, \mu (\mathrm {d}\xi )\rangle}{ \|\xi\|^{2}} \\ & \hphantom{:} = \gamma u + \frac{z}{\|z\|^{2}}\langle g,u\rangle \int _{0}^{ \frac{1}{\|z\|}}\lambda ^{-1}\eta (\mathrm {d}\lambda ),\qquad u\in \mathcal {H}_{ \beta}, \end{aligned}$$

then \(B\) and \(\mu \) satisfy Condition (2.2), i.e., \((0,B,0,\mu )\) is an admissible parameter set in the sense of Definition 2.1. The associated affine process \((X_{t})_{t\geq 0}\) is concentrated on the set \(\left \{ \lambda z\colon \lambda \geq 0\right \}\). In applications, we could choose \(z\) to be a Hilbert–Schmidt integral operator on \(H_{\beta}\), i.e., given by some kernel function \(k\colon \mathbb{R}_{+}\times \mathbb{R}_{+}\to \mathbb{R}\).

Let us verify that the corresponding affine process is subcritical for appropriate \(\gamma \), so that our results can be applied. Indeed, it is easy to see that

$$ B^{*}(u) = \gamma u+g \left \langle \frac{z}{\|z\|^{2}}, u\right \rangle \int _{0}^{\frac{1}{\|z\|}} \lambda ^{-1} \eta (\mathrm {d}\lambda ). $$

Hence we find that the effective drift is given by

$$ \widehat{B}(u) = \gamma u+g \left \langle \frac{z}{\|z\|^{2}}, u \right \rangle \int _{0}^{\infty} \lambda ^{-1} \eta (\mathrm {d}\lambda ), $$

which is well defined provided that \(\int _{0}^{\infty}\lambda ^{-1}\eta (\mathrm {d}\lambda ) < \infty \). To verify that the affine process is subcritical, let us first note that the semigroup \((\mathrm{e}^{t\widehat{B}})_{t \geq 0}\) on \(\mathcal{H}_{\beta}\) is given by

$$ \mathrm{e}^{t\widehat{B}}u = \frac{a(u)}{a(g)}\mathrm{e}^{t(a(g) + \gamma )}g, \qquad t \geq 0, $$

where \(a(u) = \langle \frac{z}{\|z\|^{2}}, u \rangle \int _{0}^{\infty} \lambda ^{-1} \eta (\mathrm {d}\lambda ) \geq 0\). Now suppose that

$$ \langle z,g \rangle > 0 \qquad \text{and} \qquad \gamma < -\left \langle \frac{z}{\|z\|^{2}}, g\right \rangle \int _{0}^{\infty} \lambda ^{-1} \eta (\mathrm {d}\lambda ). $$

Then \(a(g) > 0\) and hence \((\mathrm{e}^{t\widehat{B}})_{t \geq 0}\) is uniformly exponentially stable. This shows that the process is subcritical.

4.2 The stationary covariance regime for affine models

Next we proceed to construct the model introduced in (4.2) and (4.3) in the stationary covariance regime. For this purpose, let \((X_{t})_{t\geq 0}\) be an affine process on \(\mathcal {H}_{+}\) with initial value \(X_{0}=x\), associated with an admissible parameter set \((b,B,m,\mu )\). Let \((Y^{y}_{t})_{t\geq 0}\) be the unique (mild) solution to (4.2) given by the variation-of-constants formula

$$ Y^{y}_{t}=S(t)y+\int _{0}^{t}S(t-s)G(X_{s})\mathrm {d}s+\int _{0}^{t}S(t-s)D^{1/2}X_{s}^{1/2} \mathrm {d}W_{s},\qquad t\geq 0. $$
(4.5)

In other words, let \((Y^{y}_{t},X_{t})_{t\geq 0}\) be an affine stochastic covariance model on \(H_{\beta}\). Note that \((Y^{y}_{t},X_{t})_{t\geq 0}\) can be considered as a stochastic process on the probability basis \((\Omega , \mathcal {F}, \mathbb{F}, \mathbb{Q}_{x}):=(\Omega ^{1}\times \Omega ^{2}, \mathcal {F}^{1}\otimes \mathcal {F}^{2}, (\mathcal {F}^{1}_{t}\otimes \mathcal {F}_{t}^{2})_{t\geq 0}, \mathbb{Q}\otimes \mathbb{P}_{x})\), where we denote by \((\Omega ^{2},\mathcal {F}^{2},(\mathcal{F}_{t}^{2})_{t\ge 0}, \mathbb{P}_{x})\) the filtered probability space accommodating the affine process \((X_{t})_{t\geq 0}\) (see also Remark 2.3) and \((\Omega ^{1},\mathcal {F}^{1}, (\mathcal{F}_{t}^{1})_{t\ge 0}, \mathbb{Q})\) is another filtered probability space carrying the cylindrical Wiener process \(W\colon [0,\infty )\times \Omega \rightarrow H_{\beta}\) and the solution process \((Y_{t})_{t\geq 0}\) with \(\mathbb{Q}[Y_{0}=y]=1\). From now on, we denote the expectation with respect to the product measure \(\mathbb{Q}_{x}\) by \(\mathbb{E}_{x}\left [\,\cdot \,\right ]\). It was shown in Karbach [47] that affine processes \((X_{t})_{t\geq 0}\) on \(\mathcal {H}_{+}\) have versions with càdlàg paths, and in the following, we always consider a càdlàg version of \((X_{t})_{t\geq 0}\). It follows from Cox et al. [18] that under these conditions, the stochastic covariance model \((Y^{y}_{t},X_{t})_{t\geq 0}\) is well defined for every initial value \((y,x)\in H_{\beta} \times \mathcal {H}_{+}\). Heuristically, the process \((Y^{y}_{t},X_{t})_{t\geq 0}\) satisfies a similar transform formula for its mixed Fourier–Laplace transform as the process \((X_{t})_{t\geq 0}\) does for the Laplace transform in (1.1); see Benth et al. [9], Cox et al. [18].

If moreover Assumption 3.1 is satisfied, then by Theorem 3.3, there exists a unique invariant measure \(\pi \) for \((p_{t}(x,\,\cdot \,))_{t\geq 0}\), and Proposition 3.5 ensures the existence of the stationary process \((X_{t}^{\pi})_{t\geq 0}\). Now if there exists a mild solution \((\tilde{Y}_{t})_{t\geq 0}\) of (4.5) for \(y=0\) and where the process \((X_{t})_{t\geq 0}\) is replaced by \((X_{t}^{\pi})_{t\geq 0}\), then we call the joint process \((\tilde{Y}_{t},X^{\pi}_{t})_{t\geq 0}\) defined on \((\Omega , \mathcal {F}, \mathbb{F}, \mathbb{Q}_{\pi})\) (with \(\mathbb{Q}_{\pi}=\mathbb{Q}\otimes \mathbb{P}_{\pi}\) and where the expectation with respect to \(\mathbb{Q}_{\pi}\) is denoted by \(\mathbb{E}_{\pi}\left [\,\cdot \,\right ]\)) an affine stochastic covariance model on \(H\) in the stationary covariance regime. This terminology is inspired by the univariate setting in Keller-Ressel [49, Sect. 3]. In the following result, we give an affine transform formula for the affine stochastic covariance model \((Y^{y}_{t},X_{t})_{t\geq 0}\) in the stationary covariance regime, where for brevity we let \(G=0\).

Proposition 4.2

Assume that \((Y^{y}_{t},X_{t})_{t\geq 0}\) is an affine stochastic covariance model, where the admissible parameters of the process \((X_{t})_{t\geq 0}\) satisfy Assumption 3.1, and let \((\tilde{Y}_{t},X_{t}^{\pi})_{t\geq 0}\) be the model in the stationary covariance regime. Then for every \(T\geq 0\) and \(u=(u_{1},u_{2})\in \mathrm {i}H_{\beta}\times \mathcal {H}_{+}\), we have

$$\begin{aligned} \mathbb{E}_{\pi} [\mathrm {e}^{ \langle \tilde{Y}_{t},u_{1}\rangle _{\beta}-\langle X^{\pi}_{t},u_{2}\rangle _{\beta}} ]= \mathrm {e}^{-\Phi (t,u_{1},u_{2})-\int _{0}^{\infty}F(\psi _{2}(s,0,\psi _{2}(t,u_{1},u_{2}))) \mathrm {d}s}, \qquad t\in [0,T], \end{aligned}$$

where \(\Phi (\,\cdot \,,u_{1},u_{2})\), \(\psi _{1}(\,\cdot \,,u_{1},u_{2})\) and \(\psi _{2}(\,\cdot \,,u_{1},u_{2})\) are the unique solutions on \([0,T]\) of the differential equations

$$\begin{aligned} \textstyle\begin{cases} \displaystyle \frac{\partial \Phi}{\partial t}(t,u)=F\big(\psi _{2}(t,u) \big), &\quad \Phi (0,u)=0,\\ \displaystyle \psi _{1}(t,u)=u_{1}-\mathrm {i}\mathcal {A}^{*}\bigg(\mathrm {i}\int _{0}^{t} \psi _{1}(s,u)\mathrm {d}s\bigg), &\quad \psi _{1}(0,u)=u_{1}, \\ \displaystyle \frac{\partial \psi _{2}}{\partial t}(t,u)=\mathcal{R} \big(\psi _{1}(t,u), \psi _{2}(t,u)\big), & \quad \psi _{2}(0,u)=u_{2},\end{cases}\displaystyle \end{aligned}$$

where \(F\) is as in (2.6), \(\mathcal{R}(h,u):=R(u)-\frac{1}{2}(D^{1/2}h)\otimes (D^{1/2}h)\) with \(R\) as in (2.7), and \((\mathcal{A}^{*}, \mathrm {dom}(\mathcal{A}^{*}))\) denotes the adjoint operator of the generator \((\mathcal{A}, \mathrm {dom}(\mathcal{A}))\).

The proof of Proposition 4.2 is analogous to the finite-dimensional version in Keller-Ressel [49] and is thus omitted. As a particular case, we obtain an operator-valued extension of the Barndorff-Nielsen–Shephard stochastic covariance model as introduced below.

Example 4.3

The operator Barndorff-Nielsen–Shephard (BNS) stochastic covariance model \((Y^{y}_{t},X_{t})_{t\geq 0}\) of Benth et al. [9] is given by the pair of stochastic differential equations

$$\begin{aligned} \textstyle\begin{cases} \mathrm {d}Y_{t} = \mathcal {A}(Y_{t})\mathrm {d}t+D^{1/2}X_{t}^{1/2}\mathrm {d}W_{t},&\qquad Y_{0}=y \in H_{\beta}, \\ \mathrm {d}X_{t}= B(X_{t})\mathrm {d}t+ \mathrm {d}L_{t},&\qquad X_{0}=x\in \mathcal {H}_{+}, \end{cases}\displaystyle \end{aligned}$$
(4.6)

where \(B\) and \((L_{t})_{t\geq 0}\) are as in Example 3.7. Note that \((Y_{t}^{\pi})_{t\geq 0}\) is the same as (4.5) written in differential form with \(G=0\) and \((S(t))_{t\geq 0}\) denoting the strongly continuous semigroup generated by \((\mathcal{A},\mathrm{dom}(\mathcal{A})\). For the \(\mathcal {H}_{+}\)-valued Ornstein–Uhlenbeck process \((X_{t})_{t\geq 0}\), we already showed the existence of a unique invariant measure \(\pi \) of \((X_{t})_{t\geq 0}\) in Example 3.7. Hence we may consider the operator BNS model in the stationary covariance regime and denote it by \((\tilde{Y}_{t}, X^{\pi}_{t})_{t\geq 0}\). In Cox et al. [18], it was shown that the operator valued BNS model is a particular case of the SV models with affine pure-jump variance as given by (4.2) with \(\sigma _{t}=D^{1/2}X^{1/2}_{t}\), where \((X_{t})_{t\geq 0}\) is an affine process on \(\mathcal{H}_{+}\). Hence we obtain from Proposition 4.2 applied to this particular case that for every \((u_{1},u_{2})\in \mathrm {i}H_{\beta}\times \mathcal {H}_{+}\),

$$\begin{aligned} \mathbb{E}_{\pi} [\mathrm {e}^{\langle \tilde{Y}_{t},u_{1}\rangle _{\beta}-\langle X^{\pi}_{t}, u_{2}\rangle _{\beta}} ]=\exp \bigg( - \int _{0}^{t} \varphi _{L}\big( \psi (s,u_{1},u_{2})\big)\,\mathrm {d}s- \int _{0}^{\infty} \varphi _{L} (e^{s B^{*}}u_{2} )\,\mathrm {d}s\bigg), \end{aligned}$$

where \(\varphi _{L}\) is given by (3.7) and \(\psi (t,u_{1},u_{2})\) is explicitly known as

$$\begin{aligned} \psi (t,u_{1},u_{2})=\mathrm {e}^{sB^{*}}u_{2}+\frac{1}{2}\int _{0}^{s}\mathrm {e}^{(s- \tau )B^{*}}\big(D^{1/2}S^{*}(\tau ) u_{1}\big)^{\otimes 2}\mathrm {d}\tau . \end{aligned}$$

4.3 Pricing forward-start options on forwards

In this section, we study the geometric affine forward curve model in (4.4) determined by an affine stochastic covariance model \((Y^{y}_{t},X_{t})_{t\geq 0}\). In particular, we examine the long-time behaviour of the forward implied volatility in this model. Let us assume our model is already formulated under the risk-neutral measure \(\tilde{\mathbb{Q}}\), i.e., we assume \((F(T,\hat{T}))_{T\leq \hat{T}}\) is a \(\tilde{\mathbb{Q}}\)-martingale for all \(\hat{T}\geq 0\). A forward-start option with forward-start date \(\tau \geq 0\), forward maturity \(T\) and strike \(\mathrm {e}^{K}\) written on a forward with maturity date \(\hat{T}\) is a European option with payoff at time \(\tau +T\) given by

$$\begin{aligned} \bigg(\frac{F(\tau +T,\tau +\hat{T})}{F(\tau ,\tau +\hat{T})}-\mathrm {e}^{K} \bigg)^{+}. \end{aligned}$$
(4.7)

A forward-start option is a contract on the relative price difference of a forward contract at two times \(\tau \) and \(\tau +T\) in the future. In practice, it is used to price the future volatility of the underlying asset. Forward-start options are very common in commodity forward markets, and more complex derivatives such as cliquet options are building up on these; see e.g. Crosby [19]. Forward-start options on stocks are discussed in e.g. Jacquier and Roome [39,40], Kruse and Nögel [53], Keller-Ressel [49]. Here we restrict ourselves to the ratio-type payoff functions in (4.7), but similar results can be obtained for difference-type payoffs, i.e., \((F(\tau +T,\tau +\hat{T})-K F(\tau ,\tau +\hat{T}))^{+}\); see also Kruse and Nögel [53].

We now define the implied forward volatility of the model (4.4). First, let us denote the price of a forward-start option with payoff (4.7) by \(C_{\mathrm{fwd}}(\tau ,T,\hat{T},K)\). Then, as a reference model for the forward prices \(F(T,\hat{T})\), we take Black’s model, see Black [12], and denote the forward prices within this model by \(F^{\mathrm{B}}(T,\hat{T})\). We assume that we have the spot–forward relation

$$\begin{aligned} F^{\mathrm{B}}(T,\hat{T})=s_{T}\mathrm {e}^{r(\hat{T}-T)},\qquad 0\leq T \leq \hat{T}, \end{aligned}$$
(4.8)

where \(r\geq 0\) denotes the risk-free interest rate and \((s_{t})_{t\geq 0}\) the spot price process of the underlying commodity, which is given by a geometric Brownian motion with volatility parameter \(\sigma \). We denote by \(C^{\mathrm{B}}_{\mathrm{fws}}(\tau ,T,\hat{T},K,\sigma )\) the price in Black’s model of a forward-start option with an identical payoff function as in (4.7) and define the implied forward volatility \(\sigma (\tau ,T,\hat{T},K)\) as the unique solution to

$$ C^{\mathrm{B}}_{\mathrm{fws}}\big(\tau ,T,\hat{T},K,\sigma (\tau ,T, \hat{T},K)\big)=C_{\mathrm{fwd}}(\tau ,T,\hat{T},K). $$

In the following result we show that \(\sigma (\tau ,T,\hat{T},K)\) exists for all \(\tau ,K \geq 0\) and study its long-time behaviour as \(\tau \to \infty \).

Proposition 4.4

Let \(0\leq T\leq \hat{T}\) and denote by \(F(T,\hat{T})\) the forward price at time \(T\) with maturity date \(\hat{T}\) given by (4.4), where \((Y^{y}_{t},X_{t})_{t\geq 0}\) is an affine stochastic covariance model on \(H_{\beta}\) as defined in Sect4.2with \((S(t))_{t\geq 0}\) the left-shift semigroup on \(H_{\beta}\). Moreover, let \((\tilde{Y}_{t},X_{t}^{\pi})_{t\geq 0}\) be the model in the stationary covariance regime and define \(\tilde{F}(T,\hat{T}):=\exp (\langle \tilde{Y}_{T},u_{\hat{T}-T} \rangle _{\beta })\). Suppose we model directly under the pricing measure \(\tilde{\mathbb{Q}}\) so that \((F(T,\hat{T}))_{T\leq \hat{T}}\) is a \(\tilde{\mathbb{Q}}\)-martingale for all \(\hat{T}\geq 0\). Then for all \(\tau ,K\geq 0\), the implied forward volatility \(\sigma (\tau ,T,\hat{T},K)\) exists and we have

$$\begin{aligned} \lim _{\tau \to \infty}\sigma (\tau ,T,\hat{T},K)=\tilde{\sigma}(T, \hat{T},K), \end{aligned}$$
(4.9)

where \(\tilde{\sigma}(T,\hat{T},K)\) denotes the implied volatility of a European call option with payoff function \((\tilde{F}(T,\hat{T})-K )^{+}\).

Proof

First, we show for \(0\leq T\leq \hat{T}\) and \(K\geq 0\) a relation between the prices \(C^{\mathrm{B}} (T,\hat{T},K, \sigma )\) of a European call option and \(C_{\mathrm{fws}}^{\mathrm{B}} (\tau ,T,\hat{T},K, \sigma )\) of the forward-start call option with forward-start date \(\tau \geq 0\), both in Black’s model. Namely, let ℚ denote the unique risk-neutral measure in Black’s model and recall the price \(C_{\mathrm{fws}}^{\mathrm{B}} (\tau ,T,\hat{T},K, \sigma )\) of a forward-start option at time zero with forward-start date \(\tau \) written on the forward with maturity \(\hat{T}\). Inserting (4.8) into the payoff function and by risk-neutral pricing, we have

$$ C_{\mathrm{fws}}^{\mathrm{B}} (\tau ,T,\hat{T},K, \sigma )=\mathrm {e}^{-r( \tau +T)} \mathbb{E}_{\mathbb{Q}}\bigg[\bigg(\frac{s_{\tau +T}\mathrm {e}^{S}{r(\hat{T}-T)}}{s_{\tau} \mathrm {e}^{r\hat{T}}}-\mathrm {e}^{K}\bigg)^{+}\bigg]. $$

It is known that in the Black–Scholes model, the forward-start call option and the European call option satisfy

$$ \mathrm {e}^{-r(\tau +T)} \mathbb{E}_{\mathbb{Q}}\bigg[\bigg(\frac{s_{\tau +T}}{s_{\tau}}-K\bigg)^{+}\bigg]=\mathrm {e}^{-r( \tau +T)}\mathbb{E}_{\mathbb{Q}} [ (s_{T}-K )^{+} ], $$

see also Keller-Ressel [49], and hence

$$\begin{aligned} C_{\mathrm{fws}}^{\mathrm{B}} (\tau ,T,\hat{T},K,\sigma )&=\mathrm {e}^{-r( \tau +T)}\mathrm {e}^{-rT} \mathbb{E}_{\mathbb{Q}}\bigg[\bigg(\frac{s_{\tau +T}}{s_{\tau}}-\mathrm {e}^{K}\bigg)^{+}\bigg] \\ &=\mathrm {e}^{-r(\tau +T)}\mathrm {e}^{-rT}\mathbb{E}_{\mathbb{Q}} [ (s_{T}-\mathrm {e}^{K'} )^{+} ] \\ &=\mathrm {e}^{-r(\tau +T)}C^{\mathrm{BS}} (T,K',\sigma ), \end{aligned}$$

where \(K'=K+rT\) and the superscript \(\mathrm{BS}\) indicates that the underlying model for \((s_{t})_{t\geq 0}\) is the Black–Scholes model. From this and the definition of the implied forward volatility \(\sigma (\tau ,T,\hat{T},K)\), we have

$$\begin{aligned} C^{\mathrm{BS}}\big(T,K',\sigma (\tau ,T,\hat{T},K)\big)&=\mathrm {e}^{r( \tau +T)}C_{\mathrm{fws}}^{\mathrm{B}}\big(\tau ,T,\hat{T},K,\sigma ( \tau ,T,\hat{T},K)\big) \\ &=\mathrm {e}^{r(\tau +T)}C_{\mathrm{fwd}} (\tau ,T,\hat{T},K ). \end{aligned}$$
(4.10)

Next, we compute the right-hand side of (4.10). Recall that for every \(t\in \mathbb{R}\) and \(f\in H_{\beta}\), we use the identification \(\delta _{t}(f)=\langle f, u_{t}\rangle \) for the evaluation functional with \(u_{t}\in H_{\beta}\). Moreover, we denote the expectation with respect to the pricing measure \(\tilde{\mathbb{Q}}\) by \(\mathbb{E}_{\tilde{\mathbb{Q}}}\left [\,\cdot \,\right ]\) (here we suppress the initial value \(x\) compared to \(\mathbb{Q}_{x}\) above). The payoff function of the forward-start option is given by (4.7); hence by risk-neutral pricing and inserting our model (4.4), we have

$$\begin{aligned} C_{\mathrm{fwd}} (\tau ,T,\hat{T},K ) &=\mathrm {e}^{-r(\tau +T)} \mathbb{E}_{\tilde{\mathbb{Q}}}\bigg[\bigg(\frac{F(\tau +T,\tau +\hat{T})}{F(\tau ,\tau +\hat{T})}-\mathrm {e}^{K}\bigg)^{+}\bigg] \\ &=\mathrm {e}^{-r(\tau +T)} \mathbb{E}_{\tilde{\mathbb{Q}}} [ (\mathrm {e}^{\langle Y^{y}_{\tau +T},u_{\hat{T}-T}\rangle _{\beta}-\langle Y^{y}_{\tau},u_{\hat{T}}\rangle _{\beta}}-\mathrm {e}^{K} )^{+} ]. \end{aligned}$$
(4.11)

Note that by the definition of \(Y_{\tau}\), we have

$$\begin{aligned} \langle S(T)Y_{\tau},u_{\hat{T}-T}\rangle _{\beta}&=\langle S(T+\tau )y,u_{ \hat{T}-T}\rangle _{\beta}+\bigg\langle \int _{0}^{\tau} S(T+\tau -s)G(X_{s}) \mathrm {d}s,u_{\hat{T}-T}\bigg\rangle _{\beta} \\ & \hphantom{=:} +\bigg\langle \int _{0}^{\tau} S(T+\tau -s)X_{s}^{1/2} \mathrm {d}W_{s},u_{ \hat{T}-T}\bigg\rangle _{\beta}, \end{aligned}$$

and the left-shift \(S(T)\) satisfies \(\langle S(T)Y^{y}_{\tau},u_{\hat{T}-T}\rangle _{\beta }=\langle Y_{ \tau}^{y}, u_{\hat{T}}\rangle _{\beta}\). Thus we obtain

$$\begin{aligned} \langle Y^{y}_{\tau +T},u_{\hat{T}-T}\rangle _{\beta}-\langle Y^{y}_{ \tau},u_{\hat{T}}\rangle _{\beta} =&\bigg\langle \int _{\tau}^{T+\tau}S(T+ \tau -s)G(X_{s})\,\mathrm {d}s,u_{\hat{T}-T}\bigg\rangle _{\beta} \\ &{} +\bigg\langle \int _{\tau}^{T+\tau}S(T+\tau -s)X_{s}^{1/2}\mathrm {d}W_{s},u_{ \hat{T}-T}\bigg\rangle _{\beta}. \qquad \quad \end{aligned}$$
(4.12)

By the independent increments property and the Markov property of \((X_{t})_{t\geq 0}\), the sum of the integrals inside the inner products on the right-hand side of (4.12) has the same distribution as \(Y^{0}_{T}=\int _{0}^{T}S(T-s)G(X_{\tau +s}) \mathrm {d}s+\int _{0}^{T}S(T-s)X_{ \tau +s}^{1/2}\mathrm {d}W_{s}\). Hence for the expectation on the right-hand side in (4.11), we obtain

$$\begin{aligned} \mathbb{E}_{\tilde{\mathbb{Q}}} [ (\mathrm {e}^{\langle Y^{y}_{\tau +T},u_{\hat{T}-T}\rangle _{\beta}-\langle Y^{y}_{\tau},u_{\hat{T}}\rangle _{\beta}}-\mathrm {e}^{K} )^{+} ]= \mathbb{E}_{\tilde{\mathbb{Q}}}\big[\mathbb{E} [ (\mathrm {e}^{\langle Y^{0}_{T},u_{\hat{T}-T}\rangle _{\beta}}-\mathrm {e}^{K} )^{+}\lvert \,X_{\tau} ]\big], \end{aligned}$$

and thus we conclude that the left-hand side of (4.10) is given by

$$\begin{aligned} C^{\mathrm{BS}}\big(T,K',\sigma (\tau ,T,\hat{T},K)\big)= \mathbb{E}_{\tilde{\mathbb{Q}}}\big[\mathbb{E} [(\mathrm {e}^{\langle Y^{0}_{T},u_{\hat{T}-T}\rangle _{\beta}}-\mathrm {e}^{K})^{+}\lvert \,X_{\tau} ]\big]. \end{aligned}$$

Now taking the limit \(\tau \to \infty \) and since \(\tilde{F}(T,\hat{T})=\mathrm {e}^{\langle \tilde{Y}_{T},u_{\hat{T}-T} \rangle _{\beta}}\), we obtain

$$\begin{aligned} \lim _{\tau \to \infty}C^{\mathrm{BS}}\big(T,K',\sigma (\tau ,T, \hat{T},K)\big) &= \mathbb{E}_{\tilde{\mathbb{Q}}} [(\mathrm {e}^{\langle \tilde{Y}_{T},u_{\hat{T}-T}\rangle _{\beta}}-\mathrm {e}^{K})^{+} ] \\ &=\mathbb{E}_{\tilde{\mathbb{Q}}}\big[\big(\tilde{F}(T,\hat{T})-\mathrm {e}^{K}\big)^{+}\big]. \end{aligned}$$
(4.13)

The term \(\mathbb{E}_{\tilde{\mathbb{Q}}} [ (\tilde{F}(T,\hat{T})-\mathrm {e}^{K} )^{+} ]\) on the right-hand side of (4.13) is simply \(\mathrm {e}^{rT}\) times the price of a European call option, and the continuity of \(\sigma \mapsto C^{\mathrm{BS}} (T,K',\sigma ) \) gives

$$\begin{aligned} \lim _{\tau \to \infty}C^{\mathrm{BS}}\big(T,K',\sigma (\tau ,T, \hat{T},K)\big)=C^{\mathrm{BS}}\Big(T,K',\lim _{\tau \to \infty} \sigma (\tau ,T,\hat{T},K)\Big), \end{aligned}$$

from which we conclude (4.9), since (4.13) has a unique solution in terms of the Black–Scholes implied volatility. □

According to Proposition 4.4, the prices of forward-start options for large forward dates \(\tau \) can be accurately approximated by those of European plain vanilla options modelled in the stationary covariance regime. This result is significant because pricing options in affine stochastic covariance models, particularly in the stationary covariance regime, can be done with relatively low computational effort, as demonstrated in Karbach [46, Chap. 4]. This is due to the affine transform formula provided by Proposition 4.2, which allows computing option prices using Fourier inversion, as discussed in references such as Carr and Madan [14], Kallsen et al. [45], Hubalek et al. [37]. Our ongoing work focuses on the numerical aspects of option pricing in infinite-dimensional affine stochastic covariance models.

5 Proofs of the main results

Throughout this section, we assume that \((b,B,m,\mu )\) is an admissible parameter set according to Definition 2.1. We denote the unique subcritical affine process associated with \((b,B,m,\mu )\) through Theorem 2.2 by \((X_{t})_{t\geq 0}\) and its family of transition kernels by \((p_{t}(x,\,\cdot \,))_{t\geq 0}\). We set \(P_{t}f:=\int _{\mathcal {H}}f(\xi )p_{t}(\,\cdot \,,\mathrm {d}\xi )\) for all measurable functions \(f\) such that the integral exists. Recall from Cox et al. [17] that the transition semigroup \((P_{t})_{t\geq 0}\) is a generalised Feller semigroup on the space \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\); see also Appendix A.

5.1 Some properties of the generalised Riccati equations (2.4), (2.5)

In this section, we consider the long-time behaviour of the solutions \(\phi (\,\cdot \,,u)\) and \(\psi (\,\cdot \,,u)\) of the generalised Riccati equations in (2.4), (2.5). The arguments in this section are analogous to the finite-dimensional setting in Friesen et al. [32] and are thus left to the reader. We recall from Cox et al. [17, Sect. 3] that for every \(u\in \mathcal {H}_{+}\), there exists a unique and global solution \(\psi (\,\cdot \,,u)\in C^{1}(\mathbb{R}_{+},\mathcal {H}_{+})\) to (2.5). Given \(\psi (\,\cdot \,,u)\), we solve (2.4) by mere integration and obtain \(\phi (\,\cdot \,,u)\in C^{1}(\mathbb{R}_{+},\mathbb{R}_{+})\) given by \(\phi (t,u)=\int _{0}^{t}F(\psi (s,u))\,\mathrm {d}s\). This means that we can write the affine transform formula (2.3) as

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle \xi ,u\rangle}p_{t}(x,\mathrm {d}\xi )=\exp \bigg( -\int _{0}^{t}F\big(\psi (s,u)\big)\,\mathrm {d}s-\langle x, \psi (t,u) \rangle \bigg). \end{aligned}$$

Moreover, we recall that the unique solution \(\psi (\,\cdot \,,u)\) to (2.5) satisfies the flow equation

$$\begin{aligned} \psi (t+s,u)=\psi \big(t,\psi (s,u)\big). \end{aligned}$$
(5.1)

In the next result, we show that \(F\) and \(R\) are continuous functions on \(\mathcal {H}_{+}\) and grow at most quadratically.

Lemma 5.1

Let \((b,B,m,\mu )\) be an admissible parameter set according to Definition 2.1and let \(F\) and \(R\) be given by (2.6) and (2.7), respectively. Then \(F\) and \(R\) are continuous on \(\mathcal {H}_{+}\) and for all \(u\in \mathcal {H}_{+}\), we have

$$\begin{aligned} | F(u)|&\leq \bigg( \|b\|+ \int _{\mathcal {H}_{+}\setminus \{0\}} \|\xi\|^{2}m(\mathrm {d}\xi )\bigg) (\|u\|+\|u\|^{2} ), \\ \|R(u)\|&\leq \big( \|B\|_{\mathcal {L}(\mathcal {H})}+ \|\mu (\mathcal {H}_{+}\setminus \{0\})\|\big) (\|u\| + \|u\|^{2} ) . \end{aligned}$$

Proof

See Karbach [47, Lemma 5.13]. □

Assumption 3.1 implies that the semigroup \((\mathrm {e}^{t\hat{B}})_{t\geq 0}\) satisfies (3.1), i.e., \((\mathrm {e}^{t\hat{B}})_{t\geq 0}\) is uniformly exponentially stable. This has the following consequence on the solution \(\psi (\,\cdot \,,u)\) of the generalised Riccati equation (2.5).

Lemma 5.2

Let \((b,B,m,\mu )\) be an admissible parameter set according to Definition 2.1and for \(u\in \mathcal {H}_{+}\), let \(\psi (\,\cdot \,,u)\) be the unique solution to (2.5). Then

$$\begin{aligned} \|\psi (t,u)\|\leq \|\mathrm {e}^{t\hat{B}}\|_{\mathcal {L}(\mathcal {H})}\|u\|, \qquad \forall t\geq 0. \end{aligned}$$

If moreover Assumption 3.1is satisfied, then \(\lim _{t\to \infty}\psi (t,u)=0\).

5.2 Invariant measures for affine processes on \(\mathcal {H}_{+}\)

For two measures \(\nu _{1},\nu _{2}\in \mathcal {M}(\mathcal {H}_{+})\), we denote the convolution of \(\nu _{1}\) and \(\nu _{2}\) by \(\nu _{1}*\nu _{2}\). In the following result, we give an important convolution property of the transition kernels \(p_{t}(x,\,\cdot \,)\).

Lemma 5.3

Let \((Y_{t})_{t\geq 0}\) be the unique affine process associated with the admissible parameter set \((0,B,0,\mu )\) and transition kernels \((q_{t}(x,\,\cdot \,))_{t\geq 0}\). Then for every \(t\geq 0\) and \(x\in \mathcal {H}_{+}\), we have

$$\begin{aligned} p_{t}(x,\,\cdot \,)=p_{t}(0,\,\cdot \,)*q_{t}(x,\,\cdot \,). \end{aligned}$$

Proof

Since \(b=0\) and \(m=0\), the function \(F\) in (2.4) vanishes, see also (2.6), and thus \(\phi (t,u)=0\) for all \(t\geq 0\). Hence for every \(t\geq 0\), the affine transform formula (2.3) for \(Y_{t}\) takes the form

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle u, \xi \rangle}q_{t}(x,\mathrm {d}\xi )=\exp \big(-\langle \psi (t,u),x\rangle \big)\qquad \text{for }u\in \mathcal {H}_{+}. \end{aligned}$$

Now let \((X_{t})_{t\geq 0}\) denote the unique affine process associated with the admissible parameter set \((b,B,m,\mu )\) and denote its transition kernels by \(p_{t}(x,\,\cdot \,)\). Let \(t\geq 0\) be arbitrary and \(u\in \mathcal {H}_{+}\); then

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\!\mathrm {e}^{-\langle u,\xi \rangle}\big(p_{t}(0,\,\cdot \,)*q_{t}(x, \,\cdot \,)\big)(\mathrm {d}\xi )&=\int _{ \mathcal {H}_{+}}\bigg(\int _{ \mathcal {H}_{+}}\!\mathrm {e}^{- \langle u,\xi _{1}+\xi _{2}\rangle}p_{t}(0,\mathrm {d}\xi _{1})\bigg)q_{t}(x, \mathrm {d}\xi _{2}) \\ &=\mathrm {e}^{-\phi (t,u)}\int _{ \mathcal {H}_{+}}\!\mathrm {e}^{-\langle u,\xi _{2}\rangle}q_{t}(x, \mathrm {d}\xi _{2}) \\ &=\mathrm {e}^{-\phi (t,u)}\mathrm {e}^{-\langle \psi (t,u),x\rangle}, \end{aligned}$$

which completes the proof thanks to (2.3) and the fact that the functions \(x\mapsto \mathrm {e}^{-\langle u,x\rangle}\) characterise measures; see Cox et al. [18, Lemma A.1]. □

In the next result, we show that the Laplace transform of a subcritical affine process converges pointwise as the time \(t\) tends to infinity.

Proposition 5.4

Let \((X_{t})_{t\geq 0}\) be an affine process associated with the admissible parameter set \((b,B,m,\mu )\) satisfying Assumption 3.1. Then for all \(u,x\in \mathcal {H}_{+}\), we have

$$\begin{aligned} \lim _{t\to \infty} \mathbb{E}_{x} [\mathrm {e}^{-\langle u, X_{t}\rangle} ]= \exp \bigg( -\int _{0}^{\infty}F\big(\psi (s,u)\big)\,\mathrm {d}s\bigg)\in [0, \infty ). \end{aligned}$$
(5.2)

Proof

Let \(u,x \in \mathcal {H}_{+}\). Then by Lemma 5.2 and (3.1), we have

$$\begin{aligned} |\langle \psi (t,u),x\rangle |\leq \|\psi (t,u)\|\|x\|\leq \|\mathrm {e}^{t\hat{B}}\|_{\mathcal {L}(\mathcal {H})}\|x\|\|u\|\leq M\mathrm {e}^{-\delta t} \|x\|\|u\| . \end{aligned}$$

Lemma 5.1 gives

$$\begin{aligned} |F(\psi (t,u))|\leq C \big(\|\psi (t,u)\|+\|\psi (t,u)\|^{2} \big)\leq C M^{2}\mathrm {e}^{-\delta t}(\|u\|+\|u\|^{2}) \end{aligned}$$
(5.3)

with \(C=\|b\|+ \int _{\mathcal {H}_{+}\setminus \{0\}}\|\xi\|^{2}m(\mathrm {d}\xi )\). For every \(u\in \mathcal {H}_{+}\), this implies

$$\begin{aligned} \int _{0}^{\infty}\big|F\big(\psi (s,u)\big)\big|\,\mathrm {d}s\leq \frac{C M^{2}}{\delta}(\|u\|+\|u\|^{2})< \infty , \end{aligned}$$

and hence the limit \(\lim _{t\to \infty}\phi (t,u)=\int _{0}^{\infty}F(\psi (s,u))\,\mathrm {d}s\) exists for every \(u\in \mathcal {H}_{+}\). This, continuity of the exponential function and the fact that \(\langle \psi (t,u),x\rangle \to 0\) for all \(x,u\in \mathcal {H}_{+}\) as \(t\to \infty \) by Lemma 5.2 imply (5.2). □

The next result asserts uniform boundedness in time of the transition semigroup \((P_{t})_{t\geq 0}\) in the operator norm on \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\).

Lemma 5.5

Let \((X_{t})_{t\geq 0}\) be an affine process associated with the admissible parameter set \((b,B,m,\mu )\) satisfying Assumption 3.1and denote its transition semigroup by \((P_{t})_{t\geq 0}\). Then we have

$$\begin{aligned} \sup _{t\geq 0}\|P_{t}\|_{\mathcal {L}(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}))}< \infty . \end{aligned}$$
(5.4)

Proof

Recall that \(\rho (x)=1+\|x\|^{2}\) and note that for every \(f\in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\), we have \(|f(y)| \leq \rho (y) \| f\|_{\mathcal{B}_{\rho}}\) and hence

$$ \|P_{t}f\|_{ \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})} = \sup _{x\in \mathcal {H}_{+}}\! \rho (x)^{-1}\bigg| \int _{\mathcal{H}_{+}} \!\!f(y)p_{t}(x,\mathrm {d}y)\bigg| \leq \|f\|_{ \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})}\|P_{t}\rho\|_{ \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})}, $$

which yields \(\sup _{t\geq 0}\|P_{t}\|_{\mathcal {L}(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}))}\leq \sup _{t\geq 0} \|P_{t}\rho\|_{ \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})}\). Let \((e_{i})_{i\in \mathbb{N}}\) be an orthonormal basis of ℋ and recall that by Cox et al. [17, Remark 4.6], we have \(P_{t}\rho (x)=\mathbb{E}_{x}\left [\rho (X_{t})\right ]\) for all \(t\geq 0\). Hence by Parseval’s identity, we conclude that

$$\begin{aligned} 0\leq P_{t}\rho (x)=1+\mathbb{E}_{x} [\|X_{t}\|^{2} ]=1+\sum _{i=1}^{ \infty} \mathbb{E}_{x} [\langle X_{t},e_{i}\rangle ^{2} ]. \end{aligned}$$

Using (2.11) with \(v=w=e_{i}\) for \(i\in \mathbb{N}\), we find

$$\begin{aligned} \mathbb{E}_{x} [\langle X_{t}, e_{i}\rangle ^{2} ]&= \bigg( \int _{0}^{t} \langle \hat{b},\mathrm {e}^{s \hat{B}} e_{i}\rangle \,\mathrm {d}s+ \langle x, \mathrm {e}^{t \hat{B}}e_{i} \rangle \bigg)^{2} \\ & \hphantom{=:} + \int _{0}^{t}\int _{ \mathcal {H}_{+}\setminus \{0\}}\langle \xi , \mathrm {e}^{s \hat{B}} e_{i} \rangle ^{2}m(\mathrm {d}\xi )\,\mathrm {d}s \\ & \hphantom{=:} + \int _{0}^{t} \int _{0}^{s} \bigg\langle \hat{b},\mathrm {e}^{(s-u)\hat{B}} \int _{ \mathcal {H}_{+}\setminus \{0\}}\langle \xi ,\mathrm {e}^{u \hat{B}}e_{i}\rangle ^{2} \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}u\,\mathrm {d}s \\ & \hphantom{=:} + \int _{0}^{t} \bigg\langle x,\mathrm {e}^{(t-s)\hat{B}}\int _{ \mathcal {H}_{+}\setminus \{0\}} \langle \xi ,\mathrm {e}^{s \hat{B}}e_{i}\rangle ^{2} \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}} \bigg\rangle \,\mathrm {d}s \end{aligned}$$

and hence

$$\begin{aligned} \sum _{i=1}^{\infty} \mathbb{E}_{x} [\langle X_{t},e_{i}\rangle ^{2} ]& \leq 2\bigg\| \int _{0}^{t} \mathrm {e}^{s\hat{B}^{*}}\hat{b}\,\mathrm {d}s\bigg\| ^{2}+2\|\mathrm {e}^{t\hat{B}^{*}}x\|^{2} \end{aligned}$$
(5.5)
$$\begin{aligned} & \hphantom{=:} +\int _{0}^{t}\int _{ \mathcal {H}_{+}\setminus \{0\}}\|\mathrm {e}^{s\hat{B}^{*}}\xi\|^{2}m(\mathrm {d}\xi )\,\mathrm {d}s \end{aligned}$$
(5.6)
$$\begin{aligned} & \hphantom{=:} +\int _{0}^{t}\int _{0}^{s}\int _{ \mathcal {H}_{+}\setminus \{0\}} \|\mathrm {e}^{u\hat{B}^{*}}\xi\|^{2}\bigg\langle \hat{b},\mathrm {e}^{(s-u) \hat{B}}\frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}u\,\mathrm {d}s \end{aligned}$$
(5.7)
$$\begin{aligned} & \hphantom{=:} +\int _{0}^{t}\int _{ \mathcal {H}_{+}\setminus \{0\}} \|\mathrm {e}^{s\hat{B}^{*}}\xi\|^{2} \bigg\langle x, \mathrm {e}^{(t-s)\hat{B}} \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}s. \end{aligned}$$
(5.8)

In the following, we show that all four terms in (5.5)–(5.8) converge as \(t\to \infty \) uniformly in \(x\), which then yields (5.4).

Note first that the adjoint semigroup \((\mathrm {e}^{t\hat{B}^{*}})_{t\geq 0}\) generated by \(\hat{B}^{*}\), the adjoint of \(\hat{B}\), is also uniformly stable as \(\|\mathrm {e}^{t\hat{B}}\|_{\mathcal {L}(\mathcal {H})}=\|\mathrm {e}^{t\hat{B}*}\|_{\mathcal {L}(\mathcal {H})}\) for all \(t\geq 0\). For the first term on the right-hand side of (5.5), we have \(\int _{0}^{t} \| \mathrm {e}^{s\hat{B}^{*}}\hat{b} \| \,\mathrm {d}s\leq \frac{M}{\delta} \|\hat{b}\|\). The second term in (5.5) vanishes as \(t\to \infty \) since \((\mathrm {e}^{t\hat{B}^{*}})_{t\geq 0}\) is uniformly stable. Note that \(s\mapsto M^{2} \mathrm {e}^{-2\delta s}\int _{ \mathcal {H}_{+}\setminus \{0\}}\|\xi\|^{2}m(\mathrm {d}\xi )\) is an integrable majorant for the term in (5.6) and thus the integral converges for \(t\to \infty \). For (5.7), note that \(\langle \hat{b},\mathrm {e}^{(s-u)\hat{B}} \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}} \rangle \geq 0\) for every \(s,u\in \mathbb{R}_{+}\), which follows from the admissible parameter conditions which imply that \(\hat{b}\in \mathcal {H}_{+}\) and \(\mathrm {e}^{(s-u)\hat{B}}(\mathcal {H}_{+})\subseteq \mathcal {H}_{+}\) whenever \(s\geq u\). Hence we have

$$\begin{aligned} &\int _{0}^{\infty}\int _{0}^{s}\int _{ \mathcal {H}_{+}\setminus \{0\}} \|\mathrm {e}^{u\hat{B}^{*}}\xi\|^{2}\bigg\langle \hat{b},\mathrm {e}^{(s-u) \hat{B}}\frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}u\,\mathrm {d}s\\ & \leq \frac{3}{2}\frac{M^{3}}{\delta ^{2}}\|\hat{b}\| \|\mu (\mathcal {H}_{+}\setminus \{0\})\|. \end{aligned}$$

Finally, note that \(\int _{0}^{t}\mathrm {e}^{-2\delta s}\mathrm {e}^{-\delta (t-s)}\mathrm {d}s= \frac{1}{\delta}(\mathrm {e}^{-\delta t}-\mathrm {e}^{-2\delta t})\) and hence the last term in (5.8) vanishes as \(t\to \infty \), which can be seen from

$$\begin{aligned} &\int _{0}^{t}\int _{ \mathcal {H}_{+}\setminus \{0\}}\|\mathrm {e}^{s\hat{B}^{*}}\xi\|^{2} \bigg\langle x, \mathrm {e}^{(t-s)\hat{B}} \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}s \\ & \leq M^{3}\|\mu (\mathcal {H}_{+}\setminus \{0\})\|\|x\| \int _{0}^{t}\mathrm {e}^{-2 \delta s}\mathrm {e}^{-\delta (t-s)}\,\mathrm {d}s \\ & \leq \frac{M^{3}}{\delta} \|\mu (\mathcal {H}_{+}\setminus \{0\})\|\|x\| (\mathrm {e}^{- \delta t}-\mathrm {e}^{-2\delta t}). \end{aligned}$$
(5.9)

Thus we obtain \(\sup _{t\geq 0}\sup _{x\in \mathcal {H}_{+}}\mathbb{E}_{x}\left [\rho (X_{t})\right ]<\infty \), which proves the statement. □

In the next result, we show first that for every \(f\in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\), the transition semigroup \((P_{t})_{t\geq 0}\) converges in \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) as \(t\to \infty \), and subsequently, we use this to define a continuous linear functional on \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) given by the limits.

Proposition 5.6

For all \(f\in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\), the limit \(\lim _{t\to \infty}P_{t}f\) in \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) exists, and \(\pi (f):=\lim _{t\to \infty}P_{t}f(x)\) defines a continuous linear functional on \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\).

Proof

By Proposition 5.4, we know that for every \(u\in \mathcal {H}_{+}\),

$$\begin{aligned} \lim _{t\to \infty}(P_{t}\mathrm {e}^{-\langle u,\,\cdot \,\rangle})(x)=\mathrm {e}^{- \int _{0}^{\infty}F(\psi (s,u))\,\mathrm {d}s}, \qquad \forall x\in \mathcal {H}_{+}. \end{aligned}$$

For \(u\in \mathcal {H}_{+}\), define \(\pi _{u}=\mathrm {e}^{-\int _{0}^{\infty}F(\psi (s,u))\,\mathrm {d}s}\mathbf {1}\), where \(\mathbf {1}\) denotes the constant function one. We claim that the sequence \((P_{t}\mathrm {e}^{-\langle u, \,\cdot \,\rangle})_{t\geq 0}\) converges in \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) to the constant function \(\pi _{u} \in \mathcal{B}_{\rho}(\mathcal{H}_{+,{\mathrm{w}}})\). Indeed, we have

$$\begin{aligned} \|P_{t}\mathrm {e}^{-\langle u,\,\cdot \,\rangle}-\pi _{u}\|_{\rho} &= \sup _{x\in \mathcal {H}_{+}} \frac{ \lvert (\mathrm {e}^{-\int _{0}^{t}F(\psi (s,u))\mathrm {d}s-\langle \psi (t,u),x\rangle}-\mathrm {e}^{-\int _{0}^{\infty}F(\psi (s,u))\mathrm {d}s} ) \rvert}{\rho (x)} \\ &\leq \sup _{x\in \mathcal {H}_{+}} \frac{ \lvert \int _{t}^{\infty}F(\psi (s,u))\,\mathrm {d}s-\langle \psi (t,u),x\rangle \rvert}{\rho (x)} \\ &\leq \int _{t}^{\infty}\big|F\big(\psi (s,u)\big)\big|\mathrm {d}s + \| \psi (t,u)\|\sup _{x \in \mathcal {H}_{+}} \frac{\|x\|}{\rho (x)}, \end{aligned}$$

where we have used \(\rho (x) = 1 + \|x\|^{2}\). The first term converges to zero due to (5.3), while the second tends to zero by Lemma 5.2.

Let \(\mathcal {D}:=\mathrm {lin}( \{ \mathrm {e}^{-\langle u,\,\cdot \, \rangle}\colon u\in \mathcal {H}_{+}\})\) and define \(\pi \) as the linear extension of \(\pi _{u}\) onto \(\mathcal {D}\). In particular, we have \(\lim _{t \to \infty}P_{t}f = \pi (f)\) in \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) for every \(f\in \mathcal {D}\). In view of Proposition 5.5, we know that \(\sup _{t\geq 0}\|P_{t}\|_{\mathcal {L}(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}))}<\infty \) and therefore \(|\pi (f)| \leq \sup _{t\geq 0}\|P_{t}\|_{\mathcal {L}(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}))} \|f\|_{ \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})}\), i.e., \(\pi \) is bounded on \(\mathcal {D}\). Since \(\mathcal {D}\) is dense in \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\), see Cox et al. [17, Lemma 4.7], this means that there exists a unique extension of \(\pi \) to a continuous linear functional on \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\), which we also denote by \(\pi \). We have thus proved the existence of \(\pi \in \mathcal {L}(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}),\mathbb{R})\) and it only remains to show that \(P_{t}f\to \pi (f)\) as \(t\to \infty \) for all \(f\in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\). But this is an immediate consequence of an \(\varepsilon /3\)-argument using \(\sup _{t\geq 0}\|P_{t}\|_{\mathcal {L}(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}))}<\infty \) and \(\overline{\mathcal {D}} = \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\). Thus we obtain the assertion. □

In the following result, we prove that the functional \(\pi \) is represented by a unique probability measure on \(\mathcal {B}(\mathcal {H}_{+})\).

Lemma 5.7

Let \(\pi \) denote the continuous linear functional in Proposition 5.6. Then there exists a unique probability measure \(\nu \) on \(\mathcal {B}(\mathcal {H}_{+})\) such that

$$\begin{aligned} \pi (f)=\int _{ \mathcal {H}_{+}}f(\xi )\nu (\mathrm {d}\xi )\qquad \textit{for all } f \in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}), \end{aligned}$$
(5.10)

and \(\nu \) is inner regular on \(\mathcal {B}(\mathcal {H}_{+})\) when \(\mathcal {H}_{+}\) is equipped with the weak topology.

Proof

By an application of the Riesz representation theorem in Cuchiero and Teichmann [22, Theorem 2.4], there exists a unique finite signed Radon measure \(\nu \) on \(\mathcal {B}(\mathcal {H}_{+})\) such that (5.10) and

$$\begin{aligned} \int _{ \mathcal {H}_{+}} (1+\|x\|^{2} )|\nu |(\mathrm {d}\xi )=\|\pi\|_{\mathcal {L}( \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}),\mathbb{R})} \end{aligned}$$
(5.11)

hold. Here \(|\nu |\) denotes the total variation measure of \(\nu \). Note that \(\nu \) is a Radon measure with respect to the weak topology on \(\mathcal {H}_{+}\), which implies the statement on the inner regularity. It remains to prove that \(\nu \) is a probability measure. Note that since \(\lim _{t\to \infty}P_{t}\mathbf {1}(x)=1\), we have \(\pi (\mathbf {1})=1\) and hence \(\nu (\mathcal {H}_{+})=1\). Moreover, as \(P_{t}f\geq 0\) for all nonnegative \(f\in C_{b}(\mathcal {H}_{+,\mathrm{w}})\) and all \(t\geq 0\), we have \(\lim _{t\to \infty}P_{t}f(x)\geq 0\) for all \(x\in \mathcal {H}_{+}\) and hence \(\int _{ \mathcal {H}_{+}}f(\xi )\nu (\mathrm {d}\xi )\geq 0\) for all nonnegative \(f\in C_{b}(\mathcal {H}_{+,\mathrm{w}})\), which implies that the measure \(\nu \) is also nonnegative and hence a probability measure on \(\mathcal {B}(\mathcal {H}_{+})\). □

In the following, we identify the linear functional \(\pi \) with the measure \(\nu \) given by Lemma 5.7 and write \(\pi \) instead of \(\nu \). Finally, we show that \(\pi \) is indeed the unique invariant measure of \((p_{t}(x,\,\cdot \,))_{t\geq 0}\).

Proposition 5.8

Let \((b,B,m,\mu )\) be an admissible parameter set such that Assumption 3.1is satisfied and denote the associated subcritical affine Markov process on \(\mathcal {H}_{+}\) by \((X_{t})_{t\geq 0}\) and its transition kernels by \((p_{t}(x,\,\cdot \,))_{t\geq 0}\). Then there exists a unique invariant measure \(\pi \) for \((p_{t}(x,\,\cdot \,))_{t\geq 0}\). Moreover, for every \(x\in \mathcal {H}_{+}\), we have

$$\begin{aligned} \lim _{t\to \infty}\int _{ \mathcal {H}_{+}}f(\xi )p_{t}(x,\mathrm {d}\xi ) = \int _{ \mathcal {H}_{+}}f(\xi )\pi (\mathrm {d}\xi ),\qquad \forall f\in C_{b}(\mathcal {H}_{+, \mathrm{w}}), \end{aligned}$$
(5.12)

and the Laplace transform of \(\pi \) is given by (3.2).

Proof

In Proposition 5.6 and the subsequent arguments, we have already shown the existence of the Borel measure \(\pi \) such that (5.12) holds. It is left to show that \(\pi \) is the unique invariant measure. We have

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle u,\xi \rangle}\bigg(\int _{ \mathcal {H}_{+}}p_{t}(x, \mathrm {d}\xi )\pi (\mathrm {d}x)\bigg)&=\int _{ \mathcal {H}_{+}}\bigg(\int _{ \mathcal {H}_{+}}\mathrm {e}^{- \langle u,\xi \rangle}p_{t}(x,\mathrm {d}\xi )\bigg)\pi (\mathrm {d}x) \\ &= \mathrm {e}^{-\phi (t,u)}\int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle x,\psi (t,u)\rangle} \pi (\mathrm {d}x). \end{aligned}$$

Note that (5.1) gives \(\psi (t+s,u)=\psi (t,\psi (s,u))\) and so for every \(u\in \mathcal {H}_{+}\), we have

$$\begin{aligned} \mathrm {e}^{-\phi (t,u)}\int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle x,\psi (t,u)\rangle} \pi (\mathrm {d}x) &=\mathrm {e}^{-\phi (t,u)}\mathrm {e}^{ -\int _{0}^{\infty}F(\psi (s, \psi (t,u)))\,\mathrm {d}s} \\ &=\mathrm {e}^{-\phi (t,u)}\mathrm {e}^{-\int _{0}^{\infty}F(\psi (t+s,u))\mathrm {d}s} \\ &=\mathrm {e}^{-\phi (t,u)}\mathrm {e}^{-\int _{t}^{\infty}F(\psi (s,u))\mathrm {d}s} \\ &= \mathrm {e}^{-\int _{0}^{\infty}F(\psi (s,u))\mathrm {d}s} \\ &= \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle x,u\rangle}\pi (\mathrm {d}x). \end{aligned}$$

This proves the invariance of \(\pi \). To prove that \(\pi \) is the unique invariant measure, suppose \(\pi '\in \mathcal {M}(\mathcal {H}_{+})\) is invariant for \((p_{t}(x,\,\cdot \,))_{t \geq 0}\). Then for every \(u\in \mathcal {H}_{+}\) and \(t\geq 0\), we have

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\mathrm {e}^{-\langle x, u\rangle}\pi '(\mathrm {d}x)&=\int _{ \mathcal {H}_{+}\setminus \{0\}} \mathrm {e}^{-\langle u,\xi \rangle}\bigg(\int _{ \mathcal {H}_{+}\setminus \{0\}}p_{t}(x,\mathrm {d}\xi ) \pi '(\mathrm {d}x)\bigg) \\ &=\int _{ \mathcal {H}_{+}\setminus \{0\}}\mathrm {e}^{-\phi (t,u)-\langle x,\psi (t,u)\rangle}\pi '( \mathrm {d}x). \end{aligned}$$

By letting \(t\to \infty \), we find that

$$\begin{aligned} \int _{ \mathcal {H}_{+}\setminus \{0\}}\mathrm {e}^{-\langle u,x\rangle}\pi '(\mathrm {d}x)=\exp \bigg(- \int _{0}^{\infty}F\big(\psi (s,u)\big)\,\mathrm {d}s\bigg). \end{aligned}$$

As the Laplace transform is measure-determining for measures on \(\mathcal {B}(\mathcal {H}_{+})\), we get \(\pi '=\pi \). □

Remark 5.9

The convergence in (5.12) is the weak convergence of \(p_{t}(x,\,\cdot \,)\) to \(\pi \) in the weak topology on ℋ as \(t\to \infty \). Even though the Borel-\(\sigma \)-algebras of ℋ equipped with the norm topology and weak topology coincide, the weak convergence is different in general. We say that \(p_{t}(x,\,\cdot \,)\to \pi \) as \(t\to \infty \) weakly in the weak topology on \(\mathcal {H}_{+}\) whenever \(P_{t}f(x)\to \int _{ \mathcal {H}_{+}}f(\xi )\pi (\mathrm {d}\xi )\) for all \(f\in C_{b}(\mathcal {H}_{+,\mathrm{w}})\).

If the stronger assumption \(P_{t}f(x)\to \int _{ \mathcal {H}_{+}}f(\xi )\pi (\mathrm {d}\xi )\) for all \(f\in C_{b}(\mathcal {H}_{+})\) holds, we speak of the usual weak convergence, i.e., \(p_{t}(x,\,\cdot \,)\Rightarrow \pi \) as \(t\to \infty \). By Merkle [61, Theorems 1 and 2], we know that weak convergence in the weak topology together with

$$\begin{aligned} \lim _{N\to \infty}\sup _{t\geq 0}p_{t}(x, A_{N})=0 \qquad \text{for all }\epsilon >0, \end{aligned}$$

where \(A_{N}:= \{ \sum _{i=N}^{\infty}\langle x, e_{i}\rangle ^{2}\geq \epsilon \}\) for \(N\in \mathbb{N}\), implies \(p_{t}(x,\,\cdot \,)\Rightarrow \pi \) as \(t\to \infty \). Note that in our main result in Theorem 3.3, we assert weak convergence in the strong topology, which will be shown below.

5.3 Proof of Theorem 3.3

Proposition 5.8 ensures the existence and uniqueness of an invariant measure \(\pi \) of \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) with Laplace transform (3.2). We also proved weak convergence of \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) to \(\pi \) in the weak topology as \(t\to \infty \). It remains to show the convergence rates in the Wasserstein distance of order \(p\) for \(p\in [1,2]\) as in (3.3). Then convergence in the Wasserstein distance of some order \(p\in [1,\infty )\) implies weak convergence (in the strong topology) and convergence of the \(p\)th absolute moment; see Villani [66, Theorem 6.9]. This implies the last assertion of Theorem 3.3. In the remainder, we prove the convergence rate in (3.3) and (3.4).

Let \(p\in [1,2]\) and as before denote by \((q_{t}(x,\,\cdot \,))_{t \geq 0}\) the transition kernels of an affine process associated with the admissible parameter set \((0,B,0,\mu )\). Let \(t\geq 0\), \(x\in \mathcal {H}_{+}\) and \(G\in \mathcal {C}(\delta _{x},\pi )\), i.e., \(G\) is a coupling with marginals \(\delta _{x}\) and \(\pi \). Note that

$$\begin{aligned} p_{t}(x,\mathrm {d}y)=\int _{ \mathcal {H}_{+}}p_{t}(z,\mathrm {d}y)\delta _{x}(\mathrm {d}z)=\int _{ \mathcal {H}_{+}\times \mathcal {H}_{+}}p_{t}(z,\mathrm {d}y)\mathbf {1}(z') G(\mathrm {d}z, \mathrm {d}z'), \end{aligned}$$

and by the invariance of \(\pi \), we also have

$$\begin{aligned} \pi (\mathrm {d}y)=\int _{ \mathcal {H}_{+}}p_{t}(z',\mathrm {d}y)\pi (\mathrm {d}z')=\int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}p_{t}(z',\mathrm {d}y)\mathbf {1}(z) G(\mathrm {d}z, \mathrm {d}z'). \end{aligned}$$

Thus by the convexity property in [66, Theorem 4.8] and since \(W_{p}\leq W_{2}\) for \(p\in [1,2]\), we have

$$\begin{aligned} W_{p}\big(p_{t}(x,\,\cdot \,),\pi \big)&=W_{p}\bigg(\int _{ \mathcal {H}_{+}}p_{t}(z, \,\cdot \,)\delta _{x}(\mathrm {d}z),\int _{ \mathcal {H}_{+}}p_{t}(y,\,\cdot \,)\pi ( \mathrm {d}y)\bigg) \\ &\leq \bigg(\int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}W_{2}\big(p_{t}(z,\,\cdot \,),p_{t}(y,\,\cdot \,)\big)^{p}G(\mathrm {d}z, \mathrm {d}y) \bigg)^{1/p}.\qquad \end{aligned}$$
(5.13)

By Lemma 5.3, we have \(p_{t}(z,\,\cdot \,)=q_{t}(z,\,\cdot \,)*p_{t}(0,\,\cdot \,)\) for every \(t\geq 0\). Thus for \(H\in \mathcal {C}(q_{t}(z,\,\cdot \,),q_{t}(y,\, \cdot \,))\), we obtain by Lemma B.1 below that

$$\begin{aligned} &W_{2}\big(p_{t}(z,\,\cdot \,),p_{t}(y,\,\cdot \,)\big)^{p} \\ &= W_{2}\big(q_{t}(z,\,\cdot \,)*p_{t}(0,\,\cdot \,),q_{t}(y,\,\cdot \,)*p_{t}(0,\,\cdot \,)\big)^{p} \\ &\leq W_{2}\big(q_{t}(z,\,\cdot \,),q_{t}(y,\,\cdot \,)\big)^{p} \\ &\leq \bigg(\int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}\|\tilde{x}-\tilde{y}\|^{2}H( \mathrm {d}\tilde{x}, \mathrm {d}\tilde{y})\bigg)^{\frac{p}{2}} \\ &\leq \bigg(2\int _{\mathcal {H}_{+}\times \mathcal {H}_{+}} (\|\tilde{x}\|^{2}+ \|\tilde{y}\|^{2} )H(\mathrm {d}\tilde{x}, \mathrm {d}\tilde{y})\bigg)^{ \frac{p}{2}} \\ &= \bigg(2\int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}\|\tilde{x}\|^{2}q_{t}(z, \mathrm {d}\tilde{x}) + 2\int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}\|\tilde{y}\|^{2}q_{t}(y, \mathrm {d}\tilde{y})\bigg)^{\frac{p}{2}}. \end{aligned}$$
(5.14)

Now recall from (5.5) that

$$\begin{aligned} \int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}\|\tilde{x}\|^{2}q_{t}(z, \mathrm {d}\tilde{x}) \leq 2\|\mathrm {e}^{t\hat{B}^{*}}z\|^{2} + \int _{0}^{t}\int _{ \mathcal {H}_{+}\setminus \{0\}} \|\mathrm {e}^{s\hat{B}^{*}}\xi\|^{2}\bigg\langle z,\mathrm {e}^{(t-s) \hat{B}}\frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}s, \end{aligned}$$

while all the other terms vanish as \(\hat{b}=0\) and \(m = 0\). By the same estimations as in (5.9), we conclude that

$$\begin{aligned} \int _{\mathcal {H}_{+}\times \mathcal {H}_{+}}\|\tilde{x}\|^{2}q_{t}(z, \mathrm {d}\tilde{x})&\leq 2\|\mathrm {e}^{t\hat{B}^{*}}z\|^{2} + \int _{0}^{t}\int _{ \mathcal {H}_{+}\setminus \{0\}} \|\mathrm {e}^{s\hat{B}^{*}}\xi\|^{2}\bigg\langle z,\mathrm {e}^{(t-s) \hat{B}}\frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}s\\ &\leq 2 M^{2}\mathrm {e}^{-2\delta t}\|z\|^{2}+\frac{M^{3}}{\delta} \|\mu (\mathcal {H}_{+}\setminus \{0\})\|\mathrm {e}^{-\delta t}\|z\|. \end{aligned}$$

Inserting this back into (5.14) and using the subadditivity of \(x\mapsto x^{p/2}\) for \(p\in [1,2]\), we obtain

$$\begin{aligned} W_{2}\big(p_{t}(z,\,\cdot \,),p_{t}(y,\,\cdot \,)\big)^{p}&\leq (C_{1} \mathrm {e}^{-\delta t}\|z\| )^{p}+ (C_{2}\mathrm {e}^{-\delta /2}\|z\|^{1/2} )^{p} \\ & \hphantom{=:} + (C_{1}\mathrm {e}^{-\delta t}\|y\| )^{p}+ (C_{2}\mathrm {e}^{-\delta /2}\|y\|^{1/2} )^{p} \end{aligned}$$
(5.15)

for \(C_{1}=2M\) and \(C_{2}=2^{1/2}M^{3/2}\delta ^{-1/2}\|\mu (\mathcal {H}_{+}\setminus \{0\})\|^{1/2}\). Now plugging (5.15) back into (5.13) and again by the subadditivity of \(x\mapsto x^{1/p}\), we obtain the desired estimates in (3.3) and (3.4).

5.4 Proof of Proposition 3.5

For every \(x\in \mathcal {H}_{+}\), let \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) be the transition kernels associated to the admissible parameter set \((b,B,m,\mu )\) by Theorem 2.2. Moreover, let \(\pi \) be the unique invariant measure of \((p_{t}(x,\,\cdot \,))_{t\geq 0}\) (which is independent of \(x\in \mathcal {H}_{+}\)). From part (i) in Theorem 3.3, we know that \(\pi \) is inner regular. Thus from Proposition A.5 below, we obtain the existence of a unique Markov process \((X^{\pi}_{t})_{t\geq 0}\) such that for all \(f\in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\), we have \(\mathbb{E}_{\pi}\left [f(X_{t})\right ]=\int _{ \mathcal {H}_{+}}P_{t}f(x)\pi (\mathrm {d}x)\). Moreover, since \(\pi \) is the invariant measure, we have for each \(t \geq 0\) that

$$\begin{aligned} \int _{ \mathcal {H}_{+}}P_{t}f(x)\pi (\mathrm {d}x)=\int _{ \mathcal {H}_{+}}\bigg(\int _{ \mathcal {H}_{+}}f(\xi ) p_{t}(x,\xi )\bigg)\pi (\mathrm {d}x) =\int _{ \mathcal {H}_{+}}f(\xi ) \pi (\mathrm {d}\xi ), \end{aligned}$$

which implies that for all \(t\geq 0\), the random variable \(X_{t}^{\pi}\) has the distribution \(\pi \).

5.5 Proof of Proposition 3.6

Let us denote the space of all Hilbert–Schmidt operators on ℋ by \(\mathcal {L}_{2}(\mathcal {H})\) and note that \((e_{i}\otimes e_{j})_{i,j\in \mathbb{N}}\) is an orthonormal basis of \(\mathcal {L}_{2}(\mathcal {H})\). For every \(y\in \mathcal {H}\), the operator \(y\otimes y\colon \mathcal {H}\to \mathcal {H}\) defined by \((y\otimes y) (x)=\langle x,y\rangle y\) for every \(x\in \mathcal {H}_{+}\) is a Hilbert–Schmidt operator on ℋ and we can write \(y\otimes y= \sum _{i,j=1}^{\infty}\langle y, e_{i} \rangle \langle y, e_{j}\rangle ( e_{i}\otimes e_{j})\). Note that by (5.11), we have \(\int _{ \mathcal {H}_{+}}\rho (\xi )\pi (\mathrm {d}\xi )=\|\pi\|_{\mathcal {L}(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}}),\mathbb{R})}< \infty \) and hence the absolute second moment of \(\pi \) is finite. This implies that

$$\begin{aligned} \int _{ \mathcal {H}_{+}}\|y\otimes y\|_{\mathcal {L}_{2}(\mathcal {H})}\pi (\mathrm {d}y)\leq \int _{ \mathcal {H}_{+}}\mathrm {Tr}(y\otimes y)\pi (\mathrm {d}y)\leq \int _{ \mathcal {H}_{+}}\|y\|^{2} \pi (\mathrm {d}y)< \infty , \end{aligned}$$

and hence the integral \(\int _{ \mathcal {H}_{+}}(y\otimes y)\pi (\mathrm {d}y)\) is well defined in the Bochner sense. Thus it remains to compute the first two moments of the invariant measure.

Note that for every \(u\in \mathcal {H}\), the linear functional \(\langle u,\,\cdot \,\rangle \colon \mathcal {H}\to \mathbb{R}\) satisfies the following two properties:

i) for every \(R>0\), we have \(\langle u,\,\cdot \,\rangle \in C_{b}(K_{\mathrm{w}}^{R})\), where the set

$$\begin{aligned} K^{R}_{\mathrm{w}}:= \{ x\in \mathcal {H}_{+}\colon \|x\|^{2}+1\leq R \} \end{aligned}$$

is compact in \(\mathcal {H}_{+}\) equipped with the weak topology;

ii) \(\lim _{R\to \infty}\sup _{x\in \mathcal {H}_{+}\setminus K^{R}_{\mathrm{w}}}| \langle u,x\rangle |(1+\|x\|^{2})^{-1}=0\).

By Dörsek and Teichmann [24, Theorem 2.7], this implies that \(\langle u,\,\cdot \,\rangle \in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) for all \(u\in \mathcal {H}\). By Proposition 5.6, we have \(P_{t}f\to \pi (f)\) as \(t\to \infty \) for all \(f\in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) and hence also \(P_{t}\langle u, \,\cdot \,\rangle \to \int _{ \mathcal {H}_{+}}\langle u,\xi \rangle \pi (\mathrm {d}\xi )\) as \(t\to \infty \). Let \((e_{i})_{i\in \mathbb{N}}\) be an orthonormal basis of ℋ. Then by (2.10) for \(u=e_{i}\) for \(i\in \mathbb{N}\), we have

$$\begin{aligned} \lim _{t\to \infty}P_{t}\langle e_{i}, \, \cdot \,\rangle &=\lim _{t \to \infty}\bigg(\int _{0}^{t}\langle \hat{b},\mathrm {e}^{s\hat{B}}e_{i} \rangle \,\mathrm {d}s+\langle x, \mathrm {e}^{t\hat{B}}e_{i}\rangle \bigg)=\int _{0}^{ \infty}\langle \hat{b},\mathrm {e}^{s\hat{B}}e_{i}\rangle \,\mathrm {d}s, \end{aligned}$$

and since \(\xi =\sum _{i=1}^{\infty}\langle \xi , e_{i}\rangle \, e_{i}\), it follows that

$$\begin{aligned} \lim _{t\to \infty}\int _{ \mathcal {H}_{+}}\xi p_{t}(x,\mathrm {d}\xi ) &= \lim _{t \to \infty}\sum _{i=1}^{\infty}\int _{ \mathcal {H}_{+}}\langle \xi , e_{i} \rangle \, e_{i}p_{t}(x,\mathrm {d}\xi ) \\ &=\sum _{i=1}^{\infty }\bigg(\int _{0}^{\infty}\langle \hat{b},\mathrm {e}^{s \hat{B}}e_{i}\rangle \mathrm {d}s\bigg) e_{i} \\ &=\int _{0}^{\infty} \mathrm {e}^{s\hat{B}^{*}}\hat{b}\,\mathrm {d}s, \end{aligned}$$

where we have used

$$ \lim _{t \to \infty}\sum _{i=1}^{\infty}\bigg(\int _{0}^{t}\langle \hat{b},\mathrm {e}^{s\hat{B}}e_{i}\rangle \,\mathrm {d}s\bigg) e_{i} = \sum _{i=1}^{ \infty}\bigg(\int _{0}^{\infty}\langle \hat{b},\mathrm {e}^{s\hat{B}}e_{i} \rangle \,\mathrm {d}s\bigg) e_{i} $$

which is justified if \(\lim _{N \to \infty}\sup _{t \geq 0} \sum _{i=N}^{\infty} \| \int _{0}^{t} \langle e^{s \widehat{B}^{*}}\widehat{b}, e_{i} \rangle e_{i} \,\mathrm {d}s\| = 0\). The latter follows from \(\sup _{t \geq 0} \sum _{i=N}^{\infty} \| \int _{0}^{t} \langle e^{s \widehat{B}^{*}}\widehat{b}, e_{i} \rangle e_{i} \,\mathrm {d}s\| \leq \int _{0}^{ \infty} \sum _{i=1}^{\infty }| \langle e^{s \widehat{B}^{*}} \widehat{b}, e_{i} \rangle | \,\mathrm {d}s\) and

$$ \int _{0}^{\infty} \sum _{i=1}^{\infty} | \langle e^{s \widehat{B}^{*}} \widehat{b}, e_{i} \rangle | \,\mathrm {d}s\leq \int _{0}^{\infty} \| e^{s \widehat{B}^{*}}\widehat{b} \|\,\mathrm {d}s\leq M \| \widehat{b}\| \delta ^{-1} < \infty . $$

Recalling that \(\hat{b}=b+\int _{\mathcal {H}_{+}\cap \{\|\xi\|>1\}}\xi m(\mathrm {d}\xi )\) yields (3.5).

Next we prove the desired formula for the second moments of \(\pi \). For \(i,j\in \mathbb{N}\), we set \(g^{i,j}:=\langle \,\cdot \,, e_{i}\rangle \langle \,\cdot \,,e_{j} \rangle \). From (2.11) and analogous arguments as we used in Lemma 5.5 (to show that the integrals on the right-hand side of (5.16) below exist and are finite), we find that

$$\begin{aligned} & \lim _{t\to \infty}P_{t}g^{i,j}(x) \\ &=\bigg( \int _{0}^{\infty }\langle \hat{b},\mathrm {e}^{s \hat{B}} e_{i} \rangle \,\mathrm {d}s\bigg)\bigg( \int _{0}^{\infty }\langle \hat{b},\mathrm {e}^{s \hat{B}} e_{j}\rangle \,\mathrm {d}s\bigg) \\ & \hphantom{=:} + \int _{0}^{\infty}\int _{ \mathcal {H}_{+}\setminus \{0\}}\langle \xi , \mathrm {e}^{s \hat{B}} e_{i} \rangle \langle \xi , \mathrm {e}^{s \hat{B}} e_{j}\rangle m(\mathrm {d}\xi )\,\mathrm {d}s \\ & \hphantom{=:} + \int _{0}^{\infty }\int _{0}^{s} \bigg\langle \hat{b},\mathrm {e}^{(s-u) \hat{B}}\int _{ \mathcal {H}_{+}\setminus \{0\}} \langle \xi ,\mathrm {e}^{u \hat{B}}e_{i}\rangle \langle \xi ,\mathrm {e}^{u \hat{B}}e_{j}\rangle \frac{\mu (\mathrm {d}\xi )}{\|\xi\|^{2}}\bigg\rangle \,\mathrm {d}u\,\mathrm {d}s \end{aligned}$$
(5.16)

holds for all \(i,j\in \mathbb{N}\). The second moment formula (3.6) then follows from this and \(y\otimes y= \sum _{i,j=1}^{\infty}\langle y, e_{i} \rangle \langle y, e_{j}\rangle (e_{i}\otimes e_{j})\) once we have shown that

$$\begin{aligned} \lim _{t\to \infty}P_{t}g^{i,j}(x) = \int _{ \mathcal {H}_{+}} g^{i,j}(y)\pi ( \mathrm {d}y), \qquad i,j \in \mathbb{N}. \end{aligned}$$
(5.17)

Since the function \(g^{i,j}\) does not belong to \(\mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\), we cannot obtain (5.17) directly from Proposition 5.6. However, since we have \(P_{t}\langle \,\cdot \,,e_{i}\rangle \langle \,\cdot \,,e_{j} \rangle (x) \leq P_{t}\rho (x)<\infty \) for all \(t\geq 0\) and \(x\in \mathcal {H}_{+}\), we see that the function is in the larger space \(B_{\rho}(\mathcal {H}_{+,\mathrm{w}})\) and we deduce the assertion by an additional approximation argument. Namely, define \(g_{n}^{i,j} :=g^{i,j} \wedge n\) for \(n\in \mathbb{N}\). Then \(g_{n}^{i,j}\in \mathcal {B}_{\rho }(\mathcal {H}_{+,\mathrm {w}})\) and we find that

$$\begin{aligned} \bigg|P_{t} g^{i,j}(x) - \int _{ \mathcal {H}_{+}}g^{i,j}(x)\pi (\mathrm {d}x)\bigg| & \leq |P_{t} g^{i,j}(x) - P_{t} g_{n}^{i,j}(x)| \\ & \hphantom{=:} + \bigg|P_{t} g_{n}^{i,j}(x) - \int _{ \mathcal {H}_{+}}g^{i,j}_{n}(x)\pi (\mathrm {d}x) \bigg| \\ & \hphantom{=:} + \bigg| \int _{ \mathcal {H}_{+}}g^{i,j}_{n}(x)\pi (\mathrm {d}x) - \int _{ \mathcal {H}_{+}}g^{i,j}(x) \pi (\mathrm {d}x) \bigg|. \end{aligned}$$

Let \(\varepsilon > 0\). Take \(n \in \mathbb{N}\) large enough so that \(| \int _{ \mathcal {H}_{+}}(g^{i,j}_{n}(x) - g^{i,j}(x))\pi (\mathrm {d}x) | < \varepsilon \). Next note that

$$ \lim _{n \to \infty}\sup _{t \geq 0}|P_{t} g^{i,j}(x) - P_{t} g_{n}^{i,j}(x)| \leq \lim _{n \to \infty}\sup _{t \geq 0}\mathbb{E} [ \| X_{t}\|^{2} \mathbbm{1}_{\{ \| X_{t}\|^{2} > n\}} ] = 0, $$

where the equality follows from the characterisation of convergence in the Wasserstein distance (see Villani [66, Sect. 6]). Hence we find \(n\) large enough such that \(|P_{t} g^{i,j}(x) - P_{t} g_{n}^{i,j}(x)| < \varepsilon \) holds uniformly in \(t \geq 0\). Finally, for this fixed choice of \(n\), we may use Proposition 5.6 to choose \(t\) large enough so that we have

$$ \bigg|P_{t} g^{i,j}(x) - \int _{ \mathcal {H}_{+}}g^{i,j}(x)\pi (\mathrm {d}x) \bigg| < \varepsilon . $$

Combining all these estimates proves (5.17). This completes the proof of Proposition 3.6.

6 Conclusion and outlook

In this article, we studied the long-time behaviour of affine processes on \(\mathcal {H}_{+}\). In particular, we proved the existence and uniqueness of an invariant measure, constructed the corresponding stationary affine process and provided explicit formulas for the first and second moment of the invariant measure. Moreover, we proved ergodicity of the affine processes and established explicit and dimension-free convergence rates for the convergence in the Wasserstein distance of order \(p\in [1,2]\) of the transition kernels to the invariant measure. From a theoretical viewpoint, this article provides the first systematic study of the long-time behaviour of affine processes in a Hilbert space setting, in particular for affine processes admitting state-dependent jumps. We believe that our techniques, e.g. the use of the generalised Feller framework, can be used effectively to study the long-time behaviour of affine processes in general Hilbert spaces.

From an application point of view, we used affine processes on \(\mathcal {H}_{+}\) to model the instantaneous covariance process in infinite-dimensional stochastic covariance models. We defined Hilbert-valued affine stochastic covariance models in the stationary covariance regime by using the stationary affine process to model the instantaneous covariance. In this context, we defined a geometric affine stochastic covariance model for forward curve dynamics in commodity markets and studied the long-time behaviour of the implied forward volatility for large forward-start dates.