1 Introduction

A Wishart process is a matrix-valued continuous time stochastic process with a marginal Wishart distribution, i.e., a generalization to multiple dimensions of the chi-squared distribution, or, in the case of non-integer degrees of freedom, of the gamma distribution. While the introduction of Wishart-based Stochastic Volatility models in finance is fully motivated by the need to describe the multidimensional structure of asset variances (see La Bua and Marazzina (2019) and references therein), in this article our goal is to exploit the Wishart process in a multi assets framework. More precisely, we deal with the Wishart Affine Stochastic Correlation model (WASC), introduced in Da Fonseca et al. (2007) with the purpose of reproducing well-known multi-asset stylized facts in a tractable way. WASC model makes use of Wishart process to describe the stochastic variance covariance matrix of asset returns.

In our analysis we focus on the so-called Wishart processes introduced in Bru (1991) as a matrix generalization of square-root processes. A remarkable feature is that the analytical tractability is fully preserved since these processes belong to the class of affine processes. Given the strict connection with the well-known CIR processes, Wishart processes have been used to define multi-factor (Da Fonseca et al. 2008; La Bua and Marazzina 2019) and multi-asset (Da Fonseca et al. 2007) extensions of the classic Heston model, which is one of the most known and used models in finance, see for example (Goudenege et al. 2019; Yolcu-Okur et al. 2018).

Despite the analytical tractability (La Bua and Marazzina 2019), the implementation of Wishart-based models poses non-trivial challenges from a numerical point of view. In this article we extend the results presented in La Bua and Marazzina (2019) for the single asset Wishart Multidimensional Stochastic Volatility model (WMSV) to the WASC one. More precisely, we deal with the model calibration problem, presenting an innovative and efficient methodology to calibrate WASC parameters to market data. The algorithm exploits the close link existing between the Heston model (Heston 1993) and marginal WASC dynamics. Considering the single-asset case, in La Bua and Marazzina (2019) we show that, for an appropriate choice of parameters, both Heston (Heston 1993) and the Bi-Heston (Christoffersen et al. 2009) models may provide a reliable approximation of WMSV. In this article, for the multi-asset case, we extend this approximating technique making use of the distributional law of diagonal elements of Wishart process to connect the WASC calibration problem to the Heston one. We provide the analytical form of the gradient of calibration problem objective function with respect to Wishart-based parameters allowing for a further reduction in the computational burden.

Additionally, we propose model approximations that permit us to introduce two numerical schemes for Monte Carlo simulations. It is well known that a standard discretization (e.g. Euler scheme) is unfeasible in the Wishart-based framework, since we also need to take into account the evolution of non diagonal elements of the Wishart matrix \(\Sigma (t)\) to determine the dependence structure and satisfy the positive semi-definiteness constraint for the Wishart process. Therefore, as a first algorithm, we propose an adapted version of the scalar full truncated Euler. Secondly, we extend the Gaussian variable approximation scheme presented in La Bua and Marazzina (2019) to the WASC model. Extensive numerical results to compare the two simulation schemes are provided.

The article is organized as follows. In Sect. 2 we present the Wishart process and its basic properties, while in Sect. 3 we deal with the WASC model. Finally the calibration procedure is described in Sect. 4, while the Monte Carlo simulation schemes are presented in Sect. 5. In both cases, numerical results are presented.

2 Definition of Wishart process and basic properties

In this section, we introduce the Wishart process. We refer to La Bua and Marazzina (2019) for details.

Definition 1

(Wishart process) Let W(t) be a \(d \times d\) Brownian motion (i.e. a matrix of \(d \times d\) independent scalar Brownian motions) and \({\mathcal {S}}^{+}_{d}({\mathbb {R}})\) the set of real \(d \times d\) positive semidefinite matrices. We define the Wishart process as the solution on \({\mathcal {S}}^{+}_{d}({\mathbb {R}})\) of the following stochastic differential equation (SDE):

$$\begin{aligned} d\Sigma (t)&= (\Omega \Omega ^{\top } + M \Sigma (t) + \Sigma (t) M^{\top }) dt + \sqrt{\Sigma (t)}\,dW(t)\,Q + Q^{\top }\,dW^{\top }(t)\,\sqrt{\Sigma (t)}, \nonumber \\ \Sigma (0)&= \Sigma _{0} \in {\mathcal {S}}^{+}_{d}({\mathbb {R}}) \end{aligned}$$
(1)

with \(\Omega \), Q, \(M \in {\mathcal {M}}_{d}(\mathbb {R)}\) (the set of real \(d \times d\) square matrices).

As in La Bua and Marazzina (2019), in order to embed mean-reversion and stationarity, we consider matrix M to have only eigenvalues with negative real part. Moreover, we relate the deterministic part of the drift in (1), \(\Omega \Omega ^{\top }\), to the expected long-term value of the process, \(\Sigma _{\infty }\), by the equation

$$\begin{aligned} -\Omega \Omega ^{\top } = M \Sigma _{\infty } + \Sigma _{\infty } M^{\top }. \end{aligned}$$
(2)

As shown in La Bua and Marazzina (2019), if we set \(d = 1\) in (1), we end up with a scalar CIR process defined by the SDE

$$\begin{aligned} d v(t) = \kappa (\theta - v(t)) dt + \eta \sqrt{v(t)} dw_{v}(t),\,\,\,\,\, v(0) = v_{0}, \end{aligned}$$
(3)

with \(\kappa \), \(\theta \), and \(\eta \) strictly positive parameters, \(v_{0} \ge 0\) and \(w_{v}(t)\) a scalar Brownian motion. Therefore Wishart processes can be considered as multidimensional extension of a scalar CIR process. Moreover, a Wishart process entails a non-trivial dependence structure among its elements, since we have

$$\begin{aligned} d\left[ \Sigma _{ij}(t),\Sigma _{kl}(t) \right] = \left( \Sigma _{ik}(t) Q^{*}_{jl} + \Sigma _{il}(t) Q^{*}_{jk} + \Sigma _{jk}(t) Q^{*}_{il} + \Sigma _{jl}(t) Q^{*}_{ik}\right) dt, \end{aligned}$$
(4)

where the notation \(\left[ \cdot ,\cdot \right] \) refers as usual to the quadratic covariation of two stochastic processes, \(\Sigma _{ij}(t)\) indicates the element in the i-th row and \(j-\)th column of \(\Sigma (t)\), and \(Q^{*} = Q^{\top }Q\).

Existence and uniqueness conditions of the solution of the Wishart process SDE are given by the following results.

Proposition 1

(Proposition 2.1 in La Bua and Marazzina (2019)) Let X(t) be a generic affine process with continuous trajectories defined in \({\mathcal {S}}^{+}_{d}({\mathbb {R}})\) by the following SDE

$$\begin{aligned} X(t)= & {} X(0) + \int ^{t}_{0} \left( D_{X} + {{\mathcal {L}}}\left[ X(s)\right] \right) ds\nonumber \\&+ \int ^{t}_{0} \left( \sqrt{X(s)}\,dW(s)\,C_{X} + C_{X}^{\top }\,dW^{\top }(s)\,\sqrt{X(s)}\right) , \end{aligned}$$
(5)

where X(0), \(D_{X} \in {\mathcal {S}}^{+}_{d}({\mathbb {R}})\), \(C_{X} \in {\mathcal {M}}_{d}({\mathbb {R}})\), \({{\mathcal {L}}}:{\mathcal {S}}^{+}_{d}({\mathbb {R}}) \rightarrow {\mathcal {S}}^{+}_{d}({\mathbb {R}})\) is a linear transformation. Such process admits a unique weak solution in \({\mathcal {S}}^{+}_{d}({\mathbb {R}})\) if

  1. (a)

    \(D_{X} - (d - 1)C_{X}^{\top }C_{X} \in {\mathcal {S}}^{+}_{d}({\mathbb {R}})\),

  2. (b)

    \(\forall \, P_{1}, P_{2} \in {\mathcal {S}}^{+}_{d}({\mathbb {R}})\,\,\,\text {s.t.}\,\,\,{\text {Tr}}\left[ P_{1}P_{2}\right] = 0 \,\,\Rightarrow \,\, {\text {Tr}}\left[ {{\mathcal {L}}}(P_{1})P_{2}\right] \ge 0 \), where \({\text {Tr}}\left[ \cdot \right] \) is the trace of a square matrix (i.e., the sum of the elements on the main diagonal).

If X(0) is in the set of real positive definite matrices \({\mathcal {S}}^{++}_{d}({\mathbb {R}})\) and condition a) is replaced by the stronger requirement

  1. (c)

    \(D_{X} - (d + 1)C_{X}^{\top }C_{X} \in {\mathcal {S}}^{+}_{d}({\mathbb {R}})\),

then there exist a unique strong solution to (5) in \({\mathcal {S}}^{++}_{d}({\mathbb {R}})\).

As stated in La Bua and Marazzina (2019), we can obtain the Wishart SDE (1) from (5) by setting \(D_{X} = \Omega \Omega ^{\top }\), \(C_{X} = Q\), and \({{\mathcal {L}}}\left[ P_{0}\right] = M P_{0} + P_{0} M^{\top }\). Moreover, if we assume a more restrictive parametrization for the deterministic part of the drift

$$\begin{aligned} \Omega \Omega ^{\top } = \beta Q^{\top } Q, \end{aligned}$$
(6)

conditions (a) and (c) of Proposition 1 are satisfied as soon as

$$\begin{aligned} \beta\ge & {} d - 1, \end{aligned}$$
(7)
$$\begin{aligned} \beta\ge & {} d + 1, \end{aligned}$$
(8)

respectively, where the real positive parameter \(\beta \) plays the role of Feller’s condition in the univariate case. Additionally if condition a) is not met the whole process is not well defined. For the rest of the paper we consider a Wishart process defined by (1) and (6) as usually done in financial literature. Notice that a significant constraint has thus to be imposed on parameter \(\beta \). In the case \(d = 2\), for example, we must require \(\beta \ge 1\). As shown in La Bua and Marazzina (2019), this condition is not usually met when we perform a straight calibration of Wishart-based pricing models to market prices of plain vanilla options, while it does not seem to be a real limitation, according to our knowledge, when a maximum likelihood or a moments estimation is considered (Alfonsi et al. 2016; Boloorforoosh et al. 2020; Da Fonseca et al. 2014; Gourieroux and Sufana 2010). Given our interest in exploiting the model for pricing purposes, we rely to the calibration on plain vanilla options, and therefore to the implied volatility surface, analyzing the impact of Conditions (7, 8) in our numerical results. In La Bua and Marazzina (2020) we show how this impact can be smoothed adding a local volatility component, i.e., exploiting a stochastic-local volatility hybrid model.

2.1 Distribution of Wishart process and related results

In this section we deal with the analogy between Wishart and CIR processes. In fact, as shown in La Bua and Marazzina (2019), exploiting the affine nature of the Wishart process, we have that its characteristic function is an exponential affine transformation of the initial state as shown in the following Proposition:

Proposition 2

(Proposition 2.2 in La Bua and Marazzina (2019)) Let \(\Lambda \) be a real symmetric \(d \times d\) matrix, \(t\ge 0\) and \(T - t = \tau > 0\). The (conditional) characteristic function of the Wishart process defined by (1) and (6) is

$$\begin{aligned} \phi _{\Sigma }(\Lambda , \tau )= & {} {\mathbb {E}}\left[ \exp \left( \iota {\text {Tr}} \left[ \Lambda \Sigma (T) \right] \right) \vert \,\Sigma (t) \right] \nonumber \\= & {} \exp \left( {\text {Tr}}\left[ A_{\Sigma }(\Lambda , \tau ) \Sigma (t) \right] + b_{\Sigma }(\Lambda ,\tau ) \right) \end{aligned}$$
(9)

where \(\iota \) is the imaginary unit (i.e. \(\iota = \sqrt{-1}\)) and matrix \(A_{\Sigma }(\Lambda , \tau )\) and scalar function \(b_{\Sigma }(\Lambda , \tau )\) are such that

$$\begin{aligned} {\text {Tr}}\left[ A_{\Sigma }(\Lambda , \tau ) \Sigma (t) \right]&= {\text {Tr}}\left[ \iota \Lambda \left( {\mathbb {I}}_{d} - 2 \iota \Theta (\tau ) \Lambda \right) ^{-1} \Gamma (\tau ) \right] ,\\ b_{\Sigma }(\Lambda , \tau )&= -\frac{\beta }{2} {\text {Tr}}\left[ \log \left( \left( {\mathbb {I}}_{d} - 2 \iota \Theta (\tau ) \Lambda \right) \exp \left( \tau M^{\top } \right) \right) - \tau M \right] , \end{aligned}$$

\({\mathbb {I}}_{d}\) being the d-dimensional identity matrix. The additional matrix functions appearing in the above equations are given by

$$\begin{aligned} \Gamma (\tau )&= \exp (\tau M) \Sigma (t) \exp (\tau M^{\top }),\\ \Theta (\tau )&= \int \limits _{0}^{\tau } \exp \left( u M \right) Q^{\top }Q \exp \left( u M^{\top } \right) du. \end{aligned}$$

As a consequence of the analytical tractability of Wishart process and of the knowledge of its characteristic function, we are able to present an additional result regarding the distribution of diagonal elements of the Wishart process (a similar result for the trace of the Wishart process has been obtained in La Bua and Marazzina 2019, Corollary 2.3).

Corollary 1

(Distribution of elements on the main diagonal of Wishart process) Let \(\Sigma _{i}(t) = \Sigma _{ii}(t)\) be the i-th element on the main diagonal of \(\Sigma (t)\) and \(F_{\chi ^{2}}(x; \nu , \delta )\) the cumulative distribution function of a non-central chi-square random variable with \(\nu \) degrees of freedom and non-centrality parameter \(\delta \). Then, for a fixed \(T>t\), we have:

$$\begin{aligned} \Pr \left[ \Sigma _{i}(T) \le \upsilon \vert \Sigma (t) \right] = F_{\chi ^{2}}\left( \frac{\upsilon }{\vartheta _{i}}, \beta , \delta _{i}\right) \end{aligned}$$
(10)

where \(\delta _{i} = \gamma _{i}/\vartheta _{i}\) with \(\Gamma (\tau ) = (\gamma _{ij})_{1\le i,j \le d}\) and \(\Theta (\tau ) = (\vartheta _{ij})_{1\le i,j \le d}\). For ease of notation, we also set \(\gamma _{i} = \gamma _{ii}\) and \(\vartheta _{i} = \vartheta _{ii}\).

Proof

We use the fact that \(\exp \left( {\text {Tr}}\left[ \log (G) \right] \right) = {\text {det}}\left[ G \right] \) for any matrix invertible matrixFootnote 1G to write

$$\begin{aligned} \exp \left( b_{\Sigma }(\Lambda ,\tau ) \right)&= {\text {det}}\left[ \left( {\mathbb {I}}_{d} - 2 i \Theta (\tau ) \Lambda \right) \exp \left( \tau M^{\top } \right) \right] ^{-\frac{\beta }{2}}\, \exp \left( \frac{\beta }{2} {\text {Tr}}\left[ M \right] \tau \right) \\&= {\text {det}}\left[ {\mathbb {I}}_{d} - 2 i \Theta (\tau ) \Lambda \right] ^{-\frac{\beta }{2}}, \end{aligned}$$

and (9) becomes

$$\begin{aligned} \phi _{\Sigma }(\Lambda , \tau ) = {\text {det}}\left[ {\mathbb {I}}_{d} - 2 \iota \Theta (\tau ) \Lambda \right] ^{-\frac{\beta }{2}}\, \exp \left( {\text {Tr}}\left[ \iota \Lambda \left( {\mathbb {I}}_{d} - 2 \iota \Theta (\tau ) \Lambda \right) ^{-1} \Gamma (\tau ) \right] \right) . \qquad \end{aligned}$$
(11)

Let \(\lambda \) be a real variable and \(e_{i}^{d} = \left( \mathbb {1}_{k = \ell = i } \right) _{1\le k,\ell \le d}\), then by setting \(\Lambda _{i} = \lambda e_{i}^{d}\), the characteristic function of \(\Sigma _{i}(T)\) is

$$\begin{aligned} \phi _{\Sigma _{i}}(\lambda ,\tau ) = \phi _{\Sigma }\left( \Lambda _{i},\tau \right) = \left( 1-2\iota \lambda \vartheta _{i} \right) ^{-\frac{\beta }{2}} \,\exp \left( \frac{\iota \lambda \gamma _{i}}{1-2\iota \lambda \vartheta _{i}} \right) \end{aligned}$$
(12)

from which (10) follows by the definition of non-central chi-square distribution. \(\square \)

An alternative way to obtain (10) is to recognize that (11) is the characteristic function associated to the non-central Wishart distribution with scale matrix \(\Theta (\tau )\) and non-centrality matrix \(\Gamma (\tau )\) and apply results in Kourouklis and Moschopoulos (1985). For the sake of completeness, we point out that an analogous claim is shown in Da Fonseca and Grasselli (2011). Our formulation, however, gives a direct interpretation of parameters involved in the distribution of \(\Sigma _{i}(T)\) in terms of matrices describing the Wishart process. As we will see, this turns out to be particularly useful for computational purposes.

An important consequence of (10) is that we can define an exact mapping between \(\Sigma _{i}(T)\) and a CIR process:

Proposition 3

(CIR process mapping \(\Sigma _{i}(T)\)) Let v(t) be a CIR process defined by (3). For a fixed \(T>t\), it holds that (conditionally on v(t) and \(\Sigma (t)\) respectively) v(T) and \(\Sigma _{i}(T)\) share the same distribution provided that

$$\begin{aligned}&v(t) = \Sigma _{i}(t), \end{aligned}$$
(13)
$$\begin{aligned}&\kappa = -\frac{1}{t} \log \left( \frac{\gamma _{i}}{v(t)} \right) , \end{aligned}$$
(14)
$$\begin{aligned}&\eta = 2 \sqrt{\frac{\vartheta _{i} \kappa }{\left( 1 - e^{-\kappa t} \right) }}, \end{aligned}$$
(15)
$$\begin{aligned}&\theta = \frac{\beta \eta ^{2}}{4 \kappa }, \end{aligned}$$
(16)

where \(\gamma _{i}\) and \(\vartheta _{i}\) have been introduced in Corollary 1.

Proof

The correspondence of the distributions relies on the properties of the CIR process. We refer to Cox et al. (1985) for details. \(\square \)

3 The wishart affine stochastic correlation model

With the purpose of reproducing well-known multi-asset stylized facts in a tractable way, in Da Fonseca et al. (2007) the authors introduce the Wishart Affine Stochastic Correlation model (WASC) that makes use of Wishart process to describe the stochastic variance covariance matrix of asset returns. The model proposes the following joint dynamics for a vector of forward asset prices:

$$\begin{aligned} d{\mathbf {f}}(t) = {\text {diag}}\left[ {\mathbf {f}}(t)\right] \sqrt{\Sigma (t)}\,d{\mathbf {b}}(t), \,\,\,\,\,\, {\mathbf {f}}(0) \in {\mathbb {R}}_{+}^{d}, \end{aligned}$$
(17)

where \({\text {diag}}\left[ \cdot \right] \) is the operator that transforms a d-dimensional column vector into a \(d \times d\) diagonal matrix. In (17) \({\mathbf {b}}(t)\) is a d-vector Brownian motion such that

$$\begin{aligned} {\mathbf {b}}(t) = \sqrt{1-{\mathbf {r}}^{\top }{\mathbf {r}}}\,\, {\mathbf {z}}(t) + W(t)\, {\mathbf {r}} \end{aligned}$$
(18)

with \({\mathbf {z}}(t)\) another d-vector Brownian motion independent on W(t) and \({\mathbf {r}}\in [-1,1]^{d}\) such that \({\mathbf {r}}^{\top }{\mathbf {r}} \le 1\). Here \({\mathbf {r}}\) can be interpreted as the vector of coefficients meant to drive the linear correlation between the shocks on asset returns and shocks on variance-covariance matrix \(\Sigma (t)\). The choice of the correlation structure (18) represents the major improvement with respect to the model in Gourieroux and Sufana (2004) and aims at accommodating realistic single asset volatility skews still preserving the affinity of the model. Remarkably, the resulting WASC dynamics (17) allows for stochastic correlation among asset returns in a tractable framework where each asset is enriched with a stochastic volatility behavior consistent with the effects observed on plain vanilla markets. Let the i-th forward asset price at time t be denoted by \(f_i(t)=f_i(0)e^{y_i(t)},\, t\ge 0\). The peculiarities of the model can be fully appreciated by referring to the individual, or scalar, dynamics of asset returns \(y_{i}\):

$$\begin{aligned} d y_{i}(t)= & {} -\frac{1}{2} \sum \limits _{j=1}^{d} s_{ij}^{2}(t) dt + \sum \limits _{j=1}^{d} s_{ij}(t) db_{j} (t) \nonumber \\= & {} -\frac{1}{2} \Sigma _{i}(t) + \sum \limits _{j=1}^{d} s_{ij}(t) db_{j} (t),\,\,\,\,\,\, i = 1,...,d \end{aligned}$$
(19)

where \({\widehat{S}}(t) = {\widehat{S}}(t)^{\top } = \left( s_{ij}\right) _{1\le i,j \le d}\) is the unique positive semi-definite square root of \(\Sigma (t)\). We also use the notation \(\Sigma _{i}(t) = \Sigma _{ii}(t)\) to denote the i-th diagonal element of Wishart process. By straightforward computations, from (19) we can compute the quadratic covariation of two given assets:

$$\begin{aligned} d\left[ y_{k}(t),y_{\ell }(t)\right] = d\left[ \sum \limits _{j=1}^{d} s_{kj}(t) db_{j} (t),\sum \limits _{j=1}^{d} s_{\ell j}(t) db_{j} (t)\right] = \Sigma _{k\ell }(t) dt \end{aligned}$$
(20)

that highlights the role of Wishart process, used to describe the stochastic evolution of the asset returns variance covariance matrix. Furthermore, we can explicitly define the cross-asset correlation matrix \(C_{{\mathbf {y}}}(t)\) as

$$\begin{aligned} C_{{\mathbf {y}}}(t) = \left( \rho _{ij}(t) \right) _{1\le i,j \le d} = \left( \frac{\Sigma _{ij}(t)}{\sqrt{\Sigma _{i}(t) \Sigma _{j}(t)}} \right) _{1\le i,j \le d}. \end{aligned}$$
(21)

By exploiting the properties of Wishart process, it can be shown that \(C_{{\mathbf {y}}}(t)\) is a well-defined correlation matrix (i.e. \(C_{{\mathbf {y}}}(t)\) is positive semi-definite and each \(\rho _{ij}(t) \in [-1,1]\)) as soon as condition (7) is satisfied and provided that \(\Sigma _{ij}(t) \ne 0\) for \(i,j = 1,...,d\). Indeed, if \(\beta \ge d - 1\), \(\Sigma (t)\) is positive semi-definite and \(C_{{\mathbf {y}}}(t)\) admits the decomposition

$$\begin{aligned} C_{{\mathbf {y}}}(t) = D^{-1} \Sigma (t) D^{-1} = D^{-1} {\widehat{S}}(t) {\widehat{S}}(t)D^{-1} = LL^{\top }, \end{aligned}$$

where \(D = \sqrt{{\text {diag}}\left[ \Sigma (t) \right] }\). To show that each element \(\rho _{ij}(t)\) is bounded in \([-1,1]\) we use the following theorem that applies for Wishart distributed matrices:

Theorem 1

(Theorem 2.4.2 in Kollo and von Rosen (2006)) Let \(X_{\mathcal {W}} \in {\mathcal {S}}_{d}^{+}({\mathbb {R}}) \sim {\mathcal {W}}_{d}(\beta , \Theta , \Gamma )\), i.e. \(X_{\mathcal {W}}\) is a \(d \times d\) symmetric matrix that follows a non-central Wishart distribution with degrees of freedom \(\beta \), scale \(\Theta \) and non-centrality matrix \(\Gamma \). Then for a \(n \times d\) matrix B, we have that \(Y = B X_{\mathcal {W}} B^{\top } \sim {\mathcal {W}}_{n}(\beta , B\Theta B^{\top }, B \Gamma B^{\top })\).

If we set \(X_{\mathcal {W}} = \Sigma (t)\) and \(B = \left[ {\mathbf {e}}_{i}^{d}, {\mathbf {e}}_{j}^{d} \right] ^{\top }\) for some admissible i and j (with \({\mathbf {e}}_{i}^{d}\) the i-th element of the standard basis of \({\mathbb {R}}^{d}\)), the previous theorem shows that the resulting matrix \( {B} \Sigma (t) {B}^{\top } = \left[ \begin{array}{cc} \Sigma _{i}(t)&{} \Sigma _{ij}(t) \\ \Sigma _{ij}(t)&{}\Sigma _{j}(t)\end{array}\right] \) is a well-defined \(2 \times 2\) Wishart process and then it holds that \(\Sigma _{ij}^{2}(t) \le \Sigma _{i} \Sigma _{j}\). This proves that each element \(\rho _{ij}(t)\) defined in (21) is bounded in \([-1,1]\).

By combining (19) and (20) with Proposition 3, and fixing a time horizon T, we can represent the (T-specific) WASC dynamics as the following 2d system of scalar SDEs (for \(i = 1,...,d\))

$$\begin{aligned} dy_{i}(t)&= -\frac{1}{2} \Sigma _{i}(t) dt + \sqrt{\Sigma _{i}(t)}dw^{y}_{i}(t), \end{aligned}$$
(22)
$$\begin{aligned} d\Sigma _{i}(t)&= \kappa _{i} (\theta _{i} - \Sigma _{i}(t))dt + \eta _{i} \sqrt{\Sigma _{i}(t)}dw^{\Sigma }_{i}(t), \end{aligned}$$
(23)

where the parameters \(\kappa _{i}\), \(\theta _{i}\) and \(\eta _{i}\) are given, respectively, in (14), (15) and (16). Here the correlation structure among Brownian motions \({\mathbf {w}}= \left[ w^{\Sigma }_{1},w^{\Sigma }_{2},...,w^{\Sigma }_{d},w^{y}_{1},w^{y}_{2},...,w^{y}_{d} \right] ^{\top }\) is described by means of the stochastic block matrix

$$\begin{aligned} C(t) = \left[ \begin{array}{ c@{\quad }c} C_{\Sigma }(t) &{}\quad C_{\Sigma {\mathbf {y}}}^{\top }(t) \\ C_{\Sigma {\mathbf {y}}}(t) &{}\quad C_{{\mathbf {y}}}(t) \end{array}\right] , \end{aligned}$$
(24)

where the submatrices, other than \(C_{{\mathbf {y}}}\) already introduced in (21), will be described in the following. Interestingly, from (22) and (23) we have that, for any \(T>0\), the scalar dynamics of each asset is consistent with a standard Heston model driven by the i-th diagonal element of \(\Sigma (t)\). This assures that the behaviour induced by WASC in terms of reconstructed implied volatility surfaces is in line with the documented findings of traditional one factor stochastic volatility models. Consistently with Heston model, the asset specific returns-volatility correlation is constant: following Da Fonseca et al. (2007), we have

$$\begin{aligned} {\text {Corr}}_{t}\left[ dy_{i}(t), d\Sigma _{i}(t) \right] = \rho _{i} dt = \frac{{\text {Tr}}\left[ Q R_{i} \right] }{\sqrt{Q^{*}_{ii}}}dt, \end{aligned}$$
(25)

where \(R_{i}\) is the matrix with \({\mathbf {r}}\) on the i-th row and zero elsewhere. We can even generalize the previous result by explicitly deriving the correlation between the i-th log-asset and a generic diagonal element of \(\Sigma (t)\): we, indeed, have that (as shown in Da Fonseca et al. (2007)) the covariation between asset returns and volatility terms is given by

$$\begin{aligned} d \left[ y_{i}(t), \Sigma _{j}(t)\right] = 2 \Sigma _{ij}(t) {\text {Tr}}\left[ R_{j}Q\right] dt. \end{aligned}$$
(26)

By combining (26) and (20) and exploiting (4), we get the generic element of matrix \(C_{\Sigma {\mathbf {y}}}(t)\) as

$$\begin{aligned} {\text {Corr}}_{t}\left[ dy_{i}(t), d\Sigma _{j}(t) \right]= & {} \frac{2 \Sigma _{ij}(t) {\text {Tr}}\left[ R_{j}Q\right] }{\sqrt{\Sigma _{i}(t)} \sqrt{4 \Sigma _{j}(t) Q^{*}_{jj}}}dt \nonumber \\= & {} \frac{{\text {Tr}}\left[ R_{j}Q\right] }{\sqrt{Q^{*}_{jj}}}\frac{\Sigma _{ij}(t)}{\sqrt{\Sigma _{i}(t)}\sqrt{\Sigma _{j}(t)}}dt = \rho _{j} \rho _{ij}(t)dt. \end{aligned}$$
(27)

In the last equality of (27), we introduce a new representation of such correlations that can be seen, quite fascinatingly, as the product of the (constant) proper, or scalar, j-th asset-volatility correlation and the cross-asset correlation between i-th and j-th assets. This result highlights the peculiar dependence structure inherent in the WASC model and could help in gaining more insights on parameters impact on correlation surfaces in the spirit of the study carried out in Da Fonseca et al. (2007).

Further, using (4), it follows that the elements of \(C_{\Sigma }(t)\) have the form

$$\begin{aligned} {\text {Corr}}_{t}\left[ d\Sigma _{i}(t), d\Sigma _{j}(t) \right] = \frac{4 \Sigma _{ij}(t) Q^{*}_{ij} }{4 \sqrt{\Sigma _{i}(t) \Sigma _{j}(t)}\sqrt{ Q^{*}_{ii} Q^{*}_{jj}}} dt = \frac{Q^{*}_{ij}}{\sqrt{Q^{*}_{ii} Q^{*}_{jj}}} \rho _{ij}(t) dt. \qquad \end{aligned}$$
(28)

From (21), (27) and (28), we have that the stochastic evolution of (24) is fully described by processes \(\rho _{ij}(t)\). Let us consider, for example, the case \(d = 2\): matrix C(t) then reads as

$$\begin{aligned} C(t) = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1 &{}\quad q_{12}\rho _{12}(t) &{}\quad \rho _{1} &{}\quad \rho _{1} \rho _{12}(t) \\ q_{12}\rho _{12}(t) &{}\quad 1 &{}\quad \rho _{2} \rho _{12}(t) &{}\quad \rho _{2} \\ \rho _{1} &{}\quad \rho _{2} \rho _{12}(t) &{}\quad 1 &{}\quad \rho _{12}(t) \\ \rho _{1} \rho _{12}(t) &{}\quad \rho _{2} &{}\quad \rho _{12}(t) &{}\quad 1 \end{array}\right] \end{aligned}$$

where \(q_{ij} = \frac{Q^{*}_{ij}}{\sqrt{Q^{*}_{ii}Q^{*}_{jj}}}\). This representation highlights the peculiar dependence structure induced in the WASC model and could provide useful insights on the role of Q and \({\mathbf {r}}\) in determining the relation among state variables.

3.1 A restricted version of the model

In this section we consider a restricted, more intuitive, specification of WASC model: we assume matrix M to be diagonal and with negative entries. This setting leads to a very interesting dynamics for the diagonal elements of Wishart process. From Proposition 3, if M is diagonal, direct computation gives

$$\begin{aligned} \kappa _{i} = -2 M_{ii},\ \ \ \ \ \eta _{i} = 2 \sqrt{Q^{*}_{ii}}, \end{aligned}$$

for all \(T > 0\). An immediate consequence is that the asset instantaneous variances are now described by time-independent CIR processes, in the sense that the parameters involved are no longer function of the time horizon considered.

In our opinion the resulting parametrization turns out to be the most genuine multi-asset extension of the Heston model: each asset is exactly described by a single instance of the Heston dynamics while the joint behaviour is enriched by cross-assets and cross-variances stochastic correlation, all wrapped in an affine framework. As far as we know, there are no alternative settings that can reach a comparable degree of flexibility. The exact Heston representation of asset dynamics also helps in understanding the role and the impact of WASC parameters, that in the general formulation appear somehow unclear. In particular, it is worthwhile to point out that the pricing of single-asset European claims is only affected by the corresponding diagonal element of \(\Sigma _{0}\). To see this, it suffices to notice that for \(\varvec{\lambda } = \lambda _{i}{\mathbf {e}}_{i}^{d}\), the matrix \(A_{{\varvec{y}}}(\tau )\) in (31) has a non-null i-th diagonal element and zeros elsewhere. This peculiarity is in line with the asymptotic analysis provided in Da Fonseca and Grasselli (2011) where the i-th implied volatility approximation for short time to maturity is found to beFootnote 2

$$\begin{aligned} \sigma _{imp,i}^{2} = \Sigma _{ii} + (r_{1} Q_{1i} + r_{2} Q_{2i}) m_f + \frac{1}{2} \frac{4 \left( Q_{1i}^{2} + Q_{2i}^{2}\right) - 7 (r_{1} Q_{1i} + r_{2} Q_{2i}) ^{2}}{6 \Sigma _{ii}} m_f^{2}, \end{aligned}$$

with \(m_f = \log \left( \frac{K}{f_{i}(0)} \right) \) denoting the log-forward moneyness. Consequently, the off-diagonal entries of \(\Sigma _{0}\) can be used to match multi-asset stylized facts without compromising the shape of individual volatility surfaces. This represents an additional degree of freedom that in the general WASC model we would not have. A possible calibration strategy could be to set the off-diagonal entries of \(\Sigma _{0}\) in order to match a predefined initial cross-asset correlation matrix. Alternatively, provided that liquid multi-asset derivatives are traded, we could try to fit the implied correlation market evidences. In order to further develop this point, we now study the impact of \(\Sigma _{12}\) on the price of Best-Of put options which payoff is

$$\begin{aligned} \left( K - {\text {max}}\left[ S_{1}(T),S_{2}(T) \right] \right) ^{+}, \end{aligned}$$
(29)

\(S_i(t)\) being the asset price at time \(t,\,t\ge 0\). Let \(\varvec{\pi }_{WA}\) be the following set of WASC parameters:

$$\begin{aligned} \beta&= 1.1, \,\,\Sigma _{0} = \begin{bmatrix} 0.04 &{}\quad 0 \\ 0 &{}\quad 0.04 \end{bmatrix},\,\,M = \begin{bmatrix} -0.7 &{}\quad 0 \\ 0 &{}\quad -1.2 \end{bmatrix},\,\,Q = \begin{bmatrix} 0.3 &{}\quad 0.3\\ 0.2 &{}\quad 0.3 \end{bmatrix},\\ {\mathbf {r}}&= \begin{bmatrix} -0.6 \\ -0.1 \end{bmatrix}, \end{aligned}$$

that are meant to describe realistic market scenarios and allowing for a well defined Wishart process. Figure 1 shows the implied correlation profiles corresponding to different values of \(\Sigma _{12}\), where the implied correlation is defined to be the value of parameter \(\rho \) such that the WASC price equals the one obtained in a two-assets Black-Scholes setting, i.e.

$$\begin{aligned} P_{Best-Of}^{BS}(\sigma _{imp,1}^{WA}(K,T), \sigma ^{WA}_{imp,2}(K,T), \rho , K, T) = P_{Best-Of}^{WA}(\varvec{\pi }_{WA},K,T).\quad \end{aligned}$$
(30)

Here \(\sigma _{imp,i}^{WA}(K,T)\) is the Black-Scholes implied volatility corresponding to the option written on the i-th asset with strike K and maturity T whose price is computed with WASC model. By exploiting the affinity of the model, Best-Of put options are priced by numerically computing a bi-dimensional inverse Fourier transform. We refer the reader to Da Fonseca et al. (2007) where this pricing methodology is developed for Best-Of contracts. From the numerical results, it is evident that the off-diagonal Wishart element plays a significant role in modelling the implied correlation skew: we observe an increase in implied correlation levels for higher values of \(\Sigma _{1,2}\). This, in turn, induces an increase in option prices consistently with the fact that Best-Of put options are long correlation products that benefit from lower assets returns dispersion.

Fig. 1
figure 1

Best-Of put options implied correlation skew for different values of \(\Sigma _{12}\). Other parameters used: \(f_{1}(0) = f_{2}(0) = 100\), \(T = 1\) and \(r = 0\%\)

3.2 WASC Characteristic function

The chosen correlation structure (18) assures the affinity of WASC model. This means, once more, that we can express the (joint) characteristic function of the asset returns vector \({\mathbf {y}}(T)\) as an exponential affine transformation of state variables \({\mathbf {y}}(t)\) and \(\Sigma (t)\) as recalled in the following Proposition:

Proposition 4

(Joint characteristic function of log-prices in WASC) Let the log-forward prices vector \({\mathbf {y}}(t)\) be described by (19) and \(\varvec{\lambda }\) be an auxiliary vector-valued variable \(\varvec{\lambda } = \left[ \lambda _{1},...,\lambda _{d} \right] ^{\top }\). Then for \(T>t\), the WASC (conditional) characteristic function of \({\mathbf {y}}(T)\) admits the following closed formula representation

$$\begin{aligned} \phi ^{WA}_{{\mathbf {y}}}(\varvec{\lambda },\tau )= & {} {\mathbb {E}}\left[ \exp \left( \iota \left\langle \varvec{\lambda }, {\mathbf {y}}(T) \right\rangle \right) \vert {\mathbf {y}}(t) \right] \nonumber \\= & {} \exp \left( \iota \left\langle \varvec{\lambda }, {\mathbf {y}}(t) \right\rangle + {\text {Tr}}\left[ A_{{\mathbf {y}}}(\tau )\Sigma (t)\right] + b_{{\mathbf {y}}}(\tau )\right) , \end{aligned}$$
(31)

with the deterministic matrix \(A_{{\mathbf {y}}}(\tau )\) and the scalar function \(b_{{\mathbf {y}}}(\tau )\) given by

$$\begin{aligned} A_{{\mathbf {y}}}(\tau )&= A_{22}(\tau )^{-1}\,A_{21}(\tau ),\\ b_{{\mathbf {y}}}(\tau )&= -\frac{\beta }{2} {\text {Tr}}\left[ \log (A_{22}(\tau )) + \tau (M + \iota \lambda Q^{\top }R^{\top })\right] , \end{aligned}$$

and

$$\begin{aligned} \begin{bmatrix} A_{11}(\tau ) &{}\quad A_{12}(\tau ) \\ A_{21}(\tau ) &{}\quad A_{22}(\tau ) \end{bmatrix} = \exp \left( \tau \begin{bmatrix} M+\iota Q^{\top }{\mathbf {r}}\varvec{\lambda }^{\top } &{}\quad -2Q^{\top }Q \\ -\frac{1}{2}\left( \varvec{\lambda }\varvec{\lambda }^{\top } + \iota {\text {diag}}\left[ \varvec{\lambda } \right] \right) &{}\quad -(M+\iota Q^{\top }{\mathbf {r}}\varvec{\lambda }^{\top })^{\top } \end{bmatrix}\right) . \end{aligned}$$

Proof

See Da Fonseca et al. (2007). \(\square \)

Thanks to Proposition 4, we are able to price both plain vanilla and multi-asset options (if transform-based techniques are applicableFootnote 3) in a comprehensive framework. In particular, we price options on the i-th asset as a basket option with degenerate weights vector \(\mathbf{e }_{i}^{d}\), such that \(\varvec{\lambda } = \lambda _{i} \mathbf{e }_{i}^{d}\). Furthermore, the knowledge of the joint characteristic function of asset returns vector allows to make use of bounds techniques as those developed in Caldana et al. (2016) for basket options. Despite the analytical tractability, several numerical issues arise when we try to calibrate WASC model to market data by exploiting (31). Not only, indeed, we have to evaluate functions of matrix argument for each computation of the characteristic function (as in the WMSV case), but, even worse, we are required to perform d different plain vanilla pricing (one for each asset) for a single parameters set. This is due to the lack of liquid multi-asset derivatives that force us to calibrate model parameters to the individual market implied volatility surfaces. As reported in Da Fonseca and Grasselli (2011), such a naive algorithm can take up to 15 minutes in the simplest case \(d=2\). This is, clearly, not feasible for real market applications. Therefore in the next section we propose an accurate and fast calibration procedure.

4 A new calibration procedure

In this section we present an innovative and efficient methodology to calibrate WASC parameters that exploits the close link existing between Heston model and marginal WASC dynamics. The proposed algorithm is firstly tested in a simplified framework and then applied to market data. The results obtained also highlight the impact of parameter \(\beta \) on model accuracy in reproducing market volatility smiles.

Let us consider a WASC parameters set \(\varvec{\pi }_{WA}\) (with cardinality \(N_{WA}\)) and fix a maturity T. For the generic i-th asset described by (19) we can define a function \(g^{H-WA}_{i}\) that maps WASC parameters to those of a scalar Heston dynamics. In other words, we set \(g^{H-WA}_{i}: {\mathbb {R}}^{N_{WASC}} \times {\mathbb {R}}_{>0} \rightarrow {\mathbb {R}}^{5}\) such that \(g^{H-WA}_{i}(\varvec{\pi }_{WA},T) = \varvec{\pi }_{i,H} = [v_{0,i},\kappa _{i},\theta _{i},\eta _{i},\rho _{i}]^{\top }\) as defined in (13)–(16) along with the assets-volatility correlation (25). Consequently, for calibration purposes, we can replace the cumbersome WASC characteristic function with the simpler Heston one (see La Bua and Marazzina 2019, Appendix A.1). Furthermore, we can compute analytically the gradient of the objective function with respect to WASC parameters. In Appendix A we show how to compute explicitly the matrix \(J^{H-WA}_{i,T}(\varvec{\pi }_{WA}) = \nabla g^{H-WA}_{i} \in {\mathbb {R}}^{N_{WA} \times 5}\), i.e. the Jacobian matrix of function \(h^{WA}_{i}\) with elements:

$$\begin{aligned} j_{q,r}^{H-WA} = \frac{\partial g^{H-WA}_{i,r}(\varvec{\pi }_{WA}, T)}{\partial \varvec{\pi }_{WA,q}}. \end{aligned}$$
(32)

Then, the Jacobian matrix of \({\tilde{\mathbf {r}}}_{i,T}(\varvec{\pi }_{WA})\) (the residuals vector composed of options with maturity T written on the i-th asset) can be written as

$$\begin{aligned} J_{i,T}^{WA}(\varvec{\pi }_{WA}) = J_{i,T}^{H-WA}(\varvec{\pi }_{WA}) J^{H}_{i}(g^{H-WA}_{i}(\varvec{\pi }_{WA},T)) \end{aligned}$$
(33)

where the second matrix in the right-hand side of (33) is known thanks to Cui et al. (2017) (see also La Bua and Marazzina 2019, Appendix A.2). To obtain the overall Jacobian matrix \(J^{WA}\), we simply need to compute (33) for each maturity and asset taken into account and aggregate the resulting matrices. Notice the relationship between (La Bua and Marazzina 2019, Equation (43)) and Equation (32). Finally, similarly to La Bua and Marazzina (2019), the gradient of the calibration problem objective function is given by \(\nabla f_{obj} = J^{WA} {\tilde{\mathbf {r}}} (\varvec{\pi }_{WA})\).

The calibration algorithm so defined avoids the computation of WASC characteristic function and significantly reduces the issues due to the possible presence of multiple minima. Given that we rely on the law identity in Proposition 3 rather than on some approximation, the routine does not require any further step.

4.1 A simplified calibration exercise

The accuracy of the proposed algorithm is illustrated by considering the following numerical experiment: let us suppose that a fictitious two-assets market is perfectly described by the WASC parameters reported in the first column of Table 1. Even if simplified, the data outline realistic market environments: they represent the calibrated parameters set (truncated at the first significant decimal digit) found in Da Fonseca and Grasselli (2011) for the couple of indices EuroStoxx50-DAX. We construct a full implied volatility surface for each asset assuming to have options with maturities \(T = \left[ 0.25,\;0.5,\,1,\,3 \right] \) and 41 equally spaced strikes ranging from 0.5 to 1.5 (initial asset values are set for simplicity equal to 1). Each of the resulting surfaces consists of 164 options. The goal is to implement the proposed algorithm in order to find a suitable parameters set that reproduces the supposed market data. Hopefully, we expect the calibrated parameters to be reasonably close to the original ones (accuracy) and to experience a limited dependency on the initial guess (robustness). For this test we set the starting values of the optimization routine as shown in the second column of Table 1. The choice is meant to assess the robustness of the algorithm in the case in which the initial guess is very far from the optimal set. Indeed, not only the discrepancy is mixed - some values are overestimated, others underestimated - but the distance between initial guess and optimal values is substantial: the smallest gap, defined as percentage difference, is equal to \(35.29\%\). The mistaken initialization of the problem and the high dimensionality of the parameters space make the calibration task more challenging and could potentially lead to suboptimal outcomes. Notwithstanding, the proposed algorithm is able to produce results very close to the original values: the norm of the errors between true prices and calibrated ones is \(2.2069 \times 10^{-7}\). Most remarkably, the procedure takes only 3.56 seconds using Matlab Mex files, on a laptop PC with an Intel Core i7 CPU and 8 GB RAM. By considering parallelization and porting to more efficient languages we can obtain a further speedup. It is worthwhile also noting that in realistic applications, the calibration problem is somehow facilitated thanks to the availability of previous optimal sets that act as efficient guesses. In the lights of all these evidences, we believe that the proposed methodology represents a highly efficient tool for the calibration of WASC model. This is particularly true if we intend to increase the number of assets involved with the subsequent growth of dimensionality.

Table 1 Results for the calibration exercise described in Sect. 4.1

4.2 Calibration to market data

We now want to validate the procedure with realistic market data. Despite the general applicability of the algorithm, we focus our attention on the restricted specification of the model introduced in Sect. 3.1. With this in mind, we select a basket of market quoted instruments composed of 201 European call options written on EuroStoxx50 index and 182 on DAX index. The set of derivatives on the DAX is the same set used in La Bua and Marazzina (2019). We further set, for simplicity, interest rates and dividends to zero. Thanks to the efficiency of the new calibration algorithm, we are able to calibrate model parameters in less than 3 seconds. The outputs of the optimization routine are shown in the leftmost column of Table 2. Given that, as illustrated above, the off-diagonal element of \(\Sigma _{0}\) does not impact the pricing of univariate call options, we set its value such that the initial correlation among the two indices equal the one-year historical one (that is found to be 0.9715).Footnote 4

Table 2 Calibration on February, 3 2016 with the WASC over a full set of EuroStoxx50 and DAX indices European call options

The most interesting result is that \(\beta \) is lower than 1. This is coherent with the evidences in Da Fonseca and Grasselli (2011) where similar results are found. Figure 2 shows the calibrated model implied volatility skews for the two indices with respect to maturities of one month, one year and three years. The model succeeds in reproducing the shape of market volatility surfaces but the mispricing is not negligible for short term far-from-the-money options. This is particularly true for the EuroStoxx50 index as highlighted from the fact that the error in volatility terms is roughly 3 times higher than the error made for the DAX.

Additionally, we can compare the evidences from the calibration of the WASC model against the Wishart Multidimensional Stochastic Volatility model (WMSV, La Bua and Marazzina 2019), as well as the Heston (1993) and Christoffersen et al. (2009) models, calibrated to the same basket of DAX options. Table 3 shows the calibrated initial variance of asset returns along with the Mean Squared Error with respect to both price and implied volatility for the four models. Consistently, the estimates of initial variance are in strict agreement: all the models agree on the initial volatility. Moreover, w.r.t. the accuracy of the calibration, the two multi-factor models, i.e., the WMSV and the Bi-Heston, tend to perform quite similarly (although errors for WMSV are slightly smaller) and substantially outperform the simpler Heston and WASC dynamics. In particular, by comparing the error of WMSV and WASC models, the outperformance of the former is clearly evident. This is not surprisingly since we contrast a multi-factor volatility setting (WMSV) with the WASC single-asset dynamics that, as developed in Sect. 3, is equivalent to 1-factor parametrization. It is important to remark, however, that the models are meant to address rather different tasks (i.e. single-asset and multi-asset modelling).

Fig. 2
figure 2

Calibration results for WASC. Resulting value for parameter \(\beta \) is 0.8577. Comparison with market implied volatility for EuroStoxx50 (left) and DAX (right) indices for selected tenors

Table 3 Comparison of calibration outputs on DAX index
Fig. 3
figure 3

Simulated trajectories of cross-asset correlation \(\rho _{12}(t)\) for \(t \in \left[ 0, 1\right] \) generated with calibrated parameters obtained imposing \(\beta \ge 1\). Left panel: \(\Sigma _{12} = 0.0654\). Right panel: \(\Sigma _{12} = 0\)

Fig. 4
figure 4

Simulated trajectories of cross-asset correlation \(\rho _{12}(t)\) for \(t \in \left[ 0, 1\right] \) generated with calibrated parameters obtained imposing \(\beta \ge 3\). Left panel: \(\Sigma _{12} = 0.0643\). Right panel: \(\Sigma _{12} = 0\)

Moving back to the multi-asset calibration in Table 2, some fix is required in order to enforce the existence and uniqueness condition for matrix process \(\Sigma (t)\). First of all, we tackle the calibration problem imposing \(\beta \ge 1\). Results are exhibited in the second column of Table 2. Even if the loss in accuracy seems to be somehow limited (the error measure are just slightly higher than in the unconstrained setting), a relevant issue arises: in Fig. 3 we report simulated trajectories of WASC cross-asset correlation obtained with the resulting parameters set. The fact that \(\beta \) satisfies condition (7) effectively ensures \(\rho _{12}(t)\) to lie in the range \(\left[ -1, 1 \right] \). However, the fact that the parameter is just slightly above the threshold (\(\beta = 1\)) makes the boundary of \({\mathcal {S}}_{2}^{+}({\mathbb {R}})\) very likely to be attained. Very often, then the absolute value of correlation is stuck at 1. We can also experience sudden changes in correlation from \(+1\) to \(-1\) (or viceversa) in a very restricted time frame (even on a daily basis). In order to study the dependence of the observed phenomenon on the initial value of correlation, in the rightmost panel of Fig. 3 we also consider the case \(\Sigma _{12} = 0\) that produces a similar erratic correlation dynamics.

Given the intent to apply WASC model to describe the joint behaviour of asset prices, this represents a major issue. To tackle the problem, we decide to enforce the positive definiteness condition for \(\Sigma (t)\), given by (8), that in our setting equals to set \(\beta \ge 3\). Calibrated parameters are collected in the rightmost column of Table 2. The corresponding cross-asset correlation dynamics is depicted in Fig. 4: the trajectories are now much more meaningful. Further, as a consequence of the fact that \(\Sigma (t)\) is defined on the interior of \({\mathcal {S}}_{2}^{+}({\mathbb {R}})\), \(\rho _{1,2}(t)\) is bounded in \((-1, +1)\). Nonetheless, the stronger condition enforced has a severe impact on the ability of the model to reproduce single-asset market evidences. Reconstructed volatility skews are shown in Fig. 5. Significant discrepancies now emerge for far-from-the-money options. In particular, in the very short-end of the volatility term structure the error with respect to market volatilities can be as high as \(11.87\%\) (in-the-money options on EuroStoxx50) and \(10.80\%\) (out-of-the-money options on DAX). Disappointingly, we face a non trivial trade-off between plain vanilla pricing accuracy and realistic modellization of cross-asset correlation. A possible solution to mitigate the problem could be to set \(\beta \) equal to some value in the range (1, 3]. This alternative, however, would require to couple the plain vanilla analysis with adequate market evidences on multi-asset derivatives.

Fig. 5
figure 5

Calibration results for WASC. Resulting value for parameter \(\beta \) is 3.011. Comparison with market implied volatility for EuroStoxx50 (left) and DAX (right) indices for selected tenors

5 Simulation schemes for the WASC

This section is devoted to present simulation algorithms specifically devised for WASC model. As far as we know, indeed, there are no previous attempts in literature to deal with the discretization of prices trajectories (17). In particular, our task is to develop an efficient, yet accurate, scheme to discretize the system of SDEs (22)–(23). It is evident that a standard discretization (e.g. via Euler scheme) is unfeasible, since we also need to take into account the evolution of non diagonal elements of \(\Sigma (t)\) to determine the dependence structure and satisfy the positive semi-definiteness constraint for the Wishart process.

As a first algorithm, we implement an adapted version of the scalar full truncated Euler (TE) scheme that reads as

$$\begin{aligned} \widehat{{\mathbf {y}}}(t + \Delta )&= -\frac{1}{2}{\text {Vec}}\left[ {\widehat{\Sigma }}^{+}(t)\right] \Delta + \sqrt{{\widehat{\Sigma }}^{+}(t)} \left( \sqrt{1-{\mathbf {r}}^{\top }{\mathbf {r}}}\,\, \widetilde{{\mathbf {z}}} + {\widetilde{W}}\, {\mathbf {r}}\right) \sqrt{\Delta }\\ {\widehat{\Sigma }}(t + \Delta )&= {\widehat{\Sigma }}(t) + \left( \beta Q^{\top }Q + M {\widehat{\Sigma }}^{+}(t) + {\widehat{\Sigma }}^{+}(t) M^{\top } \right) \Delta + \sqrt{{\widehat{\Sigma }}^{+}(t)} {\widetilde{W}} Q \sqrt{\Delta } \\&\quad + Q^{\top } {\widetilde{W}}^{\top } \sqrt{{\widehat{\Sigma }}^{+}(t)} \sqrt{\Delta } \end{aligned}$$

with \(\widetilde{{\mathbf {z}}}\) and \({\widetilde{W}}\), respectively, d-dimensional vector and square matrix of independent standard gaussian random variables. Here \({\text {Vec}}\left[ \cdot \right] \) is the operator that extracts the elements on the main diagonal of a square matrix into a column vector, while \({\widehat{\Sigma }}^{+}\) is the positive part of matrix \({\widehat{\Sigma }}\).

As a second algorithm, we deal with the Wishart process sampling scheme developed in Ahdida and Alfonsi (2013): that is, we consider as given an entire discretized path of \(\Sigma (t)\) over the time grid \(0 = t_{0}< t_{1}< ... < t_{M_{T}} = T\) with time step \(\Delta \). In this way we are only left with the problem of sampling the log-prices trajectories. The most challenging task here is to embed the correlation structure (24) in the discretization of (22). In standard cases, we would compute the Cholesky decomposition of matrix C(t) such that

$$\begin{aligned}&\left[ w^{\Sigma }_{1},w^{\Sigma }_{2},...,w^{\Sigma }_{d},w^{y}_{1},w^{y}_{2},...,w^{y}_{d} \right] ^{\top } \nonumber \\&\quad = L_{C}(t) \left[ w^{*}_{1},w^{*}_{2},...,w^{*}_{d},w^{*}_{d+1},w^{*}_{d+2},...,w^{*}_{2d} \right] ^{\top }, \end{aligned}$$
(34)

where \(L_{C}(t) = \left( \ell _{i,j}(t) \right) _{1\le i\le d, 1 \le j \le i}\) is the lower triangular matrix that satisfies \(C(t) = L_{C}(t) L_{C}^{\top }(t)\) and \(\mathbf {w^{*}}=\left[ w^{*}_{1},w^{*}_{2},...,w^{*}_{d},w^{*}_{d+1},w^{*}_{d+2},...,w^{*}_{2d} \right] ^{\top }\) is a vector of independent Brownian motions. By exploiting (34), we can rewrite (22) as

$$\begin{aligned} dy_{i}(t) = -\frac{1}{2} \Sigma _{i}(t) dt + \sqrt{\Sigma _{i}(t)} \sum \limits _{j = 1}^{d+i} \ell _{d+i,j}(t) dw^{*}_{j}(t), \end{aligned}$$
(35)

that can be discretized by generating 2d independent gaussian random variables. Unfortunately, in our setting this is not readily doable as a consequence of the mechanics of Wishart sampling algorithm. In other words, we do not have a direct “access” to the discretized paths of \(w^{\Sigma }_{1}\), \(w^{\Sigma }_{2}\) ,..., \(w^{\Sigma }_{d}\).

The simple idea underlying the new simulation scheme is to exploit the auxiliary scalar dynamics of \(\Sigma _{i}(t)\) to get an approximation of \(w^{\Sigma }_{i}(t + \Delta ) - w^{\Sigma }_{i}(t)\). Let \({\hat{\Sigma }}(t)\) and \({\hat{\Sigma }}(t + \Delta )\) be the realizations of the trajectory of Wishart process for two adjacent points on the time grid computed by means of the exact scheme in Ahdida and Alfonsi (2013). The discretized version of (1), that reads

$$\begin{aligned} {\hat{\Sigma }}(t + \Delta ) - {\hat{\Sigma }}(t) \approx \kappa _{i} \left( \theta _{i} - {\hat{\Sigma }}(t) \right) \Delta + \eta _{i} \sqrt{{\hat{\Sigma }}(t)} {\widetilde{w}}_{i}, \end{aligned}$$
(36)

can now be used to approximate the gaussian variable \({\widetilde{w}}_{i}\). Let \({\widetilde{w}}_{\Sigma _{i}}\) be the result of (36), for a sufficiently small time interval, \(\widetilde{{\mathbf {w}}}_{\Sigma } = \left[ {\widetilde{w}}_{\Sigma _{1}}, {\widetilde{w}}_{\Sigma _{2}},..., {\widetilde{w}}_{\Sigma _{d}} \right] ^{\top }\) represents an approximation of a vector of gaussian variables with correlation matrix \({\hat{C}}_{\Sigma }(t)\) (i.e. the realization at time t of matrix \(C_{\Sigma }\)). Further, let \({\hat{L}}_{\Sigma }(t)\) be the lower triangular matrix obtained from the Cholesky decomposition of \({\hat{C}}_{\Sigma }(t)\), then

$$\begin{aligned} \widetilde{{\mathbf {w}}}^{*}_{\Sigma } = {\hat{L}}^{-1}_{\Sigma }(t) \widetilde{{\mathbf {w}}}_{\Sigma } \end{aligned}$$

is composed of d approximated independent gaussian random variables. By sampling an additional random vector \(\hat{{\mathbf {w}}}^{*}_{{\mathbf {y}}}\) from \({\mathcal {N}}\left( {\mathbf {0}}_{d}, \Delta {\mathbb {I}}_{d} \right) \) (the d-variate gaussian distribution) and setting

$$\begin{aligned} \hat{{\mathbf {w}}}^{*} = \begin{bmatrix} \widetilde{{\mathbf {w}}}^{*}_{\Sigma } \\ \hat{{\mathbf {w}}}^{*}_{{\mathbf {y}}} \end{bmatrix} \end{aligned}$$

we can finally approximate (35) as

$$\begin{aligned} {\hat{y}}_{i}(t + \Delta ) = {\hat{y}}_{i}(t) - \frac{1}{2} {\hat{\Sigma }}_{i}(t) \Delta + \sqrt{{\hat{\Sigma }}_{i}(t)} \sum \limits _{j = 1}^{d+i} {\hat{\ell }}_{d+i,j}(t) {\hat{w}}^{*}_{j}(t) \end{aligned}$$
(37)

where \({\hat{L}}_{C}(t)\) results from the factorization of \({\hat{C}}(t)\). If \({\hat{C}}(t)\) turns out not to be positive definite, we take its positive part, \({\hat{C}}^{+}(t)\), (defined as the matrix obtained from the spectral decomposition of \({\hat{C}}(t)\) with negative eigenvalues replaced by zeros) and apply the extended Cholesky decomposition described in Golub and Van Loan (2012). The complete algorithm is exhibited in Algorithm 1, and we refer to it as Gaussian Variables Approximation (GVA) scheme for WASC model. Notice that this algorithm can be considered as an extension of the GVA algorithm for the WMSV case developed in La Bua and Marazzina (2019).

figure a

5.1 Numerical results

Even if these new schemes would apply to the general specification of the model, here we suppose to deal with the reduced model presented in Sect. 3.1 (i.e. we consider matrix M to be diagonal). In particular, we develop an extensive numerical investigation based on the parameters set calibrated to market data enforcing the condition \(\beta \ge 3\) and shown in the rightmost column of Table 2. Considering \(5 \times 10^{5}\) simulation paths, we price European call options written on any of the 2 assets with maturity \(T = 1\) and moneyness in the range \(\lbrace 70\%, 100\%, 130\%\rbrace \). For the sake of simplicity, we assume interest rates and dividends equal to zero and \(f_{1}(0) = f_{2}(0) = 100\). The asterisk in the following tables means that the corresponding reference value lies outside of the \(95\%\) confidence interval.

In the context of plain vanilla options, results in Tables 4, 5, 6, 7 show that both schemes allow for accurate price estimates as the size of the time step is sufficiently small. More in detail, the TE scheme is found to outperform the GVA scheme when the time grid is coarse (10, 20 and 50 steps per year), while for smaller mesh widths the two approaches tend to perform similarly. With this setting and taking into account option prices for both assets, the absolute mean percentage error is respectively equal to \(0.232\%\) for the GVA scheme and to \(0.275\%\) for the TE scheme. It is worthwhile to remark, though, that the GVA scheme systematically requires a finer time discretization to produce reliable estimates (true prices lying in the \(95\%\) confidence interval) compared to the simpler TE scheme. This is due to the fact that the approximation exploited in (36) seems to be adequate only for very small time intervals. From a computational point of view, the TE scheme greatly outperforms the GVA scheme with the latter that results \(110\%-130\%\) slower than the former as exhibited in Table 8. When implementing the GVA scheme, indeed, at each time step we are asked to perform the Cholesky factorization of the 2d-dimensional matrix \({\widehat{C}}(t)\) with a considerable increase in the computational burden.

Table 4 Plain vanilla option written on Asset 1, TE scheme
Table 5 Plain vanilla option written on Asset 1, GVA scheme
Table 6 Plain vanilla option written on Asset 2, TE scheme
Table 7 Plain vanilla option written on Asset 2, GVA scheme
Table 8 Computational time as function of the number of time steps. The number of simulated paths is fixed to \(5 \times 10^{5}\)

We now compare the two simulation schemes pricing an Asian option with payoff in \(T=1\) equal to

$$\begin{aligned} \left( \frac{1}{12}\sum _{i=1}^{12}f_1(0)e^{y_1(i\Delta _M)}+\frac{1}{12}\sum _{i=1}^{12}f_2(0)e^{y_2(i\Delta _M)}-K\right) ^+, \end{aligned}$$

with \(\Delta _M=\frac{1}{12},\) i.e., we are considering an Asian option with monthly monitoring, and \(K=200\) (all the other parameters as above). In Table 9 we compare the two simulation schemes: results confirm that the TE scheme outperforms the GVA when the time grid is coarse.

Table 9 Asian option with monthly monitoring
Table 10 Absolute value of the real part of the joint characteristic function (38): \(5\times 10^5\) simulations, parameters as in Eq.(39)

Nonetheless, the GVA scheme proposed embeds the inherent advantage to implement the exact sampling of Wishart process thanks to the algorithm in Ahdida and Alfonsi (2013). This feature is of great importance in all the cases in which we need to estimate the (conditional) moments of the distribution of the elements of \(\Sigma (t)\). The ability to consistently deal with the discretization of Wishart process is, indeed, the main reason that led us to develop the new scheme. To better highlight this advantage, in Table 10 we exploit the two simulation algorithms to compute the absolute value of the real part of the joint characteristic function

$$\begin{aligned} \left| \text {Re}\phi (\varvec{\Lambda _X},\Lambda _V,T)\right| =\left| \text {Re} {\mathbb {E}}\left[ \exp \left( \iota \left\langle \varvec{\Lambda _X}, {\mathbf {X}}(T) \right\rangle + \iota {\text {Tr}}\left[ \Lambda _V\Sigma (T)\right] \right) \right] \right| , \end{aligned}$$
(38)

setting

$$\begin{aligned} {\mathbf {X}}(T) = \begin{bmatrix} \log (f_1(T)) \\ \log (f_2(T)) \end{bmatrix},\,\,\Lambda _X = \begin{bmatrix} 0 \\ a \end{bmatrix},\,\,\Lambda _V = \begin{bmatrix} 1 &{} 0.5\\ 0.5 &{} 1 \end{bmatrix}, \end{aligned}$$
(39)

varying the parameter a. Notice that a closed form solution for this characteristic function is provided in Da Fonseca and Grasselli (2011). For large values of a, e.g., \(a=1\), the TE scheme outperforms the GVA one, exactly as in the option pricing cases above described. However, the opposite happens for small values of a, i.e., when the contribution of the simulation of the Wishart process is more important w.r.t. the log-asset value. Moreover, when \(a=0\), i.e., we only need to simulate the Wishart process \(\Sigma (T)\), the GVA scheme performance is independent on the number of the time steps, since it is an exact simulation scheme, as shown in Ahdida and Alfonsi (2013).

6 Concluding remarks

The matrix structure of Wishart-based stochastic volatility models provides a remarkable degree of flexibility in describing the evolution of asset(s) volatility. Realistic implementations, though, require the development of specific numerical techniques in order to deal with the inherent level of complexity. In this article we have shown, leveraging on a thorough analysis of distributional properties of Wishart process, some possible solutions intended to make this class of model more suitable for real market applications. Accordingly, we hope that our contribution will increase the interest of researchers and practitioners towards matrix-variate stochastic volatility dynamics.