1 Introduction

This paper aims to draw the attention to a more general modeling approach than available under the classical no-arbitrage paradigm in finance. Historically, Long (1990) was the first who observed that one can rewrite the risk neutral pricing formula as a conditional expectation under the real-world probability measure where the so-called numéraire portfolio (NP) acts as numéraire. The benchmark approach postulates only the existence of the NP and no longer relies on the rather restrictive classical no-arbitrage assumptions, which are equivalent to the existence of an equivalent risk neutral probability measure. Under this much weaker assumption, one can still perform all essential tasks of valuation and risk management. The only condition imposed is that the NP, which is in the long run the pathwise best performing portfolio, remains finite in finite time. Obviously, when this assumption is violated for a model, then some economically meaningful arbitrage must exist, causing the candidate for the NP to explode. In this case, the respective model makes not much theoretical and practical sense. Note that, when a finite NP exists, various forms of classical arbitrage may be present in the market; see, e.g., Loewenstein and Willard (2000) and Heston et al. (2007) for various examples in the literature on bubbles.

The current paper illustrates the divergence of the benchmark approach from the classical approach by focusing on the currency market, which is one of the most active markets. We present and calibrate a hybrid model describing the dynamics of a vector of foreign exchange (FX) rates and the associated interest rates. We extend and unify the FX multifactor stochastic volatility models of De Col et al. (2013) and Baldeaux et al. (2015) by means of the general transform formula presented in Grasselli (2017). The resulting general model that we develop allows for the simultaneous presence of multiple stochastic volatility factors both of square root (see Heston 1993) and 3/2 type (see Heston 1997; Platen 1997). More explicitly, the square root of each CIR factor appears in both the numerator and the denominator of the diffusion terms. Based on Grasselli (2017), we refer to this model as the 4/2 model. Our specification for the volatility process spans a large class of dynamics ranging from the 3/2 to the Heston model. This means that we can let market data dictate the relative importance of the two stochastic volatility effects that we consider. While the 4/2 model might appear as an involved choice, we will show in Sect. 3.3 that it naturally emerges, e.g., in a simple Heston setting for a suitable choice of the risk premium. Moreover, such CIR factors can be freely combined in order to drive stochastic interest rates. Therefore, the model is suitable for the valuation of long-dated FX products, for which interest rate risk becomes a relevant risk factor, see the discussions in Gnoatto and Grasselli (2014).

The framework we propose is general to the extent that for suitable parameter combinations our model may not admit the existence of an equivalent risk neutral probability measure for some economies. In spite of this feature, the problem of pricing and hedging contingent claims can always be solved under the more general benchmark approach of Platen and Heath (2010), with in addition the possibility to apply benchmarked risk minimization for hedging, as developed, e.g., in Du and Platen (2016).

Despite the richness of our framework, it is possible to efficiently solve and implement the pricing of plain vanilla instruments via Fourier-based techniques, see Carr and Madan (1999) and Lewis (2001). Semi-analytical closed form solutions for products such as European FX options can be computed thanks to the availability of the exact formula for the joint Fourier transform of the model’s state variables, see Grasselli (2017). The flexibility of the 4/2 model has also been exploited in recent contributions of Detemple and Kitapbayev (2018) and Cheng et al. (2019).

We test our model on real datasets of vanilla FX options by performing several calibration experiments. Our empirical results are twofold. On the one side, we confirm the empirical findings of Baldeaux et al. (2015) on the violation of the risk neutral pricing paradigm. The appearance of such violations may change over time and across currencies. In fact, our multiple calibration experiment seems to suggest the presence of regime switches in traded FX-option prices between the standard risk neutral and the real-world pricing approach. Such a feature calls for a modeling framework which is able to span both valuation principles, which is provided by the benchmark approach.

The paper is structured as follows: In Sect. 2, we introduce the general multi-currency modeling framework and recall some notions from the benchmark approach. Section 3 motivates and formally introduces the 4/2 model as a unifying framework for stochastic volatility models driven by the CIR process as the Heston-based model of De Col et al. (2013) and the 3/2-based model of Baldeaux et al. (2015). The 4/2 model extends the Heston model and allows for the possibility of a failure of the risk neutral paradigm. The analytical tractability of the 4/2 model is demonstrated in Sect. 4, which constitutes a prerequisite for an efficient model calibration, presented in Sect. 5. Section 6 concludes, while we gather in the appendix the proofs.

2 General setup

In this section, we present the general modeling framework of the benchmark approach for a foreign exchange (FX) market. Section 2.1 provides a general setup driven by a multi-dimensional diffusion process.

2.1 Specification of the currency market

We use superscripts to reference different currencies and employ bold letters for vectors and subscripts for elements thereof. Unless specified by a suitable superscript, all expectations are considered with respect to the real-world probability measure \(\mathbb {P}\). We model the currency market on a probability space \(\left( \varOmega ,\mathcal {F}_{{\bar{T}}},\mathbb {P}\right) \), where \({\bar{T}}<\infty \) is a finite time horizon. On this space, we introduce a filtration \((\mathcal {F}_t)_{0\le t\le {\bar{T}}}\) to model the evolution of available information, satisfying the usual assumptions. The above filtered probability space supports a standard d-dimensional \(\mathbb {P}\)-Brownian motion \(\mathbf {Z}=\{\mathbf {Z}(t)=(Z_1(t),\ldots , Z_d(t)),\ 0\le t\le \bar{T}\}\) for modeling the traded uncertainty. The constant N denotes the number of currencies in the model, whereas d is the number of risk factors we employ.

In each economy, we postulate the existence of a money market account, i.e., the i-th money market account, when denominated in units of the j-th currency, evolves according to the relation

$$\begin{aligned} \mathrm {d}B^i(t) =B^i(t)r^i(t)\mathrm {d}t,\,\, \ B^i(0)=1, \ 0\le t \le {\bar{T}}; \end{aligned}$$
(2.1)

with the \(\mathbb {R}\)-valued, adapted ith short rate process \(r^i=\left\{ r^i(t), \ 0 \le t\le {\bar{T}} \right\} \). We denote by \(S^{i,j}=\left\{ S^{i,j}(t), \ 0 \le t\le {\bar{T}} \right\} \) the continuous exchange rate process between currency i and j. Here, \(S^{i,j}(t)\) denotes the price of one unit of currency i in units of currency j, meaning that, e.g., for \(i=\mathrm{USD}\) and \(j=\mathrm{EUR}\) and \(S_t^{i,j}=0.92\) we have, in line with the standard FORDOM convention, that the price of one USD is 0.92 EUR at time t.

Let us follow (Platen and Heath 2010; Heath and Platen 2006) and introduce a family of primary security account processes via

$$\begin{aligned} B^{i,j}(t)=S^{i,j}(t)B^j(t),\,\, \ 0\le t\le {\bar{T}} \end{aligned}$$

for \(i\ne j\). Obviously, for \(i=j\) we have \(B^{i,i}(t)=B^{i}(t)\). We take the perspective of a generic currency referenced with superscript i and introduce the vector of money market accounts of the form \(\varvec{B}^i(t)=\left( B^{i,1}(t),\dots ,B^{i,N}(t)\right) \), \(i=1,\dots ,N\). Given this vector of primary security accounts, an investor may trade on them. This is represented by introducing a family of predictable \(\varvec{B}^i\)-integrable stochastic processes \(\varvec{\delta }=\left\{ \varvec{\delta }(t)=\left( \delta _1(t),\dots ,\delta _N(t)\right) , \ 0 \le t\le {\bar{T}} \right\} \) for \(i=1,\dots ,N\), called strategies. Each \(\delta _j(t)\in \mathbb {R}\) denotes the number of units that an agent holds in the jth primary security account at time t. Let us introduce the process \({\mathtt {V}}^{i,\delta }=\left\{ \mathtt {V}^{i,\delta }(t), \ 0 \le t\le {\bar{T}} \right\} \), which describes the value process in ith currency denomination corresponding to the portfolio strategy \(\varvec{\delta }\), i.e.,

$$\begin{aligned} \mathtt {V}^{i,\delta }(t)=\sum _{j=1}^N\delta _j(t)B^{i,j}(t). \end{aligned}$$
(2.2)

The strategy \(\delta \) is said to be self-financing if

$$\begin{aligned} \mathrm {d}\mathtt {V}^{i,\delta }(t)=\sum _{j=1}^N\delta _j(t)\mathrm {d}B^{i,j}(t). \end{aligned}$$
(2.3)

In line with Platen and Heath (2010), we assume limited liability for all investors. For this purpose, we introduce \(\mathcal {V}^+\) as the set of all self-financing strategies forming strictly positive portfolios. For our purposes, we will be interested in a particular strategy \(\delta ^\star \in \mathcal {V}^+\), which yields the growth optimal portfolio (GOP), which can be shown to be equivalent to the numéraire portfolio (NP) and is defined as follows:

Definition 2.1

A solution \(\delta ^{\star }\) of the maximization problem

$$\begin{aligned} \sup _{\delta \in \mathcal {V}^+}\mathbb {E}\left[ \log \left( \frac{\mathtt {V}^{i,\delta }(T)}{\mathtt {V}^{i,\delta }(0)}\right) \right] , \end{aligned}$$

for all \(i = 1,\ldots ,N\) and \(0\le T\le {{\bar{T}}}\) is called a growth optimal portfolio strategy.

It has been shown in Platen and Heath (2010) that the GOP value process is unique in an incomplete jump-diffusion market setting. We summarize the discussion above in the following assumptions.

Assumption 2.1

We assume the existence of the growth optimal portfolio (GOP) and denote by \(D^i=\left\{ D^i(t)\in (0,+\infty ), \ 0 \le t\le {\bar{T}} \right\} , \ i=1,\dots ,N\) the value of the GOP denominated in the ith currency. The dynamics of the GOP are given by

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}D^i(t)}{D^i(t)}= r^i(t)\mathrm {d}t+\langle \varvec{\pi }^i(t),\varvec{\pi }^i(t)\mathrm {d}t+\mathrm {d}\varvec{Z}(t)\rangle , \ D^i(0)>0 \end{aligned} \end{aligned}$$
(2.4)

for \(t\in [0, {\bar{T}}]\) and \(i = 1,\ldots N\), where for \(N,d\in \mathbb {N}\), the N-dimensional family of predictable, \(\mathbb {R}^d\)-valued stochastic processes \(\varvec{\pi }=\big \{\varvec{\pi }^i(t)=(\pi ^i_1(t),\ldots ,\pi ^i_d(t)), \ 0 \le t\le {\bar{T}} \big \}\) represent the market prices of risk with respect to the ith currency denomination. The processes \(\varvec{\pi }^i\) are assumed to be integrable with respect to the d-dimensional standard Brownian motion \(\mathbf {Z}\).

The GOP can be shown to be in many ways the best performing portfolio. In particular, in the long run its value outperforms almost surely those of any other strictly positive portfolios. Here, we assume that it remains finite in finite time in all currency denominations. If we were to consider a model where the GOP explodes in any of the currency denominations, then the model would allow an obvious form of economically meaningful arbitrage, since one could generate, in that currency denomination, in finite time, unbounded wealth from finite initial capital. Given the uniqueness of the GOP, all exchange rates \(S^{i,j}(t)\) can be uniquely determined as ratios of different denominations of the GOP in the respective currencies.

Assumption 2.2

The family of exchange rate processes \(S^{i,j}=\left\{ S^{i,j}(t), \ 0 \le t\le {\bar{T}}\right\} \) is determined by the ratios

$$\begin{aligned} S^{i,j}(t)=\frac{D^i(t)}{D^j(t)}, \end{aligned}$$
(2.5)

for \(0 \le t \le {\bar{T}}\) and \(i,j=1,\ldots ,N\).

Given Assumption 2.2, it is immediate to compute via a direct application of the Itô formula the dynamics of all exchange rates and all primary security accounts in all currency denominations.

Lemma 2.1

The exchange rate \(S^{i,j}(t)\) under the real-world probability measure \(\mathbb {P}\) evolves according to the dynamics

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}S^{i,j}(t)}{S^{i,j}(t)}&= ( r^i(t)-r^j(t))\mathrm {d}t+\langle \varvec{\pi }^i(t)-\varvec{\pi }^j(t),\varvec{\pi }^i(t)\mathrm {d}t+\mathrm {d}\varvec{Z}(t)\rangle ,\\ S^{i,j}(0)&=s^{i,j}>0, \end{aligned} \end{aligned}$$
(2.6)

and the generic jth primary security account \(B^{i,j}\), in ith currency denomination and under the real-world probability measure \(\mathbb {P}\), evolves according to the dynamics

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}B^{i,j}(t)}{B^{i,j}(t)}&= r^i(t)\mathrm {d}t+\langle \varvec{\pi }^i(t)-\varvec{\pi }^j(t),\varvec{\pi }^i(t)\mathrm {d}t+\mathrm {d}\varvec{Z}(t)\rangle , \\ B^{i,j}(0)&=b^{i,j}, \end{aligned} \end{aligned}$$
(2.7)

for \(\ i,j = 1,\ldots ,N\) and \(t\in [0,{\bar{T}}]\).

2.2 The benchmark approach

In the present paper, we evaluate contingent claims under the benchmark approach of Platen and Heath (2010). Under this approach, price processes denominated in terms of the GOP are called benchmarked price processes. More precisely, for \(i,j=1,\ldots N\), let us introduce the benchmarked price process

$$\begin{aligned} \hat{B}^{j}=\left\{ \hat{B}^{j}(t):=\frac{B^{i,j}(t)}{D^i(t)},\ 0\le t\le \bar{T}\right\} . \end{aligned}$$

We call \(\hat{B}^{j}\) the benchmarked jth primary security account. Note that \(\hat{B}^{j}\) does not depend on the index i of the currency denomination we started from. Given (2.7) and (2.4), upon an application of the Itô formula, it is immediate to conclude that all benchmarked price processes \(\hat{B}^{j}\) form \(\mathbb {P}\)-local martingales. Even more, they are nonnegative \(\mathbb {P}\)-local martingales. Hence, due to Fatou’s lemma, they are also \(\mathbb {P}\)-supermartingales. Analogously, we also have that benchmarked nonnegative portfolio values \(\hat{\mathtt {V}}^{\delta }(t):={\mathtt {V}}^{i,\delta }(t)/D^i(t)\) form \(\mathbb {P}\)-supermartingales. Besides, the exclusion of forms of economically meaningful arbitrage, which are equivalent to the explosion of the GOP, forms of classical arbitrage that are excluded under classical no-arbitrage assumptions may exist in our model, see, e.g., Loewenstein and Willard (2000).

Let us now introduce for the ith currency denomination the Radon–Nikodym derivative process, denoted by \(\varLambda ^i=\left\{ \varLambda ^i(t), \ 0\le t \le \bar{T}\right\} \), by setting

$$\begin{aligned} \varLambda ^i(t)=\frac{\hat{B}^{i}(t)}{\hat{B}^{i}(0)},\quad i=1,\ldots , N. \end{aligned}$$
(2.8)

This is the risk neutral density for the putative risk neutral measure \(\mathbb {Q}^i\) of the ith currency denomination. It arises, e.g., when we consider replicable claims and assume the existence of an equivalent risk neutral probability measure \(\mathbb {Q}^i\). As each \(\varLambda ^i\) equals the corresponding benchmarked savings account \(\hat{B}^{i}\) (up to a constant factor), it is clear that \(\varLambda ^i\) is a \(\mathbb {P}\)-local martingale, for \(i=1,\ldots , N\). The classical assumption in the foreign exchange literature that there exists an equivalent risk neutral probability measure for each currency denomination corresponds to the requirement that each process \(\varLambda ^i\) is a true martingale for \(i=1,\ldots , N\). Such a requirement is rather strong and may be empirically rejected, see, e.g., Heath and Platen (2006), the findings in Baldeaux et al. (2015) and the necessary and sufficient conditions of Hulley and Ruf (2019). Hence, in the present paper, we shall allow each \(\varLambda ^i\) to be either a true martingale or a strict local martingale. To work in such a generalized setting requires a more general pricing concept than the one provided under the classical risk neutral paradigm. In the following, we will employ the notion of real-world pricing: a price process \(\mathfrak {V}^i=\left\{ \mathfrak {V}^i(t), 0\le t \le \bar{T}\right\} \), here denominated in ith currency, is said to be fair if, when expressed in units of the GOP \(D^i\), forms a \(\mathbb {P}\)-martingale, this means its benchmarked value forms a true \(\mathbb {P}\)-martingale, see Definition 9.1.2 in Platen and Heath (2010). For a fixed maturity \(T\in [0, \bar{T}]\), we let \(\mathcal {H}^i(T)=\mathfrak {V}^i(T)\) be an \(\mathcal {F}_T\)-measurable nonnegative contingent claim, expressed in units of the ith currency denomination such that

$$\begin{aligned} \mathbb {E}\left[ \left. \hat{\mathcal {H}}^i(T)\right| \mathcal {F}_t\right] =\mathbb {E}\left[ \left. \frac{\mathcal {H}^i(T)}{D^i(T)}\right| \mathcal {F}_t\right] <\infty , \end{aligned}$$

for all \(0\le t\le T\le \bar{T}\), \(i=1,\dots ,N\). The benchmarked fair price \({{\hat{\mathfrak {V}}}^i}(t)=\mathfrak {V}^i(t)/D^i(t)\) of this contingent claim is the minimal possible price and given by the following conditional expectation under the real-world probability measure \(\mathbb {P}\):

$$\begin{aligned} {\hat{\mathfrak {V}}^i(t)}=\mathbb {E}\left[ \left. {{\hat{\mathcal {H}}}^i(T)}\right| \mathcal {F}_t\right] , \end{aligned}$$
(2.9)

which is known in the literature as real-world pricing formula, see Corollary 9.1.3 in Platen and Heath (2010). Note that benchmarked risk minimization, described in Du and Platen (2016), gives (2.9) generally. In case \(\varLambda ^i\) is a true martingale we obtain, by changing in (2.9) from the real-world probability measure \(\mathbb {P}\) to the equivalent risk neutral probability measure \(\mathbb {Q}^i\), the risk neutral pricing formula

$$\begin{aligned} \mathfrak {V}^i(t)&=\mathbb {E}\left[ \left. \frac{B^i(t)}{B^i(T)}\frac{B^i(T)}{B^i(t)}\frac{D^i(t)}{D^i(T)}\mathcal {H}^i(T)\right| \mathcal {F}_t\right] \nonumber \\&=\mathbb {E}\left[ \left. \frac{\varLambda ^i(T)}{\varLambda ^i(t)}\frac{B^i(t)}{B^i(T)}\mathcal {H}^i(T)\right| \mathcal {F}_t\right] =\mathbb {E}^{\mathbb {Q}^i}\left[ \left. \frac{B^i(t)}{B^i(T)}\mathcal {H}^i(T)\right| \mathcal {F}_t\right] . \end{aligned}$$
(2.10)

This shows that the real-world pricing formula generalizes the classical risk neutral valuation formula and the Radon–Nikodym derivative for the respective risk neutral probability measure is given by (2.8). In general, due to the supermartingale property of benchmarked price processes in the case when \(\varLambda ^i\) is a strict supermartingale, a formally obtained risk neutral price is greater than or equal to the real-world price, see Du and Platen (2016).

3 The 4/2 model

To demonstrate the fact that in reality there may exist hedgeable securities that are less expensive than their associated formally obtained risk neutral price processes, we need some model that can potentially capture this phenomenon when it is present in the market. One such model is the one introduced by Grasselli (2017), called the 4/2 model, that unifies several well-known models. In Sect. 3.2, we state the precise conditions under which the crucial martingale property of the benchmarked savings account fails for the 4/2 model.

3.1 Formal presentation of the 4/2 model

To provide a concrete specification of the market prices of risk, we proceed to introduce an \(\mathbb {R}^d\)-valued nonnegative stochastic process \(\mathbf {V}=\big \{\mathbf {V}(t)= (V_1(t),\dots ,V_d(t)), 0\le t \le \bar{T}\big \}\), called the volatility factor process. The kth component \(V_k\) of the vector process \(\mathbf {V}\) is assumed to solve the SDE

$$\begin{aligned} \begin{aligned} \mathrm {d}V_k(t)&= \kappa _k ( \theta _k - V_k(t))\mathrm {d}t + \sigma _k V_k(t)^{1/2} \mathrm {d}W_k(t),\\ V_k(0)&= v_k>0, \end{aligned} \end{aligned}$$
(3.1)

for \(t\in [0, {\bar{T}}]\), where the parameters \(\kappa _k>0,\theta _k>0,\sigma _k>0, \) are admissible in the sense of Duffie et al. (2003), \(k=1,\dots ,d\). In addition, to avoid zero volatility factors, we impose the following assumption.

Assumption 3.1

For every \(k=1,\ldots , d,\) the parameters in (3.1) satisfy the relation

$$\begin{aligned} 2\kappa _k\theta _k-\sigma ^2_k\ge 0. \end{aligned}$$
(3.2)

We also allow for nonzero correlation between assets and their volatilities via the following condition:

Assumption 3.2

The Brownian motions \(\mathbf {Z}\) and \(\mathbf {W}\) have a covariation satisfying

$$\begin{aligned} \frac{\mathrm {d}\langle W_k,Z_l\rangle (t)}{\mathrm {d}t}=\delta _{kl}\rho _k, \ k,l=1,\dots , d, \end{aligned}$$
(3.3)

where \(\delta _{kl}\) denotes the Dirac delta function for the indices k and l.

We then proceed to provide a general specification for the family of market prices of risk.

Assumption 3.3

We assume that the ith market price of risk vector \(\varvec{\pi }^i(t)\) is a projection of the common volatility factor \(\mathbf {V}\), along a direction parametrized by a constant vector \(\mathbf {a}^i \in \mathbb {R}^d\) and a projection of the inverted elements of \(\mathbf {V}\) along another direction parametrized by \(\mathbf {b}^i\in \mathbb {R}^d\), according to the following relations

$$\begin{aligned} \varvec{\pi }^i(t)=\mathrm{Diag}^{1/2}(\varvec{V}(t))\varvec{a}^i +\mathrm{Diag}^{-1/2}(\varvec{V}(t))\varvec{b}^i ,\ \ i=1,\dots ,N, \end{aligned}$$
(3.4)

where \(\mathrm{Diag}^{1/2}(\varvec{u})\) denotes the diagonal matrix whose diagonal entries are the respective square roots of the components of the vector \(\varvec{u}\in {\mathbb {R}}^d\). The family of short-rate processes \(r^i, i=1,\dots , N\) is assumed to be given in the form

$$\begin{aligned} r^i(t)=h^i+\langle \varvec{H}^i,\mathbf {V}(t) \rangle +\langle \varvec{G}^i,\mathbf {V}^{-1} (t) \rangle , \end{aligned}$$
(3.5)

where \(\mathbf {V}^{-1} \) is a vector whose components are the inverses of those of \(\mathbf {V}\).

Under Assumption 3.3, we can express the dynamics of the GOP as

$$\begin{aligned} \frac{\mathrm {d}D^i(t)}{D^i(t)}&= \left( r^i(t)+(\varvec{a}^i)^\top \mathrm{Diag}(\varvec{V}(t))\varvec{a}^i+(\varvec{b}^i)^\top \mathrm{Diag}^{-1}(\varvec{V}(t))\varvec{b}^i +2(\varvec{a}^i)^\top \varvec{b}^i\right) \mathrm {d}t\\&\quad + (\varvec{a}^i)^\top \mathrm{Diag}^{1/2}(\varvec{V}(t))\mathrm{d}\varvec{Z}(t)+(\varvec{b}^i)^\top \mathrm{Diag}^{-1/2}(\varvec{V}(t))\mathrm{d}\varvec{Z}(t). \end{aligned}$$

Here, we suppress the explicit formulation of the dependence of \(r^i\) on \(\varvec{V}\). Consequently, the dynamics of the exchange rate \(S^{i,j}\) is given by the SDE

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}S^{i,j}(t)}{S^{i,j}(t)}&= \Big ((r^i(t)-r^j(t))+2(\varvec{a}^i)^\top \varvec{b}^i -(\varvec{a}^i)^\top \varvec{b}^j-(\varvec{a}^j)^\top \varvec{b}^i\Big .\\&\quad \Big .+(\varvec{a}^i)^\top \mathrm{Diag}(\varvec{V}(t))(\varvec{a}^i-\varvec{a}^j) +(\varvec{b}^i)^\top \mathrm{Diag}^{-1}(\varvec{V}(t))(\varvec{b}^i-\varvec{b}^j)\Big )\mathrm {d}t\\&\quad + ((\varvec{a}^i-\varvec{a}^j)^\top \mathrm{Diag}^{1/2}(\varvec{V}(t))+(\varvec{b}^i-\varvec{b}^j)^\top \mathrm{Diag}^{-1/2}(\varvec{V}(t)))\mathrm {d}\varvec{Z}(t). \end{aligned} \end{aligned}$$
(3.6)

Notice that the dynamics of the exchange rates are fully functionally symmetric w.r.t. the construction of product/ratios thereof, see Gnoatto (2017).

3.2 Strict local martingality

In this subsection, we investigate the conditions under which the ith benchmarked savings account, \(\hat{B}^i(t)=\frac{B^i(t)}{D^i(t)}\), is a strict \({\mathbb {P}}\)-local martingale, \(i=1,\dots ,N\). As observed in Sect. 2.2, \(\hat{B}^i(t)\), after normalization to one at the initial time, corresponds to the Radon–Nikodym derivative for the putative risk neutral measure of the ith currency denomination. Should \(\hat{B}_i(t)\) be a strict \({\mathbb {P}}\)-local martingale, we note that classical risk neutral pricing is not applicable. However, real-world pricing in line with (2.9) is still applicable, see Platen and Heath (2010) and provides the minimal possible price.

Given (2.7) and (2.4), the dynamics of \(\hat{B}^i(t)\) are given by the SDE

$$\begin{aligned} \mathrm {d}\hat{B}^i (t)&= - \hat{B}^i(t)((\varvec{a}^i)^\top (\mathrm{Diag}(\varvec{V}(t)))^{1/2}\mathrm {d}\varvec{Z}(t)\nonumber \\&\quad +(\varvec{b}^i)^\top (\mathrm{Diag}(\varvec{V}(t)))^{-1/2}\mathrm {d}\varvec{Z}(t)). \end{aligned}$$
(3.7)

Upon integration of the above SDE, we obtain

$$\begin{aligned} \mathbb {E} \left[ \hat{B}^i (t) \right]&=\hat{B}^i_0 \prod ^d_{k=1} \mathbb {E} \left[ \xi _k^i (t)\right] , \end{aligned}$$

where we define the exponential local martingale process \(\xi _k^i = \left\{ \xi _k^i(t) \, , \, t \ge 0 \right\} \) via

$$\begin{aligned} \xi _k^i (t)&:= \exp \left\{ - \rho _k \int ^t_0 \left( a^i_kV_k(s)^{1/2}+b^i_kV_k(s)^{-1/2}\right) \mathrm {d}W_k(s) \right. \nonumber \\&\quad \left. - \frac{1}{2} \rho _k^2 \int ^t_0 \left( a^i_kV_k(s)^{1/2}+b^i_kV_k(s)^{-1/2}\right) ^2 \mathrm {d}s \right\} . \end{aligned}$$
(3.8)

The putative change of measure with respect to the ith currency denomination involves

$$\begin{aligned} \mathrm {d}\tilde{W}_k(t) = \mathrm{d}W_k(t)+ \rho _k \left( a^i_kV_k(t)^{1/2}+b^i_kV_k(t)^{-1/2}\right) \mathrm {d}t, \end{aligned}$$

where under classical assumptions \(\tilde{W}_k\) should be a Wiener process under the putative risk neutral measure \(\mathbb {Q}^i\). Under this measure, the process \(V_k\) would then solve the SDE

$$\begin{aligned} \mathrm {d}V_k(t)&= \kappa _k ( \theta _k - V_k(t)) \mathrm {d}t - \rho _k \sigma _k \left( a^i_kV_k(t)+b^i_k\right) \mathrm {d}t+ \sigma _k V_k(t)^{\frac{1}{2}} \mathrm {d}\tilde{W}_k(t)\nonumber \\&= \left( \kappa _k \theta _k- \rho _k \sigma _k b^i_k \right) \mathrm {d}t -\kappa _k \left( 1+\frac{\rho _k \sigma _k a^i_k}{\kappa _k}\right) V_k(t)\mathrm {d}t+ \sigma _k V_k(t)^{\frac{1}{2}} \mathrm {d}\tilde{W}_k(t). \end{aligned}$$
(3.9)

Under \(\mathbb {P}\), the process \(V_k\) does not reach 0 if the Feller condition is satisfied, i.e.,

$$\begin{aligned} 2\kappa _k \theta _k \ge \sigma _k^2, \end{aligned}$$

while under the putative risk neutral measure, the process \(V_k\) would not reach 0 if the corresponding Feller condition would be satisfied, that is

$$\begin{aligned} 2\kappa _k \theta _k \ge \sigma _k^2+2\rho _k \sigma _k b^i_k. \end{aligned}$$

Therefore, the process \(V_k\) would have a different behavior at 0 under the two measures, provided that

$$\begin{aligned} \sigma _k^2\le 2\kappa _k \theta _k <\sigma _k^2+2\rho _k \sigma _k b^i_k. \end{aligned}$$
(3.10)

In this case, the putative risk neutral measure would not be an equivalent probability measure and classical risk neutral pricing would not be well-founded.

Remark 3.1

Let us comment on the failure of the martingale property. There is a significant body of literature that studies the failure of the martingale property in the context of stochastic volatility models. A first result in this direction is provided in Andersen and Piterbarg (2007) in their Proposition 2.5, that investigates the cases in which the discounted asset price is a strict local martingale.

Recently Desmettre et al. (2021) studied changes of the drift in one-dimensional diffusions. Their results are specialized to the Heston model in their Sect. 3. However, it is important to notice that their results start from a different perspective with respect to ours: their results state that, in the one-dimensional Heston model, if the Feller condition under the given historical measure \(\mathbb {P}\) is satisfied, then the model always admits an equivalent local martingale measure (ELMM). The situation in their analysis becomes even problematic for the Heston model in the case when the Feller condition under the everything underpinning historical measure \(\mathbb {P}\) is violated: their Theorem 3.4 states that, unless the condition in their Eq. (3.3) is satisfied, the Heston model admits no pricing measure. Our Assumption 3.1 states that the Feller condition is satisfied under the underlying measure \(\mathbb {P}\). We can instead obtain a violation of the Feller condition under the candidate martingale measure \(\mathbb {Q}^i\), i.e., the putative risk neutral measure of the ith currency denomination. If this happens, then we conclude that there is no ELMM.

The most general answer to the question whether a local martingale is a uniformly integrable martingale has been provided so far by Hulley and Ruf (2019) in their Theorem 1.1 in terms of sufficient and necessary conditions.

In order to get an intuition of what is the typical path behavior when dealing with true and strict local martingales, we simulate some paths of the Radon–Nikodym derivative for the putative risk neutral measure of the ith currency denomination \(\hat{B}^i(t)=\frac{B^i(t)}{D^i(t)}\) according to the corresponding SDE (3.7), together with the respective quadratic variation processes, for time horizon \(t=10\) years. In this illustration, we consider a one-factor specification of the 4/2 model (i.e., \(d=1\)) and fix the parameters as follows: \(\kappa = 0.49523; \theta = 0.53561;\sigma = 0.67128; V(0) = 1.4338;\rho = -0.89728; a = 0.047360.\) These parameter values were obtained via a calibration to market data as of 22 April 2015. We let the parameter b range in the interval \([-0.4,0.4]\) in order to generate situations in which the process \(\hat{B}^i\) is a true martingale (b positive) or a strict local martingale (b negative). We see in Fig. 1 that the quadratic variation of the strict local martingale process almost explodes from time to time and increases through these upward jumps visually much faster than in the case corresponding to the true martingale process, in line with the well-known unbounded expected quadratic variation process for square integrable strict local martingales, see, e.g., Lemma 5.5.2 in Platen and Heath (2010).

Fig. 1
figure 1

Simulation of the Radon–Nikodym derivative for the putative risk neutral measure of the ith currency denomination \(\hat{B}^i(t)=\frac{B^i(t)}{D^i(t)}\) given by the SDE (3.7), together with the relative quadratic variation process. The time horizon is \(t=10\) years. We consider a one-factor specification of the model (i.e., \(d=1\)), and we fix the parameters as \(\kappa = 0.49523; \theta = 0.53561;\sigma = 0.67128; V(0) = 1.4338;\rho = -0.89728; a = 0.047360.\) Parameter b ranges between \([-0.4, 0.4]\). For b positive, the process is a true martingale, while for b negative the process is a strict local martingale

Finally, let us observe that we can compute the prices of zero coupon bonds for all currency denominations, meaning that it is a priori possible to devise a model for long-dated FX products, in the spirit of Gnoatto and Grasselli (2014), where a joint calibration to FX surfaces and yield curves is performed. Depending on the parameter values, our general framework may be interpreted both from the point of view of real-world pricing and classical risk neutral valuation, respectively:

  • Should market data imply the existence of a risk neutral probability measure for the ith currency denomination, then it would be possible to equivalently employ the ith money market account as numéraire.

  • In the other case, i.e., when risk neutral pricing is not possible for the ith currency denomination due to the strict local martingale property of the ith benchmarked money market account, then discounting should be performed via the GOP.

3.3 The 4/2 model as a unifying framework

In this subsection, we show how the 4/2 model unifies the 3/2 and the Heston model. To achieve this, we consider, for simplicity, two currencies with \(D^1(t)\) denoting the GOP in domestic currency and \(D^2(t)\) denoting the GOP in foreign currency. For example, \(S^{1,2}(t)=D^1(t)/D^2(t)\) can follow a stochastic volatility model of Heston type (see Heston 1993), where

$$\begin{aligned} \frac{\mathrm {d}S^{1,2}(t)}{S^{1,2}(t)}&= \left( r^1(t) - r^2(t)\right) \mathrm {d}t + \sqrt{V(t)}\left( \mathrm {d}Z(t) + \lambda (t)\mathrm {d}t\right) ,\nonumber \\ S^{1,2}(0)&= s^{1,2}>0,\nonumber \\ \mathrm {d}V(t)&= \kappa ( \theta - V(t))\mathrm {d}t + \sigma V(t)^{1/2} \left( \rho \mathrm {d}Z(t)+\sqrt{1-\rho ^2}\mathrm{d}Z^{\bot }(t)\right) ,\nonumber \\ V(0)&= v>0. \end{aligned}$$
(3.11)

Here, \(Z^{\bot }=\left\{ Z^{\bot }(t), \ 0\le t \le {\bar{T}}\right\} \) is a \(\mathbb {P}\)-Brownian motion independent of Z, \(\kappa>0,\theta >0,\rho \in [-1,1]\) with the predictable processes \(r^1,r^2\) and \(\lambda \).

In the remainder of the section, we will repeatedly employ the following terminology: Heston (type) model, 3/2 (type) model, 4/2 (type) model. Let us now clarify the respective models. In the following, we consider \(a,b\in \mathbb {R}\), and let \(D^i=\left\{ D^i(t),\ 0\le t\le {\bar{T}}\right\} \) denote a generic place-holder for a GOP process satisfying a scalar diffusive stochastic differential equation (SDE). Moreover, let \(V=\left\{ V(t),\ 0\le t\le {\bar{T}}\right\} \) be a square root process as given in (3.11). A model is said to be of Heston type (resp. of 3/2 type, resp. of 4/2 type) if the diffusion coefficient in the dynamics of the GOP \(D^i\) is proportial to \(a\sqrt{V(t)}\) (resp. \(b/\sqrt{V(t)}\), resp. \((a\sqrt{V(t)} + b/\sqrt{V(t)})\)).

The question we would like to address is the following: Given a specification of the market price of risk process \(\lambda \) for the domestic currency denomination of securities, what are the associated dynamics of the domestic and foreign specifications of the GOP?

Lemma 3.1

Consider a two-currency model, where the exchange rate model for \(S^{1,2}\) is of Heston type (3.11). The following statements hold true:

  1. 1.

    If \(\lambda (t)=a\sqrt{V(t)}\), \(a\in \mathbb {R}\), then the GOP denominations \(D^1\) and \(D^2\) follow both Heston-type models.

  2. 2.

    If \(\lambda (t) = \frac{b}{\sqrt{V(t)}}\), \(b\in \mathbb {R}\), then the GOP denomination \(D^1\) follows a 3/2 model, whereas \(D^2\) follows a 4/2 model.

  3. 3.

    If \(\lambda (t) = a\sqrt{V(t)}+ \frac{b}{\sqrt{V(t)}}\), \(a,b\in \mathbb {R}\), then the GOP denominations \(D^1\) and \(D^2\) follow both 4/2 type models.

The proof for this result is given in “Appendix A.”

Note that if we had started in (3.11) with the volatility \(1/\sqrt{V(t)}\), then for \(\lambda (t) = b/\sqrt{V(t)}\) we would have always fallen into the class of 4/2 type models. Lemma 3.1 highlights an interesting interplay between several well-known financial models. It shows that the 4/2 model arises naturally from a standard Heston model when the market price of risk belongs to the essentially affine class (see Duffee 2002). Furthermore, it demonstrates that the 4/2 model provides a general framework that nests other popular model choices.

Different specifications of the market price of risk do not only impact on the shape of the GOP dynamics. In fact, depending on the calibrated values of the model parameters, we may incur situations where classical risk neutral pricing is no longer possible because an equivalent risk neutral probability measure does not exist. To see this, we observe that from a direct inspection of the dynamics in (3.11) in the Heston model setting, it is tempting to define the following two continuous processes

$$\begin{aligned} Z^{\mathbb {Q}^{1}}(t)&:=Z(t)+\int _0^t\lambda (s)\mathrm {d}s\nonumber \\ Z^{\mathbb {Q}^{2}}(t)&:=Z(t)+\int _0^t\left( \lambda (s)-\sqrt{V(s)}\right) \mathrm {d}s, \end{aligned}$$
(3.12)

which, if the assumptions of the Girsanov theorem were in both cases fulfilled, would then be \(\mathbb {Q}^1\)- (resp. \(\mathbb {Q}^2\)-) Brownian motions. Let us assume that under the real-world probability measure \(\mathbb {P}\), the Feller condition (see Karatzas and Shreve 1991, Section 5.5) is fulfilled by the parameters of the volatility process V, i.e., we have \(2\kappa \theta -\sigma ^2\ge 0\), so that the square root process V remains strictly positive \(\mathbb {P}\)-a.s. for all \( t\in [0,{\bar{T}}]\). The following lemma shows that, depending on the specification of \(\lambda \), it is possible to obtain a variance process V under the putative risk neutral measure that may not satisfy the Feller condition, implying that the putative risk neutral measure may fail to be equivalent to the real-world probability measure.

Lemma 3.2

Consider a two-currency model, where the dynamics of the exchange rate \(S^{1,2}\) is of the Heston type (3.11), such that the variance process V fulfills the Feller condition, i.e., \(2\kappa \theta -\sigma ^2\ge 0\). Let \(Z^{\mathbb {Q}^{1}},Z^{\mathbb {Q}^{2}}\), as in (3.12), be the candidate Brownian motions under the putative risk neutral measures \(\mathbb {Q}^1\) and \(\mathbb {Q}^2\), respectively. The following holds:

  1. 1.

    For the putative risk neutral measure \(\mathbb {Q}^1\), we get:

    1. (a)

      If \(\lambda (t)=a\sqrt{V(t)}\), \(a\in \mathbb {R}\), then the drift of the variance process V under \(\mathbb {Q}^1\) is

      $$\begin{aligned} \kappa \left( \theta -V(t)\right) -\sigma \rho aV(t) \end{aligned}$$

      and the Feller condition is always satisfied under \(\mathbb {Q}^1\), which is then a true equivalent martingale measure.

    2. (b)

      If \(\lambda (t) = a\sqrt{V(t)}+ \frac{b}{\sqrt{V(t)}}\), \(a,b\in \mathbb {R}\), then the drift of the variance process V under \(\mathbb {Q}^1\) equals

      $$\begin{aligned} \kappa \left( \theta -V(t)\right) -\sigma \sqrt{V(t)}\rho \left( a\sqrt{V(t)}+ \frac{b}{\sqrt{V(t)}}\right) \end{aligned}$$

      and the Feller condition may be violated, implying that \(\mathbb {Q}^1\) may not be equivalent to \(\mathbb {P}\).

    3. (c)

      If \(\lambda (t) = \frac{b}{\sqrt{V(t)}}\), \(b\in \mathbb {R}\), then the drift of the variance process V under \(\mathbb {Q}^1\) is

      $$\begin{aligned} \kappa \left( \theta -V(t)\right) -\sigma \rho b \end{aligned}$$

      and the Feller condition may be violated, implying that \(\mathbb {Q}^1\) may not be equivalent to \(\mathbb {P}\).

  2. 2.

    For the putative risk neutral measure \(\mathbb {Q}^2\), we have

    1. (a)

      If \(\lambda (t)=a\sqrt{V(t)}\), \(a\in \mathbb {R}\), then the drift of the variance process V under \(\mathbb {Q}^2\) equals

      $$\begin{aligned} \kappa \left( \theta -V(t)\right) -\sigma \rho (a-1)V(t) \end{aligned}$$

      and the Feller condition is always satisfied under \(\mathbb {Q}^2\), which is then a true equivalent martingale measure.

    2. (b)

      If \(\lambda (t) = a\sqrt{V(t)}+ \frac{b}{\sqrt{V(t)}}\), \(a,b\in \mathbb {R}\), then the drift of the variance process V under \(\mathbb {Q}^2\) equals

      $$\begin{aligned} \kappa \left( \theta -V(t)\right) -\sigma \sqrt{V(t)}\rho \left( (a-1)\sqrt{V(t)}+ \frac{b}{\sqrt{V(t)}}\right) \end{aligned}$$

      and the Feller condition may be violated, implying that \(\mathbb {Q}^2\) would be in such case not equivalent to \(\mathbb {P}\).

    3. (c)

      If \(\lambda (t) = \frac{b}{\sqrt{V(t)}}\), \(b\in \mathbb {R}\), then the drift of the variance process V under \(\mathbb {Q}^2\) is

      $$\begin{aligned} \kappa \left( \theta -V(t)\right) -\sigma \rho b +\sigma \rho V(t) \end{aligned}$$

      and the Feller condition may be violated, implying that \(\mathbb {Q}^2\) would be in such case not equivalent to \(\mathbb {P}\).

The proof of these statements is straightforward using our previous notation and relationships and, therefore, omitted.

4 Valuation of derivatives

In the present section, we solve the valuation problem for various contingent claims. The general valuation tool will be given by the real-world pricing formula (2.9). We concentrate on plain vanilla European FX options, for which a semi-closed form valuation is available by means of Fourier techniques. These require the knowledge of the characteristic function of the log-underlying, given below in Theorem 4.1, which provides as a by-product a closed form valuation formula for benchmarked zero-coupon bonds.

We first provide the calculation of the discounted conditional Fourier/Laplace transform of \(x^{i,j}(t):=\ln (S^{i,j}(t))\), which will be useful for option pricing purposes. Let us consider a European call option \(C(S^{i,j}(t),K^{i,j},\tau )\) at time t, \( i,j=1,\dots ,N,i\not =j,\) on a generic exchange rate process \(S^{i,j}\) with strike \(K^{i,j}\), maturity \(T=t+\tau \) and face value equal to one unit of the foreign currency. We denote via \(Y^i(t)=\log (D^i(t))\) the logarithm of the GOP in ith currency denomination. Hence, the log-exchange rate may be written as \(x^{i,j}(t)=\log (S^{i,j}(t))=Y^i(t)-Y^j(t)\). Let us introduce the following conditional expectation

$$\begin{aligned} \begin{aligned} \phi ^{i,j}_{t,T}( z)&= D^{i}(t) \mathbb {E}\left[ \left. \frac{1}{D^{i}(T)}e^{\mathtt {i} z x^{i,j}(T)}\right| \mathcal {F}_t \right] \\&=e^{Y^i(t)}\mathbb {E}\left[ \left. e^{-Y^i(T)+\mathtt {i} z(Y^i(T)-Y^j(T))}\right| \mathcal {F}_t \right] , \end{aligned} \end{aligned}$$
(4.1)

for \(\mathtt {i}=\sqrt{-1}\). For \( z=u\in \mathbb {R}\), we will use the terminology of a discounted characteristic function, whereas for \( z\in \mathbb {C}\) when the expectation exists, the function \( \phi ^{i,j}_{t,T}\) will be called a generalized discounted characteristic function. If we denote by \(\varPsi _{t,T}(z)\) the joint conditional (generalized) characteristic function of the vector of GOP denominations \(Y(T)=(Y^1(T),\dots ,Y^N(T))\), that is

$$\begin{aligned} \varPsi _{t,T}(\zeta ):=\mathbb {E}^{}\left[ \left. e^{\mathtt {i}\langle \zeta ,Y(T)\rangle }\right| \mathcal {F}_t\right] , \ \ \zeta \in \mathbb {C}^N, \end{aligned}$$
(4.2)

then we have

$$\begin{aligned} \phi ^{i,j}_{t,T}( z)=D^i(t)\varPsi _{t,T}(\zeta ), \end{aligned}$$
(4.3)

for \(\zeta \) being a vector with \(\zeta _i=z+\mathtt {i}\), \(\zeta _j=-z\) and all other entries being equal to zero. Now, from the real-world pricing formula (2.9), the time t price of a call option can be written as the following expected value:

$$\begin{aligned} C(S^{i,j}(t),K^{i,j},\tau )=D^i(t)\mathbb {E}\left[ \left. \frac{1}{D^i(T)}\left( S^{i,j}(T)-K^{i,j}\right) ^+\right| \mathcal {F}_t\right] . \end{aligned}$$

Following Lewis (2001), we know that option prices may be interpreted as a convolution of the payoff and the probability density function of the (log)-underlying. As a consequence, the pricing of a derivative may be solved in Fourier space by relying on the Plancherel/Parseval identity, see Lewis (2001), where we have for \(f,g\in L^2(\mathbb {R},\mathbb {C})\)

$$\begin{aligned} \int _{-\infty }^{\infty }\overline{f(x)}g(x)\mathrm{d}x=\frac{1}{2\pi }\int _{-\infty }^{\infty }\overline{\hat{f}(u)}\hat{g}(u)\mathrm{d}u \end{aligned}$$

for \(u\in \mathbb {R}\) and \(\hat{f},\hat{g}\) denoting the Fourier transforms of fg, respectively. Applying the reasoning above in an option pricing setting requires some additional care. In fact, most payoff functions do not admit a Fourier transform in the classical sense. For example, it is well-known that for the call option one has

$$\begin{aligned} \varPhi ( z)=\int _{\mathbb {R}}e^{\mathtt {i} z x}\left( e^{x}-K^{i,j}\right) ^+\mathrm {d}x=-\frac{\left( K^{i,j}\right) ^{\mathtt {i} z+1}}{ z( z-\mathtt {i})}, \end{aligned}$$

provided we let \( z\in \mathbb {C}\) with \(\mathrm {Im}( z)>1\), meaning that \(\varPhi ( z)\) is the Fourier transform of the payoff function in the generalized sense. Such restrictions must be coupled with those that identify the domain where the generalized characteristic function of the log-price is well defined. The reasoning we just reported is developed in Theorem 3.2 in Lewis (2001), where the following general formula is presented (here, we write \(\phi ^{i,j}\) for \(\phi ^{i,j}_{t,T}\) in order to simplify notation):

$$\begin{aligned} C(S^{i,j}(t),K^{i,j},\tau )&=\frac{1}{ 2\pi }\int _{\mathcal {Z}}\phi ^{i,j}(- z)\varPhi ( z)\mathrm {d}z, \end{aligned}$$
(4.4)

with \(\mathcal {Z}\) denoting the line in the complex plane, parallel to the real axis, where the integration is performed. The article (Carr and Madan 1999) followed a different procedure by introducing the concept of a dampened option price. However, as Lewis (2001) and Lee (2004) point out, this alternative approach is just a particular case of the first one. In Lee (2004), the Fourier representation of option prices is extended to the case where interest rates are stochastic. Moreover, the shifting of contours, pioneered by Lewis (2001), is employed to prove Theorem 5.1 in Lee (2004). There the following general option pricing formula is presented:

$$\begin{aligned}&C(S^{i,j}(t), K^{i,j},\tau )\nonumber \\&\quad =R\left( S^{i,j}(t),K^{i,j},\alpha \right) +\frac{1}{\pi }\int _{0-\mathtt {i}\alpha }^{\infty -\mathtt {i}\alpha }\mathrm {Re}\left( e^{-\mathtt {i} z k^{i,j}}\frac{\phi ^{i,j}( z-\mathtt {i})}{- z( z-\mathtt {i})}\right) \mathrm{d} z. \end{aligned}$$
(4.5)

Here, \(k^{i,j}=\log (K^{i,j})\), \(\alpha \) denotes the contour of integration and the term coming from the application of the residue theorem is given by

$$\begin{aligned} R\left( S^{i,j}(t),K^{i,j},\alpha \right) = {\left\{ \begin{array}{ll} \phi ^{i,j}(-\mathtt {i})-K^{i,j}\phi ^{i,j}(0), &{} \text{ if } \alpha<-1 \\ \phi ^{i,j}(-\mathtt {i})-\frac{K^{i,j}}{2}\phi ^{i,j}(0), &{} \text{ if } \alpha =-1 \\ \phi ^{i,j}(-\mathtt {i}) &{} \text{ if } -1<\alpha <0\\ \frac{1}{2}\phi ^{i,j}(-\mathtt {i})&{}\text{ if } \alpha = 0\\ 0&{}\text{ if } \alpha >0. \end{array}\right. } \end{aligned}$$
(4.6)

The following theorem provides the explicit computation of the generalized discounted characteristic function.

Theorem 4.1

The joint conditional generalized characteristic function \(\varPsi _{t,T}(\cdot )\) in (4.2) is given by

$$\begin{aligned} \varPsi _{t,T}(\zeta )&=\exp \left\{ \sum _{i=1}^N \mathtt {i}\zeta ^i \left( Y^i(t)+h^i (T-t) \right) \right\} \nonumber \\&\quad \times \prod _{k=1}^d\exp \left\{ \sum _{i=1}^N \Bigg [(T-t)\Bigg (\frac{1-\rho ^2_k}{2}\sum _{j=1}^N \mathtt {i}\zeta ^i\mathtt {i}\zeta ^j\left( a^i_kb^j_k+a^j_kb^i_k\right) \Bigg .\Bigg .\right. \nonumber \\&\quad \Bigg .+\mathtt {i}\zeta ^ia^i_kb^i_k+\mathtt {i}\zeta ^i\frac{\kappa _k\rho _k}{\sigma _k}\left( b^i_k-\theta _ka^i_k\right) \Bigg )\nonumber \\&\quad \Bigg .\Bigg .-V_k(t)\mathtt {i}\zeta ^i\frac{\rho _k a^i_k}{\sigma _k}-\mathtt {i}\zeta ^i\frac{\rho _kb^i_k}{\sigma _k}\log (V_k(t))\Bigg ]\Bigg \}\nonumber \\&\quad \times \left( \frac{\beta _k(t,V_k)}{2}\right) ^{m_k+1} V_k(t)^{-\frac{\kappa _k\theta _k}{\sigma _k^2}} (\lambda _k+K_k(t))^{-\left( \frac{1}{2}+\frac{m_k}{2}-\alpha _k +\frac{\kappa _k\theta _k}{\sigma _k^2}\right) }\nonumber \\&\quad \times e^{\frac{1}{\sigma _k^2}\left( \kappa _k^2\theta _k(T-t) - \sqrt{A_k}V_k(t)\coth \left( \frac{\sqrt{A_k}(T-t)}{2}\right) +\kappa _k V_k(t)\right) } \frac{\Gamma \left( \frac{1}{2}+\frac{m_k}{2}-\alpha _k +\frac{\kappa _k\theta _k}{\sigma _k^2}\right) }{\Gamma (m_k+1)}\nonumber \\&\quad \times {}_1F_1 \left( \frac{1}{2}+\frac{m_k}{2}-\alpha _k +\frac{\kappa _k\theta _k}{\sigma _k^2}, m_k+1, \frac{\beta _k^2(t,V_k)}{4(\lambda _k +K_k(t))}\right) , \end{aligned}$$
(4.7)

where

$$\begin{aligned} m_k= & {} \frac{2}{\sigma _k^2}\sqrt{\left( \kappa _k\theta _k-\frac{\sigma _k^2}{2}\right) ^2+2\sigma _k^2 \nu _k},\\ A_k= & {} \kappa _k^2 +2\mu _k\sigma _k^2,\\ \beta _k (t, x)= & {} \frac{2\sqrt{A_k x}}{\sigma _k^2 \sinh \left( \frac{\sqrt{A_k}(T-t)}{2}\right) },\\ K_k(t)= & {} \frac{1}{\sigma _k^2}\left( \sqrt{A_k}\coth \left( \frac{\sqrt{A_k}(T-t)}{2}\right) +\kappa _k\right) , \end{aligned}$$

and

$$\begin{aligned} \alpha _k&=-\frac{\rho _k}{\sigma _k}\sum _{i=1}^N\mathtt {i}\zeta ^ib^i_k \end{aligned}$$
(4.8)
$$\begin{aligned} \lambda _k&=-\frac{\rho _k}{\sigma _k}\sum _{i=1}^N \mathtt {i}\zeta ^ia_k^i \end{aligned}$$
(4.9)
$$\begin{aligned} \mu _k&=-\sum _{i=1}^N\left( \mathtt {i}\zeta ^iH^i_k+\frac{\mathtt {i}\zeta ^i}{2}(a^i_k)^2+\frac{1-\rho ^2_k}{2}\sum _{j=1}^N\mathtt {i}\zeta ^i\mathtt {i}\zeta ^ja^i_ka^j_k+\mathtt {i}\zeta ^i\rho _ka^i_k\frac{\kappa _k}{\sigma _k}\right) \end{aligned}$$
(4.10)
$$\begin{aligned} \nu _k&=-\sum _{i=1}^N\left( \mathtt {i}\zeta ^iG^i_k+\frac{\mathtt {i}\zeta ^i}{2}(b^i_k)^2+\frac{1-\rho ^2_k}{2}\sum _{j=1}^N\mathtt {i}\zeta ^i\mathtt {i}\zeta ^jb^i_kb^j_k\right. \nonumber \\&\quad \left. -\frac{\mathtt {i}\zeta ^i\rho _kb^i_k}{\sigma _k}\left( \kappa _k\theta _k-\frac{\sigma ^2_k}{2}\right) \right) . \end{aligned}$$
(4.11)

Let be given the functions

$$\begin{aligned} f^1_k(-\mathrm {Im}(\varvec{\zeta }))&:=\kappa _k^2+2\sigma _k^2\left( -\sum _{i=1}^N\left[ -\mathrm {Im}(\zeta ^i)H^i_k \right. \right. \\&\quad \left. \left. -\frac{\mathrm {Im}(\zeta ^i)}{2}(a^i_k)^2+\frac{1-\rho ^2_k}{2}\sum _{j=1}^N\mathrm {Im}(\zeta ^i)\mathrm {Im}(\zeta ^j)a^i_ka^j_k-\mathrm {Im}(\zeta ^i)\rho _ka^i_k\frac{\kappa _k}{\sigma _k}\right] \right) \\ f^2_k(-\mathrm {Im}(\varvec{\zeta }))&:=\left( \kappa _k\theta _k-\frac{\sigma _k^2}{2}\right) ^2+2\sigma _k^2 \left( -\sum _{i=1}^N\left[ -\mathrm {Im}(\zeta ^i)G^i_k-\frac{\mathrm {Im}(\zeta ^i)}{2}(b^i_k)^2\right. \right. \\&\quad \left. \left. +\frac{1-\rho ^2_k}{2}\sum _{j=1}^N\mathrm {Im}(\zeta ^i)\mathrm {Im}(\zeta ^j)b^i_kb^j_k+\frac{\mathrm {Im}(\zeta ^i)\rho _kb^i_k}{\sigma _k}\left( \kappa _k\theta _k-\frac{\sigma ^2_k}{2}\right) \right] \right) \\ f^3_k(-\mathrm {Im}(\varvec{\zeta }))&:=\frac{\kappa _k\theta _k+\frac{\sigma _k^2}{2}+\sqrt{f^2_k(-\mathrm {Im}(\varvec{\zeta }))}}{\sigma _k^2}\\ f^4_k(-\mathrm {Im}(\varvec{\zeta }))&:=\frac{\rho _k}{\sigma _k}\sum _{i=1}^N \mathrm {Im}(\zeta ^i)a^i_k+\frac{\sqrt{f^2_k(-\mathrm {Im}(\varvec{\zeta }))}+\kappa _k}{\sigma _k^2}, \end{aligned}$$

in conjunction with the following conditions

  1. (i)

    \(f^1_k(-\mathrm {Im}(\varvec{\zeta }))>0, \ \forall k=1,\ldots , d\);

  2. (ii)

    \(f^2_k(-\mathrm {Im}(\varvec{\zeta }))\ge 0, \ \forall k=1,\ldots , d\);

  3. (iii)

    \(f^3_k(-\mathrm {Im}(\varvec{\zeta }))>0, \ \forall k=1,\ldots , d\);

  4. (iv)

    \(f^4_k(-\mathrm {Im}(\varvec{\zeta }))\ge 0, \ \forall k=1,\ldots , d\).

The transform formula (4.7) is well defined for all \(t\in [0,T]\) when the complex vector \(\mathtt {i}\varvec{\zeta }\) belongs to the strip \(\mathcal {D}_{t,+\infty }=\mathcal {A}_{t,+\infty }\times \mathtt {i}\mathbb {R}^N\subset \mathbb {C}^N\), where the convergence set \(\mathcal {A}_{t,+\infty }\subset \mathbb {R}^N\) is given by

$$\begin{aligned} \mathcal {A}_{t,+\infty }:=\left\{ \left. -\mathrm {Im}(\varvec{\zeta })\in \mathbb {R}^N\right| \ f^l_k(-\mathrm {Im}(\varvec{\zeta })), \ l=1,\ldots ,4\ \text {satisfying (i)}-{\text {(iv)}}\right\} \end{aligned}$$

Moreover, for \(\mathtt {i}\varvec{\zeta }\in \mathcal {D}_{t,t^\star }=\mathcal {A}_{t,t^\star }\times \mathtt {i}\mathbb {R}^N\) with

$$\begin{aligned} \mathcal {A}_{t,t^\star }&:=\left\{ \left. -\mathrm {Im}(\varvec{\zeta })\in \mathbb {R}^N\right| \ f^l_k(-\mathrm {Im}(\varvec{\zeta })), \ l=1,\ldots ,3\ \text {satisfying (i)}-\text {(iii) and }\right. \\&\quad \left. f^4_k(-\mathrm {Im}(\varvec{\zeta }))<0\ \text {for some }k\right\} \supset \mathcal {A}_{t,+\infty } \end{aligned}$$

the transform is well defined until the maximal time \(t^\star \) given by

$$\begin{aligned} t^\star =\min _{k\text { s.t. }f^4_k(-\mathrm {Im}(\varvec{\zeta }))<0}\frac{1}{\sqrt{A_k}}\log \left( 1-\frac{2\sqrt{A_k}}{\kappa _k+\sigma _k\rho _k\sum _{i=1}^N \mathrm {Im}(\zeta ^i)a_k^i+\sqrt{A_k}}\right) . \end{aligned}$$
(4.12)

Proof

See “Appendix B.” \(\square \)

The general transform formula above is a powerful tool, however, checking the validity of (4.7), may not be very practical in a calibration setting. For this reason, we provide a simple, yet handy, criterion. The price \(P^i(t,T)\) at time \(t\in [0, T], \ 0 \le T\le {\bar{T}}\) of a zero coupon bond for one unit of the ith currency to be paid at T, \(i =1,\ldots ,N\), is given by the following conditional expectation

$$\begin{aligned} P^i(t,T):=D^i(t)\mathbb {E}\left[ \left. \frac{1}{D^i(T)}\right| \mathcal {F}_t\right] =\phi _{t,T}^{i,j}(0). \end{aligned}$$
(4.13)

The criterion is provided by the next lemma.

Lemma 4.1

Let \(-1<\alpha <0\) and \(z\in \mathbb {C}\) with \(z=u+\mathtt {i}\alpha \). Assume

$$\begin{aligned} P^i(t,T)\vee P^j(t,T)<\infty , \end{aligned}$$

then

$$\begin{aligned} D^i(t)\mathbb {E}\left[ \left. \frac{1}{D^i(T)}\left( S^{i,j}(T)\right) ^{-\alpha }\right| \mathcal {F}_t\right] <\infty , \end{aligned}$$

moreover, the discounted characteristic function \(\phi ^{i,j}(z)\) admits an analytic extension to the strip

$$\begin{aligned} \mathcal {Z}=\left\{ \left. z\in \mathbb {C}\right| z = u+\mathtt {i}\alpha , \ \alpha \in (-1,0)\right\} . \end{aligned}$$

Proof

See “Appendix C.” \(\square \)

Given the result in Lemma 4.1, we shall proceed to calibrate the model by employing the generalized Carr–Madan formula of Lee, i.e., (4.5), by setting

$$\begin{aligned} R\left( S^{i,j}(t),K^{i,j},\alpha \right) = \phi ^{i,j}(-\mathtt {i}). \end{aligned}$$

5 Model calibration to FX triangles

In line with De Col et al. (2013), Gnoatto and Grasselli (2014) and Baldeaux et al. (2015), we perform a joint calibration to a triangle of FX implied volatility surfaces. More specifically, we consider the data set employed in Gnoatto and Grasselli (2014), featuring implied volatility surfaces for EURUSD, USDJPY and EURJPY as of July 22nd 2010. We choose such a date so as to obtain a calibration that can be approximately compared with the one of De Col et al. (2013), based on data as of July 23rd 2010. We perform our calibration to options with expiry dates ranging from one up to 18 months and moneyness ranging from 15 delta put up to 15 delta call, and we consider a total of 126 contracts. The model we consider for the calibration is the full 4/2 stochastic volatility model, i.e., both the Heston and the 3/2 effects are simultaneously considered. As we calibrate options with maturity up to 18 months, we do not consider stochastic interest rates due to the limited interest rate risk, see Gnoatto and Grasselli (2014). In line with the references above, we choose the following penalty function

$$\begin{aligned} \sum _i\left( \sigma ^{imp}_{i,\mathrm{mkt}}-\sigma ^{imp}_{i,\mathrm{model}}\right) ^2, \end{aligned}$$

where \(\sigma ^{imp}_{i,\mathrm{mkt}}\) is the ith observed market volatility and \(\sigma ^{imp}_{i,\mathrm{model}}\) is the ith model-derived implied volatility. For each option contract, \(\sigma ^{imp}_{i,\mathrm{model}}\) is constructed along the following steps: first, given a set of model parameters, (4.5) for \(-1<\alpha <0\) is employed so as to obtain the corresponding model derived price, secondly, the obtained price is converted into \(\sigma ^{imp}_{i,\mathrm{model}}\), via a standard implied volatility solver. As far as the implementation of (4.5) is concerned, we approximated the integral via a 4096-point FFT routine, with grid spacing equal to 0.1, so that the improper integral is truncated at the point \(e^{409}\). The corresponding strike range is then given by \(\left[ e^{-31.4159},e^{31.4159}\right] \) and Simpson’s rule weights are introduced for increased accuracy, see Carr and Madan (1999). The FFT returns then a vector of option prices for a fixed grid of strikes. Option prices for the strikes of interest are obtained via a linear interpolation. We assume that the model is driven by two square root factors. The parameters we need to calibrate are given by those appearing in the dynamics of each square root process, i.e., \(\kappa _k,\theta _k,\sigma _k,V_k(0), \ k=1,2\), coupled with a two-dimensional vector of correlations and six two-dimensional vectors of projections for each currency area, i.e., \(a^i,b^i, \ i=1,2,3\), meaning that we proceed to estimate a total of 22 parameters. Clearly, in order to prevent instability and over-parametrization issues, simplified versions of the model may be considered.

The result of the calibration is presented in Fig. 2a for July \(22\mathrm{nd}\), 2010, while the corresponding parameters are reported in Table 1. We obtain a good fit over all three surfaces we consider, in line with De Col et al. (2013). This shows that a satisfactory calibration of the model can be achieved. It allows us to perform the following analysis, which constitutes an interesting empirical result of the current paper. Given the set of parameters we obtain from the calibration, we can try to analyze whether market data of FX options are supporting the common use of classical risk neutral pricing. Our approach is so flexible, that we can, in the setting of a single model, span both the risk neutral valuation and the pricing under the real-world measure. Such an analysis is summarized in Table 2. We consider different measures for pricing: the real-world probability measure \(\mathbb {P}\), and the putative risk neutral measures \(\mathbb {Q}^\mathrm{usd}\), \(\mathbb {Q}^\mathrm{eur}\) and \(\mathbb {Q}^\mathrm{jpy}\). For each measure, we compute the corresponding Feller condition for each square root process.

Table 1 Parameter values resulting from the calibration procedure of a two factor specification with deterministic interest rate
Table 2 This table reports the Feller test under different (putative) measures as introduced in Sect. 3.2
Fig. 2
figure 2

Simultaneous FX calibration. Market data as of July 22nd, 2010. Market volatilities are denoted by crosses, model volatilities are denoted by circles. Moneyness levels follow the standard Delta quoting convention in the FX option market. DC and DP stand for “delta call” and “delta put,” respectively

Under the real-world probability measure \(\mathbb {P}\), we observe that the Feller condition \(2\kappa _k\theta _k\ge \sigma ^2_k\) is satisfied by both \(V_1\) and \(V_2\). We next proceed to perform the same analysis under the two putative risk neutral measures \(\mathbb {Q}^\mathrm{usd}\) and \(\mathbb {Q}^\mathrm{jpy}\), respectively. We observe that for the first one we still have that both processes do never reach zero, whereas for \(\mathbb {Q}^\mathrm{jpy}\) we have that the Feller condition is not satisfied by the second component. As discussed in Sect. 3.2, if at least one of the square root processes has a different behavior under the putative risk neutral measure, then we have that classical risk neutral pricing is not well founded. In summary, we have a situation where market data suggest that for the USD currency denomination risk neutral pricing is potentially applicable, while in the JPY denomination it is not theoretically founded.

We also perform a second calibration experiment. The structure of the sample of the dataset is the same as in the previous case and market data were provided as of Feb 23rd, March 23rd, April 22nd, May 22nd and June 22nd 2015. Essentially, we are taking the perspective of a derivative desk following the market practice that involves a periodic model re-calibration across different trading dates. Such analysis allows us to provide some first evidence regarding the stability of the parameter estimates we obtain. By looking at Table 3, we observe a satisfactory stability of the calibrated parameters. A relevant change in the estimates is observed only between the February and March calibration. The quality of the fit is comparable with the one obtained in our first calibration and the above mentioned papers of Baldeaux et al. (2015) and De Col et al. (2013).

Table 3 Parameter values resulting from the repeated calibration procedure of a two factor specification with constant interest rates
Table 4 This table reports the Feller test under different (putative) measures as introduced in Sect. 3.2

Calibrated parameter values are listed in Table 3, whereas the Feller condition under all measures is reported in Table 4. We observe in this case a violation of the Feller condition for the second factor under the \(\mathbb {Q}^\mathrm{usd}\) putative risk neutral measure, whereas for the \(\mathbb {Q}^\mathrm{jpy}\) measure the condition is passed. For \(\mathbb {Q}^\mathrm{eur}\), instead, we observe that the condition is initially passed and then, starting from April 22nd, we have repeated violations. The overall results of our analysis allow us to suggest that markets are subject to what we may term as regime switches in pricing between the classical risk neutral and the more general real-world pricing approach. Such a feature would clearly provide a strong motivation for the introduction of models that are able to accommodate both valuation frameworks, like the 4/2 type specification that we propose.

6 Conclusion

In this paper, we introduced a more general modeling approach than available under the classical no-arbitrage paradigm in finance and insurance. In the context of a flexible model for exchange rates calibrated to market data, we showed that the classic risk neutral paradigm can fail. The main mathematical phenomenon underlying these surprising effects is the potential strict supermartingale property of benchmarked savings accounts under the real-world probability measure, as suggested in several cases by our calibration exercise on the foreign exchange option market. The presented results represent only an example for new phenomena that can be captured under the benchmark approach.

There is ample room for many new research questions and interesting related studies in insurance and finance. Our stochastic volatility model allows for a nonlinear market price of volatility risk, a desiderable property in order to explain nonlinear effects under the real-world probability measure for the risk factors. Second, we performed our calibrations at some particular trading dates. Recently, Deep Neural Networks (DNN)-based algorithms have been introduced in finance in order to deal with robust calibration and hedging of large portfolios, see, e.g., Stone (2019), Horvath et al. (2021), Bayer and Stemper (2019) and references therein. These DNN algorithms are flexible and fast, but still require a learning phase, which typically takes a lot of time and needs the pricing technology to feed the network. In this sense, our results should not be seen in competition with DNN since, on the contrary, they are a useful ingredient. Finally, much more information could be extracted from a statistical estimation on a time series of option prices or the underlying GOP. The proposed 4/2 model allows for the possibility of a failure of the classical risk neutral pricing assumption and deserves more theoretical and numerical investigation. In view of this, our paper aims to stimulate further studies.