Abstract
We develop a stochastic volatility framework for modeling multiple currencies based on CBItimechanged Lévy processes. The proposed framework captures the typical risk characteristics of FX markets and is coherent with the symmetries of FX rates. Moreover, due to the selfexciting behavior of CBI processes, the volatilities of FX rates exhibit selfexciting dynamics. By relying on the theory of affine processes, we show that our approach is analytically tractable and that the model structure is invariant under a suitable class of riskneutral measures. A semiclosed pricing formula for currency options is obtained by Fourier methods. We propose two calibration methods, also by relying on deeplearning techniques, and show that a simple specification of the model can achieve a good fit to market data on a currency triangle.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The ForeignExchange (FX) market is one of the largest in the world (see, e.g., Bank for International Settlements, 2019; Wooldridge, 2019). From the perspective of quantitative finance, modeling the FX market poses several challenges. First, multicurrency models must respect the symmetric structure of FX rates. To illustrate this aspect, let \(S^{d,f}\) represent the value of one unit of a foreign currency f measured in units of the domestic currency d. In a multicurrency model, the following symmetric relations must hold:

\(S^{f,d}=1/S^{d,f}\): the reciprocal of \(S^{d,f}\) must coincide with \(S^{f,d}\), representing the value of one unit of currency d measured in units of currency f. This is referred to as inversion.

\(S^{d,f} = S^{d,e} \times S^{e,f}\), for any other foreign currency e. In other words, the FX rate \(S^{d,f}\) must be inferred from \(S^{d,e}\) and \(S^{e,f}\) through multiplication. This is referred to as triangulation.
Besides these symmetric relations, the FX market presents some specific risk characteristics that should be properly reflected in a multicurrency model. First, FX markets are affected by stochastic volatility and jump risk, similarly to the case of equity markets. This has led to the application to FX markets of wellknown models initially conceived for stock returns, such as the Heston model (see also Sect. 1.1 below). Second, the dependence between FX rates is typically stochastic and, in particular, shows evidence of unpredictable changes over time, thus generating correlation risk. Third, the skew of the FX volatility smile exhibits a stochastic behavior. This fact has been documented in Carr and Wu (2007) by analyzing the time series of riskreversals^{Footnote 1}, showing that their values vary significantly over time and exhibit repeated sign changes. Finally, FX rates are affected by volatility clustering effects, similarly to many other asset classes (see, e.g., Cont and Tankok, 2004, Chapter 7). This phenomenon can be observed in Fig. 1, which displays the time series of the USDJPY exchange rate over the period 01/01/2012  31/12/2015.
In this paper, we develop a modeling framework for multiple currencies that is consistent with the symmetric structure of FX rates and captures all risk characteristics described above, including stochastic dependence among FX rates as well as between FX rates and their volatilities. We consider models driven by CBItimechanged Lévy processes (CBITCL processes), a broad and flexible class of processes that allows for selfexciting jump dynamics, with stochastic volatility and meanreversion (we refer to Fontana et al., 2022 and Szulda, 2021, for a thorough analysis of CBITCL processes). The proposed approach is fully analytically tractable, due to the fact that CBITCL processes are affine processes and, therefore, their characteristic function can be explicitly characterized. Moreover, we will show that CBITCL processes are coherent in the sense of Gnoatto (2017), meaning that if an FX rate is modeled by a CBITCL process, then its reciprocal also belongs to the same model class.
We construct our modeling framework by adopting an artificial currency approach (see also Sect. 1.1 below), which consists in modeling each FX rate as the ratio of two primitive processes, with one primitive process associated to each currency. FX rates then satisfy the inversion and triangulation symmetries by construction and the model formulation reduces to modeling all primitive processes by means of a common family of CBITCL processes. By relying on a Girsanovtype result for CBITCL processes, we characterize a class of riskneutral measures that leave invariant the structure of the model. In particular, this allows preserving the CBITCL property under equivalent changes of probability, which enables us to derive an efficient pricing formula for currency options by means of Fourier techniques.
We analyze the empirical performance of a twodimensional specification of our framework, driven by tempered \(\alpha \)stable CBI processes (as recently introduced in Fontana et al., 2021) and CGMY processes (see Carr et al., 2002). We perform a calibration of the model with respect to an FX triangle consisting of three major currency pairs (USDJPY, EURJPY, EURUSD). We propose two calibration methods: a standard calibration algorithm and a deep calibration algorithm, inspired by the deep learning techniques recently developed in Horvath et al. (2021) and applied here for the first time in a multicurrency setting. We assess the importance of jumps by showing that our CBITCL model achieves a superior calibration performance with respect to an analogous continuouspath model.
1.1 Related literature
As mentioned above, our framework is based on an artificial currency approach, which goes back to the works of Flesaker and Hughston (2000) and Doust (2007) (intrinsic currency approach, in the terminology of Doust, 2007). Relying on this approach, several stochastic volatility models for multiple currencies have been developed, see e.g. (Doust, 2012; De Col et al., 2013; Gnoatto and Grasselli, 2014; Baldeaux et al., 2015; Gnoatto et al., 2021) in a Brownian setting. In the latter works, the symmetries of FX rates are respected and several sources of risk can be adequately represented, with the important exceptions of jump risk, volatility clustering and selfexcitation effects, which will play an important role in our framework.
Most of the works mentioned in the previous paragraph can be regarded as multicurrency extensions of the Heston model (to this effect, see also Janek et al., 2011). This is motivated by the fact that the Heston model is known to be stable under inversion (see, e.g., Rollin, 2008). The property that a certain model class is invariant under inversion has been termed coherence in Gnoatto (2017), where it has been shown that general affine stochastic volatility models are coherent. As will be shown below, our modeling framework is coherent in this sense. The coherence property has been recently studied by Graceffa et al. (2020) (under the name of consistency) in the class of jumpdiffusion models.
Beyond the Brownian setting, some important contributions to the modeling of FX rates include the work of Eberlein and Koval (2006) based on timeinhomogeneous Lévy processes and the work of Carr and Wu (2007) based on timechanged Lévy processes with CIRtype activity rates. More recently, Ballotta et al. (2017) have developed a multicurrency framework driven by a multidimensional Lévy process with dependent components. In this work, only standard Lévy processes are considered and, therefore, stochastic volatility is not explicitly modeled. Ballotta and Morico (2017) suggest however the possible use of time change methods. This line of research is pursued in Ballotta and Morico (2018) and further expanded in our work. In particular, Ballotta and Morico (2018) propose a model based on timechanged Lévy processes where the activity rate can have jumps and selfexcitation. The presence of common jumps between the Lévy process and the activity rate induces a nontrivial dependence between the FX rate and its volatility. The model of Ballotta and Morico (2018) is however limited to a single FX rate, in which case the FX symmetries do not play any role. In contrast, we propose a coherent multicurrency framework that satisfies the FX symmetries and exhibits rich and flexible stochastic dynamics for all FX rates and their volatilities, while preserving an analytical tractability comparable to the model of Ballotta and Morico (2018).
1.2 Outline of the paper
The paper is structured as follows. In Sect. 2 we recall some basic results on CBITCL processes. In Sect. 3 we develop the multicurrency modeling framework, while in Sect. 4 we present two calibration methods and analyze the empirical fit to market data on an FX triangle. Finally, Sect. 5 concludes the paper.
2 CBItimechanged Lévy processes
In this section, we present some fundamental results on CBItimechanged Lévy (CBITCL) processes, referring to Fontana et al. (2022) for a complete theoretical analysis of this class of processes, detailed proofs and additional results (see also Szulda 2021, Chapter 4). We work on a filtered stochastic basis \((\Omega ,\mathcal {F},\mathbb {F},\mathbb {Q})\), where \(\mathbb {F}\) is a filtration satisfying the usual conditions.
2.1 Definition and characterization of CBITCL processes
Let us start by recalling the definition of a Continuousstate Branching processes with Immigration (see Li 2020, Section 5).

Let the function \(\Psi :\mathbb {R}_\rightarrow \mathbb {R}\) be given by
$$\begin{aligned} \Psi (x) := \beta \,x + \int _0^{+\infty } {(e^{\,xz}  1)\,\nu (\mathrm{d}z)}, \qquad \forall x \in \mathbb {R}_, \end{aligned}$$(2.1)where \(\beta \ge 0\) and \(\nu \) is a Lévy measure on \((0,+\infty )\) such that \(\int _0^1 {z\,\nu (\mathrm {d}z)}<+\infty \);

Let the function \(\Phi :\mathbb {R}_\rightarrow \mathbb {R}\) be given by
$$\begin{aligned} \Phi (x) := \,b\,x + \frac{\sigma ^2}{2}\,x^2 + \int _0^{+\infty } {(e^{\,{xz}}  1  {xz})\,\pi (\mathrm {d}z)}, \qquad \forall x \in \mathbb {R}_, \end{aligned}$$(2.2)where \(b \in \mathbb {R}\), \(\sigma \in \mathbb {R}\) and \(\pi \) is a Lévy measure on \((0,+\infty )\) such that \(\int _1^{+\infty } {z\,\pi (\mathrm {d}z)}<+\infty \).
Definition 2.1
A Markov process \(X=(X_t)_{t \ge 0}\) with initial value \(X_0\) and state space \(\mathbb {R}_+\) is said to be a Continuousstate Branching process with Immigration (CBI) with immigration mechanism \(\Psi \) and branching mechanism \(\Phi \) if its Laplace transform is given by
for all \(u \in \mathbb {R}_\) and \(T \in \mathbb {R}_+\), where the function \(\mathcal {V}(\cdot ,u):\mathbb {R}_+\rightarrow \mathbb {R}_\) is the unique solution to
Definition 2.1 corresponds to a conservative stochastically continuous CBI process in the sense of Kawazu and Watanabe (1971). Note that CBI processes are nonnegative, strongly Markov (Feller) and with càdlàg trajectories. As a consequence, the path integral \(Y:=\int _0^{\cdot }X_s\,\mathrm {d}s\) of a CBI process X is always well defined as a nondecreasing process. It can therefore be used as a finite continuous timechange. This motivates the following definition.
Definition 2.2
A process \((X,Z)=((X_t, Z_t))_{t \ge 0}\) is said to be a CBItimechanged Lévy process (CBITCL process) if

(i)
X is a CBI process, and

(ii)
\(Z=L_Y\), where \(L=(L_t)_{t\ge 0}\) is a Lévy process independent of X and \(Y=(Y_t)_{t\ge 0}\) denotes the process defined by \(Y_t:=\int _0^tX_s\,\mathrm {d}s\), for all \(t\in \mathbb {R}_+\).
The Lévy exponent \(\Xi \) of the Lévy process L admits the LévyKhintchine representation
where \((b_Z, \sigma _Z, \gamma _Z)\) is the Lévy triplet of L, with \(b_Z \in \mathbb {R}\), \(\sigma _Z \in \mathbb {R}\) and \(\gamma _Z\) a Lévy measure on \(\mathbb {R}\). In the following, we shall write \(\mathrm {CBITCL}(X_0, \Psi , \Phi , \Xi )\) to denote that a process (X, Z) is a CBItimechanged Lévy process in the sense of Definition 2.2, where \(\Psi \) and \(\Phi \) denote respectively the immigration and branching mechanisms of the CBI process X and \(\Xi \) the Lévy exponent of L.
CBI processes admit a characterization in terms of a Lamperti representation (see Caballero et al., 2013, and Szulda, 2021, Chapter 2). However, it turns out that there exists an equivalent representation that is better suited to our purposes, in terms of solutions to certain stochastic integral equations of the DawsonLi type (see Dawson and Li, 2006). To this effect, let us introduce the following objects:

two Brownian motions \(B^1=(B_t^1)_{t \ge 0}\) and \(B^2=(B_t^2)_{t \ge 0}\);

a Poisson random measure \(N_0(\mathrm {d}t, \mathrm {d}x)\) on \((0,+\infty )^2\) with compensator \(\mathrm {d}t\,\nu (\mathrm {d}x)\) and compensated measure \(\widetilde{N}_0(\mathrm {d}t, \mathrm {d}x) := N_0(\mathrm {d}t, \mathrm {d}x)  \mathrm {d}t\,\nu (\mathrm {d}x)\);

a Poisson random measure \(N_1(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x)\) on \((0,+\infty )^3\) with compensator \(\mathrm {d}t\,\mathrm {d}u\,\pi (\mathrm {d}x)\) and compensated measure \(\widetilde{N}_1(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x) := N_1(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x)  \mathrm {d}t\,\mathrm {d}u\,\pi (\mathrm {d}x)\);

a Poisson random measure \(N_2(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x)\) on \((0,+\infty )^2\times \mathbb {R}\) with compensator \(\mathrm {d}t\,\mathrm {d}u\,\gamma _Z(\mathrm {d}x)\) and compensated measure \(\widetilde{N}_2(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x) := N_2(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x)  \mathrm {d}t\,\mathrm {d}u\,\gamma _Z(\mathrm {d}x)\).
We furthermore assume that \(B^1\), \(B^2\), \(N_0\), \(N_1\) and \(N_2\) are mutually independent. For any \(X_0\in \mathbb {R}_+\), let us consider the following stochastic integral equations:
The connection between Definition 2.2 and the stochastic integral equations (2.3)(2.4) is given in the following proposition (see Fontana et al., 2022, Theorem 2.3).
Proposition 2.3
A process (X, Z) with initial value \((X_0, 0)\) is a \(\mathrm {CBITCL}(X_0, \Psi , \Phi , \Xi )\) process if and only if it is a weak solution to the stochastic integral equations (2.3)(2.4).
On a given stochastic basis \((\Omega ,\mathcal {F},\mathbb {F},\mathbb {Q})\), there exists a unique strong solution to (2.3)(2.4). Indeed, there is a unique strong solution \(X=(X_t)_{t \ge 0}\) to (2.3), which corresponds to the DawsonLi representation of a CBI process (see Dawson and Li, 2006, Theorems 5.1 and 5.2). In turn, since the righthand side of (2.4) does depend only on the process \(X=(X_t)_{t \ge 0}\) and not on \(Z=(Z_t)_{t\ge 0}\), this obviously implies the existence of a unique strong solution \(Z=(Z_t)_{t\ge 0}\) to (2.4) as well. In the following, if a CBITCL process (X, Z) is directly defined as the unique strong solution to (2.3)(2.4), we will say that (X, Z) is defined through its extended DawsonLi representation (see Fontana et al., 2022, Section 2.1).
Remark 2.4
The system of stochastic integral equations (2.3)(2.4) makes evident the selfexciting behavior of a CBITCL process. More specifically, we can observe the following:

For the CBI process X, the domain of integration of the integral with respect to \(\widetilde{N}_1\) depends on the value of the process itself. This generates a selfexciting effect since, when a jump occurs, the jump intensity of X increases. In turn, this increases the likelihood of subsequent jumps of X, thereby generating jump clustering phenomena.

The volatility components of Z depend on the value of the process X. Therefore, large values of X increase the volatility of the process Z, thereby increasing the likelihood of volatility clusters in the dynamics of Z as well as joint clusters between X and Z.
2.2 Affine property and changes of probability
The next proposition shows that CBITCL processes are affine and provides an explicit characterization of the LaplaceFourier transform.
Proposition 2.5
Let (X, Z) be a \(\mathrm {CBITCL}(X_0, \Psi , \Phi , \Xi )\) process and consider the joint process (X, Y, Z), where \(Y:= \int _0^{\cdot }{X_s\,\mathrm {d}s}\). Then, the process (X, Y, Z) is an affine process on the state space \(\mathbb {R}_+^2\times \mathbb {R}\) with conditional LaplaceFourier transform given by
for all \((u_1, u_2, u_3) \in \mathbb {C}_^2\times \mathsf {i}\mathbb {R}\) and \(0 \le t \le T < +\infty \), where the functions \(\mathcal {U}(\cdot , u_1, u_2, u_3):\mathbb {R}_+\rightarrow \mathbb {C}\) and \(\mathcal {V}(\cdot , u_1, u_2, u_3):\mathbb {R}_+\rightarrow \mathbb {C}_\) are solutions to
where \(\Psi :\mathbb {C}_\rightarrow \mathbb {C}\) and \(\Phi :\mathbb {C}_\rightarrow \mathbb {C}\) denote the analytic extensions to \(\mathbb {C}_\) of the corresponding functions defined in (2.1) and (2.2), respectively.
Proof
By Duffie et al. (2003), Corollary 2.10, the process X is an affine process. The affine property of (X, Y, Z) and the characterization of its conditional LaplaceFourier transform then follow by an application of KellerRessel (2009), Theorems 4.10 and 4.16, (see Fontana et al., 2022, Section 2.2 for additional details).
Remark 2.6
In view of Proposition 2.5, CBITCL processes can be viewed as affine stochastic volatility models in the sense of KellerRessel (2009), Chapter 5. In the context of FX modeling, such models have been proved to be coherent by Gnoatto (2017), as mentioned in the Introduction. This fact will play an important role in the construction of our multicurrency framework in Sect. 3. We refer to Fontana et al. (2022), Section 2.2, for a detailed analysis of the relation between CBITCL processes and affine stochastic volatility models.
We close this section by describing a class of equivalent changes of probability of Esscher type that leave invariant the CBITCL structure. Let us first define the convex set \(\mathcal {D}_X\) as follows:
The set \(\mathcal {D}_X\) is the effective domain of the functions \(\Psi \) and \(\Phi \), which can be extended as finitevalued convex functions on \(\mathcal {D}_X\). Note that \(\mathcal {D}_X\) also represents the extended domain of the Laplace transform of the CBI process X (see Fontana et al. 2021, Theorem 2.6). Let us also introduce the convex set
which represents the effective domain of the Lévy exponent \(\Xi \) when restricted to real arguments. Let us fix \(\zeta \in \mathbb {R}\) and \(\lambda \in \mathbb {R}\) and consider the process \(\mathcal {W}=(\mathcal {W}_t)_{t \ge 0}\) defined by
By Jacod and Shiryaev, 2003, Proposition II.8.26, it can be checked that \(\mathcal {W}\) is an exponentially special semimartingale if and only if \(\zeta \in \mathcal {D}_X\) and \(\lambda \in \mathcal {D}_Z\). In this case, \(\mathcal {W}\) admits a unique exponential compensator, i.e., a predictable process of finite variation, denoted by \(\mathcal {K}=(\mathcal {K}_t)_{t \ge 0}\), such that \(\exp (\mathcal {W} \mathcal {K})\) is a local martingale (see Kallsen and Shiryaev, 2002). The following lemma provides the explicit expression of \(\mathcal {K}\).
Lemma 2.7
Let (X, Z) be a \(\mathrm {CBITCL}(X_0, \Psi , \Phi , \Xi )\) process. Consider the process \(\mathcal {W}\) defined in (2.9), with \(\zeta \in \mathcal {D}_X\) and \(\lambda \in \mathcal {D}_Z\). Then, the exponential compensator \(\mathcal {K}\) of \(\mathcal {W}\) is given by
Proof
For brevity of presentation, we only give a sketch of the proof, referring to Fontana et al. (2022), Lemma 4.1 for full details. Since CBITCL processes are quasileftcontinuous, the exponential compensator \(\mathcal {K}\) can be explicitly computed in terms of the semimartingale differential characteristics of (X, Z) (see Kallsen and Shiryaev, 2002). In view of Proposition 2.3, the differential semimartingale characteristics of (X, Z) can be easily obtained from (2.3)(2.4). Representation (2.10) then follows by standard computations. \(\square \)
Fixing a time horizon \(\mathcal {T}< +\infty \), we can state the following Girsanovtype result for CBITCL processes, which will play a central role in the modeling framework developed in the next section. In the following statement, we denote by \(\mathcal {D}^{\circ }_X\) and \(\mathcal {D}_Z^{\circ }\) the interior of sets \(\mathcal {D}_X\) and \(\mathcal {D}_Z\), respectively.
Theorem 2.8
Let (X, Z) be a \(\mathrm {CBITCL}(X_0, \Psi , \Phi , \Xi )\) process. Consider the process \(\mathcal {W}\) defined in (2.9), with \(\zeta \in \mathcal {D}^{\circ }_X\) and \(\lambda \in \mathcal {D}^{\circ }_Z\), and its exponential compensator \(\mathcal {K}\) given by (2.10). Then, the process \((\exp (\mathcal {W}_t  \mathcal {K}_t))_{t\in [0,\mathcal {T}]}\) is a martingale. Moreover, setting
defines a probability measure \(\mathbb {Q}^{\prime }\sim \mathbb {Q}\) under which (X, Z) remains a CBITCL process up to time \(\mathcal {T}\) with parameters \(\beta '\), \(\nu '\), \(b'\), \(\sigma '\), \(\pi '\), \(b'_Z\), \(\sigma '_Z\) and \(\gamma _Z'\) given in Table 1.
Proof
By definition of the exponential compensator, the process \(\exp (\mathcal {W}\mathcal {K})\) is a strictly positive local martingale. The true martingale property of the process \(\exp (\mathcal {W}_t\mathcal {K}_t)_{t\in [0,\mathcal {T}]}\) follows similarly as in KellerRessel and Mayerhofer (2015), Theorem 3.2, since \(\zeta \in \mathcal {D}^{\circ }_X\) and \(\lambda \in \mathcal {D}^{\circ }_Z\). The fact that (X, Z) is a CBITCL process up to time \(\mathcal {T}\) with parameters given in Table 1 under \(\mathbb {Q}^{\prime }\) is a consequence of Girsanov’s theorem together with Proposition 2.3 (see Fontana et al., 2022, Theorem 4.2 for full details). \(\square \)
3 Modeling of multiple currencies via CBITCL processes
In this section, we present our modeling framework for a multicurrency market. In Sect. 3.1, we introduce the main quantities to be modeled together with the specific requirements induced by absence of arbitrage and by the FX symmetries discussed in the Introduction. Sect. 3.2 contains the construction of the framework and the description of its most relevant features. In Sect. 3.3, Fourier techniques are applied to the pricing of currency options. We work on a filtered stochastic basis \((\Omega ,\mathcal {F},\mathbb {F},\mathbb {Q})\) and consider models defined up to a time horizon \(\mathcal {T}<+\infty \).
3.1 Definition of the multiple currency market
The FX market involves different economies, each of them associated to a specific currency. The \(i^\text {th}\) and \(j^\text {th}\) currencies are related by the spot FX rate process \(S^{i,j}\), representing the value of one unit of currency j measured in units of currency i. Our definition of the multiple currency market will make use of the following ingredients, denoting by \(N\in \mathbb {N}\), with \(N\ge 2\), the number of economies (i.e., currencies) considered:

(i)
\({\varvec{D}} = \{D^i;i=1,\ldots ,N\}\) is an \(\mathbb {R}^N_{>0}\)valued process, with \(D^i\) representing the bank account of the \(i^\text {th}\) economy, for \(i=1,\ldots ,N\);

(ii)
\({\varvec{S}} = \{S^{i,j};i,j=1,\ldots ,N\}\) is an \(\mathbb {R}^{N \times N}_{>0}\)valued process representing the spot FX rates between the N different currencies and such that \(S^{i,i}\equiv 1\), for all \(i=1,\ldots ,N\).
Definition 3.1
We say that the pair \(({\varvec{D}},{\varvec{S}})\) represents a multiple currency market if, for every \(i=1,\ldots ,N\), the following assets are traded in the \(i^\text {th}\) economy:

the bank account \(D^i\);

for every \(j=1,\ldots ,N\) with \(j \ne i\), the bank account of the \(j^\text {th}\) economy denominated in units of the \(i^\text {th}\) currency, namely \(S^{i,j}D^j\).
We aim at constructing models for multiple currency markets that respect the FX symmetries mentioned in the Introduction and satisfy absence of arbitrage in the sense of no free lunch with vanishing risk (NFLVR). Since \(({\varvec{D}},{\varvec{S}})\) are strictly positive processes, Delbaen and Schachermayer (1998), Theorem 1.1, implies that, for each \(i=1,\ldots ,N\), NFLVR holds in the \(i^\text {th}\) economy if and only if there exists a riskneutral measure \(\mathbb {Q}^i\), i.e., a probability measure \(\mathbb {Q}^i\) equivalent to \(\mathbb {Q}\) such that \(S^{i,j}D^j/D^i\) is a local martingale under \(\mathbb {Q}^i\). We then formulate the following definition, which extends Definition 1 of Escobar and Gschnaidtner (2018) to an FX market consisting of an arbitrary number N of currencies.
Definition 3.2
The multiple currency market \(({\varvec{D}}, {\varvec{S}})\) is said to be wellposed if the following hold:

(i)
no direct arbitrage: \(S^{j,i}=1/S^{i,j}\), for all \(i,j=1,\ldots ,N\);

(ii)
no triangular arbitrage: \(S^{i,j}=S^{i,k} \times S^{k,j}\), for all \(i,k,j=1,\ldots ,N\);

(iii)
there exists a riskneutral measure \(\mathbb {Q}^i\) for the \(i^\text {th}\) economy, for all \(i=1,\ldots ,N\).
Besides the requirement of wellposedness, we are interested in multicurrency models that are coherent in the sense of Gnoatto (2017). This means that, if the FX rate process \(S^{i,j}\) belongs to a certain model class under \(\mathbb {Q}^i\), then also its reciprocal \(S^{j,i}\) belongs to the same model class under \(\mathbb {Q}^j\), where \(\mathbb {Q}^i\) and \(\mathbb {Q}^j\) are riskneutral measures for the \(i^\text {th}\) and \(j^\text {th}\) economy, respectively. Obviously, coherence is a desirable property from the modeling perspective, since it ensures that the model retains its analytical tractability in all N different economies.
To achieve a wellposed as well as coherent multiple currency market, we will proceed as follows:

(1)
Adopting the artificial currency approach, we express each currency with respect to an artificial currency indexed by 0 and construct the artificial FX rates \(S^{0,i}\), for \(i=1,\ldots ,N\).

(2)
We define the FX rates \(S^{i,j}\), for all \(i,j=1,\ldots ,N\), by taking suitable ratios of the artificial FX rates \(S^{0,i}\), \(i=1,\ldots ,N\). Parts (i)(ii) of Definition 3.2 are then satisfied by construction.

(3)
By relying on Theorem 2.8, we construct a riskneutral measure \(\mathbb {Q}^i\), for every \(i=1,\ldots ,N\), under which the driving processes remain CBITCL processes, thus ensuring part (iii) of Definition 3.2 as well as the stability of the structure of the model (coherence).
3.2 Construction of the modeling framework
The construction of our modeling framework starts by modeling the N artificial FX rates \(S^{0,i}\), for \(i=1,\ldots ,N\), by means of a common family of CBITCL processes. To this effect, we assume that the stochastic basis \((\Omega ,\mathcal {F},\mathbb {F},\mathbb {Q})\) supports d mutually independent CBITCL processes \((X^k,Z^k)\), \(k=1,\ldots ,d\), defined through the corresponding extended DawsonLi representations (2.3)(2.4).^{Footnote 2} For each \(k=1,\ldots ,d\), we denote by \(\mathcal {D}_{X^k}\) and \(\mathcal {D}_{Z^k}\) the sets (2.7) and (2.8), respectively, associated to the CBITCL process \((X^k,Z^k)\).
For each \(i=1,\ldots ,N\), we introduce the following parameters:

\(r^i\in \mathbb {R}\), representing the riskfree short rate in the \(i^\text {th}\) economy and generating the bank account \(D^i_t:=\exp (r^i\,t)\), for all \(t\in [0,\mathcal {T}]\);

\(\zeta ^i=(\zeta ^i_1,\ldots ,\zeta ^i_d)\in \mathbb {R}^d\) such that \(\zeta ^i_k\in \mathcal {D}^{\circ }_{X^k}\), for all \(k=1,\ldots ,d\);

\(\lambda ^i=(\lambda ^i_1,\ldots ,\lambda ^i_d)\in \mathbb {R}^d\) such that \(\lambda ^i_k\in \mathcal {D}^{\circ }_{Z^k}\), for all \(k=1,\ldots ,d\).
In addition, for each \(i=1,\ldots ,N\) and \(k=1,\ldots ,d\), we denote by \(\mathcal {K}^{i,k}=(\mathcal {K}^{i,k}_t)_{t\in [0,\mathcal {T}]}\) the exponential compensator of the process \((\zeta ^i_k\,X^k + \lambda ^i_k\,Z^k)\), as characterized in Lemma 2.7.
Remark 3.3
We point out that the modeling framework developed in this section can be easily generalized to the case of stochastic interest rates. In particular, by allowing the interest rates \(r^i\), for \(i=1,\ldots ,N\), to be driven by the common family of CBITCL processes \((X^k,Z^k)\), \(k=1,\ldots ,d\), one can introduce dependence between the interest rates, the FX rates and their volatilities.
For each \(i=1,\ldots ,N\), we specify the artificial FX rate \(S^{0,i}\) as follows:
The artificial FX rates are modeling quantities that cannot be observed in reality. However, the parameters \(\zeta ^i_k\) and \(\lambda ^i_k\) will have a specific role in the dynamics of the actual FX rates. Indeed, \(\lambda ^i_k\) will measure the relative importance of the risk arising from the \(k^\text {th}\) timechanged Lévy process \(Z^k\), while \(\zeta ^i_k\) will measure the dependence between the \(k^\text {th}\) CBI process \(X^k\) and the \(i^\text {th}\) FX rate, as will become clear from Lemma 3.8 below and the following discussion.
Lemma 3.4
For each \(i=1,\ldots ,N\), the process \(S^{0,i}=(S^{0,i}_t)_{t\in [0,\mathcal {T}]}\) satisfies the following dynamics:
Moreover, the process \(S^{0,i}D^i=(S^{0,i}_tD^i_t)_{t\in [0,\mathcal {T}]}\) is a martingale on \((\Omega ,\mathcal {F},\mathbb {F},\mathbb {Q})\), for all \(i=1,\ldots ,N\).
Proof
Using specification (3.1) of \(S^{0,i}\), equation (3.2) follows from an application of Itô’s formula together with the extended DawsonLi representation (2.3)(2.4) of the process \((X^k,Z^k)\), for \(k=1,\ldots ,d\). The martingale property of \(S^{0,i}D^i\) follows from the independence of the CBITCL processes \((X^k,Z^k)\), for \(k=1,\ldots ,d\), together with the martingale property stated in Theorem 2.8. \(\square \)
We define the FX rate process \({\varvec{S}}=\{S^{i,j};i,j=1,\ldots ,N\}\) as follows:
This specification of \({\varvec{S}}\) ensures that the inversion and triangulation symmetries of FX rates (corresponding respectively to parts (i) and (ii) of Definition 3.2) are satisfied by construction, thereby completing steps (1) and (2) of the model construction outlined at the end of Sect. 3.1.
The next corollary describes a class of probability measures that leave invariant the structure of our multicurrency model driven by CBITCL processes. This result plays a crucial role in ensuring absence of arbitrage and coherence of our framework.
Corollary 3.5
For each \(i=1,\ldots ,N\), setting
defines a probability measure \(\mathbb {Q}^i\sim \mathbb {Q}\) under which \((X^k,Z^k)\), for \(k=1,\ldots ,d\), remain mutually independent CBITCL processes (up to time \(\mathcal {T}\)) with parameters given in Table 2. Moreover, for each \(i=1,\ldots ,N\), the probability measure \(\mathbb {Q}^i\) is a riskneutral measure for the \(i^\text {th}\) economy.
Proof
In view of Lemma 3.4, \(\mathbb {Q}^i\) is welldefined by (3.4) as a probability measure equivalent to \(\mathbb {Q}\), for each \(i=1,\ldots ,N\). The fact that \((X^k,Z^k)\), for all \(k=1,\ldots ,d\), remains a CBITCL process under \(\mathbb {Q}^i\) with parameters given in Table 2 follows from Theorem 2.8 together with the independence of the processes \((X^k,Z^k)\), for \(k=1,\ldots ,d\). Moreover, again the independence of the processes \((X^k,Z^k)\), for \(k=1,\ldots ,d\), under \(\mathbb {Q}\) together with the structure of the probability \(\mathbb {Q}^i\) defined in (3.4) implies that the mutual independence is preserved under \(\mathbb {Q}^i\). Finally, for all \(i,j=1,\ldots ,N\), in view of (3.3) and (3.4), the process \(S^{i,j}D^j/D^i\) is a local martingale under \(\mathbb {Q}^i\) if and only if \(S^{0,j}D^j\) is a local martingale under \(\mathbb {Q}\). Since the latter property holds by Lemma 3.4, the proof is complete. \(\square \)
Remark 3.6
Financial models driven by CBITCL processes are inherently incomplete and, therefore, there exist infinitely many riskneutral measures beyond the probability measures considered in Corollary 3.5. Our approach is motivated by the preservation of the structure of the model under each riskneutral measure \(\mathbb {Q}^i\), for \(i=1,\ldots ,N\). In line with the martingale approach to financial modeling, the parameters characterizing the family \(\{\mathbb {Q}^i; i=1,\ldots ,N\}\) are determined by calibration to market data (see Sect. 4 for a specific application). We refer to Eberlein and Kallsen (2020) for an overview of several wellknown hedging approaches in incomplete markets driven by jump processes.
By combining (3.1) and (3.3), we obtain the following representation of FX rates:
In particular, note that all FX rates \(S^{i,j}\) share the same modeling structure, for all \(i,j=1,\ldots ,N\), and the driving processes \((X^k,Z^k)\), for \(k=1,\ldots ,d\), remain mutually independent CBITCL processes under each riskneutral measure \(\mathbb {Q}^i\), as a consequence of Corollary 3.5. In particular, the functional form of the process \(S^{i,j}\) under \(\mathbb {Q}^i\) is identical to that of its reciprocal \(S^{j,i}\) under \(\mathbb {Q}^j\).
We have thus proved the next theorem, which shows that we have constructed a wellposed and coherent multiple currency market, in line with the modeling objectives set in Sect. 3.1.
Theorem 3.7
The multiple currency market \(({\varvec{D}},{\varvec{S}})\) is wellposed and coherent.
For each \(i=1,\ldots ,N\), the RadonNikodym density \(\mathrm {d}\mathbb {Q}^i/\mathrm {d}\mathbb {Q}\) defined in (3.4) admits the following representation in terms of the sources of randomness driving the extended DawsonLi representation (2.3)(2.4) of the CBITCL processes \((X^k,Z^k)\), for \(k=1,\ldots ,d\):
By Girsanov’s theorem, the processes \(B^{i,k,1}=(B^{i,k,1}_t)_{t\in [0,\mathcal {T}]}\) and \(B^{i,k,2}=(B^{i,k,2}_t)_{t\in [0,\mathcal {T}]}\) defined by
are independent Brownian motions under \(\mathbb {Q}^i\). Moreover, \(N_0^k(\mathrm {d}t, \mathrm {d}x)\), \(N_1^k(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x)\), \(N_2^k(\mathrm {d}t, \mathrm {d}u, \mathrm {d}x)\) are Poisson random measures under \(\mathbb {Q}^i\) with compensated measures
Lemma 3.8
For each \(i,j=1,\ldots ,N\), the process \(S^{i,j}\) satisfies the following dynamics under \(\mathbb {Q}^i\):
with the processes \(B^{i,k,1}\), \(B^{i,k,2}\) and the random measures \(\widetilde{N}^{i,k}_0\), \(\widetilde{N}^{i,k}_1\), \(\widetilde{N}^{i,k}_2\) defined in (3.6)(3.7).
Proof
The claim follows from (3.3) by applying Itô’s product rule together with the dynamics (3.2) of the artificial FX rates \(S^{0,i}\) and \(S^{0,j}\), making use of the notation introduced above. \(\square \)
In particular, we can notice that the dynamics of \(S^{i,j}\) under \(\mathbb {Q}^i\) are functionally symmetric with respect to the dynamics of \(S^{j,i}\) under \(\mathbb {Q}^j\), for all \(i,j=1,\ldots ,N\). This is a further evidence of the coherence of our modeling framework, in the sense of Gnoatto (2017).
As can be seen from equation (3.8), in our framework FX rates possess stochastic dynamics that can capture the most relevant risk characteristics of the FX market, as discussed in the Introduction. More specifically, we can remark the following features (for simplicity of presentation, in the following discussion we consider a onedimensional CBI process X):

Stochastic volatility: for all FX rates, both the diffusive volatility and the jump volatility are stochastic and depend on the current level of the CBI process X. In particular, since X is a selfexciting process, this induces volatility clustering effects in the FX rates.

Jump risk: all FX rates are affected by three different jump terms, corresponding to the three integrals appearing on the righthand side of (3.8): the first integral results from the immigration of the CBI process X, the second integral is related to the branching property of X and the third integral is generated by the Lévy process defining the process Z, with a jump intensity proportional to X. In particular, the last two integrals are affected by the selfexciting property of X and can generate jump clusters in the dynamics of FX rates. The parameters \(\zeta ^j\zeta ^i\) and \(\lambda ^j\lambda ^i\) control the magnitude of these effects.

Stochastic dependence of FX rates: the quadratic covariation among different FX rates exhibits a rich stochastic structure, with the presence of common jumps and with a jump intensity which is related to the current level of the selfexciting CBI process X.

Stochastic skewness: the quadratic covariation \([S^{i,j},X]\) also has a rich stochastic structure. In turn, this induces a stochastic instantaneous correlation between each FX rate and its stochastic volatility. As explained in Christoffersen et al. (2009), Da Fonseca and Grasselli (2011), the presence of stochastic correlation is responsible for the existence of stochastic variations in the skew of FX implied volatilites.
3.3 Currency option pricing
The modeling framework constructed in Sect. 3.2 is coherent and, therefore, retains full analytical tractability under all riskneutral measures considered in Corollary 3.5. In particular, we can derive an explicit representation of the characteristic function of each FX rate. In the following, we denote by \(\mathbb {E}^i\) the expectation under \(\mathbb {Q}^i\), for each \(i=1,\ldots ,N\).
Lemma 3.9
For all \(i,j=1,\ldots ,N\), the characteristic function of the process \((\log S_t^{i,j})_{t\in [0,\mathcal {T}]}\) under \(\mathbb {Q}^i\) is given by
for all \((u,t) \in \mathbb {R}\times [0,\mathcal {T}]\), where \((\mathcal {U}^{i,k}(\cdot , u_1^k, u_2^k, u_3^k), \mathcal {V}^{i,k}(\cdot , u_1^k, u_2^k, u_3^k))\) is the unique solution to system (2.5)(2.6) associated to \((X^k, Z^k)\) under \(\mathbb {Q}^i\) with
Proof
In view of (3.5) and (2.10), we have that
where we have used the independence of the CBITCL processes \((X^k,Z^k)\), for \(k=1,\ldots ,d\), under \(\mathbb {Q}^i\) (see Corollary 3.5) and \(Y_t^k = \int _0^t {X_s^k\,\mathrm {d}s}\), for all \(k=1,\ldots ,d\). Formula (3.9) then follows from an application of the affine transform formula given in Proposition 2.5. \(\square \)
The availability of an explicit description of the characteristic function of each FX rate allows for currency option pricing via Fourier techniques. We adopt the COS method of Fang and Oosterlee (2009), which presents the advantage of utilizing only the characteristic function of the process, without requiring any domain extensions as in other Fourier pricing methods. In our setting, such domain extensions would necessitate additional constraints on the parameters, potentially affecting the calibration results. We consider a European Call option in the \(i^\text {th}\) economy written on the FX rate \(S^{i,j}\), with maturity \(T \le \mathcal {T}\) and strike \(K > 0\). We assume that the distribution of \(\log S^{i,j}_T\) under \(\mathbb {Q}^i\) admits a density^{Footnote 3}. Since the multiple currency market is wellposed, we can apply riskneutral valuation under \(\mathbb {Q}^i\) to compute the arbitragefree price C(T, K) at \(t=0\) of the option:
where \(f_T^{i,j}\) represents the density function of \(\log (S_T^{i,j}/K)\) under \(\mathbb {Q}^i\). To compute the integral in (3.10), we introduce a suitably chosen truncation range \([a, b] \subset \mathbb {R}\) such that C(T, K) can be approximated with good accuracy by
The resulting pricing formula is stated in the next proposition, which follows by the same arguments presented in Section 2.1 of Fang and Oosterlee (2009).
Proposition 3.10
The arbitragefree price C(T, K) of a European call option written on the FX rate \(S^{i,j}\), with maturity \(T \le \mathcal {T}\) and strike \(K > 0\), can be approximated by
where \(\delta _0\) denotes the Kronecker delta at 0, \(M \in \mathbb {N}\), \(B_0 = (e^{\,b}  1  b)/(b1)\), and where
Remark 3.11
(1) In order to ensure the accuracy of formula (3.11), one needs to specify the truncation range [a, b] properly. Following Section 5.1 of Fang and Oosterlee (2009), a suitable specification is the following:
with \(L = 10\) and where \(c_n\), for \(n = 1, 2, 4\), represents the nth cumulant of \(\log (S_T^{i,j}/K)\). In our framework, the cumulants are not available in closed form. However, they can be approximated by using finite differences since, by definition, they are given by the derivatives at zero of the cumulantgenerating function of \(\log (S_T^{i,j}/K)\) (see Fang and Oosterlee 2009, Appendix A for further details).
(2) As explained in Section 3.3 of Fang and Oosterlee (2009), formula (3.11) can be readily extended to a multistrike setting, which is practically important when one needs to price several options with the same maturity but associated to different strikes (e.g., for model calibration). We refer to Szulda (2021), Remark 5.13, for a description of the multistrike implementation of formula (3.11).
4 Model calibration
In this section, we calibrate a simple specification of our modeling framework to market data on a currency triangle. We consider a model driven by tempered \(\alpha \)stable CBI processes and CGMY Lévy processes (see Fontana et al., 2021) and propose two different calibration methods, one based on standard techniques and one relying on a deep learning algorithm. The market data are described in Sect. 4.1, while the two calibration methods are presented in Sect. 4.2. Sect. 4.3 contains a description of the model specification and in Sect. 4.4 we report the calibration results.
4.1 FX market data
We consider market data on three FX implied volatility surfaces: EURUSD, EURJPY and USDJPY (according to the FORDOM convention, the second currency of each pair represents the domestic currency). The quoting convention for FX implied volatilities differs from the case of equity markets, since implied volatilities are quoted in terms of deltas and maturities instead of strikes and maturities. Moreover, excluding ATM options, individual volatilities are not directly quoted: the market practice consists in quoting certain combinations of contracts (riskreversals and butterflies) from which implied volatilities for single contracts in terms of maturities and deltas have to be recovered.
For the three volatility surfaces, we consider a common set of maturities, ranging from one week to one year (1, 2 weeks, 1, 3, 6 months, and 1 year, representing the most liquid part of the implied volatility surface). We retrieved from Bloomberg the following market quotes as of April 15, 2020: ATM implied volatility, \(10\Delta \) and \(25\Delta \) riskreversals^{Footnote 4} and butterflies. For \(25\Delta \), we have
from which we deduce
and similarly for \(10\Delta \). For each currency pair and for each maturity, we have the implied volatilities of 5 contracts at our disposal. Market data not corresponding to the 5 points is interpolated.
In order to reconstruct observed market prices, we also retrieved from Bloomberg FX spots and FX forward points, which enable us to build FX forward curves by adding the spot and the forward points. Equipped with such data, we have all the information needed to convert deltas into strikes and recover implied volatilities for single contracts in terms of maturities and strikes.^{Footnote 5}
4.2 Two calibration methods
Let p denote a vector of model parameters, belonging to some set of admissible parameters \(\mathcal {P}\). Let \(\# T\) be the number of maturities and \(\# K\) be the number of strikes that we consider. For simplicity of presentation, we assume that all volatility surfaces have the same strike range and the same number of strikes. In general, a calibration to the implied volatilites on a set of N currencies consists in solving the following minimization problem:
where \(\sigma _{imp}^{mkt}(u,T_i,K_j)\) denotes the implied volatility observed on the market for currency u, maturity \(T_i\), and strike \(K_j\), while \(\sigma _{imp}^{mod(p)}(u,T_i,K_j)\) denotes its modelimplied counterpart for a given vector of parameters \(p \in \mathcal {P}\).
We now present two calibration methods. The first one, to which we refer as standard calibration, utilizes pricing formula (3.11) to compute model prices for a given choice of model parameters. Such prices are then converted into modelimplied volatilities and inserted into (4.1). This gives rise to a multidimensional function \(\Sigma :\mathcal {P}\rightarrow \mathbb {R}^{N\times \# T\times \# K}\) such that, for all \(p\in \mathcal {P}\), we have \(\Sigma (p)_{(u,i,j)} = \sigma _{imp}^{mod(p)}(u,T_i,K_j)\), for every \(u=1,\ldots ,N\), \(i=1,\ldots ,\# T\), and \(j=1,\ldots ,\# K\).
The second calibration method, to which we refer as deep calibration, adopts the twostep approach developed by Horvath et al. (2021) for the solution of (4.1). We proceed as follows.

Gridbased implicit training: the purpose of this step is to approximate the nonlinear function \(\Sigma \) by a fullyconnected feedforward neural network \(\mathcal {N}^w:\mathcal {P}\rightarrow \mathbb {R}^{N\times \# T\times \# K}\) (see Horvath et al. 2021, Definition 1), where w denotes a vector of network parameters (typically weights and biases). We divide this step into two substeps:

(1)
We generate a training set \(\{(p_n, \Sigma (p_n))\}_{n=1,\ldots ,N_{train}}\) of size \(N_{train}\), where each vector of parameters \(p_n\) is generated randomly by means of a standard random generator (suitable adjustments can be made to guarantee that parameter restrictions are satisfied), and where we have fixed the grid \((u, T_i, K_j)\), \(u=1,\ldots ,N\), \(i=1,\ldots ,\# T\), and \(j=1,\ldots ,\# K\), throughout the generation (hence the term “gridbased”).

(2)
We solve the following minimization problem called “training” of the neural network:
$$\begin{aligned} \min _{w}\sum _{n=1}^{N_{train}}\sum _{u=1}^N\sum _{i = 1}^{\# T}\sum _{j =1}^{\# K}\Bigl (\Sigma (p_n)_{(u,i,j)}\mathcal {N}^w(p_n)_{(u,i,j)}\Bigr )^2, \end{aligned}$$(4.2)whose solution is an optimal vector of network parameters \(\widehat{w}\) such that the neural network \(\mathcal {N}:= \mathcal {N}^{\widehat{w}}\) best approximates the observations \(\{\Sigma (p_n)\}_{n=1,\ldots , N_{train}}\). Notice that \(\widehat{w}\) depends on the grid that we have fixed, thus explaining the term “implicit”.
Deterministic calibration: we rewrite (4.1) with the trained neural network \(\mathcal {N}\) as follows:
$$\begin{aligned} \min _{p\in \mathcal {P}}\sum _{u=1}^N\sum _{i = 1}^{\# T}\sum _{j =1}^{\# K}\left( \sigma _{imp}^{mkt}(u,T_i,K_j)\mathcal {N}(p)_{(u,i,j)}\right) ^2. \end{aligned}$$(4.3)
Following Horvath et al. (2021), we adopt the following neural network architecture:

3 hidden layers with 30 nodes on each;

\(N =3\) surfaces, all sharing the same maturity range of size \(\# T = 6\) and the same number of strikes \(\# K = 5\). This yields an output layer of \(3 \times 6 \times 5 = 90\) nodes. The size of the input layer is simply the number of model parameters;

On the input and hidden layers, we employ the Exponential Linear Unit (ELU) activation function. The output layer is in turn equipped with the Sigmoid function.
Figure 2 provides a visualization of the neural network architecture.
For the training step, we start with the random generation of a training set of size \(N_{train} = 10.000\). After normalization, we proceed with the training of the neural network by solving (4.2). The common practice is to use a stochastic optimization algorithm based on “minibatch” gradient descent (see Goodfellow et al., 2016), whose updater can be specified following the Adam scheme (see Kingma and Ba, 2017). We set the minibatch size to 32 and the number of epochs to 150 with potential early stopping^{Footnote 6}.
4.3 CBITCL specification
We consider a specification of the modeling framework described in Sect. 3.2 driven by two independent CBITCL processes \((X^k,Z^k)\), \(k=1,2\), where

(i)
\(X^1\) and \(X^2\) are tempered \(\alpha \)stable CBI processes, as defined in Fontana et al. (2021);

(ii)
the Lévy processes \(L^1\) and \(L^2\) generating the processes \(Z^1\) and \(Z^2\) (see Definition 2.2), respectively, are CGMY processes, as introduced in Carr et al. (2002).
We recall from Fontana et al. (2021) that a CBI process \(X=(X_t)_{t\in [0,\mathcal {T}]}\) is said to be tempered \(\alpha \)stable if its immigration mechanism (2.1) reduces to \(\Psi (x)=\beta x\) and the measure \(\pi \) in the branching mechanism (2.2) corresponds to the Lévy measure of a spectrally positive tempered \(\alpha \)stable compensated Lévy process. More specifically, we set
where \(\eta >0\), \(\theta \ge 0\), \(\alpha \in (\infty ,2)\) (restricted to \(\alpha \in (1,2)\) if \(\theta =0\)) and \(C_{\alpha }\) is a normalization constant. This family of processes represents the tempered version of \(\alpha \)stable CBI processes, which have been successfully applied in finance in recent years (see, e.g., Jiao et al., 2017; 2019; 2021). The parameter \(\alpha \) is referred to as the stability index and determines the jump behavior of the process X (see Fontana et al. 2021, Sect. 3.1):

if \(\alpha <0\), then X has jumps of finite activity and finite variation;

if \(\alpha \in [0,1)\), then X has jumps of infinite activity and finite variation;

if \(\alpha \in [1,2)\), then X has jumps of infinite activity and infinite variation.
In the present setting, we shall consider the case \(\alpha \in (1,2)\) and specify the normalization constant as \(C_{\alpha }=1/\Gamma (\alpha )\). The jumps of the process X are tempered exponentially depending on the value of the parameter \(\theta \), while the parameter \(\eta \) serves as a volatility coefficient controlling the jump volatility of the process X. The following lemma provides the explicit representation of the branching mechanism \(\Phi \) of a tempered \(\alpha \)stable CBI process X (we refer to Szulda, 2021, for a proof). We denote by \(\Gamma \) the Gamma function extended to \(\mathbb {R}\setminus \mathbb {Z}_\) (see Lebedev, 1972).
Lemma 4.1
For a tempered \(\alpha \)stable CBI process X with \(\eta > 0\), \(\theta \ge 0\), \(C_{\alpha }=1/\Gamma (\alpha )\) and \(\alpha \in (1,2)\), the set \(\mathcal {D}_X\) defined in (2.7) is given by \(\mathcal {D}_X=(\infty ,\theta /\eta ]\). Moreover, the branching mechanism \(\Phi \) is given by
Let us also recall from Carr et al. (2002) that a Lévy process \(L=(L_t)_{t\in [0,\mathcal {T}]}\) is of CGMY type if its Lévy measure \(\gamma \) is given by
where we fix \(C_L=1/\Gamma (Y)\). The parameter \(G > 0\) tempers the downward jumps of L, while \(M > 0\) tempers the upward jumps, and \(Y \in (1,2)\) controls the local behavior of L in a similar way to the parameter \(\alpha \) above. We recall that the Lévy exponent \(\Xi \) of a CGMY process is of the form
for \(\beta \in \mathbb {R}\). It can be easily checked that, in the case of a CGMY process, the set \(\mathcal {D}_Z\) defined in (2.8) is given by \(\mathcal {D}_Z=[G,M]\). The Lévy exponent (4.4) then takes the following explicit form:
As discussed in Sect. 3.3, the pricing of currency option requires the transformation of the model under \(\mathbb {Q}^i\), for \(i=1,2,3\), where each \(\mathbb {Q}^i\) represents the riskneutral measure associated to the \(i^\text {th}\) economy and is given by (3.4). In the present model specification, Theorem 3.5 directly implies the following result, which shows that not only the general CBITCL structure, but also the tempered \(\alpha \)stable property of \(X^k\) and the CGMY structure of \(L^k\), for \(k=1,2\), is preserved.
Corollary 4.2
Under the model specification considered in this section, let \(\mathbb {Q}^i\) be the probability measure defined in (3.4), for each \(i=1,\ldots ,N\). Then, the processes \((X^k,Z^k)\), \(k=1,2\), remain independent CBITCL processes under \(\mathbb {Q}^i\) and such that

(i)
\(X^k\) is a tempered \(\alpha \)stable CBI process with tempering parameter \(\theta ^{i,k}=\theta ^k\zeta ^i_k\eta ^k\);

(ii)
\(Z^k=L^k_{Y^k}\), where \(L^k\) is a CGMY process with tempering parameters \(G^{i,k}=G^k+\lambda ^i_k\) and \(M^{i,k}=M^k\lambda ^i_k\).
Moreover, the drift term \(b^{i,k}\) and \(b_Z^{i,k}\) given in Table 2 can be explicitly computed as follows (see Szulda, 2021, Chapter 2 for further details), for \(i=1,\ldots ,N\) and \(k=1,2\):
4.4 Calibration results
For the resolution of (4.1) and (4.3), we use the Levenberg–Marquardt optimizer of the opensource Java library Finmath^{Footnote 7}. We perform standard and deep calibrations. For the standard one, we obtain a rootmeansquare error of 0.07557 in 709.977 seconds. Fig. 3 shows a satisfactory fit that slightly worsens for longer maturities. The deep calibration outperforms the standard one, achieving a rootmeansquare error of 0.04092 in 0.269 seconds. The better quality of the fit can be seeen from Fig. 4, where we can observe an improvement for longer maturities. Moreover, the execution time of the deep calibration is much smaller. However, one should take into account the time required for the training step, which may last up to several hours.
The calibrated values of the parameters are reported in Table 3. In particular, we can notice that the calibrated values of \(\alpha _1\) and \(\alpha _2\) are rather close to 1, indicating the potential presence of jump clustering phenomena (see Fontana et al. 2021, Section 3.1 for a detailed discussion of this aspect and for an analogous empirical evidence in multicurve interest rate markets). We can also observe that the differences \(\zeta _{\mathrm {EUR}}\zeta _{\mathrm {USD}}\), \(\zeta _{\mathrm {EUR}}\zeta _{\mathrm {JPY}}\), \(\zeta _{\mathrm {USD}}\zeta _{\mathrm {JPY}}\) show evidence of moderate dependence between the FX rates considered and their volatility, in line with the findings of Ballotta and Morico (2018). By inspecting the calibrated values of the parameters of the tempered \(\alpha \)stable CBI processes, we can also notice a nontrivial contribution from the selfexciting jumps.
It is interesting to remark that the calibrated values of the parameters are stable across the two types of calibration. This is in accordance with the order of the two calibration exercises: first, we performed the deep calibration, where the initial guess was generated randomly by employing the same random generator that we used for the generation of the training set. We then used the optimal set of parameters obtained from this first calibration as the initial guess of the standard calibration. The fact that the output of the standard calibration is in line with that of the deep calibration provides us with a way of validating, in the present context, the deep calibration.
In order to assess the importance of allowing for jumps in the CBITCL specification, we compared the specification described in Sect. 4.3 with a continuouspath model where the Lévy processes \(L^1\) and \(L^2\) are simply given by two independent Brownian motions and the CBI processes \(X^1\) and \(X^2\) are standard squareroot diffusions. This simplified model results in a Hestontype model. For the comparison, we calibrated both models to the same set of market implied volatilites, employing the standard calibration technique described in Sect. 4.2. Due to the much simpler structure of the model (in particular, of the associated Riccati equations), the calibration of the Hestontype model requires 311 seconds, while the calibration of the CBITCL model described in Sect. 4.3 required 703 seconds. However, the Hestontype model exhibits a worse fit to market data, achieving a RMSE of 0.1236. We observe that, in spite of the exclusion of the jump components, the calibrated values of the remaining parameters are quite similar across the two different specifications. These findings suggest that the jump components of our CBITCL specification capture some features of market data that cannot be adequately reproduced by continuouspath models.
5 Conclusions
We have proposed a stochastic volatility modeling framework for multiple currencies based on CBItimechanged Lévy processes (CBITCL processes). The proposed approach combines full analytical tractability with consistency with the symmetric structure and the most relevant risk characteristics of FX markets. In particular, the selfexciting behavior of CBI processes allows capturing jump and volatility clustering effects. We have characterized a class of riskneutral measures that leave invariant the structure of the model and allow for the derivation of a semiclosed pricing formula for currency options. Considering a specification driven by tempered \(\alpha \)stable CBI processes and CGMY Lévy processes, we have successfully calibrated the model to an FX triangle, using standard as well as deep learning techniques. The calibrated values of the parameters support the relevance of selfexcitation and clustering phenomena.
Among the possible directions for further research, the modeling framework can be extended by considering stochastic interest rates in the different economies, possibly stochastically correlated with the FX rates. Moreover, we believe that CBITCL processes represent a flexible tool that can be successfully applied to other asset classes where stochastic volatility plays a relevant role.
Notes
We recall that a riskreversal, in the context of FX options, measures the difference in implied volatility between an OTM call option and a put option with the same characteristics and a symmetric delta.
By (Williams 1991, Theorem 16.6), the random variable \(\log S^{i,j}_T\) admits a density under \(\mathbb {Q}^i\) if \(\int _{\mathbb {R}}\mathbb {E}^i[e^{\mathsf {i}u\log S_T^{i,j} }]\mathrm {d}u<+\infty \). Under this assumption, the density can be recovered by Fourier inversion from the characteristic function of \(\log S^{i,j}_T\) given in Lemma 3.9.
By \(25\Delta \) riskreversal, we mean an OTM Call option with a delta of \(25\%\) and a Put option with a delta of \(25\%\).
We performed these tasks by using the opensource Java library Strata by OpenGamma, available at https://github.com/OpenGamma/Strata.
We rely on the opensource Java library Eclipse Deeplearning4j, available at http://deeplearning4j.org.
Available at https://www.finmath.net/finmathlib.
References
Ballotta, L., Deelstra, G., & Rayée, G. (2017). Multivariate FX models with jumps: Triangles, quantos and implied correlation. European Journal of Operational Research, 260(3), 1181–1199.
Baldeaux, J., Grasselli, M., & Platen, E. (2015). Pricing currency derivatives under the benchmark approach. Journal of Banking and Finance, 53, 34–48.
Ballotta, L., & Morico, A. (2018). Hidden correlations: a selfexciting tale from the FX world. Working paper (available at https://ssrn.com/abstract=3245149).
Carr, P., Geman, H., Madan, D. B., & Yor, M. (2002). The fine structure of asset returns: An empirical investigation. The Journal of Business, 75(2), 305–332.
Christoffersen, P., Heston, S., & Jacobs, K. (2009). The shape and term structure of the index option smirk: Why multifactor stochastic volatility models work so well. Management Science, 55(12), 1914–1932.
Caballero, M. E., Pérez Garmendia, J. L., & Bravo, G. Uribe. (2013). A lampertitype representation of continuousstate branching processes with immigration. Annals of Probability, 41(3), 1585–1627.
Cont, R., & Tankov, P. (2004). Financial modelling with jump processes. London: Chapman and Hall CRC.
Carr, P., & Wu, L. (2007). Stochastic skew in currency options. Journal of Financial Economics, 86(1), 213–247.
Rollin, S. Del Baño. (2008). Spot inversion in the Heston model. Working paper (available at https://core.ac.uk/display/13283041)
De Col, A., Gnoatto, A., & Grasselli, M. (2013). Smiles all around: FX joint calibration in a multiHeston model. Journal of Banking and Finance, 37(10), 3799–3818.
Da Fonseca, J., & Grasselli, M. (2011). Riding on the smiles. Quantitative Finance, 11(11), 1609–1632.
Duffie, D., Filipović, D., & Schachermayer, W. (2003). Affine processes and applications in finance. Annals of Applied Probability, 13(3), 984–1053.
Dawson, D. A., & Li, Z. (2006). Skew convolution semigroups and affine Markov processes. Annals of Probability, 34(3), 1103–1142.
Doust, P. (2007). The intrinsic currency valuation framework. Risk Magazine, March:76–81 .
Doust, P. (2012). The stochastic intrinsic currency volatility model: A consistent framework for multiple FX rates and their volatilities. Applied Mathematical Finance, 19(5), 381–445.
Delbaen, F., & Schachermayer, W. (1998). The fundamental theorem of asset pricing for unbounded stochastic processes. Mathematische Annalen, 312, 215–250.
Escobar, M., & Gschnaidtner, C. (2018). A multivariate stochastic volatility model with applications in the foreign exchange market. Review of Derivatives Research, 21(1), 1–43.
Eberlein, E., & Koval, N. (2006). A crosscurrency Lévy market model. Quantitative Finance, 6(6), 465–480.
Eberlein, E., & Kallsen, J. (2020). Mathematical Finance. Springer finance: Springer.
Fontana, C., Gnoatto, A., & Szulda, G. (2021). Multiple yield curve modeling with CBI processes. Mathematics and Financial Economics, 15(2), 579–610.
Fontana, C., Gnoatto, A., Szulda,G. (2022). CBItimechanged Lévy processes. Working paper (available at arXiv:2205.12355) .
Flesaker, B., & Hughston, L. P. (1997) International models for interest rates and foreign exchange. Net Exposure, 3: 55–79 Reprinted as Chapter 13 in: Hughston, L.P. (ed.) The New Interest Rate Models. Risk Publications, pp. 217–235.
Bank for International Settlements. (2019). BIS Triennial Central Bank Survey: foreign exchange turnover in 2019. BIS, Monetary and Economic Department: Technical report.
Fang, F., & Oosterlee, C. W. (2009). A novel pricing method for european options based on fouriercosine series expansions. SIAM Journal on Scientific Computing, 31(2), 826–848.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
Graceffa, F., Brigo, D., & Pallavicini, A. (2020). On the consistency of jumpdiffusion dynamics for FX rates under inversion. International Journal of Financial Engineering, 7(4), 2050046.
Gnoatto, A., & Grasselli, M. (2014). An affine multicurrency model with stochastic volatility and stochastic interest rates. SIAM Journal on Financial Mathematics, 5(1), 493–531.
Gnoatto, A., Grasselli, M., & Platen, E. (2021). Calibration to FX triangles of the 4/2 model under the benchmark approach. Decisions in Economics and Finance forthcoming, 134
Gnoatto, A. (2017). Coherent foreign exchange market models. International Journal Theoretical Applied Finance, 20(1), 1750007.
Horvath, B., Muguruza, A., & Tomas, M. (2021). Deep learning volatility. Quantitative Finance, 21(1), 11–27.
Janek, A., Kluge, T., Weron, R., & Wystup, U. (2011). FX smile in the Heston model. In P. Cizek, W. K. Härdle, & R. Weron (Eds.), Statistical Tools for Finance and Insurance (pp. 133–162). BerlinHeidelberg: Springer.
Jiao, Y., Ma, C., & Scotti, S. (2017). AlphaCIR model with branching processes in sovereign interest rate modeling. Finance and Stochastics, 21(3), 789–813.
Jiao, Y., Ma, C., Scotti, S., & Sgarra, C. (2019). A branching process approach to power markets. Energy Economics, 79, 144–156.
Jiao, Y., Ma, C., Scotti, S., & Zhou, C. (2021). The AlphaHeston stochastic volatility model. Mathematical Finance, 31(3), 943–978.
Jacod, J., & Shiryaev, A. (2003). Limit Theorems for Stochastic Processes. BerlinHeidelbergNew York, second edition Springer.
Kingma, D. P., & Ba, J. (2017). Adam: a method for stochastic optimization. Working paper (available at arxiv:1412.6980).
KellerRessel, M. (2009). Affine Processes  Theory and Applications in Finance. PhD thesis, Vienna University of Technology .
KellerRessel, M., & Mayerhofer, E. (2015). Exponential moments of affine processes. Annals of Applied Probability, 25(2), 714–752.
Kallsen, J., & Shiryaev, A. (2002). The cumulant process and esscher’s change of measure. Finance and Stochastics, 6, 397–428.
Kawazu, K., & Watanabe, S. (1971). Branching processes with immigration and related limit theorems. Theory Probability and its Applications, 16(1), 36–54.
Lebedev, N. N. (1972). Special Functions and their Applications. PrenticeHall, Englewood Cliffs (N.J.).
LeNail, A. (2019). NNSVG: publicationready neural network architecture schematics. Journal of Open Source Software, 4(33), 747.
Li, Z. (2020). Continuousstate branching processes with immigration. In Y. Jiao (Ed.), From probability to finance  lecture notes of BICMR summer school on financial mathematics (pp. 1–70). Singapore: Springer.
Szulda, G. (2021). Branching Processes and Multiple Term Structure Modeling. PhD thesis, Université de Paris.
Williams, D. (1991). Probability with Martingales. Cambridge: Cambridge University Press.
Wooldridge, P. (December 2019). FX and OTC derivatives markets through the lens of the Triennial Survey. BIS Quarterly Review
Acknowledgements
C.F. is grateful to the Europlace Institute of Finance for financial support to this work. G.S. acknowledges hospitality and financial support from the University of Verona, where part of this work has been conducted. This work is part of the project “Term structure dynamics in interest rate and energy markets: modeling and numerics” (BIRD190200/19) funded by the University of Padova. We are thankful to two anonymous Reviewers for useful comments that helped to improve the paper.
Funding
Open access funding provided by Università degli Studi di Padova within the CRUICARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fontana, C., Gnoatto, A. & Szulda, G. CBItimechanged Lévy processes for multicurrency modeling. Ann Oper Res 336, 127–152 (2024). https://doi.org/10.1007/s1047902204982z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1047902204982z
Keywords
 FX market
 Multicurrency market
 Branching process
 Selfexciting process
 Timechange
 Stochastic volatility
 Deep calibration
 Affine process