Nonlinear Monte Carlo schemes for counterparty risk on credit derivatives

Two nonlinear Monte Carlo schemes, namely, the linear Monte Carlo expansion with randomization of Fujii and Takahashi (2012a,2012b) and the marked branching diffusion scheme of Henry-Labordère (2012), are compared in terms of applicability and numerical behavior regarding counterparty risk computations on credit derivatives. This is done in two dynamic copula models of portfolio credit risk: the dynamic Gaussian copula model and the model in which default dependence stems from joint defaults. For such high-dimensional and nonlinear pricing problems, more standard deterministic or simulation/regression schemes are ruled out by Bellman’s “curse of dimensionality” and only purely forward Monte Carlo schemes can be used.


Introduction
Counterparty risk is a major issue since the global credit crisis and the ongoing European sovereign debt crisis. In a bilateral counterparty risk setup, counterparty risk is valued as the so-called credit valuation adjustment (CVA), for the risk of default of the counterparty, and debt valuation adjustment (DVA), for own default risk. In such a setup, the classical assumption of a locally risk-free funding asset used for both investing and unsecured borrowing is no longer sustainable. The proper accounting of the funding costs of a position leads to the funding valuation adjustment (FVA). Moreover, these adjustments are interdependent and must be computed jointly through a global correction dubbed total valuation adjustment (TVA). The Stéphane  pricing equation for the TVA is nonlinear due to the funding costs. It is posed over a random time interval determined by the first default time of the two counterparties. To deal with the corresponding backward stochastic differential equation (BSDE), a first reduced-form modeling approach has been proposed in Crépey (2012b), under a rather standard immersion hypothesis between a reference (or market) filtration and the full model filtration progressively enlarged by the default times of the counterparties. This basic immersion setup is fine for standard applications, such as counterparty risk on interest rate derivatives. But it is too restrictive for situations of strong dependence between the underlying exposure and the default risk of the two counterparties, such as counterparty risk on credit derivatives, which involves strong adverse dependence, called wrong-way risk (for some insights of related financial contexts, see Fujii and Takahashi (2012b), Brigo, Capponi, and Pallavicini (2014)). For this reason, an extended reduced-form modeling approach has been recently developed in Crépey and Song (2014a, 2014b, 2015. With credit derivatives, the problem is also very high-dimensional. From a numerical point of view, for high-dimensional nonlinear problems, only purely forward simulation schemes can be used. In Crépey and Song (2015), the problem is addressed by the linear Monte Carlo expansion with randomization of Takahashi (2012a,2012b).
In the present work, we assess another scheme, namely the marked branching diffusion approach of Henry-Labordère (2012), which we compare with the previous one in terms of applicability and numerical behavior. This is done in two dynamic copula models of portfolio credit risk: the dynamic Gaussian copula model and the dynamic Marshall-Olkin model in which default dependence stems from joint defaults.
The paper is organized as follows. Sect. 2 and 3 provide a summary of the main pricing and TVA BSDEs that are derived in Crépey and Song (2014a, 2014b, 2015. Sect. 4 exposes two nonlinear Monte Carlo schemes that can be considered for solving these in high-dimensional models, such as the portfolio credit models of Sect. 5. Comparative numerics in these models are presented in Sect. 6. Sect. 7 concludes.

Setup
We consider a netted portfolio of OTC derivatives between two defaultable counterparties, generally referred to as the contract between a bank, the perspective of which is taken, and its counterparty. After having bought the contract from its counterparty at time 0, the bank sets-up a hedging, collateralization (or margining) and funding portfolio. We call the funder of the bank a third party, possibly composed in practice of several entities or devices, insuring funding of the bank's strategy. The funder, assumed default-free for simplicity, plays the role of lender/borrower of last resort after the exhaustion of the internal sources of funding provided to the bank through its hedge and collateral.
For notational simplicity we assume no collateralization. All the numerical considerations, our main focus in this work, can be readily extended to the case of collateralized portfolios using the corresponding developments in Crépey and Song (2015). Likewise, we assume hedging in the simplest sense of replication by the bank and we consider the case of a fully securely funded hedge, so that the the cost of the hedge of the bank is exactly reflected by the wealth of its hedging and funding portfolio.
We consider a stochastic basis (Ω , G T , G , Q), where G = (G t ) t∈[0,T ] is interpreted as a risk-neutral pricing model on the primary market of the instruments that are used by the bank for hedging its TVA. The reference filtration F is a subfiltration of G representing the counterparty risk free filtration, not carrying any direct information about the defaults of the two counterparties. The relation between these two filtrations will be pointed out in the condition (C) introduced later. We denote by: • E t , the conditional expectation under Q given G t , • r, the risk-free short rate process, with related discount factor β t = e − t 0 r s ds , • T, the maturity of the contract, • τ b and τ c , the default time of the bank and of the counterparty, modeled as G stopping times with (G , Q) intensities γ b and γ c ,

Clean price
We denote by P be the reference (or clean) price of the contract ignoring counterparty risk and assuming the position of the bank financed at the risk-free rate r, i.e. the G conditional expectation of the future contractual cash-flows discounted at the risk-free rate r. In particular, We also define Q t = P t + 1 {t=τ<T } ∆ τ , so that Q τ represents the clean value of the contract inclusive of the promised dividend at default (if any) ∆ τ , which also belongs to the "debt" of the counterparty to the bank (or vice versa depending on the sign of Q τ ) in case of default of a party. Accordingly, at time τ (if < T ), the close-out cash-flow of the counterparty to the bank is modeled as where R b and R c are the recovery rates of the bank and of the counterparty to each other.

All-inclusive price
Let Π be the all-inclusive price of the contract for the bank, including the cost of counterparty risk and funding costs. Since we assume a securely funded hedge (in the sense of replication) and no collateralization, the amounts invested and funded by the bank at time t are respectively given by Π − t and Π + t . The all-inclusive price Π is the discounted conditional expectation of all effective future cash flows including the contractual dividends before τ, the cost of funding the position prior to time τ and the terminal cash flow at time τ. Hence, whereλ is the funding spread over r of the bank toward the external funder, i.e. the bank borrows cash from its funder at rate r +λ (and invests cash at the risk-free rate r). Since the right hand side in (3) depends also on Π , (3) is in fact a backward stochastic differential equation (BSDE). Consistent with the no arbitrage principle, the gain process on the hedge is a Q martingale, which explains why it does not appear in (3).

TVA BSDEs
The total valuation adjustment (TVA) process Θ is defined as In this section we review the main TVA BSDEs that are derived in Crépey and Song (2014a, 2014b, 2015. Three BSDEs are presented. These three equations are essentially equivalent mathematically. However, depending on the underlying model, they are not always amenable to the same numerical schemes or the numerical performance of a given scheme may differ between them.

Full TVA BSDE
By taking the difference between (1) and (3), we obtain where f va t (ϑ ) =λ t (P t − ϑ ) + is the funding coefficient and where is the exposure at default of the bank. Equivalent to (5), the "full TVA BSDE" is written as for the coefficient f t (ϑ ) = f va t (ϑ ) − r t ϑ .

Fully reduced TVA BSDE
. Assume the following conditions, which are studied in Crépey and Song (2014a, 2014b, 2015: Condition (C). There exist: (C.1) a subfiltration F of G satisfying the usual conditions and such that F semimartingales stopped at τ are G semimartingales, (C.2) a probability measure P equivalent to Q on F T such that any (F , P) local Let E t denote the conditional expectation under P given F t . It is shown in Crépey and Song (2014a, 2014b, 2015 that the full TVA BSDE (I) is equivalent to the following "fully reduced BSDE": equivalent in the sense that if Θ solves (I), then the "F optional reduction" Θ of Θ (F optional process that coincides with Θ before τ) solves (III), whilst if Θ solves (III), then Θ = Θ 1 [0,τ) + 1 [τ] 1 τ<T ξ solves (I).

Marked default time setup
In order to be able to compute γξ inf , we assume that τ is endowed with a mark e in a finite set E, in the sense that where each τ e is a stopping time with intensity γ e t such that Q(τ e = τ e ) = 1, e = e , and where ε = argmin e∈E τ e yields the "identity" of the mark. The role of the mark is to convey some additional information about the default, e.g. to encode wrong-way and gap risk features. The assumption of a finite set E in (8) ensures tractability of the setup. In fact, by Lemma 5.1 in Crépey and Song (2015), there exists G -predictable processes P e t and ∆ e t such that P τ = P e τ and ∆ τ = ∆ e τ on the event {τ = τ e }.
Assuming further that τ b = min e∈E b τ e and τ c = min e∈E c τ e , where E = E b ∪ E c (not necessarily a disjoint union), one can then take on [0,τ]: where the two terms have clear respective CVA and DVA interpretation. Hence, (7) is rewritten, on [0,τ], as If the functions P e t and ∆ e t above not only exist, but can be computed explicitly (as will be the case in the concrete models of 5.1 and 5.2), once stated in a Markov setup wheref for some (G , Q) jump diffusion X, then the partially reduced TVA BSDE (II) can be tackled numerically. Similarly, once stated in a Markov setup where for some (F , P) jump diffusion X, then the fully reduced TVA BSDE (III) can be tackled numerically.

Linear approximation
Our first TVA approximation is obtained replacing Θ s by 0 in the right hand side of (I), i.e.
We then approximate the TVA by standard Monte-Carlo, with randomization of the integral to reduce the computation time (at the cost of a small increase in the variance). Hence, introducing an exponential time ζ of parameter µ, i.e. a random variable with density φ (s) = 1 s≥0 µ e −µs , we have We can use the same technic for (II) and (III), which yields: 4.2 Linear expansion and interacting particle implementation Following Fujii andTakahashi (2012a,2012b), we can introduce a perturbation parameter ε and the following perturbed form of the fully reduced BSDE (III): where ε = 1 corresponds to the original BSDE (III). Suppose that the solution of (16) can be expanded in a power series of ε: The Taylor expansion of f at Θ (0) reads Collecting the terms of the same order with respect to ε in (16), we obtain Θ (0) t = 0, due to the null terminal condition of the fully reduced BSDE (III), and where the third order term should contain another component based on ∂ 2 ϑ 2 f . But, in our case, ∂ 2 ϑ 2 f involves a Dirac measure via the terms (P t − ϑ ) + in f va t (ϑ ), so that we truncate the expansion to the term Θ (3) t as above. If the non-linearity in (III) is sub-dominant, one can expect to obtain a reasonable approximation of the original equation by setting ε = 1 at the end of the calculation, i.e.
Carrying out a Monte Carlo simulation by an Euler scheme for every time s in a time grid and integrating to obtain Θ (1) 0 would be quite heavy. Moreover, this would become completely unpractical for the higher order terms that involve iterated (multivariate) time integrals. For these reasons, Fujii and Takahashi (2012b) have introduced a particle interpretation to randomize and compute numerically the integrals in (18), which we call the FT scheme. Let η 1 be the interaction time of a particle drawn independently as the first jump time of a Poisson process with an arbitrary intensity µ > 0 starting from time t ≥ 0, i.e., η 1 is a random variable with density From the first line in (18), we have Similarly, the particle representation is available for the higher order. By applying the same procedure as above, we obtain where Θ (1) η 1 can be computed by (20). Therefore, by using the tower property of conditional expectations, we obtain where η 1 , η 2 are the two consecutive interaction times of a particle randomly drawn with intensity µ starting from t. Similarly, for the third order, we get where η 1 , η 2 , η 3 are consecutive interaction times of a particle randomly drawn with intensity µ starting from t. In case t = 0, (20), (21) and (22) can be simplified as Θ (1) (23) where ζ 1 , ζ 2 , ζ 3 are the elapsed time from the last interaction until the next interaction, which are independent exponential random variables with parameter µ.
Note that the pricing model is originally defined with respect to the full stochastic basis (G , Q). Even in the case where there exists a stochastic basis (F , Q) satisfying the condition (C), (F , Q) simulation may be nontrivial. Lemma 8.1 in Crépey and Song (2015) allows us to reformulate the Q expectations in (23) as the following Q expectations, withΘ (0) = 0: which is nothing but the FT scheme applied to the partially reduced BSDE (II). The tractability of the FT schemes (23) and (24) relies on the nullity of the terminal condition of the related BSDEs (III) and (II), which implies thatΘ (0) = Θ (0) = 0. By contrast, an FT scheme would not be practical for the full TVA BSDE (5) with terminal condition ξ = 0. Also note that the first order in the FT scheme (23) (resp. (24)) is nothing but the linear approximation (15) (resp. (14)).

Marked branching diffusion approach
Based on an old idea of McKean (1975) where L is the infinitesimal generator of a strong Markov process X and F(y) = ∑ d k=0 a k y k is a polynomial of order d, admits a probabilistic representation in terms of a random tree T (branching diffusion). The tree starts from a single particle ("trunk") born from (t 0 , x 0 ). Subsequently, every particle born from a node (t, x) evolves independently according to the generator L of X until it dies at time t = (t + ζ ) in a state x , where ζ is an independent µ-exponential time (one for each particle). Moreover, in dying, a particle gives birth to an independent number of k new particles starting from the node (t , x ), where k is drawn in the finite set {0, 1, · · · , d} with some fixed probabilities p 0 , p 1 , · · · , p d . The marked branching diffusion probabilistic representation reads where n k is the number of branching with k descendants up on (0, T ) and ν is the number of particles alive at T, with corresponding locations x 1 , . . . , x ν . The marked branching diffusion method of Henry-Labordère (2012) for CVA computations, dubbed PHL scheme henceforth, is based on the idea that, by approximating y + by a well-chosen polynomial F(y), the solution to the PDE can be approximated by the solution to the PDE (25), hence by (26). We want to apply this approach to solve the TVA BSDEs (I), (II) or (III) for which, instead of fixing the approximating polynomial F(y) once for all in the simulations, we need a state dependent polynomial approximation to g t (y) = (P t − y) + (cf. (7)) in a suitable range for y. Moreover, (I) and (II) are BSDEs with random terminal timeτ, equivalently written in a Markov setup as Cauchy-Dirichlet PDE problems, as opposed to the pure Cauchy problem (27). Hence, some adaptation of the method is required. We show how to do it for (II), after which we directly give the algorithm in the similar case of (I) and in the more classical (pure Cauchy) case of (III). Assuming τ given in terms of a (G , Q) Markov factor process X as τ = inf{t > 0 : X t / ∈ D} for some domain D, the Cauchy-Dirichlet PDE used for approximating the partially reduced BSDE (II) reads: Specifically, in view of (9), one can set where pol(r) is a d-order polynomial approximation of r + in a suitable range for r. The marked branching diffusion probabilistic representation ofū(t 0 , x 0 ) ∈ D involves a random tree T made of nodes and "particles" between consecutive nodes as follows. The tree starts from a single particle (trunk) born from the root (t 0 , x 0 ). Subsequently, every particle born from a node (t, x) evolves independently according to the generator L of X until it dies at time t = (t + ζ ) in a state x , where ζ is an independent µ-exponential time. Moreover, in dying, if its position x at time t lies in D, the particle gives birth to an independent number of k new particles starting from the node (t , x ), where k is drawn in the finite set {0, 1, · · · , d} with some fixed probabilities p 0 , p 1 , · · · , p d . Figure 1 describes such a random tree in case d = 2. The first particle starts from the root (t 0 , x 0 ) and dies at time t 1 , generating two new particles. The first one dies at time t 11 and generates a new particle, who (31) Note that (31) is unformal at that stage, where we did not justify whether the PDE (28) has a solutionū and in which sense. In fact, the following result could be used for proving that the functionū defined in the first line is a viscosity solution to (28).
Proposition 1. Denoting byū the function defined by the right hand side in (31) (assuming integrability of the integrand on the domain [0, T ] × D), the process Y t = u(t, X t ), 0 ≤ t ≤τ, solves the BSDE associated with the Cauchy-Dirichlet PDE (28), namely (which, in view of (29), approximates the partially reduced BSDE (II), so that Y ≈Θ provided Y is square integrable).
Proof. Let (t 1 , x 1 , k 1 ) be the first branching point in the tree rooted at (0, X 0 ) and let T j denote k 1 independent trees of the same kind rooted at (t 1 , x 1 ). By using the independence and the strong Markov property postulated for X, we obtain i.e. Y t =ū(t, X t ) solves (32).
If 1 τ<T ξ is given as a deterministic function Ψ (τ, X τ ), then a similar approach (using the same tree T ) can be applied to the full BSDE (I) in terms of the Cauchy-Dirichlet PDE This yields the approximation formula alternative to (31): where an exit point of T means a point where a branch of the tree leaves for the first time the time-space domain [0, T ] × D. Last, regarding the (F , Q) reduced BSDE (III), assuming an (F , Q) Markov factor process X with generator A and domain D, we can apply a similar approach in terms of the Cauchy PDE We obtain where T is the branching tree associated with the Cauchy PDE (35) (similar to T but for the generator A .

TVA models for credit derivatives
Our goal is to apply the above approaches to TVA computations on credit derivatives referencing the names in N = {1, . . . , n}, for some positive integer n, traded between the bank and the counterparty respectively labeled as −1 and 0. In this section we briefly survey two models of the default times τ i , i ∈ N = {−1, 0, 1, . . . , n}, that will be used for that purpose with τ b = τ −1 and τ c = τ 0 , namely the dynamic Gaussian copula (DGC) model and the dynamic Marshall-Olkin copula (DMO) model. For more details the reader is referred to (Crépey, Bielecki and Brigo 2014, Chapters 7 and 8) and (Crépey and Song 2015, Sections 6 and 7).

Model of default times
Let there be given a function ς (·) with unit L 2 norm on R + and a multivariate Brownian motion B = (B i ) i∈N with pairwise constant correlation ρ ≥ 0 in its own completed filtration B = (B t ) t≥0 . For each i ∈ N, let h i be a continuously differentiable increasing function from R * + to R, with lim 0 h i (s) = −∞ and lim +∞ h i (s) = +∞, and let Thus the (τ i ) i∈N follow the standard Gaussian copula model of Li (2000), with correlation parameter ρ and with marginal survival function Φ • h i of τ i , where Φ is the standard normal survival function. In particular, these τ i do not intersect each other.
In order to make the model dynamic as required by counterparty risk applications, the model filtration G is given as the Brownian filtration B progressively enlarged by the τ i , i.e.
and the reference filtration F is given as B progressively enlarged by the default times of the reference names, i.e.
As shown in Section 6.2 of Crépey and Song (2015), for the filtrations G and F as above, there exists a (unique) probability measure P equivalent to Q such that the condition (C) holds. For every i ∈ N, let and let m t = (m i t ) i∈N , k t = (k i t ) i∈N , k t = (1 i∈N k i t ) i∈N . The couple X t = (m t , k t ) (resp. X t = (m t , k t )) plays the role of a (G , Q) (resp. (F , P)) Markov factor process in the dynamic Gaussian copula (DGC) model.

TVA model
A DGC setup can be used as a TVA model for credit derivatives, with mark i = −1, 0 and E b = {−1}, E c = {0}. Since there are no joint defaults in this model, it is harmless to assume that the contract promises no cash-flow at τ, i.e., ∆ τ = 0, so that Q τ = P τ . By (Crépey, Bielecki and Brigo 2014, Propositions 7.3.1 page 178 and 7.3.3 page 181), in the case of vanilla credit derivatives on the reference names, namely CDS contracts and CDO tranches (cf. (47)), there exists a continuous, ex-plicit function P i such that or P i τ in a shorthand notation, on the event {τ = τ i }. Hence, (9) yields Assume that the processes r andλ are given before τ as continuous functions of (t, X t ), which also holds for P in the case of vanilla credit derivatives on names in N. Then the coefficientsf and in turn f are deterministically given in terms of the corresponding factor processes as so that we are in the Markovian setup where the FT and the PHL schemes are valid and, in principle, applicable.

Dynamic Marshall-Olkin copula TVA model
The above dynamic Gaussian copula model allows dealing with TVA on CDS contracts. But a Gaussian copula dependence structure is not rich enough for ensuring a proper calibration to CDS and CDO quotes at the same time. If CDO tranches are also present in a portfolio, a possible alternative is the following dynamic Marshall-Olkin (DMO) copula model, also known as the "common shock" model.

Model of default times
We define a family Y of "shocks", i.e. subsets Y ⊆ N of obligors, usually consisting of the singletons {−1}, {0}, {1}, . . . , {n}, and a few "common shocks" I 1 , I 2 , · · · , I m representing simultaneous defaults. For Y ∈ Y , the shock time η Y is defined as an i.i.d. exponential random variable with parameter γ Y . The default time of obligor i in the common shock model is then defined as Example 1.
As shown in Section 7.2 of Crépey and Song (2015), in the DMO model with G and F as above, the condition (C) holds for P = Q. Let J Y = 1 [0,η Y ) . Similar to (m, k) (resp. (m, k)) in the DGC model, the process plays the role of a (G , Q) (resp. (F , Q)) Markov factor in the DMO model.

TVA model
A DMO setup can be used as a TVA model for credit derivatives, with By (Crépey, Bielecki and Brigo 2014, Proposition 8.3.1 page 205), in the case of CDS contracts and CDO tranches, for every shock Y ∈ Y and process U = P or ∆ , there exists a continuous, explicit function U Y such that or U Y τ in a shorthand notation, on the event {τ = η Y }. The coefficientf t (ϑ ) in (9) is then given, for t ∈ [0,τ], bȳ Assuming that the processes r andλ are given before τ as continuous functions of (t, X t ), which also holds for P in case of vanilla credit derivatives on the reference names, thenf

Strong versus weak dynamic copula model
However, one peculiarity of the TVA BSDEs in our credit portfolio models is that, even though full and reduced Markov structures have been identified, which is required for justifying the validity of the FT and/or PHL numerical schemes, and the corresponding generators A or A can be written explicitly , the Markov structures are too heavy for being of any practical use in the numerics. Instead, fast and exact simulation and clean pricing schemes are available based on the dynamic copula structures. Moreover, in the case of the DGC model, we lose the Gaussian copula structure after a branching point in the PHL scheme. In fact, as visible in (Crépey, Bielecki and Brigo 2014, Formula (7.7) p. 175), the DGC conditional multivariate survival probability function is stated in terms of a ratio of Gaussian survival probability functions, which is explicit but does not simplify into a single Gaussian survival probability function. It's only in the DMO model that the conditional multivariate survival probability function, which arises as a ratio of exponential survival probability functions (see (Crépey, Bielecki and Brigo 2014, Formula (8.11) p. 197 and Section 8.2.1.1)), simplifies into a genuine exponential survival probability function. Hence, the PHL scheme is not applicable in the DGC model.
The FT scheme based on (III) is not practical either because the Gaussian copula structure is only under Q and, again, the (full or reduced) Markov structures are not practical. In the end, the only practical scheme in the DGC model is the FT scheme based on the partially reduced BSDE (II). Eventually, it's only in the DMO model that the FT and the PHL schemes are both practical and can be compared numerically.

Numerics
For the numerical implementation, we consider stylized CDS contracts and protection legs of CDO tranches corresponding to dividend processes of the respective form, for 0 ≤ t ≤ T : where all the recoveries R i and R (resp. nominals Nom i and Nom) are set to 40% (resp. to 100). The contractual spreads S i of the CDS contracts are set such that the corresponding prices are equal to 0 at time 0. Protection legs of CDO tranches, where the attachment and detachment points a and b are such that 0 ≤ a ≤ b ≤ 100%, can also be seen as CDO tranches with upfront payment. Note that credit derivatives traded as swaps or with upfront payment coexist since the crisis. Unless stated otherwise, the following numerical values are used: Note. Only the PHL numerical results are new in what follows. The FT results, presented for comparison with the PHL ones, have been retrieved from Crépey and Song (2015).

Numerical results in the DGC model
First we consider DGC random times τ i defined by (37), where the function h i is chosen so that τ i follows an exponential distribution with parameter γ i (which in practice can be calibrated to a related CDS spread or a suitable proxy). More precisely, let Φ and Ψ i be the survival functions of a standard normal distribution and an exponential distribution with intensity γ i . We choose for Φ (ε i ) has a standard uniform distribution. Moreover, we use a function ς (·) in (37) constant before a time horizonT > T and null afterT , so that ς (0) = 1 √T (given the constraint that ν 2 (0) = ∞ 0 ς 2 (s)ds = 1) and, for t ≤T , In the case of the DGC model, the only practical TVA numerical scheme is the FT scheme (24) based on the partially reduced BSDE (II), which can be described by the following steps: 1. Draw an time ζ 1 following an exponential law of parameter µ. If ζ 1 < T , then ρ)), where I n (1, ρ) is a n × n matrix with diagonal equal to 1 and all off-diagonal entries equal to ρ, and go to Step 2. Otherwise, go to Step 4. 2. Draw a second time ζ 2 , independent from ζ 1 , following an exponential law of parameter µ. If ζ 1 + ζ 2 < T , then obtain the vector m ζ 1 +ζ 2 as m ζ ρ)), and go to Step 3. Otherwise, go to Step 4. 3. Draw a third time ζ 3 , independent from ζ 1 and ζ 2 , following an exponential law of parameter µ. If ζ 1 + ζ 2 + ζ 3 < T , then obtain the vector m ζ 1 +ζ 2 +ζ 3 as (1, ρ)). Go to Step 4. 4. Simulate the vector mT from the last simulated vector m t (t = 0 by default) √T B iT , i ∈ N, and in turn the vectors k ζ 1 (if ζ 1 + ζ 2 + ζ 3 < T ), k ζ 1 +ζ 2 (if ζ 1 + ζ 2 < T ) and k ζ 1 +ζ 2 +ζ 3 (if ζ 1 + ζ 2 + ζ 3 < T ). 5. Computef ζ 1 ,f ζ 1 +ζ 2 , andf ζ 1 +ζ 2 +ζ 3 for the three orders of the FT scheme.
We perform TVA computations on CDS contracts with maturity T = 10 years, choosing for that matterT = T + 1 = 11 years, hence ς = 1 [0,11] √ 11 , for ρ = 0.6 unless otherwise stated. Table 1 displays the contractual spreads of the CDS contracts used in these experiments. In Figure 3, the left graph shows the TVA on a CDS on name Table 1 Time-0 bp CDS spreads of names −1 (the bank), 0 (the counterparty) and of the reference names 1 to n used when n = 1 (left) and n = 10 (right). λ . The right graph shows similar results regarding a portfolio comprising one CDS contract per name i = 1, . . . , 10. The time-0 clean value of the default leg of the CDS in case n = 1, respectively the sum of the ten default legs in case n = 10, is 4.52, respectively 40.78 (of course P 0 = 0 in both cases by definition of fair contractual spreads). Hence, in relative terms, the TVA numbers visible in Figure 3 are quite high, much greater for instance than in the cases of counterparty risk on interest rate derivatives considered in Crépey, Gerboud, Grbac, and Ngor (2013). This is explained by the wrong-way risk feature of the DGC model, namely, the default intensities of the surviving names and the value of the CDS protection spike at defaults in this model. Whenλ increases (forλ = 0 that's a case of linear TVA where FT higher order terms equal 0), the second (resp. third) FT term may represent in each case up to 5% to 10% of the first (resp. second) FT term, from which we conclude that the first FT term can be used as a first order linear estimate of the TVA, with a nonlinear correction that can be estimated by the second FT term. In Figure 4, the left graph shows the TVA on one CDS computed by FT scheme of order 3 as a function of the DGC correlation parameter ρ, with other parameters set as before. The right graph shows the analogous results regarding the portfolio of ten CDS contracts. In both cases, the TVA numbers increase (roughly linearly) with ρ, including for high values of ρ, as desirable from the financial interpretation point of view, whereas it has been noted in Brigo and Chourdakis (2008) (see the blue curve in Figure 1 of the ssrn version of the paper) that for high levels of the correlation between names, other models may show some pathological behaviours.
In Figure 5, the left graph shows that the errors, in the sense of the relative standard errors (% rel. SE), of the different orders of the FT scheme do not explode with the dimension (number of credit names that underlie the CDS contracts). The middle graph, produced with n = 1, shows that the errors do not explode with the level of nonlinearity represented by the unsecured borrowing spreadλ . Consistent with the fact that the successive FT terms are computed by purely forward Monte Carlo schemes, their computation times are essentially linear in the number of names, as visible in the right graph. To conclude this section, we compare the linear approximation (14) corresponding to the first FT term in (24) (FT1 in Table 2) with the linear approximations (12)-(13) (LA in Table 2). One can see from Table 2 that the LA and FT1 estimates are consistent (at least in the sense of their 95% confidence intervals, which always intersect each other). But the LA standard errors are larger than the FT1 ones. In fact, using the formula for the intensity γ of τ in FT1 can be viewed as a form of variance reduction with respect to LA, where τ is simulated. Of course, forλ = 0 (case of the right tables whereλ = 3%), both linear approximations are biased as compared with the complete FT estimate (with nonlinear correction, also shown in Table 2), particularly in the high dimensional case with 10 CDS contracts (see the bottom panels in Table 2). Figure 6 completes these results by showing the LA, FT1 and FT standard errors computed for different levels of nonlinearity and different dimensions.
Summarizing, in the DGC model, the PHL is not practical. The FT scheme based on the partially reduced TVA BSDE (II) gives an efficient way of estimating the TVA. The nonlinear correction with respect to the linear approximations (14) or (15) amounts up to 5% in relative terms, depending on the unsecured borrowing spreadλ .

Numerical results in the DMO model
In the DMO model, the FT scheme (18) for the fully reduced BSDE (23) can be implemented through following steps: 1. Simulate the time η Y of each (individual or joint) shock following an independent exponential law of parameter γ Y , Y ∈ Y , then retrieve the τ i through the formula (41). 2. Draw a time ζ 1 following an exponential law of parameter µ. If ζ 1 < T , compare the default time of each name with ζ 1 to obtain the reduced Markov factor X ζ 1 as of (42) and in turn f ζ 1 as of (45)-(46), then go to Step 3. Otherwise stop. 3. Draw a second time ζ 2 following an independent exponential law of parameter µ. If ζ 1 + ζ 2 < T , compare the default time τ i of each name with ζ 1 + ζ 2 to obtain the Markov factor X ζ 1 +ζ 2 and f ζ 1 +ζ 2 then go to Step 4. Otherwise stop. 4. Draw a third time ζ 3 following an independent exponential law of parameter µ.
We can also consider the PHL scheme (31) based on the partially reduced BSDE (II) with To simulate the random tree T in (31), we follow the approach sketched before (31) where, in order to evolve X according to the DMO generator A during a time interval ζ , a particle born from a node x = ( j Y ) Y ∈Y ∈ {0, 1} Y at time t, all one needs is, for each Y such that j Y = 1, draw an independent exponential random variable η Y of parameter γ Y and then set x = ( j Y 1 [0,η Y ) (ζ )) Y ∈Y . Rephrasing in more algorithmic terms: 1. To simulate the random tree T under the expectation in (31), we repeat the following step (generation of particles, or segments between consecutive nodes of the tree) until a generation of particles dies without children: For each node (t, x = ( j Y ) Y ∈Y , k) issued from the previous generation of particles (starting with the root-node (0, X 0 , k = 1)), for each of the k new particles, indexed by l, issued from that node, simulate an independent exponential random variable ζ l and set where, for each l, the η l Y are independent exponential-γ Y random draws and ν l is an independent draw in the finite set {0, 1, · · · , d} with some fixed probabilities p 0 , p 1 , . . . , p d .
2. To compute the random variable Φ under the expectation in (31), we loop over the nodes of the tree T thus constructed (if T ⊂ [0, T ] × D, otherwise Φ = 0 in the first place) and we form the product in (31), where theā k (t, x) are retrieved as in (30).
The PHL schemes (34) based on the full BSDE (I) or (36) based on the fully reduced BSDE (III) can be implemented along similar lines. We perform TVA computations in a DMO model with n = 120, for individual shock intensities taken as γ {i} = 10 −4 × (100 + i) (increasing from ∼ 100 bps to 220 bps as i increases from 1 to 120) and four nested groups of common shocks I 1 ⊂ I 2 ⊂ I 3 ⊂ I 4 , respectively consisting of the riskiest 3%, 9%, 21% and 100% (i.e. all) names, with respective shock intensities γ I 1 = 20 bp, γ I 2 = 10 bp, γ I 3 = 6.67 bp and γ I 4 = 5 bp. The counterparty (resp. the bank) is taken as the eleventh (resp. tenth) safest name in the portfolio. In the model thus specified, we consider CDO tranches with upfront payment, i.e. credit protection bought by the bank from the counterparty at time 0, with nominal 100 for each obligor, maturity T = 2 years and attachment (resp. detachment) points are 0%, 3% and 14% (resp. 3%, 14% and 100%). The respective value of P 0 (upfront payment) for the equity, mezzanine and senior tranche is 229.65, 5.68 and 2.99. Accordingly, the ranges of approximation chosen for pol(y) ≈ y + in the respective PHL schemes are 250, 200 and 10. We use polynomial approximation of order d = 4 with (p 0 , p 1 , p 2 , p 3 , p 4 ) = (0.5, 0.3, 0.1, 0.09, 0.01). We set µ = 0.1 in all PHL schemes and µ = 2/T = 0.2 in all FT schemes. Figure 7 shows the TVA computed by the FT scheme (23) based on the fully reduced BSDE (III), for different levels of nonlinearity (unsecured borrowing basis λ ). We observe that, in all cases, the third order term is negligible. Hence, in further FT computations, we only compute the orders 1 (linear part) and 2 (nonlinear correction).  (II), exhibit much less variance than the third one, based on the full BSDE with terminal condition ξ . This is also visible in Figure 9 (note the different scales of the y axes going from left to right in the picture), which also shows that, for any of these schemes, the relative standard errors do not explode with the level of nonlinearity or the number of reference names in the CDO (the results for the PHL scheme are not shown on the figure as very similar to those of the PHL scheme). In comparing the TVA values on the left and the right hand side of Table 3, we see that the intensities of the common shocks, which play a role similar to the correlation ρ in the DGC model, have a more important impact on the higher tranches (mezzanine and senior tranche), whereas the equity tranche is more sensitive to the level of the unsecured borrowing spreadλ .

Conclusion
Under mild assumptions, three equivalent TVA BSDEs are available. The original "full" BSDE (I) is stated with respect to the full model filtration G and the original pricing measure Q. It does not involve the intensity γ of the counterparty first-todefault time τ. The partially reduced BSDE (II) is also stated with respect to (G , Q) but it involves both τ and γ. The fully reduced BSDE (III) is stated with respect to a smaller "reference filtration" F and it only involves γ. Hence, in principle, the full BSDE (I) should be preferred for models with a "simple" τ whereas the fully reduced BSDE (III) should be preferred for models with a "simple" γ. But, in nonimmersive setups, the fully reduced BSDE (III) is stated with respect to a modified probability measure P. Even though switching from (G , Q) to (F , P) is transparent in terms of the generator of related Markov factor processes, this can be an issue in situations where the Markov structure is important in the theory to guarantee the validity of the numerical schemes, but is not really practical from an implementation point of view. This is for instance the case with the credit portfolio models that we use for illustrative purposes in our numerics, where the Markov  structure in the sense of simulation and forward pricing by copula means is sufficient for the FT scheme, a dynamic copula in the stronger sense that the copula structure is preserved in the future is required in the case of the PHL scheme. This strong dynamic copula property is satisfied by our common-shock model but not in the Gaussian copula model. In conclusion, the FT schemes applied to the partially or fully reduced BSDEs (II) or (III) (a null terminal condition is required so that the full BSDE (I) is not eligible for this scheme) appears as the method of choice on these problems.
An important message of the numerics is that, even for realistically high levels of nonlinearity, i.e. an unsecured borrowing spreadλ = 3%, the third order FT correction was always found negligible and the second order FT correction less than 5% to 10% of the first order, linear FT term. In conclusion, a first order FT term can be used for obtaining "the best linear approximation" to our problem, whereas a nonlinear correction, if wished, can be computed by a second order FT term.