Tight Semi-model-free Bounds on (Bilateral) CVA

In the last decade, counterparty default risk has experienced an increased interest both by academics as well as practitioners. This was especially motivated by the market turbulences and the financial crises over the past decade which have highlighted the importance of counterparty default risk for uncollateralized derivatives. After a succinct introduction to the topic, it is demonstrated that standard models can be combined to derive semi-model-free tight lower and upper bounds on bilateral CVA (BCVA). It will be shown in detail how these bounds can be easily and efficiently calculated by the solution of two corresponding linear optimization problems.

Over the past years, several authors have been investigating the pricing of derivatives based on a variety of models which take into account these default risks. Most of these results are covered by a variety of excellent books, for example Pykhtin [16], Gregory [12], or Brigo et al. [7] just to name a few. For a profound discussion on the pros and cons of unilateral versus bilateral counterparty risk let us refer to the two articles by Gregory [11,13].
In the following exposition, we are concerned with the quantification of the smallest and largest BCVA which can be obtained by any given model with predetermined marginal laws. This takes considerations of Turnbull [21] much further, who first derived weak bounds on CVA for certain types of products. Our approach extends first ideas from Hull and White [15], where the hazard rate determining defaults is coupled to the exposure or other risk factors in either deterministic or stochastic way. Still, Hull and White rely on an explicit choice of the default model and on an explicit coupling. More related is the work by Rosen and Saunders et al. [8,17], on which we prefer to comment later in Remark 8. As the most related work we note the paper by Cherubini [9] which provided the basis for this semi-model-free approach. There, only one particular two-dimensional copula was used to couple each individual forward swap par rate with the default time. Obviously, a more general approach couples each forward swap par rate with each other and the default time-which is in gist similar to Hull and White [15]. From there the final step to our approach is to observe that the most general approach directly links the whole stochastic evolution of the exposure with both random default times. We will illustrate in the following that these couplings can be readily derived by linear programming. For this purpose the BCVA will be decomposed into three main components: the first component is represented by the loss process, the second component consists of the default indicators of the two counterparties and the third component is comprised of the exposure-at-default of the OTC derivative, i.e. the risk-free present value of the outstanding amount 1 at time of default. This approach takes further early considerations of Haase and Werner [14], where comparable results were obtained from the point of view of generalized stopping problems.
In a very recent working paper by Scherer and Schulz [18], the above idea was analyzed in more detail. It was shown that the computational complexity of the problem is the same, no matter if only marginal distributions of defaults or the joint distribution of defaults are known.
After submission of this paper we became aware of related results by Glasserman and Yang, see [10]. Although the main idea of their exposition is similar in gist, Glasserman and Yang focus on the unilateral CVA instead of bilateral CVA. Besides an analysis of the convergence of finite samples to the continuous setup, their exposition is mainly focused on the penalization of deviation from some base distribution. In contrast, our focus is on bilateral CVA, with special attention to numerical solution and to the case that payoffs also depend on the credit quality.
In summary, this exposition makes the following main contributions: • First, the three main building blocks of such an adjustment are clearly identified and separated, and it is shown how any coupling of these blocks leads to a feasible adjustment. Unlike Cherubini, who only considered the very specific case of an interest rate swap, all kinds of derivatives (interest rate, FX, commodity, and even credit derivatives) are covered in a unified way-even if the payoff, and thus the present value of the derivative, is explicitly depending on the credit quality of any of the two counterparties. • Second, by generalizing Cherubini's approach, upper and lower bounds on unilateral and bilateral counterparty value adjustments are derived. It will be demonstrated that these bounds can be efficiently obtained by the solution of linear optimization problems, more specifically, by the solution of balanced transportation problems. In contrast to the approaches of Turnbull [21] or Cherubini [9], both the upper and lower bound derived here are tight bounds, i.e. there exists some stochastic model which is consistent with all given market prices in which these bounds are attained.
The rest of the paper is organized as follows. In Sect. 2 a succinct introduction to bilateral counterparty risk is given, before the decomposition of the BCVA into its building blocks is carried out in Sect. 3. In Sect. 4 the two main approaches for the calculation of counterparty valuation adjustments are briefly reviewed. Finally, the tight bounds on CVA are derived in Sect. 5, before the paper concludes.

Counterparty Default Risk
As usual, to model financial transactions with default risk, let (Ω, G , G t , Q) be a probability space where G t models the flow of information and Q denotes the riskneutral measure for a given risk-free numéraire process N t > 0, see e.g. Bielecki and Rutkowski [2] for more details. Further, let the space be endowed with a rightcontinuous and complete sub-filtration F t modeling the flow of information except default, such that F t ⊆ G t := F t ∨ H t with H t being the right-continuous filtration generated by the default events. Subsequently, we consider a transaction with maturity T between a client A and a counterparty B where both are subject to default. The respective random default times are denoted by τ A and τ B . In order to take into account counterparty default risk we distinguish three cases: • A defaults before B and before T : • B defaults before A and before T : For simplicity of presentation, we assume in the following that Under this assumption these sets 2 yield a decomposition of one, i.e. it holds In the following, let us consider a transaction consisting of cash flows C(B, A, T i ) paid by the counterparty B at times T i , i = 1, . . . , m B , and cash flows C(A, B, T j ) paid by the client A at times T j , j = 1, . . . , m A . Taking into account default risk of both counterparties, the quantification of the bilateral CVA is summarized in the following well-known theorem, which in essence goes back to Sorensen and Bollier [19].
where the risk-free present value of the transaction is given as and where the bilateral counterparty value adjustment CVA A (t, T ) is defined as Here L i t denotes the random loss (between 0 and 1) of counterparty i at time t. Proof A proof of Theorem 1 can be found in Bielecki and Rutkowski [2], Formula (14.25) or Brigo and Capponi [4], Proposition 2.1 and Appendix A, respectively.
Based on Theorem 1, the general approach for the calculation of the counterparty risk adjusted value V D A (t, T ) is to determine first the risk-free value V A (t, T ) of the transaction. This can be done by any common valuation method for this kind of transaction. In a second step the counterparty value adjustment CVA A (t, T ) needs to be determined. So far, two main approaches have emerged in the academic literature, which will be briefly reviewed in Sect. 4.

The Main Building Blocks of CVA
Subsequently, let us assume that the default times τ i with i ∈ {A, B} can only take a finite number of values {t 1 , . . . ,t K } in the interval ]0, T [. For continuous time models this assumption can be justified by the default bucketing approach, which can, for example, be found in Brigo and Chourdakis [5], if K is chosen sufficiently large. To be able to separate the default dynamics from the market value dynamics, let us introduce the auxiliary time s, s ∈ [t, T ] and the discounted market value Then we can rewrite Eq. (1) as: Here, 1 M is the indicator function of the set M; if M = {m} we simply write 1 m instead. Now, collecting all terms relating to the default in the default indicator process δ, we can rewrite the BCVA in a more compact manner as From Eq. (3) we immediately see that the BCVA at time t is composed of six discrete time 3 processes: • two default indicator processes δ A s and δ B s , • two loss processes L A s and L B s , and • two discounted exposure processes V + A (t, s, T ) and V + B (t, s, T ). In this way, we are able to separate the default dynamics δ from the loss process L and the exposure process V . From this decomposition, it becomes obvious that the BCVA is completely determined by the joint distribution of these six processes.

Remark 1
We note that in general it is even sufficient to model four processes (loss dynamics and market value dynamics) plus a two-dimensional random variable (τ A , τ B ). However, in the case of finitely many default times, it is more convenient to work with the default indicator process instead.
Remark 2 For simplicity of the subsequent exposition, we assume that the loss process is actually constant and equals 1: The theory of the remainder of this exposition is not affected by this simplifying assumption, with one notable exception: the resulting two-dimensional transportation problems will become a multi-dimensional transportation problem which renders its numerical solution more complex, but still feasible.

Remark 3
As we have noted, the default indicator process can only take a finite number of values in the bucketing approach. More exactly, it holds that the joint (i.e. two-dimensional) default indicator process δ = (δ k ) k=1,...,K ∈ R 2×K , defined by takes only values in the finite set which has exactly 2K + 1 elements. Therefore, the discrete time default indicator process is also a process with a finite state space.
Let us further introduce the joint exposure process in analogy to the above, Then it holds To avoid technical considerations for brevity of presentation, we prefer to work with discrete processes (i.e. discrete state space) in discrete time. Thus, it may be necessary to discretize the state space of the remaining discounted exposure process. In general, there exist (at least) two different approaches how a suitable discrete state space version of the process X could be obtained: • In the first approach-completely similar to the default bucketing approachthe state space R 2×K for the joint exposure process X is divided into N disjoint components. Then X is replaced by some representative value on this component (usually an average value) on each of the components, and the probabilities of the discretized process are set in accordance with the original probabilities of each component (cf. the default bucketing approach). • From a computational and practical point of view, a much more convenient approach relies on Monte Carlo simulation: N different scenarios (i.e. realizations) of the process X are used instead of the original process. Each realization is assumed to have probability 1/N.
For both approaches it is known that they converge at least 4 in distribution to the original process, which is sufficient for our purposes. For more details on the convergence, let us refer to the recent working paper by Glasserman and Yang [10].

Models for Counterparty Risk
In the last decade two main approaches have emerged in the literature how to model the individual, resp. joint distribution of the processes δ and X: • The most popular approach is based on the rather strong assumption of independence between exposure and default. Based on this independence assumption, only individual models for δ and X need to be specified for the CVA calculation. This kind of independence assumption is quite standard in the market, see for example the Bloomberg CVA function (for more details on the Bloomberg model let us refer to Stein and Lee [20]). • Alternatively, and more recently, a more general approach is based on a joint model (also called hybrid model) for the building blocks δ and X of the CVA calculation, see Sect. 4.3.

Independence of CVA Components
Let us assume that the exposure process X is independent of the default process δ. Then the expectation inside the summation can be split into two parts: It is well known that the expected value matches exactly the price of a call option on the basis transaction at time t with strike 0 and exercise timet k . The CVA equation can hence be rewritten as and thus the BCVA can be calculated without any further problems as the corresponding default probabilities 5 can be easily computed from any given credit risk model: in order to calculate the probability , the default times τ A and τ B together with their dependence structure have to be modeled. One of the most popular models for default times in general are intensity models, as for example described in Bielecki and Rutkowsi [2], Part III.
Remark 4 It has to be noted that a model with deterministic default intensities plus a suitable copula is sufficient for the arbitrary specification of the joint distribution of default times. Stochastic intensities do not add any value in this context. This is true as long as the default risk-free discounted present value is independent of the credit quality of each counterpart. This means that the payoff itself is not allowed to be linked explicitly to the credit quality of any counterparty.
Remark 5 Let us point out that the intensity model is just one specific example how default times could be modeled. The big advantage of our approach is that any arbitrary credit risk model can be used instead, as only the distribution of the default indicator δ finally matters. In case only marginal default models are available, we can also take into account the remaining unknown dependence between the default times, however, at the price of a higher dimensional transportation problem.

Modeling Options on the Basis Transaction
Since it could be observed in Eq. (6) that options on the basis transaction need to be priced, a suitable model for this option pricing task needs to be available. Depending on the type of derivative, any model which can be reasonably well calibrated to the market data is sufficient. For instance, for interest rate derivatives, any model ranging from a simple Vasicek or CIR model to sophisticated Libor market models or two-factor Hull-White models could be applied. In case of a credit default swap, any model which allows to price CDS options, i.e. any model with stochastic credit spread would be feasible. However, for CVA calculations, usually a trade-off between accuracy of the model and efficiency of calculations needs to be made. For this reason, usually simpler models are applied for CVA calculations than for other pricing applications. It needs to be noted that since the financial market usually provides sufficiently many prices of liquid derivatives, any reasonable model can be calibrated to these market prices, and therefore, we can assume in the following that the market implied distribution of the discounted exposure process is fully known and available.

Hybrid Models-An Example
Another way to calculate the CVA is to use a so-called hybrid approach which models all the involved underlying risk factors. Instances of such models can for example be found in Brigo and Capponi [4] for the case of a credit default swap, or Brigo et al. [6] for interest rate derivatives. In Brigo et al. [6], an integrated framework is introduced, where a two-factor Gaussian interest-rate model is set up for a variety of interest rate derivatives 6 in order to deal with the option inherent in the CVA. Further, to model the possible default of the client and its counterparty their stochastic default intensities are given as CIR processes with exponentially distributed positive jumps. The Brownian motions driving those risk factors are assumed to be correlated. Additionally, the defaults of the client and the counterparty are linked by a Gaussian copula.
In summary, the amount of wrong-way risk which can be modeled within such a framework strongly depends on the model choice. If solely correlations between default intensities (i.e. credit spreads) and interest rates are taken into account, only a rather weak relation will emerge between default and the exposure of interest rate derivatives, cf. Brigo et al. [6]. Figure 5 in Scherer and Schulz [18] provides an overview of potential CVA values for different models which illustrates that models can differ quite significantly.

Tight Bounds on CVA
From the previous section it becomes obvious that hybrid models yield different CVAs depending on the (model and parameter implied) degree of dependence between default and exposure. However, it remains unclear how large the impact of this dependence can be. In other words: Is it possible to quantify, how small or large the CVA can get for any model, given that the marginal distributions for expo-sure and default are already given? In the following, we want to address this question based on our initially given decomposition of the CVA in building blocks.
As mentioned in Sect. 4.2, we can reasonably assume that the distribution of the exposure process X is already completely determined by the available market information. In a similar manner, we have argued that also the distribution of the default indicator process δ can be assumed to be given by the market. Nevertheless, let us point out that the following ideas and concepts could indeed be generalized to the case that only the marginal distributions of the default times are known. Further, we can even consider the case that the dependence structure between different market risk factors is not known but remains uncertain. However, all these generalizations come at the price that the resulting two-dimensional transportation problem will become multi-dimensional.
For the above reasons, we argue that the following approach is indeed semi-modelfree in the sense that no model needs to be specified which links the default indicator process with the discounted exposure processes.

Tight Bounds on CVA by Mass Transportation
Let us reconsider Eq. (4) and let us highlight the dependence of the BCVA on the measure P.
With some abuse of notation, the measure P denotes the joint distribution of the default process δ and the exposure process X. Since both processes have finite support, P can be represented as a (2K + 1) × N matrix with entries in [0, 1]. We note that the marginals of P, i.e. the distributions of δ and X (denoted by the probability vectors p (X) ∈ R N and p (δ) ∈ R 2K+1 ) are already predetermined from the market. Therefore, P has to satisfy 1 P = p (X) , and P1 = p (δ) .

Remark 6
In case of independence between δ and X, P is given by the product distribution of δ and X, whereas in hybrid models the joint distribution P is determined by the specification and parametrization of the hybrid model. In the independent case, P is hence given by the dyadic product Obviously, the smallest and largest CVA which can be obtained by any P which is consistent with the given marginals, is given by It can be easily noted that the set P is a convex polytope. Thus, the computation of CVA l A (t, T ) and CVA u A (t, T ) essentially requires the solution of a linear program, as the objective functions are linear in P.

Remark 7
The structure of the above LPs coincides with the structure of socalled balanced linear transportation problems. Transportation problems constitute a very important subclass of linear programming problems, see for example Bazaraa et al. [1], Chap. 10, for more details. There exist several very efficient algorithms for the numerical solution of such transportation problems, see also Bazaraa et al. [1], Chaps. 10, 11 and 12.
Let us summarize our results in the following theorem: Theorem 2 Under the given prerequisites, it holds:

These bounds are tight, i.e. they represent the lowest and the highest CVA which can be obtained by any (hybrid) model which is consistent with the market data and there exists at least one model which reaches these bounds.
The tightness of our bounds is in contrast to Turnbull [21], where only weak bounds were derived. Of course, bounds always represent a best-case and a worst-case estimate only, which may strongly under-and overestimate the true CVA.

Remark 8
We note that a related approach of coupling default and exposure via copulas was presented by Rosen and Saunders [17] and Crepedes et al. [8]. However, their approach differs from ours in some significant aspects. First, exposure scenarios are sorted by a single number (e.g. effective exposure) to be able to couple exposure scenarios with risk factors of defaults by copulas. Second, risk factors of some credit risk model are employed instead of working with the default indicator directly. Third, their approach is restricted to the real-world setting and does not consider restrictions on the marginal distributions in the coupling process, which is e.g. necessary if stochastic credit spreads should be considered.

An Alternative Formulation as Assignment Problem
For the above setup we have assumed that the probabilities for all possible realizations of the default indicator process could be precomputed from a suitable default model. If for some default model this should not be the case, but only scenarios (with repeated outcomes for the default indicator) could be obtained by a simulation, an alternative LP formulation could be obtained. In such a scenario setting, it is advisable that for both Monte Carlo simulations, the same number N of scenarios is chosen. Then for both given marginal distributions we have p (δ) j = p (X) i = 1/N. If we apply the same arguments as above we obtain again a transportation problem, however, with probabilites 1/N each. If we have a closer look at this problem, we see that the optimization actually runs over all N × N permutation matrices-since each default scenario is mapped onto exactly one exposure scenario. This means that this problem eventually belongs to the class of assignment problems, for which very efficient algorithms are available, cf. Bazaraa et al. [1]. Nevertheless, please note that although assignment problems can be solved more efficiently than transportation problems, it is still advisable to solve the transportation problem due to its lower dimensionality, as usually 2K + 1 N (i.e. time discretization is usually much coarser than exposure discretization). However, if stochastic credit spreads have to be considered, they have to be part of the default simulation and thus assignment problems (with additional linear constraints to guarantee consistency of exposure paths and spreads) become unavoidable.

Setup
To illustrate these semi-model-free CVA bounds let us give a brief example. For this purpose let us consider a standard payer swap with a remaining lifetime of T = 4 years analyzed within a Cox-Ingersoll-Ross (CIR) model at time t = 0. The time interval ]0, 4[ is split up into K = 8 disjoint time intervals each covering half a year. For simplicity, the loss process is again assumed to be 1.

Counterparty's Default Modeling
To model the defaults we have chosen the well-known copula approach with constant intensities using the Gaussian copula. For further analyses in this example we will focus on the case of uncorrelated counterparties (ρ = 0) and highly correlated counterparties (ρ = 0.9). Furthermore, the counterpartys' default intensities are assumed to be deterministic. We will distinguish between symmetric counterparties with identical default intensities and asymmetric counterparties. Thus, four different settings The left plots show identical counterparties (cases 1 and 2) and the right ones the cases, where counterparty B has a higher default intensity (cases 3 and 4). Furthermore, the upper plots correspond to uncorrelated defaults and for the ones below we have ρ = 0.9.

Counterparty Exposure Modeling
As already mentioned, a simple CIR model is applied for the valuation of the payer swap. Since our focus is on the coupling of the default and the exposure model, we have opted for such a simple model for ease of presentation. In the CIR model, the short rate r t follows the stochastic differential equation where (W t ) t≥0 denotes a standard Brownian motion. Instead of calibrating the parameters to market data (yield curve plus selected swaption prices) on one specific day, we have set the parameters in the following way κ = 0.0156, θ = 0.0311, σ = 0.0313, r 0 = 0.030 to obtain an interest rate market which is typical for the last years. Considering now the discounted exposure of each counterparty within the discrete time framework of our example, we can easily compute E Q X i k as the average of all generated scenarios from a Monte Carlo simulation. Figure 2 illustrates the results of a simulation, which are also given in Table 1. Positive bars correspond to E Q X A k , negative bars to E Q X B k , and the small bars correspond to E Q [Ṽ A (t k , T )]. Since payer and receiver swap are not completely symmetric instruments, there remains a residual expectation, as can be observed from Fig. 2.

Results
In case of independence between default and exposure, the bilateral CVA is easily obtained by multiplying the default probabilities (as shown in Fig. 1) with the corresponding exposures (as shown in Fig. 2) and summation. Besides the independent CVA i , the minimal and maximal CVA l and CVA u have been calculated as well.
The results of these calculations are illustrated in Fig. 3 and Table 2 for each time interval Δ k . Analogously to Fig. 1 we have for each of the four cases a separate subplot and the left plots belong again to cases 1 and 2. The positive bars now correspond to E Q i δ B k · X A k and the negative ones to E Q i δ A k · X B k . In the case of the minimal CVA, E Q l δ B k · X A k vanishes, meaning that for counterparty A in case of a default of counterparty B the exposure is zero, as the present value of the swap at that time is negative from counterparty A's point of view. Contrarily, for the maximal CVA, E Q u δ A k · X B k is zero. Here, Q u , Q l , and Q i denote the optimal measures for the maximal, the minimal, and the independent CVA, respectively. As expected there Table 2 Minimal and maximal CVA A ,   are large gaps between the lower and the independent CVA, as well as between the independent CVA and the upper bound. This means that wrong-way risk (i.e. higher exposure comes with higher default rates) can have a significant impact on the bilateral CVA. Interestingly, this observation holds true for all four cases, of course, with different significance depending on the specific setup. Although it is clear that our analysis naturally shows more extreme gaps than any hybrid model, it has to be mentioned that these bounds are indeed tight.

Computation Time, Choice of Algorithm, and Impact of Assumptions
Theoretically, the computation of the bounds boils down to the solution of a linear programming problem. From this it can be expected that state-of-the-art solvers like CPLEX or Gurobi will yield the optimal solution within reasonable computation time. Using CPLEX, we have obtained the following computation times on a standard workstation (Table 3). It can be observed that the problem can be solved for reasonable discretization levels within decent time. Rather similar computation times have been obtained with an individual implementation of the standard network simplex based on Fibonacci heaps. However, for larger sizes, the performance of standard solvers begins to deteriorate. To dampen the explosion of computation time, we have resorted to a special purpose solver for min cost network flows (which are a general case of the transportation problem) for highly asymmetric problems, as in our case 2K + 1 N. Based on Brenner's min cost flow algorithm, see Brenner [3], we could still solve problems with K = 40 and N = 8192 beneath a minute.
If one has to resort to the assignment formulation (to consider credit spreads accordingly), computation times increase due to the fact that now assignment problems have to be solved. Here, a factor 100 compared to the above computation times cannot be avoided.
If the coupling of the two default times is left flexible, the problem becomes a transportation problem with three margins, i.e. of size K + 1 × K + 1 × N. For these types of problems, no special purpose solver is available and one has to resort to CPLEX. Scherer and Schultz [18] have exploited the structure of this three-dimensional transportation problem to reduce computational complexity. They were able to reduce the problem to a standard two-dimensional transportation problem, hence rendering the computation of bounds similarly easy, no matter if default times are already coupled or not.

Conclusion and Outlook
In this paper we have shown how tight bounds on unilateral and bilateral counterparty valuation adjustment can be derived by a linear programming approach. This approach has the advantage that simulations of the uncertain loss, of the default times and of the uncertain value of a transaction during her remaining life can be completely separated. Although we have restricted the exposition to the case of two counterparties and one derivative transaction, the model can easily be extended to more counterparties and a whole netting node of trades. Further, as exposure is simulated separately from default, all risk-mitigating components like CSAs, rating triggers, and netting agreements can be easily included in a such a framework.
Interesting open questions for future research include the analogous treatment in continuous time, which requires much more technically involved arguments. Further, this approach yields a new motivation to consider efficient algorithms for transportation or assignment problems with more than two marginals, which did not yet get much attention so far.