An elementary proof of the dual representation of Expected Shortfall

We provide an elementary proof of the dual representation of Expected Shortfall on the space of integrable random variables over a general probability space. Unlike the results in the extant literature, our proof only exploits basic properties of quantile functions and can thus be easily implemented in any graduate course on risk measures. As a byproduct, we obtain a new proof of the subadditivity of Expected Shortfall.


Introduction
The debate on capital adequacy and solvency regulation in the past thirty years has been dominated by two competing risk measures: Value at Risk and Expected Shortfall.As is well known, Value at Risk became popular as part of the RiskMetrics package developed by J.P. Morgan in the 1990s with the aim to provide market participants with a set of techniques and data to measure market risks in their portfolios; see [20].Shortly after, Value at Risk was chosen by the Basel Committee on Banking Supervision as the reference market risk measure in the Basel II framework and was later to become the reference credit risk measure in the Basel III framework as well as the reference metric used by European insurance regulators for the computation of solvency capital requirements within the Solvency II framework; see [6,8,16].The introduction of Value at Risk raised a number of concerns about its ability to properly capture (tail) risks and to create the right incentives towards portfolio diversification, which eventually led to the definition of Expected Shortfall in the early 2000s; see [2,4,9,27,31,32].Currently, Expected Shortfall has replaced Value at Risk as the reference market risk measure in the Basel III framework and is employed to compute solvency capital requirements for insurance and reinsurance companies in the Swiss Solvency Test; see [7,17].Both risk measures have been the subject of an intense research program that was aimed to uncover their relative merits and drawbacks both from a theoretical and empirical point of view; see, e.g., [1,5,10,11,13,21,22,25,26,34,35,36,38].
A key difference at a theoretical level is that Expected Shortfall is a coherent risk measure in the sense of [4] whereas Value at Risk is not as it fails to satisfy the important property of subadditivity.Being coherent, Expected Shortfall can be equivalently described as a "robust expectation", i.e., as a supremum of expectations over a suitable family of probability measures.More precisely, let (Ω, F , P ) be a probability space and denote by L1 and L ∞ the space of P -integrable and P -almost surely bounded random variables, respectively.Following [18], we define the Expected Shortfall, also called Average Value at Risk, of X ∈ L 1 at level α ∈ (0, 1) by where VaR β (X) := − inf{x ∈ R ; P (X ≤ x) > β} is the Value at Risk of X at level β ∈ (0, 1), which coincides, up to a sign, with the upper quantile of X at level β.We denote by P the collection of all probability measures on F , and for α ∈ (0, 1) we set We can then express the Expected Shortfall of X ∈ L 1 as a "robust expectation" over P α , namely This representation corresponds to the classical "Fenchel-Moreau-Rockafellar" dual representation from convex analysis applied to Expected Shortfall; see [37,Theorem 2.3.3] for a general formulation.The representation is informative per se and becomes a useful tool in a variety of applications featuring Expected Shortfall, e.g., to pricing, hedging, portfolio selection; see [3,12,19,24,29,30,31,32,33].Historically, the intuition behind the dual representation of Expected Shortfall can be traced back to the link between Expected Shortfall and the Worst Conditional Expectation at level α ∈ (0, 1), which, following [4], is defined for every X ∈ L 1 by WCE α (X) := sup In a nonatomic setting, the Worst Conditional Expectation admits a dual representation in the form of the right hand side of (1.2); this was established in [14,Example 4.2]. 1 This result then automatically delivers the dual representation of Expected Shortfall for atomless probability spaces because the two risk measures coincide in this setting; see, e.g., [14,Theorem 6.10].However, as shown in [2], the two risk measures (and their dual representations) do not coincide on general probability spaces, thereby requiring an independent study of Expected Shortfall in a general setting.
To the best of our knowledge, the first complete derivation of the dual representation of Expected Shortfall was obtained in [18,Theorem 4.39] for bounded random variables.The proof is based on two steps.First, the equivalent formulation of Expected Shortfall as a tail conditional expectation, which was originally proved in [2,27], is used to show that Expected Shortfall dominates from above each expectation in (1.2).Second, a Neyman-Pearson type argument is employed to show that Expected Shortfall coincides with one of those expectations for a suitable choice of the underlying probability so that one has actually equality in (1.2).A similar two-step argument is used in [28] but the representation is stated for integrable random variables defined on a nonatomic probability space.In the aforementioned references the dual representation was also used to derive -as a direct byproduct -another important property of Expected Shortfall, subadditivity.This duality-based proof of subadditivity is included in the survey article [15], where it is said that "[this proof] is probably the most mathematically advanced among all proofs in this paper".In fact, the authors recommend it "in an advanced course where the axiomatic theory of coherent risk measures is a point of interest".
The goal of this short note is to provide an elementary proof of the dual representation (1.2) of Expected Shortfall, and, as a byproduct, a new proof of its subadditivity, for integrable random variables over a general probability space.Our approach only relies on basic properties of quantile functions and standard results from measure theory.In particular, it does not require the equivalent formulation of Expected Shortfall as a tail conditional expectation.We first obtain the desired representation for simple random variables.A straightforward limiting argument allows to extend it to bounded random variables.Finally, the continuity from above of quantile functions makes it possible to further extend it to all integrable random variables.The advantage of this multi-layer approach is that one can tailor to the reference audience the choice of the model space and, hence, the overall mathematical complexity of the argument (finite/general probability space, simple/bounded/integrable random variables).We believe that the proof in the present note is considerably simpler than the ones in the extant literature and can thus be successfully implemented in any graduate course on risk measures.In this pedagogical spirit, we collect all the basic properties of quantile functions that are used in the note in the appendix and provide a full proof.

Dual representation
In this section we establish the dual representation (1.2) of Expected Shortfall.As a preliminary step, we collect some elementary properties of Expected Shortfall that are used in the proof.They are direct consequences of elementary properties of quantile functions recorded in Lemma 5 in the appendix.In particular, note that Expected Shortfall is well defined by Lemma 5(a) and is continuous from above by combining Lemma 5(d) with monotone convergence.Proposition 1.For every α ∈ (0, 1) the following statements hold: In order to prove (1.2), we first establish a link between the sign of Expected Shortfall and the sign of expectations taken under probabilities in the dual set P α from (1.1).In the language of risk measures, this is equivalent to establishing a dual representation of the acceptance set associated with Expected Shortfall.Proposition 2. Let α ∈ (0, 1).For every X ∈ L 1 the following statements hold: Step 1: Discrete random variables.Let X be a discrete random variable taking the values x 1 < • • • < x N with probabilities p 1 , . . ., p N > 0. For Q ∈ P α set q k := Q(X = x k ) and note that N k=1 q k = 1.Let K = min{h ∈ {1, . . ., N} ; (a) Suppose that ES α (X) ≤ 0 and take Q ∈ P α .As x K − x k > 0 and q k ≤ p k α for every k ∈ {1, . . ., K − 1} and x k − x K > 0 for every k ∈ {K + 1, . . ., N}, (2.1) and (2.2) give (b) Suppose that ES α (X) > 0. We always find Q ∈ P α such that q k = 1 α p k for k ∈ {1, . . ., K −1} and q k = 0 for k ∈ {K + 1, . . ., N}.Then, (2.2) together with (2.1) give Step 2: Bounded random variables.Let X ∈ L ∞ .Below we use the elementary fact that X can be approximated uniformly from above by discrete random variables.
(a) Suppose that ES α (X) ≤ 0 but assume there is Step 3: Integrable random variables.Let X ∈ L 1 and for all m, n ∈ N define X m,n := max{min{X, m}, −n} ∈ L ∞ .It follows from dominated convergence that E P (|X −X m,n |) → 0. As for every Q ∈ P α , we have dQ dP ≤ 1 α P -a.s., this gives E Q (|X − X m,n |) → 0. Below we additionally use that, by Proposition 1(e), there exists k ∈ N such that ES α (min{X, m}) = ES α (X) for every m ∈ N with m ≥ k.
(a) Assume that ES α (X) ≤ 0 and take any Step 2. This yields E Q (X) ≥ 0 as well.
(b) Suppose that ES α (X) > 0 and take any ε ∈ (0, ES α (X)).Note that, for every m ∈ N, we have In particular, for every m ∈ N with m ≥ k, we have ES α (X m,n + ε) → ES α (X + ε) > 0 by cash invariance of ES α .Hence, we can take m, n ∈ N large enough to obtain both ES α (X m,n + ε) > 0 and E The preceding result delivers at once the desired representation of Expected Shortfall.
Theorem 3. Let α ∈ (0, 1).For every X ∈ L 1 the following representation holds: In particular, ES α is subadditive, i.e., for all X, Y ∈ L 1 , Proof.Let X ∈ L 1 .For m ∈ R, by cash invariance of ES α and Proposition 2(a) and (b), Combining the inequalities above yields (2.3).Finally, subadditivity follows directly from the the right-hand side of (2.3) and subadditivity of the supremum.
Remark 4. We have opted to divide the proof of Proposition 2 into three steps to enhance versatility.If the interest is only on the dual representation of Expected Shortfall on a finite probability space or on a general probability space for bounded random variables, then one has to read only up to the end of Step 1 or Step 2, respectively.The proof of Theorem 3 is identical in these cases.

A Some properties of quantile functions
Fix a probability space (Ω, F , P ).For a random variable X ∈ L 1 we define the cumulative probability at x ∈ R by F X (x) := P (X ≤ x) and the upper quantile at level α ∈ (0, 1) by For a sequence (X n ) ⊂ L 1 we write X n ↓ X whenever X n ≥ X n+1 for every n ∈ N and X n → X P -a.s..Moreover, we set X + := max{X, 0} and X − := max{−X, 0}.The next lemma collects some basic properties of quantile functions.We include a proof for completeness.
Lemma 5.For all X ∈ L 1 and α ∈ (0, 1) the following statements hold: (a) q + α (X) ∈ R and (e) There is k ∈ N such that q + β (X) = q + β (min{X, m}) for all β ∈ (0, α) and m ∈ N with m ≥ k.Proof.We only prove integrability in (a) and the assertions in (d) and (e) as the other statements are straightforward to verify by definition.