Tail-dependence, exceedance sets, and metric embeddings

There are many ways of measuring and modeling tail-dependence in random vectors: from the general framework of multivariate regular variation and the flexible class of max-stable vectors down to simple and concise summary measures like the matrix of bivariate tail-dependence coefficients. This paper starts by providing a review of existing results from a unifying perspective, which highlights connections between extreme value theory and the theory of cuts and metrics. Our approach leads to some new findings in both areas with some applications to current topics in risk management. We begin by using the framework of multivariate regular variation to show that extremal coefficients, or equivalently, the higher-order tail-dependence coefficients of a random vector can simply be understood in terms of random exceedance sets, which allows us to extend the notion of Bernoulli compatibility. In the special but important case of bivariate tail-dependence, we establish a correspondence between tail-dependence matrices and L1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^1$$\end{document}- and ℓ1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _1$$\end{document}-embeddable finite metric spaces via the spectral distance, which is a metric on the space of jointly 1-Fréchet random variables. Namely, the coefficients of the cut-decomposition of the spectral distance and of the Tawn-Molchanov max-stable model realizing the corresponding bivariate extremal dependence coincide. We show that line metrics are rigid and if the spectral distance corresponds to a line metric, the higher order tail-dependence is determined by the bivariate tail-dependence matrix. Finally, the correspondence between ℓ1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _1$$\end{document}-embeddable metric spaces and tail-dependence matrices allows us to revisit the realizability problem, i.e. checking whether a given matrix is a valid tail-dependence matrix. We confirm a conjecture of Shyamalkumar and Tao (2020) that this problem is NP-complete.


Introduction
Extreme events such as large portfolio losses in insurance and finance, spatial and environmental extremes such as heat-waves, floods, electric grid outages, and many other complex system failures are associated with tail-events.That is, the simultaneous occurrence of extreme values in the components of a possibly very high-dimensional vector X = (X i ) 1≤i≤p of covariates.Such simultaneous extremes occur due to dependence among the extremes of the X i 's.This has motivated a large body of literature on modeling and quantifying tail-dependence, see, e.g.(Coles 2001, Finkenstädt & Rootzén 2003, Rachev 2003, Beirlant et al. 2004, Castillo 1988, Resnick 2007, de Haan & Ferreira 2007).One basic and popular measure is the bivariate (upper) tail-dependence coefficient λ X (i, j) := lim (1.1) where F −1 i (u) := inf{x : P[X i ≤ x] ≥ u} is the generalized inverse of the cumulative distribution function F i of X i .Under weak conditions the above limit exists and is independent of the choice of the (continuous) marginal distributions of (X i , X j ).The matrix Λ := (λ X (i, j)) p×p of bivariate tail-dependence coefficients is necessarily positive (semi)definite and in fact, since λ X (i, i) = 1, it is a correlation matrix of a random vector, see Schlather & Tawn (2003).We call Λ as defined in (1.1) a tail-dependence matrix or TD matrix for short.
The general theme of our paper is that we review and contribute to the unified treatment of tail-dependence using the powerful framework of multivariate regular variation.This leads to deep connections to existing results in the theory of cut (semi)metrics and ℓ 1 -embeddable metrics (Deza & Laurent 1997), as well as to extensions to the Bernoulli compatibility characterization of tail-dependence matrices established in Embrechts et al. (2016) and Krause et al. (2018).What follows is an overview of our key ideas and contributions.
Since the marginal distributions of X are not important in quantifying tail-dependence, one may transform its marginals to be heavy-tailed.In fact, we make the additional and often very mild assumption that the vector X is regularly varying, i.e., that there exists a Radon measure µ on R p \ {0} and a suitable positive sequence a n ↑ ∞ such that nP[X ∈ a n A] → µ(A), as n → ∞, for all Borel sets A ⊂ R p that are bounded away from 0 and such that µ(∂A) = 0 (see Definition 2.1, below).This allows us to conclude that nP[h(X) > a n ] → µ{h > 1} for a large class of continuous and 1-homogeneous functions h : R p → [0, ∞) (Proposition 2.5).Therefore, if h is a certain risk functional, we readily obtain an asymptotic approximation of the probability of an extreme loss P[h(X) > a n ] ≈ n −1 µ{h > 1}.By varying the risk functional h, one obtains different measures of tail-dependence, which may be of particular interest to practitioners.For example, if p} and taking h L (X) = (min i∈L X i ) + := max{0, min i∈L X i }, the risk functional quantifies the joint exceedance probability that all components of X with index in the set L are simultaneously extreme -an event with potentially devastating consequences.In practice, due to the limited horizon of historical data such extreme events especially for large sets L are rarely (if ever) observed.Thus, quantifying their probabilities is very challenging.Yet, as Emil Gumbel had eloquently put it "It is not possible that the improbable will never occur."This underscores the importance of the theoretical understanding, modeling, and inference of such functionals.Namely, one naturally arrives at the higher order tail-dependence coefficients It can be seen that if the marginals of the X i 's are identical and a n is such that n −1 ∼ P[X i > a n ] (i.e.lim n→∞ nP[X i > a n ] = 1), then λ X ({i, j}) = lim n→∞ P[X i > a n | X j > a n ] recovers the classic bivariate tail-dependence coefficients λ X (i, j) in (1.1).Using the functionals h(X) := max j∈K X j for some K ⊂ [p], one arrives at the popular extremal coefficients arising in the study of max-stable processes: Starting from the seminal works of Schlather & Tawn (2002, 2003), the structure of the extremal coefficients {θ X (K), K ⊂ [p]} has been studied extensively, see Strokorb & Schlather (2015), Strokorb et al. (2015), Molchanov & Strokorb (2016), Fiebig et al. (2017), which address fundamental theoretical problems and develop stochastic process extensions.Our goal here is more modest.We want to study both the tail-dependence and extremal coefficients as risk functionals from the unifying perspective of regular variation.Interestingly, they can be succinctly understood in terms of exceedance sets.Namely, defining the random set we show (Proposition 3.1 below) where the limit Θ is a non-empty random subset of [p] such that where a = θ X ([p]).Thus, λ X and θ X (up to rescaling by a) are precisely the inclusion and hitting functionals characterizing the distribution of Θ (Molchanov 2017).Interestingly, the probability mass function of the random set Θ recovers (up to rescaling) the coefficients in a (generalized) Tawn-Molchanov max-stable model associated with X (see (3.6)).
The above probabilistic representation in (1.2) of the tail-dependence functionals leads to transparent proofs of seminal results from Embrechts et al. (2016) and Krause et al. (2018) on the characterization of TD matrices in terms of so-called Bernoulli-compatible matrices.In fact, we readily obtain a more general result on the characterization of higher-order tail-dependence coefficients via Bernoulli-compatible tensors (Proposition 3.4).
Associated to the bivariate tail-dependence coefficients λ X ({i, j}) we introduce and discuss the so called spectral distance d X given by This spectral distance defines a metric on the space of 1-Fréchet random variables (i.e.random variables with distribution function F (x) = exp{−c/x}, x ≥ 0, for some non-negative scale coefficient c, where we speak of a standard 1-Fréchet distribution if c = 1) living on a joint probability space, which metricizes convergence in probability and was considered in Davis & Resnick (1993), Stoev & Taqqu (2005), Fiebig et al. (2017).In Section 4 we will establish the L 1 -embeddability of this metric, which allows us to apply the rich theory about metric embeddings in the context of analyzing the tail-dependence coefficients.In Section 4.2, utilizing the exceedence set representation of the bivariate tail-dependence coefficients and the L 1 -embeddability of the spectral distance, we recover the equivalence of the L 1 and ℓ 1 -embeddability as well as a probabilistic proof of the so-called cut-decomposition of ℓ 1embeddable finite metric spaces.In this case, this decomposition turns out to be closely related to the Tawn-Molchanov model of an associated max-stable vector X (Proposition 4.5).When a given ℓ 1 -embeddable metric has a unique cut-decomposition, it is called rigid (Deza & Laurent 1997).
Rigidity of the spectral distance basically means that the bivariate tail-dependence coefficients Λ determine all higher order tail-dependence coefficients.In Theorem 4.11, we show that line metrics are rigid, which to the best of our knowledge is a new finding.In particular, we obtain that the bivariate tail-dependence coefficient matrices corresponding to line metrics determine the complete set of tail-dependence or, equivalently, extremal coefficients of X.Interestingly, the random set Θ corresponding to such line-metric tail-dependence is (after a suitable reordering of marginals) a random segment, more precisely a random set of the form {i, i + 1, . . ., j − 1, j} for 1 ≤ i ≤ j ≤ p with i = 1 or j = p.In general, the characterization of rigidity is computationally hard as it is equivalent to the characterization of the simplex faces of the cone of cut metrics (Deza & Laurent 1997).
The bivariate TD matrix Λ is a correlation matrix of a random vector.It is well-known, however, that not every correlation matrix with non-negative entries is a matrix of tail-dependence coefficients.The recent works of Fiebig et al. (2017), Embrechts et al. (2016), Krause et al. (2018), and Shyamalkumar & Tao (2020) among others have studied extensively various aspects of the class of TD matrices.One surprisingly difficult problem, referred to as the realizability problem, is checking whether a given matrix Λ is a valid TD matrix.The extensive study of Shyamalkumar & Tao (2020) proposed several practical and efficient algorithms for realizability.Moreover, Shyamalkumar & Tao (2020) conjectured that the realizability problem is NP-complete.In Section 5, we confirm their conjecture.We do so by exploiting the established connection to ℓ 1 -embeddability, which allows us to utilize the rich theory on cuts and metrics outlined in the monograph of Deza & Laurent (1997).It is known that checking whether any given p-point metric space is ℓ 1 -embeddable is a computationally hard problem in the NP-complete class.
The paper is structured as follows: In Section 2 we give an overview over several ways of modeling and measuring tail-dependence of a random vector, presented in a hierarchic fashion: First of all, multivariate regular variation allows for the most complete asymptotic description of the tailbehavior of (heavy-tailed) random vectors in terms of the tail measure, with a direct correspondence to the class of max-stable models as the natural representatives for each given tail measure.A more condensed description of tail-dependence is given by the values of special extremal dependence functionals like the extremal coefficients and tail-dependence coefficients.Finally, a rather coarse but popular description of the tail-dependence is given in form of those functions evaluated only at bivariate marginals, where the bivariate tail-dependence coefficients form the most prominent example.
In Section 3 we first discuss exceedance sets, as introduced above, and Bernoulli compatibility.
Based on this interpretation we give a short introduction into generalized Tawn-Molchanov models.
In Section 4 we explore the relationship between bivariate tail-dependence coefficients and the spectral distance on the space of 1-Fréchet random variables.After a brief introduction into the concepts of metric embeddings of finite metric spaces we will show that the spectral distance is both L 1 -and ℓ 1 -embeddable, some consequences of which will be explored in Section 4.2 and Section 5.In Section 4.2 we introduce the concept of rigid metrics and prove that the building blocks of ℓ 1 -embeddability, i.e. the line metrics, correspond to Tawn-Molchanov models with a special structure which is completely determine by this line metric.Finally, in Section 5 we use known results about the computational complexity of embedding problems to show that the realization problem of a tail-dependence matrix is NP-complete.Some proofs are deferred to the Appendix A.
2 Regular variation, max-stability, and extremal dependence In this section, we provide a concise overview of fundamental notions on multivariate regular variation and max-stable distributions, which underpin the study of tail-dependence.

Multivariate regular variation
The concept of multivariate regular variation is key to the unified treatment of the various taildependence notions we will consider.Much of this material is classic but we provide here a selfcontained review tailored to our purposes.Many more details and insights can be found in Resnick (1987Resnick ( , 2007)), Hult & Lindskog (2006), Basrak & Planinić (2019), Kulik & Soulier (2020) among other sources.
We start with a few notations.A set A ⊂ R p is said to be bounded away from 0 if 0 ∈ A cl , i.e., A ∩ B(0, ε) = ∅, for some ε > 0.Here A cl is the closure of A and B(x, r) := {y ∈ R p : x − y < r} is the ball of radius r centered at x in a given fixed norm • .Furthermore, denote the Borel σ-Algebra on R p by B(R p ).
Consider the class M 0 (R p ) of all Borel measures µ on B(R p ) that are finite on sets bounded away from 0, i.e., such that µ(B(0, ε) c ) < ∞, for all ε > 0. Such measures will be referred to as boundedly finite.For µ n , µ ∈ M 0 (R p ), we write as n → ∞, for all bounded and continuous f vanishing in a neighborhood of 0. The latter is equivalent to having for all µ-continuity Borel sets A that are bounded away from 0 (Hult & Lindskog 2006, Theorems 2.1 and 2.4).
Definition 2.1.A random vector X in R p is said to be regularly varying if there is a positive sequence a n ↑ ∞ and a non-zero In this case, we write X ∈ RV({a n }, µ) and call µ the tail measure of X.
The measure µ is unique up to a multiplicative constant and the scaling property (2.2) implies that µ factors into a radial and an angular component.Namely, fix any norm • in R p \ {0} and define the polar coordinates r := x and u := x/ x , x = 0.Then, where S := {x : x = 1} is the unit sphere and σ is a finite Borel measure on S referred to as the angular or spectral measure associated with µ, see, e.g., Kulik & Soulier (2020), Section 2.2.Given the norm • , the measure σ is uniquely determined as where B(A) for A ⊂ R d denotes the d-dimensional Borel sets which are also subsets of A. The following is a useful characterization of regular variation sometimes taken as an equivalent definition, see again, e.g., Kulik & Soulier (2020), Section 2.2.
Proposition 2.2.We have where ⇒ denotes the weak convergence of probability distributions.
Proposition 2.2 characterizes regularly varying random vectors in terms of exceedances over a threshold.An equivalent charaterization is also possible in terms of maxima, see, e.g., Kulik & Soulier (2020), Section 2.1.
Proposition 2.3.For a random vector Y ∈ [0, ∞) d we have Y ∈ RV α ({a n }, µ) if and only if there exists a non-degenerate random vector X such that for all x ∈ [0, ∞) ) and Y (t) , t = 1, . . ., n are independent copies of Y and the operation ∨ denotes taking the component-wise maximum.
Multivariate regular variation provides an asymptotic framework and for given α, {a n } and µ there exist several distributions of random vectors Y such that Y ∈ RV α ({a n }, µ), but according to Proposition 2.3 their maxima are all attracted to the same random vector X whose distribution depends only on µ.The class of limiting random variables in Proposition 2.3 will be inspected more closely in the next section.

Max-stable vectors
The homogeneity property (2.2) of µ implies that the limiting random vector in Proposition 2.3 has a certain stability property, namely that n t=1 with the same notation as in Proposition 2.3 and where d = stands for equality in distribution, see Kulik & Soulier (2020), Section 2.1.We call such a random vector X max-stable and we call X non-degenerate max-stable if in addition P[X = (0, . . ., 0)] < 1.For α = 1 this simplifies to n t=1 = nX for all n ∈ N, (2.5) and we speak of a simple max-stable random vector X, which we will further analyze in the following.
The marginal distributions of simple max-stable distributions are necessarily 1-Fréchet, that is, for some non-negative scale coefficient σ i .We shall write X i 1 := σ i for the scale coefficient of the 1-Fréchet variable X i .The next result characterizes all multivariate simple max-stable distributions.
Here, we recall the so-called de Haan construction of a simple max-stable vector.
Proposition 2.4.Let (E, E, ν) be a measure space and let L 1 + (E, ν) denote the set of all nonnegative ν-integrable functions on E. For every collection f i ∈ L 1 + (E, ν), 1 ≤ i ≤ p, there is a random vector X = (X i ) 1≤i≤p , such that for all x i > 0, 1 ≤ i ≤ p, x i ν(du) . (2.6) The random vector X is simple max-stable.Conversely, for every simple max-stable vector X, Equation (2.6) holds and (E, E, ν) can be chosen as ([0, 1], B[0, 1], Leb).In fact, we have the stochastic representation where For a proof and more details, see e.g. de Haan (1984), Stoev & Taqqu (2005).The functions f i in (2.6) and (2.7) are referred to as spectral functions associated with the vector X.From (2.6) and (2.7), one can readily see that for all f ∈ L 1 + (E, ν), the so-called extremal integral I(f ) in (2.7) is a well-defined 1-Fréchet random variable.More precisely, its cumulative distribution function is: Moreover, the extremal integral functional I(•) is max-linear in the sense that for all a i ≥ 0 and Thus, every max-linear combination ∨ n i=1 a i X i of X as above with coefficients a i ≥ 0 is a 1-Fréchet random variable with scale coefficient: We will further explore the asymptotic properties of simple max-stable random vectors and how they fit into the framework of multivariate regular variation in the following section.

Extremal dependence functionals and tail-dependence coefficients
The tail measure µ and the normalizing sequence {a n } from Section 2.1 provide a comprehensive description of the asymptotic behavior of a random vector X and allow to approximate probabilities of the form P[X ∈ a n A] for all sets A bounded away from 0. Sometimes, however, one may be interested in those probabilities for certain simple sets A only and describe the asymptotic behavior of X by certain extremal dependence functions instead.In this section, we first derive a general result for such extremal dependence functions and then introduce two particularly popular families of them.
Though this result is similar to Yuen et al. (2020), Lemma A.7, and also a special case to Dyszewski & Mikosch (2020), Theorem 2.1, its proof is given Section A.
We will apply the formula in (2.8) for homogeneous functionals of the form h(x) = (min i∈K x i ) + and h(x) = (max i∈K x i ) + for some subset K ⊂ [p] = {1, . . ., p}.
The next result shows that simple max-stable vectors are regularly varying and provides means to express their extremal dependence functionals both in terms of spectral functions and tail measures.
Then, X ∈ RV 1 ({n}, µ), where µ is supported on [0, ∞) p and for all Moreover, for every non-negative, continuous 1-homogeneous function h : R p → [0, ∞), we have In particular, the spectral measure σ has the representation (2.10) Again, this result is standard but we sketch its proof for the sake of completeness in Appendix A.
The classic representation of the simple max-stable cumulative distribution functions is a simple corollary from Proposition 2.6.
Corollary 2.7.In the situation of Proposition 2.6, by taking h(u) (2.11) For more details on the characterization of the max-domain of attraction of multivariate max-stable laws in terms of multivariate regular variation, see e.g., Proposition 5.17 in Resnick (1987).
We are now ready to recall the general definitions of the extremal and tail-dependence coefficients of a regularly varying random vector, which have briefly been introduced in Section 1, now with additional notation for the normalizing sequence {a n }.
The θ X (K; {a n })'s and λ X (L; {a n })'s are referred to as the extremal and tail-dependence coefficients relative to {a n } of the vector X, respectively.
If it is clear to which random vector we refer to or it does not matter for the argument, we may drop the index X and just write θ(K; {a n }) and λ(K; {a n }).Sometimes we will view θ and λ as functions of k-tuples and write for example (where some of the arguments i 1 , . . ., i k may repeat) which corresponds to λ X (L, {a n }) where L is the set of all distinct values in {i 1 , . . ., i k }.
Remark 2.9.Note that the definitions of θ X (K, {a n }) and λ X (L, {a n }) depend on the choice of the sequence {a n }.They are unique, however, up to a multiplicative constant.More precisely, if index(X) = α and a n ∼ a ′ n , c > 0, then Remark 2.10.In the following we will focus on extremal and tail-dependence coefficients of maxstable random vectors, which exist by Definition 2.8 in combination with Proposition 2.6 as long as X is non-degenerate.Observe that if X is non-degenerate simple max-stable, then Thus, if all marginals of X are standard 1−Fréchet, i.e., X i 1 = 1, then setting a n = n ensures that lim n→∞ nP[X i > a n ] = 1 and one recovers the upper tail-dependence coefficient λ X (i, j) from (1.1), i, j ∈ [p].More generally, if X is non-degenerate simple max-stable, then we can choose a n = n as a normalizing sequence and in this case (or if the sequence {a n } does not matter for the argument), we will also write In the case that P[X = (0, . . ., 0)] = 1, we set θ The following result expresses these functionals in terms of both the tail measure µ and the spectral functions of the vector X.Again, the proof is given in Appendix A.

Bivariate tail-dependence measures and spectral distance
In Definition 2.8 we introduced general extremal and tail-dependence coefficients for arbitrary nonempty subsets K, L ⊂ [p], i.e. for 2 p − 1 different sets.Often these are too many coefficients for a handy description of the dependence structure.Therefore, one may consider only the pairwise dependence in a simple max-stable vector X which corresponds to the consideration of sets K and L with at most two entries.The set of tail-dependence coefficients with sets containing at most two elements can be written in the so called matrix of bivariate tail-dependence coefficients, which we denote by For the bivariate tail-dependence we have the alternative representation For standardized marginals as n → ∞, where X i ∨ X j 1 denotes the scale coefficient of the 1-Fréchet distribution of X i ∨ X j .Thus, for standardized marginals X i 1 = 1, 1 ≤ i ≤ p, the bivariate tail-dependence coefficients also have the following representation for all 1 ≤ i, j ≤ p: (2.15) In this form, the bivariate tail-dependence matrix is a popular measure for the extremal dependence in the random vector X.First appearing around the 60's (e.g. de Oliveira (1962)), the bivariate tail-dependence coefficients are frequently considered in the literature, see e.g.Coles et al. (1999), Beirlant et al. (2004), Frahm et al. (2005), Fiebig et al. (2017), Shyamalkumar & Tao (2020) for different considerations (sometimes other names as coefficient of (upper) tail-dependence or χmeasure are used).In the context of finance and insurance but also in an environmental context this measure is used to describe the extremal risk in the random vector X.Moreover, the characterization of whether X i and X j are extremally dependent is usually formulated by these bivariate tail-dependence coefficents: If λ X (i, j) = 0, then X i and X j are extremally independent, otherwise the two random variables are extremally dependent.
Note that for standardized marginals the relation θ X (i, j) = 2 − λ X (i, j) holds.The extremal dependence coefficient in this form has often been used in the literature as a measure for extremal dependence, see e.g.Smith (1990), Schlather & Tawn (2003), Strokorb & Schlather (2015).
In all these references, the tail-dependence coefficient was defined as in (2.15) and standardized (or at least identically distributed) marginal distributions were assumed, as it is common for the analysis of dependence.However, we allow for unequal scales and therefore use the more general form (2.14).
Remark 2.12.The matrix of bivariate tail-dependence coefficients Λ of a simple max-stable vector is necessarily positive semi-definite.Indeed, this follows from the observation that by Corollary 2.11 where B = {B(t), t ≥ 0} is a standard Brownian motion and since non-negative mixtures of covariance matrices are again covariance matrices.Another way to see this is from the observation that for each n, we have nP , which is related to the fact that (i, j) → λ(i, j) is, up to a multiplicative constant, the covariance function of a certain random exceedance set (see Remark 3.6, below).
The matrix Λ is thus positive semi-definite, has non-negative entries and for standardized marginals of X it holds λ({i}) = 1, i.e.Λ is a correlation matrix.However, not every correlation matrix with non-negative entries is necessarily a matrix of bivariate tail-dependence coefficients.The realization problem (i.e. the question whether a given matrix is the tail-dependence matrix of some random vector) is a recent topic in the literature (Fiebig et al. 2017, Krause et al. 2018, Shyamalkumar & Tao 2020).We will further discuss this problem in Section 5. Related to the bivariate dependence coefficients we define an associated function, which will turn out to be a semi-metric on [p].
Definition 2.13.Let X = (X i ) 1≤i≤p be a simple max-stable vector.Then, for i, j ∈ [p], the spectral distance d X is defined by (2.17) If the scales of the marginals of the simple max-stable vector (X i ) 1≤i≤p are the same, i.e.X i = c > 0 for some c > 0 and all 1 ≤ i ≤ p, then (2.17) simplifies to For standard 1-Fréchet marginals this further reduces to d(i, j) = 2(1 − λ X (i, j)).
The spectral distance for max-stable vectors was already considered in Stoev & Taqqu (2005), equation (2.11).There it was shown that this distance is indeed a semi-metric on [p] (Stoev & Taqqu 2005, Proposition 2.6) and that it metricizes convergence in probability in 1-Fréchet spaces (Stoev & Taqqu 2005, Proposition 2.4).In the form of (2.17), the spectral distance also appears in Fiebig et al. (2017), where it was defined in two steps in (Fiebig et al. 2017, Proposition 34 and 37).There, the use of the spectral distance is based on the fundamental work of (Deza & Laurent 1997, Section 5.2), where it is used in a different context.
In Section 4 we will prove that the spectral distance of a simple max-linear vector X is L 1embeddable, with representation d X (i, j) = f i −f j L 1 , where f i , f j are the spectral functions of X.
In this form, the spectral distance was already used in Davis & Resnick (1989, 1993), where it was mainly applied for a projection method for prediction of max-stable processes.Davis & Resnick (1993) also gave a connection to the bivariate tail-dependence coefficients λ(i, j) as considered in de Oliveira ( 1962), but only in the case of equally scaled marginals.
3 Tail-dependence via exceedance sets In this section we develop a unified approach to representing tail-dependence via random exceedence sets, which explains and extends the notion of Bernoulli compatibility discovered in Embrechts et al. (2016) to higher order tail-dependence.Moreover, we introduce a slight extension of the so-called Tawn-Molchanov models and explore their connections to extremal and tail-dependence coefficients.

Bernoulli compatibility
We will first demonstrate that tail-dependence can be succinctly characterized via a random set obtained as the limit of exceedance sets.Let X ∈ RV α ({a n }, µ) and consider the exceedance set: The asymptotic distribution of this random set, conditioned on it being non-empty can be directly characterized in terms of the extremal or tail-dependence coefficients of X.Specifically, these dependence coefficients can be seen as the hitting and inclusion functionals of a limiting random set Θ, respectively.For the precise definitions and related notions from the theory of random sets, we will always refer to the monograph of Molchanov (2017).
Before proceeding with the analysis of Θ we will introduce some appropriate coefficients.Let where again Then, in view of (2.12), since the B J 's are all pairwise disjoint in J, θ X (K) = This, in view of the so-called Möbius inversion formula, see, e.g., Molchanov (2017), Theorem 1.1.61,yields the inversion formulae: which is Equation ( 7) in Schlather & Tawn (2003), Theorem 1.We also have Finally, the usual inclusion-exclusion type relationships hold between θ and λ: Although some of the Relations (3.3), (3.4), and (3.5) are available in the literature, we prove them in Appendix A independently with elementary arguments in Lemma A.2.
Observe that the event This immediately implies that The functionals T n (•) are known as the hitting functionals of the conditional distribution of the random set Θ n .They are completely alternating capacities and their limit yields hitting functionals . This random set Θ may be viewed as the "typical" exceedance set for a regularly varying vector as the threshold a n approaches infinity.
It is immediate from (3.3) and Molchanov (2017), Corollary 1.1.31,that we have thus established the following result.
Then, as n → ∞, we have where the probability mass function of Θ is as in (3.6) and the β(J)'s are as in (3.1).We have moreover that Remark 3.2.Molchanov & Strokorb (2016) introduced the important class of Choquet random sup-measures whose distribution is characterized by the extremal coefficient functional θ(•).This is closely related but not identical to our perspective here, which emphasizes threshold-exceedance rather than max-stability.
The above result shows that all tail-dependence coefficients can be succinctly represented (up to a constant) via the random set Θ.This finding allows us to connect the tail-dependence coefficients to so-called Bernoulli-compatible tensors.
In the case k = 2, this definition recovers the notion of Bernoulli compatibility in Embrechts et al. (2016).Proposition 3.1 implies the following result.
k , there exists a simple maxstable random vector X such that (ii) Conversely, for every simple max-stable random vector X = (X i ) 1≤i≤p , and every c ≥ θ X ([p]) is a Bernoulli-compatible k-tensor.
Proof.(i) : Assume (3.8) holds and introduce the random set Θ := {i : ξ(i) = 1}.Let β(J) := P[Θ = J] and define the simple max-stable vector where 1 J = (1 J (i)) 1≤i≤p contains 1 in the coordinates in J and 0 otherwise and the Z J 's are iid standard 1-Fréchet.In view of Lemma A.1 and since λ This completes the proof of (i).
Remark 3.5.As it can be seen from the proof the lower bound on the constant c in Proposition 3.4 (ii) cannot be improved.Observe that θ( , where the inequality is strict unless all X i 's are independent.Thus, the above result even in the case k = 2 improves upon Theorem 3 in Krause et al. (2018) where the range for the constant c is c ≥ i∈[p] λ X (i).
Remark 3.6.In the case of two-point sets, we have that the bivariate tail-dependence coefficient is proportional to the so-called covariance function (i, j) of the random set Θ.This shows again that the bivariate tail-dependence function (i, j) → λ(i, j) is positive semidefinite.

Generalized Tawn-Molchanov models
In the previous section we defined in (3.1) coefficients β(J) to characterize the distribution of the limiting exceedance set Θ.These coefficients were then used in (3.10) to construct a max-stable random vector in order to prove Proposition 3.4.This special random vector is in fact nothing else than a generalized version of the so-called Tawn-Molchanov model which we will introduce formally in this section.
The following result is a slight extension and re-formulation of existing results in the literature, which have first appeared in Schlather & Tawn (2002, 2003) (see also Strokorb & Schlather (2015), Molchanov & Strokorb (2016) for extensions) in the context of finding necessary and sufficient conditions for a set of 2 p − 1 numbers {θ(K) | ∅ = K ⊂ [p]} to be the extremal coefficients of a max-stable vector X.The novelty here is that we consider max-stable vectors with possibly non-identical marginals and treat simultaneously the cases of extremal as well as tail-dependence coefficients.
The proof is given in Appendix A. The vector X * defined in (3.12) is referred to as the Tawn-Molchanov or simply TM-model associated with the extremal (tail-dependence) coefficients {θ(K)} ({λ(L)}, respectively).
Remark 3.9.The distribution of the random set Θ introduced in Section 3.1 can be understood in terms of the Tawn-Molchanov model (3.12) using the single large jump heuristic.Given that Θ n = {i : X * i > n} = ∅, for large n, only one of the Z J 's is extreme enough to contribute to the exceedance set.Thus, with high probability, Θ n equals the corresponding J in (3.12).The probability of the set J to occur is asymptotically proportional to the weight β(J), which explains the formula (3.6).
We have seen in Section 2.4 that extremal dependence can also be measured in terms of spectral distance.In the following section we will explore further the connections between spectral distance and the just introduced Tawn-Molchanov models and see how the latter naturally lead to a decomposition of the former which is equivalent to ℓ 1 -embeddability.

Embeddability and rigidity of the spectral distance
So far, we have mainly considered the overall tail-dependence of X or the tail-dependence function λ(L) for arbitrary L ⊂ [p].In this section we will focus on the bivariate dependence as in Section 2.4.Specifically, we look at the spectral distance and prove that it is both L 1 -and, equivalently, ℓ 1 -embeddable.For special spectral distances, namely those corresponding to line metrics, we prove that they are rigid and completely determine the tail-dependence of a TM-model.

L 1 -embeddability of the spectral distance
Recall that a function d : Definition 4.1.A semi-metric d on a set T is said to be L 1 (E, ν)-embeddable (or short L 1embeddable, when the measure space is understood) if there exists a collection of functions The concept of L 1 -embeddability is extensively discussed in Deza & Laurent (1997).An overview can also be found in Matoušek (2013).Our first theorem in this section shows that the spectral distance matrix d X of a max-stable vector X as defined in (2.16) is L 1 -embeddable.
(ii) Conversely, for every L 1 -embeddable semi-metric d on [p], there exists a simple max-stable vector X such that (4.1) holds with λ i,j := λ X (i, j), 1 ≤ i, j ≤ p.Moreover, there exists a c ≥ 0 such that X may be chosen to have equal marginal distributions with (iii) The semi-metric d in parts (i) and (ii) is a metric if and only if P[X i = X j ] = 1 for all i = j.
Proof.Part (i): Suppose that X = (X i ) 1≤i≤p is simple max-stable and let f i ∈ L 1 + ([0, 1]) be as in (2.6), where for simplicity and without loss of generality we choose ν =Leb.In view of Relation (2.13), we obtain This shows that the semi-metric in (4.1) is L 1 -embeddable.Note that d is a metric if and only if f i (•) = f j (•), almost everywhere, or equivalently X i = X j a.s., for all i = j.
Part (ii): Suppose now that d(i, j) = g i − g j L 1 for some For simplicity and without loss of generality, we can assume that (E, E, ν) = ([0, 1], B[0, 1], Leb).Define the function g * (x) := max i∈[p] |g i (x)| and let This way, we clearly have that the f i 's are non-negative elements of L 1 ([0, 1]) and Letting X i := I(f i ) be the extremal integrals defined in (2.7), we obtain as in (4.2) that This proves the first claim in part (ii).It remains to argue that (with this particular choice of f i 's) the scales of the X i 's are all equal.Note that X i 1 = f i L 1 and since , which completes the proof of part (ii).
Part (iii): The claim follows from the observation that X i := I(f i ) = I(f j ) =: X j almost surely if and only if f i = f j a.e., or equivalently, Remark 4.3.The construction in the proof of part (ii) of Theorem 4.2 still works for f i replaced by fi = f i + c for any c > 0. Thus, the constant c can be chosen equal to or larger than

ℓ 1 -embeddability of the spectral distance
In Theorem 4.2 we have shown the equivalence between L 1 -embeddable metrics and spectral distances of simple max-stable vectors.In this section, we will additionally state an explicit formula for the ℓ 1 -embedding of the spectral distance.Thereby we show that L 1 -and ℓ 1 -embeddability are equivalent and, in passing, we recover and provide novel probabilistic interpretations of the so-called cut-decomposition of ℓ 1 -embeddable metrics (Deza & Laurent 1997).
for some non-negative β(J)'s.This means that Proof.By Theorem 4.2, d is L 1 -embeddable if and only if (4.1) holds, where λ i,j = λ X ({i, j}) for some simple max-stable random vector.In view of (3.7) for the special case of J = {i}, using that Taking X * to be the (generalized) TM-model with matching extremal coefficients to those of X, by Relations (3.6) and (4.4) we obtain (4.3).
Remark 4.6.Equation (4.4) shows that the spectral distance d is proportional to the probability that the limiting exceedance set Θ covers one and only one of the points i and j.
Proposition 4.5 also provides a probabilistic interpretation of the so-called cut-decomposition of ℓ 1 -embeddable metrics.To connect to the rich literature on the subject, we will introduce some terminology following Chapter 4 of the monograph of Deza & Laurent (1997).Let J ⊂ [p] be a non-empty set and define the so-called cut semi-metric: The positive cone CUT p := { J⊂[p] c J δ(J), c J ≥ 0} is referred to as the cut cone of non-negative functions defined on [p].Notice that CUT p consists of semi-metrics.Therefore, Proposition 4.5 entails that the cut cone CUT p comprises all ℓ 1 -embeddable metrics on p points (Proposition 4.2.2 in Deza & Laurent 1997).Relation (4.3), moreover, provides a decomposition of any such metric as a positive linear combination of cut semi-metrics.The coefficients of this decomposition are precisely the coefficients of some Tawn-Molchanov model.Finally, in view of (4.4), the random exceedance set Θ of this TM-model is such that Remark 4.8.For a given spectral distance d, Proposition 4.5 provides a decomposition and thereby shows the Without further knowledge about the number of J such that β(J) > 0 we can always choose m = 2 p − 2, since we may set β([p]) = 0 as it does not affect d.However, by Caratheodory's theorem each ℓ 1 -embeddable metric on [p] is in fact known to be ℓ 1 -embeddable in R ( p 2 ) , see (Matoušek 2013, Proposition 1.4.2).We would like to mention that finding the corresponding "minimal" TM-model (i.e. the one with minimal |J |) and analyzing the properties of such representations could be an interesting topic for further research.
Observe that where J c = [p] \ J, which implies that, in general, the decomposition of d in Proposition 4.5 is not unique.Furthermore, β([p]) ≥ 0 does not affect d in (4.3), since The next definition guarantees that, apart from those unavoidable ambiguities, the representation in (4.3) is essentially unique.Definition 4.9.An ℓ 1 -embeddable metric d is said to be rigid if for any two representations and with non-negative β(J), β(J), ∅ = J ⊂ [p], the equality Observe that each semimetric d on p points can be identified with a vector d = (d(i, j), 1 ≤ i < j ≤ p) in R N , where N := p 2 .Thus, sets of such semimetrics can be treated as subsets of the Euclidean space R N .By Corollary 4.3.3 in Deza & Laurent (1997), the metric d is rigid, if and only if it lies on a simplex face of the cut-cone CUT p .That is, if and only if the set {J lie on an affinely independent face of CUT p .Recall that the points δ i ∈ R N , i = 1, • • • , m are affinely independent if and only if {δ i − δ 1 , i = 2, • • • , m} are linearly independent.In general, the description of the faces of the cut-cone is challenging, but the next section deals with a special class of metrics which are always rigid.

Rigidity of line metrics
In this section we show that so-called line metrics are rigid (cf.Definition 4.9) and that for spectral distances corresponding to line metrics the bivariate tail-dependence coefficients, in combination with the marginal distribution, fully determine the higher order tail-dependence coefficients of the underlying random vector and thus the coefficients of the corresponding Tawn-Molchanov model.Definition 4.10.A metric d on [p] is said to be a line metric if there exist a permutation π = (π i ) 1≤i≤p of [p] and some weights In other words, d is a line metric if all points of [p] can be ordered with different distances on some line and the distance between any two points equals the distance along that line.
Theorem 4.11.Let d be a line metric, where without loss of generality the indices are ordered in such a way that for all 1 ≤ i < j ≤ p and some w k ≥ 0 The line metric d is ℓ 1 -embeddable and rigid.
(iii) For the coefficients β(J) of the (generalized) TM-model, we have that for all 1 ≤ k ≤ p − 1, where and β(J) = 0 for all other J ⊂ [p].
Thus, d is ℓ 1 -embeddable by Proposition 4.5.Let now β(J), ∅ = J ⊂ [p] be the coefficients of a representation (4.3) of d.We will show that To this end, note that (4.6) implies, for any i ≤ j ∈ [p], that d(i, j) = j−1 k=i d(k, k + 1) and thus or, equivalently, and all β(J) are non-negative, (4.10) implies that for those J with β(J) > 0 and all i ≤ j ∈ [p].This leads to the following four possible cases: We have thus shown (4.9) and in order to show that d is rigid, we only need to consider sets of the form For those sets we get (4.11) and thus the sum β(J) + β(J c ) = w k is invariant for all representations (4.3) of d and d is rigid.
This completes the proof of (ii).
Remark 4.12.Consider a max-stable vector X with standard 1-Fréchet marginals, i.e., and for all other ∅ = J ⊂ [p], β(J) = 0.In particular, all higher order extremal coefficients of X are then completely determined by the bivariate tail-dependence coefficients and given from (3.2) by Remark 4.13.The random set Θ corresponding to such line-metric tail-dependence is a random segment with one of its endpoints anchored at 1 or p.This is a direct consequence of the characterisation of β(J) in from Theorem 4.11 (iii) and (3.6).
Remark 4.14.In practical applications, the non-parametric inference on higher-order tail-dependence coefficients can be very challenging or virtually impossible.Only, say, the bivariate tail-dependence coefficients Λ = (λ X (i, j)) p×p of the vector X may be estimated well.Given such constraints, one may be interested in providing upper and lower bounds on λ X ({1, • • • , p}), which provide the worst-and best-case scenarios for the probability of simultaneous extremes.
If the spectral distance turns out to be a line metric and the marginal distributions are known, then Theorem 4.11 provides a way to precisely calculate λ X ({1, • • • , p}).However, in general this problem falls in the framework of computational risk management (see e.g.Embrechts & Puccetti 2010) as well as the distributionally robust inference perspective (see, e.g.Yuen et al. 2020, and the references therein).The problem can be stated as a linear optimization problem in dimension 2 p − 1, similar to the approach in Yuen et al. (2020).Unfortunately, the exponential growth of complexity of the problem makes it computationally intractable for p ≥ 15.In fact, the exact solution to such types of optimization problems may be NP-hard.This underscores the importance of the line of research initiated by Shyamalkumar & Tao (2020) where new approximate solutions or model-regularized approaches to distributionally robust inference in high-dimensional extremes are of great interest.

Computational complexity of decision problems
In this section we will use known results about the algorithmic complexity of ℓ 1 -embeddings to derive that the so-called tail dependence realization problem is NP-complete, thereby confirming a conjecture from Shyamalkumar & Tao (2020).While a formal introduction to the theory of algorithmic complexity is beyond the scope of this paper, we shall informally recall the basic notions needed in our context following the treatment in (Deza & Laurent 1997, Section 2.3).Consider a class of computational problems D, where each instance I of D can be encoded with a finite number of bits |I|.D is said to be a decision problem, if for any input instance I there is a correct answer, which is either "yes" or "no".The goal is to determine this answer based on any input I by using a computer (i.e., a deterministic Turing machine).
The decision problem D is said to belong to: • The class P (for polynomial complexity), if there is an algorithm (i.e., a deterministic Turing machine), that can produce the correct answer in polynomial time, i.e. its running time is of the order O(|I| k ) for some k ∈ N.
• The class NP (nondeterministic polynomial time) if the problem admits a polynomiallyverifiable positive certificate.More precisely, this means that for each instance I of D with positive ("yes") answer, there exists a finite-bit certificate C of size |C| that can be verified by an algorithm / deterministic Turing machine with running time O(|C| l ) for some l ∈ N. (The certificate needs not be constructed in polynomial time.) • The class NP-hard if any problem in NP reduces to D in polynomial time.This means that for every problem D ′ in NP, the correct answer to this decision problem for any instance I ′ of D ′ can be found by first applying an algorithm that runs in polynomial time of |I ′ | to transform I ′ into an instance I of D and then solve the decision problem D for this instance I.Note that this definition does not require that D itself is in NP.
• The class NP-complete if D is both in NP and is NP-hard.
A decision problem which has received some attention recently, see Fiebig et al. (2017), Embrechts et al. (2016), Krause et al. (2018), andShyamalkumar &Tao (2020), is the realization problem of a TD matrix with standardized entries on the diagonal, namely finding an algorithm with the following input and output: Tail dependence realization (TDR) problem • Output: An answer to the question: Does there exist a simple max-stable vector X = (X 1 , . . ., X p ) such that λ X (i, j) = L i,j , i, j ∈ [p], i.e.L is the matrix of bivariate tail-dependence coefficients of a max-stable vector with standard 1-Fréchet-margins?
This problem may at a first glance look similar to deciding whether a given matrix is a valid covariance matrix.Indeed, as a strengthening of Remark 3.7, it can be shown that there exists a bijection between TD matrices as in the above problem and a subset of the so-called Bernoullicompatible random matrices, i.e. expected outer products E(Y Y t ) of random (column) vectors Y with Bernoulli margins, see Embrechts et al. (2016) and Fiebig et al. (2017).But while it is a simple task to check if a matrix is the covariance matrix of some random vector, for example by finding the eigenvalues of this matrix, it can become more difficult to check whether a matrix is the covariance matrix or outer product of a restricted space of random variables.Practical and numerical aspects of deciding whether a given matrix is a TD matrix have been studied in Krause et al. (2018) and Shyamalkumar & Tao (2020), including a discussion on the computational complexity of the problem.Indeed, they point out that due to results by Pitowsky (1991), checking whether a matrix is Bernoulli-compatible is an NP-complete problem.However, some subtlety arises as in order to check whether a p × p-matrix L is a so-called tail coefficient matrix, i.e. a TD matrix with 1's on the diagonal, it needs to be checked that p −1 L is Bernoulli-compatible, see Shyamalkumar & Tao (2020).Thus, the problem narrows down to checking Bernoulli compatibility of the subclass of matrices with 1/p on their diagonal and this may have a different complexity than the general membership problem.Due to the similarity in the above mentioned problems, Shyamalkumar & Tao (2020) conjecture that the TDR problem is NP-complete as well.
We add to the discussion by using results about computational complexity of problems related to cut metrics and metric embeddings, see Section 4.4 in Deza & Laurent (1997) for a brief overview over some relevant results.To this end, let us first introduce a problem which is related to the TDR problem but easier to handle for the subsequent complexity analysis.
• Output: An answer to the question: Does there exist a simple max-stable vector X = (X 1 , . . ., X p ) and some c > 0 such that X i 1 = c, i ∈ [p], and i.e. d is the spectral distance of X?
With the help of our previous results and the known computational complexity of ℓ 1 -embeddings it is simple to establish the computational complexity of the above problem.
Theorem 5.1.The SDR problem with unconstrained, identical margins is NP-complete.
Proof.Due to Theorem 4.2 (i)-(ii), the spectral distance d(i, j) = 2(c − λ X (i, j)) of a simple max-stable random vector with X i 1 = c, i ∈ [p], is L 1 -embeddable and for each L 1 -embeddable semi-metric d there exists a simple max-stable vector X with X i 1 = c, i ∈ [p], for some c > 0 such that d is the spectral distance of X.Thus, the question is equivalent to checking that d is L 1 -embeddable and this is equivalent to checking that d is ℓ 1 -embeddable, see Remark 4.7.The latter problem is NP-complete by Avis & Deza (1991), see also (P5) in Deza & Laurent (1997).
Remark 5.2.In the SDR problem one could add more assumptions about d in the first place under "Input", for example that the entries on the diagonal of d are equal to 0 or that d is a distance matrix.Alternatively, one could also just assume under "Input" that d is a p × p-matrix.Since a positive answer to the question would always ensure that d is a distance matrix and all mentioned properties (non-negativity, symmetry, triangle inequality) could be checked in a number of steps which is a polynomial in p these additional assumptions do not change the NP-completeness of the problem.
Unfortunately, the constant c in (5.1) is not part of the input in the algorithm and thus cannot be fixed a priori.If we could for example set c = 1 and thus ask if for a given d a simple max-stable vector X with standard 1-Fréchet-margins exists such that d(i, j) = 2(1 − λ X (i, j)), then this is equivalent to checking that λ i,j := 1− d(i, j)/2 is a TD matrix.But while such an arbitrary fixation of c may change the nature of the problem, the following statement points out an a posteriori feasible range for c.
Lemma 5.3.If the outcome of the SDR problem with unconstrained, identical margins is a positive answer to the question, then (5.1) holds for a suitable chosen max-stable vector X and every c ≥ (2 p − 2) max i,j∈[p] d(i, j).
The proof is given in Appendix A. From the previous lemma we see that the SDR problem with unconstrained, identical margins is equivalent to Spectral distance realization (SDR) problem with d-constrained margins • Input: A non-negative, symmetric p × p-matrix d = (d(i, j)) p×p .
• Output: An answer to the question: Does there exist a simple max-stable vector X = (X 1 , . . ., X p ) such that X i 1 = (2 p − 2) max i,j∈ d(i, j) − λ X (i, j)), i, j ∈ [p], i.e. d is the spectral distance of X?
Finally, by changing from X to X := X/((2 p − 2) max i,j∈[p] d(i, j)) the spectral distance d X of X and bivariate tail-dependence coefficients λ X (i, j) scale accordingly by Lemma A.1 and we see that the latter problem is actually equivalent to Spectral distance realization (SDR) problem with constrained, standard margins • Input: A non-negative, symmetric p × p-matrix d = (d(i, j)) p×p .
From the last line in the above problem we can see that our SDR problem with constrained, standard margins can be solved if we have an algorithm to check that λ of the given form is a TD matrix.But since we know by the stated equivalence of all three SDR problems in combination with Theorem 5.1 that all of them are NP-complete, we know that this algorithm has to be NP-complete as well.This leads to the following result.
Proof.We need to show that the TDR problem is both in NP and NP-hard.That the TDR problem is in NP has been shown in (Shyamalkumar & Tao 2020, p. 255), with the help of Caratheodory's theorem.We start with the first statement and follow the typical way to prove this by reducing a known NP-complete problem to TDR.Indeed, any input matrix d(i, j) to any of the three equivalent, and by Theorem 5.1 NP-complete, SDR problems can be transformed in polynomial time to the matrix λ(i, j) := 1 − d(i, j)/(2(2 p − 2) max i,j∈[p] d(i, j)).By the statement of the third SDR problem, the question with input d can be answered by using λ as an input to the TDR problem.Thus, an NP-complete problem reduces in polynomial time to the TDR problem and the TDR problem is NP-hard, thus NP-complete.
The latter relation shows that the sequence of measures {µ n } is relatively compact in M 0 (R p ), equipped with the M 0 -convergence topology.Indeed, by (Hult & Lindskog 2006, Theorem 2.7), it suffices to show that for all ε > 0 and η > 0, there exists an M = M (ε, η) > 0, such that sup n µ n (B(0, ε) c ) < ∞ and sup The first condition follows from (A.1) and since P[X ≤ x] > 0. The second condition follows from the fact that − log(P[X ≤ x]) ↓ 0, as x ↑ ∞, which is true since X has a valid probability distribution.
The relative compactness of the measures {µ n } entails that µ n ′ M 0 =⇒ µ for some µ ∈ M 0 and a sub-sequence n ′ → ∞.However, by (A.1) and Proposition 2.4 we have for all x ∈ [0, ∞) p \ {0}, and the limit measure is uniquely determined by its values on all the complements of rectangles containing the origin.Furthermore, we see from (A.2) that for nondegenerate X, the limit measure µ is non-degenerate as well.This proves that X ∈ RV ({n}, µ) where P[X ≤ x] = exp{−µ[0, x] c }.
Having established regular variation, the first equality in Relation (2.9) follows from the µ-continuity of the set {h > 1} as argued in the proof of Proposition 2.5.The rest of Relation (2.9) follows from (A.2).Finally, the representation in (2.10) follows from the fact that σ is determined by S g(u)σ(du), for all continuous functions g : S → R + .Indeed, for every such g, the function h(x) := g(x/ x ) x 1 {x =0} is continuous, non-negative and 1-homogeneous and hence by (2.9) S g(u)σ(du) = S g f (z) This, since g is arbitrary, proves (2.10).
On the other hand, {x : h min,L (x) > 1} = i∈L A i .The formula for λ(L) in Relation (2.13) follows from the above and Equation (2.9) since h min,L ( f ) = min i∈L f i (x).The derivations of the formulae for θ(K) are similar.
We conclude this section with the auxiliary result, that the spectral distance and the tail-dependence coefficients are linear under max-linear combinations, in the sense of the following lemma.