Centroids of the core of exact capacities: a comparative study

Capacities are a common tool in decision making. Each capacity determines a core, which is a polytope formed by additive measures. The problem of eliciting a single probability from the core is interesting in a number of fields: in coalitional game theory for selecting a fair way of splitting the wealth between the players, in the transferable belief model from evidence theory or for transforming a second order into a first order model. In this paper, we study this problem when the goal is to determine the centroid of the core of a capacity, and we compare four approaches: the Shapley value, the average of the extreme points, the incenter with respect to the total variation distance and the limit of a procedure of uniform contraction. We show that these four centroids do not coincide in general, we give some sufficient conditions for their equality, and we analyse their axiomatic properties. We also discuss how to define a notion of centrality measure indicating the degree of centrality of an additive measure in the core. Finally, we also analyse these four centroids in the more general context of imprecise probabilities.


Introduction
A problem that naturally arises in many branches of operations research, such as decision making (Huntley & Troffaes, 2012;Keith & Ahner, 2021;Troffaes, 2007) or expected utility theory (Gilboa & Schmeidler, 1989;Klibanoff et al., 2005;Sarin & Wakker, 1992), is that of determining the probability measure modelling the underlying uncertainty. Due to a number of factors (missing data, conflicting sources of information, etc.), it is sometimes difficult, or even impossible, to elicit such probability measure with minimal guarantees. A possible approach in those cases is to consider instead a capacity (Grabisch, 2016), that in turns deter-Enrique Miranda and I. Montes  mines a polytope of associated additive measures: its core. When the capacity is normalised, the core is formed by probability measures.
Capacites appear in many different contexts and with different interpretations: (i) in decision making, the values of the capacity measure the beliefs supporting the occurrence of each event; (ii) in coalitional game theory, each event represents a coalition of players, the value of the capacity is interpreted as the minimum reward guaranteed by the coalition, and the core contains the distributions of the rewards compatible with these constraints; and (iii) in imprecise probability theory (Augustin et al., 2014), a capacity gives lower bounds for the real but unknown values of the probability measure underlying the experiment. 1 The core, also called credal set (Levi, 1980), is formed then by the probability measures that are candidates for being the unknown probability measure. Among the many applications of capacities, we refer for instance to Grabisch (2013) and Shapley (1953) for some applications in game theory, and to Angilella et al. (2016) and Destercke (2017) in ordinal classification.
In any of these contexts, the problem of selecting a probability measure from the core of the normalised capacity is quite common. For instance, within coalitional game theory we may consider the solution of a game as a way to fairly divide the wealth it represents among the players; within imprecise probability theory, we may also consider transformations between imprecise and precise probability models (Klir & Parviz, 1992;Smets, 2005); in addition, we may also look for the element of the core maximising the entropy (Abellán & Moral, 2003;Jaffray, 1995) or establish procedures for assigning relevance degrees to the different features in machine learning problems (Kumar et al., 2020;Lundberg & Lee, 2017).
Our goal here is also to select a probability measure from the core of a normalised capacity, but the interpretation of the output of the process is different to the cases mentioned above: we seek to determine the center of the core of the capacity, similarly to the point with greatest data depth (Cascos, 2009;Tukey, 1975) in a data cloud; thus, our final result should be an element in the interior of the core, whenever the latter is non-empty. This already rules out methods based on maximising the entropy or minimising the Kullback Leibler divergence. Pursuing this objective, in this paper we analyse four centers: (i) the Shapley value, that first appeared in coalitional game theory (Shapley, 1953); (ii) the average of the extreme points of the core; (iii) the incenter with respect to a distance (in our case, the total variation distance, due to its good properties within the imprecise probability framework (Montes et al., 2020b)); and a center that is obtained following a procedure of uniform contraction of the core.
The investigation of the notion of center of a core leads naturally to that of a measure of centrality with respect to a given set of probability measures. We propose an axiomatic definition and analyse several examples, both based on a choice of a centroid or not.
The remainder of the paper is organised as follows. After introducing some preliminary notions in Sect. 2, in Sect. 3 we discuss the four possible notions of centroids mentioned above and study the relationships between them. In Sect. 4, we make a further comparison in terms of the axiomatic properties they satisfy and in Sect. 5 we discuss the notion of centrality measure. Finally, in Sect. 6 we show that the centroids can be defined in the more general context of coherent lower previsions. Some additional comments are given in Sect. 7. To ease the reading, proofs have been relegated to the Appendix.
A preliminary version of this paper was presented at ECSQARU 2021 Conference (Miranda & Montes, 2021). This extended version contains a deeper discussion of the four centroids, additional results, proofs and other comments stemming from the discussions carried out at the conference.

Preliminary concepts
Let us introduce the main concepts we shall use in this paper. We refer to Grabisch (2016) for more details.
Consider a finite possibility space X = {x 1 , . . . , x n }, let P(X ) be the set of all the probability measures on X and let P * (X ) be the probability measures assigning strictly positive probabilities to all the non-empty events.
A capacity is a function μ : P(X ) → R that is monotone: A ⊆ B implies μ(A) ≤ μ(B) and satisfies μ(∅) = 0. A capacity is normalised when in addition μ(X ) = 1. Throughout this paper, we will always consider normalised capacities. The conjugate of a capacity μ, denoted byμ, is defined byμ(A) = 1−μ(A c ) for any A ⊆ X . Also, a capacity μ determines the core, defined by: When the core is non-empty, the capacity is called balanced and it holds that μ ≤μ. A balanced capacity is called exact when μ(A) = min P∈core(μ) P(A) for every A ⊆ X .
As we mentioned in the introduction, capacities can be interpreted in many different ways. In this paper, we will focus mostly on the decision making or game theoretic interpretations (cases (i) and (ii) from Sect. 1). The interpretation as imprecise probability models (case (iii)) will be explored in more detail in Sect. 6.
The core of a capacity is a closed and convex subset of P(X ), and since it is determined by a finite number of restrictions, it is a polytope that can be characterised by a finite number of extreme points. Recall that P ∈ core(μ) is an extreme point of the core if P = λP 1 +(1−λ)P 2 for some λ ∈ (0, 1) and P 1 , P 2 ∈ core(μ), then P 1 = P 2 = P. The set of extreme points shall be denoted by ext core(μ) .
A property that an exact capacity may satisfy is that of supermodularity, also called convexity in coalitional game theory, which means that: When the capacity is supermodular, the set of extreme points of the core is given by {P σ | σ ∈ S n } (Shapley, 1971), where S n denotes the set of permutations of {1, . . . , n}, and given σ ∈ S n , P σ is determined by: A capacity can be equivalently represented in terms of its Möbius inverse, which is the function m : P(X ) → R given by: From the Möbius inverse we can retrieve the capacity by means of: The Möbius inverse m takes values in R. When it is non-negative for every event, the capacity μ is called belief function and its conjugateμ is called plausibility function, creating a bridge with Evidence Theory (Shafer, 1976). Any belief function is a supermodular capacity.

Center points of an exact capacity
Next, we introduce the different notions of centroid of the core of a capacity we shall compare in this paper. We shall consider four possibilities: the Shapley value, the average of the extreme points, the incenter with respect to the total variation distance and the contraction centroid.
In general, we shall use the notation μ to denote a centroid of a capacity μ. Moreover, we shall assume throughout that the capacity μ is exact.

The Shapley value
One of the most popular notions of centroid of a capacity is the Shapley value. It was introduced by Shapley (1953Shapley ( , 1971 in the framework of coalitional game theory, as a "fair" procedure to distribute some wealth between the players. Later on, it was rediscovered in the context of non-additive measures (Dubois & Prade, 1980) and popularised by Smets as the pignistic transformation of a belief function (Smets & Kennes, 1994).
Definition 1 Given an exact capacity μ, its Shapley value is defined as the probability measure associated with the following distribution: When μ is a belief function with Möbius inverse m, it was proven in Smets (2005) that μ 1 can be equivalently computed as More generally, when μ is supermodular (and in particular when it is a belief function), the extreme points of core(μ) are given by Eq.
(1), and the Shapley value can be computed as Even if the expressions in Eqs. (3) and (4) were only established for belief functions and supermodular capacities, respectively, it follows from basic combinatorial analysis that they can be extended to arbitrary capacities: 2 Proposition 1 Let μ be an exact capacity with Möbius inverse m, and let μ 1 be given by Eq.
It is worth remarking that, even if m(A) may be negative in some events A, Proposition 1 implies that the aggregation by means of Eq. (5) always produces a non-negative value.
While the Shapley value seems like a reasonable choice as a central point, it has one important drawback: it is only guaranteed to belong to the core of μ (i.e., we can only assure that μ 1 ≥ μ) when the capacity μ is supermodular. In those cases, it follows immediately from Eq. (4) and the fact that P σ dominates μ for every permutation σ ∈ S n .
More generally, when the exact capacity μ is not supermodular, we can only assure that μ 1 dominates μ for small cardinalities, as showed by Baroni and Vicig (2005, Prop.5). We refer to Miranda and Montes (2018) for a study of the connection between the Shapley value and the core of the capacity.

Average of the extreme points
The second possibility we consider in this paper is the average of the extreme points of the core of the capacity: Definition 2 Let μ be an exact capacity, and denote by P 1 , …, P k the extreme points of its core. The average of the extreme points, also called vertex centroid (Elbassioni & Tiway, 2012), is defined as It follows from Definition 2 that μ 2 always belongs to core(μ); considering the comments in the previous section, this implies that it need not coincide with the Shapley value when supermodularity is not satisfied. As we shall see later on, they need not coincide either even if the capacity is supermodular: while the set of extreme points of the core is {P σ | σ ∈ S n }, a key difference is that in the computation of Shapley value in Eq. (4) we are allowing for repetitions of the same extreme point, while in Definition 2 we do not.
It is also important to clarify that the average of the extreme points does not generally coincide with the center of gravity of the core, defined as the expectation of the set over a uniform probability distribution.
Example 1 Consider a 3-element possibility space and the exact capacity μ given by: This capacity is supermodular 3 and the extreme points of core(μ) are given by: x 1 x 2 x 3 P 1 0 0 1 P 2 0 1 0 The average of the extreme points is given by: while the expectation over the core with respect to the uniform distribution E is: Therefore, both concepts do not coincide in general. Intuitively, if we assume that the mass is uniformly distributed over core(μ), we can see in Fig. 1 that there is more mass for values of x 1 closer to 0 than closer to 1 /2, whence E({x 1 }) must be smaller than μ 2 ({x 1 }) = 1 /4. 3 It is known that any exact capacity in a 3-element possibility space is supermodular too. In fact, this example also shows that the center of gravity does not coincide with the Shapley value either, even in this case where μ is supermodular, because: While the center of gravity has the advantage of being applicable over any closed and convex set in P(X ), and not only on polytopes, it also has the drawback of being computationally more expensive (see for example (Elbassioni & Tiway, 2012)). For this reason, we have left this approach out of our study.

Incenter
Next we consider the incenter, that corresponds to the center (or centers) of the largest balls included in the interior of the core of the capacity. 4 To make this notion precise, we must specify the distance under which the balls are defined. In this respect, there are several possibilities, such as the Euclidean or the L 1 distances for example, or even the Kullback-Leibler divergence. We have considered in this paper the total variation distance (Levin et al., 2009), which is the one associated with the supremum norm: Our choice of this distance is due to the fact that the closed balls it induces are always polytopes, unlike the case of the Euclidean distance or the Kullback-Leibler divergence, and those closed balls correspond to exact capacities satisfying supermodularity (Montes et al. 2020b, Sec.2). Although the L 1 distance also induces a polytope, it does not correspond to the core of an exact capacity and our analysis in Destercke et al. (2022), Montes et al. (2020b) shows that its use is rather complex. In what follows, for the sake of notational simplicity, the total variation distance will be simply denoted by d. Moreover, we shall denote closed and open balls centered on P 0 and with radius α, respectively. This leads us to the following definition: Definition 3 Let μ be an exact capacity. The incenter radius of core(μ) is defined as Any P 0 ∈ core(μ) such that B α I o (P 0 ) ⊆ core(μ) is called an incenter of μ.
It may be surprising to see that in Eq. (7) we are requiring the inclusion of the open ball B α o (P 0 ) into the intersection core(μ) ∩ P * (X ). The reason is that if we simply require B α o (P 0 ) ⊆ core(μ) then we may obtain centers that are in the boundary of the core, something in our view counterintuitive. This is illustrated in the following example.
Example 2 Let X = {x 1 , x 2 , x 3 } and consider the exact capacity μ given by: The value α I determined by Eq. (7) is α I = 0.125, and any convex combination of the probability measures Q 1 and Q 2 given by: is an incenter of core(μ). Besides, if we do not require the open ball to be included in P * (X ), we obtain: 2, and the only P 0 ∈ core(μ) satisfying B 0.2 o (P 0 ) ⊆ core(μ) is given by: This probability measure belongs to the boundary of core(μ), which leads us to believe that it does not adequately represent the idea underlying the notion of incenter.
This example also shows the necessity of taking the open ball rather than the closed in the definition of the incenter radius in Eq. (7). The graphical representation of P 0 , Q 1 and Q 2 can be seen in Fig. 2.
On the other hand, when the core of the capacity is included in the interior of the simplex, then we immediately have that B α o (P) ⊆ core(μ) ∩ P * (X ) if and only if B α c (P) ⊆ core(μ), whence Eq. (7) can be rewritten as A first question arising naturally from the definition of incenter is whether it always exists. Our next result shows that this is indeed the case.
Proposition 2 Consider an exact capacity μ such that core(μ) has a non-empty interior. Then the value α I given by Eq. (7) is a maximum. As a consequence, the incenter of μ always exists.
On the other hand, a capacity may have more than one incenter; while this was already showed by Example 2, we next give another example where the core of the capacity is included in P * (X ): Fig. 2 Graphical representation of the core of core(μ) in Example 2, the incenters Q 1 and Q 2 (in blue) and the probability P 0 (in red)

Fig. 3
Graphical representation of the core of μ, the incenters Q 1 and Q 2 with the ball they induce (left hand-side figure) and the incenter Q β for β = 0.5 with the ball it induces (right hand-side figure) Example 3 Let X = {x 1 , x 2 , x 3 }, and consider the capacity μ given by: The value α I determined in Eq. (7) is given by: To see that there is more than one P 0 ∈ core(μ) such that B α I o (P 0 ) ⊆ core(μ) ∩ P * (X ), note that the set of such P 0 is given by the probability measures Q 1 , Q 2 defined below as well as any convex combination Q β = β Q 1 + (1 − β)Q 2 for β ∈ [0, 1]: Figure 3 gives the graphical representation of the core of μ as well as the balls B α I c (Q 1 ), B α I c (Q 2 ) and B α I c (Q β ) for β = 0.5.
Due to the lack of uniqueness, we shall denote by μ Fig. 4 Graphical representation of the core of core(μ) (in gray) in Example 4, as well as the ball A dual approach to the one above is to consider the circumcenter of the capacity, which is the center (or centers) of the smallest balls that include the core. Formally, we may consider , and then consider the set of those P such that B α C c (P) ⊇ core(μ). However, this approach has the two drawbacks we have discussed so far: not only it need not lead to a unique solution, but also it may produce values outside the core, as showed in Bader et al. (2012); this last issue may be overcome by considering instead and by calling circumcenter those probabilities P ∈ core(μ) such that B α C d (P) ⊇ core(μ). However, this second approach does not prevent us from obtaining circumcenters that lie in the boundary of the core, leading to the same counterintuitive situation we discussed in Example 2.
Example 4 Consider X = {x 1 , x 2 , x 3 } and the capacity μ given by: This capacity is exact and the extreme points of its core are given by: x 1 x 2 x 3 P 1 0.3 0.4 0.3 P 2 0.5 0.2 0.3 x 1 x 2 x 3 P 3 0.4 0.4 0.2 P 4 0.5 0.3 0.2 It follows that α C = 0.1 and that the only circumcenter is P 0 given by P 0 ({x 1 }) = 0.4, P 0 ({x 2 }) = P 0 ({x 3 }) = 0.3. The graphical representation of the core of μ and the ball B α C c (P 0 ) can be seen in Fig. 4.

Contraction centroid
Our fourth and last approach is motivated by the lack of uniqueness that has been illustrated in the case of the incenter.
Given an exact capacity μ with conjugateμ, we can split the events in X into L = and L > such that: μ(A) =μ(A) ∀A ∈ L = and μ(A) <μ(A) ∀A ∈ L > .
Using this notation, the core of μ can be expressed as: Note that when L > is empty we obtain that μ(A) =μ(A) for any A ⊆ X , meaning that μ is additive and that its core contains one single element. The idea in this approach is to contract the core in a uniform manner as long as we can, and then proceed in the same way by reducing the number of constraints. More specifically, we increase the value of the capacity in a constant amount α in all the events A ∈ L > . In this respect, we wonder whether there is some value α small enough such that this approach gives rise to a non-empty core, and also which is the maximum/supremum we can consider. Our next result gives the answer to both questions.

Proposition 3
Consider an exact capacity μ, and let us express its core as in Eq. (9) using the sets L = and L > . For a given α > 0, let us define: and let = {α > 0 | core(μ) α = ∅}. It holds that: This result assures that we can uniformly increase the capacity in the events whose value is imprecise (i.e., such that μ(A) <μ(A)) and that when the process stops the size of L > has decreased in the sense that, given the exact capacity μ α S that is the lower envelope of the set This means that we may apply the same procedure to the capacity μ α S , and after iterating it a finite number of times, we obtain a set formed by a single element that we shall call the contraction centroid. In other words, the procedure leads to the values α S 1 = max 1 , . . . , α S l = max l and the chain of nested sets 5 μ 4 is the contraction centroid of the capacity μ. Let us illustrate this procedure with an example.
Example 5 Consider again the exact capacity from Example 3. There, the set L = only contains the trivial events ∅ and X , while L > contains the six non-trivial events: Graphical representation of the core of μ, as well as the sets core(μ) α S 1 (in blue) and core(μ) α S 2 (in red) Let us see that α S 1 = max = 0.075. On the one hand, core(μ) α S 1 is non-empty because it includes for instance the probability measure P given by: To see on the other hand that this is the maximum value of , note that if we increase μ in α > 0, to keep exactness it should be that: whence α ≤ 0.075. Therefore, α S 1 = 0.075, and this gives rise to the following core: The exact capacity μ 1 determined as the lower envelope of core(μ) α S 1 and its conjugate μ 1 are given by: Let us apply now the procedure to this capacity. The sets L > 1 and L = 1 are given by: i.e., there are two non trivial events whose value is now fixed, {x 2 } and {x 1 , x 3 }.
Repeating the same steps, we obtain α S 2 = 0.0375, and in this case core(μ) α S 2 is formed by a single probability measure, that is therefore the contraction centroid μ 4 . It is given by: In Fig. 5 we have depicted the sets core(μ) α S 1 (in blue) and core(μ) α S 2 (in red), as well as the initial core of μ.
In the above procedure, it is worth mentioning that the lower envelope of the set core(μ) α does not necessarily coincide with the capacity μ given by While by construction it dominates this capacity, they may not agree on some events because μ need not be exact. This can be seen in Example 5, where given Proposition 3 assures that there exists a maximum value α S = max giving rise to a nonempty set. This naturally leads us to the problem of computing more efficiently the value of α S . Our next result gives an explicit formula for α S for a particular case of exact capacities which only coincide with their conjugate for the trivial cases of ∅, X .
Definition 4 Let μ be an exact capacity with conjugateμ. We shall call μ maximally imprecise when μ(A) <μ(A) for every A = ∅, X .

Let us define
In other words, A(X ) is the class of all finite families of subsets of X such that every x ∈ X belongs to the same number of elements in the family. Note that in each of these families A there may be repeated elements, i.e., we may consider for instance the family We consider on A(X ) the partial order determined by the inclusion, i.e., we say that A 1 ⊆ A 2 when each element in the family A 1 also belongs to the family A 2 .

Theorem 4 Let μ be a maximally imprecise exact capacity. Then
Let us return to the running Example 5. In that case, α S satisfies Eq. (12) for A = {x 2 }, {x 1 , x 3 } and β A = 1, giving that which is indeed the value we obtained in Example 5. The computation of α S in Eq. (12) requires the computation of the value h A for all the families A. Our next result shows a more tractable expression when the capacity is supermodular. For this aim, we denote by A * (X ) the subclass of A(X ) formed by the partitions of X .
Theorem 5 Let μ be a maximally imprecise supermodular capacity with conjugateμ. Then: This means that, under supermodularity, it suffices to focus on partitions of X , which simplifies considerably the computation of Eq. (12).
Example 6 Consider again our running Example 3. The capacity in that example is supermodular because any exact capacity in a 3-element space is, so Theorem 5 is applicable. Next table summarises the values associated with each partition in Eq. (13): The minimum of these values is α S = 0.075 and it is attained in the partition A formed by {x 1 , x 3 } and {x 2 }, both for the capacity and its conjugate. This is in line with our comments in Example 5.
Let us show that the result in Theorem 5 does not generalise to the case where μ is an exact capacity but not supermodular: x 4 } and the capacity μ, with conjugateμ, given by: The extreme points of its core are given by: x 1 x 2 x 3 x 4 P 1 0.1 0.2 0.5 0.2 P 2 0 0.2 0.6 0.2 It follows that μ is an exact capacity, and that it is maximally imprecise too. Note that μ is not supermodular, since To see this, note any P ∈ core(μ) 0.025 satisfies: which implies that P(A) = μ(A) + 0.025 for every A ∈ A, and also that core(μ) α = ∅ for every α > 0.025. Moreover, the above restrictions imply that Thus, core(μ) 0.025 only includes the probability mass function (0.05, 0.175, 0.525, 0.25). It is easy now to verify that this probability measure satisfies P(A) ≥ μ(A) + 0.025 for any non-trivial event A.
Moreover, the bounds determined by the partitions in A * (X ) are the following: We obtain the minimum value 0.1 3 = 0.03, strictly greater than α S = 0.025. We conclude that Theorem 5 does not hold without the hypothesis of supermodularity.

Relationships between the centroids
So far, we have introduced four different notions of the center of an exact capacity. Let us begin by showing that these four notions are indeed different: Example 8 Consider the capacity defined in Example 3; there, we gave the set of incenters μ 3 , while the contraction centroid μ 4 was given in Example 5. The extreme points of the core of μ are given by: x 1 x 2 x 3 P 1 0.55 0.1 0.35 P 2 0.5 0.15 0.35 P 3 0.5 0.25 0.25 From this, we conclude that the average of the extreme points of core(μ) is the probability measure with mass function On the other hand, the Shapley value is given by This can be derived also using Proposition 1, noting that the permutations lead to the following extreme points: Here, P σ 4 = P σ 6 , so this extreme point gets twice the weight of the others in the computation of Shapley value, hence the difference with μ 2 . These centroids are represented in Fig. 6. Even if the four approaches do not lead to the same solution in general, in Examples 5 and 8 we have seen an example where the contraction centroid μ 4 belongs to the set of incenters μ 3 . This leads us to investigate whether there is a connection between these two approaches. Our next result shows that, under some conditions, the set we obtain in the first step of the contraction approach coincides with the set of incenters.
Proposition 6 Let μ be a maximally imprecise exact capacity with conjugateμ satisfying μ(A) > 0 for every A = ∅, and let α S be the coefficient defined in Proposition 3. Given P 0 ∈ core(μ) and α ≤ α S , On the other hand, an advantage of using the contraction procedure is that the existence of μ 4 is guaranteed even if the interior of core(μ) is empty.

Properties of the centroids
Next we compare the different centroids in terms of the axiomatic properties they satisfy. There exist several axiomatic characterisations of Shapley value in the context of coalitional game theory; arguably the most important one is that as the unique additive measure μ satisfying the following axioms: Throughout this paper, we are restricting ourselves to normalised capacities, which implies that (i) the efficiency axiom simplifies to n i=1 μ ({x i }) = 1; and (ii) in order to guarantee that λ 1 μ 1 + λ 2 μ 2 is again a normalised capacity, we must have λ 2 = 1 − λ 1 , and therefore the linearity axiom implies that λμ 1 +(1−λ)μ 2 = λ μ 1 + (1 − λ) μ 2 for any λ ∈ R and any normalised capacities μ 1 , μ 2 whenever λμ 1 + (1 − λ)μ 2 is a normalised capacity too.
Next we investigate to which extent these properties are satisfied by the other centroids proposed in this paper. In this respect, note that in the framework of this paper, any center of a capacity shall be a probability measure, whence the efficiency property is trivially satisfied. Note also that when analysing the behaviour of the set of incenters, we shall say that it satisfies a property if and only if any of its elements does.
As we have already mentioned, the Shapley centroid does not necessarily satisfy feasibility, meaning that μ 1 may not belong to the core of μ. By construction, the vertex, incenter and contraction centroids do satisfy feasibility. With respect to the other properties, it is not difficult to establish the following: To see that they do not satisfy linearity in general, whence their difference with Shapley value, consider the following example: Example 9 Consider X = {x 1 , x 2 , x 3 }, the exact capacities μ 1 , μ 2 and their average μ := 0.5μ 1 + 0.5μ 2 , given in the following table: Because of the symmetry property, it is easy to see that for μ we obtain that On the other hand, 3 , because this latter set includes for instance (0.3375, 0.325, 0.3375). As a consequence, none of three centroids satisfies linearity.
Next we consider other desirable properties of a centroid.
Definition 5 Let μ be a centroid of an exact capacity μ. We say that it satisfies: When dealing with the incenter, the previous properties should be slightly rewritten due to its lack of uniqueness: the incenter satisfies ignorance preservation if core(μ) = P(X ) implies that μ 3 only contains the uniform distribution; and it satisfies continuity when for any ε > 0 there exists some δ > 0 such that d(μ 1 , μ 2 ) < δ implies that d(ν 1 , ν 2 ) < ε, where ν 1 and ν 2 are the lower envelopes of To see that the average of the extreme points does not satisfy continuity, consider the following example: Example 10 Consider X = {x 1 , x 2 , x 3 }, ∈ (0, 0.05) and the exact capacity μ given by Then μ is supermodular, and the extreme points of core(μ) are given by: 0.15 0.2 0 .65 σ 4 = (2, 3, 1) P σ 4 0.25 − ε 0.2 0.55 + ε σ 5 = (3, 1, 2) P σ 5 0.25 − 2ε 0.45 + 2ε 0.3 σ 6 = (3, 2, 1) P σ 6 0.25 − ε 0.45 + ε 0.3 All these extreme points are different, and their average is given by On the other hand, when ε = 0 we obtain that P σ 5 = P σ 6 , and, as we have seen in Example 9, the average of the extreme points becomes μ 2 = (0.17, 0.31, 0.52) = lim ε→0 ( 1.1−4ε 6 , 2+5ε 6 , 2.9−ε 6 ). Another desirable property would be that the centroid preserves the same preferences as μ, in the sense that μ(A) ≥ μ(B) ⇒ μ (A) ≥ μ (B). Since μ is an additive model, we shall only require this property on the singletons: otherwise the capacity should satisfy which need not hold. Unfortunately, none of the centroids considered in this paper satisfies the above property, as the following example shows: Example 11 Consider X = {x 1 , x 2 , x 3 } and let μ be the exact capacity, with associated Möbius inverse m, given by With respect to the incenter, it can be checked that the largest value α such that core(μ) α = ∅ is α = 0.11, from which we deduce that

Centrality measures
More generally, instead of determining which element of the core can be considered its center, we may define a centrality measure, that allows us to quantify how far inside the core an element is.
Consider for instance the same capacity as in Example 3, whose core is depicted in Fig. 7. Intuitively, given the probability measures Q 1 and Q 2 defined as: and emphasised in red in Fig. 7, Q 2 should have a greater centrality degree than Q 1 . This simple example suggests the following definition of centrality measure.
The idea underlying these properties is the following: CM1 tells us that an element outside the core should have degree of centrality zero; from CM2, the same should hold for the extreme points of the core; CM3 means that there is a unique probability P 0 with degree of centrality 1; finally, property CM4 represents the idea that the closer a probability is to P 0 , the greater its degree of centrality is. We should mention that in Definition 6, and also in the remainder of this section, we are not considering the case where the core is a singleton, core(μ) = {P 0 }, because in that case we can trivially assign a centrality degree 1 to P 0 and 0 to any other probability measure.
We next discuss two possible strategies for defining a centrality measure. The first one consists in considering a centroid in the interior of the core, and to measure the distance with respect to it. It requires to specify both the centroid and the distance. Out of the options considered in the previous section, we would reject μ 1 because it may not belong to the core and μ 3 because of non-uniqueness. With respect to the distance, we consider here the total variation, although as argued before it would also be possible to consider other options such as the L 1 or the Euclidean distances, or even the Kullback Leibler divergence.
In this sense, if we let μ be our centroid of choice and take 6 then we can define and ϕ 1 (P) = 0 otherwise. A second approach would consist in defining directly a chain {core(μ) α } α∈[0,1] of sets such that core(μ) 0 = core(μ), core(μ) 1 is a singleton determining the centroid and where core(μ) α is included in the interior of core(μ) for any α > 0, and letting The chain {core(μ) α } α∈[0,1] of sets could be defined, for example, as: where C H denotes the convex hull. Let us show that both approaches lead to a centrality measure.
Example 12 Consider again our running Example 3. The extreme points of core(μ) were given in Example 8. Taking as centroid the average of the extreme points μ 2 , given in Eq. (14), we obtain the following distances: P 1 P 2 P 3 P 4 P 5 d P i , μ 2 0.12 0.12 0.1 0.13 0.2 Eq. (16) gives β = 0.1, whence ϕ 1 (P) = 1 − min 10d P, ϕ μ 2 , 1 . Considering the probabilities Q 1 and Q 2 in Eqs. (15), (17) produces the following centrality degrees: If we choose instead as centroid μ 4 , given in Eq. (11), we obtain the following distances to the extreme points: 0.1125 , 1 . Considering again the probability measures Q 1 and Q 2 from Eq. (15) we obtain: We can see that the centrality degree of Q 1 is zero in both cases (for the centroids μ 2 and μ 4 ) but for Q 2 , the centrality degree is slightly greater when considering the contraction centroid μ 4 . Figure 8 shows the curves with centrality degree 0, 0.2, 0.5 and 0.8 for ϕ 1 when considering as centroid ϕ μ 2 (left hand side figure) and ϕ μ 4 (right hand side figure). We can also consider the centrality measure ϕ 2 defined using the chain of sets in Eq. (19) defined using the extreme points. In that case, taking the average of the extreme points μ 2 as centroid, the centrality degrees of the probability measures Q 1 and Q 2 are ϕ 2 (Q 1 ) ≈ 0.1923 and ϕ 2 (Q 2 ) ≈ 0.7142. In contrast, if we use the contraction centroid, we obtain ϕ 2 (Q 1 ) = 2 /9 and ϕ 2 (Q 2 ) = 2 /3.
It is worth mentioning here that for this second approach, any P ∈ int core(μ) has a strictly positive centrality degree. This is in contrast with the centrality measure ϕ 1 , that μ 4 (right hand side) as centroids, with the curves with centrality degree 0, 0.2, 0.5 and 0.8 assigns zero centrality degree for some probability measures in the interior of the core of μ, as for example for Q 1 . Figure 9 shows the curves of centrality degree 0.2, 0.5 and 0.8 for ϕ 2 when considering as centroid μ 2 (left hand side) and μ 4 (right hand side). It is also possible to define a centrality measure by considering the chain of sets from Eq. (10). For this, note that for each P ∈ core(μ) there is j ∈ {1, . . . , l} such that P ∈ core(μ) α S j−1 \ core(μ) α S j , where core(μ) α 0 = core(μ). Also, there is α ∈ j−1 such that P ∈ core(μ) j−1 α , but P / ∈ core(μ) j−1 α+ε for any ε > 0. Then we let: and ϕ 3 (P) = 0 if P / ∈ core(μ).

Centroids from the perspective of imprecise probabilities
We conclude this paper by considering the centroid problem within the framework of imprecise probabilities, that include capacities as a particular case and that are capable of modelling  Core (core(μ)) C o r e ( c o r e (μ)) C r e d a l s e t ( M(P)) more general scenarios. Before we do this, we make a number of clarifications about the terminology. Within imprecise probability theory, a (exact) capacity μ is usually denoted by P, and it is called (coherent) lower probability while its conjugate function, called (coherent) upper probability, is denoted by P. P and P may be understood as functions giving lower and upper bounds to a real but unknown probability P 0 , meaning that all we know about P 0 is that P(A) ≤ P 0 (A) ≤ P(A) for any event A ⊆ X . Following this interpretation, the core of a lower probability P is called credal set (Levi, 1980) and it is denoted by M(P); it may be interpreted as the set of probability measures that are compatible with the information given by the lower probability. Finally, in this context the property of supermodularity is usually called 2-monotonicity.
The correspondence 7 between the terminology used in decision making, game theory and imprecise probabilities can be seen in Table 1.
In this section, we shall recall the basics from the more general theory of (coherent) lower previsions, and show that the four centroids analysed before can be also considered in this context.

(Coherent) lower previsions
In the theory of imprecise probabilities from Walley (1991), rather than giving lower bounds for the values taken by some unknown probability measure on events, we give lower bounds for the values taken by its expectation operator. This is done by means of a lower prevision, a function P : L(X ) → R, where L(X ) denotes the set of random variables, or gambles, defined on X . Its conjugate upper prevision is defined by P( f ) = −P(− f ) for any f ∈ L(X ). One underlying interpretation is that there exists a probability measure P 0 modeling our uncertainty, and all we know about it is that P( f ) ≤ E P 0 ( f ) ≤ P( f ) for any f ∈ L(X ). Associated with a lower prevision we can define a credal set by: using for simplicity the same symbol P to denote a probability measure and its associated expectation operator: P( f ) = E P ( f ). The lower prevision P is called coherent when P( f ) = min P∈M(P) P( f ) for any f ∈ L(X ). The credal set associated with a coherent lower prevision is a closed and convex subset of P(X ) but it may not be a polytope. In fact, there is a one-to-one correspondence between coherent lower previsions and closed and convex subsets of P(X ). This allows us to understand the extent of the generality of this theory: while coherent lower probabilities (or exact capacities) give rise to credal sets (or cores) that are polytopes, coherent lower previsions induce closed and convex sets of probabilities that need not be a polytope.
One particular situation where the credal set in Eq. (21) is a polytope is when the coherent lower prevision P satisfies 2-monotonicity (Walley, 1981): where ∨ and ∧ denote the pointwise maximum and minimum. When this property is satisfied, the 2-monotone lower prevision determines the same credal set as its restriction to events, which is a 2-monotone lower probability P defined by P (A) := P(I A ); and the latter determines the values of the coherent lower prevision P by means of the Choquet integral (Choquet, 1953).

Centroids for coherent lower previsions
Let us discuss how the different notions of centroids we have analysed in this paper may be applied on arbitrary credal sets or, equivalently, on coherent lower previsions.

Shapley value
The Shapley value can be straightforwardly defined by considering the coherent lower probability associated to the lower prevision (its restriction to indicators of events) and applying any of the equivalent representations of the Shapley value given in Sect. 3.1.
Nevertheless, an important drawback here is that two different coherent lower previsions with the same restriction to events will have the same Shapley value, and so will be indistinguishable in this respect. This is illustrated in our next example: Example 14 Consider X = {x 1 , x 2 , x 3 } and the coherent lower previsions P 1 and P 2 inducing the following credal sets: 8 By taking lower envelopes, we obtain that both induce the same coherent lower probability P: Hence, the Shapley value is the same for both P 1 and P 2 , and it is given by (0.5, 0.25, 0.25). Note that the Shapley value does not belong to the interior of the credal sets for neither P 1 nor P 2 .

Average of the extreme points
Whenever the credal set M(P) is a polytope (i.e., it has a finite number of extreme points), the average of the extreme points can be computed using Eq. (6). While this definition imposes a restriction on the credal set, it is applicable for those associated with a lower probability P that is coherent (Wallner, 2007), and therefore also in the particular cases of 2-monotonicity, belief functions or p-boxes (Montes & Destercke, 2017). In addition, it is also applicable to some models of coherent lower previsions that are not determined by their restrictions to events, such as those associated with comparative probabilities (Miranda & Destercke, 2015). 9

Incenter
As we did in Sect. 3.3, we can find the (set of) incenters. In this case, the definition of incenter can be straightforwardly given: the incenter radius of M(P) is given by , and any P 0 ∈ M(P) such that B α I 0 (P 0 ) ⊆ M(P) is called incenter of P. Proposition 11 Let P be a coherent lower prevision whose credal set has non-empty interior. Then the value α I is a maximum. As a consequence, the incenter of P always exists.

Contraction centroid
The only centroid whose extension to coherent lower previsions is not straightforward is the contraction centroid. Assuming again that M(P) is a polytope, we know that it is determined by a finite number of constraints. This means that there are two (disjoint) set of gambles L > and L = such that the coherent lower prevision P and its conjugate P satisfy: and that the credal set can be expressed as: Note that we may assume without loss of generality that these constraints include the indicator functions of the proper events: In that case, when L > is empty we obtain that P(I A ) = P(I A ) for any A ⊆ X (or P(A) = P(A), if we use this abuse of notation), meaning that M(P) contains one single probability measure. Moreover, using the properties of coherent lower and upper previsions, we can also assume without loss of generality that 0 ≤ min f < max f = 1 for every f ∈ L > ∪ L = .
The idea in this approach is the same explained in Sect. 3.4: we contract the credal set in a uniform manner as long as possible, increasing the value of the lower prevision in a constant amount α in all the gambles f ∈ L > , and then proceed in the same way by reducing the number of constraints. As we showed in Proposition 3, there exists a value α small enough such that this approach produces a non-empty credal set and there is a maximum value satisfying this property.
Proposition 12 Let P be a coherent lower prevision whose credal set is a polytope that can be expressed as in Eq. (22). For a given α > 0, let: Consider also the set = {α > 0 | M(P) α = ∅}. It holds that: Moreover, as we explained before, when the coherent lower prevision satisfies 2monotoninicty, it is determined by its restriction to events. Hence, Theorem 4 also applies in this context, where we simply need to understand P(A) and P(A) as the lower and upper previsions of the indicator I A .
Finally, the connection between the set of incenters and the first step of the process determining the contraction centroid also holds for coherent lower previsions.
Proposition 13 Let P be a coherent lower prevision whose associated credal set M(P) is included in P * (X ) and such that L = = ∅. If α S is the incenter radius, then for any P 0 ∈ M(P) and any α ≤ α S :

Particular cases
We have seen that the four centroids can be defined to coherent lower previsions. In this subsection we analyse them for some particular families of imprecise models. We start with probability intervals. A probability interval (de Campos et al., 1994) is an uncertainty model I that gives lower and upper bounds to the probability of the singletons: This credal set is non-empty if and only if n i=1 l i ≤ 1 ≤ n i=1 u i , and we say that the probability interval avoids sure loss. Then, taking lower and upper envelopes of M(I) we obtain a lower and an upper probability, and we say that the probability interval is coherent when P({x i }) = l i and P({x i }) = u i for every i = 1, . . . , n. In that case, P is 2-monotone and the values of P and P for any event A ⊆ X can be computed as de Campos et al. (1994): For the particular case of coherent probability intervals, we can give an explicit formula for the value α S in the contraction method.

Proposition 14 Let P and P be the coherent lower and upper probability determined by a coherent probability interval
. . , n}. Consider: Then: (a) The value α S = max is given by:

(b) The credal set M(P) α S determined by means of Eq. (23) is a probability interval avoiding sure loss. (c)
The value α S obtained in Eq. (24) is consistent with that from Theorem 5 for the particular case of maximally imprecise probability intervals. This shows also that in that case Eq. (13) can be simplified, in the sense that we do not need to consider all partitions of X , but only the partitions {x 1 }, . . . , {x n } and {x i }, X \ {x i } for every i = 1, . . . , n.
From Proposition 14 we can also deduce an explicit formula for the value α S for the Linear Vacuous (LV) and the Pari Mutuel Model (PMM), which constitute particular instances of distortion models (Montes et al., 2020a, b) or nearly linear models (Corsato et al., 2019). The PMM (Montes et al., 2019;Pelessoni et al., 2010;Walley, 1991) is determined by the coherent lower probability: where P 0 ∈ P(X ) is a given probability measure and δ > 0. Similarly, the LV (Walley, 1991) is defined by the coherent lower probability where P 0 ∈ P(X ) and δ ∈ (0, 1).
Both the PMM and the LV are instances of probability intervals, where: This means that we can apply Proposition 14 for computing the value α S = max . In fact, when both P 0 and the lower probability only take the values 0 and 1 for trivial events (the impossible and sure events), the formula for α S = max can be simplified and the procedure of contracting the credal set finishes in only one step.
Corollary 15 Consider a PMM P P M M or a LV P LV determined by a probability measure P 0 ∈ P(X ) and a distortion parameter δ. Assume that P P M M (A) and P LV (A) belong to (0, 1) for every A = ∅, X . Then: In this respect, it is worth remarking that (i) the good behaviour of these two distortion models is in line of other desirable properties they possess, as discussed in Destercke et al. (2022), Montes et al. (2020a) and Montes et al. (2020b); and (ii) the centroid of a distortion model originated by a probability measure P 0 does not coincide with P 0 , because the distortion is not done uniformly in all directions of the simplex. This was already observed in Miranda and Montes (2018 for the particular case of the Shapley value.

Properties
We consider now the properties of the centroids considered in Sect. 4.

Proposition 16
Let P be a coherent lower prevision. Then, the properties in Proposition 7 and 8 still hold.
We have already mentioned that two different lower previsions may have the same restriction to events. For this reason, in addition to the aforementioned properties, it would be desirable that the center of a coherent lower prevision P does not necessarily coincide with the center of its restriction to events. In this respect, it is not difficult to show that P 3 , P 4 are capable of distinguishing between lower previsions and lower probabilities, and so does P 2 (when M(P) is a polytope). On the other hand, Shapley value is only defined via the lower probability, so it does not distinguish between lower probabilities and lower previsions as we showed in Example 14.

Centrality measures
Finally, we may try to define centrality measures for the credal set determined by a coherent lower prevision. For this aim, we can consider exactly the definition of centrality measure as in Definition 6, and the centrality measures ϕ 1 , ϕ 2 and ϕ 3 defined in Eqs. (17), (18) and (20), respectively. Note that in the case of ϕ 1 and ϕ 3 we need to restrict ourselves to coherent lower previsions whose credal set is a polytope to assure that the minimum in Eq. (16) is strictly positive and that the contraction approach finishes in a finite number of steps, respectively.
Proposition 17 Given a coherent lower prevision P, the function ϕ 2 is a centrality measure. Moreover, if M(P) is a polytope, ϕ 1 and ϕ 3 are centrality measures too.

Conclusions
We have performed a comparative analysis of four alternatives for defining a center of an exact capacity: the Shapley value, the average of the extreme points, the incenter for the total variation distance and the limit of uniform contractions. Our results show that these four approaches may lead to different results, and also illustrate some of the properties each of them satisfies: a summary can be seen in Table 2. Note that our goal with this paper is not to take the stance that one of them is better than the others. Instead, we intend to provide some assistance to the practitioner in her choice of a centroid: if for instance she considers that linearity is an essential property, she must pick the Shapley value; if she wants to ensure that the center belongs to the interior of the core of the capacity, she must select one of the others; and so on. Also, we have seen that these centroids can also be applied in the more general framework of coherent lower previsions. Since coherent lower previsions are in one-to-one correspondence with closed and convex set of probabilities, they include the particular case of polytopes. We have seen that the results we have proved for exact capacities can also be extended to coherent lower previsions. Let us recall that the center in this paper is understood as a probability measure that is in the interior of the core of the capacity and that can play the role of its representative. For this reason, we have left out of our study other approaches based, for example, on maximising the entropy or minimising the Kullback Leibler divergence, that in our context may produce counterintuitive results.
In addition to the comparison performed between the four centroids, some comments regarding their computation in practice must be done: • First of all, the computation of the Shapley value is known to be a hard problem that exponentially increases with | X |, since it requires the computation of μ(A) for the 2 n − 1 non-trivial events. • Secondly, the average of the extreme points is simple as long as these are known. Under the assumption of supermodularity, the extreme points coincide with the probability measures P σ defined in Eq. (1). Even more, there are particular situations where their computation is even simpler: for example, when μ is minitive there are at most 2 n−1 (Miranda et al., 2003) and when μ corresponds to a p-box the maximal number of extreme points is at most the n-th Pell number (Montes & Destercke, 2017). The problem is more challenging when the capacity is not supermodular: even if the number of extreme points is at most n! (Wallner, 2007), there is no simple procedure for their computation. • Thirdly, computing the contraction centroid may be an extremely complicated problem.
Even if Theorem 4 gives a formula for computing the value α S under fairly general conditions, it requires the computation of all the families A in A(X ), whose number is extremely large. The complexity is significantly reduced for supermodular capacities, because Theorem 5 gives a simpler expression for α S that depends on the partitions of X . Still, the problem is rather complex because the number of partitions in a n element possibility space coincides with the n-th Bell number. Nevertheless, we have seen that for particular models that may arise in practice such as probability intervals or some distortion models, the computation of the four centroids is much simpler. • Finally, under fairly general conditions the set of incenters coincide with the first step of the contraction approach (Proposition 6), hence both approaches are equivalent from the computational viewpoint.
While our results give some overview of the properties of the centroids of a capacity, there is still much work to be done in order to have a full picture of this problem. First and foremost, it would be interesting to extend our approaches to infinite possibility spaces. While this seems immediate in the case of the incenter or the vertex centroid (see also footnote 4), in the case of the Shapley value we should consider the generalisations carried out in Neyman (2002), and in the case of the contraction centroid we should verify that the process stabilises in a finite number of steps. In addition, we could consider other possibilities in the context of game solutions, such as the Banzhaf value (Banzhaf, 1965) or more generally probabilistic solutions (Weber, 1988), or other alternatives to the total variation distance, such as the Euclidean distance or the L 1 distance. It would also be interesting to obtain further conditions for the equality between some of these centroids. And finally, a deeper study of centrality measures and their axiomatic properties would be of interest.
This set is non-empty because int M(P) = ∅ ⇔ int M(P) ∩ P * (X ) = ∅ and given P ∈ int M(P) ∩ P * (X ) there exists some α > 0 such that B α o (P) ⊆ M(P) ∩ P * (X ). Besides being non-empty, the set 1 is directed (α ∈ 1 ⇒ α ∈ 1 ∀α ∈ [0, α]). Let α 1 := sup 1 . This means that for every n ∈ N there exists some P n ∈ M(P) such that B The sequence (P n ) n is included in the compact set M(P), that is as a consequence also sequentially compact. Therefore, there exists a subsequence (P n ) n that converges to some P. Note that for this subsequence we also have that B Since M(P) is closed, P ∈ M(P). Moreover, for any > 0 there is some n such that d(P n , P) < for every n ≥ n . Take = 1 n for a fixed n. Then for any Q ∈ B α 1 − 2 /n o (P) and any m ≥ n it holds that Since we can take m arbitrarily large, we deduce that Since this holds for every n, we deduce that . Therefore, α 1 belongs to 1 .

Proof of Propositions 3 and 12
Again, we prove Proposition 12 and, since exact capacities are particular cases of coherent lower previsions, Proposition 3 immediately follows.
(a) Let us denote L > = { f 1 , . . . , f k }. For each i = 1, . . . , k, it follows by coherence that there is some P i ∈ M(P) such that P i ( f i ) = P( f i ) > P( f i ). If we now let P := P 1 +···+P k k , it belongs to the convex set M(P) and by construction it satisfies P( f i ) > P( f i ) for all i = 1, . . . , k. Given α = min i=1,...,k P( f i ) − P( f i ) > 0, we conclude that P ∈ M(P) α i and therefore is non-empty. (b) Since X is finite, the topology of M(P) is equivalent to the topology associated with the Euclidean distance. The definition of M(P) α implies then that, for every α ∈ , the set M(P) α is a closed subset of M(P), and it is therefore compact. Thus, (M(P) α ) α∈ is a decreasing sequence of compact subsets of M(P), and as a consequence their intersection M * is non-empty. But this intersection M * must coincide with M(P) α S for α S = sup , and this implies that this supremum is a maximum. (c) Let P α S , P α S denote the lower and upper envelopes of M(P) α S . Assume ex-absurdo Then reasoning as in the first statement we can find some P ∈ M(P) α S such that P( f i ) > P α S ( f i ), and this contradicts that α S is the maximum value of .

Proof of Theorem 4
We are looking for the maximum α such that the set: is non-empty. This is equivalent to requiring that the capacity ν given by: for A = X is balanced. Since the possibility space X is finite, this means (Walley 1991, Lemma 2.4.4) that for any l ∈ N and A 1 , . . . , A l , we should have Note that we can assume that the events A 1 , . . . , A l are proper subsets of X , given that I X − ν(X ) = 1 − 1 = 0 and I ∅ − ν(∅) = 0 − 0 = 0. This allows us to rewrite Eq. (A1) as This means that Now, instead of considering the events A 1 , . . . , A l , consider A 1 , . . . , A l , C 1 2 . We get: , and the proof is complete.
In order to prove Theorem 5, we must establish first a couple of auxiliary lemmas.

Lemma 18 Let μ be a maximally imprecise exact capacity, and consider
(c) From the condition, given two different elements A, B ∈ A either they are disjoint or one of them is included in the other. If we then consider the partial order on A given by set inclusion, then we can find a subfamily A 1 ⊂ A of maximal elements in the order that are pairwise disjoint. Since any element of X must be included in some maximal element, it follows that the subfamily A 1 is a partition of X .

Proof of Theorem 5 Let
..,k be the element in A(X ) where the minimum in Eq. (12) is attained. According to Lemma 19, if β A = 1, then A is partition so A ∈ A * (X ) hence Eq. (13) ..,k is a partition, so it belongs to A * (X ). Also: hence Eq. (13) holds.
Assume now that 1 < β A <| A | −1, and let us prove that it is possible to find another A * ∈ A(X ) such that β A * < β A and where α S is attained.
From item (c) in Lemma 19 we deduce that either there is In this second case, applying 2-monotonicity with the sets A i and A j above we deduce that: Iterating the procedure, we find after a finite number of steps that there are no different events C and D in the family A k such that C ∩ D = ∅, C \ D = ∅ and D \ C = ∅. But then, applying Lemma 19 we deduce that there is Since both β A * and β A k \A * are strictly smaller than β A k , we deduce that we can find another element of A(X ) where the value α S is attained and with a smaller value of β A . If we repeat this process we end up with a family A ∈ A(X ) such that β A = 1, and where α S is attained, at which point we apply the first part of the proof.

Proof of Propositions 6 and 13
We start proving Proposition 13. From it, Proposition 6 trivially follows just noting that, by hypothesis, L = would be empty and L > would be formed by all the proper events of X (because μ is by hypothesis maximally imprecise), and this allows to apply Proposition 13 to a lower prevision were L > contains the indicator functions of the proper events.
(⇒) Assume that P 0 ( f ) ≥ P( f ) + α for every f ∈ L > , and consider Q ∈ B α c (P 0 ). Since we can assume without loss of generality that 0 = min f < max f = 1 for every f ∈ L > , it follows that Since by assumption L = = ∅, this implies that Q ∈ M(P). (⇐) Consider a probability measure P 0 such that B α c (P 0 ) ⊆ M(P), and let us prove that P 0 ( f ) ≥ P( f ) + α for every f ∈ L > . Assume ex-absurdo that P 0 ( f ) − α < P( f ) for some gamble f ∈ L > , and let us show that there exists some Q ∈ B α c (P 0 ) such that Q( f ) < P( f ). To see that this is indeed the case, note that since B α c (P 0 ) is included in P * (X ), it must be P 0 ({x i }) > α for every x i ∈ X . If we now take x m , x M ∈ X such that: and define Q by means of the mass function However: hence Q / ∈ M(P), we obtain a contradiction.
The second part of Proposition 6 is an immediate consequence of the first once we realise that, since M ⊆ P * (X ), we can compute α I by means of Eq. (8).

Proof of Propositions 7, 8 and 16
We start proving the properties of the centroids for coherent lower previsions (Proposition 16). Since exact capacities are particular cases of coherent lower previsions, Propositions 7 and 8 can be regarded as a corollary. Throughout this proof, for simplicity we use the notation P(A) := P(I A ) and P(A) = P(I A ). It x i is a null-player, P(X ) = 1 = P(X \ {x i }), whence P({x i }) = 0. Since belong to the credal set, any of these centroids gives probability zero to x i . With respect to symmetry, let σ i, j denote the permutation of X that exchanges x i and x j and leaves the other elements fixed, and let P σ i, j be given by If P is an extreme point of M(P), then P σ i, j is an extreme point of M(P σ i, j ). If we now assume that by symmetry P σ i, j = P, it follows that Finally, for the contraction centroid note that, for any gamble f ∈ L > and any α > 0, it holds that . With respect to ignorance preservation, in the case of P 1 it follows immediately from Eq. (3) because the Möbius inverse of P satisfies m(X ) = 1 and zero for any other A = X . In the case of P 2 , it suffices to note that the extreme points of M(P) are the degenerate distributions; and for P 3 , P 4 it suffices to use that the uniform distribution P 0 is the only one in M(P) for which B Finally, with respect to continuity, in the case of μ 1 it follows directly from Eq. (2); while in the case of μ 3 , μ 4 it follows trivially because we are using in their definition the topology of the total variation.

Proof of Propositions 9, 10 and 17
We first prove that ϕ 1 , ϕ 2 and ϕ 3 are centrality measures for a coherent lower prevision P (assuming that M(P) is a polytope for ϕ 1 and ϕ 3 ), proving hence Proposition 17. Since any exact capacity is a particular type of coherent lower prevision whose core is a polytope, Propositions 9 and 10 trivially follow as a corollary.
Let us begin by showing that ϕ 1 is a centrality measure: CM1 This holds by definition. CM2 By construction, if P ∈ ext(M(P)) then d(P, P ) ≥ β, hence ϕ(P) = 1 − 1 = 0. CM3 By definition ϕ 1 (P) = 1 iff d(P, P ) = 0, and since d is a distance this holds iff P = P . CM4 This follows if and only if for any P ∈ ext M(P) and any λ, β ∈ [0, 1] such that λ ≥ β it holds that If we denote P 1 := λP + (1 − λ) P and P 2 := β P + (1 − β) P , there is some a ∈ (0, 1) such that P 2 = a P 1 + (1 − a) P . As a consequence, for any event A it holds that Let us prove now that ϕ 2 is a centrality measure.

CM1,CM3
These follow immediately from the definition of M(P) 0 , M(P) 1 . CM2 This holds because any extreme point P of M(P) does not belong to M(P) α for any α > 0. CM4 Consider P 1 = λP + (1 − λ)P 0 and P 2 = β P + (1 − β)P 0 for λ ≥ β, where P is an extreme point of M(P) and P 0 is the unique element of M(P) 1 . Then (CM4) holds if an only if for any γ ∈ (0, 1) such that P 1 ∈ M(P) γ , also P 2 ∈ M(P) γ . This is a consequence of the convexity of the set M(P) γ , given that P 2 = a P 1 + (1 − a)P 0 for some a ∈ [0, 1].
Finally, let us see that ϕ 3 satisfies the four properties in Definition 6.

Proof of Proposition 14
Since P is associated with a probability interval, M(P) = {P | l i ≤ P({x i }) ≤ u i ∀i = 1, . . . , n}, meaning that, while we can assume without loss of generality that L = ∪ L > include the indicators of proper events, in this case only the indicators of the singletons are necessary, and L = and L > reduce to I = and I > , respectively.
Let α be the minimum in Eq. (24), and let us see that α = α S . We consider the following cases: 1. Assume that α = 1 |I > | 1 − n i=1 l i . Define P 0 by: P 0 satisfies the following properties: (i) It is a probability measure because it is non-negative and (ii) P 0 ({x i }) ∈ [l i , u i ]. On the one hand, if i ∈ I = , P 0 ({x i }) = l i . On the other hand, if i ∈ I > , P 0 ({x i }) = l i + α > l i . Also, by definition of α it holds that α ≤ 1 2 (u i − l i ), hence: (iii) To see that M(P) α = {P 0 }, and as a consequence that α = α S , note that, by construction, It follows then that by definition P 0 ∈ M(P) α . Since moreover n i=1 l i + | I > | α = 1, we deduce that any Q ∈ M α (P) must coincide with P 0 and that M α (P) = ∅ for any α > α.
This implies that M(P) α = {P 0 } is formed by a unique probability measure P 0 , which coincides with P 4 . 2. Assume that α = 1 |I > | n i=1 u i − 1 . Define P 0 by: Following the same steps as in the previous case, we obtain that M(P) α = {P 0 }, where P 0 = P 4 . 3. Assume that α = 1 2 (u i −l i ) for some i ∈ {1, . . . , n}. We define a new probability interval I * given by: Let us prove some interesting properties of this probability interval: (i) M(I * ) = ∅: on the one hand, since by hypothesis α ≤ 1 |I| 1 − n i=1 l i and, since I avoids sure loss, n i=1 l i ≤ 1. Similarly: since by hypothesis α ≤ 1 |I| n i=1 u i −1 and, since I avoids sure loss, n i=1 u i ≥ 1. We therefore conclude that I * avoids sure loss.
It only remains to see that α = α S , but this straightforwardly follows from the fact that the chosen α saturates the probability of at least one of the events.

Proof of Corollary 15
Consider first the PMM. Since we are assuming that P P M M (A) ∈ (0, 1) for non-trivial events A, it corresponds to the probability interval I P M M given by: Applying Eq. (24) to this probability interval, we obtain: Following the steps in the previous proposition, we deduce that Similarly, for the LV we are assuming that P LV (A) ∈ (0, 1) for non-trivial events A, so the LV corresponds to a probability interval I LV given by: is given by: It follows from the results in Miranda and Montes (2018) that the values obtained for the PMM and the LV coincide with the Shapley value. Also, we have seen that M(P) α S coincides with the set of incenters with respect to the total variation distance (see Proposition 13). Since in this case M(P) α S is a singleton (for both the PMM and the LV), the incenter is unique and it coincides with the contraction center. Finally, to see that these centers also coincide with the average of the extreme points, we just need to note (see Montes et al. 2019, Sec.3.1 and Montes et al. 2020a, Sec.5.1) that using the approach based on the permutations, each extreme point appears in the same number of permutations, so the average of the extreme points coincides with the Shapley value.