Tropical diagrams of probability spaces

After endowing the space of diagrams of probability spaces with an entropy distance, we study its large-scale geometry by identifying the asymptotic cone as a closed convex cone in a Banach space. We call this cone the tropical cone, and its elements tropical diagrams of probability spaces. Given that the tropical cone has a rich structure, while tropical diagrams are rather flexible objects, we expect the theory of tropical diagrams to be useful for information optimization problems in information theory and artificial intelligence. In a companion article, we give a first application to derive a statement about the entropic cone.


Introduction
With [9] we started a research program aiming for a systematic approach to a class of information optimization problems in information theory and artificial intelligence. A prototypical example of such a problem, still wide open, is the characterization of the entropic cone: For an N -tuple of random variables, one may evaluate their entropies and the entropies of the joint variables and obtain a vector in R 2 N −1 . A vector obtained in this way is called an entropy vector of an N -tuple of random variables. The closure of the set of all entropy vectors of N -tuples is what we call the entropic cone, see also [8]. Besides the characterization of the entropic cone, other information optimization problems arise for instance in causal inference [13], artificial B R. Matveev intelligence [14], information decomposition [3], robotics [1], neuroscience [5] and in variational autoencoders [7].
The global strategy of our program is roughly based on the following way of thinking. The entropic cone is clearly a very complicated object: to date, there is no explicit description of the entropic cone for four or more random variables, while it is known that it is not polyhedral [8]. Yet, perhaps, much of its complexity may be explained by it being the closure of an image under a linear map of another, simpler, higher-dimensional cone.
The purpose of this article is to construct such a higher-dimensional (infinitedimensional, in fact) object, which we call the tropical cone and to derive some of its properties which are testimony to its simple structure and which help the study of information optimization problems. As an example of its use, in [11] we apply the theory to derive a statement about the entropic cone.
Before outlining the construction of the tropical cone, let us mention that for our purposes, the language of random variables proved inconvenient, which is why work with diagrams of probability spaces instead.
Diagrams of probability spaces are commutative diagrams in the category of probability spaces, with (equivalence classes of) measure-preserving maps as morphisms, such as Collections of n random variables give rise to a special type of diagrams, that include, besides the target spaces of the random variables themselves, the target space of every joint variable. Such diagrams have a particular combinatorial type. The first and the last diagrams in (1.1) are examples of such special types of diagrams in case of two and three random variables respectively. The description of other diagrams, such as the diagram in the middle of (1.1), using the language of random variables is less transparent. We will construct the tropical cone and derive its properties over several sections. In Sect. 2 we describe the construction of the asymptotic cone in the abstract setting of a metric Abelian monoid ( , +, d). We believe that this abstract setting will make the construction more transparent and easier to follow. The results we present in that section are probably quite standard, but we find it beneficial to gather them under one roof. Such an asymptotic cone consists of equivalence classes of quasi-linear sequences in the monoid. Whereas linear sequences have the form (n · a) n∈N 0 , where a is an element of the monoid, quasi-linear sequences may deviate from linearity in a controlled fashion, measured by a sublinear function ϕ satisfying some additional conditions that we will specify later. A sequence γ ∈ N 0 is called ϕ-quasi-linear if for all m, n ∈ N, it satisfies d γ (m + n), γ (m) + γ (n) ≤ ϕ(m + n) and two sequences γ and γ are equivalent if lim n→∞ 1 n d γ (n), γ (n) = 0 The asymptotic cone is itself again a metric Abelian monoid, but it admits additional structure. It admits a distributive action of R ≥0 and the metric becomes homogeneous and translation invariant. As an example of this construction, A'Campo [2] constructed the real numbers as the asymptotic cone in the monoid of integers.
In Sect. 3 we show that, under certain conditions, the asymptotic cone is a complete metric space and it can be realized as a closed convex cone in a Banach space.
In Sect. 4 we apply the general construction of Sects. 2 and 3 to the monoid of diagrams of probability spaces endowed with the intrinsic entropy distance [6,9,15] and with the tensor product as the binary operation. We call the resulting space tropical cone and its elements tropical diagrams. 1 In Sect. 6, we give a simple characterization of the tropical cone for special types of diagrams.
For more complicated diagrams, we currently do not have an explicit description of the tropical cone, but we do show that it possesses a rich algebraic structure. In particular, one can take convex combinations of tropical diagrams. Other useful operations and constructions can be carried through for tropical diagrams, whereas they do not have an equivalent in the classical context of probability spaces, see [10]. All in all, from some perspective, tropical diagrams are easier to deal with than diagrams or probability spaces, since only rough, asymptotic relations between probability spaces are preserved under tropicalization, similar to how all complicated features of the landscape disappear when looking at the Earth from outer space.
In order to study information optimization problems, we may as well study the more malleable tropical cone. This is because the entropic cone is the closure of the image of the bounded linear map defined on the tropical cone that evaluates entropies of the individual spaces in a tropical diagram. More generally, we call any non-negative bounded linear functional on the tropical cone an entropic quantity. These include entropies of individual spaces, but also some other quantities, such as optima of some linear combinations of entropies of an extended diagram, where some extra spaces are added to the original diagram. Study of such entropic quantities is the subject of our future research.
One of the main tools in the study of entropic quantities through the tropical cone is the Asymptotic Equipartition Property for diagrams. Originally derived in [9], we cast it here into a density statement of simpler, so-called homogeneous tropical diagrams in the tropical cone, in terms of Theorems 5.1 and 5.2. Therefore, to prove statements 1 The reason for the name tropical cone is the following. For instance in algebraic geometry, tropical varieties are, roughly speaking, divergent sequences of classical varieties, renormalized on a log scale with an increasing base. The adjective 'tropical' carries little semantics, but was introduced in honor of the Brazilian mathematician and computer scientist Imre Simon who worked on the subject of tropical mathematics. Analogously, we construct the asymptotic cone from certain divergent sequences with respect to the intrinsic entropy distance. As the intrinsic entropy distance is entropy-based, we achieve a similar type of renormalization as in algebraic geometry. about entropic quantities, it suffices to study the much simpler homogeneous tropical diagrams.

Asymptotic cones of metric abelian monoids
In this section we define the asymptotic cone in the setting of an abstract metric Abelian monoid. In a later section, we will specify to the case of diagrams of probability spaces.

Metric and pseudo-metric spaces
A pseudo-metric space (X , d) is a set X equipped with a pseudo-distance d, a bivariate function satisfying all the axioms of a distance function, except that it is allowed to vanish on pairs of non-identical points. An isometry of such spaces is a distancepreserving map, such that for any point in the target space there is a point in the image at zero distance away from it. Given such a pseudo-metric space (X , d) one could always construct an isometric metric space (X / d=0 , d), the metric quotient, by identifying all pairs of points that are distance zero apart.
Any property formulated in terms of the pseudo-metric holds simultaneously for a pseudo-metric space and its metric quotient. It will be convenient for us to construct pseudo-metrics on spaces instead of passing to the quotient spaces.
For a pair of points x, y ∈ X in a pseudo-metric space (X , d) we will write x d = y if d(x, y) = 0. We call such a pair of points (d-)metrically equivalent.
Many metric-topological notions such as (Lipschitz-)continuity, compactness,nets, dense subsets, etc., extend to the setting of a pseudo-metric spaces and exercising certain care one may switch between a pseudo-metric space and its metric quotient replacing the d =-sign with equality.

Metric abelian monoids
A monoid is a set equipped with a bivariate associative operation and a neutral element. The operation is usually called multiplication, or addition if it is commutative. We call a monoid with pseudo-distance ( , +, d) a metric Abelian monoid if it satisfies: The binary operation is 1-Lipschitz with respect to each argument: For all γ, γ , η ∈ In other words, the translation maps are non-expanding for every η ∈ .
The following proposition is an elementary consequence of the triangle inequality.
Proposition 2.1 Let ( , +, d) be a metric Abelian monoid. Then: 1. For any quadruple γ 1 , γ 2 , γ 3 , γ 4 ∈ holds 2. For every n ∈ N, and γ 1 , γ 2 ∈ also holds A metric Abelian monoid ( , +, δ) will be called homogeneous if it satisfies for all n ∈ N 0 A homogeneous metric Abelian monoid is called an R ≥0 -semi-module ( , +, · , δ) if in addition there is a doubly distributive R ≥0 -action such that for any λ 1 , λ 2 ∈ R ≥0 and γ 1 , γ 2 ∈ holds A convex cone in a normed vector space would be a typical example of an R ≥0semimodule. An intersection of a convex cone in R n with the integer lattice is an example of a monoid, that does not admit semimodule structure.
The following proposition asserts that if a metric Abelian monoid is homogeneous, then the pseudo-distance is translation invariant, and, in particular, it satisfies a cancellation property. This result was communicated to us by Tobias Fritz. Proposition 2.2 Let ( , +, δ) be a homogeneous metric Abelian monoid. Then the pseudo-distance function δ is translation invariant, that is it satisfies for any γ 1 , γ 2 , η ∈ δ(γ 1 + η, γ 2 + η) = δ(γ 1 , γ 2 ) In particular, the following cancellation property holds in The proof of this proposition is essentially the same as the proof of [9,Proposition 3.7]. Even though the latter proposition is formulated for a specific homogeneous metric Abelian monoid, it does not use any of its specific properties, but only defining properties of a generic homogeneous metric Abelian monoid.

Asymptotic cones (tropicalization) of monoids
In our construction points of the asymptotic cone of ( , +, d) will be sequences of points in that grow almost linearly in a certain sense described below.

Admissible functions
Admissible functions will be used to measure the deviation of a sequence from being linear. We call a function ϕ : In particular the function ϕ is summable against dt/t 2 .
For example, the function ϕ(t) : = t α is admissible for any 0 ≤ α < 1. Any admissible function is necessarily sub-linear, that is ϕ(t)/t → 0 as t → ∞. A linear combination of admissible functions with non-negative coefficients is also admissible.

Lemma 2.3
Let ϕ be a positive admissible function. Then for any α ≥ 0 and λ ≥ 1 there is C > 1 such that for any t ≥ 1 Proof From positivity and monotonicity of ϕ we have On the other hand from monotonicity of the function ϕ(t)/t it follows that for any λ, t ≥ 1 Adding the two inequalities above we obtain the conclusion of the lemma.

Quasi-linear sequences
Let ( , +, d) be a metric Abelian monoid and ϕ be an admissible function. A sequencē γ = {γ (i)} ∈ N 0 will be called quasi-linear with defect bounded by ϕ if for every m, n ∈ N the following bound is satisfied For technical reasons we also require γ (0) = 0. Sequences that are quasi-linear with defect bounded by ϕ ≡ 0 will be called linear sequences. We will often need the following corollary of quasi-linearity, which follows from applying the bound (2.2) twice and using the monotonicity of ϕ: for all m, n, k ∈ N For an admissible function ϕ we will write QL ϕ ( , d) for the space of all quasi-linear sequences with defect bounded by C ·ϕ for some (depending on the sequence) constant C ≥ 0. We will also use notation L( , d):=QL 0 ( , d) for the space of linear sequences.

Quasi-homogeneity
We will show that quasi-linear sequences are also quasi-homogeneous in the sense of the following lemma.

Lemma 2.4 Letγ ∈ N 0 be a sequence with ϕ-bounded defect. Then for any m, n
Proof Define the function ψ : R ≥0 → R related to ϕ as follows The conclusion of the lemma in terms of ψ then reads and it is in that form it will be proven below.
Due to monotonicity properties of ϕ, the function ψ satisfies, for all 0 ≤ s 0 ≤ s We proceed by induction with respect to m, keeping n fixed. The conclusion of the lemma is obvious for m = 1. For the induction step let m = 2m + ≥ 2, where m = m/2 and ∈ {0, 1}. We first use the bound in (2.3) to estimate Next, we continue the estimate using bound (2.4) Applying bound (3) in the definition of admissible functions, we obtain the following corollary.

The semi-module structure
The group operation + on induces ad-continuous (in fact, 1-Lipschitz) group operation on QL ϕ ( , d) by adding sequences element-wise. Thus (QL ϕ ( , d), +,d) is also a metric Abelian monoid. In addition, if ϕ is positive it carries the structure of a R ≥0 -semi-module, as explained below.
If ϕ > 0 is a positive admissible function, the set QL ϕ ( , d) admits an action of the multiplicative semigroup (R ≥0 , · ) defined in the following way. Let λ ∈ R ≥0 and γ = {γ (n)} ∈ QL ϕ ( , d). Then define the action of λ onγ by To show thatγ :=λ ·γ belongs to QL ϕ ( , d) we bound its defect as follows. Let m, n ∈ N 0 , and define In the computation below we assume that λ ≥ 1. For λ ∈ [0, 1] the computation is similar, but simpler. We estimate The first inequality above is the bound (2.3) and the last inequality is obtained by applying Lemma 2.3.
The action defined above is only an action up to asymptotic equivalence. Similarly, in the constructions that follow we are tacitly assuming they are valid up to asymptotic equivalence.
The action and therefore it is continuous with respect to d.

Completeness
Here, we introduce additional conditions on a metric Abelian monoid ( , +, d), that guarantee that (QL ϕ ( ),d) is a complete metric space. Suppose ϕ is an admissible function and ( , +, d) is a metric Abelian monoid satisfying the following additional property: there exists a constant C > 0, such that for any quasi-linear sequenceγ ∈ QL ϕ ( , d), there exists an asymptotically equivalent quasi-linear sequenceγ with defect bounded by Cϕ. Note that, contrary to the situation in the definition of QL ϕ ( , d), the constant C is now not allowed to depend on the sequence. If this is the case, we say that QL ϕ ( , d) has the (C-)uniformly bounded defect property. Proposition 2.6 Suppose a metric Abelian monoid ( , +, δ) and an admissible function ϕ > 0 are such that (QL ϕ ( , δ),δ) has the uniformly bounded defect property and the distance function δ is homogeneous. Then the space (QL ϕ ( , δ),δ) is complete.
Proof Given a Cauchy sequence {γ i } of elements in (QL ϕ ( , δ),δ) we need to find a limit elementη ∈ QL ϕ ( , δ). We will constructη by a diagonal argument. First we replace each element of the sequence {γ i } by an asymptotically equivalent element with defect bounded by Cϕ according to the assumption of the proposition. We will still call the new sequence {γ i }. In fact, we may without loss of generality assume that C = 1.
We begin by establishing a bound on the divergence of the tails of sequencesγ i and γ j . By homogeneity of δ, the triangle inequality and Corollary 2.5, it holds for any n, k ∈ N that Dividing by k and passing to the limit k → ∞, while keeping n fixed, we obtain Since the sequence (γ i ) i∈N 0 is Cauchy, it follows that for any n ∈ N there is a number i(n) ∈ N such that for any i, j ≥ i(n) holdŝ Then for any i, j, n ∈ N with i, j ≥ i(n) we have the following bound Now we are ready to define the limiting sequenceη by setting First we verify thatη is quasi-linear. For m, n ∈ N, we have for some constant C > 0. The convergence ofγ i toη is shown as follows. For n, k ∈ N let q n , r n ∈ N 0 be the quotient and the remainder of the division of n by k, that is n = q n · k + r n and 0 ≤ r n < k. Fix k ∈ N and let i ≥ i(k), then Since k ∈ N is arbitrary and ϕ is sub-linear we have lim i→∞δ (γ i ,η) = 0

On the density of linear sequences
For a metric Abelian monoid ( , +, d) together with an admissible function ϕ we say that QL ϕ ( , d) has the vanishing defect property if for every > 0 and for everȳ γ ∈ QL ϕ ( , d) there exists an asymptotically equivalent quasi-linear sequenceγ with defect bounded by another admissible function ψ such that Proof Letγ = {γ (n)} be a quasi-linear sequence. For any i ∈ N select a sequenceγ i asymptotically equivalent toγ with defect bounded by an admissible function ϕ i such that where we used Lemma 2.4 in the first inequality. Thus, any quasi-linear sequence can be approximated by linear sequences.

Asymptotic distance on original monoid
Starting with an element γ ∈ one can construct a linear sequence = {i · γ } i∈N 0 . In view of Proposition 2.1, the map is a contraction.
By virtue of the bound δ ≤ d, sequences that are quasi-linear with respect to d are also quasi-linear with respect to δ. Since δ is scale-invariant, the associated asymptotic distanceδ coincides with δ on . We will show (in Lemma 2.8 below) that δ also coincides withd on d-quasi-linear sequences.
Let ϕ be an admissible function. In order to organize all these statements, and to be more precise, let us include the spaces in the following commutative diagram. 10) The maps f , f and ı 1 are isometries. The maps j 1 and j 2 are isometric embeddings. The next lemmas show that ı 2 is also an isometric embedding, and it has dense image.

Lemma 2.8 Let ϕ be a positive, admissible function. Then, the natural inclusion
is an isometric embedding with the dense image.
Proof First we show that the map ı 2 is an isometric embedding. Letγ 1 ,γ 2 ∈ QL ϕ ( , d) be two ϕ-quasi-linear sequences with respect to the distance function d. We have to show that the two numberŝ are equal. Since shifts are non-expanding maps, we have δ ≤ d and it follows immediately thatδ and we are left to show the opposite inequality. We will do it as follows. Fix n > 0, thend Passing to the limit with respect to n gives the required inequalitŷ Now we will show that the image of ı 2 is dense. Given an elementγ = {γ (n)} in QL ϕ ( ,d) we have to find aδ-approximating sequenceγ We have to show that eachγ i is d-quasi-linear and thatδ(γ i ,γ ) i→∞ −→ 0. These statements follow from for some C i > 0. It is worth noting that the defect ofγ i may not be bounded uniformly with respect to i. Finally, it holds that The difference between the distance functionsd, δ andδ is very small:d and δ are defined on the dense subset of the domain of definition ofδ and they coincide whenever are both defined. From now on we will write d for the original distance function and δ for the asymptotic metric on both the monoid and its tropicalization.

Grothendieck construction
Given an Abelian monoid with the cancellation property, there is a minimal Abelian group (called the Grothendieck Group of the monoid), into which it isomorphically embeds. Similarly, an R ≥0 -semi-module naturally embeds into a normed vector space.
A nice example of this construction applied to the semi-module of convex sets in R n (with the Minkowski sum and the Hausdorff distance) can be found in [12]. If δ is a proper pseudo-metric (not a metric), then the map f is not injective.
Proof By Lemma 2.2 the pseudo-metric δ is translation invariant. We can therefore apply the Grothendieck construction to define a normed vector space B 0 : Define Define also addition, multiplication by a scalar and a norm on B 0 by setting for all x, y, x , y ∈ and λ ∈ R ||(x, y)|| := δ(x, y) These operations respect the equivalence relation and turn (B 0 , +, ·, || · ||) into a normed vector-space. The map f defined by → (x, 0) is a well-defined distance-preserving homomorphism.
That f ( ) is closed immediately follows as is complete and f is distancepreserving.
In general, the space B 0 is not complete. We define the Banach space B as the completion of the normed vector space B 0 .

Diagrams of probability spaces
We will now briefly describe the construction of diagrams of probability spaces, see [9] for a more detailed discussion. By a finite probability space we will mean a set (not necessarily finite) with a probability measure, such that the support of the measure is finite. For such probability space X we denote by |X | the cardinality of the support of probability measure and the expression x ∈ X will mean, that x is an atom in X , which is a point of positive weight in the underlying set.
We will consider commutative diagrams of finite probability spaces, where arrows are equivalence classes of measure-preserving maps. Two maps are considered equivalent if they coincide on a set of full measure and such equivalence classes will be called reductions.
Three examples of diagrams of probability spaces are pictured in (1.1). The combinatorial structure of such a commutative diagram can be recorded by an object G, which could be equivalently considered as a special type of category, a finite poset, or a directed acyclic graph (DAG) with additional properties. We will call such objects simply indexing categories. Below we briefly recall the definition.
An indexing category is a finite category such that for any pair of objects there exists at most one morphism between them in either direction, and such that it satisfies the following property. For any pair of objects i, j in an indexing category G there exists a least common ancestor, i.e. an object k such that there are morphisms k → i and k → j in G and such that for any other object l admitting morphisms l → i and l → j, there is also a morphism l → k.
By G we denote the number of objects in the indexing category, or equivalently the number of vertices in the DAG or the number of points in the poset G. An important class of examples of indexing categories is formed by so-called full categories n , that correspond to the poset of non-empty subsets of a set {1, . . . , n} ordered by coinclusion. If n = 2, we call the category a fan. We refer to the objects O 1 and O 2 as the feet of the fan and to O 12 as the initial object. We use the same terminology for the spaces in a diagram indexed by 2 .
The space of all commutative diagrams of a fixed combinatorial type will be denoted Prob G . A morphism between two diagrams X, Y ∈ Prob G is, by definition, a natural transformation between functors X and Y. Essentially, it is a collection of morphisms between corresponding individual spaces in X and Y, that commute with morphisms within the diagrams X and Y. We call such morphisms reductions of diagrams.
The construction of forming commutative diagrams could be iterated, producing diagrams of diagrams. Especially important will be two-fans of G-diagrams, the space of which will be denoted Prob G 2 . A two-fan X will be called minimal, if for any morphism of X to another two-fan Y, the following holds: if the induced morphisms on the feet are isomorphisms, then the top morphism is also an isomorphism. Any G-diagram will be called minimal if for any sub-diagram, which is a two-fan, it contains a minimal two-fan with the same feet.
Given an n-tuple (X 1 , . . . , X n ) of finite-valued random variables, one can construct a minimal n -diagram X = {X I ; χ I J } by setting for any ∅ = I ⊂ {1, . . . , n} where X i is the target space of random variable X i , and the probabilities are the induced distributions. For the diagram constructed in such a way we will write X = X 1 , . . . , X n . On the other hand, any n -diagram gives rise to the n-tuple of random variables with the domain of definition being the initial space and the targets being the spaces indexed by one-point sets.
The constant diagram X G is G-diagram in which all the spaces are isomorphic to a single probability space X and all the morphisms are identity maps. In particular, we denote by {•} G the G-diagram consisting entirely of one-point spaces.
The tensor product X⊗Y of two G-diagrams is defined by taking the tensor product of corresponding probability spaces and the Cartesian product of maps. The diagram {•} G is a unit with respect to the tensor product. Certain care should be exercised here, since the assocaitivity, commutativity and unity of {•} G for the tensor product only hold up to isomorphism.
For a diagram X ∈ Prob G one can evaluate entropies of the individual spaces. The corresponding map will be denoted where the target space is the space of R-valued functions on objects in G and it is equipped with the 1 -norm.
For a two-fan We interpret kd(F) as a measure of deviation of F from being an isomorphism between the diagrams X and Y. Indeed, kd(F) = 0 if and only if the two morphisms in F are isomorphisms, see [9]. We define the intrinsic entropy distance k on the space Prob G by Note that according to the definitions used in this article any indexing category must have an initial object. In [9] such indexing categories and diagrams indexed by such categories were called complete. Therefore purely by a change of names, results that in [9] were said to hold for complete indexing categories, hold for the indexing categories of this article.
The tensor product is 1-Lipschitz with respect to k, thus (Prob G , ⊗, k) is a metric Abelian monoid and Ent * : (Prob G , ⊗, k) → (R G , · 1 ) is a 1-Lipschitz homomorphism. For proofs and more detailed discussion the reader is referred to [9].

Tropical diagrams
In this section we apply the general construction in Sects. 2 and 3 to the metric Abelian monoid (Prob G , ⊗, · , κ).
We define the asymptotic distance on Prob G by One of the main tools for the estimation of the (asymptotic) distance is the so-called Slicing Lemma, [9, Proposition 3.9]. We will only need its corollary that we formulate below in Proposition 4.1. For a diagram X, a space U in it and an atom u ∈ U , we may form a conditioned diagram X|u by conditioning all the spaces in X on u. (1) Let X → U G be a reduction, then The statements and the proofs of the Slicing Lemma and its consequences can be found in [9].
We will show below that (Prob G , ⊗, κ) has the uniformly bounded and vanishing defect properties. For this purpose we need to develop some technical tools.

Mixtures
The input data for the mixture operation is a family of G-diagrams, parameterized by a probability space. As a result one obtains another G-diagram with pre-specified conditionals. One particular instance of a mixture is when one mixes two diagrams X and {•} G , the latter being a constant G-diagram of one-point probability spaces. This operation will be used as a substitute for taking radicals "X

Definition of mixtures
Let G be an indexing category and be a probability space. By G we denote the constant G-diagram-the diagram such that all spaces in it are and all morphisms are identity morphisms. Let {X θ } θ∈ be a family of G-diagrams parameterized by . The mixture of the family {X θ } is the reduction The mixture exists and is uniquely defined by property (4.1) up to an isomorphism which is identity on G .
We denote the top diagram Y of the mixture by is a binary space we write simply for the mixture. The diagram subindexed by the will always be the first summand. The entropy of the mixture can be evaluated by the following formula Mixtures satisfy the distributive law with respect to the tensor product

The distance estimates for the mixtures
The mixture of a G-diagram with the constant diagram of one-point spaces {•} G may serve as an substitute of taking radicals of the diagram. The following lemma provides a justification of this by some distance estimates related to mixtures and will be used below.
Lemma 4.2 Let G be a complete indexing category and X, Y ∈ Prob G . Then Note that the distance estimates in the lemma above are with respect to the asymptotic distance. This is essential, since from the perspective of the intrinsic distance mixtures are very badly behaved.
Proof For λ ∈ B N 1/n , define q(λ) to be the number of black squares in the sequence λ. It is a binomially distributed random variable with mean N /n and variance N n (1 − 1 n ). The first claim is then proven by the following calculation where we used Proposition 4.1(1) for the inequality on the third line above, and the following estimate: for any diagram A and integers 0 ≤ m ≤ n The second claim is proven similarly and the third follows from the second and the 1-Lipschitz property of the tensor product: Finally, the fourth follows from Proposition 4.1(2), by slicing both arguments along B 1/n . We estimate the asymptotic distance between individual members of sequencesX and Y using Lemma 4.2 and Corollary 2.5 as follows

Vanishing defect property and completeness of the tropical cone
Thusκ(X,Ȳ) = 0 and the two sequences are asymptotically equivalent. Next we show that the sequenceȲ is κ-quasi-linear and evaluate its defect, also using Lemma 4.2.
Corollary 4.4 For any indexing category G and for the admissible function ϕ given by , κ) has the uniformly bounded and vanishing defect properties.
Proof LetX ∈ QL ϕ (Prob G , κ). By Lemma 4.3 there exists an asymptotically equivalent sequenceȲ with defect bounded by ϕ k defined by Hence there exists a sequence c k → 0 such that for all t ≥ 1, showing the uniformly bounded and vanishing defect property.

Diagrams of tropical probability spaces
By applying the general setup in the previous section to the metric Abelian monoids (Prob G , ⊗, k) and (Prob G , ⊗, κ) and using the Corollary 4.4 we obtain the following theorem.

Theorem 4.5 Fix an admissible function ϕ and consider the commutative diagram
Then the following statements hold: 1. The maps f , f , ı 1 are isometries.
2. The maps ı 2 , j 1 , j 2 are isometric embeddings and each map has a dense image in the corresponding target space. 3. The space in the lower-right corner, QL ϕ (Prob G , κ),κ , is complete.
We would like to conjecture that all maps in the diagram above are isometries.
Since QL ϕ (Prob G , κ) is complete and has L(Prob G , κ) as a dense subset for any ϕ > 0, it follows that QL ϕ (Prob G , κ) does not depend (up to isometry of pseudo-metric spaces) on the choice of admissible ϕ > 0. From now on we will choose the particular function ϕ(t):=t 3/4 . The choice will be clear when we formulate the Asymptotic Equipartition Property for diagrams. We may finally define the space of tropical G-diagrams, as the space in the lower-right corner of the diagram Prob[G]:= QL ϕ (Prob G , κ), ⊗, ·,κ By Theorem 4.5 above, this space is complete.
The entropy function Ent * : Prob G → R G extends to a linear functional

Homogeneous diagrams
A G-diagram X is called homogeneous if the automorphism group Aut(X) acts transitively on every space in X. Homogeneous probability spaces are uniform. For more complex indexing categories this simple description is not sufficient. The subcategory of all homogeneous G-diagrams will be denoted Prob G h . This space is invariant under the tensor product, thus it is a metric Abelian monoid.

Universal construction of homogeneous diagrams
Examples of homogeneous diagrams could be constructed in the following manner. Fix a finite group G and consider a G-diagram G i ; α i j i∈G of (not necessarily normal) subgroups of G, where morphisms α i j are inclusions. The G-diagram of probability spaces X i ; f i j is constructed by setting X i = (G/G i , unif), where G/G i denotes the set of left cosets and unif is the uniform measure, and taking f i j to be the natural projection G/G i → G/G j , whenever G i ⊂ G j . The resulting diagram X will be minimal if and only if for any i, j ∈ G there is k ∈ G, such that G k = G i ∩ G j . In fact, any complete homogeneous diagram arises this way, according to the following argument from [9], although the representation of homogeneous diagrams by diagrams of subgroups is highly non-unique.
Indeed, let X = X i ; χ i j be a homogeneous G-diagram of probability spaces, such that X 0 is the initial space in X. Then Aut(X) acts transitively on every space X i in X. Let x 0 ∈ X 0 be an atom and set x i :=χ 0i x 0 . Define G:=Aut(X) to be the full automorphism group and G i :=Stab(x i ) to be the stabilizer of the action of Aut(X) on X i at point x i . The spaces X i can be naturally identified with G/G i . Note that x i is the image of x j under the equivariant map χ ji whenever it is present in the diagram X. Thus we have G j ⊂ G i and a natural surjection G/G j → G/G i , if the morphism χ ji is present in the diagram X. Under the identification G/G j ∼ = X i the surjection G/G j → G/G i coincides with χ ji due to the equivariance of χ ji .

Asymptotic equipartition property
In [9] the following theorem is proven. where C(|X 0 |, G ) is a constant only depending on the cardinality |X 0 | of the initial space X 0 of X and the number G of objects in G.
The Asymptotic Equipartition Property of Theorem 5.1 is a direct generalization of the classical Asymptotic Equipartition Property, which states that if (X i ) is a sequence of identically distributed, independent random variables, the random variables − 1 n log p(X 1 , . . . , X n ) converge as n → ∞ in probability to the entropy of X 1 . Indeed, in that case the approximating sequence H n corresponds to a sequence of uniform random variables H n , with Ent(H n )/n → Ent(X 1 ). Denote by p(x, h) the optimal coupling achieving the distance in left-hand-side of (5.1). Then n X n ×H n |log p(h|x)| dp(x, h) + 1 n X n ×H n |log p(x|h)| dp(x, h) ≤ C(|X 0 |, G ) · ln 3 n n which implies the classical Asymptotic Equipartition Property. We define the space of tropical homogeneous diagrams by Then, the Asymptotic Equipartition Property can be reformulated as follows.
Proof By Theorem 5.1, every linear sequence can be approximated by a homogeneous sequence. It follows from the bound (5.1) that the defect of the approximating homogeneous sequence is bounded by a constant times ϕ, defined by ϕ(t) = t 3/4 . Moreover, the linear sequences are dense by Theorem 4.5. This finishes the proof.

The tropical cone for probability spaces and chains
Although for general indexing categories G the space of tropical G-diagrams will typically be infinite dimensional, it has a very simple, finite-dimensional description if G consists of a single object, or if it is a special type of indexing categories called a chain. The chain of length k, denoted by C k , is the indexing category with k objects O 1 , . . . , O k , and a morphism from O i to O j whenever i ≥ j. A C k -diagram of probability spaces is then a chain of reductions X k → X k−1 → · · · → X 1 For chains we can describe the tropical cone explicitly. Theorem 6.1 For k ∈ N, the tropical cone Prob[C k ] is isomorphic to the following cone in (R k , | · | 1 ): In particular, the algebraic structure and the pseudo-distance are preserved under the isomorphism.
Recall that a homogeneous probability space is (isomorphic to) a probability space with a uniform distribution and therefore its isomorphism class is completely determined by its cardinality or entropy. A homogeneous chain has a very simple description as well: A chain is homogeneous if and only if the individual probability spaces are homogeneous, i.e. if and only if the individual probability spaces are (isomorphic to) probability spaces with a uniform measure. Similarly, the isomorphism class of a chain is completely determined by the cardinalities of the spaces contained in it. This allows us to construct a canonical model for any chain.
We denote by H n the homogeneous probability space with the underlying set Also note that minimization does not depend on N , since we assumed it is a multiple of lcm(m, n). Now we can estimate kd(Z n,m ) ≤ 2 ln(n + m) − ln(n) − ln(m) ≤ 2 ln 2 + | ln n − ln m| = 2 ln 2 + |EntH n − EntH m | (6.3) Lemma 6.2 Let X, Y ∈ Prob C k be two homogeneous chains of length k. Then k(X, Y) ≤ 2k · ln 2 + Ent * X − Ent * Y 1 Proof Let (n i ) and (m i ) be sequences of cardinalities of spaces in X and Y respectively. Without loss of generality we may assume that both chains have canonical form provided by (6.2). Let N :=lcm(n k , m k ). Then n i and m i are divisors of N for all 1 ≤ i ≤ k. Consider two-fan of chains where H = H C k N and l i = f N ,n i , r i = f N ,m i for 1 ≤ i ≤ k. Due to transitivity (6.1) this is indeed a two-fan of chains. Its minimization is a chain of minimal fans Thus we can estimate k(X, Y) ≤ kd(Z i ) ≤ 2k · ln 2 + Ent * X − Ent * Y 1 Corollary 6.3 is an isometric embedding. To prove the surjectivity one constructs chains of probability spaces with prescribed entropies satisfying the inequalities defining the cone in the theorem. This is left to the reader.
On behalf of all authors, the corresponding author states that there is no conflict of interest.