The Entropy-Limit ( Conjecture ) for Σ 2-Premisses

The application of the maximum entropy principle to determine probabilities on finite domains is well-understood. Its application to infinite domains still lacks a well-studied comprehensive approach. There are two different strategies for applying the maximum entropy principle on first-order predicate languages: (i) applying it to finite sublanguages and taking a limit; (ii) comparing finite entropies of probability functions defined on the language as a whole. The entropy-limit conjecture roughly says that these two strategies result in the same probabilities. While the conjecture is known to hold for monadic languages as well as for premiss sentences containing only existential or only universal quantifiers, its status for premiss sentences of greater quantifier complexity is, in general, unknown. I here show that the first approach fails to provide a sensible answer for some Σ2-premiss sentences. I discuss implications of this failure for the first strategy and consequences for the entropy-limit conjecture.

Inductive logic was popularised by Carnap [2][3][4], his approach today lives on in the Pure Inductive Logic approach [9,10,22]. In this paper, I however pursue an alternative approach due to Jaynes applying the Maximum Entropy Principle. The Maximum Entropy Principle compels rational agents to adopt a probability function which satisfies all the premisses. While this first step is rather uncontroversial, the principle has a second part which applies, if there is more than one probability function satisfying all the premisses. In this case, the principle compels rational agents to adopt one of these probability functions with maximum entropy (hence the name of the principle). In this paper, I investigate consequences of a accepting this principle.
The application of the Maximum Entropy Principle to finite propositional language is "merely" a matter of computational complexity, well-understood and well-behaved [13,15,16,[19][20][21]. Applications to infinite domains are another kettle of fish altogether, because it is not clear how to explicate the notion of entropy.
I here consider a first-order predicate language L with infinitely many constant symbols. Intriguing questions arise for the application of the Maximum Entropy Principle. Two, possibly conflicting, ways to apply the Maximum Entropy Principle to predicate languages have been put forward. It is not clear whether the resulting inductive logics agree. Furthermore, it is not clear which approach to prefer, if they do indeed differ.
The entropy-limit approach, due to Jeff Paris and his co-workers [1,17,18,[23][24][25][26], proceeds as follows: (i) reinterpret the premisses as constraints on the probabilities of sentences of a finite predicate language L n containing only the first n constants; (ii) determine the function P n that maximises entropy on this finite language, subject to constraints imposed by the reinterpreted premisses; (iii) draw inductive inferences using the function P ∞ defined by P ∞ (θ) : = lim n→∞ P n (θ) for sentences θ of L, in case the limit exists.
The maximal-entropy approach, due to the objective Bayesian Jon Williamson [27][28][29][30], avoids a reinterpretation of premisses: (i) consider probability functions defined on the language L as a whole; (ii) deem one probability function P to have greater entropy than another function Q, if and only if P has greater entropy than Q on all but finitely many finite sublanguages L n ; (iii) draw inductive inferences using those functions P † , from all the probability functions on L that satisfy the premisses, that have maximal entropy (i.e., no other function satisfying the premisses has [in this sense] greater entropy).
The entropy-limit approach is constructive and allows for a (often much) simpler calculation of probabilities. The maximal-entropy approach determines probabilities in some cases in which the entropy-limit approach fails to determine probabilities. If these two approaches lead to different inductive probabilities, then this would provide ammunition to opponents of the Maximum Entropy Principle. On the other hand, if it can be shown, that both approaches agree (where they are both defined), then this would point towards a unique consistent interpretation of the Maximum Entropy Principle on infinite domains providing further support for its application.
The consistency of these applications of the Maximum Entropy Principle to predicate languages with infinitely many constants has recently been conjectured: Entropy-limit Conjecture.
If P ∞ exists and satisfies the constraints imposed by the premisses, then it is the unique function with maximal entropy from all those that satisfy the premisses, i.e., P † = P ∞ . [30, p. 191] This paper is concerned with the case that all premisses are known with certainty, X 1 = . . . = X k = 1. This amounts to having a single certain premiss ϕ known with certainty, ϕ 1 = ϕ 1 1 ∧ . . .∧ ϕ 1 k . To simplify the notation the superscripted 1 will be dropped.
There are some results which show that the conjecture is true for relatively simple premisses. For a quantifier-free premiss sentence entropy maximisation reduces to entropy maximisation on a finite domain, where both approaches agree. Two recent results show that the conjecture holds for all satisfiable premiss sentences in Σ 1 ∪Π 1 [14,25]. The conjecture is also known to hold for monadic languages [1].
The situation concerning premiss sentences containing existential and universal quantifiers is less well-understood. We know of satisfiable premiss sentences ϕ ∈ Σ 2 such that the entropy maximiser (P † ϕ ) does not exist [24]. We also know of satisfiable premiss sentences ϕ ∈ Π 2 such that the entropy limit (P ∞ ϕ ) does not exist [23]. As far as I'm aware, this all that is known, see Table 1 located at the end of the paper for an overview.
If there exists a ϕ ∈ Σ 2 premiss sentence for which the entropy maximiser (P † ϕ ) does not exist but the entropy limit (P ∞ ϕ ) does exist, then the Entropylimit Conjecture would be false.
In this paper, I show in Theorem 1 that for every polyadic language L there exists a satisfiable sentence ϕ ∈ Σ 2 of L such that P ∞ ϕ = P = , where P = is the (uniform) equivocator function defined in Example 2. In Theorem 2, I generalise this result to the existence of such sentences of all quantifier complexities greater than Σ 2 . In these cases, the maximal entropy function P † ϕ does not exist [24,Section 2.2]. While this looks prima facie like conclusive evidence against the Entropy-limit Conjecture, I conclude that these cases are, despite appearance to the contrary, cases in which the entropy limit is not defined.
Since the Entropy-limit Conjecture only applies to cases in which the entropy limit is defined, the conjecture is not troubled by my results. The Entropy-limit Conjecture even emerges strengthened, since it applies to fewer cases, and can hence also fail in fewer cases.

The Formal Framework
The formal framework and the notation are taken from [14]. I now reproduce this framework with the simplification that the premiss here is a single sentence known with certainty rather than multiple premisses which may be uncertain.
Predicate languages. Throughout this paper, I consider first-order predicate languages L, with countably many constant symbols t 1 , t 2 , . . . and finitely many relation symbols, U 1 , . . . , U n . The atomic sentences, i.e., sentences of the form U i t i 1 . . . t i k where k is the arity of the relation U i , will be denoted by a 1 , a 2 , . . ., ordered in such a way that atomic sentences involving only constants among t 1 , . . . , t n occur before those atomic sentences that also involve t n+1 . The set of sentences of L is denoted by SL.
I also consider the finite sublanguages L n of L, where L n has only the first n constant symbols t 1 , . . . , t n but the same relation symbols as L. L n has finitely many atomic sentences a 1 , . . . , a r n . We call the state descriptions of L n (i.e., the sentences of the form ±a 1 ∧ . . . ∧ ±a r n ), n-states. We let Ω n be the set of n-states for each n. Note that |Ω n | = 2 r n , and every n-state ω n ∈ Ω n has |Ω n+1 |/|Ω n | = 2 r n+1 −r n many n + 1-states, ω n+1 which extend it (i.e., ω n+1 |= ω n ). Denote the sentences of L n by SL n .
Reinterpretation. I use N to refer to the largest number n such that the constant t n appears in ϕ ∈ SL. For a sentence ϕ ∈ SL and fixed n ≥ N , one can reinterpret ϕ as a sentence of L n , by substituting every sub formula of the form ∃xθ(x) by θ(t 1 ) ∨ . . . ∨ θ(t n ) and substituting every sub formula of the form ∀xθ(x) by θ(t 1 ) ∧ . . . ∧ θ(t n ). To deal with multiple quantifications over the same variable, substitutions begin at the innermost quantifiers and proceed outwards.
I write (ϕ) n , or if there is no ambiguity, simply ϕ n , to denote this reinterpretation of ϕ, which thus becomes a quantifier free sentence of L n . For any sentence ϕ ∈ SL I denote by [ϕ] n the set of n-states that satisfy ϕ. The number of n-states in [ϕ] n is denoted by |ϕ| n .

Probability.
A probability function P on L is a function P : SL −→ R ≥0 such that:
. A probability function on L n is defined similarly (the supremum in P3 is dropped and m is equal to n). We shall use the notation P and P n to denote the set of all probability functions on L and L n respectively.
A probability function is determined by the values it gives to the nstates, for each n. Consequently, a probability function is determined by the values it gives to the quantifier-free sentences, a result known as Gaifman's Theorem [5].
The equivocator function is a probability function on L. It is called the equivocator function because it equivocates between n-states: it is the only probability function that gives each n-state the same probability, for each n. The restriction P = L n of P = to L n is a probability function on L n , for any n. To simplify notation, I use P = to refer to these restrictions, as well as to the function on L itself.
Entropy. The n-entropy of a probability function P (which is defined on either L or L n ) is defined as: The usual conventions are 0 log 0 := 0 and the logarithm to be the natural logarithm. The second convention is inconsequential. The two key approaches are the entropy-limit approach and the maximalentropy approach.
The entropy-limit approach. For fixed n ≥ N , reinterpret ϕ as a statement of L n . Let E n be the set of probability functions on L n such that P (ϕ n ) = 1. If E n = ∅ consider the entropy maximiser: E n is closed and convex and P n is uniquely determined, if ϕ n is satisfiable. However, the premiss is intended as a statement on L, not L n , and the question arises as to what would be the most appropriate probability function for drawing inferences from this premiss when it is interpreted as a statement about an infinite domain. If it exists, one can consider the function P ∞ defined on L as a pointwise limit of maximum entropy functions [1]: The entropy-limit approach takes P ∞ for inference, attaching probability There is one complication about the definition of P ∞ . While [1] define P ∞ in terms of a pointwise limit where the limit is taken independently for each sentence of L, [18,23,25] define P ∞ in a slightly different way: take the pointwise limit on quantifier-free sentences and extend this to the (unique) probability function on L as a whole which agrees with the values obtained on the quantifier-free sentences, assuming that the pointwise limit exists and satisfies the axioms of probability on quantifier-free sentences of L [5]. The Rad-Paris definition [1] circumvents a problem that can arise with the Barnett-Paris definition [18,23,25], namely that the pointwise limit on L as whole may exist but may fail to be a probability function. This detail will be a crucial for interpreting the main result of this paper (Theorem 1).
Note that P n and E n are defined on L n not L. To simplify notation, when P is defined on L, I say P ∈ E n meaning that the restriction of P to L n is in E n , P L n ∈ E n .
The maximal-entropy approach. This alternative approach avoids appealing to the finite sublanguages. For probability functions P and Q defined on L, P is deemed to have greater entropy than function Q, if and only if it has greater n-entropy for sufficiently large n, i.e., if and only if there is some natural number N such that for all n ≥ N it holds that H n (P ) > H n (Q). Then one can consider the set of probability functions in E, the set of probability functions on L with P (ϕ) = 1, with maximal entropy: maxentE : ={P ∈ E : there is no Q ∈ E that has greater entropy than P }.
If maxentE = ∅, one can draw inferences using the maximal entropy functions P † . 1 Thus, the maximal-entropy approach attaches the set of probabilities Y = {P † (ψ) : P † ∈ maxentE} to ψ. If maxentE contains only a single element, so does Y and the maximal entropy function is said to exist and be unique; it is then unambiguously denoted by P † .
See [11,12,30] for one kind of justification of this approach.
Arithmetic Hierarchy of Formulae. Δ 0 is the set of sentences of SL which are logically equivalent to a quantifier free sentence. The level of the hierarchy Σ n consists of those sentence which are logically equivalent to a sentence of the form ∃ . .) (with n-many quantifier blocks) and Π n consists of those sentence which are logically equivalent to a sentence of the The level Δ n is the set of sentences which are logically equivalent to a sentence of the form ∃ . Let Ψ be one of these classes, I then denote by Ψ * the set of sentences which are in Ψ but not in a class of lower complexity. In other words, sentences are classified according to their minimal quantifier complexity.

The Σ * 2 Case
I now prove and explain the main technical result of this paper.

Proof of the Σ * 2 Case
This result concerns the most simple Σ * 2 sentence of a polyadic language there is, ϕ = ∃x∀yU xy. ϕ only mentions two variables, one binary relation symbol, no connective and no constant.
Let L contain only one binary relation symbol U and ϕ = ∃x∀yU xy, then (1) Proof. The proof is a rather simple but not short exercise in counting states and their extensions. There are 2 (n 2 ) -many n-states, |Ω n | = 2 (n 2 ) . There are 2 n -many conjunctions of the form n i=1 ±Utt i , only one of them is a witness for ϕ n (satisfying ϕ n ), n i=1 Utt i . So, there are (2 n − 1) n -many n-states which are not in [ϕ n ] n . Hence, there are 2 n 2 − (2 n − 1) n = 2 n·n − (2 n − 1) n -many n-states in [ϕ n ] n . P n equivocates on these n-states to which it assigns probability Let us now consider an arbitrary k-state ω k / ∈ [ϕ k ] k and count the number of n-states with n > k which extend ω k such that ω n ∈ [ϕ n ] n . An n-state extending ω k is in [ϕ n ] n , if and only if there exists some c ∈ {k + 1, k + 2, . . . , n} such that n l=1 Ut c t l . Since all those k-states are assigned the same probability by P n , we may consider ω k = k i,j=1 ¬Ut i t j to make things more concrete.
There are k · (n − k)-many sentences of the form Ut i t l with 1 ≤ i ≤ k and k + 1 ≤ l ≤ n which do not affect whether ω n ∈ [ϕ n ] n or not (see below for an illustration).
There are n · (n − k)-many sentences of the form Ut i t l with k + 1 ≤ i ≤ n and 1 ≤ l ≤ n which do affect whether ω n ∈ [ϕ n ] n or not. There are hence 2 n·(n−k) -many possible conjunctions which decide whether the extension ω n of ω k is in [ϕ n ] n .
For an extending n-state ω n not to be in [ϕ n ] n it must hold that for all k + 1 ≤ i ≤ n at least one of the literals of the form n l=1 Ut i t l is negated.
For every fixed i, there are 2 n many conjunctions of the form n l=1 Ut i t l , of which 2 n − 1-many such conjunctions mentioning at least one negation symbol. There are hence (2 n − 1) n−k -many conjunctions of the 2 n·(n−k) -many conjunctions which entail that ω n is not in [ϕ n ] n . Hence, for all 1 ≤ k < n and all I now turn to computing the limit as n goes to infinity and find The numerator and the denominator converge to zero and the limit as n approaches infinity cannot simply be read off. Let s greatest natural number such that k · s ≤ n, s := n k . Define a sequence f n := 2 n 2 n −1 and put δ n := (f n ) k − 1 > 0. Then But since ω k = k i,j=1 ¬Ut i t j has the fewest n-extensions in [ϕ n ] n , it follows that P n ( k i,j=1 ¬Ut i t j ) ≤ 1 2 k·k . Thus, the sequence in (3) is always negative. But since this sequence is greater or equal than − 1 s and s grows with growing n, this sequence is a null-sequence.
This in turn entails that Consider an extension ω n ∈ [ϕ n ] n of k i,j=1 ¬Ut i t j and denote by λ the conjunction of literals in [ϕ n ] n which do not appear in k i,j=1 ¬Ut i t j , ω n = λ ∧ k i,j=1 ¬Ut i t j . Note that for every other k-state ν k it holds that ν k ∧ λ ∈ [ϕ n ] n . Hence, the set of n-extensions in [ϕ n ] n of every ν k is either equal to or a proper superset of the extensions of

This means that
for all k, all ν k ∈ Ω k \{ k i,j=1 ¬Ut i t j } and all n > k. Since for all k lim n→∞ P n ( k i,j=1 ¬Ut i t j ) = 1 2 k·k , all other k-states ν k and all δ > 0 it holds for all large enough n that 1 2 k·k − δ ≤ P n (ν k ) ≤ 1 and all n it holds that ω n ∈Ω n P n (ω n ) = 1, it must be the case that lim n→∞ P n (ν k ) ∈ [ 1 2 k·k , 1), if the limit exists. If a single such limit does not exists, then there must exist an infinite set of natural numbers I ⊆ N and an > 0 such that P n (ν k ) ≥ 1 2 k·k + for all n ∈ I. This however contradicts the fact that the least probable k-state has a probability arbitrarily close to 1 2 k·k which can only happen, if all other k-states have a probability very close to 1 2 k·k (probabilities add up to 1), too. Hence, all these other limits exist, too.
Since probabilities add up to 1, it must be the case that for all ω k ∈ Ω k it holds that lim n→∞ P n (ω k ) = 1 2 k·k = P = ( k i,j=1 ¬Ut i t j ), too. This means that for all k ∈ N that lim n→∞ P n and P = agree on all k states. And so, Mutatis mutandis, this proof goes through for more expressive languages and more complicated premiss sentences, see the Appendix for details:

Theorem 2. [Beyond a Σ 2 Premiss] For every polyadic language L containing at least two relation symbols and every set of sentences
there exists a contingent sentence χ = ψ∧∃x∀yU 1 x@r−1y in Λ * such that P ∞ χ = P = .

Explanation of the Σ * 2 Case
Since Proposition 1 is the main result of this paper and the proof does not transmit much understanding, I now explain why Proposition 1 holds.
To help our understanding I find it helpful to think of a k-state as the presence and its extensions as futures in which currently contingent facts (literals mentioning constants t c with c ≥ k + 1) have been decided. A future is called possible, if and only if it is [ϕ n ] n . The name is inspired by the observation that only n-states in [ϕ n ] n are assigned a strictly positive probability by P n .
Let us consider the probability of a presence, P n (ω k ) for a fixed k-state ω k for some fixed point in time (k ∈ N).
This probability is given as the ratio of the number n-states in which ϕ n is true and which extend ω k divided by the number n-states in which ϕ n is true (2). That is just the ratio of the number of all possible futures over the number of all possible and all alternative possible futures (possible futures of alternative presences).
Let n be very large, then this probability (measured as a ratio of such conjunctions) that there exists some k < c ≤ n such that k j=1 Rt c t j is close to 1. In fact, for large enough n, the probability that there exist many more such c than k is close to 1. But since every constant t such that k j=1 Rtt j has the same probability to satisfy n j=1 Rtt j , it follows that a witness for ϕ is most likely among the constants t k+1 , . . . , t n . In almost all possible futures there will be a witness, which is not yet present.
Denote by λ a conjunction of literals such that λ ∧ ω k ∈ [ϕ n ] n , that is a completion of the k-world to a ϕ n -world. This may be understood as time evolving from now to the future. With probability approximating 1 (as n goes to infinity), a witness for ϕ is to be found in the completion λ, the witness is among t k+1 , . . . , t n . For every way the presence could be, in almost all possible futures there will be a witness which is not yet present.
Since probabilities assigned by P ∞ , if P ∞ exists, are according to nextensions in [ϕ n ] n , it follows that ω k and ν k have the same probability. The number k and the two k-states ω k and ν k were chosen arbitrarily, so for all fixed k all k-states are assigned the same probability. This means that P ∞ = P = . Since the probability of a presence is approximated by the ratio of possible futures over the number of all ways the future could have been, all presences are equally likely.
To make a long story very short: all presences have [approximately] the same number of possible futures, which are all equally likely. Hence, all presences are [approximately] equally likely.

The Entropy-Limit Conjecture
Recall that Theorem 1 shows that for every polyadic language L there exists a satisfiable sentence ϕ = ∃x∀yU x@r − 1y ∈ Σ 2 of L such that P ∞ ϕ = P = . Prima facie, this seems like evidence against the Entropy-limit Conjecture because it provides a plethora of cases, in which the entropy limit is seemingly well-defined and the maximal entropy function does not exists (easy corollary from [24, Section 2.2]).
Let us take a closer look. P ∞ ϕ = P = entails that P ∞ ϕ (ϕ) = P = (ϕ) but observe that So, the limit of the P n ϕ assigns the certain premiss ϕ probability 0, lim n→∞ P n ϕ (ϕ) = P ∞ ϕ (ϕ) = 0. This is absurd. The entropy limit was designed to inform our inductive inferences on the basis that we are sure about the premiss ϕ. How can the rational probability of ϕ be zero? It cannot! It was, presumably, for these reasons that the entropy limit was originally only defined, if P ∞ ϕ (ϕ) = 1. The definition evolved in [18,23,25] into defining the entropy limit by first considering the limits of P n (ω k ) for all k and all ω k ∈ Ω k and then define the probability of sentences containing quantifiers via P3 (Gaifman's Theorem). In previous work, both definitions of the entropy limit agreed. Hence, the entropy limit always assigned probability 1 to certain premisses. I speculate that this is caused by previous work (with the exception of [23, Section 3.2]) not considering premiss sentences mentioning a universal and an existential quantifier.
It might somehow be argued, that it is in (exceptional) cases acceptable to have an entropy limit which assigns an uncertain premiss, ϕ X , a nonextreme probability outside of X in the open unit interval. For example, if X is an open interval, it might not appear criminal for P ∞ to assign ϕ the infimum or supremum of X. The case here is Different. We are given a certain premiss ϕ and yet the "entropy limit" assigns it probability 0.
So, according to the original definition of the entropy limit, the entropy limit is not defined for the cases in Theorems 1 and 2. A sensible interpretation of the later definition of the entropy limit ought to come to the same conclusion-and I hope you agree with my assessment.
This then entails that these cases do not trouble the Entropy-limit Conjecture (which was stated for the original definition of the entropy limit). Rather than troubling the Entropy-limit Conjecture, discovering cases in which the entropy limit is not defined strengthens the case for the Entropylimit Conjecture.
But what about the maximal entropy functions P † ϕ and P † χ ? If they are well-defined, then P † ϕ (ϕ) = 1 = P † χ (χ) by the definition of a maximum entropy function which must satisfy the constraints imposed by the evidence. The absurd situation of the entropy limit (P ∞ ϕ (ϕ) = 0 = P ∞ χ (χ)) cannot arise in the maximal-entropy approach. Unfortunately, they are not well-defined in these cases [24, Section 2.2], and both entropy maximisation methods do not provide sensible beliefs for these premisses.
Developing a comprehensive approach to the Maximum Entropy Principle yielding sensible beliefs for these premisses, too, is pressing future work.

The Entropy Limit
What have we learned about the entropy limit? Its definition only makes sense as long as the probability of the premiss sentence is computed from a single limit of probabilities of n-states. This is, obviously, the case for Π 1 , Σ 1 sentences (Gaifman) and for unary languages (every unary sentence is logically equivalent to a disjunction of mutually exclusive Π 1 and Σ 1 sentences ([6, Theorem 35, pp. 68], restated in [1, Lemma 2.2, p. 89], and so the probability is equal to the sum of the probabilities of the disjuncts). Furthermore, the recipe of using the entropy limit to (more simply) compute the maximum entropy function only seems to produce palatable results in these cases.
In case the premiss is of greater quantifier complexity, the entropy limit may fail to provide a (sensible) answer: for ϕ 2 ∈ Π * 2 and beyond the entropy limit may fail to exist [see (6) below], for ϕ ∈ Σ * 2 ∪ Ψ * the entropy limit may fail to provide a sensible answer (Theorems 1 and 2).
5. For every polyadic language L there exists a contingent sentence ϕ = ∃x∀yU 1 x@r − 1y ∈ Σ 2 of L such that P † ϕ does not exist. This is a simple corollary from [24,Section 2.2]. Similarly to Theorem 2, this fact generalises to Ψ * by considering a suitably contingent χ = ψ ∧ ϕ with P † ψ = P = (details omitted since out of scope of current interests), where ψ does not mention the relation symbols in ϕ.
6. For every polyadic language L with at least three relation symbols U 1 , U 2 , U 3 such that the arity of U i is greater or equal than i there exists a contingent sentence ϕ ∈ Π * 2 of L such that P ∞ ϕ does not exist. This is a simple corollary from [23, Section 3.2]. Again, this fact generalises to Ψ * .
Acknowledgements. Open Access funding provided by Projekt DEAL. I would like to thank Erik Curiel, Soroush Rafiee Rad, Jon Williamson and an anonymous reviewer for their helpful comments. I also gratefully acknowledge funding from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 405961989 and 432308570.
Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

2
The above focussed on a language with a single binary relation symbol. I now show how the above generalises to sentences of greater quantifier complexity and to all other languages containing at least one relation symbols that is not unary.

Proposition 2. [Greater Arities]
Let L contain only one relation symbol U , which as an arity r ≥ 2 and denote by x@r − 1 the r − 1 tuple with all elements equal to x. For ϕ = ∃x∀yU x@r − 1y it holds that Proof. This follows immediately from the proof of Proposition 1 by only counting extensions of the form n i,j=1 ±Ut i @r − 1t j . The above results generalise to languages containing further relation symbols via the following well-known fact which follows from the Irrelevance Principle and the Independence Principle [20, Principles 3 and 6, p. 186].

Proposition 3. [Multiple Relation Symbols]
Let ψ be a sentence of the language that does not mention U where the arity of U is at least two. Furthermore, for a state ω of the language let ω −U be conjunction of literals of ω which do not mention U and let ω U be the conjunction of the literals that do mention U , ω = ω −U ∧ ω U . Then P ∞ ψ∧∃x∀yU x@r−1y (ω) = P ∞ ψ (ω −U ) · P ∞ ∃x∀yU x@r−1y (ω U ), if P ∞ ψ exists.
We can now formulate the results into a global lesson. Let us call a language L polyadic, if and only if L contains at least one non-unary relation symbol. Let us call the other languages purely unary.
Proof. If L contains multiple relations symbols simply apply Proposition 3, where ψ is any tautology that does not mention U .
If L only contains one relation symbol, then this has already been shown in Proposition 2.
Proof. First recall that for the premiss ∃xU 2 x in Σ * 1 the entropy limit (and the entropy maximiser) is P = .
For every fixed level of the hierarchy in Λ * ∈ Ψ * pick a contingent sentence ξ such that: i) ξ only mentions the relation symbol U 2 and ii) ξ ∨∃xU 2 x is contingent and is in Λ * . Clearly, such sentences ξ exist for all Λ * ∈ Ψ * .
Next note that P ∞ ξ∨∃xU 2 x = P = since P n ξ∨∃xU 2 x assigns at most one nstate probability 0. All other n-states, for fixed n, are assigned the same probability by P n ξ∨∃xU 2 x . Let us put ψ := ξ ∨ ∃xU 2 x. Let us next consider the sentence χ := (ξ ∨ ∃xU 2 x) ∧ ∃x∀yU 1 x@r − 1y and note that it is in the same level of the hierarchy as ξ. We can simply absorb the two last quantifiers in χ into the quantifier blocks in ξ due to the assumption contains at least three alternating quantifier (blocks), and hence at least once an existential quantifier is followed by a universal quantifier in ξ.
In our case we have P ∞ ψ = P = = P ∞ ϕ , the entropy limits thus exist. This entails that for all n and all n-states ω ∈ Ω n that Since P ∞ χ (ω) and P = agree on all states it follows that these two probability functions have to also agree on all sentences containing quantifiers (Gaifman's Theorem). So, P ∞ χ = P = .