Uniquely Determined Uniform Probability on the Natural Numbers

In this paper, we address the problem of constructing a uniform probability measure on N\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbb {N}}$$\end{document}. Of course, this is not possible within the bounds of the Kolmogorov axioms, and we have to violate at least one axiom. We define a probability measure as a finitely additive measure assigning probability 1 to the whole space, on a domain which is closed under complements and finite disjoint unions. We introduce and motivate a notion of uniformity which we call weak thinnability, which is strictly stronger than extension of natural density. We construct a weakly thinnable probability measure, and we show that on its domain, which contains sets without natural density, probability is uniquely determined by weak thinnability. In this sense, we can assign uniform probabilities in a canonical way. We generalize this result to uniform probability measures on other metric spaces, including Rn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbb {R}}^n$$\end{document}.


Introduction and Main Results
In this paper, we introduce and study a notion of uniformity which is stronger than the extension of natural density. A uniform probability measure on [0, 1] or on a finite space is characterized by the property that if we condition on any suitable subset, the resulting conditional probability measure is again uniform on that subset. It is this property that we will generalize, and the generalized notion will be called weak thinnability. (The actual definition of weak thinnability is given later, and will also involve two technical conditions. ) We allow probability measures to be defined on collections of sets that are closed under complements and finite disjoint unions. This is because we think there is no principal reason to insist that all sets are measured, just like not all subsets of R are Lebesgue measurable. We should, however, be cautious when allowing domains that are not necessarily algebras, for the following reason. De Finetti [3] uses a Dutch Book argument to conclude that, under the Bayesian interpretation of probability, a probability measure has to be coherent. He shows that if the domain of the probability measure is an algebra, the finite additivity of the probability measure implies coherence. On domains only closed under complements and finite disjoint unions, however, this implication no longer holds. Therefore, someone sharing de Fenitti's view of probability would like to add coherence as additional constraint. For completeness, we study both the case with and the case without coherence as additional constraint on the probability measure. Definition 1.1 Let X be a space and write P(X ) for the power set of X . An f -system on X is a nonempty collection F ⊆ P(X ) such that A probability measure on an f -system F is a map μ : F → [0, 1] such that A coherent probability measure is a probability measure μ : F → [0, 1] such that for all n ∈ N, α 1 , . . . , α n ∈ R, A 1 , . . . , (

1.2)
A probability pair on X is a pair (F, μ) such that F is an f -system on X and μ is a probability measure on F. [9, p. 261] call an f -system a pre-Dynkin system, since in case of closure under countable unions of mutually disjoint sets, such a collection is called a Dynkin system. Remark 1.3 Expression 1.2 has the following interpretation. If α i ≥ 0, we buy a bet on A i that pays out α i for αμ(A i ). If α i < 0, we sell a bet on A i that pays out |α i | for |α i |μ(A i ). Then (1.2) expresses there is no guaranteed amount of net loss.

Remark 1.2 Schurz and Leitgeb
We aim at uniquely determining the probability of as many sets as possible. In particular, we are interested in probability pairs with an f -system consisting only of sets with a uniquely determined probability. So we are not only interested in probability pairs satisfying our stronger notion of uniformity, but in the canonical ones, where "canonical" is to be understood in the following way. Definition 1. 4 Let P be some collection of probability pairs. A pair (F, μ) ∈ P is canonical with respect to P if for every A ∈ F and every pair (F , μ ) ∈ P with A ∈ F we have μ(A) = μ (A).
Before we give a more detailed outline of our paper, we need the following definition. Set (1.3) Note that M is an algebra on [0, ∞ We write L * for the collection of probability pairs Our earlier observation about the indeterminacy of probability under L gets the following formulation in terms of L * : a pair (F, μ) ∈ L * is canonical with respect to L * if and only if F = C. We write W T for the collection of probability pairs that are a weakly thinnable pair (WTP), that is, a probability pair that satisfies the condition of weak thinnability. The collection W T is a proper subset of L * and contains pairs (F, μ) canonical with respect to W T such that F \ C = ∅. In other words, with restricting L * to W T we are able to assign a uniquely determined probability to some sets without natural density. Finally, we write W T C ⊆ W T ⊂ L * for the elements (F, μ) ∈ W T such that μ is coherent.
The structure of this paper is as follows. In Sect. 2, we discuss weak thinnability and motivate why this is a natural notion of uniformity. In Sect. 3, we introduce the probability pair (A uni , α) where and In Sect. 4, we derive from (A uni , α) analogous probability pairs on certain metric spaces including Euclidean space. The proofs of the results in Sects. 2-4 are given in Sect. 5. We write N 0 :={0, 1, 2, . . .}. For real-valued sequences x, y or real-valued functions x, y on [0, ∞) we write x ∼ y or x i ∼ y i if lim i→∞ (x i − y i ) = 0. Since we work only on [0, ∞) in Sects. 2 and 3, every time we speak of an f -system, probability pair or probability measure it is understood that this is on [0, ∞).

Weak Thinnability
Let m be the Lebesgue measure on R. For Lebesgue measurable Y ⊆ R with 0 < m(Y ) < ∞ the uniform probability measure on Y is given by We want to generalize this property to a property of probability pairs on [0, ∞).
. The map f A gives a one-to-one correspondence between A and [0, ∞). If A ∈ M * and B ∈ M, we want to introduce notation for the set that gives the subset of A that corresponds to B under f A . Inspired by van Douwen [12], we introduce the following operation.
We can view this operation as thinning A by B because we create a subset of A, where B is "deciding" which parts of A are removed. We also can view the operation A • B as thinning out B over A, since we "spread out" the set B over A. Taking for example we get (2.12) is a natural generalization of (2.2). Using (2.12) this translates into (2.14) We now have the restriction that A ∈ F ∩ M * . However, if A ∈ F \ M * , then any uniform probability measure should assign 0 to A and since A • B ⊆ A (2.14) still holds. In Sect. 6.2, we show that the condition that (2.14) holds for all A, B ∈ F is so strong that only probability pairs with relatively small f -systems satisfy it. Since it is our goal to find a notion of uniformity that allows for a canonical pair with a large f -system, we choose to use a weakened version of this property which asks that Weak thinnability also involves two technical conditions. Let (F, μ) be a probability pair, let A, B ∈ F and suppose it is true for every x ∈ [0, ∞) that Since this inequality is true for every x, the set B is "sparser" than A. Therefore, it is natural to ask that μ(A) ≥ μ(B). We call this property "preserving ordering by S." Since we have C ⊆ F, it seems natural to also ask μ C = λ, but it turns out to be sufficient to ask the weaker property that μ([c, ∞)) = 1 for every c ∈ [0, ∞). So, to reduce redundancy we require the latter and then prove that μ C = λ. Putting everything together, we obtain the following definition. That every WTP extends natural density is implied by the following result. (2.16)

The Pair (A uni , α)
For A ∈ M set σ A : (0, ∞) 2 → [0, 1] given by and It is easy to check that (W uni , λ W uni ) is a probability pair. For any A ∈ M, we set Notice that Definition 3.1 gives a definition of (A uni , α) that is slightly different from (1.8) and (1.9). For a justification of Eqs. 1.8 and 1.9, see the proof of Lemma 5.2. Our first concern is that α coincides with natural density.

Proposition 3.2 We have C ⊆ A uni and for every
It is easy to check that A / ∈ C, but is a probability pair follows directly from the fact that (W uni , λ W uni ) is a probability pair. The pair (A uni , α) is also a WTP. Theorem 3. 3 We have (A uni , α) ∈ W T C ⊆ W T and we can extend (A uni , α) to a W T P with M as f -system. Remark 3. 4 We use free ultrafilters in the proof of Theorem 3.3 to show there exists an extension to a W T P ith M as f -system. The existence of free ultrafilters is guaranteed by the Boolean Prime Ideal Theorem, which cannot be proven in ZF set theory, but is weaker than the axiom of choice [4]. The existence of a atomfree or nonprincipal (i.e., every singleton has measure zero) finite additive measure defined on the power set of N cannot be established in ZF alone [10]. Consequently, a version of the axiom of choice is always necessary to construct a probability measure on M that assigns measure zero to all bounded intervals.
We do not only want an element of W T C, but a canonical one. This is guaranteed by the following theorem.

Theorem 3.5 The pair (A uni , α) is canonical with respect to both W T and W T C.
The pair (A uni , α) is maximal in the sense that it contains every pair that is canonical with respect to W T or W T C.

Generalization to Metric Spaces
In this section we derive probability pairs on a class of metric spaces that are analogous to (A uni , α). Of course one could also try to construct such a probability measure by working more directly on these metric spaces, instead of constructing a derivative of (A uni , α). Since probability pairs on [0, ∞), motivated from the problem of a uniform probability measure on N, is the priority of this paper, we do not make such an effort here.
Let us first sketch the idea of the generalization. Let A ∈ M. Whether A is in A uni depends completely on the asymptotic behavior of ρ A (Lemma 5.2). If A ∈ A uni , then also α(A) only depends on the asymptotic behavior of ρ A (Lemma 5.2). Now suppose that on a space X , we can somehow define a density functionsρ B : [0, ∞) → [0, 1] for (some) subsets B ⊆ X in a canonical way. Then, by replacing ρ byρ, we get the analogue of (A uni , α) in X . The goal of this section is to make this idea precise.
Let (X, d) be a metric space. For x ∈ X and r ≥ 0, write Write B(X ) for the Borel σ -algebra of X . We need a "uniform" measure on this space to measure density of subsets in open balls. It is clear that the measure of an open ball should at least be independent of where in the space we look, i.e., it should only depend on the radius of the ball. This leads to the following definition.
Definition 4. 1 We say that a Borel measure ν on X is uniform if for all r > 0 and On R n with Euclidean metric, the standard Borel measure as obtained by assigning to a product of intervals the product of the lengths of those intervals, is a uniform measure. In general, on normed locally compact vector spaces, the invariant measure with respect to vector addition, as given by the Haar measure, is a uniform measure.
A result by Christensen [1] tells us that uniform measures that are Radon measures are unique up to multiplicative constants on locally compact metric spaces. This, however, does not cover all cases. The set of irrational numbers, for example, is not locally compact, but the Lebesgue measure restricted to Borel sets of irrational numbers is a uniform measure and unique up to a multiplicative constant. We give a slightly more general version of the result of Christensen. Proposition 4.2 gives us uniqueness, but not existence. To see that there are metric spaces without a uniform measure, consider the following example. Let X be the set of vertices in a connected graph that is not regular. Let d be the graph distance on X . If we suppose that ν is a uniform measure on X , from (4.2) with r < 1 it follows that for some for every x ∈ V , which implies (4.2) cannot hold for r = 2 since the graph is not regular. A characterization of metric spaces on which a uniform measure exist, does not seem to be present in the literature.
We now assume X has a uniform measure ν and that ν(X ) = ∞. In addition to that, we write h(r ):=ν(B(x, r )) for r ≥ 0 and assume that which is equivalent with amenability in case (X, d) is a normed locally compact vector space [7]. For the importance of this assumption, see Remark 4.4 below. Set We also show in Proposition 4.3 that the asymptotic behavior ofρ A is not affected if we replace r − (u) by r + (u) in (4.5).

Proposition 4.3 Fix x, y ∈ X and A ∈ L(X ).
Then Remark 4.4 Proposition 4.3 is not necessarily true if we do not assume (4.3), as illustrated by the following example. Suppose X is the set of vertices of a 3-regular tree graph and d is the graph distance. Let ν be the counting measure, which is a uniform measure on this metric space. Then clearly (4.3) is not satisfied. Now pick any x ∈ X and let y be a neighbor of x. Let A ⊆ P(X ) be the connected component containing y in the graph where the edge between x and y is removed. Then Proposition 4.3 justifies the use ofρ to determine the density, since its asymptotic behavior is canonical. So, we define for A ∈ L(X ) the mapξ A : Then we set and α X : The pair (A uni (X ), α X ) gives us the analogue of (A uni , α) in X . In particular, it gives for X = N the corresponding uniform probability measure on N we initially searched for. In case of Euclidean space, we have the following expression for (A uni (X ), α X ), which in the special case of X = R gives us an extension of

Proofs
First we show that every f -system of a WTP is closed under translation and that every probability measure of a WTP is invariant under translation. If u = 1 there is nothing to prove, so assume u < 1. Let > 0 be given. Let First we observe that there is a K > 0 such that for all x ≥ K we have ρ A (x) ≤ u . We can write u as u = p q for some p, q ∈ N 0 with p ≤ q. Now we introduce the set Y given by Note that Y ∈ C ⊆ F. Lemma 5.1 and the fact that μ is a probability measure, gives us that μ(Y ) = u . Further, observe that for each By applying this to A c we find Before we prove Proposition 3.2 and Theorem 3.3, we present the following alternative representation of (A uni , α). We define for A ∈ M the map ξ A : (1, ∞) 2 → [0, 1] given by    Since the U-limit is multiplicative it follows completely analogous that (M, μ) is a WTP. Hence every (A s, f , α s, f ) can be extended to a WTP with M as its f -system. In particular, by Lemma 5.2, this means that (A uni , α) can be extended to a WTP with M as its f -system. From de Finetti [2] it follows that if α can be extended to a finitely additive probability measure on an algebra, then α is coherent. Since we have showed that α can be extended to M, which is an algebra, it follows that (A uni , α) ∈ W T C. Notice that we showed that α s, f can be extended to M for every (s, f ) ∈ P, so we also have For our proof of Theorem 3.5, we need an alternate expression for U (log(A)). For Proof Let A ∈ M and fix C > 1.
Step 1 We show that Define This implies Step 2 We give an upper and lower bound for We now observe that A (C, j))) . (5.42) The fact that log(1 + y) ≤ y for every y ≥ 0, combined with (5.40), (5.41) and (5.42) gives Step 3 We combine Step 1 and Step 2 to finish the proof. Observe that Analogously, we find that A)). We also need the following lemma. (F, μ) be a WTP. Then for any A ∈ F and C > 1

Lemma 5.4 Let
Proof Let (F, μ) be a WTP with A ∈ F. Fix C > 1 and write The idea is to introduce a set B ∈ M for which we have lim We are ready to give the proof of Theorem 3.5.

L(log(A)) ≤ μ(A) ≤ U (log(A)). (5.54)
We give the following example to give an idea of the proof that follows. Set Note that Z 1 , Z 2 , Z 3 ∈ C are pairwise disjoint. Now, we set Observe that for j ≥ 3 So we constructed a set A that on each interval [2 j−1 , 2 j ) with j ≥ 3 has an average that equals the average of the averages of A on two consecutive intervals. By weak thinnability we find that μ(A ) = 1 2 μ(A) + 1 4 μ(A) + 1 4 μ(A) = μ(A). If τ A (2, j) is convergent or only oscillates a little, we can give a good upper bound of μ(A) using Lemma 5.4. Applying this strategy not only for C = 2 but for any C > 1 and averages of not only two but arbitrarily many averages on consecutive intervals, is what happens in the proof.
Step 1 We construct aÂ ∈ F. Fix C > 1 and n ∈ N. We split up [C j−1 , C j ) into intervals of length 1 plus a remainder interval for every j. Set for j ∈ N so that for every j ∈ N we have Choose u ∈ N such that for every j ∈ N we have which can be done since N j is asymptotically equivalent with C j−1 (C − 1). For p ∈ {0, .., n}, k ∈ {1, .., C p } and j ∈ N we set that "evenly" distributes mass l over the interval [0, T ). Note that (5.60) guarantees that is well defined. Note that by construction Z ( p, k) ∈ C and m(Z ( p, k) ∩ I p,k (u + j)) = C n− p+ j−1 (C − 1). Observe that all the Z ( p, k) are disjoint. So P1 and the fact that F is an f -system imply thatÂ ∈ F.
Step 2 We give an upperbound for μ(A) by first giving an upperbound for μ(Â) and then relating μ(A) and μ(Â).
A crucial property ofÂ is that for Hence (5.72) Step 3 We take limits in (5.72).
Unfix n and C. We first take the limit superior for n → ∞ in (5.72), giving (5.73) Then we take the limit superior for C ↓ 1 and find by Lemma 5.3 that The lower bound we can now easily obtain by applying our upper bound for the complement of A. Doing this, we see that giving that μ(A) ≥ L(log(A)).
Proof of Theorem 3. 6 We prove the contrapositive. Let (F, μ) be a WTP with F \ A uni = ∅. Let A ∈ F \ A uni . By Lemma 5.2, this means that there is a (s, f ) ∈ P such that Clearly, we can find m, l ∈ N ∞ such that ξ A (s m n , f m n ) tends to I and ξ A (s l n , f l n ) tends to S. Now set s n :=s m n , f n := f m n , s n :=s l n and f n := f l n . Then we see that A ∈ A s , f and A ∈ A s , f with Proof of Proposition 4. 2 We give a proof along the lines of Mattila [6, p. 45], with small adaptations for completeness and more generality.
First let A be an open set of (X, d) with ν 1 (A) < ∞ and ν 2 (A) < ∞. Suppose that r > 0 is such that h 2 is continuous in r . Then ) is a continuous mapping from X to [0, ∞). Since h 2 is nondecreasing, it can have at most countable many discontinuities. So we can choose r 1 , r 2 , r 3 , . . . such that lim n→∞ r n = 0 and h 2 is continuous in every r n . For n ∈ N let f n : X → [0, 1] be given by Notice that by our previous observation f n is continuous on A, hence f n is measurable.
Because A is open, we have lim n→∞ f n (x) = 1 for every x ∈ A. With Fatou's Lemma we find Note that any uniform measure is σ -finite. Applying Fubini's theorem we obtain By interchanging ν 1 and ν 2 we get Now let A be any open set of (X, d). Let x ∈ X and set A n :=A ∩ B(x, n) for n ∈ N. Note that A n is open with ν 1 (A n ) ≤ ν 1 (B(x, n)) < ∞ and ν 2 (A n ) ≤ ν 2 (B(x, n)) < ∞. Hence, by the first part of the proof, we find ν 1 (A n ) = cν 2 (A n ). But then Proof of Proposition 4.3 Fix A ∈ L(X ) and x, y ∈ X . By (4.3) we have Observe that for any r ∈ [0, ∞) we have Let ν be the Borel measure on R n . Note that h(r ) = n −1 δ n r n . If we set u = n nδ −1 n y, then Since |ζ A (D, x)| ≤ 1 log(D) , the desired result follows.

Algebra Versus f -System
The natural analogue of an σ -algebra in finite additive probability theory is an algebra. It has been remarked [9,13] that the restriction of M to C is problematic since C is not an algebra. However, any collection extending C that is not M itself, is not an algebra since a(C) = M. This can be seen as follows. Let A ∈ M and set . Hence A ∈ a(C) and since A ∈ M was arbitrary, we have a(C) = M. This observation brings us to the conclusion that the requirement of an algebra, despite the fact that an algebra is the natural analogue of an σ -algebra, is too restrictive. Furthermore, finite additivity only dictates how a probability measure behaves when taking disjoint unions, and thus only suggests closedness under disjoint unions. Coherence is a concern since, as remarked before, it is not guaranteed on f -systems, whereas it is guaranteed on algebras. Coherence, however, can also be achieved on f -systems, as α does, and therefore, coherence not being guaranteed is in itself not an argument against f -systems. Therefore, we think the requirement of an f -system rather than an algebra in Definition 1.1 is justified.
It should be noted that even if one prefers M as domain, by Theorem 3.3 (A uni , α) can be extended to a WTP with M as f -system. Such a pair is not canonical with respect to W T or W T C (Theorem 3.6), but still has A uni included as an f -system within the domain on which probability is uniquely determined.

Thinnability
Suppose that in Definition 2. So (A uni , α) is not a thinnable pair. Since every thinnable pair is also a WTP, by Theorem 3.5 we see that a thinnable probability measure on A uni , does not exist. Notice that we are not necessarily looking for the strongest notion of uniformity, but for a notion that allows for a canonical probability pair with a "big" f -system. This is the reason why we are interested in weak thinnability rather than thinnability. There may, of course, be other notions of uniformity that lead to canonical pairs with bigger f -systems than A uni . At this point, we cannot see any convincing motivation for such notions.

Weak Thinnability
In this paper, we only studied the notion of weak thinnability from the interest in canonical probability pairs. There are, however, interesting open questions about the property of weak thinnability itself that we did not address in this paper. Some examples are: -Is every probability pair that extends (A uni , α) a WTP? -Is every WTP coherent? -Can every WTP be extended to a WTP with M as f -system? -How do the sets {μ(A) : (F, μ) ∈ W T and A ∈ F} and {μ(A) : (F, μ) ∈ W T C and A ∈ F} look like for A / ∈ A uni ? -Is P2 redundant? If no, what probability pairs are not a WTP, but do satisfy P1 and P3? -How does weak thinnability relate to the property μ(c A) = μ(A), where c A:={ca : a ∈ A} and c > 1?

Size of A uni
A typical example of a set in M that does not have natural density, but is assigned a probability by α, is [e 2n , e 2n+1 ), (6.4) for which we have α(A) = 1/2. It is, however, unclear how "many" of such sets there are, i.e., how much "bigger" the f -system A uni is than C and how much "smaller" it is Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.