The Modal Logic of Bayesian Belief Revision

In Bayesian belief revision a Bayesian agent revises his prior belief by conditionalizing the prior on some evidence using Bayes’ rule. We define a hierarchy of modal logics that capture the logical features of Bayesian belief revision. Elements in the hierarchy are distinguished by the cardinality of the set of elementary propositions on which the agent’s prior is defined. Inclusions among the modal logics in the hierarchy are determined. By linking the modal logics in the hierarchy to the strongest modal companion of Medvedev’s logic of finite problems it is shown that the modal logic of belief revision determined by probabilities on a finite set of elementary propositions is not finitely axiomatizable.


Introduction and Overview
Let (X, B, p) be a classical probability measure space with B a Boolean algebra of subsets of set X and p a probability measure on B. In Bayesian belief revision elements in B stand for the propositions that an agent regards as possible statements about the world, and the probability measure p represents an agent's prior degree of belief in the truth of these propositions.Learning proposition A in B to be true, the agent revises his prior p on the basis of this evidence and replaces p with q(•) = p(• | A), where p(• | A) is the conditional probability given by Bayes' rule: This new probability measure q can be regarded as the probability measure that the agent infers from p on the basis of the information (evidence) that A is true.The aim of this paper is to study the logical aspects of this type of inference from the perspective of modal logic.
Why modal logic?We will see in Section 2 that it is very natural to regard the move from p to q in terms of modal logic: The core idea is to view A in the Bayes' rule (1) as a variable and say that a probability measure q can be inferred from p if there exits an A in B such that q(•) = p(• | A).Equivalently, we will say in this situation that "q can be (Bayes) learned from p".That "it is possible to obtain/learn q from p" is clearly a modal talk and calls for a logical modeling in terms of concepts of modal logic.
Bayesian belief revision is just a particular type of belief revision: Various rules replacing Bayes's rule have been considered in the context of belief change (e.g.Jeffrey conditionalization, maxent principle; see [20] and [5]), and there is a huge literature on other types of belief revision as well.Without completeness we mention: the AGM postulates in the seminal work of Alchourrón-Gärdenfors-Makinson [1]; the dynamic epistemic logic [19]; van Benthem's dynamic logic for belief revision [18]; probabilistic logics, e.g.Nilsson [16]; and probabilistic belief logics [2].For an overview we also refer to Gärdenfors [6].Typically, in this literature beliefs are modeled by sets of formulas defined by the syntax of a given logic and axioms about modalities are intended to prescribe how a belief represented by a formula should be modified when new information and evidence are provided.
Viewed from the perspective of such theories of belief revision our intention in this paper is very different: Rather than trying to give a plausible set of axioms intended to capture desired features of statistical inference we take the standard Bayes model and we aim at an in-depth study of this model from a purely logical perspective.Our investigation is motivated by two observations.First, the logical properties of this type of belief change do not seem to have been studied in terms of the modal logic that we see emerging naturally in connection with Bayesian belief revision.Second, Bayesian probabilistic inference is relevant not only for belief change: Bayesian conditionalization is the typical and widely applied inference rule also in situations where probability is interpreted not as subjective degree of belief but as representing objective matters of fact.Finding out the logical properties of this type of probabilistic inference has thus a wide interest going way beyond the confines of belief revision.
The structure of the paper is the following.After some motivation, in Section 2 the modal logic of Bayesian probabilistic inference (we call it "Bayes logic") is defined in terms of possible world semantics.The set of possible worlds will be the set of all probability measures on a measurable space (X, B).The accessibility relation among probability measures will be the "Bayes accessibility" relation, which expresses that the probability measure q is accessible from p if q(•) = p(• | A) for some A (Definition 2.1).We will see that probability measures on (X, B) with X having different cardinalities determine different Bayes logics.The inclusion relation of these Bayes logics is clarified by Theorem 4.1 in Section 4: the different Bayes logics are all comparable, and the larger the cardinality of X, the smaller the logic.The standard modal logical features of the Bayes logics are determined in Section 3 (see Proposition 3.1).In Section 5 we establish a connection between Bayes logics and the modal counterpart of Medvedev's logic of finite problems [14,15].We will prove (Theorem 5.2) that the Bayes logic determined by the set of probability measures over (X, B) with a finite or countable X coincides with the strongest modal companion of Medvedev's logic.This entails that the Bayes logic determined by probability spaces on finite X (hence with finite Boolean algebras B) is not finitely axiomatizable (Proposition 5.9).This result is clearly significant because it indicates that axiomatic approaches to belief revision might be severely limited.The paper [8]  = "it can be learned that the probability of A is at least 1/4 and at most 1/2" (6) = "it can be learned that φ" ( 7 ) In view of the interpretation of Bayes' rule formulated in the Introduction, it is very natural to define χ to be true at probability measure p if there is a B in B such that the conditional probability measure q(•) .= p(• | B) makes true the proposition φ = "the probability of A is at least 1/4 and at most 1/2" (8) where "true" is understood in the sense of Eq. 4; i.e. if for some B ∈ B we have Propositions such as χ in Eqs.6-7 are obviously of modal character and it is thus very natural to express this modality formally using the modal operator ♦ by writing the sentence χ as ♦φ.In view of Eq. 7 the reading of ♦φ is "φ can be learned in a Bayesian manner".Thus we model Bayesian learning by specifying a standard unimodal language given by the grammar defining formulas ϕ, where a belongs to a nonempty countable set V ar of propositional letters.As usual abbreviates ¬♦¬.(We refer to the books [3,4] concerning basic notions in modal logic).Models of such a language are tuples M = W, R, ] : V ar → ℘ (W ) is an evaluation of propositional letters.Truth of a formula ϕ at world w is defined in the usual way By definition formula ϕ is valid over a frame F, F ϕ in symbols, if and only if it is true at every point in every model based on the frame.For a class C of frames the modal logic of C is the set of all modal formulas that are valid on every frame in C: We denote by M(X, B) the set of all probability measures over X, B .M(X, B) is non-empty as the Dirac measures δ x for x ∈ X always belong to M(X, B).We assume, without loss of generality, that elementary events {x} for x ∈ X always belong to the algebra B. It follows that for a finite or countably infinite X, B must be the powerset algebra ℘ (X).
For a fixed X, B the set of possible worlds W is defined to be the set of probability measures M(X, B).Consider again the sentences φ .
= "the probability of A is at least 1/4 and at most 1/2" (12) The core idea of the semantic of the introduced modal language describing Bayesian statistical inference is the following: • The intended interpretation of φ and ψ are the sets • The intended interpretation of ♦φ is that "φ can be learned in a Bayesian manner": This intended interpretation suggests the following definition of the accessibility relation R on W = M(X, B): We are now in a position to give the definition of one of the central concepts of this paper.Definition 2.2 (Bayes frames) Let X, B be a measurable space.The structure is called a Bayes frame.In case B = ℘ (X), we use the notation We define a Bayes model as a model ] based on a Bayes frame F(X, B).The modal logic (F(X, B)) corresponds then to the set of laws of Bayesian learning based on the frame F(X, B).The general laws of Bayesian learning independent of the particular representation X, B of the events is then the modal logic BL = {φ : (∀ Bayes frames F) F φ} (19) From the point of view of applications the most important classes of Bayes frames are the frames F X with X = n (a finite ordinal) or X = ω (the smallest infinite ordinal).We will see that finiteness of X serves as a dividing line when defining the logic of Bayes frames.To indicate these frames we make use of the following notation Definition 2.3 (Bayes logics) We define a family of normal modal logics based on finite or countable or countably infinite or all Bayes frames as follows.
We call BL <ω (resp.BL ≤ω ) the logic of finite (resp.countable) Bayes frames; however, observe that the set of possible worlds M(X, B) of a Bayes frame F(X, B) is finite if and only if X is a one-element set, otherwise it is at least of cardinality continuum.
One can easily check the inclusions using the very definition of Bayes logics.

Modal Principles of Bayes Learning
In this section we discuss the connections of Bayes logic to a list of modal axioms that are often considered in the literature.Such axioms are Let us recall some of the standard frame properties corresponding to these axioms (cf.[3] and [4]).

Logic
As Bayes logics were defined to be the modal logics of certain frames, these logics are normal modal logics (that is, they extend K).The next proposition establishes the connection between the Bayes logics and the usual frame properties.

Proposition 3.1
The following statements hold: R(X, B) be an arbitrary Bayes frame.To check BL ⊇ S4 we need to show that R(X, B) is a preorder (reflexive and transitive).To simplify notation, we frequently write R instead of the longer R(X, B).
• Transitivity: suppose u, v, w ∈ M(X, B) with uRv and vRw, i.e. there are We note that the accessibility relation is also antisymmetric: To see that BL ⊇ S4.1 it is enough to give an example for a Bayes frame that does not validate the axiom M. Consider the frame where [0, 1] is the unit interval and B is the Borel σ -algebra.Let w be the Lebesgue measure.We claim that For, suppose for some probability u we have wRu.Then u(•) = w( • | A) for some Borel set A with w(A) = 0.Each Borel set A with non-zero Lebesgue measure contains a Borel subset B A with a strictly smaller but non-zero Lebesgue measure: 0 < w(B) < w(A).It is easy to see that from u = w( • | A) we can R-access w( • | B) and since w(B) < w(A) we also have w( (2) In order to show BL ≤ω ⊇ S4.1 it is enough to verify that for a countable measurable space X, B , the frame F(X, B) has end-points in the following sense.
Pick an arbitrary w and let x ∈ X be such that w({x}) = 0.Such an x must exist because X is countable.We claim that u = w( • | {x}) will be suitable.For H ∈ ℘ (X) we have Thus w( • | {x}) is the Dirac measure δ x .If a measure is Bayes accessible from δ x , then it must be absolutely continuous with respect to δ x and it is clear that δ x is the only such probability measure.
(3) Next, let us verify BL <ω ⊇ S4.Grz.To this end it is enough to show that no Bayes frame F(X, B) with a finite X can contain an infinite R(X, B)-path.But this follows from the fact that finiteness of X implies finiteness of B = ℘ (X), whence there are only finitely many elements in B that can serve as possible evidence for conditionalizing a probability.
Finally, we prove F ω Grz (thus BL ω ⊇ S4.Grz).Let w ∈ M(N, ℘ (N)) be a measure such that for all x ∈ N we have w({x}) = 0. Fix a sequence A i = N − {0, . . ., i} for i ∈ N. Then shows the failure of the Grzegorczyk axiom Grz in F ω .

Inclusions Between Bayes Logics
Recall the inclusions that follow directly from the definition of Bayes logics: In this section we prove the following theorem: Some of the inclusions in the above theorem follow from Proposition 3.1.For instance BL BL ≤ω is witnessed by S4.1 ⊆ BL ≤ω but S4.1 ⊆ BL.To prove the other inclusions, we establish several lemmas first.
For two frames F = W, R and G = W , R we write F G if F is (isomorphic as a frame to) a generated subframe of G.We recall that if F G, then G φ implies F φ, whence (G) ⊆ (F) (see Theorem 3.14 in [3] where the symbol was used instead of ).

Lemma 4.2 F
It can be checked that α establishes F n F n+k .The case F n F ω is similar.
To see why the proper inclusion BL n+k BL n holds we need some preparation.In a frame F = W, R a sequence x 0 , x 1 , . .., x k is called a path if x i Rx i+1 for i < k and x i = x j for i = j .The length of a path is the number of the x i 's in the sequence.Define by recursion the following formulas ] be a model, and x ∈ W .
• M, x π n only if there is in F a path of length n starting from x.
• If there is in F a path of length n starting from x, then there is an evaluation Proof The proof is not hard and is left to the reader, we only visualize the idea of the proof using Fig. 1.It is clear that if F X is a Bayes frame with a finite X, then there are only finitely many elements in B that can serve as a possible evidence for conditionalizing a probability.From this it follows, that in these finite cases the maximal length of a path in F X is smaller than the cardinality of the power set ℘ (X).Therefore, for every n < m there exists k such that This proves BL m = BL n .

Connection to the Modal Logic of Medvedev Frames
We start by recalling first the notion of Medvedev frames.Such frames originate in intuitionistic logic, for an overview about the history we refer to the book [4] and to Shehtman [17].The main purpose of this section is to establish a correspondence between Bayes logics and the modal logics of Medvedev frames.Definition 5.1 (Medvedev frames) A Medvedev frame is a frame that is isomorphic (as a directed graph) to ℘ (X) {∅}, ⊇ for a non-empty finite set X.
For convenience, as a slight abuse of notation, we will call every frame of the form ℘ (X) {∅}, ⊇ (X being finite or infinite) a Medvedev frame and we will use the notation A hierarchy of normal modal logics that correspond to the frames P 0 X can be given: Observe that for cardinals α < β we have P 0 α P 0 β , consequently ML β ⊆ ML α .Since there are countably many modal formulas and proper class many cardinals, there must exists a cardinal α 0 such that the sequence ML α stabilizes, i.e.ML = ML α 0 or equivalently for all β ≥ α 0 we have ML β = ML α 0 .
The main result of this section is the following theorem: Theorem 5.2 Countable Bayes logics and the modal logics of countable Medvedev frames coincide.
We prove Theorem 5.2 through a series of lemmas.
Lemma 5.3 P 0 X F X for all finite or countably infinite set X.
Proof Take any w ∈ M X with full support supp(w) = X, and consider the subframe Therefore each element v ∈ W can be identified with a non-empty subset H ⊆ supp(w).It is fairly easy to check that the mapping H → w(• | H ) establishes an isomorphism between F w and P 0 X , which completes the proof.Lemma 5.3 implies BL ω ⊆ ML ω and BL n ⊆ ML n for all n > 0 and therefore BL <ω ⊆ ML <ω .Next, we want to establish the converse inclusions.
Let F G denote a surjective, bounded morphism between frames F and G. Recall that if F G, then F φ implies G φ, whence (F) ⊆ (G) (see Theorem 3.14 in [3]).We also recall that (∀i) F i φ implies F i φ (for the definition of the disjoint union of frames see Definition 3.13 in [3]).In the special case when F i = F it follows that (F) ⊆ ( F) (Theorem 3.14 in [3]).
Note that neither F X P 0 X nor P 0 X F X can hold if X is finite because the underlying set M X of F X has the cardinality of continuum (for n > 1) while ℘ (X) is finite.
So far we have proved BL ω = ML ω , BL <ω = ML <ω and BL n = ML n for all n > 0. To complete the proof of Theorem 5.2 it remains to show BL ML.
As for the inclusion BL ⊆ ML recall that there is a cardinal α 0 such that ML = ML α 0 .It is enough to find a Bayes frame F such that P 0 α 0 F because in such a case we obtain The construction of such a frame is interesting on its own and for this reason is postponed to Proposition 5.6.
Putting together all the previous lemmas we arrive at Theorem 5.2: Though we established BL = ML, the two logics are "close" to each other in the sense of the following proposition.

Proposition 5.6 The logic of each Bayes frame can be dominated by the logic of a Medvedev frame, and vice versa.
Proof #1: Proving that for all F = F(X, B) there exists P 0 = P 0 Y such that (F) ⊆ (P 0 ): Take any F(X, B) and let Y ⊆ X be a finite, non-empty subset.Let v ∈ M(X, B) be a probability measure such that supp(v) = Y .Then the subframe F v generated by v is isomorphic (as a directed graph) to P 0 Y (cf. the proof of Lemma 5.3), whence P 0 Y F(X, B).This implies (F(X, B)) ⊆ (P 0 Y ), as desired.#2: Proving that for all P 0 = P 0 Y there exists F = F(X, B) such that (P 0 ) ⊆ (F): The proof is similar to that of Lemma 5.4.Take any P 0 Y and let X ⊆ Y be a finite, non-empty subset.We need the following Lemma: Lemma 5. 7  Proof ML <ω is complete with respect to the set of (finite) Medvedev frames by definition, and BL <ω = ML <ω by Theorem 5.2.
An immediate consequence is that BL <ω is complete with respect to a recursive set of finite frames.Therefore, non-validities can be witnessed by finite counterexamples.
The most remarkable consequence of the identification of Bayes logics with the modal logics of Medvedev frames concerns the (non-)axiomatizability properties of Bayes logics: Proposition 5.9 The modal logics BL <ω and BL ω of Bayes frames over respectively finite or countably infinite probability spaces are not finitely axiomatizable.
The previous proposition is philosophically significant: it tells us that there is no finite set of formulas from which all general laws of Bayesian belief revision and Bayesian learning based on probability spaces with a finite set of propositions can be deduced.Bayesian learning and belief revision based on such simple probability spaces are among the most important instances of probabilistic updatings because they are widely used in applications.Proposition 5.9 says that the logic of such very basic belief revisions cannot be captured by a finite set of axioms.If the axiomatic approach to belief revision is not capable to characterize the logic of the simplest, paradigm form of belief revision, then this casts doubt on the general enterprise that aims at axiomatizations of belief revision systems.Countable Bayes logics can be characterized not only by the modal logic of Medvedev frames but also by that of Kubiński frames: Łazarz [12] proved that Medvedev's and Kubiński's logic coincide.Taking into account Theorem 5.2, Łazarz's result provides a lattice characterization of countable Bayes logics.For the necessary definitions we refer to [12].
As mentioned above ML is not finitely axiomatizable.The inequality BL ML in Theorem 5.2 raises the following problem, which also remains open.

Problem 5.11 Is BL finitely axiomatizable?
We noted at the beginning of Section 5 that there exists a least cardinal α 0 such that ML = ML α 0 .The exact value of α 0 is not known.Problem 5.12 What is the exact value of α 0 ?

Closing Words and Further Research Directions
In addition to standard Bayes conditionalization there are other Bayesian methods, extensions of the standard one, of updating a probability measure: Jeffrey's conditionalization and conditionalization based on the concept of conditional expectations (cf.[5,9,10]).
Let us first recall Jeffrey's conditionalization.Suppose p ∈ M(X, B) is a prior probability, {E i } i<n is a finite partition of X with p(E i ) = 0 for all i, and we are given a probability measure r : A → [0, 1], called the uncertain evidence, on the subalgebra A of B generated by this partition.The Bayesian Agent updates his prior probability p using the evidence r to get the posterior probability defined by the "Jeffrey rule": Given two measures p, q ∈ M(X, B) one can define Jeffrey accessibility in a manner similar to Bayes accessibility: q is Jeffrey accessible from p if there is a partition {E i } i<n and uncertain evidence r such that Eq. 49 holds.Jeffrey's conditionalization is just a special case of the general conditionalization based on the concept of conditional expectation introduced by Kolmogorov [11] already (see [9] as well): Let S be the Borel σ -algebra of R. Recall that for p ∈ M(X, B) and A ≤ B the conditional expectation E p (f | A) : X → R is any (A, S)-measurable function that satisfies (50) below for all (B, S)-measurable f : Such a function exists and is unique p-almost everywhere.Let dq dp : X → R denote the Radon-Nikodym derivative of q with respect to p.We say that q can be inferred from p using general conditionalization if q is absolutely continuous with respect to p and there is a σ -subalgebra A of B such that for all H ∈ B. If q can be inferred from p in this way, we say that q is generally Bayes accessible from p.
One can now define the modal logics based on Bayes frames F(X, B), where the accessibility relation is replaced with either Jeffrey accessibility or with the more general accessibility using conditional expectations.The basic logical properties of Jeffrey accessibility have been studied in the paper [7] and frame properties of accessibility using conditional expectations have been investigated in [9].It has been proven in [7] that Jeffrey accessibility is also not finitely axiomatizable in the finite or countably infinite case; however, we do not yet have results about decidability questions.
R , where W is a non-empty set, R a binary relation on W and [ | • |

2 Motivation and Basic Definitions
deals with finite nonaxiomatizability of Bayes logics over standard Borel spaces; however, it remains an open question whether general Bayes logics are finitely axiomatizable (Problem 5.11).Section 6 indicates future directions of research.
If X ⊇ Y , then P 0 Any surjection f : X → Y can be lifted up to a surjection f It can be checked that f + is a bounded morphism P 0 X .With F X = M X , R X , X being finite, following the proof of Lemma 5.4 one obtains P 0 Consequences Recall that if X, B is a finite probability space (with |X| > 1), then the set of probability measures M(X, B) has cardinality continuum.Therefore Bayes frames F(X, B) over finite probability spaces are uncountable.Thus it is surprising that despite the uncountability of Bayes frames the corresponding logic has the finite frame property: The modal logic BL <ω of Bayes frames over a finite probability space has the finite frame property.
[4]is a longstanding hard open problem whether there are recursive axiomatizations for any of the logics ML α (α infinite), or ML <ω (and thus BL <ω ), cf.[4], Chapter 2. New logical systems did not shed light to this problem.Since the class of (finite) Medvedev frames is a recursive class of finite frames, BL <ω is co-recursively enumerable.It follows that if ML <ω is recursively axiomatizable, then BL <ω is decidable.In the light of Theorem 5.2 we raise the following open problem.Are BL <ω or BL recursively axiomatizable?