Truth-Tracking by Belief Revision

We study the learning power of iterated belief revision methods. Successful learning is understood as convergence to correct, i.e., true, beliefs. We focus on the issue of universality: whether or not a particular belief revision method is able to learn everything that in principle is learnable. We provide a general framework for interpreting belief revision policies as learning methods. We focus on three popular cases: conditioning, lexicographic revision, and minimal revision. Our main result is that conditioning and lexicographic revision can drive a universal learning mechanism, provided that the observations include only and all true data, and provided that a non-standard, i.e., non-well-founded prior plausibility relation is allowed. We show that a standard, i.e., well-founded belief revision setting is in general too narrow to guarantee universality of any learning method based on belief revision. We also show that minimal revision is not universal. Finally, we consider situations in which observational errors (false observations) may occur. Given a fairness condition, which says that only finitely many errors occur, and that every error is eventually corrected, we show that lexicographic revision is still universal in this setting, while the other two methods are not.


Introduction
At the basis of the modeling of intelligent behavior lies the idea that agents integrate new information into their prior beliefs and knowledge. Intelligent agents are assumed to be endowed with some learning methods, which allow them to change their beliefs on the basis of assessing new information. But how effective is an agent's learning method in eventually finding the truth? To make this question precise and to answer it, we borrow concepts from formal learning theory and adapt them to the commonly used model of beliefs, knowledge, and belief change, namely that of possible worlds.
A set S of possible worlds, which we will call the state space, together with a family O of observable propositions, represents the agent's epistemic space, or knowledge state. Note that the sets S and O do not have to be finite, but are assumed to be at most countable. Intuitively, the epistemic space represents the uncertainty range of the agent. She can consider some possible worlds to be more plausible than others. This is represented by a total preorder on possible worlds, called a plausibility preorder. It captures the agent's assessments concerning which of any two worlds s, s is more likely to be the actual one. Such an assessment can obviously be based on many different factors, in particular on the assessed level of simplicity, or on consistency with previous observations. An epistemic space together with a plausibility preorder is called a plausibility space.
In order to represent the dynamic aspects of knowledge and belief we will use belief revision methods, which, triggered by incoming information, change the epistemic (plausibility) space. The change can occur through removal of the states incompatible with the new information, or through a revision of the plausibility relation. Many belief revision policies proposed in the literature are formulated, or can be reconstructed, within our setting. In this paper we investigate three basic policies: conditioning, lexicographic revision [38,39], and minimal revision [18]. The goal is to see how they can be viewed as learning methods, and to investigate their learning power, i.e., the ability to identify the real world from the incoming information.
We obtain our results by defining learning methods based on belief revision policies. We show that learning from positive data via conditioning and lexicographic revision is universal, i.e., that those learning methods can uniformly identify the real world in the limit, when starting in any epistemic space in which the real world is identifiable (via any learning method). However, that happens only if the agent's prior plans/dispositions for belief revision are suitably chosen; and not every such prior is suitable. Furthermore, we show that the most conservative belief revision method, minimal revision, is not universal.
Our approach brings together methods of formal learning theory [FLT, see, e.g., 37] and Dynamic Epistemic Logic [DEL, see 7,8,14,22]. The interest in bringing together learning theory and belief revision theory has appeared before within at least two lines of research. Firstly, in [30][31][32][33][34] some classical belief revision policies were treated as learning strategies. Secondly, in [35,36] the connection has been rooted in the classical AGM framework [1]. Finally, in [23][24][25][26][27] the set learning paradigm (also called language (or set) learning) has been connected with epistemic and doxastic logics of belief revision [3,[9][10][11]13,21]. The present paper is a continuation of the latter line of research, and is in fact a thorough presentation of results announced in a previously published extended abstract [5].
We are chiefly concerned with the counterpart of one of the central notions in formal learning theory, namely identifiability in the limit [28]. Hence, we focus on stabilizing to a correct belief. 1 We hence investigate the reliability of mind-changing strategies, i.e., the possibility of converging to an accurate hypothesis after a finite number of mind-changes.

Notation and Basic Definitions
The agent is represented by her epistemic space, i.e., a range of possible worlds that satisfy relevant observables.
Definition 1. Let S be set of possible worlds and let O ⊆ P(S) be a set of observable propositions (also called 'observables'). We will assume that both sets are possibly infinite, but at most countable, and that there are no two possible worlds that make exactly the same propositions true. The pair S = (S, O) is then called an epistemic space.
The epistemic space represents an agent who does not favor any possibility over any other. We break that symmetry by introducing a plausibility relation, . 2 Definition 2. Let S = (S, O) be an epistemic space, and ⊆ S × S be a total preorder. 3 The structure B S = (S, O, ) is called a plausibility space.
Since we allow for the epistemic space to be infinite, the question of well-foundedness of the plausibility space becomes very relevant. We do not restrict our considerations to well-founded spaces, which we call standard plausibility spaces due to their popularity in their belief revision literature.
Definition 3. A standard plausibility space B S = (S, O, ) is one whose plausibility relation is well-founded (i.e., there is no infinite descending chain s 0 s 1 . . . s n . . ., where ≺ is the strict plausibility relation, given by: s t and t s).

Knowledge and Belief in Epistemic Spaces
Let us briefly discuss the interpretation of knowledge and belief in our setting. Statements about knowledge and belief are not taken to be observables, but are meaningful within the epistemic spaces. Let us take a plausibility space B S = (S, O, ). Given a proposition p ⊆ S, we say that the agent knows that p, Kp, if and only if S ⊆ p. In other words, knowledge is a global modality-we say that the agent knows p iff p is true in all possible worlds of the plausibility space B S . In such case we will write B S |= Kp.
As we mentioned before, most of the epistemic doxastic logic and belief revision literature deals with standard, i.e., well-founded plausibility structures. The well-foundedness assumption has at least two advantages. Firstly, it allows us to canonically assign ordinal numbers to states [so-called Spohn ordinals or 'degrees of implausibility', see 39]. Secondly, it leads to a simple definition of belief, which can then be understood as 'truth in all the most plausible worlds'. In any standard plausibility space B S = (S, O, ), the agent believes p, Bp, if and only if min S ⊆ p, where for any set X ⊆ S, min X is the set of all most plausible worlds in X, defined as {t ∈ X | t s for all s ∈ X}. 4 For simplicity, given any B S = (S, O, ), we will write min B S , meaning min S. Obviously, this definition is equivalent to min B S ⊆ p in well-founded plausibility spaces.
We do not assume well-foundedness because, as we will show, it prevents one from learning what one could have learned with a non-well-founded plausibility relation. To make the belief operator meaningful also in the non-well-founded cases, we will say that the agent believes that p, Bp, if and only if ∃w ∀u w u ∈ p. In such case we will write B S |= Bp.

Observable Propositions
An observable proposition is identified with the set of those possible worlds which makes it true. These propositions can be empirically encountered (observed) by an agent and hence can be viewed as data or evidence for learning: an agent can witness them. This does not mean that they are all observable at the same time. We will assume that at each step of learning consists of only one observation. 5 4 It is easy to see that, if is well-founded, then min X = ∅ whenever X = ∅. 5 In general, the set of observables can be closed under certain operations, e.g., under negation (if 'negative data' are observed), or under finite intersection (if 'conjunctions' are observed). Under the usual possible-world interpretation, O can be viewed as the set of (atomic) propositions, or, if the stress is put on closure under certain operations (e.g., negation or conjunction), as a set encoding the valuation for a relevant logical language.
The agent revises her beliefs (plausibility space) in response to the observations. We assume that the agent is inductively presented with an endless stream of observables.
The intuition behind the streams of data is that at stage i, the agent observes the information in O i . A data stream captures a possible future history of observations in its entirety, while a data sequence captures only a finite part of such a history. A data stream is sound iff it presents only true observables. A data stream is complete iff it presents every true observable property. In most of this paper we assume the data streams to be sound and complete with respect to the actual world, i.e., all observed data is true, and all true data will eventually be observed. 6 Those are the most favorable conditions for learning (in the limit) the whole truth about the identity of the actual world. Nonetheless, learning from such data is not trivial, as will be seen below.

Learning and Belief Revision Methods
In this section, we introduce a formal framework for belief change, learning functions, and belief revision operators. Depending on the epistemic space, a learning function responds to a given data sequence with a new conjecture.
Definition 8. Let S = (S, O) be an epistemic space and let σ be a data sequence. A learning method is a function L that on the input of S and data sequence σ outputs some set of worlds L(S, σ) ⊆ S, called a conjecture.
Learning methods can have various properties; for instance, the learner can be forgetful or conservative in revising her conjectures. Below we list several properties of this type.
For the reader familiar with belief revision theory, let us now briefly compare the above properties with the well-known AGM postulates [1]. Weak data retention means that the current conjecture always entails the most recently observed data. If we interpret conjectures as beliefs, this intuitively corresponds to the AGM Success Postulate. Strong data retention says that the current conjecture always accounts for all data that have been encountered till now. Between those two extremes, a learner with bounded memory could retain some fixed finite amount of data. Conservativity requires that the agent sustains the same conjecture whenever the new piece of data is already entailed by her old conjecture. A learning method is memory-free if, at each stage, the new conjecture depends only on the previous conjecture and the new datum. 7 As we will see later, this assumption poses severe problems for iterated belief revision. In fact, some standard belief revision methods implementing the AGM postulates are not memory-free: the new belief depends in addition on some hidden parameter, namely the old plausibility relation.
We now turn to belief revision methods. In our logical setting, they are transformations of plausibility spaces triggered by the incoming data.
Definition 10. A one-step revision method is a function R 1 that, for any plausibility space B S and any observable p ∈ O, outputs a new plausibility space R 1 (B S , p).
A (iterated) belief revision method R is obtained by iterating a one-step revision method R 1 : Now we show how to define learning functions in terms of belief revision. First, one must assign an initial plausibility order to each epistemic space to turn it into a plausibility space.
Definition 11. Let S = (S, O) be any epistemic space. A prior plausibility assignment plaus is a map that assigns to S some plausibility relation on S, thus converting it into a plausibility space plaus(S) = (S, O, ).
Then one can apply the belief revision operation to each new datum and recover the resulting belief state as the set of all worlds that are minimal (most plausible) in the revised plausibility space.
Definition 12. Every belief revision method R, together with a prior plausibility assignment plaus, generates a learning method L plaus R , called a belief revision-based learning method, given by: The previously defined properties of learning functions (Definition 9) can be now applied to belief revision methods: Definition 13. A belief revision method R is called weakly data-retentive (strongly data-retentive, conservative, or data-driven) iff for any prior plausibility assignment plaus, the induced learning method L plaus R is weakly data-retentive (strongly data-retentive, conservative, or data-driven).
The properties that belief revision methods inherit from their corresponding learning methods can be characterised in terms of belief.
For (3), let R be a conservative belief revision method, i.e., L plaus R is conservative, for any plaus. Let us take a plaus, S, σ, and For the left to right direction, assume that R(plaus(S), σ) |= Bq. Since We also define additional, specific to belief revision properties: strong conservativity and history independence.
, if the new piece of data was already believed, R does not change the plausibility space at all.
(b) history-independent iff for any B S , p ∈ O, and any data sequences σ, π i.e., R's output at any stage depends only on the previous output and the most recently observed data.
Strongly conservative belief revision methods not only keep the old conjecture the same (as in the case of conservative learning methods), but, when receiving truthful information, they do not change anything within the plausibility space. As we will see below, the classical belief revision methods are not necessarily strongly conservative. However, every iterated one-step belief revision method must be history-independent. History-independent methods do not require that the agent retains all past events: only the last plausibility space and the new datum are enough to determine the next plausibility space. However, as we will show in the next section, the corresponding learning is not necessarily memory-free.

Some Iterated Belief Revision Methods
Below we consider three basic belief revision methods that received considerable attention in dynamic epistemic logic and belief revision theory.
Conditioning. First we focus on the revision by conditioning [38,39], also called update in DEL [11,13]. This method operates by deleting those worlds that do not satisfy the newly observed data.
Definition 15. Conditioning, Cond, is an (iterated) belief revision method based on the one-step belief revision method Cond 1 that takes B S = (S, O, ) and p ∈ O, and outputs a restriction of B S to p, i.e., Since conditioning throws out all worlds inconsistent with current information, it is easy to see that: Proposition 2. Cond is strongly data-retentive on sound data streams.
Proof. Let us take any B S = (S, O, ) and a σ = (σ 0 , . . . , σ n ) sound with respect to an s ∈ S. By Proposition 1, it suffices to show that Cond(B S , σ) |= Bσ i for i ∈ {0, . . . , n}. Observe that indeed, for all w such that w s, It is the case because σ is sound with respect to σ, and since all worlds inconsistent with elements of σ have already been eliminated by Cond.
While conditioning is obviously conservative, it is not strongly conservative, since new information can rule out some worlds without ruling out all of the most plausible worlds:  [38,39], also known as radical upgrade in Dynamic Epistemic Logic [11,13] does not delete any worlds. Instead, it 'promotes' all the worlds satisfying the new piece of data, making them more plausible than all the worlds that do not satisfy it; while within the two zones, the old order is kept the same.
Definition 16. Lexicographic revision, Lex, is a belief revision method based on the one-step belief revision method Lex 1 that takes B S and p ∈ O, and outputs a revised plausibility space: It is easy to see that Lex is weakly data-retentive and conservative. However, it does not satisfy the strong versions of these properties: Proposition 4. Lex is not strongly data-retentive on arbitrary streams.
Proof. The proof is analogous to that of Proposition 2.
Proposition 5. Lex is strongly data-retentive on sound streams.
Proof. Let us take B S = (S, O, ). Let us also fix s ∈ S and assume that σ = (σ 0 , . . . , σ n ), is sound with respect to s, i.e., B S , s |= set(σ). It is easy to see that min Proposition 6. Lexicographic revision is not strongly conservative.
Proof. Let us consider a plausibility space in Figure 2: Moreover, the plausibility gives the following order: s ≺ t ≺ u, and that σ = (p). Clearly, B S |= Bp. However, after receiving σ 0 = p, the revision method will still put world u to be more plausible than t, and therefore B S = Lex(B S , p). Proposition 7. A learning method generated from a history-independent belief revision method does not have to be memory-free.
Proof. We prove this proposition by showing an example, a belief revision method R that is history-independent but the learning method that it induces is not memory-free (see Figure 3). Let R be Lex. Lex is clearly history-independent, because it is an iterated one-step revision method. To see that L Lex is not memory-free consider the following plausibility space (1) Lex(B S , σ) gives the plausibility order: s ≺ u ≺ t; (2) Lex(B S , σ ) gives the plausibility order: s ≺ t ≺ u.
Minimal Revision. The minimal revision method [18,38], known as conservative upgrade in DEL [11,13], is 'conservative' in the sense that it keeps as much as possible of the old structure. More precisely, the most plausible states satisfying the new piece of data become the most plausible overall; while in the rest of the space, the old order is kept the same.
Definition 17. Minimal revision Mini, is a belief revision method based on the one-step belief revision method Mini 1 that takes a plausibility space B S and a proposition p ∈ O, and outputs a new plausibility space in the Minimal revision is obviously weakly data-retentive-it leads to a belief that accounts for the last datum. However, it does not retain more than that. Consider the plausibility space in Figure 4, and the sequence σ = (p, q) (which is sound with respect to world u). After receiving p the plausibility order becomes s ≺ t ≺ u. Then q comes in and now our method gives the order t ≺ s ≺ u. So p is no longer believed after q was given. Therefore: Moreover, the minimal revision is conservative: as long as the incoming information is already believed, beliefs do not change, since the minimal worlds do not change. In this case we can say even more: nothing about the plausibility space changes, and so: Proposition 9. Minimal revision is strongly conservative.
The properties introduced in this section capture some interesting distinctions between belief revision methods. While conditioning and lexicographic s u t p q Figure 4. Example of a plausibility space revision are quite similar, differing only with respect to their strong retention capacity, minimal revision is different in two respects. It is not strongly data retentive, even on sound data streams. In the next section we will see that this combination of properties negatively affects learning [similar observations about the interaction of those properties were given in 30,31].

Convergence to Truth
Formal learning theory is concerned with reliable learning methods, i.e., those that can be relied upon (when observing a sound and complete data stream) to find the complete truth in the limit no matter what the actual world is, as long as it is among the possibilities allowed by the initial epistemic space S. 8 Following the learning-theoretic terminology, we say in this case that the real world has been identified in the limit.
Definition 18. Given an epistemic space S = (S, O), a world s ∈ S is identified in the limit by a learning method L if, for every sound and complete data stream for s, there exists a finite stage after which L outputs the singleton {s} from then on.
We say that the epistemic space S is identified in the limit by L iff all its worlds are identified in the limit by L.
An epistemic space S is identifiable in the limit (learnable) if there exists a learning method L that can identify it in the limit.
Learning methods differ in their learning power. We are interested in the most powerful among them, those that are universal -they can learn any epistemic space that is learnable.
Definition 19. A learning method L is universal on a class C of epistemic spaces if it can identify in the limit every epistemic space in C that is identifiable in the limit. A universal learning method is one that is universal on the class of all epistemic spaces.
In the remainder of this paper we focus on learning methods that are generated by iterated belief revision methods. For brevity, we will attribute the ability of identification in the limit also to belief revision policies.
Definition 20. An epistemic space S is identified in the limit by a belief revision method R if there exists a prior plausibility assignment plaus such that the generated learning method L plaus R identifies S in the limit. The epistemic space S is standardly identified in the limit by R if there exists a well-founded prior plausibility assignment plaus such that L plaus R identifies S in the limit.
Definition 21. A revision method R is universal on a class C of epistemic spaces if it can identify in the limit every epistemic space in C that is identifiable in the limit. R is standardly universal on a class C if it can standardly identify in the limit every epistemic space in C that is identifiable in the limit.
Our main result is the existence of AGM-like universal belief revision methods. The main technical difficulty of this part is the construction of an appropriate prior plausibility order. To define it, we use the concept of locking sequences introduced in [15] and that of finite tell-tale sets proposed in [2]. We adjust the classical notion of finite tell-tales, and use it in the construction of a suitable prior plausibility assignment that, together with conditioning and lexicographic revision, generates universal learning methods.
The first observation is that if convergence occurs, then there is a finite sequence of data that 'locks' the corresponding sequence of conjectures on a correct answer. This finite sequence is called a locking sequence [15]. Proof. Assume L identifies s without there being a locking sequence for L and s. We construct in stages a data stream O for s on which L does not converge. Let x 1 , x 2 , x 3 , . . . enumerate elements of (the countable) O that are true in in s. Stage 1. The string (x 1 ) is not a locking sequence, so for some τ , and sound data sequence for s, L((x 1 ) * τ ) = L((x 1 )). Take (x 1 ) * τ as the initial segment σ 1 of O.
Because each x i occurs in O, O is a sound and complete data stream for s. But learner L keeps changing value on O, L does not converge.
The characterization of identifiability in the limit requires the existence of finite sets that allow drawing a conclusion without the risk of overgeneralization. The characterization theorem is adapted from [2] and [29].

Lemma 2. Let S = (S, O) be an epistemic space. S is identifiable in the limit iff there exists a total map D : S → P(O), given by s → D s , such that D s is a finite tell-tale for s, i.e.,
(1) D s is finite, Proof.
[⇒] Let S = (S, O) be an epistemic space. Recall that S and O are at most countable. Assume that S is identifiable in the limit by learning method L, i.e., for every world s ∈ S and every sound and complete positive data stream for s, there exists a finite stage after which L outputs the singleton {s} from then on. By Lemma 1, for every s ∈ S we can take a locking sequence σ s for L and s. For any s ∈ S we define D s := set(σ s ).
(1) D s is finite because locking sequences are finite.
(3) Assume that there are s, t ∈ S, such that s = t and for all p ∈ O such that t ∈ p it is the case that s ∈ p. Take a sound and complete data stream O for t, such that for some n ∈ N This concludes the proof.
Our aim is to show that belief revision can learn every learnable epistemic space. The next step is to construct a suitable plausibility order from telltales, but we introduce one additional condition (see (2) in Definition 23, below). (1) D s is a finite tell-tale for s;

then i(s) < i(t).
We call D an ordering tell-tale map, and D s an ordering tell-tale set of s. Proof. First assume that S = (S, O) is an epistemic space that is identifiable in the limit. Let i : S → N and j : O → N be injective maps. By Lemma 2, there exists map D that assigns a tell-tale to each s ∈ S. On the basis of D, we construct a new map D : S → P(O). We proceed step by step according to the enumeration of S given by i and the enumeration of O given by j (when i(s) = n we will simply write s n and similarly for j and p ∈ O). The general idea is to add data to D s until the at most finitely many counterexamples to condition (2) in Definition 23 are eliminated.
(2) For s n proceed in the following way. First, for every k < n define: Finally, set D s n := D s n ∪ (P n 1 ∪ . . . ∪ P n n−1 ). We now verify that D satisfies Definition 23.
(1) D s is finite, because D s is finite, i(s) = n for some n ∈ N, and there are only finitely many P n k such that k < n, each of them being either a singleton or the empty set.
(2) s n ∈ D s n , because s n ∈ D s n and s n ∈ (P n 1 ∪ . . . ∪ P n n−1 ).
by the definition the finite tell-tale set t = s.

It remains to check condition 2: if t ∈ D s and O s O t then i(s)<i(t). Towards contradiction, assume that t ∈ D s , O s O t , and i(s)≥i(t). If i(s)=i(t), then O s = O t , contradiction. If i(s)>i(t), then by the construction of D s , there is a p ∈ D s such that s ∈ p and t /
∈ p (if D s ⊆ O t we added such p in the process of obtaining D s ; otherwise it had been already there to start with). But then t / ∈ D s . Contradiction.
The next step is to use the ordering tell-tales to define a preorder on an epistemic space.
Definition 24. For s, t ∈ S, define: We take D to be the transitive closure of the relation 1 D .
We want to show that, indeed, the above construction generates an order, i.e., that D is reflexive, transitive, and antisymmetric. The latter involves proving that D includes no proper cycles (see Figure 5).
Definition 25. A proper cycle in D is a sequence of distinct worlds s 1 , . . . , s n , with n ≥ 2, and such that: Proof. The fact that D is a preorder is trivial: reflexivity follows from the fact that s is always in D s , and transitivity is imposed by construction (by taking the transitive closure).
We need to prove that D is antisymmetric. In order to do that we will show (by induction on n) that D does not contain proper cycles of any length n ≥ 2.
(1) For the initial step (n = 2): Suppose we have a proper cycle of length 2. As we saw, this means that there exist two states s 1 , s 2 such that s 1 = s 2 , s 2 ∈ D s 1 , and s 1 ∈ D s 2 .
From the assumption that s 2 ∈ D s 1 , and that O s 1 O s 2 , we can infer (by Condition 2 of Definition 23), that i(s 1 ) < i(s 2 ). But, in the same way (from s 1 ∈ D s 2 , and O s 2 O s 1 ), we can also infer that i(s 2 ) < i(s 1 ). Putting these together, we get i(s 1 ) < i(s 2 ) < i(s 1 ). Contradiction.
(2) For the inductive step (n + 1): Suppose that there is no proper cycle of length n, and, towards contradiction, that s 1 , s 2 , ..., s n+1 is a proper cycle of length n + 1. We consider two cases: Since s k ∈ ∩D s k−1 , it must be that s k+1 ∈ ∩D s k−1 . Therefore, the sequence s 1 , . . . , s k−1 , s k+1 , . . . (obtained by deleting s k from the above proper cycle of length n + 1) is also a (shorter) proper cycle (of length n). Contradiction.
By Condition 2 of Definition 23, it follows that i(s k ) < i(s k+1 ), for all k = 1, . . . , n, and hence i(s 1 ) < i(s n+1 ). But s 1 ∈ D s n and O s n+1 O s 1 (since otherwise s 1 ∈ ∩D s n and s 1 , . . . , s n would give a proper cycle of length n), hence i s 1 > i s n+1 . Contradiction.
Let us now show that D , when used by the conditioning revision method, guarantees convergence to the right belief whenever the underlying epistemic space is identifiable in the limit.

Theorem 1. The conditioning belief revision method (Cond) is universal.
Proof. Obviously, if S is identifiable in the limit by conditioning, then S is identifiable in the limit. We therefore focus on the other direction, i.e., we show that if S is identifiable in the limit by any learning method, then it is identifiable in the limit by conditioning.
By Lemma 3 we know there exists an ordering tell-tale map for S and by Lemma 4, the corresponding D is a (partial) order on S. By the Order-Extension Principle, there exists a total order on S such that, for all s, t ∈ S, it is the case that s D t implies s t. 9 It remains to show that S is identifiable in the limit by the learning method generated from the conditioning belief revision method and the prior plausibility assignment . Let B S = (S, O, ) and let us take any s ∈ S and the corresponding D s . Since s ∈ D s , it follows that for every sound and complete positive data stream O for s, there exists n ∈ N such that D s ⊆ set ( O[n]). Let Cond(B S , O[n]) = (S , O, ). Our aim is now to demonstrate that min S = {s}. By the antisymmetry of the order relation and hence also of , the minimal element of S is unique, so it is sufficient to show that s ∈ min S . For this, let t ∈ S be arbitrary. We need to show that s t. Since t ∈ S , we get that D s ⊆ set ( O[n]  The proof is analogous to the proof of Theorem 1. Within our learning setting lexicographic revision with true information does exactly what conditioning does. The only difference is that the rest of the doxastic structure might not stabilize, but only the minimal elements stabilize. Proof. Let us give a counter-example, an epistemic space that is identifiable in the limit, but is not identifiable by the minimal revision method (see Figure 6). Let S = (S, O), where S = {s 1 , s 2 , s 3 }, O = {p, q}, and p = {s 1 , s 3 }, q = {s 2 , s 3 }. The epistemic space S is identifiable in the limit by the conditioning revision method: just assume the ordering s 1 ≺ s 2 ≺ s 3 . However, there is no ordering that allows identification in the limit of S by the minimal revision method. If s 3 occurs in the ordering before s 1 (or before s 2 ), then the minimal revision method fails to identify s 1 (s 2 , respectively). If both s 1 and s 2 precede s 3 in the ordering then the minimal revision method fails to identify s 3 on any data stream consisting of singletons of propositions from s 3 . On all such data streams for s 3 the minimal state will alternate between s 1 and s 2 , or stabilize on one of them. The last case is that at least one of s 1 and s 2 is equi-plausibile to s 3 . In such case s 3 is not identifiable because for any single proposition from s 3 there is more than one possible world consistent with it. Proof. There is an epistemic space S that is identifiable in the limit by a learning method, but is not standardly identified in the limit by any conservative belief revision method. The following epistemic space constitutes such a counter-example. Figure 7. S is identifiable in the limit 10 by the following learning method L: L(S, σ) = {s n } iff n is the largest such that s n ∈ set(σ).
Let us now assume (towards contradiction) that S is standardly identifiable in the limit by a conservative belief revision method R, i.e., there exists a well-founded total preorder on S, such that the learning method L R generated from R and identifies S in the limit and is conservative.
If is well-founded we can choose some minimal s k ∈ min S and set L R (S, λ) = {s k }, where λ is the empty data sequence. Take now some m > k, and notice that O s m ⊂ O s k (by our construction of S). Let O be a sound and complete data stream for s m . By assumption, L R identifies s m in the limit, hence there must exists some k such that The above result concerns the type of preorders that facilitate identifiability in the limit. We show that for universality results our non-standard setting (involving non-well-founded plausibility orders) is essential: assuming that AGM-like belief revision must be conservative, no such method is universal with respect to well-founded plausibility spaces.

Learning from Positive and Negative Data
One may wonder what would happen if the revision process was governed not only by arbitrary sets of observables, but by observables which are closed under certain logical operations. One simple adjustment of the set O is to assume its closure on negation. 11 First let us extend our framework to account for situations in which both positive and negative data can be observed. Proof. We prove that every negation-closed epistemic space S (with O and S countable) is identifiable in the limit by conditioning and by lexicographic revision.
Let us assume that S is countable and negation closed. In fact, any ω-type order on S gives a suitable (well-founded) prior plausibility assignment. Let us take an s ∈ S. Since is ω-type it is well-founded, so there are only finitely many worlds that are more plausible than s. For each such world For every data stream O that is sound and complete with respect to s, there must exist a stage n ∈ N by which all data in {O t | t ≺ s} have been observed. After this stage, all worlds that are more plausible than s will have been deleted (in the case of conditioning) or will have become less plausible than s (in the case of lexicographic revision), so from then on the (only) most plausible state is s. Hence conditioning and lexicographic revision identify any world s ∈ S in the limit.
Proposition 11. Minimal revision is not universal on negation-closed spaces.
Proof. We show a negation-closed epistemic space that is identifiable in the limit, but is not identifiable in the limit by the minimal revision method. The epistemic space S is identifiable in the limit by Cond, just assume the plausibility order s 1 ≺ s 2 ≺ s 3 ≺ s 4 . However, there is no plausibility order that allows identification in the limit of S by the minimal revision method. Whichever ordering is assumed, the least plausible element will not be identifiable. It is so because each piece of data consistent with s is also consistent with one of the ≺-smaller sets.

Erroneous Information
With the introduction of negative information, we can now allow for occasional observational errors, and for their corrections. To consider erroneous data we now give up the soundness of data streams, i.e., we allow that the learner can observe data that may be false in the real world. In order to still give the agent a chance to learn the real world, we need to impose some limitation on errors. We do this by requiring the data streams to be 'fair'. 12 (1) O is complete with respect to s, (2) there is n ∈ N such that for all k ≥ n, s ∈ O k , and Unsurprisingly, conditioning (which assumes absolute veracity of the new observations) is no longer a good strategy. If erroneous observations are possible, then eliminating worlds that do not satisfy these observations is risky and irrational. Proof. Conditioning does not tolerate errors at all. On any O i such that s / ∈ O i conditioning will remove s and there is no way to revive it. Minimal revision, as it has been shown, is not universal on negation-closed epistemic spaces even with respect to sound and complete data streams, which are a special case of fair streams.
We will demonstrate that lexicographic revision deals with errors in a skillful manner. Before we get to that we introduce and discuss the notion of propositional upgrade [which is a special case of generalized upgrade, see 12]. Such an upgrade is a transformation of a plausibility space that can be given by any finite sequence of mutually disjoint propositional sentences x 1 , . . . , x n . The corresponding propositional upgrade (x 1 , . . . , x n ) acts on a plausibility space B S = (S, O, ) by changing the preorder as follows: all worlds that satisfy x 1 become less plausible than all worlds satisfying x 2 , all worlds satisfying x 2 become less plausible than all x 3 worlds, etc., up to the worlds which satisfy x n . Moreover, for any k such that 1 ≤ k ≤ n, among the worlds satisfying x k the old order is kept the same. In particular, lexicographic revision is a special case of such propositional upgrade, (¬p, p).
Lemma 5. The class of propositional upgrades is closed under sequential composition.
Proof. We need to show that the sequential composition of any two propositional upgrades gives a propositional upgrade. Let us take X := (x 1 , . . . , x n ) and Y := (y 1 , . . . , y m ). The sequential composition X * Y is equivalent to the following propositional upgrade: To show this, take an arbitrary plausibility space B S = (S, O, ) and upgrade on X and Y successively. First, we upgrade on X, to obtain the new preorder X , in which all worlds satisfying x 1 are less plausible than all x 2 -worlds, etc., and within each such partition the old order is kept the same. Now, to this new plausibility space we apply the second upgrade, Y , obtaining the new preorder XY , in which all y 1 -worlds are less plausible than all y 2worlds, etc. However, since the upgrade Y has been applied to the preorder X we also know that the new preorder XY has the following property: for each j, such that 1 ≤ j ≤ m, within the partition given by y j , all x 1 -worlds are less plausible than all x 2 -worlds, etc. At the same time, in each j and k such that 1 ≤ j ≤ m and 1 ≤ k ≤ n the preorder is maintained in the partition (y j ∧ x k ). Thus, XY has the following structure: Moreover, within each such partition, the old preorder is kept the same.
The final observation is that the above setting can be obtained directly by the propositional upgrade of the following form: (x 1 ∧ y 1 , . . . , x n ∧ y 1 , x 1 ∧ y 2 , . . . , x n ∧ y 2 , . . . , x 1 ∧ y m , . . . , x n ∧ y m ). Now we are ready to show that lexicographic revision is well-behaved on fair streams.
Proposition 13. Lexicographic revision generates a standardly universal belief revision-based learning method for fair streams on the class of negation-closed epistemic spaces.
Proof. First, recall that lexicographic revision, Lex, is standardly universal for sound and complete streams on negation-closed spaces. It is left to show that Lex retains its power on fair streams. It is sufficient to show that lexicographic revision is 'error-correcting', i.e., that the effect of revising with the stream (p * σ * p) is exactly the same as with the stream (σ * p), where σ is any sequence of observables. The proof uses the properties of sequential composition for propositional upgrade.
Let us observe that some of the terms in the above upgrade are inconsistent. We can eliminate them since they correspond to empty subsets of the plausibility space. We obtain: (x 1 ∧ p, . . . , x n ∧ p, x 1 ∧ ¬p, . . . , x n ∧ ¬p).
The observation that the two propositional upgrades turn out to be the same concludes the proof.

Conclusions and Perspectives
We have considered iterated belief revision policies of conditioning, lexicographic, and minimal belief revision. We have identified certain features of those methods relevant in the context of iterated revision: data-retention and conservativity turn out to be especially important. We defined learning methods based on those revision policies and have shown how the aforementioned properties influence the learning process. Throughout the paper we have been mainly interested in convergence to the actual world on the basis of infinite data streams. In the setting of positive, sound and complete data streams we have exhibited that conditioning and lexicographic revision generate universal learning methods. Minimal revision fails to be universal, and the crucial property that makes it weaker is its strong conservativity. Moreover, we have shown that the full power of learning cannot be achieved when the underlying prior plausibility assignment is assumed to be well-founded.
In the case of positive and negative information, both conditioning and lexicographic revision are universal. Minimal revision again is not. Finally, in the setting of fair streams (containing a finite number of errors that all get corrected later in the stream) lexicographic revision again turns out to be universal. Both conditioning and minimal revision lack the 'error-correcting' property. Future and on-going work consists of multi-level investigation of the relationship between formal learning theory, belief revision theory, and DEL. There surely are many links still to be found. What seems to be especially interesting is the multi-agent extension of our results. The interactive aspect would probably be appreciated in formal learning theory, where the singleagent perspective dominates. Another way to extend the framework is to allow revision with more complex formulae. This would link to the AGM approach, and to the philosophical investigation into the process of scientific inquiry, where possible realities have a more 'theoretical' character.
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.