Probabilities with Gaps and Gluts

Belnap-Dunn logic (BD), sometimes also known as First Degree Entailment, is a four-valued propositional logic that complements the classical truth values of True and False with two non-classical truth values Neither and Both. The latter two are to account for the possibility of the available information being incomplete or providing contradictory evidence. In this paper, we present a probabilistic extension of BD that permits agents to have probabilistic beliefs about the truth and falsity of a proposition. We provide a sound and complete axiomatization for the framework defined and also identify policies for conditionalization and aggregation. Concretely, we introduce four-valued equivalents of Bayes' and Jeffrey updating and also suggest mechanisms for aggregating information from different sources.


Introduction
In learning about a classical system that adheres to the laws of propositional logic, we may be faced with information that does not. Naturally, if information is scarce, our evidence may contain truth value gaps, neither indicating certain propositions to be true nor false. But we may also be faced with contradictory information, especially when our insights are gained by combining various bodies of evidence. This may lead to truth value gluts, i.e. propositions that are labelled as both true and false.
There has been many attempts in the literature to develop formal systems for capturing and analyzing such non-classical situations. These are generally divided into two camps. The first is motivated by adopting the philosophical position of dialetheism as defended by Priest (2006Priest ( , 2007, advocating the thesis that there are true contradictions, i.e. sentences which are both true and false. Corresponding formal systems should thus allow for assigning both truth values to a sentence simultaneously. Probably the most well known example of such logical systems is the logic LP (Priest, 1979(Priest, , 2002. The second camp takes the existence of gaps and gluts as a pathological consequence of imperfect information. Crucially, one may hope than even imperfect information would allow for at least some reliable inferences. In general, there are two ways to go here. One could either make the set of premises consistent or develop non-trivial inference rules that work on inconsistent sets of premises. Consistency of premises can be obtained by focusing on maximal consistent subsets, cf. Rescher and Manor (1970); Klein and Marra (2020), or by employing belief revision, as in AGM systems (Alchourrón et al., 1985). Mechanisms for dealing with inconsistent information, on the other hand, are developed in a variety of frameworks such as discussive logic (Jaskowski, 1948), adaptive logic (Batens, 2001), Da Costa's logics of formal inconsistency (1974; 1989), relevant logic of Anderson and Belnap (1975) and their variants.
Another well-known logical framework that falls in this last category is Belnap-Dunn logic (BD. cf. Belnap, 1977, 2019Dunn, 1976), sometimes also going by the name of First Degree Entailment. Briefly, this system rests on two assumptions. The first is that gaps and gluts may occur even for boundedly rational agents, as information may be limited (gaps) and the question of whether a given belief set is consistent (i.e. checking for the absence of gluts) is known to be NP-hard. Building on the latter claim, BD's second assumption is that the logic of information should not validate the principle of explosion 1 . Just to the contrary, BD stipulates that a body of information may afford us substantial insights about some matter q, even if it contains contradictory information about some other p that is completely unrelated to q. Belnap-Dunn logic, in short, is a substructural logic, that invalidates explosion and tracks which insights can be inferred from an information base that may contain gaps and gluts.
But of course, the problem of insufficient or contradictory information does not apply to categorial true-false information only. Rather, probabilistic information is affected by similar arguments about gaps and gluts as those outlined above. In his 1997 paper, Jøsang puts forward a framework for three valued probabilities, incorporating uncertainty as third value that may occur naturally when evidence is ambiguous or insufficient. Notably, this framework circumvents the debated principle of insufficient reason by distinguishing situations of insufficient information from those, where equally strong evidence is available for and against some proposition. Later approaches extend this to four-valued probabilities, where the fourth value represents conflicting information, or gluts. The necessity of gluts is often argued for by considering a Bayesian agent who receives two pieces of mutually contradictory information from sources she judges highly reliable, cf. the firefighter example in Dunn and Kiefer (2019).
In short, these arguments call for a four-valued probabilistic generalization of Belnap-Dunn logic in a similar way as classical probability theory generalizes propositional logic. In a first approach to this project Michael Dunn (2010) has defined a four-valued probabilistic framework and has studied logical properties of the resulting probabilistic entailment. In a similar vein, Childers, Majer and Milne (2019) have put forward a single-valued approach to non-standard probabilities motivated by a frequentist interpretation where probability gaps and gluts may occur naturally if probabilities are derived from sampling two independent sources. They further substantiate the approach by providing a subjectivist interpretation of non-standard probabilities and the corresponding Dutch Book Argument.
In the present paper, we offer a novel framework for non-standard probabilities that reconciles Dunn's and Childers et al.'s lines of work (Section 3). In doing so, we pursue four major goals. The first is to provide a translation mechanism between four-valued and single-valued non-standard probabilities, showing that these are different but equivalent perspectives on the same phenomenon. (Section 5). The second aim relates to an axiomatization of the system defined. While Dunn (2010) analyzes logical properties of the probabilistic inference relation ensuing from his approach, no axiomatization of the probabilistic system itself has been put forward so far. To fill this gap, we provide an axiomatization of the non-standard probabilities defined here (Section 4) and show this axiomatization sound and complete with respect to a certain class of probabilistic models (Section 6). While building on Dunn's approach, our framework slightly deviates from his in order to avoid certain conceptual problems. As we will show, both, Jøsang's three valued probabilities as well as Dunn's fourvalued probability implicitly assume all events to be mutually probabilistically independent. Under this assumption, the question of conditionalization trivializes, as no proposition bears any information about any other proposition. In the present framework, we abandon this independence assumption. Consequentially, the question of conditional probabilities becomes meaningful. Defining and studying an adequate notion of conditionalization is our third goal, pursued in Section 7. The fourth goal, finally, is related to aggregation, i.e. the question of how to combine probabilistic information from various sources. Here, we will introduce various policies and study their respective properties (Section 8).

Logical Preliminaries
We start by giving a brief recollection of Belnap-Dunn four-valued logic before proceeding to introduce its probabilistic extensions. Belnap-Dunn four-valued logic is defined over a propositional language that is built over a set Prop of propositional variables. Formally, the logical language L Prop is given by the Backus-Naur form: Disjunction (_) is defined in the standard way. The main difference to classical propositional logic consists in the way that formulas are evaluated. In classical propositional logic, evaluations are defined as functions v : L Prop Ñ t0, 1u that are derived from a valuation on the set of atoms Prop. For Belnap-Dunn logic there are two ways to define evaluations. One approach is to define evaluations as functions v : L Prop Ñ Ppt0, 1uq. In other words instead of evaluating formulas on the two element lattice t0u t1u they are interpreted on the four element lattice BD 4 t0u t u t1, 0u t1u Evaluating formulas in the four element lattice Ppt0, 1uq allows for the assignment of two new truth values tu and t0, 1u. These represent so-called truth-value gaps and gluts, i.e. situations where formulas obtain neither resp. both of the classic truth values. Formally, the evaluation is defined inductively, starting from an atomic valuation v : Prop Ñ Ppt0, 1uq, by: 1 P vp φq iff 0 P vpφq 0 P vp φq iff 1 P vpφq 1 P vpφ^ψq iff 1 P vpφq and 1 P vpψq 0 P vpφ^ψq iff 0 P vpφq or 0 P vpψq An alternative approach is to use two separate classical valuations, called the positive valuation v`and the negative valuation v´. Building on atomic valuations v`: Prop Ñ t0, 1u and v´: Prop Ñ t0, 1u, these are defined for φ, ψ P L Prop as: Both approaches yield equivalent semantics for Belnap-Dunn logic, as is easily seen. For reasons of notational convenience, we will employ the double valuation approach. Within this approach, we can define an entailment relation as φ ( L ψ if and only if v`pφq " 1 ñ v`pψq " 1 for all double valuations pv`, v´q. This entailment relation goes by the name of first degree entailment. An important property of Belnap-Dunn logic, that we will make heavy use of later, is that it admits disjunctive (as well as conjunctive) normal forms. Just as in classical logic, a formula in disjunctive normal form is written as a disjunction of conjunctions of literals. However, unlike in classical logic, an atom might appear both positively and negatively within a conjunctive clause.
Theorem 1. (Theorem 3.9 in Font (1997)) Every formula of Belnap-Dunn logic is equivalent to a formula in a conjunctive (disjunctive) normal form.
Moreover, up to permutation of conjuncts and disjuncts, formulas in conjunctive (disjunctive) normal form may be identified with finite families of finite sets of literals (Theorem 3.15 in Přenosil (2018)).

Probabilistic Models
The double valuation approach's starting assumption is that positive and negative evidence are distinct. That is the absence of positive evidence for some p is not the same as negative evidence against p (or positive evidence for p, if you will). In particular, there may be gaps, where neither evidence for p nor against p is available, and gluts, where evidence of both types is present. Within our models, we must hence treat positive and negative evidence separately. In the following, we assume Prop finite and constant. Also, we will denote the set of literals over Prop by Lit, i.e. Lit :" Prop Yt p | p P Propu.
Definition 1. A non-standard model is a triple M " xΣ, v`, v´y where Σ is a finite or countably infinite set of states and v`, v´: ΣˆProp Ñ t0, 1u are called the positive (negative) valuation function respectively. For p P Prop we let v˘ppq " ts P Σ | v˘ps, pq " 1u.
Hence, a state s of a model M might be assigned an inconsistent set of propositions (i.e., s P v`ppq X v´ppq for some p P P rop), and may remain undecided about some propositions (s R v`pqq Y v´pqq for some q P P rop).
Non-standard models provide a semantics for BD. More specifically, logical formulas of L Prop are evaluated on model-state pairs, using relations |ù`and |ù´. From this, we then obtain the notions of a positive and negative extension.
Definition 2. Let M " xΣ, v`, v´y be a non-standard model, s P Σ a state and ϕ, ψ P L Prop be formulas. Then iq The semantics of L Prop on pM, sq is given by: iiq The positive and negative extensions of ϕ P L Prop are |ϕ|M " ts P Σ | M, s (`ϕu |ϕ|Ḿ " ts P Σ | M, s (´ϕu p" ts P Σ | M, s (` ϕuq We define the entailment relation between sentences in the usual way: φ (˘ψ if and only if for all models M and states σ, if M, s (˘φ then M, s (ψ . Observe the obvious connection between positive and negative extension: | ϕ|M " |ϕ|Ḿ. Moreover, we define the set of pure belief, pure disbelief, conflict and uncertainty about ϕ as The terms belief and disbelief, of course, refer to the intended interpretation as doxastic state. Whenever clear by context, we omit the subscript M. Towards a semantics of non-standard probability theory, we expand the nonstandard model defined above with a probability measure that is classic. Nonclassicality of the ensuing probability assignments, then, will be derived from the underlying valuations only, i.e. from the fact that non-standard models allow for gaps and gluts of truth values.
Definition 3. A probabilistic model is a tuple M " xΣ, µ, v`, v´y where xΣ, v`, v´y is a non-standard model and µ is a probability measure on the full subset algebra of Σ.
Building on probabilistic models, we can derive two different probability assignments from M, one four-valued, the other single valued. These are: Definition 4. For a probabilistic model M " xΣ, µ, v`, v´y, iq the induced non-standard probability function p µ : L Prop Ñ R is: iiq the induced four-valued probability functionp µ : L Prop Ñ R 4 iŝ p µ pϕq "`µp|ϕ| b q, µp|ϕ| d q, µp|ϕ| u q, µp|ϕ| c q˘. 2 To end this section, we'd like to highlight a strong similarity to classic probabilistic models. Classic probability assignments can be derived from possible worlds models equipped with a probability function, i.e. finite classical models akin to those in Definition 3. More explicitly, for a classical model of the form M " xW, v, µy with W a set of possible worlds, v : WˆProp Ñ t0, 1u a valuation, and µ : PpW q Ñ r0; 1s a probability measure, the probability of some ϕ is given as µprϕsq, with rϕs " ts P Σ : M, s ( ϕu. In fact, if Prop is finite, every probability assignment to L Prop can be obtained in this way.
Moreover, every world w of a possible worlds models W naturally corresponds to its atomic valuation, which can be represented by the subset V Ď Prop given by p P V iff vpw, pq " 1 for p P Prop. In the same vein, each state σ of a probabilistic model corresponds to a non-standard possible assignment n s Ď PpLitq defined by p P n s iff v`pw, pq " 1 and p P n s iff v´pw, pq " 1 for p P At. Hence, non standard probabilistic models are obtained from possible world models by replacing classical worlds, i.e. atomic valuations with BD-possible worlds, that is elements of PpLitq.

Axioms of Non-standard probability
In the following, we present a number of axioms for non-standard and fourvalued probabilities. The two sets of axioms given here are easily seen to be sound w.r.t. to the semantics just presented. That they are also complete will be shown in Section 6. We can hence use these axioms for a purely syntactic definition of non-standard and four-valued probabilities.

Non-standard probabilities
We begin with axioms for single-valued non-standard probabilities, i.e. probability measures assigning each ϕ P L Prop a unique rational number.
Definition 5. A non-standard probability assignment is a function p : L Prop Ñ R satisfying for all ϕ, ψ P L Prop .
These axioms are strictly weaker than the classic Kolmogorov axioms (Kolmogorov, 2018). Axioms (A1)-(A3) can be derived from the Kolmogorov axioms, using that first degree entailment is a sub-relation of classical entailment. In the converse direction, however, only the non-negativity axiom (ppϕq ě 0 for all ϕ) is derivable from (A1). Neither Kolmogorov's unit axiom pppJq " 1q nor the (σ)-additivity axioms are derivable from (A1)-(A3), as is illustrated by the fact that assigning probability .5 to every formula satisfies (A1)-(A3). In fact, the import-export axiom is a weak counterpart to additivity, stating that a general rule for adding probabilities that is derivable from the Kolmogorov axioms, ppϕ _ ψq " ppϕq`ppψq´ppϕ^ψq, continues to hold. Within the above axiomatization, the import-export axioms (A3) is the only condition regulating the relation between the probability of a formula and its negation. As a result the probabilities of ϕ and ϕ need not sum up to 1. The constraint ppϕ _ ϕq`ppϕ^ ϕq " ppϕq`pp ϕq allows for probabilistic gaps (ppϕ _ ϕq ă 1q and gluts (ppϕ^ ϕq ą 0q to occur simultaneously. This squares with our original motivation of establishing independence between positive and negative evidence.

Four-Valued probabilities
We now turn to four-valued probability assignments. These are characterized by a total of six axioms.
Definition 6. A four-valued probability assignment is a functionp : L Prop Ñ R 4 . Writingφ as pb ϕ , d ϕ , u ϕ , c ϕ q, this function must satisfy where ( L is first-degree entailment and ϕ, ψ P L Prop The four entries ofp stand for pure belief (i.e. ϕ is true and ϕ is not), pure disbelief, uncertainty and conflict respectively. Let us briefly explain the axioms. The first two axioms (D1) and (D2) are classicality axioms, stating that probabilities are non-negative and that the probabilistic masses of pure belief, pure disbelief, conflict and uncertainty must add up to 1. This reflects the intuition that the four cases are mutually exclusive and jointly exhaustive, i.e. that the metatheory of gaps and gluts is classical.
Axioms (D3)-(D6) then represent structural relations between the fourvalued assignments. (D3) emphasizes the strong relation between ϕ and ϕ: belief in one is the same as disbelief in the other, while both share the same conflict and uncertainty. (D4) is a direct counterpart of axioms (A2) above, stating that the total belief in ϕ (i.e. the sum of pure belief in ϕ and belief in ϕ and ϕ together) must be monotonous under first degree entailment. (D5) expresses that an agent cannot have pure belief in contradictory formulas of the form ϕ^ ϕ. A fortiori, the conflict about ϕ^ ϕ must be derived from (and equal to) conflict about ϕ alone. (D6), finally, is a counterpart to the importexport axiom (A3). Briefly, it states the total beliefs (i.e. the sum of pure belief and conflict together) of ϕ, ψ, ϕ _ ψ and ϕ^ψ must satisfy the import-export rule.
We should note that the axioms presented here are weaker than those put forward in Dunn (2010). There, the probability of a conjunction ϕ^ψ is determined by its conjuncts through: A similar axiom for three valued probabilities (true/false/uncertain) can be found in Jøsang (1997). Notably, such definition makes conjunctions truth functional, i.e. the probability of ϕ^ψ is fully determined by the probabilities of ϕ and ψ. We take this to be too strong, especially given that no such functional dependence holds in classic probability theory. Moreover this truth functional approach implies that all propositions are mutually probabilistically independent -precluding any interesting notions of conditionalization. To see this, assume that ϕ and ψ are classical, i.e.ppϕq " pb ϕ , d ϕ , 0, 0q andppψq " pb ψ , d ψ , 0, 0q. Then the above definition simplifies toppϕ^ψq " pb ϕ b ψ , d ϕ`dψ´dϕ d ψ , 0, 0q. With other words, the probability (belief) in ϕ^ψ is the product of the probabilities of ϕ and ψ -which exactly is the definition of probabilistic independence.
In the following section we will show a strong correspondence between nonstandard and four-valued probability assignments. Thereafter, we show axiom systems (A1)-(A3) and (D1)-(D6) to be sound and complete with respect to the class of probabilistic models defined above (Section 6). In Section 7 we then discuss approaches to conditionalization in either setting.

Correspondence between non-standard and four-valued probabilities
We have so far presented two different frameworks for non-standard probability, one real-valued, the other with values in R 4 . As we show now, both are different but equivalent perspectives on the same phenomenon. To this end, let P ns and P 4 be the set of non-standard and four-valued probability assignments respectively. That is, P ns is the set of functions L Prop Ñ R satisfying (A1)-(A3) while P 4 consists of all mappings L Prop Ñ R 4 satisfying (D1)-(D6). We will show the translation map tr 4 ns : P 4 Ñ P ns defined by to be a bijection. In the opposite direction, the map tr ns 4 : P ns Ñ P 4 is given by tr ns 4 ppqpϕq :"pppϕq´ppϕ^ ϕq, pp ϕq´ppϕ^ ϕq, p1´ppϕq´pp ϕq`ppϕ^ ϕq, ppϕ^ ϕqq As expected, the maps tr 4 ns and tr ns 4 are inverse to each other: Theorem 2. tr 4 ns and tr ns 4 are well-defined. Moreover tr ns 4˝t r 4 ns " id P4 and tr 4 ns˝t r ns 4 " id Pns Moreover, the translation maps tr 4 ns and tr ns 4 cohere with the way we defined non-standard and four-valued assignments on a given probabilistic model. Theorem 3. Let M " xΣ, µ, v`, v´y be a probabilistic model and p µ andp µ the induced non-standard and four-valued probability functions. Then tr 4 ns˝p µ " p µ and tr ns 4˝pµ "p µ . The remainder of this section is devoted to showing these two results.
Finally, we show that tr ns 4˝t r 4 ns " id P4 and tr 4 ns˝t r ns 4 " id Pns , i.e. that tr ns 4 and tr 4 ns are left and right inverses of each other. We begin by showing that tr 4 ns ptr ns 4 ppqq " p for any p P P ns . For ϕ P L Prop , we have that tr ns 4 ppqpϕq equals pppϕq´ppϕ^ ϕq, pp ϕq´ppϕ^ ϕq, 1´ppϕq´pp ϕq`ppϕ^ ϕq, ppϕ^ ϕqq .
Proof of Theorem 3. For ϕ P L Prop denotep µ pϕq by pb ϕ , d ϕ , u ϕ , c ϕ q. By Definition 3, we have Hence, By definition, the latter term is exactly p µ pϕq. Thus tr 4 ns˝p µ " p µ , as desired. Moreover, the latter formula implies that tr ns 4˝t r 4 ns˝p µ " tr ns 4˝p µ . By Theorem 2, we have have tr ns 4˝t r 4 ns " id P4 . Hence, the last equation reduces top µ " tr ns 4˝p µ , proving the second part of the theorem.

A Completeness Result.
Having shown that non-standard and four-valued probability assignments are equivalent, as witnessed by the bijection tr 4 ns : P 4 Ñ P ns , we now turn our attention to the class of probability functions that are induced by probabilistic models. As it turns out, these are fully characterized by our axioms (A1)-(A3). More specifically, we will show that axioms (A1)-(A3) are a sound and complete characterization of the induced non-standard probability functions of probabilistic models. Of course, by Theorems 2 and 3, this implies that also (D1)-(D6) are a sound and complete characterization of the induced four-valued probability functions of probabilistic models. In fact, the soundness part is easy to check: Lemma 1. Let M " xΣ, µ, v`, v´y be a probabilistic model and p µ the induced non-standard probability function. Then p µ satisfies (A1)-(A3).
Towards completeness, we will show a stronger result. Recall that completeness expresses that every p P P ns is the induced non-standard probability function of some probabilistic model M. This M may, however, not be unique as p may be not expressive enough to completely determine all properties of M. As we will show M, is almost unique. More specifically, we determine a class M can of canonical models such that every p P P ns is the induced non-standard probability function of exactly one M P M can .
Definition 7. iq We call a probabilistic model M " xΣ, µ, v`, v´y canonical iff Σ " PpLitq and v`, v´satisfy v`ppq " tσ P PpLitq | p P σu v´ppq " tσ P PpLitq | p P σu iiq M can is the set of canonical probabilistic models.

Remark:
The set M can is representative of the set of all models in the following sense: For any probabilistic model M " xΣ, µ, v`, v´y, there is a unique canonical model M c " xPpLitq, µ c , vc , vć y and a unique function f : M Ñ M c such that x P v˘ppq ô f pxq P vc ppq and µ c pσ c q " µpf´1pσ c qq for all σ c P PpLitq. In particular, p µ pϕq " p µc pϕq for all ϕ P L Prop . The main theorem of this section is: Theorem 4. For any p P P ns there is a unique canonical model M p " xPpLitq, µ, v`, v´y with induced non-standard probability function p µ such that p " p µ .
Corollary 1. Axioms (A1)-(A3) are sound and complete with respect to the class of induced non-standard probability functions of probabilistic models.
By Theorems 2 and 3, the previous result readily translates to the level of four-valued probability functions.
Theorem 5. For anyp P P 4 there is a unique canonical model Mp " xPpLitq, µ, v`, v´y with induced four-valued probability functionp µ such that p "p µ .
Corollary 2. Axioms (D1)-(D6) are sound and complete with respect to the class of induced four-valued probability functions of probabilistic models.
Proof of Theorem 4. Fix p P P ns . Let Σ " PpLitq and let v˘: Prop Ñ PpΣq be defined as v`pqq " tσ P Σ | q P σu and v´pqq " tσ P Σ | q P σu respectively. We will construct a classic probability function µ : PpΣq Ñ r0; 1s such that the canonical model M " xΣ, µ, v`, v´y satisfies p µ " p. It suffices to construct the underlying probability mass function W : Σ Ñ r0; 1s, i.e. the function satisfying W pxq " µptxuq for x P Σ. We will do so by induction on |x| for x P Σ " PpLitq. The construction proceeds in three steps. As an induction base, we set µpx max q with x max the unique element in Σ with |x max | " | Lit |. In the induction step, we define µpxq for all x with |x| " k ě 1, assuming that µpyq has already been defined for all y with |y| ą k. In the last step, finally, we define µpHq, where H is the unique element of Σ of cardinality 0.
We will need to ensure that that µprϕsq " ppϕq for all ϕ P L Prop , where rϕs denotes the truth set of ϕ in the non-standard model xΣ, v`, v´y, i.e. rϕs " tx Ď Lit | Ź qPx q ( L ϕu. Note that by the normal form theorem (Theorem 1) and axiom (A2), it suffices to show this property for all ϕ P L Prop that are in disjunctive normal form. Moreover note that for any ϕ, ψ P L Prop in disjunctive normal form, we have that µprϕ_ψsq " µprϕsq`µprψsq´µprϕ^ψsq, as witnessed by By (A3), hence, knowing that µpr˚sq " pp˚q for˚P tϕ, ψ, ϕ^ψu guarantees that µprϕ _ ψsq " ppϕ _ ψq. It thus suffices to show that µprϕsq " ppϕq whenever ϕ is a conjunction of literals, i.e. of the form Ź qPx q with x Ď Lit. We will show this property to hold alongside our inductive construction.
For the first step, let x max be Ź qPLit q, the unique element in PpΣq of maximal cardinality. Note that tx max u is the truth set of the formula Ź qPLit q. We thus set W px max q :" pp Ź qPLit qq. By axiom (A1) we have that 0 ď W px max q ď 1.
For the inductive step let k ě 1 and assume that W pyq has already been defined for all y with |y| ą k. We simultaneously define W pxq for all x Ď Lit with |x| " k. Let such x be given. Note that the truth set of Ź qPx q is ty Ď Lit | x Ď yu. By induction assumption W pyq is already defined for all ty Ď Lit | x Ă yu. We can hence define We have that W pxq ď pp Ź qPx qq and thus W pxq ď 1. On the other hand, note that ty Ď Lit | x Ă yu is the truth set of Ž yĄx Ź qPy q. Hence, by induction assumption, Combining this inequality with (1) and (2) yields W pxq ě 0.
For the last step, finally, assume that W pxq is already defined for all x ‰ H. We then set W pHq " 1´ř x‰H W pxq. It follows immediately that ř xPΣ W pxq " 1. Moreover, by our induction, W pxq ě 0 for all x ‰ H, hence W pHq ď 1. On the other hand, note that tx Ď Lit | x ‰ Hu is the truth set of Ž qPLit q. By induction assumption, ř x‰H W pxq " pp Ž qPLit qq. By axiom (A1), hence, W pHq ě 0.
Along the lines of the proof, we have ensured that µprϕsq " ppϕq for all ϕ of the form Ź qPx q for some x Ď Lit. By the above remark, this ensures that µprϕsq " ppϕq for all ϕ, i.e. that p µ " p.
To end the static parts of this paper, we provide a graphical overview over the relationships identified so far. By Theorems 2 to 5, the diagram in Figure  1 commutes. Moreover, each pair of opposite arrows in the upper half of the diagram, i.e. the pairs ptr 4 ns , tr ns 4 q, pp Ñ M p , µ Ñ p µ q and pp Ñ Mp, µ Ñp µ q are left-and right inverses to each other.

Conditioning
In a classic setting, Bayesian conditioning on a formula ϕ describes a situation, where ϕ is learned to be true with probability 1 -and hence ϕ true with probability 0. A generalization of this rule is Jeffrey conditioning, where an  Figure 1: The relationships identified so far. By Theorems 2 to 5, this diagram commutes.
ϕ ϕ Figure 2: Classic Conditioning agent may learn the probability of ϕ to be any value in q P r0; 1s, rather than only the extremal value of 1 (or 0, when ϕ is learned) permitted in Bayes' conditioning.
Either method is best illustrated semantically. Within a classical setting, any formula ϕ defines a binary partition trϕs, r ϕsu on the state space, cf. Figure 2. Jeffrey conditioning is then executed by linearly expanding or contracting the original measure µ on rϕs and r ϕs to some new µ in such a way that µprϕsq " q and µpr ϕsq " 1´q. We hence get for any ϕ P L Prop that µprψsq " µprψ^ϕsq q µprϕsq`µ prψ^ ϕsq 1´q µpr ϕsq which, in the case of Bayesian condition (i.e. q " 1) reduces to the well-known formula µprψsq " µprψ^ϕsq µprϕsq . Conditionalization in our extended setting follows a similar idea. However, note that both Bayes' and Jeffrey conditioning implicitly rest on the facts that ppϕq`pp ϕq " 1 and that ppϕ^ ϕq " 0, i.e. that there are no gaps and gluts. As this fact no longer holds, conditioning will behave differently in a non-standard setting. In fact, we will show that non-standard probabilities allow for two different notions of Jeffrey updating, one where a new value for the probability of ϕ, i.e. ppϕq is learned, the other where a new value of the four-valued vectorppϕq is acquired. The former version of Jeffrey updating is best described on the level of non-standard probability assignments, the latter on the level of four-valued assignments. Yet, using the maps tr 4 ns and tr ns 4 , both versions of updating can naturally be applied to either non-standard or four-valued probability assignments.
Just as in the standard case, non-normal Bayes conditioning can be defined as extremal case of Jeffrey updates. In fact, non-normal Bayes conditioning has been studied independently, for instance in Mares (1997). The current framework generalizes the latter's approach by also incorporating Jeffrey updating and by identifying a number of different Bayes like updates, containing the one put forward by Mares.

Updating on non-standard information
In our first notion of updating, the agent's update proscribes her to set the probability of ϕ to some q P r0; 1s. Notably, within a non-standard setting, this does not carry any information about the value of ϕ -the agent may or may not leave pp ϕq unchanged in her update. In line with classic Jeffrey updating, non-standard Jeffrey updating is best illustrated semantically. For any set ψ P L Prop , we can dissect the state space of a probabilistic model M " xΣ, µ, v`, v´y in two sets -the truth set rϕs of ϕ and it's complement Σzrϕs. Unlike in the classic case, however, Σzrϕs is not the truth set of r ϕs, nor of any other ψ P L Prop . Yet, we can define Jeffrey updating as in the classic case.
Definition 8. Let M " xΣ, µ, v`, v´y be a probabilistic model. Let q P r0; 1s and ϕ P L Prop such that µprϕsq P p0; 1q. Then the semantic non-standard Jeffrey update for updating the probability of ϕ to be q on M is the probabilistic model M ϕ,q " xΣ, µ ϕ,q , v`, v´y determined by: Fact 1. Non-standard Jeffrey updating is successful, i.e. for any probabilistic model M " xΣ, µ, v`, v´y, any q P r0; 1s and ϕ P L Prop such that µprϕsq P p0; 1q the non-standard Jeffrey update on M updating the probability of ϕ to q satisfies µ ϕ,q pxqprϕsq " q.
Despite the fact that the set Σzrϕs is not definable, we can give a syntactic characterization of non-standard Jeffrey-updating. The following is a non-standard equivalent to classic Jeffrey's updating, cf. Formula (3). Lemma 2. Let M " xΣ, µ, v`, v´y be a probabilistic model. Let q P r0; 1s and ϕ P L Prop such that µprϕsq P p0; 1q. Then for any ψ P L Prop , the non-standard Jeffrey update M ϕ,q " xΣ, µ ϕ,q , v`, v´y of M satisfies: µ ϕ,q prψsq " µprψ^ϕsq¨q µprϕsq`p µpψq´µpψ^ϕqq 1´q 1´µprϕsq Notably, after translating the previous fact into its induced non-standard probability assignments p µ and p µ ϕ,q , we obtain a fully syntactic characterization of non-standard Jeffrey updating.
Definition 9. Let p : L Prop Ñ R be a non-standard probability assignment, let q P r0; 1s and ϕ P L Prop with ppϕq P p0; 1q. Then the syntactic non-standard Jeffrey update setting the probability of ϕ to q is the probability function p ϕ,q : L Prop Ñ R defined by p ϕ,q pψq " ppψ^ϕq¨q ppϕq`p ppψq´ppψ^ϕqq 1´q 1´ppϕq By construction, semantic and syntactic non-standard Jeffrey updating coincide in the following sense.
Fact 2. Let M " xΣ, µ, v`, v´y be a probabilistic model, let q P r0; 1s and ϕ P L Prop with ppϕq P p0; 1q. Then p µ ϕ,q " p ϕ,q µ . We will hence omit the labels and only speak of non-standard Jeffrey updating. We end this section with three facts about non-standard Jeffrey updating.
Fact 3. Assume that the non-standard probability function p : L Prop Ñ R is classic, i.e. satisfies the Kolmogorov axioms. Moreover, let ϕ P L Prop with ppϕq P p0; 1q and q P r0; 1s. Then the non-standard and the classic Jeffrey update for setting the probability of ϕ to q coincide, i.e. for all ψ P L Prop p ϕ,q pψq " ppψ^ϕq q ppϕq`p pψ^ ϕq q pp ϕq .

From this, it follows directly that
Fact 4. Non-standard Jeffrey updating is not commutative. That is, there is a non-standard probability function p : L Prop Ñ R and ϕ, ψ P L Prop and q, r P r0; 1s with ppϕq, ppψq, p ϕ,q pψq, p ψ,r pϕq P p0; 1q such that pp ϕ,q q ψ,r ‰ pp ψ,r q ϕ,q .

Non-standard Bayesian updating
Just as in the classic case, we will define non-standard Bayesian updating as special case of non-standard Jeffrey updating where the probability of ϕ is set to 1. In this case, the formula of Definition 9 simplifies to the same formula as in the classical case. Note that this is also the first of two approaches to Bayes updating proposed by Mares (1997). The second proposal by Mares, in contrast is not related to any version of Bayes updating presented here, as it strives to actively minimize conflict.
Definition 10. Let p : L Prop Ñ R be a non-standard probability function and let ϕ P L Prop with ppϕq ą 0. Then the (positive) non-standard Bayesian update on ϕ is the function p ϕ,pos : p ϕ,pos pψq " ppψ^ϕq ppϕq for ϕ P L Prop .
Unlike in the classical setting, however, non-standard Bayesian updating does not cover all extremal cases. Setting the probability of ϕ to 0 is not the same as setting the probability of ϕ to 1, hence this case needs to be treated separately.
Definition 11. Let p : L Prop Ñ r0; 1s be a non-standard probability function and let ϕ P L Prop with ppϕq ă 1. Then the negative non-standard Bayesian update on ϕ is the function p ϕ,neg : As their classic counterpart, positive and negative non-standard Bayesian conditioning are order independent: Lemma 3. Let p : L Prop Ñ R and let ϕ, ψ P L Prop with ppϕq, ppψq, p ϕ pψq, p ψ pϕq P p0; 1q.

Updating on four-valued information
Within non-standard probability, knowing the probability of ϕ does not provide any information about the probability of ϕ. Hence, in learning about ϕ, two cases are to be distinguished. In the first case, the agent only receives information about ϕ, without learning anything about ϕ or ϕ^ ϕ. In the second case, the agent learns the full probabilistic information about ϕ, that is, the probabilities of ϕ and ϕ, but also the size of the corresponding gap and glut. As discussed above, this information can be encoded in a vector pb, d, u, cq P R 4 specifying the new pure belief (i.e. belief without conflict), pure disbelief (belief in ϕ without conflict), uncertainty and conflict about ϕ.
Again, the notion of four-valued Jeffrey updating is best illustrated semantically. As shown in Figure 3, for any ϕ P L Prop , the sets of pure belief, pure disbelief, uncertainty and conflict about ϕ jointly form a partition prϕszrϕ^ ϕs, pr ϕszrϕ^ ϕs, Σzrϕ _ ϕs, rϕ^ ϕsq of a probabilistic model M. Hence, a similar idea as in classic Jeffrey updating can be applied, linearly expanding or shrinking the measure on each of these four cells to their appropriate size. Notably, linear expansion (to a larger size) is only well defined if the cell to be expanded has a strictly positive measure. We capture this with the notion of admissibility of a vector pb, d, u, cq: Definition 12. Let M " xΣ, µ, v`, v´y, let ϕ P L Prop and denotep µ pϕq by pb ϕ , d ϕ , u ϕ , c ϕ q. We call a vector pb, d, u, cq P r0; µpxq¨b µprϕs´µrϕ^ ϕsq iff x P rϕszrϕ^ ϕs µpxq¨d µpr ϕsq´µprϕ^ ϕsq iff x P r ϕszrϕ^ ϕs µpxq¨c µprϕ^ ϕsq iff x P rϕ^ ϕs µpxq¨u 1´µprϕ_ ϕsq else Fact 5. Four-valued Jeffrey updating is successful, i.e. for any probabilistic model M " xΣ, µ, v`, v´y, any ϕ P L Prop and any pb, d, u, cq P r0; 1s 4 that is admissible for ϕ, the non-standard Jeffrey update on M setting the probability of ϕ to pb, d, u, cq satisfiesp µ ϕ,pb,d,u,cq pϕq " pb, d, u, cq.
Just as in the case of non-standard Jeffrey conditioning, we obtain a purely syntactic characterization of four-valued Jeffrey updating. Unfortunately, the drop in elegance with respect to standard Jeffrey updating is significant.
To check correctness of the above equations, it then suffices to verify that the formulas pick out the respective fields, i.e. that b ϕ,ψ is the size of field 1, b ϕ,ψ is the size of field 5, d ϕ,ϕ,ψ,ψ´d ϕ,ϕ,ψ´cϕ,ϕ,ψ`c ϕ,ϕ,ψ,ψ is the size of field 9 and so on. That this is the case follows from the pictures in Figure 4, showing the belief and disbelief sets for certain composites of ϕ and ψ.
Again, the latter set of equations can be read purely syntactically. Thus, we get a syntactic counterpart to semantic four-valued Jeffrey updates.
Fact 6. Let M " xΣ, µ, v`, v´y be a probabilistic model, let ϕ P L Prop and let pb, d, u, cq P r0; 1s 4 be admissible for ϕ. Thenp µ ϕ,pb,d,u,cq "p ϕ,pb,d,u,cq µ . We will hence omit the distinction between semantic and syntactic and only speak of four-valued Jeffrey updating. We end this section with three facts about this updating.

Four-valued Bayesian updating
Just as in the classical case, we can define four-valued Bayesian updating as a special instance of Jeffrey updating where the information acquired is extremal.

Interaction Principles
Using the translation functions tr 4 ns and tr ns 4 , both notions of Jeffrey conditioning, non-standard and four-valued, work on both types of probability functions defined, non-standard and four-valued. However, the notions of updating do not correspond to each other. While non-standard Jeffrey conditioning applies to situations where only the probability of ϕ is set, without any mention of the probabilities of ϕ or ϕ^ ϕ, four-valued Jeffrey conditioning covers cases where new probabilities of ϕ, ϕ and the corresponding gap and glut are all proscribed simultaneously. Hence, even after appropriate transformations of their domains with tr 4 ns and tr ns 4 , the two types of Jeffrey updates are not interdefinable. This, however, changes if we move to non-standard and four-valued Bayesian updating. Each of the three types of four-valued Bayesian updating is equivalent to a composition of two steps of non-standard Bayesian updating. Moreover, the order of these two steps does not matter.
Lemma 6. Letp : L Prop Ñ R 4 be a four-valued probability assignment and let ϕ P L Prop .

Conditioning on Partial Information
In the previous sections we investigated updating a probability function with a generalized Jeffery rule by learning either only a new value for the belief in ϕ (Section 7.1) or the entire four-valued probability vector assigned to ϕ (Section 7.2). However, there may be other contexts where the agent acquires partial information about the (four-valued) probability of ϕ, e.g. only a new value for pure belief or pure disbelief in ϕ.
The idea for conditioning on partial information proceeds along the same lines as for complete information, i.e. by a modified version of Jeffery conditioning. The only difference is that the partiality of information, say about ϕ does not permit to work with the full partition induced by ϕ on a model M, i.e. the partition into trϕszr ϕs, r ϕszrϕs, Σ´rϕ _ ϕs, rϕ^ ϕsu, cf Figure  3, but with a coarsening thereof.
By obtaining partial information we mean that the agent learns the values of a partial assignment a : tb, d, u, cu á r0; 1s, i.e. an assignment proscribing new values for some of the agent's pure belief, pure disbelief, uncertainty and conflict, but not necessarily for all. Let us denote the domain of a, i.e. those x P tb, d, u, cu for which apxq is defined, by dompaq. For simplicity, we assume that H Ă dompaq Ă tb, d, u, cu with both inclusions strict. Following the same intuitions as in the four-valued case, we can define conditioning on the partial information a by setting the new pure belief, disbelief, uncertainty and conflict in ϕ to be apbq, apdq, apuq and apcq respectively whenever this is defined and afterwards rescaling the probabilistic mass on the remaining area appropriately.
Formally, to ensure that the corresponding operation is well-defined, we need to assume that ř yPdompaq y ď 1. Denoting the prior four-valued probability vector of ϕ with pb ϕ , d ϕ , u ϕ , c ϕ q, the Jeffrey updating sketched above will lead to the posterior four-valued probability vector pb ϕ ,d ϕ ,ū ϕ ,c ϕ q with: else for x P tb, d, u, cu. With this, we can formally define partial Jeffrey updating.
Definition 16. Let a : tb, d, u, cu á r0; 1s be a partial assignment such that ř yPdompaq y ď 1. Let M " xΣ, µ, v`, v´y be a model, let ϕ P L Prop and let the vector pb ϕ ,d ϕ ,ū ϕ ,c ϕ q defined above be admissible for ϕ. Then the fourvalued Jeffrey update of ϕ on the partial information a is defined as the four-valued Jeffrey update on ϕ to pb ϕ ,d ϕ ,ū ϕ ,c ϕ q.

Aggregation
Assume two agents informed you about their credences in ϕ. You take both agents as similarly competent and equally informed. Yet, they equip you with different assessments of ϕ. How, then, should you combine these judgments towards forming your own belief about ϕ? Within standard probability theory, your options are fairly limited. You may, for instance, decide to follow one of the agents, or build a weighted average between the two. A broad number of approaches in the literature on peer disagreement, for instance, promotes to split the difference equally see for instance Elga (2007); Christensen (2007) on conciliationism, but also Kelly (2010) for an opposing opinion.

Aggregating non-standard probabilities
Within the non-standard probabilities studied here, further options open up. First, note that within classic probability theory, learning about the agents credence in ψ also informs us about her degree of belief in ψ. This does not hold true within the current non-standard setting. Hence, let us assume for the current analysis that agents inform us about both their positive and negative attitude towards ϕ, that is about ppϕq and pp ϕq, or even about their fourvalued vectorppϕq. Of course, we may follow the previous strategies and form weighted averages between the agents' assessments of ϕ. If needed, this policy could be specified to also taking a weighted average on the agents conflict and uncertainty and, more general, their remaining belief set.
Definition 17. Let k P r0, 1s. iq Assume agents A and E provide their non-standard assessments of ϕ, i.e. p A pϕq, p E pϕq, p A p ϕq and p E p ϕq. Then their k-weighted non-standard aggregate belief p k tA,Eu is defined by p k tA,Eu pϕq " kp A pϕq`p1´kqp E pϕq and p k tA,Eu p ϕq " kp A p ϕq`p1´kqp E p ϕq.
iiq For agent A and E 1 s four-valued probabilitiy assessments pb, d, u, cq A and pb, d, u, cq E for ϕ, i.e.p A pϕq andp E pϕq their k-weighted four-valued aggregate beliefp k tA,Eu is:p k tA,Eu pϕq " kp A pϕq`p1´kqp E pϕq. Lemma 7. Weighted averaging can be applied to an entire belief base simultaneously. That is, when agents A and E both provide their full subjective nonstandard probability functions p A , p E : L Prop Ñ R (resp.p A ,p E : L Prop Ñ R 4 ), Non-Standard beliefs, however, allow for further aggregation policies that do not have classic counterparts. Credulous agents, for instance, could opt for the maximal values of their input in terms of belief and disbelief simultaneously. That is, they could set their updated belief and disbelief in ϕ to be maxpp A pϕq, p E pϕqq and maxpp A p ϕq, p E p ϕqq respectively. Likewise, cautious agents may rather chose to belief and disbelief ϕ only to an amount supported by all input information. Such agents would set their belief and disebelief in ϕ to minpp A pϕq, p E pϕqq and minpp A p ϕq, p E p ϕqq respectively.

Updating rule
In special situations, further policies are conceivable. When testing the safety of a new drug, for example, agents may be extremely vary of false positives while being much less concerned with false negatives. Such an agent might decide to set her new belief in ϕ to minpp A pϕq, p E pϕqq while adopting maxpp A p ϕq, p E p ϕqq as new disbelief in ϕ. Likewise, also the combination of maxpp A pϕq, p E pϕqq with minpp E p ϕq, p E p ϕqq are conceivable. In some sense, the latter two policies are aggregation functions that minimize type I and type II errors. For a lack of a better name we call these pessimist and optimist updating rules respectively. See Table 1 for an overview.
Unlike weighted average, none of these four policies can be applied to an entire belief set simultaneously.
Fact 9. Let p A and p E be such that p A pϕq " 1 and p A p ϕq " p A pϕ^ ϕq " 0, while p E p ϕq " 1 and p E pϕq " p E pϕ^ ϕq " 0. Then p A and p E are consistent, but the function p tA,Eu defined by p tA,Eu p˚q " maxp˚q for˚P tϕ, ϕ, ϕ^ ϕu is not.
Proof. To see that p A and p E are consistent consider a nonstandard model with three worlds, x, y, z and v`ppq " tx, yu, v´ppq " tx, zu. The measure µ A putting all weight on y is such that p µA p˚q " p A p˚q for˚P tϕ, ϕ, ϕ^ ϕu, showing p A consistent by Lemma 1. Likewise, the measure µ E putting all weight on z shows p E consistent. For the inconsistency of p tA,Eu , finally, note that p tA,Eu pϕ^ ϕq " 0 and p tA,Eu pϕq " p tA,Eu p ϕq " 1. Plugging these three values into (A3) yields 0`p tA,Eu pϕ _ ϕq " 2, contradicting (A1).
Likewise, the missing conditions for cautious updates cannot be retrieved by extending the policy of taking minima to the agents' assessments of ϕ _ ϕ, as can be seen from the previous Fact. In particular, there is no counterpart to Lemma 7 for credulous or cautious update. Neither can be performed for all ϕ P L Prop simultaneously.
Before proceeding to four-valued updating, we compare the above policies to operations in non-probabilistic Belnap-Dunn logic. For this, recall the classic Belnap-Dunn bi-lattice of truth values BD 4 . This bi-lattice can be interpreted in two directions relating to truth values and the available information. We denote meet and join of the truth lattice operations by^and _ while meet and join for the information lattice operations are [ and \. Note that we can identify an assignment of BD 4 -values to some formula ϕ with a non-standard probability assignment of ppϕq and pp ϕq into t0, 1u. More specifically, assigning t1, 0u to some ϕ corresponds to ppϕq " pp ϕq " 1, while assigning t1u, resp t0u to ϕ corresponds to ppϕq " 1, pp ϕq " 0 and ppϕq " 0, pp ϕq " 1 respectively. Value tu, finally, corresponds to ppϕq " pp ϕq " 0. For a probability assignment ppϕq, pp ϕq P t0, 1u, we denote the corresponding BD 4 value by t p pϕq. Applying this correspondence, we obtain the following characterization of the four updating policies introduced above: Lemma 8. Assume when asked about their credences in ϕ, agents A and E provide extremal assignments, i.e. p A pϕq, p A p ϕq, p E pϕq, p E p ϕq P t0, 1u. Then Credulous update yields beliefs in ϕ and ϕ t pA pϕq \ t pE pϕq Cautious update that are equal to t pA pϕq [ t pE pϕq Opimistic update t pA pϕq _ t pE pϕq Pessimistic update t pA pϕq^t pE pϕq.
Finally, we consider the special case where both agents input classic probability values, i.e. values such that ppϕq`pp ϕq " 1.
Fact 10. When p A and p E are classic, i.e. p A pϕq`p A p ϕq " p E pϕq`p E p ϕq " 1, then the same holds for the aggregated belief when aggregation follows weighted averaging, optimistic or pessimistic updates. That is, these three rules preserve classicality. This does not hold for credulous and cautious updating. The latter two rules turn classic inputs beliefs for agent A and E into non-classic aggregate values as soon as A and E disagree about ppϕq.

Aggregating four-valued probabilities.
So far, we have assumed aggregation to operate on non-standard probability assignments. Within the above framework, agents provide their subjective nonstandard beliefs in both ϕ and ϕ, which the various aggregative mechanisms described above then merge into aggregate belief values for ϕ and ϕ. But of course, our agents might also provide their subjective four-valued probabilitieŝ Note, that by the map tr 4 ns , the non-standard probabilities ppϕq and pp ϕq can be calculated from the four-valued probabilityppϕq. Hence, ifp tA,Eu pϕq is defined, a corresponding two-valued aggregation mechanism for p tA,Eu pϕq and p tA,Eu p ϕq follows immediately. However, the opposite does not hold. p tA,Eu pϕq and p tA,Eu p ϕq do not fully determinep tA,Eu pϕq and hence the various policies defined in the last section do not readily translate into four-valued aggregation procedures. In fact, when employing the map tr ns 4 , the three values ppϕq, pp ϕq and ppϕ^ ϕq are required to determineppϕq. In the case of weighted averaging, this is not a problem. By Lemma 7, setting p k tA,Eu pϕ^ ϕq " kp k A pϕ^ ϕq`p1´kqp k E pϕ^ ϕq yields a consistent set of requirements and the corresponding four-valued aggregation rule is exactlyp k tA,Eu pϕq " kp A pϕq`p1´kqp E pϕq. However, the situation is different in the case of credulous or cautious updating. As shown in Fact 9, requiring that p tA,Eu pϕq " maxpp A pϕq, p E pϕqq, p tA,Eu p ϕq " maxpp A p ϕq, p E p ϕqq and p tA,Eu pϕ^ ϕq " maxpp A pϕ^ ϕq, p E pϕ^ ϕqq may yield an inconsistent set of requirements. Hence, other choices are needed.
The vector p tA,Eu pϕq is determined by four choices. With two of them given by p tA,Eu pϕq " maxpp A pϕq, p E pϕqq and p tA,Eu p ϕq " maxpp A p ϕq, p E p ϕqq, and a third by axiom (D2), one last condition is missing. In the case of credulous update, we would arguably expect that c tA,Eu ϕ ě maxpc A ϕ , c E ϕ q: If an agent opts to be credulous about both ϕ and ϕ, she could not expect her conflict to fall below any of the input conflicts. Within this restriction, the below definition of credulous update, assumes c tA,Eu ϕ to be as close to maxpc A ϕ , c E ϕ q as possible while maintaining consistency. The algebraic structure of credulous and cautious aggregation.
Proposition 2. The subjective four-valued probability assignment p0, 0, 0, 1q, i.e. the element of maximal conflict, is a neutral element with respect to cautious updating. Likewise, the subjective four-valued probability assignment p0, 0, 1, 0q, representing maximal uncertainty, is neutral with respect to credulous updating.
Proof. Letp A pϕq " p0, 0, 0, 1q, letp E pϕq " pb, d, u, cq be arbitrary and denote the result of cautious updating by pB, D, U, Cq. Then by definition B`C " b`c and D`C " d`c which implies Using this, the last condition of cautious updating yields U " maxp0, u, 1b´d´2 cq. Since u " 1´b´d´c, this implies U " u. Together with 1 " U`B`C`D " u`b`c`d, it follows that B`C`D " b`c`d. In combination with equation (4), this implies C " c. With this, B`C " b`c and D`C " d`c imply that B " b and D " d. The proof for the second claim follows from a similar argument.

Conclusions
Many classical approaches to reasoning address idealized situations, where the agents' information is consistent, closed under logical implication, and possibly even complete. These assumptions, of course, are at odds with many realistic reasoning scenarios, where the available evidence may be scarce and memory or observation faulty. In short, there is no guarantee for our available information to be consistent, nor complete. Yet, we would arguably hold that some valid inferences can be drawn from such imperfect information, as partial incompleteness or local contradictions may not preclude us from drawing conclusions about other parts of the data. As automated reasoning systems are becoming increasingly important, there is a need for a rigorous formal treatment of inferences from non-ideal information. To this end, a wealth of non-classical logical systems for dealing with uncertainty or conflict has been put forward, with Belnap-Dunn logic (BD) arguably the most prominent such framework.
However, the reasons for moving to non-normal, BD like frameworks apply equally well to probabilistic settings. Agents may, for instance, have inconclusive, probabilistic evidence for the truth or falsity of various statements. Just as in the classic case, if such information comes from different sources or different experiments, it needs not add up to 1, nor be mutually exclusive. It hence seems natural to investigate probabilistic extensions of BD. This was the focus of the current paper.
Paralleling recent work by Dunn (cf. Dunn, 2010;Dunn and Kiefer, 2019), we have investigated four-valued probability assignments that permit agents to have probabilistic beliefs about the truth and falsity of a statement, and about its gaps and gluts. More specifically, we have provided a theory of four-valued probabilities that slightly departs from Dunn's in its treatment of conjunctions. Yet, both are generalizations of Belnap-Dunn logic in that they coincide with BD whenever all probabilities are extremal, i.e. only assume the values of 0 and 1.
In this paper, we have clarified the connection between our four-valued probabilities and single valued non-standard probabilities as introduced by Childers, Majer and Milne (2019). By providing a translation function between the two approaches, we have shown these to be equivalent. Moreover, we have introduced probabilistic models as semantics for four-valued probabilities, and have provided a sound and complete axiomatization with respect to the class of all such models. Lastly, we have enriched our frameworks with dynamical operations for updating and aggregation. As for the former, we have provided versions of Jeffrey and Bayes' conditioning that work in non-standard and four-valued settings and have clarified the relation between these. For aggregation, finally, we have studied a host of different aggregation policies, some of which go beyond what is available in classic probabilistic settings.
Of course, there are other approaches to weakening classic probability theory, not all of which have a corresponding logic as starting point. Many such approaches take probability or weights as central notion, but consider various cases where no exact probabilistic information is available. A typical example are inner measures intended to approximate probability from below (Fagin and Halpern, 1991). Their underlying idea, briefly, is that an agent might lack probabilistic evidence about some proposition ϕ, for instance when ϕ is not in the algebra of (possible) observations. The agent may, though, estimate a lower bound for the probability of ϕ by building on her available information about other propositions. Formally, this gives rise to an inner measures that only satisfy super-additivity instead of the classic additivity, i.e. µ˚pϕ _ ψq ě µ˚pϕq`µ˚pψq, where ϕ^ψ a classical contradiction.
A related weakening of classic probability theory is Dempster-Shafer (DS) theory of belief (Shafer, 1976;Halpern, 2017). The starting point of this theory is an agent's evidence about some state of affairs, usually represented as a normalized measure on a boolean algebra of possible observations. This evidence then gives rise to a belief function, where Belpϕq, the belief in some ϕ, is derived from all pieces of evidence that entail ϕ. As the agent might have strong evidence for a compound event, say ψ _ ϕ, without having much evidence that entails either of its compounds alone, this belief function is super-additive in the sense defined above. More specifically, the degree of support for some A needs not be complementary to the support of A. That is, BelpAq may be less than 1´Belp Aq, just as in our framework. While BelpAq can be seen as a lower bound for the classical probability for A, the term 1´Belp Aq, sometimes denoted the plausibility of A, is it's upper bound. The interval between both is then interpreted as the agent's uncertainty about A. As our presentation suggests, there is a tight connection between DS theory and inner measures approaches: both are equivalent, at least on a syntactic level where probabilities are associated to formulas, rather than states (Fagin and Halpern, 1991;Zhou, 2013).
Both, inner probabilitiy approaches and DS theory differ in two ways from our framework. In one dimension, our framework is more general than DS belief functions or inner probabilities, as it admits not only for uncertainty but also for conflict in probability assignments. By allowing for gluts, non-standard and four-valued probability assignments can represent contradictory information in ways that DS theory and inner measure frameworks cannot.
For a second difference consider a classic tautology such as p _ p. Working on a classical meta-theory, DS theory associates a probability of 1 to this tautology. Yet, when evidence is scarce, the belief values assigned to p and p need not add up to one, exemplifying the above super-additivity. In fact, it is compatibly with DS theory that both p and p are even assigned a belief of zero. In our framework, in contrast, uncertainty or conflict derive straight from the information available about p and p, rather than from evidence about some larger proposition. Working with an non-classic, BD-metatheory, non-classic information about literals extends to complex formulas such as p _ p, as witnessed in the inclusion-exclusion axiom (A3). This axiom, in fact, can be seen to stand in direct opposition to the theory of inner measures. Our axioms (A3) implies a subadditivity property (i.e. µ˚pϕ _ ψq ď µ˚pϕq`µ˚pψq when ϕ^ψ is a classical contradiction), in contrast to the superadditivity of DS theory and inner measures. A detailed comparison beyond DS belief functions and our approach would require a more careful analysis that exceeds the scope of this article. We leave this for future work.
Finally, another open line of inquiry concerns practical implications of the present framework. One may, for instance, ask how an ideally rational agent is to act if she has only imperfect information at her disposal. In future work, we hope to sketch the contours of a non-standard decision theory, that rests on four-valued probabilities in the same manner as traditional decision theory employs classic probability. Doing so, we hope, can help to fill a gap between current frameworks for decisions under risk and under uncertainty.