1 Introduction

This article gives an extensive treatment of reasoning with ambiguity, more precisely with ambiguous propositions. We approach the problem from an algebraic and a logical perspective and show some interesting surprising results on both ends, which lead up to some interesting philosophical questions, which we address in a preliminary fashion. The term linguistic ambiguity roughly designates cases where expressions of natural language give rise to two or more, though finitely many, sharply distinguished meanings. We leave it for now with this brief and intuitive definition,Footnote 1 since we will be rather explicit on its properties later on, and we rely on the fact that even non-linguists have very stable intuitions on what ambiguity is (though the distinction from vagueness and polysemy probably requires some additional knowledge). In linguistics, ambiguity is usually considered to be a very heterogeneous phenomenon, and this is certainly true as far as it can arise from many different sources: from the lexicon, from syntactic derivations, from semantic composition as in quantifier scope ambiguity (this is sometimes reduced to syntax), from literal versus collocational meanings, and probably even more sources such as metaphors etc. Nonetheless, there is something common to all these phenomena, hence it makes sense to think of ambiguity as one single phenomenon.

We have recently argued (see Wurm and Lichte 2016) that the best solution is to treat ambiguity consistently as part of semantics, because there are some properties which are consistently present regardless of its source. The advantage of this unified treatment is that having ambiguity in semantics, we can use all semantic resources in order to resolve it and draw inferences from it (we will be more explicit below). It is a remarkable fact that even though ambiguity is a pervasive phenomenon in natural language, it usually does not seem to pose any problems for speakers: in some cases, we do not even notice ambiguity (as in (1)), whereas in other cases, we can also perfectly reason with and draw inferences from ambiguous information (as in (2)):

figure a

In (1), uttered in an appropriate situation to a non-linguist, hardly any listener would think about the mysterious time flies. Conversely, in (2) everyone notices the ambiguity, but still without any explicit reasoning, the conclusion that in New York there is at least one big car (and probably many more) seems immediate to us. Hence we can easily draw inferences from ambiguous statements. This is in line with psycholinguistic findings that “inference is easy, articulation is costly” (see Piantadosi et al. 2011), and hence ambiguity is to be expected in a language shaped by convenience. This entails two things for us:

  1. 1.

    We should rather not disambiguate (i.e. decide on a reading) before we start constructing semantics, as otherwise at least one reading remains unavailable, and soundness of inferences cannot be verified.

  2. 2.

    Hence we have to be able to construct something like “ambiguous meanings”, and we have to be able to reason with them.

As regards 1., we have to add that from a psychological point of view, it is often plausible to assume that we disambiguate before we interpret a statement, in the sense that even though a sentence is ambiguous between \(m_1\) and \(m_2\), only one of the meanings is constructed or even perceived (see (1)). However, from a logical point of view this prevents sound reasoning, and our goal here is to provide a theory for sound (and complete) reasoning with ambiguity, not a psychological theory. We will investigate thoroughly the matter of reasoning with ambiguity, which will lead to many results which are surprising from a mathematical and interesting from a philosophical point of view.

In Sect. 2, we will lay the conceptual foundations and explain what we mean by ambiguity and what are, in our view, its main properties. The rest of the paper is devoted to a formal approach to reasoning with ambiguity; we let the ambiguity between two meanings \(m_1,m_2\) be denoted by \(m_1\Vert m_2\).

In Sect. 3, we will try to tackle the problem algebraically, by introducing three classes of algebras, all being extensions of Boolean algebras with a binary operator \(\Vert \). These are strong and weak ambiguous algebras and universal distribution algebras (denoted by \(\mathbf {SAA}\), \(\mathbf {WAA}\) and \(\mathbf {UDA}\)). \(\mathbf {WAA}\) has been introduced in Wurm and Lichte (2016), and \(\mathbf {UDA}\) in Wurm (2017). These algebras have, at first glance, innocuous axioms which implement unquestionable properties of ambiguity. However, we will show that in all of them strongly counterintuitive properties hold, and moreover, we show that the equational theories of these three classes actually coincide. These results are surprising and interesting from an algebraic point of view, and leave us with the main paradox we have pointed out already in earlier publications (Wurm and Lichte 2016): how can properties, which are intuitively correct beyond doubt, lead to properties which are intuitively incorrect beyond doubt? There is one obvious way out, which consists in using partial algebras. This however does not provide us with satisfying results either, hence we only mention this possibility and show some rather negative results. Our solution is to say: algebra itself is the problem, more precisely, the fact that we use a congruence which disregards the syntactic form of terms. This problem obviously cannot be solved in an algebraic fashion, hence we use logic to approach it.

In Sect. 4, we introduce the logic \(\mathsf {AL}\), a logic which extends classical logic with an additional connective \(\Vert \) corresponding to ambiguity. We provide a Gentzen-style calculus for this logic and prove it sound and complete for \(\mathbf {UDA}\) [and hence as well for \(\mathbf {SAA}\); the former has already been proved in Wurm (2017)]. Here, the rule (cut) ensures we have congruence as in algebra.

In Sect. 5 we present elementary results on the proof-theory of \(\mathsf {AL}\) and its cut-free version \(\mathsf {AL}^\textit{cf}\), the key results being the following: many important rules, like logical rules corresponding to universal distribution, are admissible in the cut-free calculus; but the cut-rule itself is not admissible. Whereas this is usually considered a bad result, for us it is positive: \(\mathsf {AL}\) (with cut), being complete for \(\mathbf {UDA}\), is too strong for our purposes.

In Sect. 6 we put forward our main hypothesis: the cut-free logic \(\mathsf {AL}^{\textit{cf}}\) (arguably with or without commutativity of \(\Vert \)) is the correct tool for reasoning with ambiguity, that is, it covers all and only the correct inferences. We present some evidence for this hypothesis, though of course it is impossible to formally prove it. Cut-free \(\mathsf {AL}^\textit{cf}\) is incongruent, that is, there is a difference between (1) being logically equivalent (\(\alpha \) entails \(\beta \), \(\beta \) entails \(\alpha \)), and (2) being substitutable in all contexts while preserving truth of implications. We provide cut-free \(\mathsf {AL}^\textit{cf}\) with a semantics, which is also incongruent in the above sense. We then prove soundness and completeness for this semantics. This semantics is based on strings, hence our completeness proof provides also a sort of representation theorem for ambiguous meanings, where roughly speaking concatenation represents ambiguity, and a string represents an “ambiguous normal form”, that is, a list of unambiguous meanings.

Finally, we will discuss the meaning of our results for the nature of ambiguity. Assuming our main hypothesis is correct, reasoning with ambiguity presupposes incongruence, that is, logical equivalence does not entail substitutability. In other words, this means: syntactic form of formulas matters beyond equivalence. Even if we treat ambiguity semantically, there remains something syntactic to it. This is the final insight provided by the quest for the proper tool for reasoning with ambiguity, and we think this opens some philosophical questions on the nature of meaning, which go beyond what we can address in this article.

2 Logic and the Nature of Ambiguity

2.1 Background and Some History

From a philosophical point of view, one often considers ambiguity to be a kind of “nemesis” of logical reasoning; for Frege, for example, the main reason to introduce his logical calculus was that it was in fact unambiguous, contrary to natural language. The discussion about the detrimental effect of ambiguity in philosophy can be traced back even to the ancient world, see Sennet (2016), and is still going on, see Atlas (1989).Footnote 2 On the other hand, in natural language semantics, there is a long tradition of dealing both with ambiguity and logic; we will discuss here three main approaches.

In the first approach, a natural language utterance is translated into an unambiguous formal language such as predicate logic, and ambiguity becomes visible by the fact that there are several translations. To consider a famous example:Footnote 3

$$\begin{aligned}&\texttt {Every\,boy\,loves\,a\,movie.} \end{aligned}$$
$$\begin{aligned}&\exists x.\forall y.movie(x)\wedge (boy(y)\rightarrow loves(y,x)) \end{aligned}$$
$$\begin{aligned}&\forall y.\exists x.movie(x)\wedge (boy(y)\rightarrow loves(y,x)) \end{aligned}$$

So ambiguity does not enter into the logic itself, but is “represented” by the fact that there are two (or more) different logical representations for one sentence. So we cannot simply translate natural language into logical representations (predicate logic or other), as there is no way to represent ambiguity in these languages. The standard way around the lack of functional interpretation is that we do not interpret natural language sentences as strings, but rather their derivations: one string has several syntactic derivations, and derivations in turn are functionally mapped to semantic representations (e.g. see Montague 1973). The problem with this approach is that we basically ban ambiguity from semantics: we first make an (informed or arbitrary) choice, and then we construct an unambiguous semantics. Now this is a problem, as we have seen above:

  1. 1.

    If we simply pick one reading, we cannot know whether a conclusion is generally valid or not, because we necessarily discard some information.

  2. 2.

    To decide on a reading, we usually use semantic information; but if we choose a reading before constructing a semantic representation, how are we suppose to decide?

This becomes even more problematic if we have an ambiguous statement as a constituent in a larger statement. These reasons indicate that we should not prevent ambiguity from entering semantics, because semantics is where we need it, and if it is only to get rid of it. But once ambiguity enters into semantics, we have to reason about its combinatorial, denotational and inferential properties.

A second possibility of which authors make use (though often implicitly) is to treat ambiguity as the disjunction of meanings (see Saka 2007). However, here the above example gives a good argument why this is necessarily inadequate: if we take the disjunction of (4) and (5), the formula would be logically equivalent to (5) (because (4) entails (5))—hence there would not even exist an ambiguity in (3) in any reasonable sense! Apart from this, disjunction behaves differently from ambiguity when for example negated: disjunction obeys the DeMorgan laws, whereas ambiguity remains invariant (see (6-a), (6-b), we will explain this in more detail below). Hence importantly, ambiguity is not disjunction, though there is a relation between the two. Actually, this is a long-lasting misunderstanding among many scholars, even though this has been recognized many years ago (see for example Poesio 1994).

A third approach for representing ambiguity (as e.g. in the quantifier case) is to use a sort of meta-semantics,Footnote 4 whose expressions underspecify logical representations (see Egg 2010); famous cases in point would be Cooper storage and Hole Semantics. Assume our “unambiguous” language is the formal language of logic \({\mathcal {L}}\) (say some extension of predicate logic); in addition to this, we assume we have a meta-language \({\mathcal {M}}\), by which we can underspecify terms of \({\mathcal {L}}\). For example, let \(\chi \) be a formula of \({\mathcal {M}}\) underspecifying the two formulas \(\alpha ,\beta \) of \({\mathcal {L}}\) (for example (4) and (5)). But now that we have this meta-language \({\mathcal {M}}\) of our logic \({\mathcal {L}}\), there are new questions:

  1. 1.

    How do we interpret formulas of \({\mathcal {M}}\)?

  2. 2.

    How do we provide the connectives of \({\mathcal {M}}\) with a compositional semantics?

  3. 3.

    What are the inferences both in \({\mathcal {L}}\) and \({\mathcal {M}}\) we can draw from formulas in \({\mathcal {M}}\)?

Once we start seriously addressing these questions, we see that moving to a meta-language does not solve any of our problems—at best, it removes them from our sight. We usually do have a compositional semantics and consequence relation for \({\mathcal {L}}\); for \({\mathcal {M}}\) we do not. Hence \({\mathcal {M}}\) fails to have the most basic features of a semantics, unless, of course, \({\mathcal {M}}\) itself is a logic with consequence relation and compositional semantics. But in this case, considering that \({\mathcal {M}}\) should conservatively extend \({\mathcal {L}}\), it seems to be much more reasonable to include the new operator for ambiguity directly into our object language \({\mathcal {L}}\). And this is exactly what we do here. From this example it becomes clear once more that ambiguity cannot be reasonably interpreted the same way as disjunction: because \({\mathcal {L}}\) in any normal case already has disjunction, there would be no need at all for \({\mathcal {M}}\) [this problem is discussed in more detail in van Eijck and Jaspars (1996)].

This is but a short outline of the main problems of the three usual treatments of ambiguity, namely i. moving ambiguity to syntax, ii. treating ambiguity as disjunction, and iii. using meta-(meta-)languages. In our view, none of them substantially contributes to the problem of reasoning with ambiguity. We will now expose what for us are the key features of ambiguity, which at the same time are the main challenges in developing a logic of ambiguity. For more extensive treatment of some aspects, we refer the reader to Wurm and Lichte (2016).

2.2 Key Aspects of Ambiguity

We think that the crucial point to distinguish ambiguity from related phenomena like vagueness or sense generality lies in considering combinatorial, denotational and inferential properties of ambiguity separately. Whereas the latter two are closely related, the combinatorial properties are rather distinct.

One important distinction has to be made from the outset, namely the one between what we might call local and global ambiguity. For example the word can is ambiguous between a noun and an auxiliary; however, it will probably not contribute to the ambiguity of any sentence, because the correct syntactic category can be inferred from its context, and hence the ambiguity remains local. Local ambiguity is thus ambiguity which can be definitely discarded at some level by syntactic or combinatoric properties alone, and therefore can never enter semantics. What is interesting for us is global ambiguity, which cannot be disambiguated on the base of morpho-syntactic combinatorics. Note that even in the context of finance transactions, the word bank remains globally ambiguous. This article only covers global ambiguity in this sense.

Recall that we let ambiguity be denoted by \(\Vert \); we use this symbol both as an algebraic operator and a logical connective, both binary. Hence \(a\Vert b\) can be a term in an appropriate algebra, \(\alpha \Vert \beta \) a logical formula. We use the symbol also to combine meanings, on the precise nature of which we are agnostic. We now list the main features of ambiguity.

Discreteness This is a main intuitive feature of ambiguity, in particular distinguishing it from vagueness: in ambiguity, we have a finite (usually rather small) list of meanings between which an expression is ambiguous, and those are clearly distinct. This feature is most basic in the sense that this allows us to treat ambiguity as a binary algebraic operator or logical connective \(\Vert \). To take our typical example of the word bank: we have the two clearly distinct meanings “financial institute” and “strip of land along a river”. Note that this intuitively obvious feature of discreteness is by no means trivial, as the two clearly distinct meanings of bank are vague themselves, as most common noun meanings (for example, how broad can a piece of land along a river be to still qualify as a river bank?)

Universal distribution For the combinatorics of \(\Vert \), the most prominent, though only recently focused (see Wurm and Lichte 2016) feature of ambiguity is the fact that it equally distributes over all other connectives. To see this, consider the following examples:

figure b

(6-a) is ambiguous between \(m_1=\)“there is a financial institute” and \(m_2=\) “there is a strip of land along a river”. When we negate this, the ambiguity remains, with the negated content: (6-b) is ambiguous between \(n_1=\)“there is no financial institute” and \(n_2=\)“there is no strip of land along a river”, and importantly, the relation between the two meanings \(n_1\) and \(n_2\) is intuitively exactly the same as the one between \(m_1\) and \(m_2\). This distinguishes an ambiguous expression such as bank from a hypernym as vehicle, which is just more general than the meanings “car” and “bike”:

figure c

(7-a) means (arguably): “there was a car or there was a bike or ...”; but (7-b) rather means: “there was no car and there was no bike and ...”. Hence when negated, the relation between the meanings changes from a disjunction to a conjunction (as we expect from a classical logical point of view); but for ambiguity, nothing like this happens: the relation remains invariant. This also holds for distribution of all other connectives/operations (see Wurm and Lichte 2016). This invariance is the first point where we see a clear difference between ambiguity and disjunction, and we consider this property of universal distribution to be most characteristic of ambiguity. Universal distribution seems to be strongly related to another observation: we can treat ambiguity as something which happens in semantics (as we do here), or we can treat it as a “syntactic” phenomenon, where “syntactic” is to be conceived in a very broad sense. In our example, the syntactic approach would be to say: there is not one word (as form-meaning pair) bank, but rather two words bank\(_1\) and bank\(_2\), bearing different meanings.Footnote 5 The same holds for genuine syntactic ambiguity: one does not assume that the sentence I have seen a man with a telescope has strictly speaking two meanings, one rather assumes it has two derivations, where each derivation comes with a single meaning. Universal distribution is what makes sure that semantic and syntactic treatment are completely parallel: every operation f on an ambiguous meaning \(m_1\Vert m_2\) equals an ambiguity between two (identical) operations on two distinct meanings, hence

$$\begin{aligned} f(m_1\Vert m_2)=f(m_1)\Vert f(m_2) \end{aligned}$$

Note that in cases where we combine ambiguous meanings with ambiguous meanings, this leads to an exponential growth of ambiguity, as is expected. Hence universal distribution is what creates the parallelism between semantic and syntactic treatment of ambiguity. This means: strictly speaking, we do not even need to argue whether ambiguity is a syntactic or semantic phenomenon—because the result in the end should be the same, it is of no relevance where ambiguity comes from. However, as soon as we start to reason with ambiguity, a unified semantic treatment will only have advantages, as all information is in one place. If we consider propositional logic, (8) reduces to

$$\begin{aligned} \lnot (\alpha \Vert \beta )&\equiv \lnot \alpha \Vert \lnot \beta \end{aligned}$$
$$\begin{aligned} (\alpha \Vert \beta )\vee \gamma&\equiv (\alpha \vee \gamma )\Vert (\beta \vee \gamma ) \end{aligned}$$
$$\begin{aligned} (\alpha \Vert \beta )\wedge \gamma&\equiv (\alpha \wedge \gamma )\Vert (\beta \wedge \gamma ) \end{aligned}$$

By convention, we use symbols as \(m_1,m_2\) if we speak about (propositional) linguistic meanings, symbols like abc when we speak about arbitrary algebraic objects; Greek letters \(\alpha ,\beta \) etc. will be reserved for logical formulas. Logically speaking, this means that \(\Vert \) is self-dual: \(\Vert \) preserves over negative contexts such as negation, similar to fusion in Lambek (1995) (this logic is however used for a very different purpose, namely analysis of natural language syntax).

Entailments An ambiguity \(m_1\Vert m_2\) is generally characterized by the fact that the speaker intends one of \(m_1\) or \(m_2\). The point is: we do not know which one of the two, as for example in

figure d

From this simple fact, we can already deduce that for arbitrary formulas \(\phi ,\alpha ,\beta ,\chi \) in the logic of ambiguity, if \(\phi \vdash \alpha \vdash \chi \) and \(\phi \vdash \beta \vdash \chi \) hold, then \(\phi \vdash \alpha \Vert \beta \vdash \chi \) holds, hence in particular, \(\alpha \wedge \beta \vdash \alpha \Vert \beta \vdash \alpha \vee \beta \). But: we cannot reduce \(\alpha \Vert \beta \) to neither \(\alpha \) nor \(\beta \): we have \(\alpha \not \vdash \alpha \Vert \beta \) and \(\beta \not \vdash \alpha \Vert \beta \), and also \(\alpha \Vert \beta \not \vdash \alpha \) and \(\alpha \Vert \beta \not \vdash \beta \). This is because our logic is supposed to model the inferences which are sound in every case (i.e. under every intention), not in some cases, and all the latter entailments are all unsound in some cases. Hence \(\Vert \) does not coincide with any classical connective and is not definable in classical logic. It is actually a substructural connective (see Restall 2008, for an introduction), behaving similar as fusion in linear logic: in particular, it does not allow for weakening (we will make this precise below). Note that this also illustrates how ambiguity behaves rather differently from disjunction:

figure e

Anyone who utters (13) should be satisfied if he gets handed the pastry, and also if he gets handed the money. If a speaker utters (12), he either means “pastry” or “money”, but he might complain either if you give him the money or if you give him the pastry. The conditions for satisfying (12) are thus clearly different from (13): in the former, whichever of the two you give, you might remain with an angry interlocutor.

Conservative extension In particular in connection with logic, it should be clear that our logical calculus of ambiguity should be a conservative extension of the classical calculus, meaning that for formulas not involving ambiguity, the same consequences should be valid as before. The reason is that even if we include ambiguous propositions, unambiguous propositions should behave as they used to before—if there are new entailments, they should only concern ambiguous propositions. The algebraic notion corresponding to the fragment in logic is the one of a reduct, hence the notion equally makes sense in an algebraic setting.

There are some more important properties of ambiguity which have some relevance in the paper, which however are more technical. These are the following:

Associativity This property states that given an ambiguity between more than two meanings, their grouping is irrelevant, formally \(a\Vert (b\Vert c)=(a\Vert b)\Vert c\). This seems natural to us, and there seems little to object to it. It is very important in connection with commutativity.

Commutativity This property states that for meaning, the order of ambiguities does not play a role, hence \(a\Vert b=b\Vert a\). This is not intuitively clear to our conception of meaning: on the one hand, there does not seem to be in general a natural order between ambiguous meanings; on the other hand, we often have a clear intuition on which meaning is primary, secondary etc. Regardless and from a mathematical point of view, this property will turn out to be the most critical in this article, and will serve as a probe into the adequacy of a formal theory of ambiguity. The reason is as follows: in all algebraic approaches we present, including commutativity will result in having only trivial (i.e. one-element) algebras. This, among other, is obviously a knock-out criterion, because even if we do not necessarily want to include commutativity, we definitely want to be able to include it into our axiom set. We thus use this property to definitely refuse approaches to ambiguity. Having such a pivotal role, we will in the very end use it also in a positive fashion: the fact that our logic \(\mathsf {AL}^{\textit{cf}}\)—and its incongruent semantics—can be extended with commutativity without any apparent problems for us is a strong evidence that it is adequate.

Non-productivity or partiality This is a very peculiar feature of ambiguity, which distinguishes it fundamentally from other propositional connectives: for connectives like \(\wedge ,\vee ,\lnot \) etc., we find natural language counterparts and, or, not; in this sense, they are productive. This even holds for definable connectives which do not have a simple counterpart, such as XOR (the exclusive or), which we can express in some way or other. For ambiguity, this does not hold: we simply cannot create arbitrary ambiguities in natural language. There is no English phrase expressing the ambiguity between “squirrel” and “table”. We conjecture that this will hold in all natural languages (though there does not seem to be any research on this). One might argue that there is simply no function for ambiguity, but this is definitely not true. Assume we have a shy man who wants to ask out his office-mate, but is afraid to commit himself. It would be extremely useful for him to have a sentence ambiguous between “would you go out with me” and “do you mind if I open the window”—but this sentence does (presumably) not exist. It is easy to find many other examples—just think of what people might want to say (and not say) in court or politics.

This leads to an important question: why is this the case, and should we search the motivation in formal properties of ambiguity, or rather in linguistic considerations? We conjecture the latter, and we give the argument in a nutshell: assume there were an (English) ambiguity connective am. The problem with this connective is: if we say something like x am y, we give less information than by saying just x or y, yet we say (quantitatively) more. This contradicts fundamental Gricean principles, because we say more, still are deliberately less informative. Hence an ambiguity connective would already be an atrocity from the point of view of pragmatics.

And still, reconsidering the case of our shy office worker, this connective would not be particularly useful: because one of the features of ambiguity is that a speaker, being ambiguous, does not even commit to being ambiguous on purpose—this is what makes it so attractive in our example. By being obviously ambiguous on purpose—say by uttering Would you open the window am go out with me—one already loses a core feature of “full” ambiguity. Put differently, “full” ambiguity includes the possibility of not being aware of it, and if it there were an explicit connective, this possibility is excluded. This is not the place to dwell on these linguistic arguments; we only want to conclude: in our view, the partiality of ambiguity is due to linguistic and pragmatic principles, not to its semantic properties itself. Hence this is not an argument to make ambiguity a partial operation in our logics/algebras. We will still consider the possibility of making \(\Vert \) a partial operation in our algebras, and check whether it helps avoid some negative results. As we will see, this is not the case.

Monotonicity Basically, monotonicity states that every ambiguous term entails itself, and this entailment is closed under weakening in the logical sense:

figure f

This is not entirely straightforward, since under this assumption plants and animals entails plants, but the word animals might provide evidence for one specific reading of plants. But we are interested in logical soundness, not plausibility, hence we put this concern aside. Algebraically, this means: if you increase the arguments of \(\Vert \), you increase the value. Logically, its counterpart is the following inference rule (monotonicity):Footnote 6

figure g

Consistency of usage and trust We adopt this feature, but underline that it is actually the only one from the list here which is not mandatory for ambiguity. This feature actually distinguishes our work from the approach of van Eijck and Jaspars (1996). Imagine someone telling you something about banks, and as he goes on, you discover that what he says does not make any sense to you. In the end, you notice that he has been using the term bank with different meanings in different utterances. At this point, you obviously have to consider most of the discourse meaningless: how can you possibly reconstruct what meaning was intended in which utterance? Trustful reasoning with ambiguity makes the following assumption:

figure h

This is of course arguable, not only because the notion of “context” remains vague, but also because we can use the same word with different meanings in the same sentence, as in I spring over a spring in spring.Footnote 7 However the classic work by Yarowsky (1995) gives strong evidence for consistent usage in empirical data.

We underline that reasoning with ambiguity in a situation of distrust is also possible and has been described, though not as such, by van Eijck and Jaspars (1996). To illustrate the difference from a formal point of view: \(p\Vert q\vdash p\Vert q\) is a valid inference in both cases (monotonicity), whereas \((p\Vert q)\wedge \lnot (p\Vert q)\) is a contradiction in the trustful case, but not in the distrustful case, since we might intend different propositions in \(p\Vert q\) and \(\lnot (p\Vert q)\). Linguistically: a sentence like

figure i

is a contradiction in the trustful approach; in the distrustful approach not necessarily: dead could be used in two different senses, say medical and spiritual.

Hence in the distrustful approach, classical theorems are no longer valid if constituted by ambiguous propositions, and classical inferences (like Modus Ponens) usually fail if applied to ambiguous propositions (see also the conclusion of Sect. 3). There is a lot more to say on this issue, but we plan to compare the trustful and distrustful approach in a separate publication. In this article, we want to describe reasoning with ambiguity in a situation of trust in consistent usage.

3 Algebras of Ambiguity

3.1 Preliminaries and Boolean Algebras

In this section, we will present an algebraic approach to the problem of reasoning with ambiguity. We will sketch the preliminaries, then present three relevant classes of algebras, prove the equivalence of their equational theories (i.e. the set of all equations holding in all algebras), which ultimately will lead us to discard this approach. The results of this section are thus mostly negative. If the reader is mainly interested in how ambiguity can be adequately treated, she can safely skip this section. The interesting result can be summarized as follows: algebra, or at least extensions of Boolean algebras, will not do the job. About the reasons for this we will speak in the end of this section. In Sect. 6 we will see which general insights can be drawn from this.

The general setting we will use here are Boolean algebras, which are structures of the form \({\mathbf {B}}=(B,\wedge ,\vee ,{\sim },0,1)\). As these are most well-known, we do not introduce them [the reader interested in background might consider Kracht (2003) and Maddux (2006), or many other sources]. We denote the class of Boolean algebras by \(\mathbf{BA} \). In this section, we will only use elementary properties of Boolean algebras, frequently and without proof or explicit reference. Many results we present here depend on specific properties of Boolean algebras such as the law of double complementation; hence the results do depend on this very peculiar choice. However, there is a very good justification for this choice, namely that in semantics of natural language, which is by far the greatest field of research where ambiguity arises and has to be handled, there is (comparatively) very little work on approaches using non-classical logic [but see Barwise and Etchemendy (1990), which is very interesting since it also includes the information-theoretic aspect which is important for ambiguity].

In the algebraic approach to ambiguity, we think of the objects of algebras as propositional meanings; the operations of the algebra (in our case, the Boolean operators and ‘\(\Vert \)’) correspond to ways to combine these meanings. Here, the Boolean operations of course (loosely) correspond to their counterparts in natural language; for ‘\(\Vert \)’, there is no corresponding connective. Importantly, there is no straightforward sense in which some meanings are more “basic” than others: all terms denote simple objects, that is, propositional meanings.

We now discuss what properties the connective \(\Vert \) should satisfy on a conceptual level; put differently, we ask: what kind of object is \(a\Vert b\), and which rules does the operator \(\Vert \) obey? We distinguish three different ways how we can conceptually conceive of the operation \(\Vert \):

  1. 1.

    \(a\Vert b\) denotes the “correct” meaning, that is, the one intended by the speaker (but which is unknown to any interpreter)

  2. 2.

    \(a\Vert b\) is entailed by the “correct” meaning, that is, the one intended by the speaker

  3. 3.

    \(a\Vert b\) is a “genuinely ambiguous” object, a sort of underspecification, which behaves in a certain combinatorial and inferential fashion

\(a\Vert b\) here includes an epistemic aspect in our algebra: because in cases 1. and 2., we refer to the intention of the speaker, which is invisible to any outsider. This is also clear in the case of 3., as in this case, we have a genuinely underspecified meaning, that is, one the true content of which we cannot reconstruct.

3.2 Three Classes of Algebras

We now introduce three classes of algebras corresponding to the three conceptions mentioned above. All of them will have the same signature \((A,\wedge ,\vee ,{\sim },\Vert ,\) 0, 1), where \((A,\wedge ,\vee ,{\sim },0,1)\) is a Boolean algebra, and \(\Vert \) is a binary operator. We adopt the following general conventions: boldface letters like A designate algebras, corresponding letters A denote their carrier set. We define \(a\le b\) as an abbreviation for \(a\wedge b=a\) (equivalently, \(a\vee b=b\)). Another general convention we will adopt here is the following: let \({\mathfrak {C}}\) be a class of algebras, \(t,t'\) be terms over their signature. We write \({\mathfrak {C}}\models t=t'\) if for all \(\mathbf{C }\in {\mathfrak {C}}\), all instantiations of \(t,t'\) with terms denoting objects in C, the equality holds in \(\mathbf{C }\). Hence we write \(\mathbf{BA} \models a\vee \lnot a=1\) etc. The following algebras are ordered from strong to weak.

Strong ambiguous algebras In this class, we have the following axioms for \(\Vert \):

figure j

We denote the class of all algebras satisfying these axioms by \(\mathbf {SAA}\). (\(\Vert \)1) and (\(\Vert \)2) will hold in all classes, and it is these axioms which ensure universal distribution (9)–(11), which thus hold in all algebras we consider. (\(\Vert \)3) is the axiom peculiar to \(\mathbf {SAA}\), and all it states is that \(a\Vert b\) either denotes a or it denotes b.

Weak ambiguous algebras

figure k

We denote the class by \(\mathbf {WAA}\). As we see, it is only the slightly weaker equality in (\(\Vert 3w\)) which distinguishes it from the strong form. Still, the two do not coincide. However, we will show that every weak ambiguous algebra is actually a universal distribution algebra. We need the additional axiom (assoc) to ensure associativity, which is actually derivable in \(\mathbf {SAA}\), but does not seem to be derivable from the other axioms in \(\mathbf {WAA}\).

Universal distribution algebras

figure l

We denote the class by \(\mathbf {UDA}\). This is the weakest algebraic class we present here. As is easy to see, this class is a variety, being axiomatized by a set of (in)equalities. \(\mathbf {SAA}\) and \(\mathbf {WAA}\) are not varieties: every variety contains the free algebra generated by an arbitrary certain set; however, the free ambiguous algebra (both weak or strong) over some non-trivial set is not an ambiguous algebra, because of the disjunctive axiom: since in general, \(a\ne a\Vert b\ne b\), in the free algebra neither of the two holds. We will now consider the three classes one after the other.

3.3 Strong Ambiguous Algebras

We now present the most important results on the class of strong ambiguous algebras, which has been introduced and thoroughly investigated in Wurm and Lichte (2016).Footnote 8 Intuitively, this is a model where all ambiguous meanings exist, but every ambiguity is resolved to an underlying intention (this makes the implicit presupposition that ambiguous meanings are used consistently in one sense). This is a strong commitment, and mathematical results show that it is actually too strong. Firstly note, that (\(\Vert \)1),(\(\Vert \)2) are sufficient for universal distribution: they entail all equations (9)–(11) (for details see Wurm and Lichte 2016), as \(\vee \) is redundant. The axiom (id) \(a\Vert a=a\) is obviously derivable. We now prove the main result on \(\mathbf {SAA}\), namely uniformity.

Lemma 1

Let \({\mathbf {A}}\) be a strong ambiguous algebra, and \(a\in A\). If \(a\Vert {\sim }a=a\), then

figure m


1. follows by negation distribution; 2. is because \(1\Vert a=({\sim }a\Vert a)\vee a={\sim }a\vee a=1\). Results 3.–9. follow in a similar fashion from the distributive laws. To see why 10. holds, assume conversely that \(0\Vert 1=1\). Then we have

$$\begin{aligned} a=1\wedge a=(0\Vert 1)\wedge a=(0\wedge a)\Vert (1\wedge a)=0\Vert a \end{aligned}$$

– a contradiction to 3. 11. follows by distribution of \({\sim }\). \(\square \)

Obviously, this lemma has a dual where \(a\Vert {\sim }a={\sim }a\), and where all results are parallel.

Lemma 2

Let \({\mathbf {A}}\) be a strong ambiguous algebra, \(a\in A\).

  1. 1.

    If \(a\Vert {\sim }a=a\), then for all \(b,c\in A\), \(b\Vert c=b\)

  2. 2.

    If \(a\Vert {\sim }a={\sim }a\), then for all \(b,c\in A\), \(b\Vert c=c\).


We only prove 1., 2. is dual. Assume \(a\Vert {\sim }a=a\), and assume \(b\Vert c=c\). By the previous lemma, we know that \(1\Vert 0=1\), \(0\Vert 1=0\), hence


Hence \(b\Vert c=c\) entails \(b=c\), which proves the claim. \(\square \)

Now we can prove the strongest result on \(\mathbf {SAA}\), the Uniformity Lemma.

Lemma 3

Assume we have a strong ambiguous algebra \({\mathbf {A}}\), \(a,b\in A\) such that \(a\ne b\).

  1. 1.

    If \(a\Vert b=a\), then for all \(c,c'\in A\), we have \(c\Vert c'=c\);

  2. 2.

    if \(a\Vert b=b\), then for all \(c,c'\in A\), we have \(c\Vert c'=c'\).


We only prove 1., as 2. is completely parallel. Assume there are \(a,b\in A\), \(a\ne b\) and \(a\Vert b=a\). Assume that there are pairs \(c,c'\in A\) such that \(c\Vert c'\ne c\). There are two cases:

  1. (i)

    Among these pairs, there is a pair \(c,c'\) such that \(c'={\sim }c\). Then we have \(c\Vert {\sim }c={\sim }c\), and by Lemma 2, it follows that \(a\Vert b=b\), which is wrong by assumption—contradiction.

  2. (ii)

    Among these pairs, there is no pair \(c,c'\) such that \(c'={\sim }c\) and \(c\Vert c'\ne c\). Then we necessarily have (among other) \(a\Vert {\sim }a=a\), and by Lemma 2, this entails \(c\Vert c'=c\)—contradiction. \(\square \)

Put differently: let \(\pi _l\) be left projection, a binary function where \(\pi _l(a,b)=a\); \(\pi _r\) is then right projection, with \(\pi _r(a,b)=b\).

Lemma 4

(Uniformity of strong ambiguous algebras) Every strong ambiguous algebra has the form \((B,\wedge ,\vee ,{\sim },\pi _l,0,1)\) or \((B,\wedge ,\vee ,{\sim },\pi _r,0,1)\), where \((B,\wedge ,\vee ,{\sim },\) 0, 1) is a Boolean algebra.

Hence for every Boolean algebra, there exist exactly two ambiguous algebras, one where \(\Vert \) computes uniformly \(\pi _l\) for all arguments, and one where \(\Vert \) computes uniformly \(\pi _r\). We say an ambiguous algebra \((B,\wedge ,\vee ,{\sim },\pi _l,0,1)\) is left-sided, and respectively \((B,\wedge ,\vee ,{\sim },\pi _r,0,1)\) right-sided; we denote the left-sided algebra extending a Boolean algebra \(\mathbf{B }\) by \(C_l(\mathbf{B })\), the right-sided extension by \(C_r(\mathbf{B })\). This entails that ambiguous algebras are rather uninteresting, as extending a Boolean algebra with a left/right projection operator is not very interesting. It also entails that strong ambiguous algebras with commutative \(\Vert \) operation are trivial (i.e. one element), but we will see that this even holds for more general classes. Hence even though the axiomatization seems unproblematic, it is too strong. We will therefore next consider algebras with weaker axioms for ambiguity.

3.4 Weak Ambiguous Algebras

It is obvious that \(\mathbf {SAA}\subseteq \mathbf {WAA}\), that is, every strong ambiguous algebra is a weak ambiguous algebra, since it easily follows from uniformity that strong ambiguous algebras satisfy (assoc) (simply by case distinction). What is less obvious is that \(\mathbf {WAA}\subseteq \mathbf {UDA}\); this is the first thing we will prove here. First we will prove that (id) holds in \(\mathbf {WAA}\).

Lemma 5

\(\mathbf {WAA}\models a=a\Vert a\).


\(a\ge a\Vert a\): \(a\wedge (a\Vert a)=(a\wedge a)\Vert (a\wedge a)=a\Vert a\).

\(a\le a\Vert a\): \(a\vee (a\Vert a)=(a\vee a)\Vert (a\vee a)=a\Vert a\). \(\square \)

Now this has an important corollary:

Corollary 6

\(\mathbf {WAA}\models a\wedge b\le a\Vert b\le a\vee b\).


We have \((a\Vert b)\wedge (a\wedge b)=(a\wedge (a\wedge b))\Vert (b\wedge (a\wedge b))=(a\wedge b)\Vert (a\wedge b)=a\wedge b\) (by idempotence); hence \(a\wedge b\le a\Vert b\) by definition of \(\le \). Parallel for \(\vee \). \(\square \)

We now need two auxiliary properties which hold in all Boolean algebras:

  1. B1

    If \(b\vee {\sim }a=1\), then \(a\le b\).

  2. B2

    If \(a\vee c=1\) and \(a\vee {\sim }c=1\), then \(a=1\).

To see B1, consider that if \(b\vee {\sim }a=1\), then \(a=(b\vee {\sim }a)\wedge a=(b\wedge a)\vee ({\sim }a\wedge a)=(b\wedge a)\vee 0=b\wedge a\), hence by definition of \(\le \), \(a\le b\). To see B2, consider that if the premise holds, we have \(1=(a\vee c)\wedge (a\vee {\sim }c)=a\vee (c\wedge {\sim }c)=a\vee 0=a\).

Lemma 7

In \(\mathbf {WAA}\), the following equalities hold:

  1. 1.

    \(1=1\Vert ({\sim }a\vee b\vee {\sim }c)\Vert 1\)

  2. 2.

    \(1=(1\Vert (a\vee c\vee {\sim }b)\Vert ({\sim }a\vee b)\Vert 1)\)

  3. 3.

    \(((a\vee c)\Vert b)\vee ({\sim }a\Vert {\sim }b)=1\)

  4. 4.

    \(a\Vert b\le (a\vee c)\Vert b\)

  5. 5.

    \(a\Vert b\le a\Vert (b\vee c)\)

  6. 6.

    \(a\Vert b\le (a\vee c)\Vert (b\vee d)\)


1. We have

figure n

2. We use B2:

figure o


figure p

and the claim follows from 1.

3. is obtained from 2. by applying distributive laws.

4. Since \({\sim }a\Vert {\sim }b={\sim }(a\Vert b)\), this is obtained from 3 and B1.

5. Parallel to 4 (several steps have to be repeated).

6. The relation \(\le \), defined by \(\wedge \) or \(\vee \), is obviously transitive, hence this follows from the previous. \(\square \)

This is already sufficient for the following result:

Corollary 8

Every weak ambiguous algebra is a universal distribution algebra.

Are there weak ambiguous algebras which are not strong ambiguous algebras? This question can be answered positively: just take the algebra with the four elements \(\{0,1,0\Vert 1,1\Vert 0\}\), with the obvious order (the square). There is actually just one algebra with these elements, where necessarily we have \(1\Vert 0\Vert 1=1\), \(1\Vert 0\Vert 0=1\Vert 0\) etc., that is, we have \(a_1\Vert a_2\Vert a_3=a_1\Vert a_3\). This can be easily proved to be a weak ambiguous algebra, but it is not a strong ambiguous algebra, as \(0\ne 0\Vert 1\ne 1\).

The next question is: are there universal distribution algebras which are not weak ambiguous algebras? Again, the answer is positive: take the UDA which extends the four element Boolean algebra over \(\{0,a,b,1\}\) by ambiguous objects, and take the object \(a\Vert b\). Then \((a\Vert b)\vee a=a\Vert (a\vee b)=a\Vert 1\); \((a\Vert b)\vee b=1\Vert b\). If this were a weak ambiguous algebra, we would have either \(a\le a\Vert b\) or \(b\le a\Vert b\). If \(a\le a\Vert b\), then \(a\Vert b=a\vee (a\Vert b)=a\Vert 1\), and if \(b\le a\Vert b\), then \(a\Vert b=(a\Vert b)\vee b= 1\Vert b\). However, every UDA can be completed to a strong ambiguous algebra in two ways (see Lemma 21) without collapsing any elements of the underlying Boolean algebra; hence in the algebra over \(\{0,a,b,1\}\), we would either have \(a=1\) or \(a=b\) or \(b=1\), which by assumption do not hold. This proves the following:

Lemma 9

\(\mathbf {SAA}\subsetneq \mathbf {WAA}\subsetneq \mathbf {UDA}\).

This is already all we have to say about this class: Being located between \(\mathbf {SAA}\) and \(\mathbf {UDA}\), it does not seem to be particularly interesting.

3.5 Universal Distribution Algebras

In a sense, \((\Vert 3)\) and \((\Vert \)3w) state that we use an ambiguous term with a given intention. This might be true if we think of sentences uttered by speakers. It is no longer true if we just think (for example) of the lexicon, where terms exist regardless of any intention. \(\mathbf {UDA}\) models ambiguity without any underlying intention. Regarding the axioms, (\(\Vert \)1), (\(\Vert \)2) ensure that universal distribution holds, see (9)–(11). (inf) regulates the relation \(\le \) between ambiguous and unambiguous objects; (mon) the relation \(\le \) between ambiguous objects. As we will later see, (mon) is derivable from the other axioms; we include it nonetheless, since it makes the properties of the algebra easier to grasp. It is easy to see that (mon) amounts to a form of monotonicity: increasing the arguments of \(\Vert \) increases the value of the function:

Lemma 10

(Monotonicity for \(\mathbf {UDA}\)) For every \(\mathbf{U }\in \mathbf {UDA}\), \(a,b,c,d\in U\), if \(a\le c\),\(b\le d\), then \(a\Vert b\le c\Vert d\).


Assume \(a\le c\), \(b\le d\). Then \(c=a\vee c\), \(d=b\vee d\), and \(a\Vert b\le (a\vee c)\Vert (b\vee d)=c\Vert d\). \(\square \)

The formulation we choose immediately entails that \(\mathbf {UDA}\) is a variety, contrary to \(\mathbf {SAA}\) and \(\mathbf {WAA}\). Note that in presence of distributive laws, (inf) is equivalent to (id): (id) entails (inf), because \((a\Vert b)\wedge (a\wedge b)=(a\wedge b)\Vert (a\wedge b)=a\wedge b\), and hence by definition of \(\le \), \(a\wedge b\le a\Vert b\); parallel for \(a\vee b\). Conversely, (inf) entails idempotence, because then \(a=a\wedge a\le a\Vert a\le a\vee a=a\). Hence \(\mathbf {UDA}\) splits (\(\Vert \)3) into two weaker axioms. Note also that as (inf) is correct beyond doubt, the maybe more questionable (id) is inevitable. Note that (id) might look questionable as it entails already things like

$$\begin{aligned} a\Vert b&=a\Vert a\Vert b\\&=a\Vert b\Vert b\\&=(a\Vert b\Vert b)\vee (a\Vert a\Vert b)\\&=a\Vert (a\vee b)\Vert b \end{aligned}$$

(We skipped some straightforward intermediate steps.) In \(\mathbf {UDA}\), also inequations such as the following law of disambiguation are satisfied:

$$\begin{aligned} (a\Vert b\Vert c)\wedge {\sim }b&=(a\wedge {\sim }b)\Vert (b\wedge {\sim }b)\Vert (c\wedge {\sim }b)\\&\le (a\wedge {\sim }b)\Vert 0\Vert (c\wedge {\sim }b)\\&\le a\Vert c \end{aligned}$$

To get a better intuition on the structure of universal distribution algebras, we present some first results. We say a term t is in ambiguous normal form, iff \(t=t_1\Vert ...\Vert t_i\), where \(t_1,...,t_i\) are Boolean terms. The following is not difficult:

Lemma 11

For every term t, there is a term \(t'\) in ambiguous normal form such that \(\mathbf {UDA}\models t=t'\).

To see this, just iterate the application of distributive laws. When we have a Boolean combination of ambiguous terms, the procedure of forming ambiguous normal forms leads to an exponential blow-up in the size of terms. This “problem” (if we want to consider it as such) will however turn out immaterial for \(\mathbf {UDA}\), once we have the Margin Lemma, which is the central result on \(\mathbf {UDA}\).

An interesting property is the following: let \(t=t_1\Vert ...\Vert t_i\) be a term in ambiguous normal form. One might conjecture that \(\mathbf {UDA}\models 1=t\) iff \(\mathbf{BA} \models 1=t_1,...,1=t_i\). This is however not correct, as can be seen from the following:

$$\begin{aligned} 1&=(a\Vert b)\vee {\sim }(a\Vert b)\\&=(a\vee ({\sim }a\Vert {\sim }b))\Vert (b\vee ({\sim }a\Vert {\sim }b))\\&=(a\vee {\sim }a)\Vert (a\vee {\sim }b)\Vert (b\vee {\sim }a)\Vert ( b\vee {\sim }b) \end{aligned}$$

Here, a and b are arbitrary. This can still be strengthened: first, as a special case, put \(b\equiv {\sim }a\); then we have:

$$\begin{aligned} 1&=(a\Vert {\sim }a)\vee {\sim }(a\Vert {\sim }a)\\&=1\Vert (a\vee a)\Vert ({\sim }a\vee {\sim }a)\Vert 1\\&=1\Vert a\Vert {\sim }a \Vert 1 \end{aligned}$$

where a is arbitrary. So far, we have only used Boolean algebra axioms and universal distribution. With (id) and (assoc) we can derive:

$$\begin{aligned} 1&=(1\Vert a)\Vert ({\sim }a \Vert 1)&\\&=((1\Vert a)\Vert (1\Vert a))\Vert ({\sim }a \Vert 1)&\text {(id)}\\&=(1\Vert a)\Vert (1\Vert a\Vert {\sim }a \Vert 1)&\text {(assoc)}\\&=(1\Vert a)\Vert 1&\text {substitution of line 1} \end{aligned}$$

There is a parallel derivation (using \(\wedge \) instead of \(\vee \)) for \(0=0\Vert a\Vert 0\), hence the following equalities are valid in \(\mathbf {UDA}\):

$$\begin{aligned} 0= & {} 0\Vert a\Vert 0 \end{aligned}$$
$$\begin{aligned} 1= & {} 1\Vert a\Vert 1 \end{aligned}$$

where a is arbitrary. From here we can prove the following:

Lemma 12

\(\mathbf {UDA}\models a=a\Vert b\Vert a\).


By cases:

Case 1 Assume \(b\le a\). Then \(a=1\wedge a=(1\Vert b\Vert 1)\wedge a=a\Vert (b\wedge a)\Vert a=a\Vert b\Vert a\).

Case 2 Assume \(a\le b\). Then \(a=a\vee 0=a\vee (0\Vert b\Vert 0)=a\Vert (a\vee b)\Vert a=a\Vert b\Vert a\).

Case 3 Assume \(a\not \le b\), \(b\not \le a\). Then \((a\Vert b\Vert a)\wedge a=a\Vert (b\wedge a)\Vert a=a\) (by case 1), hence \(a\le a\Vert b\Vert a\); similarly, \((a\Vert b\Vert a)\vee a=a\Vert (b\vee a)\Vert a=a\) (by case 2), hence \(a\Vert b\Vert a\le a\), hence the claim follows. \(\square \)

Hence in particular, \(0\Vert 1\Vert 0=0\), \(1\Vert 0\Vert 1=1\). Hence we have again a very strong result, definitely stronger than what our intuition tells us about ambiguity. In particular, this makes it problematic to include commutativity:

Lemma 13

Let \(\mathbf{U }\) be a universal distribution algebra, with \(a,b\in U\). Then if \(a\Vert b=b\Vert a\), we have \(b=a\).


Assume \(a\Vert b=b\Vert a\). Then \(b=b\Vert a\Vert b=b\Vert b\Vert a=b\Vert a=b\Vert a\Vert a=a\Vert b\Vert a=a\). \(\square \)

Corollary 14

Every commutative universal distribution algebra \({\mathbf {U}}\) has at most one element.

We can now show the following result, which characterizes \(\mathbf {UDA}\) very neatly:

Lemma 15

(Margin Lemma) Let \(\mathbf{U }\) be a universal distribution algebra. Then for all \(a,b,c\in U\), we have \(a\Vert b\Vert c=a\Vert c\); put differently, \(\mathbf {UDA}\models a\Vert b\Vert c=a\Vert c\).


(For simplicity, we now omit associative brackets)

$$\begin{aligned} a\Vert b\Vert c&=a\Vert c\Vert a\Vert b\Vert c&(a=a\Vert c\Vert a)\\&=a\Vert c&(c=c\Vert a\Vert b\Vert c) \end{aligned}$$

\(\square \)

Note that in order to derive the Margin Lemma, we have used (inf) by its equivalent (id), but we have not used (mon), as can be easily checked. Hence we can derive the following:Footnote 9

Lemma 16

Every algebra \(\mathbf{U }\) satisfying (\(\Vert \)1),(\(\Vert \)2),(assoc),(inf) is a universal distribution algebra.


We can use the Margin Lemma, since it follows from the four axioms already. Hence

$$\begin{aligned}&(a\Vert b)\vee ((a\vee c)\Vert (b\vee d))\\ =&(a\vee a\vee c)\Vert (b\vee a\vee c)\Vert (a\vee b\vee d)\Vert (b\vee b\vee d)&\text {distributive laws}\\ =&(a\vee c)\Vert (b\vee a\vee c)\Vert (a\vee b\vee d)\Vert (b\vee d)&\text {Boolean laws}\\ =&(a\vee c)\Vert (b\vee d)&\text {Margin Lemma} \end{aligned}$$

Hence we have (mon) \(a\Vert b\le (a\vee c)\Vert (b\vee d)\), and the claim follows. \(\square \)

In the end, in \(\mathbf {UDA}\) arbitrary ambiguities “boil down” to the margins of ambiguous terms: commutativity is excluded, and more than 2-fold ambiguity is meaningless in this class of algebras. Now this is obviously a problem, which basically excludes UDA as a realistic model for ambiguity. We will discuss a way out of this predicament in later; but before this, we will prove a useful representation theorem for \(\mathbf {UDA}\).

Definition 17

We define the canonical \(\mathbf {UDA}\) over two given Boolean algebra \({{\varvec{B}}}_1,{{\varvec{B}}}_2\) as the direct product algebra \({{\varvec{B}}}_1\times {{\varvec{B}}}_2\), where Boolean operations are defined pointwise as usual, and the operation \(\Vert \) is defined by \((a,b)\Vert (c,d)=(a,d)\) for all \(a,c\in B_1\), \(b,d\in B_2\).

It is straightforward to check that this satisfies all \(\mathbf {UDA}\)-axioms. Canonical \(\mathbf {UDA}\) have a very simple structure, in that they only slightly extend product algebras. By the Margin Lemma, we will prove that every UDA has is isomorphic to a canonical universal distribution algebra. Given \(\mathbf{U }\in \mathbf {UDA}\), we define the relations \(\theta _l,\theta _r\subseteq U^2\) by

figure q

These are equivalence relations for every carrier set U, and in fact they are congruences for all universal distribution algebras, that is:

Lemma 18

Assume that for \(a,b,c,d\in U\), \(a \theta _l b\) and \(c\theta _l d\). Then

  1. 1.

    \({\sim }a\theta _l {\sim }b\),

  2. 2.

    \((a\wedge c)\theta _l(b\wedge d)\),

  3. 3.

    \((a\vee c)\theta _l(b\vee d)\), and

  4. 4.

    \((a\Vert c)\theta _l(b\Vert d)\)

The same for \(\theta _r\).



  1. 1.

    Assume \(a\Vert x=b\Vert x\) for all x, hence also \(a\Vert {\sim }x=b\Vert {\sim }x\). Then in particular, \(({\sim }a)\Vert x={\sim }(a\Vert ({\sim }x))={\sim }(b\Vert ({\sim }x))=({\sim }b)\Vert x\).

  2. 2.

    Assume \(a\Vert x=b\Vert x\) and \(c\Vert x=d\Vert x\) for all x. We have \((a\wedge c)\Vert x=(a\Vert x)\wedge (c\Vert x)\) by the Margin Lemma, and by assumption and Margin Lemma \((a\Vert x)\wedge (c\Vert x)=(b\Vert x)\wedge (d\Vert x)=(b\wedge d)\Vert x\).

  3. 3.

    Parallel to 2.

  4. 4.

    Assume \(a\Vert x=b\Vert x\) for all x. Hence by associativity, for all x, \((a\Vert c)\Vert x=a\Vert (c\Vert x)=b\Vert (d\Vert x)=(b\Vert d)\Vert x\), and hence \((a\Vert c)\theta _l (b\Vert d)\) (we only need one of the premises here). \(\square \)

We define maps \(h_l,h_r:U\rightarrow \wp (U)\) by \(h_l(x)=\{a:a\theta _lx\}\), \(h_r(x)=\{a:a\theta _rx\}\) (that is, elements are mapped onto congruence classes). These are, by the famous results of general algebra, homomorphisms for arbitrary universal distribution algebras. Hence we can construct the two homomorphic images \(h_l(\mathbf{U })=(U_{\theta _l},\wedge ,\vee ,{\sim },0,1)\) (with the congruence classes as carrier set), and \(h_r(\mathbf{U })\). We now define the map \(\phi \) by \(\phi (x)=(h_l(x),h_r(x))\). This is still a homomorphism for Boolean operations, if we define all operations pointwise in the image algebra. The image \(\phi [U]\) is a set of pairs (of congruence classes), so we can define \(\Vert \) canonically by \((a,b)\Vert (c,d)=(a,d)\). Hence we obtain a canonical universal distribution algebra which we denote by \(\phi (\mathbf{U })\). The crucial lemma is the following (here \(\cong \) denotes isomorphism of two algebras):

Lemma 19

\(\phi (\mathbf{U })\cong \mathbf{U }\).


We show two things, 1. \(\phi (a\Vert b)=\phi (a)\Vert \phi (b)\) (that this holds for all other connectives already follows by general algebra), and 2. \(\phi \) is a bijection.

  1. 1.

    Note that \(\phi (a)\Vert \phi (b)=(h_l(a),h_r(a))\Vert (h_l(b),h_r(b))=(h_l(a),h_r(b))\). By the Margin Lemma, we have \(h_l(a\Vert b)=h_l(a)\) and \(h_r(a\Vert b)=h_r(b)\); hence \(\phi (a\Vert b)=(h_l(a\Vert b),h_r(a\Vert b))=(h_l(a),h_r(b))\).

  2. 2.

    \(\phi \) is surjective by definition. Now assume we have \(\phi (a)=\phi (b)\), hence \(h_l(a)=h_l(b)\), \(h_r(a)=h_r(b)\). Consequently, we have \(a\Vert 0=b\Vert 0\), and hence \(a\Vert 0\Vert a=b\Vert 0\Vert b\). As \(a=a\Vert 0\Vert a\), \(b=b\Vert 0\Vert b\), it follows that \(a=b\). \(\square \)

Now as \(\phi (\mathbf{U })\) is a canonical algebra for every \(\mathbf{U }\), this proves the following theorem:

Theorem 20

(Product representation theorem for \(\mathbf {UDA}\)) Every UDA is isomorphic to a canonical UDA.

This result shows that all \(\mathbf {UDA}\) are very simple and well-behaved extensions of Boolean algebras. But it also shows, as did our results on \(\mathbf {SAA}\), that they are too simple to be really interesting!

3.6 Equivalence of Equational Theories

We now prove the equational theories of the three classes to be equivalent.

Lemma 21

For every term t in the signature of \(\mathbf {UDA}\), interpretation \(\sigma \) of t into a canonical universal distribution algebra \(\mathbf{U }=\mathbf{B }_1\times \mathbf{B }_2\), there exist two strong ambiguous algebras \(\mathbf{A }_1\), \(\mathbf{A }_2\), with interpretations \(\sigma _1,\sigma _2\) into \(A_1,A_2\) such that \((\sigma _1(t),\sigma _2(t))=\sigma (t)\).


Assume we have the interpretation \(\sigma :X\rightarrow B_1\times B_2\). We know that for every \(\mathbf{B }\in \mathbf {BA}\) there are exactly two strong ambiguous algebras with the same carrier set. We take these two completions \(C_l(\mathbf{B }_1)\) and \(C_r(\mathbf{B }_2)\), and put \(\sigma _1(x)=\pi _l(\sigma (x))\) and \(\sigma _2(x)=\pi _r(\sigma (x))\).

We prove that these algebras and assignments do the job as required by an induction on the complexity of t. For atomic terms, the claim is straightforward, as \((\pi _l(\sigma (x)),\pi _2(\sigma (x)))=\sigma (x)\) by definition. Now assume the claim holds for some arbitrary terms \(t,t'\).

  1. 1.

    We have \(\sigma _1(t\wedge t')=\sigma _1(t)\wedge \sigma _1(t')=\pi _l(\sigma (t))\wedge \pi _l(\sigma (t'))=\pi _l(\sigma (t\wedge t')\) (by pointwise definition of \(\wedge \)); same for \(\pi _r\) and \(\sigma _2\), hence \((\sigma _1(t\wedge t'),\sigma _2(t\wedge t'))=(\pi _l(\sigma (t\wedge t')),\pi _r(\sigma (t\wedge t')))=\sigma (t\wedge t')\).

  2. 2.

    \(\vee \) parallel.

  3. 3.

    \({\sim }\) similar.

  4. 4.

    \(\Vert \) \(\sigma _1(t\Vert t')=\pi _l(\sigma (t\Vert t'))=\pi _l(\sigma (t))\). Same for \(\sigma _2\), so \((\sigma _1(t\Vert t'),\sigma _2(t\Vert t'))=(\pi _l(\sigma (t),\pi _r(\sigma (t'))\), which by canonicity entails the claim. \(\square \)

Theorem 22

For all terms \(t,t'\), the following three are equivalent:

  1. 1.

    \(\mathbf {UDA}\models t=t'\)

  2. 2.

    \(\mathbf {SAA}\models t=t'\)

  3. 3.

    \(\mathbf {WAA}\models t=t'\)


\(1.\Rightarrow 2\): \(\mathbf {SAA}\subseteq \mathbf {UDA}\), hence the claim is obvious.

2.\(\Rightarrow 1.\): Contraposition: assume \(\mathbf {UDA}\not \models t=t'\); hence there is \(\mathbf{U },\sigma \) for which the equality is false: \(\sigma (t)\ne \sigma (t')\). Now we take an isomorphic canonical UDA, which we denote by \(can(\mathbf{U })\), and which has the form \(\mathbf{B }\times \mathbf{B }'\), where \(\mathbf{B },\mathbf{B }'\in \mathbf{BA} \). By the isomorphism \(\phi \), we have \(can(\mathbf{U }),\phi \circ \sigma \not \models t=t'\). Hence \(\phi \circ \sigma (t)=(a,b)\ne (a',b') =\phi \circ \sigma (t')\). Now use the previous lemma: we have two strong ambiguous algebras \(\mathbf{A }_1\) and \(\mathbf{A }_2\), interpretations \(\sigma _1\) and \(\sigma _2\), where \(\sigma _1(t)=a\), \(\sigma _2(t)=b\), \(\sigma _1(t')=a'\), \(\sigma _2(t')=b'\). Now as by assumption, either \(a\ne a'\) or \(b\ne b'\), we either have \(\mathbf{A }_1,\sigma _1\not \models t=t'\) or \(\mathbf{A }_2,\sigma _2\not \models t=t'\). Either way, \(\mathbf {SAA}\not \models t=t'\), hence the claim follows.

\(1\Rightarrow 3\) \(\mathbf {WAA}\subseteq \mathbf {UDA}\), hence the claim is obvious.

\(3\Rightarrow 2\) \(\mathbf {SAA}\subseteq \mathbf {WAA}\), hence the claim is obvious. \(\square \)

Hence we have three algebraic models, and all of them have the same equational theory, that is the same set of valid equations. This is, given the difference in axiomatization, rather astonishing and shows an interesting convergence. Unfortunately, we cannot consider this convergence as evidence for the “correct” model of ambiguity—because all algebras have strongly unintuitive properties. On the other hand, we do not see any algebraic alternatives either, because it seems impossible to weaken the axioms of \(\mathbf {UDA}\) without losing essential properties of ambiguity. Before we sketch the way out of this dilemma, we will quickly present the (rather simple) corollaries on the decidability of the equational theories:

Corollary 23

The equational theories of \(\mathbf {UDA}\), \(\mathbf {WAA}\), \(\mathbf {SAA}\) are decidable; more precisely, their decision problem is NP-complete.


We show the claim for \(\mathbf {SAA}\), from which all others follow. To check that \(\mathbf {SAA}\models t=t'\), we just have to reduce the equation by interpreting \(\Vert \) as \(\pi _l\) and \(\pi _r\) respectively, by which the equality reduces to two Boolean equalities \(t_l=t_l'\), \(t_r=t_r'\). Then the question is equivalent to checking whether \(\mathbf {BA}\models t_l=t_l'\) and \(\mathbf {BA}\models t_r=t_r'\), which is well-known to be NP-complete. \(\square \)

We will now quickly review one possible solution to the problem of the axioms being at the same time correct properties of ambiguity and “too strong”. This solution is to use partial algebras and looks promising at first sight, but does not really lead out of our predicament.

3.7 Partiality

As we have mentioned above, a peculiar property of ambiguity in natural language is that it is—to our knowledge—never productive: ambiguities are in the lexicon, arise in syntactic derivations and from many other sources, but we cannot construct them ad libitum, there is no productive mechanism for ambiguity. This nicely motivates the idea of algebras where \(\Vert \) is a partial operation. Apart from this intuitive motivation of partiality, there is also a mathematical one: uniformity for \(\mathbf {SAA}\) was derived from the existence of objects such as \(a\Vert {\sim }a\), \(0\Vert 1\), which in natural language generally do not arise (leaving irony as part of pragmatics). The same holds for \(\mathbf {UDA}\), where proofs proceed over peculiar objects like \(0\Vert a\Vert 0\) which need not necessarily exist. As \(\mathbf {UDA}\) is the largest class of algebras we have presented and the only variety, we will present the results on partiality only for this class.

A partial universal distribution algebra is an algebra \((U,\wedge ,\vee ,{\sim },\Vert ,0,1)\), where \(\Vert \) is a partial function \(U\times U\rightarrow U\), which satisfies the usual equalities:

figure r

Here equations have to be read in the following fashion: if one side of the equality is defined, so is the other, and obviously both are identical. Moreover, as the operations \({\sim },\wedge ,\vee \) are total, it follows that if \(a \Vert b\) is defined, so are \({\sim }a\Vert {\sim }b\), \((a\wedge c)\Vert (b\wedge c)\) for all defined c, etc. Moreover, if \(a\Vert (b\Vert c)\) is defined, so is \(b\Vert c\), because undefined terms are absorbing for all operations. We now show that this extension does not really help.

Assume we have a partial UDA \(\mathbf{U }\), where \(a,b\in U\), and \(a\Vert b\ne \perp \) (we use \(\perp \) as an abbreviation for undefined, not to be confused with 0!). Then we have the defined terms \(1\Vert (a\vee {\sim }b)\Vert ({\sim }a\vee b)\Vert 1=1\), \(0=0\Vert (a\wedge {\sim }b)\Vert ({\sim }a\wedge b)\Vert 0\). Here, \(0\Vert 1\) need not be defined, neither \({\sim }a\Vert a\). Still, we can conclude a number of things (arguments are similar to the ones above). Firstly, note that if \(a\ne b\), then in all Boolean algebras we have either \(a\vee {\sim }b<1 \) or \({\sim }a\vee b<1\). For assume \(a\vee {\sim }b=1\). Then \(b< a\) (since \(a\ne b\)), hence \({\sim }a<{\sim }b\). Now as \({\sim }b\) is the smallest element such that \(b\vee {\sim }b=1\), we have \({\sim }a\vee b<1\). Hence we conclude: if \(a\Vert b\ne \perp \), we have \(1=1\Vert ( a\vee {\sim }b) \Vert ( {\sim }a\vee b)\Vert 1\), where one of \(a\vee {\sim }b\) and \({\sim }a\vee b\) is not equal 1. Hence for some \(c< 1\), \(1\Vert c\Vert 1=1\), \(0\Vert c\Vert 0=0\) (parallel argument).

By the fact that these terms are defined and Boolean operations remain total in partial \(\mathbf {UDA}\), it follows that for all \(a\in U\), we have a c such that \(a=a\Vert (a\vee c)\Vert a\), \(a=a\Vert (a\wedge c)\Vert a\), where c relates to a defined ambiguity. Moreover, if \(a\Vert b\) is defined, we have

$$\begin{aligned} a&=a\wedge 1\\&=a\wedge (1\Vert ( a\vee {\sim }b)\Vert ({\sim }a\vee b )\Vert 1)\\&=(a\wedge 1)\Vert (a\wedge (a\vee {\sim }b))\Vert (a\wedge ({\sim }a\vee b))\Vert (a\wedge 1)\\&=a\Vert a\Vert (a\wedge b)\Vert a\\&=a\Vert a\wedge b\Vert a \end{aligned}$$

Similar for \(\vee \), where we get \(a=a\Vert a\vee b\Vert a\), and similarly, \(a\wedge b=a\wedge b\Vert b\Vert a\wedge b\). By the same argument, we get \(b=b\Vert a\vee b\Vert b\) etc. This is devastating for a possible commutativity: assume we have \(a\Vert b=b\Vert a\ne \perp \). Then it easily follows that \((a\vee b)\Vert b=b\Vert (a\vee b)\) and \(a\Vert (a\vee b)=(a\vee b)\Vert a\). And from these we conclude:

$$\begin{aligned} a&=a\Vert (a\vee b)\Vert a\\&= (a\vee b)\Vert a\Vert (a\vee b)\\&=a\vee b\\&=(a\vee b)\Vert b\Vert (a\vee b)\\&=b\Vert (a\vee b)\Vert b\\&=b \end{aligned}$$

This shows the following:

Lemma 24

Assume \(\mathbf{U }\) is a commutative partial universal distribution algebra, \(a,b\in U\). If \(a\Vert b\ne \perp \), then \(a=b\).

Hence ambiguous elements collapse, provided we have commutativity! Here we can again make use of commutativity as a probe: it cannot be reasonably included into partial \(\mathbf {UDA}\). This in turn means for us that the theory is inadequate. There would still be some things to say about this class, and there are further results which show that it is an inadequate model of our intuition, but we omit them, as we do not see other results as neat and general as the ones presented for \(\mathbf {UDA}\).

3.8 (Intermediate) Conclusion

The main results of this section suggest that the algebraic approach offers some interesting insights, but is of little help to adequately address our original problem of reasoning with ambiguity. Even the weakest axioms result in consequences which are strongly counterintuitive. We will take the following way out: we think that the algebraic approach as such is inept. Put differently: the problem is not the particular axioms (we have chosen the weakest implementing the requirements for ambiguity); algebra itself is the problem. There are two main features of algebra which can be abandoned while preserving the desiderata of ambiguity:

  1. 1.

    Uniform substitution of atoms by arbitrary terms preserves the truth of equalities. More formally: let term(X) denote the terms over variables \(X=\{x_1,x_2,...\}\); assume \(t,t'\in term(X)\), and \(\sigma :X\rightarrow term(X)\) is a function which is canonically extended to terms. Then if \(t=t'\) is valid in an algebra, then so is \(\sigma (t)=\sigma (t')\).

  2. 2.

    Substitution of arbitrary equivalent terms preserves the truth of equalities, i.e. equivalence entails congruence. More formally: if \(t_1[t_2],t_2',t_3\in term(X)\), where \(t_1[t_2]\) is a term with subterm \(t_2\), and \(t_1[t_2]=t_3\), \(t_2=t_2'\) are valid in an algebra, then so is \(t_1[t_2']=t_3\).

Actually, both features can be separately abandoned, each time resulting in a logic. Moreover, the two resulting logics exactly correspond to the two modes of reasoning with ambiguity we have sketched above (see Sect. 2.2, on consistent usage), depending on whether they assume consistent usage of ambiguous terms or not:

  1. 1.

    Lack of closure under substitution corresponds to the distrustful mode [no consistent usage, see van Eijck and Jaspars (1996)]

  2. 2.

    Lack of closure under substitution of equivalents corresponds to the trustful mode (consistent usage of ambiguous terms), which we will consider here.

Obviously, the former is a fragment of the latter, that is, it has less valid inferences. In this article, we will only consider the second approach, and we will provide a comparison of the two modes in further work. Hence we assume trustful reasoning, which means we preserve closure under uniform substitution, but we will not have closure under substitution of equivalents. Logically speaking, substitution of equivalents corresponds to the rule (cut), which should not be admissible in our logic.

4 The Ambiguity Logic \(\mathsf {AL}\)

4.1 Preliminaries

4.2 Multi-sequents and Contexts

The logic \(\mathsf {AL}\) is an extension of classical (propositional) logic (we denote the classical sequent calculus by \(\mathsf {CL}\)), that is, it derives the valid sequents of classical logic in the language restricted to \(\mathsf {CL}\), but it has an additional connective \(\Vert \), by which we can derive additional valid sequents. We will show that this extension is indeed conservative, if we do not include commutativity for \(\Vert \). The connective \(\Vert \) is not very exotic from the point of view of substructural logic: it is a fusion-style operator, which allows for contraction and expansion (its inverse), but not for weakening. We present it both in a commutative and non-commutative version. Our approach differs from the usual approach to substructural logic in that we extend classical logic with a substructural connective, whereas usually, one considers logics which are proper fragments of classical logic. In order to make this possible, we have to go beyond the normal sequent calculus: we still have sequents, but we have different types of contexts: one of them we denote by \(\natural (...)\), which basically embeds classical logic, the other one we denote by \(\lozenge (...)\), which allows to introduce the new connective \(\Vert \). The contexts thus differ in what kind of connectives we can introduce in them, and what kind of structural rules are allowed in them. Different contexts can be arbitrarily embedded within each other. We refer to the symbols \(\natural ,\lozenge \) as modalities (but they do not immediately relate to modal logic). We have found this idea briefly mentioned as a way to approach substructural logic in Restall (2008), and structures similar to multi-contexts are found in Dyckhoff et al. (2012). They are also used in the context of linear logic, see for example de Groote (1996).

We call the resulting structures multi-contexts. For given multi-contexts \(\varDelta ,\varGamma \), we call a pair \(\varDelta \vdash \varGamma \) a multi-sequent. The calculus accordingly can be called a multi-sequent calculus. Our approach is particular in that we actually extend classical propositional contexts, by that, \(\mathsf {AL}\) is but one particular instance of multi-sequent logics. According to us, this field definitely deserves further study, but this does no longer relate to ambiguity.

In order to increase readability, we distinguish contexts both by the symbols \(\natural ,\lozenge \), and by the type of period we use to separate formulas/contexts. This will be the symbol ‘,’ in the classical context, so \(\natural (\alpha ,\beta )\) is a well-formed (classical) context. Here ‘,’corresponds to \(\wedge \) on the left side of \(\vdash \), and to \(\vee \) on the right side of \(\vdash \), and allows for all structural rules. In the ambiguous context, we use ‘;’, hence \(\lozenge (\alpha ;\beta )\) is a well-formed (ambiguous) context. The symbol ‘;’ corresponds to \(\Vert \), is self-dual, and allows for some structural rules such as contraction, but not for others, such as weakening (or commutativity, depending on whether we include it or not). Formulas are defined as usual: we have a set \(\textit{Var}\) of propositional variables, and define the set of well-formed formulas \(\texttt {WFF}\) by

  • if \(p\in \textit{Var}\), then \(p\in \texttt {WFF}\);

  • if \(\phi ,\chi \in \texttt {WFF}\), then \((\phi \wedge \chi ),(\phi \vee \chi ),(\phi \Vert \chi ),(\lnot \phi )\in \texttt {WFF}\);

  • nothing else is in \(\texttt {WFF}\).

As usual, we will omit outermost parentheses of formulas. Next, we define multi-contexts; for sake of brevity, we refer to them simply as contexts.

  1. 1.

    \(\natural (\epsilon )\), where \(\epsilon \) is the empty sequence, is a well-formed, classical context, which we also call the empty context.

  2. 2.

    If \(\gamma \in \texttt {WFF}\), then \(\natural (\gamma )\) is a well-formed, classical context.

  3. 3.

    If \(\varGamma _1,...,\varGamma _i\) are well-formed contexts, then \(\natural (\varGamma _1,...,\varGamma _i)\) is a well-formed, classical context.

  4. 4.

    If \(\varGamma _1,\varGamma _2\) are well-formed, non-empty contexts, then \(\lozenge (\varGamma _1;\varGamma _2)\) is a well formed ambiguous context.

Note that \(\lozenge \) is strictly binary. This choice is somewhat arbitrary, but seems to be the most elegant way to prevent some technical problems. \(\natural \) has no restriction in this sense. \(\varGamma \vdash \varDelta \) is a well-formed multi-sequent, if both \(\varGamma ,\varDelta \) are well-formed, classical contexts. We write \(\varGamma [\alpha ]\) to refer to a subformula \(\alpha \) (actually a unary context \(\natural (\alpha )\), see conventions below) of a context \(\varGamma \); same for \(\varGamma [\varDelta ]\), where \(\varDelta \) is a sub-context. More formally, \(\varGamma [-]\) can be thought of as a function from contexts to contexts. These context functions are inductively defined by

  1. 1.

    \([-]:\varDelta \mapsto \varDelta \) is a context function (the identity function).

  2. 2.

    If \(\varGamma [-]\) is a context function, \(\varTheta _1,\varTheta _2\) are contexts, then \((\natural (\varTheta _1,\varGamma [-],\varTheta _2))\) is a context function, where \((\natural (\varTheta _1,\varGamma [-],\varTheta _2))(\varDelta )=\natural (\varTheta _1,\varGamma [\varDelta ],\varTheta _2)\).

  3. 3.

    If \(\varGamma [-]\) is a context function, \(\varTheta \) is a context, then \((\lozenge (\varGamma [-];\varTheta ))\) is a context function, where \((\lozenge (\varGamma [-];\varTheta ))(\varDelta )=(\lozenge (\varGamma [\varDelta ];\varTheta ))\). Parallel for \((\lozenge (\varTheta ;\varGamma [-]))\).

  4. 4.

    Nothing else is a context function.

The calculus with all modalities is somewhat clumsy to write, so we have a number of conventions for multi-sequents, to increase readability. These are important, as we make full use of them already in presenting the calculus.

  • We generally omit unary classical contexts; hence \(\lozenge (\alpha ;\beta )\) is short for \(\lozenge (\natural (\alpha );\natural (\beta ))\).

  • We omit the outermost context in multi-sequents. We can do this because it always is \(\natural (...)\), otherwise the sequent would not be well-formed. As a special case, we omit the empty context \(\natural ()\). Hence \(\vdash \alpha \) is a shorthand for \(\natural ()\vdash \natural (\alpha )\) etc.

  • We write \(\varGamma \) to refer to arbitrary contexts, so \(\alpha ,\varGamma \) is a shorthand for \(\natural (\natural (\alpha ),\varGamma )\).

  • We write \(\varGamma [_\natural \alpha ,\beta ]\) etc. in order to indicate that \(\alpha ,\beta \) occur in the scope of \(\natural \), that is, the smallest sub-context containing them is classical.

  • If \(i>2\), then \(\lozenge (\varGamma _1;...;\varGamma _i)\) is an abbreviation both for \(\lozenge (\varGamma _1;\lozenge (\varGamma _2;...;\varGamma _i))\) and \(\lozenge (\lozenge (\varGamma _1;\varGamma _{2};...);\varGamma _i)\) (meaning that we can use an arbitrary one of them). This abbreviation is unproblematic due to rules ensuring associativity of bracketing. If \(i=1\), then it is an abbreviation for \(\natural (\varGamma _1)\) (hence for the classical context!). If \(i=0\), then it is an abbreviation for \(\natural ()\) (the empty context). The latter two conventions are useful to formulate rules with more generality.

  • We let \(\natural (\varGamma ,\natural (\varDelta _1,...,\varDelta _i))\) and \(\natural (\natural (\varDelta _1,...,\varDelta _i),\varGamma )\) just be an alternative notation for \(\natural (\varGamma ,\varDelta _1,...,\varDelta _i)\). Hence classical contexts do not really embed into each other. This again will allow to formulate rules in greater generality.

We urge the reader to be careful: we will make full use of these conventions already in the presentation of the sequent calculus. The reason is that only this way, it will be plain obvious that our calculus is a neat extension of classical logic. Moreover, we aim to formulate the calculus in a way to make the structural rules, as far as they are desired, admissible [see Negri and Plato (2001), for background]. Arguably, some rules could be formulated in an intuitively simpler way, but at the price of not having admissible structural rules, which are problematic for proof search. We skip the proof of basic properties such as the fact that all rules preserve well-formedness of multi-sequents, which in fact is not entirely trivial.

4.3 The Classical Context and Its Rules

The modality \(\natural \) (partly) embeds the classical calculus; hence we have the following well-known rules:

figure s

(\(\wedge \)I) and (I\(\vee \)) show how \(\wedge ,\vee \) correspond to ‘,’, depending on the side of \(\vdash \). For negation, we have slightly generalized standard rules:

figure t

We let negation introduction pertain to the classical context, though it is somewhat intermediate. Note that the rules slightly generalize the classical rules; if \(i=1\), we have the classical rule. This extension is sound by universal distribution. In the following, we have the three structural rules of classical logic; these rules are of course restricted to the classical context. We will later show that weakening and contraction are admissible in the calculus (usual argument of reducing the degree of the rule), so the only rule we really need is commutativity.

figure u

This notation means that the rules can be equally applied on both sides of \(\vdash \). Note that we have all these rules not for formulas, but for contexts (recall that in our notation, a formula is just a shorthand for an atomic context anyway). Also keep in mind that classic context does not embed in itself; this is important to read (\(\natural \)weak),(\(\natural \)contr) properly. Hence by our conventions, the classical \(\natural \) is really ubiquitous in the calculus.

4.4 The Ambiguous Context and Its Rules

\(\lozenge \) is a binary modality, and hence there should be no way to introduce single formulas in this context (recall that in the unary case, \(\lozenge (\varGamma )\) is an abbreviation for \(\natural (\varGamma )\)). The introduction rule for \(\lozenge \) is as follows:

figure v

Note that this rule implements and generalizes both (inf) and (mon) from the \(\mathbf {UDA}\)-axioms: it models (inf) if either both \(\varDelta ,\varPhi \) are empty (which is possible) or both \(\varGamma ,\varTheta \) are empty, and it models (mon) if both \(\varLambda ,\varPsi \) are empty. Here our conventions allow us to formulate all these instances in one rule. By this, we can also see that these rules are in a sense a generalization of \(\bullet \)-introduction in the Lambek-calculus. We have two more rules introducing \(\lozenge \), which are admissible in the calculus with cut, but necessary to provide for proper distribution and invertibility of negation in the cut-free case. At first glance, they have nothing to do with negation, however they solve problems of distribution of negation in a surprising fashion:

figure w

Firstly, note that they are sound due to negation properties: assume \(\varGamma \vdash \varDelta _1,\psi _1,\varTheta \) and \(\varGamma \vdash \varDelta _2,\psi _2,\varTheta \) are sound. Then so are \(\varGamma ,\lnot \psi _1\vdash \varDelta _1,\varTheta \) and \(\varGamma ,\lnot \psi _2\vdash \varDelta _2,\varTheta \), hence \(\varGamma ,\lozenge (\lnot \psi _1;\lnot \psi _2)\vdash \lozenge (\varDelta _1;\varDelta _2),\varTheta \). Now by distribution, we should have \(\varGamma ,\lnot (\psi _1\Vert \psi _2)\vdash \lozenge (\varDelta _1;\varDelta _2),\varTheta \), and by invertibility (aka double negation elimination) we should have: \(\varGamma \vdash \lozenge (\varDelta _1;\varDelta _2),\varTheta ,\psi _1\Vert \psi _2\). It is easy to see that (I\(\lozenge \)),(\(\lozenge \)I) allow for this kind of inference without any problematic steps such as deleting connectives.Footnote 10 There are two (parallel) introduction rules for \(\Vert \):

figure x

These rules eliminate the \(\lozenge \)-context, and create a classical one. There are two structural rules in \(\lozenge \)-context, namely associativity and contraction (we do for now not allow commutativity). (\(\lozenge \)contr) is obviously admissible with cut, and even without cut, we will prove it to be admissible, so it is not part of the calculus.

figure y

Here double lines indicate that the rule works in both directions, and absence of \(\vdash \) means rules work equally on both sides. Together with (cut), these rules would be sufficient. However, we add two more rules which ensure that we will satisfy the universal distribution in the cut-free case.

figure z

This looks like a law for eliminating contexts, but it is rather a distributive law for \(\wedge \) on the left and \(\vee \) on the right. Note that if we have a context \(\varGamma [\natural (\lozenge (\varDelta ;\varPsi ),\varDelta ')]\), we can always derive \(\varGamma [\natural (\lozenge (\natural (\varDelta ,\varDelta ');\varPsi ),\natural (\varDelta ,\varDelta '))]\) via (admissible) (\(\natural \)weak). We call the rule (inter1) since the resulting context \(\varGamma [\lozenge (\varDelta ;\varPsi )]\) might be called an interpolant for the two premises, containing only the material common to the two. This formulation has two advantages: firstly, (\(\natural \)contr) is admissible with (inter1) (as we will show below), and more importantly, (inter1) is invertible, hence if the conclusion is correct, so are the premises, which is advantageous for proof search.Footnote 11 Hence (inter1) slightly generalizes normal distribution: it ensures we can properly distribute \(\wedge \) on the left and \(\vee \) on the right; for the dual distribution of \(\wedge \) on the right and \(\vee \) on the left we need a more problematic rule:

figure aa

Here again the consequence can be thought of as an interpolant of the two premises, containing only the common material. To understand its meaning, consider that in terms of formulas, it means as much as

$$\begin{aligned}&((\beta \wedge (\alpha \Vert \beta \Vert \alpha '))\vee (\alpha \Vert (\beta \wedge \beta ')\Vert \alpha ')\equiv \alpha \Vert \beta \Vert \alpha '\\&\quad \equiv ((\beta \vee (\alpha \Vert \beta \Vert \alpha '))\wedge (\alpha \Vert (\beta \vee \beta ')\Vert \alpha ') \end{aligned}$$

We will motivate this rule more explicitly in Sect. 5.2.2. In particular, without this rule the rules (I\(\wedge \)) and (\(\vee \)I) do not seem to be invertible, which would be very problematic. This rule has the problematic property that it eliminates material: the \(\beta \) of the right premise does not occur in the conclusion. However, this seems to be inevitable, and the drawback is made up by two properties: firstly, (inter2) makes structural rules admissible, and secondly, it is fully invertible: truth of the conclusion entails truth of the (weaker) premises. We will see that invertibility is actually of central importance for reasoning with ambiguity (see proof of Lemma 56 for an example), and also crucial for the matrix semantics. We will also quickly provide alternative, simpler, but less favorable equivalent versions for these two rules in Sect. 5.

4.5 Cut

We now present the cut rule. Its adaption to multi-sequents is straightforward, as unary contexts are always classical (\(\lozenge \) is a strictly binary modality).

figure ab

Note that (cut) does not substitute formulas, but atomic contexts. It ensures transitivity and congruence without any special cases to consider. Importantly, as every context has a particular modality, also the context inserted by cut comes with a modality—but it does not need to be the same as the one of the cut-formula. We define the notion of a derivation as usual by labelled proof-trees. A proof is a labelled tree where 1. all leaves are instantiations of (ax), and 2. every subtree of depth 1 is an instantiation of one of the other rules of the calculus. A multi-sequent \(\varGamma \vdash \varDelta \) is derivable if it is the root of such a proof-tree. In this case, we write \(\Vdash _{\mathsf {AL}}\varGamma \vdash \varDelta \), meaning the sequent is derivable in \(\mathsf {AL}\). Also the cut-free calculus will play an important role in the sequel; we denote this calculus by \(\mathsf {AL}^{\textit{cf}}\), and write \(\Vdash _{\mathsf {AL}^{\textit{cf}}}\varGamma \vdash \varDelta \) if the sequent is derivable in \(\mathsf {AL}\) without using the cut-rule. We will, in the sequel, mostly write \(\Vdash \) for \(\Vdash _\mathsf {AL}\), and \(\Vdash _\textit{cf}\) for \(\Vdash _{\mathsf {AL}^\textit{cf}}\). We first consider the full calculus \(\mathsf {AL}\), which is the less interesting of the two.

4.6 Algebraic Interpretations of \(\mathsf {AL}\)

Because of the equivalence of the equational theories, we will only consider interpretations into \(\mathbf {UDA}\); by Theorem 22, all soundness and completeness results will hold for \(\mathbf {SAA}\) and \(\mathbf {WAA}\) as well. The interpretation of \(\mathsf {AL}\) into \(\mathbf {UDA}\) is straightforward, but we have to spell it out nonetheless. We define interpretations for contexts; this is necessary for the usual inductive soundness proof. Assume \(\mathbf{U }\in \mathbf {UDA}\) and \(\sigma :\textit{Var}\rightarrow U\) is an (atomic) interpretation. We define two interpretation functions \({\overline{\sigma }},{\underline{\sigma }}\) by:

figure ac

As is easy to see, \({\overline{\sigma }}\) and \({\underline{\sigma }}\) coincide on formulas, and hence in the formula case there is no reason to distinguish them. They also coincide in their interpretation of ‘;’, but as there might be a classical context embedded, it is important to keep them distinct.

We define truth in an algebra as usual: \(\mathbf{U },\sigma \models \varGamma \vdash \varDelta \) iff \({\underline{\sigma }}(\varGamma )\le _U{\overline{\sigma }}(\varDelta )\); as a special case, we write \(\mathbf{U },\sigma \models \varDelta \) iff \(1_U\le _U{\overline{\sigma }}(\varDelta )\). Moreover, we define the notion of validity in a class as usual by \(\mathbf {UDA}\models \varGamma \vdash \varDelta \) (stating that \(\varGamma \vdash \varDelta \) is valid) iff for all \(\mathbf{U }\in \mathbf {UDA}\), \(\sigma :\textit{Var}\rightarrow U\), we have \(\mathbf{U },\sigma \models \varGamma \vdash \varDelta \). We now prove soundness and completeness of \(\mathbf {UDA}\)-semantics for \(\mathsf {AL}\), that is, \(\mathbf {UDA}\models \varGamma \vdash \varDelta \) iff \(\Vdash _\mathsf {AL}\varGamma \vdash \varDelta \). We start with soundness.

4.7 Soundness for \(\mathsf {AL}\)

Recall that \(\varGamma [-]\) is a function which is inductively defined; this definition allows us to perform inductions over the complexity of \(\varGamma [-]\), as in the following proof:

Lemma 25

For arbitrary \(\varGamma [-],\varDelta ,\varXi \), interpretation \(\sigma \), we have \({\underline{\sigma }}(\varGamma [\varDelta ],\varXi )\le {\underline{\sigma }}(\varGamma [\natural (\varDelta ,\varXi )])\)


An easy induction over \(\varGamma \). If \(\varGamma [-]\) is the identity function, then the claim is obvious. Assume the claim holds for some \(\varGamma [-]\). then it obviously holds for the function \(\natural (\varTheta _1,\varGamma [],\varTheta _2)\), since the result is identical up to \(\natural \)-commutation. Now take the function \(\lozenge (\varGamma [];\varTheta )\). Obviously, \(\mathbf {UDA}\models (a\Vert b)\wedge c=(a\wedge c)\Vert (b\wedge c)\le (a\Vert (b\wedge c))\). This entails that \({\underline{\sigma }}(\lozenge (\varGamma [\varDelta ];\varTheta ),\varXi )\le {\underline{\sigma }}(\lozenge (\varGamma [\varDelta ,Xi];\varTheta )\). Same for the function \(\lozenge (\varTheta ;\varGamma [])\). \(\square \)

Lemma 26

(Soundness) If \(\Vdash _\mathsf {AL} \varGamma \vdash \varDelta \), then \(\mathbf {UDA}\models \varGamma \vdash \varDelta \).


We make the usual induction over proof rules, showing they preserve correctness. We omit this for some of the classical rules for which the standard proofs can be taken over with minor modifications.

\(\blacktriangleright \) (I\(\lozenge \)I) Assume \(\varGamma ,\varLambda \vdash \varDelta ,\varPsi \) and \(\varTheta ,\varLambda \vdash \varPhi ,\varPsi \) are true in a model. Then by Lemma 10, \(\lozenge (\natural (\varGamma ,\varLambda );\natural (\varTheta ,\varLambda ))\vdash \lozenge (\natural (\varDelta ,\varPsi );\natural (\varPhi ,\varPsi ))\) is true, too. It is now easy to check that by distributive laws (ensured by (\(\Vert \)1),(\(\Vert \)2)),

$$\begin{aligned} \qquad \qquad {\underline{\sigma }}(\lozenge (\natural (\varGamma ,\varLambda );\natural (\varTheta ,\varLambda )))= & {} {\underline{\sigma }}(\natural (\lozenge (\varGamma ;\varTheta ),\varLambda ))\\ \qquad \qquad {\overline{\sigma }}(\lozenge (\natural (\varDelta ,\varPsi );\natural (\varPhi ,\varPsi )))= & {} {\overline{\sigma }}(\natural (\lozenge (\varDelta ;\varPhi ),\varPsi )) \end{aligned}$$

\(\blacktriangleright \) (I\(\lozenge \)) Assume we have \({\underline{\sigma }}(\varDelta )\le {\overline{\sigma }}(\varGamma _1,\varTheta _1,\varXi )\) and \({\underline{\sigma }}(\varDelta )\le {\overline{\sigma }}(\varGamma _2,\varTheta _2,\varXi )\). There are formulas \(\gamma _1,\gamma _2\) such that for \(i\in \{1,2\}\), \({\overline{\sigma }}(\gamma _i)={\overline{\sigma }}(\varGamma _i)\). Then we have \({\underline{\sigma }}(\varDelta ,\lnot \gamma _1)\le {\overline{\sigma }}(\varTheta _1,\varXi )\) and \({\underline{\sigma }}(\varDelta ,\lnot \gamma _2)\le {\overline{\sigma }}(\varTheta _2,\varXi )\) by soundness of (\(\lnot \)I) (see below), and by soundness of (I\(\lozenge \)I), we have \({\underline{\sigma }}(\varDelta ,\lozenge (\lnot \gamma _1;\lnot \gamma _2))\le {\overline{\sigma }}(\lozenge (\varTheta _1;\varTheta _2),\varXi )\). By soundness of (I\(\lnot \)), we obtain in turn that \({\underline{\sigma }}(\varDelta )\le {\overline{\sigma }}(\lnot (\lnot (\gamma _1\Vert \lnot \gamma _2)),\) \(\lozenge (\varTheta _1;\varTheta _2),\varXi )\). By universal distribution and double complementation elimination in BA, this is equivalent to \({\underline{\sigma }}(\varDelta )\le {\overline{\sigma }}(\lozenge (\varGamma _1;\varGamma _2),\lozenge (\varTheta _1;\varTheta _2),\varXi )\).

\(\blacktriangleright \) (\(\lozenge \)I) Parallel.

\(\blacktriangleright \) (\(\Vert \)I),(I\(\Vert \)),(assoc): the former are sound, because antecedent and consequent have actually the same interpretation; the latter is obvious.

\(\blacktriangleright \) (\(\lnot \)I) is sound by soundness of the classical negation rule in Boolean algebras and negation distribution ensured by (\(\Vert \)I).

\(\blacktriangleright \) (I\(\lnot \)) same.

\(\blacktriangleright \) (\(\wedge \)I),(I\(\vee \)) Straightforward, since interpretation remains identical.

\(\blacktriangleright \) (\(\vee \)I) We prove that \({\underline{\sigma }}(\varGamma [\alpha ])\vee {\underline{\sigma }}(\varGamma [\beta ])={\underline{\sigma }}(\varGamma [\alpha \vee \beta ]\) by induction over the complexity of \(\varGamma [-]\). The claim is obvious for \(\varGamma [\alpha ]=\alpha \). Assume it holds for some \(\varGamma [-]\). Then

figure ad

Hence the claim follows for \(\natural (\varGamma [-],\varDelta )\), parallel for \(\natural (\varDelta ,\varGamma [-])\). Moreover,

figure ae

Hence the claim follows for \(\lozenge (\varGamma [-];\varDelta )\) (parallel for \(\lozenge (\varDelta ;\varGamma [-])\)), and hence for arbitrary contexts. Since \({\underline{\sigma }}(\varGamma [\alpha ])\le {\overline{\sigma }}(\varDelta )\) and \({\underline{\sigma }}(\varGamma [\beta ])\le {\overline{\sigma }}(\varDelta )\) entail \({\underline{\sigma }}(\varGamma [\alpha ])\vee {\underline{\sigma }}(\varGamma [\beta ])\le {\overline{\sigma }}(\varDelta )\), it follows that \({\underline{\sigma }}(\varGamma [\alpha \vee \beta ]\le {\overline{\sigma }}(\varDelta )\).

\(\blacktriangleright \) (I\(\wedge \)) A parallel argument to (\(\vee \)I).

\(\blacktriangleright \) (inter1) We just consider the case on the left of \(\vdash \); the other case is parallel. Assume \(\varGamma [\lozenge (\varDelta ;\varPsi ),\varDelta ]\vdash \varXi \) and \( \varGamma [\lozenge (\varDelta ;\varPsi ),\varPsi ]\vdash \varXi \) are true in a model. Assume moreover that \(\delta ,\psi \) are formulas such that \({\underline{\sigma }}(\delta )={\underline{\sigma }}(\varDelta )\) and \({\underline{\sigma }}(\psi )={\underline{\sigma }}(\varPsi )\), which obviously exist. Because of soundness of \(\vee \)-rules we know that \(\varGamma [\lozenge (\varDelta ;\varPsi ),\delta \vee \psi ]\vdash \varXi \) is also true, and since in general \(a\Vert b\le a\vee b\), it follows that \({\underline{\sigma }}(\lozenge (\varDelta ;\varPsi ),\delta \vee \psi ))={\underline{\sigma }}((\delta \Vert \psi )\wedge (\delta \vee \psi ))= {\underline{\sigma }}(\lozenge (\varDelta ;\varPsi ))\), hence \(\varGamma [\lozenge (\varDelta ;\varPsi )]\vdash \varXi \) is true as well.

\(\blacktriangleright \) (inter2) We just consider the case on the left of \(\vdash \); the other case is parallel. Assume \({\underline{\sigma }}(\varGamma [\varPsi ,\lozenge (\varDelta ;\varPsi ;\varDelta '])\le a\) and \({\underline{\sigma }}(\varGamma [\lozenge (\varDelta ;\natural (\beta ,\varPsi );\varDelta ')]\le a\); moreover assume \({\underline{\sigma }}(\delta )={\underline{\sigma }}(\varDelta )=d\), \({\underline{\sigma }}(\delta ')={\underline{\sigma }}(\varDelta ')=d'\), \({\underline{\sigma }}(\psi )={\underline{\sigma }}(\varPsi )=p\), \({\underline{\sigma }}(\beta )=b\). By soundness of (\(\vee \)I) and other simple rules, it follows that \({\underline{\sigma }}(\varGamma [((\psi \wedge (\delta \Vert \psi \Vert \delta '))\vee (\delta \Vert (\beta \wedge \varPsi )\Vert \delta ')])\le a\), and

$$\begin{aligned}&{\underline{\sigma }}((\psi \wedge (\delta \Vert \psi \Vert \delta '))\vee (\delta \Vert (\beta \wedge \psi )\Vert \delta ')) \\&\quad =(p\wedge (d\Vert p\Vert d'))\vee (d\Vert (b\wedge p)\Vert d') \\&\quad =(p\vee (d\Vert (b\wedge p)\Vert d'))\wedge ((d\Vert p\Vert d')\vee (d\Vert (b\wedge p)\Vert d')) \end{aligned}$$

where obviously \(d\Vert p\Vert d'\le (p\vee (d\Vert (b\wedge p)\Vert d'))\), and \(d\Vert p\Vert d'\le (d\Vert p\Vert d')\vee (d\Vert (b\wedge p)\Vert d')\). Hence, as contexts cannot be negated (just formulas), we have \({\underline{\sigma }}(\varGamma [\delta \Vert \psi \Vert \delta ']={\underline{\sigma }}(\varGamma [\lozenge (\varDelta ;\varPsi \;\varDelta ')]\le a\).

\(\blacktriangleright \) (cut) We use the well-known fact that in Boolean algebras, we have \(a\wedge \lnot b\le c\) iff \(a\le c\vee b\). Assume \(\varGamma [\alpha ]\vdash \varPsi \) and \(\varDelta \vdash \alpha ,\varTheta \) are true in a model, and let \(\theta \in \texttt {WFF}\) be a formula such that \({\overline{\sigma }}(\theta )={\overline{\sigma }}(\varTheta )\). Then \(\varDelta ,\lnot \theta \vdash \natural \alpha \) is true, and since contexts cannot be negated, so is \(\varGamma [\natural (\varDelta ,\lnot \theta )]\vdash \varPsi \) (by monotonicity). By Lemma 25, \(\varGamma [\varDelta ],\lnot \theta \vdash \varPsi \) remains true, and by Boolean laws, so is \(\varGamma [\varDelta ]\vdash \varPsi ,\theta \), where \(\theta \) can be again replaced by \(\varTheta \). \(\square \)

4.8 Completeness for \(\mathsf {AL}\)

We now present a standard algebraic completeness proof for \(\mathsf {AL}\) and \(\mathbf {UDA}\) via the Lindenbaum algebra for \(\mathsf {AL}\), denoted by \(\mathbf{Linda} \). Its carrier set M is the set of \(\mathsf {AL}\)-formulas modulo logical equivalence: we write \(\alpha \dashv \vdash \beta \) iff \(\Vdash _\mathsf {AL}\alpha \vdash \beta \), \(\Vdash _\mathsf {AL}\beta \vdash \alpha \). This relation is symmetric by definition, reflexive and transitive (by cut). We put \(\alpha _{\dashv \vdash }=\{\beta :\beta {\dashv \vdash }\alpha \}\), and \(M=\{\alpha _{\dashv \vdash }:\alpha \in \texttt {WFF}\}\). The next step will be to show that \({\dashv \vdash }\), more than an equivalence relation, is a congruence over connectives.

Lemma 27

Assume \(\alpha _1{\dashv \vdash }\beta _1\), \(\alpha _2{\dashv \vdash }\beta _2\). Then for \(\star \in \{\wedge ,\vee ,\Vert \}\), \(\alpha _1\star \alpha _2{\dashv \vdash }\beta _1\star \beta _2\), and \(\lnot \alpha _1{\dashv \vdash }\lnot \beta _1\).


By cases; for all classical connectives, just use standard proof; for \(\Vert \), this is no less straightforward. \(\square \)

Hence we can use the equivalence classes irrespective of representatives and define, for \(m,n\in M\):

  • \(m\wedge n=(\alpha \wedge \beta )_{\dashv \vdash }\), where \(\alpha \in m,\beta \in n\)

  • \(m\vee n=(\alpha \vee \beta )_{\dashv \vdash }\), where \(\alpha \in m,\beta \in n\)

  • \(m\Vert n=(\alpha \Vert \beta )_{\dashv \vdash }\), where \(\alpha \in m,\beta \in n\)

  • \({\sim }m=(\lnot \alpha )_{\dashv \vdash }\), where \(\alpha \in m\)

  • \(1=(p\vee \lnot p)_{\dashv \vdash }\), where \(p\in \textit{Var}\)

  • \(0=(p\wedge \lnot p)_{\dashv \vdash }\), where \(p\in \textit{Var}\)

Since our calculus subsumes the classical propositional calculus, the algebra \((M,\wedge ,\vee ,{\sim },0,1)\) is a Boolean algebra, where the relation \(\le \) coincides with \(\vdash \) (modulo equivalence). We prove this extension is a universal distribution algebra:

Lemma 28

\(\mathbf{Linda} =(M,\wedge ,\vee ,{\sim },\Vert ,0,1)\) is a universal distribution algebra.


\(\vdash \) corresponds to \(\le \), \(=\) corresponds to \({\dashv \vdash }\). Hence equalities fall into two subclaims, which we sometimes treat separately.

\((\Vert 1)\) i. \((a\Vert b)\wedge c\le (a\wedge c)\Vert (b\wedge c)\).

figure af

ii. \((a\wedge c)\Vert (b\wedge c)\le (a\Vert b)\wedge c\).

figure ag

(\(\Vert \) 2) i. \(\lnot (a\Vert b)\le \lnot a\Vert \lnot b\) Straightforward; we abbreviate the proof:

figure ah

ii. \(\lnot a\Vert \lnot b\le \lnot (a\Vert b)\) is parallel.

(assoc) Straightforward.

(inf) Consider the following (abbreviated) proof for \(a\wedge b\le a\Vert b\):

figure ai

\(a\Vert b\le a\vee b\) can be proved similarly.

(mon) \(a\Vert b\le (a\vee c)\Vert (b\vee d)\) is easy to derive from \(a\vdash a\vee c\), \(b\vdash b\vee d\) and (I\(\lozenge \)I). \(\square \)

So we obtain a completeness result following the standard argument: if a sequent is valid in universal distribution algebras, it is in particular valid in \(\mathbf{Linda} \), the term algebra; hence it is derivable in the calculus. This proves part 1 of the following theorem; 2 and 3 follow by equivalence of equational theories.

Theorem 29

(Soundness and Completeness) 

  1. 1.

    \(\mathbf {UDA}\models \varGamma \vdash \varDelta \) if and only if \(\Vdash _\mathsf {AL} \varGamma \vdash \varDelta \).

  2. 2.

    \(\mathbf {SAA}\models \varGamma \vdash \varDelta \) if and only if \(\Vdash _\mathsf {AL} \varGamma \vdash \varDelta \).

  3. 3.

    \(\mathbf {WAA}\models \varGamma \vdash \varDelta \) if and only if \(\Vdash _\mathsf {AL} \varGamma \vdash \varDelta \).

Note, by the way, that for completeness we need neither of (inter1),(inter2), (\(\lozenge \)I),(I\(\lozenge \)), hence these rules are admissible, provided we have (cut). This shows that for a complete logic for \(\mathbf {UDA}\), we only need a slight extension of the classical calculus (denoted by \(\mathsf {CL}\)) with (I\(\lozenge \)I) and (\(\Vert \)I),(I\(\Vert \)).

Corollary 30

In \(\mathsf {AL} \), the rules (inter1),(inter2),(\(\lozenge \)I),(I\(\lozenge \)) are admissible.

Hence \(\mathsf {CL}\) with three additional rules is enough to be sound and complete for \(\mathbf {UDA}\). In the cut-free calculus however, these admissible rules will be of crucial importance to ensure congruence results, especially for distributive laws.

Given the negative results we have obtained for our algebras, Theorem 29 is a not a positive result for \(\mathsf {AL}\), on the contrary: it entails that \(\mathsf {AL}\) is inadequate for reasoning with ambiguity. The crucial property hereby is what we call congruence: if \(\varGamma [\varTheta ]\dashv \vdash \varDelta \), \(\varTheta \dashv \vdash \varTheta '\), then \(\varGamma [\varTheta ']\dashv \vdash \varDelta \). This is the logical counterpart of congruence in algebra, and it is ensured by the rule (cut). If we omit this rule, we lose congruence, and more importantly, we can no longer derive the undesirable results which follow from Theorem 29. This is why the cut-free calculus \(\mathsf {AL}^\textit{cf}\) will be in the focus of what follows.

5 Elementary Proof-Theory for \(\mathsf {AL}\) and \(\mathsf {AL}^{\textit{cf}}\)

5.1 Ambiguous and Classical Theorems

Our results on \(\mathbf {UDA}\) already entail some strong results for \(\mathsf {AL}\) (with cut). We start with the following result:

Lemma 31

Assume \(\alpha =\alpha _1\Vert \beta \Vert \alpha _2\), \(\gamma =\gamma _1\Vert \delta \Vert \gamma _2\), where \(\alpha _1,\alpha _2,\gamma _1,\gamma _2\) are formulas of classical logic. Then \(\Vdash _\mathsf {AL} \alpha \vdash \gamma \) if and only if \(\Vdash _\mathsf {CL} \alpha _1\vdash \gamma _1\) and \(\Vdash _\mathsf {CL} \alpha _2\vdash \gamma _2\).


If: Assume \(\Vdash _\mathsf {CL}\alpha _1\vdash \gamma _1\), \(\Vdash _\mathsf {CL}\alpha _2\vdash \gamma _2\). Then for all Boolean algebras, all interpretations we have \({\underline{\sigma }}(\alpha _1)\le {\overline{\sigma }}(\gamma _1)\) etc. By the Margin Lemma, \({\underline{\sigma }}(\alpha )={\underline{\sigma }}(\alpha _1)\Vert {\underline{\sigma }}(\alpha _2)\), \({\overline{\sigma }}(\gamma )={\overline{\sigma }}(\gamma _1)\Vert {\overline{\sigma }}(\gamma _2)\). Hence by (mon), we have \({\underline{\sigma }}(\alpha )\le {\overline{\sigma }}(\gamma )\) for all \(\sigma \) and all universal distribution algebras. By completeness, the claim follows.

Only if: Contraposition: assume without loss of generality \(\not \Vdash _\mathsf {CL}\alpha _1\vdash \gamma _1\). Then there is a Boolean algebra B and interpretation \(\sigma \) with \({\underline{\sigma }}(\alpha _1)\not \le {\overline{\sigma }}(\gamma _1)\). Then we can construct the canonical UDA (see Definition 17) \(\mathbf{B }\times \mathbf{B }\), and by the definition of \(\le \) in canonical UDA, \({\underline{\sigma }}(\alpha _1\Vert \alpha _2)\not \le {\overline{\sigma }}(\gamma _1\Vert \gamma _2)\). By the Margin Lemma, \({\underline{\sigma }}(\alpha _1\Vert \alpha _2)={\underline{\sigma }}(\alpha )\), \({\overline{\sigma }}(\gamma )={\overline{\sigma }}(\gamma _1\Vert \gamma _2)\), hence \({\underline{\sigma }}(\alpha )\not \le {\overline{\sigma }}(\gamma )\), and by soundness, \(\not \Vdash _\mathsf {AL}\alpha \vdash \gamma \). \(\square \)

As special cases, we obtain the following, which is not trivial because of the presence of the cut rule!

Corollary 32

Let \(\varDelta ,\varGamma \) be multi-sequents which (1) do not contain any occurrence of \(\Vert \), (2) nor any occurrence of \(\lozenge \). Then \(\Vdash _\mathsf {AL} \varDelta \vdash \varGamma \) iff \(\Vdash _\mathsf {CL} \varDelta \vdash \varGamma \).


Consider that \(\lozenge (\varDelta ;\varDelta )\dashv \vdash \varDelta \) and the previous lemma. \(\square \)

Hence, \(\mathsf {AL}\) is a conservative extension of classical logic. This tells us something about commutativity as well (these considerations are actually just the logical counterpart to what we already said about \(\mathbf {UDA}\)).

Convention We use 1 in proofs as a placeholder for an arbitrary theorem of classical logic, 0 as a placeholder for an arbitrary contradiction of classical logic. It is important that 1 is not equal to a particular classical theorem, since in \(\mathsf {AL}^\textit{cf}\), not all classical theorems are exchangeable in proofs; same for 0!

Now take the following rule:

figure aj

Lemma 33

Let \(\mathsf {AL} ^{\textit{comm}}\) be the calculus \(\mathsf {AL} \) (with cut) with the additional rule (\(\lozenge \)comm). For every \(\alpha \in \texttt {WFF}\), we have \(\Vdash _{\mathsf {AL} ^{\textit{comm}}}\vdash \alpha \); put differently: \(\mathsf {AL} ^{\textit{comm}}\) is inconsistent.


By completeness, we know that \(\Vdash _\mathsf {AL}\lozenge (1;0;0;1)\), and \(\Vdash _\mathsf {AL}\lozenge (0;1;1;0)\vdash \alpha \) for arbitrary \(\alpha \) (since \(\mathbf {UDA}\models 1=1\Vert 0\Vert 1\) etc.). With (\(\lozenge \)comm), we know that \(\Vdash _\mathsf {AL}\lozenge (1;0;0;1)\vdash \lozenge (0;1;1;0)\). Hence, by two applications of (cut), we derive \(\Vdash _\mathsf {AL}\lozenge (1;0;0;1)\vdash \alpha \) and \(\Vdash _\mathsf {AL}\alpha \). \(\square \)

Note: this result—as we explained above—disqualifies the calculus for our purposes. However, as we will argue more explicitly below, this only concerns the calculus \(\mathsf {AL}^\textit{comm}\) with (cut); without (cut), the rules of \(\mathsf {AL}^\textit{cf}\) do not seem to derive any counterintuitive sequents (but of course this remains an open problem, it seems to be too early to make this a definite claim), and \(\mathsf {AL}^{\textit{cf}+\textit{comm}}\) is provably consistent.

For the remainder of this article, we will therefore only consider the cut-free calculus \(\mathsf {AL}^\textit{cf}\).

5.2 Admissible Rules of \(\mathsf {AL}^\textit{cf}\) I: Structural Rules

5.2.1 Weakening in Classical Context

It is well-known that in the classical calculus with shared contexts, weakening is admissible (see Negri and Plato 2001). We slightly extend this result to our new calculus and multisequents. Recall that we write \(\Vdash \) for \(\Vdash _\mathsf {AL}\), \(\Vdash _\textit{cf}\) for \(\Vdash _{\mathsf {AL}^{\textit{cf}}}\).

Definition 34

We write \(\Vdash ^{n}\varGamma \vdash \varDelta \), if the longest branch from root to leaf in the shortest \(\mathsf {AL}\) proof tree of \(\varGamma \vdash \varDelta \), has length \(\le n\); same for \(\Vdash ^{n}_{\textit{cf}}\varGamma \vdash \varDelta \) [hence \(n\ge 1\), we skip the inductive definition for reasons of space, for background check (Negri and Plato 2001)].

The following is a standard lemma:

Lemma 35

Assume \(\Vdash _\textit{cf}^{n}\varGamma [\varDelta ]\vdash \varTheta [\varPsi ]\). Then \(\Vdash _\textit{cf}^{n}\varGamma [\natural (\varDelta ,\varXi )]\vdash \varTheta [\natural (\varPsi ,\varLambda )]\) for arbitrary \(\varXi ,\varLambda \).


Induction over n; the induction base is clear, for the way we formulated the axiom. So assume the claim holds for some \(n\in {\mathbb {N}}\). We can now make the usual case distinction as to the last rule applied in the derivation of the sequent. By induction hypothesis, it is sufficient to show that the following rule (\(\natural \)weak) can be exchanged with the preceding one in some way, thereby moving upward in the tree. As the argument is entirely standard (and takes pages if spelled out), we just illustrate it with one example:

figure ak

We can move the rule upward by re-arranging the derivation as follows:

figure al

Similar (and mostly much easier) arguments can be applied in all cases where weakening is applied in other positions, and the same holds for all other rules of the calculus. \(\square \)

5.2.2 More Distribution Rules of \(\mathsf {AL}^\textit{cf}\)

We prove the admissibility of some additional distribution rules, which in turn will be important to prove general invertibility and congruence results. Firstly, consider the following rules:

figure am

Lemma 36

(distr) and (subst) are admissible in \(\mathsf {AL} ^\textit{cf}\).


For (distr), we just use (\(\natural \)weak) to transform \(\varDelta \) and \(\varTheta _1\) to \(\natural (\varDelta ,\varTheta _1)\); same for \(\varPsi \) and \(\varTheta _2\). We can then use (\(\natural \)contr) to eliminate double occurrences of formulas in \(\varDelta ,\varTheta _1\) and \(\varPsi ,\varTheta _2\) respectively.

For (subst), consider the following proof:

figure an

\(\square \)

In fact, in the presence of (\(\natural \)contr) the two rules (distr) and (subst) are easily shown to be equivalent to (inter1),(inter2). They have the advantage that they are conceptually slightly simpler, and we now see what their main use is: assume we have proofs \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [\lozenge (\alpha ;\beta ;\gamma )\), \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [\delta ]\). Then obviously we can prove \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [(\alpha \Vert \beta \Vert \gamma )\wedge \delta ]\); to satisfy distributive laws, we have to be able to prove \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [(\alpha \wedge \delta )\Vert (\beta \wedge \delta )\Vert (\gamma \wedge \delta )]\). Finally, to ensure invertibility, we have to be able to prove \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [\alpha \Vert (\beta \wedge \delta )\Vert \gamma ]\) etc. To prove this sequent from our premises, we need (inter2) (or alternatively, (subst)).

The main problem of (distr) and (subst) is that they are both not invertible themselves. In particular the rule (subst) is very problematic for proof search, as the set of possible premises is infinite, but (contrary to (inter2)) the derivability of the conclusion does not guarantee the derivability of a possible antecedent. Moreover, we conjecture that the calculus with (subst) and (distr) instead of (inter1),(inter2) does not allow for the admissibility of (\(\natural \)contr), hence our formulation seems preferable. We will however use the rules (distr),(subst) from time to time if it makes proofs more conspicuous.

The following rules are slightly stronger inversions of the rule (distr).

figure ao

In these rules, we move a context out of the scope of an ambiguous context, for which we have to distinguish two cases (as ambiguity is—for now—not commutative). The following lemma shows that these rules are admissible in our calculus.

Lemma 37


  1. 1.

    Assume \(\Vdash _\textit{cf}^{n} \varGamma [\lozenge (\natural (\varDelta ,\varTheta );\varPsi )]\vdash \varXi \). Then \(\Vdash _\textit{cf}^n \varGamma [\natural (\lozenge (\varDelta ;\varPsi ),\varTheta )]\vdash \varXi \).

  2. 2.

    Assume \(\Vdash _\textit{cf}^{n} \varXi \vdash \varGamma [\lozenge (\natural (\varDelta ,\varTheta );\varPsi )]\). Then \(\Vdash _\textit{cf}^n \varXi \vdash \varGamma [\natural (\lozenge (\varDelta ;\varPsi ),\varTheta )]\).

This only proves the claim for (distr1), but the parallelism with (distr2) is so obvious that we assume we can omit even the statement.


We only prove 1., as 2. is completely parallel. We make an induction over n: the induction base is clear, because (ax) is based on a single formula not within the scope of \(\lozenge \), hence this formula is not affected by the re-arrangement.

Now assume the claim holds for some \(n\in {\mathbb {N}}\). We prove it holds for \(n+1\) by case distinction as to which was the last rule applied in the derivation, followed by (distr1) or (distr2). For rules introducing connectives, this is a plain standard argument. Now assume we have a derivation

figure ap

Then we can also derive

figure aq

and by the n-admissibility of weakening, the derivation length does not increase, hence the claim follows in this case. (\(\lozenge \)I),(I\(\lozenge \)) are similar. (inter1) is straightforward: since (distr1) and (distr2) are weaker inverses of the rules (modulo weakening and contraction), the rules can be easily commuted. The exchanging of the rule (inter2) followed by an arbitrary instance of (distr1), (distr2) is also an easy exercise. \(\square \)

Corollary 38

The rules (distr1), (distr2) are admissible in \(\mathsf {AL} ^{\textit{cf}}\).

5.2.3 Contraction in Classical Context

We now consider contraction in classical context.

Lemma 39

If \(\Vdash ^{n}_\textit{cf}\varGamma [\natural (\varDelta ,\varDelta )]\vdash \varTheta \), then \(\Vdash _\textit{cf}^{n}\varGamma [\natural (\varDelta )]\vdash \varTheta \).


We make the usual induction over n, where we distinguish cases according to the last rule applied in the proof. The classical rules do not pose problems; reductions are well known, the rules (I\(\lozenge \)I),(\(\lozenge \)I),(I\(\lozenge \)) are obviously formulated in a way to make contraction admissible. So we only consider some critical rules; moreover, we omit the symbol \(\vdash \) in proofs if the proofs works equally on both sides.

figure ar

By the n-admissibility of weakening, this shortens the proof, hence the claim follows.

(inter1),(inter2) are also obviously formulated in a way such that any instance of them, followed by \((\natural \text {contr})\), can be easily commuted.

\(\Vert \)-introduction rules are unproblematic, because we use contraction of contexts rather than formulas: hence instead of contracting \(\alpha \Vert \beta \), we can equally well contract \(\lozenge (\alpha ;\beta )\). This finishes the proof, though we omit of course many unproblematic cases. \(\square \)

We omit the parallel lemma for the right-hand side, as everything is completely parallel.

Corollary 40

(\(\natural \)contr) is admissible in \(\mathsf {AL} ^\textit{cf}\).

5.2.4 Expansion in Ambiguous Context

There is a dual rule to (\(\lozenge \)contr), expansion in ambiguous context:

figure as

This is again a shorthand for two rules, and in a sense a special case of weakening (which is not admissible in ambiguous context). It obviously corresponds, together with (\(\lozenge \)contr), to the idempotence of ambiguity.

Lemma 41

(\(\lozenge \)exp) is admissible in \(\mathsf {AL} ^{\textit{cf}}\).


We prove this once more by induction over derivations, distinguishing cases as to the previous rule. We have to take care of the induction hypothesis:

figure at

Consider \(\lnot \)-introduction rules:

figure au

This can be re-arranged to

figure av

This way, (\(\lozenge \)exp) can be discarded altogether. For all other rules, it is easy to see how (\(\lozenge \)exp) can be moved upwards in the proof. \(\square \)

5.2.5 Contraction in Ambiguous Context

The rule (\(\lozenge \)contr) is now easy to show admissible; in fact, it is even derivable by a sequence of (distr1) and (\(\natural \)contr); this is because we always have the empty context \(\natural ()\) at our disposition.

figure aw

Hence we remain with but one rule in \(\mathsf {AL}^{\textit{cf}}\) which is problematic for proof search, namely (inter2).

Lemma 42

(\(\lozenge \)contr) is admissible in \(\mathsf {AL} ^{\textit{cf}}\).

5.2.6 Invertibility

A crucial property of proof systems is their invertibility: if a sequent \(\varGamma '\vdash \varDelta '\) can be derived from \(\varGamma \vdash \varDelta \), then if the former is derivable, so is the latter (same for several premises). Invertibility of rules is often straightforward to prove; for us, invertibility is one main reason we have the problematic rule (inter2). We now present the results on invertibility:

Lemma 43

  (Invertibility Lemma)

  1. 1.

    If \(\Vdash _{\textit{cf}}\varGamma [\alpha \Vert \beta ]\vdash \varDelta \), then \(\Vdash _{\textit{cf}}\varGamma [\lozenge (\alpha ;\beta )]\vdash \varDelta \).

  2. 2.

    If \(\Vdash _{\textit{cf}}\varDelta \vdash \varGamma [\alpha \Vert \beta ]\), then \(\Vdash _{\textit{cf}}\varDelta \vdash \varGamma [\lozenge (\alpha ;\beta )]\).

  3. 3.

    If \(\Vdash _{\textit{cf}}\varDelta \vdash \varGamma [\alpha \wedge \beta ]\), then \(\Vdash \varDelta \vdash \varGamma [\alpha ]\) and \(\Vdash _{\textit{cf}}\varDelta \vdash \varGamma [\beta ]\).

  4. 4.

    If \(\Vdash _{\textit{cf}}\varGamma [\alpha \vee \beta ]\vdash \varDelta \), then \(\Vdash _{\textit{cf}}\varGamma [\alpha ]\vdash \varDelta \) and \(\Vdash _{\textit{cf}}\varGamma [\beta ]\vdash \varDelta \)

  5. 5.

    If \(\Vdash _{\mathsf {AL} ^{\textit{cf}}}\varDelta ,\lnot \alpha \vdash \varGamma \), then \(\Vdash _{\textit{cf}}\varDelta \vdash \varGamma ,\alpha \).

  6. 6.

    If \(\Vdash _{\mathsf {AL} ^{\textit{cf}}}\varDelta \vdash \varGamma ,\lnot \alpha \), then \(\Vdash _{\textit{cf}}\varDelta ,\alpha \vdash \varGamma \).


All claims are straightforward by rule formulation. Formally, they can be proved by induction over proof length, and exchanging the critical rule with the previous ones, which works fine in all cases. \(\square \)

5.3 Admissible Rules II: Cut and Restricted Cut Rules

The following important result is actually straightforward to prove now:

Theorem 44

The rule (cut) is not admissible in \(\mathsf {AL} \), put differently, there are sequents which are derivable in \(\mathsf {AL} \), but not in \(\mathsf {AL} ^{\textit{cf}}\).

Recall that we let 1 stand for an arbitrary classical tautology, 0 a classical contradiction.


By completeness, we know that \(\Vdash _\mathsf {AL}1\vdash \lozenge (1; 0;1)\). This sequent is not derivable in \(\mathsf {AL}^{\textit{cf}}\), because \(\mathsf {AL}^{\textit{cf}+\textit{comm}}\), which is \(\mathsf {AL}^\textit{cf}\) with (\(\lozenge \)comm), is a conservative extension of classical logic (see Lemma 49) and if the sequent was derivable, we would also be able to derive \(1\vdash \lozenge (0;1;0)\vdash 0\) (using the usual methods), hence \(1\vdash 0\)—contradiction. \(\square \)

There is however a weaker cut-rule which is admissible, namely cut where the cut-formula is unambiguous and in unambiguous context. This means we have a sort of “Boolean transitivity”. The argument for this is standard (reduction of cut-degree). The rules where this procedure of reduction does not work are actually (\(\Vert \)I),(I\(\Vert \)) inside the cut-formula, and the cut-formula in ambiguous context. But if we exclude them by definition, the cut remains admissible:

figure ax

Lemma 45

In \(\mathsf {AL} ^{\textit{cf}}\), the rule (classic cut) is admissible.


Basically, one can reproduce the classical proof for cut-elimination. \(\lozenge \) and \(\Vert \) will be unproblematic in side formulas, the only place where they have to be considered. \(\square \)

This result is not as uninteresting as it might seem, given the importance \(\mathsf {AL}^{\textit{cf}}\) will have for us. There is another restricted cut rule which does not restrict the cut-formula, but the context to the identity context. This rule corresponds to transitivity of consequence and will be called (trans) accordingly.

figure ay

Lemma 46

In \(\mathsf {AL} ^{\textit{cf}}\), the rule (trans) is not admissible.



figure az

Given this proof, we can assume as a special case that \(a=0\), \(b=1\):

figure ba

The sequent \(1\vdash 1\Vert 0\Vert 1\) is however not derivable in \(\mathsf {AL}^{\textit{cf}}\), for then the calculus with \((\lozenge \)comm) would be inconsistent (see Lemma 33). \(\square \)

Hence in general, \(\mathsf {AL}^{\textit{cf}}\) does not even allow for transitivity of inference; but for unambiguous formulas in unambiguous context (that is, not embedded within the scope of \(\lozenge \)), we can allow cut. This entails that transitivity of inference does not hold in general, but in special cases: if \(\Vdash _{\mathsf {AL}^\textit{cf}} \varGamma \vdash \alpha \), \(\Vdash _{\mathsf {AL}^\textit{cf}} \alpha \vdash \varDelta \), and \(\alpha \) is a formula of \(\mathsf {CL}\), then \(\Vdash _{\mathsf {AL}^\textit{cf}} \varGamma \vdash \varDelta \).

5.4 Decidability

Lemma 47

\(\mathsf {AL} \) is decidable, that is, we can decide whether \(\Vdash _\mathsf {AL} \varGamma \vdash \varDelta \) for arbitrary \(\varGamma ,\varDelta \).


This follows from completeness; in fact, Corollary 23 entails the stronger claim that the problem is NP-complete. \(\square \)

For \(\mathsf {AL}^{\textit{cf}}\), we leave this problem open:

Conjecture 1

\(\mathsf {AL} ^{\textit{cf}}\) is decidable, that is, we can decide whether \(\Vdash _{\mathsf {AL} ^{\textit{cf}}}\varGamma \vdash \varDelta \) for arbitrary \(\varGamma ,\varDelta \).

There is only one rule which remains problematic for proof search in the calculus, namely (inter2). However, we do not see how this can be dispensed with without losing invertibility, which is a crucial feature both for proof-theory and semantics of the cut-free calculus.

6 Cut-Free \(\mathsf {AL}^{\textit{cf}}\) and the Main Hypothesis

6.1 The Main Hypothesis

The results of the last sections strongly indicate that \(\mathsf {AL}\) with cut is not a good model to reason with ambiguity, despite the fact that all axioms of \(\mathbf {UDA}\) and all inference rules of \(\mathsf {AL}\) agree with our intuitions. How does this go together? As we said, logically speaking, the problem lies in the cut rule, and algebraically speaking, the problem is the fact that our semantics is congruent, that is, we can always substitute equivalents preserving the truth of equalities. Being congruent is actually the core of being algebraic, so if we dismiss this feature, we should be careful in motivating this, explaining what this means, and formalizing this intuition. Firstly, we formulate our main hypothesis:

Conjecture 2

(Main hypothesis) Under the assumption of consistent usage, if a sequent \(\alpha \vdash \beta \) is derivable in \(\mathsf {AL} ^{\textit{cf}}\), then the inference is intuitively sound. Moreover, every intuitively sound inference with ambiguous propositions can be derived by cut-free \(\mathsf {AL} ^{\textit{cf}}\).

This is basically the main conceptual claim we make in this article, but we moderate it immediately: firstly, the hypothesis cannot be proved, it can only be falsified by deriving some intuitively unsound sequent, or showing that some intuitively sound sequent is not derivable. So we use this as a benchmark, hoping that by trying to falsify this hypothesis we will further our understanding of reasoning with ambiguity. Our motivation for this hypothesis is mostly empirical, given the previous results and the fact that for all counter-intuitive results we considered, we actually do need the cut rule to derive them.

How can we best explain the fact that the calculus closest to our intuition should be one without cut and without algebraic semantics (which obviously subsumes truth-functional semantics)? In our view, the main point is that ambiguity is something on the border between syntax and semantics. We have pointed out the parallelism between syntax and semantics, which is accounted for by universal distribution. Having the laws of universal distribution allows us to transfer ambiguity from syntax to semantics. Incongruence, on the other hand, is maybe the price we have to pay for this, as even semantically, there remains something syntactic to ambiguity: the syntactic form of formulas matters beyond mutual derivability, hence the same must hold for terms in semantics. This is exactly the core of incongruence: the fact that two formulas are inferentially equivalent (usually written \(\alpha \dashv \vdash \beta \)) does not entail that we can substitute one for the other in all contexts (which we will write \(\alpha \equiv \beta \)).

Incongruence is something which cannot be captured algebraically, hence we will have to look for an alternative semantics for \(\mathsf {AL}^{\textit{cf}}\). We will present a matrix-style semantics for the cut-free calculus, which is based on strings, where each string can be thought of as a sort of ambiguous normal form. This section is structured as follows: Firstly, we will explore the main hypothesis and sketch why it is plausible according to us. Then we will present the matrix semantics of \(\mathsf {AL}^{\textit{cf}}\) and prove its soundness and completeness. Finally, we will ponder about what it means and what we can learn for reasoning with ambiguity.

6.2 Cut-Free \(\mathsf {AL}\): Evidence for the Main Hypothesis

The main hypothesis cannot be mathematically proved, but we can gather some support for it. The hypothesis falls into two parts we might call soundness and completeness. The soundness part states: If a sequent is derivable in \(\mathsf {AL}^{\textit{cf}}\), then it is intuitively correct. This part is easier to grasp, once we have an intuition on what multi-sequents and inference rules mean, we can just use the usual induction over rules. Since this might be considered unsatisfying in hindsight of the counterintuitive results we obtained before, we will establish the following result. Recall that \(\mathsf {AL}^{\textit{comm}}\) is the calculus \(\mathsf {AL}\) enriched with the rule (\(\lozenge \)comm), we let \(\mathsf {AL}^{\textit{cf}+\textit{comm}}\) be the corresponding calculus without cut. As we showed before, \(\mathsf {AL}^{\textit{comm}}\) is inconsistent, that is, every sequent is derivable. Recall that we let 0, in the context of our logic, stand for an arbitrary classical contradiction. We now show the following:

Lemma 48

\(\mathsf {AL} ^{\textit{cf}+\textit{comm}}\) is consistent, that is, \(\not \Vdash _{\mathsf {AL} ^{\textit{cf}+\textit{comm}}} 0\).

Proof is straightforward, as in the cut-free calculus, to derive 0, we can only use classical rules, as there is no possibility to eliminate ambiguity once it is introduced. Actually, by the same argument we can easily conclude the following:

Lemma 49

\(\mathsf {AL} ^{\textit{cf}+\textit{comm}}\) is a conservative extension of \(\mathsf {CL} \), that is, it derives the same sequents in the classical language.

It is hard to gather evidence for the “completeness direction” of the main hypothesis, namely that all valid inferences with ambiguous terms are derivable with \(\mathsf {AL}^\textit{cf}\). We can however show that a number of properties hold which we would like to hold; these mostly regard the congruence of formulas. Recall that formulas, as terms, have ambiguous normal forms, which however are not unique. We let \(\textit{anf}(\phi )\) denote the set of ambiguous normal forms of \(\phi \). In the following we consider formulas only up to bracketing for \(\Vert \); hence we treat all formulas of the form \(\alpha _1\Vert ...\Vert \alpha _i\), with arbitrary bracketing, as equivalent.

Definition 50

We define \(\textit{anf}(\phi )\) syntactically by

  • \(\textit{anf}(p)=\{p\}\), for \(p\in \textit{Var}\)

  • \(\textit{anf}(\lnot \phi )=\{(\lnot \alpha _1)\Vert ...\Vert (\lnot \alpha _i):\ \) \(\alpha _1\Vert ...\Vert \alpha _i\in \textit{anf}(\phi )\}\)

  • \(\textit{anf}(\phi \wedge \psi )=\)

    \(\quad \{\gamma _1\Vert ...\Vert \gamma _i :\ \) \((\exists ( \alpha _1\Vert ...\Vert \alpha _i)\in \textit{anf}(\phi )).(\forall j\in \{1,...,i\}).\gamma _j\in \textit{anf}(\alpha _j\wedge \psi )\} \)

       \(\cup \)\( \{\gamma _1\Vert ...\Vert \gamma _i :\ (\exists (\beta _1\Vert ...\Vert \beta _i)\in \textit{anf}(\psi )).(\forall j\in \{1,...,i\}).\gamma _j\in \textit{anf}(\phi \wedge \beta _j)\}\)

  • \(\textit{anf}(\phi \vee \psi )=\)

       \( \{\gamma _1\Vert ...\Vert \gamma _i :\ \) \((\exists ( \alpha _1\Vert ...\Vert \alpha _i)\in \textit{anf}(\phi )).(\forall j\in \{1,...,i\}).\gamma _j\in \textit{anf}(\alpha _j\vee \psi )\} \)

       \(\cup \)\( \{\gamma _1\Vert ...\Vert \gamma _i :\ (\exists (\beta _1\Vert ...\Vert \beta _i)\in \textit{anf}(\psi )).(\forall j\in \{1,...,i\}).\gamma _j\in \textit{anf}(\phi \vee \beta _j)\}\)

  • \(\textit{anf}(\phi \Vert \psi )=\{\alpha _1\Vert ...\Vert \alpha _i\Vert \beta _1\Vert ...\Vert \beta _j:\alpha _1\Vert ...\Vert \alpha _i\in \textit{anf}(\phi )\), \(\beta _1\Vert ...\Vert \beta _j\in \textit{anf}(\psi )\}\)

It is easy to see that \(\alpha \in \textit{anf}(\beta )\) for some \(\beta \) if and only if \(\alpha =\alpha _1\Vert ...\Vert \alpha _i\), where \(\alpha _1,...,\alpha _i\) are classical. Note that we define this concept by iterating distribution rules on formulas, hence in particular without reference to any proof theory or semantics. Hence \((p\wedge r)\Vert (q\wedge r)\in \textit{anf}((p\Vert q)\wedge r)\), but \((r\wedge p)\Vert (q\wedge r)\notin \textit{anf}((p\Vert q)\wedge r)\), since \(\wedge \)-commutation is not part of the definition!

We now come to another crucial concept. In a cut-free calculus, we have to distinguish two concepts: one is \(\vdash \) with reflexive closure \(\dashv \vdash \), which is mutual derivability. The other (and probably more important) concept is the following:

Definition 51

We write \(\alpha \leqq \beta \) iff i. \(\Vdash _\textit{cf}\varGamma [\beta ]\vdash \varDelta \) entails \(\Vdash _\textit{cf}\varGamma [\alpha ]\vdash \varDelta \) and ii. \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\alpha ]\) entails \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\beta ]\). We let \(\equiv \) denote the reflexive closure of \(\leqq \).

In a cut-free calculus, \(\equiv \) is the largest congruence on formulas which respects derivable sequents. Note that both \(\leqq \) and \(\equiv \) are transitive relations, and \(\equiv \) is an equivalence relation. What is particularly interesting for our “completeness direction” of the main hypothesis is not derivability but rather congruence of certain formulas; hence \(\leqq \) is more interesting than \(\vdash \).Footnote 12 We will also need the following: we write \(\varTheta \equiv _l\varTheta '\) if \(\Vdash _\textit{cf}\varGamma [\varTheta ]\vdash \varDelta \) iff \(\Vdash _\textit{cf}\varGamma [\varTheta ']\vdash \varDelta \), and \(\varTheta \equiv _r\varTheta '\) if \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\varTheta ]\) iff \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\varTheta ']\). The following is obvious yet important:

Lemma 52

For every multi-sequent \(\varGamma \), there are formulas \(\gamma _l,\gamma _r\), such that \(\gamma _l\equiv _l\varGamma \) and \(\gamma _r\equiv _r\varGamma \).


For one direction, use introduction rules, for the other their invertibility. \(\square \)

The following lemmas are particularly important:

Lemma 53

In \(\mathsf {AL} ^{\textit{cf}}\), it holds that \(\phi \wedge \chi \leqq \phi \Vert \chi \leqq \phi \vee \chi \).


We only prove \(\phi \wedge \chi \leqq \phi \Vert \chi \), the proof for \(\phi \Vert \chi \leqq \phi \vee \chi \) is exactly parallel. The proof for \(\phi \wedge \chi \leqq \phi \Vert \chi \) has two subparts.

i. \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\phi \wedge \chi ]\) implies \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\phi \Vert \chi ]\).

By \((\lozenge \)exp), \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\phi \wedge \chi ]\) implies \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\phi \wedge \chi ;\phi \wedge \chi )]\). By the invertibility of (I\(\wedge \)), it follows that \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\phi ;\chi )]\), hence \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\phi \Vert \chi )]\)

ii. \(\Vdash _\textit{cf}\varGamma [\phi \Vert \chi ]\vdash \varDelta \) implies \(\Vdash _\textit{cf}\varGamma [\phi \wedge \chi ]\vdash \varDelta \).

\(\Vdash _\textit{cf}\varGamma [\phi \Vert \chi ]\vdash \varDelta \) implies \(\Vdash _\textit{cf}\varGamma [\lozenge (\phi ;\chi )]\vdash \varDelta \) by Lemma 43. Now consider the following proof:

figure bb

\(\square \)

Hence we preserve the properties of inference between ambiguous and unambiguous formulas (as stated by (inf) in \(\mathbf {UDA}\)).

Lemma 54

For all formulas \(\alpha \), we have \(\alpha \equiv \lnot \lnot \alpha \).


This is most easy to prove by induction over the proof length; the exact claim is: there is a proof of length n for \(\varGamma [\alpha ]\vdash \varDelta \), if and only if there is a proof of length \(\le n+2\) for \(\varGamma [\lnot \lnot \alpha ]\vdash \varDelta \). Induction base is clear, and induction step, distinguishing rules, is straightforward, provided the invertibility lemma. \(\square \)

Lemma 55

In \(\mathsf {AL} ^\textit{cf}\), for arbitrary formulas \(\alpha ,\beta ,\gamma \), we have

  1. 1.

    \(\lnot (\alpha \Vert \beta )\equiv \lnot \alpha \Vert \lnot \beta \)

  2. 2.

    \(\alpha \wedge (\beta \Vert \gamma )\equiv (\alpha \wedge \beta )\Vert (\alpha \wedge \gamma )\)

  3. 3.

    \(\alpha \vee (\beta \Vert \gamma )\equiv (\alpha \vee \beta )\Vert (\alpha \vee \gamma ) \)


1. \(\lnot \)

\(\geqq \) Assume we have a proof \(\varGamma [\lnot (\alpha \Vert \beta )]\vdash \varDelta \). Then at some point, we have introduced the negation by \((\lnot \)I) in the subproof(s). Hence we had one (or several) proofs

figure bc

By invertibility, we also have a proof for \(\varPsi \vdash \varLambda ,\lozenge (\gamma _1;...;\gamma _i;\alpha ;\beta ;\gamma _{i+1};...;\gamma _n)\); hence we can prove

figure bd

From here, the proof can proceed as before, as we do not make any reference to formula structure in the calculus. Same on the right-hand side.

\(\leqq \) Assume we have \(\varGamma [\lnot \alpha \Vert \lnot \beta ]\vdash \varDelta \). This case is more complicated, since we have to distinguish cases as to which was the rule by which the formula was introduced.

Case 1: We had a proof

figure be

This case is easy, we re-arrange the proof to

figure bf

Same on the right-hand side.

Case 2: We had a proof

figure bg

Then by invertibility,we have \(\Vdash _\textit{cf}\varGamma \vdash \varDelta ',\alpha \) and \(\Vdash _\textit{cf}\varGamma \vdash \varDelta '',\beta \). Now we can use (I\(\lozenge \)) to get a proof

figure bh

Here the claim follows, and parallel on the right hand side.

Case 3: We had a proof

figure bi

Then by invertibility we have \(\Vdash _\textit{cf}\varGamma '\vdash \varDelta ,\alpha \) and \(\Vdash _\textit{cf}\varGamma ''\vdash \varDelta ,\beta \). Now we can apply (I\(\lozenge \)I) to obtain \(\vdash _\textit{cf}\varGamma \vdash ,\varDelta ,\lozenge (\alpha ;\beta )\), and we proceed as obvious.

2. \(\wedge \)

\(\leqq \) Assume we have a proof for \(\varGamma [\alpha \wedge (\beta \Vert \gamma )]\vdash \varDelta \). Then we have \(\Vdash _\textit{cf}\varGamma [\natural (\alpha ,\lozenge (\beta ;\gamma ))]\vdash \varDelta \). Now we can use the (admissible) rule (distr) to derive

$$\begin{aligned} \Vdash _\textit{cf}\varGamma [\lozenge (\natural (\alpha ,\beta );\natural (\alpha ,\gamma ))]\vdash \varDelta \end{aligned}$$

and the claim follows.

For the right hand side, assume \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [(\alpha \wedge \beta )\Vert (\alpha \wedge \gamma )]\), whence \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\alpha ;\alpha )]\) and \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\beta ;\gamma )]\). By (\(\lozenge \)contr) we obtain \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\alpha ]\), by (\(\Vert \)I) \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\beta \Vert \gamma ]\), and with (I\(\wedge \)) we derive \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\alpha \wedge (\beta \Vert \gamma )]\).

\(\geqq \) Assume \(\Vdash _\textit{cf}\varGamma [(\alpha \wedge \beta )\Vert (\alpha \wedge \gamma )]\vdash \varDelta \). Then by invertibility, we have \(\Vdash _\textit{cf}\varGamma [\lozenge (\natural (\alpha ,\beta ));\natural (\alpha ,\gamma ))]\vdash \varDelta \) We use the admissibility of (distr1) and (distr2): we derive \(\Vdash _\textit{cf}\varGamma [\lozenge (\beta ;\natural (\alpha ,\gamma )),\alpha ]\vdash \varDelta \) and \(\Vdash _\textit{cf}\varGamma [\lozenge (\beta ;\gamma )),\alpha ,\alpha ]\vdash \varDelta \). With (\(\natural \)contr), we can derive \(\Vdash _\textit{cf}\varGamma [\lozenge (\beta ;\gamma )),\alpha ]\vdash \varDelta \), hence the claim follows.

For the right hand side, assume \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\alpha \wedge (\beta \Vert \gamma )]\). Then we have \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\alpha ]\), \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\beta ;\gamma )]\). Then we can use (subst) and (I\(\wedge \)) to derive \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\alpha \wedge \beta ;\gamma )]\) and then \(\Vdash _\textit{cf}\varDelta \vdash \varGamma [\lozenge (\alpha \wedge \beta ;\alpha \wedge \gamma )]\).

3. \(\vee \) Parallel to \(\wedge \). \(\square \)

To prove the following crucial lemma, we need to define a slightly odd measure of formula complexity \(c:\texttt {WFF}\rightarrow {\mathbb {N}}\).

  • \(c(p)=0\), for \(p\in \textit{Var}\)

  • \(c(\alpha \Vert \beta )=c(\alpha )+c(\beta )+1\)

  • \(c(\lnot \alpha )=2\cdot c(\alpha )+1\)

  • \(c(\alpha \wedge \beta )=c(\alpha )\cdot c(\beta )+1\)

  • \(c(\alpha \vee \beta )=c(\alpha )\cdot c(\beta )+1\)

This is to account for the fact that distributing \(\lnot ,\wedge ,\vee \) over \(\Vert \) can significantly increase formula complexity in terms of number both of connectives and variables. On the contrary, simple arithmetics tells us that \(c((\alpha \wedge \gamma )\Vert (\beta \wedge \gamma ))< c((\alpha \Vert \beta )\wedge \gamma )\) etc. An easy induction then yields that

$$\begin{aligned} \text {for all }\alpha '\in \textit{anf}(\alpha )\text {, we have }c(\alpha ')\le c(\alpha ) \end{aligned}$$

This is important in the following proof, where we often substitute formulas by their ambiguous normal forms.

Lemma 56

(ANF Lemma) In \(\mathsf {AL} ^{\textit{cf}}\), for all \(\phi \in \texttt {WFF}\) and all \(\chi \in \textit{anf}(\phi )\) we have \(\phi \equiv \chi \).


Induction over \(c(\phi )\). The base case is clear, since \(c(\phi )=0\) entails \(p\in \textit{Var}\). Assume the claim holds for all formulas with complexity \(\le n\), and the complexity of \(\phi \) is \(n+1\).

  1. 1.

    \(\phi =\alpha \Vert \beta \) Assume \(\varGamma [\alpha \Vert \beta ]\vdash \varDelta \). Then \(\varGamma [\lozenge (\alpha ;\beta )]\vdash \varDelta \). By induction hypothesis, \(\varGamma [\lozenge (\alpha ',\beta ')]\vdash \varDelta \) for all \(\alpha '\in \textit{anf}(\alpha ),\beta '\in \textit{anf}(\beta )\). By definition, every \(\gamma \in \textit{anf}(\alpha \Vert \beta )\) has the form \(\alpha '\Vert \beta '\), with \(\alpha '\in \textit{anf}(\alpha )\), \(\beta '\in \textit{anf}(\beta )\). Invert the argument for the other direction, and the same on the right-hand side.

  2. 2.

    \(\phi =\lnot \alpha \) Assume \(\varGamma [\lnot \alpha ]\vdash \varDelta \) Case 1: \(\alpha \) does not contain \(\Vert \). Then \(\lnot \alpha \) is its only normal form and the claim follows. Case 2: \(\alpha \) contains \(\Vert \). Then we trace back in the proof to the point where the negation was introduced; there we can by induction hypothesis replace \(\alpha \) by \(\alpha '\in \textit{anf}(\alpha )\), where \(\alpha '=\alpha _1\Vert \alpha _2\). Hence for arbitrary \(\alpha _1\Vert \alpha _2\in \textit{anf}(\alpha )\), we have \(\varGamma [\lnot (\alpha _1\Vert \alpha _2)]\vdash \varDelta \). By the previous lemma, we then have \(\varGamma [\lnot \alpha _1\Vert \lnot \alpha _2]\vdash \varDelta \), by invertibility we have \(\varGamma [\lnot \alpha _1;\lnot \alpha _2]\vdash \varDelta \). We can apply the induction hypothesis to \(\lnot \alpha _1,\lnot \alpha _2\) because of (25). Invert the argument for the other direction, and similar on the right-hand side.

  3. 3.

    \(\phi =\alpha \wedge \beta \) Assume \(\varGamma [\alpha \wedge \beta ]\vdash \varDelta \). Then \(\varGamma [\natural (\alpha ,\beta )]\vdash \varDelta \). By induction hypothesis, we can substitute and obtain \(\varGamma [\natural (\alpha ',\beta ')]\vdash \varDelta \) for arbitrary \(\alpha '\in \textit{anf}(\alpha )\), \(\beta '\in \textit{anf}(\beta )\).

Case 1 Assume \(\alpha '=\alpha '_1\Vert \alpha _2'\). Then by the previous lemma, \(\varGamma [\natural (\lozenge (\alpha '_1\wedge \beta ';\alpha _2'\wedge \beta ')]\vdash \varDelta \). (25) then entails that \(c(\alpha '_1\wedge \beta ')\le n\) and \(c(\alpha _2'\wedge \beta ')\le n\), hence by induction hypothesis they can be replaced by any of their ambiguous normal forms.

Case 2 Assume \(\beta '=\beta '_1\Vert \beta '_2\). This is parallel to case 1, and of course the two cases can overlap (are not exclusive).

Case 3 \(\alpha ,\beta \) are both classical. Then \(\textit{anf}(\alpha \wedge \beta )=\{\alpha \wedge \beta \}\), hence this case is trivial.

This covers all ambiguous normal forms; and the argument works equally in the other direction.

For the right hand side, assume \(\varDelta \vdash \varGamma [\alpha \wedge \beta ]\). Then we have \(\varDelta \vdash \varGamma [\alpha ]\), \(\varDelta \vdash \varGamma [\beta ]\), hence \(\varDelta \vdash \varGamma [\alpha ']\), \(\varDelta \vdash \varGamma [\beta ']\) for arbitrary \(\alpha '\in \textit{anf}(\alpha )\), \(\beta '\in \textit{anf}(\beta )\). Then we use Lemma 55 and the claim follows easily from the induction hypothesis. The argument works in the other direction as well.

4. \(\phi =\alpha \vee \beta \) Parallel. \(\square \)

Hence every formula is congruent to all of its ambiguous normal forms, as it should by universal distribution. This means among other that if \(\textit{anf}(\alpha )\cap \textit{anf}(\beta )\ne \emptyset \ne \textit{anf}(\beta )\cap \textit{anf}(\gamma )\), then \(\alpha \equiv \gamma \), as congruence is an equivalence relation (hence transitive).

With the rules we have established, it is easy to derive the following law of disambiguation: for every i, \(1\le j\le i\), \(\Vdash _{\mathsf {AL}^{\textit{cf}}}(\phi _1\Vert ...\Vert \phi _i)\wedge \lnot \phi _j\vdash \phi _1\Vert ...\Vert \phi _{j-1}\Vert \phi _{j+1}\Vert ...\Vert \phi _i\). We would like to show a stronger result, namely that this can be strengthened to \(\leqq \), which is to say, disambiguation can be applied in arbitrary contexts. This corresponds to the following rules of left/right disambiguation, which we will prove to be admissible:

figure bj

These rules are not dual to each other: (disamb I) introduces arbitrary contexts, (I disamb) eliminates classical formulas. Each of the two has a separate dual which we omit as it is less immediate to disambiguation. To prove their admissibility we need two auxiliary lemmas.

Lemma 57

Assume \(\xi \) is formula of classical logic.

  1. 1.

    If \(\Vdash _\textit{cf}\xi \vdash \) and \(\Vdash _\textit{cf}\varGamma \vdash \varDelta ,\xi \), then \(\Vdash _\textit{cf}\varGamma \vdash \varDelta \).

  2. 2.

    If \(\Vdash _\textit{cf}\vdash \xi \) and \(\Vdash _\textit{cf}\varGamma ,\xi \vdash \varDelta \), then \(\Vdash _\textit{cf}\varGamma \vdash \varDelta \).

Hence we can eliminate classical contradictions on the right, classical theorems on the left of \(\vdash \). This will also have some relevance in matrix semantics.


Just consider that (26), (27) are both instances of admissible (classic cut), with exchanged order of premises.

figure bk

\(\square \)

Lemma 58


  1. 1.

    Assume \(\Vdash _\textit{cf}\varGamma [\lozenge (\varDelta ;\varDelta ')]\vdash \varTheta \) and \(\Vdash _\textit{cf}\varXi \vdash \). Then \(\Vdash _\textit{cf}\varGamma [\lozenge (\varDelta ;\varXi ;\varDelta ')]\vdash \varTheta \)

  2. 2.

    Assume \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [\lozenge (\varDelta ;\varDelta ')]\) and \(\Vdash _\textit{cf}\vdash \varXi \). Then \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [\lozenge (\varDelta ;\varXi ;\varDelta ')]\)


We make an induction over n in \(\Vdash ^n_\textit{cf}\), simultaneously for both 1. and 2., so we cover negation rules. The most difficult case is the base case: assume \(\Vdash _\textit{cf}^1\alpha ,\varGamma \vdash \alpha ,\varTheta \).

1.: Case 1 Assume \(\lozenge (\varDelta ;\varDelta ')\) is a subterm of \(\varGamma \). As \(\varGamma \) is arbitrary, the claim follows immediately.

Case 2 Assume \(\varDelta :=\alpha \) (hence \(\varDelta '=\natural ()\)) Then we prove

figure bl

Case 3 \(\alpha =\varDelta '\) is parallel.

2. is parallel.

Now we come to the induction step. Assume the claim holds for some n; we make a case distinction as to the last rule applied. Basically all cases are straightforward except for (\(\lnot \)I),(I\(\lnot \)), if they are applied to the context \(\lozenge (\varDelta ;\varDelta ')\). However, here the claim follows from the fact the we perform the induction simultaneously for both claims, and a contradiction is the negation of a theorem and vice versa. \(\square \)

Corollary 59

(disamb I) is admissible in \(\mathsf {AL} ^\textit{cf}\).


If \(\Vdash _\textit{cf}\varGamma [\lozenge (\varDelta _1;\varDelta _2)]\vdash \varTheta \) and \(\Vdash _\textit{cf}\varDelta ,{\overline{\varDelta }}\vdash \) hold, then \(\Vdash _\textit{cf}\varGamma [\lozenge (\varDelta _1;\natural (\varDelta ,{\overline{\varDelta }});\varDelta _2)]\vdash \varTheta \) by Lemma 58, and by iterated (distr1),(distr2), we obtain \(\Vdash _\textit{cf}\varGamma [\lozenge (\varDelta _1;\varDelta ;\varDelta _2),{\overline{\varDelta }}]\vdash \varTheta \). \(\square \)

Corollary 60

(I disamb) is admissible in \(\mathsf {AL} ^\textit{cf}\).


Consider the following proofs:

figure bm

We can then prove \(\varTheta \vdash \varGamma [\lozenge (\varDelta _1;\varDelta _2)],\alpha \wedge {\overline{\alpha }}\), and since \(\Vdash _\textit{cf}\alpha \wedge {\overline{\alpha }}\vdash \) by one premise, where \(\alpha \wedge {\overline{\alpha }}\) is classical, by Lemma 57 it follows that \(\Vdash _\textit{cf}\varTheta \vdash \varGamma [\lozenge (\varDelta _1;\varDelta _2)]\). \(\square \)

Corollary 61

For every i, \(1\le j\le i\), where \(\phi _j\) is classical, \((\phi _1\Vert ...\Vert \phi _i)\wedge \lnot \phi _j\leqq \phi _1\Vert ...\Vert \phi _{j-1}\Vert \phi _{j+1}\Vert ...\Vert \phi _i\)


To prove that \(\varGamma [\phi _1\Vert ...\Vert \phi _{j-1}\Vert \phi _{j+1}\Vert ...\Vert \phi _i]\vdash \varTheta \) entails \(\varGamma [(\phi _1\Vert ...\Vert \phi _i)\wedge \lnot \phi _j]\vdash \varTheta \), use invertibility of (\(\Vert \)I) and (disamb I).

To prove that \(\varTheta \vdash \varGamma [(\phi _1\Vert ...\Vert \phi _i)\wedge \lnot \phi _j]\) entails \(\varTheta \vdash \varGamma [\phi _1\Vert ...\Vert \phi _{j-1}\Vert \phi _{j+1}\Vert ...\Vert \phi _i]\), use invertibility of (I\(\Vert \)) and (I disamb). \(\square \)

There is a dual result stating that \(\phi _1\Vert ...\Vert \phi _{j-1}\Vert \phi _{j+1}\Vert ...\Vert \phi _i\leqq (\phi _1\Vert ...\Vert \phi _i)\vee \lnot \phi _j\), provable by the dual rules of the ones we presented. This would be co-disambiguation. Of course, there are many more rules one could consider interesting and important, but we think that beyond these crucial ones, the choice would become arbitrary. What we consider most important is that for many critical properties of ambiguity, mostly regarding distributive laws, we actually have congruence of formulas representing (arguably) congruent meanings.

We hope that these results will have convinced the reader that \(\mathsf {AL}^{\textit{cf}}\) is a powerful logic for reasoning with ambiguity. Having established these results, we will now provide it with a rather simple and natural semantics. This will of course not be algebraic or set-theoretic, as these are excluded by incongruence. It could rather be qualified as language-theoretic, as we interpret formulas and sequents as (sets of) strings, where every string corresponds to an ambiguous normal form.

6.3 Matrix Semantics for \(\mathsf {AL}^\textit{cf}\)

We now present a semantics for cut-free \(\mathsf {AL}^{\textit{cf}}\), which is based on matrix semantics; we adapt the definition of a Gentzen matrix from Galatos et al. (2007, chapter 7). We define an ambiguity matrix as a structure \((\mathbf{A },\preceq )\), where \(\mathbf{A }=(A,\wedge ,\vee ,{\sim },0,1)\) is an arbitrary algebra (hence not necessarily a Boolean algebra!) of the signature of Boolean algebras, and \(\preceq \subseteq A^{*}\times A^{*}\), where \(A^{*}\) denotes the set of finite strings over A, including the empty string \(\epsilon \); \(A^+\) denotes the set of non-empty strings. In this section, we will use the convention that letters abcd, ... are used for letters in A, whereas letters uvwxyz are used for strings in \(A^{*}\). Before we define \(\preceq \), we have to introduce some important shorthands: for \(w,v\in A^{+}\), \(a\in A\), \(w\wedge a\), \(w\vee a\), \(w\wedge v\), \(w\vee v\), \({\sim }w\) are not defined: importantly, pseudo-Boolean operations are only defined for letters in A, not strings! But we still use terms over strings as a shorthand, by the following string abbreviations:

$$\begin{aligned}&(b_1...b_i)\wedge a:=(b_1\wedge a)...(b_i\wedge a) \qquad a\wedge (b_1...b_i):=(a\wedge b_1)...(a\wedge b_i) \\&(b_1...b_i)\vee a:=(b_1\vee a)...(b_i\vee a) \qquad a\vee (b_1...b_i):=(a\vee b_1)...(a\vee b_i) \\&w\wedge v:=\underset{w=w_1w_2}{\bigcup }\{(w_1\wedge v)(w_2\wedge v)\}\cup \underset{v=v_1v_2}{\bigcup }\{(w\wedge v_1)(w\wedge v_2)\} \\&w\vee v:=\underset{w=w_1w_2}{\bigcup }\{(w_1\vee v)(w_2\vee v)\}\cup \underset{v=v_1v_2}{\bigcup }\{(w\vee v_1)(w\vee v_2)\} \\&{\sim }(a_1...a_i):=({\sim }a_i)...({\sim }a_i) \end{aligned}$$

The definitions of \(w\wedge v\) and \(w\vee v\) require that \(w_1,w_2,v_1,v_2\in A^+\); this ensures that at some point we will have a term which is well-defined by lines one and two, like \(w\wedge a\) etc. Hence \(w\vee v\) and \(w\wedge v\) are defined inductively. The representation axioms below will ensure that all members of these sets are congruent, that is, exchangeable in all contexts. As a result, we can use all pseudo-Boolean operations on arbitrary words in \(A^{+}\); but it is important to keep in mind that they are abbreviations for operations which are defined only for letters in A itself, and operations are not necessarily Boolean. The relation \(\preceq \) has to satisfy a number of conditions, which are as follows:


  • M1. For all \(a\in A\), \(a\preceq a\)

  • M2. \(x\preceq 1\), \(0\preceq x\)

  • M3. If \(x\preceq z\) and \(y\preceq u\), then \(xy\preceq zu\)

  • \(\vee \)

  • M4. \(xwy\preceq z\) and \(xvy\preceq z\) if and only if \(x(w\vee v)y\preceq z\)

  • M5. If \(z\preceq xwy\), then \(z\preceq x(w\vee v)y\)

  • M6. If \(z\preceq xvy\), then \(z\preceq x(w\vee v)y\)

  • assoc\(\vee \). \(z\preceq x(v_1\vee (v_2\vee v_3))y\) iff \(z\preceq x((v_1\vee v_2)\vee v_3 )y\)

  • comm\(\vee \). \(z\preceq w(x\vee y)v\) iff \(z\preceq w(y\vee x)v\)

  • id\(\vee \). If \(x\preceq y(w\vee w)z\), then \(x\preceq ywz\)

  • \(\wedge \)

  • M7. \(z\preceq xwy\) and \(z\preceq xvy\) if and only if \(z\preceq x(w\wedge v)y\)

  • M8. If \(xwy\preceq z\), then \(x(w\wedge v)y\preceq z\)

  • M9. If \(xwy\preceq z\), then \(x(v\wedge w)y\preceq z\)

  • \(\wedge \) assoc. \(x(v_1\wedge (v_2\wedge v_3))y\preceq z\) iff \(x((v_1\wedge v_2)\wedge v_3 )y\preceq z\)

  • \(\wedge \) comm. \(w(x\wedge y)v\preceq z\) iff \(w(y\wedge x)v\preceq z\)

  • \(\wedge \) id. If \(x(w\wedge w)y\preceq z\), then \(xwy\preceq z\)

  • (\({\sim }\))

  • M10. If \(w\wedge {\sim }v\preceq u\vee v\), then \(w\preceq u\vee v\) and \(w\wedge {\sim }v\preceq u\)

  • M11. If \(w\wedge v\preceq u\vee {\sim }v\), then \(w\wedge v\preceq u \) and \(w\preceq u\vee {\sim }v\)

  • (Representation)

figure bn
  • (Boolean distribution)

  • BD1. \(w(x\wedge (y\vee z))v\preceq u\) iff \(w((x\wedge y)\vee (x\wedge z))v\preceq u\)

  • BD2. \(u\preceq w(x\wedge (y\vee z))v\) iff \(u\preceq w((x\wedge y)\vee (x\wedge z))v\)

  • (Double negation)

  • DN1. \(wxv\preceq y\) iff \(w({\sim }{\sim }x)v\preceq y\)

  • DN2. \(y\preceq wxv\) iff \(y\preceq w({\sim }{\sim }x)v\)

  • 1l. If \(w(a\wedge 1)v\preceq z\), then \(wav\preceq z\)

  • 0r. If \(z\preceq w(a\vee 0)v\) then \(z\preceq wav\)

We denote the class of ambiguity matrices which satisfy the above requirements by \(\mathbf{AM }\). It is important to underline that an ambiguity matrix \((\mathbf{A },\preceq )\) is not an algebra; \(\mathbf{A }\) is an algebra, \(\preceq \subseteq A^*\times A^*\) is a relation between strings of terms. It is of course easy to see that concatenation corresponds to ambiguity. Maybe a comment on the representation axioms: as we have said, a term like \(w\wedge v\) is just an abbreviation for a set of strings. The representation axioms are necessary that all strings abbreviated by a term are exchangeable in all contexts (we will come to this below). Note that in all ambiguity matrices,

id. \(xwwy\preceq z\) iff \(xwy\preceq z\)

is derivable from these axioms (same on the right): assume \(xwwy\preceq z\). Then \(x(w\vee w)(w\vee w)y\preceq z\) (by M4.), hence \(x((ww)\vee w)y\preceq z\) (by notation); hence (again by M4.) \(xwy\preceq z\). For the other direction, assume \(xwy\preceq z\), hence \(x(w\wedge (ww))y\preceq z\), where \(w\wedge (ww)\) is an abbreviation for \((w\wedge w)(w\wedge w)\), hence \(x(w\wedge w)(w\wedge w))y\preceq z\), and so \(xwwy\preceq z\). The main use for 1l. is that it ensures that

empty left. \(w\preceq v\) iff \(1\preceq v\vee {\sim }w\)

If is clear, since \(w\preceq v\) entails \(1\wedge w\preceq v\vee {\sim }w\) (M5.,M8.) entails \(1\preceq v\vee {\sim }w\) (M11.). Only if \(1\preceq v\vee {\sim }w\) entails \(1 \wedge w\preceq v\) (M8.,M11.), and by 1l., \(w\preceq v\). Hence 1l. allows us to simulate an empty left-hand side, same for 0r. on the right. Keep in mind that the terms we write are just abbreviations for strings, and operations on the empty string \(\epsilon \) are undefined! We have a (somewhat sloppy) correspondence of strings with ambiguous normal forms on the one side, and proof rules with conditions on \(\preceq \) on the other. The only rule we miss is (cut), and in fact this would correspond to a very peculiar property of the ambiguity matrix which does not hold in general:

(strong transitivity) If \(y'\preceq y\), \(xyz\preceq u\), then \(xy'z\preceq u\).

We will refer to this property below in Theorem 75. But we first come to an important definition.

Definition 62

(Congruence in \({{\varvec{AM}}}\)) We say wv are congruent and write \(w\asymp v\) iff for all \(x,y,z\in A^*\), \([xwy\preceq z\) \(\Leftrightarrow \) \(xvy\preceq z]\) and \([z\preceq xwy\) \(\Leftrightarrow \) \(z\preceq xvy]\).

This is an important relation, as this does not coincide with the symmetric closure of \(\preceq \), which we denote by \(\simeq \); in (incongruent) ambiguity matrices, \(\simeq \) is not a congruence on \(A^{*}\)! Note that matrices are more general than \(\mathbf {UDA}\) for exactly this reason. In matrices, strings are less interesting than their congruences classes; we define \(w_\asymp =\{x:x\asymp w\}\), and \(A^*_\asymp =\{w_\asymp :w\in A^*\}\). We will show that pseudo-Boolean operations can be applied to congruence classes; hence \(A^*_\asymp \) forms an algebra, which we will describe in Sect. 6.6. We now provide some examples of ambiguity matrices.

Example 1

Let \(\mathbf{A }_1\) be an arbitrary algebra of the signature of Boolean algebras, and put \(\preceq _1=A^*\times A^*\). Then \((\mathbf{A }_1,\preceq _1)\) is a matrix, as it clearly satisfies the base conditions M1.-M3., and moreover all other conditions which have the form of implications, of which the consequence is always true in our example. This is the matrix in which every sequent is valid, and none is falsified. If we take the set of \(\asymp _1\)-congruence classes (there is just one), this matrix becomes the trivial commutative universal distribution algebra.

Example 2

Let \(\mathbf{A }_2\) be the two element Boolean algebra, hence \(A_2=\{0,1\}\). The smallest matrix relation \(\preceq _2\) for this algebra is the following: we have \(awb\preceq _2cvd\) iff \(a\le c\), \(b\le d\) (read \(\le \) as on natural numbers). The reason is that \(1001\asymp _2 (10)\vee {\sim }(10)\) and \(0110\asymp _2 (10)\wedge {\sim }(10)\), and we have idempotence. Hence the Margin Lemma applies to this matrix \((\mathbf{A }_2,\preceq _2)\), and it is easy to show that the algebra of its \(\asymp _2\)-congruence classes is a universal distribution algebra. This example shows that making the underlying algebra \(\mathbf{A }\) a Boolean algebra already heavily restricts the possibilities of the matrix relation; this is the reason we allow for arbitrary algebras. A more general result on the effect of the underlying algebra will be presented in Theorem 75.

Example 3

Let G be an arbitrary set, \(A=term(G)\), the set of all Boolean terms over G. We let \(\mathbf{A }_3\) be the term algebra, that is, the algebra where every term denotes itself and nothing else. We let \(\preceq _3\) denote the smallest relation such that for every \(a\in A\), \(w\in A^*\) \(a\preceq _3a\), \(w\preceq _3 1\), \(0\preceq _3 w\), and all other matrix axioms (which have the form of implications) are satisfied, and nothing else. This is a well-formed inductive definition, and it defines the free matrix generated by G. Whereas example 1 was a matrix making every sequent valid, \((\mathbf{A }_3,\preceq _3)\) is a matrix making only those sequents valid which are valid in every matrix containing G. It is actually not difficult to show that the formula-matrix, which we define for our completeness proof below, is equivalent to the matrix generated by \(G=\textit{Var}\). If we take the set of the \(\asymp _3\) congruence classes, the result is not a universal distribution algebra, provided \(|G|\ge 2\); this follows from Theorems 75.

Informally, Theorem 75 gives a number of equivalent conditions under which a matrix \((\mathbf{A },\preceq )\) is equivalent to a universal distribution algebra (in a sense to be made precise), among which are: it satisfies (strong transitivity), and the algebra of its congruence classes is Boolean. We present this result later on, since it is much easier to prove with the following completeness result.

6.4 Matrix Interpretation of \(\mathsf {AL}^\textit{cf}\)

Given a matrix \((\mathbf{A },\preceq )\), there are several possible interpretations of \(\mathsf {AL}^{\textit{cf}}\). Importantly, we will interpret variables into \(\asymp \)-congruence classes, not strings. Interpreting variables as letters/string leads to problems, since for example \(w\wedge v\) does not represent a unique string, so the interpretation would become non-functional. We will interpret variables as \(\asymp \)-congruence classes of letters. We thus take a map \(\sigma :\textit{Var}\rightarrow A_\asymp \) and extend it canonically to \({\underline{\sigma }},{\overline{\sigma }}:\texttt {WFF}\rightarrow A^*_\asymp \). To this end, we need to define the operations \(\vee ,\wedge ,{\sim }\) for congruence classes; the same for concatenation, which we then denote by \(\cdot \):

figure bo

The result is always a congruence class (see below). As usual, \(\cdot \) will often be omitted, and we often write strings as arbitrary representatives of their congruence class (justified by Lemma 65).

We consider the interpretation where atomic formulas are mapped to congruence classes of single letters, that is \(\sigma (p)=a_\asymp \) for some a, and call this interpretation unambiguous. This is however not necessary; we could also assume that a formula like p is interpreted as ambiguous (i.e. as a string). Given \(\sigma :\textit{Var}\rightarrow A\), we now define matrix interpretations:

figure bp

Note that by the string abbreviations, it follows immediately that

$$\begin{aligned} {\underline{\sigma }}((\alpha \Vert \beta )\wedge \gamma )= & {} {\underline{\sigma }}((\alpha \wedge \gamma )\Vert (\beta \wedge \gamma )) \end{aligned}$$
$$\begin{aligned} {\underline{\sigma }}((\alpha \Vert \beta )\vee \gamma )= & {} {\underline{\sigma }}((\alpha \vee \gamma )\Vert (\beta \vee \gamma )) \end{aligned}$$
$$\begin{aligned} {\underline{\sigma }}(\lnot (\alpha \Vert \beta ))= & {} {\underline{\sigma }}((\lnot \alpha )\Vert (\lnot \beta )) \end{aligned}$$

Same for \({\overline{\sigma }}\). Hence the distributive laws are already implicit in interpretations!

Definition 63

(Truth and validity in ambiguity matrices) 

  • We say a sequent \(\varGamma \vdash \varDelta \) is true in a matrix \(({{\varvec{A}}},\preceq )\) under interpretation \(\sigma \), in symbols \(({{\varvec{A}}},\preceq ),\sigma \models \varGamma \vdash \varDelta \), if \({\underline{\sigma }}(\varGamma )\preceq {\overline{\sigma }}(\varDelta )\).

  • We say \(\varGamma \vdash \varDelta \) is valid in a matrix \(({{\varvec{A}}},\preceq )\), in symbols \(({{\varvec{A}}},\preceq )\models \varGamma \vdash \varDelta \), if for every \(\sigma :\textit{Var}\rightarrow A\), we have \(({{\varvec{A}}},\preceq ),\sigma \models \varGamma \vdash \varDelta \).

  • We say \(\varGamma \vdash \varDelta \) is valid, in symbols \({{\varvec{AM}}}\models \varGamma \vdash \varDelta \) if \(\varGamma \vdash \varDelta \) is valid in all ambiguity matrices.

We get a neat semantics, where all forms of ambiguity are just interpreted as concatenation in words. We now have to prove some properties for matrices and interpretations:

Lemma 64

In all ambiguity matrices,

  1. 1.

    \((wxv)\wedge (wyv)\preceq z\) iff \(w(x\wedge y)v\preceq z\)

  2. 2.

    \(z\preceq (wxv)\vee (wyv)\) iff \(z\preceq w(x\vee y)v\)


We only prove 1., 2. is clearly parallel.

If: We write the axiom we use in each line, and the line to which it applies; if no line is indicated, it refers to the previous. “Notation” means the line is just a notational variant of the premise.

figure bq

Only if:

figure br

\(\square \)

We now address the problem of applying operations to \(\asymp \)-equivalence classes; recall that \(w_\asymp =\{x:x\asymp w\}\). What is important is that the result is independent by the choice of the members of the class; this is assured by the following lemma:

Lemma 65


  1. 1.

    \((w\wedge v)_\asymp =(w'\wedge v')_\asymp \) for arbitrary \(w'\asymp w\), \(v'\asymp v'\)

  2. 2.

    \((w\vee v)_\asymp =(w'\vee v')_\asymp \) for arbitrary \(w'\asymp w\), \(v'\asymp v'\)

  3. 3.

    \(({\sim }w)_\asymp =({\sim }w')\) for arbitrary \(w'\asymp w\)

  4. 4.

    \(\{(wv)_\asymp \}= (w'v')_\asymp \) for arbitrary \(w'\asymp w\), \(v'\asymp v'\)


1. On the left of \(\preceq \):

figure bs

On the right of \(\preceq \): Assume \(z\preceq x(w\wedge v)y\). Then \(z\preceq xwy\) and \(z\preceq xvy\), hence \(z\preceq xw'y\) and \(z\preceq xv'y\), so \(z\preceq x(w'\wedge v')y\).

2. Parallel.

3. Assume \(x({\sim }w)y\preceq z\). Then \(1\preceq z\vee {\sim }x({\sim }{\sim }w){\sim }y\) (empty left.), hence \(1\preceq z\vee ({\sim }x) w({\sim }y)\) (DN2.), so \(1\wedge {\sim }z\preceq (({\sim }x)w'{\sim }y)\) (M10.), and inverting back again, \(x({\sim }w')y\preceq z\).

Parallel on the right-hand side.

4. Assume \(xwvy\preceq z\). Then \(xw'vy\preceq z\) and so \(xw'v'y\preceq z\).

Parallel on the right-hand side. \(\square \)

This lemma is important, because it shows that for \(\asymp \)-congruence classes MN, we can choose arbitrary representatives \(w\in M\), \(v\in N\) to define classes \(M\wedge N\) etc. Algebraically speaking, operations are independent of representatives. In the sequel, we will consider strings only up to \(\asymp \)-congruence, writing things like \(w\wedge v=u\) as a shorthand for both \(w_\asymp \wedge v_\asymp =u_\asymp \) and \(w\wedge v\asymp u\).

6.5 Soundness and Completeness

Soundness First, we start with another auxiliary lemma:

Lemma 66

For all interpretations \(\sigma \), contexts \(\varGamma [-]\), formulas \(\delta ,\delta '\), there are wv such that

$$\begin{aligned} \qquad {\overline{\sigma }}(\varGamma [\delta ])= & {} wxv_\asymp \text {, }{\overline{\sigma }}(\varGamma [\delta '])=wx'v_\asymp \text {, and }{\overline{\sigma }}(\varGamma [\delta \wedge \delta '])=w(x\wedge x')v_\asymp \end{aligned}$$
$$\begin{aligned} {\underline{\sigma }}(\varGamma [\delta ])= & {} wxv_\asymp \text {, }{\underline{\sigma }}(\varGamma [\delta '])=wx'v_\asymp \text {, and }{\underline{\sigma }}(\varGamma [\delta \vee \delta '])=w(x\wedge x')v_\asymp \end{aligned}$$


We just prove (31), since (31) is parallel. By induction over \(\varGamma [-]\). Induction base is clear (\(\varGamma [\alpha ]=\alpha \)), where \(w=v=\epsilon \).

Assume it holds for \(\varGamma [-]\). Then take \(\natural (\varGamma [-],\varDelta )\). Then by distribution laws we have \({\overline{\sigma }}(\natural (\varGamma [\delta ],\varDelta ))=(wxv)\vee {\overline{\sigma }}(\varDelta )=(w\vee {\overline{\sigma }}(\varDelta ))(x\vee {\overline{\sigma }}(\varDelta ))(v\vee {\overline{\sigma }}(\varDelta ))\); same for \({\overline{\sigma }}(\natural (\varGamma [\delta '],\varDelta ))=(w\vee {\overline{\sigma }}(\varDelta ))(x'\vee {\overline{\sigma }}(\varDelta ))(v\vee {\overline{\sigma }}(\varDelta ))\). Moreover, we have \({\overline{\sigma }}(\natural (\varGamma [\delta \wedge \delta '],\varDelta ))=(w\vee {\overline{\sigma }}(\varDelta ))((x\wedge x')\vee {\overline{\sigma }}(\varDelta ))(v\vee {\overline{\sigma }}(\varDelta ))\). As we interpret into \(\asymp \)-equivalence classes and we have \((x\wedge x')\vee {\overline{\sigma }}(\varDelta )\asymp (x\vee {\overline{\sigma }}(\varDelta ))\wedge (x'\vee {\overline{\sigma }}(\varDelta ))\), we have \({\overline{\sigma }}(\varGamma [\delta \wedge \delta '])=w' ((x\vee {\overline{\sigma }}(\varDelta ))\wedge (x'\vee {\overline{\sigma }}(\varDelta ))v'\). Parallel for \(\natural (\varDelta ,\varGamma [-])\).

Take \(\lozenge (\varGamma [-];\varDelta )\) This is straightforward, as \({\overline{\sigma }}(\lozenge (\varGamma [\delta ];\varDelta ))={\overline{\sigma }}(\lozenge (\varGamma [\delta ])v\) for some v independent of \(\delta \). Same for \(\lozenge (\varDelta ;\varGamma [-])\) \(\square \)

Lemma 67

If \(\Vdash _{\mathsf {AL} ^{\textit{cf}}}\varGamma \vdash \varDelta \), then \({\mathbf{AM }}\models \varGamma \vdash \varDelta \).


We perform the usual induction over rules, where (ax) can be seen as the induction base. We need not consider the admissible rules, but we do in some cases, as they can be useful in soundness proofs for other rules.

(ax) Obviously sound by the interpretation; we always have \(a\preceq a\), and so \(w\wedge a\preceq a\vee v\) for all \(w,v\in A^{*}\)

(\(\wedge \)I),(I\(\vee \)) Interpretation of sequents remains identical.

(I\(\wedge \)) Sound by Lemma 66, and M7.

(\(\vee \)I) Parallel with M8.

(\(\natural \)comm) Application on the left-hand side is sound by \(\wedge \) comm, on the right-hand side by comm\(\vee \).

(\(\natural \)weak) (admissible) Clear by \(\vee \), \(\wedge \) introduction: we thereby can immediately weaken by single letters. Moreover, weakening by words is nothing but repeated weakening by letters, hence the claim follows.

(\(\natural \)contr) (admissible) We have \(w\wedge w\asymp w\asymp w\vee w\).

(I\(\lozenge \)I) Assume \(w\wedge v\preceq x\vee y\), and \(z\wedge v\preceq u\vee y\). Then by M3. \((w\wedge v)(z\wedge v)\preceq (x\vee y)(u\vee y)\). Now by the definition of \(\wedge ,\vee \) on strings, we have \((w\wedge v)(z\wedge v)=(wz)\wedge v\), and \((x\vee y)(u\vee y)=(xu)\vee y\).

(I\(\lozenge \)) Assume \(w\preceq x\vee y\vee z\), \(w\preceq x'\vee y'\vee z\). Then we have \(w\wedge {\sim }x \preceq y\vee z\) and \(w\wedge {\sim }x'\preceq y'\vee z\). Then by soundness of (I\(\lozenge \)I), we have \(w\wedge (({\sim }x)({\sim }x'))\preceq (yy')\vee z\). Hence, we have \(w\preceq {\sim }(({\sim }x)({\sim }x'))\vee (yy')\vee z\), were \({\sim }(({\sim }x)({\sim }x'))=xx'\) by notation and DN1.

(\(\lozenge \)I) Parallel.

(\(\Vert \)I),(I\(\Vert \)) are clear (interpretation remains identical).

(\(\lozenge \)assoc) is clear (interpretation remains identical).

(inter1) Assume we apply the rule on the left. We then have \(x((wv)\wedge w)y\preceq z\), \(x((wv)\wedge v)y\preceq z\). Hence we have \(x(((wv)\wedge w)\vee ((wv\wedge v))y\preceq z\) by M4., and by Boolean distribution BD1., \(x(((wv)\wedge (w\vee v))y=x(w\wedge (w\vee v))(v\wedge (w\vee v))y=x((w\wedge w)\vee (w\wedge v))((v\wedge w)\vee (v\wedge v))y\preceq z\), and by using M4. in the other direction, \(x(w\wedge w)(v\wedge v))y\preceq z\), which holds iff \(xwvy\preceq z\). Hence the rule is sound on the left; parallel on the right.

(subst) (admissible) Instead of contexts, we use formulas with the same interpretation. Assume \({\underline{\sigma }}(\varGamma [\alpha ])\preceq w\), \({\underline{\sigma }}(\varGamma [\lozenge (\beta _1;...;\beta _j;...;\beta _i)]\preceq w\). By soundness of (\(\vee \)I), \({\underline{\sigma }}(\varGamma [\lozenge (\beta _1\vee \alpha ;...;\beta _j\vee \alpha ;...;\beta _i\vee \alpha )]=u(x_1\vee y)...(x_j\vee y)...(x_i\vee y)z\preceq w\). By applying M7. repeatedly, we obtain \(ux_1...y...x_iz\preceq w\), where \(ux_1...y...x_iz={\underline{\sigma }}(\varGamma [\lozenge (\beta _1;...;\alpha ;...;\beta _i)]\) by Lemma 66. Parallel on the right-hand side.

(inter2) Assume \({\underline{\sigma }}(\varGamma [\beta ,\lozenge (\alpha ;\beta ;\gamma )])\preceq w\), \({\underline{\sigma }}(\varGamma [\lozenge (\alpha ;(\natural (\beta ,\beta ');\gamma )]\preceq w\). We then have, by the soundness of weakening, \({\underline{\sigma }}(\varGamma [\lozenge (\alpha ;(\natural (\beta ,\beta ');\gamma ),\lozenge (\alpha ;\beta ;\gamma )]]\preceq w\). Now we can apply the soundness of (subst) and obtain, substituting \(\beta \) for \(\natural (\beta ,\beta ')\), \({\underline{\sigma }}(\varGamma [\lozenge (\alpha ;\beta ;\gamma ),\lozenge (\alpha ;\beta ;\gamma )]\preceq w\). Now by the soundness of (\(\natural \)contr), we have \({\underline{\sigma }}(\varGamma [\lozenge (\alpha ;\beta ;\gamma )]\preceq w\), which was to prove. Parallel on the right-hand side.

(\(\lnot \)I) Assume \(w\preceq x\vee y\). Then we have (by M4.) \(w\wedge {\sim }y\preceq x\vee y\), hence by M10. \(w\wedge {\sim }y\preceq x\). As \({\sim }(a_1...a_n)\) is just an abbreviation for \({\sim }a_1....{\sim }a_n\), this already proves the claim for arbitrary instances.

(I\(\lnot \)) Parallel. \(\square \)

Completeness We first establish the concept of the formula-matrix: We let \(\mathbf{A }_{\mathsf {AL}}=(A_\mathsf {AL},\preceq _\mathsf {AL})\) be the absolutely free term algebra of \(\mathsf {CL}\)-formulas. Hence \(A_\mathsf {AL}=\{\alpha : \alpha \) a well-formed classical formula\(\}\), where connectives are interpreted as themselves (similar to example 3), and every term is equal to itself and no other term. This means that in this matrix, arbitrary Boolean terms count as single letters.

Definition 68

We define \(\preceq _\mathsf {AL}\) as follows: for arbitrary terms \(\alpha _1,...,\alpha _i,\beta _1,...,\beta _j\) of \({{\varvec{A}}}_\mathsf {AL}\) (equivalently: formulas of classical logic),

$$\begin{aligned} \alpha _1...\alpha _i\preceq _\mathsf {AL}\beta _1...\beta _j\text { iff }\Vdash _{\mathsf {AL}^{\textit{cf}}}\lozenge (\alpha _1;...;\alpha _i)\vdash \lozenge (\beta _1;...;\beta _j) \end{aligned}$$

The formula matrix has the following important property:

Lemma 69

Assume \(a_1,...,a_i,b_1,...,b_j\in A_\mathsf {AL} \). Then

  1. 1.

    \((a_1...a_i)\wedge (b_1...b_j)=\{c_1...c_n:c_1\Vert ....\Vert c_n\in \textit{anf}((a_1\Vert ...\Vert a_i)\wedge (b_1\Vert ...\Vert b_j))\)

  2. 2.

    \((a_1...a_i)\vee (b_1...b_j)=\{c_1...c_n:c_1\Vert ....\Vert c_n\in \textit{anf}((a_1\Vert ...\Vert a_i)\vee (b_1\Vert ...\Vert b_j))\)


An easy induction over \(i+j\). Base case is \(i=j=1\), which is trivial. For induction step, just compare the string abbreviations and Definition 50 of ambiguous normal forms, which are obviously parallel.Footnote 13\(\square \)

Lemma 70

\(\mathbf{A }_\mathsf {AL} \) is an ambiguity matrix.


We go through the rules for \(\preceq \).

  • M1. is clear.

  • M2. we let 0 stand for an arbitrary classical contradiction, 1 an arbitrary classical theorem; then the claim follows.

  • M3. Clear by (I\(\lozenge \)I).

  • M4. We can prove this as follows: assume we have \(\lozenge (\varGamma ;\alpha _1;. ..;\alpha _i;\varDelta )\vdash \varTheta \) and \(\lozenge (\varGamma ;\beta _1;. ..;\beta _j;\varDelta )\vdash \varTheta \). Then we have \(\lozenge (\varGamma ;(\alpha _1\Vert . ..\Vert \alpha _i)\vee (\beta _1\Vert ...\Vert \beta _j);\varDelta )\vdash \varTheta \). By the ANF Lemma, for every \(\gamma \in \textit{anf}((\alpha _1\Vert . ..\Vert \alpha _i)\vee (\beta _1\Vert ...\Vert \beta _j))\), we have \(\lozenge (\varGamma ;\gamma ;\varDelta )\vdash \varTheta \). Hence the claim follows easily Lemma 69 and by invertibility.

  • M5., M6.: clear by repeated weakening.

  • assoc\(\vee \). Straightforward by invertibility.

  • comm\(\vee \). Straightforward by invertibility.

  • id\(\vee \). Clear by invertibility and (\(\natural \)contr).

  • M7.: Parallel to M4.

  • M8., M9.: parallel to M5.,M6.

  • \(\wedge \) assoc. Straightforward by invertibility.

  • \(\wedge \) comm. Straightforward by invertibility.

  • \(\wedge \) id. Clear by inversion and (\(\natural \)contr).

  • M10. Assume \(\Vdash _\textit{cf}\alpha ,\lnot \beta \vdash \gamma ,\beta \). Then by invertibility and (\(\natural \)contr) \(\Vdash _\textit{cf}\alpha \vdash \gamma ,\beta \), and by (\(\lnot \)I) and (\(\natural \)contr) \(\Vdash _\textit{cf}\alpha ,\lnot \beta \vdash \gamma \).

  • M11. Parallel.

  • \(\wedge \) repr. We prove this via the ANF Lemma. Assume \(x((wa)\wedge (w'cdv'))((bv)\wedge (w'cdv'))y\preceq _\mathsf {AL}z\); recall that \(((wa)\wedge (w'cdv'))((bv)\wedge (w'cdv'))\) is a shorthand for a set of strings, each of which can be seen as an ambiguous normal form. By the definition of ambiguous normal forms,

figure bt

Hence all these ambiguous normal forms are congruent in \(\mathsf {AL}^\textit{cf}\) (ANF Lemma, and congruence is transitive), and hence \(x((wabv)\wedge (w'c))((wabv)\wedge (dv'))y\preceq _\mathsf {AL}z\). Same for the other direction.

  • repr\(\wedge \). Similar.

  • repr\(\vee \). Similar.

  • \(\vee \) repr. Similar.

  • BD1. Assume \(w(x\wedge (y\vee z))v\preceq _\mathsf {AL}u\). Note that this is an abbreviation: for \(x=a_1...a_n\), \(w(a_1\wedge (y\vee z))...(a_n\wedge (y\vee z))v\preceq _\mathsf {AL}u\), which itself is again an abbreviation etc. If we take the word letterwise, we find that letters have the form \(a\wedge (b\vee c)\). Hence, as the \(\vee \) is not within the scope of \(\lnot \), by invertibility of (\(\wedge \)I),(\(\vee \)I), we can substitute every letter by \(a\wedge b\), same for \(a\wedge c\). Hence we can derive \(w(x\wedge y)v\preceq _\mathsf {AL}u\) and \(w(x\wedge z)v\preceq _\mathsf {AL}u\). Now we can apply the rule (\(\vee \)I) to the formulas for \(x\wedge y\) and \(x\wedge z\). Now the claim follows easily from the ANF Lemma and invertibility of (\(\Vert \)I).

  • BD2. parallel.

  • DN1. Straightforward by Lemma 54 and the ANF Lemma.

  • DN2. Parallel.

  • 1l. Assume \(\Vdash _\textit{cf}\lozenge (\alpha _1;...;\beta \wedge 1;...;\alpha _i)\vdash \varGamma \). By repeated weakening and (\(\Vert \)I), we obtain \(\Vdash _\textit{cf}\alpha _1\wedge 1\Vert ...\Vert \beta \wedge 1\Vert ...\Vert \alpha _i\wedge 1\vdash \varGamma \). Since \((\alpha _1\wedge 1)\Vert ...\Vert (\beta \wedge 1)\Vert ...\Vert (\alpha _i\wedge 1)\in \textit{anf}((\alpha _1\Vert ...\Vert \beta \Vert ...\Vert \alpha _i)\wedge 1)\), we have \(\Vdash _\textit{cf}\alpha _1\Vert ...\Vert \beta \Vert ...\Vert \alpha _i,1\vdash \varGamma \). Now the claim follows from Lemma 57.

  • 0r. Parallel. \(\square \)

We need one more lemma. We define the canonical interpretation into \(\mathbf{A }_\mathsf {AL}\) by \(\sigma (p)=p_\asymp \); the extension to arbitrary formulas is as usual. Recall that \({\underline{\sigma }}\) and \({\overline{\sigma }}\) coincide on formulas.

Lemma 71

Assume \(\sigma \) is the canonical interpretation into the formula-matrix. Then for an arbitrary formula \(\phi \), if \(\gamma _1\Vert ...\Vert \gamma _i\in \textit{anf}(\phi )\), then \(\gamma _1...\gamma _i\in {\overline{\sigma }}(\phi )={\underline{\sigma }}(\phi )\).


Induction over formula complexity. The atomic case is clear. So assume the claim holds for all formulas with complexity \(\le n\) (where complexity is the number of connectives in the formula), and \(\phi \) has complexity \(n+1\). We show it holds for \(\phi \) by case distinction:

  • \(\phi =\lnot \delta _1\): Assume \(\lnot \alpha _1\Vert ...\Vert \lnot \alpha _i\in \textit{anf}(\lnot \delta _1)\), where \(\alpha _1\Vert ...\Vert \alpha _i\in \textit{anf}(\delta _1)\). By induction hypothesis, \(\alpha _1...\alpha _i\in {\overline{\sigma }}(\delta _1)\), and hence \(\lnot \alpha _1...\lnot \alpha _i\in {\overline{\sigma }}(\lnot \delta _1)\).

  • \(\phi =\delta _1\Vert \delta _2\): Assume \(\gamma _1\Vert ...\Vert \gamma _i\in \textit{anf}(\delta _1\Vert \delta _2)\). Then there is \(j\in \{1,...,i-1\}\) such that \(\gamma _1\Vert ...\Vert \gamma _j\in \textit{anf}(\delta _1)\), \(\gamma _{j+1}\Vert ...\Vert \gamma _i\in \textit{anf}(\delta _2)\). Hence \(\gamma _1...\gamma _j\in {\overline{\sigma }}(\delta _1)\), \(\gamma _{j+1}...\gamma _i\in {\overline{\sigma }}(\delta _2)\), and hence \(\gamma _1...\gamma _i\in {\overline{\sigma }}(\delta _1)\cdot {\overline{\sigma }}(\delta _2)={\overline{\sigma }}(\delta _1\Vert \delta _2)\).

  • \(\phi =\delta _1\wedge \delta _2\): Assume \(\alpha _1\Vert ...\Vert \alpha _i\in \textit{anf}(\delta _1\wedge \delta _2)\). Then either (case 1) \(\delta _1=\gamma _1\Vert ...\Vert \gamma _i\) and for \(j\in \{1,...,i\}\), \(\alpha _j\in \textit{anf}(\gamma _j\wedge \delta _2)\). By induction hypothesis, we have \(\alpha _j\in {\overline{\sigma }}(\gamma _j\wedge \delta _2)\). We have \({\overline{\sigma }}((\gamma _1\Vert ...\Vert \gamma _i)\wedge \delta _2)={\overline{\sigma }}((\gamma _1\wedge \delta _2)\Vert ...\Vert (\gamma _i\wedge \delta _2))\) (see equation (28)). Hence \(\alpha _1...\alpha _i\in {\overline{\sigma }}((\gamma _1\Vert ...\Vert \gamma _i)\wedge \delta _2)={\overline{\sigma }}(\delta _1\wedge \delta _2)\) (by induction hypothesis). Parallel for (case 2) \(\delta _2=\gamma _1\Vert ...\Vert \gamma _i\) and \(\alpha _1\in \textit{anf}(\delta _1\wedge \gamma _1)\),..., \(\alpha _i\in \textit{anf}(\delta _1\wedge \gamma _i)\).

  • \(\phi =\delta _1\vee \delta _2\): Parallel \(\square \)

Note that the inclusion is generally proper: since for example \(p\vee q\notin \textit{anf}(q\vee p)\), but \(p\vee q\in {\overline{\sigma }}(q\vee p)\) (\(\sigma \) the canonical interpretation), since the two are congruent in every matrix. Having established this, we can easily prove the main claim:

Theorem 72

(Soundness and Completeness) \({{\mathbf{AM }}}\models \varGamma \vdash \varDelta \) if and only if \(\Vdash _{\mathsf {AL} ^{\textit{cf}}}\varGamma \vdash \varDelta \).


If: See Lemma 67.

Only-if By contraposition: assume we have an underivable sequent \(\not \Vdash _{\mathsf {AL}^{\textit{cf}}}\varGamma \vdash \varDelta \). Let \(\gamma ,\delta \) be the formulas congruent to these sequents (by Lemma 52). By the ANF Lemma, we have \(\gamma '=\gamma _1\Vert ...\Vert \gamma _n\in \textit{anf}(\gamma )\), \(\delta '=\delta _1\Vert ...\Vert \delta _m\in \textit{anf}(\delta )\), such that \(\not \Vdash _{\mathsf {AL}^{\textit{cf}}}\gamma '\vdash \delta '\). By the invertibility of the rules (\(\Vert \)I),(I\(\Vert \)), we have \(\not \Vdash _{\mathsf {AL}^{\textit{cf}}}\varGamma '\vdash \varDelta '\), where \(\varGamma '=\lozenge (\gamma _1;...;\gamma _n)\), \(\varDelta '=\lozenge (\delta _1;...;\delta _m)\), where \(\gamma _1,...,\gamma _n,\delta _1,...,\delta _m\) are classical (since they are the components of an ambiguous normal form). Now by the previous lemma it follows that \(\gamma _1...\gamma _n\in {\underline{\sigma }}(\varGamma )\), \(\delta _1...\delta _m\in {\overline{\sigma }}(\varDelta )\), where (by Definition 68) \(\gamma _1...\gamma _n\not \preceq _{\mathsf {AL}}\delta _1...\delta _m\); hence \(\mathbf{A }_\mathsf {AL},\sigma \not \models \varGamma \vdash \varDelta \). \(\square \)

Hence we have a completeness proof for our matrix semantics. Note that in some sense, this semantics corresponds to ambiguous normal forms. In particular, it is noteworthy that formulas are not interpreted as themselves even if we interpret into \(\mathbf{A }_\mathsf {AL}\). For example, we have \({\overline{\sigma }}((a\Vert b)\wedge c)=((a\wedge c)(b\wedge c))_\asymp \), as is easy to check. Hence in the interpretation, every formula automatically becomes an (kind of) ambiguous normal form. However, this is a sloppy correspondence, as interpretation into the formula-matrix involves more than just distribution, since we interpret into congruence classes. This is the first point where the semantics is non-trivial: some equivalent unambiguous formulas are congruent, such as \(p\vee q\) and \(q\vee p\), but others not, such as \(p\vee \lnot p\) and \(q\vee \lnot q\). The second remarkable property of this semantics is that we actually only manipulate strings, where all operations except for concatenation are defined for letters only. We used operations \(\wedge ,\vee ,{\sim }\) on strings, but these are only abbreviations for operations on letters! It is thus a noteworthy effect that there is a semantics of ambiguity which works on strings in this canonical sense.

6.6 Matrices and Algebras

We can now establish a slightly more concise correlation between ambiguity matrices and universal distribution algebras (and algebra in general). For the relation of matrices and algebras, we need the algebra of congruence classes of a matrix: given \((\mathbf{A },\preceq )\), we define \(\textit{Con}(\mathbf{A },\preceq )=(A^*_\asymp ,\wedge ,\vee ,{\sim },\cdot ,0_\asymp ,1_\asymp )\). This is an algebra where pseudo-Boolean connectives are defined over congruence classes (see above), and instead of \(\Vert \) we have concatenation (denoted by \(\cdot \)) of congruence classes. The above results ensure all operations on classes are independent of representatives. The relation between algebras and matrices is non-trivial: given a matrix \((\mathbf{A },\preceq )\), the algebra \(\textit{Con}(\mathbf{A },\preceq )\) has an order relation \(\le \), defined by

figure bu

(Actually, the two conditions are equivalent, we skip the proof). Note that \(\le \) is a relation between congruence classes, not strings, but since it is independent of representatives and we mostly write w instead of \(w_\asymp \) anyway, this can be neglected. It is easy to show that \(w_\asymp \le v_\asymp \) entails \(w\preceq v\), but the converse does not necessarily hold. We now show that whereas \(\preceq \) semantically corresponds to \(\vdash \), \(\le \) in \(\textit{Con}(\mathbf{A },\preceq )\) is the semantic counterpart of the relation \(\leqq \) in \(\mathsf {AL}^\textit{cf}\) which we have discussed at length in Sect. 6.2:

Lemma 73

\(w\le v\) if and only if \(xvy\preceq z\) entails \(xwy\preceq z\) and \(z\preceq xwy\) entails \(z\preceq xvy\).


Only if: assume \(w\le v\), and \(xvy\preceq z\). Then \(x(w\vee v)y\preceq z\) (definition of \(\le \)), and so \(xwy\preceq z\) (M4.). Parallel for \(z\preceq xwy\) using \(\wedge \).

If: Assume \(xvy\preceq z\) entails \(xwy\preceq z\) and \(z\preceq xwy\) entails \(z\preceq xvy\). We prove that \(v\asymp (w\vee v)\); the proof for \(\wedge \) is parallel.

On the left: \(xvy\preceq z\) entails by assumption \(xwy\preceq z\), hence \(x(w\vee v)y\preceq z\), and \(x(w\vee v)y\preceq z\) obviously entails \(xvy\preceq z\).

On the right: \(z\preceq xvy\) entails \(z\preceq x(w\vee v)y\). Conversely, \(z\preceq x(w\vee v)y\) holds iff \(z\preceq (xwy)\vee (xvy)\) (Lemma 64), which holds iff \(z\wedge {\sim }(xvy)\preceq xwy\) which entails \(z\wedge {\sim }(xvy)\preceq xvy\) (by assumption), which entails \(z\preceq (xvy)\vee (xvy)\), and hence \(z\preceq xvy\). \(\square \)

Hence we have a natural semantics for the logical relation \(\leqq \). \(\le \) is, contrary to \(\preceq \), obviously transitive and antisymmetric (since it is defined on congruence classes), and it satisfies (strong transitivity); but \(\le \) is not a good model for \(\vdash \), as we can see from Theorem 75.

Lemma 74

In an ambiguity matrix \((\mathbf{A },\preceq )\) which satisfies (strong transitivity), the rule (cut) is sound.


The claim is obvious (provided Lemma 66) for the following:

(matrix cut) If \(y'\preceq y\vee c\), \(xyz\preceq u\), then \(xy'z\preceq u\vee c\).

We show that (strong transitivity) entails (matrix cut) (the other direction is straightforward): assume in a strongly transitive matrix we have \(y\preceq a\vee c\) and \(xaz\preceq u\). Now \(y\preceq a\vee c\) entails \({\sim }c\wedge y\preceq a\); by (strong transitivity) we obtain \(x({\sim }c \wedge y)z\preceq u\), and by repeated M9., we obtain \({\sim }c\wedge (xyz)\preceq u\) (recall that \({\sim }c\wedge (xyz)\) is a shorthand, we obtain the term by introducing \(\wedge {\sim }c\) on all letters separately). Then by M5., we get \({\sim }c\wedge xyz\preceq a\vee c\). By M10., we obtain \(xyz\preceq a\vee c\). \(\square \)

Now comes the main result on ambiguity matrices and algebras:

Theorem 75

Let \((\mathbf{A },\preceq )\) be an ambiguity matrix. Then the following are equivalent:

  1. 1.

    \((A_\asymp ,\wedge ,\vee ,{\sim },0_\asymp ,1_\asymp )\) is a Boolean algebra.

  2. 2.

    \(\textit{Con}(\mathbf{A },\asymp )\) is a universal distribution algebra.

  3. 3.

    \((\mathbf{A },\preceq )\) satisfies (strong transitivity).

  4. 4.

    \(\le \) coincides with \(\preceq \).


\(1.\Rightarrow 2.\) We show that the relations \(\le \) and \(=\) of \(\textit{Con}(\mathbf{A },\preceq )\) satisfy the axioms of \(\mathbf {UDA}\): (\(\Vert \)1),(\(\Vert \)2) are clear by notation, (assoc) obvious for concatenation, (mon) follows from Lemma 73, and for (inf) consider the following: assume \(xwvy\preceq z\); then \(x(w\wedge v)(w\wedge v)y\preceq z\), hence \(x (w\wedge v)y\preceq z\). Assume \(z\preceq x(w\wedge v)y\); then \(z\preceq x(w\wedge v)(w\wedge v)y\), hence \(z\preceq xwvy\), same for \(\vee \).

\(2.\Rightarrow 3.\) Assume \(\textit{Con}(\mathbf{A },\preceq )\in \mathbf {UDA}\), hence it is a model of \(\mathsf {AL}\), and (cut) is sound. Our soundness proof states in particular that if \(\varGamma [\alpha ]\vdash \varDelta \) is true in a model, \(\varTheta \vdash \alpha \) is true, then so is \(\varGamma [\varTheta ]\vdash \varDelta \). Assume \(xyz\preceq u\), \(y'\preceq y\). We simply put \(\sigma (p)=y\), \(\sigma (p')=y'\), \(\sigma (q_1)=x\), \(\sigma (q_2)=z\), \(\sigma (r)=u\). Then \(\lozenge (q_1;p;q_2)\vdash r\) is true, \(p'\vdash p\) is true, hence so is \(\lozenge (q_1;p';q_2)\vdash r\). Note that it is irrelevant that none of these sequents is derivable: (cut) preserves truth, that is all we need.

\(3.\Rightarrow 4.\) In general, \(w\le v\) entails \(w\preceq v\wedge w\) entails \(w\preceq v\). We prove that under assumption of 3., \(w\preceq v\) entails \(w\le v\). So assume \(w\preceq v\), where \(\preceq \) satisfies (strong transitivity).

Assume \(xvy\preceq z\). Then \(xwy\preceq z\), so \(x(w\vee v)y\preceq z\). \(x(w\vee v)y\preceq z\) obviously entails \(xvy\preceq z\), hence v and \(w\vee v\) are congruent on the left-hand side.

\(z\preceq xvy\) obviously entails \(z\preceq x(w\vee v)y\). Conversely, assume \(z\preceq x(w\vee v)y\). Hence \(z\preceq (xwy)\vee (xvy)\) (Lemma 64). By repeated negation axiom application, \({\sim }x{\sim }w{\sim }y\preceq ({\sim }z)\vee (xvy)\). Since \({\sim }v\preceq {\sim }w\), by (strong transitivity) we obtain \({\sim }x{\sim }v{\sim }y\preceq ({\sim }z)\vee (xvy)\), and again applying negation axioms, we have \(z\preceq (xvy)\vee (xvy)\), hence \(z\preceq xvy\). This proves that \(v\asymp w\vee v\). Parallel for \(\wedge \), so \(v\le w\).

\(4.\Rightarrow 1.\) It is an easy exercise to check that that \(\simeq \), the reflexive closure of \(\preceq \), satisfies all axioms of a Boolean algebra (whatever axiomatization we choose). If \(\preceq \) coincides with \(\le \), then \(\simeq \) coincides with \(=\) (equality of \(\textit{Con}(\mathbf{A },\preceq )\)). Hence the claim follows. \(\square \)

This theorem gives us a number of insights: firstly, note that this entails the following: if \((\mathbf{A },\preceq )\) satisfies any of the above, then for all \(a,b\in A\), \(w,v\in w^*\), \(awb\asymp avb\). This obviously follows from 2.: if \(\textit{Con}(\mathbf{A },\preceq )\in \mathbf {UDA}\), then \((\mathbf{A },\preceq )\) is a model for \(\mathsf {AL}\) (with cut), hence the Margin Lemma and other negative results apply to it. The theorem thus gives a number of equivalent conditions which disqualify a matrix as an adequate model. An important one is 1.: if the congruence algebra is Boolean, the matrix is inadequate. This means, in order to reason adequately with ambiguity, we need to abandon Boolean reasoning as soon as ambiguity actually comes into play. (strong transitivity) has the same effect of disqualifying the model immediately. An important criterion comes with 4.: this point states that we have to distinguish the relation \(\preceq \) (corresponds to incongruent entailment, \(\vdash \)) and \(\le \) (congruent entailment, corresponds to \(\leqq \)) necessarily. As soon as the two coincide, the result becomes a rather trivial and surely inadequate model for ambiguity. We think Theorem 75 is very satisfying from a formal point of view; next we will try to shed some light on the intuitive meaning of these results.

6.7 Meaning of the Semantics

Given this semantics for a calculus, we now try to provide it with an intuitive meaning, which somehow relates it to the real world, though only in a very preliminary fashion. We think that what we mostly learn from the completeness of matrix semantics (and inadequacy of \(\mathbf {UDA}\)) is that syntactic form matters for ambiguous meanings in a stronger sense than usual, namely even beyond inferential equivalence. For example, consider the (derivable) sequent

$$\begin{aligned} \vdash (p\vee \lnot p)\Vert (p\vee \lnot q)\Vert (\lnot p\vee q)\Vert (q\vee \lnot q) \end{aligned}$$

Our results lead us to the conclusion that the different subterms, separated by \(\Vert \), are dependent on each other beyond inferential equivalence. If we substitute \(r\vee \lnot r\) for \(p\vee \lnot p\) or \(q\vee \lnot q\), then the sequent is no longer derivable, even though the subformulas are logically equivalent in \(\mathsf {AL}^{\textit{cf}}\). The point is that we have a dependence which is syntactic in nature, that is, it concerns the syntactic form of the terms, not their denotation or inferential properties. Concretely, the q in \(p\vee \lnot q\) and \(\lnot p\vee q\) is connected to the q in \(q\vee \lnot q\), we cannot change one without the other. Seen from the other direction, in classical logic q has a contribution to the meaning of \(q\vee \lnot q\); but since the latter is inferentially equivalent to 1 (any theorem), the q is irrelevant and arbitrary for the meaning of any formula containing \(q\vee \lnot q\) as a subformula. In \(\mathsf {AL}\) on the other hand, we have to keep track of this subterm q, so in a sense we can say our semantics and calculus are less local or less context-free than \(\mathsf {CL}\), the classical calculus. We need to know more of a term than its inferential equivalence class in order to interpret it properly, and this is what we mean by saying the syntactic form of the term matters in a stronger sense. Standard algebraic semantics is—by definition—incapable of modeling this, since the central notion of algebra is the one of congruence, and this is why we have introduced matrix semantics. Maybe we can put the peculiarity of our calculus and semantics in simple terms: as soon as there is ambiguity, inferential equivalence does no longer entail congruence (at least in the trustful setting).

Our matrix-semantics is language-theoretic in a broad sense, in that the main objects are strings, and the main notion of congruence concerns exchangeability in strings, which is called Nerode-equivalence in formal language theory. In matrix semantics, different Boolean terms which are equivalent in \(\mathbf{B }\) (the class of Boolean algebras) are not generally congruent, because they are not necessarily exchangeable within words of the matrix, and this is the whole point of matrix-semantics. This is achieved by keeping all terms (even equivalent ones) distinct, and expressing constraints only via the relation \(\preceq \).

So how does this relate to the reality of, and intuition on, ambiguity? In fact, regarding incongruence, our logic not only lacks the cut-rule, it also lacks transitivity of inference. One might consider this a devastating result, as this even contradicts a basic property of consequence relations as usually defined (see Tarski 1936). However, to us this does not seem to be too bad: if we think of transitivity of logical inference, we think of unambiguous statements, and for unambiguous formulas, in fact we do have transitivity of inference in \(\mathsf {AL}^\textit{cf}\). Transitivity is problematic for ambiguity because in ambiguity, syntactic form matters, and transitivity “cuts out the middle man”. It is exactly this discarding of a syntactic object which is problematic due to the lack of locality we explained above. We can explain this by analogy with a real-world situation: if somebody makes an unambiguous statement, it is enough to remember what follows from it, that is, some very abstract meaning representation. However, if someone makes an ambiguous statement, we better remember (more or less precisely) its syntactic form, in order to be able to reconstruct possible intentions, and to remain aware of ambiguity. This intuitive matter of fact is reflected by the mathematics, and this seems to be a nice result, enlightening about the nature of ambiguity.

There is another empirical phenomenon interesting in this context, namely the well-known zeugma-effect, which states that something is “weird”, but not incorrect. This arises among other when we use an ellipsis of an ambiguous word, but in each occasion we use it in a different sense.Footnote 14

figure bv

Whereas (33-b) is completely normal, in (33-a) we feel there is something strange and funny, even though we would not say it is wrong. We feel like language has been abused. In fact, these examples describe very well the position of ambiguity: if it where a purely syntactic phenomenon, then (33-a) would be clearly wrong, because we technically cannot make an ellipsis with two distinct lexical entries. If it were a purely semantic phenomenon, then (33-a) should be alright (putting aside matters of uniform usage for the moment). It is however none of the two: we feel like we are “cheated” by the sentence, as the left-out word is used in a different sense than its counterpart, but on the other hand, we cannot say the sentence is wrong. Of course our work will not shed a completely new light on the nature of ellipsis and the zeugma effect at this point, but the examples clarify the intermediate position we assign to ambiguity. And it is interesting that independently and on a completely formal approach, we find the same effect: our approach of using \(\mathsf {AL}^{\textit{cf}}\) provides exactly an intermediate solution: the syntactic form of formulas matters (equivalence does not entail congruence), but only up to a certain point, as congruence goes beyond syntactic identity!

Our mathematical results actually allow to make this a little more precise, using Lemma 45 (on classic cut) and the corresponding negative results. We can say: the syntactic form of a term is irrelevant beyond its inferential properties (that is, equivalence coincides with congruence), as long it is 1. not in the scope of an ambiguity operator, 2. nor is ambiguous in itself. Hence we can say two things: 1. for representations of ambiguous meanings, equivalence does not entail congruence. And 2. even for unambiguous “submeanings” of an ambiguous meaning, equivalence does not entail congruence. On the one side, this is a nice wrap-up of formal results. However, from a philosophical point of view, it is more like an interesting point of departure: this is because here we have naturally used two notions which are actually not really defined and are quite problematic: the notion of an ambiguous meaning, and the notion of a submeaning. We will shortly address these notions, though to adequately treat them would probably require an article in itself.

1. It seems to be intuitively clear what an ambiguous meaning is; but it seems to be very difficult to define it without already presupposing an equivalent definition (e.g. the one of an unambiguous meaning). Our semantics gives a very simple definition of ambiguity: a meaning is unambiguous, if and only if, for all terms denoting it, all of its constituents can be exchanged by equivalent terms. This is of course not extremely satisfying as it presupposes many theoretical concepts, but still it is an interesting insight.

2. From a logical point of view, it is actually unclear what a submeaning is: meanings in the logical sense are not ordered by subsumption, nor do they have an obvious structure. In Boolean algebras, it would be nonsense to say that p is a submeaning of \(q\vee (p\wedge \lnot p)\) (because then every meaning would be a submeaning of every meaning). In the case of ambiguity, this does (at least intuitively) not seem to be the case: we have a very clear intuition on the structure of ambiguous meanings and their submeanings, namely the following: given an ambiguous utterance, every possible reading is a submeaning. We thus have a clear structure, which is reflected in our semantics by strings, where each letter corresponds to one reading (recall that in ambiguity matrices, arbitrarily complex Boolean terms are just single letters!). Actually, we suppose that this point is closely connected to 1.: the clear intuition on submeanings and their composition constitutes our intuition on what ambiguity is.

Two final notes: firstly, note that these points make clear that also conceptually, an algebraic semantics (and a congruent logical calculus with cut) should be inadequate, because in algebra (for example \(\mathbf {UDA}\)), it is easy to show (via isomorphisms) that there is no natural definition of the ambiguous objects in an algebra, and given an algebra, it is impossible to define the notion of a constituent of some object (which is not to be confused with the notion of sub-term, which is very different). Secondly, the notion of submeaning is actually well-established in information-theoretic semantics is based on feature structures and unification. This opens some interesting connections which will however require some research on their own (see Barwise and Etchemendy 1990, for some work in this direction).

7 Conclusion

We have investigated the problem of (trustful) reasoning with ambiguity, both from a logical/syntactic and semantic/algebraic point of view. In the beginning, we found some paradoxical results: even from most innocuous seemingly assumptions, we had consequences which were strongly counterintuitive, and almost led to triviality, as in the case of ambiguous algebras and universal distribution algebras. We have chosen a way out of the dilemma which is not really obvious: we abandoned the assumption that reasoning with ambiguity is congruent. Mathematically, this means that we use a logic without admissible cut-rule; conceptually, this means that the syntactic form of a term matters beyond inferential equivalence. This seems strange at the beginning, but there were good motivations for this move both on the formal and conceptual side, and the results which follow were satisfying for us: in particular, we presented the cut-free calculus \(\mathsf {AL}^{\textit{cf}}\), and our main hypothesis was that this calculus is sound and complete for trustful reasoning with ambiguity. We have provided this calculus with a semantics which is in a sense very natural given the peculiar properties of ambiguity, though it is rather unusual: ambiguous meanings are represented as strings, which might be best thought of as ambiguous normal forms, that is, every string represents the ambiguity between classical, unambiguous meanings. We leave the full philosophical and practical implications of this work for further research.