Reasoning with Ambiguity

We treat the problem of reasoning with ambiguous propositions. Even though ambiguity is obviously problematic for reasoning, it is no less obvious that ambiguous propositions entail other propositions (both ambiguous and unambiguous), and are entailed by other propositions. This article gives a formal analysis of the underlying mechanisms, both from an algebraic and a logical point of view. The main result can be summarized as follows: sound (and complete) reasoning with ambiguity requires a distinction between equivalence on the one and congruence on the other side: the fact that α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} entails β\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta $$\end{document} does not imply β\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta $$\end{document} can be substituted for α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} in all contexts preserving truth. Without this distinction, we will always run into paradoxical results. We present the (cut-free) sequent calculus ALcf\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {AL}^{\textit{cf}}$$\end{document}, which we conjecture implements sound and complete propositional reasoning with ambiguity, and provide it with a language-theoretic semantics, where letters represent unambiguous meanings and concatenation represents ambiguity.


Introduction
This article gives an extensive treatment of reasoning with ambiguity, more precisely with ambiguous propositions. We approach the problem from an algebraic and a logical perspective and show some interesting surprising results on both ends, which lead up to some interesting philosophical questions, which we address in a preliminary fashion. The term linguistic ambiguity roughly designates cases where expressions of natural language give rise to two or more, though finitely many, sharply distinguished B Christian Wurm cwurm@phil.hhu.de 1 Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany meanings. We leave it for now with this brief and intuitive definition, 1 since we will be rather explicit on its properties later on, and we rely on the fact that even nonlinguists have very stable intuitions on what ambiguity is (though the distinction from vagueness and polysemy probably requires some additional knowledge). In linguistics, ambiguity is usually considered to be a very heterogeneous phenomenon, and this is certainly true as far as it can arise from many different sources: from the lexicon, from syntactic derivations, from semantic composition as in quantifier scope ambiguity (this is sometimes reduced to syntax), from literal versus collocational meanings, and probably even more sources such as metaphors etc. Nonetheless, there is something common to all these phenomena, hence it makes sense to think of ambiguity as one single phenomenon.
We have recently argued (see Wurm and Lichte 2016) that the best solution is to treat ambiguity consistently as part of semantics, because there are some properties which are consistently present regardless of its source. The advantage of this unified treatment is that having ambiguity in semantics, we can use all semantic resources in order to resolve it and draw inferences from it (we will be more explicit below). It is a remarkable fact that even though ambiguity is a pervasive phenomenon in natural language, it usually does not seem to pose any problems for speakers: in some cases, we do not even notice ambiguity (as in (1)), whereas in other cases, we can also perfectly reason with and draw inferences from ambiguous information (as in (2)): (1) Time flies like an arrow.
(2) The first thing that strikes a stranger in New York is a big car.
In (1), uttered in an appropriate situation to a non-linguist, hardly any listener would think about the mysterious time flies. Conversely, in (2) everyone notices the ambiguity, but still without any explicit reasoning, the conclusion that in New York there is at least one big car (and probably many more) seems immediate to us. Hence we can easily draw inferences from ambiguous statements. This is in line with psycholinguistic findings that "inference is easy, articulation is costly" (see Piantadosi et al. 2011), and hence ambiguity is to be expected in a language shaped by convenience. This entails two things for us: 1. We should rather not disambiguate (i.e. decide on a reading) before we start constructing semantics, as otherwise at least one reading remains unavailable, and soundness of inferences cannot be verified. 2. Hence we have to be able to construct something like "ambiguous meanings", and we have to be able to reason with them.
As regards 1., we have to add that from a psychological point of view, it is often plausible to assume that we disambiguate before we interpret a statement, in the sense that even though a sentence is ambiguous between m 1 and m 2 , only one of the meanings is constructed or even perceived (see (1)). However, from a logical point of view this prevents sound reasoning, and our goal here is to provide a theory for sound (and complete) reasoning with ambiguity, not a psychological theory. We will investigate thoroughly the matter of reasoning with ambiguity, which will lead to many results which are surprising from a mathematical and interesting from a philosophical point of view. In Sect. 2, we will lay the conceptual foundations and explain what we mean by ambiguity and what are, in our view, its main properties. The rest of the paper is devoted to a formal approach to reasoning with ambiguity; we let the ambiguity between two meanings m 1 , m 2 be denoted by m 1 m 2 .
In Sect. 3, we will try to tackle the problem algebraically, by introducing three classes of algebras, all being extensions of Boolean algebras with a binary operator . These are strong and weak ambiguous algebras and universal distribution algebras (denoted by SAA, WAA and UDA). WAA has been introduced in Wurm and Lichte (2016), and UDA in Wurm (2017). These algebras have, at first glance, innocuous axioms which implement unquestionable properties of ambiguity. However, we will show that in all of them strongly counterintuitive properties hold, and moreover, we show that the equational theories of these three classes actually coincide. These results are surprising and interesting from an algebraic point of view, and leave us with the main paradox we have pointed out already in earlier publications (Wurm and Lichte 2016): how can properties, which are intuitively correct beyond doubt, lead to properties which are intuitively incorrect beyond doubt? There is one obvious way out, which consists in using partial algebras. This however does not provide us with satisfying results either, hence we only mention this possibility and show some rather negative results. Our solution is to say: algebra itself is the problem, more precisely, the fact that we use a congruence which disregards the syntactic form of terms. This problem obviously cannot be solved in an algebraic fashion, hence we use logic to approach it.
In Sect. 4, we introduce the logic AL, a logic which extends classical logic with an additional connective corresponding to ambiguity. We provide a Gentzen-style calculus for this logic and prove it sound and complete for UDA [and hence as well for SAA; the former has already been proved in Wurm (2017)]. Here, the rule (cut) ensures we have congruence as in algebra.
In Sect. 5 we present elementary results on the proof-theory of AL and its cut-free version AL cf , the key results being the following: many important rules, like logical rules corresponding to universal distribution, are admissible in the cut-free calculus; but the cut-rule itself is not admissible. Whereas this is usually considered a bad result, for us it is positive: AL (with cut), being complete for UDA, is too strong for our purposes.
In Sect. 6 we put forward our main hypothesis: the cut-free logic AL cf (arguably with or without commutativity of ) is the correct tool for reasoning with ambiguity, that is, it covers all and only the correct inferences. We present some evidence for this hypothesis, though of course it is impossible to formally prove it. Cut-free AL cf is incongruent, that is, there is a difference between (1) being logically equivalent (α entails β, β entails α), and (2) being substitutable in all contexts while preserving truth of implications. We provide cut-free AL cf with a semantics, which is also incongruent in the above sense. We then prove soundness and completeness for this semantics. This semantics is based on strings, hence our completeness proof provides also a sort of representation theorem for ambiguous meanings, where roughly speaking concate-142 C. Wurm nation represents ambiguity, and a string represents an "ambiguous normal form", that is, a list of unambiguous meanings.
Finally, we will discuss the meaning of our results for the nature of ambiguity. Assuming our main hypothesis is correct, reasoning with ambiguity presupposes incongruence, that is, logical equivalence does not entail substitutability. In other words, this means: syntactic form of formulas matters beyond equivalence. Even if we treat ambiguity semantically, there remains something syntactic to it. This is the final insight provided by the quest for the proper tool for reasoning with ambiguity, and we think this opens some philosophical questions on the nature of meaning, which go beyond what we can address in this article.

Background and Some History
From a philosophical point of view, one often considers ambiguity to be a kind of "nemesis" of logical reasoning; for Frege, for example, the main reason to introduce his logical calculus was that it was in fact unambiguous, contrary to natural language. The discussion about the detrimental effect of ambiguity in philosophy can be traced back even to the ancient world, see Sennet (2016), and is still going on, see Atlas (1989). 2 On the other hand, in natural language semantics, there is a long tradition of dealing both with ambiguity and logic; we will discuss here three main approaches.
In the first approach, a natural language utterance is translated into an unambiguous formal language such as predicate logic, and ambiguity becomes visible by the fact that there are several translations. To consider a famous example: 3 Every boy loves a movie.
(3) ∃x.∀y.movie(x) ∧ (boy(y) → loves(y, x)) (4) ∀y.∃x.movie(x) ∧ (boy(y) → loves(y, x)) So ambiguity does not enter into the logic itself, but is "represented" by the fact that there are two (or more) different logical representations for one sentence. So we cannot simply translate natural language into logical representations (predicate logic or other), as there is no way to represent ambiguity in these languages. The standard way around the lack of functional interpretation is that we do not interpret natural language sentences as strings, but rather their derivations: one string has several syntactic derivations, and derivations in turn are functionally mapped to semantic representations (e.g. see Montague 1973). The problem with this approach is that we basically ban ambiguity from semantics: we first make an (informed or arbitrary) choice, and then we construct an unambiguous semantics. Now this is a problem, as we have seen above: 1. If we simply pick one reading, we cannot know whether a conclusion is generally valid or not, because we necessarily discard some information. 2. To decide on a reading, we usually use semantic information; but if we choose a reading before constructing a semantic representation, how are we suppose to decide?
This becomes even more problematic if we have an ambiguous statement as a constituent in a larger statement. These reasons indicate that we should not prevent ambiguity from entering semantics, because semantics is where we need it, and if it is only to get rid of it. But once ambiguity enters into semantics, we have to reason about its combinatorial, denotational and inferential properties. A second possibility of which authors make use (though often implicitly) is to treat ambiguity as the disjunction of meanings (see Saka 2007). However, here the above example gives a good argument why this is necessarily inadequate: if we take the disjunction of (4) and (5), the formula would be logically equivalent to (5) (because (4) entails (5))-hence there would not even exist an ambiguity in (3) in any reasonable sense! Apart from this, disjunction behaves differently from ambiguity when for example negated: disjunction obeys the DeMorgan laws, whereas ambiguity remains invariant (see (6-a), (6-b), we will explain this in more detail below). Hence importantly, ambiguity is not disjunction, though there is a relation between the two. Actually, this is a long-lasting misunderstanding among many scholars, even though this has been recognized many years ago (see for example Poesio 1994).
A third approach for representing ambiguity (as e.g. in the quantifier case) is to use a sort of meta-semantics, 4 whose expressions underspecify logical representations (see Egg 2010); famous cases in point would be Cooper storage and Hole Semantics. Assume our "unambiguous" language is the formal language of logic L (say some extension of predicate logic); in addition to this, we assume we have a meta-language M, by which we can underspecify terms of L. For example, let χ be a formula of M underspecifying the two formulas α, β of L (for example (4) and (5)). But now that we have this meta-language M of our logic L, there are new questions: 1. How do we interpret formulas of M? 2. How do we provide the connectives of M with a compositional semantics? 3. What are the inferences both in L and M we can draw from formulas in M?
Once we start seriously addressing these questions, we see that moving to a metalanguage does not solve any of our problems-at best, it removes them from our sight. We usually do have a compositional semantics and consequence relation for L; for M we do not. Hence M fails to have the most basic features of a semantics, unless, of course, M itself is a logic with consequence relation and compositional semantics. But in this case, considering that M should conservatively extend L, it seems to be much more reasonable to include the new operator for ambiguity directly into our object language L. And this is exactly what we do here. From this example it becomes clear once more that ambiguity cannot be reasonably interpreted the same way as disjunction: because L in any normal case already has disjunction, there would be no need at all for M [this problem is discussed in more detail in van Eijck and Jaspars (1996)]. This is but a short outline of the main problems of the three usual treatments of ambiguity, namely i. moving ambiguity to syntax, ii. treating ambiguity as disjunction, and iii. using meta-(meta-)languages. In our view, none of them substantially contributes to the problem of reasoning with ambiguity. We will now expose what for us are the key features of ambiguity, which at the same time are the main challenges in developing a logic of ambiguity. For more extensive treatment of some aspects, we refer the reader to Wurm and Lichte (2016).

Key Aspects of Ambiguity
We think that the crucial point to distinguish ambiguity from related phenomena like vagueness or sense generality lies in considering combinatorial, denotational and inferential properties of ambiguity separately. Whereas the latter two are closely related, the combinatorial properties are rather distinct.
One important distinction has to be made from the outset, namely the one between what we might call local and global ambiguity. For example the word can is ambiguous between a noun and an auxiliary; however, it will probably not contribute to the ambiguity of any sentence, because the correct syntactic category can be inferred from its context, and hence the ambiguity remains local. Local ambiguity is thus ambiguity which can be definitely discarded at some level by syntactic or combinatoric properties alone, and therefore can never enter semantics. What is interesting for us is global ambiguity, which cannot be disambiguated on the base of morpho-syntactic combinatorics. Note that even in the context of finance transactions, the word bank remains globally ambiguous. This article only covers global ambiguity in this sense.
Recall that we let ambiguity be denoted by ; we use this symbol both as an algebraic operator and a logical connective, both binary. Hence a b can be a term in an appropriate algebra, α β a logical formula. We use the symbol also to combine meanings, on the precise nature of which we are agnostic. We now list the main features of ambiguity.
Discreteness This is a main intuitive feature of ambiguity, in particular distinguishing it from vagueness: in ambiguity, we have a finite (usually rather small) list of meanings between which an expression is ambiguous, and those are clearly distinct. This feature is most basic in the sense that this allows us to treat ambiguity as a binary algebraic operator or logical connective . To take our typical example of the word bank: we have the two clearly distinct meanings "financial institute" and "strip of land along a river". Note that this intuitively obvious feature of discreteness is by no means trivial, as the two clearly distinct meanings of bank are vague themselves, as most common noun meanings (for example, how broad can a piece of land along a river be to still qualify as a river bank?) Universal distribution For the combinatorics of , the most prominent, though only recently focused (see Wurm and Lichte 2016) feature of ambiguity is the fact that it equally distributes over all other connectives. To see this, consider the following examples: (6) a. There is a bank. b. There is no bank.
(6-a) is ambiguous between m 1 ="there is a financial institute" and m 2 = "there is a strip of land along a river". When we negate this, the ambiguity remains, with the negated content: (6-b) is ambiguous between n 1 ="there is no financial institute" and n 2 ="there is no strip of land along a river", and importantly, the relation between the two meanings n 1 and n 2 is intuitively exactly the same as the one between m 1 and m 2 . This distinguishes an ambiguous expression such as bank from a hypernym as vehicle, which is just more general than the meanings "car" and "bike": (7) a. There was a vehicle. b. There was no vehicle.
(7-a) means (arguably): "there was a car or there was a bike or ..."; but (7-b) rather means: "there was no car and there was no bike and ...". Hence when negated, the relation between the meanings changes from a disjunction to a conjunction (as we expect from a classical logical point of view); but for ambiguity, nothing like this happens: the relation remains invariant. This also holds for distribution of all other connectives/operations (see Wurm and Lichte 2016). This invariance is the first point where we see a clear difference between ambiguity and disjunction, and we consider this property of universal distribution to be most characteristic of ambiguity. Universal distribution seems to be strongly related to another observation: we can treat ambiguity as something which happens in semantics (as we do here), or we can treat it as a "syntactic" phenomenon, where "syntactic" is to be conceived in a very broad sense. In our example, the syntactic approach would be to say: there is not one word (as form-meaning pair) bank, but rather two words bank 1 and bank 2 , bearing different meanings. 5 The same holds for genuine syntactic ambiguity: one does not assume that the sentence I have seen a man with a telescope has strictly speaking two meanings, one rather assumes it has two derivations, where each derivation comes with a single meaning. Universal distribution is what makes sure that semantic and syntactic treatment are completely parallel: every operation f on an ambiguous meaning m 1 m 2 equals an ambiguity between two (identical) operations on two distinct meanings, hence Note that in cases where we combine ambiguous meanings with ambiguous meanings, this leads to an exponential growth of ambiguity, as is expected. Hence universal distribution is what creates the parallelism between semantic and syntactic treatment of ambiguity. This means: strictly speaking, we do not even need to argue whether ambiguity is a syntactic or semantic phenomenon-because the result in the end should be the same, it is of no relevance where ambiguity comes from. However, as soon as we start to reason with ambiguity, a unified semantic treatment will only have advantages, as all information is in one place. If we consider propositional logic, (8) reduces to By convention, we use symbols as m 1 , m 2 if we speak about (propositional) linguistic meanings, symbols like a, b, c when we speak about arbitrary algebraic objects; Greek letters α, β etc. will be reserved for logical formulas. Logically speaking, this means that is self-dual: preserves over negative contexts such as negation, similar to fusion in Lambek (1995) (this logic is however used for a very different purpose, namely analysis of natural language syntax). Entailments An ambiguity m 1 m 2 is generally characterized by the fact that the speaker intends one of m 1 or m 2 . The point is: we do not know which one of the two, as for example in (12) Give me the dough! From this simple fact, we can already deduce that for arbitrary formulas φ, α, β, χ in the logic of ambiguity, if φ α χ and φ β χ hold, then φ α β χ holds, hence in particular, α ∧ β α β α ∨ β. But: we cannot reduce α β to neither α nor β: we have α α β and β α β, and also α β α and α β β. This is because our logic is supposed to model the inferences which are sound in every case (i.e. under every intention), not in some cases, and all the latter entailments are all unsound in some cases. Hence does not coincide with any classical connective and is not definable in classical logic. It is actually a substructural connective (see Restall 2008, for an introduction), behaving similar as fusion in linear logic: in particular, it does not allow for weakening (we will make this precise below). Note that this also illustrates how ambiguity behaves rather differently from disjunction: Give me the pastry or give me the money! Anyone who utters (13) should be satisfied if he gets handed the pastry, and also if he gets handed the money. If a speaker utters (12), he either means "pastry" or "money", but he might complain either if you give him the money or if you give him the pastry. The conditions for satisfying (12) are thus clearly different from (13): in the former, whichever of the two you give, you might remain with an angry interlocutor.

Conservative extension
In particular in connection with logic, it should be clear that our logical calculus of ambiguity should be a conservative extension of the classical calculus, meaning that for formulas not involving ambiguity, the same consequences should be valid as before. The reason is that even if we include ambiguous propositions, unambiguous propositions should behave as they used to before-if there are new entailments, they should only concern ambiguous propositions. The algebraic notion corresponding to the fragment in logic is the one of a reduct, hence the notion equally makes sense in an algebraic setting.
There are some more important properties of ambiguity which have some relevance in the paper, which however are more technical. These are the following: Associativity This property states that given an ambiguity between more than two meanings, their grouping is irrelevant, formally a (b c) = (a b) c. This seems natural to us, and there seems little to object to it. It is very important in connection with commutativity.
Commutativity This property states that for meaning, the order of ambiguities does not play a role, hence a b = b a. This is not intuitively clear to our conception of meaning: on the one hand, there does not seem to be in general a natural order between ambiguous meanings; on the other hand, we often have a clear intuition on which meaning is primary, secondary etc. Regardless and from a mathematical point of view, this property will turn out to be the most critical in this article, and will serve as a probe into the adequacy of a formal theory of ambiguity. The reason is as follows: in all algebraic approaches we present, including commutativity will result in having only trivial (i.e. one-element) algebras. This, among other, is obviously a knock-out criterion, because even if we do not necessarily want to include commutativity, we definitely want to be able to include it into our axiom set. We thus use this property to definitely refuse approaches to ambiguity. Having such a pivotal role, we will in the very end use it also in a positive fashion: the fact that our logic AL cf -and its incongruent semantics-can be extended with commutativity without any apparent problems for us is a strong evidence that it is adequate.
Non-productivity or partiality This is a very peculiar feature of ambiguity, which distinguishes it fundamentally from other propositional connectives: for connectives like ∧, ∨, ¬ etc., we find natural language counterparts and, or, not; in this sense, they are productive. This even holds for definable connectives which do not have a simple counterpart, such as XOR (the exclusive or), which we can express in some way or other. For ambiguity, this does not hold: we simply cannot create arbitrary ambiguities in natural language. There is no English phrase expressing the ambiguity between "squirrel" and "table". We conjecture that this will hold in all natural languages (though there does not seem to be any research on this). One might argue that there is simply no function for ambiguity, but this is definitely not true. Assume we have a shy man who wants to ask out his office-mate, but is afraid to commit himself. It would be extremely useful for him to have a sentence ambiguous between "would you go out with me" and "do you mind if I open the window"-but this sentence does (presumably) not exist. It is easy to find many other examples-just think of what people might want to say (and not say) in court or politics.
This leads to an important question: why is this the case, and should we search the motivation in formal properties of ambiguity, or rather in linguistic considerations? We conjecture the latter, and we give the argument in a nutshell: assume there were an (English) ambiguity connective am. The problem with this connective is: if we say something like x am y, we give less information than by saying just x or y, yet we say (quantitatively) more. This contradicts fundamental Gricean principles, because we say more, still are deliberately less informative. Hence an ambiguity connective would already be an atrocity from the point of view of pragmatics.

C. Wurm
And still, reconsidering the case of our shy office worker, this connective would not be particularly useful: because one of the features of ambiguity is that a speaker, being ambiguous, does not even commit to being ambiguous on purpose-this is what makes it so attractive in our example. By being obviously ambiguous on purpose-say by uttering Would you open the window am go out with me-one already loses a core feature of "full" ambiguity. Put differently, "full" ambiguity includes the possibility of not being aware of it, and if it there were an explicit connective, this possibility is excluded. This is not the place to dwell on these linguistic arguments; we only want to conclude: in our view, the partiality of ambiguity is due to linguistic and pragmatic principles, not to its semantic properties itself. Hence this is not an argument to make ambiguity a partial operation in our logics/algebras. We will still consider the possibility of making a partial operation in our algebras, and check whether it helps avoid some negative results. As we will see, this is not the case.
Monotonicity Basically, monotonicity states that every ambiguous term entails itself, and this entailment is closed under weakening in the logical sense: bank and restaurant entails bank entails bank or restaurant This is not entirely straightforward, since under this assumption plants and animals entails plants, but the word animals might provide evidence for one specific reading of plants. But we are interested in logical soundness, not plausibility, hence we put this concern aside. Algebraically, this means: if you increase the arguments of , you increase the value. Logically, its counterpart is the following inference rule (monotonicity): 6 (monotonicity) α γ β δ α β γ δ Consistency of usage and trust We adopt this feature, but underline that it is actually the only one from the list here which is not mandatory for ambiguity. This feature actually distinguishes our work from the approach of van Eijck and Jaspars (1996). Imagine someone telling you something about banks, and as he goes on, you discover that what he says does not make any sense to you. In the end, you notice that he has been using the term bank with different meanings in different utterances. At this point, you obviously have to consider most of the discourse meaningless: how can you possibly reconstruct what meaning was intended in which utterance? Trustful reasoning with ambiguity makes the following assumption: (UU) In a given context, (globally) ambiguous terms must be used consistently in only one sense. This is of course arguable, not only because the notion of "context" remains vague, but also because we can use the same word with different meanings in the same sentence, as in I spring over a spring in spring. 7 However the classic work by Yarowsky (1995) gives strong evidence for consistent usage in empirical data. We underline that reasoning with ambiguity in a situation of distrust is also possible and has been described, though not as such, by van Eijck and Jaspars (1996). To illustrate the difference from a formal point of view: p q p q is a valid inference in both cases (monotonicity), whereas ( p q) ∧ ¬( p q) is a contradiction in the trustful case, but not in the distrustful case, since we might intend different propositions in p q and ¬( p q). Linguistically: a sentence like (14) He is dead and he is not dead.
is a contradiction in the trustful approach; in the distrustful approach not necessarily: dead could be used in two different senses, say medical and spiritual. Hence in the distrustful approach, classical theorems are no longer valid if constituted by ambiguous propositions, and classical inferences (like Modus Ponens) usually fail if applied to ambiguous propositions (see also the conclusion of Sect. 3). There is a lot more to say on this issue, but we plan to compare the trustful and distrustful approach in a separate publication. In this article, we want to describe reasoning with ambiguity in a situation of trust in consistent usage.

Preliminaries and Boolean Algebras
In this section, we will present an algebraic approach to the problem of reasoning with ambiguity. We will sketch the preliminaries, then present three relevant classes of algebras, prove the equivalence of their equational theories (i.e. the set of all equations holding in all algebras), which ultimately will lead us to discard this approach. The results of this section are thus mostly negative. If the reader is mainly interested in how ambiguity can be adequately treated, she can safely skip this section. The interesting result can be summarized as follows: algebra, or at least extensions of Boolean algebras, will not do the job. About the reasons for this we will speak in the end of this section. In Sect. 6 we will see which general insights can be drawn from this.
The general setting we will use here are Boolean algebras, which are structures of the form B = (B, ∧, ∨, ∼, 0, 1). As these are most well-known, we do not introduce them [the reader interested in background might consider Kracht (2003) and Maddux (2006), or many other sources]. We denote the class of Boolean algebras by BA. In this section, we will only use elementary properties of Boolean algebras, frequently and without proof or explicit reference. Many results we present here depend on specific properties of Boolean algebras such as the law of double complementation; hence the results do depend on this very peculiar choice. However, there is a very good justification for this choice, namely that in semantics of natural language, which is by far the greatest field of research where ambiguity arises and has to be handled, there is (comparatively) very little work on approaches using non-classical logic [but see Barwise and Etchemendy (1990), which is very interesting since it also includes the information-theoretic aspect which is important for ambiguity].
In the algebraic approach to ambiguity, we think of the objects of algebras as propositional meanings; the operations of the algebra (in our case, the Boolean operators and ' ') correspond to ways to combine these meanings. Here, the Boolean operations of course (loosely) correspond to their counterparts in natural language; for ' ', there is no corresponding connective. Importantly, there is no straightforward sense in which some meanings are more "basic" than others: all terms denote simple objects, that is, propositional meanings.
We now discuss what properties the connective should satisfy on a conceptual level; put differently, we ask: what kind of object is a b, and which rules does the operator obey? We distinguish three different ways how we can conceptually conceive of the operation : 1. a b denotes the "correct" meaning, that is, the one intended by the speaker (but which is unknown to any interpreter) 2. a b is entailed by the "correct" meaning, that is, the one intended by the speaker 3. a b is a "genuinely ambiguous" object, a sort of underspecification, which behaves in a certain combinatorial and inferential fashion a b here includes an epistemic aspect in our algebra: because in cases 1. and 2., we refer to the intention of the speaker, which is invisible to any outsider. This is also clear in the case of 3., as in this case, we have a genuinely underspecified meaning, that is, one the true content of which we cannot reconstruct.

Three Classes of Algebras
We now introduce three classes of algebras corresponding to the three conceptions mentioned above. All of them will have the same signature (A, ∧, ∨, ∼, , 0, 1), where (A, ∧, ∨, ∼, 0, 1) is a Boolean algebra, and is a binary operator. We adopt the following general conventions: boldface letters like A designate algebras, corresponding letters A denote their carrier set. We define a ≤ b as an abbreviation for a ∧ b = a (equivalently, a ∨ b = b). Another general convention we will adopt here is the following: let C be a class of algebras, t, t be terms over their signature. We write C | t = t if for all C ∈ C, all instantiations of t, t with terms denoting objects in C, the equality holds in C. Hence we write BA | a ∨ ¬a = 1 etc. The following algebras are ordered from strong to weak.
Strong ambiguous algebras In this class, we have the following axioms for : At least one of a = a b or b = a b holds We denote the class of all algebras satisfying these axioms by SAA. ( 1) and ( 2) will hold in all classes, and it is these axioms which ensure universal distribution (9)-(11), which thus hold in all algebras we consider.
( 3) is the axiom peculiar to SAA, and all it states is that a b either denotes a or it denotes b.
Weak ambiguous algebras We denote the class by WAA. As we see, it is only the slightly weaker equality in ( 3w) which distinguishes it from the strong form. Still, the two do not coincide. However, we will show that every weak ambiguous algebra is actually a universal distribution algebra. We need the additional axiom (assoc) to ensure associativity, which is actually derivable in SAA, but does not seem to be derivable from the other axioms in WAA.

Universal distribution algebras
We denote the class by UDA. This is the weakest algebraic class we present here. As is easy to see, this class is a variety, being axiomatized by a set of (in)equalities. SAA and WAA are not varieties: every variety contains the free algebra generated by an arbitrary certain set; however, the free ambiguous algebra (both weak or strong) over some non-trivial set is not an ambiguous algebra, because of the disjunctive axiom: since in general, a = a b = b, in the free algebra neither of the two holds. We will now consider the three classes one after the other.

Strong Ambiguous Algebras
We now present the most important results on the class of strong ambiguous algebras, which has been introduced and thoroughly investigated in Wurm and Lichte (2016). 8 Intuitively, this is a model where all ambiguous meanings exist, but every ambiguity is resolved to an underlying intention (this makes the implicit presupposition that ambiguous meanings are used consistently in one sense). This is a strong commitment, and mathematical results show that it is actually too strong. Firstly note, that ( 1), ( 2) are sufficient for universal distribution: they entail all equations (9)-(11) (for details see Wurm and Lichte 2016), as ∨ is redundant. The axiom (id) a a = a is obviously derivable. We now prove the main result on SAA, namely uniformity.
Obviously, this lemma has a dual where a ∼a = ∼a, and where all results are parallel.

Lemma 2 Let
A be a strong ambiguous algebra, a ∈ A.

If a ∼a = a, then for all b, c ∈
Proof We only prove 1., 2. is dual. Assume a ∼a = a, and assume b c = c. By the previous lemma, we know that 1 0 = 1, 0 1 = 0, hence Hence b c = c entails b = c, which proves the claim.
Now we can prove the strongest result on SAA, the Uniformity Lemma.
Lemma 3 Assume we have a strong ambiguous algebra A, a, b ∈ A such that a = b.

If a b = a, then for all c, c ∈
Proof We only prove 1., as 2. is completely parallel. Assume there are a, b ∈ A, a = b and a b = a. Assume that there are pairs c, c ∈ A such that c c = c. There are two cases: (i) Among these pairs, there is a pair c, c such that c = ∼c. Then we have c ∼c = ∼c, and by Lemma 2, it follows that a b = b, which is wrong by assumptioncontradiction. (ii) Among these pairs, there is no pair c, c such that c = ∼c and c c = c. Then we necessarily have (among other) a ∼a = a, and by Lemma 2, this entails c c = c-contradiction.
Put differently: let π l be left projection, a binary function where π l (a, b) = a; π r is then right projection, with π r (a, b) = b.
Hence for every Boolean algebra, there exist exactly two ambiguous algebras, one where computes uniformly π l for all arguments, and one where computes uniformly π r . We say an ambiguous algebra (B, ∧, ∨, ∼, π l , 0, 1) is left-sided, and respectively (B, ∧, ∨, ∼, π r , 0, 1) right-sided; we denote the left-sided algebra extending a Boolean algebra B by C l (B), the right-sided extension by C r (B). This entails that ambiguous algebras are rather uninteresting, as extending a Boolean algebra with a left/right projection operator is not very interesting. It also entails that strong ambiguous algebras with commutative operation are trivial (i.e. one element), but we will see that this even holds for more general classes. Hence even though the axiomatization seems unproblematic, it is too strong. We will therefore next consider algebras with weaker axioms for ambiguity.

Weak Ambiguous Algebras
It is obvious that SAA ⊆ WAA, that is, every strong ambiguous algebra is a weak ambiguous algebra, since it easily follows from uniformity that strong ambiguous algebras satisfy (assoc) (simply by case distinction). What is less obvious is that WAA ⊆ UDA; this is the first thing we will prove here. First we will prove that (id) holds in WAA.

C. Wurm
We now need two auxiliary properties which hold in all Boolean algebras: Lemma 7 In WAA, the following equalities hold: and the claim follows from 1. 3. is obtained from 2. by applying distributive laws. 4. Since ∼a ∼b = ∼(a b), this is obtained from 3 and B1. 5. Parallel to 4 (several steps have to be repeated). 6. The relation ≤, defined by ∧ or ∨, is obviously transitive, hence this follows from the previous. This is already sufficient for the following result:

Corollary 8 Every weak ambiguous algebra is a universal distribution algebra.
Are there weak ambiguous algebras which are not strong ambiguous algebras? This question can be answered positively: just take the algebra with the four elements {0, 1, 0 1, 1 0}, with the obvious order (the square). There is actually just one algebra with these elements, where necessarily we have 1 0 1 = 1, 1 0 0 = 1 0 etc., that is, we have a 1 a 2 a 3 = a 1 a 3 . This can be easily proved to be a weak ambiguous algebra, but it is not a strong ambiguous algebra, as 0 = 0 1 = 1.
The next question is: are there universal distribution algebras which are not weak ambiguous algebras? Again, the answer is positive: take the UDA which extends the four element Boolean algebra over {0, a, b, 1} by ambiguous objects, and take the object a b.
However, every UDA can be completed to a strong ambiguous algebra in two ways (see Lemma 21) without collapsing any elements of the underlying Boolean algebra; hence in the algebra over {0, a, b, 1}, we would either have a = 1 or a = b or b = 1, which by assumption do not hold. This proves the following: This is already all we have to say about this class: Being located between SAA and UDA, it does not seem to be particularly interesting.

Universal Distribution Algebras
In a sense, ( 3) and ( 3w) state that we use an ambiguous term with a given intention. This might be true if we think of sentences uttered by speakers. It is no longer true if we just think (for example) of the lexicon, where terms exist regardless of any intention. UDA models ambiguity without any underlying intention. Regarding the axioms, ( 1), ( 2) ensure that universal distribution holds, see (9)-(11). (inf) regulates the relation ≤ between ambiguous and unambiguous objects; (mon) the relation ≤ between ambiguous objects. As we will later see, (mon) is derivable from the other axioms; we include it nonetheless, since it makes the properties of the algebra easier to grasp. It is easy to see that (mon) amounts to a form of monotonicity: increasing the arguments of increases the value of the function: The formulation we choose immediately entails that UDA is a variety, contrary to SAA and WAA. Note that in presence of distributive laws, (inf) is equivalent to (id): Conversely, (inf) entails idempotence, because then a = a ∧ a ≤ a a ≤ a ∨ a = a. Hence UDA splits ( 3) into two weaker axioms. Note also that as (inf) is correct beyond doubt, the maybe more questionable (id) is inevitable. Note that (id) might look questionable as it entails already things like (We skipped some straightforward intermediate steps.) In UDA, also inequations such as the following law of disambiguation are satisfied: To get a better intuition on the structure of universal distribution algebras, we present some first results. We say a term t is in ambiguous normal form, iff t = t 1 ... t i , where t 1 , ..., t i are Boolean terms. The following is not difficult:

Lemma 11 For every term t, there is a term t in ambiguous normal form such that
To see this, just iterate the application of distributive laws. When we have a Boolean combination of ambiguous terms, the procedure of forming ambiguous normal forms leads to an exponential blow-up in the size of terms. This "problem" (if we want to consider it as such) will however turn out immaterial for UDA, once we have the Margin Lemma, which is the central result on UDA.
An interesting property is the following: let t = t 1 ... t i be a term in ambiguous normal form. One might conjecture that UDA | 1 = t iff BA | 1 = t 1 , ..., 1 = t i . This is however not correct, as can be seen from the following: Here, a and b are arbitrary. This can still be strengthened: first, as a special case, put b ≡ ∼a; then we have: where a is arbitrary. So far, we have only used Boolean algebra axioms and universal distribution. With (id) and (assoc) we can derive: There is a parallel derivation (using ∧ instead of ∨) for 0 = 0 a 0, hence the following equalities are valid in UDA: where a is arbitrary. From here we can prove the following: Proof By cases: Hence in particular, 0 1 0 = 0, 1 0 1 = 1. Hence we have again a very strong result, definitely stronger than what our intuition tells us about ambiguity. In particular, this makes it problematic to include commutativity: Corollary 14 Every commutative universal distribution algebra U has at most one element.
We can now show the following result, which characterizes UDA very neatly: Proof (For simplicity, we now omit associative brackets) Note that in order to derive the Margin Lemma, we have used (inf) by its equivalent (id), but we have not used (mon), as can be easily checked. Hence we can derive the following: 9 Lemma 16 Every algebra U satisfying ( 1), ( 2)

,(assoc),(inf) is a universal distribution algebra.
Proof We can use the Margin Lemma, since it follows from the four axioms already. Hence , and the claim follows.
In the end, in UDA arbitrary ambiguities "boil down" to the margins of ambiguous terms: commutativity is excluded, and more than 2-fold ambiguity is meaningless in this class of algebras. Now this is obviously a problem, which basically excludes UDA as a realistic model for ambiguity. We will discuss a way out of this predicament in later; but before this, we will prove a useful representation theorem for UDA.

Definition 17
We define the canonical UDA over two given Boolean algebra B 1 , B 2 as the direct product algebra B 1 × B 2 , where Boolean operations are defined pointwise as usual, and the operation is defined by (a, b) It is straightforward to check that this satisfies all UDA-axioms. Canonical UDA have a very simple structure, in that they only slightly extend product algebras. By the Margin Lemma, we will prove that every UDA has is isomorphic to a canonical universal distribution algebra. Given U ∈ UDA, we define the relations θ l , θ r ⊆ U 2 by These are equivalence relations for every carrier set U , and in fact they are congruences for all universal distribution algebras, that is: The same for θ r .
by the Margin Lemma, and by assumption and we only need one of the premises here).
We define maps h l , h r : U → ℘ (U ) by h l (x) = {a : aθ l x}, h r (x) = {a : aθ r x} (that is, elements are mapped onto congruence classes). These are, by the famous results of general algebra, homomorphisms for arbitrary universal distribution algebras. Hence we can construct the two homomorphic images h l (U) = (U θ l , ∧, ∨, ∼, 0, 1) (with the congruence classes as carrier set), and h r (U). We now define the map φ by φ( . This is still a homomorphism for Boolean operations, if we define all operations pointwise in the image algebra. The image φ[U ] is a set of pairs (of congruence classes), so we can define canonically by Hence we obtain a canonical universal distribution algebra which we denote by φ(U). The crucial lemma is the following (here ∼ = denotes isomorphism of two algebras): Proof We show two things, 1. φ(a b) = φ(a) φ(b) (that this holds for all other connectives already follows by general algebra), and 2. φ is a bijection.
Now as φ(U) is a canonical algebra for every U, this proves the following theorem: Theorem 20 (Product representation theorem for UDA) Every UDA is isomorphic to a canonical UDA.

Equivalence of Equational Theories
We now prove the equational theories of the three classes to be equivalent.

Lemma 21 For every term t in the signature of UDA, interpretation σ of t into a canonical universal distribution algebra
Proof Assume we have the interpretation σ : X → B 1 × B 2 . We know that for every B ∈ BA there are exactly two strong ambiguous algebras with the same carrier set. We take these two completions C l (B 1 ) and C r (B 2 ), and put σ 1 (x) = π l (σ (x)) and We prove that these algebras and assignments do the job as required by an induction on the complexity of t. For atomic terms, the claim is straightforward, as (π l (σ (x)), π 2 (σ (x))) = σ (x) by definition. Now assume the claim holds for some arbitrary terms t, t .
, which by canonicity entails the claim.

Theorem 22
For all terms t, t , the following three are equivalent: Proof 1. ⇒ 2: SAA ⊆ UDA, hence the claim is obvious. 2.⇒ 1.: Contraposition: assume UDA | t = t ; hence there is U, σ for which the equality is false: σ (t) = σ (t ). Now we take an isomorphic canonical UDA, which we denote by can(U), and which has the form B . Now use the previous lemma: we have two strong ambiguous algebras A 1 and 1 ⇒ 3 WAA ⊆ UDA, hence the claim is obvious. 3 ⇒ 2 SAA ⊆ WAA, hence the claim is obvious.
Hence we have three algebraic models, and all of them have the same equational theory, that is the same set of valid equations. This is, given the difference in axiomatization, rather astonishing and shows an interesting convergence. Unfortunately, we cannot consider this convergence as evidence for the "correct" model of ambiguitybecause all algebras have strongly unintuitive properties. On the other hand, we do not see any algebraic alternatives either, because it seems impossible to weaken the axioms of UDA without losing essential properties of ambiguity. Before we sketch the way out of this dilemma, we will quickly present the (rather simple) corollaries on the decidability of the equational theories:

Corollary 23
The equational theories of UDA, WAA, SAA are decidable; more precisely, their decision problem is NP-complete.
Proof We show the claim for SAA, from which all others follow. To check that SAA | t = t , we just have to reduce the equation by interpreting as π l and π r respectively, by which the equality reduces to two Boolean equalities t l = t l , t r = t r . Then the question is equivalent to checking whether BA | t l = t l and BA | t r = t r , which is well-known to be NP-complete.
We will now quickly review one possible solution to the problem of the axioms being at the same time correct properties of ambiguity and "too strong". This solution is to use partial algebras and looks promising at first sight, but does not really lead out of our predicament.

Partiality
As we have mentioned above, a peculiar property of ambiguity in natural language is that it is-to our knowledge-never productive: ambiguities are in the lexicon, arise in syntactic derivations and from many other sources, but we cannot construct them ad libitum, there is no productive mechanism for ambiguity. This nicely motivates the idea of algebras where is a partial operation. Apart from this intuitive motivation of partiality, there is also a mathematical one: uniformity for SAA was derived from the existence of objects such as a ∼a, 0 1, which in natural language generally do not arise (leaving irony as part of pragmatics). The same holds for UDA, where proofs proceed over peculiar objects like 0 a 0 which need not necessarily exist. As UDA is the largest class of algebras we have presented and the only variety, we will present the results on partiality only for this class.
A partial universal distribution algebra is an algebra (U , ∧, ∨, ∼, , 0, 1), where is a partial function U × U → U , which satisfies the usual equalities: Here equations have to be read in the following fashion: if one side of the equality is defined, so is the other, and obviously both are identical. Moreover, as the operations ∼, ∧, ∨ are total, it follows that if a b is defined, so are ∼a ∼b, are absorbing for all operations. We now show that this extension does not really help. Assume we have a partial UDA U, where a, b ∈ U , and a b =⊥ (we use ⊥ as an abbreviation for undefined, not to be confused with 0!). Then we have the defined terms Here, 0 1 need not be defined, neither ∼a a. Still, we can conclude a number of things (arguments are similar to the ones above). Firstly, note that if a = b, then in all Boolean algebras we have either a ∨ ∼b < 1 or ∼a ∨ b < 1. For assume a ∨ ∼b = 1.
By the fact that these terms are defined and Boolean operations remain total in partial UDA, it follows that for all a ∈ U , we have a c such that Similar for ∨, where we get a = a a ∨ b a, and similarly, By the same argument, we get b = b a ∨ b b etc. This is devastating for a possible commutativity: assume we have a b = b a =⊥. Then it easily follows that And from these we conclude: This shows the following: Hence ambiguous elements collapse, provided we have commutativity! Here we can again make use of commutativity as a probe: it cannot be reasonably included into partial UDA. This in turn means for us that the theory is inadequate. There would still be some things to say about this class, and there are further results which show that it is an inadequate model of our intuition, but we omit them, as we do not see other results as neat and general as the ones presented for UDA.

(Intermediate) Conclusion
The main results of this section suggest that the algebraic approach offers some interesting insights, but is of little help to adequately address our original problem of reasoning with ambiguity. Even the weakest axioms result in consequences which are strongly counterintuitive. We will take the following way out: we think that the algebraic approach as such is inept. Put differently: the problem is not the particular axioms (we have chosen the weakest implementing the requirements for ambiguity); algebra itself is the problem. There are two main features of algebra which can be abandoned while preserving the desiderata of ambiguity: 1. Uniform substitution of atoms by arbitrary terms preserves the truth of equalities.
More formally: let term(X ) denote the terms over variables X = {x 1 , x 2 , ...}; assume t, t ∈ term(X ), and σ : X → term(X ) is a function which is canonically extended to terms. Then if t = t is valid in an algebra, then so is σ (t) = σ (t ). 2. Substitution of arbitrary equivalent terms preserves the truth of equalities, i.e. equivalence entails congruence. More formally is a term with subterm t 2 , and t 1 [t 2 ] = t 3 , t 2 = t 2 are valid in an algebra, then so is t 1 [t 2 ] = t 3 . Actually, both features can be separately abandoned, each time resulting in a logic. Moreover, the two resulting logics exactly correspond to the two modes of reasoning with ambiguity we have sketched above (see Sect. 2.2, on consistent usage), depending on whether they assume consistent usage of ambiguous terms or not: 1. Lack of closure under substitution corresponds to the distrustful mode [no consistent usage, see van Eijck and Jaspars (1996)] 2. Lack of closure under substitution of equivalents corresponds to the trustful mode (consistent usage of ambiguous terms), which we will consider here. Obviously, the former is a fragment of the latter, that is, it has less valid inferences. In this article, we will only consider the second approach, and we will provide a comparison of the two modes in further work. Hence we assume trustful reasoning, which means we preserve closure under uniform substitution, but we will not have closure under substitution of equivalents. Logically speaking, substitution of equivalents corresponds to the rule (cut), which should not be admissible in our logic.

Multi-sequents and Contexts
The logic AL is an extension of classical (propositional) logic (we denote the classical sequent calculus by CL), that is, it derives the valid sequents of classical logic in the language restricted to CL, but it has an additional connective , by which we can derive additional valid sequents. We will show that this extension is indeed conservative, if we do not include commutativity for . The connective is not very exotic from the point of view of substructural logic: it is a fusion-style operator, which allows for contraction and expansion (its inverse), but not for weakening. We present it both in a commutative and non-commutative version. Our approach differs from the usual approach to substructural logic in that we extend classical logic with a substructural connective, whereas usually, one considers logics which are proper fragments of classical logic. In order to make this possible, we have to go beyond the normal sequent calculus: we still have sequents, but we have different types of contexts: one of them we denote by (...), which basically embeds classical logic, the other one we denote by ♦(...), which allows to introduce the new connective . The contexts thus differ in what kind of connectives we can introduce in them, and what kind of structural rules are allowed in them. Different contexts can be arbitrarily embedded within each other. We refer to the symbols , ♦ as modalities (but they do not immediately relate to modal logic). We have found this idea briefly mentioned as a way to approach substructural logic in Restall (2008), and structures similar to multi-contexts are found in Dyckhoff et al. (2012). They are also used in the context of linear logic, see for example de Groote (1996).
We call the resulting structures multi-contexts. For given multi-contexts Δ, Γ , we call a pair Δ Γ a multi-sequent. The calculus accordingly can be called a multi-sequent calculus. Our approach is particular in that we actually extend classical propositional contexts, by that, AL is but one particular instance of multi-sequent logics. According to us, this field definitely deserves further study, but this does no longer relate to ambiguity.
In order to increase readability, we distinguish contexts both by the symbols , ♦, and by the type of period we use to separate formulas/contexts. This will be the symbol ',' in the classical context, so (α, β) is a well-formed (classical) context. Here ','corresponds to ∧ on the left side of , and to ∨ on the right side of , and allows for all structural rules. In the ambiguous context, we use ';', hence ♦(α; β) is a well-formed (ambiguous) context. The symbol ';' corresponds to , is self-dual, and allows for some structural rules such as contraction, but not for others, such as weakening (or commutativity, depending on whether we include it or not). Formulas are defined as usual: we have a set Var of propositional variables, and define the set of well-formed formulas WFF by -nothing else is in WFF.
As usual, we will omit outermost parentheses of formulas. Next, we define multicontexts; for sake of brevity, we refer to them simply as contexts.
Note that ♦ is strictly binary. This choice is somewhat arbitrary, but seems to be the most elegant way to prevent some technical problems. has no restriction in this sense. Γ Δ is a well-formed multi-sequent, if both Γ , Δ are well-formed, classical contexts. We write Γ [α] to refer to a subformula α (actually a unary context (α), see conventions below) of a context Γ ; same for Γ [Δ], where Δ is a sub-context. More formally, Γ [−] can be thought of as a function from contexts to contexts. These context functions are inductively defined by 1.
). 4. Nothing else is a context function.
The calculus with all modalities is somewhat clumsy to write, so we have a number of conventions for multi-sequents, to increase readability. These are important, as we make full use of them already in presenting the calculus.
-We omit the outermost context in multi-sequents. We can do this because it always is (...), otherwise the sequent would not be well-formed. As a special case, we omit the empty context (). Hence α is a shorthand for () (α) etc. -We write Γ to refer to arbitrary contexts, so α, Γ is a shorthand for ( (α), Γ ).
We urge the reader to be careful: we will make full use of these conventions already in the presentation of the sequent calculus. The reason is that only this way, it will be plain obvious that our calculus is a neat extension of classical logic. Moreover, we aim to formulate the calculus in a way to make the structural rules, as far as they are desired, admissible [see Negri and Plato (2001), for background]. Arguably, some rules could be formulated in an intuitively simpler way, but at the price of not having admissible structural rules, which are problematic for proof search. We skip the proof of basic properties such as the fact that all rules preserve well-formedness of multi-sequents, which in fact is not entirely trivial.

The Classical Context and Its Rules
The modality (partly) embeds the classical calculus; hence we have the following well-known rules: (∧I) and (I∨) show how ∧, ∨ correspond to ',', depending on the side of . For negation, we have slightly generalized standard rules: We let negation introduction pertain to the classical context, though it is somewhat intermediate. Note that the rules slightly generalize the classical rules; if i = 1, we have the classical rule. This extension is sound by universal distribution. In the following, we have the three structural rules of classical logic; these rules are of course restricted to the classical context. We will later show that weakening and contraction are admissible in the calculus (usual argument of reducing the degree of the rule), so the only rule we really need is commutativity.
This notation means that the rules can be equally applied on both sides of . Note that we have all these rules not for formulas, but for contexts (recall that in our notation, a formula is just a shorthand for an atomic context anyway). Also keep in mind that classic context does not embed in itself; this is important to read ( weak),( contr) properly. Hence by our conventions, the classical is really ubiquitous in the calculus.

The Ambiguous Context and Its Rules
♦ is a binary modality, and hence there should be no way to introduce single formulas in this context (recall that in the unary case, ♦(Γ ) is an abbreviation for (Γ )). The introduction rule for ♦ is as follows: Note that this rule implements and generalizes both (inf) and (mon) from the UDAaxioms: it models (inf) if either both Δ, Φ are empty (which is possible) or both Γ , Θ are empty, and it models (mon) if both Λ, Ψ are empty. Here our conventions allow us to formulate all these instances in one rule. By this, we can also see that these rules are in a sense a generalization of •-introduction in the Lambek-calculus. We have two more rules introducing ♦, which are admissible in the calculus with cut, but necessary to provide for proper distribution and invertibility of negation in the cut-free case. At first glance, they have nothing to do with negation, however they solve problems of distribution of negation in a surprising fashion: Firstly, note that they are sound due to negation properties: assume Γ Δ 1 , ψ 1 , Θ and Γ Δ 2 , ψ 2 , Θ are sound. Then so are Γ , ¬ψ 1 Δ 1 , Θ and Γ , ¬ψ 2 Δ 2 , Θ, hence Γ , ♦(¬ψ 1 ; ¬ψ 2 ) ♦(Δ 1 ; Δ 2 ), Θ. Now by distribution, we should have Γ , ¬(ψ 1 ψ 2 ) ♦(Δ 1 ; Δ 2 ), Θ, and by invertibility (aka double negation elimination) we should have: Γ ♦(Δ 1 ; Δ 2 ), Θ, ψ 1 ψ 2 . It is easy to see that (I♦),(♦I) allow for this kind of inference without any problematic steps such as deleting connectives. 10 There are two (parallel) introduction rules for : These rules eliminate the ♦-context, and create a classical one. There are two structural rules in ♦-context, namely associativity and contraction (we do for now not allow commutativity). (♦contr) is obviously admissible with cut, and even without cut, we will prove it to be admissible, so it is not part of the calculus. (♦assoc) Here double lines indicate that the rule works in both directions, and absence of means rules work equally on both sides. Together with (cut), these rules would be sufficient. However, we add two more rules which ensure that we will satisfy the universal distribution in the cut-free case. (inter1) This looks like a law for eliminating contexts, but it is rather a distributive law for ∧ on the left and ∨ on the right. Note that if we have a context Γ [ (♦(Δ; Ψ ), Δ )], we can always derive Γ [ (♦( (Δ, Δ ); Ψ ), (Δ, Δ ))] via (admissible) ( weak). We call the rule (inter1) since the resulting context Γ [♦(Δ; Ψ )] might be called an interpolant for the two premises, containing only the material common to the two. This formulation has two advantages: firstly, ( contr) is admissible with (inter1) (as we will show below), and more importantly, (inter1) is invertible, hence if the conclusion is correct, so are the premises, which is advantageous for proof search. 11 Hence (inter1) slightly generalizes normal distribution: it ensures we can properly distribute ∧ on the left and ∨ on the right; for the dual distribution of ∧ on the right and ∨ on the left we need a more problematic rule: Here again the consequence can be thought of as an interpolant of the two premises, containing only the common material. To understand its meaning, consider that in terms of formulas, it means as much as We will motivate this rule more explicitly in Sect. 5.2.2. In particular, without this rule the rules (I∧) and (∨I) do not seem to be invertible, which would be very problematic. This rule has the problematic property that it eliminates material: the β of the right premise does not occur in the conclusion. However, this seems to be inevitable, and the drawback is made up by two properties: firstly, (inter2) makes structural rules admissible, and secondly, it is fully invertible: truth of the conclusion entails truth of the (weaker) premises. We will see that invertibility is actually of central importance for reasoning with ambiguity (see proof of Lemma 56 for an example), and also crucial for the matrix semantics. We will also quickly provide alternative, simpler, but less favorable equivalent versions for these two rules in Sect. 5.

Cut
We now present the cut rule. Its adaption to multi-sequents is straightforward, as unary contexts are always classical (♦ is a strictly binary modality). (cut) Note that (cut) does not substitute formulas, but atomic contexts. It ensures transitivity and congruence without any special cases to consider. Importantly, as every context has a particular modality, also the context inserted by cut comes with a modality-but it does not need to be the same as the one of the cut-formula. We define the notion of a derivation as usual by labelled proof-trees. A proof is a labelled tree where 1. all leaves are instantiations of (ax), and 2. every subtree of depth 1 is an instantiation of one of the other rules of the calculus. A multi-sequent Γ Δ is derivable if it is the root of such a proof-tree. In this case, we write AL Γ Δ, meaning the sequent is derivable in AL. Also the cut-free calculus will play an important role in the sequel; we denote this calculus by AL cf , and write AL cf Γ Δ if the sequent is derivable in AL without using the cut-rule. We will, in the sequel, mostly write for AL , and cf for AL cf . We first consider the full calculus AL, which is the less interesting of the two.

Algebraic Interpretations of AL
Because of the equivalence of the equational theories, we will only consider interpretations into UDA; by Theorem 22, all soundness and completeness results will hold for SAA and WAA as well. The interpretation of AL into UDA is straightforward, but we have to spell it out nonetheless. We define interpretations for contexts; this is necessary for the usual inductive soundness proof. Assume U ∈ UDA and σ : Var → U is an (atomic) interpretation. We define two interpretation functions σ , σ by: As is easy to see, σ and σ coincide on formulas, and hence in the formula case there is no reason to distinguish them. They also coincide in their interpretation of ';', but as there might be a classical context embedded, it is important to keep them distinct. We define truth in an algebra as usual: U, σ | Γ Δ iff σ (Γ ) ≤ U σ (Δ); as a special case, we write U, σ | Δ iff 1 U ≤ U σ (Δ). Moreover, we define the notion of validity in a class as usual by UDA | Γ Δ (stating that Γ Δ is valid) iff for all U ∈ UDA, σ : Var → U , we have U, σ | Γ Δ. We now prove soundness and completeness of UDA-semantics for AL, that is, UDA | Γ Δ iff AL Γ Δ. We start with soundness.  a (b ∧ c)).
Proof We make the usual induction over proof rules, showing they preserve correctness. We omit this for some of the classical rules for which the standard proofs can be taken over with minor modifications.
(inter2) We just consider the case on the left of ; the other case is parallel. Assume By soundness of (∨I) and other simple rules, it follows that (cut) We use the well-known fact that in Boolean algebras, we have a ∧ ¬b ≤ c iff a ≤ c ∨ b. Assume Γ [α] Ψ and Δ α, Θ are true in a model, and let θ ∈ WFF be a formula such that σ (θ) = σ (Θ). Then Δ, ¬θ α is true, and since contexts cannot be negated, so is Γ [ (Δ, ¬θ)] Ψ (by monotonicity). By Lemma 25, Γ [Δ], ¬θ Ψ remains true, and by Boolean laws, so is Γ [Δ] Ψ , θ, where θ can be again replaced by Θ.

Completeness for AL
We now present a standard algebraic completeness proof for AL and UDA via the Lindenbaum algebra for AL, denoted by Linda. Its carrier set M is the set of ALformulas modulo logical equivalence: we write α β iff AL α β, AL β α. This relation is symmetric by definition, reflexive and transitive (by cut). We put α = {β : β α}, and M = {α : α ∈ WFF}. The next step will be to show that , more than an equivalence relation, is a congruence over connectives.
Proof By cases; for all classical connectives, just use standard proof; for , this is no less straightforward.
Hence we can use the equivalence classes irrespective of representatives and define, for m, n ∈ M: Since our calculus subsumes the classical propositional calculus, the algebra (M, ∧, ∨, ∼, 0, 1) is a Boolean algebra, where the relation ≤ coincides with (modulo equivalence). We prove this extension is a universal distribution algebra: Lemma 28 Linda = (M, ∧, ∨, ∼, , 0, 1) is a universal distribution algebra.
Proof corresponds to ≤, = corresponds to . Hence equalities fall into two subclaims, which we sometimes treat separately.
( 2) i. ¬(a b) ≤ ¬a ¬b Straightforward; we abbreviate the proof: is easy to derive from a a ∨ c, b b ∨ d and (I♦I).
So we obtain a completeness result following the standard argument: if a sequent is valid in universal distribution algebras, it is in particular valid in Linda, the term algebra; hence it is derivable in the calculus. This proves part 1 of the following theorem; 2 and 3 follow by equivalence of equational theories. Note, by the way, that for completeness we need neither of (inter1),(inter2), (♦I),(I♦), hence these rules are admissible, provided we have (cut). This shows that for a complete logic for UDA, we only need a slight extension of the classical calculus (denoted by CL) with (I♦I) and ( I),(I ).

Corollary 30 In AL, the rules (inter1),(inter2),(♦I),(I♦) are admissible.
Hence CL with three additional rules is enough to be sound and complete for UDA. In the cut-free calculus however, these admissible rules will be of crucial importance to ensure congruence results, especially for distributive laws.
Given the negative results we have obtained for our algebras, Theorem 29 is a not a positive result for AL, on the contrary: it entails that AL is inadequate for reasoning with ambiguity. The crucial property hereby is what we call congruence: Δ. This is the logical counterpart of congruence in algebra, and it is ensured by the rule (cut). If we omit this rule, we lose congruence, and more importantly, we can no longer derive the undesirable results which follow from Theorem 29. This is why the cut-free calculus AL cf will be in the focus of what follows.
As special cases, we obtain the following, which is not trivial because of the presence of the cut rule! Corollary 32 Let Δ, Γ be multi-sequents which (1) do not contain any occurrence of , (2) nor any occurrence of ♦.
Hence, AL is a conservative extension of classical logic. This tells us something about commutativity as well (these considerations are actually just the logical counterpart to what we already said about UDA).

Convention
We use 1 in proofs as a placeholder for an arbitrary theorem of classical logic, 0 as a placeholder for an arbitrary contradiction of classical logic. It is important that 1 is not equal to a particular classical theorem, since in AL cf , not all classical theorems are exchangeable in proofs; same for 0! Now take the following rule: Lemma 33 Let AL comm be the calculus AL (with cut) with the additional rule (♦comm). For every α ∈ WFF, we have AL comm α; put differently: AL comm is inconsistent.
Note: this result-as we explained above-disqualifies the calculus for our purposes. However, as we will argue more explicitly below, this only concerns the calculus AL comm with (cut); without (cut), the rules of AL cf do not seem to derive any counterintuitive sequents (but of course this remains an open problem, it seems to be too early to make this a definite claim), and AL cf+comm is provably consistent.
For the remainder of this article, we will therefore only consider the cut-free calculus AL cf .

Weakening in Classical Context
It is well-known that in the classical calculus with shared contexts, weakening is admissible (see Negri and Plato 2001). We slightly extend this result to our new calculus and multisequents. Recall that we write for AL , cf for AL cf .

Definition 34
We write n Γ Δ, if the longest branch from root to leaf in the shortest AL proof tree of Γ Δ, has length ≤ n; same for n cf Γ Δ [hence n ≥ 1, we skip the inductive definition for reasons of space, for background check (Negri and Plato 2001)].
The following is a standard lemma: Proof Induction over n; the induction base is clear, for the way we formulated the axiom. So assume the claim holds for some n ∈ N. We can now make the usual case distinction as to the last rule applied in the derivation of the sequent. By induction hypothesis, it is sufficient to show that the following rule ( weak) can be exchanged with the preceding one in some way, thereby moving upward in the tree. As the argument is entirely standard (and takes pages if spelled out), we just illustrate it with one example: We can move the rule upward by re-arranging the derivation as follows: Similar (and mostly much easier) arguments can be applied in all cases where weakening is applied in other positions, and the same holds for all other rules of the calculus.

More Distribution Rules of AL cf
We prove the admissibility of some additional distribution rules, which in turn will be important to prove general invertibility and congruence results. Firstly, consider the following rules:
The main problem of (distr) and (subst) is that they are both not invertible themselves. In particular the rule (subst) is very problematic for proof search, as the set of possible premises is infinite, but (contrary to (inter2)) the derivability of the conclusion does not guarantee the derivability of a possible antecedent. Moreover, we conjecture that the calculus with (subst) and (distr) instead of (inter1),(inter2) does not allow for the admissibility of ( contr), hence our formulation seems preferable. We will however use the rules (distr),(subst) from time to time if it makes proofs more conspicuous.
The following rules are slightly stronger inversions of the rule (distr).
In these rules, we move a context out of the scope of an ambiguous context, for which we have to distinguish two cases (as ambiguity is-for now-not commutative). The following lemma shows that these rules are admissible in our calculus. This only proves the claim for (distr1), but the parallelism with (distr2) is so obvious that we assume we can omit even the statement.

Proof
We only prove 1., as 2. is completely parallel. We make an induction over n: the induction base is clear, because (ax) is based on a single formula not within the scope of ♦, hence this formula is not affected by the re-arrangement. Now assume the claim holds for some n ∈ N. We prove it holds for n + 1 by case distinction as to which was the last rule applied in the derivation, followed by (distr1) or (distr2). For rules introducing connectives, this is a plain standard argument. Now assume we have a derivation Then we can also derive and by the n-admissibility of weakening, the derivation length does not increase, hence the claim follows in this case. (♦I),(I♦) are similar. (inter1) is straightforward: since (distr1) and (distr2) are weaker inverses of the rules (modulo weakening and contraction), the rules can be easily commuted. The exchanging of the rule (inter2) followed by an arbitrary instance of (distr1), (distr2) is also an easy exercise.

Contraction in Classical Context
We now consider contraction in classical context.
Proof We make the usual induction over n, where we distinguish cases according to the last rule applied in the proof. The classical rules do not pose problems; reductions are well known, the rules (I♦I),(♦I),(I♦) are obviously formulated in a way to make contraction admissible. So we only consider some critical rules; moreover, we omit the symbol in proofs if the proofs works equally on both sides.
By the n-admissibility of weakening, this shortens the proof, hence the claim follows. (inter1),(inter2) are also obviously formulated in a way such that any instance of them, followed by ( contr), can be easily commuted.
-introduction rules are unproblematic, because we use contraction of contexts rather than formulas: hence instead of contracting α β, we can equally well contract ♦(α; β). This finishes the proof, though we omit of course many unproblematic cases.
We omit the parallel lemma for the right-hand side, as everything is completely parallel.

Expansion in Ambiguous Context
There is a dual rule to (♦contr), expansion in ambiguous context: This is again a shorthand for two rules, and in a sense a special case of weakening (which is not admissible in ambiguous context). It obviously corresponds, together with (♦contr), to the idempotence of ambiguity.

Contraction in Ambiguous Context
The rule (♦contr) is now easy to show admissible; in fact, it is even derivable by a sequence of (distr1) and ( contr); this is because we always have the empty context () at our disposition.
Hence we remain with but one rule in AL cf which is problematic for proof search, namely (inter2).

Invertibility
A crucial property of proof systems is their invertibility: if a sequent Γ Δ can be derived from Γ Δ, then if the former is derivable, so is the latter (same for several premises). Invertibility of rules is often straightforward to prove; for us, invertibility is one main reason we have the problematic rule (inter2). We now present the results on invertibility:

Lemma 43 (Invertibility Lemma)
Proof All claims are straightforward by rule formulation. Formally, they can be proved by induction over proof length, and exchanging the critical rule with the previous ones, which works fine in all cases.

Admissible Rules II: Cut and Restricted Cut Rules
The following important result is actually straightforward to prove now:

Theorem 44
The rule (cut) is not admissible in AL, put differently, there are sequents which are derivable in AL, but not in AL cf .
Recall that we let 1 stand for an arbitrary classical tautology, 0 a classical contradiction.
Proof By completeness, we know that AL 1 ♦(1;0;1). This sequent is not derivable in AL cf , because AL cf+comm , which is AL cf with (♦comm), is a conservative extension of classical logic (see Lemma 49) and if the sequent was derivable, we would also be able to derive 1 ♦(0; 1; 0) 0 (using the usual methods), hence 1 0-contradiction.
There is however a weaker cut-rule which is admissible, namely cut where the cut-formula is unambiguous and in unambiguous context. This means we have a sort of "Boolean transitivity". The argument for this is standard (reduction of cut-degree). The rules where this procedure of reduction does not work are actually ( I),(I ) inside the cut-formula, and the cut-formula in ambiguous context. But if we exclude them by definition, the cut remains admissible:

Lemma 45
In AL cf , the rule (classic cut) is admissible.
Proof Basically, one can reproduce the classical proof for cut-elimination. ♦ and will be unproblematic in side formulas, the only place where they have to be considered.
This result is not as uninteresting as it might seem, given the importance AL cf will have for us. There is another restricted cut rule which does not restrict the cutformula, but the context to the identity context. This rule corresponds to transitivity of consequence and will be called (trans) accordingly. (trans)
Hence in general, AL cf does not even allow for transitivity of inference; but for unambiguous formulas in unambiguous context (that is, not embedded within the scope of ♦), we can allow cut. This entails that transitivity of inference does not hold in general, but in special cases: if AL cf Γ α, AL cf α Δ, and α is a formula of CL, then AL cf Γ Δ.

Decidability
Lemma 47 AL is decidable, that is, we can decide whether AL Γ Δ for arbitrary Γ , Δ.
Proof This follows from completeness; in fact, Corollary 23 entails the stronger claim that the problem is NP-complete.
For AL cf , we leave this problem open: There is only one rule which remains problematic for proof search in the calculus, namely (inter2). However, we do not see how this can be dispensed with without losing invertibility, which is a crucial feature both for proof-theory and semantics of the cut-free calculus.

The Main Hypothesis
The results of the last sections strongly indicate that AL with cut is not a good model to reason with ambiguity, despite the fact that all axioms of UDA and all inference rules of AL agree with our intuitions. How does this go together? As we said, logically speaking, the problem lies in the cut rule, and algebraically speaking, the problem is the fact that our semantics is congruent, that is, we can always substitute equivalents preserving the truth of equalities. Being congruent is actually the core of being algebraic, so if we dismiss this feature, we should be careful in motivating this, explaining what this means, and formalizing this intuition. Firstly, we formulate our main hypothesis: Conjecture 2 (Main hypothesis) Under the assumption of consistent usage, if a sequent α β is derivable in AL cf , then the inference is intuitively sound. Moreover, every intuitively sound inference with ambiguous propositions can be derived by cut-free AL cf . This is basically the main conceptual claim we make in this article, but we moderate it immediately: firstly, the hypothesis cannot be proved, it can only be falsified by deriving some intuitively unsound sequent, or showing that some intuitively sound sequent is not derivable. So we use this as a benchmark, hoping that by trying to falsify this hypothesis we will further our understanding of reasoning with ambiguity. Our motivation for this hypothesis is mostly empirical, given the previous results and the fact that for all counter-intuitive results we considered, we actually do need the cut rule to derive them.
How can we best explain the fact that the calculus closest to our intuition should be one without cut and without algebraic semantics (which obviously subsumes truthfunctional semantics)? In our view, the main point is that ambiguity is something on the border between syntax and semantics. We have pointed out the parallelism between syntax and semantics, which is accounted for by universal distribution. Having the laws of universal distribution allows us to transfer ambiguity from syntax to semantics. Incongruence, on the other hand, is maybe the price we have to pay for this, as even semantically, there remains something syntactic to ambiguity: the syntactic form of formulas matters beyond mutual derivability, hence the same must hold for terms in semantics. This is exactly the core of incongruence: the fact that two formulas are inferentially equivalent (usually written α β) does not entail that we can substitute one for the other in all contexts (which we will write α ≡ β).
Incongruence is something which cannot be captured algebraically, hence we will have to look for an alternative semantics for AL cf . We will present a matrix-style semantics for the cut-free calculus, which is based on strings, where each string can be thought of as a sort of ambiguous normal form. This section is structured as follows: Firstly, we will explore the main hypothesis and sketch why it is plausible according to us. Then we will present the matrix semantics of AL cf and prove its soundness and completeness. Finally, we will ponder about what it means and what we can learn for reasoning with ambiguity.

Cut-Free AL: Evidence for the Main Hypothesis
The main hypothesis cannot be mathematically proved, but we can gather some support for it. The hypothesis falls into two parts we might call soundness and completeness. The soundness part states: If a sequent is derivable in AL cf , then it is intuitively correct. This part is easier to grasp, once we have an intuition on what multi-sequents and inference rules mean, we can just use the usual induction over rules. Since this might be considered unsatisfying in hindsight of the counterintuitive results we obtained before, we will establish the following result. Recall that AL comm is the calculus AL enriched with the rule (♦comm), we let AL cf+comm be the corresponding calculus without cut. As we showed before, AL comm is inconsistent, that is, every sequent is derivable. Recall that we let 0, in the context of our logic, stand for an arbitrary classical contradiction. We now show the following: Proof is straightforward, as in the cut-free calculus, to derive 0, we can only use classical rules, as there is no possibility to eliminate ambiguity once it is introduced. Actually, by the same argument we can easily conclude the following: Lemma 49 AL cf+comm is a conservative extension of CL, that is, it derives the same sequents in the classical language.
It is hard to gather evidence for the "completeness direction" of the main hypothesis, namely that all valid inferences with ambiguous terms are derivable with AL cf . We can however show that a number of properties hold which we would like to hold; these mostly regard the congruence of formulas. Recall that formulas, as terms, have ambiguous normal forms, which however are not unique. We let anf(φ) denote the set of ambiguous normal forms of φ. In the following we consider formulas only up to bracketing for ; hence we treat all formulas of the form α 1 ... α i , with arbitrary bracketing, as equivalent.

Definition 50
We define anf(φ) syntactically by It is easy to see that α ∈ anf(β) for some β if and only if α = α 1 ... α i , where α 1 , ..., α i are classical. Note that we define this concept by iterating distribution rules on formulas, hence in particular without reference to any proof theory or semantics. Hence ( p ∧ r ) (q ∧ r ) ∈ anf(( p q) ∧ r ), but (r ∧ p) (q ∧ r ) / ∈ anf(( p q) ∧ r ), since ∧-commutation is not part of the definition! We now come to another crucial concept. In a cut-free calculus, we have to distinguish two concepts: one is with reflexive closure , which is mutual derivability. The other (and probably more important) concept is the following: . We let ≡ denote the reflexive closure of .
In a cut-free calculus, ≡ is the largest congruence on formulas which respects derivable sequents. Note that both and ≡ are transitive relations, and ≡ is an equivalence relation. What is particularly interesting for our "completeness direction" of the main hypothesis is not derivability but rather congruence of certain formulas; hence is more interesting than . 12 We will also need the following: . The following is obvious yet important: Lemma 52 For every multi-sequent Γ , there are formulas γ l , γ r , such that γ l ≡ l Γ and γ r ≡ r Γ .
To prove the following crucial lemma, we need to define a slightly odd measure of formula complexity c : WFF → N. - · c(β) + 1 This is to account for the fact that distributing ¬, ∧, ∨ over can significantly increase formula complexity in terms of number both of connectives and variables. On the contrary, simple arithmetics tells us that c((α ∧ γ ) (β ∧ γ )) < c((α β) ∧ γ ) etc. An easy induction then yields that This is important in the following proof, where we often substitute formulas by their ambiguous normal forms.
Proof Induction over c(φ). The base case is clear, since c(φ) = 0 entails p ∈ Var. Assume the claim holds for all formulas with complexity ≤ n, and the complexity of φ is n + 1.
Hence every formula is congruent to all of its ambiguous normal forms, as it should by universal distribution. This means among other that if anf(α) ∩ anf(β) = ∅ = anf(β) ∩ anf(γ ), then α ≡ γ , as congruence is an equivalence relation (hence transitive).
With the rules we have established, it is easy to derive the following law of disambiguation: We would like to show a stronger result, namely that this can be strengthened to , which is to say, disambiguation can be applied in arbitrary contexts. This corresponds to the following rules of left/right disambiguation, which we will prove to be admissible: , with α, α classical These rules are not dual to each other: (disamb I) introduces arbitrary contexts, (I disamb) eliminates classical formulas. Each of the two has a separate dual which we omit as it is less immediate to disambiguation. To prove their admissibility we need two auxiliary lemmas.
Lemma 57 Assume ξ is formula of classical logic.
Hence we can eliminate classical contradictions on the right, classical theorems on the left of . This will also have some relevance in matrix semantics.
Proof Just consider that (26) Proof We make an induction over n in n cf , simultaneously for both 1. and 2., so we cover negation rules. The most difficult case is the base case: assume 1 cf α, Γ α, Θ. 1.: Case 1 Assume ♦(Δ; Δ ) is a subterm of Γ . As Γ is arbitrary, the claim follows immediately.
Case 2 Assume Δ := α (hence Δ = ()) Then we prove is parallel. Now we come to the induction step. Assume the claim holds for some n; we make a case distinction as to the last rule applied. Basically all cases are straightforward except for (¬I),(I¬), if they are applied to the context ♦(Δ; Δ ). However, here the claim follows from the fact the we perform the induction simultaneously for both claims, and a contradiction is the negation of a theorem and vice versa.
There is a dual result stating that φ 1 by the dual rules of the ones we presented. This would be co-disambiguation. Of course, there are many more rules one could consider interesting and important, but we think that beyond these crucial ones, the choice would become arbitrary. What we consider most important is that for many critical properties of ambiguity, mostly regarding distributive laws, we actually have congruence of formulas representing (arguably) congruent meanings.
We hope that these results will have convinced the reader that AL cf is a powerful logic for reasoning with ambiguity. Having established these results, we will now provide it with a rather simple and natural semantics. This will of course not be algebraic or set-theoretic, as these are excluded by incongruence. It could rather be qualified as language-theoretic, as we interpret formulas and sequents as (sets of) strings, where every string corresponds to an ambiguous normal form.

Matrix Semantics for AL cf
We now present a semantics for cut-free AL cf , which is based on matrix semantics; we adapt the definition of a Gentzen matrix from Galatos et al. (2007, chapter 7). We define an ambiguity matrix as a structure (A, ), where A = (A, ∧, ∨, ∼, 0, 1) is an arbitrary algebra (hence not necessarily a Boolean algebra!) of the signature of Boolean algebras, and ⊆ A * × A * , where A * denotes the set of finite strings over A, including the empty string ; A + denotes the set of non-empty strings. In this section, we will use the convention that letters a, b, c, d, ... are used for letters in A, whereas letters u, v, w, x, y, z are used for strings in A * . Before we define , we have to introduce some important shorthands: for w, v ∈ A + , a ∈ A, w ∧ a, w ∨ a, w ∧ v, w ∨ v, ∼w are not defined: importantly, pseudo-Boolean operations are only defined for letters in A, not strings! But we still use terms over strings as a shorthand, by the following string abbreviations: The definitions of w∧v and w∨v require that w 1 , w 2 , v 1 , v 2 ∈ A + ; this ensures that at some point we will have a term which is well-defined by lines one and two, like w∧a etc. Hence w ∨ v and w ∧ v are defined inductively. The representation axioms below will ensure that all members of these sets are congruent, that is, exchangeable in all contexts. As a result, we can use all pseudo-Boolean operations on arbitrary words in A + ; but it is important to keep in mind that they are abbreviations for operations which are defined only for letters in A itself, and operations are not necessarily Boolean. The relation has to satisfy a number of conditions, which are as follows: Basic: M1. For all a ∈ A, a a M2.
We denote the class of ambiguity matrices which satisfy the above requirements by AM. It is important to underline that an ambiguity matrix (A, ) is not an algebra; A is an algebra, ⊆ A * × A * is a relation between strings of terms. It is of course easy to see that concatenation corresponds to ambiguity. Maybe a comment on the representation axioms: as we have said, a term like w ∧ v is just an abbreviation for a set of strings. The representation axioms are necessary that all strings abbreviated by a term are exchangeable in all contexts (we will come to this below). Note that in all ambiguity matrices, id. xwwy z iff xwy z is derivable from these axioms (same on the right): assume xwwy z. Then x(w ∨ w)(w ∨ w)y z (by M4.), hence x((ww) ∨ w)y z (by notation); hence (again by M4.) xwy z. For the other direction, assume xwy z, hence x(w ∧ (ww))y z, where w∧(ww) is an abbreviation for (w∧w)(w∧w), hence x(w∧w)(w∧w))y z, and so xwwy z. The main use for 1l. is that it ensures that empty left. M8.,M11.), and by 1l., w v. Hence 1l. allows us to simulate an empty left-hand side, same for 0r. on the right. Keep in mind that the terms we write are just abbreviations for strings, and operations on the empty string are undefined! We have a (somewhat sloppy) correspondence of strings with ambiguous normal forms on the one side, and proof rules with conditions on on the other. The only rule we miss is (cut), and in fact this would correspond to a very peculiar property of the ambiguity matrix which does not hold in general: (strong transitivity) If y y, x yz u, then x y z u.
We will refer to this property below in Theorem 75. But we first come to an important definition. This is an important relation, as this does not coincide with the symmetric closure of , which we denote by ; in (incongruent) ambiguity matrices, is not a congruence on A * ! Note that matrices are more general than UDA for exactly this reason. In matrices, strings are less interesting than their congruences classes; we define w = {x : x w}, and A * = {w : w ∈ A * }. We will show that pseudo-Boolean operations can be applied to congruence classes; hence A * forms an algebra, which we will describe in Sect. 6.6. We now provide some examples of ambiguity matrices.
Example 1 Let A 1 be an arbitrary algebra of the signature of Boolean algebras, and put 1 = A * × A * . Then (A 1 , 1 ) is a matrix, as it clearly satisfies the base conditions M1.-M3., and moreover all other conditions which have the form of implications, of which the consequence is always true in our example. This is the matrix in which every sequent is valid, and none is falsified. If we take the set of 1 -congruence classes (there is just one), this matrix becomes the trivial commutative universal distribution algebra.
Example 2 Let A 2 be the two element Boolean algebra, hence A 2 = {0, 1}. The smallest matrix relation 2 for this algebra is the following: we have awb 2 cvd iff a ≤ c, b ≤ d (read ≤ as on natural numbers). The reason is that 1001 2 (10)∨∼(10) and 0110 2 (10) ∧ ∼(10), and we have idempotence. Hence the Margin Lemma applies to this matrix (A 2 , 2 ), and it is easy to show that the algebra of its 2congruence classes is a universal distribution algebra. This example shows that making the underlying algebra A a Boolean algebra already heavily restricts the possibilities of the matrix relation; this is the reason we allow for arbitrary algebras. A more general result on the effect of the underlying algebra will be presented in Theorem 75.
Example 3 Let G be an arbitrary set, A = term(G), the set of all Boolean terms over G. We let A 3 be the term algebra, that is, the algebra where every term denotes itself and nothing else. We let 3 denote the smallest relation such that for every a ∈ A, w ∈ A * a 3 a, w 3 1, 0 3 w, and all other matrix axioms (which have the form of implications) are satisfied, and nothing else. This is a well-formed inductive definition, and it defines the free matrix generated by G. Whereas example 1 was a matrix making every sequent valid, (A 3 , 3 ) is a matrix making only those sequents valid which are valid in every matrix containing G. It is actually not difficult to show that the formula-matrix, which we define for our completeness proof below, is equivalent to the matrix generated by G = Var. If we take the set of the 3 congruence classes, the result is not a universal distribution algebra, provided |G| ≥ 2; this follows from Theorems 75.
Informally, Theorem 75 gives a number of equivalent conditions under which a matrix (A, ) is equivalent to a universal distribution algebra (in a sense to be made precise), among which are: it satisfies (strong transitivity), and the algebra of its congruence classes is Boolean. We present this result later on, since it is much easier to prove with the following completeness result.

Matrix Interpretation of AL cf
Given a matrix (A, ), there are several possible interpretations of AL cf . Importantly, we will interpret variables into -congruence classes, not strings. Interpreting variables as letters/string leads to problems, since for example w ∧ v does not represent a unique string, so the interpretation would become non-functional. We will interpret variables as -congruence classes of letters. We thus take a map σ : Var → A and extend it canonically to σ , σ : WFF → A * . To this end, we need to define the operations ∨, ∧, ∼ for congruence classes; the same for concatenation, which we then denote by ·: The result is always a congruence class (see below). As usual, · will often be omitted, and we often write strings as arbitrary representatives of their congruence class (justified by Lemma 65).
We consider the interpretation where atomic formulas are mapped to congruence classes of single letters, that is σ ( p) = a for some a, and call this interpretation unambiguous. This is however not necessary; we could also assume that a formula like p is interpreted as ambiguous (i.e. as a string). Given σ : Var → A, we now define matrix interpretations: Note that by the string abbreviations, it follows immediately that Same for σ . Hence the distributive laws are already implicit in interpretations! Definition 63 (Truth and validity in ambiguity matrices) We get a neat semantics, where all forms of ambiguity are just interpreted as concatenation in words. We now have to prove some properties for matrices and interpretations: Lemma 64 In all ambiguity matrices, 2. Parallel. 3. Assume x(∼w)y z. Then 1 z ∨ ∼x(∼∼w)∼y (empty left.), hence 1 z ∨ (∼x)w(∼y) (DN2.), so 1 ∧ ∼z ((∼x)w ∼y) (M10.), and inverting back again, x(∼w )y z.
Parallel on the right-hand side. 4. Assume xwvy z. Then xw vy z and so xw v y z. Parallel on the right-hand side.
This lemma is important, because it shows that for -congruence classes M, N , we can choose arbitrary representatives w ∈ M, v ∈ N to define classes M ∧ N etc. Algebraically speaking, operations are independent of representatives. In the sequel, we will consider strings only up to -congruence, writing things like w ∧ v = u as a shorthand for both w ∧ v = u and w ∧ v u.
(ax) Obviously sound by the interpretation; we always have a a, and so w ∧ a a ∨ v for all w, v ∈ A * (∧I),(I∨) Interpretation of sequents remains identical. (I∧) Sound by Lemma 66, and M7.
( comm) Application on the left-hand side is sound by ∧comm, on the right-hand side by comm∨.
( weak) (admissible) Clear by ∨, ∧ introduction: we thereby can immediately ∨repr. Similar. BD1. Assume w(x ∧ (y ∨ z))v AL u. Note that this is an abbreviation: for x = a 1 ...a n , w(a 1 ∧ (y ∨ z))...(a n ∧ (y ∨ z))v AL u, which itself is again an abbreviation etc. If we take the word letterwise, we find that letters have the form a ∧ (b ∨ c). Hence, as the ∨ is not within the scope of ¬, by invertibility of (∧I),(∨I), we can substitute every letter by a ∧ b, same for a ∧ c. Hence we can derive w(x ∧ y)v AL u and w(x ∧ z)v AL u. Now we can apply the rule (∨I) to the formulas for x ∧ y and x ∧ z. Now the claim follows easily from the ANF Lemma and invertibility of ( I). BD2. parallel. DN1. Straightforward by Lemma 54 and the ANF Lemma. DN2. Parallel. 1l. Assume cf ♦(α 1 ; ...; β ∧ 1; ...; α i ) Γ . By repeated weakening and ( I), we We need one more lemma. We define the canonical interpretation into A AL by σ ( p) = p ; the extension to arbitrary formulas is as usual. Recall that σ and σ coincide on formulas.
Note that the inclusion is generally proper: since for example p ∨ q / ∈ anf(q ∨ p), but p ∨ q ∈ σ (q ∨ p) (σ the canonical interpretation), since the two are congruent in every matrix. Having established this, we can easily prove the main claim: Only-if By contraposition: assume we have an underivable sequent AL cf Γ Δ. Let γ, δ be the formulas congruent to these sequents (by Lemma 52). By the ANF Lemma, we have γ = γ 1 ... γ n ∈ anf(γ ), δ = δ 1 ... δ m ∈ anf(δ), such that AL cf γ δ . By the invertibility of the rules ( I),(I ), we have AL cf Γ Δ , where Γ = ♦(γ 1 ; ...; γ n ), Δ = ♦(δ 1 ; ...; δ m ), where γ 1 , ..., γ n , δ 1 , ..., δ m are classical (since they are the components of an ambiguous normal form). Now by the previous lemma it follows that γ 1 ...γ n ∈ σ (Γ ), δ 1 ...δ m ∈ σ (Δ), where (by Definition 68) Hence we have a completeness proof for our matrix semantics. Note that in some sense, this semantics corresponds to ambiguous normal forms. In particular, it is noteworthy that formulas are not interpreted as themselves even if we interpret into A AL . For example, we have σ ((a b) ∧ c) = ((a ∧ c)(b ∧ c)) , as is easy to check. Hence in the interpretation, every formula automatically becomes an (kind of) ambiguous normal form. However, this is a sloppy correspondence, as interpretation into the formula-matrix involves more than just distribution, since we interpret into congruence classes. This is the first point where the semantics is non-trivial: some equivalent unambiguous formulas are congruent, such as p ∨ q and q ∨ p, but others not, such as p ∨ ¬p and q ∨ ¬q. The second remarkable property of this semantics is that we actually only manipulate strings, where all operations except for concatenation are defined for letters only. We used operations ∧, ∨, ∼ on strings, but these are only abbreviations for operations on letters! It is thus a noteworthy effect that there is a semantics of ambiguity which works on strings in this canonical sense.

Matrices and Algebras
We can now establish a slightly more concise correlation between ambiguity matrices and universal distribution algebras (and algebra in general). For the relation of matrices and algebras, we need the algebra of congruence classes of a matrix: given (A, ), we define Con(A, ) = (A * , ∧, ∨, ∼, ·, 0 , 1 ). This is an algebra where pseudo-Boolean connectives are defined over congruence classes (see above), and instead of we have concatenation (denoted by ·) of congruence classes. The above results ensure all operations on classes are independent of representatives. The relation between algebras and matrices is non-trivial: given a matrix (A, ), the algebra Con(A, ) has an order relation ≤, defined by (Actually, the two conditions are equivalent, we skip the proof). Note that ≤ is a relation between congruence classes, not strings, but since it is independent of representatives and we mostly write w instead of w anyway, this can be neglected. It is easy to show that w ≤ v entails w v, but the converse does not necessarily hold. We now show that whereas semantically corresponds to , ≤ in Con(A, ) is the semantic counterpart of the relation in AL cf which we have discussed at length in Sect. 6.2: Lemma 73 w ≤ v if and only if xvy z entails xwy z and z xwy entails z xvy.
which is syntactic in nature, that is, it concerns the syntactic form of the terms, not their denotation or inferential properties. Concretely, the q in p ∨ ¬q and ¬ p ∨ q is connected to the q in q ∨ ¬q, we cannot change one without the other. Seen from the other direction, in classical logic q has a contribution to the meaning of q ∨ ¬q; but since the latter is inferentially equivalent to 1 (any theorem), the q is irrelevant and arbitrary for the meaning of any formula containing q ∨ ¬q as a subformula. In AL on the other hand, we have to keep track of this subterm q, so in a sense we can say our semantics and calculus are less local or less context-free than CL, the classical calculus. We need to know more of a term than its inferential equivalence class in order to interpret it properly, and this is what we mean by saying the syntactic form of the term matters in a stronger sense. Standard algebraic semantics is-by definition-incapable of modeling this, since the central notion of algebra is the one of congruence, and this is why we have introduced matrix semantics. Maybe we can put the peculiarity of our calculus and semantics in simple terms: as soon as there is ambiguity, inferential equivalence does no longer entail congruence (at least in the trustful setting).
Our matrix-semantics is language-theoretic in a broad sense, in that the main objects are strings, and the main notion of congruence concerns exchangeability in strings, which is called Nerode-equivalence in formal language theory. In matrix semantics, different Boolean terms which are equivalent in B (the class of Boolean algebras) are not generally congruent, because they are not necessarily exchangeable within words of the matrix, and this is the whole point of matrix-semantics. This is achieved by keeping all terms (even equivalent ones) distinct, and expressing constraints only via the relation .
So how does this relate to the reality of, and intuition on, ambiguity? In fact, regarding incongruence, our logic not only lacks the cut-rule, it also lacks transitivity of inference. One might consider this a devastating result, as this even contradicts a basic property of consequence relations as usually defined (see Tarski 1936). However, to us this does not seem to be too bad: if we think of transitivity of logical inference, we think of unambiguous statements, and for unambiguous formulas, in fact we do have transitivity of inference in AL cf . Transitivity is problematic for ambiguity because in ambiguity, syntactic form matters, and transitivity "cuts out the middle man". It is exactly this discarding of a syntactic object which is problematic due to the lack of locality we explained above. We can explain this by analogy with a real-world situation: if somebody makes an unambiguous statement, it is enough to remember what follows from it, that is, some very abstract meaning representation. However, if someone makes an ambiguous statement, we better remember (more or less precisely) its syntactic form, in order to be able to reconstruct possible intentions, and to remain aware of ambiguity. This intuitive matter of fact is reflected by the mathematics, and this seems to be a nice result, enlightening about the nature of ambiguity.
There is another empirical phenomenon interesting in this context, namely the wellknown zeugma-effect, which states that something is "weird", but not incorrect. This arises among other when we use an ellipsis of an ambiguous word, but in each occasion we use it in a different sense. 15 (33) a. She made no reply, up her mind, and a dash for the door. b. She made no reply, made up her mind, and made a dash for the door.
Whereas (33-b) is completely normal, in (33-a) we feel there is something strange and funny, even though we would not say it is wrong. We feel like language has been abused. In fact, these examples describe very well the position of ambiguity: if it where a purely syntactic phenomenon, then (33-a) would be clearly wrong, because we technically cannot make an ellipsis with two distinct lexical entries. If it were a purely semantic phenomenon, then (33-a) should be alright (putting aside matters of uniform usage for the moment). It is however none of the two: we feel like we are "cheated" by the sentence, as the left-out word is used in a different sense than its counterpart, but on the other hand, we cannot say the sentence is wrong. Of course our work will not shed a completely new light on the nature of ellipsis and the zeugma effect at this point, but the examples clarify the intermediate position we assign to ambiguity. And it is interesting that independently and on a completely formal approach, we find the same effect: our approach of using AL cf provides exactly an intermediate solution: the syntactic form of formulas matters (equivalence does not entail congruence), but only up to a certain point, as congruence goes beyond syntactic identity! Our mathematical results actually allow to make this a little more precise, using Lemma 45 (on classic cut) and the corresponding negative results. We can say: the syntactic form of a term is irrelevant beyond its inferential properties (that is, equivalence coincides with congruence), as long it is 1. not in the scope of an ambiguity operator, 2. nor is ambiguous in itself. Hence we can say two things: 1. for representations of ambiguous meanings, equivalence does not entail congruence. And 2. even for unambiguous "submeanings" of an ambiguous meaning, equivalence does not entail congruence. On the one side, this is a nice wrap-up of formal results. However, from a philosophical point of view, it is more like an interesting point of departure: this is because here we have naturally used two notions which are actually not really defined and are quite problematic: the notion of an ambiguous meaning, and the notion of a submeaning. We will shortly address these notions, though to adequately treat them would probably require an article in itself.
1. It seems to be intuitively clear what an ambiguous meaning is; but it seems to be very difficult to define it without already presupposing an equivalent definition (e.g. the one of an unambiguous meaning). Our semantics gives a very simple definition of ambiguity: a meaning is unambiguous, if and only if, for all terms denoting it, all of its constituents can be exchanged by equivalent terms. This is of course not extremely satisfying as it presupposes many theoretical concepts, but still it is an interesting insight.
2. From a logical point of view, it is actually unclear what a submeaning is: meanings in the logical sense are not ordered by subsumption, nor do they have an obvious structure. In Boolean algebras, it would be nonsense to say that p is a submeaning of q ∨ ( p ∧ ¬p) (because then every meaning would be a submeaning of every meaning). In the case of ambiguity, this does (at least intuitively) not seem to be the case: we have a very clear intuition on the structure of ambiguous meanings and their submeanings, namely the following: given an ambiguous utterance, every possible reading is a submeaning. We thus have a clear structure, which is reflected in our semantics by strings, where each letter corresponds to one reading (recall that in ambiguity matrices, arbitrarily complex Boolean terms are just single letters!). Actually, we suppose that this point is closely connected to 1.: the clear intuition on submeanings and their composition constitutes our intuition on what ambiguity is.
Two final notes: firstly, note that these points make clear that also conceptually, an algebraic semantics (and a congruent logical calculus with cut) should be inadequate, because in algebra (for example UDA), it is easy to show (via isomorphisms) that there is no natural definition of the ambiguous objects in an algebra, and given an algebra, it is impossible to define the notion of a constituent of some object (which is not to be confused with the notion of sub-term, which is very different). Secondly, the notion of submeaning is actually well-established in information-theoretic semantics is based on feature structures and unification. This opens some interesting connections which will however require some research on their own (see Barwise and Etchemendy 1990, for some work in this direction).

Conclusion
We have investigated the problem of (trustful) reasoning with ambiguity, both from a logical/syntactic and semantic/algebraic point of view. In the beginning, we found some paradoxical results: even from most innocuous seemingly assumptions, we had consequences which were strongly counterintuitive, and almost led to triviality, as in the case of ambiguous algebras and universal distribution algebras. We have chosen a way out of the dilemma which is not really obvious: we abandoned the assumption that reasoning with ambiguity is congruent. Mathematically, this means that we use a logic without admissible cut-rule; conceptually, this means that the syntactic form of a term matters beyond inferential equivalence. This seems strange at the beginning, but there were good motivations for this move both on the formal and conceptual side, and the results which follow were satisfying for us: in particular, we presented the cut-free calculus AL cf , and our main hypothesis was that this calculus is sound and complete for trustful reasoning with ambiguity. We have provided this calculus with a semantics which is in a sense very natural given the peculiar properties of ambiguity, though it is rather unusual: ambiguous meanings are represented as strings, which might be best thought of as ambiguous normal forms, that is, every string represents the ambiguity between classical, unambiguous meanings. We leave the full philosophical and practical implications of this work for further research.