Formalizing Kant’s Rules

This paper formalizes part of the cognitive architecture that Kant develops in the Critique of Pure Reason. The central Kantian notion that we formalize is the rule. As we interpret Kant, a rule is not a declarative conditional stating what would be true if such and such conditions hold. Rather, a Kantian rule is a general procedure, represented by a conditional imperative or permissive, indicating which acts must or may be performed, given certain acts that are already being performed. These acts are not propositions; they do not have truth-values. Our formalization is related to the input/ output logics, a family of logics designed to capture relations between elements that need not have truth-values. In this paper, we introduce KL3 as a formalization of Kant’s conception of rules as conditional imperatives and permissives. We explain how it differs from standard input/output logics, geometric logic, and first-order logic, as well as how it translates natural language sentences not well captured by first-order logic. Finally, we show how the various distinctions in Kant’s much-maligned Table of Judgements emerge as the most natural way of dividing up the various types and sub-types of rule in KL3. Our analysis sheds new light on the way in which normative notions play a fundamental role in the conception of logic at the heart of Kant’s theoretical philosophy.


Introduction
Judgments, insofar as they are regarded merely as the condition of the unification of given representations in one consciousness, are rules. [Prolegomena 4:305] 1 We will define a logic of conditional imperatives and permissives that was designed as part of an effort to make sense of what Kant was trying to do in the Critique of Pure Reason. There were two sources of motivation for designing this logic. The first came from our long-term project to extract from the Critique a cognitive architecture that could be realised in a computational system. 2 Consider a simple agent with various sensors, trying to make sense of its 3 sensory perturbations. It must, somehow, interpret its motley array of sensory perturbations as representations of an external world. This world consists of objects located in space and persisting through time, causally interacting with each other. What sorts of things must an agent do in order to achieve this? What must an agent do in order to represent a world at all? This is not an epistemological question: we are not asking what conditions have to hold in order for an agent who already believes something to also know something. This is a pre-epistemological question about intentionality: what conditions must hold for an agent to even think a thought that is about the world, irrespective of whether that thought is true or false?
Kant's cardinal innovation, as we read him, is that the agent makes sense of its sensory perturbations by constructing and applying rules: We have above explained the understanding in various ways -through a spontaneity of cognition (in contrast to the receptivity of the sensibility), through a faculty of thinking, or a faculty of concepts, or also of judgements -which explanations, if one looks at them properly, come down to the same thing. Now we can characterize it as the faculty of rules. This designation is more fruitful, and comes closer to its essence. Sensibility gives us forms (of intuition), but the understanding gives us rules. It is always busy poring through the appearances with the aim of finding some sort of rule in them.
[A126] 4 In making sense of its sensory perturbations, the rules an agent constructs and applies must satisfy various constraints, codified by Kant as the Categories and Principles of Pure Understanding. Only if they satisfy these constraints does the agent achieve what Kant calls "experience": it has constructed a coherent, unified representation of a coherent, unified external world. 5 According to this interpretation, self-legislation is just as critical to Kant's theoretical philosophy as it is to his practical philosophy. The agent is only able to achieve experience by constructing rules that it then applies. According to this picture, self-legislation is prior to conscious experience in the explanatory ordering.
In stark contrast to interpretations on which the Kantian agent is only able to construct and apply rules after they have already achieved a conscious representation of the world, our account views the construction and application of rules as necessary, indeed partially constitutive, of such representation.
If rules are to play this fundamental, load-bearing role in Kant's theory of intentionality -his theory of experience -then we had better be very clear what we mean, exactly, by a rule. If a rule is seen as a conditional that relates propositions that have truth-values, then it cannot be a foundational part of his architecture. Kant's project, as we understand it, is to explain intentionality itself: he wants to explain how an agent can have world-directed thoughts that are so much as capable of being true or false. If he presupposes rules that connect propositions that are already true or false, then he has already presupposed too much. We argue that, for Kant, a rule is a general procedure relating acts, not propositions. In the case of theoretical reason, the rule's constituent acts are mental rather than physical. They include things like seeing a bruised apple, feeling a heavy hammer, and hearing a buzzing bee. Crucially, for Kant, such acts do not themselves have truth-values: For truth and illusion are not in the object insofar as it is intuited, but in the judgment about it insofar as it is thought. Thus it is correctly said that the senses do not err; yet not because they always judge correctly, but because they do not judge at all. Hence truth, as much as error, and thus also illusion as leading to the latter, are to be found only in judgments, i.e., only in the relation of the object to our understanding... In the senses there is no judgment at all, neither a true nor a false one. [A293-4/B350] See also [Jäsche Logic 9:53].
In this paper, we will provide a formalization of Kant's conception of rules as general procedures, using a logic of conditional imperatives and permissives over acts. We will also sketch how this logic fits into the larger picture of the Kantian self-legislating agent, that makes sense of its sensory perturbations by spontaneously constructing and applying rules.
We said that there were two sources of motivation for designing this logic. The first was to formalize the notion of a rule at the heart of Kant's cognitive architecture with a view to realising that architecture in a computational system. The second source of motivation is exegetical and defensive.
Although Kant's Table of Judgements plays an absolutely pivotal role in his Critical system, it has been roundly criticised. One common objection has been that it is based on the out-dated Aristotelian term logic. Just as Kant's views on nature are based on a defunct conception of Newtonian physics and his views on mathematics are based on a defunct conception of Euclidean geometry, so his views on the mind and its acts of judgement are based on a defunct conception of Aristotelian logic. Yet if the Table of Judgements is incomplete or arbitrary, then the derivation of the  Table of Categories and the subsequent Transcendental Deduction has failed before it has even started. If the Table of Judgements is based on an incomplete and outdated logic, then Kant's recurrent use throughout his work of the basic structure it provides is merely the result of an "architectonic mania", 6 rather than the persistent application of a unified template that runs through all our mental activity.
Or so the story goes. Our second motivation for formalizing Kant's conception of rules, then, was to design a logic in which the distinctions he introduces in the Table  of Judgements emerge as the most natural way of dividing up the various types and sub-types of rule.
In Section 2, we briefly outline some of the key elements in our preferred interpretation of Kant. We define the essential terms and sketch how they fit together in Kant's theory of experience, focusing on his conception of rules.
In Sections 3,4, and 5, we present a logic that formalizes Kant's rules as conditional imperatives and permissives. We explain how it handles the major deontic paradoxes and how it differs from standard input/output logics, geometric logic, and first-order logic. We also explain how it translates natural language sentences, including those involving multiple quantification and features not captured by first-order logic, such as predicate negation and disjunction.
Finally, in Section 6, we show how this logic makes sense of Kant's Table of Judgements, not only its particular "moments" and its overall structure, but also the various finer points of structure that Kant insists upon.
Our analysis sheds new light on the way in which normative notions play an absolutely fundamental role in the conception of logic at the heart of Kant's theoretical philosophy. Apart from its own intrinsic interest, historical and philosophical, this also provides a clear hint that our analysis might be extended to Kant's account of moral rules and practical agency. Consider the central role that imperatives and permissives play in Kant's moral philosophy (e.g. at [Groundwork 4:414ff., 421ff.]), as well as his claim that practical and theoretical reason share a "common principle" [Groundwork 4:391]. We cannot pursue this extension here; that is a task for future work.

Kant's Cognitive Architecture
In this section, we outline our preferred interpretation of Kant. The interpretation will be elaborated on and confirmed throughout (especially in Section 6), but we do not attempt to mount a full defence of it here. Our primary aim in this paper is to formalize an aspect of Kant's thought, as we understand it, and then apply the results in making sense of his Table of Judgements. The aspects of our interpretation that we do not defend here have been defended by ourselves or others elsewhere, which we note when relevant. 7

Mental Activity as Constituted Activity
It is a familiar idea that social activity is constituted activity. 8 In certain circumstances, if various constraints are satisfied, then pushing a horse-shaped object across a chequered board counts as moving a knight to king's bishop three; an utterance of the words "I do" counts as an acceptance of marriage vows; running away counts as desertion; writing your name counts as signing a contract. Such counts-as claims are constitutive, not merely predicative or classificatory (as when we say that a horse counts as a mammal). We are saying that doing x just consists in doing y in the right circumstances; that there is nothing more to doing x than doing y in the right circumstances. In this sense, social actions are things we can only do mediately, by doing something else in the right circumstances. The constitution might continue, so that one constituted social activity in turn constitutes another, as when a move in a chess game in turn counts as winning the game. But here, too, we do one thing by doing another, and the circumstances must be appropriate. Neither playing nor winning at chess are things we can do immediately.
One action can only count as another if the surrounding circumstances satisfy certain conditions. Just going up to a stranger in the street and saying "I do" does not count as marrying them. Saying "I do" only counts as marrying someone in the particular context of a marriage ceremony when the officiator has asked a particular question. What determines which circumstances are the right circumstances? It is the constitutive rules of the constituted activity that determine the subset of circumstances in which doing y also counts as doing x. These constitutive rules are to be distinguished from regulative rules, like the driving laws, which merely regulate a pre-existing independent activity.
So we have here a distinction between constituting and constituted activities, where constituted activities can in turn play a role in constituting further activities, as well as an attendant distinction between constitutive and regulative rules. And note finally that, when a constituting activity itself consists in the construction and application of rules, then the constitutive rules that determine when this activity counts as a further, constituted activity will be meta-rules: rules that determine how the construction and application of rules must go if it is to constitute the constituted activity in question.
As we read Kant, the guiding theme of his philosophy of mind is that mental activity is constituted activity. His primary concern is with the constituted mental activity experience. This is a complex, high-level activity, itself constituted by other constituted mental activities, themselves constituted by yet others, and so on. In each case, we can ask: what constraints must be satisfied for one activity to constitute another? Ultimately, we are asking: what are the constitutive rules of experience? It is the purpose of Kant's cognitive architecture to articulate all of this (and more). He calls it "the conditions for the possibility of experience" [e.g. at A92ff./B124ff., A158ff./B197ff.].
Two of the mental activities that play a role in constituting experience, if certain constraints are satisfied, are perception and judgement. The former includes things like seeing a bruised apple, feeling a heavy hammer, and hearing a buzzing bee. The latter includes things like forming the thought that all humans are mortal, that some humans are uneducated, or that Caius is a human. Perception and judgement are distinct but interdependent activities. They are also themselves constituted activities, and it is at this level that we come to the four elements of Kant's cognitive architecture that will be essential for our logic.

The Four Elements
An intuition is a mental object, constructed out of given sensations by the solitary individual agent. Your intuitions are different from my intuitions -each agent has its own private repository. There is no limit to how many intuitions an agent can construct. Intuiting is the constituted mental activity of constructing intuitions. 9 A mark is a symbol that can be ascribed to multiple intuitions. Unlike an intuition which can only be had by a single agent, a mark is public and can be used by many different agents. (Marks are general in both of these senses.) A mark has no predefined meaning. Its meaning is determined entirely by its inferential role (i.e. by the rules in which it figures). 10 A subsumption is the mental activity of assigning a mark to an intuition 11 (or tuple of intuitions). As we read Kant, and this is central to everything that follows, a subsumption is an act that does not itself have a truth-value. It is not itself a judgement or a thought, still less a belief or knowledge. Although marks are shared public symbols, intuitions are private mental objects, so the act of subsuming an intuition under a mark is only performable by the particular individual who has that particular intuition.
A rule is a general procedure for generating subsumptions from subsumptions. There are two basic types of rule. As Kant describes them: the representation of a universal condition in accordance with which a certain manifold (of whatever kind) can be posited is called a rule [Regel], and, if it must be so posited, a law [Gesetz]. [A113] 12 9 We have argued for this traditional view of intuition, and against the currently popular 'relationalist' view elsewhere [41]. 10 See [32]. 11 See [Kant and the Capacity to Judge, p.92n] and also [45] p.264 and p.269n. 12 Kant rarely sticks to this rule/law terminology and we do not adopt it here, referring to both imperatives and permissives simply as rules. See also [B201n], where Kant uses yet other terminology.
A rule is not a sentence -it is a general procedure. But if it were to be described by a sentence, it would be described by a conditional imperative or permissive. For example: for every intuition x, if you subsume x under mark p, then also subsume x under mark q! Or: for every intuition x, if you subsume x under mark p, then feel free 13 to also subsume x under mark r! Rules are general procedures that apply to all intuitions. Unlike subsumptions, which are private to the individual, rules are things that the solitary agent can share with others. 14 Unlike subsumptions, which bring two heterogenous elements together (the intuition and the mark), rules bring homogenous elements (various subsumptions) together.
To reiterate, rules are not themselves conditional imperatives or permissives like those above. This is important because natural language conditional imperatives and permissives have truth-evaluable content in their antecedents, whereas this is not the case for Kantian rules (for the reasons given in Section 1). Our claim is that Kantian rules, as general procedures, can nevertheless be formalized using a logic of conditional imperatives and permissives (with a suitable formal semantics). As a convenience we will often talk as though rules just are conditional imperative or permissives, but strictly speaking what we mean is that they can be described or formalized as such.
These, then, are the four basic elements of Kant's constitutive psychology: intuitions, marks, subsumptions, and rules. If certain constraints are satisfied, if everything comes together, then: -an intuition counts as a representation of a particular external object -a mark counts as a concept -a subsuming counts as a perception -a rule counts as a judgement (with a truth-value) As we read Kant, these constitutive counts-as claims should not be thought of as successive or independent stages. An intuition only counts as a representation of an external particular insofar as it is subsumed under a mark that counts as a concept; a mark only counts as a concept insofar as it is involved in a subsumption that counts as a perception; and a subsumption only counts as a perception insofar as it is bound 13 This informal way of putting it is not ideal. What we are trying to express here is the permissive that corresponds to the imperative as "may" corresponds to "must". If English contained a punctuation mark corresponding to "!" that represented a permissive rather than an imperative, then we would use that, but there isn't one. 14 [Kant and the Capacity to Judge, p.88]: "This is how, by virtue of its logical form alone, a judgment lays a claim to holding for any consciousness, whereas a mere coordination of representations might only hold for my subjective consciousness.". See also [26] and [42].
to other subsumptions in a rule that counts as a judgement; thus, in turn, an intuition only counts as a representation of an external particular and a mark only counts as a concept insofar as each figures in a rule that counts as a judgement. And a rule only counts as a judgement insofar as it is bound to other such rules in a coherent, unified representation of a coherent, unified external world; that is, only insofar as it is part of experience.
This, in a nutshell, is Kant's constitutive theory of experience. Its constitutive (meta-) rules -the constraints that must be satisfied for the above counts-as claims to hold -are what Kant articulates in the Analytic of Principles. 15 Its basic elements are intuitions, marks, subsumptions, and rules, all of which will play a role in our logic. Here we focus on just one of the counts-as claims: that a rule counts as a judgement.

Judgements as Rules
Recall that a rule is a general procedure that we will formalize as a conditional imperative or permissive. It might seem strange to think of an imperative or permissive rule as something that can count as having a truth-value, but this, we contend, is exactly what Kant has in mind. He says: All judgements are accordingly functions of unity among our representations, since instead of an immediate representation, a higher one, which comprehends this and other representations under itself, is used for the cognition of the object, and many possible cognitions are thereby drawn together into one [A69/B94] This is exactly the role of rules, on our account. A rule is a "higher representation" that binds together subsumptions, which consist of marks and intuitions, or "immediate representations". It is "the mediate cognition of an object, hence the representation of a representation of it" [A68/B93]. Thus: Judgments, insofar as they are regarded merely as the condition of the unification of given representations in one consciousness, are rules. [Prolegomena 4:305] All rules (judgments) contain objective unity of consciousness of the manifold of cognition, hence a condition under which one cognition belongs with another to one consciousness. [Jäsche Logic 9:121] Consider Kant's identification of judgements with rules. This identification is easiest to see in the case of universal judgements. The judgement "All humans are mortal" just is the rule: for every intuition x, if you subsume x under "human", then also subsume x under "mortal"! But the identification applies equally to particular judgements. The difference is that particular judgements are permissive rules. "Some humans are uneducated" just is the rule: for every intuition x, if you subsume x under "human", then feel free to also subsume x under "uneducated"! And it also applies to singular judgements. "Caius is a human" just is the rule: for every intuition x, if you subsume x under "Caius", then also subsume x under "human"! together with a constraint: for any distinct intuitions x and y, do not subsume both x and y under "Caius"! Indeed, after presenting our logic in Sections 3, 4, and 5, we will argue in Section 6 that our rule-based analysis accounts for all of the different kinds of judgement that Kant identifies in his Table of Judgements, and we will also show how it accounts for the Table's finer points of structure. Note, for instance, how singular judgements come out above as a sub-type of universal judgements; how both singular judgements and universal judgements imply particular judgements, so long as imperatives imply permissives; and how negation can be applied to a predicate within an atomic (categorical) judgement.

What Kant Meant by "logic"
We have become accustomed to thinking of logic as the study of entailment relations between sets of linguistic items. We are given a set of sentences, A, and a further sentence p and we want to find out if A entails p, written A |= p, where |= is defined in terms of truth: A |= p if any model in which A is true is also a model in which p is true.
Logic, for Kant, was not primarily about entailment relations between elements with truth-values. First and foremost, it describes how we should think. 16 And since thinking is a mental activity, logic is primarily a codification of principles describing which activities we should perform, conditional on the activities we have already performed. This project will turn out to include an account of relations between elements with truth-values. But, we contend, it is not exhausted by such an account.
Our focus in this paper is on how we should think when our goal is experience. As we read Kant, this question amounts to the following: given a collection of subsumptions that the agent is performing concurrently, and a set of rules it has adopted, what further subsumptions may/must it also perform, if it is to achieve experience? Note that this is a question about acts: given that the agent is performing these acts, and given that it has adopted these rules, what further acts may/must it perform? These mental acts (subsumptions) do not have truth-values. Kant's Logic is not only concerned with relating elements that have truth-values, but also tells us what mental acts we may/must perform.
Suppose, for example, the agent has adopted the following rules: -If you are seeing yellow and black stripes and hearing a buzzing, then feel free to perceive a bee! -If you are seeing yellow and black stripes and hearing a buzzing, then feel free to perceive a wasp! -Do not count anything as both a wasp and a bee! Having adopted the above rules, suppose that the agent now performs the subsumptions that, in the right circumstances, constitute the antecedents: seeing yellow and black stripes and hearing a buzzing. What further subsumptions may/must it make? To repeat, this is not yet a question about which judgements it should adopt. It is not yet a question about which propositions it should hold for true. Rather, it is a question about what mental acts it should perform. In the case at hand, it is a question about what it should perceive. 17 One permissible subsumption would be to perceive a bee. Another permissible subsumption would be to perceive a wasp. But it is not permissible to subsume the same intuition under both "bee" and "wasp". On our account, a fundamental question of Kant's logic concerns relations between elements that do not have truth-values.
There is, however, so conceived, a further question for logic to answer: given a collection of judgements (i.e. rules) that have been adopted, what other judgements may/must also be adopted? Experience is constituted by perception and judgement. Now, since judgements do have truth-values, this secondary aspect of transcendental logic aligns with the contemporary, Fregean focus on truth-conditional logic. In this paper, we present a logic that addresses both aspects (see Section 3.6). But our logic begins "one level down" from first-order logic: 18 it represents (the perceptual) activities that do not themselves have truth-values but which can contribute to the constitution of (the judgemental) elements that do have truth-values, all of which together, if things go right, will constitute experience. 19 17 Note that these mental acts are at the sub-personal level -our 'agent' is a sub-personal rule-induction system. It is not as if the person consciously chooses whether to perceive a bee or a wasp, but rather that a pre-conscious process makes this "decision". It does so with the goal of achieving experience, whence the sense and force of a question about what should be perceived. But experience, on Kant's account, is the first level at which there arises anything like a person's conscious perspective on a unified, coherent external world. See Section 6 and [40] for further discussion. 18 See [1], p.236 for a related claim. 19 In the Second Analogy, Kant claims that it is only through the construction and application of causal rules that temporal succession is generated. It is not that we first perceive a temporal succession of individual events, and then subsequently posit causal laws to explain the succession. Rather, what it is to perceive the succession of a followed by b just is to posit a causal rule whose body subsumes a and whose head subsumes b. In this picture, rules and acts are prior in the order of explanation to temporality. A potential problem for our account emerges when we acknowledge that acts (the constituents of rules) are themselves temporal phenomena. If a rule has a conjunction of actions in the body, this means that both actions are performed at the same time. If an act has pre-and post-conditions, this means that certain conditions must be true before the act is performed, and certain conditions must be true after. How, then, can the notion of In the Critique, Kant distinguished between general and transcendental logic. While general logic describes the forms and principles for thinking in general, transcendental logic describes the forms and principles needed for thinking to have "objective validity" [A88-9/B122], i.e., to be about the world. How does the logic of rules that we present relate to Kant's distinction between general and transcendental logic? Our logic of rules is necessary (but not sufficient) for a subsequent project of properly formalizing the transcendental logic, but there is still remaining work to do. We do not give a full account of the conditions for the possibility of experience. Nor do we give a full account of what it takes for a thought to have objective validity.

KL 1
In the following three sections, we present a logic that formalizes Kant's rules as conditional imperatives and permissives. Our claim is not, of course, that Kant had this precise logic in mind. Rather, the claim is that our formalization is based on, compatible with, and helps explain part of Kant's view in the Critique of Pure Reason (and associated texts, especially Jäsche Logic).
From what has already been said, we know this logic must satisfy two constraints. First, since subsumptions are acts, we need a logic that does not assume its constituent elements have truth-values. The input/output logics [36] were designed to capture inferential relations between elements that do not necessarily have truth-values. They were conceived in response to Jørgensen's Dilemma: 1. Logical inference requires that the elements (premises and conclusions) have truth-values. 2. Imperatives do not have truth-values. For example, the command "Burn all the books!" has a satisfaction-condition, but does not have a truth-value. 3. There are valid logical inferences between imperatives. For example: -Burn all the books in the library! -The Critique of Pure Reason is a book in the library -Therefore: burn The Critique of Pure Reason! The input/output logics resolve this impasse by denying the first claim: they support inference between elements that do not have truth-values. The logic we present below is a member of the family of input/output logics (broadly conceived).
The second constraint on any logic that formalizes Kant's rules is that it must support not only conditional imperatives but also conditional permissives. For example: rule and act be prior to temporality in the direction of explanation, if the notion of an act presupposes a temporal ordering? This is a difficult question, but it is not just a question for our particular account. It is a problem for all interpretations of Kant that focus on the mental processes that underly experience. One way to address this concern is to invoke Kant's distinction between subjective succession and objective succession [B233-4], and to argue that the subjective succession of mental acts can be used to explain the construction of the objective succession of external events. if you subsume intuition x under mark p, then feel free to also subsume x under q! A logic that contains explicit permissives as well as imperatives will generate multiple acceptable sets of derived subsumptions. 20 For ease of exposition, we divide the logic into three parts. The first part, KL 1 , is a type of input/output logic with two types of rule: conditional imperatives and permissives. The second part, KL 2 , extends KL 1 with a negation operator. The third part, KL 3 , extends KL 2 by adding variables and quantifiers.
All three logics, KL 1 , KL 2 , and KL 3 , have been implemented and tested in computer programs. In particular, the soundness, completeness, and monotonicity for KL 1 and KL 2 have been empirically verified. 21 The code corresponds closely to the text in Sections 3, 4, and 5 below.

Syntax
Let A be the set of all atoms. A, B, C and X will range over sets of atoms, and a, b, c will range over individual atoms.
There are two types of rule in KL 1 : B is the body of the rule; it is a set of atoms representing a conjunction. C is the head of the rule; it is a set of atoms representing a disjunction. We use sets, not sequences or multisets, to avoid various uninteresting inferences involving duplication and permutation of elements.
For readability, we write the body of the rule as a conjunction, and the head of a rule as a disjunction. The {p, q} →{r, s} is represented as: If the body of the rule is empty, we write instead of the empty set. For example, {} →{p} is written as →p. If the head of the rule is empty, we write ⊥ for the empty set. For example, {p} →{} is written as p →⊥. Rules with empty bodies and singleton heads are called facts, while rules with empty heads are called constraint rules.
The → rules are intended to be read as conditional imperatives between actions. For example, the rule p →q ∨ r should be read as "if you are doing p, then do q or do r!".
The → rules are intended to be read as conditional permissives between actions. For example, p →q should be read as "if you are doing p, then feel free to do q!". Some example rules are given in Table 1. 20 The standard input/output logics generate a single set of derived conclusions. The family of logics we present here are unusual (in the family of input/output logics) in generating multiple acceptable sets of conclusions. Of course, many non-monotonic formalisms generate multiple acceptable sets of conclusions. See Section 3.7 for further comparison. 21 See https://github.com/RichardEvans/kl haskell. Since the elements of rules are actions -not propositions with truth-values -disjunction should not be interpreted truth-functionally. To say that you must do p ∨ q is to say there are two available actions, p and q, and you must choose one of these actions (or both 22 ).
It is straightforward to generalise the form of rules to allow disjunctions of conjunctions of atoms in the head. We will not do that here for ease of exposition.

Semantics (Part 1)
Given a (countable but not necessarily finite) set R of rules and a (finite) set A ⊆ A of atoms, the consequences out 1 (R, A) is a set {X 1 , X 2 , . . . } of sets of atoms, where each X i ⊆ A is one of the distinct ways in which the rules can be satisfied.
There are two sources of non-determinism in KL 1 . The first is disjunction. Rules that have disjunctive heads can be satisfied in multiple ways. For example, given: with A = {p}, the possible outcomes are: The second source of non-determinism is → rules. For example, given: the possible outcomes are: The out 1 function is defined in terms of a set cns(R, A) of consequences, representing the various ways of applying R to A, from which are filtered out those that do not satisfy the rules in R.

Definition 1
A set X of atoms satisfies a set R of rules, written X |= R, when X satisfies every rule in R. X satisfies a rule r, written X |= r, when: Definition 2 For all sets R of rules and sets A of atoms: is defined inductively as follows: The step function combines the consequences of the various rules that apply: The step function treats → and → exactly the same; the place where they are treated differently is in the satisfaction condition X |= R in out 1 .
Note that, according to this semantics, the permissives are weak in that they are overridden by the imperatives. If we have p →q and p ∧q →⊥, then the constraint overrides the → rule: Similarly, if we have p →q and p →r with q and r incompatible (i.e., q ∧r →⊥), then p →q will trump p →r: It is also worth observing that the assumptions A can be replaced equivalently by a set of corresponding unconditional → rules. For any set of assumptions A: For readability we sometimes write out 1 (R) as shorthand for out 1 (R, ∅). Note also that out 1 (R, A) is not monotonic in either R or A. It remains to confirm that cns and out 1 are well-defined and unique for any set of rules R and assumptions A. We do this in the next section by translating rules R to a simple form of logic program for which the required properties are immediate.

Semantics (Part 2)
In this section, we provide an alternative semantics that is provably equivalent to the semantics in Section 3.2 above.
A definite clause is a rule of the form where c and b 1 , . . . , b n are atoms. We are using standard logic programming notation: c is the head of the clause; the body b 1 , . . . , b n is to be read as a conjunction. As is usual, where the body of a clause is empty we identify a clause c ← with the the atom c. A definite logic program is a set of definite clauses. The idea is that every → and → rule can be translated to a set of definite clauses, each of which represents one of the ways that the rule can be satisfied; a set of rules is translated to the set of definite programs obtained by taking all combinations of the translations of the individual rules.
Definition 3 (Definite clause encoding) Define a function def r from rules to sets of sets of definite clauses: Now define a function def that translates a set of rules into a set of definite programs (a set of sets of definite clauses): [15] of the definite program D: For any A ⊆ A, let M(D, A) be defined inductively as follows: The following are all properties of definite logic programs [15]  Now we can define an alternative version of cns in terms of def and M.
In the original semantics of Section 3.2, the inductive definition of cns can be seen as the construction of a tree rooted in {A} whose leaves are the elements of cns(R, A). Here in our second, alternative semantics, cns d (R, A) can be seen as a set of linear derivations each of which is the application to A of one of the definite programs in def (R).
It can be confirmed (e.g., by induction on R) that if R contains no constraint rules, i.e., no rules of the form B →⊥, then: Let R ⊥ denote the set of all constraint rules in R. Then: Putting the above together we have the following alternative characterisation of out 1 (R, A).

Proposition 1 Let R be a set of rules and A a set of atoms.
In the preceding discussion.
The following corollary will be useful later: We have M(D, X) = X and hence by Proposition 1, X ∈ out 1 (R, X).

Entailment
We shall define entailment on KL 1 rules using a notion of strong equivalence between rule sets. It is natural to say that two rule sets R 1 and R 2 are strongly equivalent if, for all rule sets R and all sets A of assumptions: for all rule sets R . However, for strong equivalence of KL 1 rule sets it is sufficient to restrict attention to sets R of nullary rules ('facts'), i.e., rules of the form →a. This is because out 1 (R, ∅) can be seen as being defined by a set of definite clause programs def (R). Two definite clause programs D 1 and D 2 have the same models, and in particular the same least models, if and only if D 1 ∪ A and D 2 ∪ A have the same models for all sets A of unit clauses ('facts'), i.e., clauses of the form a ← . The property generalises straightforwardly to comparing sets of definite clause programs as required here.
We therefore take the following definition of strong equivalence and of rule entailment.

Definition 5
Two rule sets R 1 and R 2 are strongly equivalent in KL 1 if: It is also convenient to employ a functional notation. kl 1 (R) denotes the set of rules semantically entailed by R in KL 1 : kl 1 (R) = {r | R |= KL 1 r}. Rule sets R 1 and R 2 are strongly equivalent in KL 1 when kl 1 (R 1 ) = kl 1 (R 2 ).

Fig. 1
The entailment lattice for {p, q}. If there is a line between two nodes, then the lower node entails the higher node A set R of rules is strongly inconsistent in KL 1 if, for every set A of atoms,

Proposition 3 Let R be a set of rules. out 1 (R, A) = ∅ for all sets A of atoms if and
The other direction is trivial.
→⊥ for any X. So we have the following.

Inference Rules
The inference rules for KL 1 are given in Fig. 2. Note that, since the left-and right-hand sides of rules are sets of atoms, not sequences or multisets, a finite set R of rules has only a finite number of inferential consequences. Given a set R of rules, the derived rules deriv(R) are the rules generated by repeated application of the inference rules in Fig. 2. We also write R KL 1 r if r ∈ deriv(R).
Here is a derivation using the inference rules: Example 3 It can also be confirmed that {p →q ∨ r, q →⊥} |= KL 1 p →r. Here is a derivation using the inference rules: Proofs are in the Appendix.
We do not have a proof of this although we have strong reasons to believe it is true. We have tested it empirically using a computer implementation 23 of the definitions and results in Sections 3, 4, and 5. For KL 1 , we sample randomly generated sets R of KL 1 rules, and individual rules r such that R |= KL 1 r. Then we generate all inferential consequences of R (always finite for a finite set of rules) and test if r is among the consequences. Extensive empirical testing, from sample sets of the order of 100,000 rules, suggests that KL 1 is indeed complete. We would of course prefer the full confidence of a formal proof. 24

A Comparison Between → and → Inference Rules
The MUST-TRANS and MAY-TRANS rules are very similar. But there is one extra condition in MAY-TRANS which does not appear in MUST-TRANS. The reason for the extra condition is this. Consider the simpler variant: This rule is exactly parallel to MUST-TRANS but it is unsound: it would allow us to infer p →r from p →q and q →r.
Let p →q be "if you visit the south of France, then feel free to enter a naturist resort!". Let q →r be "if you enter a naturist resort, then feel free to disrobe entirely!". We do not want to infer "if you visit the south of France, then feel free to disrobe entirely!".
To avoid this, there is an extra condition in MAY-TRANS that insists that each c in 23 See https://github.com/RichardEvans/kl haskell. 24 Completeness is not just of formal interest -it also has significant philosophical consequences. The contrapositive of completeness is that any consistent set of sentences has a model. If every finite subset of a set of sentences is consistent, then (by completeness) each subset has a model. Then, by compactness, has a model. Thus, completeness allows us to move from a proof-theoretic notion (consistency) to a worlddirected notion (having a model), thus allowing our logic to fulfil a key role in the task of transcendental logic: explaining how it is possible for our thoughts to be about an external mind-independent world. This is related to the inverse system semantics of Achourioti and Michiel van Lambalgen [1].

The Primary and Secondary Aim of Logic
In Section 2.4 above, we claimed that, for Kant, a fundamental question of logic is: given a set of subsumptions, and a set of rules, what further subsumptions may/must I perform? This question is answered by the out 1 function defined in Section 3.2 above and further characterised by the results of the sections that follow.
A further, secondary question of logic is: given a set of rules I have adopted, what further rules may/must I adopt? This question is answered by the rule entailment (|= KL 1 ) and inference rules ( KL 1 ) defined in Sections 3.4 and 3.5 above.

Comparing KL 1 with Other Input / Output Logics
The family of input/output logics, broadly conceived, are logics that operate on elements that do not (necessarily) have truth-values, and in which inferences are not closed under contraposition. KL 1 is a member of the family of input/output logics, very broadly conceived. However, there are a number of essential differences between KL 1 and the particular input/output logics described in [36].
There is the obvious notational difference. Standard input/output logics use the pair notation (p, q) to indicate implication from p to q. That in itself is a trivial difference but KL 1 needs two distinct arrows, p →q and p →q, for the two different types of rule (see Section 2.2).
That aside, the first major difference is that, in the standard input/output logics, a rule (φ, ψ) relates arbitrarily complex expressions that are closed under the Boolean connectives. In KL 1 , by contrast, the elements are conjunctions and disjunctions of atoms only.
In KL 1 , the left hand side (antecedent) of a rule is a set of atoms representing a conjunction, while the right hand side (consequent) is a sets of atoms representing a disjunction. In a standard input/output logic, if we apply the rule (p, q ∨ r) to the premises {p}, we get the single result {q ∨ r}. In KL 1 , by contrast, if we apply the rule p →q ∨ r to the premises {p}, we get three possible answers: The second major difference, then, is that the output out 1 of a set of rules in KL 1 is a set of sets of atoms, representing different possible ways to satisfy the rules, while the output in a standard input/output logic is a single set of propositions.
A third major difference is that KL 1 does not have an inference rule for weakening the output. Although there is a rule for strengthening the input (MUST-SI), there is no corresponding rule for weakening the output (WO): This rule is not valid in KL 1 because it would license activities that were arbitrary. 25 Suppose, for example, the agent is performing the action p and has the rule: If we are allowed to infer, using WO, that p →q ∨ r then there will be two possible sets of actions that are compatible with the original rule plus the derived rule: {p, q} and {p, q, r}. The trouble is that the action r that is introduced in the second answer is arbitrary in the sense that it is not itself grounded in a rule.
Throughout his mature writings, Kant assumes that all activity must be grounded in a rule in order to count as activity at all. Consider the following difference: some of the movements my body performs are mere spasms, while other movements count as actions. The difference between the two, according to Kant, is that actions are movements that are grounded in a rule I have adopted. Kant's fundamental normative step is to characterise the subset of my bodily movements that count as actions as those which are subsumed under a rule. This is as true of mental activity as it is of physical activity: Like all our powers, the understanding in particular is bound in its actions to rules [Jäsche Logic §1] 26 Therefore, in any logic that tries to capture Kant's normative theory of activity, weakening output (WO) should not be valid (see Section 3.9). KL 1 has only the following restricted form: The final difference between KL 1 and the standard input/output logics is that KL 1 allows what is called "throughput" in input/output logics: for each X ∈ out 1 (R, A), A ⊆ X. The inference rule corresponding to throughput is MUST-ID, allowing us to infer p →p: "if you are doing p, then do p!", for all actions p. The reason why this inference rule is valid in KL 1 is again because of the particular intended application: KL 1 is a logic of concurrent activity, prescribing the activities that we must perform conditional on the activities we are already performing. It is for the agent to produce a package of activities that together satisfy the various rules. If it is already performing action p, then p must be part of any complete package of activity. If you are already doing p, there is no point trying to undo the performance of p -that ship has already sailed. See Section 6 for further discussion. 25 Achourioti and van Lambalgen [1, 2] make a similar point, for different reasons. They argue that B ∨ C may not constitute a "whole". See footnotes 10 and 41 of [1]. 26 Or to put it another way, mental occurrences not grounded in a rule "would then belong to no experience, and would consequently be without an object, and would be nothing but a blind play of representations, i.e., less than a dream." [A112], see also [A156/B195]. See [17] for further elaboration.

Comparing KL 1 to Other Logics of Imperatives
Many logics of imperatives 27 start with a base truth-functional logic and extend it with one or more imperatival operators (e.g. "!"). For example, given a set of sentences that have a truth-functional semantics (e.g. sentences of classical propositional logic or first-order logic), with S ranging over sentences in , define an imperative language L as: Given such a framework, imperatival inference can be explained using Dubislav's trick [21]: !p entails !q whenever p entails q.
We stress, however, that KL 1 does not use this framework. We do not presuppose an existing base language in which truth-conditions have already been assigned. 28 In KL 1 , the atoms A represent actions that do not have truth-values. In this crucial respect, KL 1 is closer to the input/output logics than it is to most logics of imperatives.
Charlow [11] identifies three sets of requirements that any logic of imperatives must satisfy: 1. imperatives can stand in inconsistency relations 2. imperatives can stand in inferential relations 3. imperatives can be embedded Here, R 1 is strongly inconsistent but R 2 is only weakly inconsistent. When A = {}, Charlow [11] insists that an imperative requiring φ is inconsistent with a permissive allowing ∼φ. In KL 2 , where we add a form of negation to KL 1 , the following set R 5 is not inconsistent (not even weakly inconsistent): This is because the permissive → rule is weak and is overridden by a → rule.
Of course, differences of intuition are to be expected here because [11] is developing a semantics for conditional imperatives in natural language, while our project is to provide a logic of conditional imperatives for describing rules of thought.
Requirement 2 inferential relations in KL 1 are defined in terms of a semantic (|= KL 1 ) and syntactic ( KL 1 ) notion of entailment. We have commented on the relationship (soundness and completeness) between them. Requirement 3 the central motivating cases of embedding in [11] are conditional imperatives and permissives. These are precisely the types of imperative that KL 1 is designed to model. Although we agree that other forms of embeddedness are also important, we do not have space to do justice to a full discussion here. We have developed an extension of KL 1 that includes embedded rules (e.g. (p →q) →r), but we leave a full description to further work.

The Deontic Paradoxes in KL 1
According to our interpretation of Kant, conditional imperatives and permissives play a key role in both his theoretic and his practical philosophy. In this section, we sketch how KL 1 handles some of the standard deontic paradoxes of practical reasoning.
Ross's paradox [38] was first described for von Wright's deontic logic, but it also applies to logics of conditional imperatives. Suppose we are given the order: Post the letter! Now the declarative proposition "x posts the letter" entails the proposition "either x posts the letter or x burns it." But we do not want to infer from this entailment and the original order that: Therefore: post the letter or burn it! One way of seeing the problem with this conclusion is by inferring (via the inference that if you must do something, then you may do it): Therefore: you may post the letter or burn it! and then inferring (since permission distributes over disjunctions): Therefore: you may burn it! The absurdity in this chain of reasoning comes out even more clearly if the newly introduced disjunct is something altogether irrelevant to posting the letter, and altogether unacceptable. For example: Post the letter! Therefore: post the letter or set fire to the school! Therefore: you may post the letter or set fire to the school! Therefore: you may set fire to the school! In the usual input/output logics, the rule for weakening the output is: There is a paradox that is closely related to Ross's paradox, that involves conjunction rather than disjunction. Suppose you are permitted to wipe your feet and enter the house. It does not follow that you are permitted to enter the house simpliciter. 29 This troublesome inference does not go through in KL 1 . Letting w stand for "wipe your feet", e stand for "enter the house", and c stand for the conjunction of both activities, the conjunctive permission is represented by the set R of rules: 30 Here there are two acceptable packages of actions: doing nothing, or doing both w and e: You must either leave the dinner early or stay until the end. If you leave early, then you must interrupt the conversation to tell everyone you are going early. 29 See the related "Window paradox" in [21]. 30 It is straightforward to extend the form of rules in KL 1 to allow disjunctions of conjunctions of atoms in the heads (consequents) of rules. In that version, the example would be represented by R = { →(w ∧ e)} with out 1 (R , {}) = {{}, {w.e}}. The details are straightforward but we have omitted them here to shorten the presentation.
It would not be acceptable to both stay until the end but also interrupt the conversation to tell everyone that you were leaving. This troublesome inference does not go through in KL 1 :

KL 2 : Extending KL 1 with Negation
We define KL 2 by adding a negation operator to KL 1 . This negation operator applies to an atom a to generate a literal ∼a. It can only be applied to atoms; we do not allow expressions such as ∼∼p, ∼(p ∧ q) or ∼(p ∨ q).
Henceforth, a, b, c range over literals and A, B, C and X range over sets of literals. Where c is a literal we write c for the complement of c: if c is an atom then c = ∼c and ∼c = c. When C is a set of literals C = {c | c ∈ C}.
The rules of KL 2 are: as for KL 1 except that now B and C are sets of literals (representing conjunctions and disjunctions respectively). A set of literals X satisfies a set R of rules, X |= R, and satisfies an individual rule r, X |= r, just as in KL 1 except that now the elements of a rule are sets of literals rather than atoms: X |= B →C always; X |= B →C if B X or C ∩ X = ∅.

Minimal Requirements on a Kantian Negation Operator
Kant describes a variety of properties that negation must satisfy throughout the Jäsche Logic. 31 As minimal requirements, we pick out two fundamental properties that he insists on.
The operator ∼ from atoms to literals is our negation operator. First, p and ∼p must be incompatible. 32 Second, ∼p must be the most general proposition that is incompatible with p: 33 for any q, if p and q are incompatible, then q must entail ∼p.
To motivate the second requirement, consider the following example. Suppose Jill can support at most one of three football teams: Arsenal, Barnet, or Chelsea. She cannot support more than one: supporting Barnet is incompatible with supporting Arsenal. But "Jill supports Barnet" (b) cannot be the negation of "Jill supports Arsenal" (a) because it is too specific. The negation ∼a of "Jill supports Arsenal" is the 31 33 Kant makes this precise claim in [Jäsche Logic §49]: "one of [a pair of contrary judgements] says more . . . than the mere negation of the other." In other words, a claim that is incompatible with p entails (but is not necessarily entailed by) the negation of p. See also Brandom [6], Humberstone [23], p.1170. most general claim that is incompatible with her supporting Arsenal, and "Jill supports Chelsea" (c) is also incompatible with her supporting Arsenal. All we can say about ∼a is that b entails ∼a and c entails ∼a.

Inference Rules
The extra inference rules for KL 2 are given in Fig. 3. They are chosen to capture the two requirements on the negation operator described above. Here we are assuming that the mutual incompatibility of a (non-empty) set A of literals is expressed by the rule A →⊥.
Example 5 Here we use p →q to derive ∼q →∼p: Example 6 Here we derive →∼q from p ∧ q →⊥ and ∼p ∧ q →⊥: It is instructive to look at some derived rules of KL 2 . Those in Fig. 4 are all derivable using only ∼-LEFT and the rules of KL 1 . TRANSPOSE is obtained from ∼-LEFT using MUST-SI and MUST-TRANS. MUST-⊥ is obtained by repeated application of TRANSPOSE. (The case of MUST-⊥ where C = ∅ is vacuous but harmless.) INCONS generalises ∼-LEFT. We do not show the derivations in detail. They are very straightforward and will be shown in detail when we present their KL 3 versions in Section 5.
Of particular interest is the following pair: The first is a special case of the second. They will be discussed in more detail in the treatment of KL 3 . Unlike the inference rules in Fig. 4 their derivation relies on ∼-RIGHT.

Semantics
A set X of literals is consistent if it does not contain a pair of complementary literals a and ∼a for any atom a. It is inconsistent otherwise. A denotes the set of atoms. Let C A denote the set of constraint rules We omit the subscript A where it is obvious from context. Clearly the set X of literals is consistent when X |= C. We write V A for the set of maximal consistent sets of literals from A, i.e., V A is the set of sets X m such that X m is consistent and, for every a ∈ A, either a ∈ X m or ∼a ∈ X m . Definition 6 Given a set R of rules, a set X of literals is a violating set of R if there is no maximal consistent X m ∈ V A such that X m ⊇ X and X m |= R.
One can see that an inconsistent set of literals is, by definition, a violating set of every set of rules. And if X is a violating set of R then so is every X ⊇ X.
A violating set X of R cannot be extended to a maximal consistent set X m ⊇ X that satisfies every rule in R. If B →⊥ is a rule in R then X is a violating set of R when B ⊆ X. For a rule of the form B →C (C = ∅) and without the consistency requirement, a set X of literals can always be extended to a set X ⊇ X that satisfies that rule (the set of all literals satisfies it). With the consistency requirement, X cannot be extended to a consistent X ⊇ X that satisfies B →C when B ∪ C ⊆ X. (Indeed that condition applies to constraint rules also: X cannot be extended to satisfy B →⊥ when B ∪ ∅ ⊆ X.) It is possible that X is a violating set of a set R of rules even though X is not a violating set of any of them individually. A rule B →C is satisfied by every set X of literals: only inconsistent sets of literals are violating sets of → rules. Note the close parallel between the inference rules and the semantics. In out 2 (R, A), the check for consistency, expressed by the rules C, matches ∼-LEFT, while the set of rules aux(R) provides the additional inferences afforded by ∼-RIGHT. Indeed, we will show (below) that X is a (finite) violating set of R if, and only if, the rule X →⊥ is entailed by R in KL 2 . That will establish the connections between semantic and syntactic entailment in KL 2 .
Rule entailment is defined as in KL 1 . We write kl 2 (R) for the set of rules semantically entailed by R in KL 2 : kl 2 (R) = {r | R |= KL 2 r}. Rule sets R 1 and R 2 are strongly equivalent in KL 2 when kl 2 (R 1 ) = kl 2 (R 2 ).
For brevity, we will write KL 1 (C) for KL 1 extended with the inference rule ∼-LEFT and say that R KL 1 (C)-entails r when R ∪ C |= KL 1 r. Proposition 6 (Decomposition) A set R of rules semantically entails a rule r in KL 2 if and only if R ∪C ∪aux(R) semantically entails r in KL 1 . That is, for all rule sets R: The following is a general property of KL 1 .

Proposition 7 Let R be a set of rules and A a (finite) set of literals:
It is a corollary of the above that R is strongly inconsistent in KL 1 if and only if R |= KL 1 →⊥. This was Proposition 4. Now we are ready to prove what we want, that X is a (finite) violating set of R precisely when X →⊥ is KL 2 -entailed by R. Informally, if X is a (finite, nonempty) violating set of R then And aux(R) ⊆ kl 1 (R ∪ C ∪ aux(R)). So straight away: Now X →⊥ ∈ kl 2 (R) because (for any non-empty finite set X of literals) X →⊥ is entailed in KL 1 (C) by X − {a} →a, any a ∈ X. Syntactically, that is easy to see. It is just an instance of MUST-⊥: B →c B ∧ c →⊥ which is a derived rule of KL 1 (C). Semantically, we want to confirm that B ∧ c →⊥ ∈ kl 1 ({B →c} ∪ C). That is very easy (see proof below).

Proposition 8
Let R be a set of rules. If X is a finite violating set of R then: Proposition 9 Let R be a set of rules and X a finite set of literals.
R |= KL 2 X →⊥ iff X is a violating set of R Note that according to the above, ∅ is a violating set of R if and only if R |= KL 2 →⊥. Two refinements are immediately available. First, any inconsistent set of literals is a violating set of any set R of rules. (It has no consistent superset.) But an inconsistent set of literals contributes nothing useful to aux(R). The inconsistent set {c, ∼c} produces only the pair {c →c, ∼c →∼c} in aux(R). These are merely instances of MUST-ID. More generally, an inconsistent set A ∪ {c, ∼c} contributes the following rules to aux(R): The first two rules are merely consequences of MUST-ID and MUST-SI. The others are entailed by c ∧ ∼c →⊥ in KL 1 by MUST-SI and QUOD-LIBET. So if X is an inconsistent set of literals, then right ({X →⊥}) ⊆ kl 1 (C): X contributes nothing to out 2 (R, A) and can be ignored.
Second, if X is a violating set of R and X ⊇ X then X is also a violating set of R. Moreover, every rule in right ({X →⊥}) can be derived in KL 1 from a rule in right ({X →⊥}). For suppose X = X ∪ Y , X and Y disjoint. Then the rules in right ({X →⊥}) are of the following two forms: Rules in the first group are derived by MUST and hence Proof In the preceding discussion.
Note that if R ∪ C is strongly inconsistent then ∅ is the only minimal consistent violating set of R. In that case aux m (R) = ∅ and out 2 (R, A) = out 1 Finally we confirm that out 2 (R, A) is well-defined for non-empty sets A of assumptions.

Proposition 11
Let R be a set of rules and A a set of literals.

An Alternative Characterisation of out 2
It is possible to construct alternative, equivalent characterisations of the auxiliary rules aux(R). The following will be used in discussions of KL 3 to come and for completeness of KL 2 .
Observation 1 X is a violating set of R iff X is a violating set of rules R ⊥ ∪ must ⊥ (R) where R ⊥ is the set of constraint rules of the form B →⊥ in R and must ⊥ (R) is the set of rules obtained by applying MUST-⊥ to the rules in R: Observation 2 Let R 1 and R 2 be sets of rules. If X is a violating set of R 1 ∪ R 2 then X ⊆ X 1 ∪ X 2 for some X 1 and X 2 such that X 1 is a violating set of R 1 and X 2 is a violating set of R 2 .
The following is a derived rule of KL 2 : Its derivation requires ∼-RIGHT and so it is an inference rule of KL 2 not of KL 1 (C). We can also give a semantic justification by appeal to violating sets. If B ∪ {c} and B ∪ {∼c} are both violating sets of R then clearly so is B.
The following more general rule is also easily derived in KL 2 : Semantically, if A∪{c} and B ∪{∼c} are both violating sets of R then so is A∪B. (We cannot extend consistently by either c or ∼c.) These rules will feature prominently in KL 3 and we will present their derivation there. We can reformulate RESOLVE-⊥ as a rule applying to violating sets, exactly as stated in the semantic argument. We will call that V-RESOLVE.
We could reformulate RESOLVE-⊥ so that it does not generate rules with inconsistent bodies and V-RESOLVE so that inconsistent sets are discarded. We could also make the computation of v * (R) more efficient by discarding non-minimal elements as soon as they are constructed during the computation of the closure v * (R). These are details.

Example 15
Now we can define an alternative set of auxiliary rules aux e (R) to be used in out 2 (R, A) that will be useful in establishing completeness and in KL 3 .

Definition 12
For all rule sets R, let: In order to use aux e (R) in the computation of out 2 (R, A), and to preserve semantic entailment |= KL 2 , it is not necessary that aux e (R) generates all elements of aux(R) -only that it generates at least all minimal elements aux m (R) of aux(R) and nothing that is not in aux(R).

Proposition 12 Let R be a set of rules and X a set of literals. If X is a violating set of R and X /
∈ v * (R) then either X is inconsistent or there exists X ⊂ X such that X ∈ v * (R).
Proof By induction on the number of rules in R, and Observation 2.
This does not say that all elements of v * (R) are consistent or minimal, but only that all minimal consistent violating sets of R are elements of v * (R), which is all we need.
Clearly aux m (R) ⊆ aux e (R). Further, since the set of all violating sets of R is closed under V-RESOLVE, aux e (R) ⊆ aux(R). We also have (Proposition 10) aux(R) ⊆ kl 1 (aux m (R)∪C). Putting these observations together gives the following.

Proposition 13 Let R be a set of rules. aux m (R) ⊆ aux e (R) ⊆ aux(R) ⊆ kl 1 (aux m (R) ∪ C)
and hence Now we shall provide an alternative characterisation of aux e (R) in terms of inference rules of KL 2 .

Definition 13
Let right + (R) denote the results of applying inference rule ∼-RIGHT to rules R keeping only those rules whose bodies are consistent: Proof This follows from the definitions. must ⊥ (R) is the set of constraint rules implied by rules of the form B →C (C = ∅) in R. R may also contain constraint rules of the form B →⊥. So (by definition) X ∈ v(R) when X →⊥ is a rule in R ∪ must ⊥ (R) and X is consistent. X ∈ v * (R) when X →⊥ is a rule in resolve * ⊥ (R ∪must ⊥ (R)) and X is consistent. So aux e (R) = right + (resolve * ⊥ (R ∪ must ⊥ (R))). Now this is going to be used in establishing completeness, because all the inference rules used in the construction of aux e (R) are (derived) inference rules of KL 2 .

Soundness and Completeness
The inference rules of KL 2 are those of KL 1 together with ∼-LEFT and ∼-RIGHT. We write deriv 2 (R) to denote the set of rules that can be derived from the set R of rules by repeated application of the inference rules of KL 2 . We write R KL 2 r if r ∈ deriv 2 (R).

Proposition 15 (Soundness of KL 2 ) For all sets R of rules:
We would expect that if KL 1 is complete with respect to out 1 then KL 2 is complete with respect to out 2 . That is indeed the case.

Proposition 16 (Conditional completeness of KL 2 )
If KL 1 is complete with respect to out 1 then KL 2 is complete with respect to out 2 . That is: if, for all sets R of rules kl 1 (R) ⊆ deriv 1 (R) then, for all sets R of rules kl 2 (R) ⊆ deriv 2 (R).

Conservative Extension
We have established that: One can see that KL 2 is a conservative extension of KL 1 , both semantically and syntactically.

Proposition 18 (Conservative extension) KL 2 is a conservative extension of KL 1 : If
R is a set of rules containing no negative literals, and rule r also contains no negative literals, then r ∈ kl 2 (R) iff r ∈ kl 1 (R), and r ∈ deriv 2 (R) iff r ∈ deriv 1 (R).
We can see this claim is true by looking at the aux(R) construction: if R contains no negative literals, then all violating sets of R are sets of atoms. All the rules in aux(R) are therefore rules with singleton heads where the head is a negative literal and the body contains only positive atoms, i.e., rules of the form B →∼c where c is an atom and B is a set (possibly empty) of atoms. Any rule r containing only positive atoms can only be derived from R ∪ aux(R) (syntactically or semantically) if it can be derived from the rules R. The constraint rules C have no effect if neither R nor the entailed rule r contain negative literals.
Indeed, KL 1 (C) is a conservative extension of KL 1 and KL 2 is a conservative extension of KL 1 (C). kl 1 (R ∪ C) is a conservative extension of kl 1 (R) and kl 2 (R) is a conservative extension of kl 1 (R ∪ C). KL 2 can also be seen as a conservative extension of KL 1 in the following rather different sense. Given a set X of literals, let X + be the largest subset of X containing only positive literals. In other words, let X + be the set of atoms obtained by removing all negative literals from X. If is a set of sets of literals, let + = {X + | X ∈ }.

Proposition 19 Let R be a set of rules and A a set of assumptions, both containing no negative literals. Then:
In other words: out 2 does not add or remove from the set of solutions to out 1 -all it does is possibly add some negative literals to the existing solutions.

Entailments
Some examples of entailments and non-entailments are given in Table 2. Note that the rule corresponding to the law of excluded middle ( →p ∨∼p) is not a theorem of KL 2 .

Concluding Remarks on Negation
The treatment of negation in KL 2 derives from two starting assumptions: that complementary literals c and ∼c are mutually incompatible (∼-LEFT), and that the negation of c is the most general proposition that is incompatible with c (∼-RIGHT). These two assumptions embedded in KL 1 produce a form of negation in which the rules B ∧ c →⊥ and B →c turn out to be equivalent. 34 It is possible to devise some more elaborate technical constructions which weaken this equivalence, such that the rule B →c entails B ∧ c →⊥ but not the other way round. We have not presented any such alternatives here. The technical constructions are not difficult but we have not found support for them in Kant's writings.

KL 3 : Extending KL 2 with Variables and Quantifiers
KL 3 extends KL 2 by adding quantified rules to KL 2 , including rules in which the head may have existentially quantified variables. In KL 3 , an atom has internal structure; it is composed of a predicate and a list of terms.
Given a set P of predicate symbols with associated arities: 35 The negation of a predicate p means "un-p". For example, ∼clear means "un-clear". The formula p(x) →∼q(x) does not mean "if p(x) then do not subsume x under q!". Rather, it means: "if p(x) then do subsume x under un-q!". Given a set P +/− of predicate symbols, a set K of constants, and a set X of variables, the set L K of ground literals is: L K ::= {p(k 1 , . . . , k n ) | p ∈ P +/− , k i ∈ K, arity(p) = n} 34 This equivalence only holds at the propositional level. We shall see in Section 5.1 below that the two rules are not equivalent when one of them contains existentially quantified variables. 35 Some commentators believe Kant's logic only allowed monadic predicates, but [1] and [2] argue convincingly that Kant always had n-ary predicates in mind. The set L X of unground literals is: The set L of literals is: Note that predicates of arity 0 are allowed. A literal of arity 0 is both a grounded literal and an ungrounded literal.
In what follows, constants and variables are written in lower case. Constants are a, b, c, while variables are x, y, z, possibly with subscripts. To avoid cluttering the syntax, we take it to be obvious from context whether a, b, c are to be read as constants or as ranging over literals.
Note that both L K and L X are proper subsets of L, and there are literals in L that are not in L K nor L X : any literal that contains a mixture of variables and constants is in neither L K nor L X . p(x, k) is not in L K nor in L X .
In KL 3 , rules are made up entirely of unground literals from L X . No constants are allowed in any of the literals in any of the rules. This is essential. Since rules are intended to be public and shareable between agents, while intuitions are private mental objects, rules must not contain constants (intuitions) or they would not be public (see Section 2.2).
There are two forms of rule in KL 3 , as in KL 1 and KL 2 but with B and C ranging over sets of unground literals from L X : Variables that appear in both the body and the head of a rule are read as universally quantified. For example, p(x) →q(x) means: "for any x, if you perform p(x), then you must perform q(x)!". Variables that appear in the head but not in the body are existentially quantified. 36 For example, p(x) →q(x, y) means: "for any x, if you perform p(x), then you may construct a y and perform q(x, y)!". p(x) →q(x, y) means: "for any x, if you perform p(x), then you must construct a y and perform q(x, y)!". To emphasise this reading, we write such rules with explicit existential quantifiers in the head, as in e.g. p(x) →∃y q(x, y) and p(x) →∃y q(x, y).
A rule such as p(x) →q(x, y) ∨ r(x, y) where there is a shared variable y in the head can be read either as p(x) →∃y (q(x, y) ∨ r(x, y)) or (equivalently) as p(x) →∃y q(x, y) ∨ ∃y r(x, y). Notice that the latter is also equivalent to p(x) →∃y q(x, y) ∨ ∃z r(x, z), and therefore to the rule p(x) →q(x, y) ∨ r(x, z) without explicit quantifiers.
As explained below, however, for simplicity of presentation and for practical reasons we will restrict the language so that existential rules have only singleton heads. This does not restrict the expressive power of the language.

Preliminaries
Where θ is a substitution (an assignment of variables and/or constants to variables) and c is a literal, the expression c.θ denotes the application of θ to c. Where C is a set of literals C.θ = {c.θ | c ∈ C}. A substitution θ is ground when all variables in θ are assigned to constants. Where C is a set of unground literals and C.θ are ground literals we say that C.θ is a ground instantiation of C.
Example Suppose θ = {x/a} and θ = {y/b}. Then q(a, y).θ = q(a, b) Suppose θ = {x/a, y/b, z/c} and θ = {} (the identity substitution). Then p(x).θ = p(a) q(x, y).θ. θ = q(a, b).θ = q(a, b) Definition 14 A set X of ground literals satisfies a set of rules R, written X |= R, when X satisfies every rule in R. X satisfies a rule r, written X |= r, when: there exists a ground instantiation C.θ.θ of C.θ such that C.θ.θ ∩ X = ∅ X |= B →C always Note in the above that if there are no existential variables in the head C of a rule B →C then C.θ is ground and θ is the identity substitution. If there are existentially quantified variables in C and θ instantiates all the variables in C to constants (i.e., if C.θ is already a ground instantiation of C) then θ is the identity substitution.
Rules in KL 3 are quantified and unground. Leaving aside rules with existential heads, it is clear that formally all ground instances of KL 3 rules are -syntactically and semantically -rules of KL 2 where the positive ground literals of KL 3 are treated as positive literals (atoms) of KL 2 , and negative ground literals of KL 3 as negative literals of KL 2 , i.e., as atoms prefixed by the negation operator. We can see that, for rules without existential heads: Universally quantified rules without existentially quantified heads behave exactly in KL 3 , syntactically and semantically, as the set of all their ground instances in KL 2 .
Rules with existentially quantified heads however are a different kind of rule and have to be treated specially. Consider the very simplest example: At first sight it might seem that this rule cannot be violated, that there is (apparently) no consistent violating set because we can always extend a (consistent) set of ground literals by finding a new candidate q(k i ) atom.
But that is not so. The set {p, ∼q(a 0 ), ∼q(a 1 ), . . . } (infinitely many ∼q(a i ) literals) is a violating set, as are all of its supersets. Ordinarily, in the semantics adopted for negation in KL 2 , if X is a violating set of rule set R then R entails the rule X →⊥. That does not work here: X in this example would represent an infinite conjunction, which is not well-formed. Put another way, the derived inference rule MUST-⊥ which we rely on in the construction of aux e (R) in KL 2 , would look as follows: p →∃x q(x) p ∧ ∼q(x) →⊥ That rule does not hold for existential rules. The quantification is wrong. To make it valid we would need p →∃x q(x) p ∧ ∼∃x q(x) →⊥ but the consequent is not an allowed rule form in KL 3 .
What about ⊥-RIGHT? Could the following be valid?
p ∧ q(x) →⊥ p →∃x ∼q(x) Clearly not. "You must not perform p and q(x) for any x!" should not imply "if you perform p you must also construct an x and perform ∼q(x)!". For KL 3 we will need a restricted form of ⊥-RIGHT, as discussed in the next section. For example, the following inference is valid: Further, notice that the following rules are strongly inconsistent (with the obvious definition). And that the following pair p →∃x q(x) q(x) →⊥ is weakly inconsistent and has a violating set {p}. Now this is key, because we will want to construct a set aux q e (R) of auxiliary rules for KL 3 in analogy to the construction of aux e (R) in KL 2 . aux e (R) employs a combination of MUST-⊥, to derive constraint rules from non-constraint rules, and then RESOLVE-⊥ to process constraint rules. That is not available here -we do not have MUST-⊥ for existential rules. For existential rules we need (the general form of) the following inference rule, a generalisation of the example above: In the next section we will call the general form of the above inference rule EXISTS-⊥. In KL 2 its propositional analogue can be derived from MUST-⊥ followed by an application of RESOLVE-⊥. In KL 3 it can be given a semantic justification in terms of violation sets, as sketched above for the example. It is also derivable from the inference rules for KL 3 to be presented in the next section -however, as we show there, the derivation imposes certain restrictions on variables that limit its applicability in KL 3 . Similarly, we are also going to need the KL 3 analogue of RESOLVE-⊥; its derivation likewise will impose certain restrictions in order to deal correctly with quantifiers.
For ease of exposition, we restrict attention to the special case of existential rules with singleton heads. Note that this restriction causes no loss of expressive power: we can express rules with existentially quantified disjunctive heads by introducing auxiliary predicates if necessary.
In summary we have: -universally quantified rules without existentially quantified heads; they have exactly the same meaning -the same semantics and inference rules -as sets of all their ground instances in KL 2 ; -inference rules for converting existential rules to constraint rules, which we can justify by appeal to violation sets, and which are derivable from the inference rules for KL 3 to be presented in the next section. The inference rule EXISTS-⊥ for existential rules with singleton heads is simple.

Inference Rules
The inference rules for KL 3 are provided in Fig. 5. As explained above, for simplicity we deal only with the case of universally quantified rules and existential rules with singleton heads. In Fig. 5, a, b, c range over unground literals, and A, B, C, A , B , C range over sets of unground literals. The inference rules are of two types: those that are valid for all rules, and those that are valid only for universally quantified rules without existential variables in the head. In the figure they are distinguished by specifying restrictions on variables. SUB-1 and SUB-2 are specific to KL 3 . They allow the uniform replacement of variables by variables, enabling, for example, the inference from p(x) →q(x) to p(y) →q(y). In SUB-1 and SUB-2, the substitution θ must be injective on the existential variables (the variables in B − A). Without this restriction, they would allow the inference from p(x) →∃y q(x, y) to p(x) →q(x, x), which is invalid. In MUST-SI and MAY-SI, the new literals in A − A must not bind any of the existential variables in B −A. Without this restriction, we would be able to infer from p(x) →∃y q(x, y) to p(x) ∧ r(y) →q(x, y), which is invalid.
In ∼-RIGHT, we insist that var(c) ⊆ var (B). Without this restriction, we would be able to infer wrongly from p(x) ∧ q(x, y) →⊥ to p(x) →∃y ∼q(x, y). Figure 6 shows three derived inference rules. They will be used, as in KL 2 , in the construction of auxiliary rules aux q e (R) used in the definition of the out function for KL 3 .
MUST-⊥ was used in KL 2 . It is valid for universally quantified rules without existentially quantified heads but not for rules with existentially quantified heads. Its derivation is presented below in order to show how the restrictions on variables

MUST-TRANS
B ∧ c →⊥ EXISTS-⊥, also discussed informally in the previous section, gives the conditions under which we can derive a universally quantified constraint rule from an existential rule. We present only the version for existential rules with singleton heads. The rule can be justified semantically, by reference to violation sets, and also derived from the inference rules in Fig. 5: the derivation uses MUST-SI and MUST-TRANS and for this reason EXISTS-⊥ inherits restrictions on variables from MUST-SI. Notice in particular that none of the existentially quantified variables in A →c may appear in the literals B of B ∧ c →⊥.

MUST-TRANS
A ∪ B →⊥ RESOLVE-⊥ was introduced in its quantifier free form in the section on KL 2 . Although it deals with universally quantified constraint rules its derivation relies on EXISTS-⊥ and ∼-RIGHT from which it inherits restrictions on variables:

EXISTS-⊥
A ∪ B →⊥ (and the symmetric form, which gives the variable restrictions quoted for RESOLVE-⊥ in Fig. 6).

Semantics
A set of ground literals is consistent when it contains no complementary pair of literals p(k 1 , . . . , k n ) and ∼p(k 1 , . . . , k n ). Violation sets (sets of ground literals) are defined as in KL 2 .
Given a (countable but not necessarily finite) set R of rules and a (finite) set A of ground literals, the consequences out 3 (R, A) are defined, as in KL 2 , in terms of a set aux q e (R) of additional rules representing the consequences of the inference rules for negation, ∼-LEFT and ∼-RIGHT. We will have: , A) C P is the set of rules corresponding to ∼-LEFT: {p(x 1 , . . . , x n ) ∧ ∼p(x 1 , . . . , x n ) →⊥ | p ∈ P, x i ∈ X , arity(p) = n} As usual we omit the subscript P when it is obvious from context. out q 1 (R, A) is the set of all possible outcomes obtained by applying the rules in R to the assumptions A. Each element of out q 1 (R, A) is a set (finite if R is finite) of ground literals. The definition is essentially the same as for KL 1 but adjusted to deal with variables in rules.
Notice that since R is a set of unground rules with variables and A is a set of grounded literals, it is no longer the case that assumptions A can be replaced by 'facts' (unconditional → rules with singleton head). An expression →a where a is a grounded literal is not a valid rule in KL 3 (unless a is a 0-ary term).

Definition 15
Let R be a set of KL 3 rules and A a set of ground literals. The step q function takes a set of rules and a set of ground literals and produces all the ground literals that can be inferred in a single step using a single rule from R. step q is exactly like the step function in the definition of cns and out 1 for KL 1 , except for the need to instantiate variables in rules to constants in the ground literals of argument X. The substitution θ can include new fresh constants that do not appear in A that can serve as witnesses for existentially quantified variables. , a), q(a, ν 0 ), q(a, ν 1 ), . . . } Here, ν 0 and ν 1 are new fresh constants. We assume we have an infinite stock of such constants ν 0 , ν 1 , . . . . aux q e (R) is defined analogously to aux e (R) in KL 2 . For rules without existential heads this will be exactly as for KL 2 with universal rules treated as standing for the set of their ground instances. The additional ingredient for existential rules is an application of the inference rule EXISTS-⊥ as discussed informally in the previous section.
Definition 16 Let R be a set of KL 3 rules. must ⊥ (R), resolve * ⊥ (R) and right + (R) are the three derived rules in Fig. 6, defined as for KL 2 (and in accordance with the relevant KL 3 variable restrictions). Let exists * ⊥ (R) denote the closure of rules R under EXISTS-⊥. Define: is the closure of R under RIGHT-∼, EXISTS-⊥ and MUST-⊥. In aux q e (R) it is sufficient to perform a single application of MUST-⊥, which deals with non-existential rules, and then the closure under EXISTS-⊥ and RESOLVE-⊥. The latter can be done in two separate steps, first the closure under EXISTS-⊥ and then the closure under RESOLVE-⊥. This is because (as was shown earlier) RESOLVE-⊥ is derivable as RIGHT-∼ followed by EXISTS-⊥: resolve ⊥ (R) = exists ⊥ (R ∪ right (R)) for any R. resolve ⊥ (R), for any R, is already closed under exists ⊥ .

Example 20 Suppose:
Note that the existentially quantified variable x in the rule →∃x ∼q(x) of R appears in the body of the inferred constraint rule p(x) ∧ ∼q(x) →⊥ of aux q e (R). The variable restrictions in EXISTS-⊥ however do not sanction the inference of the rule p(x) →⊥.

Equality
We add an extra binary logical operator =. The expression x = y does not represent the act of subsuming x and y under the mark of inequality. Rather, = is a testing operator that is different from the act of subsumption: to test if x = y is just to see whether the denotations of x and y are distinct. Expressions of the form x = y can appear only in the body of a rule; x and y must be variables appearing in the body.
One can think of a rule as having two distinct components (T , r) where r is an expression of the form B →C or B →C, and T is a set, possibly empty, of = tests on variables appearing in B. However, for readability, we allow the inequality tests in T to be written in the body of a rule as if they were atoms. q(a, a)}  {p(a), q(a, ν 0 q(a, a), q(a, ν 0 q(a, ν 0 ), q(a, ν 1

Example 21 Suppose R = {p(x) →∃y q(x, y)} and A = {p(a)}.
If we add an extra rule containing =, then we can constrain the set of witnesses.
Note that ⊥-RIGHT does not allow us to infer from the rule q(x, y) ∧ q(x, z) ∧ y = z →⊥ in R to the rule q(x, z) ∧ y = z →∼q(x, y). In the (T , r) representation described above, that would be an inference from ({y = z}, q(x, y) ∧ q(x, z) →⊥) to ({y = z}, q(x, y) →∃z q(x, z) ), which does not satisfy the variable restrictions of ⊥-RIGHT.
To handle inequality, we modify what it means for a set X of ground literals to satisfy the body of a rule to take into account the possible presence of = tests. Let us think of the inequality tests as belonging to the body, B. We will say that X satisfies B with ground instantiation of variables θ , written X |= θ B, if for every literal b in B, b.θ ∈ X, and for every expression x = y in B, the constants x.θ and y.θ are distinct. We are thereby making a unique names assumption on constants: two constants denote distinct objects when they are lexicographically distinct.
The adjustment for step q is as follows: step q (R, X) ={c.θ | B →C ∈ R or B →C ∈ R, X |= θ B, c ∈ C, c.θ is ground}

Comparing KL 3 with Geometric Logic
All rules in KL 3 are of the form ∀x φ(x) •→ ∃ȳ ψ(x,ȳ), wherex andȳ are tuples of variables and •→ is either → or →. These rules have the same quantifier structure as the rules of geometric logic. 37 The geometric formulae (also known as the "coherent implications") are the implications C → D where C ::= | C ∧ P D ::= ⊥ | D ∨ E E ::= ∃x C 37 The importance of geometric logic for understanding Kant's thought is stressed in the papers by Theodora Achourioti and Michiel van Lambalgen [1,2]. For geometric logic in general, see [4,5,13,20,37]. and P ranges over L. 38 Although the rules of KL 3 have the same quantifier structure as the rules of geometric logic, there are a number of differences. First, KL 3 has two types of rule, → and →, while geometric logic has only one. Second, KL 3 has predicate negation, while geometric logic does not include any sort of negation. Third, weakening the output is valid in geometric logic, 39 but not in KL 3 .
The fourth difference between the two systems is the way in which the tree of nodes 40 is generated. In KL 3 , to generate the successors step q (R, X) of a set X of atoms, we consider all rules in R whose bodies are satisfied. When we have finished constructing the nodes, we filter them to accept only those that satisfy all the → rules. In geometric logic, the dynamical proof tree is generated by considering only violated rules: rules whose body is satisfied but whose head is unsatisfied. To see the difference, consider the rule-set R consisting of only one rule: In geometric logic, the proof tree contains one node with the single atom φ(a 0 ) for some constant a 0 . Once the rule's head has been satisfied, it is no longer available to generate further nodes. In KL 3 by contrast, the step q function allows a rule to be applied whenever its body is satisfied, so out 3 (R, {}) contains infinitely many possible solutions: All of these differences are crucial to the intended application of our logic in understanding Kant (see Sections 2, 3.7, and 6).

Translating Natural Language into KL 3
Finally in this section, and before returning to Kant's texts, we shall spend a little time showing how natural language sentences can be translated into KL 3 . This exercise is important because the translation guidelines for KL 3 are rather different from those for translating natural language into first-order logic.

Singular Judgements
In first-order logic, a singular judgement, such as "Caius is mortal," is translated into an atom: where mortal is a one-place predicate and caius is a constant representing the individual Caius. In KL 3 , by contrast, an atom represents a subsumption -the act of subsuming a private mental intuition under a mark. So in KL 3 , declarative sentences are never 38 Note that geometric logic, unlike KL 3 , does allow constants as terms in rules. 39 See the classical evaluation rule for disjunction on page 3 of [13]: X φ 1 ∨ φ 2 if X U and for all Y ∈ U , Y φ 1 or Y φ 2 . 40 Each "node" is a set of literals in out q n (R, A) at depth n. translated into atoms (subsumptions). Instead, the judgement "Caius is mortal" is rendered as a rule: caius(x) →mortal(x) This is a conditional imperative that relates actions. It says: for all intuitions x, if you are subsuming x under the mark "caius", then also subsume x under the mark "mortal"! Now a proper noun, such as "Caius," is normally taken to imply existence (there is at least one individual denoted by "Caius") and uniqueness (there is at most one individual denoted by "Caius"). If we wish to express existence and uniqueness in KL 3 , we write: 41 →∃x caius(x) Judgements involving binary predicates are represented similarly. "Jack loves Jill" is rendered as: jack(x) ∧ j ill(y) →loves(x, y) plus existence and uniqueness constraints, as needed.

All and Some
Universally quantified judgements, such as "All humans are mortal," are rendered directly into KL 3 as: human(x) →mortal(x) Recall, once more, that this rule is a conditional imperative stating what actions you must do: if you are subsuming private mental intuition x under the mark "human" then also subsume x under "mortal"! Judgements involving "some" can be translated into KL 3 in two different ways. "Some humans are fickle," for example, can be translated into: This is a permissive rule: if you are subsuming intuition x under "human", then feel free to also subsume x under "fickle"! This way of translating the sentence has no existential import whatsoever. It is fully compatible with there actually being no humans at all. The other way of translating "Some humans are fickle," by contrast, provides existential import: 41 This is related to Kant's point that "It is a mere tautology to speak of universal or common concepts -a mistake that is grounded in an incorrect division of concepts into universal, particular, and singular. Concepts themselves cannot be so divided, but only their use" [Jäsche Logic p. 91]. For further discussion, see Section 6. Relatedly, note that KL 3 and inclusive logic have one thing in common in that they both avoid the presupposition that the domain is non-empty.
Here, p 1 is a new predicate mark introduced to represent the conjunction of human and fickle. These rules mean: you must construct at least one intuition x and subsume x under both "human" and under "fickle". 42 In [Jäsche Logic §46], Kant says that universal judgements ("all" judgements) imply particular judgements ("some" judgements). In KL 3 the inference from "all" to "some" is valid if we interpret "all" and "some" in terms of "must" and "may": 43 But the inference from "all" to "some" is not valid if we interpret "some" in terms of the existential quantifier: One of the key strengths of first-order logic is its ability to handle multiply quantified sentences. We can infer, for example, from "there is some (particular) prince who has offended every delegate", that "for every delegate, there is some prince who has offended her". Aristotle's two-term logic has been rightly criticised for its inability to deal with inferences involving multiply quantified sentences. KL 3 does not suffer from the inadequacies of Aristotle's logic. A single rule in KL 3 is implicitly of the form: ∀x φ(x) •→ ∃ȳ ψ(x,ȳ) wherex andȳ are tuples of variables, and •→ is either → or →. A single rule cannot capture a sentence of the form ∃x ∀ȳ φ(x,ȳ). However, a set of rules in KL 3 can capture this. For example, "there is some (particular) prince who has offended every delegate" can be rendered as R 1 below, while "for every delegate, there is some prince who has offended her" can be rendered as R 2 : In the above example R 2 is a conservative extension of R 1 in the following sense: 42 The auxiliary predicate p 1 is necessary because we have restricted the form of rules so as not to allow conjunctive conclusions. This restriction can be removed straightforwardly; we have not presented the details to avoid lengthening the presentation. Henceforth when presenting examples we will occasionally write rules with conjunctive conclusions without further comment. Rules with conjunctive conclusions can always be translated by introducing an auxiliary predicate, as in this example. 43 An alternative way to handle the inference from 'all' to 'some' is to use a many-sorted logic (where the domain of each sort must be non-empty), thus legitimising the inference from ∀x:t φ(x) to ∃x:t φ(x). We are grateful to Michiel van Lambalgen for this suggestion.
More generally, [14] shows that, for each set F of first-order sentences, there is a set of sentences of geometric logic that is a conservative extension of F . 44

The "is" of Identity and the "is" of Predication
In first-order logic, the sentence "Phosphorus is bright" is translated as a predication:

bright (phosphorus)
where bright is a one-place predicate and phosphorus is a constant. The sentence "Hesperus is Phosphorus," by contrast, involves the "is" of identity and should be translated as: hesperus = phosphorus If we wish to infer that, therefore, Hesperus is bright, we need to use Leibniz's law. This is an (infinite) axiom schema licensing, for every sentence φ(x) with one free variable x the inference: LEIBNIZ LAW φ(x) x = y φ(y) In KL 3 , by contrast, the two senses of "is" do not come apart. "Phosphorus is bright" is translated as: "Hesperus is Phosphorus" is rendered as: together with the symmetric rule:

phosphorus(x) →hesperus(x)
The inference to hesperus(x) →bright (x) does not require any infinite axiom schema. It just involves the standard MUST-TRANS inference rule.

Two Types of Negation
Natural language distinguishes between sentence-negation (e.g. "It is not the case that Jack is tall") and predicate-negation 45 (e.g. "Jack is not tall"). First-order logic, of course, cannot capture these two distinct interpretations. The only negation in firstorder logic is sentential negation. But KL 3 can capture the two distinct readings. "It is not the case that Jack is tall" is rendered as: 44 Many commentators (for example, MacFarlane [35], p.26; also [18] and [43]) assume or claim that Kant's logic does not support nested quantifiers, while our formalization presupposes that his logic does have this expressive power. Our main evidence that this common view is wrong is the systematic support our account gets from making sense of Kant's otherwise notoriously obscure and problematic Table of Judgements (Section 6). But for compelling textual evidence, see [1], pages 260-2. 45 As does Kant (in transcendental logic), e.g. at [A71-3/B97-8], [Jäsche Logic 9:103-4]. See also [45] p.268. "Jack is not tall" is rendered as: Now in KL 3 these two particular claims are provably equivalent -but in general, when existentially quantified variables are involved, sentence-negation and predicate-negation are not equivalent in KL 3 . Consider "Jack is not married to anyone": jack(x) ∧ married(x, y) →⊥ Compare with "There is someone who Jack is not married to": Neither claim entails the other.

Recovering the Table of Judgements
The Table of Judgements [A70/B95] is divided into four "titles": Quantity, Quality, Relation, and Modality. Kant clearly thought this division was fundamental because it appears as an organising framework throughout the critical works. 46 Each title represents one structural feature of a judgement. In Kant's table, there are three possible values for each structural feature, so there are at most 3 4 possible types of judgement. 47 The four titles were in widespread use in the logic textbooks of the time, 48 but Kant's particular use of them was unusual.
The Quantity of a categorical subject-predicate judgement indicates whether the extension of the subject is partly or wholly contained in the extension of the predicate. If the extension of the subject S is wholly contained in the extension of the predicate P , then we say "all S are P", and the judgement has universal quantity. If the extension of S is only partly contained in the extension of P , then we say "some S are P", and the judgement has particular quantity. If the extension of S is a singleton, and this single element is a member of the extension of P , then we say "the individual S is P ", and the judgement has singular quantity.
One problem with this way of characterising Quantity is that it only applies to categorical judgements involving monadic predicates. But we shall see below how to extend this idea naturally to all other types of judgement, including hypothetical and disjunctive judgements involving binary or n-ary predicates. Kant 47 In practice, there will be slightly fewer, since some combinations are incompatible. For example, a judgement cannot both be negative and disjunctive. Nor can it be both negative and particular. See [2]. 48 See in particular The Port-Royal Logic [3]. within the Aristotelian two-term logic. But note that this claim is obviously false if universal and singular judgements are interpreted in terms of first-order logic. The singular judgement p(a) is not a sub-type of universal judgement (∀x) a(x) ⊃ p(x).
The Quality of a judgement indicates whether the predicate is affirmed or denied of the subject. If the predicate is affirmed of the subject, as in "All humans are mortal", then the judgement is affirmative. If the predicate is denied of the subject, as in "It is not the case that the soul is mortal", then the judgement is negative. But if the negation of the predicate is affirmed of the subject, as in "The soul is non-mortal", then the judgement is infinite. 49 The infinite judgements are, according to Kant, a sub-type of the affirmative judgements: If I had said of the soul that it is not mortal, then I would at least have avoided an error by means of a negative judgement. Now by means of the proposition "The soul is non-mortal" I have certainly made an actual affirmation as far as logical form is concerned, for I have placed the soul within the unlimited domain of undying things. [A72/B97] Note that both the distinction between negative and infinite judgements, and the claim that the infinite judgements are a sub-type of affirmative judgements, make no sense within first-order logic. In Frege's logic and its descendants, there is only one type of negation: sentence-level negation.
Kant's use of Relation is very different from its current meaning. In modern logic, a relation is a n-ary predicate where n > 1. For Kant, the Relation is a structural feature of a judgement indicating how the various subsumptions in the judgement are related to each other: All relations of thinking in judgement are either those a) of the predicate to the subject, b) of the ground to the consequence, and c) between the cognition that is to be divided and all of the members of the division. [A73/B98] In case (a), when a judgement involves just two subsumptions (e.g. "all humans are mortal"), then the judgement is categorical. In case (b), when a judgement has a condition that must be satisfied (e.g. "If there is perfect justice, then obstinate evil will be punished"), then the judgement is hypothetical. In case (c), when a judgement has a disjunctive conclusion (e.g. "The world exists either through blind chance, or through inner necessity, through an external cause"), then the judgement is disjunctive 50 [A73-4/B98-9].
Strawson [43] criticised Kant's use of Relation for being neither exhaustive nor exclusive. The three types of Relation are not exhaustive since some types of judgement (e.g. conjunctions) are not present at all. The three types of Relation are not exclusive since hypotheticals and disjunctions can, in standard propositional logic, be inter-defined using negation: p ⊃ q if and only if ¬p ∨ q. However we shall see, below, that in KL 3 , Kant's threefold division is very natural. 49 "In negative judgements the negation always affects the copula; in infinite ones it is not the copula but rather the predicate that is affected" [Jäsche Logic 9:104] 50 Disjunctions for Kant are exclusive disjunctions (see Section 3.1).
The fourth title, Modality, is a different type of feature from the others. While Quantity, Quality, and Relation are structural features of an individual judgement, Modality (as we read Kant) is a feature indicating how the judgement relates to the rest of the judgements held by an agent: The modality of judgements is a quite special function of them, which is distinctive in that it contributes nothing to the content of the judgement (for besides quantity, quality, and relation there is nothing more that constitutes the content of the judgement), but rather concerns only the value of the copula in relation to thinking in general. [A74/B100] The Modality of a judgement can be either problematic, assertoric, or apodictic. These are not the alethic modalities of possibility, actuality, and necessity. They are more like epistemic modals that relate us to the alethic modalities in particular ways: Problematic judgments are those in which one regards the assertion or denial as merely possible (arbitrary). Assertoric judgments are those in which it is considered actual (true). Apodictic judgments are those in which it is seen as necessary. [A74-5/B100] In the [Jäsche Logic 9:108-9], Kant goes on to explain his modalities of judgement in terms of the very same normative notions (may/must) that have been so central to KL 3 . The difference, we shall see, is that they function at a different level.
In each of the four titles, the third moment is defined as a sub-type of the first moment. A singular judgement is a sub-type of universal judgement; an infinite judgement is a sub-type of affirmative judgement; a disjunctive judgement is a sub-type of categorical judgement, and an apodictic judgement is a sub-type of problematic judgement. According to Kant, the third moment in each title entails a judgement of the second moment. A singular judgement entails a particular judgement; an infinite judgement entails a negative judgement; a disjunctive judgement entails a hypothetical judgement, and an apodictic judgement entails an assertoric judgement.
Kant's Table of Judgements has been roundly criticised for being incomplete, confused, or for being based on an impoverished expressively-limited logic. In this paper we argue, by contrast, that KL 3 is a powerful and expressive logic in which Kant's table emerges as the most natural way of categorising rules.

KL 3 Makes Sense of Kant's Table of Judgements
Since Kant sees a judgement as a type of rule (see Sections 1 and 2), a way of classifying rules will also be a way of classifying judgements. In this section, we shall provide four ways of classifying rules in KL 3 , and show how each classification corresponds to one of the four titles in the Table of Judgements.
Quantity In KL 3 , there are two types of rule : conditional imperatives and conditional permissives. An imperative of the form p →q means: "if you are performing p, then also perform q!" A permissive of the form p →q means: "if you are performing p, then feel free to also perform q!" We propose the following simple identification: a rule (judgement) has universal quantity if it is a conditional imperative, while a rule has particular quantity if it is a conditional permissive. So, for example, the universal judgement "all humans are mortal" would be rendered as: while the particular judgement "some men are fickle" would be rendered as: A singular judgement is a sub-type of universal judgement in which there is at most one object falling under the subject term. "Caius is mortal", for example, would be rendered by a pair of rules: This way of characterising Quantity has three appealing features. First, it shows how a singular judgement can be a type of universal judgement. In first-order logic, by contrast, a singular judgement is typically not rendered as a type of universal judgement. Second, it shows how Quantity can apply to all types of judgement. Recall that Quantity is normally defined for affirmative categorical judgements involving monadic predicates (subject-predicate sentences of the form "S is P"), and there is a problem how to extend this definition to all types of judgement. If Quantity is based on the distinction between conditional imperatives and conditional permissives, then it applies to all types of rule. Finally, this way of defining Quantity is consistent with Kant's view 51 that the inference from universal to particular quantity is valid. Consider the inference: All S are P Therefore, some S are P If these statements are translated into first-order logic, the inference is obviously invalid: we cannot infer from ∀x s(x) ⊃ p(x) that ∃x s(x) ∧ p(x) since there may be no objects whatsoever satisfying s(x). However, when we translate into KL 3 as we can infer s(x) →p(x) from s(x) →p(y) using the MUST-MAY inference rule.
In Kant's Table of Judgements, the third moment always entails a judgement of the second moment. In the case of Quantity, a singular judgement is a type of universal judgement, which itself entails a particular judgement, using the MUST-MAY inference rule.

Quality
In KL 3 , a conditional imperative has the form p 1 ∧ ... ∧ p n →q 1 ∨ ... ∨ q m . In particular, if the disjunction is empty, then the imperative acts as a constraint: p 1 ∧...∧p n →⊥ says whatever you do, do not perform all of p 1 , ..., p n . Constraints can be used to represent negative judgements: 52 represents the judgement that it is not the case that Jack is married. Recall from Section 5 that the set of predicates in KL 3 contains positive and negative marks. Given a set P of predicate marks, the complete set P +/− of signed predicates is: An infinite judgement is an affirmative judgement in which the conclusion involves a negated mark. To say that Jack is unmarried, we write: Note that the negation binds to the mark married and not to the subsumption married(x).
Unlike first-order logic, KL 3 is able to distinguish between negative and infinite judgements, and is able to characterise infinite judgements as a sub-type of affirmative judgements.
In Kant's Table of Judgements, the third moment always entails a judgement of the second moment. In the case of Quality, an infinite judgement (e.g. p →∼q) entails a negative judgement (e.g. p ∧ q →⊥) using the following inference:

Relation
In KL 3 , imperatives of the form p 1 ∧...∧p n →q 1 ∨...∨q m and permissives of the form p 1 ∧ ... ∧ p n →q 1 ∨ ... ∨ q m can be categorised based on how many conjuncts n they have in the antecedent, and how many disjuncts m they have in the consequent. If there is one element in the antecedent, then the rule is categorical. 53 If there are many elements in the antecedent, then the rule is hypothetical. If there is one element in the antecedent, but many elements in the consequent, then the rule is disjunctive [A93-4, B98-9].
Note that Strawson's criticism of Kant's three moments of Relation (that they are not exhaustive) does not apply to this formalization in KL 3 . The first two types of rule are exhaustive as long as n > 0.
Recall that in Kant's Table of Judgements, the third moment of each title is a subtype of the first moment, and entails a judgement of the second moment. In the case of Relation, the disjunctive judgement (because it has one element in the antecedent) is a sub-type of the categorical and entails a hypothetical judgement (using MUST-SI or MAY-SI).

Modality
The Kantian agent makes sense of its sensory perturbations by constructing and applying rules. These rules are conditional imperatives and permissives relating mental acts, e.g., for all private mental intuitions x, if you are subsuming x under mark p, then also subsume x under mark q! At any moment, the Kantian agent has a set A of subsumptions that it is performing, and a set R of rules that it has adopted. Given the subsumptions A and rules R, there are various different bundles of mental activity that are compatible with A and R. These are the various sets X ∈ out (R, A): the various sets of subsumptions it may perform. There is also the one distinguished set of subsumptions it is actually performing. If we take the intersection of all the X such that X ∈ out (R, A), then we get the subsumptions it must perform.
As well as the collections of subsumption acts that it may or must perform, however, there are also the rules that it may or must adopt. These are the selfsame normative notions at work in both cases -they have their force and content relative to the agent's goal of achieving experience. But they function at different levels. Whereas the normative characterizations of subsumptions within rules were the basis of the types of Quantity, Quality, and Relation in the Table of Judgements, it is the normative characterizations of rules themselves that are the basis of the types of Modality.
Given a set A of subsumptions that the agent is performing, and a set R of rules it has adopted, there are various further rules that the agent may adopt. Of course, not every set of rules can be added. Some rules may be incompatible with one of the existing rules in R. Or some rule may be incompatible with some of the agent's current subsumptions. For example, if the agent is subsuming an intuition k under mark p, and also is subsuming k under q, then it may not adopt the rule: We propose that the problematic judgements are the rules an agent may adopt: 54 Problematic judgements are those in which one regards the assertion or denial as merely possible. [A74/B100] Given a set R of rules that an agent has already adopted, the assertoric judgements are the rules in R that it has already committed to: The assertoric proposition ... indicates that the proposition is already bound to the understanding according to its laws. [A76/B101] Further, the apodictic judgements are the rules that it must adopt, 55 given R: 54 See also [Jäsche Logic 9:109]: "The soul of man may be immortal." 55 See also [Jäsche Logic 9:109]: "The soul of man must be immortal." Apodictic judgements are those in which it [the judgement] is seen as necessary.
[A75/B100] These are the rules in deriv(R), in Section 3.5 above. A summary of our interpretation of Kant's Table of Judgements in KL is given in Fig. 7. 56

Apperception and Rule-Revision
There are, then, two different levels in Kant's theory at which the same normative notions play a key role: there are the subsumptions that may / must be performed, and there are the judgements (i.e. rules) that may / must be adopted. These two levels can come apart: an agent may choose to adopt a rule saying that it must perform a particular subsumption. In this case, there is a sense in which the subsumption is necessary, even though the rule that prompted it need not have been adopted and is, hence, contingent: this word [the copula "is"] designates the relation of the representations to the original apperception and its necessary unity, even if the judgement itself is empirical, hence contingent [B142] This passage brings to the fore a topic that has been latent in much of the preceding text, but which we can only discuss very briefly here insofar as it relates to an important simplification in the present account of Kant's modalities of judgement. The topic is the unity of apperception. The simplification is that we have so far avoided the question of whether, and how, the Kantian agent can reject or revise rules it has previously adopted.
For Kant, unity of apperception and experience are two sides of the same coin. On our interpretation, the Kantian agent "binds" itself in two distinct but related senses when it constructs and applies its rules in constructing experience. First, it binds itself to its rules: it commits to up-holding those rules. Second, it binds itself together: it forms itself into a unity by up-holding its rules. 57 Now, the agent can only do either of these things insofar as it also binds its subsumptions into a unity, since the rules to and by which it binds itself just are procedures for generating subsumptions from subsumptions. And as we have said, if various (meta-) constraints on this activity are satisfied, this rule-bound unity of subsumptions will constitute experience. It is in this way that a unity of consciousness arises alongside and necessarily accompanies that consciousness of unity that is our experience of a coherent, unified external world. Unity of apperception and experience are two sides of the same coin, and both are the upshot of self-legislation.
So how does this relate to rule-revision? Above we assumed that the set R of rules an agent has adopted is fixed. Our prototype computer simulations [16,17] of 56 Note that the given formulas in Fig. 7 are only examples. Certain combinations are proscribed -see Section 6. But, for instance, while our example of a categorical judgement is universal and affirmative, it could just as well have been, say, particular and infinite: p(X) →∼q(X). 57 Note that there is also a sense in which the agent binds others (and itself to others) in this way, in that its rules quantify over all intuitions (see Sections 2.2 and 5). the Kantian cognitive architecture also make the same assumption. But what if we consider the set of rules to be changing -what if we consider adding to or removing from R? Clearly some sort of revision will sometimes be required in the light of new information. Indeed, the Kantian agent will always be changing its rules to best account for the stream of sensory data. The pattern is new in every moment, constantly requiring revision in making coherent, unified sense of the on-going stream of sensory perturbations. However, if the Kantian agent can reject or revise any rule whenever it sees fit, then this makes a nonsense of the idea that the agent has previously committed to that rule (and with it, the quoted notion of necessary unity of apperception even in contingency of judgement). In what sense is the agent really bound to its rules or together into a unity? Kant's notion of spontaneity cannot just be a free-for-all -rather, it must be compatible with self-legislation.
A thorough model of rule revision must answer two questions. First, under what circumstances is the agent permitted to revise a rule? Second, when it is in one of these special circumstances, what is the proper procedure for revision? What are the constraints on acceptable revision? We do not attempt a full answer or a formal implementation here -that is a task for future work. But in brief: first, the agent is permitted to revise a rule when its current rules cannot make sense of its current sensory stimulations; second, the only acceptable revisions of a rule-set are revisions in which all previous subsumptions are still licensed. The end towards which the agent's activity is directed is experience (and apperception): a coherent, unified representation. It cannot revise a rule-set if the new rule-set no longer legitimizes one of the activities it has already performed. (See Section 3.11.)

Expressive Limitations and Open Questions
In this section, we highlight aspects of Kant's logic that are not well captured by our formalization. We have described a logic of conditional imperatives and permissives that formalizes Kant's central notion of a rule. In our interpretation, the relata of rules are acts that do not have a truth-value. In the case of cognition, the constituents of rules are mental acts (specifically, subsumptions). In the case of practical reason, the constituents are physical acts. In both cases, both theoretical and practical reason, we use the same form of rules, the same semantics, and the same inference rules.
Because our logic is designed to be able to capture what is in common between theoretical and practical reason, inferences which are only valid in theoretical reason (but not in practical reason) are not supported in KL 3 . Consider, for example, the law of excluded middle, which is endorsed in [Jäsche Logic 9:117]. 58 In KL 2 this would be expressed as →p ∨ ∼p. This is not valid in KL 2 (see Section 4.7). Consider, next, the inference from 'All A are B' to 'Some B are A'. This is endorsed in [Jäsche Logic 9:103]. The inference from 'All A are B' to 'Some B are A' follows from two simpler inferences: (i) the inference from 'all A are B' to 'Some A are B' (endorsed in [Jäsche Logic 9:116]), and (ii) the inference from 'Some A are B' to 'Some B are A' (endorsed in [Jäsche Logic 9:118]).
In KL 3 , 'Some A are B' has two readings. One is permissive: a(x) →b(x). This rule has no existential import: if you subsume x under a, then feel free to also subsume x under b. The second reading is imperative and has existential import: →∃x a(x) ∧ b(X). In KL 3 , the inference from 'All A are B' to 'Some A are B' is only valid under the first permissive non-existential reading of 'Some A are B'. It is not valid under the existential reading. The second inference (from 'Some A are B' to 'Some B are A') is valid in KL 3 under the existential reading, but is not valid under the permissive non-existential reading: When considering practical actions, the inference is also invalid. The conditional "If you spill the drink then you may apologize" does not entail "If you apologize then you may spill the drink." To conclude, under either interpretation of 'some' (the permissive or existential), the inference from 'All A are B' to 'some B are A' is not valid in KL 3 .
Relatedly, Kant endorses the inference from 'All A are B' to 'Some B are not-A' (see [Jäsche Logic 9:103]). This inference is also not supported in KL 3 .
Finally, consider Kant's discussion of embedded judgements. In the first Critique, Kant says that hypothetical and disjunctive judgements "do not contain a relation of concepts but of judgements themselves" [B141] (see also [A73/B98]). However, in our formalization, rules cannot contain rules as constituents.
In our approach, a universal categorical 'all A are B' is translated to the conditional a(x) →b(x). A hypothetical 'if an A is C, then it is also B' is translated to the conditional a(x) ∧ c(x) →b(x). In general, a categorical judgement is translated into a conditional with exactly one antecedent, while a hypothetical judgement is translated into a conditional with more than one antecedent. Note that, in this approach, the translation of the categorical 'all A are B' (a(x) →b(x)) is not a syntactic constituent of the translation of the hypothetical 'if an A is C, then it is also B' (a(x) ∧ c(x) →b(x)).
This approach to categorical and hypothetical judgements is based on Longuenesse [Kant and the Capacity to Judge, p. 103n]. The major difference is that she is analysing judgements using conditionals of classical logic, where the constituents of the conditional are truth-evaluable propositions, while our conditionals relate acts that do not have truth values. 59 In both Longuenesse's approach and in ours, the constituents of hypothetical judgements are not themselves judgements, but are rather more primitive elements (in our case, subsumptions). This is a clear divergence from Kant's explicit pronouncements.
We are currently developing extensions of KL 1 , KL 2 , and KL 3 to include embedded rules. A rule p →(q →q) means that if you are doing p, then you must adopt the following rule: if you are doing q, then you must do r. A rule (p →q) →r means that if the rules you have adopted jointly commit you to upholding p →q, then you must do r.
Extending the semantics of KL 1 to include embedded rules is relatively straightforward: we modify cns of Section 3.2 so that each output is no longer a set of atoms, but a pair consisting of a set of atoms and a set of additional rules that we have adopted. But things get more complicated when embedded rules interact with bound variables. Consider, for example, the rule a(x) →(b(x) →c(x)). Is the x inside the embedded rule b(x) →c(x) bound to the same x in outer scope? Or is it a distinct variable, thus equivalent to a(x) →(b(y) →c(y))? If we take the first option, then the subsumption a(k) together with the rule a(x) →(b(x) →c(x)) jointly commit us to adopting the ground rule b(k) →c(k); this option requires a thorough understanding of how ground rules interact with quantified rules. If we take the second option, then the "currying" principle p ∧ q →r |= p →(q →r) is invalid, since a(x) ∧ b(x) →c(x) does not entail a(x) →(b(x) →c(x)) = a(x) →(b(y) →c(y)). There is much further work to do here.

Conclusion
The Kantian agent is a self-legislating rule-induction system. It makes sense of its sensory perturbations by spontaneously constructing and applying rules. If this activity satisfies various constraints, the agent achieves experience: it has constructed a coherent, unified representation of a coherent, unified external world. We have defined a logic of conditional imperatives and permissives that was designed as a formalization of Kant's conception of rules relating acts that do not have truthvalues. At its heart are the normative notions captured by conditional imperatives and permissives, rather than the notion of truth.
In this paper, we showed how the rules formalized in our logic have structural features that correspond precisely to those displayed in Kant's Table of Judgements. We also explained how this logic handles the major deontic paradoxes, how it differs from related logics, and how it translates natural language sentences. Of course our claim has not been that Kant had this precise logic in mind, but rather that it is based on, compatible with, and helps to explain part of Kant's view in the Critique of Pure Reason and associated texts.
This paper is part of a larger project to realize Kant's cognitive architecture in the medium of computation. 60 This wider project includes a number of other key components. First, we must give a formalizable account of the meta-rules governing the agent's construction of experience. Second, as noted above (Section 6), we must give an account of rule-revision. And third, we must extend our analysis to the personal level to account for the agent's consciously guided ruled-based activity in deciding what to do. In this paper, we have mostly focused on theoretical reason, where the constituents of rules are private mental acts (subsumptions), rather than practical reason, where the constituents of rules are public physical acts. In future work, we plan to extend our account to practical reason.
For left-to-right: suppose out 1 (R, A) = ∅. First observe that out 1 (R ∪ ⊥, A ) ⊆ out 1 (R, A ) for all A . (Because if X ∈ out 1 (R ∪ , A ) then X is computed from assumptions A using the non-constraint rules of R and X |= R ∪ . Since X |= R, that means X ∈ out 1 (R, A ) also.) It remains to show that out 1 (R, A ) ⊆ out 1 (R ∪ , A ) for all A . Assume X ∈ out 1 (R, A ); we shall show X ∈ out 1 (R ∪ , A ). Since X ∈ out 1 (R, A ), there is a D in def (R) such that X = M(D, A ). We will prove, first, that D is one of the definite programs of R ∪ , and second, that X |= R ∪ . First, since def r = {∅}, D ∈ def (R ∪ ). So D, as well as being one of the definite programs of R, is also one of the definite programs of R ∪ . Second, since X ∈ out 1 (R, A ), X |= R. We just need to show X |= . Since out 1 (R, A) = ∅, A R by Proposition 2. Now, since X |= R, A X, hence X |= . These two claims entail, using Proposition 1, that X ∈ out 1 (R ∪ , A ). For the other direction: suppose out 1 (R, A) = ∅. We need to show that there is some A such that out 1 (R ∪ , A ) = out 1 (R, A ). Take A = A: clearly out 1 (R ∪ , A) = ∅.
set of R. We will prove that if out 1 (R ∪ C ∪ aux(R), X) = ∅ then X is a violating set of R. Consider any definite logic program D R in the encoding def (R) of R. Let def (aux(R)) = {D aux } (all rules in aux(R) are rules with singleton heads and so there is a single definite program encoding aux(R)). M(D R ∪ D aux , X) |= aux(R) so it must be that M(D R ∪ D aux , X) |= R ∪ C, i.e., either M(D R ∪ D aux , X) is inconsistent or B ⊆ M(D R ∪ D aux , X) for some rule in R. If X is inconsistent then X is a violating set of R. If X is consistent then consider any maximal consistent X m ⊇ X. M(D R ∪ D aux , X m ) ⊇ M(D R ∪ D aux , X), and since X m is maximal, X m ⊇ M(D R ∪ D aux , X m ) ⊇ M(D R ∪ D aux , X). If M(D R ∪ D aux , X) is inconsistent then so is X m , and that cannot be. So B ⊆ M(D R ∪ D aux , X) for some rule in R. But then B ⊆ X m , and X m |= R.

Proposition 11
Let R be a set of rules and A a set of literals.
Proof The result follows from the previous minimality result. It is enough to consider a singleton set of assumptions A = {a}. The general result follows by repeated application. If is strongly inconsistent the result holds trivially. Suppose it is not strongly inconsistent. We need to show that: We will show that . Clearly {a} and all violating sets of R are violating sets of . Further, right . So aux(R) ∪ ⊆ aux(R ∪ ). Now (assuming is not strongly inconsistent) {a} is a minimal consistent violating set of . If X is a minimal consistent violating set of R and a ∈ X then X is not a minimal consistent violating set of ; if a / ∈ X then X is a minimal consistent violating set of . So aux m (R ∪ ) = aux m (R) ∪ .
Proposition 15 (Soundness of KL 2 ) For all sets R of rules: Proof We need to show soundness of the inference rules of KL 1 , ∼-LEFT and ∼-RIGHT with respect to semantic entailment in KL 2 . Since kl 2 (R) = kl 1 (R ∪ C ∪ aux(R)) (Proposition 6) and KL 1 is sound with respect to kl 1 , the soundness of KL 1 inference rules is immediate. Soundness of ∼-LEFT is just C ⊆ kl 2 (R) which also follows trivially. It remains to show that ∼-RIGHT is sound: if r ∈ right (R) then R |= KL 2 r, or more generally, that right (R) ⊆ kl 2 (R). We want to show: We will show that right (R) ⊆ aux(R) (which implies the above). In full: if r ∈ right (R) then r is a rule of the form where is a rule in R. In whether the set A of assumptions contains negative literals or not). If A contains no negative literals, then this least model also satisfies the constraints C.