Abstract
In this paper we study the theory Q. We prove a basic result that says that, in a sense explained in the paper, Q can be split into two parts. We prove some consequences of this result. (i) Q is not a poly-pair theory. This means that, in a strong sense, pairing cannot be defined in Q. (ii) Q does not have the Pudlák Property. This means that there two interpretations of \(\mathsf{S}^1_2\) in Q which do not have a definably isomorphic cut. (iii) Q is not sententially equivalent with \(\mathsf{PA}^-\). This tells us that we cannot do much better than mutual faithful interpretability as a measure of sameness of Q and \(\mathsf{PA}^-\). We briefly consider the idea of characterizing Q as the minimal-in-some-sense theory of some kind modulo some equivalence relation. We show that at least one possible road towards this aim is closed.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Prelude
Robinson’s Arithmetic Q is an old friend. The first time I met it was when I studied the First Incompleteness Theorem in Boolos and Jeffrey’s wonderful book Computability and Logic.Footnote 1 The choice of Q and its smaller brother R to prove the First Incompleteness Theorem is beautiful, since these theories seem to be about as weak as one can get to prove this result.Footnote 2 On the other hand, it is a bit awkward to go on from there to prove the Second Incompleteness Theorem, where we need internal verification of some principles. Computability and Logic also contains nice exercises illustrating how easy it is to manufacture various counter-models to show non-provability in Q.
Later I was pleased to discover that there are—in a sense—two proofs of the Second Incompleteness Theorem for Q itself. There is a truly beautiful result by Bezboruah and Shepherdson (1976). They show that, under some very reasonable assumptions, the theory \(\mathsf{PA}^-\) does not prove the consistency of any theory. Hence, a fortiori, Q does not prove its own consistency. Bezboruah and Shepherdson’s proof does depend on rather specific assumptions about the coding. Also it does not generalize to stronger theories. What is more, it tells us nothing about the question whether Q can prove its consistency on some definable cut. Later, Pavel Pudlák proved a strong version of the Second Incompleteness Theorem (see Pudlák 1985; Hájek and Pudlák 1993) that I would formulate as follows. Let U be any consistent recursively enumerable theory and let N be an interpretation of Q in U. Then, U does not prove its own consistency relativized to N. It seems to me that one should say that Bezboruah and Shepherdson on the one hand and Pudlák on the other proved quite different results which share some consequences. I always thought that the two proofs should provide a good case study for a philosophical enquiry into the problem of theorem individuation.
Still later I read Tarski et al. (1953) which proves the essential undecidability of Q. And after that there was also Nelson (1986) where the possibilities for bootstrapping in Q are used with impressive, almost magical, results.
The present paper is my tribute to Q. As is fitting among friends, I will not only praise it, but also discuss, in a respectful way, some of its weaknesses.
2 Introduction
The theory Q was introduced by Robinson (1950). He introduced it as a simplification of an earlier finitely axiomatized, essentially undecidable theory due to Mostowski and Tarski (1949). The system became widely known via the book Undecidable Theories by Tarski et al. (1953). It is given by the following axioms:
-
Q1. \(\vdash \mathsf{S}x = \mathsf{S}y \rightarrow x=y\)
-
Q2. \(\vdash \mathsf{S}x \ne 0\)
-
Q3. \(\vdash x=0 \vee \exists y\quad x=\mathsf{S}y\)
-
Q4. \(\vdash x+0=x\)
-
Q5. \(\vdash x+\mathsf{S}y = \mathsf{S}(x+y)\)
-
Q6. \(\vdash x\cdot 0 = 0\)
-
Q7. \(\vdash x\cdot \mathsf{S}y = x\cdot y + x\)
Robinson shows that Q is essentially undecidable, which tells us that any consistent theory that interprets Q is undecidable. Since Q is finitely axiomatized, it follows that any theory that weakly interprets Q,Footnote 3 i.o.w. any theory that is consistent with some translation of Q, is undecidable.
The theory Q is, in many senses, a natural theory. What would be the quintessential weak arithmetic? Well, we want to have at least the basic properties of zero and the successor function: zero is not a successor and the successor relation is total, functional and injective. Then, we want the recursion equations for addition and multiplication. However, the resulting theory is a subtheory of the theory of the positive reals including zero. Thus, it has a decidable extension. So, the theory cannot binumerate the recursive functions. We can repair that by adding the axiom that zero and successor are jointly surjective. This final axiom seems to have a certain harmony with the other successor axioms: we have axioms Q1 and Q2 to state that zero and successor are jointly injective and we have an axiom, to wit Q3, that articulates that they are jointly surjective. Thus, from the standpoint of motivation of the axioms, Q seems to be a well-balanced theory.
Clearly, Q is very weak. However, if we are just interested in the property of essential undecidability, we can go much weaker. There is the theory R, also introduced in Tarski et al. (1953). Here are the axioms of R, where underlining stands for the usual unary numeral function.
-
R1. \(\vdash {\underline{m}} + {\underline{n}} = {\underline{m+n}}\)
-
R2. \(\vdash {\underline{m}} \cdot {\underline{n}} = {\underline{m\cdot n}}\)
-
R3. \(\vdash {\underline{m}} \ne {\underline{n}} \), for \(m\ne n\)
-
R4. \(\vdash x\le {\underline{n}} \rightarrow \bigvee _{i\le n} x= {\underline{i}}\)
-
R5. \(\vdash x\le {\underline{n}} \vee {\underline{n}} \le x\)
This theory is not only essentially undecidable but also has the property that any theory that weakly interprets it is undecidable. See Vaught (1962). Since we build up the needed machinery anyway, we will provide a proof of this last result in Sect. 3.
We can even consider weaker theories than R. See Vaught (1962) and Jones and Shepherdson (1983). See also Visser (2014) for more information.
The theory R is not finitely axiomatizable. Moreover, every finitely axiomatized subtheory of it is finitely satisfiable and, hence, has a decidable extension. So perhaps Q is the weakest finitely axiomatized theory that is essentially undecidable and, say, extends R? Alas, no such luck. We refer the reader to Vítězslav Švejdar’s paper in Švejdar (2007) where finitely axiomatized systems are studied that are weaker than Q. These systems do not extend R but with a minor modification they do: we have to set the partial functions in Švejdar’s paper when they are not defined to some default value.
The systems studied by Švejdar are mutually interpretable with Q, so perhaps Q is still minimal, among finitely axiomatized theories, with respect to interpretability? Not so, for any finitely axiomatized subtheory A of Q that extends R, we can find a finitely axiomatized subtheory B of A that extends R and such that B does not interpret A. This can be shown by a minor adaptation of the methods of Friedman (2007). We show how to do this in Sect. 3. So, the prospect of characterizing Q, or some closely related theory, as the weakest (in a suitable sense) finitely axiomatized theory with such and such a property, seems pretty dim.
A quite different and beautiful feature of Q is that it interprets fairly strong theories like \(I\varDelta _0+\varOmega _1\) on a definable cut. It follows from this that we have the second incompleteness theorem for all extensions of Q. This feature also holds for the still weaker theories studied by Švejdar that interpret Q on a definable cut. Regrettably, this does not seem to help us with the characterization problem.
From one perspective, Q seems rather natural, from another it does not. It lacks many desirable properties. As has been shown in Visser (2008) and Jeřábek (2012), Q is not a pair theory. In Sect. 5 of this paper we will show that it is not even a poly-pair theory. We will also show, in Sect. 6, that Q does not have the Pudlák property. The negation of the Pudlák property for Q tells us that there are two interpretations of \(\mathsf{S}^1_2\) in Q that do not verifiably have definably isomorphic cuts.
It is interesting to compare Q with its bigger brother \(\mathsf{PA}^{-}\). The theory \(\mathsf{PA}^{-}\) is the theory of commutative, discretely ordered semi-rings with a minimal element plus the subtraction axiom (\(\mathsf{PA}^{-}14\) below). It is employed as the basic arithmetic, e.g. in the textbook (Kaye 1991). The theory is given by the following axioms:
-
\(\mathsf{PA}^{-}1. \vdash x+0=x\)
-
\(\mathsf{PA}^{-}2. \vdash x+y = y+x\)
-
\(\mathsf{PA}^{-}3. \vdash (x+y)+z = x+(y+z)\)
-
\(\mathsf{PA}^{-}4. \vdash x\cdot 1 = x\)
-
\(\mathsf{PA}^{-}5. \vdash x\cdot y = y\cdot x\)
-
\(\mathsf{PA}^{-}6. \vdash (x\cdot y)\cdot z = x\cdot (y\cdot z)\)
-
\(\mathsf{PA}^{-}7. \vdash x\cdot (y+z) = x\cdot y + x\cdot z\)
-
\(\mathsf{PA}^{-}8. \vdash x \le y \vee y \le x\)
-
\(\mathsf{PA}^{-}9. \vdash (x\le y \wedge y \le z) \rightarrow x\le z\)
-
\(\mathsf{PA}^{-}10. \vdash x+1\not \le x\)
-
\(\mathsf{PA}^{-}11. \vdash x \le y \rightarrow (x=y \vee x+1\le y)\)
-
\(\mathsf{PA}^{-}12. \vdash x \le y \rightarrow x+z\le y+z\)
-
\(\mathsf{PA}^{-}13. \vdash x\le y \rightarrow x\cdot z \le y\cdot z\)
-
\(\mathsf{PA}^{-}14. \vdash x\le y \rightarrow \exists z \;\; x+z=y\)
Emil Jeřábek’s version in Jeřábek (2012) does not have the subtraction axiom \(\mathsf{PA}^{-}14\). Thus Jeřábek’s version is a universal theory. As noted by Jeřábek his version interprets the stronger version with subtraction axiom on a definable cut. The weak version does not extend Q but the strong version does.
The theory \(\mathsf{PA}^-\) has been shown to be sequential by Emil Jeřábek in his paper (Jeřábek 2012) (even in the weaker form without the subtraction axiom).
Bezboruah and Shepherdson (1976) show that, under some very reasonable assumptions, \(\mathsf{PA}^-\) does not prove the consistency of any theory.
Victor Pambuccian studies number theoretical theorems over \(\mathsf{PA}^{-}\). See his papers (Pambuccian 2008, 2014, 2015).
The theory Q interprets much stronger theories than \(\mathsf{PA}^{-}\), like \(I\varDelta _0 + \varOmega _1\) (see, e.g. Hájek and Pudlák 1993). Hence, a fortiori, Q is mutually interpretable with \(\mathsf{PA}^-\) (both the strong and the weak version). Using ideas of Per Lindström, one may show that Q is even mutually faithfully interpretable with \(\mathsf{PA}^-\). One can also demonstrate that the Lindenbaum algebras of Q and \(\mathsf{PA}^-\) are recursively isomorphic. This means that there is a recursive function of sentences that induces an isomorphism of Lindenbaum algebras. The result is a special case of the theorem of Marian Pour-El and Saul Kripke that the Lindenbaum algebras of all recursively enumerable theories that interpret Q are recursively isomorphic. See Pour-El and Kripke (1967).
Are these the best samenesses that we can get between these theories? In Sect. 7, we will show that the two theories are not sententially congruent. So, at least in terms of traditional notions of sameness, we cannot do better than recursive isomorphism of Lindenbaum algebras on the one hand, and mutual faithful interpretability on the other.
The main technical tool of the paper is a theorem that tells us that, in a sense, Q can be split into two disjoint parts. This result is proved in Sect. 4. The proof is an adaptation of an earlier result in Visser (2014). One might say that the progress of the present paper is to provide a better understanding of what the result of Visser (2014) really means.
We end the paper with some concluding remarks in Sect. 8.
2.1 How to read the paper
In the appendices, I present basic materials needed for understanding the paper. In the main text there are references to the appendices when needed. Section 3 can be read independently of the other sections. Section 4 is the basic preliminary for Sects. 5–7. The Sects. 5–7 are pairwise independent of each other.
3 Between R and Q
In this section, we endeavour to make the idea of characterizing Q using an appropriate minimality claim less plausible. Perhaps it is better to say: if there is a characterization of Q as the minimal theory such that ..., then it cannot take such and such a form. Specifically, we show that, for any finitely axiomatised consistent theory A such that \(\mathsf{R} \subseteq A\), there is a finitely axiomatised B such that \(\mathsf{R} \subseteq B \subseteq A\) and \(B \mathrel {\not \! \rhd }A\).Footnote 4 The result is just a rather direct application of ideas from Friedman (2007). So, we do not claim great originality here.
The following nice version of the theory of a number was developed by Johannes Marti, Nal Kalchbrenner, Paula Henk and Peter Fritz in Interpretability Project Report of 2011, the report of a project they did under my guidance in the Master of Logic in Amsterdam.Footnote 5 \({}^{,}\) Footnote 6 We call it the theory of a number since, in our intended applications, the fact that it is satisfied by the structure associated with a finite, non-zero ordinal is central.
-
TN1. \(\vdash x \not < 0\)
-
TN2. \(\vdash (x< y \wedge y<z ) \rightarrow x < z\)
-
TN3. \(\vdash x< y \vee x= y \vee y < x\)
-
TN4. \(\vdash x=0 \vee \exists y\quad x=\mathsf{S}y\)
-
TN5. \(\vdash \mathsf{S}x \not < x\)
-
TN6. \(\vdash x<y \rightarrow (x<\mathsf{S}x \wedge y \not < \mathsf{S}x)\)
-
TN7. \(\vdash x+0=0\)
-
TN8. \(\vdash x+\mathsf{S}y =\mathsf{S}(x+y)\)
-
TN9. \(\vdash x \cdot 0 = 0\)
-
TN10. \(\vdash x\cdot \mathsf{S}y = x\cdot y + x\)
Since \(\mathsf{TN}6\) implies \(x \not < x\), a model of TN is a linear ordering that either represents a finite ordinal or starts with a copy of \(\omega \).
We call a \(\varDelta _0\)-formula pure if (i) all bounding terms are variables and (ii) all occurrences of terms are in subformulas of the form \(\mathsf{S}x=y\), \(x+y=z\) and \(x \cdot y = z\). We call a \(\varSigma _1\)-sentence pure if it is of the form \(\exists \mathbf {x}\, S_0\mathbf {x}\), where \(S_0\) is a pure \(\varDelta _0\)-sentence.
We can transform an arbritrary \(\varSigma _1\)-sentence S into a pure \(\varSigma _1\)-sentence \(S^\circ \), for example, in the following way. We start with S. We treat bounded quantifiers for the moment as if they were given with the language and not defined. First we replace all implications \((A \rightarrow B)\) by \((\lnot A \vee B)\) and all bi-implications by \(((\lnot A \vee B) \wedge (\lnot B \vee A))\). Next we push all negations inside in the usual manner. We replace:
-
\(\forall x < t\) by \(\exists z\quad (t = z \wedge \forall x< z \ldots )\),
-
\(\exists x < t\) by \(\exists z\quad (t = z \wedge \exists x< z \ldots )\),
-
\(\lnot t_0 = t_1\) by \(\exists z \exists w\quad (t_0 =z \wedge t_1 = w \wedge \lnot z = w)\),
-
\(\lnot t_0 < t_1\) by \(\exists z \exists w\quad ( t_0 = z \wedge t_1 = w \wedge \lnot z < w)\).
In this way all term occurrences are on positive places. At this point we apply the usual term-unwinding algorithm to our formula using a small scope interpretation. We note that this will translate an atomic formula to a block of existential quantifiers followed by an boolean combination of atomic formulas. Finally, we bring all unbounded existential quantifiers to the front in the usual manner replacing, e.g. \(\forall x< y \exists z \ldots \) by \(\exists w \forall x<y \exists z <w \ldots \). The resulting formula is \(S^\circ \), which is clearly pure and equivalent to the original formula (say, over \(\mathsf{PA}^{-}\) plus \(\varSigma _1\)-collection).Footnote 7
We note that the transformation \(S \mapsto S^\circ \) that we described is clearly elementary. Hence it exists in EA. Inspecting the transformation, we see that \(S^\circ \) implies S in predicate logic.
Let \(S := \exists \mathbf {y} S_0\mathbf {y}\), where \(S_0\) is a pure \(\varDelta _0\)-formula. We define:
Using the machinery of theories of a number, we can reprove Cobham’s result that any recursively enumerable theory that weakly interprets R is undecidable. Suppose U is recursively enumerable and \(U + \mathsf{R}^\tau \) is consistent. Consider a pure \(\varSigma _1\)-sentence S. Let \(S^\star \) be the sentence that says:
Since [S] is finitely axiomatised, we can write out \(S^\star \) in the obvious way. Consider the set \({\mathcal {S}}\) of all S such that \(U+(\mathsf{R} + S^\star )^\tau \) is consistent. Clearly, \({\mathcal {S}}\) contains all true (pure) \(\varSigma _1\)-sentences. If \({\mathcal {S}}\) did not contain false (pure) \(\varSigma _1\)-sentences, then this would make \(\varSigma _1\)-truth decidable. Hence, there is a false (pure) \(\varSigma _1\)-sentence \(S_1\) such that \(U+(\mathsf{R} + S_1^\star )^\tau \) is consistent. We can use \(S_1\) to build a translation \(\tau _0\) such that \(U+[S_1]^{\tau _0}\) is consistent. Since \([S_1]\) is a finitely axiomatised extension of R, it follows that U is undecidable.
To prove our main result we need the following result that was first verified in detail in the Interpretability Project Report by Marti, Kalchbrenner, Henk and Fritz. The basic idea behind the result is present in Friedman’s (2007).
We remind the reader of witness comparison notation. Suppose A is of the form \(\exists x A_0(x)\) and B is of the form \(\exists y B(y)\). We define:
-
\(A<B := \exists x (A(x) \wedge \forall y \le x \lnot B(y))\).
-
\(A \le B := \exists x (A(x) \wedge \forall y < x \lnot B(y))\).
-
If C is \(A< B\), then \(C^\bot \) is \(B \le A\).
-
If D is \(A \le B\), then \(D^\bot \) is \(B < A\).
Theorem 1
Let S and \(S^{\prime }\) be pure \(\varSigma _1\)-sentences. We have:
-
(a)
Suppose S is true. Then, if we allow piecewise interpretations, we have \(\top \rhd [S]\).
If we do not allow piecewise interpretations, we still have \((\exists x \exists y x\ne y) \rhd [S]\).
-
(b)
If \(S \le S^{\prime }\), then \([S^{\prime }] \vdash S\).
Proof
Ad (a): We note that if S is true, then [S] has a finite model. Any theory with a finite model is interpretable with a piecewise interpretation in predicate logic. If we do not allow piecewise interpretations, we can obtain the same effect using a multidimensional interpretation, assuming that we have at least two distinct elements.
Ad (b). Consider any model \({\mathcal {M}}\) of \([S^{\prime }]\). Without loss of generality we can identify the initial elements of \({\mathcal {M}}\) with \(0, 1, 2, \ldots \) It is easily shown that for any pure \(\varDelta _0\)-formula \(A\mathbf {x}\) we have \(A\mathbf {n}\) is true iff \({\mathcal {M}} \models A \mathbf {n}\), provided the \(\mathbf {n}\) are natural numbers in \({\mathcal {M}}\) and are not the top elements.Footnote 8 Suppose m is the smallest witness of S. Then we have \({\mathcal {M}} \models \lnot S_0^{\prime } \mathbf {k}\), for all \(\mathbf {k} < m\). Since \(S^{\prime }\) is witnessed by a non-top element, m is non-top and we have \({\mathcal {M}} \models S_0m\), and hence \({\mathcal {M}} \models S\). \(\square \)
We now have the materials to prove the main theorem of the present section.
Theorem 2
Suppose \(\mathsf{R} \subseteq A\), where A is finitely axiomatized and consistent. Then, there is a finitely axiomatized B such that \(\mathsf{R} \subseteq B \subseteq A\) and \(B \mathrel {\not \! \rhd }A\).
Proof
Suppose \(\mathsf{R} \subseteq A\), where A is finitely axiomatized and consistent. By the Gödel Fixed Point Lemma, we define R with:
Suppose \([R^\circ ] \rhd A\). It follows that either R or \(R^\bot \).
In the first case, we find \(R^\circ \). By Theorem 1(a), we find that \(\top \rhd [R^\circ ]\) and, hence, that \(\top \rhd A\). Quod non, since A extends R and since any theory interpretable in predicate logic has finite models.
In the second case, it follows that \((R^\bot )^\circ \) and \(\lnot R^\circ \). Hence, \((R^\bot )^\circ \le R^\circ \) and, hence, by Theorem 1(b), \([R^\circ ] \vdash (R^\bot )^\circ \). Since \(R^\circ \) implies R in predicate logic and similarly for \((R^\bot )^\circ \) and \(R^\bot \), we find that \([R^\circ ]\) proves both R and \(R^\bot \). So, \([R^\circ ] \vdash \bot \). From \(R^\bot \), we also have \(A \rhd [R^\circ ]\), thus it follows that \(A\rhd \bot \). Quod non, since A is consistent.
We may conclude that \([R^\circ ] \mathrel {\not \! \rhd }A\) and hence that \((\bigwedge [R^\circ ] \vee A) \mathrel {\not \! \rhd }A\). Since R implies \([R^\circ ] \rhd A\), it also follows that R is false. Thus, \([R^\circ ]\) extends R. Let \(B := (\bigwedge [R^\circ ] \vee A) \). We find: \(\mathsf{R} \subseteq B \subseteq A\) and \(B\mathrel {\not \! \rhd }A\). \(\square \)
4 Decomposition of Q
The theory Q has many unexpected extensions. We develop one here that is especially useful for proving negative results about Q. We will use it in the subsequent sections. The construction is an adaptation of a result of Visser (2014). One could say that only in the present paper the full meaning of the construction of Visser (2014) is unfolded.
The theory \(\mathsf{Q}^\#\) in the language of Q is axiomatized by the following principles.
-
Q
-
\(\forall x \forall y ((\mathsf{S}x \ne x \wedge \mathsf{S}y \ne y ) \rightarrow \mathsf{S}(x+y) \ne x+y)\)
-
\(\forall x \forall y ((\mathsf{S}x \ne x \wedge \mathsf{S}y \ne y ) \rightarrow \mathsf{S}(x \cdot y) \ne x \cdot y)\)
-
\(\exists a \forall x ( x+a = a \wedge a+x = a)\)
It is easy to see that this a is unique. We call it \(\infty \).
-
\(\exists x (\mathsf{S}x=x \wedge x \ne \infty )\)
-
\(\forall x \forall y ((x \ne \mathsf{S}x \wedge y = \mathsf{S}y ) \rightarrow (x +y = y \wedge y + x = y))\)
-
\(\forall x \forall y ((x = \mathsf{S}x \wedge y = \mathsf{S}y ) \rightarrow (x +y = y \vee x + y = \infty ))\)
-
\(\forall x \forall y (y = \mathsf{S}y \rightarrow x \cdot y = \infty )\)
-
\(\forall x \forall y ((x = \mathsf{S}x \wedge y \ne \mathsf{S}y) \rightarrow x \cdot \mathsf{SS}y = x+x)\)
Theorem 3 below will have the immediate consequence that \(\mathsf{Q}^\#\) is consistent.
Let \(\mathsf{Q}^\circ \) be Q plus the axiom \(\forall x \mathsf{S}x \ne x\). Let \(\mathsf{CQC}^2\) be predicate logic with identity and one binary predicate symbol R. Let \({\mathbbm {1}}\) be the theory in the language of identity that states that there is at most one object. The operation \(\boxplus \) is defined and discussed in the “Sums” section of Appendix 1. The notion of synonymy is defined in the “Provable equivalence of interpretations” section of Appendix 1. We have:
Theorem 3
The theory \(\mathsf{Q}^\#\) is synonymous to the theory \(\mathsf{Y} := \mathsf{Q}^\circ \boxplus \mathsf{CQC}^2 \boxplus {\mathbbm {1}}\).
Proof
We define \(K:\mathsf{Y} \rightarrow Q^{\#}\) as follows. We write Z for the unary relational representation of zero, S for the binary relational representation of successor, A for the ternary relational representation of addition, and M for the ternary relational representation of multiplication.
-
\(\delta _K(x) :\leftrightarrow x=x\)
-
\(x =_K y :\leftrightarrow x=y\)
-
\(\triangle _{0K}(x) :\leftrightarrow \mathsf{S}x \ne x\)
-
\(\triangle _{1K}(x) :\leftrightarrow \mathsf{S}x = x \wedge x \ne \infty \)
-
\(\triangle _{2K}(x) : \leftrightarrow x = \infty \)
-
\(\mathsf{Z}_Kx :\leftrightarrow \mathsf{Z}x\)
-
\(\mathsf{S}_Kxy :\leftrightarrow \triangle _{0K}(x) \wedge \mathsf{S}x=y\)
-
\(\mathsf{A}_Kxyz :\leftrightarrow \triangle _{0K}(x) \wedge \triangle _{0K}(y) \wedge x+y=z\)
-
\(\mathsf{M}_Kxyz :\leftrightarrow \triangle _{0K}(x) \wedge \triangle _{0K}(y) \wedge x\cdot y = z\)
-
\(R_Kxy :\leftrightarrow \triangle _{1K}(x) \wedge \triangle _{1K}(y) \wedge x+y = \infty \)
It is easy to see that the specified translation does indeed deliver the desired interpretation. We define \(M:\mathsf{Q}^\# \rightarrow \mathsf{Y}\) as follows. To make the interpretation readable we use functional notation on the side of the interpreting theory. The reader should keep in mind that, e.g. S is only defined on \(\triangle _0\). We write \(\infty \) for the unique inhabitant of \(\triangle _2\). Below the itemized definition, we repeat the definitions of addition and multiplication in more readable tabular form.
-
\(\delta _M(x) :\leftrightarrow x=x\)
-
\(x=_M y :\leftrightarrow x=y\)
-
\(\mathsf{Z}_Mx :\leftrightarrow x=0\)
-
\(\mathsf{S}_Mxy :\leftrightarrow \mathsf{S}x= y \vee (\lnot \triangle _0(x) \wedge x=y)\)
-
\(\mathsf{A}_Mxyz :\leftrightarrow x+y=z \ \vee \)
$$\begin{aligned}&(\triangle _0(x) \wedge \triangle _1(y) \wedge z=y) \; \vee \\&(\triangle _1(x) \wedge \triangle _0(y) \wedge z=x) \; \vee \\&(\triangle _1(x) \wedge \triangle _1(y) \wedge Rxy \wedge z = \infty ) \\&(\triangle _1(x) \wedge \triangle _1(y) \wedge \lnot \, Rxy \wedge z = y) \; \vee \\&( (x=\infty \vee y = \infty ) \wedge z = \infty ) \end{aligned}$$ -
\(\mathsf{M}_Mxyz :\leftrightarrow (y=0 \wedge z=0) \vee (y=1 \wedge \mathsf{A}_M(0,x,z))\, \vee \)
$$\begin{aligned}&\exists u (y = \mathsf{SS}u \wedge ((\triangle _0(x) \wedge x\cdot y = z) \vee \\&(\triangle _1(x) \wedge ((Rxx \wedge z= \infty ) \vee (\lnot Rxx \wedge z = x) )) \vee \\&(x = \infty \wedge z = \infty ))) \vee ((\triangle _1(y) \vee y = \infty ) \wedge z=\infty ) \end{aligned}$$
Here is the diagrammatic version of the definitions of addition and multiplication. In the diagrams, n ranges over \(\triangle _0\) and x ranges over \(\triangle _1\).
![](http://media.springernature.com/full/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Equ23_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Equ24_HTML.gif)
The verification that the translation we specified does indeed carry an interpretation of \(\mathsf{Q}^\#\) is immediate. We treat two sample cases of the verification of \(M \circ K =_0 \mathsf{ID}_{Y}\). We have:
The last step uses the fact that the \((\ldots )\) implies that \(\lnot (\triangle _0(x) \wedge \triangle _0(y))\).
![](http://media.springernature.com/full/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Equ25_HTML.gif)
We treat one sample case to illustrate that \(K \circ M =_0 \mathsf{ID}_{\mathsf{Q}^\#}\).
Of course, the last step is by verifying that if, e.g. \(\mathsf{S} x = x \wedge x\ne \infty \wedge \mathsf{S}y = y \wedge y \ne \infty \), then \( x+y = \infty \wedge z = \infty \) iff \(x+y=z\), and similarly for the other cases. \(\square \)
Theorem 4
\(\mathsf{Q} \rhd \mathsf{Q}^{\#}\).
Proof
The theorem is a direct consequence of the fact that \(\mathsf{Q} \rhd \mathsf{Q}^\circ \) via a definable cut, \(\mathsf{Q} \rhd \mathsf{CQC}^2\) and \(\mathsf{Q} \rhd {\mathbbm {1}}\) in combination with Theorem 3, noting that \(\boxplus \) is (an implementation of) the supremum in the degrees of interpretability. \(\square \)
5 Q is not a poly-pair theory
In this section we show that Q is not a poly-pair theory. We explain the notion of poly-pair theory in Appendix 2. In this same appendix we provide various basic facts about poly-pair theories. These basics are mainly derived from our paper (Visser 2013).
Theorem 5
Q is not a poly-pair theory.
Proof
Suppose Q were a poly-pair theory. By our result of Sect. 4, the theory Y is bi-interpretable with \(\mathsf{Q}^\#\). Since, by the results of Appendix 2, the property of being a poly-pair theory is upwards preserved under theory extension and is preserved under bi-interpretations, is sufficient to show that Y is not a poly-pair theory.
Suppose Y is a poly-pair theory. Consider any model \({\mathcal {M}}\) of Y in which the relation R is empty and in which the domain of the second component is infinite. By the results in Appendix 2 the interpretation that witnesses the fact that Y is poly-pair can be taken to be parameter-free, but we do not need to use the result. Suppose the parameters of our interpretation are \(\mathbf {p}\) and let the dimension be m.
Consider two m-sequences \(\mathbf {a}\) and \(\mathbf {b}\) in the second component. We assume that the elements of \(\mathbf {a},\mathbf {b},\mathbf {p}\) are pairwise distinct. Let \(\mathbf {c}\) be a pair (according to the interpretation) containing \(\mathbf {a}\) and \(\mathbf {b}\). Some element d of \(\mathbf {a},\mathbf {b}\) does not occur in \(\mathbf {c}\). Let e in the second component be disjoint from \(\mathbf {a},\mathbf {b},\mathbf {c}, \mathbf {p}\). Let \(\sigma \) be the operation of interchanging d and e. Clearly, \(\sigma \) is an automorphism of our model that leaves \(\mathbf {p}\) fixed. So \(\mathbf {c}\) is also a pair of \(\sigma \mathbf {a}\) and \(\sigma \mathbf {b}\). Since either \(\sigma \mathbf {a} \ne \mathbf {a}\) or \(\sigma \mathbf {b} \ne \mathbf {b}\), this contradicts the defining property of pairing. \(\square \)
6 The Pudlák property
The Pudlák property of a theory U in its classical formulations says: (i) there is an interpretation \(N^\star :\mathsf{S}^1_2 \rightarrow U\) and (ii) whenever \(N:\mathsf{S}^1_2 \rightarrow U\) and \(N^{\prime }: \mathsf{S}^1_2 \rightarrow U\), then there is a U-definable, U-verifiable isomorphism F between certain U-definable, U-verifiable cuts I of N and J of \(N^{\prime }\). We take our cuts to be downwards closed w.r.t. < and closed under S, \(+\), \(\times \) and \(\omega _1\).
To keep our treatment simple we only consider the case of parameter-free interpretations. The case with parameters is briefly discussed in Remark 1.
For our purposes, it is nicer to view the Pudlák property in the light of the category \(\mathsf{INT}_1\) of bi-interpretability. See Appendix 1 for an introduction to bi-interpretability and \(\mathsf{INT}_1\).
We define a functor A from \(\mathsf{INT}_1\) to the category of preorders. Here we allow the empty preorder. Consider any theory U. We send U to the structure \(\mathsf{A}(U)\). The elements of \(\mathsf{A}(U)\) are interpretations \(N:\mathsf{S}^1_2 \rightarrow U\) modulo i-isomorphism. The structure \(\mathsf{A}(U)\) has a binary preordering \(\preceq \) defined by \(N \preceq N^{\prime }\) iff there is a U-definable, U-verifiable initial embedding F from N to \(N^{\prime }\). We note that the existence of such an embedding is independent of the choice of the syntactical representatives of N and \(N^{\prime }\). If \(K :U \rightarrow V\), then \(\mathsf{A}(K):\mathsf{A}(U) \rightarrow \mathsf{A}(V)\) is defined by \(\mathsf{A}(K)(N) := K \circ N\). We note \(\mathsf{A}(K)\) does indeed preserve \(\preceq \).
We remind the reader that a preorder is downward directed if for every x and y, there is a z with \(z \le x\) and \(z \le y\).
The Pudlák property for U is equivalent to: \(\mathsf{A}(U)\) is non-empty and downward directed.
We remind the reader of Pavel Pudlák’s well-known result from Pudlák (1985).
Theorem 6
(Pudlák) Sequential theories have the Pudlák property.
We show that the converse of Pudlák’s result does not hold in Appendix 3. Specifically, we show that, if U has the Pudlák Property, then so does \(U \boxplus \mathsf{EQ}\). Here EQ is the pure theory of equality in the minimal signature. It follows that, e.g. \(\mathsf{S}^1_2 \boxplus \mathsf{EQ}\) has the Pudlák Property. However, the methods of Sect. 7 show that \(\mathsf{S}^1_2 \boxplus \mathsf{EQ}\) is not sequential.
Remark 1
What is the Pudlák Property in the case with parameters? It seems that there are lots of options. Instead of systematically looking at various versions, I will just give the one that I think is most attractive. For the basic definitions and notations concerning parameters the reader is referred to the “Adding parameters” section of Appendix 1.
First we generalize \(\preceq \) to the case with parameters. We define:
-
\(N \preceq N^{\prime }\) iff, for some F, we have:
\(U\vdash \forall \mathbf {q} (\alpha _{N^{\prime }}(\mathbf {q}) \rightarrow \exists \mathbf {p} \; (\alpha _N(\mathbf {p}) \wedge F(\mathbf {p}, \mathbf {q}): N^{\mathbf {p}} \preceq {N^{\prime }}^{\mathbf {q}}))\).
Here \((F(\mathbf {p}, \mathbf {q}): N^{\mathbf {p}} \preceq {N^{\prime }}^{\mathbf {q}})\) means that F is a formula representing an initial embedding of \(N^{\mathbf {p}}\) in \({N^{\prime }}^{\mathbf {q}}\), where domain and range of F are cuts. We note that we could allow F to have some extra parameters of its own, so that the formula \((F(\mathbf {p}, \mathbf {q}): N^{\mathbf {p}} \preceq {N^{\prime }}^{\mathbf {q}})\) would become \(\exists \mathbf {r} (F(\mathbf {p}, \mathbf {q}, \mathbf {r}): N^{\mathbf {p}} \preceq {N^{\prime }}^{\mathbf {q}})\). However, we will refrain from doing so.
The Pudlák Property now takes a simple form: (i) There is an \(N^\star :\mathsf{S}^1_2 \rightarrow U\) and (ii) for every \(N:\mathsf{S}^1_2 \rightarrow U\), there is a parameter-free \(N_0:\mathsf{S}^1_2 \rightarrow U\), such that \(N_0 \preceq N\). This holds even in the case that the direct interpretation that witnesses the sequentiality of U itself contains parameters. We can rephrase this version of the Pudlák property as follows: \(\mathsf{A}(U)\) is not empty, and the parameter-free interpretations are coinitial in \(\mathsf{A}(U)\).
We note that the two interpretations N and \(N^{\prime }\) in the formulation of the parameter-free case can be subsumed under a single interpretation with parameters \(N{\langle x=0 \rangle } N^{\prime }\). So, the Pudlák Property with parameters includes the one without parameters.
Let us call my version of the Pudlák Property with parameters: the strong Pudlák Property. Sequential theories have the strong Pudlák Property. This result holds even in the case that U is sequential via a direct interpretation with parameters. The result also holds in the poly-sequential case. We refer the reader to Visser (2013), for details, specifically to the second proof of Theorem 5.2 of that paper.
I have not worked out the full development of the case with parameters. However note that we show that Q fails to have the weaker property, so a fortiori it fails to have the stronger property.
We collect some basic insights. Since the homomorphic image of a downward directed preorder is downward directed, we have:
Theorem 7
Suppose \(K:U \rightarrow V\) and U has the Pudlák Property and \(\mathsf{A}(K)\) is surjective. Then, V has the Pudlák property.
We show that A applied to an instance of the extension relation is surjective.
Theorem 8
Suppose V is an extension of U in the same language. Let \(\mathsf{emb}_{UV}\) be the identical embedding. Suppose further that \(\mathsf{A}(U)\) is non-empty. Then, \(\mathsf{A}(\mathsf{emb}_{UV})\) is surjective.
Proof
Suppose \(N_0 \in \mathsf{A}(U)\) and \(N \in \mathsf{A}(V)\). Let \(N_1 := N {\langle (\bigwedge \mathsf{S}^1_2)^N \rangle } N_0\).Footnote 9 Then, clearly \(N_1 \in \mathsf{A}(U)\). Moreover, \(\mathsf{emb}_{UV}(N_1) = N\). \(\square \)
We remind the reader of the following. Consider a category \({\mathcal {C}}\). Suppose \(f: x\rightarrow y\) and \(g: y \rightarrow x\) and \(g\circ f = \mathsf{id}_x\). In this case, we call f a section or split monomorphism. We call g a retraction or split epimorphism. The object x is in this situation a retract of y.
We show that A applied to a retraction is surjective.
Theorem 9
Suppose \(K:U \rightarrow V\) is a retraction, then \(\mathsf{A}(K)\) is a retraction and hence surjective.
Proof
Any functor preserves retractions. So \(\mathsf{A}(K)\) is a retraction in the category of preorders. It follows that \(\mathsf{A}(K)\) is surjective. \(\square \)
Question 1
We note that both retractions and theory extensions are epimorphisms in \(\mathsf{INT}_1\). Does A preserve epimorphisms? (Clearly an epimorphism in the category of preorders is surjective.)
In Visser (2006), we proved that, in \(\mathsf{INT}_0\), epimorphisms can always be split in first a theory extension and then an isomorphism. So, a fortiori, in \(\mathsf{INT}_0\), epimorphisms are preserved by the \(\mathsf{INT}_0\)-analogue of A.
Theorem 10
Q does not have the Pudlák Property.
Proof
By Theorem 3, we have extensions \(\mathsf{Q}^\#\) and \(\mathsf{Q}^\circ \) of Q such that \(\mathsf{Q}^\#\) is synonymous to \(\mathsf{Y} := \mathsf{Q}^\circ \boxplus \mathsf{CQC}^2 \boxplus {\mathbbm {1}}\). We extend \(\mathsf{CQC}^2\) to AS with R in the role of \(\in \).Footnote 10 Let \(A: = \mathsf{Q}^\circ \boxplus \mathsf{AS} \boxplus {\mathbbm {1}}\).
Suppose Q has the Pudlák Property. By Theorems 8 and 9, the property is preserved from Q to \(\mathsf{Q}^\#\), from \(\mathsf{Q}^\#\) to Y, and from Y to A.
Consider interpretations \(N:\mathsf{S}^1_2 \rightarrow Q^\circ \) and \(M:\mathsf{S}^1_2 \rightarrow \mathsf{AS}\). Let \(N^*:= \mathsf{in}_0 \circ N : \mathsf{S}^1_2 \rightarrow A\) and let \(M^*:= \mathsf{in}_1 \circ M: \mathsf{S}^1_2 \rightarrow A\). Suppose there is an embedding F in A of a cut of \(N^*\) into a cut of \(M^*\). By Theorem 15, we have:
Here the \(D_j\) are formulas in the range of \(\mathsf{in}_0\) and the \(E_j\) are formulas in the range of \(\mathsf{in}_1\). Suppose \(\mathbf {x}\) is in \(D_j\). In that case all \(\mathbf {y}\) in \(E_j\) are in the F-image of \(\mathbf {x}\). Hence, \(E_j\) is closed under \(=_{M^*}\). We may conclude that the range of F is standardly finite modulo \(=_{M^*}\). Quod non. \(\square \)
7 \(\mathsf{PA}^{-}\) and Q
In this section we show that \(\mathsf{PA}^{-}\) is not sententially congruent with Q.Footnote 11 In a sense, we could have written this section without even mentioning \(\mathsf{PA}^-\), since the result that \(\mathsf{PA}^-\) and Q are not sententially congruent follows from a much stronger result proven here. However, since the theories \(\mathsf{PA}^-\) and Q seem to be so close together, I feel that the specific result concerning \(\mathsf{PA}^-\) and Q speaks more to the imagination than the stronger result from which it follows.
To prove our main result, we need a few lemmas. We will be interested in retractions in the category \(\mathsf{INT}_3\) (see the “Five categories” section of Appendix 1). This takes the following form: we have interpretations \(K:U\rightarrow V\) and \(M:V \rightarrow U\) such that, for all U-sentences A, \(U \vdash A \leftrightarrow A^{KM}\). In this case K is a section or split monomorphism.
A basic insight is that the section relation has the forward or zig property w.r.t. theory-extension in \(\mathsf{INT}_3\). This is illustrated by the following diagram.
![figure a](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Figa_HTML.gif)
Theorem 11
The section relation in \(\mathsf{INT}_3\) has the forward or zig property with respect to theory extension.
Proof
Suppose \(K: U \rightarrow V\) is a section. We suppose that M is an inverse of K, so \(M: V \rightarrow U\) and \(M\circ K = \mathsf{ID}_U\). We suppose also \(U \subseteq U^{\prime }\). We define \(V^{\prime }:= \{ A \in \mathsf{sent}_V \mid U^{\prime } \vdash A^M \}\). Clearly, we have an interpretation \(M^{\prime }:V^{\prime } \rightarrow U^{\prime }\) based on the same translation as M. We have:
Hence there is an interpretation \(K^{\prime }\) based on the same translation as K such that \(K^{\prime }:U^{\prime } \rightarrow V^{\prime }\). We find that \(K^{\prime }\) is a section, since \(B^{K^{\prime }M^{\prime }}\) is only dependent on the underlying translations, and hence strictly identical to \(B^{KM}\). \(\square \)
We need a sufficient store of incomparable extensions of given finitely axiomatised theories. We remind the reader that a theory U tolerates or weakly interprets a theory V if, for some translation \(\tau \), the theory \(U+V^\tau \) is consistent. Note that we take the identity axioms for \(\varSigma _V\) including \(\exists x x=x\) to be part of V. The following theorem can probably be much improved, but it is what we need for the current application.
Theorem 12
Suppose A and B are finitely axiomatized theories that tolerate \(\mathsf{S}^1_2\). Then, there are finitely axiomatized theories \(A^\star \supseteq A\) and \(B^\star \supseteq B\), that are incomparable w.r.t. \(\lhd \), i.e. \(A^\star \mathrel {\not \! \rhd }B^\star \) and \(B^\star \mathrel {\not \! \rhd }A^\star \).
Proof
Suppose \(\tau \) witnesses that A tolerates \(\mathsf{S}^1_2\) and \(\nu \) witnesses that B tolerates \(\mathsf{S}^1_2\). We take \(A^{\prime } := A +(\mathsf{S}^1_2)^\tau \) and \(B^{\prime } := B +(\mathsf{S}^1_2)^\nu \). So there is an N based on \(\tau \) such that \(N:\mathsf{S}^1_2 \rightarrow A^{\prime }\) and there is an M based on \(\nu \) such that \(M:\mathsf{S}^1_2 \rightarrow B^{\prime }\). By the Gödel Fixed Point Lemma, we find R such that:
![](http://media.springernature.com/full/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Equ26_HTML.gif)
We take \(A^\star := A^{\prime } +R^N\) and \(B^\star := B^{\prime } + \lnot R^M\). Suppose \(A^\star \rhd B^\star \). It follows that R or \(R^\bot \). In case we have R, we find, by \(\varSigma _1\)-completeness, that \(A^{\prime } \rhd \bot \). Quod non. If we have \(R^\bot \), it follows that \((B^{\prime }+ \lnot R^M)\rhd (A^{\prime }+R^N)\), by the fixed point equation. By \(\varSigma _1\)-completeness, we have \(B^{\prime } \rhd \bot \). Quod non. We may conclude that \(A^\star \mathrel {\not \! \rhd }B^\star \).
The proof that \(B^\star \mathrel {\not \! \rhd }A^\star \) is similar. \(\square \)
The following theorem gives the basic simple insight concerning the unsplittability of connected theories. For the definition of connected see the “Sums” section of Appendix 1.
Theorem 13
Suppose U and V are incomparable w.r.t. local interpretability. Then \(U\boxplus V\) is not connected. It follows that no connected theory W can be mutually locally interpretable with \(U \boxplus V\).
Proof
Suppose that U and V are incomparable w.r.t. local interpretability and that \(U \boxplus V\) is connected. Since \((U \boxplus V)\rhd _\mathsf{loc} (U \boxplus V)\), it follows, by connectedness, that either U locally interprets \(U \boxplus V\) or V locally interprets \(U\boxplus V\). Suppose U locally interprets \(U\boxplus V\). Then, \(U \rhd _\mathsf{loc} (U \boxplus V) \rhd _\mathsf{loc} V\). So, \(U \rhd _\mathsf{loc} V\). Quod non. The assumption that V locally interprets \(U \boxplus V\) leads similarly to a contradiction.
\(\square \)
We are now ready to prove the main result of this section.
Theorem 14
Q cannot be an \(\mathsf{INT}_3\)-retract of a sequential theory.
Proof
We work in \(\mathsf{INT}_3\). Suppose U is sequential and \(\mathsf{Q}\) is a retract of U. We derive a contradiction.
We have \(\mathsf{Q} \subseteq \mathsf{Q}^\#\) and, hence by Theorem 11, we can find a theory \(V \supseteq U\) such that \(\mathsf{Q}^\#\) is a retract of V. Since \(\mathsf{Q}^\#\) is synonymous with Y, the theory \(\mathsf{Q}^\#\) is a fortiori, sententially congruent to Y. It follows that Y is an \(\mathsf{INT}_3\)-retract of V.
We can interpret \(\mathsf{S}^1_2\) in \(\mathsf{Q}^\circ \), so \(\mathsf{Q}^\circ \) tolerates \(\mathsf{S}^1_2\). We can extend \(\mathsf{CQC}^2\) to the weak set theory AS which interprets \(\mathsf{S}^1_2\). So, \(\mathsf{CQC}^2\) tolerates \(\mathsf{S}^1_2\). It follows that \( \mathsf{CQC}^2 \boxplus {\mathbbm {1}}\) tolerates \(\mathsf{S}^1_2\).Footnote 12 Let \(A \supseteq \mathsf{Q}^\circ \) and \(B \supseteq \mathsf{CQC}^2 \boxplus {\mathbbm {1}}\) be the mutually incomparable theories promised by Theorem 12. By Theorem 11, we can find a theory \(W \supseteq V\) such that \(A \boxplus B\) is a retract of W.
It follows that \(A \boxplus B\) is mutually locally interpretable with W. Moreover, A and B are incomparable w.r.t. local interpretability, since they are finitely axiomatized. This contradicts the result of Theorem 13.
![figure b](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Figb_HTML.gif)
We may conclude that Q is not a retract of a sequential theory. \(\square \)
From our theorem, we immediately have that Q and \(\mathsf{PA}^-\) are not sententially congruent.
We note that the only property we used in our proof of sequential theories is the fact that sequential theories are closed under theory-extension-in-the-same-language. Thus for any class \({\mathcal {X}}\) of connected theories, such that \({\mathcal {X}}\) is closed under theory-extension, we have that Q cannot be an \(\mathsf{INT}_3\)-retract of \({\mathcal {X}}\).
8 Concluding remarks
The paper shows that the Pudlák Property and being a poly-pair theory are not preserved under mutual (faithful) interpretability. This provides two examples of good properties of theories that are not preserved under mutual (faithful) interpretability.Footnote 13 It illustrates the usefulness of having more refined notions of sameness of theories.
Connectedness is preserved under mutual interpretability and even under mutual local interpretability. It is a notion of non-splittability. As we have seen the theory Q is splittable in a sense, but this splittability is under the radar of the notion of connectedness, since Q is connected. This discrepancy suggests that it might be interesting to explore more refined versions of connectedness that would exclude Q but include \(\mathsf{PA}^-\).
We note that our results in Sect. 7 imply that Q is connected but has a non-connected extension, to wit (a theory synonymous to) a theory of the form \(A \boxplus B\), where A and B are finitely axiomatized theories that are mutually incomparable w.r.t. relative interpretability. The natural class of sequential theories, however, is upwards closed under theory-extension. So, one may wonder if a more refined version of connectedness would have the property of being preserved under theory-extension.
Notes
Caveat Emptor. The version of Computability and Logic I studied was the second edition. In this book, Q is introduced with the correct axioms but without the name ‘Robinson’s Arithmetic’ and without reference to Tarski et al. (1953). The fourth edition (Boolos et al. 2002) is by Boolos, Burgess and Jeffrey. In this book, a different (but closely related) theory, also called minimal arithmetic, is presented as Q. In an optional section, we learn that “the label Q is often used to refer not to our minimal arithmetic but to another system, called Robinson’s Arithmetic for which we use the label R”. Then, the correct axioms for our Q are specified. However, now the name R is used for Q and not for Tarski, Mostowski and Robinson’s R ...
Saul Kripke noted that one can even prove the First Incompleteness Theorem for weaker theories than R, to wit a theory he calls school. His proof uses Matiyasevich’s Theorem to compensate the fact that for school one cannot prove \(\varSigma _1\)-completeness, employing the fact that one still has completeness for purely existential formulas.
The notion of weak interpretability is introduced by Tarski, Mostowski and Robinson. Giorgi Japaridze uses tolerates for weakly interprets, which seems to me a more descriptive term.
The notation \(\subseteq \) holds between theories of the same signature. It is the extensional subset relation between the sets of theorems of the theories at hand. The notation \(\rhd \) is explained in the “Global and local interpretability” section of Appendix 1.
Unfortunately, the report is not available. We reproduce all necessary materials here.
I simplified the axioms of Marti, Kalchbrenner, Henk and Fritz a bit and also implemented three nice simplifications suggested by the referee. It seems to me that, for our intended applications, we could even work with the weaker system that omits axiom TN4. The remaining system would also be satisfied by an arbitrary ordinal.
The referee remarks that the use of \(\varSigma _1\)-collection can be eliminated from the argument. To do this one needs a careful expansion and reworking of the current argument.
Note that the \(\mathbf {n}\) in the context of \({\mathcal {M}} \models A \mathbf {n}\) are used as sequences of domain constants and not as sequences of numerals.
The notation \(K {\langle A \rangle } M\) is explained in the “Translations” and “Relative interpretations” sections of Appendix 1.
For the definition of AS, see Appendix 2.
The notion of sentential congruence is explained in “Five categories” section of Appendix 1.
I am using here that \(\boxplus \) is associative modulo synonymy.
A trivial example of a property not preserved under mutual interpretability is decidability. However, decidability is preserved under mutual faithful interpretability.
We assume possible variable clashes resulting by the substitution of the \(\mathbf {x}\)’s for the \(\mathbf {v}\)’s to be resolved by \(\alpha \)-conversion.
There are several ways of handling such conventions. First we can work with a fixed global association between the \(x_i\) and the \(\mathbf {x}_i\). Secondly, we can make such an association local and carry it around as an extra argument of the translation. Thirdly, we can throw away the mechanism of using variable-names and work in a language that works with explicit links between places. Fourthly, we can sidestep the problem by working in many-sorted languages and, for every k, adding sequences of length k (of various sorts). This construction can be viewed as a representation of more dimensional interpretations as arrows in a Kleisli category. Regrettably, each way of proceeding needs some work and produces some awkwardness somewhere. We demand that the \(\mathbf {x}_i\) are fully disjoint when the \(x_i\) are different. In this paper, we will assume that these details are taken care of by one strategy or by another.
Of course, there is a foundational issue with this definition. Let’s say that we work in Gödel–Bernays set theory to understand the definition.
The ‘i’ in ‘i-isomorphism’ stands for interpretation.
Here we treat the variables \(x_k\) and \(y_\ell \) as parameters. They are not supposed to be universally generalized away.
This assumption becomes superfluous when we allow piece-wise interpretations.
References
Boolos GS, Burgess JP, Jeffrey RC (2002) Computability and logic, 4th edn. Cambridge University Press, Cambridge
Bezboruah A, Shepherdson JC (1976) Gödel’s second incompleteness theorem for Q. J Symb Log 41(2):503–512
de Bouvère KL (1965a) Logical synonymy. Indag Math 27:622–629
de Bouvère KL (1965b) Synonymous theories. In: Addison JW, Henkin L, Tarski A (eds) Proceedings of the 1963 international symposium at the theory of models at Berkeley. North Holland, Amsterdam, pp 402–406
Friedman H (2007) Interpretations according to Tarski. This is one of the 2007 Tarski lectures at Berkeley. The lecture is available at http://www.math.osu.edu/~friedman.8/pdf/Tarski1,052407
Hodges W (1993) Model theory, vol 42. Encyclopedia of mathematics and its applications. Cambridge University Press, Cambridge
Hájek P, Pudlák P (1993) Metamathematics of first-order arithmetic. Perspectives in mathematical logic. Springer, Berlin
Jeřábek E (2012) Sequence encoding without induction. Math Log Q 58(3):244–248
Jones JP, Shepherdson JC (1983) Variants of Robinson’s essentially undecidable theory R. Archiv Math Logik Grundl 23:61–64
Kaye R (1991) Models of peano arithmetic. Oxford University Press, Oxford
Mycielski J, Pudlák P, Stern AS (1990) A lattice of chapters of mathematics (interpretations between theorems), vol 426. Memoirs of the American Mathematical Society (AMS), Providence
Mostowski A, Tarski A (1949) Undecidability in the arithmetic of integers and in the theory of rings. J Symb Log 14:76
Nelson E (1986) Predicative arithmetic. Princeton University Press, Princeton
Pambuccian V (2008) The sum of irreducible fractions with consecutive denominators is never an integer in \({ PA}^-\). Notre Dame J Form Log 49(4):425–429
Pambuccian V (2014) The Green–Tao theorem on primes in arithmetical progressions in the positive cone of \({\mathbb{Z}} [X]\). Elem Math 69(1):30–32
Pambuccian V (2015) Schatunowsky’s theorem, Bonse’s inequality, and Chebyshev’s theorem in weak fragments of Peano arithmetic. Math Log Q 61(3):230–235
Pour-El MB, Kripke S (1967) Deduction-preserving “recursive isomorphisms” between theories. Fundam Math 61:141–163
Pudlák P (1983) Some prime elements in the lattice of interpretability types. Trans Am Math Soc 280:255–275
Pudlák P (1985) Cuts, consistency statements and interpretations. J Symb Log 50(2):423–441
Robinson RM (1950) An essentially undecidable axiom system. Proc Int Congr Math 1:729–730
Stern AS (1989) Sequential theories and infinite distributivity in the lattice of chapters. J Symb Log 54:190–206
Švejdar V (2007) An interpretation of Robinson’s arithmetic in its Grzegorczyk’s weaker variant. Fundam Inform 81(1–3):347–354
Tarski A, Mostowski A, Robinson RM (1953) Undecidable theories. North-Holland, Amsterdam
Vaught RL (1962) On a theorem of Cobham concerning undecidable theories. In: Nagel E, Suppes P, Tarski A (eds) Proceedings of the 1960 international congress on logic, methodology and philosophy of science. Stanford University Press, Stanford, pp 14–25
Visser A (2006) Prolegomena to the categorical study of interpretations. Logic Group preprint series No. 249. Utrecht University, Utrecht. http://www.phil.uu.nl/preprints/lgps/
Visser A (2008) Pairs, sets and sequences in first order theories. Arch Math Log 47(4):299–326
Visser A (2012) Vaught’s theorem on axiomatizability by a scheme. Bull Symb Log 18(3):382–402
Visser A (2013) What is sequentiality? In: Cégielski P, Cornaros C, Dimitracopoulos C (eds) New studies in weak arithmetics, vol 211. CSLI lecture notes. CSLI Publications and Presses Universitaires du Pôle de Recherche et d’Enseingement Supérieur Paris-est, Stanford, pp 229–269
Visser A (2014) Interpretability degrees of finitely axiomatized sequential theories. Arch Math Log 53(1–2):23–42
Visser A (2014) Why the theory \({\sf R}\) is special. In: Tennant N (ed) Foundational adventures. Essays in honour of Harvey Friedman. College Publishing Ltd, London, pp 7–23 (originally published online by Templeton Press in 2012). http://foundationaladventures.com/
Acknowledgments
We thank Victor Pambuccian for his comments on the slides of a talk I gave on this subject. We are grateful to the anonymous referee for his perceptive comments and suggestions and for spotting some small errors. We thank Paul Taylor for the use of the Diagrams package.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that there are no conflicts of interest connected to publication of this article.
Additional information
Communicated by A. Di Nola, D. Mundici, C. Toffalori, A. Ursini.
This paper is dedicated to the memory of Franco Montagna. Franco was a gentle colleague and a fine human being. His insights and ideas were always an inspiration.
Appendices
Appendix 1: Basics
In this appendix, we provide detailed definitions of translations, interpretations and morphisms between interpretations.
1.1 Theories
Theories in this paper are one-sorted theories of first order predicate logic of finite relational signature. We take identity to be a logical constant. Our official signatures are relational, however, via the term-unwinding algorithm, we can also accommodate signatures with functions. For most purposes in the present paper, a theory can be identified with a deductively closed set of sentences of the given language. The exception is the few places where we use Rosser style arguments. We only do this in the context of finitely axiomatized theories. We assume that we employ the obvious axiomatizations in these cases.
1.2 Translations
Translations are the heart of our interpretations. In fact, they are often confused with interpretations, but we will not do that officially. In practice it is often convenient to conflate an interpretation and its underlying translation.
To formulate the notion of translation, a number of subtleties and details concerning the choice and use of variables in the translations will be only sketched in a hand waving way.
We define more-dimensional, one-piece relative translations without parameters. Let \(\varSigma \) and \(\varTheta \) be one-sorted signatures. A translation \(\tau :\varSigma \rightarrow \varTheta \) is given by a triple \({\langle m,\delta ,F \rangle }\). Here \(\delta (v_0,\ldots ,v_{m-1})\) will be the domain formula. The mapping F associates to each relation symbol R of \(\varSigma \) with arity n a formula \(A(\mathbf {v}_0,\ldots , \mathbf {v}_{n-1})\) of signature \(\varTheta \). Here the \(\mathbf {v}_i\) are sequences of variables of length m. The \(\mathbf {v}_i\) and the \(\mathbf {v}_j\), for \(i \ne j\) are disjoint.
We demand that predicate logic proves
Of course, given any candidate F(R) not satisfying the restriction, we can obviously modify the formula to satisfy the restriction.
We translate \(\varSigma \)-formulas to \(\varTheta \)-formulas as follows.Footnote 14
-
\((R(x_0,\ldots ,x_{n-1}))^\tau := F(R)(\mathbf {x}_0,\ldots , \mathbf {x}_{n-1} )\). The single variable \(x_i\) of the source language needs to have no obvious connection with the sequence of variables \(\mathbf {x}_i\) of the target language that represents it. We need some conventions to properly handle the association \(x_i\mapsto \mathbf {x}_i\).Footnote 15
-
\((\cdot )^\tau \) commutes with the propositional connectives;
-
\((\forall x A)^\tau := \forall \mathbf {x} (\delta (\mathbf {x})\rightarrow A^\tau )\);
-
\((\exists x A)^\tau := \exists \mathbf {x} (\delta (\mathbf {x})\wedge A^\tau )\).
Here are some convenient conventions and notations.
-
We write \(\delta _\tau \) for ‘the \(\delta \) of \(\tau \)’ and \(F_\tau \) for ‘the F of \(\tau \)’.
-
We write \(R_\tau \) for \(F_\tau (R)\).
-
We write \(\mathbf {x}\in \delta \) for: \(\delta (\mathbf {x})\).
There are some natural operations on translations. The identity translation \(\mathsf{id}:=\mathsf{id}_\Theta \) is one-dimensional and it is defined by:
-
\(\delta _\mathsf{id}(x):= (x=x)\),
-
\(R_\mathsf{id}(\mathbf {x}):= R\mathbf {x}\).
We can compose relative translations as follows. Suppose \(\tau \) is an m-dimensional translation from \(\varSigma \) to \(\varTheta \), and \(\nu \) is a k-dimensional translation from \(\varTheta \) to \(\varXi \). We define the \(m \times k\)-dimensional interpretation \(\tau \nu \) or \(\nu \circ \tau \) as follows.
-
We suppose that with the variable x we associate under \(\tau \) the sequence \(x_0,\ldots , x_{m-1}\) and under \(\nu \) we send \(x_i\) to \(\mathbf {x}_i\).
-
Let R be n-ary. Suppose that under \(\tau \) we associate with \(x_i\) the sequence \(x_{i,0},\ldots ,x_{i,m-1}\) and that under \(\nu \) we associate with \(x_{i,j}\) the sequence \(\mathbf {x}_{i,j}\). We take:
$$\begin{aligned}&R_{\tau \nu }(\mathbf {x}_{0,0},\ldots \mathbf {x}_{n-1,m-1})\\&\quad \quad = \delta _\tau (\mathbf {x}_{0,0}) \wedge \ldots \wedge \delta _\tau (\mathbf {x}_{n-1,m-1}) \wedge \\&\qquad \quad (R_\tau ( x_{0,0},\ldots x_{n-1,m-1}))^\nu . \end{aligned}$$
We can make a disjunctive interpretation as follows. Suppose \(\tau \) and \(\nu \) are translations from \(\varSigma \) to \(\varTheta \). We assume that \(\tau \) is k-dimensional and \(\nu \) is m-dimensional. Let A be a \(\Theta \)-sentence. We introduce a \(\mathsf{max}(k,m)\)-dimensional interpretation \(\tau {\langle A \rangle } \nu \).
We first ‘lift’ one of the interpretations by padding to get the dimensions equal. Suppose, e.g. that \(k < m\). Then we define:
-
\(\delta _{\tau ^{\prime }}(\mathbf {x} \mathbf {z}) :\leftrightarrow \delta _\tau (\mathbf {x})\),
-
\(P_{\tau ^{\prime }}(\mathbf {x}_0 \mathbf {z}_0, \ldots , \mathbf {x}_{n-1}\mathbf {z}_{n-1}) :\leftrightarrow P_\tau (\mathbf {x}_0,\ldots ,\mathbf {x}_{n-1})\).
Here the dimension of the \(\mathbf {z}\) is \(m-k\).
Suppose the results of the padding operation are \(\tau ^{\prime }\) and \(\nu ^{\prime }\), where, of course, in case \(k < m\), \(\nu = \nu ^{\prime }\), etcetera. We define \(\tau {\langle A \rangle } \nu \) as follows:
-
\(\delta _{\tau {\langle A \rangle }\nu }(\mathbf {x}) := ((A \wedge \delta _{\tau ^{\prime }}(\mathbf {x})) \vee (\lnot A \wedge \delta _{\nu ^{\prime }}(\mathbf {x})))\).
-
\(R_{\tau {\langle A \rangle }\nu }(\mathbf {x}_0,\ldots ,\mathbf {x}_{n-1}) := \)
$$\begin{aligned} \left( (A \wedge R_{\tau ^{\prime }}(\mathbf {x}_0,\ldots ,\mathbf {x}_{n-1})) \vee (\lnot A \wedge R_{\nu ^{\prime }}(\mathbf {x}_0,\ldots ,\mathbf {x}_{n-1}))\right) . \end{aligned}$$Here the \(\mathbf {x}\) are \(\mathsf{max}(k,m)\)-dimensional.
An m-dimensional translation \(\tau \) preserves identity if
An m-dimensional translation \(\tau \) is unrelativized if \(\delta _\tau (\mathbf {x}) = \top \). An m-dimensional translation \(\tau \) is direct if it is unrelativized and preserves identity. Note that all these properties are preserved by composition (modulo provable equivalence in predicate logic).
Consider a model \({\mathcal {M}}\) with domain M of signature \(\varSigma \) and k-dimensional translation \(\tau :\varSigma \rightarrow \varTheta \). Suppose the \(\tau \)-translations of the identity axioms, including \(\exists x\, x = x\), are true in \({\mathcal {M}}\). Let \(N:=\{ \mathbf {m} \in M^k\mid {\mathcal {M}} \models \delta _\tau \mathbf {m} \}\). Let E be the equivalence relation on N defined in \({\mathcal {M}}\) by \(=_\tau \). Then \(\tau \) specifies an internal model \({\mathcal {N}}\) of \({\mathcal {M}}\) with domain N / E and with \({\mathcal {N}} \models R([\mathbf {m}_0]_E,\ldots ,[\mathbf {m}_{n-1}]_E) \) iff \({\mathcal {M}} \models R_\tau (\mathbf {m}_0,\ldots ,\mathbf {m}_{n-1})\). We will write \(\widetilde{\tau }({\mathcal {M}})\) for the internal model of \({\mathcal {M}}\) given by \(\tau \).
We treat the mapping \(\tau ,{\mathcal {M}} \mapsto \widetilde{\tau }{{\mathcal {M}}}\) as a partial function that is defined precisely if the translations of the identity axioms are true in \({\mathcal {M}}\). Let \(\textsf {Mod}\) or \(\widetilde{(\cdot )}\) be the function that maps \(\tau \) to \(\widetilde{\tau }\). We have:
So, Mod behaves contravariantly.
1.3 Relative interpretations
A translation \(\tau \) supports a relative interpretation of a theory U in a theory V, if, for all U-sentences A, we have \(U\vdash A \Rightarrow V\vdash A^\tau \). Note that this automatically takes care of the theory of identity and assures us that \(\delta _\tau \) is inhabited. We will write \(K={\langle U,\tau ,V \rangle }\) for the interpretation supported by \(\tau \). We write \(K:U\rightarrow V\) for: K is an interpretation of the form \({\langle U,\tau ,V \rangle }\). If M is an interpretation, \(\tau _M\) will be its second component, so \(M={\langle U,\tau _M,V \rangle }\), for some U and V.
Par abus de langage, we write ‘\(\delta _K\)’ for: \(\delta _{\tau _K}\); ‘\(R_K\)’ for: \(R_{\tau _K}\); ‘\(A^K\)’ for: \(A^{\tau _K}\), etc. Here are the definitions of three central operations on interpretations.
-
Suppose U has signature \(\varSigma \). We define:
\(\mathsf{ID}_U:U\rightarrow U\) is \({\langle U,\mathsf{id}_\varSigma ,U \rangle }\).
-
Suppose \(K:U\rightarrow V\) and \(M:V\rightarrow W\). We define:
\(M\circ K:U\rightarrow W\) is \({\langle U,\tau _M\circ \tau _K,W \rangle }\).
-
Suppose \(K:U\rightarrow (V+A)\) and \(M:U\rightarrow (V+\lnot \, A)\). We define:
\(K{\langle A \rangle } M:U\rightarrow V\) is \({\langle U,\tau _K{\langle A \rangle }\tau _M,V \rangle }\).
It is easy to see that we indeed correctly defined interpretations between the theories specified.
1.4 Five categories
We do not automatically get a category of theories and interpretations from the machinery we built up until now. For example, \(\mathsf{ID}_U \circ \mathsf{ID}_U\) will not be strictly speaking identical with \(\mathsf{ID}_U\). We will obtain a category, when we divide out a suitable equivalence among interpretations. Below we will consider five kinds of equivalence that will give us five different categories. One important point of the categories is that isomorphism in each of them defines a salient notion of sameness of theories.
1.4.1 Provable equivalence of interpretations
Two interpretations are provably equivalent when the target theory thinks they are the same. Specifically, two interpretations \(K,M:U\rightarrow V\) are provably equivalent if they have the same dimension, say m, and:
-
\(V\vdash \forall \mathbf {x} ( \delta _K(\mathbf {x})\leftrightarrow \delta _{M}(\mathbf {x}))\),
-
\(V \vdash \forall \mathbf {x}_0,\ldots ,\mathbf {x}_{n-1}{\in } \delta _K (R_K(\mathbf {x}_0,\ldots ,\mathbf {x}_{n-1})\leftrightarrow R_{M}(\mathbf {x}_0,\ldots ,\mathbf {x}_{n-1}))\).
Modulo this identification, the operations identity and composition give rise to a category \(\mathsf{INT}_0\), where the theories are objects and the interpretations arrows. Isomorphism in this category is synonymy or definitional equivalence. This is the strictest notion of identity between theories in the literature. It was first introduced by de Bouvère (1965a, (1965b).
Let MOD be the category with as objects classes of models and as morphisms all functions between these classes. We define \(\textsf {Mod}(U)\) as the class of all models of U. Suppose \(K:U\rightarrow V\). Then, \(\textsf {Mod}(K)\) is the function from \(\textsf {Mod}(V)\) to \(\textsf {Mod}(U)\) given by: \({\mathcal {M}} \mapsto \widetilde{K}({{\mathcal {M}}}) := \widetilde{\tau _K}({{\mathcal {M}}})\). It is clear that Mod is a contravariant functor from \(\mathsf{INT}_0\) to MOD.Footnote 16
1.4.2 Maps between interpretations
For many applications provable equivalence is too strict. A better notions is provable isomorphism or i-isomorphism.
Consider \(K,M:U\rightarrow V\). Suppose K is m-dimensional and M is k-dimensional. An i-isomorphism between interpretations \(K,M:U\rightarrow V\) is given by a formula F with \(m+k\) free variables in the language of V.Footnote 17 We demand that V verifies that “F is an isomorphism between K and M”, or, equivalently, that, for each model \({\mathcal {M}}\) of V, the function \(F^{{\mathcal {M}}}\) is an isomorphism between \(\widetilde{K}({\mathcal {M}})\) and \(\widetilde{M}({\mathcal {M}})\).
We spell out the syntactical definition of an i-isomorphism \(F: K \Rightarrow M\).
-
\(V\vdash \mathbf {x}\mathrel {F}\mathbf {y} \rightarrow (\mathbf {x} \in \delta _K \wedge \mathbf {y} \in \delta _M)\).
-
\(V\vdash (\mathbf {x}=_K \mathbf {u} \wedge \mathbf {u} \mathrel {F} \mathbf {v} \wedge \mathbf {v} =_M\mathbf {y}) \rightarrow \mathbf {x}\mathrel {F}\mathbf {y}\).
-
\(V\vdash \forall \mathbf {x} \in \delta _K \exists \mathbf {y} \in \delta _M \ \mathbf {x}\mathrel {F} \mathbf {y}\).
-
\(V\vdash \forall \mathbf {y} \in \delta _M \exists \mathbf {x} \in \delta _K \ \mathbf {x}\mathrel {F} \mathbf {y}\).
-
\(V\vdash (\mathbf {x}_0F\mathbf {y}_0 \wedge \ldots \mathbf {x}_{n-1}F\mathbf {y}_{n-1}) \rightarrow (R_K(\mathbf {x}_0,\ldots ,\mathbf {x}_{n-1}) \leftrightarrow R_M(\mathbf {y}_0,\ldots ,\mathbf {y}_{n-1}))\).
Here the last item includes identity in the role of R!
Two interpretations \(K,M:U\rightarrow V\), are i-isomorphic iff there is an i-isomorphism between K and M. Wilfrid Hodges calls this notion: homotopy. See Hodges (1993, p. 222).
We can also define the notion of being i-isomorphic semantically. The interpretations \(K,M:U\rightarrow V\), are i-isomorphic iff there is an F such that, for all V-models, \({\mathcal {M}}\), the relation \(F^{{\mathcal {M}}}\) is an isomorphism between \(\widetilde{K}({{\mathcal {M}}})\) and \(\widetilde{M}({{\mathcal {M}}})\).
The default in this paper is that theories have finite signature: in this case we have a third characterization. The interpretations \(K,M:U\rightarrow V\), are i-isomorphic iff, for every V-model \({\mathcal {M}}\), there is an \({\mathcal {M}}\)-definable isomorphism between \(\widetilde{K}({{\mathcal {M}}})\) and \(\widetilde{M}({{\mathcal {M}}})\). This characterization follows by a simple compactness argument.
Clearly, if K and M are provably equivalent in the sense of the previous subsubsection, they will be i-isomorphic. The notion of i-isomorphism give rise to a category of interpretations modulo i-isomorphism. We call this category \(\mathsf{INT}_1\).
Isomorphism in \(\mathsf{INT}_1\) is
bi-interpretability. Bi-interpretability is a very good notion of sameness that preserves such diverse properties as finite axiomatizability and \(\kappa \)-categoricity.
1.4.3 Isomorphism
Our third notion of sameness of the basic list is that K and M are the same if, for all models \({\mathcal {M}}\) of V, the internal models \(\widetilde{K}({{\mathcal {M}}})\) and \(\widetilde{M}({{\mathcal {M}}})\) are isomorphic. We will simply say that K and M are isomorphic. Clearly, i-isomorphism implies isomorphism. We call the associated category \(\mathsf{INT}_2\). Isomorphism in \(\mathsf{INT}_2\) is iso-congruence.
1.4.4 Elementary equivalence
The fourth notion is to say that two interpretations K and M are the same if, for each \({\mathcal {M}}\), the internal models \(\widetilde{K}({{\mathcal {M}}})\) and \(\widetilde{M}({{\mathcal {M}}})\) are elementary equivalent. We will say that K and M are elementary equivalent.
By the completeness theorem, we easily see that this notion can be alternatively defined by saying that K is elementary equivalent to M iff, for all U-sentences A, we have \(V\vdash A^K\leftrightarrow A^{M}\). It is easy to see that isomorphism implies elementary equivalence. We call the associated category \(\mathsf{INT}_3\). Isomorphism in \(\mathsf{INT}_3\) is elementary congruence or sentential congruence.
1.4.5 Identity of source and target
Finally, we have the option of abstracting away from the specific identity of interpretations completely, simply counting any two interpretations \(K,M:U\rightarrow V\) the same. The associated category is \(\mathsf{INT}_4\). This is simply the structure of degrees of (global) interpretability \(\mathsf{DEG}_\mathsf{glob}\). Isomorphism in \(\mathsf{INT}_4\) is mutual interpretability.
1.5 Global and local interpretability
We can view interpretability as a generalization of provability. When we take this stand point, we write:
-
\(U \rhd V\) (or \(V \lhd U\)) for: U interprets V (or V is interpretable in U).
-
\(U \equiv V\) for: U and V are mutually interpretable.
A closely related notion is local interpretability. We define
-
U locally interprets V or \(U \rhd _\mathsf{loc} V\) iff, for every finitely axiomatize subtheory \(V_0\) of V we have \(U \rhd V_0\).
-
We write \(V \lhd _\mathsf{loc} U\) and \(U \equiv _\mathsf{loc} V\) with the obvious meanings.
If we want to stress the contrast between local and ordinary interpretability, we often call ordinary interpretability global interpretability. We will write \(\rhd _\mathsf{glob}\), etcetera. The degrees of local interpretability are \(\mathsf{DEG}_\mathsf{loc}\).
Example 1
Let \(\mathbb {2}\) be the theory in the language of identity that says that there are precisely two elements. Let INF be the theory in the language of identity that has for every n an axiom saying ‘there are at least n elements’. Then \(\mathbb {2} \rhd _\mathsf{loc} \mathsf{INF}\) but \(\mathbb {2} \mathrel {\not \! \rhd }_\mathsf{glob} \mathsf{INF}\)
1.6 Weak interpretability or tolerance
We say that a theory U weakly interprets a theory V, or that U tolerates V if, for some interpretation \(\tau :\varSigma _V \rightarrow \varSigma _U\), the theory \(U+V^\tau \) is consistent. Here we take V to contain the axioms of identity including \(\exists x x=x\). We note that:
-
U tolerates V, iff for some consistent extension (in the same language) \(U^{\prime }\) of U, we have \(U^{\prime }\rhd V\).
-
If U tolerates V and \(V\rhd W\), then U tolerates W.
1.7 Adding parameters
We can add parameters in the obvious way. An interpretation \(K:U\rightarrow V\) with parameters will have a k-dimensional parameter domain \(\alpha \) (given by a formula in k variables), where \(V\vdash \exists \mathbf {x} \alpha \mathbf {x}\). We allow the extra variables \(\mathbf {x}\) to occur in the translations of the U-formulas. We sometimes write \(K^{\mathbf {x}}\) to make the dependence on the parameters visible. We write \(A^{K,\mathbf {x}}\) for the K-translation of A for parameters \(\mathbf {x}\).
The condition for K to be an interpretations changes into: \(\vdash \forall \mathbf {x} (\alpha \mathbf {x} \rightarrow A^{K,\mathbf {x}})\), for all sentences A such that \(U \vdash A\). Note that this automatically takes care of the axioms of identity and the non-emptiness of the domain.
Notions like direct interpretation are lifted in the obvious way to the case where we allow parameters.
We note that, in the presence of parameters, the functor \(\widetilde{K}\) associates a class of models of U to a model of V.
Similar adaptations are needed to define i-isomorphisms with parameters.
1.8 Piecewise interpretability
The idea of piecewise interpretability is that we can build up the domain from a number of pieces that may or may not be of the same dimension and that may or may not overlap. The same object of the interpreting theory may occur in different roles posing as different objects of the interpreted theory. We will not develop piecewise interpretability here. We just given an example of how it works.
Suppose we have two pieces a and b. Let’s say that a is 1-dimensional and b is 2-dimensional. Suppose we have no parameters.
We want to translate \(P(u_3,u_1)\). How are we going to do it? Well, we need to know in what pieces \(u_1\) and \(u_3\) are supposed to be. We need a function g as argument in the translation as an oracle that tells us precisely that. Suppose that, according to g, \(u_1\) is in piece a and \(u_3\) is in piece b. We note that g in combination with our formula \(P(u_3,u_1)\) places b on the first argument place and a on the second argument place. So, F must tell us the translation for P when the first argument is b and the second argument is a. Let f be a function that associates pieces to argument places. Thus, we need F(P, f), where and
. We may choose F(P, f) be A(u, v, w), where A is dependent on f. We may view \(P(u_3,u_1)\) as given by P plus a function h that associates variables to argument places. Thus \((P(u_3,u_1))^{\tau ,g} := F(P, g\circ h)(u_{30},u_{31},u_{10})\).
Here is the clause for the universal quantifier for the variable \(u_2\).
where \(g[u_2:j]\) is the result of setting g at \(u_2\) to j.
So we handle quantification over the different pieces simply by conjunction over the quantifications of each piece separately.
We can miraculously conjure any finite number of elements out of nothing, simply by creating sufficiently many pieces containing one element.
If the target theory V proves that we have at least two elements, we can always replace a piecewise interpretation by a many-dimensional piece-less one modulo i-isomorphism.
1.9 Sums
We define the operation \(\boxplus \) on theories as follows. The signature of \(U \boxplus V\) is the disjoint union of the signatures of U and V, plus two new unary predicates \(\triangle _0\) and \(\triangle _1\). The axioms of \(U\boxplus V\) are:
-
\(P(x_0,\ldots , x_{n-1}) \rightarrow \bigwedge _{i<n} \triangle _0(x_i)\), if P is derived from the signature of U,
-
\(Q(y_0,\ldots , y_{m-1}) \rightarrow \bigwedge _{j<m} \triangle _1(y_j)\), if Q is derived from the signature of V,
-
the axioms of U relativized to \(\triangle _0\),
-
the axioms of V relativized to \(\triangle _1\),
-
\(\forall x (\triangle _0(x) \vee \triangle _1(x))\),
-
\(\forall x \lnot (\triangle _0(x) \wedge \triangle _1(x))\).
We treat identity as outside of the signature here. We have the ordinary theory of identity.
We note that \(U \boxplus V\) is synonymous with \(V\boxplus U\) and \((U \boxplus V) \boxplus W\) is synonymous with \(U \boxplus (V \boxplus W)\) and that both are synonymous with the ternary sum \(\boxplus (U,V,W)\) which is defined in the obvious way using \(\triangle _0\), \(\triangle _1\) and \(\triangle _2\).
We show that \(\boxplus \) is the sum in the categories \(\mathsf{INT}_i\) for \(1\le i \le 4\) on the assumption that one of the theories U and V proves that there are at least two elements. We remind the reader of the sum diagram.
![](http://media.springernature.com/full/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Equ29_HTML.gif)
The arrow \(\mathsf{in}_0\) interprets U in \(U \boxplus V\) by relativization to \(\triangle _0\). We note that, by our conventions we should take \(x =_{\mathsf{in}_0} y\) iff \(\triangle _0(x) \wedge \triangle _0(y) \wedge x=y\). The other predicate symbols do not need this addition. The definition of \(\mathsf{in}_1\) is similar.
Suppose \(K: U \rightarrow W\) and \(M: V \rightarrow W\). We suppose further that one of U and V proves that there are at least two elements. As a first step we make the domains of K and M in W disjoint and of the same dimension. We note that W proves that there are at least two elements. This follows from our assumption that at least one of U and V proves that there are at least two elements. Let the dimension of \(\delta _K\) be k and the dimension of \(\delta _M\) be m. Suppose, e.g. \(k \le m\). We replace \((x_0,\ldots x_{k-1})\) in \(\delta _K\) by \((z,z,w_0,\ldots , w_{m-k-1}, x_0,\ldots x_{k-1})\) in \(\delta _{K^{\prime }}\), where z and the \(w_i\) can be arbitrary. We replace \((y_0,\ldots , y_{m-1})\) in \(\delta _K\) by \((z,z^{\prime }, y_0,\ldots , y_{m-1})\), where z, \(z^{\prime }\) can be arbitrary under the constraint that \(z \ne z^{\prime }\). Still under the assumption that \(m \le k\), we define \((z,z,w_0,\ldots , w_{m-k-1}, x_0,\ldots x_{k-1})\mathrel {=_{K^{\prime }}} (z^{\prime },z^{\prime },w^{\prime }_0,\ldots , w^{\prime }_{m-k-1}, x^{\prime }_0,\ldots x^{\prime }_{k-1})\) iff \(( x_0,\ldots x_{k-1}) \mathrel {=_K} (x^{\prime }_0,\ldots x^{\prime }_{k-1})\). Etcetera. Let the newly obtained interpretations (considered as syntactical objects) be \(K^{\prime }\) and \(M^{\prime }\). As is easily seen \(K^{\prime }\) and \(M^{\prime }\) have disjoint domains and the same dimensions and are equivalent to K, respectively M in each of \(\mathsf{INT}_i\), where \(i=1,2,3,4\).
The interpretation \([K^{\prime },M^{\prime }]\) is the obvious one where the new domain is the union of the domains of \(K^{\prime }\) and \(M^{\prime }\). It is easy to see that, in each of the categories \(\mathsf{INT}_i\), for \(i=1,2,3,4\), the arrow \([K^{\prime },M^{\prime }]\) is unique with the desired property. Hence \(U \boxplus V\) is indeed the sum in all of these theories.
As we have seen, the sum construction only works when at least one of the summands has provably two elements in its domain. If we have piecewise interpretations, we would not need this assumption. In this case the construction would become much simpler, since both padding and making disjoint become superfluous. There is an alternative construction of a sum \(U \oplus V\) in Mycielski et al. (1990) or Stern (1989). This alternative construction is, for many purposes, more convenient. The reason that we use \(\boxplus \) in this paper is that in Theorem 3 we obtain synonymy for \(\boxplus \). (The theories \(U \oplus V\) and \(U \boxplus V\) are bi-interpretable but not synonymous.)
We have the following basic theorem.
Theorem 15
Consider the theory \(W := U\boxplus V\). Consider any formula \(A\mathbf {x}\mathbf {y}\) in the language of W. Then, there are formulas \(B_i\mathbf {x}\) in the language of U and formulas \(C_j\mathbf {y}\) in the language of V, such that \(A\mathbf {x}\mathbf {y}\) is equivalent to a boolean combination of \(B_i^{\mathsf{in}_0}\mathbf {x}\) and \(C_j^{\mathsf{in}_1}\mathbf {y}\) in the theory \(W + \bigwedge _k \; \triangle _0(x_k) + \bigwedge \triangle _1(y_\ell )\).Footnote 18
Proof
The proof of the theorem is by a simple induction on A. \(\square \)
An important notion that is defined in terms of the notion of sum is connectedness. We say that a theory W is connected if, for any theories U and V, if \((U \boxplus V) \rhd _\mathsf{loc} W\), then \(U \rhd _\mathsf{loc} W\) or \(V \rhd _\mathsf{loc} W\). The following fundamental theorem is due to Pudlák (1983). It was reproved with a markedly different proof by Stern (1989). For more context, see also Mycielski et al. (1990).
Theorem 16
Every sequential theory is connected.
The notion of sequentiality is introduced in Appendix 2.
Appendix 2: Pair theories and sequential theories
Pair theories and sequential theories are theories in which containers or data structures of a certain kind are present for all objects of a given theory. The presence of such containers provides many good properties for such theories. For example, recursively enumerable pair theories can be axiomatised by a scheme. See Visser (2012). Sequential theories are locally reflexive due to the presence of partial satisfaction predicates. We refer the reader to Visser (2013) for more information about poly-sequential theories.
1.1 Basic definitions
We consider the theory of non-surjective unordered pairing PAIR and adjunctive set theory AS. The language of both theories has just a binary predicate symbol \(\in \) (in addition to identity that is standardly available). The theory PAIR is axiomatised as follows:
-
PAIR1. \(\vdash \exists x \forall y y \not \in x\)
-
PAIR2. \(\vdash \exists z \forall u (u\in z \leftrightarrow (u=x \vee u=y))\)
The theory AS is given by:
-
AS1. \(\vdash \exists x \forall y y \not \in x\)
-
AS2. \(\vdash \exists z \forall u (u\in z \leftrightarrow (u\in x \vee u=y))\)
A theory is poly-sequential if it directly interprets AS. A theory is a poly-pair theory if it directly interprets PAIR. A theory is sequential if it directly interprets AS via a direct 1-dimensional interpretation. A theory is a pair theory if it directly interprets PAIR via a 1-dimensional interpretation.
Below we will mainly develop the poly-pair case. The proofs in the sequential case are entirely analogous.
We note that using the Kuratowski pairing we can obtain (non-extensional) ordered pairs in PAIR. Using iterated pairing we can obtain non-extensional sequences of length n, for each natural number n. Thus, we have a predicate \(\mathsf{SEQ}_n\) and projection relations \(\pi _i\) for each \(i<n\). We define
One can show that:
-
(i)
\(\vdash \exists x \forall \mathbf {y}\quad \mathbf {y} \not \in _n x\)
-
(ii)
\(\vdash \exists z \forall \mathbf {u}\quad (\mathbf {u}\in _n z \leftrightarrow (\mathbf {u}= \mathbf {x} \vee \mathbf {u}=\mathbf {y}))\)
1.2 Preservation under theory extension
Since direct interpretations are closed under composition, each theory that directly interprets a poly-pair theory is itself a poly-pair theory. Obviously, the identical embedding of a theory in an extension-in-the-same-language is direct. Ergo, being a poly-pair theory is preserved under extension-in-the-same-language. Similar remarks hold for poly-sequential theories.
1.3 Preservation under bi-interpretations
We show a slightly stronger result.
Theorem 17
Let U be a poly-pair theory and suppose that V is a retraction in \(\mathsf{INT}_1\) of U. Then, V is a poly-pair theory.
Proof
To simplify the argument inessentially we ignore parameters.
Let \(K:U \rightarrow V\) and \(M: V \rightarrow U\) and let F be an i-isomorphism from \(\mathsf{ID}_V\) to \(K\circ M\) witnessing the retraction. We assume that K is k-dimensional and that M is m-dimensional. Let \((\cdot )^\star \) be the interpretation of PAIR in U. Say \((\cdot )^\star \) is \(\ell \)-dimensional. We define a \(k\ell \)-dimensional interpretation \((\cdot )^\dag \) of PAIR in V as follows:
Here:
-
\(\mathbf {v}\) has length \(k\ell \).
-
\(\mathbf {w}\) has length \(k^2\ell m\). It is of the form \(\mathbf {w}_0,\ldots ,\mathbf {w}_{k\ell -1}\), where each \(\mathbf {w}_i\) has length km and \(\mathbf {v}_i \mathrel F \mathbf {w}_i\). The \(\mathbf {w}_i\) stand for elements of \(\delta _{K\circ M}\). Note that the definition of an i-isomorphism forces the \(\mathbf {w}_i\) to be in \(\delta _{K\circ M}\).
-
\(\mathbf {u}\) has length \(k\ell \). It stands for a sequence \(\mathbf {u}_0, \ldots , \mathbf {u}_{\ell -1}\), where the \(\mathbf {u}_j\) have length k. The \(\mathbf {u}_j\) stand for elements of \(\delta _K\). (In case they do not, we don’t care. It is sufficient for our purposes that the ‘correct’ sequences \(\mathbf {u}\) provide all the unordered pairs we want.)
-
The \(\in ^\star _{km}\) lives inside K. Here its first component is \(km\ell \) dimensional and its second component \(\ell \)-dimensional. Looking at it from the outer level of \(\mathsf{ID}_V\), the first component acquires dimension \(k^2\ell m\) and its second component acquires dimension \(k\ell \).
A moment’s reflection shows that our definition indeed gives us an interpretation of PAIR in V. We note that literally the same construction works for AS. \(\square \)
1.4 Elimination of parameters
We show that we can eliminate the parameters from a witness of the property of being a pair theory.
Theorem 18
Suppose U is a poly-pair theory. Then, U is a poly-pair theory via a direct interpretation without parameters.
Proof
Consider any poly-pair theory U. Let \((\cdot )^\star \) be the witnessing interpretation of PAIR in U. Suppose \((\cdot )^\star \) is k-dimensional and that M has an \(\ell \)-dimensional parameter domain \(\alpha \). Let \(ak+b = k + \ell \), where a and b are natural numbers and \(b < k\). We define a parameter-free interpretation \((\cdot )^\dag \) of dimension \(k+\ell \) as follows.
Here \(\mathbf {x}\) has length \(k+\ell \), i.e. \(ak+b\). The block \(\mathbf {z}\) is just padding and has length \( k-b\). Thus, the length of \((\mathbf {x},\mathbf {z})\) is \((a+1)k\). Finally, the length of \(\mathbf {y}\) is k and, therefore, the length of \((\mathbf {p},\mathbf {y})\) is \(k+ \ell \).
We can easily verify that \(\in ^\dag \) yields an interpretation of PAIR. The argument in the case of AS is very similar. \(\square \)
Appendix 3: The converse of the Pudlák property fails
The following theorem is one of these utterly strange cases where the fact proven seems totally obvious, but where one still has to work to obtain the desired result. It would be interesting to prove the theorem using the quantifier elimination for EQ.
Theorem 19
Let U and V be theories of finite signature. We suppose that U proves that there are at least two elements.Footnote 19 Suppose V includes the theory of linear order for, say, <. Then any interpretation \(K: V\rightarrow (U \boxplus \mathsf{EQ})\) (with parameters) is i-isomorphic in \(U \boxplus \mathsf{EQ}\) to an interpretation with parameters \(K^\star :V \rightarrow (U\boxplus \mathsf{EQ})\), where both the parameter domain and the object domain consist of sequences from the U-domain \(\triangle _0\).
Proof
Consider any model \({\mathcal {M}}\boxplus {\mathcal {I}}\) of \(U\boxplus \mathsf{EQ}\). We work in \({\mathcal {M}}\). We have the U-domain \(\triangle _0\) and the EQ-domain \(\triangle _1\). We use \(a,b,c,\ldots \) to range over \(\triangle _0\), we use \(x,y,z,\ldots \) to range over \(\triangle _1\) and we use \(u,v,w, \ldots \) to range over the mixed domain.
Suppose that K is an m-dimensional interpretation and that K has a parameter domain that is \(\ell \)-dimensional. Let s be the smallest number such that \(2^s \ge \ell + m +2\).
We define, for any n, a formula \(G_n(\mathbf {a}, \mathbf {u})\), where the length of \(\mathbf {a}\) is sn and the length of \(\mathbf {u}\) is n. Here we intend to only consider \(G_n\) for \(n\le \ell +m\). The idea behind the formula \(G_n\) is that it represents the relation ‘\(\mathbf {a}\) mimicks \(\mathbf {u}\)’.
-
\(\mathbf {a} \mathrel {G_n} \mathbf {u}\) iff
-
(i)
whenever \(u_i\) is in \(\triangle _0\), then \(a_{si}= a_{si+1} = \cdots = a_{si +s-1} = u_i\);
-
(ii)
whenever \(u_i\) is in \(\triangle _1\), then there are j and \(j^{\prime }\) such that \( j< j^{\prime } < s\) and \(a_{si+j} \ne a_{si+j^{\prime }}\);
-
(iii)
whenever \(u_i\) and \(u_j\) are in \(\triangle _1\), then:
$$\begin{aligned} u_i = u_j\quad \hbox {iff}\quad&(a_{si},\ldots , a_{si+s-1})=\\&(a_{sj},\ldots , a_{sj+s-1}). \end{aligned}$$
-
(i)
We define a translation \(\tau ^\star \).
-
\(\alpha _{\tau ^\star }(\mathbf {a}) :\leftrightarrow \exists \mathbf {u} (\mathbf {a} \mathrel {G_{\ell }} \mathbf {u} \wedge \alpha _{\tau _K}(\mathbf {u}))\),
-
\(\delta _{\tau ^\star }^{\mathbf {a}}(\mathbf {b}) :\leftrightarrow \exists \mathbf {u} \exists \mathbf {v} (\mathbf {a}\mathbf {b} \mathrel {G_{\ell +m}} \mathbf {u} \mathbf {v} \wedge \delta _{\tau _K}^{\mathbf {u}}(\mathbf {v}))\),
-
$$\begin{aligned}&P_{\tau ^\star }^{\mathbf {a}}(\mathbf {b}_0, \ldots , \mathbf {b}_{p-1}) :\leftrightarrow \exists \mathbf {u} \exists \mathbf {v}_0 \cdots \exists \mathbf {v}_{p-1}\\&\quad \quad \left( \mathbf {a} \mathbf {b}_0 \mathrel {G_{\ell +m}} \mathbf {u} \mathbf {v}_0 \wedge \cdots \wedge \mathbf {a} \mathbf {b}_{p-1} \mathrel {G_{\ell +m}} \mathbf {u} \mathbf {v}_{p-1}\right. \\&\qquad \left. \wedge P_{\tau _K}(\mathbf {v}_0,\cdots , \mathbf {v}_{p-1})\right) . \end{aligned}$$
We define the relation R between the parameters and the corresponding i-isomorphism F as follows:
-
\(\mathbf {a} \mathrel {R} \mathbf {u} :\leftrightarrow \mathbf {a} \mathrel {G_\ell } \mathbf {u} \wedge \alpha _{\tau _K} (\mathbf {u})\).
-
\(\mathbf {b} \mathrel F^{\mathbf {a}, \mathbf {u}} \mathbf {v} : \leftrightarrow \mathbf {a}\mathbf {b} \mathrel {G_{\ell + m}} \mathbf {u} \mathbf {v} \wedge \delta _{\tau _K}^{\mathbf {u}}(\mathbf {v})\).
Our first order of business is to show that R is total and surjective between \(\alpha _{\tau ^\star }\) and \(\alpha _{\tau _K}\). That it is total is immediate from the definition of \(\alpha _{\tau ^\star }\). We show that it is surjective. Consider any \(\mathbf {u}\) in \(\alpha _{\tau ^\star }\). To find a \(G_\ell \)-counterpart \(\mathbf {a}\), we have only to match the pattern of identity versus non-identity between the \(\triangle _1\)-elements of \(\mathbf {u}\). To do that we need at most \(\ell \) different sequences \(\mathbf {c}\) of length s that represent \(\triangle _1\)-elements—in the most demanding case the elements of \(\mathbf {u}\) are all in \(\triangle _1\) and are all different. Since \(\triangle _1\) contains at least two elements, we have at least \(2^s\) sequences of length s. (We note that we do not need to introduce extra parameters for the two elements, since we are only interested in sameness and difference of these sequences.) Since the constant sequences are reserved for representing elements from \(\triangle _0\), we need that \(2^s \ge \ell +2\). By our choice of s this is true.
Suppose we fix parameters \(\mathbf {a}\) and \(\mathbf {u}\) such that \(\mathbf {a} \mathrel {R} \mathbf {u}\). We want to show that \(F^{\mathbf {a}, \mathbf {u}}\) is an bijection between \(\delta ^{\mathbf {a}}_{\tau ^\star }\) and \(\delta ^{\mathbf {u}}_{\tau _K}\) modulo the respective identities of \(\tau ^{\star ,\mathbf {a}}\) and \(\tau _K^{\mathbf {u}}\).
We first prove that \(F^{\mathbf {a}, \mathbf {u}}\) is total. Consider \(\mathbf {b}\) in \(\delta ^{\mathbf {a}}_{\tau ^\star }\). By definition, there are \(\mathbf {u}^{\prime }\) and \(\mathbf {v}^{\prime }\) such that \(\mathbf {a} \mathbf {b} \mathrel {G_{\ell + m}} \mathbf {u}^{\prime } \mathbf {v}^{\prime }\). There is an automorphism \(\sigma \) of \({\mathcal {I}}\) that maps \(\triangle _1\)-elements of \(\mathbf {u}^{\prime }\) to the corresponding elements of \(\mathbf {u}\). Take \(\mathbf {v} := \sigma (\mathbf {v}^{\prime })\). It follows that \(\mathbf {b} \mathrel {F^{\mathbf {a},\mathbf {u}}} \mathbf {v}\).
We prove that \(F^{\mathbf {a}, \mathbf {u}}\) is surjective. Consider \(\mathbf {v} \in \delta _{\tau _K}^{\mathbf {u}}\). We need to find a \(\mathbf {b}\) such that \(\mathbf {a} \mathbf {b} \mathrel {G_{\ell + m}} \mathbf {u} \mathbf {v}\). For we need the sameness-difference pattern of the \(\triangle _1\)-elements of \(\mathbf {a} \mathbf {b}\) with the given pattern of the \(\triangle _1\)-elements of \(\mathbf {u} \mathbf {v}\). Since we have chosen \(2^s \ge \ell + m +2\) we can always do this.
We prove that \(F^{\mathbf {a}, \mathbf {u}}\) is functional. Suppose \(\mathbf {b} =_{\tau ^\star }^{\mathbf {a}} \mathbf {b}^{\prime }\) and \(\mathbf {b} \mathrel {F^{\mathbf {a},\mathbf {u}}} \mathbf {v}\) and \(\mathbf {b}^{\prime } \mathrel {F^{\mathbf {a},\mathbf {u}}} \mathbf {v}^{\prime }\). We note that, for some \(\mathbf {w}\) and \(\mathbf {w}^{\prime }\), we have \(\mathbf {a}\mathbf {b} \mathrel {G_{\ell +m} }\mathbf {u} \mathbf {w}\) and \(\mathbf {a}\mathbf {b}^{\prime } \mathrel {G_{\ell +m} }\mathbf {u} \mathbf {w}^{\prime }\) and \(\mathbf {w} =_{\tau _K}^{\mathbf {u}} \mathbf {w}^{\prime }\). We claim that \(\mathbf {v} =_{\tau _K}^{\mathbf {u}} \mathbf {w}\) and, similarly, that \(\mathbf {v}^{\prime } =_{\tau _K}^{\mathbf {u}} \mathbf {w}^{\prime }\). Assuming the claim, we have \(\mathbf {v} =_{\tau _K}^{\mathbf {u}} \mathbf {w} =_{\tau _K}^{\mathbf {u}} \mathbf {w}^{\prime } =_{\tau _K}^{\mathbf {u}}\mathbf {v}^{\prime }\) and we are done. We note that \(\mathbf {u} \mathbf {v}\) and \(\mathbf {u} \mathbf {w}\) have the same sameness-difference pattern on the \(\triangle _1\) elements.
We prove the claim. Suppose, to get a contradiction that \(\mathbf {v} \ne _{\tau _K}^{\mathbf {u}} \mathbf {w}\). It follows that either \(\mathbf {v} <_{\tau _K}^{\mathbf {u}} \mathbf {w}\). or \(\mathbf {w} <_{\tau _K}^{\mathbf {u}} \mathbf {v}\). We assume, e.g. that \(\mathbf {v} <_{\tau _K}^{\mathbf {u}} \mathbf {w}\). The other case is analogous.
Suppose \(\triangle _1\) (i.e. the domain of \({\mathcal {I}}\)) is finite. There clearly is an automorphism \(\sigma \) of \({\mathcal {I}}\) such that \(\sigma (\mathbf {u} \mathbf {v}) = \mathbf {u} \mathbf {w}\), where we leave the \(\triangle _0\)-elements in place. We have:
Since \({\mathcal {I}}\) was supposed to be finite, this is impossible.
Suppose \({\mathcal {I}}\) is infinite. We first assume that the \(\triangle _1\)-elements in \(\mathbf {v}\) and \(\mathbf {w}\) that are not in \(\mathbf {u}\) are disjoint. In this case there is a \({\mathcal {I}}\)-automorphism \(\sigma \) such that \(\sigma (\mathbf {u} \mathbf {v}) = \mathbf {u} \mathbf {w}\) and \(\sigma (\mathbf {u} \mathbf {w}) = \mathbf {u} \mathbf {v}\). It follows that:
A contradiction. So \(\mathbf {v} =_{\tau _K}^{\mathbf {u}} \mathbf {w}\). In case the \(\triangle _1\)-elements in \(\mathbf {v}\) and \(\mathbf {w}\) that do not occur in \(\mathbf {u}\) are not fully disjoint, we find a \(\mathbf {z}\) such that \(\mathbf {a}\mathbf {b} \mathrel {G_{\ell + m}} \mathbf {u} \mathbf {z}\) and the \(\triangle _1\)-elements in \(\mathbf {z}\) that are not in \(\mathbf {u}\) are disjoint from all \(\triangle _1\)-elements in \(\mathbf {u}\), \( \mathbf {v}\) and \(\mathbf {w}\). We now use our earlier argument, to see that \(\mathbf {v} =_{\tau _K}^{\mathbf {u}} \mathbf {z} =_{\tau _K}^{\mathbf {u}} \mathbf {w}\).
We prove that \(F^{\mathbf {a}\mathbf {u}}\) is injective. Suppose \(\mathbf {b} \mathrel {F^{\mathbf {a}\mathbf {u}}} \mathbf {v}\), \(\mathbf {b}^{\prime } \mathrel {F^{\mathbf {a}\mathbf {u}}} \mathbf {v}^{\prime }\) and \(\mathbf {v} =_{\tau _K}^{\mathbf {u}} \mathbf {v}^{\prime }\). Then, by the definition of \(=^{\mathbf {a}}_{\tau ^\star }\), we find: \(\mathbf {b} =_{\tau ^\star }^{\mathbf {a}} \mathbf {b}^{\prime }\).
Finally we want to show that, if \( \mathbf {b}_0\mathrel {F^{\mathbf {a} \mathbf {u}}} \mathbf {v}_0\) and ...and \( \mathbf {b}_{p-1}\mathrel {F^{\mathbf {a} \mathbf {u}}} \mathbf {v}_{p-1}\), then:
But this is immediate by the definition of \(P_{\tau ^\star }^{\mathbf {a}}\) and the fact that F is a bijection.
We define \(K^\star :={\langle V, \tau ^\star , U\boxplus \mathsf{EQ} \rangle }\). It is easy to see that, by the Completeness Theorem, \(K^\star \) satisfies the desiderata. \(\square \)
We are now ready to prove the Pudlák’s Property for \(U\boxplus \mathsf{EQ}\), whenever U is sequential.
Theorem 20
Suppose U is sequential. Then, we have both the Pudlák property and the strong Pudlák property for \(U\boxplus \mathsf{EQ}\).
Proof
We just consider the strong property. Suppose \(N:\mathsf{S}^1_2 \rightarrow U\boxplus \mathsf{EQ}\). We can find \(N^\star :\mathsf{S}^1_2 \rightarrow U\boxplus \mathsf{EQ}\) that is i-isomorphic to N such that the parameter domain \(\alpha \) and the object domain \(\delta \) of \(N^\star \) are entirely in \(\triangle _0\). Note that the definitions of the corresponding translations are not necessarily entirely in the U-part of the language.
Suppose the dimension of the parameter domain of \(N^\star \) is k and the dimension of \(N^\star \) is m. We expand the signature \(\varSigma _U\) of U with new symbols \(\widetilde{\alpha }\), \(\widetilde{\delta }\), \(\widetilde{\mathsf{Z}}\), \(\widetilde{\mathsf{A}}\), \(\widetilde{\mathsf{M}}\). Here \(\widetilde{\alpha }\) is k-dimensional, \(\widetilde{\mathsf{Z}}\) is \((k+m)\)-dimensional, \(\widetilde{\mathsf{S}}\) is \((k+2m)\)-dimensional, \(\widetilde{\mathsf{A}}\) is \((k+3m)\)-dimensional, \(\widetilde{\mathsf{M}}\) is \((k+3m)\)-dimensional. Let the resulting signature be \(\widetilde{\varSigma }_U\). We define a translation \(\widetilde{\tau }\) from the signature of arithmetic into \(\widetilde{\varSigma }_U\), by taking \(\alpha _{\widetilde{\tau }}(\mathbf {u}) : = \widetilde{\alpha }(\mathbf {u})\), \(\delta _{\widetilde{\tau }}(\mathbf {u},\mathbf {v}) := \widetilde{\delta }(\mathbf {u},\mathbf {v})\), \(\mathsf{Z}_{\widetilde{\tau }}(\mathbf {v}, \mathbf {u}) := \widetilde{\mathsf{Z}}(\mathbf {v},\mathbf {u})\), etcetera.
Let \(\widetilde{U} := U + \forall \mathbf {u}\, (\widetilde{\alpha }(\mathbf {u}) \rightarrow (\mathsf{S}^1_2)^{\widetilde{\tau },\mathbf {u}})\). Here the axioms of \(\mathsf{S}^1_2\) are supposed to include the identity axioms. Let \(\widetilde{N}:\mathsf{S}^1_2 \rightarrow \widetilde{U}\) be the interpretation based on \(\widetilde{\tau }\).
We define a parameter-free interpretation \(K:\widetilde{U} \rightarrow (U \boxplus \mathsf{EQ})\) by letting \(\tau _K\) restricted to \(\Sigma _U\) be the translation corresponding to \(\mathsf{in}_0\). We take \(\widetilde{\alpha }_{\tau _K}(\mathbf {v}) := \alpha _{N^\star }(\mathbf {v})\), \(\widetilde{\delta }_{\tau _K}(\mathbf {u},\mathbf {v}) := \delta _{N^\star }(\mathbf {u},\mathbf {v}) \), \(\mathsf{S}_{\tau _K}(\mathbf {u}\mathbf {v}) := \mathsf{S}_{N^\star }(\mathbf {u},\mathbf {v})\), etcetera. It is easy to see that \(tau_K\) indeed supports the promised interpretation K.
It is easy to see that the following diagram commutes in \(\mathsf{INT}_0\).
![](http://media.springernature.com/full/springer-static/image/art%3A10.1007%2Fs00500-016-2341-5/MediaObjects/500_2016_2341_Equ30_HTML.gif)
Clearly, \(\widetilde{U}\) is a sequential theory. So, we have the strong Pudlák property for \(\widetilde{N}\). Thus, we have a parameter-free \(N_0:\mathsf{S}^1_2 \rightarrow \widetilde{U}\) with \(N_0 \preceq \widetilde{N}\).
We find that \(K \circ N_0\) is parameter-free and \((K \circ N_0) \preceq (K \circ \widetilde{N})\). On the other hand, \(K \circ \widetilde{N}\) is equal in the sense of \(\mathsf{INT}_0\) to \(N^\star \). Ergo, \((K \circ N_0) \preceq N^\star \). Since \(N^\star \) is i-isomorphic to N, we find \((K \circ N_0) \preceq N\). \(\square \)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Visser, A. On Q . Soft Comput 21, 39–56 (2017). https://doi.org/10.1007/s00500-016-2341-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-016-2341-5