Journal of Logic, Language and Information

, Volume 28, Issue 4, pp 489–514 | Cite as

Translation Invariance and Miller’s Weather Example

  • J. B. ParisEmail author
  • A. Vencovská
Open Access


In his 1974 paper “Popper’s qualitative theory of verisimilitude” published in the British Journal for the Philosophy of Science David Miller gave his so called ‘Weather Example’ to argue that the Hamming distance between constituents is flawed as a measure of proximity to truth since the former is not, unlike the latter, translation invariant. In this present paper we generalise David Miller’s Weather Example in both the unary and polyadic cases, characterising precisely which permutations of constituents/atoms can be effected by translations. In turn this suggests a meta-principle of the rational assignment of subjective probabilities, that rational principles should be preserved under translations, which we formalise and give a particular characterisation of in the context of Unary Pure Inductive Logic.


Miller’s weather example Verisimilitude for relations Translation invariance Renaming invariance Pure inductive logic Uncertain reasoning 

1 Introduction

In response to Pavel Tichý’s paper (Tichý 1974) (itself a reaction to Karl Popper’s formulations of verisimilitude in Popper 1972) David Miller gave, in Miller (1974) (continued in Miller 2006, Chapter 11), an example to show that the Hamming distance between constituents/atoms is not a good indicator of closeness to the truth, verisimilitude.1 Paraphrased Miller’s example runs as follows.

Jones and Smith are closeted away in prison and try to guess the weather outside. Jones thinks it is cool and dry and still whilst Smith also thinks it is cool but otherwise differs in thinking it rainy and windy. They subsequently learn that actually it is hot, rainy and windy. So Smith is right on two scores (rainy and windy) while Jones is wrong on all scores. From this we might conclude that Smith’s guess is closer to the truth than Jones’. But suppose we now replace the propositions ‘hot’, ‘rainy’ and ‘windy’ by the equally expressive ‘hot’, ‘Minnesotan’ (meaning hot and wet or cool and dry) and ‘Arizonan’ (meaning hot and windy or cool and still). In that case Smith’s guess becomes cool, not Minnesotan and not Arizonan, Jones’ becomes cool, Minnesotan and Arizonan. The actual situation is hot and Minnesotan and Arizonan. So now it is Jones who is right on two scores and Smith who is right on none!

The primary observation in this present paper is that there is a general result (and conclusion) behind this example. We will first show this in the case of a unary (i.e., monadic) predicate language (where it may well have no great novelty) and then extend our results to a general polyadic (relational) language. On the way we will uncover a ‘meta-principle’ of probability assignment which we will investigate in the unary case.

To make precise the context we shall work in let \(L_{\vec {P}}\) be a predicate language2 with unary relation (i.e., predicate) symbols \(P_1,P_2, \ldots , P_q\) and, for later applications, constant symbols \(a_1,a_2,a_3, \ldots \). Note that by treating \(P_1(a_1), P_2(a_1), \ldots , P_q(a_1)\) as propositional variables our set-up can be considered to extend the propositional calculus in which Miller’s example is formalised.

Let \(L^-_{\vec {P}}\) be the language \(L_{\vec {P}}\) without the constants \(a_i\). The atoms3 of \(L_{\vec {P}}\) are the \(2^q\) formulae \(\alpha _{\epsilon }(x)\) of the form
$$\begin{aligned} \bigwedge _{i=1}^q P_i^{\epsilon (i)}(x) \end{aligned}$$
where \(\epsilon :\{1,2, \ldots , q\} \rightarrow \{0,1\}\) and for a formula \(\phi \), \(\phi ^1=\phi \), \(\phi ^0=\lnot \phi \). For shorthand let \(\Omega _{\vec {P}}\) denote the set of maps from \(\{1,2, \ldots , q\}\) to \(\{0,1\}\).

An atom then is specified by a map \(\epsilon \) and the Hamming distance between atoms is defined to be the Hamming distance between these maps, that is the number of arguments on which they give different values.

A translation of the predicate symbols \(P_1,P_2, \ldots , P_q\) is a family \(\phi _1(x), \phi _2(x),\)\(\ldots , \phi _q(x)\) of quantifier free formulae of \(L^-_{\vec {P}}\) such that as \(\delta \) ranges over \(\Omega _{\vec {P}}\) so the \(2^q\) formulae
$$\begin{aligned} \alpha _{\delta }^{\vec {\phi }}(x)= \bigwedge _{i=1}^q \phi _i^{\delta (i)}(x) \end{aligned}$$
range through all of the \(2^q\) atoms.4

So if we imagine that the \(P_1(x), P_2(x), \ldots , P_q(x)\) describe certain features of, say, the weather at location x, then the \(\phi _1(x), \phi _2(x), \ldots , \phi _q(x)\) would, as in Miller’s example, provide an alternative, and exactly as descriptive, way to describe the weather.

Clearly then a translation \(\phi _1(x), \phi _2(x), \ldots , \phi _q(x)\) determines a permutation of \(\Omega _{\vec {P}}\), which we shall denote \(\tau _{\vec {\phi }}\), given by5
$$\begin{aligned} \tau _{\vec {\phi }}(\epsilon ) = \delta\iff & {} \bigwedge _{i=1}^q P_i^{\epsilon (i)}(x) = \bigwedge _{i=1}^q \phi _i^{\delta (i)}(x) \nonumber \\\iff & {} \alpha _\epsilon (x) = \alpha _{\delta }^{\vec {\phi }}(x). \end{aligned}$$
In this case we say that the permutation \(\tau = \tau _{\vec {\phi }}\)is supported by the translation \(\vec {\phi }\).
Conversely, any permutation \(\tau \) of \(\Omega _{\vec {P}}\) determines (up to logical equivalence) formulae \(\phi _i(x)\), \(i=1, \ldots , q\), such that \(\vec {\phi }\) supports \(\tau \) since if we define
$$\begin{aligned} \phi _j(x)\, = \bigvee _{\tau (\epsilon )(j)=1}~ \bigwedge _{i=1}^q P_i^{\epsilon (i)}(x). \end{aligned}$$
then, noting that
$$\begin{aligned} \lnot \phi _j(x) = \phi _j^0(x)\, = \bigvee _{\tau (\epsilon )(j)=0}~ \bigwedge _{i=1}^q P_i^{\epsilon (i)}(x), \end{aligned}$$
for any \(\epsilon \in \Omega _{\vec {P}}\) we have
$$\begin{aligned} \bigwedge _{i=1}^q P_i^{\epsilon (i)}(x) \models \bigwedge _{j=1}^q \phi _j^{\tau (\epsilon )( j)}(x) \end{aligned}$$
so since the \( \bigwedge _{j=1}^q \phi _j^{\tau (\epsilon )( j)}(x)\) (with \(\epsilon \in \Omega _{\vec {P}}\) and \(\tau \) fixed) are disjoint and exhaustive,
$$\begin{aligned} \alpha _{\epsilon }(x) = \bigwedge _{i=1}^q P_i^{\epsilon (i)}(x) = \bigwedge _{j=1}^q \phi _j^{\tau (\epsilon )( j)}(x) = \alpha _{\tau (\epsilon )}^{\vec {\phi }}(x). \end{aligned}$$
We have actually shown here that:

Theorem 1

Let \(\tau \) be a permutation of \(\Omega _{\vec {P}}\). Then there is a translation \(\vec {\phi } = \langle \phi _1(x), \phi _2(x), \ldots , \phi _q(x)\rangle \) that supports \(\tau \). Conversely, any translation \(\vec {\phi }\) supports a permutation of \(\Omega _{\vec {P}}\) (namely \(\tau _{\vec {\phi }}\)).

Putting this another way, and more in line with Miller’s weather example, if we define new unary predicates \(Q_1(x), Q_2(x), \ldots , Q_q(x)\) by \( Q_j(x) = \phi _j(x)\) then for each \(\epsilon , \delta \in \Omega _{\vec {P}}\), the atom
$$\begin{aligned} \alpha _{\epsilon }(x)= \bigwedge _{i=1}^q P_i^{\epsilon (i)}(x) \end{aligned}$$
of \(L_{\vec {P}}\) is logically equivalent to the atom
$$\begin{aligned} \alpha ^{\vec {Q}}_{\delta }(x) =\bigwedge _{i=1}^q Q_i^{\delta (i)}(x) \end{aligned}$$
of \(L_{\vec {Q}}\) just when \(\tau _{\vec {\phi }}(\epsilon ) = \delta \).
Thus we can change the Hamming Distance between \(\alpha _{\epsilon }(x)\) and \(\alpha _{\delta }(x)\) (for \(\delta \ne \epsilon \)) by a translation in any way we want (without making it 0). Indeed provided that it is consistent (i.e. there is some tuple of atoms with the desired Hamming distances between them) we can in fact make multiple distance changes simultaneously. In Miller’s example, writing CDS and MA for ‘cool, dry, still’ and ‘Minnesotan, Arizonan’ respectively, (meaning the weather outside Jones and Smith’s prison), the situation may be represented as
$$\begin{aligned} \begin{array}{lccclccc} \mathrm{Jones} &{}\quad C&{}\quad D&{} C&{}\quad \qquad \mathrm{Jones} &{}\quad C&{}\quad M&{}\quad A\\ \mathrm{Smith}&{}\quad C&{}\quad \lnot D&{}\quad \lnot S&{} \quad \qquad \mathrm{Smith} &{}\quad C&{}\quad \lnot M&{}\quad \lnot A\\ \mathrm{True} &{}\quad \lnot C&{}\quad \lnot D&{}\quad \lnot S&{} \quad \qquad \mathrm{True} &{}\quad \lnot C&{}\quad M&{}\quad A\\ \end{array} \end{aligned}$$
respectively. We can, for example, also arrange for Jones and Smith each to differ from the true situation and from each other on two scores although we cannot arrange that they each differ from the true situation and each other on a single score. The conclusion we may draw here, in line with Miller, is that Hamming Distance is as flawed a measure of verisimilitude for constituents (in the propositional context) or instantiated atoms (in the predicate context) as can be.
Given a permutation \(\tau \) of \(\Omega _{\vec {P}}\) and assuming \(Q_1, \ldots ,Q_q\) stand for unary predicates, we can treat \(\tau \) as a renaming of atoms which sends (2) to
$$\begin{aligned} \alpha ^{\vec {Q}}_{\tau (\epsilon )}(x) = \bigwedge _{i=1}^q Q_i^{\tau (\epsilon )(i)}(x). \end{aligned}$$
Theorem 1 has shown that in the unary context renamings of atoms and translations are the same thing, and in turn this somewhat broadens Miller’s example. This raises the question of what happens when we move to polyadic languages (i.e., containing possibly binary, ternary, etc., relation symbols). We shall consider this in the next section after which we will return again to the unary context to investigate a further issue raised by Theorem 1.
Before progressing further, note that a translation/renaming as above from \(L_{\vec {P}}\) to \(L_{\vec {Q}}\), that sends the atom \(\alpha _{\epsilon }(x)\) to the atom
$$\begin{aligned} \alpha ^{\vec {Q}}_{\tau _{\vec {\phi }}(\epsilon )}(x), \end{aligned}$$
which is logically equivalent to \(\alpha _{\epsilon }(x)\) when the \(Q_j\) are defined by \( Q_j(x) = \phi _j(x)\), extends to all formulae of \(L^-_{\vec {P}}\), and hence also to all sentences of \(L_{\vec {P}}\). This can be seen for example by noting that a formula \(\xi (x_1, \ldots ,x_n)\) of \(L^-_{\vec {P}}\) is logically equivalent to a disjunction of the form
$$\begin{aligned} \bigvee _{j=1}^t \left( \bigwedge _{k=1}^n \alpha _{\epsilon _{j,k}}(x_{i}) \wedge \bigwedge _{\epsilon \in \Omega _{\vec {P}}} (\exists x \,\alpha _{\epsilon }(x))^{h_{j,\epsilon }}\right) \end{aligned}$$
where the \(\epsilon _{j,k} \in \Omega _{\vec {P}}\) and \(h_{j, \epsilon } \in \{0,1\}\) and consequently, \(\xi (x_1, \ldots ,x_n)\) is also logically equivalent to the formula
$$\begin{aligned} \bigvee _{j=1}^t \left( \bigwedge _{k=1}^n \alpha ^{\vec {Q}}_{\tau _{\vec {\phi }}(\epsilon _{j,k})}(x_{i}) \wedge \bigwedge _{\epsilon \in \Omega _{\vec {P}}} \left( \exists x \,\alpha ^{\vec {Q}}_{\tau _{\vec {\phi }}(\epsilon )}(x)\right) ^{h_{j,\epsilon }}\right) \end{aligned}$$
of \(L^-_{\vec {Q}}\). Note that \(\phi _j(x) \) translates as \(Q_j(x)\).
In Miller (1978) Miller also considers a related question of invariance under ‘translations’ of the Hamming distance between constituents of the predicate language\(L^-_{\vec {P}}\), that is in our notation sentences of \(L^-_{\vec {P}}\) of the form
$$\begin{aligned} \bigwedge _{\epsilon \in \Omega _{\vec {P}}} \big ( \exists x \, \alpha _{\epsilon }(x) \big )^{\tau (\epsilon )} \end{aligned}$$
where \( \tau : \Omega _{\vec {P}} \rightarrow \{0,1\} \) is not constantly zero. Let \(\Omega _{{\Omega }_{\vec {P}}}\) be the set of such \( \tau \). Clearly, these distances are preserved under the translations we are considering, that is, generated by permutations of atoms. However as Miller demonstrates in Miller (1978) there are ‘translations’—sets of disjoint exhaustive formulae \(\varphi _\epsilon (x)\) for \(\epsilon \in \Omega _{\vec {P}}\) of \(L_{\vec {P}}^-\) with quantifiers—such that for \( \tau \in \Omega _{\Omega _{\vec {P}}} \)
$$\begin{aligned} \bigwedge _{\epsilon \in \Omega _{\vec {P}}} \big ( \exists x \, \alpha _\epsilon (x) \big )^{\tau (\epsilon )} = \bigwedge _{\epsilon \in \Omega _{\vec {P}}}\big ( \exists x \, \varphi _\epsilon (x) \big )^{\nu (\tau )(\epsilon )} \end{aligned}$$
where \(\nu : \Omega _{{\Omega }_{\vec {P}}} \rightarrow \Omega _{{\Omega }_{\vec {P}}}\) does not preserve Hamming distance, though necessarily, by considering smallest models, \(\nu \) must preserve \(|\{\epsilon \,|\,\tau (\epsilon )=1 \}|\). In fact this condition exactly characterises which \(\nu \) are possible here, as we shall now show.
Let \(\nu :\Omega _{{\Omega }_{\vec {P}}} \rightarrow \Omega _{{\Omega }_{\vec {P}}}\) be such that for each \(\tau \in \Omega _{{\Omega }_{\vec {P}}}\)
$$\begin{aligned} | \tau ^{-1}\{1\}| = |\nu (\tau )^{-1}\{1\}|. \end{aligned}$$
For each \(\tau \in \Omega _{\Omega _{\vec {P}}} \) let
$$\begin{aligned} \iota _\tau :\nu (\tau )^{-1}\{1\} \rightarrow \tau ^{-1}\{1\} \end{aligned}$$
be an injection and for \(\epsilon \in \Omega _{\vec {P}}\) set
$$\begin{aligned} \varphi _\epsilon (x) = \bigvee _{\nu (\tau )(\epsilon )=1} \left( \alpha _{\iota _{\tau }(\epsilon )}(x) \wedge \bigwedge _{\delta } \big (\exists y\, \alpha _\delta (y)\big )^{\tau (\delta )}\right) . \end{aligned}$$
Then these \(\varphi _\epsilon (x)\) are disjoint (since the disjuncts in (7) for \(\epsilon , \epsilon ' \in \Omega _{\vec {P}}\) could only meet for the same \(\tau \) but \(\iota _\tau (\epsilon ) \ne \iota _\tau (\epsilon ') \) when \(\epsilon \ne \epsilon '\)) and, as will follow from (8) below by taking the disjunction over \(\tau \), exhaustive.
From (7) it follows that
$$\begin{aligned} \exists x \, \varphi _\epsilon (x) = \bigvee _{\nu (\tau )(\epsilon )=1} \bigwedge _{\delta } \big (\exists y\, \alpha _\delta (y)\big )^{\tau (\delta )} \end{aligned}$$
and in turn, as required, that for each \(\tau \)
$$\begin{aligned} \bigwedge _\epsilon \big ( \exists x \, \varphi _\epsilon (x)\big )^{\nu (\tau )(\epsilon )} = \bigwedge _{\delta } \big (\exists x\, \alpha _{ \delta }(x)\big )^{\tau (\delta )}. \end{aligned}$$

2 The General Polyadic Case

As a simple motivating example, consider Jones and Smith again. Speculating about weather and police behaviour, Jones thinks it is hot in London but not in Belfast and that there are London-trained police dogs being deployed in Belfast but not conversely. Smith thinks that it is not hot in London but hot in Belfast and that there are Belfast-trained police dogs being deployed in London but not conversely. In truth, it is hot both in London and Belfast and there are London-trained police dogs being deployed in Belfast but not conversely. Let HD, and B stand for, respectively,
This means that in terms of H and D, with l and b standing for the cities, the respective positions are
$$\begin{aligned} \begin{array}{lcccc} \mathrm{Jones} &{}\quad H(l)&{}\quad \lnot H(b)&{}\quad D(l,b)&{}\quad \lnot D(b,l)\\ \mathrm{Smith}&{}\quad \lnot H(l)&{}\quad H(b)&{}\quad \lnot D(l,b) &{}\quad D(b,l)\\ \mathrm{True} &{}\quad H(l) &{}\quad H(b)&{}\quad D(l,b)&{} \quad \lnot D(b,l) \end{array} \end{aligned}$$
so Jones is ‘closer to the truth’,6 but in terms of H and B they are
$$\begin{aligned} \begin{array}{lcccc} \mathrm{Jones} &{}\quad H(l)&{}\quad \lnot H(b)&{}\quad B(l,b) &{}\quad \lnot B(b,l)\\ \mathrm{Smith}&{}\quad \lnot H(l)&{}\quad H(b)&{}\quad \lnot B(l,b)&{}\quad B(b,l)\\ \mathrm{True} &{}\quad H(l) &{}\quad H(b)&{}\quad \lnot B(l,b)&{}\quad B(b,l) \end{array} \end{aligned}$$
so Smith is ‘closer’. This provides an example similar to the above weather example, which however involves a binary relation rather than just unary predicates/ propositional variables. By the end of this section we shall justify in some detail why H, B is an equally expressive pair as H, D.
Let the polyadic language \(L_{\vec {R}}\) contain relation symbols \(R_1, \ldots , R_q\) of arities \(r_1, \ldots , r_q\) respectively and constants \(a_1, a_2, \ldots \) (needed later). Let \(L^-_{\vec {R}}\) stand for the language \(L_{\vec {R}}\) without the constants \(a_i\). A state formula of \(L_{\vec {R}}\) for variables \(x_1, \ldots , x_n\) is a formula
$$\begin{aligned} \Theta (x_1,\ldots , x_n) \,=\, \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , n\}^{r_i} \end{array}} R_i ^{ \epsilon (i, j_1, \ldots ,j_{r_i})}(x_{j_1}, \ldots , x_{j_{r_i}}) \end{aligned}$$
of \(L^-_{\vec {R}}\), where the \(\epsilon (i, j_1, \ldots ,j_{r_i})\in \{0,1\}\). If \(\Theta (x_1,\ldots , x_n)\) is a state formula and the (distinct) \(b_1, \ldots , b_n\) are from \(a_1, a_2, \ldots \) then \(\Theta (b_1,\ldots , b_n)\) is called a state description (for\(b_1, \ldots , b_n\)) .
Let \(r=\max \{r_1, \ldots ,r_q\}\). An atom of\(L_{\vec {R}}\) is a state formula for r variables. Hence an atom of \(L_{\vec {R}}\) is determined by a map
$$\begin{aligned} \epsilon : \bigcup _{i=1}^q \left( \{i\} \times \{1,2,\ldots , r\}^{r_i}\right) \rightarrow \{0,1\}. \end{aligned}$$
Mimicking the notation of the previous section let \(\Omega _{\vec {R}}\) denote the set of such maps \(\epsilon \) and for \(\epsilon \in \Omega _{\vec {R}}\) let \(\alpha _{\epsilon }\) denote the atom determined by \(\epsilon \). Permutations of atoms are identified with the corresponding permutations of \(\Omega _{\vec {R}}\). We shall similarly use \(\epsilon , \delta , \ldots \) for elements of \(\Omega _{\vec {R}}\).
We say that a q-tuple of quantifier free formulae \(\psi _i (x_1, \ldots x_{r_i})\) forms a translation of \(L_{\vec {R}}^-\) if the
$$\begin{aligned} \alpha ^{\vec {\psi }}_{{\epsilon }}(x_1,\ldots , x_r) \,=\, \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} \psi _i ^{\epsilon (i, j_1, \ldots ,j_{r_i})}(x_{j_1}\ldots , x_{j_{r_i}}) \end{aligned}$$
run through all the atoms of \(L_{\vec {R}}\) as the \(\epsilon \) run through \(\Omega _{\vec {R}}\).
We say that a permutation \( \tau \) of atoms (equivalently of \(\Omega _{\vec {R}}\)) is supported by the translation \(\psi _1, \ldots , \psi _q\) if for each \(\epsilon \in \Omega _{\vec {R}}\),
$$\begin{aligned} \alpha _\epsilon (x_1, \ldots ,x_r) \, =\, \alpha ^{\vec {\psi }}_{\tau (\epsilon )} (x_1, \ldots ,x_r) . \end{aligned}$$
As in the unary case, we denote such a permutation \( \tau \) by \(\tau _{\vec {\psi }}\).
Note that this is the situation when if we define the new predicates \(Q_i(x_1,\ldots ,x_{r_i})\) for \(i=1,2,\ldots , q\) by
$$\begin{aligned} Q_i(x_1, \ldots , x_{r_i}) = \psi _i(x_1, \ldots , x_{r_i}) \end{aligned}$$
then for each \(\epsilon \) the atom
$$\begin{aligned} \alpha _\epsilon (x_1,\ldots , x_r) \,=\, \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} R_i ^{\epsilon (i, j_1, \ldots ,j_{r_i})}(x_{j_1}\ldots , x_{j_{r_i}}) \end{aligned}$$
is logically equivalent to the atom
$$\begin{aligned} \alpha ^{\vec {Q}}_\delta (x_1,\ldots , x_r) \,=\, \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\}\\ \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} Q_i ^{\delta (i, j_1, \ldots ,j_{r_i})}(x_{j_1}\ldots , x_{j_{r_i}}) \end{aligned}$$
of the language \(L_{\vec {Q}}\) with relation symbols \(Q_1,Q_2, \ldots , Q_q\) just when \(\delta = \tau _{\vec {\psi }}(\epsilon )\).

Unlike the purely unary case however, in the polyadic it is not in general the case that just any permutation, or renaming, of atoms is supported by a translation. As we shall prove this will be the case just if the permutation satisfies a certain property (C) from Ronel and Vencovská (2014) which we will define shortly. Interestingly, as we shall subsequently explain, condition (C) is also equivalent to the permutation of atoms generating an automorphism of a certain structure BL relevant in Pure Inductive Logic.

To formulate (C) we shall need the following notation.
  • Let \(\Theta (x_1,\ldots , x_n)\) be as in (9) and let \( k_1, \ldots ,k_t \) be distinct numbers from \(\{1,\ldots ,n\}\). Then \(\Theta [x_{k_1},\ldots , x_{k_t}]\) denotes the state formula obtained from (9) by restricting it to \(x_{k_1},\ldots , x_{k_t}\), that is, replacing
    $$\begin{aligned} \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , n\}^{r_i} \end{aligned}$$
    $$\begin{aligned} \langle j_1, \ldots , j_{r_i}\rangle \in \{k_1,\ldots , k_t\}^{r_i}. \end{aligned}$$
  • Let \(\Phi (x_{k_1},\ldots , x_{k_t})\) be a state formula, \( m_1, \ldots ,m_s \) distinct numbers and
    $$\begin{aligned} f :\{m_1, \ldots , m_s\} \rightarrow \{k_1, \ldots , k_t\} \end{aligned}$$
    a surjection. Then \((\Phi (x_{k_1}, \ldots ,x_{k_t}))_f\) denotes the state formula \(\Psi (x_{m_1}, \ldots , x_{m_s})\) for which
    $$\begin{aligned} \Psi (x_{f(m_1)}, \ldots , x_{ f(m_s)}) = \Phi (x_{k_1}, \ldots , x_{k_t}). \end{aligned}$$
We can now state condition (C) for a permutation \(\sigma \) of \(\Omega _{\vec {R}}\):
  1. (C)
    For\(\epsilon , \delta \in \Omega _{\vec {R}}\), \(t\le r\)and distinct\( j_1, \ldots ,j_t \)from\( \{1,\ldots ,r\}\), if\(f :\{1, \ldots , r\} \rightarrow \{j_1, \ldots , j_t\}\)is a surjection then
    $$\begin{aligned} \alpha _{\epsilon }(x_1, \ldots , x_r)= & {} (\alpha _{\delta }[x_{j_1}, \ldots ,x_{j_t}])_f \iff \alpha _{\sigma (\epsilon )}(x_1, \ldots , x_r) \\= & {} (\alpha _{\sigma (\delta )}[x_{j_1}, \ldots ,x_{j_t}])_f. \end{aligned}$$
We shall also need a consequence of (C) which it is apposite to spell out. Assume that \({m_1},\ldots , {m_s} \in \{1,2,\ldots , r\}\) are distinct, \(k_1, \ldots , k_t \in \{1,2,\ldots , r\}\) are distinct,
$$\begin{aligned} f :\{m_1, \ldots , m_s\} \rightarrow \{k_1, \ldots , k_t\} \end{aligned}$$
is a surjection and \(\epsilon , \delta \in \Omega _{\vec {R}}\) are such that .
$$\begin{aligned} \alpha _\epsilon [x_{m_1}, \ldots , x_{m_s}] = (\alpha _\delta [x_{k_1}, \ldots , x_{k_t}])_f. \end{aligned}$$
$$\begin{aligned} g:\{1,\ldots ,r\} \rightarrow \{m_1, \ldots , m_s\} \end{aligned}$$
be such that for \(i \in \{m_1, \ldots ,m_s\}\), \( g(i)=i\) and for \(i \notin \{m_1, \ldots ,m_s\}\), \( g(i)=m_1\). Hence
$$\begin{aligned} fg:\{1, \ldots , r\} \rightarrow \{k_1, \ldots , k_t\} \end{aligned}$$
is a surjection and its restriction to \(\{m_1, \ldots ,m_s\}\) is f. Let \(\gamma \in \Omega _{\vec {R}}\) be such that
$$\begin{aligned} (\alpha _\epsilon [x_{m_1}, \ldots , x_{m_s}])_g = \alpha _\gamma (x_1, \ldots , x_r)= (\alpha _\delta [x_{k_1}, \ldots , x_{k_t}])_{fg} . \end{aligned}$$
Then for any \(\sigma \) satisfying (C) we have
$$\begin{aligned} (\alpha _{\sigma (\epsilon )}[x_{m_1}, \ldots , x_{m_s}])_g = \alpha _{\sigma (\gamma )}(x_1, \ldots , x_r)= (\alpha _{\sigma (\delta )}[x_{k_1}, \ldots , x_{k_t}])_{fg}. \end{aligned}$$
Since g is the identity on \(\{m_1,\ldots ,m_s\}\), it follows that
$$\begin{aligned} \alpha _{\sigma (\epsilon )}[x_{m_1}, \ldots , x_{m_s}]= \alpha _{\sigma (\gamma )}[x_{m_1}, \ldots , x_{m_s}]= (\alpha _{\sigma (\delta )}[x_{k_1}, \ldots , x_{k_t}])_f. \end{aligned}$$
Since we can argue conversely in the same way, any \(\sigma \) that satisfies (C) also satisfies the following condition:
  1. (D)
    For distinct\({m_1},\ldots , {m_s} \in \{1,2,\ldots , r\}\), \(k_1, \ldots , k_t \in \{1,2,\ldots , r\}\), surjection\(f :\{m_1, \ldots , m_s\} \rightarrow \{k_1, \ldots , k_t\}\)and\(\epsilon , \delta \in \Omega _{\vec {R}}\),
    $$\begin{aligned} \alpha _\epsilon [x_{m_1}, \ldots , x_{m_s}]= & {} (\alpha _\delta [x_{k_1}, \ldots , x_{k_t}])_f \Longleftrightarrow \alpha _{\sigma (\epsilon )}[x_{m_1}, \ldots , x_{m_s}] \\= & {} (\alpha _{\sigma (\delta )}[x_{k_1}, \ldots , x_{k_t}])_f \end{aligned}$$
In particular, taking f to be identity, we have
  1. (E)
    For distinct\(m_1, \ldots , m_s \in \{1,2,\ldots , r\}\),
    $$\begin{aligned} \alpha _\epsilon [x_{m_1}, \ldots , x_{m_s}]= & {} \alpha _\delta [x_{m_1}, \ldots , x_{m_s}] ~ \Longleftrightarrow ~ \alpha _{\sigma (\epsilon )}[x_{m_1}, \ldots , x_{m_s}] \\= & {} \alpha _{\sigma (\delta )}[x_{m_1}, \ldots , x_{m_s}]. \end{aligned}$$

Theorem 2

Given a permutation \(\sigma \) of \(\Omega _{\vec {R}}\) there is a translation supporting \(\sigma \) just if \(\sigma \) satisfies (C).


First suppose that \(\sigma \) is a permutation of atoms satisfying condition (C). Define
$$\begin{aligned} \psi _i(x_1,\ldots ,x_{r_i}) = \bigvee _{\sigma (\epsilon ) (i, 1, 2, \ldots ,{r_i})=1} \alpha _{{\epsilon }}[x_1,\ldots , x_{r_i}]. \end{aligned}$$
We shall show that this provides the required translation supporting \(\sigma \).
Let \(j_1, \ldots , j_{r_i}\) be from \(\{1,\ldots ,r\}\) (not necessarily distinct). We have
$$\begin{aligned} \psi _i(x_{j_1},\ldots ,x_{j_{r_i}}) = \bigvee _{\sigma (\epsilon )(i, 1, \ldots ,{r_i})=1} ( \alpha _{{\epsilon }}[x_1,\ldots , x_{r_i}] )(x_{j_1}/x_1,\ldots , x_{j_{r_i}}/x_{r_i}). \end{aligned}$$
We shall now prove that this gives
$$\begin{aligned} \psi _i(x_{j_1},\ldots ,x_{j_{r_i}}) = \bigvee _{\sigma (\epsilon ) (i, j_1, \ldots ,j_{r_i})=1} \alpha _{{\epsilon }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] \end{aligned}$$
where \(j_{h_1}, \ldots ,j_{h_t}\) are the distinct numbers from amongst the \(j_1, \ldots , j_{r_i}\).
Let \(A_1, A_2, \ldots , A_t\) form the partition of \(\{1,2, \ldots , r_i\}\) such that for \(x,y \in \{1,2, \ldots , r_i\}\), xy are in the same \(A_k\) just if \(j_x=j_y\). Let \(h_k= \min \{A_k\}\) for \(k=1,2,\ldots ,t\). A disjunct
$$\begin{aligned} (\alpha _{{\epsilon }}[x_1,\ldots , x_{r_i}] )(x_{j_1}/x_1,\ldots , x_{j_{r_i}}/x_{r_i}) \end{aligned}$$
of (13) is consistent just if for \(1 \le m \le q\), \(1 \le d_1,d_2, \ldots , d_{r_m} \le r_i\), \(\epsilon (m,d_1,d_2, \ldots , d_{r_m})\) depends only on which of the classes \(A_1, \ldots , A_t\) the \(d_n\) are in. Equivalently
$$\begin{aligned} \epsilon (m,d_1,d_2, \ldots , d_{r_m}) = \epsilon (m,f(d_1),f(d_2), \ldots , f(d_{r_m})) \end{aligned}$$
where f maps the members of each \(A_k\) to \(h_k\), the least member of that \(A_k\). Another way of expressing this is that
$$\begin{aligned} \alpha _\epsilon [x_1,\ldots , x_{r_i}] = (\alpha _\epsilon [x_{h_1},\ldots , x_{h_t}] )_f. \end{aligned}$$
Notice that when (15) is consistent then
$$\begin{aligned} (\alpha _{{\epsilon }}[x_1,\ldots , x_{r_i}] )(x_{j_1}/x_1,\ldots , x_{j_{r_i}}/x_{r_i}) = (\alpha _{{\epsilon }}[x_{h_1},\ldots , x_{h_t}] )(x_{j_{h_1}}/x_{h_1},\ldots , x_{j_{h_t}}/x_{h_t}). \end{aligned}$$
Let g be a permutation of \(\{1,2, \ldots , r\}\) which, in particular, maps each \(h_k\) to \(j_{h_k}\). Note that this means that for each \(c \in \{1, \ldots , r_i\}\) we have \(g(f(c)) = j_c\). For each \(\epsilon \in \Omega _{\vec {R}}\) define \(\epsilon ' \in \Omega _{\vec {R}}\) by
$$\begin{aligned} \epsilon '(m,g(u_1),g(u_2), \ldots , g(u_{r_m})) = \epsilon (m,u_1,u_2, \ldots , u_{r_m}), \end{aligned}$$
for \(m=1,2, \ldots , q\) and \(u_1, \ldots , u_{r_m} \in \{1, \ldots , r\}\), equivalently
$$\begin{aligned} \alpha _{ \epsilon '}(x_1, x_2, \ldots , x_r) = \alpha _{ \epsilon }(x_{g(1)}, x_{g(2)}, \ldots , x_{g(r)}), \end{aligned}$$
that is,
$$\begin{aligned} \alpha _{\epsilon }(x_1, x_2, \ldots , x_r) = (\alpha _{ \epsilon '}(x_1,x_2, \ldots , x_r))_{g} . \end{aligned}$$
From (17) a consistent disjunct (15) equals
$$\begin{aligned} (\alpha _{{\epsilon }}[x_{h_1},\ldots , x_{h_t}] )(x_{j_{h_1}}/x_{h_1},\ldots , x_{j_{h_t}}/x_{h_t}), \end{aligned}$$
and since \(j_{h_k}=g (h_k)\), we get by (18) that
$$\begin{aligned}&\alpha _{{\epsilon }}[x_{h_1},\ldots , x_{h_t}] (x_{j_{h_1}}/x_{h_1},\ldots , , x_{j_{h_t}}/x_{h_t}) \, = \nonumber \\&\quad = \bigwedge _{\begin{array}{c} m\in \{1,2,\ldots ,q\} \\ \langle u_1, \ldots , u_{r_m}\rangle \in \{h_1,\ldots , h_t\}^{r_m} \end{array}} R_m ^{\epsilon (m, u_1, \ldots ,u_{r_m})}(x_{g(u_1)},\ldots , x_{g(u_{r_m})}) ~=~\alpha _{{\epsilon '}}[x_{j_{h_1}},\ldots , x_{j_{h_t}}].\nonumber \\ \end{aligned}$$
By (C) and (19) we have
$$\begin{aligned} \alpha _{\sigma (\epsilon )}(x_1, x_2, \ldots , x_r) = (\alpha _{ \sigma (\epsilon ')}(x_1,x_2, \ldots , x_r))_{g}, \end{aligned}$$
that is,
$$\begin{aligned} \sigma (\epsilon ')(m,g(u_1),g(u_2), \ldots , g(u_{r_m})) = \sigma (\epsilon )(m,u_1,u_2, \ldots , u_{r_m}). \end{aligned}$$
Hence, using the fact that \(j_c=g(f(c) )\) for each \(c \in \{1, \ldots , r_i\}\),
$$\begin{aligned} \sigma (\epsilon ')(i, j_{1}, \ldots , j_{r_i}) = \sigma (\epsilon ')(i, g(f(1)), \ldots , g(f({r_i})))= \sigma (\epsilon )(i, f(1), \ldots ,f(r_i) ). \end{aligned}$$
From (16) and (D) we have
$$\begin{aligned} \alpha _{\sigma (\epsilon )}[x_1, \ldots , x_{r_i}] = (\alpha _{\sigma (\epsilon )}[x_{h_1}, \ldots , x_{h_{r_i}}])_f \end{aligned}$$
$$\begin{aligned} \sigma (\epsilon )(i, f(1), \ldots ,f(r_i) )= \sigma (\epsilon )(i, 1, \ldots ,r_i ). \end{aligned}$$
With (22) it now follows that
$$\begin{aligned} \sigma (\epsilon ')(i, j_{1}, \ldots , j_{r_i})=\sigma (\epsilon )(i, 1, \ldots ,r_i). \end{aligned}$$
Hence a consistent disjunct (15), equivalently (20) and (21), with \(\sigma (\epsilon )(i,1,2, \ldots , r_i)=1\) equals
$$\begin{aligned} \alpha _{{\epsilon '}}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] \end{aligned}$$
with \( \sigma (\epsilon ')(i, j_1, \ldots , j_{r_i})=1\). Consequently (13) logically implies
$$\begin{aligned} \bigvee _{\sigma (\epsilon ')(i, j_1, \ldots ,j_{r_i})=1} \alpha _{{\epsilon '}}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] . \end{aligned}$$
Clearly then the disjunction in (13) logically implies the disjunction in (14) with \(\epsilon '\) in place of \(\epsilon \) and hence also the right hand side of (14) without this replacement.
Conversely, let \(\gamma \in \Omega _{\vec {R}}\) be such that \(\sigma (\gamma )(i, j_1, \ldots , j_{r_i})=1\) (so \(\alpha _{{\gamma }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] \) contributes to the right hand side of (14)). With g as above, let \(\xi \in \Omega _{\vec {R}}\) be such that
$$\begin{aligned} \alpha _{\xi }(x_1, x_2, \ldots , x_r) = (\alpha _{\gamma }(x_1,x_2, \ldots , x_r))_{g}, \end{aligned}$$
that is,
$$\begin{aligned} \gamma (m,g(u_1),g(u_2), \ldots , g(u_{r_m})) = \xi (m,u_1,u_2, \ldots , u_{r_m}). \end{aligned}$$
(Using our above notation, \(\gamma = \xi '\).)
Let \(\epsilon \) be such that
$$\begin{aligned} \alpha _\epsilon (x_1, \ldots , x_r) = (\alpha _{\xi } [x_{h_1}, \ldots , x_{h_t}, x_{r_i+1}, \ldots , x_r])_f \end{aligned}$$
where f is an extension of the above defined f, mapping the members of each \(A_k\) to \(h_k\) and \(f(h)=h\) for \(h > r_i\). So
$$\begin{aligned} (\alpha _\epsilon [x_1, \ldots , x_{r_i}])(x_{j_1}/x_1, \ldots , x_{j_{r_i}}/x_{r_i}) \end{aligned}$$
is consistent (which may not have been true of \(\xi \)),
$$\begin{aligned} \alpha _\epsilon [x_{h_1}, \ldots , x_{h_t}] = \alpha _\xi [x_{h_1}, \ldots , x_{h_t}] \end{aligned}$$
and \(\epsilon \) satisfies (16). Let \(\epsilon '\) be associated with \(\epsilon \) as above, that is, via
$$\begin{aligned} \epsilon '(m,g(u_1),g(u_2), \ldots , g(u_{r_m})) = \epsilon (m,u_1,u_2, \ldots , u_{r_m}). \end{aligned}$$
Noting that for \(h \in \{h_1, \ldots h_t\}\) we have \(g(h)=j_{h}\), from (24), (25), (26),
$$\begin{aligned} \alpha _{\epsilon '}[x_{j_{h_1}}, \ldots , x_{j_{h_t}}] = \alpha _\gamma [x_{j_{h_1}}, \ldots , x_{j_{h_t}}] \end{aligned}$$
Since \(\sigma (\gamma )(i, j_1, \ldots , j_{r_i})=1\), by property (E) we have also \(\sigma (\epsilon ')(i, j_1, \ldots , j_{r_i})=1\) and furthermore by virtue of (23), \(\sigma (\epsilon )(i, 1, \ldots , {r_i})=1\).

Hence from (21), \(\alpha _{{\gamma }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}]\) with \(\sigma (\gamma )(i, j_1, \ldots , j_{r_i})=1\) equals (20) and (15) with \(\sigma (\epsilon ) (i, 1, \ldots , {r_i})=1\), and thus the disjunction from (14) logically implies the disjunction from (13) and the identity (14) is proved.

Note that by virtue of the condition (E), if for some \(\epsilon , \delta \in \Omega _{\vec {R}}\) we have
$$\begin{aligned} \alpha _{{\epsilon }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] = \alpha _{{\eta }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] \end{aligned}$$
$$\begin{aligned} \sigma (\epsilon ) (i, j_1, \ldots ,j_{r_i})=\sigma (\eta ) (i, j_1, \ldots ,j_{r_i}) \end{aligned}$$
and if \(\alpha _{{\epsilon }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}],\, \alpha _{{\eta }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] \) are not equal then they are disjoint. Since the disjunction of \(\alpha _{{\epsilon }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] \) over all the \(\epsilon \in \Omega _{\vec {R}}\) is a tautology, we have
$$\begin{aligned} \lnot \psi _i(x_{j_1},\ldots ,x_{j_{r_i}}) = \bigvee _{\sigma (\epsilon ) (i, j_1, \ldots ,j_{r_i})=0} \alpha _{{\epsilon }}[x_{j_{h_1}},\ldots , x_{j_{h_t}}] . \end{aligned}$$
It follows that for \(\epsilon \in \Omega _{\vec {R}}\),
$$\begin{aligned} \sigma (\epsilon ) (i, j_1, \ldots ,j_{r_i})=1 ~\iff ~ \alpha _{{\epsilon }}(x_1,\ldots , x_r) \models \psi _i(x_{j_1},\ldots ,x_{j_{r_i}}). \end{aligned}$$
$$\begin{aligned} \alpha _{{\epsilon }}(x_1,\ldots , x_r) \models \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} \psi _i ^{\sigma (\epsilon ) (i, j_1, \ldots ,j_{r_i})}(x_{j_1}\ldots , x_{j_{r_i}}) \,\, = \,\, \alpha _{{\sigma (\epsilon )}}^{\vec {\psi } }(x_1,\ldots , x_r). \end{aligned}$$
and since there are as many formulae \(\alpha _{{\sigma (\epsilon )}}^{\vec {\psi } }(x_1,\ldots , x_r)\) as there are atoms and they are exclusive, we must have
$$\begin{aligned} \alpha _{{\epsilon }}(x_1,\ldots , x_r) = \alpha _{{\sigma (\epsilon )}}^{\vec {\psi } }(x_1,\ldots , x_r), \end{aligned}$$
which concludes this direction of the proof.

For the converse suppose that the translation \(\vec {\psi }\) supports a permutation \(\sigma \) of atoms. We need to show that \(\sigma \) satisfies (C).

Since \(\vec {\psi }\) supports \(\sigma \) we have that for each \(\epsilon \in \Omega _{\vec {R}}\),
$$\begin{aligned} \alpha _\epsilon (x_1, \ldots ,x_r)= & {} \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} R_i ^{\epsilon (i, j_1, \ldots ,j_{r_i})}(x_{j_1}\ldots , x_{j_{r_i}}) \nonumber \\= & {} \alpha ^{\vec {\psi }}_{\sigma (\epsilon )} (x_1, \ldots ,x_r) = \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle j_1, \ldots , j_{r_i}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} \psi _i ^{\sigma (\epsilon ) (i, j_1, \ldots ,j_{r_i})}(x_{j_1}\ldots , x_{j_{r_i}}).\nonumber \\ \end{aligned}$$
Note that this means that when \(\gamma \in \Omega _{\vec {R}}\) and \(t\le r\), \( k_1, \ldots ,k_t \) from \( \{1,\ldots ,r\}\) are distinct and \(f:\{1, \ldots , r\} \rightarrow \{k_1, \ldots , k_t\}\) is a surjection, then
$$\begin{aligned} (\alpha _\gamma [x_{k_1}, \ldots ,x_{k_t}])_f= & {} \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle {j_1}, \ldots , {j_{r_i}}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} R_i ^{\gamma (i,{f(j_1)}\ldots , {f(j_{r_i})})}(x_{ j_1}, \ldots ,x_{j_{r_i}}). \\= & {} (\alpha ^{\vec {\psi }}_{\sigma (\gamma )} [x_{k_1}, \ldots ,x_{k_t}])_f \\= & {} \bigwedge _{\begin{array}{c} i \in \{1,2,\ldots ,q\} \\ \langle {j_1}, \ldots , {j_{r_i}}\rangle \in \{1,2,\ldots , r\}^{r_i} \end{array}} \psi _i ^{\sigma (\gamma ) (i, {f(j_1)}, \ldots ,{f(j_{r_i})})}(x_{j_1}\ldots , x_{j_{r_i}}). \end{aligned}$$
Hence \(\alpha _\epsilon (x_1, \ldots ,x_r) = (\alpha _\gamma [x_{k_1}, \ldots ,x_{k_t}])_f\) just when for all i and \(j_1, \ldots , j_{r_i}\in \{1,\ldots ,r\} \) we have
$$\begin{aligned} \epsilon (i, j_1, \ldots ,j_{r_i})= \gamma (i,{f(j_1)}\ldots , {f(j_{r_i})}) \end{aligned}$$
and just when for all i and \(j_1, \ldots , j_{r_i}\in \{1,\ldots ,r\} \) we have
$$\begin{aligned} \sigma (\epsilon ) (i, j_1, \ldots ,j_{r_i})= \sigma (\gamma ) (i,{f(j_1)}\ldots , {f(j_{r_i})}) \end{aligned}$$
which is equivalent to \(\alpha _{\sigma (\epsilon )} (x_1, \ldots ,x_r) = (\alpha _{\sigma (\gamma )} [x_{k_1}, \ldots ,x_{k_t}])_f\), as required. \(\square \)
The Example continued For a language L containing one unary predicate \(R_1\) and one binary predicate \(R_2\), let \(\sigma :\Omega _{\vec {R}}\rightarrow \Omega _{\vec {R}}\) be defined by
$$\begin{aligned} \sigma (\epsilon ) = \left\{ \begin{array}{ll} \delta &{}\quad \mathrm{if}\quad \epsilon (2,1,2)\ne \epsilon (2,2,1) ~\mathrm{and}~ \epsilon (1,1)=\epsilon (1,2)=1, \\ \epsilon &{}\quad \mathrm{otherwise}, \end{array}\right. \end{aligned}$$
where \(\delta \) is as \(\epsilon \) except that \(\delta (2,1,2)= \epsilon (2,2,1)\) and \(\delta (2,2,1)= \epsilon (2,1,2)\). Considering the 3 non-identity possibilities for f in condition (C) it is easy to see that the condition is satisfied and hence \(\sigma \) is supported by a translation. From (12) we can see that the translation is \(\psi _1(x)=R_1(x)\) and
$$\begin{aligned} \psi _2(x,y)= (\lnot (R_1(x) \wedge R_1(y))\wedge R_2(x,y) ) \vee (R_1(x) \wedge R_1(y) \wedge R_2(y,x) ). \end{aligned}$$
This shows that Smith and Jones could argue in terms of HB just as well as in terms of HD.
As in the unary case, a translation \(\vec {\psi }\) as above that maps the atom \(\alpha _{\epsilon }(x_1, \ldots ,x_r)\) to the atom
$$\begin{aligned} \alpha ^{\vec {Q}}_{\tau _{\vec {\psi }}(\epsilon )}(x_1, \ldots ,x_r) \end{aligned}$$
(which is logically equivalent to \(\alpha _{\epsilon }(x_1, \ldots ,x_r)\) when the \(Q_j\) are defined by \( Q_j(x_1, \ldots ,x_{r_j}) = \psi _j(x_1, \ldots ,x_{r_j})\)) extends to all formulae of \(L^-_{\vec {R}}\), and hence also to all sentences of \(L_{\vec {R}}\). To see that, first note that any state formula \(\Theta (x_1,x_2, \ldots , x_n)\) is logically equivalent to a conjunction of atoms, see (Ronel and Vencovská 2014) or (Paris and Vencovská 2015, Chapter 41):
$$\begin{aligned} \Theta (x_1,\ldots , x_n) = \bigwedge _{\langle k_1,\ldots ,k_r\rangle \in \{1,\ldots ,n\}^r}\alpha _{\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle }}(x_{k_1},\ldots ,x_{k_r}), \end{aligned}$$
where the \(\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle } \) are elements of \(\Omega _{\vec {R}}\).7 Hence with the \(Q_j\) defined as the \(\psi _j\), \(\Theta (x_1, \ldots , x_n)\) is logically equivalent to
$$\begin{aligned} \bigwedge _{\langle k_1,\ldots ,k_r\rangle \in \{1,\ldots ,n\}^r}\alpha ^{\vec {Q}}_{\tau _{\vec {\psi }}(\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle })}(x_{k_1},\ldots ,x_{k_r}) \end{aligned}$$
(which we may denote \((\tau _{\vec {\psi }}\Theta )^{\vec {Q}}(x_1,\ldots ,x_n)\)).
The rest follows by the Prenex and Disjunctive Normal Form Theorems because any formula \(\xi (x_1, \ldots ,x_n)\) of \(L^-_{\vec {R}}\) is logically equivalent to a formula of \(L^-_{\vec {R}}\) of the form
$$\begin{aligned} T_1 x_{n+1} T_2 x_{n+2} \ldots T_{m} x_{n+m} \bigvee _{j=1}^t \Theta _{j}(x_1, \ldots ,x_{n+m}) \end{aligned}$$
where each \(T_i\) is one of \(\forall \) or \(\exists \), and hence also to
$$\begin{aligned} T_1 x_{n+1} T_2 x_{n+2} \ldots T_{m} x_{n+m} \bigvee _{j=1}^t (\tau _{\vec {\psi }}\Theta _j)^{\vec {Q}}(x_1, \ldots ,x_{n+m}). \end{aligned}$$
This observation is not surprising in the unary case where atoms for distinct variables are disjoint but seems somewhat remarkable in the polyadic case where atoms for different r-tuples of variables may well be incompatible if the r-tuples have some variables in common.

Unlike the unary case for the polyadic case we will leave open the question of fully characterising the permutations of constituents (as described in Hintikka (1965)) which are supported by a translation.

3 Verisimilitude?

The original motivation for the research in this note came from Miller’s paper (Miller 1974) regarding a measure of closeness of a theory to the truth, where, as explained in e.g., (Miller 1978), the truth is identified with a constituent of a finite propositional language (a complete consistent theory) and the theory with a set of constituents (possibly just one as in the Prisoner Example). First propositional languages were considered (equivalently, unary predicate languages with one constant), then unary predicate languages with no constants, see (Miller 1978).

With predicate languages however, it appears natural to employ languages with constants. In this case we are led to the notion of the quantifier free truth about \(a_1, a_2, \ldots ,a_n\) being a state description for \(a_1,a_2,\ldots ,a_n\) and a quantifier free theory about \(a_1,a_2, \ldots ,a_n\) being a set (disjunction) of such. State descriptions are conjunctions of instantiated atoms and in Sect. 1 we have seen that for unary languages and any permutation of atoms (that is of \(\Omega _{\vec {P}}\)) there is a translation supporting it, which consequently disqualifies Hamming distance between atoms as a measure of verisimilitude. In the polyadic case a permutation of atoms (that is of \(\Omega _{\vec {R}}\)) is supported by a translation just when it satisfies the condition (C) and the same conclusion has to be reached regarding the suitability of the Hamming distance between atoms for measuring verisimilitude.

Whilst atoms can be ‘renamed’ by translations in various ways [subject to (C)], and the Hamming distance between them (i.e., the Hamming distance between elements of \(\Omega _{\vec {R}}\)) can change translations do preserve Hamming distances between state formulae
$$\begin{aligned} \Theta (x_1,\ldots , x_n)= & {} \bigwedge _{\langle k_1,\ldots ,k_r\rangle \in \{1,\ldots ,n\}^r}\alpha _{\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle }}(x_{k_1},\ldots ,x_{k_r}), \\ \Phi (x_1,\ldots , x_n)= & {} \bigwedge _{\langle k_1,\ldots ,k_r\rangle \in \{1,\ldots ,n\}^r}\alpha _{\epsilon _{ \langle \Phi , \langle k_1,\ldots ,k_r\rangle \rangle }}(x_{k_1},\ldots ,x_{k_r}), \end{aligned}$$
as given by
$$\begin{aligned} \sum _{\langle k_1,\ldots ,k_r\rangle \in \{1,\ldots ,n\}^r} \#(\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle }, \epsilon _{ \langle \Phi , \langle k_1,\ldots ,k_r\rangle \rangle }) \end{aligned}$$
$$\begin{aligned} \#(\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle }, \epsilon _{ \langle \Phi , \langle k_1,\ldots ,k_r\rangle \rangle })= \left\{ \begin{array}{ll} 0 &{}\quad \text{ if } \quad \epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle } = \epsilon _{ \langle \Phi , \langle k_1,\ldots ,k_r\rangle \rangle }, \\ 1&{}\quad \text{ otherwise. } \end{array} \right. \end{aligned}$$
Translations also preserve the structure of state descriptions/formulae in the sense which we now explain. In Paris and Vencovská (2015, Chapter 40) the following notion of similarity is introduced:
State formulae\(\Theta (x_1, \ldots , x_n)\), \(\Phi (x_1, \ldots , x_n)\)are similar, if for distinct\({m_1},\ldots , {m_s} \in \{1,2,\ldots , n\}\), \(k_1, \ldots , k_t \in \{1,2,\ldots , n\}\)andsurjection\(f :\{m_1, \ldots , m_s\} \rightarrow \{k_1, \ldots , k_t\}\),
$$\begin{aligned} \Theta [x_{m_1}, \ldots , x_{m_s}] = (\Theta [x_{k_1}, \ldots , x_{k_t}])_f \Longleftrightarrow \Phi [x_{m_1}, \ldots , x_{m_s}] = (\Phi [x_{k_1}, \ldots , x_{k_t}])_f. \end{aligned}$$
Informally, \(\Theta (x_1, \ldots , x_n)\), \(\Phi (x_1, \ldots , x_n)\) are similar if, whenever in \(\Theta \) the behaviour8 of some variables \(x_{m_1}, \ldots , x_{m_s}\) exactly corresponds to the behaviour of some further variables \(x_{k_1}, \ldots , x_{k_t}\) (with f specifying how the \(x_{k_1}, \ldots , x_{k_t}\) are ‘cloned’ by the \(x_{m_1}, \ldots , x_{m_s}\)), the same happens in \(\Phi \), and conversely.
It follows from results in Ronel and Vencovská (2014), Paris and Vencovská (2015, Chapter 41) and the above that state formulae \(\Theta (x_1, \ldots , x_n)\) and \(\Phi (x_1, \ldots , x_n)\) with \(\Theta \) as in (30) are similar just when there is a a permutation of atoms \(\sigma \) that is supported by a translation such that \( \Phi (x_1, \ldots ,x_n) \) is equal to
$$\begin{aligned} \sigma (\Theta (x_1, \ldots ,x_n))=\bigwedge _{\langle k_1,\ldots ,k_r\rangle \in \{1,\ldots ,n\}^r}\alpha _{\sigma (\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle })}(x_{k_1},\ldots ,x_{k_r}), \end{aligned}$$
that is, just when \(\Phi \) is a ‘translated’ version of \(\Theta \). In other words there are quantifier free formulae \(\psi _i(x_1,\ldots ,x_{r_i})\) such that if we replace each \(R_i \) in \(\Phi \) by \(\psi _i\) we obtain \(\Theta \).

Hence any translated version of the truth carries some information about the truth, namely the shared structure. From this viewpoint then the minimum Hamming distance between translations of state descriptions/formulae might be used to measure how far a theory is from capturing the structure of the truth.

4 Renaming Invariance and Rationality

Consider the problem Pure Inductive Logic aims to address: How to give a rational assignment of probabilities \(w(\theta )\) to sentences \(\theta \) of an entirely uninterpreted language \(L_{\vec {R}}\)?9 The current modus operandi here is to propose principles which we may intuitively feel are somehow ‘rational’ for the probability function w to satisfy and investigate their consequences and inter relationships.10

The previous sections suggest a principle of ‘translation invariance’, that w should be unaffected by the sort of translation we have considered above. For suppose that within the remit of Pure Inductive Logic we have chosen a ‘rational’ probability function w on the set of sentences of the language \(L_{\vec {R}}\) (denoted \(SL_{\vec {R}}\)), that is chosen a probability function satisfying those principles which we judge to demarcate what ‘rational’ means. A caviller now points out that since there is supposed to be no intended interpretation here we could equally well have based our choice on a translation of the original relations - on an equally expressive set of relation symbols of appropriate arities. So, the caviller continues, to be consistent we should still be giving the same probabilities even after making this translation (and regardless of the the names chosen for the relation symbols). In other words, at risk of otherwise being seemingly not consistent we should accept the principle that a rational assignment of probabilities should additionally be invariant under translations.

Translation Invariance Principle, TIP

If \(\sigma \) is a permutation of atoms supported by a translation and \(w_{\sigma }\) is the probability function on \(SL_{\vec {R}}\) determined by 11
$$\begin{aligned} w_\sigma (\Theta (b_1, \ldots , b_n)) = w(\sigma (\Theta (x_1, \ldots ,x_n))(b_1/x_1, \ldots , b_n/x_n)), \end{aligned}$$
then \(w=w_\sigma .\)

It turns out that such a principle of ‘translation invariance’ already exists in equivalent forms in the literature. We shall now discuss this in more detail.

4.1 The Unary Case

We start with the case of the purely unary language \(L_{\vec {P}}\), adopting again the notation of that first section.12 Let w be a probability function on the set of sentences \(SL_{\vec {P}}\) of \(L_{\vec {P}}\). Then w is uniquely determined (see for example (Paris and Vencovská 2015, Chapter 7) or [12]) by its values on the state descriptions of \(L_{\vec {P}}\), that is sentences of \(L_{\vec {P}}\) of the form
$$\begin{aligned} \bigwedge _{i=1}^m \alpha _{\epsilon _i}(b_i) \end{aligned}$$
where the \(b_1,b_2, \ldots , b_m\) are distinct constants from \(\{a_1,a_2,a_3, \ldots \}\).

As in Sect. 1 but using \(\vec {P}\) also in place of \(\vec {Q}\), any permutation \(\tau \) of \(\Omega _{\vec {P}}\) determines a permutation/renaming/translation of atoms and in turn of state formulae, that is formulae of \(L^-_{\vec {P}}\) of the form \( \bigwedge _{i=1}^m \alpha _{\epsilon _i}(x_i)\), by sending \( \bigwedge _{i=1}^m \alpha _{\epsilon _i}(x_i)\) to \( \bigwedge _{i=1}^m \alpha _{\tau (\epsilon _i)}(x_i)\).

For a probability function w, each such translation uniquely determines a further probability function \(w_\tau \) on \(SL_{\vec {P}}\) by setting
$$\begin{aligned} w_\tau \left( \bigwedge _{i=1}^m \alpha _{\epsilon _i}(b_i)\right) = w\left( \bigwedge _{i=1}^m \alpha _{\tau (\epsilon _i)}(b_i)\right) . \end{aligned}$$
The requirement that a probability function w is invariant under translation in the sense we have discussed above can be seen to be equivalent to satisfying the well know property of Atom Exchangeability:13

The Principle of Atom Exchangeability, Ax

For\(\tau \)a permutation of atoms and a state description\(\bigwedge _{i=1}^n \alpha _{\epsilon _i}(b_i)\),
$$\begin{aligned} w\left( \bigwedge _{i=1}^n \alpha _{\epsilon _i}(b_i)\right) = w\left( \bigwedge _{i=1}^n \alpha _{\tau (\epsilon _i)}(b_i)\right) . \end{aligned}$$
Atom Exchangeability follows from Johnson’s Sufficientness Postulate (see for example (Paris and Vencovská 2015, Lemma 17.1)) and hence holds for the members of Carnap’s Continuum of Inductive methods. It implies (but is not implied by) two other principles which are central in the field:

The Principle of Predicate Exchangeability, Px

For\(\theta \in SL\)and predicate symbols\(P_i,P_j\)ofL, if\(\theta '\)is the result of transposing\(P_i,P_j\)throughout\(\theta \)then\(w(\theta ) = w(\theta ')\).

The Strong Negation Principle, SN

For\(\theta \in SL\), \(w(\theta )=w(\theta ')\)where\(\theta '\)is the result of replacing each occurrence of the predicate symbol\(P_i\)in\(\theta \)by\(\lnot P_i\).

Each of the principles Px and SN have an evident claim to rationality on the grounds of symmetry. Namely in the completely uninterpreted situation envisaged here it would be irrational to assign probability values which broke the existing symmetries in the language. However, the reasons for the rationality of Atom Exchangeability are less easy to appreciate, as indeed the surprise inherent in Miller’s example shows.

For this reason one might require the constraints imposed on one’s choice of probability function to at least include Px+SN. That being the case one might argue not that we should always have \(w= w_\tau \) for \(\tau \) a permutation of \(\{0,1\}^q\) but simply that \(w_\tau \) was, as far as these constraints were concerned, an equally good choice, i.e., that the \(w_\tau \) should also satisfy at least Px+SN.

Thus we would be advocating here a sort of meta-principle, namely that if one initially proposed that adherence to principles XYZ etc. determined what constituted a choice of probability function w being rational, then on secondary consideration w should additionally be such that all the \(w_\tau \) also satisfy XYZ etc..14

In the particular case of Px+SN this meta-principle yields a new principle which, for \(q>2\), lies strictly between Px+SN and Ax:

The Unary Principle of Inculcated Px+SN, I(Px+SN)

For every permutation\(\tau \)of\(\Omega _{\vec {P}}\), \(w_\tau \)satisfies Px+SN.

Theorem 3

Let w be a probability function on \(SL_{\vec {P}}\). Then the probability function \(w_\tau \) satisfies Px+SN for each permutation \(\tau \) of \(\Omega _{\vec {P}}\) just if \(q >2\) and \(w=w_\sigma \) for every even permutation \(\sigma \) of \(\Omega _{\vec {P}}\) or \(q \le 2\) and w satisfies Ax.


We first show this in the forward direction. Let \(\mathcal {S}_{\vec {P}}\) be the group of permutations of \(\Omega _{\vec {P}}\). Let H be the subgroup of \(\mathcal {S}_{\vec {P}}\) of permutations \(\tau \) such that \(v_\tau =v\) for all probability functions v on \(SL_{\vec {P}}\) which satisfy Px+SN. Equivalently, H is generated by the permutations of atoms which just transposes \(P_i,P_j\) and those which just transpose \(P_i, \lnot P_i\). Notice that for \(q>2\) all such permutations are even whilst for \(q \le 2\)H also contains odd permutations. Note too that a probability function v satisfies Px+SN just when \(v_\tau =v\) for all \(\tau \in H\).

Let \(K_w\) be the set of \(\tau \in \mathcal {S}_{\vec {P}}\) such that \(w=w_\tau \). Then \(K_w\) is a subgroup of \(\mathcal {S}_{\vec {P}}\). For \(\rho \in K_w\), and \(\sigma \in \mathcal {S}_{\vec {P}}\),
$$\begin{aligned} w_\sigma \left( \bigwedge _{i=1}^m \alpha _{\epsilon _i}(b_i)\right)= & {} w\left( \bigwedge _{i=1}^m \alpha _{\sigma (\epsilon _i)}(b_i)\right) = w_\rho \left( \bigwedge _{i=1}^m \alpha _{\sigma (\epsilon _i)}(b_i)\right) \\= & {} w \left( \bigwedge _{i=1}^m \alpha _{\rho \sigma (\epsilon _i)}(b_i)\right) = w_\sigma \left( \bigwedge _{i=1}^m \alpha _{\sigma ^{-1}\rho \sigma (\epsilon _i)}(b_i)\right) \end{aligned}$$
so \(\sigma ^{-1} K_w \sigma \subseteq K_{w_\sigma }\) and hence, since \(|\sigma ^{-1} K_w \sigma |=| K_{w_\sigma }|\), \(K_{w_\sigma }= \sigma ^{-1}K_w \sigma .\)
Also since the \(w_\sigma \) all satisfy Px+SN, \(H \subseteq K_{w_\sigma }\) so
$$\begin{aligned} \bigcap _{\sigma \in \mathcal {S}_{\vec {P}}} K_{w_\sigma }= \bigcap _{\sigma \in \mathcal {S}_{\vec {P}}} \sigma ^{-1} K_w \sigma \end{aligned}$$
is a subgroup of \(\mathcal {S}_{\vec {P}}\) containing H (and so is non-trivial). In fact it is a normal subgroup since if
$$\begin{aligned} \gamma \in \bigcap _{\sigma \in \mathcal {S}_{\vec {P}}} \sigma ^{-1} K_w \sigma \end{aligned}$$
$$\begin{aligned} \tau \gamma \tau ^{-1}\in \bigcap _{\sigma \in \mathcal {S}_{\vec {P}}} \tau \sigma ^{-1} K_w \sigma \tau ^{-1}=\bigcap _{\delta \in \mathcal {S}_{\vec {P}}} \delta ^{-1} K_w \delta \end{aligned}$$
since \(\delta = \tau \sigma ^{-1}\) will run through all elements of \(\mathcal {S}_{\vec {P}}\) as \(\sigma \) does.

Now suppose that \(q >2\). Then since the only non-trivial normal subgroups of \(\mathcal {S}_{\vec {P}}\) are itself and the alternating group A of even permutations it must be the case that \(\bigcap _{\sigma \in \mathcal {S}_{\vec {P}}} K_{w_\sigma }\) is one of these and hence, by taking \(\sigma \) to be the identity permutation, it follows that \(A \subseteq K_w\). Hence \(w=w_\sigma \) for every \(\sigma \in A\). (As we shall see later in this case of \(q>2\), H only contains even permutations and we cannot obtain a stronger result here.)

We now turn to the case \(q \le 2\). When \(q =1\), \(\mathcal {S}_{\vec {P}}\) is the only non-trivial normal subgroup of \(\mathcal {S}_{\vec {P}}\) so in this case we must have \(K_w=\mathcal {S}_{\vec {P}}\), in other words w satisfies Ax. When \(q=2\), \(\mathcal {S}_{\vec {P}}\) has two non-trivial proper normal subgroups. However both of these only contain even permutations while H contains some odd permutations so again we must have \(K_w=\mathcal {S}_{\vec {P}}\) and Ax follows.

Turning to the converse direction this is clear in the case of \(q \le 2\). For \(q>2\), suppose that \(w=w_\sigma \) for all \(\sigma \in A\). Then certainly since \(H \subseteq A\) for \(q>2\), w is invariant under permutations of predicate symbols and permutations replacing \(P_i\) by \(\lnot P_i\), so w satisfies Px+SN, and so does \(w_\sigma \) for \(\sigma \) an even permutation. Also since \(w_\sigma =w_\tau \) for all even permutations \(\sigma , \tau \), it is easy to see that this must also hold for all odd permutations \(\sigma , \tau \) and hence by the same argument as in the even case, for \(\sigma \) an odd permutation \(w_\sigma \) must satisfy Px+SN too. \(\square \)

At this point one might question whether even for \(q>2\) the new principle I(Px+SN), amounting to \(A \subseteq K_w\), really is strictly between Px+SN and Ax. (As Theorem 3 shows it is equivalent to Ax for \(q\le 2\).) We now construct examples of probability functions which show that this is the case.

Let \(\epsilon _1,\epsilon _2, \ldots , \epsilon _{2^q}\) list the elements of \(\Omega _{\vec {P}}\). For \(\sigma \in \mathcal {S}_{\vec {P}}\) it will be convenient to also treat \(\sigma \) as a permutation of these subscripts, that is \(\sigma (\epsilon _i)=\epsilon _{\sigma (i)}\).

For \(\vec {c} = \langle c_1,c_2, \ldots , c_{2^q}\rangle \in \mathbb {R}^{2^q}\) such that the \(c_i\) are non-negative with sum 1 define the probability function15\(w_{\vec {c}}\) by16
$$\begin{aligned} w_{\vec {c}}\left( \bigwedge _{i=1}^m \alpha _{\epsilon _{h_i}}(b_i)\right) = \prod _{i=1}^m c_{h_i}. \end{aligned}$$
Now set
$$\begin{aligned} v= |A|^{-1} \sum _{\sigma \in A} w_{\sigma \vec {c}} \end{aligned}$$
where \(\sigma \vec {c} = \langle c_{\sigma (1)}, c_{\sigma (2)}, \ldots , c_{\sigma (2^q)}\rangle \). Then \(K_v \supseteq A\) and for distinct constants \(b_{i,k}\), \(i=1,2, \ldots , 2^q\), \(k=1,2, \ldots ,i-1\),
$$\begin{aligned} v\left( \bigwedge _{i=1}^{2^q} \bigwedge _{k =1}^{i-1}\alpha _i(b_{i,k})\right) \end{aligned}$$
is the sum of the positive terms from the determinant of the Vandermonde \(2^q \times 2^q\) matrix with ij th entry \(c_i^{j-1}\). Since we can find suitable \(\vec {c}\) yielding positive Vandermonde determinant this means that for any odd permutation \(\tau \)
$$\begin{aligned} v\left( \bigwedge _{i=1}^{2^q} \bigwedge _{k=1}^{i-1}\alpha _i(b_{i,k})\right) > v\left( \bigwedge _{i=1}^{2^q} \bigwedge _{k=1}^{i-1}\alpha _{\tau (i)}(b_{i,k})\right) \end{aligned}$$
so in this case \(K_v \subseteq A\), forcing \(K_v=A\).
To distinguish the ‘new principle’ from Px+SN in the case \(q>2\) suppose without loss of generality that
$$\begin{aligned} \alpha _{\epsilon _1}(x)= & {} P_1(x) \wedge P_2(x) \wedge \bigwedge _{j=3}^q P_j(x) \\ \alpha _{\epsilon _2}(x)= & {} P_1(x) \wedge \lnot P_2(x) \wedge \bigwedge _{j=3}^q P_j(x) \\ \alpha _{\epsilon _3}(x)= & {} \lnot P_1(x) \wedge P_2(x) \wedge \bigwedge _{j=3}^q P_j(x) \end{aligned}$$
and let \(\nu \) be the even permutation \(\epsilon _1 \mapsto \epsilon _2 \mapsto \epsilon _3 \mapsto \epsilon _1\) and leaving all remaining elements of \(\Omega _{\vec {P}}\) fixed. Then \(\nu \notin H\) since it does not preserve Hamming Distance between atoms whilst every permutation in H does (and conversely in fact, see (Hill and Paris 2013)). Then since \( \nu \notin H\) the polynomials
$$\begin{aligned} \sum _{\sigma \in H} \prod _{i=1}^{2^q} x_{\sigma (i)}^i, \quad \sum _{\sigma \in H} \prod _{i=1}^{2^q} x_{\nu \sigma (i)}^i, \end{aligned}$$
are not formally the same and hence there must be a vector in the positive quadrant, indeed since these polynomials are homogeneous, a normalized positive vector \(\langle c_1,c_2, \ldots , c_{2^q}\rangle \), on which they give different values. But that means that if
$$\begin{aligned} u= |H|^{-1} \sum _{\sigma \in H} w_{\sigma \vec {c}} \end{aligned}$$
then u satisfies Px+SN whilst \(u \ne u_\nu \) since they give different values on
$$\begin{aligned} \bigwedge _{i=1}^{2^q} \bigwedge _{k =1}^{i}\alpha _i(b_{i,k}). \end{aligned}$$
A somewhat surprising observation on the proof of Theorem 3 is that we could have dropped SN throughout. Similarly as far as the result for \(q>2\) is concerned we could have dropped Px (while retaining SN). Notice too that all the above results (and counter-examples) hold equally well if we add to Px+SN also the ubiquitous Principle of Constant Exchangeability, Ex, that is that for any sentence \(\theta (a_1, a_2, \ldots , a_n)\) and distinct \(j_1,j_2, \ldots , j_n\),
$$\begin{aligned} w(\theta (a_1, a_2, \ldots , a_n)) = w(\theta (a_{j_1}, a_{j_2}, \ldots , a_{j_n})). \end{aligned}$$

4.2 The Polyadic Case

We now turn to the case of the polyadic language \(L_{\vec {R}}\) and w a probability function on \(SL_{\vec {R}}\).

In this case too, equivalent forms of the Invariance Under Translation Principle already exist in the literature, see (Paris and Vencovská 2015, Chapters 39, 40; Paris and Vencovská 2011; Ronel and Vencovská 2014). Most directly, the principle is equivalent to the the Permutation Invariance Principle, PIP. This is a special case of the ‘ultimate’ symmetry principle of Pure Inductive Logic, INV, which we will briefly explain to start with.

In Paris and Vencovská (2015, Chapters 23, 39) we have argued that assigning probabilities to (classes of logically equivalent) sentences of an entirely uninterpreted language \(L_{\vec {R}}\) could be imagined as a task to be performed by an agent who knows that s/he is in a structure M for \(L_{\vec {R}}\) with universe \(\{a_1, a_2, \ldots \}\), in which each constant symbol \(a_i\) is interpreted as \(a_i\), but having no information as to what sentences of \(SL_{\vec {R}}\) hold in their ambient structure M. Since a rational agent in such a situation would presumably wish to respect symmetry, this picture clearly helps to motivate the symmetry principles which we have mentioned already. Our attempt to capture that which underlies all symmetry principles in Pure Inductive Logic, see the foregoing, was based on the observation that any symmetry of the (classes of logically equivalent) sentences of \(SL_{\vec {R}}\) corresponds to an automorphism of the set of all possible structures as above, along with the set of its definable subsets, in the following sense:

Let \(\mathcal{T}L_{\vec {R}}\) be the set of structures M for \(L_{\vec {R}}\) with universe \(\{a_1,a_2, a_3, \ldots \}\) where each constant symbol \(a_i\) of the language is interpreted in M by the element \(a_i \). Let \(BL_{\vec {R}}\) be the two-sorted structure with universe \(\mathcal{T}L_{\vec {R}}\) together with the sets
$$\begin{aligned} {[}\theta ] = \{ \, M \in \mathcal{T}L_{\vec {R}} \,|\, M \models \theta \,\} ~~ \text{ for } \theta \in SL_{\vec {R}} \end{aligned}$$
and the membership relation between elements of \(\mathcal{T}L_{\vec {R}}\) and these sets.
An automorphism \(\eta \) of \(BL_{\vec {R}}\) is a bijection of \(\mathcal{T}L_{\vec {R}}\) such that for each \(\theta \in SL_{\vec {R}}\) there is some \(\psi \in SL_{\vec {R}}\) such that
$$\begin{aligned} \eta [\theta ] = \{ \, \eta ( M) \,|\,M \in \mathcal{T}L_{\vec {R}}, \, M \models \theta \,\} = [\psi ] \end{aligned}$$
and conversely, for every \(\psi \in SL_{\vec {R}}\) there is a sentence \(\theta \in SL_{\vec {R}}\) satisfying (37). We write \(\eta (\theta )\) for the sentence \(\psi \in SL_{\vec {R}}\) for which \(\eta [\theta ] = [\psi ]\) (up to logical equivalence).

The Invariance Principle, INV

If\(\eta \)is an automorphism of\(BL_{\vec {R}}\)then\(w(\theta )=w(\eta ( \theta ))\) for \(\theta \in SL_{\vec {R}}\).

As discussed in Paris and Vencovská (2015), INV in its full generality may be too strong, possibly denying rationality to almost all probability functions. This has indeed been proved for languages \(L_{\vec {P}}\) with only unary predicates: there is only one (from various points of view a not-entirely-suitable) probability function on \(SL_{\vec {P}}\) satisfying INV, see (Paris and Vencovská 2015, Chapter 23). The situation in the polyadic remains intriguingly open.

Whilst INV in the purely unary context, after corroborating the intuition for previously known symmetry principles, has been shown to just go too far, in the polyadic context INV has yielded a further interesting symmetry principle which obtains from INV by imposing an additional requirement on the \(\eta \), namely that they map state descriptions to state descriptions. In Paris and Vencovská (2015, Chapter 39) this has been proved to be equivalent to the principle PIP which we will state precisely after introducing some further definitions from Paris and Vencovská (2015).

We say that a function \(\digamma \)permutes state formulae if for each n and (distinct) variables \(x_{j_1},\ldots , x_{j_n}\), \(\digamma \) permutes the state formulae \(\Phi (x_{j_1},\ldots , x_{j_n})\) in these variables (up to logical equivalence). Properties (A) and (B) are defined as follows:
For each state formula\(\Theta (x_{k_1},\ldots , x_{k_t})\)and surjective mapping\(\tau :\{m_1, \ldots , m_s\} \rightarrow \{k_1, \ldots , k_t\}\),
$$\begin{aligned} (\digamma (\Theta (x_{k_1},\ldots , x_{k_t}))_\tau = \digamma (\Theta (x_{k_1},\ldots , x_{k_t})_\tau ). \end{aligned}$$
For each state formula \(\Phi (x_{j_1},\ldots , x_{j_n})\) and (distinct) \(i_1,i_2,\ldots ,i_k \in \{j_1, \ldots , j_n\}\)
$$\begin{aligned} \digamma (\Phi ) [x_{i_1},\ldots ,x_{i_k}]~=~ \digamma (\Phi [x_{i_1},\ldots ,x_{i_k}]). \end{aligned}$$

The Permutation Invariance Principle, PIP

If\(\digamma \)is a permutation of state formulae of\(L_{\vec {R}}\)satisfying (A) and (B) then for a state description\(\Phi (b_1, \ldots , b_n)\),
$$\begin{aligned} w(\Phi (b_1, \ldots , b_n)) = w(\digamma ( \Phi (x_1, \ldots , x_n))(b_1/x_1, \ldots , b_n/x_n)). \end{aligned}$$
The equivalence of PIP and TIP follows by Lemma 4 from Ronel and Vencovská (2014) which shows that a permutation \(\sigma \) of atoms extends to a permutation of state formulae satisfying (A) and (B), by mapping \(\Theta (x_1, \ldots ,x_n)\) to \(\sigma (\Theta (x_1, \ldots ,x_n))\) as in (34) (and analogously for any other n-tuple of distinct variables), just when \(\sigma \) satisfies the condition (C). We remark that PIP, and hence TIP, is also equivalent to the principle which states that similar state descriptions get the same probability (Nathanial’s Invariance Principle, NIP), cf. (Paris and Vencovská 2015, Chapter 41).

In the purely unary context, PIP is equivalent to Ax. As Ax does in the unary case, in the polyadic PIP implies Px and SN (the general formulation of these principles are as in the unary case except that in Px we need to say that we exchange predicate symbols of the same arities). How PIP relates to the Inculcated Px+SN, that is, to the requirement that Px and SN hold not only for w but also for any \(w_\sigma \) where \(\sigma \) is a permutation of \(\Omega _{\vec {R}}\) satisfying (C) and \(w_\sigma \) is defined as in (35), remains an open question. Although much of the reasoning used in the proof of Theorem 3 could be used with \(\mathcal {S}_{\vec {P}}\) replaced by the group of all permutations of \(\Omega _{\vec {R}}\) satisfying (C), we lack sufficient insight into the structure of this group to allow us to draw interesting conclusions.

5 Conclusion

Inspired by Miller’s Weather Example and its underlying notion of a translation we have considered the extent to which a simple permutation of atoms can be formulated, or explained, in terms of a translation. It turns out that this is always the case for purely unary languages whilst for general polyadic languages it requires the permutation to also satisfy a certain property (C). This also establishes a precise connection between translations and those permutations that can be extended to automorphisms of the overlying structure since (C) is again exactly the additional ingredient needed in that case too.

A salient feature of Miller’s Weather Example is that it reveals an underlying rational commitment to adopting beliefs that are translation-proof. Whilst the most rigid meaning one might give to that expression is that beliefs should be translation-invariant we argue that within the context of Pure Inductive Logic a more catholic interpretation might be that the translation preserves the rationality of the beliefs, rather than the actual quantitative beliefs themselves. Formally this leads to a meta-principle which we have characterised for certain rationality criteria in unary languages, showing it (for languages with at least three predicate symbols) to lie strictly between simply observance of these criteria and full preservation of belief values under translation.


  1. 1.

    Further philosophical discussion of this matter, which is not a concern of this present paper, continued well beyond Miller’s paper (Miller 1974), see for example (Miller 2006, Chapter 11), (Niiniluoto 1998).

  2. 2.

    Miller’s example is set within a propositional language. Working within a predicate language will however allows us to generalize it, in particular to polyadic languages.

  3. 3.

    Note that atoms are not the same thing as what is commonly called atomic formulae of the language. Carnap refers to them as Q-predicates and Hintikka and Niiniluoto as attributive constituents.

  4. 4.

    Up to logical equivalence. Throughout we will usually, for convenience, identify formulae which are logically equivalent rather than actually syntactically identical.

  5. 5.

    Recall our convention of identifying formulae even if they are formally only logically equivalent.

  6. 6.

    We do not introduce a precise notion of distance here wishing just to convey the intuition. It could be the Hamming distance between atoms, see below, in which case we should also incorporate the prisoners’ opinion about, and the truth of, D(bb), D(ll), B(bb) and B(ll). However since B(xx) and D(xx) coincide, this would make no difference to comparisons of descriptions using HD and HB.

  7. 7.

    Note that this means that the notation, if the need ever arose to write this out, would require us to talk about \(\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle }(i,j_1,\ldots , j_{r_i})\) which are values from \(\{0,1\} \) such that \(\Theta (x_1, \ldots ,x_n) \models R_i(x_{k_{j_1}}, \ldots ,x_{k_{j_{r_i}}})\) just when \(\epsilon _{ \langle \Theta , \langle k_1,\ldots ,k_r\rangle \rangle }(i,j_1,\ldots , j_{r_i})=1\); the \(\langle k_1,\ldots ,k_r\rangle \) are from \( \{1,\ldots ,n\}^r\), i from \(\{1, \ldots , q\}\) and \(\langle j_1,\ldots ,j_{r_i}\rangle \) from \(\{1,\ldots ,r\}^{r_i}\). Note also in particular that tuples \(\langle k_1,\ldots ,k_r\rangle \) and \(\langle j_1,\ldots ,j_{r_i}\rangle \)with repeats are included.

  8. 8.

    By the behaviour of the variables\(x_{m_1}, \ldots , x_{m_s}\)in\( \Theta \) we mean all the information contained in \(\Theta \) that involves just these variables and no others.

  9. 9.

    See for example (Paris and Vencovská 2015, Chapter 1) for further details.

  10. 10.

    For the definition of a probability function in this context and general background see for example (Paris and Vencovská 2015) or [12].

  11. 11.

    Using the condition (E) and (Paris and Vencovská 2015, p 42), or [12], we can see that this definition does yield a probability function.

  12. 12.

    So in particular \(\Omega _{\vec {P}}\) is the set of maps from \(\{1,2, \ldots , q\} \) to \(\{0,1\}\).

  13. 13.

    For more details on Atom Exchangeability see (Paris and Vencovská 2015, Chapter 14).

  14. 14.

    Another such ‘meta-principle’ in this area is (Unary) Language Invariance, see for example (Paris and Vencovská 2015), which again applies not simply to a single probability function w but to a family of related probability functions to which w belongs.

  15. 15.

    See (Paris and Vencovská 2015, Page 51) for more details.

  16. 16.

    The \(w_{\vec {c}}\), and in turn the probability functions v, u to be introduced shortly, immediately satisfy the ubiquitous Constant Exchangeability Principle, Ex, see for example (Paris and Vencovská 2015, Page 33).



  1. Hill, A. J., & Paris, J. B. (2013). An analogy principle in inductive logic. Annals of Pure and Applied Logic, 164, 1293–1321.CrossRefGoogle Scholar
  2. Hintikka, J. (1965). Distributive normal forms in first-order logic. Studies in Logic and the Foundations of Mathematics, 40, 48–91.CrossRefGoogle Scholar
  3. Miller, D. (1974). Popper’s qualitative theory of verisimilitude. British Journal for the Philosophy of Science, 25(2), 166–177.CrossRefGoogle Scholar
  4. Miller, D. (1978). The distance between constituents. Synthese, 38(2), 197–212.CrossRefGoogle Scholar
  5. Miller, D. (2006). Out of error. Farnham: Ashgate Publishing Company.Google Scholar
  6. Niiniluoto, I. (1998). Verisimilitude: The third period. British Journal for the Philosophy of Science, 49(1), 1–29.CrossRefGoogle Scholar
  7. Paris, J. B. & Vencovská, A. (April 2015). Pure inductive logic. In The Association of Symbolic Logic Perspectives in Mathematical Logic Series. Cambridge University Press.Google Scholar
  8. Paris, J. B. & Vencovská, A. (2011). A note on Nathanial’s invariance principle in polyadic inductive logic. In Banerjee, M., & Seth, A. (Eds.), Logic and its applications, ICLA, Proceedings of the 4th Indian Logic Conference, Dehli, India (pp. 137–146).Google Scholar
  9. Popper, K. R. (1972). Objective knowledge. Oxford: Clarendon Press.Google Scholar
  10. Ronel, T., & Vencovská, A. (2014). Invariance principles in polyadic inductive logic. Logique et Analyse, 228, 541–561.Google Scholar
  11. Tichý, P. (1974). On Popper’s definitions of Verisimilitude. British Journal for the Philosophy of Science, 25(2), 155–160.CrossRefGoogle Scholar
  12. Vencovská, A. Pure inductive logic—Nesin Maths Village Summer School Course Notes. Available at

Copyright information

© The Author(s) 2019

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.School of MathematicsThe University of ManchesterManchesterUK

Personalised recommendations