1 Introduction

Given decision procedures for theories \(\mathcal {T}_1\) and \(\mathcal {T}_2\) with disjoint signatures, is there a decision procedure for \(\mathcal {T}_1 \cup \mathcal {T}_2\)? In general, the answer is “not necessarily”, but a central question in Satisfiability Modulo Theories (SMT) [3] is: what assumptions on \(\mathcal {T}_1\) and \(\mathcal {T}_2\) suffice for theory combination? This line of research began with Nelson and Oppen’s theory combination procedure [15], which applies when \(\mathcal {T}_1\) and \(\mathcal {T}_2\) are stably infinite, roughly meaning that every \(\mathcal {T}_i\)-satisfiable quantifier-free formula is satisfied by an infinite \(\mathcal {T}_i\)-interpretation for \(i \in \{1,2\}\).

The Nelson–Oppen procedure is quite useful, but requires both theories to be stably infinite, which is not always the case (e.g., the theories of bit-vectors and finite datatypes are not stably infinite). Thus, sufficient properties of only one of the theories were identified, such as gentleness [7], shininess [20], and flexibility [9]. The most relevant property for our purposes is strong politeness [4, 8, 18, 19]. It is essential to the functioning of the SMT solver cvc5 [1], which is called billions of times per day in industrial production code. A theory is strongly polite if it is smooth and strongly finitely witnessable, which are model-theoretic properties we will define later. These properties are more involved than stable infiniteness, so proving a theory to be strongly polite is more difficult. But the advantage of strongly polite theories is that they can be combined with any other decidable theory, including theories that are not stably infinite.

Given the abundance of model-theoretic properties relevant to theory combination, some of which interact in subtle ways, it behooves us to understand the logical relations between them. Recent papers [21, 22] have sought to understand the relations between seven model-theoretic properties—including stable infiniteness, smoothness, and strong finite witnessability—by determining which combinations of properties are possible in various signatures. In most cases, a theory with the desired combination of properties was constructed, or it was proved that none exists. The sole exception was theories that are stably infinite and strongly finitely witnessable but not smooth, dubbed unicorn theories and conjectured not to exist. Our main result, Theorem 2, confirms this conjecture.

Besides completing the taxonomy of properties from [21, 22], our result has practical consequences. The nonexistence of unicorns implies that strongly polite theories can be equivalently defined as those that are stably infinite and strongly finitely witnessable. Since it is easier to prove that a theory is stably infinite than to prove that it is smooth, this streamlines the process of proving that a theory is strongly polite. Thus, each time a new theory is introduced, proving that it can be combined with other theories becomes easier.Footnote 1 Similarly, our results give a new characterization of shiny theories, which makes it easier to prove that a theory is amenable to the shiny combination procedure (see Corollary 2).

We also believe that our result is of theoretical interest. Theorem 3, which is the main ingredient in the proof of Theorem 2, can be seen as a variant of the upward Löwenheim–Skolem theorem for many-sorted logic, since proving that a theory is smooth amounts to proving that cardinalities of sorts can be increased arbitrarily, including to uncountable cardinals. This result may be of independent interest to logicians studying the model theory of many-sorted logic, and we hope the proof techniques are useful to them as well.

Speaking of proof techniques, our proof is curious in that it uses Ramsey’s theorem from finite combinatorics. This is not the first time Ramsey’s theorem has been used in logic. Ramsey proved his theorem in the course of solving a special case of the decision problem for first-order logic [17]. Ramsey’s theorem also shows up in the Ehrenfeucht–Mostowski construction in model theory [5]. Our proof actually requires a generalization of Ramsey’s theorem, which we prove using the standard version of Ramsey’s theorem.

A major component of the proof of Theorem 2 amounts to proving a many-sorted version of the Löwenheim–Skolem theorem. On the course to proving this, we realized that a proper understanding of this theorem for many-sorted logic appears to be missing from the literature, despite the fact that the SMT-LIB standard [2] is based on many-sorted logic. To fill this gap, we prove generalizations of the Löwenheim–Skolem theorem for many-sorted logic, and use them to prove a many-sorted Łoś–Vaught test, useful for proving theory completeness.

The remainder of this paper is structured as follows. Section 2 provides background and definitions on many-sorted logic and SMT. Section 3 proves the main result of this paper, namely the nonexistence of unicorn theories. Section 4 proves new many-sorted variants of the Löwenheim–Skolem theorem. Section 5 concludes and presents directions for future work.Footnote 2

2 Preliminaries

2.1 Many-Sorted First-Order Logic

We work in many-sorted first-order logic [14]. A signature \(\varSigma \) consists of a nonempty set \(\mathcal {S}_{\varSigma }\) of sorts, a set \(\mathcal {F}_{\varSigma }\) of function symbols, and a set \(\mathcal {P}_{\varSigma }\) of predicate symbols containing an equality symbol \(=_{\sigma }\) for every sort \(\sigma \in \mathcal {S}_{\varSigma }\).Footnote 3 Every function symbol has an arity \((\sigma _1, \dots , \sigma _n, \sigma )\) and every predicate symbol an arity \((\sigma _1, \dots , \sigma _n)\), where \(\sigma _1, \ldots , \sigma _n, \sigma \in \mathcal {S}_{\varSigma }\) and \(n\ge 0\). Every equality symbol \(=_{\sigma }\) has arity \((\sigma , \sigma )\). To quantify a variable x of sort \(\sigma \), we write \(\forall \,x : \sigma .\,\) and \(\exists \,x : \sigma .\,\) for the universal and existential quantifiers respectively. Let \(|\varSigma | = |\mathcal {S}_{\varSigma }|+|\mathcal {F}_{\varSigma }|+|\mathcal {P}_{\varSigma }|\). If a signature contains only sorts and equalities, we say it is empty. Two signatures are said to be disjoint if they share at most sorts and equality symbols.

We define \(\varSigma \)-terms and \(\varSigma \)-formulas as usual. The set of free variables of sort \(\sigma \) in \(\varphi \) is denoted \(\textit{vars}_{\sigma }(\varphi )\). For \(S\subseteq \mathcal {S}_{\varSigma }\), let \(\textit{vars}_S(\varphi ) = \bigcup _{\sigma \in S} \textit{vars}_{\sigma }(\varphi )\). We also let \(\textit{vars}(\varphi ) = \textit{vars}_{\mathcal {S}_{\varSigma }}(\varphi )\). A \(\varSigma \)-sentence is a \(\varSigma \)-formula with no free variables.

A \(\varSigma \)-structure \(\mathbb {A}\) interprets each sort \(\sigma \in \mathcal {S}_{\varSigma }\) as a nonempty set \(\sigma ^\mathbb {A}\), each function symbol \(f \in \mathcal {F}_{\varSigma }\) as a function \(f^\mathbb {A}\) with the appropriate domain and codomain, and each predicate symbol \(P \in \mathcal {P}_{\varSigma }\) as a relation \(P^\mathbb {A}\) over the appropriate set, such that \(=_{\sigma }^{\mathbb {A}}\) is the identity on \(\sigma ^\mathbb {A}\). A \(\varSigma \)-interpretation \(\mathcal {A}\) is a pair \((\mathbb {A}, \nu )\), where \(\mathbb {A}\) is a \(\varSigma \)-structure and \(\nu \) is a function, called an assignment, mapping each variable x of sort \(\sigma \) to an element \(\nu (x) \in \sigma ^\mathbb {A}\), denoted \(x^\mathcal {A}\). We write \(t^{\mathcal {A}}\) for the interpretation of the \(\varSigma \)-term t under \(\mathcal {A}\), which is defined in the usual way. The entailment relation, denoted \(\vDash \), is defined as usual.

Two structures are elementarily equivalent if they satisfy the same sentences. We say that \(\mathbb {A}\) is an elementary substructure of \(\mathbb {B}\) if \(\mathbb {A}\) is a substructure of \(\mathbb {B}\) and, for all formulas \(\varphi \) and all assignments \(\nu \) on \(\mathbb {A}\), we have \((\mathbb {A},\nu ) \vDash \varphi \) if and only if \((\mathbb {B}, \nu ) \vDash \varphi \). Note that if \(\mathbb {A}\) is an elementary substructure of \(\mathbb {B}\), then they are elementarily equivalent. \(\mathcal {A}\) is an elementary subinterpretation of \(\mathcal {B}\) if \(\mathbb {A}\) is an elementary substructure of \(\mathbb {B}\) and \(\mathcal {A}\)’s assignment is the same as \(\mathcal {B}\)’s assignment.

Given a \(\varSigma \)-structure \(\mathbb {A}\), let \(\mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}=\{\sigma \in \mathcal {S}_{\varSigma }: |\sigma ^{\mathbb {A}}|\ge \aleph _{0}\}\) and \(\mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}=\mathcal {S}_{\varSigma }\setminus \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\). We similarly define \(\mathcal {S}^{\mathcal {A}}_{\ge \aleph _{0}}\) and \(\mathcal {S}^{\mathcal {A}}_{<\aleph _{0}}\) for a \(\varSigma \)-interpretation \(\mathcal {A}\).

A \(\varSigma \)-theory \(\mathcal {T}\) is a set of \(\varSigma \)-sentences, called the axioms of \(\mathcal {T}\). We write \(\vdash _\mathcal {T}\varphi \) instead of \(\mathcal {T}\vDash \varphi \). Structures satisfying \(\mathcal {T}\) are called \(\mathcal {T}\)-models, and interpretations satisfying \(\mathcal {T}\) are called \(\mathcal {T}\)-interpretations. We say a \(\varSigma \)-formula is \(\mathcal {T}\)-satisfiable if it is satisfied by some \(\mathcal {T}\)-interpretation, and we say two \(\varSigma \)-formulas are \(\mathcal {T}\)-equivalent if every \(\mathcal {T}\)-interpretation satisfies one if and only if it satisfies the other. \(\mathcal {T}\) is complete if for every sentence \(\varphi \), we have \(\vdash _{\mathcal {T}} \varphi \) or \(\vdash _{\mathcal {T}} \lnot \varphi \). \(\mathcal {T}\) is consistent if there is no formula \(\varphi \) such that \(\vdash _{\mathcal {T}} \varphi \) and \(\vdash _{\mathcal {T}}\lnot \varphi \). If \(\varSigma _1\) and \(\varSigma _2\) are disjoint, let \(\varSigma _1 \cup \varSigma _2\) be the signature with the union of their sorts, function symbols, and predicate symbols. Given a \(\varSigma _1\)-theory \(\mathcal {T}_1\) and a \(\varSigma _2\)-theory \(\mathcal {T}_2\), the \((\varSigma _1 \cup \varSigma _2)\)-theory \(\mathcal {T}_1 \cup \mathcal {T}_2\) is the theory whose axioms are the union of the axioms of \(\mathcal {T}_1\) and \(\mathcal {T}_2\).

The following theorem, proved in [14], is a many-sorted variant of the first-order compactness theorem.

Theorem 1

(Compactness Theorem [14]). A set of \(\varSigma \)-formulas \(\varGamma \) is satisfiable if and only if every finite subset of \(\varGamma \) is satisfiable.

We say that a \(\varSigma \)-theory \(\mathcal {T}\) has built-in Skolem functions if for all formulas \(\psi (\overrightarrow{x}, y)\), there is \(f \in \mathcal {F}_\varSigma \) such that \(\vdash _{\mathcal {T}} \forall \,\overrightarrow{x}.\, (\exists \,y.\, (\psi (\overrightarrow{x}, y)) \rightarrow \psi (\overrightarrow{x}, f(\overrightarrow{x})))\).Footnote 4 The following is a many-sorted variant of Lemma 2.3.6 of [12]. The proof is almost identical to that of the single-sorted case from [12].

Lemma 1

If \(\mathcal {T}\) is a \(\varSigma \)-theory for a countable \(\varSigma \), then there is a countable signature \(\varSigma ^* \supseteq \varSigma \) and \(\varSigma ^*\)-theory \(\mathcal {T}^* \supseteq \mathcal {T}\) with built-in Skolem functions.

We state a many-sorted generalization of the Tarski–Vaught test, whose proof is also similar to the single-sorted case [12, Proposition 2.3.5].

Lemma 2

(The Tarski–Vaught Test). Suppose \(\mathbb {A}\) is a substructure of \(\mathbb {B}\). Then, \(\mathbb {A}\) is an elementary substructure of \(\mathbb {B}\) if and only if \((\mathbb {B}, \nu ) \vDash \exists \,v.\, \varphi (\overrightarrow{x}, v)\) implies \((\mathbb {A}, \nu ) \vDash \exists \,v.\, \varphi (\overrightarrow{x}, v)\) for every formula \(\varphi (\overrightarrow{x}, v)\) and assignment \(\nu \) over \(\mathbb {A}\).

2.2 Model-Theoretic Properties

Definition 1

Let \(\varSigma \) be a many-sorted signature, \(S \subseteq \mathcal {S}_{\varSigma }\), and \(\mathcal {T}\) a \(\varSigma \)-theory.

  • \(\mathcal {T}\) is stably infinite with respect to S if for every \(\mathcal {T}\)-satisfiable quantifier-free formula \(\varphi \), there is a \(\mathcal {T}\)-interpretation \(\mathcal {A}\) satisfying \(\varphi \) with \(|\sigma ^{\mathcal {A}}| \ge \aleph _0\) for every \(\sigma \in S\).

  • \(\mathcal {T}\) is stably finite with respect to S if for every quantifier-free \(\varSigma \)-formula \(\varphi \) and \(\mathcal {T}\)-interpretation \(\mathcal {A}\) satisfying \(\varphi \), there is a \(\mathcal {T}\)-interpretation \(\mathcal {B}\) satisfying \(\varphi \) such that \(|\sigma ^\mathcal {B}| \le |\sigma ^\mathcal {A}|\) and \(|\sigma ^\mathcal {B}| < \aleph _0\) for every \(\sigma \in S\).

  • \(\mathcal {T}\) is smooth with respect to S if for every quantifier-free formula \(\varphi \), \(\mathcal {T}\)-interpretation \(\mathcal {A}\) satisfying \(\varphi \), and function \(\kappa \) from S to the class of cardinals such that \(\kappa (\sigma ) \ge |\sigma ^{\mathcal {A}}|\) for every \(\sigma \in S\), there is a \(\mathcal {T}\)-interpretation \(\mathcal {B}\) satisfying \(\varphi \) with \(|\sigma ^{\mathcal {B}}|=\kappa (\sigma )\) for every \(\sigma \in S\).

Next, we define arrangements. Given a set of sorts \(S \subseteq \mathcal {S}_{\varSigma }\), finite sets of variables \(V_{\sigma }\) of sort \(\sigma \) for each \(\sigma \in S\), and equivalence relations \(E_{\sigma }\) on \(V_{\sigma }\), the arrangement \(\delta _V\) on \(V=\bigcup _{\sigma \in S}V_{\sigma }\) induced by \(E=\bigcup _{\sigma \in S}E_{\sigma }\) is

$$ \bigwedge _{\sigma \in S}\left[ \bigwedge _{xE_{\sigma }y}(x=y) \wedge \bigwedge _{x\overline{E_{\sigma }}y}\lnot (x=y)\right] , $$

where \(\overline{E_{\sigma }}\) is the complement of \(E_{\sigma }\).

Definition 2

Let \(\varSigma \) be a many-sorted signature, \(S \subseteq \mathcal {S}_{\varSigma }\) a finite set, and \(\mathcal {T}\) a \(\varSigma \)-theory. Then \(\mathcal {T}\) is strongly finitely witnessable with respect to S if there is a computable function \(\textit{wit}\) from the quantifier-free formulas into themselves such that for every quantifier-free formula \(\varphi \):

  1. (i)

    \(\varphi \) and \(\exists \,\overrightarrow{w}.\, \textit{wit}(\varphi )\) are \(\mathcal {T}\)-equivalent, where \(\overrightarrow{w}=\textit{vars}(\textit{wit}(\varphi ))\setminus \textit{vars}(\varphi )\); and

  2. (ii)

    given a finite set of variables V and an arrangement \(\delta _{V}\) on V, if \(\textit{wit}(\varphi ) \wedge \delta _{V}\) is \(\mathcal {T}\)-satisfiable, then there is a \(\mathcal {T}\)-interpretation \(\mathcal {A}\) satisfying \(\textit{wit}(\varphi ) \wedge \delta _{V}\) such that \(\sigma ^{\mathcal {A}}=\textit{vars}_{\sigma }(\textit{wit}(\varphi ) \wedge \delta _{V})^{\mathcal {A}}\) for every \(\sigma \in S\).

2.3 Notation

\(\mathbb {N}\) denotes the set of non-negative integers. Given \(m,n \in \mathbb {N}\), let \( [m,n] := \{\ell \in \mathbb {N} : m \le \ell \le n\} \)  and \( [n] := [1,n]. \) Given a set X, let \( P_{n}(X) := \{Y \subseteq X : |Y| = n\}\), \(X^n := \{(x_1, \dots , x_n) : x_i \in X \ \text {for all} \ i \in [n]\}\), and \(X^* := \bigcup _{n \in \mathbb {N}} X^n\). For any x, we denote \((x,\ldots ,x)\) by \((x)^{\oplus n}\). Given a tuple of tuples \((\overrightarrow{x_1}, \dots , \overrightarrow{x_n})\), where \(\overrightarrow{x_i} \in X^*\) for all i, we will often treat it as an element of \(X^*\) by flattening the tuple.

3 The Nonexistence of Unicorns

We now state our main theorem, which implies that unicorn theories do not exist. Note that since we are motivated by applications to SMT, we hereafter assume all signatures are countable.Footnote 5

Theorem 2

Assume that \(\mathcal {T}\) is a \(\varSigma \)-theory, where \(\varSigma \) is countable. If \(\mathcal {T}\) is stably infinite and strongly finitely witnessable, both with respect to \(S \subseteq \mathcal {S}_\varSigma \), then \(\mathcal {T}\) is smooth with respect to S.

For our proof, we define a weaker variant of smoothness, that focuses the requirements only for finite cardinals.

Definition 3

A \(\varSigma \)-theory \(\mathcal {T}\) is finitely smooth with respect to \(S \subseteq \mathcal {S}_{\varSigma }\) if for every quantifier-free formula \(\varphi \), \(\mathcal {T}\)-interpretation \(\mathcal {A}\) with \(\mathcal {A}\vDash \varphi \), and function \(\kappa \) from \(\mathcal {S}^{\mathcal {A}}_{<\aleph _{0}}\cap S\) to the class of cardinals with \(|\sigma ^{\mathcal {A}}| \le \kappa (\sigma ) < \aleph _0\) for every \(\sigma \in \mathcal {S}^{\mathcal {A}}_{<\aleph _{0}}\cap S\), there is a \(\mathcal {T}\)-interpretation \(\mathcal {B}\) with \(\mathcal {B}\vDash \varphi \) with \(|\sigma ^{\mathcal {B}}|=\kappa (\sigma )\) for every \(\sigma \in \mathcal {S}^{\mathcal {A}}_{<\aleph _{0}}\cap S\).

We make use of the following two lemmas.

Lemma 3

If \(\mathcal {T}\) is stably infinite and strongly finitely witnessable, both with respect to some set of sorts \(S \subseteq \mathcal {S}_\varSigma \), then \(\mathcal {T}\) is finitely smooth with respect to S.

Lemma 4

([22, Theorem 3]). If \(\mathcal {T}\) is strongly finitely witnessable with respect to some set of sorts \(S \subseteq \mathcal {S}_\varSigma \), then \(\mathcal {T}\) is stably finite with respect to S.

In light of the above two lemmas, the following theorem implies Theorem 2.

Theorem 3

Assume that \(\mathcal {T}\) is a \(\varSigma \)-theory, where \(\varSigma \) is countable. If \(\mathcal {T}\) is stably finite and finitely smooth, both with respect to some set of sorts \(S \subseteq \mathcal {S}_\varSigma \), then \(\mathcal {T}\) is smooth with respect to S.

The remainder of this section is thus dedicated to the proof of Theorem 3.

3.1 Motivating the Proof

In this section, we illustrate the proof technique with a simple example. The goal is to motivate the proof of Theorem 3 before delving into the details.

Suppose \(\mathcal {T}\) is a \(\varSigma \)-theory, where \(\mathcal {S}_{\varSigma } = \{\sigma _1, \sigma _2\}\), \(\mathcal {F}_{\varSigma } = \{f\}\), f has arity \((\sigma _2, \sigma _1)\), and the only predicate symbols are equalities. Suppose that \(\mathcal {T}\) is also stably finite and finitely smooth, both with respect to \(S = \mathcal {S}_{\varSigma }\). Let \(\varphi \) be a \(\mathcal {T}\)-satisfiable quantifier-free formula and \(\mathcal {A}\) a \(\mathcal {T}\)-interpretation satisfying \(\varphi \). Let \(\kappa \) be a function from S to the class of cardinals such that \(\kappa (\sigma ) \ge |\sigma ^\mathcal {A}|\) for both \(\sigma \in S\). For concreteness, suppose \(|\sigma _1^\mathcal {A}| = |\sigma _2^\mathcal {A}| = 10\), \(\kappa (\sigma _1) = \aleph _0\), and \(\kappa (\sigma _2) = \aleph _1\). Our goal is to show that there is a \(\mathcal {T}\)-interpretation \(\mathcal {B}^-\) satisfying \(\varphi \) with \(|\sigma _1^{\mathcal {B}^-}| = \aleph _0\) and \(|\sigma _2^{\mathcal {B}^-}| = \aleph _1\).Footnote 6

A natural thought is to apply some variant of the upward Löwenheim–Skolem theorem, but this doesn’t quite work. As will be seen in Sect. 4, generalizations of the Löwenheim–Skolem theorem to many-sorted logic do not let us control the cardinalities of \(\sigma _1\) and \(\sigma _2\) independently. Nevertheless, let us emulate the standard proof technique for the upward Löwenheim–Skolem theorem.

Here is the most natural way of generalizing the proof of the upward Löwenheim–Skolem theorem to our setting. For simplicity, assume that \(\mathcal {T}\) already has built-in Skolem functions. We introduce \(\aleph _0\) new constants \(\{c_{1,\alpha }\}_{\alpha < \omega }\) and \(\aleph _1\) new constants \(\{c_{2,\alpha }\}_{\alpha < \omega _1}\). We define a set of formulas \(\varGamma = \{\varphi \} \cup \varGamma _1\), where

$$ \varGamma _1 = \{\lnot (c_{i,\alpha } = c_{i,\beta }) : i \in \{1,2\};\, \alpha , \beta < \kappa (\sigma _i);\, \alpha \ne \beta \}. $$

By Theorem 1 and finite smoothness, there is a \(\mathcal {T}\)-interpretation \(\mathcal {B}\) satisfying \(\varGamma \): indeed, were that not true, Theorem 1 would guarantee that some finite subset of \(\varGamma \) is unsatisfiable; yet such a set would only demand the existence of finitely many new elements, which can be achieved by making use of finite smoothness. Since \(\mathcal {B}\vDash \varGamma _1\), we have \(|\sigma _1^\mathcal {B}| \ge \aleph _0\) and \(|\sigma _2^\mathcal {B}| \ge \aleph _1\).

Since \(\mathcal {B}\) may be too large, we construct a subinterpretation \(\mathcal {B}^-\) with

$$\begin{aligned} \sigma _1^{\mathcal {B}^-} &= \{c_{1,\alpha }^\mathcal {B}\}_{\alpha < \omega } \cup \{f^\mathcal {B}(c_{2,\alpha }^\mathcal {B})\}_{\alpha < \omega _1} \\ \sigma _2^{\mathcal {B}^-} &= \{c_{2,\alpha }^\mathcal {B}\}_{\alpha < \omega _1}. \end{aligned}$$

And using the assumption that \(\mathcal {T}\) has built-in Skolem functions, we can prove that \(\mathcal {B}^-\) is an elementary subinterpretation of \(\mathcal {B}\), so \(\mathcal {B}^- \vDash \varGamma \); we can then prove that \(|\sigma _2^{\mathcal {B}^-}| = \aleph _1\), but we unfortunately cannot guarantee that \(|\sigma _1^{\mathcal {B}^-}| = \aleph _0\). This is because \(\mathcal {B}^-\) has not only the \(\aleph _1\) elements \(\{c_{2,\alpha }^\mathcal {B}\}_{\alpha < \omega _1}\) of sort \(\sigma _2\), but also the elements \(\{f^\mathcal {B}(c_{2,\alpha }^\mathcal {B})\}_{\alpha < \omega _1}\) of sort \(\sigma _1\). The function symbol f has created a “spillover” of elements from \(\sigma _2\) to \(\sigma _1\).

To fix this, we need to ensure that \(|\{f^\mathcal {B}(c_{2,\alpha }^\mathcal {B})\}_{\alpha < \omega _1}| \le \aleph _0\). To that end, define \(\varGamma \) to instead be \(\{\varphi \} \cup \varGamma _1 \cup \varGamma _2\), where

$$ \varGamma _2 = \{f(b) = f(d) : b,d \in \{c_{2,\alpha }\}_{\alpha < \omega _1}\}. $$

Then, if there is a model \(\mathcal {B}\) satisfying \(\varGamma \), we have \(|\{f^\mathcal {B}(c_{2,\alpha }^\mathcal {B})\}_{\alpha < \omega _1}| = 1 \le \aleph _0\). To show \(\varGamma \) is \(\mathcal {T}\)-satisfiable, it suffices by the compactness theorem to show that \(\mathcal {T}\cup \varGamma '\) is satisfiable for every finite subset \(\varGamma ' \subseteq \varGamma \). So let \(\varGamma '_1 \subseteq \varGamma _1\) and \(\varGamma '_2 \subseteq \varGamma _2\) be finite subsets. We will construct a \(\mathcal {T}\)-interpretation \(\mathcal {B}'\) such that \(\mathcal {B}' \vDash \{\varphi \} \cup \varGamma '_1 \cup \varGamma '_2\). For concreteness, suppose that \(\{c_{1,0}, c_{1,1}, \dots , c_{1,99}\}\) and \(\{c_{2,0}, c_{2,1}, \dots , c_{2,9}\}\) are the new constants that appear in \(\varGamma '_1 \cup \varGamma '_2\). By finite smoothness, there is a \(\mathcal {T}\)-interpretation \(\mathcal {B}'\) satisfying \(\varphi \) such that \(|\sigma _1^{\mathcal {B}'}| = 100\) and \(|\sigma _2^{\mathcal {B}'}| = 901\). By the pigeonhole principle, there is a subset \(Y \subseteq \sigma _2^{\mathcal {B}'}\) with \(|Y| \ge 10\) such that \(f^{\mathcal {B}'}\) is constant on Y; if 901 pigeons are put in 100 holes, then some hole has at least 10 pigeons (although this is not true for 900 pigeons). Then, \(\mathcal {B}'\) can interpret the constants \(\{c_{1,0}, c_{1,1}, \dots , c_{1,99}\}\) as distinct elements of \(\sigma _1^{\mathcal {B}'}\) and the constants \(\{c_{2,0}, c_{2,1}, \dots , c_{2,9}\}\) as distinct elements of Y. This proves that \(\varGamma \) is \(\mathcal {T}\)-satisfiable.

Fig. 1.
figure 1

How we move from interpretation to interpretation

We illustrate the top level structure of the proof idea in Fig. 1, applied to the working example. The x axis represents cardinalities of interpretations of \(\sigma _1\), and the y axis does the same for \(\sigma _2\). Starting from the interpretation \(\mathcal {A}\) with \(|\sigma _{1}^{\mathcal {A}}|=|\sigma _{2}^{\mathcal {A}}|=10\), we construct some interpretation \(\mathcal {B}\), represented by the array of red dots as there is some degree of uncertainty regarding the precise cardinalities of its domains, with \(|\sigma _{1}^{\mathcal {B}}|\ge \aleph _{0}\) and \(|\sigma _{2}^{\mathcal {B}}|\ge \aleph _{1}\). From \(\mathcal {B}\) we hope to construct \(\mathcal {B}^{-}\), which has \(|\sigma _{1}^{\mathcal {B}^{-}}|=\aleph _{0}\) and \(|\sigma _{2}^{\mathcal {B}^{-}}|=\aleph _{1}\): the latter can be achieved using techniques similar to the many-sorted Löwenheim-Skolem theorems (see Sect. 4 below), while the former requires the aforementioned pigeonhole principle arguments.

The above proof sketch illustrates the main ideas behind the proof of Theorem 3. The generalization to more sorts and function symbols requires some extra bookkeeping. More interestingly, the generalization to functions of arity greater than one requires a version of Ramsey’s theorem, which is a generalization of the pigeonhole principle.

3.2 Ramsey’s Theorem and Generalizations

In this section, we state Ramsey’s theorem and a generalization of it.

Ramsey’s theorem is sometimes stated in terms of coloring the edges of hypergraphs, but for our purposes it is more convenient to state it as follows. In the following lemma, the notations \(P_{n}(X)\) and [k] are defined as in Sect. 2.3.

Lemma 5

(Ramsey’s theorem [17, Theorem B]). For any \(k,n,m \in \mathbb {N}\), there is an \(R(k,n,m) \in \mathbb {N}\) such that for any set X with \(|X| \ge R(k,n,m)\) and function \(f : P_{n}(X) \rightarrow [k]\), there is a subset \(Y \subseteq X\) with \(|Y| \ge m\) such that f is constant on \(P_{n}(Y)\).

Note that in Ramsey’s theorem, the set [k] can be replaced by any set of cardinality k.

We want to generalize Ramsey’s theorem to functions \(f : X^n \rightarrow [k]\). The most natural generalization would state that there is a large subset \(Y \subseteq X\) such that f is constant on \(Y^n\). But this generalization is false, as the following example shows.

Example 1

Let \(X = \mathbb {Z}\), and let \(f : X^2 \rightarrow [2]\) be given by

$$ f(m,n) = {\left\{ \begin{array}{ll} 1 & \quad \text {if} \ m < n \\ 2 & \quad \text {otherwise}. \end{array}\right. } $$

Then, \(f(m,n) \ne f(n,m)\) for all \(m,n \in X\) with \(m \ne n\). Thus, there is no subset \(Y \subseteq X\) with \(|Y| \ge 2\) such that f is constant on \(Y^2\).

To avoid counterexamples like this, our generalization needs to consider the order of the arguments of f. This motivates the following definition.

Definition 4

Let \((X, <)\) be a totally ordered set, and let \(\overrightarrow{x} = (x_1, \dots , x_n)\) and \(\overrightarrow{y} = (y_1, \dots , y_n)\) be elements of \(X^n\). We write \(\overrightarrow{x} \sim \overrightarrow{y}\) if for every \(1 \le i < j \le n\) we have

$$\begin{aligned} x_i < x_j &\Longleftrightarrow y_i < y_j \quad \text {and} \\ x_i = x_j &\Longleftrightarrow y_i = y_j. \end{aligned}$$

Observe that \(\sim \) is an equivalence relation on \(X^n\) with finitely many equivalence classes.Footnote 7

Now we can state our first generalization of Ramsey’s theorem.

Lemma 6

For any \(k,n,m \in \mathbb {N}\), there is an \(R^*(k,n,m) \in \mathbb {N}\) such that for any totally ordered set \((X, <)\) with \(|X| \ge R^*(k,n,m)\) and function \(f : X^n \rightarrow [k]\), there is a subset \(Y \subseteq X\) with \(|Y| \ge m\) such that f is constant on each \(\sim \)-equivalence class of \(Y^n\).

Next, we further generalize Ramsey’s theorem to multiple functions \(f_1, \dots , f_r\).

Lemma 7

For any \(k,m \in \mathbb {N}\) and \(\overrightarrow{n} = (n_1, \dots , n_r) \in \mathbb {N}^r\), there is a number \(R^{**}(k,\overrightarrow{n},m) \in \mathbb {N}\), such that for any totally ordered set \((X, <)\) with \(|X| \ge R^{**}(k,\overrightarrow{n},m)\) and functions \(f_i : X^{n_i} \rightarrow [k]\) for \(i \in [r]\), there is a subset \(Y \subseteq X\) with \(|Y| \ge m\), such that \(f_i\) is constant on each \(\sim \)-equivalence class of \(Y^{n_i}\) for all \(i \in [r]\).

3.3 The Proof of Theorem 3

Fix a \(\varSigma \)-theory \(\mathcal {T}\) and a set of sorts \(S\subseteq \mathcal {S}_{\varSigma }\). Assume that \(\varSigma \) is countable. Suppose that \(\mathcal {T}\) is stably finite and finitely smooth, both with respect to S. Let \(\varphi \) be a \(\mathcal {T}\)-satisfiable quantifier-free formula and \(\mathcal {A}\) a \(\mathcal {T}\)-interpretation satisfying \(\varphi \). Let \(\kappa \) be a function from S to the class of cardinals such that \(\kappa (\sigma ) \ge |\sigma ^\mathcal {A}|\) for every \(\sigma \in S\).

Write \(S = \{\sigma _1, \sigma _2, \dots \}\) and, without loss of generality, assume \(\kappa (\sigma _1) \le \kappa (\sigma _2) \le \cdots \). For notational convenience, we write all \(\varSigma \)-terms in the form \(t(\overrightarrow{x_1}, \overrightarrow{x_2}, \dots )\),Footnote 8 where \(\overrightarrow{x_i}\) is a tuple of variables of sort \(\sigma _i\). If \(\kappa (\sigma _i) < \aleph _0\) for all i, then we are done by the fact \(\mathcal {T}\) is finitely smooth. Otherwise, let \(\ell \) be the largest natural number such that \(\kappa (\sigma _\ell ) < \aleph _0\) if there is such a number, and let \(\ell = 0\) otherwise.

The proof of Theorem 3 proceeds in two steps. First, we construct a set of formulas \(\varGamma \) such that \(\varphi \in \varGamma \) and prove that there is a \(\mathcal {T}\)-interpretation \(\mathcal {B}\) satisfying \(\varGamma \). Second, we prove that \(\mathcal {B}\) has an elementary subinterpretation \(\mathcal {B}^-\) such that \(|\sigma _i^{\mathcal {B}^-}| = \kappa (\sigma _i)\) for all i. Since \(\varphi \in \varGamma \), it will follow that \(\mathcal {T}\) is smooth.

The assumption that \(\mathcal {T}\) is stably finite and finitely smooth is used to construct \(\mathcal {T}\)-interpretations of the following form, which will be useful for a compactness argument.

Lemma 8

There is a \(\mathcal {T}\)-interpretation \(\mathcal {B}\) satisfying \(\varphi \) such that \(|\sigma _i^\mathcal {B}| = \kappa (\sigma _i)\) for all \(i \le \ell \), and \(|\sigma _i^\mathcal {B}|\) is arbitrarily large but finite for all \(i > \ell \).

Proof

First, apply stable finiteness to get a \(\mathcal {T}\)-interpretation \(\mathcal {A}'\) satisfying \(\varphi \) such that \(|\sigma _i^{\mathcal {A}'}| \le |\sigma _i^\mathcal {A}|\) and \(|\sigma _i^{\mathcal {A}'}| < \aleph _0\) for all i. Then, apply finite smoothness to \(\mathcal {A}'\) with \(\kappa '\) given by \(\kappa '(\sigma _i) = \kappa (\sigma _i)\) for all \(i \le \ell \) and \(\kappa '(\sigma _i)\) arbitrarily large but finite for all \(i > \ell \).   \(\square \)

It will be convenient to work with a theory with built-in Skolem functions, so we use Lemma 1 to get a \(\varSigma ^*\)-theory \(\mathcal {T}^* \supseteq \mathcal {T}\), where \(\varSigma ^* \supseteq \varSigma \) and \(\varSigma ^*\) is countable. To construct our set of formulas \(\varGamma \), we introduce \(\kappa (\sigma _i)\) new constants \(\{c_{i,\alpha }\}_{\alpha < \kappa (\sigma _i)}\) of sort \(\sigma _i\) for each i. We consider these constants to be part of an even larger signature \(\varSigma ' \supseteq \varSigma ^*\). In what follows, we construct sentences and interpretations over \(\varSigma '\). Impose an arbitrary total order on each \(\{c_{i,\alpha }\}_{\alpha < \kappa (\sigma _i)}\) to be used for the \(\sim \) relation. For the definition below, recall that given a set X, we define \(X^{*}= \bigcup _{n\in \mathbb {N}}X^{n}\).

Definition 5

We define a set of formulas \(\varGamma = \{\varphi \} \cup \varGamma _1 \cup \varGamma _2 \cup \varGamma _3\), where

$$\begin{aligned} \varGamma _1 = &\{\lnot (c_{i,\alpha } = c_{i,\beta }) : 1 \le i \le |S|;\, \alpha , \beta < \kappa (\sigma _i);\, \alpha \ne \beta \} \\ \varGamma _2 = &\left\{ t\left( \overrightarrow{c_1}, \dots , \overrightarrow{c_i}, \overrightarrow{b_{i+1}}, \overrightarrow{b_{i+2}}, \dots \right) = t\left( \overrightarrow{c_1}, \dots , \overrightarrow{c_i}, \overrightarrow{d_{i+1}}, \overrightarrow{d_{i+2}}, \dots \right) : \right. \\ & t \ is~a~\varSigma ^* -term~of~sort \ \sigma _i;\, i > \ell ;\, \overrightarrow{c_k}, \overrightarrow{b_k}, \overrightarrow{d_k} \in (\{c_{k, \alpha }\}_{\alpha < \kappa (\sigma _k)})^* \\ &\left. {for~all} \ k;\, \overrightarrow{b_j} \sim \overrightarrow{d_j} \ {for~all} \ j > i \right\} \\ \varGamma _3 = &\left\{ \forall \,x : \sigma _i.\, \bigvee _{\alpha < \kappa (\sigma _i)} x = c_{i,\alpha } : i \le \ell \right\} . \end{aligned}$$

Note that the disjunctions in \(\varGamma _3\) are finite given the condition \(i \le \ell \).

Lemma 9

There is a \(\mathcal {T}^*\)-interpretation \(\mathcal {B}\) such that \(\mathcal {B}\vDash \varGamma \).

This lemma forms the core of the argument. By the compactness theorem, it suffices to prove that for any finite subset \(\varGamma ' \subseteq \varGamma \), there is a \(\mathcal {T}^*\)-interpretation \(\mathcal {B}'\) such that \(\mathcal {B}' \vDash \varGamma '\). The tricky part is making \(\mathcal {B}'\) satisfy \(\varGamma ' \cap \varGamma _2\). The strategy is to use Lemma 8 to construct a model \(\mathcal {B}'\) in which \(|\sigma _{i+1}^{\mathcal {B}'}|\) is very large in terms of \(|\sigma _i^{\mathcal {B}'}|\) for each \(i > \ell \). Lemma 7 will ensure that there is some way of interpreting the constants \(\{c_{i,\alpha }\}_{\alpha < \kappa (\sigma _i)}\) so that \(\mathcal {B}' \vDash \varGamma ' \cap \varGamma _2\).

We are now ready to prove Theorem 3.

Proof

(Theorem 3). By Lemma 9, there is a \(\mathcal {T}^*\)-interpretation \(\mathcal {B}\) such that \(\mathcal {B}\vDash \varGamma \). Let

$$ B = \left\{ t^\mathcal {B}\left( (\overrightarrow{c_1})^\mathcal {B}, (\overrightarrow{c_2})^\mathcal {B}, \dots \right) : t \ \text {is a }\varSigma ^* \text {-term};\, \overrightarrow{c_i} \in (\{c_{i, \alpha }\}_{\alpha < \kappa (\sigma _i)})^* \ \text {for all} \ i\right\} . $$

For every \(f \in \mathcal {F}_\varSigma \), the set B is closed under \(f^\mathcal {B}\). Thus, we can define \(\mathcal {B}^-\) to be the subinterpretation of \(\mathcal {B}\) obtained by restricting the sorts, functions, and predicates to B.Footnote 9 Since the \(\varSigma ^*\)-theory \(\mathcal {T}^*\) has built-in Skolem functions, \(\mathcal {B}^-\) is an elementary subinterpretation of \(\mathcal {B}\) by Lemma 2. We claim \(|\sigma _i^{\mathcal {B}^-}| = \kappa (\sigma _i)\) for all i.

First, \(\{c_{i,\alpha }^{\mathcal {B}^-}\}_{\alpha < \kappa (\sigma _i)}\) is a set of \(\kappa (\sigma _i)\) distinct elements in \(\sigma _i^{\mathcal {B}^-}\), because \(\mathcal {B}^- \vDash \varGamma _1\). Thus, \(|\sigma _i^{\mathcal {B}^-}| \ge \kappa (\sigma _i)\) for all i.

Second, \(|\sigma _i^{\mathcal {B}^-}| \le |\{c_{i,\alpha }\}_{\alpha < \kappa (\sigma _i)}| = \kappa (\sigma _i)\) for all \(i \in [\ell ]\), as \(\mathcal {B}^- \vDash \varGamma _3\).

Finally, it remains to show that \(|\sigma _i^{\mathcal {B}^-}| \le \kappa (\sigma _i)\) for all \(i > \ell \). Inductively suppose that \(|\sigma _j^{\mathcal {B}^-}| \le \kappa (\sigma _j)\) for all \(j < i\). Now, every element of \(\sigma _i^{\mathcal {B}^-}\) is of the form

$$ t^\mathcal {B}\left( (\overrightarrow{c_1})^\mathcal {B}, \dots , (\overrightarrow{c_i})^\mathcal {B}, (\overrightarrow{c_{i+1}})^\mathcal {B}, (\overrightarrow{c_{i+2}})^\mathcal {B}, \dots \right) , $$

where t is a \(\varSigma ^*\)-term of sort \(\sigma _i\). Since \(\varSigma ^*\) is countable, there are at most \(\aleph _0\) choices for t. We have at most \(\kappa (\sigma _i)\) choices for \( (\overrightarrow{c_1})^\mathcal {B}, \dots , (\overrightarrow{c_i})^\mathcal {B}. \) Finally, we have finitely many choices for \((\overrightarrow{c_{i+1}})^\mathcal {B}, (\overrightarrow{c_{i+2}})^\mathcal {B}, \ldots \) up to \(\sim \)-equivalence. Since \(\mathcal {B}^- \vDash \varGamma _2\), it follows that there are at most \(\kappa (\sigma _i)\) elements of \(\sigma _i^{\mathcal {B}^-}\). Therefore, \(\mathcal {B}^-\) is a \(\mathcal {T}^*\)-interpretation satisfying \(\varphi \) with \(|\sigma _i^{\mathcal {B}^-}| = \kappa (\sigma _i)\) for all i. Taking the reduct of \(\mathcal {B}^-\) to \(\varSigma \) gives the desired \(\mathcal {T}\)-interpretation.   \(\square \)

3.4 Applications to Theory Combination

Since Theorem 2 implies that stably infinite and strongly finitely witnessable theories are strongly polite, we can restate the theorem on strongly polite theory combination with weaker hypotheses. This was already proved in [21] via a different method, but is now obtained as an immediate corollary of Theorem 2.

Corollary 1

Let \(\varSigma _1\) and \(\varSigma _2\) be disjoint countable signatures. Let \(\mathcal {T}_1\) and \(\mathcal {T}_2\) be \(\varSigma _1\)- and \(\varSigma _2\)-theories respectively, and let \(\varphi _1\) and \(\varphi _2\) be quantifier-free \(\varSigma _1\)- and \(\varSigma _2\)-formulas respectively. Suppose \(\mathcal {T}_1\) is stably infinite and strongly finitely witnessable, both with respect to \(\mathcal {S}_{\varSigma _1} \cap \mathcal {S}_{\varSigma _2}\), and let \(V = \textit{vars}_{\mathcal {S}_{\varSigma _1} \cap \mathcal {S}_{\varSigma _2}}(\textit{wit}(\varphi _1))\). Then, \(\varphi _1 \wedge \varphi _2\) is \((\mathcal {T}_1 \cup \mathcal {T}_2)\)-satisfiable if and only if there is an arrangement \(\delta _V\) on V such that \(\textit{wit}(\varphi _1) \wedge \delta _V\) is \(\mathcal {T}_1\)-satisfiable and \(\varphi _2 \wedge \delta _V\) is \(\mathcal {T}_2\)-satisfiable.

We can also use our results to give a new characterization of shiny theories, which allows us to restate shiny combination theorem with weaker hypotheses.

To define shininess, we first need a few other notions. Let \(\varSigma \) be a signature with \(\mathcal {S}_\varSigma \) finite, and let \(S \subseteq \mathcal {S}_\varSigma \). Write \(S = \{\sigma _1, \dots , \sigma _n\}\). Then, the S-size of a \(\varSigma \)-interpretation \(\mathcal {A}\) is given by the tuple \((|\sigma _1^\mathcal {A}|, \dots , |\sigma _n^\mathcal {A}|)\). Such n-tuples are partially ordered by the product order: \((x_1, \dots , x_n) \preceq (y_1, \dots , y_n)\) if and only if \(x_i \le y_i\) for all \(i \in [n]\). Given a quantifier-free formula \(\varphi \), let \(\textrm{minmods}_{\mathcal {T}, S}(\varphi )\) be the set of minimal S-sizes of \(\mathcal {T}\)-interpretations satisfying \(\varphi \). It follows from results in [10] that \(\textrm{minmods}_{\mathcal {T}, S}(\varphi )\) is a finite set of tuples.Footnote 10

Then, we say a \(\varSigma \)-theory \(\mathcal {T}\) is shiny with respect to some subset of sorts \(S \subseteq \mathcal {S}_\varSigma \) if \(\mathcal {S}_\varSigma \) is finite, \(\mathcal {T}\) is stably finite and smooth, both with respect to S, and \(\textrm{minmods}_{\mathcal {T}, S}\) is computable. Theorem 3 implies that we can replace smoothness by finite smoothness, which may make it easier to prove that some theories are shiny. We can therefore improve the shiny theory combination theorem from [4, Theorem 2] as an immediate corollary of Theorem 3.

Corollary 2

Let \(\varSigma _1\) and \(\varSigma _2\) be disjoint countable signatures, where \(\mathcal {S}_{\varSigma _1}\) and \(\mathcal {S}_{\varSigma _2}\) are finite. Let \(\mathcal {T}_1\) and \(\mathcal {T}_2\) be \(\varSigma _1\)- and \(\varSigma _2\)-theories respectively, and assume the satisfiability problems for quantifier-free formulas of both \(\mathcal {T}_{1}\) and \(\mathcal {T}_{2}\) are decidable. Suppose \(\mathcal {T}_1\) is stably finite and finitely smooth, both with respect to \(\mathcal {S}_{\varSigma _1} \cap \mathcal {S}_{\varSigma _2}\), and \(\textrm{minmods}_{\mathcal {T}_1, \mathcal {S}_{\varSigma _1} \cap \mathcal {S}_{\varSigma _2}}\) is computable. Then, the satisfiability problem for quantifier-free formulas of \(\mathcal {T}_{1}\cup \mathcal {T}_{2}\) is decidable.

4 Many-Sorted Löwenheim–Skolem Theorems

In this section, we state many-sorted generalizations of the Löwenheim–Skolem theorem. Our first results, in Sect. 4.2, hold with no assumptions on the signature. Later, in Sect. 4.3, we state stronger results for restricted signatures, which we then use for a many-sorted variant of the Łoś–Vaught test in Sect. 4.4. But first, in Sect. 4.1, we explain the limitations of relying solely on translations to single-sorted first-order logic.

4.1 Lost in Translation

We may transform a many-sorted signature into a single-sorted signature by adding unary predicates signifying the sorts; of course, some restrictions are necessary, distinctness of sorts, etc. This procedure [6, 13, 24] is often used to lift results from single-sorted to many-sorted logic. As one example, standard versions of the downward Löwenheim–Skolem theorem for many-sorted logic, found in [14], are proven using this translation; we can, however, strengthen these results while still using only translations:

Theorem 4

(Downward). Let \(\varSigma \) be a many-sorted signature with \(|\mathcal {S}_{\varSigma }|<\aleph _{0}\). Suppose we have a \(\varSigma \)-structure \(\mathbb {A}\) with \(\max \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}_{\varSigma }\}\ge \aleph _{0}\), a cardinal \(\kappa \) satisfying \(\max \{|\varSigma |, \aleph _{0}\}\le \kappa \le \min \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\}\), and sets \(A_{\sigma }\subseteq \sigma ^{\mathbb {A}}\) with \(|A_{\sigma }|\le \kappa \) for each \(\sigma \in \mathcal {S}_{\varSigma }\). Then, there is an elementary substructure \(\mathbb {B}\) of \(\mathbb {A}\) such that \(\sigma ^{\mathbb {B}}=\sigma ^{\mathbb {A}}\) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\), \(\aleph _{0}\le |\sigma ^{\mathbb {B}}|\le \kappa \) for all \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\), \(|\sigma ^{\mathbb {B}}|=\kappa \) for some \(\sigma \in \mathcal {S}_{\varSigma }\), and \(A_{\sigma }\subseteq \sigma ^{\mathbb {B}}\) for all \(\sigma \in \mathcal {S}_{\varSigma }\).

Theorem 5

(Upward). Let \(\varSigma \) be a many-sorted signature with \(|\mathcal {S}_{\varSigma }|<\aleph _{0}\). Suppose we have a \(\varSigma \)-structure \(\mathbb {A}\) with \(\max \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}_{\varSigma }\}\ge \aleph _{0}\) and a cardinal \(\kappa \ge \max \{|\varSigma |, \max \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}_{\varSigma }\}\}\). Then, there is a \(\varSigma \)-structure \(\mathbb {B}\) containing \(\mathbb {A}\) as an elementary substructure such that \(\sigma ^{\mathbb {B}}=\sigma ^{\mathbb {A}}\) for all \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\), \(\aleph _{0}\le |\sigma ^{\mathbb {B}}|\le \kappa \) for all \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\), and \(|\sigma ^{\mathbb {B}}|=\kappa \) for some sort \(\sigma \in \mathcal {S}_{\varSigma }\).

As convenient as translation arguments are, the above Löwenheim–Skolem theorems seem unsatisfactory, as they only allow us to choose a single cardinal, rather than one for each sort.

4.2 Downward, Upward, and Combined Versions

The following are generalizations of the downward and upward Löwenheim–Skolem theorems to many-sorted logic, which are proved by adapting the proofs of the single-sorted case. Notice that we set all infinite domains to the same cardinality, while finite domains preserve their cardinalities.

Theorem 6

(Downward). Fix a first-order many-sorted signature \(\varSigma \). Suppose we have a \(\varSigma \)-structure \(\mathbb {A}\), a cardinal \(\kappa \) such that \(\max \{\aleph _{0}, |\varSigma |\}\le \kappa \le \min \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\}\), and sets \(A_{\sigma } \subseteq \sigma ^\mathbb {A}\) with \(|A_\sigma | \le \kappa \) for each \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\). Then, there is an elementary substructure \(\mathbb {B}\) of \(\mathbb {A}\) that satisfies \(|\sigma ^{\mathbb {B}}|=\kappa \) and \(\sigma ^{\mathbb {B}}\supseteq A_{\sigma }\) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\), and also \(\sigma ^{\mathbb {B}}=\sigma ^{\mathbb {A}}\) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\).

Theorem 7

(Upward). Fix a first-order many-sorted signature \(\varSigma \). Given a \(\varSigma \)-structure \(\mathbb {A}\), pick a cardinal \(\kappa \ge \max \{|\varSigma |, \aleph _{0},\sup \{ |\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\}\}\). Then, there is a \(\varSigma \)-structure \(\mathbb {B}\) containing \(\mathbb {A}\) as an elementary substructure that satisfies \(|\sigma ^{\mathbb {B}}|=\kappa \) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\), and also \(\sigma ^{\mathbb {B}} = \sigma ^{\mathbb {A}}\) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\).

Theorems 6 and 7 can be combined to yield yet another variant of the Löwenheim–Skolem theorem, which may be called the combined version.

Corollary 3

(Combined). Fix a many-sorted signature \(\varSigma \). Given a \(\varSigma \)-structure \(\mathbb {A}\), pick a cardinal \(\kappa \ge \max \{|\varSigma |, \aleph _{0}\}\). Then, there is a \(\varSigma \)-structure \(\mathbb {B}\) elementarily equivalent to \(\mathbb {A}\) with \(|\sigma ^{\mathbb {B}}|=\kappa \) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\), and \(\sigma ^{\mathbb {B}} = \sigma ^{\mathbb {A}}\) for \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\).

Fig. 2.
figure 2

Illustration of Corollary 3.

We illustrate Corollary 3 in Fig. 2. In black, we represent the cardinalities of the resulting structure, and in red, those of the original one. When they coincide, we use marks split between the two colors. This representation shows a set of sorts in the horizontal axis, and the heights of the marks represent the cardinalities of the respective domains. We clearly separate cardinals larger and smaller than \(\aleph _{0}\) with a rule. Assume, without loss of generality, that initially \(\sigma _1\dots \sigma _n\) have finite cardinalities and \(\sigma ^{\prime }_{1}\) has the least and \(\sigma ^{\prime }_{m}\) the greatest infinite cardinality.Footnote 11 Corollary 3 allows us to pick an infinite cardinal \(\kappa \) in between the least and greatest infinite cardinalities, and set all infinite cardinlaities in the interpretation to \(\kappa \).

The above theorems require that the desired cardinalities of the infinite sorts are all equal. The following example shows that this limitation is necessary.

Example 2

Take the signature \(\varSigma \) with sorts \(S=\{\sigma _{1}, \sigma _{2}\}\), no predicates, and only one function f of arity \((\sigma _{1},\sigma _{2})\). Take the \(\varSigma \)-structure \(\mathbb {A}\) with: \(\sigma _{1}^{\mathbb {A}}\) and \(\sigma _{2}^{\mathbb {A}}\) of cardinality \(\aleph _{1}\), and \(f^{\mathbb {A}}\) a bijection. It is then true that \(\mathbb {A}\vDash \varphi _{ inj } \wedge \varphi _{ sur }\), where \(\varphi _{ inj }=\forall \,x : \sigma _1.\,\forall \,y : \sigma _1.\,\big [[f(x)=f(y)]\rightarrow [x=y]\big ]\) and \(\varphi _{ sur }=\forall \,u : \sigma _2.\,\exists \,x : \sigma _1.\,[f(x)=u]\), codifying that f is injective and surjective respectively. Notice then that, although \(\max \{|\varSigma |, \aleph _{0}\}=\aleph _{0}\), there cannot be an elementary substructure \(\mathbb {B}\) of \(\mathbb {A}\) with \(|\sigma _{1}^{\mathbb {B}}|=\aleph _{0}\) and \(|\sigma _{2}^{\mathbb {B}}|=\aleph _{1}\): for if \(\mathbb {B}\vDash \varphi _{ inj }\wedge \varphi _{ sur }\), \(f^{\mathbb {B}}\) must be a bijection between \(\sigma _{1}^{\mathbb {B}}\) and \(\sigma _{2}^{\mathbb {B}}\). A similar argument shows that the corresponding generalization of the upwards theorem fails as well.

4.3 A Stronger Result for Split Signatures

Example 2 relies on “mixing sorts” by using a function symbol with arities spanning different sorts. We can state stronger versions of the many-sorted Löwenheim–Skolem theorems when such mixing of sorts is restricted.

Definition 6

A signature \(\varSigma \) is said to be split by \(\Lambda \)into a family of signatures \(\{\varSigma _{\lambda } : \lambda \in \Lambda \}\) if \(\Lambda \) is a partition of \(\mathcal {S}_{\varSigma }\), \(\mathcal {S}_{\varSigma _{\lambda }}=\lambda \) for each \(\lambda \in \Lambda \), \(\mathcal {F}_{\varSigma }=\bigcup _{\lambda \in \Lambda }\mathcal {F}_{\varSigma _{\lambda }}\), and \(\mathcal {P}_{\varSigma }=\bigcup _{\lambda \in \Lambda }\mathcal {P}_{\varSigma _{\lambda }}\). If \(\varSigma \) is split by \(\Lambda \) and each \(\lambda \in \Lambda \) is a singleton, then we say that \(\varSigma \) is completely split by \(\Lambda \).

If \(\varSigma \) is split by \(\Lambda \), then the function/predicate symbols of \(\varSigma _{\lambda }\) must be disjoint from \(\varSigma _{\lambda '}\) for \(\lambda \ne \lambda '\). Given a partition \(\Lambda \) of \(\mathcal {S}_{\varSigma }\) and \(\lambda \in \Lambda \), let \(\mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda )=\mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}\cap \lambda \). We state the downward, upward, and combined theorems for split signatures.

Theorem 8

(Downward). Fix a first-order many-sorted signature \(\varSigma \) split by \(\Lambda \). Suppose we have a \(\varSigma \)-structure \(\mathbb {A}\), a cardinal \(\kappa _\lambda \) such that \(\max \{\aleph _{0}, |\varSigma _{\lambda }|\}\le \kappa _{\lambda }\le \min \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda )\}\) for each \(\lambda \in \Lambda \), and sets \(A_{\sigma } \subseteq \sigma ^\mathbb {A}\) with \(|A_\sigma | \le \kappa _\lambda \) for each \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda )\). Then, there is an elementary substructure \(\mathbb {B}\) of \(\mathbb {A}\) that satisfies \(|\sigma ^{\mathbb {B}}|=\kappa _\lambda \) and \(\sigma ^{\mathbb {B}}\supseteq A_{\sigma }\) for \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda )\), and \(\sigma ^{\mathbb {B}}=\sigma ^{\mathbb {A}}\) for \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\).

Theorem 9

(Upward). Suppose \(\varSigma \) is split by \(\Lambda \). Given a \(\varSigma \)-structure \(\mathbb {A}\), pick a cardinal \(\kappa _{\lambda }\ge \max \{|\varSigma _{\lambda }|, \aleph _{0}, \sup \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda )\}\}\) for each \(\lambda \in \Lambda \). Then, there is a \(\varSigma \)-structure \(\mathbb {B}\) containing \(\mathbb {A}\) as an elementary substructure that satisfies \(|\sigma ^{\mathbb {B}}|=\kappa _\lambda \) for \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda )\), and \(\sigma ^{\mathbb {B}} = \sigma ^{\mathbb {A}}\) for \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\).

Corollary 4

(Combined). Suppose \(\varSigma \) is split by \(\Lambda \). Given a \(\varSigma \)-structure \(\mathbb {A}\), pick a cardinal \(\kappa _{\lambda }\ge \max \{|\varSigma _{\lambda }|, \aleph _{0}\}\) for each \(\lambda \in \Lambda \). Then, there is a \(\varSigma \)-structure \(\mathbb {B}\) elementarily equivalent to \(\mathbb {A}\) with \(|\sigma ^{\mathbb {B}}|=\kappa _\lambda \) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda )\), and also \(\sigma ^{\mathbb {B}} = \sigma ^{\mathbb {A}}\) for every \(\sigma \in \mathcal {S}^{\mathbb {A}}_{<\aleph _{0}}\).

Fig. 3.
figure 3

Illustration of Corollary 4.

Corollary 4 is illustrated in Fig. 3. We add sorts \(S^{\prime \prime }=\{\sigma ^{\prime \prime }_{1},\ldots ,\sigma ^{\prime \prime }_{m}\}\), and assume our signature is split into \(\varSigma _{\lambda _{1}}\) and \(\varSigma _{\lambda _{2}}\), where \(\mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda _{1})=\{\sigma ^{\prime }_{1},\ldots ,\sigma ^{\prime }_{m}\}\) and \(\mathcal {S}^{\mathbb {A}}_{\ge \aleph _{0}}(\lambda _{2})=S^{\prime \prime }\) (the sorts with finite cardinalities can belong to either). Then, \(\kappa ^{\prime }\) is the cardinal associated with \(\varSigma _{\lambda _{1}}\), and \(\kappa ^{\prime \prime }\) with \(\varSigma _{\lambda _{2}}\). Thus, we are able to choose a cardinality for each class of sorts.

4.4 An Application: The Łoś–Vaught Test

We describe an application of our Löwenheim–Skolem theorems for theory-completeness: the Łoś–Vaught test. This is particularly relevant to SMT, as if a complete theory \(\mathcal {T}\) has a decidable set of axioms, then it is decidable whether \(\vdash _{\mathcal {T}} \varphi \)  [12, Lemma 2.2.8]. The single-sorted Łoś–Vaught is the following.

Definition 7

Let \(\varSigma \) be a signature and \(\kappa \) a function from \(\mathcal {S}_{\varSigma }\) to the class of cardinals. A \(\varSigma \)-theory \(\mathcal {T}\) is \(\kappa \)-categorical if it has exactly one model \(\mathbb {A}\) (up to isomorphism) with the property that \(|\sigma ^{\mathbb {A}}| = \kappa (\sigma )\) for every \(\sigma \in \mathcal {S}_{\varSigma }\). If there is only one sort \(\sigma \in \mathcal {S}_{\varSigma }\), we abuse notation by using \(\kappa \) to denote the cardinal \(\kappa (\sigma )\).

Theorem 10

([11, 23]). Suppose \(\varSigma \) is single-sorted and \(\mathcal {T}\) is a \(\varSigma \)-theory with only infinite models. If \(\mathcal {T}\) is \(\kappa \)-categorical for some \(\kappa \ge |\varSigma |\), then \(\mathcal {T}\) is complete.

The Łoś–Vaught test is quite useful, e.g., for the completeness of dense linear orders without endpoints and algebraically closed fields. We generalize it to many sorts. Translating to one-sorted logic and using Theorem 10 gives us:

Corollary 5

Let \(\varSigma \) be a signature with \(|\mathcal {S}_{\varSigma }|<\aleph _{0}\). Suppose \(\mathcal {T}\) is a \(\varSigma \)-theory, all of whose models \(\mathbb {A}\) satisfy \(\max \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}_{\varSigma }\} \ge \aleph _{0}\). Suppose further that for some cardinal \(\kappa \ge |\varSigma |\), \(\mathcal {T}\) has exactly one model \(\mathbb {A}\) (up to isomorphism) such that \(\max \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}_{\varSigma }\} = \kappa \). Then, \(\mathcal {T}\) is complete.

This is not the result one would hope for, because it excludes some many-sorted \(\kappa \)-categorical theories, as the following example demonstrates.

Example 3

Suppose \(\varSigma \) has \(S = \{\sigma _1, \sigma _2\}\), no predicate symbols, and function symbols 0, 1, \(+\), and \(\times \), of the expected arities. Let \( \mathcal {T}= \mathsf {ACF_0} \cup \big \{\psi ^{\sigma _2}_{\ge n} : n \in \mathbb {N}\big \}, \) where \(\mathsf {ACF_0}\) is the theory of algebraically closed fields of characteristic zero (with respect to \(\sigma _1\)) and \( \psi ^{\sigma }_{\ge n}=\exists \,x_{1} : \sigma .\,\cdots \exists \,x_n : \sigma .\, \bigwedge _{1\le i<j\le n}\lnot (x_{i}=x_{j}) \), which asserts that there are at least n elements of sort \(\sigma \). \(\mathcal {T}\) is \(\kappa \)-categorical, where \(\kappa (\sigma _1) = \aleph _1\) and \(\kappa (\sigma _2) = \aleph _{0}\). But \(\mathcal {T}\) is also \(\kappa '\)-categorical, where \(\kappa '(\sigma _1) = \kappa '(\sigma _2) = \aleph _1\). Thus, \(\mathcal {T}\) has multiple models \(\mathbb {A}\) satisfying \(\max \{|\sigma ^{\mathbb {A}}| : \sigma \in \mathcal {S}_{\varSigma }\} = \aleph _1\). Similar reasoning holds for other infinite cardinals, so Corollary 5 does not apply.

For completely split signatures, we prove a more natural Łoś–Vaught test:

Definition 8

A \(\varSigma \)-structure \(\mathbb {A}\) is strongly infinite if \(|\sigma ^{\mathbb {A}}| \ge \aleph _{0}\) for all \(\sigma \in \mathcal {S}_{\varSigma }\).

Theorem 11

Suppose \(\varSigma \) is completely split into \(\{\varSigma _\sigma : \sigma \in \mathcal {S}_{\varSigma }\}\), \(\mathcal {T}\) is a \(\varSigma \)-theory all of whose models are strongly infinite, and \(\mathcal {T}\) is \(\kappa \)-categorical for some function \(\kappa \) such that \(\kappa (\sigma ) \ge |\varSigma _\sigma |\) for every \(\sigma \in \mathcal {S}_{\varSigma }\). Then, \(\mathcal {T}\) is complete.

The assumption that \(\varSigma \) is completely split is necessary for Theorem 11:

Example 4

Let \(\varSigma \) have sorts \(\sigma _{1}, \sigma _{2}\), and function symbol f of arity \((\sigma _{1},\sigma _{2})\). Let \( \mathcal {T}= \big \{\psi ^{\sigma _1}_{\ge n} : n \in \mathbb {N}\big \} \cup \big \{\psi ^{\sigma _2}_{\ge n} : n \in \mathbb {N}\big \}\cup \big \{\varphi _{ inj } \vee \forall \,x : \sigma _1.\,\forall \,y : \sigma _1.\, [f(x)=f(y)]\big \} \). In \(\mathcal {T}\), \(\sigma _1,\sigma _2\) are infinite, and f is injective or constant. \(\mathcal {T}\) is \(\kappa \)-categorical for \(\kappa (\sigma _1) = \aleph _1,\kappa (\sigma _2) = \aleph _{0}\), but not complete, due to the sentence \(\forall x ,y : \sigma _1 .f(x)=f(y)\). This does not contradict Theorem 11, as \(\varSigma \) is not completely split.

5 Conclusion

We closed the problem of the existence of unicorn theories and discussed applications to SMT. This included a result similar to the Löwenheim–Skolem theorem, which inspired us to investigate the adaptation of this theorem to many-sorted logic. We also obtained a many-sorted version of the Łoś–Vaught test.

In future work, we plan to investigate whether Theorem 3 can be extended to uncountable signatures. More broadly, we intend to continue studying the relationships among many-sorted model-theoretic properties related to SMT.