Abstract
We study the finegrained complexity of NPcomplete, infinitedomain constraint satisfaction problems (CSPs) parameterised by a set of firstorder definable relations (with equality). Such CSPs are of central importance since they form a subclass of any infinitedomain CSP parameterised by a set of firstorder definable relations over a relational structure (possibly containing more than just equality). We prove that under the randomised exponentialtime hypothesis it is not possible to find \(c > 1\) such that a CSP over an arbitrary finite equality language is solvable in \(O(c^n)\) time (n is the number of variables). Stronger lower bounds are possible for infinite equality languages where we rule out the existence of \(2^{o(n \log n)}\) time algorithms; a lower bound which also extends to satisfiability modulo theories solving for an arbitrary background theory. Despite these lower bounds we prove that for each \(c > 1\) there exists an NPhard equality CSP solvable in \(O(c^n)\) time. Lower bounds like these immediately ask for closely matching upper bounds, and we prove that a CSP over a finite equality language is always solvable in \(O(c^n)\) time for a fixed c, and manage to extend this algorithm to the much broader class of CSPs where constraints are formed by firstorder formulas over a unary structure.
Introduction
In this article we study the finegrained, rather than classical, coarse, complexity of infinitedomain constraint satisfaction problems. We approach the subject in a systematic manner and obtain powerful lower bounds applicable to all infinitedomain CSPs where constraints consists of firstorder definable relations over a fixed relational structure. In the direction of upper bounds we obtain improved, singleexponential time algorithms for equality CSPs, and the broader class of CSPs over reducts of unary structures. Some parts of this article have been presented in preliminary form in a conference publication [20].
Background
Let \(\varGamma \) denote a (finite or infinite) set of finitary relations over a (finite or infinite) set D. The input to the constraint satisfaction problem over \(\varGamma \) (CSP\((\varGamma )\)) is a pair (V, C) where V is a set of variables (with domain D) and C is a set of constraints over \(\varGamma \). A constraint is an expression \(R(x_1,\dots ,x_n)\) where \(x_1,\dots ,x_n\) are variables in V and n equals the arity of the relation R. The problem is to find an assignment \(f:V \rightarrow D\) that satisfies every constraint in C, i.e. \((f(x_1),\dots ,f(x_n)) \in R\) for every constraint \(R(x_1,\dots ,x_n)\) in C. The set \(\varGamma \) is called the constraint language or the template. The CSP is computationally hard in the general case; if the variable domains are finite, then the problem is NPcomplete, and otherwise it may be of arbitrarily high complexity or even undecidable [6].
Depending on the constraint language \(\varGamma \), it is possible to formulate many natural problems as CSP\((\varGamma )\) problems. This is especially true if we allow templates over an infinite universe, which increases the expressive power of CSPs and e.g. makes it possible to formulate a rich amount of problems from artificial intelligence [7, 15]. The complexity of CSPs have also been the subject of intense theoretical research: for each constraint language \(\varGamma \) over a finite domain CSP\((\varGamma )\) is always either polynomialtime solvable or is NPcomplete [13, 41]. Infinitedomain CSPs are in general undecidable, but there exists a wealth of results when additional restrictions are imposed. Early examples include the CSP formulation of Allen’s interval algebra [25], the region connection calculus [31], CSPs over firstorder definable relations with equality [8] (equality CSPs), and temporal CSPs [9], i.e. CSPs where the constraint language is firstorder definable in the structure \(({\mathbb Q}; <)\) whose domain is the set of rational numbers \({\mathbb Q}\) and where < denotes the usual strict order of the rationals. More generally, it is common to consider firstorder reducts of a fixed relational structure \({{\mathcal {A}}}\), i.e., languages that are firstorder definable with equality over \({{\mathcal {A}}}\). Equality CSPs then correspond to CSP\((\varGamma )\) when \(\varGamma \) is a firstorder reduct of \((A; \emptyset )\) for some universe A (an equality language) while temporal CSPs correspond to CSP\((\varGamma )\) when \(\varGamma \) is a firstorder reduct of \(({\mathbb {Q}}; <)\). To make the intended meaning clearer we sometimes treat equality languages as firstorder reducts of \((A; =)\), where \(=\) is the equality relation over the universe A, even though this is strictly speaking not needed since the equality relation is always allowed in firstorder formulas. Equality CSPs have previously been intensively studied due to their fundamental importance for understanding more complex CSPs, since any classification of a larger relational structure \({{\mathcal {A}}}\) necessarily also needs to include a classification of equality CSPs (an equality language \(\varGamma \) is a reduct of any countably infinite structure \({{\mathcal {A}}}\)). Let us also remark that CSPs in this setting are very similar to reasoning problems occurring in artificial intelligence, where one fixes a set of “base relations” \({{\mathcal {A}}}\), typically binary, and then consider a satisfiability problem where constraints are taken from e.g. the relation algebra generated by \({{\mathcal {A}}}\), or the set of all disjunctive clauses over \(\mathcal{A}\) [15]. A recent comparison may also be found in satisfiability modulo theories (SMT) where a background theory \({{\mathcal {A}}}\) is fixed, and where one considers the satisfiability problem of firstorder formulas (with equality) restricted to interpretations agreeing with \(\mathcal{A}\) [3].
While theoretical CSP research has concentrated on classical complexity, complexity theory itself has partially shifted towards parameterised complexity and finegrained complexity, which e.g. encompasses constructing improved exponentialtime algorithms, and proving lower bounds with stronger assumptions than \(\mathrm {P}\ne \mathrm {NP}\). A popular conjecture for this purpose is the exponentialtime hypothesis (ETH). It states that the 3SAT problem is not solvable in subexponential time, i.e. it is not solvable in \(2^{o(n)}\) time, where n is the number of variables. Another popular conjecture is the strong ETH (SETH) which, roughly, states that the finegrained complexity of kSAT tends to \(2^n\) for increasing values of k.
In this article we study the finegrained complexity of NPhard infinitedomain CSPs, with a particular focus on equality CSPs using the number of variables, n, as the complexity parameter. As remarked, equality CSPs constitute a natural starting point for questions of finegrained complexity, since if we cannot even overcome this obstacle there is little hope of understanding finegrained complexity questions for larger classes of CSPs. Assume, for example, that we prove that there exists an equality language \(\varGamma \) such that CSP\((\varGamma )\) is not solvable in O(f(n)) time, for some function f. Then, regardless of which relational structure \({{\mathcal {A}}}\) that we choose, we cannot hope to construct an algorithm with a running time of O(f(n)) which is applicable to CSP\((\varDelta )\) for every firstorder reduct \(\varDelta \) of \({{\mathcal {A}}}\). Under this viewpoint it is therefore crucial to prove lower bounds for equality CSPs before moving on to construct faster exponentialtime algorithms for broader classes of infinitedomain CSPs.
Thus, among the class of NPhard equality CSPs, how does the choice of \(\varGamma \) affect the finegrained complexity of CSP\((\varGamma )\)? For example, it is known that CSP\((\varGamma )\) is solvable in \(O^{*}(2^{n \cdot \log (\frac{0.792n}{\ln (n+1)})})\) time when \(\varGamma \) is an arbitrary equality language [18] (the \(O^{*}\) notation is used to suppress polynomial factors). Concerning lower bounds it is known that no NPcomplete equality CSP\((\varGamma )\) problem is solvable in subexponential time without violating the ETH. This follows from Barto & Pinsker [2]: if \(\varGamma \) is an equality language and CSP\((\varGamma )\) is NPhard, then \(\varGamma \) ppinterprets 3SAT since \(\varGamma \) is a firstorder reduct of the finitely bounded homogeneous structure \(({\mathbb N};=)\). This fact combined with Theorem 3.1 in [22] gives the result. Furthermore, if \(\varGamma \) is the full firstorder reduct of \((A; =)\) then there cannot exist an \(O^{*}(c^n)\) time algorithm for CSP\((\varGamma )\) for any constant c without violating the SETH [18]. Despite bounds like these, there are still large gaps in our understanding of finegrained complexity of infinitedomain CSPs in general, and of equality CSPs in particular. For example, is it possible to find an equality language \(\varGamma \) such that CSP\((\varGamma )\) is NPcomplete but solvable in \(O(c^n)\) time for a constant \(c > 1\)? Is it possible to solve CSP\((\varGamma )\) in \(O(c^n)\) time whenever \(\varGamma \) is a finite equality language, and in that case, does c depend on \(\varGamma \) or is it possible to find a uniform value? Furthermore, since no NPcomplete equality CSP is solvable in subexponential time without violating the ETH, does there exist a \(c > 1\) such that no NPcomplete equality CSP is solvable in \(O(c^n)\) time?
Our Results
After defining the necessary preliminaries (in Sect. 2) we in Sect. 3 begin to answer the aforementioned questions by a careful study of lower bounds. First, we prove that under the randomised ETH for each \(c > 1\) there exists a finite equality language \(\varGamma _c\) such that CSP\((\varGamma _c)\) is not solvable in \(O(c^n)\) time. Second, we showcase a striking difference between finite and infinite languages and prove the existence of an infinite equality language \(\varGamma \) such that CSP\((\varGamma )\) is not solvable in \(2^{o(n \log n)}\) time (under the ETH). In particular this lower bound rules out a uniform \(O(c^n)\) time algorithm, \(c > 1\), applicable to arbitrary equality CSPs (which previously was only known to hold under the much stronger SETH). We also manage to lift this lower bound to SMT, where little is known about the finegrained complexity, despite being a framework with a wide range of applications due to the availability of efficient SAT solvers. We provide the first known lower bound under the ETH and show that regardless of the background theory it is not possible to solve the resulting SMT in \(2^{o(n \log n)}\) time without violating the ETH. Importantly, this shows that existing algorithms for SMT running in \(2^{O(n \log n)}\) time are close to being optimal (cf. Rodeh & Strichman [32]). It should also be noted that we are able to prove this as a straightforward consequence of our general bounds for equality CSPs, indicating yet another advantage of studying finegrained complexity in this setting. Third, we prove that for each constant \(c > 1\) there exists an NPcomplete equality CSP which is solvable in \(O(c^n)\) time, and thus rule out the existence of an “easiest NPcomplete equality CSP”. Such CSPs are known to exist for finitedomain CSPs [22] so we see a clear dividing line between finite and infinitedomain CSPs. We also provide an algebraic explanation of the lack of such an “easiest CSP problem”, based on a connection between finegrained complexity of CSPs and algebraic invariants called partial polymorphisms. Since partial polymorphisms recently have become an important tool for studying finegrained complexity of CSPs and related problems [21,22,23, 26, 27] it therefore appears important to chart any differences to the finitedomain case. In short, an “easiest NPcomplete CSP” would have a maximally large set of partial polymorphisms, and we prove that such a maximal set cannot exist for the set of all equality relations. The proof also generalises to other classes of languages, e.g., temporal languages, and can, interestingly, be proven independently of any complexity theoretical assumptions.
In light of these lower bounds, what is the best possible exponentialtime algorithm for equality CSPs that we could hope for? We tackle this question in Sect. 4 and construct an \(O^{*}(c^n)\) time algorithm for CSP\((\varGamma )\) whenever \(\varGamma \) is a finite equality language, where c is a constant depending only on the arities of relations in \(\varGamma \). Note that while the constant c likely can be improved, we have already established (under the randomised ETH) that it is not possible to find a uniform value. Similarly, it appears difficult to extend the algorithm to nontrivial classes of infinite equality languages since we have already proved that there is an infinite equality language that cannot be solved in \(2^{o(n \log n)}\) time under the ETH. Here, it is also interesting to note that certain classes of infinitedomain CSPs do not admit an \(O(c^n)\) algorithm even if the template is finite. For instance, there is a finite temporal language whose CSP (under the randomised ETH) cannot be solved in \(2^{o(n \log n)}\) time [19]. Could it even be the case that finite equality CSPs are the only reasonable class of infinitedomain CSPs solvable in singleexponential time, and that every nontrivial structure results in CSPs of higher complexity? This, however, is not the case, and we do manage to construct an \(O^{*}(c^n)\) time algorithm applicable to a much richer and broader class of problems, namely CSPs over reducts of unary structures. More precisely, say that \(\varGamma \) is a unary structure (US) language if \(\varGamma \) is a firstorder reduct of a structure \((A; U_1, \ldots , U_k)\) where each \(U_i\) is unary. Such CSPs are a subclass of firstorder definable structures with atoms and have attracted recent attention from the automata theory community [11, 12, 24]. They have also been studied by the CSP community and the complexity has been fully classified [5, 10]. The algorithm works by partitioning the domains of the unary relations \(U_1, \ldots , U_k\) in such a way that we create a finite “pseudouniverse” \({\mathbf {U}}\) where each element gives an implicit description of a finite equality language. This makes it possible to enumerate the elements of \({\mathbf {U}}\) and in each iteration test the satisfiability of the corresponding equality CSP instance.
These results paint a peculiar picture of the finegrained complexity of equality CSPs (and all classes of infinitedomain CSPs over firstorder reducts of relational structures). On the one hand, equality CSPs are incredibly hard to solve (no uniform \(O(c^n)\) time algorithm for finite languages under the randomised ETH, and no \(2^{o(n \log n)}\) time algorithm for infinite languages), but on the other hand one for any \(c > 1\), say, \(c = 1.00001\), can find an NPhard equality CSP solvable in \(O(c^n)\) time. These conflicting messages indicate that a complete understanding of finegrained complexity of equality CSPs is well out of reach, but we have simultaneously unravelled several interesting research directions. We discuss some of these in Sect. 5.
Preliminaries
A relational structure is a tuple \((A; \sigma , I)\) where A is a set typically called a domain, or a universe, \(\sigma \) is a relational signature, and I is a function from \(\sigma \) to the set of all relations over A which assigns each relation symbol a corresponding relation over A. For simplicity, we will typically write a relational structure as \((A; R_1, \ldots , R_k)\) where each \(R_i\) is a relation over A, and will not make a sharp distinction between relations and their corresponding signatures. A set of relations \(\varGamma \) over A is a firstorder reduct of a relational structure \(\mathbf{A}=(A; R_1, \ldots , R_k)\) if each \(R \in \varGamma \) is the set of models of a \(\sigma \)formula (with equality) interpreted in \((A; R_1, \ldots , R_k)\). Alternatively, one may view \(\varGamma \) as a set of relations where each relation has a firstorder definition (without parameters) in \(\mathbf{A}\). In symbols, we write \(R(x_1, \ldots , x_n) \equiv \varphi (x_1, \ldots , x_n)\) if R is the set of models of the firstorder formula \(\varphi (x_1, \ldots , x_n)\) with respect to the free variables \(x_1, \ldots , x_n\).
The Constraint Satisfaction Problem
Let \(\varGamma \) be a set of finitary relations over some set A of values, occasionally called a constraint language. The constraint satisfaction problem over \(\varGamma \) (CSP\((\varGamma )\)) is defined as follows.
Concerning representation, we take a simple approach and only consider the case when \(\varGamma \) is a firstorder reduct of a relational structure, and represent each relation \(R \in \varGamma \) by a firstorder formula. However, the exact representation is only important if \(\varGamma \) is infinite, since any reasonable representation can be chosen and precomputed if \(\varGamma \) is finite.
Primitive Positive Definitions and Interpretations
Let \(\varGamma \) be a constraint language over a domain A. A kary relation R is said to have a primitive positive definition (ppdefinition) over \(\varGamma \) if
where each \(R_i \in \varGamma \cup \{\mathrm {Eq}_A\}\) and each \(\mathbf {x_i}\) is a tuple of variables over \(x_1,\ldots , x_{k}\), \(y_1, \ldots , y_{k'}\) matching the arity of \(R_i\). Here, and in the sequel, \(\mathrm {Eq}_A\) is the equality relation \(\{(a,a) \mid a \in A)\}\) over A. Thus, R is definable by a firstorder formula consisting only of existential quantification and conjunction over positive atoms from \(\varGamma \) and equality constraints. If \(\varGamma \) is a constraint language we let \(\langle \varGamma \rangle \) be the smallest set of relations containing \(\varGamma \) closed under ppdefinitions. Ppdefinitions are typically only useful for comparing similar languages over the same domain, but can be generalised as follows.
Definition 1
Let A and B be two domains and let \(\varGamma \) and \(\varDelta \) be two constraint languages over A and B, respectively. A primitive positive interpretation (ppinterpretation) of \(\varDelta \) over \(\varGamma \) consists of:

1.
a dary relation \(F \subseteq A^{d}\),

2.
and a surjective function \(f :F \rightarrow B\)
such that \(F, f^{1}(\mathrm {Eq}_B) \in \langle \varGamma \rangle \) and \(f^{1}(R) \in \langle \varGamma \rangle \) for every kary \(R \in \varDelta \), where \(f^{1}(R)\) denotes the \((k \cdot d)\)ary relation
Hence, ppinterpretations are generalisations of ppdefinitions, and can be used to obtain polynomialtime reductions between CSPs.
Theorem 1
(cf. Theorem 3.1.4 in Bodirsky [4]) If \(\varGamma \) and \(\varDelta \) are finite constraint languages and there exists a ppinterpretation of \(\varDelta \) over \(\varGamma \), then CSP\((\varDelta )\) is polynomialtime reducible to CSP\((\varGamma )\).
We invite the reader to verify that a standard reduction from the 3coloring problem (formulable as a CSP over the inequality relation over a ternary domain) to 3SAT (formulable as a Boolean CSP) can be expressed as a ppinterpretation of the 3coloring relation over 3SAT.
Equality Languages
We say that \(\varGamma \) is an equality language if each \(R \in \varGamma \) admits a firstorder definition over a relational structure \((A;\emptyset )\), i.e. the empty structure. Recall here that the equality relation is always accessible in firstorder logic. Without loss of generality we henceforth assume that \(A = {\mathbb {N}}\), write \(\mathrm {Eq}\) (or \(=\) in infix notation) for the equality relation over \({\mathbb {N}}\), and \(\mathrm {R_{\scriptscriptstyle \ne }}\) or \(\ne \) (in infix notation) for the inequality relation \(\{(x,y) \in {\mathbb {N}}^2 \mid x \ne y\}\) over \({\mathbb {N}}\). Via these conventions an equality language can also be viewed as a set of firstorder definable relations over \(({\mathbb {N}}; =)\), and we typically prefer this notation over \((A; \emptyset )\) since the intended meaning is clearer. The computational problem we consider is then CSP\((\varGamma )\) when \(\varGamma \) is an equality language. This problem is easily seen to belong to NP for any finite language, and its classical complexity has been completely classified [8].
Theorem 2
Let \(\varGamma \) be an equality language. Then either

1.
CSP\((\varGamma )\) is polynomialtime solvable or

2.
there exists a finite \(\varDelta \subseteq \varGamma \) such that CSP\((\varDelta )\) is NPcomplete since \(\varDelta \) ppinterprets every finitedomain relation.
Example 1
Let \(S = \{(x,x,y), (x,y,y) \mid x, y \in {\mathbb {N}}, x \ne y\}\), and observe that \(S(x,y,z) \equiv (x = y \wedge y \ne z) \vee (x \ne y \wedge y = z)\). Thus, \(\{S\}\) is an equality language, and it is known that \(\{S\}\) ppinterprets a language \(\varDelta \) where CSP\((\varDelta )\) is NPhard, which implies that CSP\((\{S\})\) is NPhard, too. For tractability, if we take \(\{\mathrm {Eq}, \mathrm {R_{\scriptscriptstyle \ne }}\}\) then CSP\((\{\mathrm {Eq}, \mathrm {R_{\scriptscriptstyle \ne }}\})\) is wellknown to be polynomialtime solvable. This can be proven via Theorem 2, however, CSP\((\{\mathrm {Eq}, \mathrm {R_{\scriptscriptstyle \ne }}\})\) can also be solved by elementary propagation methods.
FineGrained Complexity and the Exponential–Time Hypothesis
Assume that CSP\((\varGamma )\) is NPcomplete. How fast can we solve CSP\((\varGamma )\), and is it possible to prove stronger lower bounds than an expected superpolynomial running time (under \(\mathrm {P}\ne \mathrm {NP}\))? Such questions, especially when the complexity parameter is the number of variables V or the number of constraints C, fall under the umbrella of finegrained complexity. To prove nontrivial lower bounds for NPcomplete problems we typically need stronger assumptions than \(\mathrm {P}\ne \mathrm {NP}\). Say that CSP\((\varGamma )\) is solvable in subexponential time if CSP\((\varGamma )\) is solvable in \(O(2^{\varepsilon V})\) for each \(\varepsilon > 0\). The conjecture that 3SAT is not solvable in subexponential time is called the exponentialtime hypothesis (ETH). There exists several stronger variants of the ETH. First, an algorithm A is said to be a \(2^{c \cdot V}\)randomised algorithm if its running time is bounded by \(2^{c \cdot V} \cdot \mathrm {poly}(I)\) and its error probability is at most 1/3 (I is the number of bits required to represent a CSP instance I). For \(k,d \ge 1\) we then define
and
where \(\varGamma _{d,k}\) is the set of all relations over the set \(\{0, \ldots , d  1\}\) of arity at most k. The randomised exponentialtime hypothesis (rETH) is then the conjecture that \(c_{2,3} > 0\), i.e., that 3SAT is not solvable in subexponential time even with randomised algorithms, and the strong exponentialtime hypothesis (SETH) is the conjecture that the limit of the sequence \(c_3, c_4, \ldots \) is equal to 1.
Lower Bounds on the Complexity of Equality Constraints
In this section we investigate lower bounds for equality CSPs. As remarked in Sect. 1, such lower bounds are valuable since if it is possible to prove that, for an equality language \(\varGamma \), CSP\((\varGamma )\) is not solvable in O(f(V)) time (for some function f) then, for some arbitrary relational structure \(\mathbf{A}\), there exists a \(\varDelta \) such that CSP\((\varDelta )\) is not solvable in O(f(V)) time and \(\varDelta \) is a firstorder reduct of \(\mathbf{A}\). Let us recapitulate two known lower bounds.
Theorem 3
Let \(\varGamma \) be an equality language.

1.
If CSP\((\varGamma )\) is NPhard then it is not solvable in subexponential time unless the ETH is false (Theorem 3.1 in [22]), and

2.
If \(\varGamma \) is the full firstorder reduct of \(({\mathbb {N}}; =)\) then CSP\((\varGamma )\) is not solvable in \(O(c^{V})\) time for any \(c > 1\) unless the SETH is false (Theorem 19 in [18]).
Finite Versus Infinite Equality Languages
We begin by proving that for every \(c > 1\) there exists a finite equality language \(\varGamma _c\) such that CSP\((\varGamma _c)\) is not solvable in \(O(2^{cV})\) time without contradicting the rETH. This result is a substantial strengthening of Theorem 3(2). We first require the following result [39, Theorem 1].
Theorem 4
If rETH holds, then there exists a universal constant \(\alpha > 0\) such that \(\alpha \cdot \log (d) \le c_{d,2}\) for all \(d \ge 3\),
Theorem 5
For every \(c>1\), there exists a finite equality language \(\varGamma _c\) such that CSP\((\varGamma _c)\) cannot be solved in \(O(2^{c \cdot V})\) (randomised) time unless the rETH is false.
Proof
For \(1 \le a,b \le d\) define
For arbitrary d then define the finite equality language
We present a polynomialtime reduction from CSP\((\varGamma _{d,2})\) to CSP\((\varTheta _d)\) only introducing a constant number of fresh variables. Let (V, C) be an instance of CSP\((\varGamma _{d,2})\). Introduce d fresh variables \(c_1,\dots ,c_d\) together with constraints \(\{c_i \ne c_j \;  \; 1 \le i < j \le d\}\). For each \(R(x,y) \in C\), add the constraints \(R_{d,a,b}(c_1,\dots ,c_d,x,y)\) for every \(1 \le a,b \le d\) such that \((a,b) \not \in R\). The resulting instance \((V \cup \{c_1,\dots ,c_d\},C')\) can be constructed in polynomial time, and is clearly satisfiable if and only if (V, C) is satisfiable. Furthermore, d is fixed so only a constant number of variables are introduced. By Theorem 4, CSP\((\varTheta _d)\) cannot be solved in \(2^{(c_{d,2}\epsilon ) \cdot V}\) time for any \(\epsilon > 0\) unless rETH is false, and the result follows by choosing d such that \(c_{d,2} \ge c\). We know that \(\alpha \cdot \log (d) \le c_{d,2}\) so it is sufficient to choose a d such that \(\alpha \cdot \log (d) \ge c\), e.g. \(d=2^{\lceil \frac{c}{\alpha } \rceil }\). \(\square \)
Thus, assuming the rETH, there cannot exist an algorithm solving CSP\((\varGamma )\) in \(O(c^{V})\) time for every finite equality language \(\varGamma \). This can be strengthened even further for infinite equality languages, and we will show the existence of \(\varGamma \) such that CSP\((\varGamma )\) is not solvable in \(O(2^{o(V \log V)})\) time without contradicting the ETH. In contrast, the second statement of Theorem 3 is only valid under the much stronger SETH, and only if \(\varGamma \) consists of all firstorder definable relations over \(({\mathbb {N}};=)\). For this lower bound we provide a reduction from the \(k \times k\) Independent Set problem: given a graph G over the vertex set \(\{1,\dots ,k\} \times \{1,\dots ,k\}\) (where k is part of the input), is there an independent set of size k in G with exactly one element from each row? One may view this problem as a variant of the standard Independent Set problem where the vertices are the elements of a \(k \times k\) table and one wants to find an independent set that contain exactly one element from each row. The following lower bound is known under the ETH [29].
Theorem 6
\(k \times k\) Independent Set is not solvable in \(2^{o(k \log k)}\) time unless the ETH is false.
For \(n \ge 1\) define \(R_n(y,x_1,\dots ,x_n) \equiv y=x_1 \vee y=x_2 \vee \dots \vee y=x_n\), and let \(R(x, y, z, w) \equiv x \ne y \vee z \ne w\). Let \(\varGamma _{\mathrm {inf}}\) be the infinite equality language \(\{\ne , R, R_1, R_2, \ldots \}\).
Theorem 7
CSP\((\varGamma _{\mathrm {inf}})\) cannot be solved in \(2^{o(V \log V)}\) time unless the ETH is false.
Proof
To prove the result, we present a polynomialtime reduction from \(k \times k\) Independent Set to CSP\((\varGamma _{\mathrm {inf}})\) such that the resulting CSP\((\varGamma _{\mathrm {inf}})\) instance only contains 2k variables. Let \(G=(V,E)\) denote an arbitrary graph where \(V=\{1,\dots ,k\} \times \{1,\dots ,k\}\). We then begin by introducing k variables \(a_1,\dots ,a_k\) together with the constraints \(a_i \ne a_j\), \(1 \le i < j \le k\). Second, for each row \(1 \le i \le k\) in G, introduce a variable \(x_i\) and the constraint \(R_k(x_i,a_1,\dots ,a_k)\). This constraint ensures that \(x_i\) equals one of the variables \(a_1,\dots ,a_k\). Third, for each edge \(e=((a,b),(c,d)) \in E\), introduce the constraint \(R(x_a,a_b,x_c,a_d)\). This constraint guarantees that both endpoints of an edge are not put into the independent set simultaneously. \(\square \)
Hence, we cannot even hope to solve CSP\((\varGamma )\) in \(O(c^{V})\) time for any c when \(\varGamma \) is allowed to be infinite. Furthermore, since an equality CSP is always solvable in \(2^{O(V\log V)}\) time [18], the bound in Theorem 7 is asymptotically tight.
Thus, the distinction between finite and infinite languages seems to be rather important in the context of equality CSPs, but if one considers slightly richer structures than \(({\mathbb {N}}; =)\) then significantly stronger bounds can be obtained also for finite languages. Let \(\prec \; \subseteq D^2\) denote a binary relation over a set D and let \(\succ \) denote its converse where \(x \succ y\) holds if and only if \(y \prec x\) holds. We say that \(\prec \) is an acyclic order if there does not exist any finite subset \(\{d_1,\dots ,d_k\} \subseteq D\) such that \(d_1 \prec d_2 \prec \dots \prec d_{k1} \prec d_k \prec d_1\). Acyclic orders are irreflexive (i.e. they do not contain any element d such that \(d \prec d\)) by definition. We say that \(\prec \) is a strict partial order if it is irreflexive and for arbitrary \(d,d',d'' \in D\): \(d \prec d'\) and \(d' \prec d''\) imply \(d \prec d''\) (transitivity). Note that these two properties also ensure that \(\prec \) is antisymmetric, i.e. if \(d \prec d'\), then \(d' \prec d\) does not hold. We say that \(\prec \) is a strict total order if \(\prec \) is a strict partial order and it is a connex relation, i.e. for arbitrary distinct \(d,d' \in D\), either \(d \prec d'\) or \(d' \prec d\) holds. Finally, we say that \(\prec \) contains unbounded total orders if for every \(k \in {\mathbb N}\), there exists a subset \(L \subseteq D\) such that \(L \ge k\) and \(\prec \) is a strict total order on L.
Example 2
The lessthan relation < over \({\mathbb {Q}}\) is an acyclic order containing unbounded total orders. For the latter property, simply observe that < is a strict total order \(\{1, \ldots , k\}\). However, there exists a wealth of examples of acyclic orders containing unbounded total orders in the artificial intelligence literature, especially in combination with qualitative reasoning problems, e.g., temporal and spatial reasoning problems such as Allen’s interval algebra and the region connection calculus. For many additional examples of this kind, see e.g. the survey by Dylla et al [15].
The following result is a significant strengthening of Theorem 11 in [19].
Theorem 8
Let \(\prec \, \subseteq D^2\) be an acyclic order that contains unbounded total orders. Then, there exists a constraint language \(\varGamma \) such that

1.
\(\varGamma \) is finite,

2.
\(\varGamma \) is firstorder definable in \((D; \prec )\) (even with quantifierfree definitions), and

3.
CSP\((\varGamma )\) is not solvable in \(2^{o(n \log n)}\) time unless the ETH is false.
Proof
Let \(\varGamma =\{\prec ,R,S\}\) where

\(R(x,y) \equiv x \prec y \vee y \prec x\) and

\(S(x,a,b,y,c,d) \equiv x \prec a \vee b \prec x \vee y \prec c \vee d \prec y\).
Clearly, \(\varGamma \) is finite and (quantifierfree) firstorder definable in \((D; \prec )\). Assume that CSP\((\varGamma )\) can be solved in \(2^{o(n \log n)}\) time. We show how to polynomialtime reduce \(k \times k\) Independent Set to CSP\((\varGamma )\) in a way such that only O(k) variables are used. Hence, \(k \times k\) Independent Set can be solved in \(2^{o(k \log k)}\) time and this contradicts the ETH via Theorem 6.
Let \(G=(V,E)\) be an arbitrary instance of \(k \times k\) Independent Set. Introduce \(2k+1\) fresh variables \(y_1,\dots ,y_k,t_1,\dots ,t_{k+1}\). The idea behind these variables is that \(y_i\), \(1 \le i \le k\), points out the vertex in row i that is to be included in the independent set. This is done with the aid of variables \(t_1,\dots ,t_{k+1}\). Informally speaking, if \(y_i\) “lies between” \(t_j\) and \(t_{j+1}\), then we will put the jth vertex on row i into the independent set. Now, let \(V_1=\{y_1,\dots ,y_k,t_1,\dots ,t_{k+1}\}\) and
Since \(\prec \) is a acyclic order that contains unbounded total orders, we know that \(I_1=(V_1,C_1)\) is satisfiable.
In every solution s to \(I_1\), it holds that \(s(t_i) \prec s(t_j)\) when \(1 \le i < j \le k+1\). Constrain each \(y_i\), \(1 \le i \le k\), as follows:

\(t_1 \prec y_i\),

\(R(y_i,t_j)\) for \(2 \le j \le k\), and

\(y_i \prec t_{k+1}\).
Let \(C_2\) denote the corresponding set of constraints and let \(I_2=(V_1, C_1 \cup C_2)\). It is easy to verify that in every solution s to \(I_2\) and for each \(1 \le j \le k\), the variable \(y_j\), satisfies \(s(t_i) \prec s(y_j) \prec s(t_{i+1})\) for exactly one \(1 \le i \le d\). We will interpret this as “the ith vertex on row j is chosen for inclusion in the independent set”. Note that a solution always exists to \(I_2\) since \(\prec \) is a acyclic order that contains unbounded total orders.
For each edge \(\{(x_a,x_b),(x_c,x_d)\}\) in E, we introduce the following constraint
With the given interpretations of \(y_a,t_b,y_c,t_d\), this constraint implies that we cannot simultaneously choose vertex b on row a and vertex d on row c for inclusion in the independent set. Let \(C_3\) denote the resulting set of constraints and let \(I_3=(V_1, C_1 \cup C_2 \cup C_3)\). Given the explanations above, it is easy to verify that \(I_3\) is satisfiable if and only if G is a yesinstance. We conclude the proof by noting that \(I_3\) can be computed in polynomial time and contains O(k) variables. \(\square \)
Satisfiability Modulo Theories
We will now consider a problem which is related to equality CSPs, for which we rather effortlessly can obtain lower bounds by reducing from CSP\((\varGamma _{\mathrm {inf}})\). Satisfiability modulo theories (SMT) is a decision problem for logical formulas with respect to a given background theory. The logical formulas are expressed in classical firstorder logic with equality. However, it is quite common to not use the full power of this framework; for instance, a frequent restriction is to require that the formulas are quantifierfree (and we will use this fragment ourselves below). An accessible introducion to SMT can be found in the survey by Barrett et al. [1]. Let SMT\(({{\mathcal {T}}})\) be the problem of determining whether a firstorder formula (with respect to a background theory \({{\mathcal {T}}}\)) is satisfiable, and let be the subproblem where universal quantifiers are not allowed. We can then readily prove a matching lower bound valid for any background theory \(\mathcal{T}\).
Theorem 9
cannot be solved in \(2^{o(V \log V)}\) time unless the ETH is false.
Proof
We present a polynomialtime reduction from CSP\((\varGamma _{\mathrm {inf}})\) which does not introduce any fresh variables. Let (V, C) be an instance of CSP\((\varGamma _{\mathrm {inf}})\), where \(V=\{x_1,\dots ,x_k\}\) and \(C=\{c_1,\dots ,c_p\}\). Define F to be the formula \(\exists x_1 \dots \exists x_k :F_1 \wedge \dots \wedge F_p\) where

\(F_i=(\lnot (x = y)) \; \mathrm{if} \; c_i=x \ne y,\)

\(F_i=(y=x_1 \vee y= x_2 \vee \ldots \vee y=x_n) \; \mathrm{if} \; c_i=R_n(y,x_1,\dots ,x_n), \; \mathrm{and}\)

\(F_i=(\lnot (x=y) \vee \lnot (z=w)) \; \mathrm{if} \; c_i=S(x,y,z,w).\)
It is obvious that F is true if and only if (V, C) has a solution, that F can easily be constructed in polynomial time, and that F contains as many variables as there are variables in V. The result then follows from Theorem 7. \(\square \)
is often referred to as equality logic and this problem is important in, for instance, hardware verification [14]. In fact, a slightly more expressive logic known as the logic of equality with uninterpreted functions (EUF) is extensively used in hardware verification. There are several known algorithms that solve EUF in \(O(V!) = 2^{O(V \log V)}\) time — see, for instance, the discussion in Sect. 12 in the article by Rodeh & Strichman [32]. We conclude, with the aid of Theorem 9, that such algorithms are close to optimal.
To present another optimality result in SMT, we consider the wellknown unit two variable per inequality (UTVPI) class of constraints, i.e., where \(\mathrm {UTVPI}\) for each integer b and coefficients \(c_1,c_2 \in \{1,1\}\) contains the relation \(\{(x,y) \in {\mathbb Z}^2 \;  \;c_1 \cdot x + c_2 \cdot y \ge b\}\). The UTVPI class has many applications in, for instance, abstract interpretation, spatial databases, and theorem proving (cf. Schutt and Stuckey [37] and the references therein). It is known [38] that can be solved in \(2^{O(V \log d)}\) time where \(d=2V(b_{\max }+1)+1\) and \(b_{\max }\) is the maximum over the absolute values of constant terms in the constraints. Using Theorem 9 we can prove that this algorithm is close to optimal.
Theorem 10
cannot be solved in \(2^{o(V \log d)}\) time unless the ETH is false.
Proof
Assume there is an algorithm A that solves in \(2^{o(V \log d)}\) time. The formulas constructed in Theorem 9 are formulas (degenerate ones, though, since they do not contain UTVPI constraints). Thus, \(b_{\max }\) for this class X of formulas is 0, implying that A can solve restricted to X in \(2^{o(n \log n)}\) time, contradicting Theorem 9. \(\square \)
Difference logic is an interesting fragment of where only constraints of the form \(x  y \ge b\) are allowed. Difference logic has found applications in, for example, verification of timed automata [30] and analysis of dynamic fault trees [40]. The lower bound in Theorem 10 naturally holds also in this restricted case.
No Easiest NPHard Infinite–Domain CSP
Our lower bounds suggest that equality CSPs are rather different from finitedomain CSPs when viewed under the lens of finegrained complexity. In this section we prove yet another differentiating factor. For each finite A it is known that there exists a constraint language \(\varGamma _A\) with domain A such that CSP\((\varGamma _A)\) is NPcomplete, and if an NPcomplete CSP\((\varDelta )\)^{Footnote 1} over A is solvable in \(O(c^{V})\) time, then CSP\((\varGamma _A)\) is solvable in \(O(c^{V})\) time, too [22]. More generally, if \({{\mathcal {G}}}\) is a set of constraint languages over A, we say that CSP\((\varGamma )\) for some \(\varGamma \in {{\mathcal {G}}}\) is the easiest CSP problem in \({{\mathcal {G}}}\) if CSP\((\varGamma )\) is solvable in \(O^{*}(c^{V})\) time whenever CSP\((\varDelta )\) for \(\varDelta \in {{\mathcal {G}}}\) is solvable in \(O^{*}(c^{V})\) time.
Contrary to the finitedomain case we will prove that there does not exist an easiest NPcomplete equality CSP, unless the ETH is false. In order to prove this, we show that for every \(c > 1\) there exists an equality language \(\varGamma _c\) such that CSP\((\varGamma _c)\) is NPcomplete but solvable in \(O^*(c^{V})\) time. First, recall from Example 1 that the ternary relation \(S = \{(x,x,y), (x,y,y) \mid x, y \in {\mathbb {N}}, x \ne y\}\) has an NPcomplete CSP. We will show how S can be extended with additional arguments in order to decrease the time complexity of the resulting CSP. If \({\mathbf {v}} = (v_1, \ldots , v_k)\) and \({\mathbf {w}} = (w_1, \ldots , w_k)\) are two kary tuples of variables, x is a variable, and R is a binary relation, then we write \(R(x, {\mathbf {v}})\) for \(\bigwedge _{1 \le i \le k} R(x, v_i)\), \(R({\mathbf {v}}, {\mathbf {w}})\) for \(\bigwedge _{1 \le i, j \le k} R(v_i, w_j)\), and \(R({\mathbf {v}})\) for \(\bigwedge _{1 \le i,j \le k, i \ne j} R(v_i, v_j)\).
For arbitrary \(k \ge 1\) now define
where \({\mathbf {v}} = (v_1, \ldots , v_k)\) and \({\mathbf {w}} = (w_1, \ldots , w_k)\) are two distinct kary tuples of variables.
The general idea behind the relation \(S^k\) is that we want to take an existing relation S yielding an NPhard CSP and add a number of variables, depending on the given parameter k, so that these variables depend on the original variables from S but cannot be identified with each other. The latter point is important since it allows us to construct a branching algorithm with a sufficiently good branching factor.
It is straightforward to verify that the problem CSP\((\{S^k\})\) is NPcomplete with the aid of Theorem 1 since \(S \in \langle \{S^k\} \rangle \). We will now prove that the finegrained complexity of CSP\((\{S^k\})\) decreases with increasing k, in the following sense.
Theorem 11
Let \(c > 1\). Then there exists k such that CSP\((\{S^k\})\) is solvable in \(O^{*}(c^{V})\) time.
Proof
We will present an algorithm Y for CSP\((\{S^k\})\) which runs in \(O^{*}(2^{\frac{n}{k}})\) time. The claim then follows from choosing a sufficiently large \(k \ge \frac{1}{\log c}\). Thus, choose \(k \ge 1\) and let (V, C) be an instance of CSP\((\{S^k\})\), where \(V = n\). Say that a set of inequality constraints L is consistent if L, viewed as an instance of CSP\((\{\mathrm {R_{\scriptscriptstyle \ne }}\})\), is satisfiable, and inconsistent otherwise. The consistency of a set of inequality constraints can be determined in polynomial time since CSP\((\{\mathrm {R_{\scriptscriptstyle \ne }}\})\) is in P (from Example 1). Consider the algorithm Y in Fig. 1. The set L is used to keep track of inequality constraints induced by the constraints in the instance.
For correctness, the algorithm branches on a constraint \(S^k(x, y, z, {\mathbf {v}}, {\mathbf {w}}) \in C\), and either identifies x with y, or y with z; in the process, it identifies variables and introduces inequality constraints according to the definition of \(S^k\). Furthermore, the algorithm answers ‘yes’ if and only if it for each constraint \(S^k(x, y, z, {\mathbf {v}}, {\mathbf {w}}) \in C\) is possible to identify x with y, or y with z, in a noncontradictory way, and answers ‘no’, otherwise. Concerning time complexity, note first that all variables in \({\mathbf {v}}\) and \({\mathbf {w}}\) are distinct, once step (7) is reached. This follows from the tests undertaken in step 4 where we systematically verify that \(\{w^1, \ldots , w^k\}\) and \(\{v^1, \ldots , v^k\}\) are disjoint and that \(\{w^1, \ldots , w^k\} = \{v^1, \ldots , v^k\} = k\). Furthermore, if (7)(b) or (7)(c) is reached then \(\{x, y, z\} = 3\), as otherwise the current instance is unsatisfiable (\(\{x, y, z\} = 1\)) or no branching was required (\(\{x, y, z\} = 2\)). Thus, in each branch in step 7 we eliminate k variables via variable identification, which implies that the time complexity is bounded by the recurrence \(T(n) = 2T(n  k) + \mathrm {poly}(I)\). Thus, algorithm Y has total running time \(O^{*}(2^{\frac{n}{k}})\), and therefore it solves CSP\((\{S^k\})\) in \(O^{*}(c^n)\) time for a sufficiently large k. \(\square \)
We immediately obtain the following corollary.
Corollary 1
Let \({{\mathcal {A}}} = (A; R_1, \ldots , R_k)\) be a relational structure over a countably infinite A. Assume that a firstorder reduct \(\varGamma \) of \({{\mathcal {A}}}\) is NPcomplete if and only if \(\varGamma \) ppinterprets 3SAT. Let \({{\mathcal {G}}} = \{\varGamma \mid \varGamma \) is a firstorder reduct of \({{\mathcal {A}}}\) and CSP\((\varGamma )\) is NPcomplete\(\}\). If \({{\mathcal {G}}}\) has an easiest CSP, then the ETH is false.
Proof
For each \(c > 1\) there exists a constraint language \(\varGamma _c \in {{\mathcal {G}}}\) such that CSP\((\varGamma _c)\) is NPcomplete and solvable in \(O^*(c^{V})\) time (Theorem 11). If \({{\mathcal {G}}}\) has an easiest NPcomplete problem CSP\((\varGamma )\) then (1) CSP\((\varGamma )\) ppinterprets 3SAT, and (2) CSP\((\varGamma )\) is solvable in \(O^{*}(c^{V})\) time for each \(c > 1\). Thus, CSP\((\varGamma )\) is solvable in subexponential time, but this violates the ETH by Theorem 3.1 in [22]. \(\square \)
Observe that the class of relational structures considered in Corollary 1 includes the NPhard cases of the CSP dichotomy conjecture over finitely bounded homogeneous structures [2]. It is worth noting that after the FederVardi conjecture on finitedomain CSPs was settled (independently) by Bulatov [13] and Zhuk [41], a large part of the complexityoriented CSP work has concentrated on homogeneous infinitedomain CSPs.
Algebra and FineGrained Complexity of Equality CSPs
Our lower bounds suggest a large difference in finegrained complexity between equality CSPs and finitedomain CSPs. In this section we take a different viewpoint and investigate this difference through the lens of universal algebra and partial clone theory, with the aim of achieving an algebraic explanation of the results obtained in the previous section. We will see a correspondence to the nonexistence of certain relations known as weak bases. Via the results from Sect. 3.3 we are first able to give a straightforward proof conditional to the ETH (Theorem 1) which we then strengthen to an unconditional proof (Theorem 14) but which requires more elaborate arguments.
Algebraic Background
The basic setting on the functional side is to consider partial functions over a universe A. We view a partial function as a mapping of the form \(f :X \rightarrow A\) for a set \(X \subseteq A^k\) called the domain of f, and denoted by \(\mathrm {domain}(f) = X\). Then a partial function f of arity k is said to be a partial polymorphism of an nary relation R over A if \(f(t_1, \ldots , t_k) \in R\) for each sequence of tuples \(t_1, \ldots , t_k \in R\) such that \((t_1[i], \ldots , t_k[i]) \in \mathrm {domain}(f)\) for each \(1 \le i \le n\). If f is total, i.e., f is always defined, then f is simply called a polymorphism. If we let \(\mathrm {pPol}(R)\) be the set of all partial polymorphisms of a relation R, and \(\mathrm {pPol}(\varGamma ) = \bigcap _{R \in \varGamma } \mathrm {pPol}(R)\) be the set of all partial polymorphisms of the set of relations \(\varGamma \), the resulting sets of partial functions are called strong partial clones. Similarly, we write \(\mathrm {Pol}(\varGamma )\) for the set of all polymorphisms of \(\varGamma \), and the resulting sets of functions are known as clones. If F is a set of (total or partial) functions then we write \(\mathrm {Inv}(F)\) to denote the set of relations invariant under each function in F.
Let us also briefly mention some properties of strong partial clones. In this context the term strong means that if \(f \in \mathrm {pPol}(\varGamma )\) then \(f_{\mid X} \in \mathrm {pPol}(\varGamma )\) for each restriction of f on the domain \(X \subseteq \mathrm {domain}(f)\), i.e., \(\mathrm {domain}(f_{\mid X}) = X\) and \(f({\mathbf {x}}) = f_{\mid X}({\mathbf {x}})\) for each \({\mathbf {x}} \in X\). More generally, if \(X \not \subseteq \mathrm {domain}(f)\) then we let \(f_{\mid X}\) be the restriction of f to the set \(\mathrm {domain}(f) \cap X\). Then strong partial clones of the form \(\mathrm {pPol}(\varGamma )\) are precisely the local strong partial clones over A [34], meaning that \(f \in \mathrm {pPol}(\varGamma )\) for a kary (partial) function f if \(f_{\mid X} \in \mathrm {pPol}(\varGamma )\) for each finite \(X \subseteq A^k\). On the relational side strong partial clones \(\mathrm {pPol}(\varGamma )\) correspond to sets of relations closed under ppdefinitions without existential quantification, quantifierfree ppdefinitions (qfppdefinitions). We write \(\langle \varGamma \rangle _{\not \exists }\) for the smallest set of relations containing \(\varGamma \) which is closed under qfppdefinitions. For \(\omega \)categorical structures we then have the following useful correspondence between \(\mathrm {Inv}(\cdot )\) and \(\mathrm {pPol}(\cdot )\). The theorem follows almost directly from Romov [33], but for completeness we include a proof sketch where the \(\omega \)categorical case differs.
Theorem 12
Let \(\varGamma \) and \(\varDelta \) be two \(\omega \)categorical sets of relations over a domain A. Then (1) \(\mathrm {Inv}(\mathrm {pPol}(\varGamma )) = \langle \varGamma \rangle _{\not \exists }\) and (2) \(\varGamma \subseteq \langle \varDelta \rangle _{\not \exists }\) if and only if \(\mathrm {pPol}(\varDelta ) \subseteq \mathrm {pPol}(\varGamma )\).
Proof
Since \(\varGamma \) is \(\omega \)categorical there for each \(n \ge 1\) only exists a finite number of firstorder definable relations of arity n (see, e.g., Theorem 6.3.1 in Hodges [17]). The first claim then follows from Proposition 2 in Romov [33] since infinite intersections of relations and direct limits of relations can always be expressed via qfppdefinitions over a finite number of relations from \(\varGamma \). The second claim then readily follows by standard arguments. \(\square \)
There is a similar connection between \(\mathrm {Pol}(\cdot )\) and \(\langle \cdot \rangle \) which we omit since it is not directly useful for our purposes (see, e.g., the introductory textbook by Lau [28]). Theorem 12 then implies that partial polymorphisms determine the finegrained complexity of CSPs in the following sense, as originally proved by Jonsson et al. for Boolean CSPs [21].
Theorem 13
Let \(\varGamma \) and \(\varDelta \) be two finite \(\omega \)categorical languages. If \(\mathrm {pPol}(\varDelta ) \subseteq \mathrm {pPol}(\varGamma )\) then there exists a polynomialtime manyone reduction f from CSP\((\varGamma )\) to CSP\((\varDelta )\) such that \(f((V,C)) = (V', C')\), \(V' \le V\), for each instance (V, C) of CSP\((\varGamma )\).
The lattice of strong partial clones is uncountable in the Boolean domain [16] and it is to a large extent unexplored. Quite naturally, even less is known for arbitrary finite domains or infinite domains. However, we can simplify the task of analysing strong partial clones by restricting our attention to strong partial clones \(\mathrm {pPol}(\varGamma )\) where \(\mathrm {Pol}(\varGamma ) = C\) for a fixed clone C. It is then of particular interest to determine whether the set of strong partial clones of this form, sometimes called an interval, has a largest element.
Definition 2
Let C be a clone over a finite or countably infinite domain A. If there exists a set of relations \(\varGamma _w\) over A such that \(\mathrm {Pol}(\varGamma _w) = C\) and \(\mathrm {pPol}(\varGamma _w) = \bigcup \{\mathrm {pPol}(\varDelta ) \mid \mathrm {Pol}(\varDelta ) = C\}\) then we say that \(\varGamma _w\) is a weak base of \(\mathrm {Inv}(C)\).
Thus, \(\mathrm {pPol}(\varGamma _w)\) is the largest element in \(\{\mathrm {pPol}(\varDelta ) \mid \mathrm {Pol}(\varDelta ) = C\}\), which on the relational side means that \(\varGamma _w \subseteq \langle \varDelta \rangle _{\not \exists }\) for every set of relations \(\varDelta \) such that \(\mathrm {Pol}(\varDelta ) = C\). Hence, \(\varGamma _w\) is minimally expressive with respect to qfppdefinitions among the generating sets of \(\mathrm {Inv}(C)\), which explains the name “weak base”. If A is finite and C can be generated by a finite set of functions over A, then it is known that \(\mathrm {Inv}(C)\) has a weak base [36]. For infinite domains the situation differs, and weak bases do not necessarily exist. For both negative and positive examples, see Romov [35].
Might it then be possible that \(\langle \varGamma \rangle \) admits a weak base whenever \(\varGamma \) is an equality language? And which implications would that have if CSP\((\varGamma )\) is NPcomplete? Let \({{\mathcal {E}}}\) be the set of all firstorder definable relations over \(({\mathbb {N}}; =)\). Now, recall the definition of the relation S from Example 1. It is then known that \(S \in \langle \varGamma \rangle \) (and thus, CSP\((\varGamma )\) is NPcomplete) for an equality language \(\varGamma \) if and only if \(\langle \varGamma \rangle = \mathcal{E}\) [8]. Thus, we are interested in determining whether \({{\mathcal {E}}}\) has a weak base, and we may now observe that the existence of a weak base would have farreaching implications.
Proposition 1
If \({{\mathcal {E}}}\) has a weak base then the ETH is false.
Proof
Assume that \({{\mathcal {E}}}\) has a weak base \(\varGamma _w\). Assume first that \(\varGamma _w\) is infinite. It is then known that there exists a finite set \(\varDelta \subseteq \varGamma _w\) such that \(\langle \varGamma _w \rangle = \langle \varDelta \rangle \), implying also that \(\langle \varGamma _w \rangle _{\not \exists } = \langle \varGamma \rangle _{\not \exists }\) (see, e.g., the second condition of Theorem 7.4.2 in Bodirsky [4].
Thus, assume that \(\varGamma _w\) is finite. But then Theorem 13 together with the relations constructed in Theorem 11 implies that CSP\((\varGamma _w)\) is solvable in \(O(c^{V})\) time for every \(c > 1\). However, then Theorem 3 implies that 3SAT is solvable in subexponential time, thus contradicting the ETH. \(\square \)
The NonExistence of a Weak Base
Due to Proposition 1 we strongly suspect that \(\mathcal{E}\) does not have a weak base, but we will see that one can unconditionally prove that \({{\mathcal {E}}}\) does not have a weak base. In fact, we will prove a fairly general condition which determines the nonexistence of a weak base, which is particularly poignant in the relationship of NPhard CSPs. For a universe A, let \(R^A_{\ne } = \{(x,y) \in A^2 \mid x \ne y\}\) be the inequality relation over A.
Lemma 1
Let \(\varGamma \) be a finite, \(\omega \)categorical set of relations over an infinite domain A. If \(R^A_{\ne } \in \langle \varGamma \rangle \) then \(\langle \varGamma \rangle \) does not admit a weak base.
Proof
Let f be an arbitrary kary function over A. Our goal is to show that there for every finite \(X \subset \mathrm {domain}(f) = A^k\) exists an equality constraint language \(\varGamma \) such that \(\langle \varGamma \rangle = {{\mathcal {E}}}\) and such that \(f_{\mid X}\) preserves \(\varGamma \). If \(\langle \varGamma \rangle \) admits a weak base \(\varGamma _w\) then, clearly, \(f_X \in \mathrm {pPol}(\varGamma _w)\) for every finite \(X \subset A^k\), which implies that \(f \in \mathrm {pPol}(\varGamma _w)\) for every function f (since \(\mathrm {pPol}(\varGamma _w)\) is local). This contradicts the assumption that \(R^A_{\ne } \in \langle \varGamma \rangle \) since \(R^A_{\ne }\), for example, is not preserved by any constant function over D.
Hence, let \(X \subset A^k\) be finite. Let \(N = \{d_1, \ldots , d_k \mid (d_1, \ldots , d_k) \in X\}\) be the set of values occuring in tuples in X, and let \(N = n\). Let \(\varGamma = \{R_1, \ldots , R_l\}\) and define the relation R to be the Cartesian product of all relations in \(\varGamma \), i.e., \(R = R_1 \times \ldots \times R_l\). Let m be the arity of the relation R. Clearly, \(\langle \{R\} \rangle = \langle \varGamma \rangle \), since \(\varGamma \) can ppdefine R via a conjunction, and R can ppdefine each relation in \(\varGamma \) by projecting away every other argument. Define the \((m+n+1)\)ary relation \(R^n\) such that
This relation is ppdefinable by R since we assumed that R can ppdefine the inequality relation \(R^A_{\ne }\) , and since
we also have that \(\langle \{R\} \rangle = \langle \{R^n\} \rangle = \langle \varGamma \rangle \). Next, we claim that \(f_{\mid X}\) preserves \(R^n\). Consider any sequence of tuples \(t_1, \ldots , t_k \in R^n\). Due to the definition of \(R^n\) we then have that \(t[m+i] \ne t[m+j]\) for any distinct \(i,j \in \{1, \ldots , n+1\}\). Hence, \(\{t[i] \mid 1 \le i \le m + n + 1\} > N\), meaning that \(f(t_1, \ldots , t_k)\) is undefined, and that f preserves \(R^n\).
Last, assume there exists \(\varGamma _w\) such that \(\mathrm {pPol}(\varGamma _w) = \bigcup _{\langle \varDelta \rangle = \langle \varGamma \rangle } \mathrm {pPol}(\varDelta )\), i.e., that \(\varGamma _w\) is a weak base of \(\langle \varGamma \rangle \). Then, by the above construction, \(f_{\mid X} \in \mathrm {pPol}(\varGamma _w)\) for every finite X since \(\mathrm {pPol}(\{R^n\}) \subseteq \mathrm {pPol}(\varGamma _w)\), which then implies that \(f \in \mathrm {pPol}(\varGamma _w)\) since \(\mathrm {pPol}(\varGamma _w)\) is local. Hence, the strong partial clone \(\mathrm {pPol}(\varGamma _w)\) would need to contain all total functions over A. But then \(\varGamma _w\) cannot be a weak base of \(\langle \varGamma \rangle \) since the assumption that \(R^A_{\ne } \in \langle \varGamma \rangle = \langle \varGamma _w \rangle \) e.g. implies that \(\varGamma _w\) cannot be preserved by any constant function over A. \(\square \)
This condition is sufficient to establish nonexistence of weak bases in the context of both equality languages and temporal languages.
Theorem 14
\({{\mathcal {E}}}\) does not have a weak base.
Proof
Since \({{\mathcal {E}}}\) is the set of all firstorder definable relations it clearly follows that \(R_{\ne } \in {{\mathcal {E}}}\). But since all equality languages are \(\omega \)categorical, and since \({{\mathcal {E}}} = \langle \{S\} \rangle \), the result then directly follows from Lemma 1. \(\square \)
Observe that Theorem 14 together with Theorem 12 implies that
To see this, assume otherwise, i.e., that there exists \(R \notin \langle \mathrm {Eq} \rangle _{\not \exists }\) such that \(R \in \langle \varDelta \rangle _{\not \exists }\) for \(\varGamma \) such that \(\langle \varGamma \rangle = {{\mathcal {E}}}\). This, however, would imply that \(\mathrm {pPol}(\mathrm {Eq}) \supset \mathrm {pPol}(R) \supseteq \bigcup _{\langle \varGamma \rangle = {{\mathcal {E}}}} \mathrm {pPol}(\varGamma )\). This contradicts the Proof of Theorem 14 since it is shown that \(\bigcup _{\langle \varGamma \rangle = {{\mathcal {E}}}}\) contains all (total and partial) functions. One interpretation of this result is that equality languages resulting in NPhard CSPs have rather little in common with regards to qfppdefinability. For example, we may conclude that not all such languages can qfppdefine the inequality relation \(\mathrm {Neq}_{{\mathbb {N}}}\).
Last, we will show that the nonexistence of weak bases is not solely a property of equality CSPs, and that an analogous property can be proven also for temporal CSPs, i.e., CSP\((\varGamma )\) where each relation in \(\varGamma \) has a firstorder definition in the structure \(({\mathbb {Q}}; <)\).
This class of CSPs is a strict generalisation of equality CSPs and includes many natural problems, e.g., the betweenness problem and the cyclic ordering problem. For many other examples, see Bodirsky & Kára [9]. Let \({{\mathcal {T}}}\) be the set of all firstorder definable relations over \(({\mathbb {Q}}; <)\). For \(x_1,\dots ,x_k \in {\mathbb Q}\), we write \(\overrightarrow{x_1 \dots x_k}\) when \(x_1< \dots < x_k\). The following dichotomy holds for temporal CSPs.
Theorem 15
(Bodirsky and Kára [9]) Let \(\varGamma \subseteq {{\mathcal {T}}}\) be a temporal constraint language. If there is a primitive positive definition of \(\mathrm {Betw}\), \(\mathrm {Cycl}\), \(\mathrm {Sep}\), \(T_3\), \(T_3\), or S in \(\varGamma \), where

1.
\(\mathrm {Betw} = \{(x, y, z) \in {\mathbb {Q}}^3 \mid \overrightarrow{xyz} \vee \overrightarrow{zyx}\}\),

2.
\(\mathrm {Cycl} = \{(x, y, z) \in {\mathbb {Q}}^3 \mid \overrightarrow{xyz} \vee \overrightarrow{yzx} \vee \overrightarrow{zxy}\},\)

3.
\(\begin{array}{ll} \mathrm {Sep} = \{(x_1, y_1, x_2, y_2) \in {\mathbb {Q}}^4 \mid &{} \overrightarrow{x_1 x_2 y_1 y_2} \vee \overrightarrow{x_1 y_2 y_1 x_2} \, \vee \\ &{} \overrightarrow{y_1 x_2 x_1 y_2} \vee \overrightarrow{y_1 y_2 x_1 x_2} \, \vee \\ &{} \overrightarrow{x_2 x_1 y_2 y_1} \vee \overrightarrow{x_2 y_1 y_2 x_1} \, \vee \\ &{} \overrightarrow{y_2 x_1 x_2 y_1} \vee \overrightarrow{y_2 y_1 x_2 x_1} \}, \end{array} \)

4.
\(T_3 = \{(x,y,z) \in {\mathbb {Q}}^3 \mid x = y< z \vee x = z < y\}\), and

5.
\(T_3 = \{(x,y,z) \mid (x,y,z) \in T_3\}\),
then CSP\((\varGamma )\) is NPcomplete. Otherwise, CSP\((\varGamma )\) is tractable.
With the help of this classification we can then prove that \(\langle \varGamma \rangle \) cannot admit a weak base whenever CSP\((\varGamma )\) is NPcomplete (assuming P \(\ne \) NP).
Theorem 16
Let \(\varGamma \subseteq {{\mathcal {T}}}\) be a finite temporal language. If \(\varGamma \) ppdefines \(\mathrm {Betw}\), \(\mathrm {Cycl}\), \(\mathrm {Sep}\), \(\mathrm {T_3}\), \(\mathrm {T_3}\), or S, then \(\langle \varGamma \rangle \) does not admit a weak base.
Proof
We want to apply Lemma 1, and thus need to show that \(\varGamma \) can ppdefine the inequality relation \(R^{{\mathbb {Q}}}_{\ne }\) over \({\mathbb {Q}}\). To prove this it is sufficient to show that \(\mathrm {Betw}\), \(\mathrm {Cycl}\), \(\mathrm {Sep}\), \(\mathrm {T_3}\), \(\mathrm {T_3}\), and S, can all ppdefine the inequality relation, which can be done with straightforward arguments. For example, \(R^{{\mathbb {Q}}}_{\ne }(x,y) \equiv \exists z :\mathrm {Betw}(x,y,z)\), and \(R^{{\mathbb {Q}}}_{\ne }(x,y) \equiv \exists z :\mathrm {Cycl}(x,y,z)\). The result then directly follows from Lemma 1. \(\square \)
Upper Bounds for Equality CSPs and Reducts of Unary Structures
The lower bounds established in Sect. 3 suggest that we cannot construct an \(O(c^{V})\) time algorithm (\(c > 1\)) which is applicable to arbitrary equality languages. However, if we fix a finite equality language \(\varGamma \), this still leaves the possibility of constructing an \(O(c^{V})\) time algorithm for a constant c depending on \(\varGamma \). In this section we tackle this problem, and the more general problem of constructing faster exponentialtime algorithms for CSP\((\varGamma )\) whenever \(\varGamma \) is a finite unary reduct. We begin in Sect. 4.1 by constructing an improved algorithm for the case when \(\varGamma \) is a finite equality language, and in Sect. 4.2 consider the more involved case of reducts of unary structures.
An Algorithm for Finite Equality Languages
We begin by describing a novel algorithm for CSP\((\varGamma )\), where \(\varGamma \) is a finite equality language with maximum arity \(\alpha \), with a running time of \(O^{*}((\frac{\alpha (\alpha 1)}{2})^{V})\). Thus, the algorithm runs in \(O^{*}(c^{V})\) time for a constant c depending on \(\varGamma \), which is a significant improvement over the algorithm proposed by [18] which solves CSP\((\varGamma )\) in \(O^{*}(2^{V \cdot \log (\frac{0.792V}{\ln (V+1)})})\) time.
Theorem 17
The CSP of an arbitrary finite equality language \(\varGamma \) can be solved in \(O^{*} \left( \left( \frac{\alpha (\alpha 1)}{2} \right) ^{V} \right) \) time where \(\alpha =\max \{ar(R) \;  \; R \in \varGamma \}\).
Proof
Consider the algorithm A for instances of CSP\((\varGamma )\) presented in Fig. 2. We begin by proving correctness by induction over \(V = n\). If \(n=1\), then the tests in steps (3) and (4) provide the correct answer. Assume the algorithm is correct when \(n>1\). Let \(I=(V,C)\) be an instance where \(V = n+1\). First, assume that I has an injective solution. Then it is readily verified that \(f :V \rightarrow \{1, \ldots , V\}\) defined as \(f(x_i) = i\) for each \(x_i \in V = \{x_1, \ldots , x_{V}\}\), is a solution to I as well (in technical terms this follows from the wellknown fact that the automorphisms of \(\varGamma \) is the full symmetric group [8]). Hence, the algorithm answers ‘yes’ via step (3). Otherwise I does not have an injective solution and at least one constraint \(c=R(x_{i_1},\dots ,x_{i_p}) \in C\) is not satisfied by the function s. This implies that (at least) two variables in \(\{x_{i_1},\dots ,x_{i_p}\}\) must be assigned the same value. This is systematically tested in step (6), and the correctness follows from the inductive hypothesis.
Concerning the time complexity, it is bounded from above by the recurrence \(T(n) = \frac{\alpha (\alpha 1)}{2} \cdot T(n1) + \mathrm {poly}(I)\) since \(i_p \le \alpha \) for each possible choice of constraint \(R(x_{i_1}, \ldots , x_{i_p})\). Thus, \(T(n) \in O^{*}((\frac{\alpha (\alpha 1)}{2})^{n})\), and we get the desired bound on the time complexity. \(\square \)
An Algorithm for Finite Reducts of Unary Structures
We recall that a structure \(\mathbf{A}=(A;U_1,U_2,\dots ,U_k)\) is unary if \(U_1,U_2,\dots ,U_k\) are unary relations. The classical complexity of the constraint satisfaction problem for finite firstorder reducts of unary structures has been thoroughly analysed by Bodirsky & Mottet [10] and Bodirsky & Bodor [5]: they prove that such problems are either polynomialtime solvable or NPcomplete. We refer the reader to their articles for more background information about unary structures and their reducts.
Throughout this section we let \(\varTheta = ({\mathbb {N}}; U_1, \ldots , U_k)\), \(k \ge 1\), be an arbitrary unary structure where each \(U_i \subseteq {\mathbb {N}}\), and we let \(\varGamma =\{R_1,\dots ,R_m\}\) be a finite firstorder reduct of \(\varTheta \). We can (without loss of generality) focus on structures with a countably infinite domain since every reduct of a unary structure has the same CSP as a reduct of a structure on a countably infinite domain. Since \(\varTheta \) admits quantifierelimination, and since \(\varGamma \) is finite, we may without loss of generality assume that each \(R_i\) is defined via a DNF formula where an atom consists of either a unary relation from \(\varTheta \) or an equality constraint. Let \(\alpha \) denote the maximum arity of \(\varGamma \), i.e. \(\alpha =\max \{ar(R_1),\dots ,ar(R_m)\}\).
Our algorithm for CSP\((\varGamma )\) is based on the following steps. First, we show that there for each instance \(I = (V,C)\) of CSP\((\varGamma )\) exists a particular set of functions F with \(c^{V}\) elements (where c is a constant that only depends on \(\varGamma \)). These functions can be viewed as “highlevel descriptions” of the solution we are searching for. Second, we prove that for each \(f \in F\), one can construct an instance \(I_f\) of CSP\((\varGamma _{\mathrm{eq}})\) where \(\varGamma _{\mathrm{eq}}\) is a finite equality language that only depends on the choice of \(\varGamma \). The instances \(I_f\) are constructed in such a way that I is satisfiable if and only if \(I_f\) is satisfiable for some \(f \in F\).
We proceed with a few definitions. For every set \(S \subseteq {\mathbb N}\) we denote the complement \({\mathbb {N}} \setminus S\) of S by \({\bar{S}}\). Define U(S), \(S \subseteq \{1,\dots ,k\}\), such that
and let \(\mathbf{S}=\{U(S) \mid S \subseteq \{1,\dots ,k\}\}\). One may view the set \(\mathbf{S}\) as a “basis” for \(U_1,\dots ,U_k\) in the sense that each \(U_i\) is the union of some elements in \(\mathbf{S}\). Let
and
The set \(\mathbf{U}\) can be viewed as a refinement of \(\mathbf{S}\): it is still a basis for \(U_1,\dots ,U_k\) but it contains more elements. The functions in the set F that we briefly discussed earlier have the set \(\mathbf{U}\) as their codomain.
Lemma 2
The following statements are true.

1.
\(\mathbf{U}\) is a partitioning of \({\mathbb N}\),

2.
\(\mathbf{U}\) is finite, and

3.
for every \(U_i\), \(1 \le i \le k\), there exist \(S_{1},\dots ,S_{p} \in \mathbf{U}\) such that \(U_i = \bigcup _{j=1}^p S_j\).
Proof
We first make the following claims concerning the set \(\mathbf{S}\).

1.
\(\mathbf{S}\) is a partitioning of \({\mathbb N}\),

2.
\(\mathbf{S}\) is finite, and

3.
for every \(U_i\), \(1 \le i \le k\), there exist \(S_{1},\dots ,S_{p} \in \mathbf{S}\) such that \(U_i = \bigcup _{j=1}^p S_j\).
We prove each case in turn.

1.
Arbitrarily choose \(p \in {\mathbb N}\). Let \(S=\{i \;  \; p \in U_i\}\) and note that \(p \in U(S)\). In particular, if \(S=\emptyset \), then \(U(\emptyset )=\bigcap _{i=1}^k \bar{U_i}\) and \(p \in U(\emptyset )\). We conclude that every element in \({\mathbb N}\) appears in at least one of the sets U(S). Assume, with the aim of getting a contradiction, that \(p \in U(S)\), \(p \in U(S')\), and \(S \ne S'\) where \(S' \subseteq \{1,\dots ,k\}\). It is clear that the only sets among \(U_1,\dots ,U_k,\bar{U_1},\dots ,\bar{U_k}\) that contain p are \(U_i\), \(i \in S\), and \(\bar{U_j}\), \(j \in \{1,\dots ,k\} \setminus S\). Thus, there is at least one set in
$$\begin{aligned} X=\{U_i \;  \; i \in S'\} \cup \{\bar{U_j} \;  \; j \in \{1,\dots ,k\} \setminus S'\} \end{aligned}$$that does not contain p. We know that \(U(S')= \bigcap X\) so \(p \not \in U(S')\) and this leads to a contradiction.

2.
\(\mathbf{S}\) contains at most \(2^k\) elements.

3.
Arbitrarily choose \(U_i\), \(1 \le i \le k\). Let \(T = \bigcup \{X \;  \; X \subseteq U_i, \; X \in \mathbf{S}\}\) and note that \(T \subseteq U_i\). We show that \(U_i \subseteq T\) and conclude that there exists a set of elements in \(\mathbf{S}\) whose union equals \(U_i\). Arbitrarily choose \(e \in U_i\) and assume to the contrary that \(e \not \in T\). There exists exactly one set \(E \in \mathbf{S}\) that contains e since \(\mathbf{S}\) is a partitioning of \({\mathbb N}\). We know that \(E=U(S)\) for some \(S \subseteq \{1,\dots ,k\}\) by the definition of \(\mathbf{S}\). If \(i \in S\), then \(E \subseteq U_i\) and \(\{e\} \subseteq E \subseteq U_i \subseteq T\) which leads to a contradiction. Hence, \(i \not \in S\) and \(E = E \cap \bar{U_i}\) by the definition of U(S). This implies that \(e \not \in E\) since \(e \in U_i\) and this contradicts the choice of E.
The statements for the set \(\mathbf{U}\) now become straightforward consequences. The family of sets \(\mathbf{U}\) is still a partitioning of \({\mathbb N}\) since we have only “refined” the finite sets of \(\mathbf{S}\) into singleelement sets. Since \(\mathbf{S}\) is a finite set, \(\mathbf{U}\) is finite, too. Finally, every \(U_i\), \(1 \le i \le k\), can be expressed as a union of elements in \(\mathbf{U}\) since this is possible in \(\mathbf{S}\). \(\square \)
Let us remark that a partition where every part is either infinite or oneelement (such as \(\mathbf{U}\)) is called a stabilised partition in the terminology of Bodirsky & Mottet [10]. We define the algorithm D (see Fig. 3) for instances (V, C) of CSP\((\varGamma )\) and functions \(f :V \rightarrow \mathbf{U}\). Algorithm D checks whether a given instance (V, C) has a solution that respects the function f: we say that a solution \(g :V \rightarrow {\mathbb N}\) to (V, C) respects f if \(g(x) \in f(x)\) for all \(x \in V\). If a conjunct becomes empty, then we view it (as usual) as satisfiable and it can be removed. If a disjunction becomes empty, then it is not satisfiable and the algorithm can immediately report that the instance is not satisfiable. The algorithm A that appears within D is the algorithm for equality languages presented in Sect. 4.1. It will only be applied to constraints that are based on equality relations with arity at most \(\alpha \). We let \(\varGamma _{\mathrm{eq}}\) denote this set of equality relations. The language \(\varGamma _{\mathrm{eq}}\) is finite so algorithm A solves CSP\((\varGamma _{\mathrm{eq}})\) in time \(O^{*}\left( \left( \frac{\alpha (\alpha 1)}{2} \right) ^{V}\right) \) by Theorem 17.
Our aim is now to prove that an instance \(I = (V,C)\) of CSP\((\varGamma )\) is satisfiable if and only if there exists a function \(f :V \rightarrow \mathbf{U}\) such that D(I, f) answers ‘yes’. First of all, we verify that the computation of the instance \((V, C'')\) is an instance of CSP\((\varGamma _{\mathrm{eq}})\), implying that the call to algorithm A in step (4) is valid. For this purpose it is sufficient to show that \(S \subseteq U_i\) or \(S \cap U_i = \emptyset \) for each \(U_i \in \{U_1, \ldots , U_k\}\) and \(S \in \mathbf{U}\), since the filtering in step (1) then guarantees that any constraint involving \(U_i\) is replaced by a constraint over \(\varGamma _{\mathrm{eq}}\).
Lemma 3
Arbitrarily choose \(U_i\), \(1 \le i \le k\), and a set \(S \in \mathbf{U}\). Either \(S \subseteq U_i\) or \(S \cap U_i = \emptyset \).
Proof
There exist \(S_{1},\dots ,S_{p} \in \mathbf{U}\) such that \(U_i = \bigcup _{j=1}^p S_j\) by the third statement of Lemma 2. Since \(\mathbf{U}\) is a partitioning of \({\mathbb N}\) (by the first statement of Lemma 2), this decomposition is unique. If \(S \in \{S_1,\dots ,S_p\}\), then \(S \subseteq U_i\). Otherwise, \(S \cap U_i = \emptyset \).
\(\square \)
We continue the correctness proof by establishing a close connection between (V, C) and \((V,C'')\).
Lemma 4
Let (V, C) be an instance of CSP\((\varGamma )\), let \(f :V \rightarrow \mathbf{U}\), and let \((V,C'')\) be the instance computed in step (3) of the algorithm D((V, C), f). Then (V, C) has a solution \(g :V \rightarrow {\mathbb N}\) that respects f if and only if the instance \((V,C'')\) has such a solution.
Proof
We begin by showing that (V, C) has a solution \(g :V \rightarrow {\mathbb {N}}\) which respects f if and only if the instance \((V,C')\) computed in step 1 of the algorithm has such a solution. Therefore, first assume that (V, C) has a solution g that respects f. If a formula in C contains the atom \(U_i(x)\) (respectively, \(\lnot U_i(x))\) and \(f(x) \cap U_i=\emptyset \) (respectively, \(f(x) \subseteq U_i\)), then we can safely remove the entire conjunction containing \(U_i(x)\) since it cannot be satisfied by a solution that respects f (such as g). Furthermore, every atom \(U_i(x)\) (respectively, \(\lnot U_i(x)\)) such that \(f(x) \subseteq U_i\) (respectively, \(f(x) \cap U_i = \emptyset \)) is vacuously satisifed by any solution that respects f so such atoms can be removed. We conclude that g is a solution to \((V,C')\).
Second, assume that \((V,C')\) has a solution \(g :V \rightarrow {\mathbb N}\) that respects f. First note that the atoms that are removed in step 1(c) and 1(d) are satisfied by the solution f. Since g respects f, these atoms are satisfied by g, too. Thus, if we take all constraints in \(C'\) and extend them with the conjuncts and atoms that were removed in step 1, then g is a solution to this set of constraints. Note here that adding back the removed conjuncts only makes the instance easier in the sense that it is satisfied by a potentially larger set of variable assignments. Clearly, the new set of constraints equals C and we conclude that g is a solution to (V, C).
Now, assume that \(g :V \rightarrow {\mathbb N}\) is a solution to \((V,C')\) that respects f. The additional constraints \(\{x \ne y \;  \; f(x) \ne f(y)\}\) are always satisfied when we are only interested in solutions that respect f—this follows from the fact that \(\mathbf{U}\) is a partitioning of \({\mathbb N}\) by the first statement of Lemma 2. The constraints \(\{x=y \;  \; f(x)=f(y)=S \; \mathrm{and} \; S=1\}\) are always satisfied when the domain of a variable consists of a single element. Thus, g is a solution to \((V,C'')\).
Last, assume that \((V,C'')\) has a solution \(g :V \rightarrow {\mathbb N}\) that respects f. Since \(C' \subseteq C''\), \(C'\) can be viewed as a relaxation of \(C''\). Consequently, g is a solution to \((V,C')\) which respects f. \(\square \)
Lemma 4 gives us a straightforward way of proving the correctness of algorithm D.
Lemma 5
Let (V, C) be an instance of CSP\((\varGamma )\), and let \(f :V \rightarrow \mathbf{U}\). Then the algorithm D accepts ((V, C), f) if and only if (V, C) has a solution that respects f.
Proof
For the first direction, assume that D accepts the instance ((V, C), f). This implies that there exists a solution \(g :V \rightarrow {\mathbb N}\) to the instance \((V,C'')\). Let \(D_S=\{g(x) \mid f(x)=S, x \in V\}\) for every \(S \in \mathbf{U}\), i.e. \(D_S\) contains the values that g assigns to the variables satisfying \(f(x)=S\). We make two observations concerning the sets \(D_S\).

1.
\(D_S \cap D_{S'} = \emptyset \) whenever \(S \ne S'\). This a consequence of the construction of \(C''\): the constraint \(x \ne y\) is in \(C''\) whenever \(f(x) \ne f(y)\).

2.
\(D_S \le 1\) if \(S=1\). Once again, this a consequence of the construction of \(C''\): the constraint \(x = y\) is in \(C''\) whenever \(f(x) = f(y) = S\) and \(S=1\).
These two observations imply that there exist injective functions \(h_S\) from \(D_S\) to S for all \(S \in \mathbf{U}\) (recall that a set in \(\mathbf{U}\) is either infinite or oneelement). The sets in \(\{D_S \mid S \in \mathbf{U}\}\) are pairwise disjoint and so are the sets in \(\mathbf{U}\). Hence, there exists an injective function \(h :{\mathbb N} \rightarrow {\mathbb N}\) such that \(\{h(d) \mid d \in D_S\} \subseteq S\) for all \(S \in \mathbf{U}\). We see that the function \(g' :V \rightarrow {\mathbb N}\) defined by \(g'(x)=h(g(x))\) is a solution to \((V,C'')\) that respects f. By Lemma 4, there is a solution to (V, C) that respects f.
For the other direction, assume that D does not accept the instance ((V, C), f). This implies that there does not exist any solution to the instance \((V,C'')\). By Lemma 4, there is no solution \(g :V \rightarrow {\mathbb N}\) to (V, C) that respects f. \(\square \)
We can now state and prove the main result by combining the results presented in this section.
Theorem 18
CSP\((\varGamma )\) can be solved in \(O^{*}((\mathbf{U} \cdot \frac{\alpha (\alpha 1)}{2})^{V})\) time.
Proof
We begin by proving that algorithm D runs in \(O^{*}((\frac{\alpha (\alpha 1)}{2})^{V})\) time. Let \(I=((V,C),f)\) denote an arbitrary input instance. First of all, each test performed in step 1 can be performed in constant time since the constraint language \(\varGamma \) is fixed and U is finite: the information needed for verifying if \(f(x) \cap U_i = \emptyset \) and \(f(x) \subseteq U_i\) can be precomputed and stored in a finite table. Furthermore, the operations in step 1 do not increase the arity of the formulas in I, and the formulas added in step 3 all have arity 2. Thus, the algorithm D runs in \(O^{*}(c^{V})\) time where \(c = \max \{2,\frac{\alpha (\alpha 1)}{2}\}\). However, if the arity of the formulas in C are at most 2, then the algorithm runs in polynomial time since \(C''\) only contains formulas of arity at most 2—such a formula is either \(x=y\) or \(x \ne y\).
We continue by proving the main result. Let \(I=(V,C)\) denote an arbitrary instance of CSP\((\varGamma )\). Let F denote the set of functions from V to \(\mathbf{U}\) and note that \(F=\mathbf{U}^{V}\) is finite since \(\mathbf{U}\) is a finite set by the second statement of Lemma 2. If (V, C) has a solution g, then there exists an \(f \in F\) such that g respects f since \(\mathbf{U}\) is a partitioning of \({\mathbb N}\) by the first statement of Lemma 2. We can thus check the satisfiability of I by applying the algorithm D (which is correct by Lemma 5) to the set of input instances \(\{((V,C),f) \mid f \in F\}\). The time complexity is consequently \(O^{*}((\mathbf{U} \cdot \frac{\alpha (\alpha 1)}{2})^{V})\). \(\square \)
Concluding Remarks
We have studied the finegrained complexity of infinitedomain equality CSPs, and have proven that this class of problems differ from finitedomain CSPs in almost every way conceivable. Despite the disarray of this complexity landscape, it is possible to outline several concrete future research directions. First, since we know that all finite equality languages can be solved in \(O(c^{V})\) time and that there exists infinite equality languages not solvable in \(O(c^{V})\) time for any \(c > 1\), is it possible to prove a complete dichotomy separating the equality language CSPs that are solvable in \(O(c^{V})\) time from those that are not?
More generally, one may ask the following question: which infinitedomain CSPs are solvable in \(O(c^{V})\) time? This is naturally a question that is too broad so it needs to be narrowed down. An interesting starting point is the class of temporal CSPs, i.e., CSPs over firstorder reducts of \(({\mathbb {Q}}; <)\). Temporal languages are wellbehaved from a model theoretic viewpoint (they are \(\omega \)categorical), admit a dichotomy between P and NPcomplete, and are always solvable in \(O^{*}(2^{V \log V})\) time, so one would expect similarities between equality CSPs and temporal CSPs when it comes to finegrained complexity. Thus, which temporal CSPs are solvable in \(O(c^{V})\) time? Despite the aforementioned similarities there are still large differences to equality CSPs. For example, there exists a finite firstorder reduct \(\varGamma \) of \(({\mathbb {Q}}; <)\) such that CSP\((\varGamma )\) is not solvable in \(2^{o(V\log V})\) time without violating the rETH [19].
Last, we have seen that the class of NPcomplete equality CSPs does not admit an “easiest problem” unless the ETH is violated, contrary to satisfiability problems [21] and finitedomain CSPs [22]. This discrepancy stems from the constructions in Sect. 3.3 where we proved that one can construct NPhard equality CSPs with arbitrarily low finegrained complexity. Furthermore, we gave an algebraic explanation of this difference, namely the nonexistence of a weak base for the set of all equality relations. Here, it is important to stress that whether a set of relations admits a weak base or not is a purely algebraic property and it can be formulated entirely without mentioning either CSPs or complexity theory. Interestingly, we first gave a conditional proof under the ETH (Proposition 1), and later strengthened this to an unconditional proof (Theorem 14). Furthermore, the conditional proof turned out to be simpler and more straightforward than the algebraic proof. To the best of our knowledge, proofs of algebraic properties under the ETH are exceedingly rare, if not nonexistent, and this raises the question on whether this is an isolated incidence, or a fragment of a larger phenomena.
Notes
For technical reasons \(\varDelta \) contains all unary relations over A.
References
Barrett, C.W., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In: Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability, Frontiers in Artificial Intelligence and Applications, vol. 185, pp. 825–885. IOS Press, Amsterdam (2009)
Barto, L., Pinsker, M.: The algebraic dichotomy conjecture for infinite domain constraint satisfaction problems. In: Proceedings of 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS2016) (2016)
Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability, Frontiers in Artificial Intelligence and Applications, vol. 185. IOS Press (2009)
Bodirsky, M.: Complexity of InfiniteDomain Constraint Satisfaction. Cambridge University Press, Cambridge (2021)
Bodirsky, M., Bodor, B.: Canonical polymorphisms of Ramsey structures and the unique interpolation property. In: Proceedings of 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS2021), pp. 1–13 (2021)
Bodirsky, M., Grohe, M.: Nondichotomies in constraint satisfaction complexity. In: Proceedings of 35th International Colloquium on Automata, Languages and Programming (ICALP2008), pp. 184–196 (2008)
Bodirsky, M., Jonsson, P.: A modeltheoretic view on qualitative constraint reasoning. J. Artificial Intelligence Res. 58, 339–385 (2017)
Bodirsky, M., Kára, J.: The complexity of equality constraint languages. Theory Comput. Syst. 43(2), 136–158 (2008)
Bodirsky, M., Kára, J.: The complexity of temporal constraint satisfaction problems. J. ACM 57(2), 9:19:41 (2010)
Bodirsky, M., Mottet, A.: A dichotomy for firstorder reducts of unary structures. Logical Methods in Computer Science 14(2) (2018)
Bojańczyk, M., Klin, B., Lasota, S.: Automata theory in nominal sets. Logical Methods in Computer Science 10(3) (2014)
Bojańczyk, M., Klin, B., Lasota, S., Toruńczyk, S.: Turing machines with atoms. In: Proceedings of 28th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS2013), pp. 183–192 (2013)
Bulatov, A.: A dichotomy theorem for nonuniform CSPs. In: Proceedings of 58th Annual Symposium on Foundations of Computer Science (FOCS2017) (2017)
Burch, J.R., Dill, D.L.: Automatic verification of pipelined microprocessor control. In: Proceedings of 6th International Conference on Computer Aided Verification (CAV1994), pp. 68–80 (1994)
Dylla, F., Lee, J.H., Mossakowski, T., Schneider, T., Delden, A.V., Ven, J.V.D., Wolter, D.: A survey of qualitative spatial and temporal calculi: Algebraic and computational properties. ACM Comput. Surveys 50(1), 7:17:39 (2017)
Freivald, R.V.: A completeness criterion for partial functions of logic and manyvalued logic algebras. Sov. Phys. Doklady 11, 288 (1966)
Hodges, W.: A Shorter Model Theory. Cambridge University Press, New York (1997)
Jonsson, P., Lagerkvist, V.: An initial study of time complexity in infinitedomain constraint satisfaction. Artificial Intelligence 245, 115–133 (2017)
Jonsson, P., Lagerkvist, V.: Why are CSPs based on partition schemes computationally hard? In: 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS2018), pp. 43:1–43:15 (2018)
Jonsson, P., Lagerkvist, V.: Lower bounds and faster algorithms for equality constraints. In: Proceedings of 29th International Joint Conference on Artificial Intelligence (IJCAI2020), pp. 1784–1790 (2020)
Jonsson, P., Lagerkvist, V., Nordh, G., Zanuttini, B.: Strong partial clones and the time complexity of SAT problems. J. Comput. Syst. Sci. 84, 52–78 (2017)
Jonsson, P., Lagerkvist, V., Roy, B.: Finegrained time complexity of constraint satisfaction problems. ACM Trans. Comput. Theory 13(1), 2:12:32 (2021)
Jonsson, P., Lagerkvist, V., Schmidt, J., Uppman, H.: The exponentialtime hypothesis and the relative complexity of optimization and logical reasoning problems. Theor. Comput. Sci. 892, 1–24 (2021)
Klin, B., Lasota, S., Ochremiak, J., Toruńczyk, S.: Turing machines with atoms, constraint satisfaction problems, and descriptive complexity. In: Proceedings of the Joint Meeting of the TwentyThird EACSL Annual Conference on Computer Science Logic and the TwentyNinth Annual ACM/IEEE Symposium on Logic in Computer Science (CSLLICS2014), pp. 58:1–58:10 (2014)
Krokhin, A., Jeavons, P., Jonsson, P.: Reasoning about temporal relations: The tractable subalgebras of Allen’s interval algebra. J. ACM 50(5), 591–640 (2003)
Lagerkvist, V.: Precise upper and lower bounds for the monotone constraint satisfaction problem. In: Proceedings of the Mathematical Foundations of Computer Science (MFCS2015), pp. 357–368 (2015)
Lagerkvist, V., Wahlström, M.: Sparsification of SAT and CSP problems via tractable extensions. ACM Trans. Comput. Theory 12(2), 13:113:29 (2020)
Lau, D.: Function Algebras on Finite Sets: Basic Course on ManyValued Logic and Clone Theory. Springer, New York (2006)
Lokshtanov, D., Marx, D., Saurabh, S.: Slightly superexponential parameterized problems. SIAM J. Comput. 47(3), 675–702 (2018)
Niebert, P., Mahfoudh, M., Asarin, E., Bozga, M., Maler, O., Jain, N.: Verification of timed automata via satisfiability checking. In: Proceedings of 7th International Symposium on Formal Techniques in RealTime and FaultTolerant Systems (FTRTFT2002), pp. 225–244 (2002)
Renz, J., Nebel, B.: On the complexity of qualitative spatial reasoning: A maximal tractable fragment of the region connection calculus. Artificial Intelligence 108(1–2), 69–123 (1999)
Rodeh, Y., Strichman, O.: Building small equality graphs for deciding equality logic with uninterpreted functions. Inf. Comput. 204(1), 26–59 (2006)
Romov, B.: The algebras of partial functions and their invariants. Cybernetics 17(2), 157–167 (1981)
Romov, B.A.: Extendable local partial clones. Discrete Math. 308(17), 3744–3760 (2008)
Romov, B.A.: Endpoints of associated intervals for local clones on an infinite set. Algebra Universalis 79(4), 82 (2018)
Schnoor, H., Schnoor, I.: Partial polymorphisms and constraint satisfaction problems. In: Creignou, N., Kolaitis, P.G., Vollmer, H. (eds.) Complexity of Constraints. Lecture Notes in Computer Science, vol. 5250, pp. 229–254. Springer, Berlin (2008)
Schutt, A., Stuckey, P.J.: Incremental satisfiability and implication for UTVPI constraints. INFORMS J. Comput. 22(4), 514–527 (2010)
Seshia, S.A., Subramani, K., Bryant, R.E.: On solving Boolean combinations of UTVPI constraints. J. Satisfiability Boolean Model. Comput. 3(1–2), 67–90 (2007)
Traxler, P.: The time complexity of constraint satisfaction. In: Proceedings of 3rd International Workshop on Parameterized and Exact Computation (IWPEC2008), pp. 190–201 (2008)
Volk, M., Junges, S., Katoen, J.: Fast dynamic fault tree analysis by model checking techniques. IEEE Trans. Industrial Inform. 14(1), 370–379 (2018)
Zhuk, D.: A proof of the CSP dichotomy conjecture. J. ACM 67(5), 30:130:78 (2020)
Acknowledgements
The authors are partially supported by the Swedish Research Council (VR) under Grants 201704112, 201903690, and 202104371.
Funding
Open access funding provided by Linköping University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jonsson, P., Lagerkvist, V. General Lower Bounds and Improved Algorithms for Infinite–Domain CSPs. Algorithmica (2022). https://doi.org/10.1007/s00453022010178
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00453022010178
Keywords
 Constraint satisfaction
 Infinite domains
 Equality languages
 Finegrained complexity
 Lower bounds