Abstract
We find an orientation of a tree with 20 vertices such that the corresponding fixedtemplate constraint satisfaction problem (CSP) is NPcomplete, and prove that for every orientation of a tree with fewer vertices the corresponding CSP can be solved in polynomial time. We also compute the smallest tree that is NLhard (assuming L≠NL), the smallest tree that cannot be solved by arc consistency, and the smallest tree that cannot be solved by Datalog. Our experimental results also support a conjecture of Bulín concerning a question of Hell, Nešetřil and Zhu, namely that ‘easy trees lack the ability to count’. Most proofs are computerbased and make use of the most recent universalalgebraic theory about the complexity of finitedomain CSPs. However, further ideas are required because of the huge number of orientations of trees. In particular, we use the wellknown fact that it suffices to study orientations of trees that are cores and show how to efficiently decide whether a given orientation of a tree is a core using the arcconsistency procedure. Moreover, we present a method to generate orientations of trees that are cores that works well in practice. In this way we found interesting examples for the open research problem to classify finitedomain CSPs in NL.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Constraint satisfaction problems (CSPs) with a single binary constraint type have an elegant mathematical formalisation using the language of graph homomorphisms. For a fixed directed graph (short: digraph) \(\mathbb {H}\), the constraint satisfaction problem for \(\mathbb {H}\), denoted by \(\text {CSP}(\mathbb {H})\), is the computational problem of deciding whether a given finite digraph \(\mathbb {G}\) admits a homomorphism to \(\mathbb {H}\). Here, the vertices of the graph \(\mathbb {G}\) play the role of the variables, the vertices of \(\mathbb {H}\) play the role of the values for the variables, the edges of \(\mathbb {G}\) are the constraint scopes, and the edges of \(\mathbb {H}\) define the constraint relation that specifies how the constraints can be satisfied. Homomorphisms from \(\mathbb {G}\) to \(\mathbb {H}\) will also be called solutions for the instance \(\mathbb {G}\) of \(\text {CSP}(\mathbb {H})\). The CSP for \(\mathbb {H}\) is also known as the \(\mathbb {H}\)coloring problem.
If \(\mathbb {H}\) is finite and symmetric, then \(\text {CSP}(\mathbb {H})\) can be solved in polynomial time if \(\mathbb {H}\) is bipartite or contains a loop, and is NPcomplete otherwise [41]. The situation if \(\mathbb {H}\) is a finite but not necessarily symmetric digraph is much more complicated. The FederVardi dichotomy conjecture states that \(\text {CSP}(\mathbb {H})\) is in P or NPcomplete [35]. In fact, the conjecture was phrased not only for digraphs but for the corresponding computational problem for general finite relational structures. However, it is known that every CSP for a finite relational structure is polynomialtime (and even logspace) equivalent to the CSP for a finite digraph [23, 35]. The FederVardi conjecture was proved in 2017 independently by Bulatov [20] and by Zhuk [71, 72]. Prior to their breakthrough result, the conjecture was open even if \(\mathbb {H}\) is an orientation of a finite tree.
The description of the polynomially solvable cases in the proofs of Bulatov and of Zhuk is based on the socalled algebraic approach and phrased using polymorphisms of \(\mathbb {H}\), i.e., edgepreserving multivariate operations on the vertex set (‘higherdimensional symmetries’) [12]. The algebraic condition for polynomialtime tractability in the proofs of Bulatov and of Zhuk has numerous equivalent characterizations, e.g. [6, 62, 68]. Siggers was the first to show the (at the time somewhat surprising) fact that the condition can be characterized by the existence of a single, 6ary polymorphism satisfying certain identities [68] — which can readily be tested at least for very small digraphs. This was later improved by Kearnes, Marković, and McKenzie [55] to the existence of a single 4ary operation, commonly referred to as a Siggers polymorphism, or a pair of 3ary operations which we will call KearnesMarkovićMcKenzie polymorphisms (for the definition, see Section 6.1). The latter is computationally the most feasible (the search space is the smallest) and thus the most suitable for our purposes. The question whether a given finite digraph \(\mathbb {H}\) satisfies any of the equivalent characterizations of the algebraic tractability condition is decidable, but NPhard [25].
1.1 Computational complexity
Several other important conjectures about the computational complexity of the constraint satisfaction problem for a fixed finite structure \(\mathbb {H}\) with finite relational signature remain open: most notably the question for which finite structures \(\mathbb {H}\) the problem \(\text {CSP}(\mathbb {H})\) is in the complexity class NL (nondeterministic logspace), and for which finite structures \(\mathbb {H}\) it is in the complexity class L (deterministic logspace). As in the case of P versus NPhard, it appears that these questions are closely linked to central dividing lines in universal algebra, as illustrated by the following conjectures.
Conjecture 1 (Larose and Tesson [58])
If the polymorphisms of a finite structure \(\mathbb {H}\) with finite relational signature contain a KearnesKiss chain (defined in Section 6.7), then \(\text {CSP}(\mathbb {H})\) is in NL.
It is known that if \(\mathbb {H}\) does not satisfy the condition from Conjecture 1, then \(\mathbb {H}\) is hard for complexity classes that are not believed to be in NL (more details can be found in Section 6.7). Conjecture 1 is wide open and we believe it to be one of the most difficult research problems in the theory of finitedomain constraint satisfaction that remains open.
Conjecture 2 (Egri, Larose, and Tesson [32])
If the polymorphisms of a finite structure \(\mathbb {H}\) with finite relational signature contain a Noname chain (defined in Section 6.9) then \(\text {CSP}(\mathbb {H})\) is in L.
Also here it is known that if \(\mathbb {H}\) does not satisfy the condition from Conjecture 2, then \(\mathbb {H}\) is hard for complexity classes that are not believed to be in L (more details can be found in Section 6.9).
We mention that both conjectures can equivalently be phrased by the inability to primitively positively construct in \(\mathbb {H}\) certain finite structures that are known to be Lhard, Mod_{p}Lhard, or NLhard (see Section 6). Kazda proved a conditional result that states that resolving the first conjecture would also provide a solution to the second [52].
Again, these conjectures are already open if \(\mathbb {H}\) is a finite digraph, or even if \(\mathbb {H}\) is an orientation of a finite tree. It is also known that answering the question of containment in NL for finite digraphs would also answer the question for general finite structures [23]. For orientations of finite trees, however, the question might be easier to resolve. For brevity, an orientation of a finite tree is simply called a tree in this paper and we adopt the following terminology: a digraph \(\mathbb {H}\) is NPhard if \(\text {CSP}(\mathbb {H})\) is NPhard, and tractable if \(\text {CSP}(\mathbb {H})\) is in P. Similarly, we say that \(\mathbb {H}\) is Phard, NLhard, in NL, in L, NPcomplete, NLcomplete, etc. if \(\text {CSP}(\mathbb {H})\) has that property.
Unfortunately, there is no graph theoretic characterization of which trees are NPhard. The first NPhard tree \(\mathbb {T}\) was found by Gutjahr, Welzl, and Woeginger and had 287 vertices [39]. This was later improved by Gutjahr to a smaller NPhard tree with 81 vertices [38], and then to an NPhard tree with just 45 vertices by Hell, Nešetřil, and Zhu [40]. The tree \(\mathbb {T}\) constructed there is even a triad, i.e., a tree with exactly one vertex of degree three and all other vertices of degree one or two. An NPhard triad with 39 vertices was found by Barto, Kozik, Maróti, and Niven [8, 9] using an indepth analysis of the polymorphisms of triads; they conjectured that their triad is the smallest NPhard tree (assuming P≠NP). This approach lead to a study of certain classes of trees [4, 22]. Fischer [36] used a computer search and found an NPhard tree with just 30 vertices (refuting the conjecture of Barto et al. mentioned above). Later, independently, Tatarko constructed a 26vertex NPhard triad, by manual analysis of polymorphisms. See Table 1.
1.2 Descriptive complexity
Besides the computational complexity of CSPs, the descriptive complexity of CSPs has been studied intensively, and leads to a fruitful interplay of finite model theory, graph theory, and universal algebra. Since the results obtained in this context are highly relevant for the open conjectures mentioned above, we provide a brief introduction to the most prominent concepts. A digraph \(\mathbb {H}\) has tree duality if for all finite digraphs \(\mathbb {G}\), if whenever all trees that map homomorphically to \(\mathbb {G}\) also map to \(\mathbb {H}\), then \(\mathbb {G}\) maps to \(\mathbb {H}\). It is well known that a finite digraph \(\mathbb {H}\) has tree duality if and only if the socalled arcconsistency procedure solves \(\text {CSP}(\mathbb {H})\) [35]. This procedure is of central importance to our work, for many independent reasons that we mention later, and will be introduced in detail in Section 2.1.
For every finite digraph \(\mathbb {H}\), the arcconsistency procedure for \(\text {CSP}(\mathbb {H})\) can be formulated as a Datalog program [35]; Datalog is the fragment of Prolog where function symbols are forbidden. Every Datalog program can be evaluated in polynomial time. Feder and Vardi proved that \(\text {CSP}(\mathbb {H})\) can be solved by Datalog if and only if \(\mathbb {H}\) has socalled bounded treewidth duality; the definition of this concept is similar to the concept of tree duality but we omit it since it is not needed in this article. Bounded treewidth duality can be strengthened to bounded pathwidth duality, which corresponds precisely to solvability by a natural fragment of Datalog, namely linear Datalog [27]. Linear Datalog programs can be evaluated in NL. An even more restricted fragment of Datalog is linear symmetric Datalog; such programs can be evaluated in L [32].
A structure \(\mathbb {H}\) has the ability to count [59] if \(\text {CSP}(\mathbb {H})\) can encode, in some natural sense (namely ppconstructibility [12, 13]), solving systems of linear equations over \(\mathbb Z_{p}\) (for some prime p); thus making \(\text {CSP}(\mathbb {H})\) Mod_{p}Lhard. The ability to count adds substantial complexity to the CSP. Structures that cannot count (i.e., that lack the ability to count) are all tractable and even in Datalog [5, 18] and this result, known as the bounded width theorem, was an important intermediate step towards the resolution of the FederVardi CSP dichotomy conjecture. Based on this theorem, the lack of the ability to count has a number of equivalent characterizations: bounded width, bounded treewidth duality, definability in Datalog, solvability by Singleton Arc Consistency [56].
Several important classes of structures exhibit a dichotomy between NPhardness and the lack of the ability to count (assuming P≠NP), which we will refer to as “easy structures cannot count”. Examples include undirected graphs [41], smooth digraphs (digraphs without sources and sinks) [10], conservative digraphs (digraphs expanded with all subsets of vertices as unary relations) [45], binary conservative structures (even 3conservative) [53]. We note that this phenomenon also occurs for many large classes of infinite structures \(\mathbb {H}\): for example for all firstorder expansions of the basic relations of RCC5 [15]; see [16] for a survey on the question of which infinitedomain CSPs can be solved in Datalog. In this paper, however, we only consider finite structures. Additionally, for classes of finite structures, if the easy structures in the class cannot count, then the algebraic tractability condition for that class can be tested in polynomial time [25].
In [22] Bulín conjectured that ‘easy trees cannot count’ (i.e., only NPhard trees possess the ability to count) and showed that this conjecture held for a large yet structurally limited subclass of trees.
Conjecture 3
Let \(\mathbb {T}\) be a tree. If \(\mathbb {T}\) has the ability to count, then \(\mathbb {T}\) is NPhard.
This conjecture (which is a rephrasing of Conjecture 2 in [22]) would answer an open question posed by Hell, Nešetřil, and Zhu [44] (Open Problem 1 at the end of the article): they asked whether there exists a tractable tree which does not have bounded treewidth duality.
1.3 Contributions
In this article, we continue the research program of systematic, computerbased investigation and classification of the complexity of CSPs by encoding conditions ensuring tractability as constraint problems via the socalled indicator construction [48]. This program was initiated in [37]; our approach is more directly influenced by [25] but builds upon recent developments of the algebraic theory. We obtain the following results.

1.
We find 36 NPhard trees with 20 vertices; moreover, we prove that all smaller trees and all other trees with 20 vertices are tractable.

2.
We find four NPhard triads with 22 vertices, and prove that all smaller triads and all other triads with 22 vertices are tractable.

3.
We show that all the trees with at most 20 vertices that are not NPhard can be solved by Datalog, confirming Conjecture 3 for trees with at most 20 vertices.

4.
We find a tree with 19 vertices that can not be solved by arc consistency, and prove that all smaller trees and all other trees with 19 vertices can be solved by arc consistency.

5.
We find 8 NLhard trees with 12 vertices; moreover, we prove that all smaller trees and all other trees with 12 vertices are in L.
Even though we draw from the results of the universalalgebraic approach to the CSP which led to the theorems of Bulatov and of Zhuk, and use stateoftheart computers for our computations, these tasks remain challenging due to the huge number of trees: for example, even considered up to isomorphism, there are 139,354,922,608 trees with 20 vertices (see Table 3), which is prohibitive even if we could test the algebraic tractability condition within milliseconds. Several further contributions of this article are related to the way we managed to overcome these difficulties.
A wellknown key simplification is to only consider trees that are cores; a digraph \(\mathbb {H}\) is a core if every homomorphism from \(\mathbb {H}\) to \(\mathbb {H}\) is injective. Two graphs \(\mathbb {H}_{1}\) and \(\mathbb {H}_{2}\) are called homomorphically equivalent if there is a homomorphism from \(\mathbb {H}_{1}\) to \(\mathbb {H}_{2}\) and vice versa. Clearly, in that case \(\text {CSP}(\mathbb {H}_{1})=\text {CSP}(\mathbb {H}_{2})\) and so \(\mathbb {H}_{1},\mathbb {H}_{2}\) are either both tractable or both NPhard. It is easy to see that every finite digraph \(\mathbb {H}\) is homomorphically equivalent to a core digraph \(\mathbb {H}^{\prime }\), which is unique up to isomorphism [42]. Moreover, if \(\mathbb {H}\) is a tree, then \(\mathbb {H}^{\prime }\) is a tree as well (and its size is at most the size of \(\mathbb {H}\)). Hence, it suffices to work with trees that are cores. Hell and Nešetřil proved that deciding whether a given digraph is a core is coNPcomplete [42]. However, it follows from a result of Chen and Mengel [26] (Lemma 25) that for trees this problem can be decided in polynomial time. Our next result provides a more efficient algorithm for the same task and remarkably seems to be unnoticed in the literature.

6.
A single execution of the arc consistency procedure can be used to decide whether a given tree is a core.
There are far too many trees with at most 20 vertices to run the core test on each of them. Our next contribution is a method to generate the core trees more directly, rather than generating all trees and then discarding the noncores (the details can be found in Section 4). Applying our method we were able to construct all trees that are cores up to size 20, and all triads that are cores up to size 22, which was essential to achieve the results 1.5. above.

7.
We computed the number of trees that are cores for sizes up to 20 (see Table 3). In particular, there are 779268 core trees of size 20.
These are still too many to be tested for the algebraic tractability condition if this is implemented naively. We therefore use results from the universal algebraic approach to first run more efficient tests for certain sufficient conditions, such as the existence of a binary symmetric polymorphism, and only run the full test for KearnesMarkovićMcKenzie polymorphisms if the simpler conditions all fail; this will be explained in Section 6.
Finally, we identify trees that are important ‘testcases’ for the open problems that have been mentioned earlier.

8.
We computed the two smallest trees that are not known to be in NL; they have 16 vertices, and they are the smallest trees that do not have a majority polymorphism.

9.
We computed 28 smallest trees that are candidates for failing the condition in Conjecture 1 and hence might be Phard (and that are thus candidates for not being in NL, unless NL = P); they have 18 vertices.
1.4 Outline of the article
Basic notation and terminology about directed and undirected graphs and homomorphisms is introduced in Section 2. This section also presents a brief description of the arcconsistency procedure which plays an important role in several of our results. In Section 3 we explain how to use the arcconsistency procedure to efficiently test whether a given tree is a core. In Section 4 we present our method to generate all trees that are cores (directly, without having to discard too many noncores in the process). We then use these trees to make extensive experiments about their computational and descriptive complexity. For this, we need to introduce important polymorphism conditions and related facts from universal algebra (Sections 5 and 6). Finally, the results of our experiments as announced in Section 1.3 can be found in Section 7.
2 Graphs, digraphs, homomorphisms
For the definition of relational structure we refer to any textbook in mathematical logic; note that we allow the signature of structures to be infinite (but the constraint satisfaction problem is only defined for relational structures with finite relational signature). Since we work most of the time with digraphs, we present the basic definitions only for digraphs; most of them generalize to relational structures in a straightforward way. We use standard terminology for graphs and undirected graphs as introduced e.g. in [31]. All graphs we consider are finite. A digraph is a pair \(\mathbb {H} = (H;E)\) where H is a nonempty set and \(E=E(\mathbb {H}) \subseteq H^{2}\) is a set of (directed) edges. A (simple, undirected) graph is a pair \(\mathbb {H} = (H;E)\) where H is a nonempty set and \(E=E(\mathbb {H}) \subseteq {H \choose 2}\) is a set of twoelement subsets of H. An orientation of a graph \(\mathbb {G}\) is a digraph \(\mathbb {O}\) such that O = G, \((x,y) \in E(\mathbb {O})\) implies \(\{x,y\} \in E(\mathbb {G})\), and for every \(\{x,y\} \in E(\mathbb {G})\) either \((x,y) \in E(\mathbb {O})\) or \((y,x) \in E(\mathbb {O})\), but not both. If \(\mathbb {H}\) is a digraph, then the reverse of \(\mathbb {H}\) is the digraph \(\mathbb {H}^{R} = (H;E^{R})\) where E^{R} = {(y,x)∣(x,y) ∈ E}. The operation that obtains \(\mathbb {H}^{R}\) from \(\mathbb {H}\) is called edge reversal.
If \(\mathbb {G}\) and \(\mathbb {H}\) are digraphs, then a homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\) is a map h: G → H such that for all \((x,y) \in E(\mathbb {G})\) we have \((h(x),h(y)) \in E(\mathbb {H})\). We write \(\text {CSP}(\mathbb {H})\) (for constraint satisfaction problem) for the class of all finite digraphs \(\mathbb {G}\) which admit a homomorphism to \(\mathbb {H}\). A homomorphism from \(\mathbb {H}\) to \(\mathbb {H}\) is called an endomorphism of \(\mathbb {H}\). A finite digraph \(\mathbb {H}\) is called a core if all endomorphisms of \(\mathbb {H}\) are injective. It is easy to see that an injective endomorphism of \(\mathbb {H}\) must in fact be bijective and an automorphism, i.e., an isomorphism between \(\mathbb {H}\) and \(\mathbb {H}\). It is easy to see that every finite digraph \(\mathbb {G}\) is homomorphically equivalent to a finite core digraph \(\mathbb {H}\), and that this core digraph is unique up to isomorphism [43], hence \(\mathbb {H}\) will be called the core of \(\mathbb {G}\).
An undirected tree is a connected undirected graph without cycles. If u,v ∈ T and \(\mathbb {T}\) is an undirected tree, then there exists a unique path \(\mathbb {P}\) from u to v in \(\mathbb {T}\); the number of edges of \(\mathbb {P}\) is denoted by dist(u,v). A vertex v ∈ T is called a center of \(\mathbb {T}\) if v lies in the middle of a longest path in \(\mathbb {T}\). An edge \(e\in E(\mathbb {T})\) is called a bicenter of \(\mathbb {T}\) if e is the middle edge of a longest path in \(\mathbb {T}\). We will use the following classical result.
Theorem 1 (Jordan (1869))
An undirected tree \(\mathbb {T}\) has exactly one center or one bicenter.
If \(\mathbb {O}\) is an orientation of a tree and u,v ∈ O, then dist(u,v) (center of \(\mathbb {O}\), bicenter \(\mathbb {O}\)) is meant with respect to the underlying undirected tree. As mentioned in the introduction in this article an orientation of a finite tree will simply be called a tree. A digraph \(\mathbb {H}\) is balanced if its vertices can be organized into levels, that is, there exists a function \(\text {lvl} \colon H\to \mathbb N\) such that lvl(v) = lvl(u) + 1 for all \((u,v)\in E(\mathbb {H})\) and the smallest level is 0. The height of \(\mathbb {H}\) is the maximum level. Note that trees are balanced and observe that if \(\mathbb {G}\) and \(\mathbb {H}\) are balanced of the same height, and \(\mathbb {G}\) is connected, then any homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\) must preserve levels, that is, lvl(v) = lvl(h(v)) for all v ∈ G. A rooted tree is a tuple \((\mathbb {T},r)\), where \(\mathbb {T}\) is a tree and r ∈ T; r is then called the root of \(\mathbb {T}\). A rooted tree \((\mathbb {T},r)\) is called a rooted core if every endomorphism of \(\mathbb {T}\) that fixes r is injective. The depth of a rooted tree \((\mathbb {T},r)\) is \(\max \limits \{\text {dist}(r,v)\mid v \in T\}\).
2.1 The arcconsistency procedure
One of the most efficient algorithms employed by constraint solvers to reduce the search space is the arcconsistency procedure. In the graph homomorphism literature, the algorithm is sometimes called the consistency check algorithm. The arcconsistency procedure is important for us for several reasons:

It plays a crucial role for efficiently deciding whether a given tree is a core (Section 3).

It is well suited for combination with exhaustive search to prune the search space, and this will be relevant in Section 5.

It is an important fragment of Datalog of independent interest from the point of view of the CSP theory (see Section 6.5), and we will later perform experiments to compute the smallest tree that cannot be solved by arc consistency (Section 7.1.4).
We need to give a short description of the procedure.
Let \(\mathbb {G}\) and \(\mathbb {H}\) be finite digraphs. We would like to determine whether there exists a homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\). The idea of the arcconsistency procedure is to maintain for each x ∈ G a set \(L(x) \subseteq H\). Informally, each element of L(x) represents a candidate for an image of x under a homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\). The algorithm initializes each list L(x) with H and successively removes vertices from these lists; it only removes a vertex u ∈ H from L(x) if there is no homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\) that maps x to u. To detect vertices x,u such that u can be removed from L(x), the algorithm uses two rules (in fact, one rule and a symmetric version of the same rule): if \((x,y) \in E(\mathbb {G})\), then
If eventually we cannot remove any vertex from any list with these rules any more, the digraph \(\mathbb {G}\) together with the lists for each vertex is called arcconsistent. Note that formally we may view L as a function L: G → 2^{H} from the vertices of \(\mathbb {G}\) to sets of vertices of \(\mathbb {H}\).
Note that we may run the algorithm also on digraphs \(\mathbb {G}\) where for some x ∈ G the list L(x) is already set to some subset of H. In this setting, the input consists of \(\mathbb {G}\) and the given lists, and we are looking for a homomorphism h from \(\mathbb {G}\) to \(\mathbb {H}\) such that for every x ∈ G we have h(x) ∈ L(x). The pseudocode of the entire arcconsistency procedure is displayed in Algorithm 1. The standard arcconsistency procedure \(\text {AC}_{\mathbb {H}}(\mathbb {G})\) is then obtained by calling \(\text {AC}_{\mathbb {H}}(\mathbb {G},L)\) with L(x) := H for all x ∈ G.
Clearly, if the algorithm removes all vertices from one of the lists, then there is no homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\). It follows that if \(\text {AC}_{\mathbb {H}}\) rejects \(\mathbb {G}\), then there is no homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\). The converse implication does not hold in general. For instance, let \(\mathbb {H}\) be the loopless digraph with two vertices and two edges, denoted \(\mathbb {K}_{2}\), and let \(\mathbb {G}\) be \(\mathbb {K}_{3} = (\{0,1,2\};\neq )\). In this case, \(\text {AC}_{\mathbb {H}}\) does not remove any vertex from any list, but obviously there is no homomorphism from \(\mathbb {K}_{3}\) to \(\mathbb {K}_{2}\).
The arcconsistency procedure can be implemented so that it runs in \(O(E(\mathbb {G}) \cdot H^{3})\), e.g. by Mackworth’s AC3 algorithm [61].
3 Cores of trees
Recall that the problem of deciding whether a given digraph is a core is coNPcomplete [42]. The following theorem implies that whether a given finite tree is a core can be tested in polynomial time.
Theorem 2
Let \(\mathbb {T}\) be a finite tree. Then the following are equivalent.

1.
\(\mathbb {T}\) is a core;

2.
\(\text {End}(\mathbb {T}) = \{\text {id}_{T}\}\);

3.
\(\text {AC}_{\mathbb {T}}(\mathbb {T})\) terminates such that the list for each vertex contains a single element.
Corollary 3
There is a polynomialtime algorithm to decide whether a given finite tree is a core.
We first prove the following two useful lemmata.
Lemma 4
Let \(\mathbb {T}\) be a finite tree and let \(\mathbb {H}\) be a finite digraph such that \(\text {AC}_{\mathbb {H}}(\mathbb {T})\) does not reject. Let t ∈ T, and let a ∈ H be such that a ∈ L(t) after running \(\text {AC}_{\mathbb {H}}(\mathbb {T})\). Then there is a homomorphism \(h\colon \mathbb {T}\to \mathbb {H}\) such that h(t) = a.
Proof
Let S be a maximal subtree of \(\mathbb {T}\) such that t ∈ S and there exists a partial homomorphism \(h\colon S\to \mathbb {H}\) with h(t) = a. If S≠T, then there exists x ∈ S and y ∈ T ∖ S such that either (x,y) or (y,x) is an edge in \(\mathbb {T}\); without loss of generality assume that \((x,y)\in E(\mathbb {T})\). Because the value u = h(x) was not removed from L(x) when running \(\text {AC}_{\mathbb {H}}(\mathbb {T})\), it follows that there exists v ∈ L(y) such that \((u,v)\in E(\mathbb {H})\). But then setting h(y) = v extends h to a partial homomorphism from S ∪{y} to \(\mathbb {H}\) contradicting maximality of the subtree S. □
Lemma 5
Let \((\mathbb {T},r)\) be a rooted tree with an automorphism that is not the identity. Then \((\mathbb {T},r)\) has a noninjective endomorphism.
Proof
Let h be an automorphism of \((\mathbb {T},r)\) that is not the identity. We prove the statement by induction on the number of vertices of \(\mathbb {T}\). Consider the components of the graph obtained from \(\mathbb {T}\) by deleting r. If there is a component C such that h does not map C into itself, then the mapping which agrees with h on C and which fixes all other vertices of \(\mathbb {T}\) is a noninjective endomorphism of \((\mathbb {T},r)\).
If each component C is mapped by h into itself, then each h_{C} is an automorphism of (C,r_{C}), where r_{C} is the unique neighbor of r that lies in C. Since h is not id_{T} there must be some C such that h_{C} is not id_{C} and by the induction hypothesis there exists a noninjective endomorphism \(h^{\prime }\) of (C,r_{C}). Since \(h^{\prime }(r_{C})=r_{C}\) the mapping which extends \(h^{\prime }\) to T by fixing all other vertices of \(\mathbb {T}\) is a noninjective endomorphism of \((\mathbb {T},r)\). □
Proof Proof of Theorem 2
We prove the equivalence of 1. and 2., and then the equivalence of 2. and 3.
Clearly, 2. implies 1. Conversely, suppose that \(\mathbb {T}\) has an endomorphism h which is not the identity map. If h is not injective, then \(\mathbb {T}\) is not a core and we are done. Hence, suppose that h is an automorphism. Note that by Theorem 1, if \(\mathbb {T}\) has a center c ∈ T, then h(c) = c and if \(\mathbb {T}\) has a bicenter \((x,y) \in E(\mathbb {T})\), then h({x,y}) = {x,y}. In the latter case, since \((y,x)\notin E(\mathbb {T})\), we must have h(x) = x and h(y) = y. In both cases h has a fixed point r and h is an automorphism of \((\mathbb {T},r)\). By Lemma 5, \(\mathbb {T}\) has a noninjective endomorphism and is therefore not a core.
To prove that 2. implies 3., we prove the contrapositive. Suppose that \(\text {AC}_{\mathbb {T}}(\mathbb {T})\) terminates with L(x) > 1 for some x ∈ T. Then there exists y ∈ L(x) with y≠x. Lemma 4 implies that there is an endomorphism h of \(\mathbb {T}\) such that h(x) = y and thus h≠id_{T}.
To see that 3. implies 2., note that L(x) = {x} since because of the identity endomorphism id_{T}, x cannot be removed from L(x). Therefore, for any endomorphism \(h\colon \mathbb {T}\to \mathbb {T}\) and any x ∈ T we must have h(x) ∈ L(x), and so h(x) = x and h = id_{T}. □
4 Generating all core trees
In this section we present an algorithm to generate all core trees with n vertices up to isomorphism. To this end, we first present a known algorithm that generates all trees with n vertices up to isomorphism [70]. Later we explain how to modify this algorithm to directly generate core trees.
We also refer to the isomorphism classes of trees as unlabeled trees, as opposed to labeled trees, which are trees with vertex set \(\{1,\dots ,n\}\) for some \(n \in {\mathbb {N}}\). The difference between the enumeration of labelled and unlabeled trees is significant: while the number of labelled trees is Sloane’s integer sequence A097629, given by 2(2n)^{n− 2}, the number of unlabeled trees is Sloane’s integer sequence A000238, which grows asymptotically as cd^{n}/n^{5/2} where d ≈ 5.6465 and c ≈ 0.2257 are constants; the initial terms are shown in Table 3. However, these numbers are still too large to apply the core test to all the unlabeled trees separately. The number of unlabeled trees that are cores is again much smaller. We therefore present a modification of the generation algorithms that allows us to generate unlabeled trees that are cores directly without enumerating all unlabeled trees.
Let ≥ be some total order on all rooted trees that linearly extends the order by depth. The idea of the algorithm is to generate all unlabeled rooted trees with at most n − 1 vertices and then use Theorem 1.
It is easy to verify that Algorithm 3 produces all unlabeled rooted trees with n vertices and depth d. Analogously, Algorithm 2 generates all unlabeled trees with n vertices. Remarkably, there are no isomorphism checks necessary and Algorithm 2 runs in linear time in the number of unlabeled trees with n vertices plus the number of unlabeled rooted trees with at most n − 1 vertices.
Let us make some observations. Let \((\mathbb {T}_{1},r_{1})\geq \dots \geq (\mathbb {T}_{m},r_{m})\) be rooted trees, s ∈{0,1}^{m}, \(T := \{r\}\uplus T_{1}\uplus \dots \uplus T_{m}\), and \(E := \{(r_{i},r)\mid s_{i}=1\}\uplus \{(r,r_{i})\mid s_{i}=0\}\uplus E(\mathbb {T}_{1})\uplus \dots \uplus E(\mathbb {T}_{m})\).

A rooted tree \((\mathbb {T},r)\) is a rooted core if and only if \(\text {AC}_{\mathbb {T}}(\mathbb {T},L)\), where L(r) = {r} and L(x) = T for x ∈ T ∖{r}, terminates such that the list for each vertex contains a single element.

By Corollary 3, testing whether a (rooted) tree is a (rooted) core can be checked in polynomial time using the arcconsistency procedure.

If \((\mathbb {T},r)\) is a rooted core, then \((\mathbb {T}_{i},r_{i})\) is a rooted core for every i.

If \(\mathbb {T}\) is a core and r is its center, then \((\mathbb {T}_{i},r_{i})\) is a rooted core for every i.

If \((T_{1}\uplus T_{2};\{(r_{1},r_{2})\}\uplus E(\mathbb {T}_{1})\uplus E(\mathbb {T}_{2}))\) is a core and (r_{1},r_{2}) is its bicenter, then \((\mathbb {T}_{1},r_{1})\) and \((\mathbb {T}_{2},r_{2})\) are rooted cores.

If two trees that are cores are homomorphically equivalent, then they are isomorphic.
To generate all oriented trees that are cores we slightly modify both algorithms. In both functions we only add trees to the output if they are cores or rooted cores, respectively. By the above observations, these modified algorithms generate each tree with n vertices that is a core exactly once. We do not know whether our algorithm is a polynomialdelay generation procedure for unlabeled core trees. In practice, it is fast enough to generate all core trees with at most 20 vertices within reasonable time (see Section 7).
5 Polymorphism conditions
In this section we introduce basic facts from the universalalgebraic approach that are essential for obtaining our results. If \(\mathbb {H}\) is a digraph and k ≥ 1 is an integer, then \(\mathbb {H}^{k}\) denotes the kth categorical power of \(\mathbb {H}\), i.e., the digraph with vertex set H^{k} and edge set
A polymorphism of \(\mathbb {H}\) is a homomorphism from \(\mathbb {H}^{k}\) to \(\mathbb {H}\), for some k ≥ 1. Clearly, every projection, i.e., every operation of the form \((x_{1},\dots ,x_{k}) \mapsto x_{i}\), for some fixed i ≤ k, is a polymorphism for every digraph \(\mathbb {H}\). Of particular interest to us will be polymorphisms that satisfy certain sets of identities, introduced in Section 5.2. The connection between such identities and computational complexity is described in Section 5.1.
5.1 Primitive positive constructions
Primitive positive definitions are a natural type of gadget construction which can be used to obtain logspace reductions between CSPs (see, e.g., Lemma 1.2.6 in [14]). Central to the algebraic theory of the CSP is the fact that the possibility of such an encoding can be determined from the polymorphisms of the respective templates. Let \(\mathbb {A} = (A;R_{1},\dots ,R_{n})\) be a relational structure. A relation \(S\subseteq A^{n}\) is primitive positive (pp) definable from \(\mathbb {A}\) if it can be defined (without parameters) by a first order formula which only uses the predicate symbols \(R_{1},\dots ,R_{n}\), the equality predicate, conjunction, and existential quantification.
Primitive positive constructions are a more powerful generalization of primitive positive definitions. A relational structure \(\mathbb {B}=(B;S_{1},\dots ,S_{m})\) is ppconstructible from \(\mathbb {A}\) if there exists k > 0 and a structure \(\mathbb {B}^{\prime }=(B^{\prime };S^{\prime }_{1},\dots ,S^{\prime }_{m})\) which is homomorphically equivalent to \(\mathbb {B}\) such that \(B^{\prime }=A^{k}\) and for every \(i\in \{1,\dots ,m\}\) the relation \(S_{i}\subseteq {B^{\prime }}^{r_{i}}\), when considered as a kr_{i}ary relation on A, is ppdefinable from \(\mathbb {A}\). In this case there is a logspace reduction from \(\text {CSP}(\mathbb {B})\) to \(\text {CSP}(\mathbb {A})\) (see, e.g., [13, 21]).
If \(\mathbb {A}\) and \(\mathbb {B}\) are finite structures, then \(\mathbb {B}\) is ppconstructible from \(\mathbb {A}\) if and only if \(\mathbb {B}\) satisfies every heightone condition satisfied by \(\mathbb {A}\); these concepts will be introduced in the next section. This algebraic characterization of ppconstructibility was shown in [13, Theorem 1.3, Corollary 4.7] (see also [12, Theorem 38, Corollary 20]).
5.2 Linear conditions and heightone conditions
Heightone conditions and linear conditions are particular types of strong Maltsev condition [54] that are essential for the algebraic approach to CSPs. If f is a function symbol of arity k and h is a function symbol of arity ℓ, and \(\sigma \colon \{1,\dots ,k\} \to \{1,\dots ,n\}\) and \(\rho \colon \{1,\dots ,\ell \} \to \{1,\dots ,n\}\) are functions, then an expression of the form
is called a heightone identity. For the purposes of this paper, a finite set of heightone identities will be called a heightone condition. Heightone conditions are important because of the mentioned tight link with ppconstructibility (Section 5.1). More generally, an identity is linear if each side has one or zero occurrences of function symbols, i.e., it is either heightone, or of the form \(f(x_{\sigma (1)},\dots ,x_{\sigma (k)}) \approx x_{j}\), \(x_{i}\approx g(x_{\rho (1)},\dots ,x_{\rho (\ell )})\), or x_{i} ≈ x_{j}. Then a linear condition is a finite set of linear identities.
A set of operations \(\mathcal F\) on some domain D satisfies a linear condition Σ if we can interpret every function symbol appearing in Σ as an operation from \(\mathcal F\) so that for every identity in Σ, the lefthand side of the identity and the righthand side of the identity evaluate to the same element under all possible substitutions of variables by elements of D. We say that a structure \(\mathbb {A}\) satisfies a linear condition if the set of all polymorphisms of \(\mathbb {A}\) satisfies it.
An example of a heightone condition consisting of a single heightone identity involving a single binary function symbol f is
which is satisfied by \(\mathbb {A}\) if and only if \(\mathbb {A}\) has a binary symmetric polymorphism. Since \(x_{1},x_{2},x_{3},\dots \) are just variable names we sometimes use x,y,z, etc. instead.
An operation f : D^{k} → D is called idempotent if it satisfies the identity \(f(x,\dots ,x) \approx x\). Note that this identity is linear but not heightone.
Remark 6
It is well known and easy to see that the polymorphisms of a finite core digraph satisfy a heightone condition if and only if its idempotent polymorphisms satisfy the condition: if f is a polymorphism of a core digraph \(\mathbb {H}\), then \(f(x,\dots ,x)\) must be an automorphism of \(\mathbb {H}\); let i be its inverse. Then \((x_{1},\dots ,x_{n}) \mapsto i(f(x_{1},\dots ,x_{n}))\) is an idempotent polymorphism of \(\mathbb {H}\) that satisfies the same heightone identities as f. The argument for conditions involving more than one heightone identity is analogous.
Linear conditions have been introduced and studied first, in particular motivated by the fact that idempotence of operations can only be expressed in linear conditions, but not in height one conditions, and that idempotence plays a central role in many classical areas of universal algebra. Also note that in the setting of finitedomain CSPs we may use the idempotence assumption for free because of Remark 6 and the fact that the core of a structure has an equivalent CSP (indeed, the set of YES instances is exactly the same) and satisfies the same heightone identities (which is not true for linear conditions in general).
5.3 The indicator construction
The question whether a given digraph \(\mathbb {H}\) has polymorphisms that satisfy a given heightone or even linear condition can be tested algorithmically; this is wellknown, see, e.g., [25]. To illustrate the idea, suppose that the given set consists of a single identity, namely f(x,y) ≈ f(y,x). We then compute \(\mathbb {H}^{2}\) and identify every vertex of \(\mathbb {H}^{2}\) of the form (x,y) with the vertex (y,x). The resulting digraph \({\mathbb {H}^{\text {Ind}}}\) will be called the indicator digraph for the heightone condition. We finally search for a homomorphism from \(\mathbb {H}^{\text {Ind}}\) to \(\mathbb {H}\). Note that \(\mathbb {H}^{\text {Ind}}\) may be viewed as an instance of \(\text {CSP}(\mathbb {H})\), and that the homomorphisms from \(\mathbb {H}^{\text {Ind}}\) to \(\mathbb {H}\) are in 11 correspondence with the binary symmetric polymorphisms of \(\mathbb {H}\) (see Definition 9).
Analogously we may proceed for any other heightone condition: to compute \(\mathbb {H}^{\text {Ind}}\), we construct for each function symbol the categorical power of \(\mathbb {H}\) of the corresponding arity, take their disjoint union, and then identify vertices as dictated by the identities. Clearly, the size of the indicator digraph grows exponentially with the arity of the function symbols in the condition and linearly with number of function symbols in the condition so we generally prefer conditions where the function symbols are of low arity even if the number of function symbols is large.
Linear conditions can be tested in the following way. Note that the left and righthand sides of identities can be switched, and that identities of the form x_{i} ≈ x_{j} are only satisfied in oneelement structures. Therefore, we may assume that every identity is either heightone or of the form \(f(x_{\sigma (1)},\dots ,x_{\sigma (k)})\approx x_{j}\). First, construct the indicator digraph \(\mathbb {H}^{\text {Ind}}\) using only the heightone identities when identifying vertices. Then, for every identity that is not heightone, find every vertex of \(\mathbb {H}^{\text {Ind}}\) that comes from a tuple of vertices of \(\mathbb {H}\) matching the lefthand side and set its value to the vertex of \(\mathbb {H}\) given by the righthand side. For example, if the identity is f(x,y,x) ≈ x, we require that (a,b,a) ∈ H^{Ind} must be mapped to a, for every a,b ∈ H. In this way, we obtain an instance of the \(\mathbb {H}\)precoloring extension problem (see also the discussion in Section 8). It is well known that for cores, this problem is logspaceequivalent to \(\text {CSP}(\mathbb {H})\) [13, 21]. Moreover, it is particularly easy to implement within the arcconsistency procedure, see the next section. For balanced digraphs, another important improvement—based on the decomposition into levels—that can be applied for many heightone conditions is described in Section 5.5 below.
5.4 Arc consistency with exhaustive search
To test for the existence of polymorphisms satisfying a given linear condition, we run the arcconsistency procedure for \(\mathbb {H}\) on the indicator digraph \(\mathbb {H}^{\text {Ind}}\) and then perform an exhaustive search. While this procedure is not (provably) in P, it is very efficient in practice.
We initialize the lists for vertices of \(\mathbb {H}^{\text {Ind}}\) with preset values dictated by nonheightone identities, as explained above. Additionally, for every u ∈ H, we initialize the list for every vertex of \(\mathbb {H}^{\text {Ind}}\) of the form \((u,\dots ,u)\) with {u} (since it suffices to look for idempotent polymorphisms as we have explained in Remark 6 and this reduces the search space). For the remaining vertices of \(\mathbb {H}^{\text {Ind}}\), the lists are initialized to H.
If \(\text {AC}_{\mathbb {H}}\) detects an inconsistency, we can be sure that no polymorphisms satisfying the linear condition exist. Otherwise, we select some vertex x ∈ H^{Ind}, and set L(x) to {u} for some u ∈ L(x). Then we proceed recursively with the resulting lists. If \(\text {AC}_{\mathbb {H}}\) now detects an empty list, we backtrack, but remove u from L(x). Finally, if the algorithm does not detect an empty list at the first level of the recursion, we end up with singleton lists for each vertex x ∈ H^{Ind}, which defines a homomorphism from \(\mathbb {H}^{\text {Ind}}\) to \(\mathbb {H}\). The restriction of this homomorphism to the vertices of \(\mathbb {H}^{\text {Ind}}\) for a specific function symbol can then be interpreted as (the function table of) a polymorphism of \(\mathbb {H}\), and these polymorphisms satisfy the linear condition.
There are numerous heuristics that often help to speed up this backtracking procedure. One of the best known is called Maintaining Arc Consistency (MAC) [66]. This family of algorithms has the arcconsistency procedure at its core and takes advantage of the incremental design of the backtracking procedure by maintaining data structures which help to reduce the number of consistency checks. Another common way to speed up the search procedure is to choose the vertex \(x \in \mathbb {H}^{\text {Ind}}\) that has a list of smallest size.
5.5 Levelwise satisfiability
If \(\mathbb {H}\) is a balanced digraph (in particular, a tree), the test from the previous section can sometimes be significantly simplified. We say that a linear condition is levelwise satisfied if we can interpret the function symbols as polymorphisms of \(\mathbb {H}\) in such a way that for every level in \(\mathbb {H}\), the identities are satisfied under all evaluations of variables by vertices from that level.
When testing whether a linear condition is levelwise satisfied, we do not need to construct the full indicator digraph. Instead, for every function symbol (say of arity k) we construct only the subgraph of \(\mathbb {H}^{k}\) consisting of all samelevel ktuples (i.e., tuples in which all vertices are from the same level). Note that, since \(\mathbb {H}\) is balanced, this subgraph is a union of connected components of \(\mathbb {H}^{k}\) and that polymorphisms can be defined to be the first projection on the remaining connected components of \(\mathbb {H}^{k}\).
While we do not have a general construction, for many linear conditions relevant to the complexity of the CSP we can show that if a linear condition is levelwise satisfied in \(\mathbb {H}\), then it is satisfied in \(\mathbb {H}\). The idea is to start with polymorphisms satisfying the identities levelwise, and then redefine those polymorphisms for tuples of vertices that are not all on the same level, in such a way as to satisfy the identities. We will introduce several such concrete constructions in the next section. Similar constructions have appeared in [4, 8, 22, 23]. This optimization is particularly useful when testing the condition TS(n) for all n; see Section 7.1.4.
6 Specific polymorphism conditions
In this section we focus on certain concrete linear conditions that are relevant for studying the membership of CSPs in the most prominent complexity classes in the subsequent sections. An overview of the classes and the respective linear conditions is given in Fig. 1. Solid arrows indicate implications, dotted arrows indicate conjectures. Figure 2 shows the relationships between relevant linear polymorphism conditions that are defined throughout the section. The left side shows the general case of finitedomain structures with possibly infinite signatures and the right side shows the case for trees assuming Conjecture 3 (and P≠NP). The implications are either immediate or from the literature [46] (Chapter 9), [3, 11, 55].
6.1 Containment in P
As discussed above, the characterization of the algebraic condition for tractability which is the most suitable for testing with a computer consists of a pair of ternary operations [55].
Definition 7
A pair of ternary operations p,q: D^{3} → D is called KearnesMarkovićMcKenzie if it satisfies the heightone condition
The respective height one condition is abbreviated by KMM.
Using this characterization, the CSP dichotomy can be stated as follows.
Theorem 8 (20, 55, 72)
A finite digraph \(\mathbb {H}\) has KearnesMarkovićMcKenzie polymorphisms if and only if there is no ppconstruction of \(\mathbb {K}_{3}\) from \(\mathbb {H}\). In this case, \(\text {CSP}(\mathbb {H})\) is in P.
This characterization is optimal in the following sense: every heightone condition equivalent to KearnesMarkovićMcKenzie polymorphisms involves either an operation of arity at least 4 or at least two operations of arity 3 [55]. However, there are several heightone conditions that imply the existence of KearnesMarkovićMcKenzie polymorphisms and that are easier to test. In particular, we use the following.
Definition 9
An operation f of arity k, for k ≥ 2, is called a kary weak nearunanimity operation (short, kwnu) if it satisfies the following heightone condition
A binary operation f is called symmetric if it is a 2wnu, i.e., if it satisfies f(x,y) ≈ f(y,x).
It is known that the existence of a kwnu implies the existence of KearnesMarkovićMcKenzie polymorphisms [55, 68], and that the existence of KearnesMarkovićMcKenzie polymorphisms implies the existence of a kwnu for some k ≥ 2 [62]. Hence, in particular, if a finite digraph \(\mathbb {H}\) has a binary symmetric polymorphism then \(\text {CSP}(\mathbb {H})\) can be solved in polynomial time. Our results will show that the converse is false even if \(\mathbb {H}\) is a tree, see Section 7.1.4.
For both KearnesMarkovićMcKenzie and kwnu, it is enough to test for levelwise satisfiability as discussed in Section 5.5. We prove the following more general claim.
Lemma 10
Let Σ be a heightone condition in two variables such that both the variables appear on each side in every identity from Σ. Then a balanced digraph levelwise satisfies Σ if and only if it satisfies Σ.
Proof
Fix some polymorphisms \(f,\dots \) that levelwise satisfy Σ, and define polymorphisms \(f^{\prime },\dots \) in the following way (say f is kary): If \(lvl(x_{1})=lvl(x_{2})=\dots =lvl(x_{k})\), define \(f^{\prime }(x_{1},x_{2},\dots ,x_{k})=f(x_{1},x_{2},\dots ,x_{k})\). Else, let \(\ell =\min \limits \{lvl(x_{i})\mid 1\leq i\leq k\}\) and define \(f^{\prime }(x_{1},x_{2},\dots ,x_{k})=x_{j}\) where \(j\in \{1,2,\dots ,k\}\) is the smallest index such that lvl(x_{j}) = ℓ. To verify that the \(f^{\prime }\)s are polymorphisms, note that if (x_{i},y_{i}) is an edge for \(i\in \{1,2,\dots ,k\}\), then \(f^{\prime }(x_{1},x_{2},\dots ,x_{k})\) and \(f^{\prime }(y_{1},y_{2},\dots ,y_{k})\) fall under the same case of the definition. If it is the second case, x_{j} lies on the smallest level out of {lvl(x_{i})∣1 ≤ i ≤ k} if and only if y_{j} lies on the smallest level out of {lvl(y_{i})∣1 ≤ i ≤ k}. Hence, the selected coordinate j is the same.
Now let x,y be the two variables appearing in Σ. To see that every identity from Σ is satisfied, note that the only interesting case is when lvl(x)≠lvl(y), and \(f^{\prime }\) then chooses the variable on the lower level. The other implication is trivial. □
6.2 Containment in Datalog
We have already mentioned in the introduction that containment in Datalog has numerous equivalent characterizations. In this section, we formally state one of these characterizations in terms of ppconstructibility and one in terms of heightone conditions. The structure 3Lin_{p} has the domain \(D = \{0,\dots ,p1\}\) where p is some prime, the relation {(x,y,z)∣x + y + z ≡ 0 (mod p)}, and the relation {x} for every x ∈ D. It is well known that CSP(3Lin_{p}) is not in Datalog [35].
Definition 11
A 34 weak nearunanimity pair (short, 34 WNU) is a pair of operations f,g such that f is a 3wnu, g is a 4wnu, and they additionally satisfy the identity
Theorem 12 (7, 57, see [12, Theorem 47])
Let \(\mathbb {H}\) be a finite digraph. Then the following are equivalent.

\(\mathbb {H}\) can be solved by Datalog,

there is no ppconstruction of 3Lin_{p} in \(\mathbb {H}\), for any prime p (i.e., \(\mathbb {H}\) lacks the ability to count),

\(\mathbb {H}\) has a 34 weak nearunanimity pair of polymorphisms.
Note that by the above theorem, if Conjecture 3 is true (and assuming P ≠ NP), then every tree with KearnesMarkovićMcKenzie polymorphisms has a 34WNU pair of polymorphisms. In particular, it implies that every tree with KMM polymorphisms has a 3wnu polymorphism: a claim that is weaker but still open. Also note that by Lemma 10, it is enough to test for levelwise 34 WNU.
6.3 Containment in NL
In this section we present a strong sufficient condition for the containment of \(\mathbb {H}\) in NL.
Definition 13
For n ≥ 0, a Jónsson chain of length n over D is a sequence of ternary operations \(j_{1},j_{2},\dots ,j_{2n+1}\) on D that satisfy
The respective linear condition is abbreviated by J(n).
Note that J(n) implies J(n + 1) for every n ≥ 0. Also note that for n = 0 the operation j_{1} must be a socalled majority operation, which is the ternary case of a nearunanimity (NU) operation, that is, an operation satisfying the identities
The existence of a nearunanimity polymorphism characterizes bounded strict width [35]. More importantly for us, a nearunanimity polymorphism is sufficient to put \(\mathbb {H}\) in NL, using the following two results. Barto, Kozik, and Willard proved that finite structures with finite relational signature and a nearunanimity polymorphism have bounded pathwidth duality [11]. Dalmau proved that bounded pathwidth duality implies containment in NL [28].
Barto [3] moreover proved that if a finite structure with a finite relational signature has polymorphisms that form a Jónsson chain, then it also has a nearunanimity polymorphism (albeit its arity in the proof is doubly exponential in the size of the domain). In the other direction, it is well known that NU(n) implies J(n − 2), syntactically. Therefore, we do not test for nearunanimities of arities higher than 3; it is more efficient to test for a Jónsson chain.
Theorem 14 (3, 11, 28)
If a finite digraph \(\mathbb {H}\) satisfies J(n) for some n ≥ 1, then \(\text {CSP}(\mathbb {H})\) is in linear Datalog, and hence in NL.
Note that the existence of polymorphisms of \(\mathbb {H}\) that form a Jónsson chain is only a sufficient condition for the containment of \(\text {CSP}(\mathbb {H})\) in NL. An incomparable sufficient condition for the containment of \(\text {CSP}(\mathbb {H})\) in NL was identified in [24]. The condition presented there also has a characterization via heightone identities, but the arities of the operations are prohibitively large so that we did not implement this test for trees.
The conjectured characterization of containment in NL, the KearnesKiss chain from Conjecture 1, is defined in Section 6.7 (Definition 21).
6.4 Containment in L
One of the strongest known sufficient conditions for containment in L is a conditional result of Kazda, which involves the following linear condition.
Definition 15
For n ≥ 1, a HagemannMitschke chain of length n over D is a sequence of ternary operations \(p_{1},\dots ,p_{n}\) on D that satisfy
The respective linear condition is abbreviated by HaMi(n).
Note that HaMi(n) implies HaMi(n + 1) for every n ≥ 1. For n = 1 the operation p_{1} is known as a Maltsev operation. Kazda [52] proved the following conditional result.
Theorem 16 (52)
If a finite digraph \(\mathbb {H}\) can be solved by linear Datalog, and \(\mathbb {H}\) satisfies HaMi(n) for some n ≥ 1, then \(\mathbb {H}\) can also be solved by linear symmetric Datalog (and hence is in L).
The conjectured characterization of containment in L, the Noname chain from Conjecture 2, is defined in Section 6.9 (Definition 24).
6.5 Solvability by Arc Consistency
Solvability by ArcConsistency (and tree duality) can be characterized in terms of height one conditions as well.
Definition 17
An operation s_{n}: D^{n} → D is called totally symmetric if for all variables \(x_{1},\dots ,x_{n}\) and \(y_{1},\dots ,y_{n}\) such that \(\{x_{1},\dots ,x_{n}\} = \{y_{1},\dots ,y_{n}\}\) the operation s_{n} satisfies
The respective height one condition is abbreviated by TS(n).
The digraph \(\mathbb {H}\) can be solved by arc consistency if and only if \(\mathbb {H}\) has totally symmetric polymorphisms of all arities [30, 35]. Note that TS(4) implies 34 WNU. Also note that a finite digraph \(\mathbb {H}\) satisfies TS(n) for all n > 0 if and only if it satisfies \(\text {TS}(2 E(\mathbb {H}))\) (see the proof given in [30]). The arity \(2 E(\mathbb {H})\) is still fairly large; therefore it is particularly useful that the levelwise test is sufficient.
Lemma 18 (see [8, proof of Lemma 4.1])
For any balanced digraph \(\mathbb {H}\) and n > 0, \(\mathbb {H}\) levelwise satisfies TS(n) if and only if \(\mathbb {H}\) satisfies TS(n).
Proof
Let s_{n} be an nary polymorphism of \(\mathbb {H}\) that levelwise satisfies the condition TS(n). We can construct an nary totally symmetric polymorphism \(s^{\prime }_{n}\) of \(\mathbb {H}\) by applying s_{n} to the set of vertices on the smallest level. That is, for an input tuple \((x_{1},\dots ,x_{n})\) let \(\ell =\min \limits \{\text {lvl}(x_{i})\mid 1\leq i\leq n\}\), \(\{x_{i}\mid \text {lvl}(x_{i})=\ell \}=\{x_{i_{1}},\dots ,x_{i_{k}}\}\), and set
Clearly, the definition of \(s^{\prime }_{n}(x_{1},\dots ,x_{n})\) depends only on the set \(\{x_{1},\dots ,x_{n}\}\). To see that \(s^{\prime }_{n}\) is a polymorphism, note that similarly as in Lemma 10 if (x_{i},y_{i}) is an edge for \(i\in \{1,2,\dots ,n\}\), then \(x_{i_{j}}\) lies on the smallest level among the levels of the arguments if and only if \(y_{i_{j}}\) does. The rest follows from the fact that s_{n} is totally symmetric on each level. The other implication is again trivial. □
6.6 Phardness
The structure Horn3SAT has the domain {0,1} and a ternary relation {0,1}^{3} ∖{(1,1,0)}, and the two unary relations {0} and {1}). It is well known that CSP(Horn3SAT) is Pcomplete, i.e., complete for the complexity class P under deterministic logspace reductions.
Definition 19
Let D be a set. A HobbyMcKenzie chain of length n over D is a sequence of ternary operations \(d_{0},\dots ,d_{n},p,e_{0},\dots ,e_{n}\) such that
The respective linear condition is abbreviated by HoMcK(n).
Theorem 20 (consequence of Theorem 9.8 in 46)
A finite structure does not satisfy HoMcK(n) for any n ≥ 1 if and only if it can ppconstruct Horn3SAT.
The theorem implies that if a finite digraph \(\mathbb {H}\) does not satisfy HoMcK(n) for any n ≥ 1, then \(\mathbb {H}\) is Phard (see Section 5.1).
6.7 Phardness or Mod_{p}Lhardness
It is widely believed that NL is a proper subclass of P. Another complexity class which is believed to be a proper subclass of P is the class Mod_{p}L, for some prime p: this is defined to be the class of problems such that there exists a nondeterministic logspace machine M such that an instance is in the class if and only if the number of accepting paths of M on the instance is divisible by p; see [58]. It is well known that CSP(3Lin_{p}) is Mod_{p}Lcomplete (see the discussion in Section 1.3 of [58]). If NL would contain Mod_{p}L then this would be a considerable breakthrough in complexity theory.
Definition 21 (from Theorem 9.11 in 46)
Let D be a set. A KearnesKiss chain of length n ≥ 2 over D is a sequence of ternary operations \(d_{0},d_{1},\dots ,d_{n}\) on D such that
The respective linear condition is abbreviated by KK(n).
Note that KK(n) implies KK(n + 1) for every n ≥ 0. Also note that the existence of a Jónsson chain implies the existence of a KearnesKiss chain [46]; namely J(n) trivially implies KK(2n + 4).
Theorem 22 (see 12 and [46])
A finite structure does not satisfy KK(n) for any n ≥ 2 if and only if it can ppconstruct Horn3SAT or 3Lin_{p} for some prime p.
We have already mentioned that if a finite digraph \(\mathbb {H}\) ppconstructs Horn3SAT then it is Phard, and hence \(\text {CSP}(\mathbb {H})\) is not in NL unless NL = P. Similarly, if \(\mathbb {H}\) ppconstructs 3Lin_{P} then it is Mod_{p}Lhard, and in this case it is not in NL unless NL contains Mod_{p}L. If the conjecture that ‘easy trees cannot count’ (Conjecture 3) is true, then the existence of a KearnesKiss chain is equivalent to the existence of a HobbyMcKenzie chain for trees (assuming P ≠ NP).
6.8 NLhardness
The structure stCon has the domain {0,1}, the binary relation {(0,0),(0,1),(1,1)} and the unary relations t = {0} and s = {1}. Note that an instance of the CSP of stCon is unsatisfiable if and only if there exists a directed path from s to t in the digraph defined by the binary relation. It is well known that CSP(stCon) is complete for the complexity class NL (see, e.g., [47]).
Theorem 23 (46, 58)
If a finite digraph \(\mathbb {H}\) does not satisfy HaMi(n) for any n ≥ 1, then it can ppconstruct the structure stCon, and \(\mathbb {H}\) is NLhard.
6.9 NLhardness or Mod_{p}Lhardness
We now present a polymorphism condition that characterizes the finite structures that can ppconstruct stCon or3Lin_{P} for some prime p.
Definition 24 (46, Theorem 9.15)
For n ≥ 0, a Noname chain of length n over D is a sequence of operations \(f_{0},f_{1},\dots ,f_{n}\) of arity four on D such that
The respective linear condition is abbreviated by NN(n).
Note that NN(n) implies NN(n + 1) for every n ≥ 0.
Theorem 25 (12, 46)
A finite structure does not satisfy NN(n) for any n ≥ 1 if and only if it can ppconstruct the structure stCon or the structure 3Lin_{P} for some prime p.
It follows that if a finite digraph \(\mathbb {H}\) does not satisfy NN(n) for any n ≥ 1, then \(\text {CSP}(\mathbb {H})\) is NLhard or Mod_{p}Lhard. Hence, \(\mathbb {H}\) is in this case not in L, unless L = NL or L = Mod_{p}L. Note that Conjecture 3 together with Conjecture 2 implies that NN(n) for some n and HaMi(n) for some n are equivalent for trees (assuming L≠NL).
7 Experimental results
We implemented the AC3 algorithm for establishing arcconsistency and used its adaptation, known as the MAC3 algorithm, for maintaining arcconsistency during the backtracking procedure described in Section 5.3. The lists and related operations were implemented by doublylinked lists. The code is written in Rust and the experiments were run on a Intel(R) Xeon(R) CPU E52680 v3 (12 cores) @ 2.50GHz with Linux. We also used another implementation written in Python. All tests for chains of polymorphisms and for totally symmetric polymorphisms were performed using this implementation on a AMD Ryzen 5 4500U (with 8 cores) @ 2.38 GHz with Windows. Some results were verified with both implementations, see Table 2. An efficient implementation was essential to obtain our results. ^{Footnote 1} Both implementations and all trees presented in the figures in this section can be found at https://github.com/WhatDothLife/TheSmallestHardTrees.
Table 3 shows the number of unlabeled trees with n vertices and the number of those that are cores. The table suggests that the fraction of trees that are cores quickly goes to 0. The next columns contain the number of unlabeled rooted cores with n vertices, the number of AC calls, and the mean cpu time per AC call on a tree with n vertices. The final column in the table shows the computation time needed to generate all the unlabeled core trees with n vertices with Algorithm 2.
In this section we present the results of testing the discussed linear conditions on these trees to classify them with respect to their computational complexity. In some cases we manage to compute all the minimal trees in the respective complexity class; the corresponding results are presented in Section 7.1. In Section 7.2 we present trees whose precise complexity status is open. While the numbers of trees given in the text are up to isomorphism, the corresponding figures make a further restriction based on the following fact.
Remark 26
An operation is a polymorphism of \(\mathbb {H}\) if and only if it is a polymorphism of \(\mathbb {H}^{R}\).
The remark justifies that our figures contain exactly one of the trees \(\mathbb {T}\), \(\mathbb {T}^{R}\). It turns out that the trees in our figures that do not satisfy a certain heightone condition have a unique minimal subtree which has no idempotent polymorphisms satisfying the respective condition (see Remark 6). The vertices and edges drawn in gray do not belong to this minimal subgraph of \(\mathbb {T}\).
7.1 The smallest hard trees
In this section we present the smallest trees that are NPhard and that are NLhard, under standard assumptions from complexity theory. We also compute the smallest tree that cannot be solved by arc consistency, the smallest trees that cannot be solved by Datalog, and the smallest trees that cannot be solved by linear symmetric Datalog; these results hold without any assumptions from complexity theory.
7.1.1 The smallest NPhard trees
Our algorithm found that all trees with at most 19 vertices have KearnesMarkovićMcKenzie polymorphisms and hence are tractable. It also found that there exist exactly 36 trees with 20 vertices that have no KearnesMarkovićMcKenzie polymorphisms and hence are NPhard. For such an NPhard tree \(\mathbb {T}\) with 20 vertices it takes our algorithm about 0.07 seconds to construct the indicator digraph \(\mathbb {T}^{\text {Ind}}\) for the KearnesMarkovićMcKenzie polymorphisms and about 0.03 seconds to verify that \(\mathbb {T}^{\text {Ind}}\) does not have a homomorphism to \(\mathbb {T}\). When applying the level trick, it takes about 0.01 seconds to construct the indicator digraph and 0.03 seconds to verify that \(\mathbb {T}^{\text {Ind}}\) does not have a homomorphism to \(\mathbb {T}\).
The trees with 20 vertices that have no KearnesMarkovićMcKenzie polymorphisms are displayed in Fig. 3. Each of these trees has a unique smallest subtree without idempotent KearnesMarkovićMcKenzie polymorphisms. These subtrees are clearly not cores and have KearnesMarkovićMcKenzie polymorphisms. Note that these subtrees are the same for the trees A1–A8 and for A10–A18.
Moreover, there are 4 smallest triads with 22 vertices that have no KearnesMarkovićMcKenzie polymorphisms; these are shown in Fig. 4. All smaller triads have a binary symmetric polymorphism.
7.1.2 The smallest NLhard trees
There are 8 trees with 12 vertices that are NLhard. Two of them are isomorphic to their reverse, so we only display 5 trees in Fig. 5, called B1, B2, B3, B4, and B5. The proof that they are NLhard can be found below. All other trees with at most 12 vertices satisfy HaMi(8).
Since the trees B1B5 have a majority polymorphism they are in NL. To prove that B1B5 are NLhard, we show that they can ppconstruct stCon. Hence we need to construct the three relations {0},{1}, and {(0,0),(0,1),(1,1)}. First note that in a core tree \(\mathbb {T}\) any singleton set is ppdefinable from \(\mathbb {T}\), since \(\text {End}(\mathbb {T})=\{\text {id}_{T}\}\) by Theorem 2. The following two graphs represent two ppformulas ϕ_{1}(x,y) and ϕ_{2}(x,y). The filled vertices stand for existentially quantified variables.
The trees B1, B2, B3 can ppdefine a structure that is homomorphically equivalent to stCon using ϕ_{1}(x,y) for E(stCon). The trees B4 and B5 can do the same using ϕ_{2}(x,y) for E(stCon). Since stCon is NLhard, B1B5 are NLhard as well.
So for 12 vertices, 8 out of 226 trees are NLhard (assuming L ≠ NL). In Fig. 6 we present how this distribution in core trees changes with increasing number of vertices. Every tree with at most 20 vertices falls into one of two cases:

it satisfies HaMi(16) and has a majority polymorphism, hence it is in L, or

it has no HaMi(30).
We strongly suspect that in the latter case the trees have no HaMi(n) for any n, can ppconstruct stCon, and are NLhard.
7.1.3 The smallest tree not solved by Datalog
It turns out that every tree with at most 20 vertices which is not NPhard can be solved by Datalog, thus confirming Conjecture 3. In fact, up to 20 vertices all trees that have KearnesMarkovićMcKenzie polymorphisms either have a majority polymorphism or totally symmetric polymorphisms of all arities. The picture is however more complex for larger trees: there exists a tree which can be solved by Datalog but does not have a nearunanimity polymorphism (of any arity) and does not have totally symmetric polymorphisms of all arities, see [4, Proposition 5.5] for an example and [22] for its solvability in Datalog.
7.1.4 The smallest tree not solved by arc consistency
The smallest tree \(\mathbb {T}\) that has no binary symmetric polymorphism has 19 vertices and is displayed in Fig. 7. It has a 3wnu polymorphism, and even a majority polymorphism which satisfies f(x,x,y) = f(x,y,x) = f(y,x,x) = x for all x,y ∈ T. Note that \(\text {CSP}(\mathbb {T})\) cannot be solved by the arcconsistency procedure since in this case \(\mathbb {T}\) must have a binary symmetric polymorphism [30, 35]. All other trees with at most 19 vertices satisfy TS(n) for all n. For a tree \(\mathbb {T}\) the vertices of the indicator digraph for \(\text {TS}(2 E(\mathbb {T}))\) correspond to the nonempty subsets of T. Hence the indicator structure of a tree \(\mathbb {T}\) with 19 vertices has 2^{19} − 1 = 524287 vertices. Using levelwise satisfiability (see Section 6.5) the number of vertices of the indicator structure is reduced to something between 19 and 513, depending on the number of vertices on each level.
7.2 Open trees
In this section we present trees that are interesting test cases, in particular regarding the conjectured classification of digraphs in NL (Conjecture 1).
7.2.1 A tree not known to be in NL
We found a tree with polymorphisms that form a KearnesKiss chain of length five and a HobbyMcKenzie chain of length 2, but has no majority polymorphism, and no (levelwise) Jónsson chain of length 1000 (see Fig. 8). This tree is neither known to be Phard or Mod_{p}Lhard, nor is it known to be in NL. It is the smallest tree without a majority polymorphism. Note that the existence of a Jónsson chain of some length is decidable because for a given digraph there are only finitely many operations of arity three. Moreover, by the discussion from Section 5.5 we know that we may narrow down the set of operations that have to be considered; the resulting number of operations is 12^{36}. Even if we could show that the tree has no Jónsson chain we would not know that the tree is not in NL. We believe that the tree is in NL, but new ideas are needed to prove that (e.g., ideas to prove Conjecture 1). We mention that it can ppconstruct stCon, so it is NLhard.
7.2.2 Trees that might be Phard
There are 28 trees with 18 vertices that satisfy neither HoMcK(1000) nor KK(1000), not even levelwise (see Fig. 9). They satisfy TS(n) for all n, so they are in P and cannot ppconstruct 3Lin_{p} for every p. Hence, this is in accordance with Conjecture 3. All other trees with up to 18 vertices satisfy KK(5) and are in NL assuming Conjecture 1. Hence, if this conjecture is true, and if NL≠P, and if indeed these 28 trees do not have HoMcK(n) for any n, then they are the smallest trees that are Phard.
7.3 Majority polymorphisms
Majority polymorphisms play a central role in the early theory of the constraint satisfaction problem [34, 35, 49], in graph theory [45, 51], and in the algebraic theory of CSPs [19, 20]. Dalmau and Krokhin [29] proved that structures with a majority polymorphism are in NL already before the mentioned result of Barto, Kozik, and Willard for nearunanimity polymorphisms [11]. We have therefore also computed a smallest tree without a majority polymorphism (see Fig. 8). Interestingly, when solving the indicator problem for the existence of a majority polymorphism of \(\mathbb {H}\) for graphs with at most 15 vertices (which all have a majority polymorphism), no backtracking was needed: pruning with the arcconsistency procedure sufficed to avoid all deadends in the search. Theoretical results only guarantee this behavior for establishing (2,3)consistency (since \(\mathbb {H}\) has a majority polymorphism). So one might ask: can every tree with a majority polymorphism be solved by arc consistency? This is not the case; see Lemmata 4.1 and 4.2 in [8]. In our experiments we found the smallest such tree: Fig. 7 shows a tree with a majority polymorphism which does not even have a binary symmetric polymorphism, and hence in particular cannot be solved by arc consistency.
7.4 Code availability and reproducibility
All of the code we used can be found at https://github.com/WhatDothLife/TheSmallestHardTrees. There you also find a list with all core trees with at most 20 vertices and a list with all trees occurring in figures in this section. The trees are represented as lists of edges.
If you implement your own algorithm to generate core trees you can compare the numbers with Table 3 (comparing the actual trees in the lists will not work as there are too many). If you want to verify the satisfiability of various polymorphism conditions independently of our implementation, we recommmend the software package PCSP Tools by Opršal [65].
8 Open problems and future work
The following conjecture is implied by Conjecture 3, but might be easier to answer.
Conjecture 4
A tree has KearnesMarkovićMcKenzie polymorphisms if and only if it has a 3wnu polymorphism.
Question 1
Is it true that the probability that a tree drawn uniformly at random from the set of all trees with vertex set \(\{1,\dots ,n\}\) is NPhard tends to 1 as n tends to infinity? The answer is yes if we ask the question for random labelled digraphs instead of random labelled trees [60].
Figure 6 suggests that the following conjecture is true.
Conjecture 5
The fraction of core trees with n vertices that are NLhard goes to 1 as n goes to infinity.
Question 2
Determine the smallest trees that are Phard (assuming that NL≠P). We know from Section 7.3 that they must have at least 16 vertices, since all smaller trees have a majority polymorphism and thus are in NL.
Question 3
Is our algorithm from Section 4 to generate unlabeled core trees a polynomialdelay enumeration algorithm (in the sense of [50])?
Question 4
Characterize linear conditions that can be tested levelwise (in the sense of Section 5.5) for balanced digraphs, and more specifically, for trees.
It would be interesting to perform experiments similar to the experiments presented here for trees that are equipped with a singleton unary relation {a} for each vertex a of the tree; in this case, if \(\mathbb {T}_{c}\) is the resulting expanded tree structure, \(\text {CSP}(\mathbb {T}_{c})\) models the socalled \(\mathbb {T}\)precoloring extension problem. This setting is particularly nice from the algebraic perspective because then all the polymorphisms of \(\mathbb {T}_{c}\) are idempotent. Note, however, that all these structure \(\mathbb {T}_{c}\) are cores, so there are far more structures to consider, and hardness will dominate more rapidly.
Taking this one step further, it would also be interesting to study the socalled list homomorphism problem for trees \(\mathbb {T}\) from an experimental perspective. Here, the input contains, besides the graph \(\mathbb {G}\), a set (also commonly referred to as a list) of vertices from \(\mathbb {H}\) for every vertex of \(\mathbb {G}\). And we are looking for a homomorphism from \(\mathbb {G}\) to \(\mathbb {H}\) that maps each vertex to an element from its set. This can be seen as a special case of a CSP for a relational structure, which contains besides the edge relation also a unary relation for each subset of the vertices of \(\mathbb {H}\). On the algebraic side, we are therefore interested in polymorphisms that preserve all subsets of \(\mathbb {H}\); such polymorphisms (and consequently the respective CSPs) are also called conservative. The algorithms and complexities for conservative CSPs are better understood than the general case [2, 17, 19, 53], which will help to determine the complexity of the list homomorphism for trees. On the other hand, as in the case of the precoloring extension problem we have much larger numbers of trees to consider since all the structures that we study are already cores.
Notes
Using the constraint modeling language MiniZinc [64] and the solver Gecode [67] we are able to verify the polymorphism conditions for concrete trees. Also, together with the program Nauty [63] used to generate trees up to isomorphism we can verify the number of core trees for reasonably small sizes, but this approach was orders of magnitude slower than needed.
References
Afrati, F. N., & Cosmadakis, S. S. (1989). Expressiveness of restricted recursive queries (extended abstract). In D. S. Johnson (Ed.) Proceedings of the 21st Annual ACM Symposium on Theory of Computing (pp. 113–126). Washigton: ACM.
Barto, L. (2011). The dichotomy for conservative constraint satisfaction problems revisited. In Proceedings of the Symposium on Logic in Computer Science (LICS), Toronto, Canada.
Barto, L. (2013). Finitely related algebras in congruence distributive varieties have near unanimity terms. Canadian Journal of Mathematics, 65(1), 3–21.
Barto, L., & Bulín, J. (2013). CSP dichotomy for special polyads. International Journal of Algebra and Computation, 23(5), 1151–1174.
Barto, L., & Kozik, M. (2009). Constraint satisfaction problems of bounded width. In Proceedings of Symposium on Foundations of Computer Science (FOCS), pp. 595–603.
Barto, L., & Kozik, M. (2012). Absorbing subalgebras, cyclic terms and the constraint satisfaction problem. Logical Methods in Computer Science, 8/1 (07), 1–26.
Barto, L., & Kozik, M. (2014). Constraint satisfaction problems solvable by local consistency methods. Journal of the ACM, 61(1), 3:1–3:19.
Barto, L., Kozik, M., Maróti, M., & Niven, T. (2009). CSP dichotomy for special triads. Proceedings of the American Mathematical Society, 137(9), 2921–2934.
Barto, L., Kozik, M., Maróti, M., & Niven, T. (2009). Erratum to: CSP dichotomy for special triads Available from the website of the first author.
Barto, L., Kozik, M., & Niven, T. (2009). The CSP dichotomy holds for digraphs with no sources and no sinks (a positive answer to a conjecture of BangJensen and Hell). SIAM Journal on Computing, 38(5).
Barto, L., Kozik, M., & Willard, R. (2012). Near unanimity constraints have bounded pathwidth duality. In Proceedings of the 27th ACM/IEEE Symposium on Logic in Computer Science (LICS), pp. 125–134.
Barto, L., Krokhin, A., & Willard, R. (2017). Polymorphisms, and how to use them. In A. Krokhin S. živný (Eds.) The Constraint Satisfaction Problem: Complexity and Approximability, volume 7 of Dagstuhl FollowUps (pp. 1–44). Germany: Schloss Dagstuhl–LeibnizZentrum fuer Informatik, Dagstuhl.
Barto, L., Opršal, J., & Pinsker, M. (2018). The wonderland of reflections. Israel Journal of Mathematics, 223(1), 363–398.
Bodirsky, M. (2021). Complexity of InfiniteDomain constraint satisfaction. In Lecture notes in logic (52). Cambridge: Cambridge University Press.
Bodirsky, M., & Bodor, B. (2021). Canonical polymorphisms of Ramsey structures and the unique interpolation property. In Proceedings of the Symposium on Logic in Computer Science (LICS).
Bodirsky, M., & Jonsson, P. (2017). A modeltheoretic view on qualitative constraint reasoning. Journal of Artificial Intelligence Research, 58, 339–385.
Bulatov, A. A. (2003). Tractable conservative constraint satisfaction problems. In Proceedings of the Symposium on Logic in Computer Science (LICS), (pp. 321–330). Ottawa, Canada.
Bulatov, A. A. (2009). Bounded relational width Manuscript.
Bulatov, A. A. (2016). Conservative constraint satisfaction rerevisited. Journal Computer and System Sciences, 82(2), 347–356. ArXiv:1408.3690.
Bulatov, A. A. (2017). A dichotomy theorem for nonuniform CSPs. In 58th IEEE, Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15–17, pp. 319–330.
Bulatov, A. A., Krokhin, A. A., & Jeavons, P. G. (2005). Classifying the complexity of constraints using finite algebras. SIAM Journal on Computing, 34, 720–742.
Bulín, J. (2018). On the complexity of Hcoloring for special oriented trees. European Journal of Combinatorics, 69, 54–75.
Bulín, J., Delić, D., Jackson, M., & Niven, T. (2015). A finer reduction of constraint problems to digraphs. Log. Methods Comput. Sci. 11(4).
Carvalho, C., Dalmau, V., & Krokhin, A. (2010). CSP Duality and trees of bounded pathwidth. Theoretical Computer Science, 411, 3188–3208.
Chen, H., & Larose, B. (2017). Asking the metaquestions in constraint tractability. TOCT, 9(3), 11:1–11:27.
Chen, H., & Mengel, S. (2015). A trichotomy in the complexity of counting answers to conjunctive queries. In 18th International Conference on Database Theory, ICDT 2015, March 2327, 2015, Brussels, Belgium, pp. 110–126.
Dalmau, V. (2000). Computational complexity of problems over generalized formulas phDthesis at the Departament de Llenguatges i Sistemes informátics at the Universitat politécnica de Catalunya.
Dalmau, V. (2005). Linear Datalog and bounded path duality of relational structures. Logical Methods in Computer Science 1(1).
Dalmau, V., & Krokhin, A. A. (2008). Majority constraints have bounded pathwidth duality. European Journal of Combinatorics, 29(4), 821–837.
Dalmau, V., & Pearson, J. (1999). Closure functions and width 1 problems. In Proceedings of the International Conference on Principles and Practice of Constraint Programming (CP), pp. 159–173.
Diestel, R. (2005). Graph theory, 3rd edn. New York: Springer–Verlag.
Egri, L., Larose, B., & Tesson, P. (2007). Symmetric datalog and constraint satisfaction problems in logspace. In Proceedings of the Symposium on Logic in Computer Science (LICS), pp. 193–202.
Egri, L, Larose, B., & Tesson, P. (2008). Directed stconnectivity is not expressible in symmetric datalog. In Proceedings of the 35th International Colloquium on Automata, Languages and Programming, Part II, ICALP ’08 (pp. 172–183). Berlin: SpringerVerlag.
Feder, T. (2001). Classification of homomorphisms to oriented cycles and of kpartite satisfiability. SIAM Journal on Discrete Mathematics, 14(4), 471–480.
Feder, T., & Vardi, M. Y. (1999). The computational structure of monotone monadic SNP, and constraint satisfaction: a study through Datalog and group theory. SIAM Journal on Computing, 28, 57–104.
Fischer, J. (2015). CSPS of orientations of trees. TU Dresden: Master thesis.
Gault, R., & Jeavons, P. (2004). Implementing a test for tractability. Constraints An International Journal, 9(2), 139–160.
Gutjahr, W. (1991). Graph colourings. PhD Thesis, Free University Berlin.
Gutjahr, W., Welzl, E., & Woeginger, G. J. (1992). Polynomial graphcolorings. Discrete Applied Mathematics, 35(1), 29–45.
Hell, P., Nesetril, J., & Zhu, X. (1996). Complexity of tree homomorphisms. Discret. Appl Math., 70(1), 23–36.
Hell, P., & Nešetřil, J. (1990). On the complexity of Hcoloring. Journal of Combinatorial Theory Series B, 48, 92–110.
Hell, P., & Nešetřil, J. (1992). The core of a graph. Discrete Mathematics, 109, 117–126.
Hell, P., & Nešetřil, J. (2004). Graphs and homomorphisms. Oxford: Oxford University Press.
Hell, P., Nešetřil, J., & Zhu, X. (1996). Duality and polynomial testing of tree homomorphisms. TAMS, 348(4), 1281–1297.
Hell, P., & Rafiey, A. (2011). The dichotomy of list homomorphisms for digraphs. In Proceedings of the TwentySecond Annual ACMSIAM Symposium on Discrete Algorithms, SODA ’11 (pp. 1703–1713). USA: Society for Industrial and Applied Mathematics.
Hobby, D., & McKenzie, R. (1988). The structure of finite algebras, volume 76 of Contemporary Mathematics American Mathematical Society.
Immerman, N. (1998). Descriptive complexity. Graduate texts in computer science. New York: Springer.
Jeavons, P., Cohen, D., & Gyssens, M. (1996). A test for tractability. In E. C. Freuder (Ed.) Principles and Practice of Constraint Programming — CP96 (pp. 267–281). Berlin: Springer Berlin Heidelberg.
Jeavons, P., Cohen, D., & Gyssens, M. (1997). Closure properties of constraints. Journal of the ACM, 44(4), 527–548.
Johnson, D. S., Yannakakis, M., & Papadimitriou, C. H. (1988). On generating all maximal independent sets. Information Processing Letters, 27(3), 119–123.
Kazda, A. (2011). Maltsev digraphs have a majority polymorphism. European Journal of Combinatorics, 32, 390–397.
Kazda, A. (2018). npermutability and linear Datalog implies symmetric Datalog. Logical Methods in Computer Science, Volume 14 Issue 2.
Kazda, A. (2019). CSP for binary conservative relational structures. Algebra Universalis, 75(1), 75–84.
Kearnes, K. A., & Kiss, E. W. (2013). The Shape of Congruence Lattices, volume 222 (1046) of Memoirs of the American Mathematical Society American Mathematical Society.
Kearnes, K. A., Marković, P., & McKenzie, R. (2015). Optimal strong Mal’cev conditions for omitting type 1 in locally finite varieties. Algebra Universalis, 72(1), 91–100.
Kozik, M. (2016). Weak consistency notions for all the CSPs of bounded width. In Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’16 (pp. 633–641). New York: Association for Computing Machinery.
Kozik, M., Krokhin, A., Valeriote, M., & Willard, R. (2015). Characterizations of several Maltsev conditions. Algebra universalis, 73(3), 205–224.
Larose, B., & Tesson, P. (2009). Universal algebra and hardness results for constraint satisfaction problems. Theoretical Computer Science, 410(18), 1629–1647.
Larose, B., Valeriote, M., & Zádori, L. (2009). Omitting types, bounded width and the ability to count. International Journal of Algebra and Computation 19(5).
Łuczak, T., & Nešetřil, J. (2006). When is a random graph projective? Eur. Journal Comb. 27(7).
Mackworth, A. K. (1977). Consistency in networks of relations. Artificial Intelligence, 8, 99–118.
Maróti, M., & McKenzie, R. (2008). Existence theorems for weakly symmetric operations. Algebra Universalis, 59(34), 463–489.
McKay, B. D., & Piperno, A. (2014). Practical graph isomorphism, ii. Journal of Symbolic Computation, 60, 94–112.
Nethercote, N., Stuckey, P. J., Becket, R., Brand, S., Duck, G. J., & Tack, G. (2007). Minizinc: Towards a standard CP modelling language. In C. Bessière (Ed.) Principles and Practice of Constraint Programming – CP 2007 (pp. 529–543). Berlin: Springer Berlin Heidelberg.
Opršal, J. (2022). PCSP Tools https://github.com/jakuboprsal/pcsptools.
Sabin, D., & Freuder, E. C. (1994). Contradicting conventional wisdom in constraint satisfaction. In ECAI.
Schulte, C., Lagerkvist, M. Z., & Tack, G. (2010). Gecode generic constraint development environment. http://www.gecode.org/.
Siggers, M. H. (2010). A strong Mal’cev condition for varieties omitting the unary type. Algebra Universalis, 64(1), 15–20.
Tatarko, W. (2019). CSP Over oriented trees. Bachelor thesis: Charles University Prague.
Taylor, P. Algorithm for generating all unlabeled trees with n nodes? Computer Science Stack Exchange. https://cs.stackexchange.com/q/103287 (version: 20190123).
Zhuk, D. (2020). A proof of the CSP, dichotomy conjecture. Journal of ACM, 67(5), 30:1–30:78.
Zhuk, D. N. (2017). A proof of CSP dichotomy conjecture. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 1517, pp. 331–342.
Acknowledgements
The authors thank the anonymous referees for their valuable comments. The authors are grateful to the Center for Information Services and High Performance Computing [Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)] at TU Dresden for providing its facilities for high throughput calculations.
Funding
Open Access funding enabled and organized by Projekt DEAL. Manuel Bodirsky has received funding from the European Research Council (Grant Agreement no. 681988, CSPInfinity). Jakub Bulín was supported by the MŠMT ČR INTEREXCELLENCE project LTAUSA19070 and the Charles University project UNCE/SCI/004. Florian Starke is supported by DFG Graduiertenkolleg 1763 (QuantLA).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bodirsky, M., Bulín, J., Starke, F. et al. The smallest hard trees. Constraints 28, 105–137 (2023). https://doi.org/10.1007/s10601023093418
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10601023093418