Optimal Data Reduction for Graph Coloring Using Low-Degree Polynomials

The theory of kernelization can be used to rigorously analyze data reduction for graph coloring problems. Here, the aim is to reduce a q-Coloring input to an equivalent but smaller input whose size is provably bounded in terms of structural properties, such as the size of a minimum vertex cover. In this paper we settle two open problems about data reduction for q-Coloring. First, we obtain a kernel of bitsize O(kq-1logk)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(k^{q-1}\log {k})$$\end{document} for q-Coloring parameterized by Vertex Cover for any q≥3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q\ge 3$$\end{document}. This size bound is optimal up to ko(1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k^{o(1)}$$\end{document} factors assuming NP⊈coNP/poly\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {NP} \not \subseteq \mathsf {coNP/poly}$$\end{document}, and improves on the previous-best kernel of size O(kq)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(k^q)$$\end{document}. We generalize this result for deciding q-colorability of a graph G, to deciding the existence of a homomorphism from G to an arbitrary fixed graph H. Furthermore, we can replace the parameter vertex cover by the less restrictive parameter twin-cover. We prove that H-Coloring parameterized by Twin-Cover has a kernel of size O(kΔ(H)logk)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(k^{\varDelta (H)}\log k)$$\end{document}. Our second result shows that 3-Coloring does not admit non-trivial sparsification: assuming NP⊈coNP/poly\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {NP} \not \subseteq \mathsf {coNP/poly}$$\end{document}, the parameterization by the number of vertices n admits no (generalized) kernel of size O(n2-ε)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(n^{2-\varepsilon })$$\end{document} for any ε>0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon > 0$$\end{document}. Previously, such a lower bound was only known for coloring with q≥4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q \ge 4$$\end{document} colors.


Introduction
The q-Coloring problem asks whether the vertices of a graph can be properly colored using q colors. It is one of many colorability problems on graphs that have been widely studied. Since these are often NP-hard, they are good candidates to study from a parameterized perspective [4,7]. Here we use additional parameters, other than the size of the input, to describe the complexity of the problem. In this paper we study preprocessing algorithms (called kernelizations or kernels) that aim to reduce the size of an input graph in polynomial time, without changing its colorability status. The natural choice for a parameter for q-Coloring is the number of colors q. However, since even 3-Coloring is NP-hard, this parameter does not give interesting results. Therefore the problem is studied using different parameters, that often try to capture the complexity of the input graph. For example, Fiala et. al. [8] compared the parameterized complexity of several coloring problems when parameterized by vertex cover, to the complexity when parameterized by treewidth. Jansen and Kratsch [13] studied graph coloring when parameterized by a hierarchy of different parameters.
In this earlier work [13], Jansen and Kratsch provided a kernel for q-Coloring parameterized by Vertex Cover with O(k q ) vertices that can be encoded in O(k q ) bits. Furthermore they showed that for q ≥ 4, a kernel of bitsize O(k q−1−ε ) is unlikely to exist. Unfortunately, these bounds left a gap of a factor k and it remained unclear whether the upper or the lower bound had to be strengthened. As our first main result, we manage to close this gap by improving the kernel.
To obtain this improvement, we can use a recent result by the current authors [14] about the kernelization of constraint satisfaction problems when parameterized by the number of variables. A non-trivial data reduction can be achieved when the constraints are given by equalities of low-degree polynomials on boolean variables. The size of the resulting instance then depends on the maximum degree of the given polynomials. Suppose now we are given a 3-Coloring instance G with vertex cover S and let I = V (G) \ S be the corresponding independent set. One can think of each vertex v ∈ I as a constraint of the form "my neighbors use at most 2 different colors", such that a remaining color can be used to color v. We write these constraints as polynomial equalities and apply our previous result to find out which ones are redundant. Since vertices of the independent set can be colored independently, a vertex that corresponds to a redundant constraint can be removed from G, without changing the 3-colorability of G. We can apply this idea to obtain a kernel for q-Coloring parameterized by Vertex Cover. The key technical step is to build a polynomial of degree q − 1 that captures the desired constraints.
In this paper, we further generalize the problem by studying the H-Coloring problem. The problem asks for a given graph G and fixed graph H, whether there exists a homomorphism f : V (G) → V (H) such that {u, v} ∈ E(G) ⇒ {f (u), f (v)} ∈ E(H). Instead of using the size of a vertex cover as the parameter, we use a smaller parameter called twin-cover [9]. We show in Theorem 21 that H-Coloring parameterized by the size of a twin-cover has a kernel with O(k ∆(H) ) vertices and bitsize O(k ∆(H) log k). Since q-Coloring is equivalent to K q -Coloring where K q is the clique on q vertices, this result immediately gives a kernel of bitsize O(k q−1 log k) for q-Coloring parameterized by vertex cover. This closes the gap with the lower bound for q-Coloring up to k o (1) factors.
Often, when describing a kernel for a problem parameterized by a structural parameter like vertex cover, it is assumed that (an approximation of) the minimum vertex cover is given with the input [2,11]. However, an interesting feature of our kernel for H-Coloring is that it can be computed without knowing an (approximation of the) optimal twin-cover of the input graph. The fact that the graph has size-k twin-cover is only used to analyze the size of the resulting kernel.
Our second main result concerns the parameterization by the number of vertices n. The current authors showed in earlier work [15] that for a number of graph problems it is impossible to give a kernel of size O(n 2−ε ), unless NP ⊆ coNP/poly. This implies that the number of edges cannot efficiently be reduced to a subquadratic amount without changing the answer, a task that is also known as sparsification. For example, q-Coloring was shown to have no non-trivial sparsification for any q ≥ 4, unless NP ⊆ coNP/poly. The case for q = 3 remained open. One might think that 3-Coloring is so restrictive, that a 3-colorable instance is likely to either be sparse, or have a very specific structure. Exploiting this structure could then allow for a non-trivial sparsification. In Theorem 27 we show that this is not the case: 3-Coloring allows no kernel of size O(n 2−ε ), unless NP ⊆ coNP/poly.

Related work.
Hell and Nešetřil showed that H-Coloring is NP-hard for any non-bipartite graph H that has no self-loops [10]. For a bipartite graph, the problem is equivalent to testing whether the input graph is bipartite, and thus polynomial-time solvable. Chitnis et al. show that the problem of finding a smallest set W ⊆ V (G) such that G − W is H-list-colorable is FPT when H is (C 6 , P 6 )-free and bipartite, when parameterized by the size of H together with the solution size [3].
Ganian introduced Twin-Cover as a new parameter [9] and gives relations to existing parameters. For example, a minimum twin-cover is not larger than a minimum vertex cover, but twin-cover is incomparable to treewidth. The paper also gives an FPT algorithm for Precoloring Extension parameterized by the size of a twin-cover, and studies a number of other problems using this parameter.
Dell and Van Melkebeek showed that for any d ≥ 3, d-CNF-Satisfiability with n variables has no kernel of size O(n d−ε ), unless NP ⊆ coNP/poly [6]. Continuing this line of research, precise kernel lower bounds were shown for a variety of problems. For example, it was shown that Vertex Cover is unlikely to have a kernel of size O(k 2−ε ) [6], while a kernel with O(k 2 ) edges and O(k) vertices is known. Furthermore, the Point-Line cover problem, which asks to cover a set of n points in the plane with at most k lines, was proven to have a tight kernel lower bound of size O(k 2−ε ) [16], assuming NP ⊆ coNP/poly. Dell and Marx [5] proved polynomial kernelization lower bounds for several packing problems. They showed how a table structure can help realize the reduction that is needed for such a lower bound. We will also use this table structure in our lower bound.

Preliminaries
To denote the set of numbers 1 to n, we use the following notation: For x, y ∈ Z we write x ≡ 2 y to denote that x and y are congruent modulo 2. For a finite set X and non-negative integer k, let X k be the collection of all subsets of X of size exactly k and let X ≤k be the collection of all subsets of X of size at most k.

Graphs
All graphs considered in this paper are finite, simple, and undirected. In particular, this means that graphs do not have self-loops. A graph G has vertex set V (G) and edge set E(G).
Let ∆(G) denote the maximum degree of any vertex in G and let ω(G) denote the size of a largest clique in G.
A vertex cover of a graph G is a set S ⊆ V (G) such that each edge has at least one endpoint in S (equivalently, G − S is an independent set). We say vertices u and v ∈ V (G) are (true) twins whenever N G [u] = N G [v]. Note that this relation is transitive. We say X ⊆ V (G) is a twin-cover [9] of G, if for every edge {u, v} ∈ E(G), vertex u ∈ X, or v ∈ X, or u and v are twins.
A proper q-coloring of G is a function f : V (G) → [q] such that for all {u, v} ∈ E(G) : f (u) = f (v). Let G and H be graphs. We say that G is H-colorable if there exists a a r X i v -v e r s i o n Such a function is also called a homomorphism from G to H. Note that G has a homomorphism to K q (a clique on k vertices) if and only if G is q-colorable. In this paper, we will only consider H-Coloring where H has no self-loops and is not bipartite, as otherwise the problem is polynomial-time solvable. We will frequently use the following properties of H-colorings in the remainder of the paper.

Parameterized complexity
A parameterized problem Q is a subset of Σ * × N, where Σ is a finite alphabet. Let Q, Q ⊆ Σ * × N be parameterized problems and let h : N → N be a computable function. A generalized kernel for Q into Q' of size h(k) is an algorithm that, on input (x, k) ∈ Σ * × N, takes time polynomial in |x| + k and outputs an instance (x , k ) such that: 1. |x | and k are bounded by h(k), and Since a polynomial-time reduction to an equivalent sparse instance yields a generalized kernel, a lower bound for the size of a generalized kernel can be used to prove the non-existence of sparsification algorithms.
We use the framework of cross-composition [1] to establish kernelization lower bounds, requiring the definitions of polynomial equivalence relations and or-cross-compositions. We repeat them here for completeness: Definition 3 (Polynomial equivalence relation, [1,Def. 3.1]). An equivalence relation R on Σ * is called a polynomial equivalence relation if the following conditions hold.
There is an algorithm that, given two strings x, y ∈ Σ * , decides whether x and y belong to the same equivalence class in time polynomial in |x| + |y|. For any finite set S ⊆ Σ * the equivalence relation R partitions the elements of S into a number of classes that is polynomially bounded in the size of the largest element of S.
Definition 4 (Cross-composition, [1,Def. 3.3]). Let L ⊆ Σ * be a language, let R be a polynomial equivalence relation on Σ * , let Q ⊆ Σ * × N be a parameterized problem, and let f : N → N be a function. An or-cross-composition of L into Q (with respect to R) of cost f (t) is an algorithm that, given t instances x 1 , x 2 , . . . , x t ∈ Σ * of L belonging to the same equivalence class of R, takes time polynomial in t i=1 |x i | and outputs an instance (y, k) ∈ Σ * × N such that: The parameter k is bounded by where c is some constant independent of t, and instance (y, k) ∈ Q if and only if there is an i ∈ [t] such that x i ∈ L.
Theorem 5 ([1, Theorem 6]). Let L ⊆ Σ * be a language, let Q ⊆ Σ * × N be a parameterized problem, and let d, ε be positive reals. If L is NP-hard under Karp reductions, has an or-cross-composition into Q with cost f (t) = t 1/d+o (1) , where t denotes the number of instances, and Q has a polynomial (generalized) kernelization with size bound O(k d−ε ), then NP ⊆ coNP/poly.
We will refer to an or-cross-composition of cost f (t) = √ t log(t) as a degree-2 crosscomposition. By Theorem 5, a degree-2 cross-composition can be used to rule out generalized kernels of size O(k 2−ε ).

Kernel for H-Coloring parameterized by Twin-Cover
In this section, we give a kernel for H-Coloring parameterized by the size of a twin-cover. We start by showing how to partition the graph into vertex sets that are twins in Section 3.1.
We introduce some of the polynomial equalities that we use and their properties in Section 3.2, and use them in Section 3.3 to define the set of equalities that is constructed for a given input graph. In Section 3.4 we define the three reduction rules our kernel will use and prove that they are safe. Finally, in Section 3.5 we give the kernel.

Twin Decomposition
Computing a minimum Twin-Cover is NP-hard, since Vertex Cover is NP-hard on graphs where no two vertices are twins. We will therefore construct the kernel for H-Coloring without knowing a twin-cover of the input graph. In order to do this, we decompose the graph into vertex sets consisting of twins. Recall that throughout the paper, twins are vertices with the same closed neighborhood.
Definition 6 ((Partial) twin decomposition). A partial twin decomposition of a graph G is a partition Π = {P 1 , . . . , P m } of V (G), such that any two vertices in the same partite set are twins. Partition Π is a twin decomposition if furthermore any two vertices in different partite sets are not twins.
To be able to use the twin decomposition for the kernelization procedure, we show how it can efficiently be computed.

Lemma 7. A twin decomposition can be computed in
Proof. This is for example stated in [17,Exercise 2.17] for the case of finding false twins, which are vertices such that N G (u) = N G (v). Finding (true) twins is similar. An example solution uses the adjacency-list representation, and adds each vertex to its own adjacency list. Then we efficiently sort the vertices based on their adjacency lists and use this to find duplicates.
The next lemma shows how the twin decomposition and a minimal twin-cover may intersect.

Lemma 8.
Let G be a graph with twin decomposition Π and a minimal twin-cover S. Then for any partite set P ∈ Π it holds that either P ⊆ S or P ∩ S = ∅.
Proof. Let P ∈ Π. Suppose P ∩ S = ∅ and P \ S = ∅. Let u ∈ S ∩ P and v ∈ P \ S. We show that S \ {u} is a twin-cover of G, which contradicts the assumption that S is minimal.
Let {u, w} for w = v be any edge in G. Since u and v are twins, it follows that {v, w} ∈ E(G). Thereby, either w ∈ S and thus edge {u, w} is covered by w, or w and v are twins. In this case, by transitivity of being twins u and w are also twins. This proves that S \ {u} is indeed a twin-cover of G, which is a contradiction.

6
Optimal Data Reduction for Graph Coloring Using Low-Degree Polynomials

Modeling constraints as polynomial equalities
As explained in the introduction, the kernelization is based on a connection to constraint satisfaction problems. To find the kernel, we represent the constraints that a vertex set puts on the coloring of its neighborhood, as polynomial equalities. We then use this representation to find redundant vertices and edges in the graph. To use this idea, we need some additional lemmas and definitions. Recall that a monomial of degree d is the product of d variables, with the unique monomial of degree zero being the constant 1. For example, x 1 · x 3 · x 3 is a monomial of degree three. A monomial is multilinear if each variable occurs at most once.
, y is a linear combination of x 1 , . . . , x when vectors are added component-wise, over the integers modulo 2.
Let p(x 1 , . . . , x n ) be a multivariate polynomial in (a subset of) the variables x 1 , . . . , x n , evaluated over the integers modulo 2, of degree at most d for some fixed d. Hence p is a weighted sum of monomials of degree at most d over x 1 , . . . , x n . For some fixed ordering of the monomials of degree d over x 1 , . . . , x n , let vect(p) denote the vector containing the coefficients of the corresponding monomials in p.
Let P be a set of multivariate polynomials in (subsets of) the variables x 1 , . . . , x n . We use span 2 (P ) to denote span 2 ({vect(p) | p ∈ P }), and we use p ∈ span 2 (P ) to denote that span 2 (P ) contains vect(p).
The following lemma follows from the definition above.
Lemma 11. Let P be a set of polynomials of degree at most d over variable set y, and let q be a polynomial of degree at most d over y. If q ∈ span 2 (P ), then any assignment to y that satisfies p(y) ≡ 2 0 for all p ∈ P , satisfies q(y) ≡ 2 0.
Proof. Choose α p ∈ {0, 1} for all p ∈ P such that vect(q) ≡ 2 p∈P α p vect(p). Consider an assignment to the variables y with p(y) ≡ 2 0 for all p ∈ P . Let y be the vector containing the evaluation of the monomials of degree at most d over y, for the values assigned to y. List them in the same order in which the coefficients for these monomials are listed in vect(·). Since a polynomial is a weighted sum of monomials, the value of a polynomial p of degree most d in y for the assigned values, equals the inner product of vect(p) and y . So: To utilize polynomials over boolean variables to represent solutions of graph H-coloring problems, we represent the color of a vertex v in a graph G by |V (H)| boolean variables, indicating whether v has the corresponding color. We now define a partial choice assignment, which reflect that any vertex receives at most one color. Definition 12 (Partial choice assignment). Let y i,k ∈ {0, 1} for i ∈ [n], k ∈ [q] be a set of boolean variables and let y be the vector containing all these variables. We say y is given a partial choice assignment if for all i ∈ [n]: Note that a partial choice assignment sets at most n variables to true. By this definition, a partial choice assignment can be seen as a partial coloring in the following way: y i,k = 1 means vertex i has color k. Note that the coloring of some vertices may remain undefined.
The following lemma gives a polynomial that can be used to express the constraint that out of exactly q neighbors of a given vertex u, there are at least two that have the same color. By combining multiple such constraints, we can ensure that at most q − 1 different colors are used in the neighborhood of vertex u, leaving one color free for u itself in the q-coloring problem. When evaluating the polynomial for y that is given a partial choice assignment, the polynomial has the following two essential properties. (1) It equals 1 modulo 2 when the q vertices all receive a distinct color, and (2) it equals 0 modulo 2 whenever two vertices have the same color, or when two vertices have no color defined.

Lemma 13. Let q > 0 be an integer and let
Then there exists a polynomial p of degree q −1 over the integers modulo 2, such that whenever the variables in y are given a partial choice assignment, it holds that p(y) such that y i,k = 1. Before proving Lemma 13, we give the polynomial p corresponding to q = 3 as an example.
Proof of Lemma 13. Define the multivariate polynomial p as We prove that p has the desired properties. It is easy to see the degree of p is q − 1. It remains to prove the claim on the values of p(y) for partial choice assignments. So let y be given a partial choice assignment, and for each i ∈ [q] let x i := k exactly when y i,k = 1. Let We now show that p(y) ≡ 2 1 if there exist no i, j, k ∈ [q] with i = j such that y i,k = y j,k = 1, and for all k ∈ [q − 1] there exists i ∈ [q] such that y i,k = 1. In terms of the values for x i , this implies that they are all distinct, and that [q − 1] ⊆ {x 1 , . . . , x q }. Thereby, we only have to consider the following two cases. Either {x 1 , . . . ,

Optimal Data Reduction for Graph Coloring Using Low-Degree Polynomials
Suppose that we are in one of the two situations above. For k ∈ [q − 1], let j k be the unique index such that x j k = k, implying that y j k ,k = 1. Note that this is well defined, since all values from [q − 1] are used exactly once. Then, such that x i = k. It follows from our earlier assumption that there must with c = k, let i c be the unique index such that x ic = c and thus y ic,c = 1. Then However, q−1 c=1 y ic,c = 0 for any other choice of i 1 , . . . , i q−1 . Thereby, p(y) = 2 ≡ 2 0.

Construction of polynomial equalities
We continue to define the polynomial equalities that will be constructed for a subset P of the vertices of G. These are necessary constraints on the coloring of N G (P ), such that P can be properly H-colored. In the construction, P will be a partite set of the twin decomposition of G, and hence a clique.
Let G be a graph with P ⊆ V (G). We create variables c v,i for each v ∈ V (G) and i ∈ V (H), denoting whether v has color i. Let C contain all constructed variables. Let L(P, G) be the set of polynomial equalities produced by the following procedure, which results in two types of constraints. The first will ensure that the neighborhood of P does not use too many colors, such that there are at least |P | remaining colors to color (the clique) P . The second will ensure that the coloring of the neighborhood of P can be extended to also color P .
For each set S ⊆ N G (P ) with |S| = ∆(H) + 1 and each set X ⊆ V (H) with |X| = |S|, use Lemma 13 to find a polynomial p P,S,X such that p P,S,X (C) ≡ 2 1 if and only if the following two statements hold: Add the following constraint to L(P, G): Proof. Let f be given and the value of any c v,i ∈ C be defined by c v,i = 1 ⇔ f (v) = i. We show that this assignment satisfies all constraints in L Π (G), by showing that it satisfies both types of constraints in L(P, G) for all P ∈ Π. Consider some P ∈ Π. Since it consists of twins, it is a clique in G. As H has no self-loops, the vertices in P all receive distinct colors by Observation 1, and the colors used on P form a clique in H. The fact that P consists of twins also implies that {u, v} ∈ E(G) for all u ∈ P, v ∈ N G (P ). Thereby, any color used in P is not used in the coloring of N G (P ).
Consider a constraint p P,S,X (C) ≡ 2 0 ∈ L(P, G) for S ⊆ N G (P ) of size |∆(H)| + 1 and X ⊆ V (H) of the same size. By Observation 2, the vertices in S use at most ∆(H) = |S| − 1 colors. Thereby, some color in X is used twice, or at least two colors in X are unused. It follows from Lemma 13 that p P,S,X (C) ≡ 2 0 as required.
Consider a constraint q P,S,X (C) ≡ 2 0 ∈ L(P, G) for S ⊆ N (P ) and X = x 1 , . . . , x |S| ∈ V (H). Suppose this constraint is not satisfied. Then the coloring of S is given by X and

satisfies all constraints in L(P , G). Then f can be extended to properly color G.
Proof. Let f be given and C be defined by  G).
. To extend f to color P , assign each vertex in P a distinct color from K.
It remains to verify that we have given a proper H-coloring. Any edge between two vertices in V (G) \ P remains properly colored. Any edge in P is properly colored, because its endpoints have a different color and K is a clique in H. Any edge between P and V (G) \ P is properly colored, because all vertices in K are a common neighbor of the vertices in X, and K ∩ X = ∅.

Reduction rules
We now present the three reduction rules that will be used to obtain the kernel, and prove that they are safe. The first checks whether the graph is trivially not H-colorable, the second removes sets of edges from the graph, and the third removes sets of vertices from the graph.

Reduction rule 1. Let G be a graph with twin decomposition Π. If there exists P ∈ Π with |P | > ω(H), return a trivial no-instance.
It is easy to see that Reduction rule 1 preserves the answer to the problem, since G cannot have a proper H-coloring by Observation 1.

Reduction rule 2.
Let G be a graph with twin decomposition Π. Let P = P ∈ Π such that E G (P , P ) = ∅. If L(P , G) ⊆ span 2 (L Π (G \ E G (P , P ))), remove all edges in E G (P , P ) from graph G.
Reduction rule 2 is the key rule for our kernelization. It simplifies the graph by removing all edges between two distinct sets of twins P and P , if the constraints L(P , G) are linear combinations of the constraints generated by the remaining graph G \ E G (P , P ). The following lemma proves that the reduction rule is safe.

Lemma 18. If G is obtained from G by applying Reduction rule 2, then G is H-colorable if and only if
In the other direction, let f be a proper H-coloring of G . It follows from Lemma 16 and the fact that Π is a partial twin decomposition of G \ E G (P , P ) that the derived setting of the boolean variables C satisfies the constraints in L Π (G \ E G (P , P )). Since L(P , G) ⊆ span 2 (L Π (G \ E G (P , P ))) it follows from Lemma 11 that this setting of C also satisfies all constraints in L(P , G). Let f be defined as f restricted to the vertices in G − P . Note that G − P equals G − P by definition. It is easy to see that f is a proper H-coloring of G − P since G − P is a subgraph of G and f is a proper H-coloring of G . Furthermore, f satisfies the constraints in L(P , G) since it colors the relevant vertices the same as f . It now follows from Lemma 17 that we can extend f to color all vertices in G. Thereby, G has a proper H-coloring.
The final rule effectively removes isolated cliques from the graph, when H has a sufficiently large clique to allow them to be colored properly.

Optimal Data Reduction for Graph Coloring Using Low-Degree Polynomials
The set L Π (G) contains at most m := 2n · n ∆(H)+1 · |V (H)| ∆(H)+1 polynomial equalities (the number of ways to pick S, X and P as for the definition of p P,S,X and q P,S,X ), over n · |V (H)| variables. All polynomials we employ are multilinear. This can be verified directly from their construction, and explained by noting that squaring a number does not change it, when working modulo 2. By Lemma 9, we therefore only have to consider (n · |V (H)|) ∆(H) + 1 coefficients for the polynomials. Constructing the required polynomial equalities can be done in polynomial time, for fixed H. We can test if one vector lies in the span of a set of other vectors by comparing the ranks of matrices of dimensions at most m × ((n · |V (H)|) ∆(H) + 1). Thereby, Reduction rule 2 can be applied in polynomial time. Reduction rule 3 can trivially be applied in polynomial time. Since |Π| ≤ |V (G)|, checking for all P ∈ Π whether any of the reduction rules can be applied takes polynomial time.
Each rule can be applied at most |V (G)| 2 times, as it always removes at least one edge or vertex. The claim follows.
Let G be the result of applying Reduction rules 1, 2, and 3 exhaustively. We use the following claim to prove a bound on the size of G . Proof. When Reduction rule 1 has been applied at any point, G trivially has constant size. Otherwise, since G has a twin-cover of size k, it follows from Lemma 20 that G has a twin-cover of size at most k. Let Y be a minimum twin-cover of G , such that |Y | ≤ k. Let Π be the twin decomposition of G . By Lemma 8,  Let L tc ⊆ L tc be a basis of the vectors of L tc , working modulo 2. Since all employed polynomials are multilinear, it follows that the vectors in L tc only have nonzero coefficients for positions corresponding to multilinear monomials, of which there are at most α by Lemma 9. As the size of the basis L tc equals the rank of the matrix containing the (row)vectors L tc , which is upper-bounded by the number of columns that contain a nonzero entry, it follows that |L tc | ≤ α.
We define a set of meta-edges F ⊆ (Π × (Π \ Π )) based on the constraints in L tc . For each constraint Z in L tc , do the following.
Suppose Z = p P ,S,X (C) ≡ 2 0 for some P ∈ Π , S ⊆ N G (P ) and X ⊆ V (H). Since P is a partite set of twins that is disjoint from Y , we have N G (P ) ⊆ Y since Y is a twin cover. So each v ∈ S belongs to a partite set P v of twins with P v ∈ Π \ Π . For each v ∈ S, add (P , P v ) to F . Otherwise, Z = q P ,S,X (C) ≡ 2 0 for some P ∈ Π , S ⊆ N G (P ), and sequence X = x 1 , . . . , x k ∈ V (H). Similarly as above, for each v ∈ S take P v ∈ Π \ Π such that v ∈ P v and add (P , P v ) to F . The above procedure adds at most ∆(H) + 1 meta-edges for each constraint in L tc . Thereby, We now argue that for any (P , P ) / ∈ F with P ∈ Π and P ∈ Π \ Π , the following holds: To see this, consider a constraint in L tc . It is of one of two possible types, and it was added to L tc = P ∈Π L(P, G ) ⊇ L tc because it satisfied the criteria described in Section 3.3. Effectively, the constraint was created because some set P ∈ Π contains a certain vertex set S of size at most ∆(H) + 1 in its open neighborhood in G . But by our choice of meta-edges F , the set P still has S in its neighborhood in G \ E G (P, P ), so that all constraints of L tc are also contained in L Π (G \ E G (P , P )). Using this, we show that for all P ∈ Π and P ∈ Π \ Π : Suppose there exist P ∈ Π , P ∈ Π \ Π such that E G (P , P ) = ∅ but (P , P ) / ∈ F . It follows from Equation 2 that L tc ⊆ L Π (G \ E G (P , P )). Thereby, Thereby, Reduction rule 2 could be applied to G , which is a contradiction. It follows that P ∈ Π and P ∈ Π \ Π can only be connected in G , if there is a corresponding meta-edge in F . We can now use Equations 1 and 3 to bound the number of vertices and edges in G .
First of all, for all P ∈ Π there must exist some P ∈ Π \ Π such that (P , P ) ∈ F , otherwise it follows from Equation 3 that N G (P ) = ∅ and P would have been removed by Reduction rule 3. Thereby |Π | ≤ |F |. Since |P | ≤ ω(H) ≤ ∆(H) + 1 for all P ∈ Π by Reduction rule 1, the number of vertices of G can be bounded as follows.
If edge {u, v} ∈ G with u ∈ Y and v / ∈ Y , then there exist (P , P ) ∈ F such that u ∈ P , v ∈ P . Since |P | ≤ ∆(H) + 1 for any P ∈ Π, there are at most |F | · (∆(H) + 1) 2 such edges. Furthermore, there are at most |Y | 2 ≤ k 2 edges between vertices in Y , and at most |F | · (∆(H) + 1) 2 edges between vertices in V (G) \ Y . Thereby, the total number of edges can be bounded by: This concludes the proof of Claim 23.
It follows from the correctness of Reduction rules 1, 2, and 3 that G is H-colorable if and only if G is H-colorable. It follows from Claims 22 and 23 that we have given a kernel for H-coloring with O(k ∆(H) ) vertices and edges for constant-size H that can be computed in polynomial time. By encoding the graph using adjacency lists, it can be encoded in O(k ∆(H) · log k) bits.
The following corollary shows that Theorem 21 generalizes the result obtained for q-Coloring parameterized by vertex cover in the extended abstract of this work.
Corollary 24. For any constant q ≥ 3, q-Coloring parameterized by the size of a twin-cover has a kernel with O(k q−1 ) vertices, which can be encoded in O(k q−1 log k) bits. Furthermore, the resulting instance is a subgraph of the original input graph.
Proof. Since q-Coloring is equivalent to K q -Coloring, and ∆(K q ) = q − 1 and K q has q vertices, the result now follows directly from Theorem 21.

Sparsification lower bound for 3-Coloring
In this section we provide a sparsification lower bound for 3-Coloring. We show that 3-Coloring does not have a (generalized) kernel of size O(n 2−ε ), unless NP ⊆ coNP/poly. This will also provide a kernel lower bound for 3-Coloring parameterized by the size of a twin-cover, that matches the upper bound given in the previous section up to k o(1) factors. For ease of presentation, we will prove the lower bound by giving a degree-2 crosscomposition from a tailor-made problem to 3-List Coloring. The input to 3-List Coloring is a graph G together with a function L that assigns to each vertex v a list L(v) ⊆ {1, 2, 3}. The problem asks whether there exists a proper coloring of G, such that each vertex is assigned a color from its list. Before presenting the cross-composition, we introduce an important gadget that will be used. It was constructed by Jaffke and Jansen [12]. The gadget, which we will call a blocking-gadget, will be used to forbid one specific coloring of a given vertex set. The following Lemma is a rephrased version of Lemma 15 in [12]. The blocking-gadget can be used to forbid one specific coloring given by the tuple c of a set of vertices v 1 , . . . , v m , by adding a blocking-gadget(c) and connecting π i to v i for all i ∈ [m]. If the color of v i is c i for all i, then the inserted edges prevent all π i to receive the corresponding color c i , and by Lemma 25 the coloring cannot be extended to the gadget. If however the color of v i differs from c i for some i, the gadget can be properly colored.
Having presented the gadget we use in our construction, we define the source problem for the cross-composition. This problem was also used as the starting problem for a crosscomposition in our earlier sparsification lower bound for 4-Coloring [15]. [15] Input: A graph G with a partition of its vertex set into U ∪ V such that G[U ] is an edgeless graph and G[V ] is a disjoint union of triangles.

2-3-Coloring with Triangle Split Decomposition
Question: Is there a proper 3-coloring c : V (G) → {1, 2, 3} of G, such that c(u) ∈ {1, 2} for all u ∈ U ? We will refer to such a coloring as a 2-3-coloring of the graph G, since two colors are used to color U , and three to color V .

Lemma 26 ([14, Lemma 3]). 2-3-Coloring with Triangle Split Decomposition is NP-complete.
To establish a quadratic lower bound on the size of generalized kernels, it suffices to give a degree-2 cross-composition from this special coloring problem into 3-Coloring. Effectively, we have to show that for any t, one can efficiently embed a series of t size-n instances indexed as X i,j for i, j ∈ [ √ t], into a single 3-Coloring instance with O( √ t · n O(1) ) vertices that acts as the logical or of the inputs. To achieve this composition, a common strategy is to construct vertex sets S i and T i of size n O(1) for i ∈ [ √ t], such that the graph induced by S i ∪ T j encodes input X i,j . The fact that the inputs can be partitioned into an independent set and a collection of triangles facilitates this embedding; we represent the independent set within sets S i and the triangles in sets T i . To embed t inputs into a graph on O( √ t · n O(1) ) vertices, each vertex will have incident edges corresponding to many different input instances. The main issue when trying to find a cross-composition into 3-Coloring, is to ensure that when there is one 2-3-colorable input graph, the entire graph becomes 3-colorable. This is difficult, since the neighbors that a vertex in S i has among the many different sets T j should not invalidate the coloring. For vertices in some set T j , we have a similar issue. Our choice of starting problem ensures that if some combination S i * , T j * corresponding to input X i * ,j * has a 2-3-coloring, then the remaining sets T j can be safely colored 3, since vertices in S i * will use only two of the available colors. The key insight to ensure that vertices in the remaining S i can also be colored, is to split them into multiple copies that each have at most one neighbor in any T j . There will be at most one vertex in the neighborhood of a copy that is colored using color 1 or 2, thereby we can always color it using the other available color. Finally, additional gadgets will ensure that in some S i all these copies get equal colors, and in some T j the vertices that correspond to a triangle in the inputs are properly colored as such. With this intuition, we give the construction.

Proof.
To prove this statement, we give a degree-2 cross-composition from 2-3-Coloring with triangle split decomposition to 3-List Coloring and then show how to change this instance into a 3-Coloring instance. We start by defining a polynomial equivalence relation R on instances of 2-3-Coloring with triangle split decomposition. Let two instances be equivalent under R, when the sets U have the same size and sets V consist of the same number of triangles. It is easy to verify that R is a polynomial equivalence relation.
By duplicating one of the inputs several times if needed, we ensure that the number of inputs to the cross-composition is a square. This increases the number of inputs by at most a factor four and does not change the value of the or. Therefore, assume we are given t instances of 2-3-Coloring with Triangle Split Decomposition such that t := √ t is integer. Enumerate these instances as X i,j for i, j ∈ [t ] and let instance X i,j have graph G i,j . For input instance X i,j , let U and V be such that U is an independent set with |U | = m and V consists of n vertex-disjoint triangles. Enumerate the vertices in U as u 1 , . . . , u m and in V as v 1 , . . . , v 3n such that v 3k−2 , v 3k−1 , v 3k form a triangle for k ∈ [n]. We now create an instance of the 3-List Coloring problem, consisting of a graph G together with a list function L that assigns a subset of the color palette {1, 2, 3} to each vertex.
Refer to Figure 1 for a sketch of G . together represent a single vertex of the independent set of an input instance, which is split into copies to ensure that every copy has at most one neighbor in each cell of T (the bottom row in Figure 1a an input graph. They are not connected, so that we can safely color all vertices that do not correspond to a 3-colorable input with color 3.   , c 2 , 1)) to G . Connect s i k, to π 1 , s i k+1, to π 2 , and a i to π 3 . This ensures that when a i has color 1, vertices s i k, and s i k , have the same color for all k, k ∈ [3n]. 8. For every j ∈ [t ], k ∈ [n], for every c 1 , c 2 , c 3 ∈ [3] that are not all pairwise distinct, add a blocking-gadget((c 1 , c 2 , c 3 , 1)) to G . Connect t j 3k−2 to π 1 , t j 3k−1 to π 2 , t j 3k to π 3 , and b j to π 4 . This construction ensures that if b j is colored 1, all "triangles" in T j are properly colored. If b j is colored 2 however, the gadgets add no additional restrictions to the coloring of vertices in T j .

Connect vertex s
This concludes the construction of G ; we proceed with the analysis.  1 , c 2 , c 3 )) was added in Step 7 and connected to these three vertices. But by Lemma 25, it follows that any list-coloring of this blocking-gadget must assign color c i to are assigned equal colors and thus at least one of them has a coloring different from the coloring given by c 1 and c 2 as these colors are distinct. Thus, the colors that are forbidden on vertices π i by the connections to the rest of the graph, do not correspond to (c 1 , c 2 , c 3 ) and c can be extended to color the entire blocking-gadget by Lemma 25.
Similarly, coloring c can be extended to blocking-gadgets(c) added in Step 8, as either π 4 in the gadget is connected to b j for j = j * and c(b j ) = 2 = c 4 , or the three vertices from T connected to this gadget are colored with three different colors.
The claim above shows that we have given a cross-composition into 3-List Coloring.
To obtain an instance of 3-Coloring, we add a triangle consisting of vertices {C 1 , C 2 , C 3 } to the graph. We connect a vertex v in G to C i if i / ∈ L(v) for i ∈ [3]. This graph now has a proper 3-coloring if and only if the original graph had a proper 3-list coloring. Thus, by Claim 30, the resulting 3-Coloring instance acts as the logical or of the inputs.
It remains to bound the number of vertices of G . In Step 1 we add |S| = m · 3n · t vertices and in Step 2 we add another |T | = 3n · t vertices. Then in Step 4 we add |A| + |B| = 2t additional vertices. The two blocking-gadgets added in Steps 5 and 6 each have size O(t ). The blocking-gadgets added in Step 7 have constant size, and we add six of them for each i ∈ The set of all vertices of a graph is always a valid vertex cover for that graph. Thereby, it follows from Theorem 27 that the lower bound also holds when parameterized by vertex cover. In [13,Theorem 3], it was shown that for any q ≥ 4, q-Coloring parameterized by vertex cover does not have a generalized kernel of size O(k q−1−ε ), unless NP ⊆ coNP/poly. Combining these results gives a lower bound for q-Coloring parameterized by vertex cover size.
The above lower bound carries over to q-Coloring parameterized by the size of a twincover, since any vertex cover of a graph is also a valid twin-cover. Recall that q-Coloring is equivalent to K q -Coloring and ∆(K q ) = q − 1. Thereby, the above lower bound matches the kernel size given in Theorem 21 up to k o(1) factors.

Conclusion
We have given a kernel for H-Coloring parameterized by Twin-Cover with O(k ∆(H) ) vertices and bitsize O(k ∆(H) log k). This kernel can be obtained without using information about (an approximation of) the minimum twin-cover of the input graph. It follows from this result that q-Coloring parameterized by Vertex Cover has a kernel of bitsize O(k q−1 log k), improving on the previously known kernel by almost a factor k. Furthermore, 3-Coloring when parameterized by the number of vertices has no kernel of size O(n 2−ε ), unless NP ⊆ coNP/poly. It was already known that for q ≥ 4, q-Coloring parameterized by Vertex Cover was unlikely to have a kernel of size O(k q−1−ε ). Combining these results allows us to give the same lower bound for q = 3, under the assumption that NP ⊆ coNP/poly.
Thereby we have provided an upper and lower bound on the kernel size of q-Coloring parameterized by Vertex Cover for any q ≥ 3, that match up to k o (1) factors. It is easy to see that the kernel lower bounds also hold for q-List Coloring, where every vertex v in the graph has a list L(v) ⊆ [q] of allowed colors. Furthermore, we can also apply our kernel, by first reducing an instance of q-List Coloring to an instance of q-Coloring using q additional vertices, and adding these q vertices to the twin-cover of the graph. This only changes the size of the obtained kernel by a constant factor. The kernel does not extend to the general List H-Coloring problem, since the gadget to simulate the list constraints only works correctly when H is a clique.
In this paper we gave a first example where finding redundant vertices and edges is done using appropriate polynomial equalities. It would be interesting to see if this technique can be applied to obtain smaller kernels for other graph problems as well. To apply this idea, one needs to first identify which constraints should be modeled. When the constraints are found, they need to be written as equalities of low-degree polynomials over a suitably chosen field. This requires the clever construction of polynomials that have a sufficiently low degree, in order to obtain a good bound on the kernel size.