Exact Algorithms for Intervalizing Coloured Graphs

In the Intervalizing Coloured Graphs problem, one must decide for a given graph G = (V, E) with a proper vertex colouring of G whether G is the subgraph of a properly coloured interval graph. For the case that the number of colors is fixed, we give an exact algorithm that uses 2 O ( n / log n ) $2^{\mathcal {O}(n/\log n)}$ time. We also give an O ∗ ( 2 n ) $\mathcal {O}^{\ast }(2^{n})$ algorithm for the case that the number of colors is not fixed.


Introduction
The area of exact algorithms for NP-hard problems is an old area in the field of design and analysis of algorithms, but also one with many important new developments. A recent overview of the area can be found in the book by Fomin and Kratsch [14]. In this paper, we consider exact algorithms for the problem called INTERVALIZING COLOURED GRAPHS. This problem is defined in the following way. Given a graph G = (V , E) together with a proper vertex colouring c : V → {1, . . . , k} of G (a colouring c is proper if for all edges {v, w} ∈ E: c(v) = c(w)), one must decide if G is subgraph of a properly coloured interval graph, i.e., can we add edges, such that each edge is between vertices of different colors and the result is an interval graph? The problem has its original motivation in DNA physical mapping [13].
This problem is NP-complete [13] (see also [18]), even when the number of colors equals four [5,6], and in addition, inputs are restricted to caterpillar trees [1]. We denote the version of the problem where the number k of colors is fixed by INTER-VALIZING k-COLOURED GRAPHS, and the version with a potentially unbounded number of colors by INTERVALIZING COLOURED GRAPHS.
If the number of colors equals two, the problem is trivially solvable in linear time. For three colors, the problem is solvable in quadratic time with a complicated algorithm [7]; the case for three colors and biconnected graphs is described in [6].
In this paper, we give two algorithms: one for the case that the number of colors is bounded by a constant, and one for the case that the number of colors is unbounded. The former result is relevant if the number of colors is a constant that is at least four, and forms a somewhat curious exception to a general pattern that can be observed in most exact algorithms for graph problems. Namely, the algorithm uses slightly less than exponential time. Most NP-hard problems that have subexponential algorithms deal with planar graphs and generalizations of planar graphs, see e.g., [12,16,23]. Typically, the running time of such algorithms is of the form 2 O( √ n) . Our result is unlike most results in two ways: first, our inputs are general graphs (but a positive answer implies bounded pathwidth of the input; see Proposition 3 and its discussion), and secondly, the running time is 'just subexponential': it is bounded by 2 O(n/ log n) .
Our algorithm for INTERVALIZING k-COLOURED GRAPHS for fixed k can be viewed as a dynamic programming algorithm in Held-Karp style [21], resembling algorithms for some graph layout problems given e.g., in [8], but with one additional improvement: an isomorphism test on certain parts of the graph during the dynamic programming. Important concepts that facilitate the presentation of our results are the notions of path decomposition and nice path decomposition.
Our second result is an algorithm for INTERVALIZING COLOURED GRAPHS (with no bound on the number of colors); this algorithm runs in O * (2 n ). It is a rather simple dynamic programming algorithm, also in Held-Karp style, and is the first exact algorithm for the problem.
This paper is organized as follows. In Section 2, preliminary definitions and results are given. In Section 3, the notion of partial path decomposition and some related notions are introduced, and a few structural results on these notions are derived. In Section 4, we give the algorithm for INTERVALIZING k-COLOURED GRAPHS, and analyse its running time. In Section 5, the algorithm for INTERVALIZING COLOURED GRAPHS is given. Some final remarks are made in Section 6.

Preliminaries
In this section, we introduce some standard notations, and give a few preliminary results on path decompositions.
The graphs in this paper are considered to be undirected and simple. We make a distinction between labelled and unlabelled graphs. In a labelled graph, each vertex has a unique label, and two isomorphic graphs with different labels are considered to be different. In contrast, two isomorphic unlabelled graphs are considered to be the same object. We also consider graphs given with a vertex colouring. A vertex colouring of a graph G = (V , E) is a function c : V → C, for some finite set C. |C| is the number of colours. A vertex colouring c is proper, if for all edges {v, w} ∈ E, we have that c(v) = c(w).
Throughout the paper, we assume that the input graph G is connected; if not, we can run the algorithm separately on each connected component of G.
For a graph G = (V , E) and a set of vertices W ⊆ V , we denote G[W ] as the subgraph induced by W : , if G and H have the same vertex set, E ⊆ F , and H is an interval graph. More background can be found in [17]; see also [20].
A path decomposition of a graph G = (V , E) is a sequence of subsets of V called bags, (X 1 , X 2 , . . . , X r ) such that: The width of a path decomposition (X 1 , X 2 , . . . , X r ) is max 1≤i≤r |X i | − 1. The pathwidth of a graph G is the minimum width of a path decomposition of G.
A path decomposition (X 1 , X 2 , . . . , X r ) is nice, if for all i, 1 ≤ i < r, exactly one of the following two cases holds: We call X i+1 a forget node.
If |X 1 | = 1, we also call X 1 an introduce node. The following proposition is well known. We give the proof for later reference.
Proposition 1 (Folklore) Each graph G = (V , E) with pathwidth k has a nice path decomposition of width k with 2n bags, with |X 1 | = 1, and X r = ∅.
Proof Suppose we have a path decomposition (X 1 , X 2 , . . . , X r ). We can turn it in a nice path decomposition as follows. First, remove all bags that are empty. If for some i, 1 ≤ i < r, i + 1 is not an introduce or forget bag, then we insert some new bags between i and i + 1: first forget nodes, one for each vertex in X i − X i+1 , and then we have one introduce node for each vertex in X i+1 − X i . Similarly, we add introduce nodes before X 1 when |X 1 | = 1, and add forget nodes at the end of the procedure till X r = ∅. We have one introduce and one forget node per vertex, so we have 2n bags.

Proposition 2
There are at most (k + 2 k + 1) 2n−1 unlabelled graphs with pathwidth at most k that are pairwise non isomorphic.
Proof Consider a nice path decomposition of a graph with n vertices, with |X 1 | = 1, and with 2n bags. For each of the bags X i , i > 1, there are at most k + 2 k + 1 possibilities: we can have a forget node, where we have the choice which of the at most k + 1 vertices in X i we forget, or we can have an introduce node, where we have the choice to which of the at most k vertices in X i the introduced vertex has an edge, i.e., at most 2 k choices for an introduce node. If we have two graphs with two path decompositions that we can construct while always making the same choices, then these graphs are isomorphic. 1. G has a properly coloured interval completion, using colouring c.
This proposition is well known. We sketch some of the proofs, as they provide some intuition for several of our later notions and results. Suppose H = (V , F ) is a properly coloured interval completion of G. Consider the interval model of H . We can assume that all endpoints of intervals are different in this model. Use a scanline that moves in this model from left to right. While moving the scanline, we build a nice path decomposition with the required property. We start with an empty bag. Each time the scanline meets a left or right endpoint, we add a new node to the path decomposition. If we meet a right endpoint of an interval representing a vertex v, we have a forget node, where we forget v. If we meet a left endpoint of an interval representing a vertex v, we have an introduce node, where we introduce the vertex v. In this way, the vertices in a bag are exactly the vertices represented by the intervals that intersect the scanline. One easily verifies that we have obtained a nice path decomposition of H , and, as G is a subgraph of H , this is also a nice path decomposition of G. As H is properly coloured, vertices in the same bag are differently coloured.
For the reverse direction, suppose that we have a (nice) path decomposition (X 1 , X 2 , . . . , X r ) from Proposition 3 (ii) or (iii), one obtains the corresponding interval graph by making each X i a clique. The corresponding interval graph model is obtained by taking for a vertex v the interval [min v∈X i i, max v∈X i i]. As all colors in a bag X i are different, the width of the path decompositions is bounded by k − 1.
Note that it follows directly from Proposition 3, that a graph with a proper vertex colouring with k colors has pathwidth at most k − 1.
Proposition 3 motivates the definition of a properly coloured path decomposition:

Partial Path Decompositions
In this section, we introduce a number of notions that will be used for our dynamic programming algorithm in the next section.
A partial path decomposition of a graph G = (V , E) is a sequence of subsets of V (X 1 , X 2 , . . . , X s ) such that: The following proposition follows from well known facts about path and tree decompositions. For completeness we give the proof.
There is a path from v to w with all vertices in W , i.e., it avoids X s . This path contains two neighbouring vertices x ∈ A and y ∈ B. There is a bag X i with {x, y} ⊆ X i . If i ≤ s, then y ∈ X s by the definition of path decomposition; if i > s, then x ∈ X s by definition of path decomposition. In both cases we have a contradiction.
Consider a partial path decomposition (X 1 , X 2 , . . . , X r ) and a vertex set X. Later, X will typically be the set X r for some partial path decomposition (X 1 , X 2 , . . . , X r ). A component of X is a vertex set that forms a connected component of the graph that preserves colors and is the identity when restricted to X, i.e., f is a bijective function, such that the following conditions hold: The following proposition follows directly from the definition of partial path decomposition, see also Proposition 4.

Proposition 5
Let (X 1 , X 2 , . . . , X r ) be a partial path decomposition of G. Each component of X r is either a left or a right component of (X 1 , X 2 , . . . , X r ).
We say that a partial path decomposition ( We define an equivalence relation on partial path decompositions as follows. We say that the partial path decomposition (X 1 , X 2 , . . . , X r ) is equivalent to the partial path decomposition (Y 1 , Y 2 , . . . , Y s ), if the following two conditions hold: . . , Y s ) and • W i and W g(i) are isomorphic components of X r .
The main insight behind our dynamic programming algorithm is the following result.

Proposition 6
If (X 1 , X 2 , . . . , X r ) and (Y 1 , Y 2 , . . . , Y s ) are equivalent coloured partial path decompositions, then (X 1 , X 2 , . . . , X r ) has an extension that is a properly coloured path decomposition of G, if and only if (Y 1 , Y 2 , . . . , Y s ) has an extension that is a properly coloured path decomposition of G.
Proof Suppose (X 1 , X 2 , . . . , X r , Z 1 , Z 2 , . . . , Z r ) is a properly coloured path decomposition of G that is an extension of (X 1 , X 2 , . . . , X r ). Suppose that W 1 , . . . , W q are the components of X r . Let g be the bijective function as in the definition of equivalence. Let f i be a color preserving graph isomorphism from G[X r ∪W i ] to G[X r ∪W g(i) ] that is the identity on X r , as implied by the definition of equivalence.
Let f : V → V be the function defined in the following way.

Claim 7 f is a color preserving automorphism of G.
Proof f is a color preserving bijection: if v ∈ X r , then v does not belong to a as f i is an isomorphism, there is only one such w; and as f i is color preserving, the color of w equals the color of v.
Consider an edge {v, w} ∈ E. There must be an i, with v, w ∈ W i ∪ X r . Now, Similarly, a pair of non-adjacent vertices is mapped to a pair of non-adjacent vertices.
First, we show that every edge {v, w} ∈ E is contained in some bag of Second, it now directly follows that 1≤i≤s Y i ∪ 1≤i≤r Z i = V : as G is connected, each vertex is endpoint of an edge.
Third, we show that every v ∈ V only occurs in a series of consecutive bags. For a vertex v ∈ Y s = X r , we note that there are 1 ≤ α ≤ s, 0 ≤ β ≤ r , such that v belongs to bags Y α , Y α+1 , . . . , Y s , and v belongs to bags Z 1 , Z 2 , . . . , Z β , and no other bags. As f (v) = v, v also belongs to bags Z 1 , Z 2 , . . . , Z β , and no later bags. So, for a vertex v ∈ Y s = X r , we are done.
If v ∈ W g(i) where W g(i) is a left component of (Y 1 , Y 2 , . . . , Y s ). Then, f −1 (v) ∈ W i with W i a left component of (X 1 , X 2 , . . . , X r ). Thus, f −1 (v) belongs to one or more consecutive bags in (X 1 , X 2 , . . . , X r−1 ), and, as f −1 (v) does not belong to X r , f −1 (v) does not belong to Z 1 , Z 2 , . . . , Z r because otherwise (X 1 , X 2 , . . . , X r , Z 1 , Z 2 , . . . , Z r ) is not a path decomposition. So, v belongs to one or more consecutive bags in (Y 1 , Y 2 , . . . , Y s−1 ) and no others. And, if v ∈ W g(i) where W g(i) is a right component of (Y 1 , Y 2 , . . . , Y s ), then the required result follows from a similar analysis.
Finally, by assumption all vertices in a bag Y i have a different color, and, as f is color preserving, as all vertices in a bag Z i have a different color, also all vertices in a bag Z i have a different color. We thus have shown Claim 8.
So, (Y 1 , Y 2 , . . . , Y s ) has an extension that is a properly coloured path decomposition of G. This shows one direction of implication of the proposition; the proof of the other direction is identical. Thus, Proposition 6 holds.
We assume some ordering on the vertices. The characteristic of a partial path decomposition (X 1 , X 2 , . . . , X r ) is the following pair: where we assume that both vertex sets are given as an ordered list of vertices.
Two properly coloured partial path decompositions with the same characteristic are trivially equivalent, using the identity for g. The algorithm in Section 5 basically tabulates all different characteristics of properly coloured partial path decompositions, and thus gives an O(2 n ) algorithm for INTERVALING COLOURED GRAPHS; this is somewhat similar to the Held-Karp algorithm for TSP [21]. In case the number of colors is bounded, we can obtain a faster algorithm by applying an isomorphism check for components; this is the main ingredient of the faster algorithm described in the next section.

An Exact Algorithm for Intervalizing k-Coloured Graphs
In this section, we give the algorithm for INTERVALIZING k-COLOURED GRAPHS, building upon the notions and preliminary results of the previous sections.
First, we note that a positive instance has a path decomposition in which each bag has size at most k (all vertices in a bag have a different color and there are k colors). Thus, as a first step we use the linear time algorithm (for fixed k), that tests if the pathwidth of the input graph is at most k − 1 from [4,10]. If not, we are done, and can decide negatively. Thus, we can assume that G has pathwidth at most k in the remainder. We consider k to be a constant.
We introduce some further notions. We define the progress of a partial path decomposition (X 1 , X 2 , . . . , X r ) to be 2 · | 1≤i≤r X i | − |X r |. Note that when we extend a nice partial path decomposition with one additional introduce or one additional forget node, then the progress always increases by exactly one. As the characteristic of (X 1 , . . . , X r ) is (X r , 1≤i<r X i − X r ), it follows that if a partial path decomposition has characteristic (X, Z), its progress equals 2|Z| − |X|.
The canonical characteristic of a properly coloured partial path decomposition is the lexicographically minimal characteristic over all characteristics of equivalent properly coloured partial path decompositions.

Proposition 9 Given a characteristic of a properly coloured partial path decomposition, we can compute in polynomial time its canonical characteristic.
Proof The GRAPH ISOMORPHISM problem is polynomial time solvable on graphs of bounded treewidth, and thus also on graphs of bounded pathwidth [3,15]. It is straightforward to modify the algorithms of [3] or [15] such that it also works on coloured graphs while using the same running time. (More precisely, the very recent result of Fomin et al. [15] shows that GRAPH ISOMORPHISM is fixed parameter tractable, with the treewidth as parameter; this improvement is however suppressed by other factors in the running time of our algorithm.) Given a characteristic (X r , Z), we first compute (with depth first search) the connected components of G[V − X r ], say W 1 , W 2 , . . . , W q . For each pair W i , W j , we can check in polynomial time if they are isomorphic: use the isomorphism algorithm on coloured graphs of bounded pathwidth discussed above, and take a new, different color for each vertex in X r . (Note the definition of isomorphism for components, as given in Section 3).
Thus, we can partition the components in equivalence classes dictated by isomorphism. We can sort each component lexicographically, and then each class lexicographically. Then, for each class, we determine how many components from the class are a subset of Z (i.e., left components). In the canonical characteristic, we take the same number of left components from the class, but now take this number of lexicographically smallest elements. A simple last sorting step gives the desired result.
We can now describe our algorithm.
• Check if the pathwidth of G is at most k − 1.
If not, answer no and terminate.
• Otherwise, for α = 1 · · · 2n, compute a table T α of all canonical characteristics of partial path decompositions of progress α. • If T 2n is empty, then answer no; otherwise, answer yes.
The output of the algorithm clearly is correct as a partial path decomposition is a path decomposition, if and only if, its progress equals 2n.
We now describe how the tables T i are computed. Computing T 1 is simple: for all v ∈ V , we have an entry in T 1 of the form ({v}, ∅). Given a table T α , 1 ≤ α < 2n, we compute table T α+1 as follows. Initialize T α+1 as empty set. For each entry (X, Z) from T α , do the following: • Compute the new characteristics that result when the next node in the partial path decomposition is an introduce node: for each v ∈ V − Z − X such that there is no x ∈ X with c(v) = c(x), compute the canonical characteristic of (X ∪ {v}, Z) and put it in T α+1 . • Compute the new characteristics that result when the next node in the partial path decomposition is a forget node: for each x ∈ X such that there is no v ∈ Z −V − X with {v, x} ∈ E, compute the canonical characteristic of (X − {v}, Z ∪ {v}).
Proof Note that the characteristic of a partial path decomposition remains the same when we apply the procedure of Proposition 1. So, we may assume that we compute the canonical characteristics of the properly coloured nice partial path decompositions (X 1 , X 2 , . . . , X r ) with progress α + 1. Of these, we consider two cases: the last node X r can be an introduce node or a forget node. If X r is an introduce node with X r = X r−1 ∪ {v}, then (X 1 , X 2 , . . . , X r−1 ) is a properly coloured partial path decomposition of progress α. If (X 1 , X 2 , . . . , X r−1 ) has characteristic (X r−1 , Z), then (X 1 , X 2 , . . . , X r ) has characteristic (X r−1 ∪ {v}, Z). v must have a color different from the colors of vertices in X r−1 .
If X r is a forget node with X r = X r−1 − {v}, then again (X 1 , X 2 , . . . , X r−1 ) is a properly coloured partial path decomposition of progress α. As v is forgotten, it cannot belong to bags right of X r , and thus all neighbours of v must belong to 1≤i≤r X i . If (X 1 , X 2 , . . . , X r−1 ) has characteristic (X r−1 , Z), then the characteristic of (X 1 , This completes the description of the algorithm. From our discussion, we see that the algorithm indeed correctly decides if G has a properly coloured interval completion.
We now will analyse the running time of the algorithm. We remark that our algorithm uses polynomial time per entry in a table T i . Thus, the running time of the algorithm equals the product of a polynomial in n and the number of canonical characteristics of properly coloured partial path decompositions. So, we need to establish an upper bound on this number of canonical characteristics. First, we obtain an upper bound on the number of non-isomorphic components of a set X.

Proposition 11
Let (X 1 , X 2 , . . . , X r ) be a properly coloured partial path decomposition of G. There are at most (k · 2 3k ) equivalence classes of the isomorphism relation on components of G[V − X r ] that contain components with vertices.
Proof Each equivalence class can be identified by an uncoloured unlabelled graph on vertices of pathwidth at most k − 1, a colouring with at most k colors of the vertices of the graph, and the incidence relation between the vertices in the graph and the vertices in X r . This gives at most the following number of equivalence classes: because the first gives at most (k − 1 + 2 k−1 + 1) 2 −1 possibilities by Proposition 2, the second at most k possibilities, and the last at most 2 k possibilities.
To show that the size of a table • A component is infrequent-small, if it is not large and it has less than √ n isomorphic components of G[V − X r ]. There are less than (k · 2 3k ) c k log n = 2 log k·3k·c k ·log n = 2 (log n)/2 = √ n equivalence classes of the isomorphism relation that contain small components, by Proposition 11. For each class with infrequent-small components, we have less than √ n components in the class, and thus at most √ n possibilities regarding the number of left components in the class. This gives a total of at most √ n √ n = 2 O(n/ log n) possibilities for infrequent-small components.
For a fixed set X, the number of characteristics of the form (X, Z) is obtained by multiplying the number of possibilities for large, frequent-small, and infrequentsmall components. As each case is bounded by 2 O(n/ log n) , we obtain a bound of 2 O(n/ log n) . To obtain an upper bound on the total size of a table, we note that X is a subset of at most k vertices, and as (n + 1) k = 2 O(n/ log n) , we have that each table T i has a size bounded by 2 O(n/ log n) . As the running time of the algorithm is bounded by a polynomial times the size of these tables, we obtain our main result.

Theorem 12 For every fixed k ≥ 4, there is an algorithm for
We remark that there are inputs on which the algorithm uses (2 n/ log n ) time: suppose G has a vertex v that is a separator such that G[V − {v}] has (n/ log n) non-isomorphic components each of size log n .

An Algorithm for Intervalizing Coloured Graphs with an Arbitrary Number of Coors
In this section, we consider the case that the number of colors is not fixed. We give a simple Held-Karp style dynamic programming algorithm for this problem.
Suppose we are given a properly coloured graph G = (V , E). For a given set of vertices W ⊆ V , the border of W is the set of vertices in W with at least one neighbour in V − W , i.e., we denote  It is easy to see that the amount of work per fine set of vertices is polynomial. Finally, G has a properly coloured interval completion, if and only if F (n) = ∅. Thus, we have

Theorem 14
The INTERVALIZING COLOURED GRAPHS problem can be solved in O * (2 n ) time.

Conclusions
In this paper, we gave a dynamic programming algorithm for the INTERVALIZING k-COLOURED GRAPHS problem for a fixed number of colors k. It uses subexponential time of a somewhat unusual form, and thus, the result forms a somewhat curious exception to the types of results that are usually obtained in the field. The result is merely of theoretical interest, as values of n for which the algorithm can be run in practice can be expected to be rather small, say below 100. Experiments with a somewhat similar Held-Karp style algorithm for TREEWIDTH [9] suggest that our algorithm can also be practical for small values of n. In particular, it would be interesting to test a variant of the algorithm where the isomorphism test is applied heuristically, i.e., only on components that are very small, and with components, and with a usual graph isomorphism heuristic instead of the (complicated and probably only theoretically interesting) algorithms from [3,15].
The result with an arbitrary number of colors from Section 5 follows more standard arguments; the algorithm is similar to Held-Karp style algorithms for several layout problems, see [8].
A minor generalization of the result can be obtained when we consider a small (o(log n)) number of colors; in such cases the algorithm also uses subexponential time.
A generalization of the INTERVALIZING k-COLOURED GRAPHS problem is the INTERVAL GRAPH SANDWICH problem, in which we are given two graphs with G and H with the same vertex set, and ask whether there exists an interval graph G that is a subgraph of H and contains G as a subgraph. A well studied variant has the additional condition that G has maximum clique size k. See e.g., [19,22]. The ideas of our paper seem not to give results better than an algorithm that uses * (2 n ) time for this problem however, still assuming that k is fixed.
Other related problems are the version where we ask to find a properly coloured proper interval graph, which is polynomial for a fixed number of colors k [2], and the problem to find a properly coloured chordal graph, which is also polynomial for a fixed number of colors [24].
Very recently, Bodlaender and Nederlof [11] obtained a lowerbound for the problem: assuming the Exponential Time Hypothesis, an algorithm for INTERVALIZING k-COLOURED GRAPHS uses at least 2 (n/ log n) time, for each fixed number of colours at least six.