Cutwidth: obstructions and algorithmic aspects

Cutwidth is one of the classic layout parameters for graphs. It measures how well one can order the vertices of a graph in a linear manner, so that the maximum number of edges between any prefix and its complement suffix is minimized. As graphs of cutwidth at most $k$ are closed under taking immersions, the results of Robertson and Seymour imply that there is a finite list of minimal immersion obstructions for admitting a cut layout of width at most $k$. We prove that every minimal immersion obstruction for cutwidth at most $k$ has size at most $2^{O(k^3\log k)}$. For our proof, we introduce the concept of a lean ordering that can be seen as the analogue of lean decompositions defined by Thomas in [A Menger-like property of tree-width: The finite case, J. Comb. Theory, Ser. B, 48(1):67--76, 1990] for the case of treewidth. As an interesting algorithmic byproduct, we design a new fixed-parameter algorithm for computing the cutwidth of a graph that runs in time $2^{O(k^2\log k)}\cdot n$, where $k$ is the optimum width and $n$ is the number of vertices. While being slower by a $\log k$-factor in the exponent than the fastest known algorithm, given by Thilikos, Bodlaender, and Serna in [Cutwidth I: A linear time fixed parameter algorithm, J. Algorithms, 56(1):1--24, 2005] and [Cutwidth II: Algorithms for partial $w$-trees of bounded degree, J. Algorithms, 56(1):25--49, 2005], our algorithm has the advantage of being simpler and self-contained; arguably, it explains better the combinatorics of optimum-width layouts.


Introduction
The cutwidth of a graph is defined as the minimum possible width of a linear ordering of its vertices, where the width of an ordering σ is the maximum, among all the prefixes of σ, of the number of edges that have exactly one vertex in a prefix. Due to its natural definition, cutwidth has various applications in a range of practical fields of computer science: whenever data is expected to be roughly linearly ordered and dependencies or connections are local, one can expect the cutwidth of the corresponding graph to be small. These applications include circuit design, graph drawing, bioinformatics, and text information retrieval; we refer to the survey of layout parameters of Díaz, Petit, and Serna [6] for a broader discussion.
As finding a layout of optimum width is NP-hard [8], the algorithmic and combinatorial aspects of cutwidth were intensively studied. There is a broad range of polynomial-time algorithms for special graph classes [11,12,23], approximation algorithms [15], and fixed-parameter algorithms [19,20]. In particular, Thilikos, Bodlaender, and Serna [19,20] proposed a fixedparameter algorithm for computing the cutwidth of a graph that runs 1 in time 2 O(k 2 ) · n, where k is the optimum width and n is the number of vertices. Their approach is to first compute the pathwidth of the input graph, which is never larger than the cutwidth. Then, the optimum layout can be constructed by an elaborate dynamic programming procedure on the obtained path decomposition. To upper bound the number of relevant states, the authors had to understand how an optimum layout can look in a given path decomposition. For this, they borrow the technique of typical sequences of Bodlaender and Kloks [3], which was introduced for a similar reason, but for pathwidth and treewidth instead of cutwidth.
Since the class of graphs of cutwidth at most k is closed under immersions, and the immersion order is a well-quasi ordering of graphs 2 [16], it follows that for each k there exists a finite obstruction set L k of graphs such that a graph has cutwidth at most k if and only if it does not admit any graph from L k as an immersion. However, this existential result does not give any hint on how to generate, or at least estimate the sizes of the obstructions. The sizes of obstructions are important for efficient treatment of graphs of small cutwidth; this applies also in practice, as indicated by Booth et al. [4] in the context of VLSI design.
The estimation of sizes of minimal obstructions for graph parameters like pathwidth, treewidth, or cutwidth, has been studied before. For minor-closed parameters pathwidth and treewidth, Lagergren [14] showed that any minimal minor obstruction to admitting a path decomposition of width k has size at most single-exponential in O(k 4 ), whereas for tree decompositions he showed an upper bound double-exponential in O(k 5 ) . Less is known about immersion-closed parameters, like cutwidth. Govindan and Ramachandramurthi [10] showed that the number of minimal immersion obstructions for the class of graphs of cutwidth at most k is at least 3 k−7 +1, and their construction actually exemplify minimal obstructions for cutwidth at most k with (3 k−5 − 1)/2 vertices. To the best of our knowledge, nothing was known about upper bounds for the cutwidth case. 1 Thilikos, Bodlaender, and Serna [19,20] do not specify the parametric dependence of the running time of their algorithm. A careful analysis of their algorithm yields the above claimed running time bound. 2 All graphs considered in this paper may have parallel edges, but no loops.

Results on obstructions.
Our main result concerns the sizes of obstructions for cutwidth.
Theorem 1. Suppose a graph G has cutwidth larger than k, but every graph with fewer vertices or edges (strongly) immersed in G has cutwidth at most k. Then G has at most 2 O(k 3 log k) vertices and edges.
The above result immediately gives the same upper bound on the sizes of graphs from the minimal obstruction sets L k as they satisfy the prerequisites of Theorem 1. This somewhat matches the (3 k−5 − 1)/2 lower bound of Govindan and Ramachandramurthi [10]. Our approach for Theorem 1 follows the technique used by Lagergren [14] to prove that minimal minor obstructions for pathwidth at most k have sizes single-exponential in O(k 4 ). Intuitively, the idea of Lagergren is to take an optimum decomposition for a minimal obstruction, which must have width k + 1, and to assign to each prefix of the decomposition one of finitely many "types", so that two prefixes with the same type "behave" in the same manner. If there were two prefixes, one being shorter than the other, with the same type, then one could replace one with the other, thus obtaining a smaller obstruction. Hence, the upper bound on the number of types, being double-exponential in O(k 4 ), gives some upper bound on the size of a minimal obstruction. This upper bound can be further improved to single-exponential by observing that types are ordered by a natural domination relation, and the shorter a prefix is, the weaker is its type. An important detail is that one needs to make sure that the replacement can be modeled by minor operations. For this, Lagergren uses the notion of linked path decompositions, also known as lean path decompositions; cf. [21].
To prove Theorem 1, we perform a similar analysis of prefixes of an optimum ordering of a minimal obstruction. We show that prefixes can be categorized into a bounded number of types, each comprising prefixes that have the same "behavior". Provided two prefixes with equally strong type appear one after the other, we can "unpump" the part of the graph in their difference. To make sure that unpumping is modeled by taking an immersion, we introduce lean orderings for cutwidth and prove the analogue of the result of Thomas [21] for treewidth: there is always an optimum-width ordering that is lean (see also [1]).
The proof of the upper bound on the number of types essentially boils down to the following setting. We are given a graph G and a subset X of vertices, such that at most ℓ edges have exactly one endpoint in X. The question is how X can look like in an optimum-width ordering of G. We prove that there is always an ordering where X is split into at most O(kℓ) blocks, where k is the optimum width. This allows us to store the relevant information on the whole X in one of a constant number of types (called bucket interfaces). The swapping argument used in this proof holds the essence of the typical sequences technique of Bodlaender and Kloks [3], while being, in our opinion, more natural and easier to understand.
As an interesting byproduct, we can also use our understanding to treat the problem of removing edges to get a graph of small cutwidth. More precisely, for parameters w, k, we consider the class of all graphs G, such that w edges can be removed from G to obtain a graph of cutwidth at most k. We prove that for every constant k, the minimal (strong) immersion obstructions for this class have sizes bounded linearly in w. Moreover we give an exponential lower bound to the number of these obstructions. These results are presented in Section 6.

Algorithmic results.
Consider the following "compression" problem: given a graph G and its ordering σ of width ℓ, we would like to construct, if possible, a new ordering of the vertices of G of width at most k, where k < ℓ. Then the types defined above essentially match states that would be associated with prefixes of σ in a dynamic programming algorithm solving this problem. Alternatively, one can think of building an automaton that traverses the ordering σ while constructing an ordering of G of width at most k. Hence, our upper bound on the number of types can be directly used to limit the state space in such a dynamic programming procedure/automaton, yielding an FPT algorithm for the above problem.
With this result in hand, it is not hard to design of an exact FPT algorithm for cutwidth. One could introduce vertices one by one to the graph, while maintaining an ordering of optimum width. Each time a new vertex is introduced, we put it anywhere into the ordering, and it can be argued that the new ordering has width at most three times larger than the optimum. Then, the dynamic programming algorithm sketched above can be used to "compress" this approximate ordering to an optimum one in linear FPT time.
The above approach yields a quadratic algorithm. To match the optimum, linear running time, we use a similar trick as Bodlaender in his linear-time algorithm for computing the treewidth of the graph [2]. Namely, we show that instead of processing vertices one by one, we can proceed recursively by removing a significant fraction of all the edges at each step, so that their reintroduction increases the width at most twice. We then run the compression algorithm on the obtained 2-approximate ordering to get an optimum one. The main point is that, since we remove a large portion of the graph at each step, the recursive equation on the running time solves to a linear function, instead of quadratic. This gives the following. Theorem 2. There exists an algorithm that, given an n-vertex graph G and an integer k, runs in time 2 O(k 2 log k) · n and either correctly concludes that the cutwidth of G is larger than k, or outputs an ordering of G of width at most k.
The algorithm of Theorem 2 has running time slightly larger than that of Thilikos, Bodlaender, and Serna [19,20]. The difference is the log k factor in the exponent, the reason for which is that we use a simpler bucketing approach to bound the number of states, instead of the more entangled, but finer, machinery of typical sequences. We believe the main strength of our approach lies in its explanatory character. Instead of relying on algorithms for computing tree or path decompositions, which are already difficult by themselves, and then designing a dynamic programming algorithm on a path decomposition, we directly approach cutwidth "via cutwidth", and not "via pathwidth". That is, the dynamic programming procedure for computing the optimum cutwidth ordering on an approximate cutwidth ordering is technically far simpler and conceptually more insightful than performing the same on a general path decomposition. We also show that the "reduction-by-a-large-fraction" trick of Bodlaender [2] can be performed also in the cutwidth setting, yielding a self-contained, natural, and understandable algorithm.
Graphs. All graphs considered in this paper are undirected, without loops, and may have multiple edges. The vertex and edge sets of a graph G are denoted by V (G) and E(G), respectively.
Cutwidth. Let G be a graph and σ an ordering of V (G). For u, v ∈ V (G), we write u < σ v if u appears before v in σ. Given two disjoint sequences σ 1 = x 1 , . . . , x r1 and σ 1 = y 1 , . . . , y r2 of vertices in V (G), we define their concatenation as σ 1 • σ 2 = x 1 , . . . , x r1 , y 1 , . . . , y r2 . For X ⊆ V (G), let σ X be the ordering of X induced by σ, i.e., the ordering obtained from σ if we remove the vertices that do not belong in X.
The cutwidth of G, cw(G), is the minimum of cw σ (G) over all possible orderings of V (G).
Obstructions. Let ≤ be a partial order on graphs. We say that G ′ G if G ′ ≤ G and G ′ is not isomorphic to G. A graph class G is closed under ≤ if whenever G ′ ≤ G and G ∈ G, we also have that G ′ ∈ G. Given a partial order ≤ and a graph class G closed under ≤, we define the (minimal) obstruction set of G w.r.t. ≤, denoted by obs ≤ (G), as the set containing all graphs where the following two conditions hold: O1: G ∈ G, i.e., G is not a member of G, and O2: for each G ′ with G ′ G, we have that G ′ ∈ G.
We say that a set of graphs H is a ≤-antichain if it does not contain any pair of comparable elements wrt. ≤. By definition, for any class G closed under ≤, the set obs ≤ (G) is an antichain.
Immersions. Let H and G be graphs. We say that G contains H as an immersion if there is a pair of functions (φ, ψ), called an H-immersion model of G, such that φ is an injection from V (H) to V (G) and ψ maps every edge uv of H to a path of G between φ(u) and φ(v) so that different edges are mapped to edge-disjoint paths. Every vertex in the image of φ is called a branch vertex. If we additionally demand that no internal vertex of a path in ψ(E(H)) is a branch vertex, then we say that (φ, ψ) is a strong H-immersion model and H is a strong immersion of G. We denote by H ≤ i G (H ≤ si G) the fact that H is an immersion (strong immersion) of G; these are partial orders. Clearly, for any two graphs H and G, if H ≤ si G then H ≤ i G. This implies the following observation: Robertson and Seymour proved in [16] that every ≤ i -antichain is finite and conjectured the same for ≤ si . It is well-known that for every k ∈ N, the class C k of graphs of cutwidth at most k is closed under immersions. It follows from the results of [16] that obs ≤i (C k ) is finite; the goal of this paper is to provide good estimates on the sizes of graphs in obs ≤si (C k ). As the cutwidth of a graphs is the maximum cutwidth of its connected components, it follows that graphs in obs ≤si (C k ) are connected. Moreover, every graph in obs ≤si (C k ) has cutwidth exactly k + 1, because the removal of any of its edges decreases its cutwidth to at most k.

Bucket interfaces
Let G be a graph and σ be an ordering of V (G). For a set X ⊆ V (G), the X-blocks in σ are the maximal subsequences of consecutive vertices of σ that belong to X. Suppose (A, B) is a cut of G. Then we can write σ = b 1 • . . . • b p , where b 1 , . . . , b p are the A-and B-blocks in σ; these will be called jointly (A, B)-blocks. The next lemma is the cornerstone of our approach: we prove that given a graph G and a cut (A, B) of G, there exists an optimum cutwidth ordering of G where number of blocks depends only on the cutwidth and the size of (A, B). Proof. Let σ be an optimum cutwidth ordering such that, subject to the width being minimum, the number of (A, B)-blocks it defines is also minimized. Let σ = b 1 •b 2 •· · ·•b r , where b 1 , b 2 , . . . , b r are the (A, B)-blocks of σ. If σ defines less than three blocks, then the claim already follows, so let us assume r ≥ 3.
Consider any ordering σ ′ obtained by swapping two blocks, i.e., 1]. Observe that since the blocks b 1 , . . . , b r alternate as A-blocks and B-blocks, the ordering σ ′ has a strictly smaller number of blocks; indeed, either j − 1 ≥ 1, in which case b j−1 • b j+1 defines a single block of σ ′ , or j = 1 and hence j + 2 ≤ r, in which case b j • b j+2 does. Therefore, by choice of σ, for each j ∈ [r − 1], swapping b j and b j+1 in σ must yield an ordering with strictly larger cutwidth.
We call a block free if it does not contain any endpoint of the cut edges E G (A, B). We now prove that any run of consecutive free blocks in σ has at most 2cw(G) + 3 blocks. Since the cut (A, B) has size ℓ, there are at most 2ℓ blocks that are not free. This implies the claimed bound on the total number of all blocks in σ.
Suppose, to the contrary, that there exists a run of q > 2cw(G) + 3 consecutive free blocks in σ. Let these blocks be b r , b r+1 , . . . , b s , where s − r + 1 = q. For j ∈ [r, s − 1], we define µ(j) to be the size of the cut between all vertices inside or preceding the vertices of block b j and all vertices inside or following the vertices of block b j+1 in σ; see Figure 1.
Proof. Suppose that for some j ∈ [r + 1, s − 2], µ(j) ≥ max(µ(j − 1), µ(j + 1)). We will then show that the ordering σ ′ obtained by swapping the blocks b j and b j+1 still has optimum cutwidth, a contradiction to the choice of σ. Notice that for every vertex v preceding all vertices of b j or succeeding all vertices of . Thus, it remains to show that for any vertex v belonging to the block b j or to the block b j+1 , also δ( . Let p j be the number of edges of G with one endpoint in the block b j and the other endpoint preceding (in σ) all vertices of b j . Let also s j be the number of edges of G with one endpoint in b j and the other endpoint succeeding (in σ) all vertices of b j (and hence succeeding all vertices of block b j+1 , since both b j and b j+1 are free). Notice that µ(j) = µ(j − 1) − p j + s j and recall that µ(j) ≥ µ(j − 1). This yields that s j ≥ p j .
Similarly, let p j+1 be the number of edges of G with one endpoint in b j+1 and the other endpoint preceding all vertices of the block b j+1 (and, in particular, all vertices of block b j ). Let also s j+1 be the number of edges of G with one endpoint in b j+1 and the other endpoint succeeding all vertices of block b j+1 . Again, we have µ(j + 1) = µ(j) − p j+1 + s j+1 and µ(j) ≥ µ(j + 1). This yields that p j+1 ≥ s j+1 .
Let v be a vertex of the block b j . Recall that the blocks b j and b j + 1 are free and thus, there are no edges between them. Observe then that Claim 5 shows that for all j ∈ [r + 1, s − 2], we have µ(j − 1) > µ(j) or µ(j) < µ(j + 1). It follows that any non-decreasing pair µ(j − 1) ≤ µ(j) must be followed by an increasing pair µ(j) < µ(j + 1). Hence, if j min is the minimum index such that µ(j min ) ≤ µ(j min + 1), then the sequence µ(j) has to be strictly decreasing up to j min and strictly increasing from j min + 1 onward. Since µ(j) ≤ cw(G) for all j, the length q of the sequence of consecutive free blocks cannot be longer than 2cw(G) + 3 in total, concluding the proof.
We use the above lemma to bound the number of "types" of prefixes in graph orderings. To describe such a prefix, i.e., one side of a cut in a graph, we use the following definition. Definition 6. A k-boundaried graph is a pair G = (G,x) where G is a graph andx = (x 1 , . . . , x k ) is a k-tuple of the graph's boundary vertices (ordered, not necessarily distinct). The extension of G is the graph G * obtained from G by adding k new vertices x ′ 1 , . . . , x ′ k and edges is the graph obtained from the disjoint union of A and B by adding an edge x i y i for i ∈ [k].
From Lemma 4 we derive that for any given cut (A, B) of size ℓ of a graph G with cw(G) ≤ k, there is an optimum cutwidth ordering in which the vertices of A occur in O(kℓ) blocks. Our next goal is to show that the only information about A that can affect the cutwidth of G is: the placing of the endpoints of each cutedge (x i and x ′ i ) into blocks, and the cutwidth of each block (as an induced subgraph of A or A * ). Recall that for an ordering σ of V (G), σ-cuts are cuts of Notice that every σ-cut of G is in cuts(G, σ, T, i) for at least one bucket i ∈ [ℓ]; since cw σ (G) is the maximum of |E G (L, R)| over σ-cuts (L, R), we have width(G, σ, T, i).
For two k-boundaried graphs A = (A,x), B = (B,ȳ), we slightly abuse notation and understand the edges x 1 x ′ 1 , . . . , x k x ′ k in A * to be the same as y ′ 1 y 1 , . . . , y ′ k y k in B * and as We define σ| A * as an ordering that orders x ′ i just as σ orders y i , with the order between x ′ i and x ′ j chosen arbitrarily when y i = y j . The following lemma shows that if an ℓ-bucketing respects the sides of a cut, then the width of any bucket can be computed as the sum of contributions of the sides.

Lemma 8. Let k, ℓ be positive integers and
Proof. Consider any cut (L, R) in cuts(G, σ, T, i). Observe that for every edge e of E A⊕B (L, R) one of the following holds: Since we do not distinguish between the vertices x i and the vertices y ′ i , we equivalently obtain that for every edge e ∈ E A⊕B (L, R), e is either an edge in . Therefore, the total number of edges crossing these cuts is at most width For the converse inequality, observe that since the bucket T −1 (i) does not contain any vertices Then, since we assumed that T −1 (i) does not contain any vertices of A (and thus, may only contain vertices of B), it follows that ( where we again do not distinguish between the vertices x i and y ′ i . Hence Replacing the roles of A and B above, we obtain that if T −1 (i) does not contain any vertex of B, then Intuitively, this implies that the cutwidth of A ⊕ B depends on A only in the widths of each block relative to A and A * (in any bucketing where buckets are either A-blocks or B-blocks). Therefore, replacing A with another boundaried graph whose extension has an ordering and bucketing with the same widths preserves cutwidth (as long as endpoints of the cut edges are placed in the same buckets too). This is formalized in the next definition.
Definition 9. For k, ℓ ∈ N, an (k,ℓ)-bucket interface consists of functions: identifying the buckets which contain x i and x ′ i , respectively and corresponding to the widths of buckets.
We call two k-boundaried graphs G 1 , G 2 (k,ℓ)-similar if the sets of (k, ℓ)-bucket interfaces they conform with are equal. The following lemma subsumes the above ideas. The proof follows easily from Lemma 8 and the fact that cw σ (G) = max i∈[ℓ] width(G, σ, T, i) (Eq. (1)).
Theorem 11. Let k, r be two positive integers. Let also A 1 and A 2 be two k-boundaried graphs that are (k, ℓ)-similar, where ℓ = (2k + 1) · (2r + 4). Then for any By adding an empty block at the front, if necessary, we may assume that the number of blocks is at most ℓ, while odd-indexed blocks are V (A 1 )-blocks and even-indexed blocks are V (B)-blocks. Then, there is an ℓ-bucketing T 1 of σ 1 such that T 1 (v) is odd for v ∈ A 1 and even for v ∈ B. Therefore σ 1 | A * 1 and T 1 | A * 1 certify that the following (k, ℓ)-bucket interface conforms with A 1 : . By (k, ℓ)-similarity there is an ordering σ 2 of A * 2 and its ℓ-bucketing T 2 such that: . Given this, we define an assignment of vertices into buckets Π : Clearly, We claim that and, similarly, Thus, we obtain that Note also that vertices of A 2 are mapped to odd buckets and vertices of B are mapped to even buckets. We use Π to define an ordering π of the vertices of A 2 ⊕ B as follows. Formally, we let u < π u if and only if one of the following conditions hold: Note that this is a linear ordering as it first sorts the vertices according to the bucket they belong to and then according to the ordering induced in this bucket by the orderings σ 1 and σ 2 . Note also that by definition Π is an ℓ-bucketing of π. Recall that, from Eq. (4), Π| A * 2 = T 2 | A2 . This, together with the observation that the vertices of A 2 are mapped to odd buckets of Π, implies that Moreover, recall that Π| B * = T 1 | B * . This, together with the fact that the vertices of B are mapped to even buckets of Π, implies that We now bound the width of each bucket. Let i ∈ [ℓ]. Notice that if i is even the by construction Π −1 (i) contains only vertices from B. Therefore, where the first equality follows from Lemma 8, the second equality holds by Eq. (3), (7), (8), and (5), the third inequality follows from the (k, ℓ)-bucket interface, and the fifth equality follows from Lemma 8. We similarly argue, using µ * instead of µ, that for odd Similarly, to Eq. 10, we get that the first equality follows from Lemma 8, the second equality holds by Eq. (4), (6), (2), and (9), the third inequality follows from the (k, ℓ)-bucket interface, and the fifth equality follows from Lemma 8. Therefore, from Eq. (10) and (11) we obtain that So in particular cw(A 2 ⊕ B) ≤ r. By applying the same reasoning, but with A 1 and A 2 reversed, we obtain also the converse inequality cw( Figure 2: An ordering of vertices with the minimum cut (A, B) between A 1 and B 2 of size i highlighted in blue and red. Below, the modified ordering, with cutwidth bounded using submodularity.

Obstruction sizes and lean orderings
In this section we establish the main result on sizes of obstructions for cutwidth. We first introduce lean orderings and prove that there is always an optimum ordering that is lean.
Proof. Without loss of generality, we may assume that the graph is connected. Let σ be an optimum cutwidth ordering of V = V (G). Subject to the optimality of σ, we choose σ so that We prove that σ defined in this manner is in fact lean. Assume the contrary. Then by Menger's theorem, there exist vertices Figure 2). Notice that Let σ ′ be the ordering of V obtained by concatenating σ| A1 , σ| A2 , σ| B1 , and σ| B2 . We , contradicting the choice of σ. Therefore, σ is a lean ordering of V with cw σ (G) = cw(G).
The rest of Section 4 is devoted to the proof of Theorem 1. Before we proceed with this proof, we need a series of auxiliary lemmas.
For every s, r ∈ N + , we set A s,r = [s, s + r − 1]. We prove the following.

Lemma 14.
Let N be a positive integer. For every s, r ∈ N + and every word w over A s,r of length N r there is a symbol k ∈ A s,r and a subword u of w such that (a) u contains only numbers not smaller than k, and (b) u contains the number k at least N times.
Proof. We prove the lemma by induction on r. Notice that for r = 1, A s,r = {s} and thus the only word w of length N is s N . Thus, the lemma holds with k = s and u = w. We proceed to the inductive step for r > 1. Let now s ∈ N and let w be a word over A s,r of length N r . If s occurs at least N times, then again, the lemma holds with k = s and u = w. Thus, we may assume that s occurs at most N − 1 times. Then, since w has length at least N r , there exists a subword w ′ of w of length at least N r−1 over A s,r \ {s} = A s+1,r−1 . From the inductive hypothesis, there exists k ∈ A s+1,r−1 ⊆ A s,r and a subword u of w ′ such that k occurs at least N times in u and u contains only numbers at least k. Since w ′ is a subword of w, u is also a subword of w. This completes the inductive step and the proof of the lemma.
We use Lemma 14 only for s = 1, giving the following corollary. We also need one additional statement about boundaried graphs and bucket interfaces.
Proof. First, we extend the immersion model (φ, ψ) to an immersion model (φ . Suppose that ordering σ of V (B * ) and its ℓ-bucketing T certify that B conforms to (b, b ′ , µ, µ * ). We define ordering σ ′ of V (A * ) and its ℓ-bucketing T ′ as follows: It is easy to see that T ′ is an ℓ-bucketing of σ ′ . We now verify that σ ′ and T ′ certify that A conforms to (b, b ′ , µ, µ * ). The first two conditions of conforming follow directly from the definition of σ ′ and T ′ , so we are left with the third and the fourth condition.
Consequently, one of the edges of this path must belong to E B (L, R). Since paths ψ(uv) are pairwise edge-disjoint for different edges uv ∈ E A (L ′ , R ′ ), we infer that This establishes the third condition. The fourth condition follows by the same argument applied to graphs A * and B * , instead of A and B.
The following theorem is the technical counterpart of Theorem 1. Its proof is based on Theorem 11, Lemma 13, Observation 10 and the idea of "unpumping" repeating types, presented in the introduction. The leanness is used to make sure that within the unpumped segment of the ordering, one can find the maximum possible number of edge-disjoint paths between the parts of the graph on the left side and on the right side of the segment. This ensures that the graph obtained from unpumping can be immersed in the original one.
Proof. Take any G ∈ obs ≤si (C k ) and assume, towards a contradiction, that |V (G)| > N k+1 . Let σ = v 1 , v 2 , . . . , v |V (G)| be a lean optimum cutwidth ordering of G, which exists by Lemma 13. We define c i = δ(V σ vi ), that is, c i is the size of the cut between the vertices of G up to v i and the rest of the graph. Notice that since G ∈ obs ≤si (C k ), we have that cw(G) = k + 1 and G is connected. This implies that Observe that c 1 c 2 . . . c |V (G)|−1 is a word of length at least N k+1 over the alphabet [k + 1]. From Corollary 15, it follows that there exist 1 ≤ s ≤ t < |V (G)| and q ∈ [k + 1] such that for every s ≤ i ≤ t we have c i ≥ q, and there also exist N distinct indices s ≤ i 1 < i 2 < · · · < i N ≤ t such that c ij = q, for every j ∈ [N ]. Without loss of generality we may assume that i 1 = s and i N = t.
For each j ∈ [N ], we can define a q-boundaried graph G j = (G j , (z 1 j , z 2 j , . . . , z q j )) in the following way. First, by leanness, we find edge-disjoint paths P 1 , . . . , P q between V σ vi 1 and V \V σ . Then ψ can be defined by taking ψ(e) = e for each e ∈ E(G j1 ) and mapping each edge x i j1 z i j1 to an appropriate infix of the path P i , extended by the edge x i j2 z i j2 . Consequently, G j1 and G j2 satisfy the prerequisites of Lemma 16. We infer that if by ζ(j) we denote the set of (q, ℓ)-bucket interfaces to which G j conforms, then Observation 10 implies that N is larger by more than 1 than the total number of (q, ℓ)-bucket interfaces. It follows that there exists an index j, 1 ≤ j < N , such that ζ(j) = ζ(j + 1).
In other words, the q-boundaried graphs G j and G j+1 are (q, ℓ)-similar.
Define a q-boundaried graph G ′ = (G ′ , (y 1 j+1 , . . . , y q j+1 )) by taking ]. It can be now seen that G j+1 ⊕G ′ is exactly the graph G with every edge of the cut ) subdivided once. Since subdividing edges does not change the cutwidth of the graph, we have that On the other hand, q-boundaried graphs G j and G j+1 are (q, ℓ)-similar. Since ℓ ≥ (2q+3)·(2q+6), by Theorem 11 we conclude that Examine the graph G j ⊕ G ′ . In the join operation, we added an edge z i j y i j+1 for each i ∈ [q], which means each vertex z i j has exactly two incident edges in G j ⊕ G ′ : one connecting it to x i j and one connecting it to y i j+1 . Let H be the graph obtained from G j ⊕ G ′ by dissolving every vertex z i j , i.e., removing it and replacing edges x i j z i j and z i j y i j+1 with a fresh edge x i j y i j+1 . Subdividing edges does not change the cutwidth of a graph, so we obtain that: Finally, it is easy to see that G admits H as a strong immersion: a strong immersion model of H in G can be constructed by mapping the vertices and edges of G j and G ′ identically, and then mapping each of the remaining edges x i j y i j+1 to a corresponding infix of the path P i . Also, since i j < i j+1 , the graph H has strictly less vertices than G. However, from Eq. (12), (13), and (14) we conclude that cw(H) = cw(G) > k. This contradicts the assumption that G ∈ obs ≤si (C k ).
Proof of Theorem 1. Theorem 17 provides an upper bound on the number of vertices of a graph in obs ≤si (C k ). Observe that since such a graph has cutwidth k + 1, each it its vertices has degree at most 2(k + 1). It follows that any graph from obs ≤si (C k ) has 2 O(k 3 log k) vertices and edges. Finally, by Observation 3 we have obs ≤i (C q ) ⊆ obs ≤si (C q ), so the same bound holds also for immersions instead of strong immersions. This concludes the proof of Theorem 1.

An algorithm for computing cutwidth
In this section we present an exact FPT algorithm for computing the cutwidth of the graph. First, we need to give a dynamic programming algorithm that given an approximate ordering σ of width r, finds, if possible, an ordering of width at most k, where k ≤ r is given.
Our algorithm takes advantage of the given ordering σ and essentially computes, for each subgraph of G induced by a prefix of σ, the (r, ℓ)-bucket interfaces it conforms to. More precisely, in Lemma 18 we show that if G has an optimum ordering of width k, then there is an optimum ordering were each of these induced subgraphs occupies at most ℓ = O(rk) buckets, allowing to restrict our search to (r, ℓ)-bucket profiles (a variant of bucket interfaces to be defined later, refined so as to consider border vertices more precisely). The proof slightly strengthens that of Lemma 4.
Lemma 18. Let G be a graph with an ordering σ of width r. Then there exists also an ordering τ of optimum width, i.e., with cw τ (G) = cw(G), that has the following property: for every prefix X of σ, the number of X-blocks in τ is at most 2r · cw(G) + cw(G) + 4r + 2.
As A-blocks and B-blocks appear alternately, at most half rounded up of the (A, B)-blocks can be A-blocks. Hence, the number of A-blocks in such an optimum-width ordering is at most 2r · cw(G) + cw(G) + 4r + 2; we denote this quantity by λ.
The proof of Lemma 4 in fact shows that for any ordering σ of V (G) and any cut (A, B) of G of size at most r, either σ already has at most 2λ − 1 (A, B)-blocks, or an ordering σ ′ can be obtained from σ by swapping its (A, B)-blocks so that σ ′ has strictly less (A, B)-blocks. Therefore, by reordering (A, B)-blocks of σ, we eventually get a new ordering which has at most 2λ − 1 (A, B)-blocks, and hence at most λ A-blocks.
For i = 1, 2, . . . , |V (G)| − 1, let (A i , B i ) be the cut of G, where A i is the prefix of σ of length i, while B i is the suffix of σ of length |V (G)| − i. Let τ 0 be any optimum-width ordering of G. We now inductively construct orderings τ 1 , τ 2 , . . . , τ |V (G)|−1 , as follows: once τ i is constructed, we apply the above reordering procedure to τ i and cut (A i+1 , B i+1 ). This yields a new ordering τ i+1 of optimum width such that the number of A i+1 -blocks in τ i+1 is at most λ. Furthermore, τ i+1 is obtained from τ i by reordering A i+1 -and B i+1 -blocks in τ i . Hence, whenever X is a subset of A i+1 , then any X-block in τ i remains consecutive in τ i+1 , as it is contained in one A i+1 -block in τ i that is moved as a whole in the construction of τ i+1 . Consequently, if for all j ≤ i we had that the number of A j -blocks in τ i is at most λ, then this property is also satisfied in τ i+1 . It is now clear that a straightforward induction yields the following invariant: for each j ≤ i, then number of A j -blocks in τ i is at most λ. Therefore τ = τ |V (G)|−1 gives an ordering with the claimed properties.
Bucket profiles. We now define a refinement of the widths of the buckets of a bucket interface as well as a refinement of the notion of bucket interfaces. They are used in the dynamic programming algorithm of Lemma 22.

Definition 19. Let (G,x) be a k-boundaried graph and let
. Let now σ be an ordering of V (G * ) and T be an ℓ-bucketing of σ. For every bucket Let also cuts(G, σ, T, i, j) be the family of σ-cuts containing on one side all vertices appearing before v j−1 (or, if j = 0, all vertices of buckets appearing before bucket i) and a prefix (in σ) of T −1 j (i). For an ordering σ of the vertices of a graph G, define the width of j-th segment T −1 j (i) of the bucket i, i ∈ [ℓ], j ∈ [0, p], as the maximum width of any cut in the family cuts(G, σ, T, i, j). Formally, We also need to refine the notion of a (k, ℓ)-bucket interface.
From the fact that the boundary vertices of a k-boundaried graph G split the buckets defined by T into at most 2k segments in total it follows that: Observation 21. For any pair (k, ℓ) of positive integers, there is a set of at most 2 2k(log ℓ+log k)+(ℓ+2k) log(k+1) (k, ℓ)-bucket profiles that a k-boundaried graph G can possibly conform with, and this set can be constructed in time polynomial in its size.
The (k, ℓ)-bucket profiles that Observation 21 refers to will be called valid. By making use of these two notions we ensure that we will be able to update the widths of each bucket every time a new vertex is processed by the dynamic programming algorithm. We are now ready to prove Lemma 22. Proof. The algorithm attempts to compute an ordering of width k for consecutive k = 0, 1, 2, . . .. The first value of k for which the algorithms succeeds is equal to the value of the cutwidth, and then the constructed ordering may be returned. Since there is an ordering of width r, we will always eventually succeed for some k ≤ r, which implies that we will make at most r + 1 iterations. Hence, from now on we may assume that we know the target width k ≤ r for which we try to construct an ordering.
Given a graph G and an ordering σ of its vertices with cw σ (G) ≤ r we denote by G w the graph induced by the vertices of the prefix of σ of length w. Then we naturally define the boundaried graph G w , where we introduce a boundary vertex x i for each edge e i of the cut . Note that this cut has at most r edges.
By Lemma 18, we know that there is an optimum-width ordering τ such that every prefix V (G w ) of σ has at most ℓ blocks in τ . Our dynamic programming algorithm will simply inductively reconstruct all (k, ℓ)-bucket profiles that may correspond to V (G w )-blocks in τ , for each consecutive w in the ordering σ, eventually reconstructing τ , if cw τ (G) ≤ k.
We describe which bucket profiles P ′ expand P by guessing where the new vertex would land in the bucket profile P , assuming that G w conforms to P . After the guess is made, the updated profile P becomes the expanded profile P ′ . Different guesses lead to different profiles P ′ which extend P ; this corresponds to different ways in which the construction of the optimum ordering can continue. As describing the details of this expansion relation is a routine task, we prefer to keep the description rather informal, and leave working out all the formal details to the reader.
Let v w+1 be the (w + 1)-st vertex in the ordering σ, that is, v w+1 ∈ V (G w+1 ) \ V (G w ). We construct (by guessing) a (k, ℓ)-bucket profile P ′ from the (k, ℓ)-bucket profile P in the following way. First, we guess an even bucket of P to place each one of the vertices in V (G * w+1 ) \ V (G * w ): the new vertices of the extension that correspond to new edges of the cut E G (V (G w+1 ), V (G) \ V (G w+1 )) that are incident to v w+1 . Notice that each bucket contains, at any moment, at most r vertices. Therefore, we have at most r + 1 possible choices about where each vertex will land in each bucket (including the placing in the order, as indicated by the function p ′ (·). Notice also that there are at most r +1 vertices in V (G * w+1 )\V (G * w ). Therefore we have at most (ℓ(r +1)) r+1 options for this guess.
Next, we choose the place v w+1 is going to be put in. If v w+1 is an endpoint of an edge from the cut E G (V (G w ), V (G) \ V (G w )), then this place is already indicated by functions b ′ (·) and p ′ (·) in the bucket profile P ; if there are multiple edges in the cut E G (V (G w ), V (G) \ V (G w )) that have v w+1 as an endpoint, then all of them must be placed next to each other in the same even bucket (otherwise P has no extension). Otherwise, if v w+1 is not an endpoint of an edge from E G (V (G w ), V (G) \ V (G w )), we guess the placing of v w+1 by guessing an even bucket (one of at most ℓ + 1 options) together with a segment between two consecutive extension vertices in this bucket (one of at most r + 1 options).
The placing of v w+1 may lead to one of three different scenarios; we again guess which one applies. First, v w+1 can establish a new odd bucket and split the even bucket into which it was put into two new even buckets, one on the left and one on the right of the new odd bucket containing v w+1 ; the other extension vertices placed in this bucket are split accordingly. Second, v w+1 can be present at the leftmost or rightmost end of the even bucket it is placed in, so it gets merged into the neighboring odd bucket. Finally, if the even bucket in which v w+1 is placed did not contain any other extension vertices of G * w , then v w+1 can be declared to be the last vertex placed in this bucket, in which case we merge it together with both neighboring odd buckets. In these scenarios, whenever the extended profile turns out to have more than ℓ buckets, we discard this option.
Having guessed how the placing of v w+1 will affect the configuration of buckets, we proceed with updating the sizes of cuts, as indicated by function ν(·). For this, we first examine all the edges of the cut E G (V (G w ), V (G) \ V (G w )) that have v w+1 as an endpoint. These edges did not contribute to the values of ν(·) in the bucket profile P , but should contribute in P ′ . Note that given the placement of v w+1 , for each such edge we exactly see over which segments this edge "flies over", and therefore we can update the values of ν(·) for these segments by incrementing them by one. Finally, when v w+1 got merged to a neighboring odd bucket (or to two of them), we may also need to take into account one more cut in the value of ν(·) for the last/first segment of this bucket: the one between v w+1 and the vertices placed in this bucket. It is easy to see that from the value of ν(·) for the segment in which v w+1 is placed, and the exact placement of the endpoints of all the boundary edges, we can deduce the exact size of this cut. Hence, the relevant value of ν(·) can be efficiently updated by taking the maximum of the old value and the deduced size of the cut. We update the function ν in a similar fashion when v w+1 merges with both neighboring odd buckets. If at any point any of the values of ν(·) exceeds k, we discard this guess.
This concludes the definition of the extension. For every (k, ℓ)-bucket profile P and every (k, ℓ)-bucket profile P ′ that extends it, we add to D an arc from (w, P ) to (w + 1, P ′ ). It is easy to see from the description above that, given P and P ′ , it can be verified in time polynomial in r whether such an arc should be added.
Finally, in the graph D we determine using, say, depth-first search, whether there is a directed path from node (0, P ∅ ) to node (|V (G)|, P full ), where P ∅ is an empty bucket profile and P full is a bucket profile containing just one odd bucket. It is clear from the construction that if we find such a path, then by applying operations recorded along such a path we obtain an ordering of the vertices of G of width at most k. On the other hand, provided k = cw(G), by Lemma 18 we know that there is always an optimum-width ordering τ such that every prefix of σ has at most ℓ blocks in τ . Then the (k, ℓ)-bucket profiles naturally defined by the prefixes of σ in τ define a path from (0, P ∅ ) to (|V (G)|, P full ) in D.
The graph D has 2 O(r 2 log r) · |V (G)| vertices and arcs, and the depth-first search runs in time linear in its size. It is also trivial to reconstruct the optimum-width ordering of the vertices of G from the obtained path in linear time. This yields the promised running time bounds.
Having the algorithm of Lemma 22, a standard application of the iterative compression technique immediately yields a 2 O(k 2 log k) ·n 2 time algorithm for computing cutwidth, as sketched in Section 1. Simply add the vertices of G one by one, and apply the algorithm of Lemma 22 at each step. However, we can make the dependence on n linear by adapting the approach of Bodlaender [2]; more precisely, we make bigger steps. Such a big step consists of finding a graph H that can be immersed in the input graph G, which is smaller by a constant fraction, but whose cutwidth is not much smaller. This is formalised in Lemma 25. For its proof we we need the following definition and a known result about obstacles to small cutwidth.

Definition 23. A perfect binary tree is a rooted binary tree in which all interior nodes have two children and all leaves have the same distance from its root. The height of a perfect binary tree is the distance between its root and one of its leaves.
Lemma 24 ( [10,13,18]). If T is a perfect binary tree of height 2k, then cw(T ) ≥ k. Proof. Without loss of generality we assume that G is connected, because we can apply the algorithm on the connected components of G separately and then take the disjoint union of the results.
Observe first that we may assume that every vertex in G is incident to at most 2k edges, as otherwise, we could immediately conclude that cw(G) > k. This also implies that every vertex in G has at most 2k neighbors; by N (v) we denote the set of neighbors of a vertex v, and N (X) = ( v∈X N (v)) \ X for a vertex subset X. Let G ′ be the graph obtained from G by exhaustively dissolving any vertices of degree 2 whose neighbors are different. That is, having such a vertex v, we delete it from the graph and replace the two edges incident to it with a fresh edge between its neighbors, and we proceed doing this as long as there are such vertices in the graph. Clearly, the eventually obtained graph G ′ can be immersed in G, we have |E(G ′ )| ≤ |E(G)|, the degree of every vertex in G ′ is the same to its degree in G, and cw(G ′ ) ≤ cw(G). However, observe that any ordering of the vertices of G ′ can be turned into an ordering of the vertices of G with the same width by placing each dissolved vertex in any place between its two original neighbors. Thus, cw(G ′ ) = cw(G).
Moreover, G ′ can be constructed in linear time by inspecting, in any order, all the vertices that have degree 2 in the original graph G. It is also easy to see that, given an ordering of vertices of G ′ , one can reconstruct in linear time an ordering of G of at most the same width.
Altogether, it is now enough to either conclude that cw(G ′ ) > k or find a graph H immersed in G ′ such that and cw(G ′ ) ≤ 2cw(H ′ ). Therefore, from now on we may assume that if the graph G ′ contains a vertex that is incident to two edges then this vertex is incident to an edge of multiplicity 2. Let V 1 be the set of vertices of degree 1 in G ′ . We consider two cases depending on the size of V 1 .
, and recall that every vertex in G ′ is incident to at most 2k edges and therefore has at most 2k neighbors. It follows then that |V 1 | ≤ 2k · |N (V 1 )| and hence |N (V 1 )| ≥ |E(G ′ )|/(2k + 1) 4(k+1)+3 . Let H be the graph obtained from G ′ by removing, for each vertex in N (V 1 ), one of its neighbors in ) and H is immersed in G ′ (as it is an induced subgraph). Hence, H is also immersed in G. Furthermore, let σ be any ordering of the vertices of H. Then, we can obtain an ordering of the vertices of G ′ by placing each deleted vertex next to its original neighbors. Notice that this placement increases the width of σ by at most 1 in total, and thus by a multiplicative factor of at most 2. As we already showed how to obtain an ordering of V (G) from a given ordering of V (G ′ ), the lemma follows for the case where |V 1 | ≥ |E(G ′ )|/(2k + 1) 4(k+1)+2 .
Case 2. |V 1 | < |E(G ′ )|/(2k + 1) 4(k+1)+2 . For every v ∈ V (G ′ ) and every positive integer s, we define B s (v) to be the ball of radius s around v, that is, the set of vertices at distance at most s from v in G ′ . Recall that every vertex of G ′ has at most 2k neighbors and observe then that |B s (v)| ≤ (2k + 1) s . We construct a set of vertices v 1 , v 2 , . . . , v ℓ ∈ V (G ′ ) whose pairwise distance is greater than 4(k + 1) in the following greedy way. Having chosen v 1 , .
Proof. Suppose for some i ∈ I, B 2(k+1) (v i ) does not contain a cycle. We will prove that every vertex in G ′ [B 2(k+1) (v i )] has degree at least 3 in G ′ , and that every edge appears with multiplicity 1. Notice first that every edge of the graph G ′ [B 2(k+1) (v i )] has multiplicity 1, as otherwise an edge with multiplicity at least 2 would form a cycle, a contradiction. Notice also that B 2(k+1) (v i ) does not have any vertex that has degree 2 in G. Indeed, recall that by the construction of the graph G ′ any vertex of degree 2 is incident only to one edge of multiplicity 2, which is again a contradiction. Moreover, by the choice of i ∈ [I], we obtain that does not have any vertex that has degree 1 in G. We conclude that every vertex in G ′ [B 2(k+1) (v i )] has degree at least 3 in G, and every edge appears with multiplicity 1. Recall that the subgraph of G ′ induced by B 2(k+1) (v i ) contains the full breadth-first search tree of vertices at distance at most 2(k + 1) from v i . If G ′ [B 2(k+1) (v i )] did not contain any cycle, then it would be equal to this breadth-first search tree, and in this tree all vertices except possibly the last layer would have degrees at least 3. Hence, G ′ would contain as a subgraph a perfect binary tree of height 2(k + 1). From Lemma 24, this tree has cutwidth at least k + 1. The algorithm can thus check (by breadth-first search) for a cycle in the subgraph induced by B 2(k+1) (v i ). If it does not find any such cycle it immediately concludes that cw(G) = cw(G ′ ) > k.
Let us assume that the algorithm has now found a set C of at least E(G ′ )/(2k + 1) 4(k+1)+2 edge-disjoint cycles and let H be the subgraph obtained from G ′ by removing one, arbitrarily chosen, edge e C from each cycle C ∈ C. Then H can be immersed in G ′ and |E(H)| ≤ |E(G ′ )| · (1−1/(2k+1) 4(k+1)+2 ). To complete the proof of the lemma we will prove that if σ is any ordering of the vertices of H then σ is also an ordering of the vertices of G ′ such that cw σ (G ′ ) ≤ 2cw σ (H). Notice that by reintroducing an edge e C of G ′ to H we increase the width of the σ-cuts separating its endpoints by exactly 1. Observe also that since e C belongs to the cycle C, the rest of the cycle forms a path P C in H that connects the endpoints of e C . Therefore, each of the σ-cuts separating the endpoints of e C has to contain at least one edge of P C . Since for different edges e C , for C ∈ C, the corresponding paths P C are pairwise edge-disjoint and they are present in H, it follows that the size of each σ-cut in G ′ is at most twice the size of this σ-cut in H. Therefore cw σ (G ′ ) ≤ 2cw σ (H). Thus, H can be returned, concluding the algorithm.
We are now ready to put all the pieces together.
Proof of Theorem 2. Given an n-vertex graph G and an integer k, one can in time 2 O(k 2 log k) · n either conclude that cw(G) > k, or output an ordering of G of width at most k. The proof follows the same recursive Reduction&Compression scheme as the algorithm of Bodlaender [2]. By applying Lemma 25, we obtain a significantly smaller immersion H, and we recurse on H. This recursive call either concludes that cw(H) > k, which implies cw(G) > k, or it produces an ordering of H of optimum width cw(H) ≤ k. This ordering can be lifted, using Lemma 25 again, to an ordering of G of width ≤ 2k. Given this ordering, we apply the dynamic programming procedure of Lemma 22 to construct an optimum ordering of G in time 2 O(k 2 log k) · |V (G)|.
Since at each recursion step the number of edges of the graph drops by a multiplicative factor of at least 1/(2k + 1) 4(k+1)+3 , we see that the graph G i at level i of the recursion will have at most (1 − 1/(2k + 1) 4(k+1)+3 ) i · |E(G)| edges. Hence, the total work used by the algorithm is bounded by the sum of a geometric series: 6 Obstructions to edge-removal distance to cutwidth Throughout this section, by O k (w) we mean a quantity bounded by c k · w + d k , for some constant c k , d k depending on k only. Given a graph G and a k ∈ N, we define the parameter dcw k (G) as the minimum number of edges that can be deleted from G so that the resulting graph has cutwidth at most k. In other words: In this section, we provide bounds to the sizes of the obstruction sets of the class of graphs G with dcw k (G) ≤ w, for each k, w ∈ N. Our results are the following.
Theorem 27. For every w, k ∈ N, every graph in obs ≤si (C w,k ) has O k (w) vertices.
From Observation 3, both bounds of Theorems 27 and 28 holds for both immersions and strong immersions as well.
Given a collection H of graphs, we define the parameter aic H (G) as the minimum number of edges whose removal from G creates an H-immersion free graph, that is, a graph that does not admit any graph from H as an immersion. In both subsections that follow, we need the following observation.
Observation 29. For every graph G and every w ∈ N, it holds that dcw w (G) = aic obs(Cw) (G).
We remark that, within the same set of authors, we have recently studied kernelization algorithms for edge removal problems to immersion-closed classes. The following result has been obtained in a yet unpublished manuscript [9]: Whenever a finite collection of graphs H contains at least one planar subcubic graph, and all graphs from H are connected, then the problem of computing aic H (G), parameterized by the target value, admits a linear kernel. These prerequisites are satisfied for H = obs(C w ), and hence the problem of computing dcw w (G), parameterized by the target value k, admits a linear kernel.
The connections between kernelization procedures and upper bounds on minimal obstruction sizes have already been noticed in the literature; see e.g. [7]. Intuitively, whenever the kernelization rules apply only minor or immersion operations, the kernelization algorithm can be turned into a proof of an upper bound on the sizes of minimal obstacles for the corresponding order. Unfortunately, this is not the case for the results of [9]: the main problem is the lack of lean decompositions for parameter tree-cut width, which plays the central role. Here, the situation is different, as we know that there are always lean orderings of optimum width. We therefore showcase how to use the leanness to obtain a linear upper bound on the sizes of obstructions for C w,k . The arguments are somewhat similar as in [9]: we use the idea of protrusions, adapted to the setting of edge cuts, and we try to replace protrusions with smaller ones having the same behavior. The main point is that leanness ensures us that the replacement results in an immersion of the original graph.

Upper bound
A partial q-boundaried graph is a pair G = (G,x) where G is a graph andx = (x 1 , . . . , x k ) is a k-tuple that consists either from vertices of G or from empty slots (that is indices that to not correspond to vertices of G). If x i is an empty slot, we denote it by x i = ⋄. The extension of such G is defined just as for q-boundaried graphs, but we put Given a q-boundaried graph F = (F,x) we denote by P(F ) the set containing every partial q-boundaried graph F ′ = (F ′ ,x ′ ) such that F ′ is a subgraph of F and a vertex x i inx ′ is an empty slot iff . Intuitively a partial q-boundaried graph extends the notion of a boundaried graph by demanding that the vertices in their boundary carry indices from a set whose cardinality might be bigger than the boundary.
Let H be a graph and let (X 1 , X 2 ) be its cut where q = δ(X 1 ). Let E H (X 1 , , and such that x j i ∈ X j for (i, j) ∈ [q] × [2]. For j ∈ [2], we say that the pair (X 1 , X 2 ) generates the q-boundaried graph A j = (A j , x j ) if A i = G[X i ] and x i = (x j 1 , . . . , x j q ). We denote by B q,h the collection containing every q-boundaried graph that can be generated from some cut (X 1 , X 2 ) of some graph H where |V (H)| + |E(H)| ≤ h and q = δ(X 1 ). Moreover, we denote by M q,h = P(B q,h ). In other words, M q,h contains all partial q-boundaried graphs that can be generated by a graph whose number of edges and vertices does not exceed h. We insist that if H = (H, x) ∈ M q,h , then the vertices of H are taken from some fixed repository of h vertices and that an element x i of x is either an empty slot (i.e., x i = ⋄) or the i-th vertex of some predetermined ordering (x 1 , . . . , x q ) of q vertices from this repository. This permits us to assume that |M q,h | is bounded by some function that depends only on q and h.
Let G = (G, x) be a q-boundaried graph and H = (H, y) be a partial q-boundaried graph. Let also G * and H * be the extensions of G and H, respectively. We also assume that, for all i ∈ [q], either y i = x i or y i = ⋄. For an edge subset R ⊆ E(G * ), we say that H is an R-avoiding We now define the R-avoiding (q, h)-folio of G as the set of all partial q-boundaried graphs in M q,h that are R-avoiding strong immersions of G and we denote it by folio q,h,R (G). We finally define Given two q-boundaried graphs G 1 and G 2 we write G 1 ∼ q,h G 2 in order to denote that F q,h (G 2 ) = F q,h (G 2 ). As F q,h maps each q-boundaried graph to a collection of subsets of M q,h we have the following.
Lemma 30. There is some function f 1 : N 2 → N such that for every two non-negative integers h and r, the number of equivalence classes of ∼ q,h is at most f 1 (q, h).
The next lemma is a consequence of the definition of the function F q,h . Lemma 31. Let H be some set of connected graphs, each of at most h vertices, and let G i = (G i , x i ), i ∈ {1, 2} be two q-boundaried graphs such that G 1 ∼ q,h G 2 and such both G 1 , G 2 are H-immersion free. Then, for every q-boundaried graph F = (F, y), it holds that aic H (F ⊕ G 1 ) = The proof is omitted as it is very similar to the one in [5] where a similar encoding was defined in order to treat the topological minor relation. To see the main idea, recall that F q,h (G i ) registers all different "partial occurrences" of graphs of ≤ h vertices (and therefore also of graphs of H) in G ′ i , for all possible ways to obtain G ′ i from G after removing at most q edges. This encoding is indeed sufficient to capture the behavior of all possible edge sets whose removal from F ⊕ G i creates an H-free graph. Indeed, as both G 1 , G 2 are H-immersion free, any such set should have at most q edges inside G i as, if not, the q-boundary edges between F and G i would also make the same job. A similar discussion is also present in [9].
Given a graph G and X ⊆ V (G), we write cw σ (G, X) = δ G (X) + cw σX (G[X]). We require the following extension of the definition of lean orderings.
The proof of the following result is very similar to the one of Lemma 13. We just move X to the end of the ordering, in the order given by σ, and apply exhaustively the same refinement step based on submodularity, but only to the subordering induced by X.
Lemma 33. For every graph G and every subset X of V (G) there exists an ordering σ of G such that cw σ (G, X) ≤ r, then there exist an X-lean ordering σ ′ of G such that cw σ ′ (G, X) ≤ r.
Let w 1 , w 2 ∈ N, G be a graph, and X ⊆ V (G). We say that X is an ( The next lemma uses an idea similar to the one of Lemma 17. Here ∼ q,h plays the role of (q, ℓ)-similarity.
Lemma 34. There is a computable function f 2 : N 2 → N such that the following holds: Let k be a non-negative integer and let H be a finite set of connected graphs, each having at most h vertices and edges. Let also G be a graph and let X be a (2k, k)-cutwidth-edge-protrusion of G.
If |X| > f 2 (k, h), then G contains as a proper strong immersion a graph G ′ where aic H (G) = aic H (G ′ ).
By X-leanness, there are p edge-disjoint paths P i , for i ∈ [p], from {v 1 , . . . , v i1 } to {v iN+1 , . . . , v |V (G)| }. Observe that for each j ∈ [N + 1], each path P i must cross exactly one edge of the cut δ G ({v 1 , . . . , v ij }); let this edge be z i j w i j , where z i j ∈ {v 1 , . . . , v ij } and w i j / ∈ {v 1 , . . . , v ij }. For each j ∈ [N + 1] we define p-boundaried graphs F j = (F j , (z 1 j , . . . , z p j )), where F j = G[{v 1 , . . . , v ij }], and G j = (G j , (w 1 j , . . . , w p j )) where G j = G[{v ij +1 , . . . , v |V (G)| }]. As, from Lemma 30, the equivalence relation ∼ 3k,h has at most N equivalent classes, there are j 1 , j 2 such that a ≤ j 1 < j 2 ≤ b such that G j1 ∼ 3k,h G j2 . Let G ′ = F ij 1 ⊕ G ij 2 ; it is To prove that G ∈ obs ≤i (C w,H ) we have to show that it satisfies O1 and O2. Notice that since H is an ≤ i -antichain, aic H (H) = 1 for every H ∈ H. By Observations 38 and 39, aic H (G) = w+1 i=1 aic H (G i ) = w +1 and O1 holds. Therefore, G ∈ C w,H . Let now G ′ is a proper immersion of G. This mean that G ′ = w+1 i=1 G ′ i where G ′ i ≤ i G i and at least one of G ′ 1 , . . . , G ′ w+1 is different than G i . W.l.o.g. we assume that this graph is G w+1 . As H is a ≤ i -antichain, G ′ w+1 is different than any of the graphs in H. Therefore aic H (G ′ w+1 ) = 0. Then, by Observations 37 and 38, aic aic H (G i ) + 0 = w and O2 holds.
Theorem 41. If k is a non-negative integer and H is a ≤ i -antichain that contains at least q connected graphs, then |obs ≤i (C w,H )| ≥ q+w w+1 .
Proof. Let H ′ be some subset of H containing q connected graphs. Using Lemma 40, we observe that every multiset of cardinality w + 1 whose elements belong to H ′ corresponds to a different (i.e. non-isomorphic) obstructions of C w,H . Therefore, |obs ≤i (C w,H )-is at least the number of multisets of cardinality w + 1 the elements of which are taken from a set of cardinality q, which is known to be q+w w+1 .
Proof of Theorem 27. From Observation 29, C w,k = C w,H k , where H k = obs ≤i (C k ). This means that obs ≤i (C w,k ) = obs ≤i (C w,H k ). The result follows from Theorems 36 and 41.

Conclusions
In this paper we have proved that the immersion obstructions for admitting a layout of cutwidth at most k have sizes single-exponential in O(k 3 log k). The core of the proof can be interpreted as bounding the number of different behavior types for a part of the graph that has only a small number of edges connecting it to the rest. This, in turn, gives an upper bound on the number of states for a dynamic programming algorithm that computes the optimum cutwidth ordering on an approximate one. This last result, complemented with an adaptation of the reduction scheme of Bodlaender [2] to the setting of cutwidth, yields a direct and self-contained FPT algorithm for computing the cutwidth of a graph. In fact, we believe that our algorithm can be thought of "Bodlaender's algorithm for treewidth in a nutshell". It consists of the same two components, namely a recursive reduction scheme and dynamic programming on an approximate decomposition, but the less challenging setting of cutwidth makes both components simpler, thus making the key ideas easier to understand. In our proof of the upper bound on the number of types/states, we used a somewhat new bucketing approach. This approach holds the essence of the typical sequences of Bodlaender and Kloks [3], but we find it more natural and conceptually simpler. The drawback is that we lose a log k factor in the exponent. It is conceivable that we could refine our results by removing this factor provided we applied typical sequences directly, but this is a price that we are willing to pay for the sake of simplicity and being self-contained.
An important ingredient of our approach is the observation that there is always an optimum cutwidth ordering that is lean: the cutsizes along the ordering precisely govern the edge connectivity between prefixes and suffixes. Recently, there is a growing interest in parameters that are tree-like analogues of cutwidth: tree-cut width [22] and carving-width [17]. For tree-cut decompositions and carving decompositions, one could define leanness in a very similar manner. For example for carving-width, the definition would be as follows: Suppose (T , τ ) is a carving decomposition of a graph G, where τ bijectively maps vertices of G to leaves of T . Then (T , τ ) is considered lean if for every two disjoint subtrees S 1 and S 2 of T , respectively rooted at nodes x 1 and x 2 , the maximum number of edge-disjoint paths leading from τ −1 (S 1 ) to τ −1 (S 2 ) is equal to the minimum cutsize among the edges of the path in T from x 1 to x 2 . The definition for tree-cut width is very similar.
We conjecture that both for tree-cut width and carving-width, there is always an optimum decomposition that is lean; i.e., an analogue of Lemma 13 holds. Interestingly, when one tries to mimic the proof of Lemma 13, the refinement operation can be generalized without much effort. The problem lies in finding the right potential function for showing that the refinement cannot be applied indefinitely. If the conjectured result would be true for tree-cut width or carving-width, it is conceivable that an approach similar to the one of this paper would give upper bounds on the sizes of minimal immersion obstructions also for these parameters.

Acknowledgements.
The second author thanks Miko laj Bojańczyk for the common work on understanding and reinterpreting the Bodlaender-Kloks dynamic programming algorithm [3], which influenced the bucketing approach presented in this paper.