Complexity of Token Swapping and Its Variants

In the Token Swapping problem we are given a graph with a token placed on each vertex. Each token has exactly one destination vertex, and we try to move all the tokens to their destinations, using the minimum number of swaps, i.e., operations of exchanging the tokens on two adjacent vertices. As the main result of this paper, we show that Token Swapping is $$W[1]$$ W[1] -hard parameterized by the length k of a shortest sequence of swaps. In fact, we prove that, for any computable function f, it cannot be solved in time $$f(k)n^{o(k / \log k)}$$ f(k)no(k/logk) where n is the number of vertices of the input graph, unless the ETH fails. This lower bound almost matches the trivial $$n^{O(k)}$$ nO(k) -time algorithm. We also consider two generalizations of the Token Swapping, namely Colored Token Swapping (where the tokens have colors and tokens of the same color are indistinguishable), and Subset Token Swapping (where each token has a set of possible destinations). To complement the hardness result, we prove that even the most general variant, Subset Token Swapping, is FPT in nowhere-dense graph classes. Finally, we consider the complexities of all three problems in very restricted classes of graphs: graphs of bounded treewidth and diameter, stars, cliques, and paths, trying to identify the borderlines between polynomial and NP-hard cases.


Introduction
In reconfiguration problems, we are interested to transform a combinatorial or geometric object from one state to another, by performing a sequence of simple operations.An important example is motion planning, where we want to move an object from one configuration to another.Elementary operations are usually translations and rotations.It turns out that motion planning can be reduced to the shortest path problem is some higher dimensional Euclidean space with obstacles [7].
Finding the shortest flip sequence between any two triangulations of a convex polygon is a major open problem in computational geometry.Interestingly it is equivalent to a myriad of other reconfiguration problems of so called Catalan structures [3].Examples include: binary trees, perfect matchings of points in convex position, Dyck words, monotonic lattice paths, and many more.Reconfiguring permutations under various constraints is heavily studied and usually called sorting.
An important class of reconfiguration problems is a big family of problems in graph theory that involves moving tokens, pebbles, cops or robbers along the edges of a given graph, in order to reach some final configuration [1,4,8,10,13,15,21,29,33]. In this paper, we study one of them.
The Token Swapping problem, introduced by Yamanaka et al. [34], fits nicely into this long history of reconfiguration problems and can be regarded as a sorting problem with special constraints.The problem is defined as follows, see also Figure 1.We are given an undirected connected graph with n vertices v 1 , . . ., v n , a set of tokens T = {t 1 , . . ., t n } and two permutations π start and π target .These permutations are called start permutation and target permutation.Initially vertex v i holds token t πstart(i) .In one step, we are allowed to swap tokens on a pair of adjacent vertices, that is, if v and w are adjacent, v holds the token s, and w holds the token t, then the swap between v and w results in the configuration where v holds t, w holds s, and all the other tokens stay in place.The Token Swapping problem asks if the target configuration can be reached in at most k swaps.Thus, a solution for the Token Swapping problem is a sequence of edges, where the swaps take place.The solution is optimal if its length is shortest possible.To see the correspondence to sorting note that every placement of tokens can be regarded as a permutation and the target permutation can be regarded as the sorted state.
Yamanaka et al. [34] observed that every instance of Token Swapping has a solution, and its length is O(n 2 ).Moreover, Ω(n 2 ) swaps are sometimes necessary.It is interesting to note that although the problem in its full generality was introduced only recently [34], some special cases were studied before in the context of sorting permutations with additional restrictions (see Knuth [22,Section 5.2.2] for paths, Pak [28] for stars, Cayley [5] for cliques, and Heath and Vergara [17] for squares of a path).Recently the problem was also solved for a special case of complete split graphs (see Gaku et.al. [36]).Is is also worth mentioning that a very closely related concept of sorting permutations using cost-constrained transitions was considered by Farnoud, Chen, and Milenkovic [12], and Farnoud and Milenkovic [11].
The complexity of the Token Swapping problem was investigated by Miltzow et al. [26].They show that the problem is NP-complete and APX-complete.Moreover, they show that any algorithm solving the Token Swapping problem in time 2 o(n) would refute the Exponential Time Hypothesis (ETH) [19].The results of Miltzow et al. [26] carry over also to some generalization of the Token Swapping problem, called Colored Token Swapping, first introduced by Yamanaka et al. [35].In this problem, vertices and tokens are partitioned into color classes.For each color c, the number of tokens colored c equals the number of vertices colored c.The goal is to reach, with the minimum number of swaps, a configuration in which each vertex contains a token of its own color.Token Swapping corresponds to the special case where each color class comprises exactly one token and one vertex.NP-hardness of Colored Token Swapping was first shown by Yamanaka et al. [35], even in the case that only 3 colors exist.
We introduce the Subset Token Swapping problem, which is an even further generalization of Token Swapping.Here a function D : T → 2 V specifies the set D(t i ) of possible destinations D(t i ) for the token t i .Observe that Subset Token Swapping also generalizes Colored Token Swapping.It might happen that there is no satisfying swapping sequence at all to this new problem.Though, this can be checked in polynomial time by deciding if there is a perfect matching in the bipartite tokendestination graph.Thus we shall always assume that we have a satisfiable instance.
In this paper we continue and extend the work of Miltzow et al. [26].They presented a very simple algorithm which solves the instance of the Token Swapping problem in n O(k) time and space, where k denotes the number of allowed swaps.In Section 3 we show that this algorithm can be easily generalized to Colored Token Swapping and Subset Token Swapping problems.One of the main bottlenecks for exponential-time algorithms is not time, but space consumption.Thus we present a slightly slower exact algorithm, using only polynomial space (in fact, only slightly super-linear).
The existence of an XP algorithm for the Token Swapping problem gives rise to the question, if the problem can be solved in FPT time (i.e., f (k) • n O (1) , for some function f ).There is some evidence indicating that this could be possible.First, observe that if more than 2k tokens are misplaced, then one can immediately answer that we deal with a No-instance, as each swap involves exactly two tokens.Further, one can safely remove all vertices from the graph that are at distance more than k from all misplaced tokens.This preprocessing yields an equivalent instance, where every connected component has diameter O(k 2 ).Thus for bounded maximum degree ∆ each component has size f (k), for some function f .The connected components of f (k) size can be solved separately by exhaustively guessing (still in FPT time) the number of swaps to perform in each of them.Moreover, even the generalized Subset Token Swapping problem is FPT in k+∆ (see Proposition 7).For those reasons, one could have hoped for an FPT algorithm for general graphs.However, as the main result of this paper, we show in Section 4 that this is not possible.
Theorem 1 (Parameterized Hardness).Token Swapping is W [1]-hard, parameterized by the number k of allowed swaps.Moreover, assuming the ETH, for any computable function f , Token Swapping cannot be solved in time f (k)(n + m) o(k/ log k) where and n and m are respectively the number of vertices and edges of the input graph.
Observe that this lower bound shows that the simple n O(k) -time algorithm is almost best possible.It is worth mentioning that the parameter for which we show hardness is in fact number of swaps + number of initially misplaced tokens + diameter of the graph, which matches the reasoning presented in the previous paragraph.
To show the lower bound, we introduce handy gadgets called linkers.They are simple and can be used to give a significantly simpler proof of the lower bounds given by Miltzow et al. [26].One might also use them to establish a simpler and potentially stronger inapproximability result.
Since there is no FPT algorithm for the Token Swapping problem (parameterized by the number k of swaps), unless FPT = W [1], a natural approach is to try to restrict the input graph classes, in hope to obtain some positive results.Indeed, in Section 5 we show that FPT algorithms exist, if we restrict our input to the so-called nowhere-dense graph classes.
Theorem 2 (FPT in nowhere dense graphs).Subset Token Swapping is FPT parameterized by k on nowhere-dense graph classes.
The notion of nowhere-dense graph classes has been introduced as a common generalization of several previously known notions of sparsity in graphs such as planar graphs, graphs with forbidden (topological) minors, graphs with (locally) bounded treewidth or graphs with bounded maximum degree.
Grohe, Kreutzer, and Siebertz [16] proved that every property definable as a firstorder formula ϕ is solvable in O(f (|ϕ|, ε) n 1+ε ) time on nowhere-dense classes of graphs, for every ε > 0. We use this meta-theorem to show the existence of an FPT time algorithm for the Subset Token Swapping problem, restricted to nowhere-dense graph classes.In particular, this implies the following results.It is often observed that NP-hard graph problems become tractable on classes of graphs with bounded treewidth (or, at least, with bounded tree-depth; see Nešetřil and Ossona de Mendez [27,Chapter 10] for the definition and some background of tree-depth and related parameters).It is not uncommon to see FPT algorithms running in time f (tw)n O(1) (or f (td)n O (1) ) or XP algorithms running in time n f (tw) (or n f (td) ), for some computable functions f .Especially, in light of Corollary 3 (a), we want to know if there exists an algorithm that runs in polynomial time for constant treewidth.In Section 6 we rule out the existence of such algorithms by showing that Token Swapping remains NP-hard when restricted to graphs with tree-depth 4 (treewidth and pathwidth 2; diameter 6; distance 1 to a forest).
Theorem 4 (Hard on Almost Trees).Token Swapping remains NP-hard even when both the treewidth and the diameter of the input graph are constant, and cannot be solved in time 2 o(n) , unless the ETH fails.
The Table 1 shows the current state of our knowledge about the parameterized complexity of Token Swapping (TS), Colored Token Swapping (CTS), and Subset Token Swapping (STS) problems, for different choices of parameters.
Table 1: The parameterized complexity of Token Swapping, Colored Token Swapping, and Subset Token Swapping.
While we think that our results give a fairly detailed view on the complexity landscape of the Token Swapping problem, we also want to point out that our reductions are significantly simpler than those by Miltzow et al. [26].
Since the investigated problems seem to be immensely intractable, in Section 7 we investigate their complexities in very restricted classes of graphs, namely cliques, stars, and paths.We focus on finding the borderlines between easy (polynomially solvable) and hard (NP-hard) cases.The summary of these results is given in Table 2. Observe trees cliques stars paths TS ?P (see [26]) P (see [26]) P (see [26]) CTS ?NP-c (Th 15) P (Th 12) P (Th 17) STS NP-c NP-c NP-c (Th 13) ?
Table 2: The complexity of Token Swapping (TS), Colored Token Swapping (CTS), and Subset Token Swapping (STS) on very restricted classes of graphs.
that cliques distinguish the complexities of the Token Swapping and the Colored Token Swapping problems, while stars distinguish the complexities of the Colored Token Swapping and the Subset Token Swapping problems.
The paper is concluded with several open problems in Section 8.

Preliminaries
Yamanaka et al. [34] showed that in every instance of the Token Swapping problem, the length of the optimal solution is O(n 2 ) and this bound is asymptotically tight for paths.Here we show that long induced paths are the only structures forcing solutions of superlinear length.
Proposition 5.The length of the optimal solution for Token Swapping in an n-vertex P r+1 -free graph G is at most r • n.
Proof.We can assume that G is connected, since otherwise we can solve the problem on connected components separately.Let P be the longest path in G and let v be its endvertex.Observe that G − v is connected (otherwise P is not longest) and P r+1free.First, we move the token with destination v to this vertex, which requires at most diam(G) r swaps.Then we can recursively continue with the graph G − v (we never touch v again).Such a solution has length at most r • n.
Note that this bound is asymptotically tight -to see this, consider a graph, whose every connected component is isomorphic to P r and has the reverse permutation of tokens (if we want to have our instance connected, we can add one additional vertex, adjacent to one of the endvertices of each path, and put a well-placed token on it).Moreover, we observe that the bound from Proposition 5 holds also for Colored Token Swapping and Subset Token Swapping problems.Indeed, we can fix one destination for each of the tokens (by choosing a perfect matching in the token-destination graph) to obtain an instance of Token Swapping problem, whose solution is also the solution for the original problem.
For a token t, let dist(t) denote the distance from the position of t to its destination.For an instance I of the Token Swapping problem, we define L(I) := t dist(t), i.e., the sum of distances to the destination over all the tokens.Clearly, after performing a single swap, dist(t) may change by at most 1.We shall also use the following classification of swaps: for x, y ∈ {−1, 0, 1}, x ≤ y, by a (x/y)-swap we mean a swap, in which one token changes its distance by x, and the other one by y.Intuitively, (−1/ − 1)-swaps are the most "efficient" ones, thus we will call them happy swaps.Since each swap involves two tokens, we get the following lower bound.

Proposition 6 ([26]
).The length of the optimal solution for an instance I of Token Swapping is at least L(I)/2.Besides, it is exactly L(I)/2 iff there is a solution using happy swaps only.
When designing algorithms, especially for computationally hard problems, it is natural to ask about lower bounds.However, the standard complexity assumption used for distinguishing easy and hard problems, i.e., P = NP, is too weak to tell us something meaningful about possible complexities of algorithms.The stronger assumption that is typically used for this purpose is the so-called Exponential Time Hypothesis (usually referred to as the ETH), formulated by Impagliazzo and Paturi [19].We refer the reader to the survey by Lokshtanov and Marx for more information about ETH and conditional lower bounds [23].The version we present below (and is most commonly used) is not the original statement of this hypothesis, but its weaker version (see also Impagliazzo, Paturi, and Zane [20]).

Exponential Time Hypothesis (Impagliazzo and Paturi [19]
).There is no algorithm solving every instance of 3-Sat with N variables and M clauses in time 2 o(N +M ) .

Algorithms
First, we prove that Subset Token Swapping (and therefore also Colored Token Swapping as its restriction) is FPT in k + ∆, where k is the number of allowed swaps, and ∆ is the maximum degree of the input graph.This generalized the observation of Miltzow et al. [26] for the Token Swapping problem.Furthermore, we show that the simple algorithm for the Token Swapping problem, presented by Miltzow et al. [26], carries over to the generalized problems, i.e., Colored Token Swapping and Subset Token Swapping.At last, we will present an algorithm that has polynomial space complexity.
Proposition 7. Subset Token Swapping problem if FPT in k + ∆ and admits a kernel of size 2k + 2k 2 • ∆ k .
Proof.Let I be an instance of Subset Token Swapping on a graph G with maximum degree ∆, and suppose that s is a solution for I of length at most k.
Let V m be the set of such vertices v of G, that v is not among the possible destinations for the token initially placed on v. First, observe that every vertex from V m has to be involved in some swap in s.Thus we can assume that |V m | ≤ 2k (otherwise we immediately report a No-instance).
Let E be the set of edges that appear in s and let G be the subgraph of G induced by E .Consider a connected component C of G .Suppose first that the vertex set of C does not contain any vertex from V m .Observe that the sequence s obtained from s by removing all edges from C is also a solution for I of length at most k.So, without loss of generality, every connected component C of G contains a vertex from V m , and has at most k edges.Let G be the subgraph of G induced by the vertices at distance at most k from V m (we find it by running a breadth-first search, starting from V m ).We observe that every vertex incident to an edge in E is in G .Thus the instance I of Subset Token Swapping, restricted to G , is equivalent to I. Note that the maximum degree of G is at most ∆, and the number of vertices in G is at most 2k + 2k 2 ∆ k .Thus I is a kernel for I.
Miltzow et al. [26] show that an optimal solution for the Token Swapping problem can be found by performing a breath-first-search on the configuration graph, i.e. a graph, whose vertices are all possible configurations of tokens on vertices, and two configurations are adjacent when we can obtain one from another with a single swap.We observe that the same approach works for the Colored Token Swapping and the Subset Token Swapping problems, the only difference is that we terminate on any feasible target configuration.
Proposition 8. Let G be a graph with n vertices, and let k be the maximum number of allowed swaps.The Colored Token Swapping and the Subset Token Swapping problems on G can be solved in time: The main drawback of such an approach is an exponential space complexity.Here we show the following complementary result, inspired by the ideas of Savitch [32].
Theorem 9. Let G be a graph with n vertices, and let k be the maximum number of allowed swaps.The Subset Token Swapping problem on G can be solved in time Proof.Consider the algorithm Reach (see Algorithm 1).It is easy to verify that it returns true if the configuration π s can be reached from the configuration π 0 with exactly k swaps, and false otherwise.The depth of the recursion is O(log k).The configurations can be generated with polynomial delay, using only linear (in n) memory.Thus the time complexity of the algorithm is n! log k • n O(1) = 2 O(n log n log k) .The space needed to keep track of the recursive stack is O(n log n log k).Recall that k = O(n 2 ) -otherwise we immediately report a Yes-instance.
To use the algorithm for the Subset Token Swapping problem, we can enumerate all possible target configurations in n! • n O(1) = 2 O(n log n) time and polynomial space, and then solve the instance of Token Swapping problem for each of them.

Lower Bounds on parameterized Token Swapping
Let us start by defining an auxiliary problem, called Multicolored Subgraph Isomorphism (also known as Partitioned Subgraph Isomorphism; see Figure 2).
In Multicolored Subgraph Isomorphism, one is given a host graph H whose vertex set is partitioned into k color classes V 1 V 2 . . .V k and a pattern graph P with k vertices: V (P ) = {u 1 , . . ., u k }.The goal is to find an injection ϕ : Figure 2: On the left is the pattern graph P ; on the right, the host graph H.We indicate the image of ϕ with white vertices.To keep the example small, we did not make P 3-regular.
such that u i u j ∈ E(P ) implied that ϕ(u i )ϕ(u j ) ∈ E(H) and ϕ(u i ) ∈ V i for all i, j.Thus we can assume that each V i forms an independent set.Further, we assume without loss of generality that In other words, we try to find -hardness of the Multicolored Clique.Marx [24] showed that assuming the ETH, Multicolored Subgraph Isomorphism cannot be solved in time , for any computable function f , even when the pattern graph P is 3-regular and bipartite (see also Marx and Pilipczuk [25]).In particular, k has to be an even integer since |E(P )| is exactly 3k/2.We finally assume that for every i ∈ [k] it holds that |V i | = t, by padding potentially smaller classes with isolated vertices.This can only increase the size of the host graph by a factor of k, and does not create any new solution nor destroy any existing one.Now we are ready to prove the following theorem.
Theorem 1 (Parameterized Hardness).Token Swapping is W [1]-hard, parameterized by the number k of allowed swaps.Moreover, assuming the ETH, for any computable function f , Token Swapping cannot be solved in time f (k)(n + m) o(k/ log k) where and n and m are respectively the number of vertices and edges of the input graph.
Proof.To show parameterized hardness of the Token Swapping problem, we introduce a very handy linker gadget.This gadget has a robust and general ability to link decisions.
As such, it permits to reduce from a wide range of problems.Its description is short and its soundness is intuitive.Because it yields very light constructions, we can rule out fairly easily unwanted swap sequences.We describe the linker gadget and provide some intuitive reason why it works (see Figure 3).
Linker gadget.Given two integers a and b, the linker gadget L a,b contains a set of a vertices, called finishing set and a path on a vertices, that we call starting path.The tokens initially on vertices of the finishing set are called local tokens; they shall go to the vertices of the starting path in the way depicted in Figure 3.The tokens initially on vertices of the starting path are called global tokens.Global tokens have their destination in some other linker gadget.To be more specific, their destination is in the finishing set of another linker.
We describe and always imagine the finishing set and the starting paths to be ordered from left to right.Below the finishing set and to the left of the starting path, stand b disjoint induced paths, each with a vertices, arranged in a grid, see Figure 3.We call those paths private paths.The private tokens on private paths are already well-placed.Every vertex in the finishing set is adjacent to all private vertices below it and the leftmost vertex of the starting path is adjacent to all rightmost vertices of the private paths.At the bottom right, we see the result after swapping all the local tokens to the starting path.In this case, the global tokens go to that private path.
For local tokens to go to the starting path, they must go through a private path.As its name suggests, the linker gadget aims at linking the choice of the private path used for every local token.Intuitively, the only way of benefiting from a 2 happy swaps between the a local tokens and the a global tokens is to use a unique private path (note that the destination of the global tokens will make those swaps happy).That results in a kind of configuration as depicted in the bottom right of Figure 3, where each global token is in the same private path.The fate of the global tokens has been linked.
Construction.We present a reduction from Multicolored Subgraph Isomorphism with cubic pattern graphs to Token Swapping where the number of allowed swaps is linear in k.Let (H, P ) be an instance of Multicolored Subgraph Isomorphism.For any color class V i = {v i,1 , v i,2 , . . ., v i,t } of H, we add a copy of the linker L 3,t that we denote by L i .We denote by j 1 < j 2 < j 3 the indices of the neighbors of u i in the pattern graph P .The linker L i will be linked to 3 other gadgets and it has t private paths (or choices).The finishing set of L i contains, from left to right, the vertices a(i, j 1 ), a(i, j 2 ), and a(i, j 3 ).We denote the tokens initially on the vertices a(i, j 1 ), a(i, j 2 ), and a(i, j 3 ) by local(i, j 1 ), local(i, j 2 ), local(i, j 3 ), respectively.
For each p ∈ [3], local(i, j p ) shall go to vertex b(i, j p ), whereas global(i, j p ) shall go to a(j p , i) in the gadget L jp .Observe that the former transfer is internal and may remain within the gadget L i , while the latter requires some interplay between the gadgets L i and L jp .For any h ∈ [t], by U(i, h) we denote the h-th private path.This path represents the vertex v i,h .The path U(i, h) consists of, from left to right, vertices u(i, h, j 1 ), u(i, h, j 2 ), u(i, h, j 3 ).We set U(i) := h∈[t] U(i, h).Initially, all the tokens placed on vertices of U(i) are already well placed.
Figure 5: The way linkers (in that case, L 3 and L 7 ) are assembled together, with t = 3.
We complete the construction by adding every edge of the form u(i, h, j)u(j, h , i) whenever v i,h v j,h is an edge in E(V i , V j ) (see Figure 5).Let G be the graph that we built, and let I be the whole instance of Token Swapping (with the initial position of the tokens).We claim that (H, P ) is a Yes-instance of Multicolored Subgraph Isomorphism if and only if I has a solution of length at most := 16.5k = O(k).Recall that k is even, so 16.5k is an integer.
Correctness.(⇒) We first assume that there is a solution {v 1,h1 , v 2,h2 , . . ., v k,h k } to the Multicolored Subgraph Isomorphism instance.We perform the following sequence of swaps.The orderings that we do not specify among those swaps are not important, which means that they can be done in an arbitrary fashion.In each gadget L i , we first bring local(i, j 3 ) to b(i, j 3 ), then local(i, j 2 ) to b(i, j 2 ), and finally local(i, j 1 ) to b(i, j 1 ), each time passing through the same private path U(i, h i ).This corresponds to a total of 12 swaps per gadget and 12k swaps in total.Note that global(i, j p ) is moved to u(i, h i , j p ). Now, for each edge v i,hi v j,hj of the host graph H (i.e., u i u j ∈ E(P )), we swap the tokens global(i, j) and global(j, i).By construction of G, u(i, h i , j)u(j, h j , i) is indeed an edge in E(G), so this swap is legal.This adds 3k/2 swaps.At this point, the token global(j, i) is on vertex u(i, h i , j).Therefore, we move each token global(j, i) to the vertex a(i, j) in one swap.This corresponds to 3k additional swaps.Observe that it has also the effect of putting the private tokens back to their original private path.Thus, every token is now well placed.The overall number of swaps in this solution is 12k + 3k/2 + 3k = 16.5k= .
(⇐) We now assume that there is a solution s to Token Swapping of length at most .We define Y := {(i, j) | u i u j ∈ E(P )}.Note that (i, j) ∈ Y implies (j, i) ∈ Y , and |Y | = 3k.We compute the sum L(I) of the distances token to destination.For any (i, j) ∈ Y , local(i, j) is at distance 4 of its destination b(i, j) (via any private path).For any (i, j) ∈ Y , global(i, j) is at distance 5 of its destination a(j, i) (following any private path of L i , then an edge to gadget L j , and a last edge to a(j, i)).The rest of the tokens are initially well-placed.Therefore, L := L(I) = (4 + 5) • 3k = 27k.By Proposition 6, the length of any solution for I is at least 13.5k.Claim 1.In any solution s for I, at least 3k initially well-placed tokens have to move.

Proof of Claim 1.
There are 3k local tokens and each has a disjoint neighborhood from all the others.Furthermore, all tokens in their neighborhood are private tokens, which are already well placed.
In solution s, let x be the number of swaps between a well-placed token and a misplaced token (in the best case, (−1/ + 1)-swaps), and y the number of swaps between two well-placed tokens ((+1/ + 1)-swaps).Claim 1 implies that x + 2y 3k.Those x + y swaps increase the sum of distances token to destination by 2y; its value reaches L + 2y.As 16.5k, there can only be at most 16.5k − (x + y) 13.5k + y = L+2y 2 other swaps.Therefore, all those swaps shall be happy.It also implies that in each U(i) exactly 3 well-placed tokens move in solution s.A last consequence is that all the swaps strictly worse than (−1/ + 1)-swaps (that is, (0/ + 1)-swaps and (+1/ + 1)-swaps) have to be swaps between two well-placed tokens.
Claim 2. In any solution s, no token local(i, j) leaves the gadget L i .
Proof of Claim 2. It should first be noted that the token local(i, j) can only increase its distance to its destination by leaving L i .Let j 1 < j 2 < j 3 be such that (i, j l ) ∈ Y for every l ∈ [3].The distance of local(i, j) to its destination is its distance to b(i, j 1 ) plus l − 1. Besides, local(i, j) can only leave L i via a vertex u(i, h, j ) with h ∈ [t] and (i, j ) ∈ Y .From this vertex, it can go to u(j , h , i) for some h ∈ [t].Now, the distance of local(i, j) to b(i, j l ) is 2 if l = 3, and at least 3 otherwise.In both cases, the swap that puts local(i, j) cannot be happy.Therefore, by the consequences of Claim 1, it has to be a swap with a well-placed token.That means that this swap is at best a (0/ + 1)-swap.This is only possible if it is a (+1/ + 1)-swap between two well-placed tokens; hence, a contradiction.
Claim 3.For every i ∈ [k], the 3 tokens of U(i) which moved in solution s, are in the same U(i, h i ), for some Proof of Claim 3. Let j 1 < j 2 < j 3 such that (i, j 1 ), (i, j 2 ), and (i, j 3 ) are all in Y .Consider the token local(i, j 2 ).It first moves to a vertex u(i, h i , j 2 ) (for some h i ∈ [t]).By Claim 2, its only way to its destination b(i, j 2 ) is via u(i, h i , j 3 ).This means that the token initially well-placed on u(i, h i , j 3 ) is one of those 3 tokens of U(i) which moved.Now, by considering the token local(i, j 1 ), the same argument shows that the three tokens of U(i) which are moved by solution s are u(i, h i , j 1 ), u(i, h i , j 2 ), and u(i, h i , j 3 ).
We now claim that {v 1,h1 , v 2,h2 , . . ., v k,h k } is a solution to the Multicolored Subgraph Isomorphism instance.Indeed, for any (i, j) ∈ Y , global(i, j) has to go to a(j, i).By Claim 3, it has to be via vertices of U(i, h i ) and U(j, h j ), and there is an edge between those two sets only if v i,hi v j,hj ∈ E(H).
The graph G has 3(t + 2)k vertices and O(t 2 k 2 ) edges.We recall that = O(k).Therefore, any algorithm solving Token Swapping in time f ( )(|V (G)|+|E(G)|) o( / log ) , for some computable function f , could be used to solve Multicolored Subgraph Isomorphism in time f (k)(|V (H)|+|E(H)|) o(k/ log k) ; and would contradict the ETH.This completes the proof of Theorem 1.

Token Swapping on nowhere-dense classes of graphs
As we have seen in Section 4, there is little hope FPT algorithm for the Token Swapping problem (parameterized by k), unless FPT = W [1]. Now let us show that FPT algorithms exist, if we restrict our input to nowhere-dense graph classes.
To define nowhere-dense graphs, first let us introduce a notion of a shallow minor.A shallow minor of a graph G at depth d is a subgraph of a graph obtained from G by contracting subgraphs of G, each of radius at most d, into single vertices, and removing loops and multiple edges.A class G is nowhere-dense if for every d the class of shallow minors at depth d of graphs in G has bounded clique number.For more information about this topic, we refer the reader to the comprehensive book of Nešetřil and Ossona de Mendez [27,Chapter 13].
As graphs with bounded degree are nowhere-dense, this result generalizes Proposition 7.
Theorem 2 (FPT in nowhere dense graphs).Subset Token Swapping is FPT parameterized by k on nowhere-dense graph classes.
Proof.If we are able to express the Subset Token Swapping problem as a first-order formula, then the result follows immediately from the meta-theorem by Grohe, Kreutzer, and Siebertz [16].
Theorem (Grohe, Kreutzer, and Siebertz [16]).For every nowhere-dense class C and every ε > 0, every property of graphs definable by a first-order formula ϕ can be decided in time O(f (|ϕ|, ε) • n 1+ε ) on C, where f is some function depending only on ϕ and ε.
We will define the instance of Subset Token Swapping as a first-order formula Φ ≤k of size O(k 4 ).Recall that if the length of an optimal solution is k, then at most 2k tokens are swapped.In our formula variables will denote vertices of G.The relation edge(x, y) denotes the existence of an edge xy.The subsets of possible destinations of tokens will be represented by relation target(x, y), which means that the vertex y is a possible destination for the token initially starting on vertex x.Moreover, each token will be identified by its initial position.
Let Φ k denote the formula encoding the solution of the Subset Token Swapping problem with exactly k swaps.If we are interested in a solution using at most k swaps, it is given by Φ k = k i=1 Φ i .We use variables to represent: 1. the "traced" tokens t 1 , t 2 , . . ., t 2k that are involved in the solution (some of them may stay intact, if the solution uses less than 2k tokens), 2. the final positions dest 1 , dest 2 , . . ., dest 2k of the "traced" tokens (dest j is the final position of token t j ), 3. the swaps s 1 1 , s 2 1 , . . ., s 1 k , s 2 k (in the i-th swap we exchange the tokens on edge 4. the tokens that are swapped -by st 1 i , st 2 i we denote the tokens that were swapped in the i-th swap, i.e., before the swap the position of st p i was s p i , 5. the positions of "traced" tokens in each round -pos j,i is the vertex, where token t j is after i-th swap.
Now we are ready to present the formula Φ k .
∃(pos 1,k , pos 2,k , . . ., pos 2k,k ) In lines 1-9 we define the variables.Line 10 says that the tokens that are not involved in any swaps are already at feasible positions.Line 11 ensures that the traced tokens are pairwise different.Lines 12 and 13 say that the final positions of traced tokens should be feasible, and we can perform swaps only on edges.In lines 14 and 15 we synchronize the values of variables pos j,0 and pos j,k with variables t j and dest j .In lines 16 and 17 we synchronize the values of variables sp 1 i , sp 2 i and s 1 i , s 2 i .In line 18 we make sure that the tokens that are not involved in the current swap, stay on their positions.Finally, in line 19 and 20 , we say that the tokens involved in the current swap exchange their positions.
We derive the following corollary.To see Corollary 3 (a), recall that bounded-treewidth graphs are nowhere-dense.Thus by Theorem 2 there exists an algorithm with running time O(f (k) n 1+ε ), for any ε > 0 and treewidth bounded by some constant c.Observe that the constant hidden in the big-O notation depends on the constant c.In particular c has no influence on the exponent of n.

Token Swapping on almost trees
This section is devoted to the proof of the following theorem.
Theorem 4 (Hard on Almost Trees).Token Swapping remains NP-hard even when both the treewidth and the diameter of the input graph are constant, and cannot be solved in time 2 o(n) , unless the ETH fails.
Proof.In Exact Cover by 3-Sets, one is given a family S = {S 1 , S 2 , . . ., S m } of 3-element subsets of the universe X = {x 1 , x 2 , . . ., x n }, where 3 divides n.The goal is to find n/3 subsets in S that partition (or here, equivalently, cover) X.The problem can be seen as a straightforward generalization of the 3-Dimensional Matching problem.This problem is NP-complete and has no 2 o(n) algorithm, unless the ETH fails, even if each element belongs exactly 3 triples [2,14].Therefore we can reduce from the restriction of the Exact Cover by 3-Sets problem, where each element belongs to 3 sets of S, and obviously |S| = |X| = n.Construction.For each set S j ∈ S, we add a set gadget consisting of a tree on 10 vertices (see Figure 6).In the set gadget, the four gray tokens should cyclically swap as indicated by the dotted arrows: g j i shall go where g j i+1 is, for each i ∈ [4] (addition is computed modulo 4).The three black tokens, as usual, are initially well placed.The three remaining vertices are called element vertices.They represent the three elements of the set.The tokens initially on the element vertices are called element tokens.For each element of X, there are 3 element tokens and 3 element vertices.
Figure 6: The set gadget for red, green and blue.We voluntarily omit the superscript j.
We add a vertex c that is linked to all the element vertices of the set gadgets and to all the vertices g j 0 .Each token originally on an element vertex should cyclically go to its next occurrence (see Figure 7).The token initially on c is well placed (it could be drawn as a black token).The constructed graph G has 10n + 1 vertices.If one removes the vertex c the remaining graph is a forest, which means that the graph has a feedback vertex set of size 1 and, in particular, treewidth 2. G has its diameter bounded by 6, since all the vertices are at distance at most 3 of the vertex c.We now show that the instance S of Exact Cover by 3-Sets admits a solution iff there exists a solution for our instance of Token Swapping of length at most := 11 • n/3 + 9 • 2n/3 + 2n = 35n/3 = 11n + 2n/3.
Soundness.The correctness of the construction relies mainly on the fact that there are two competitive ways of placing the gray tokens.The first way is the most direct.It consists of only swapping along the spine of the set gadget.By spine, we mean the 7 vertices initially containing gray or black tokens.From hereon, we call that swapping the gray tokens internally.Proof of Claim 4. In 6 swaps, we can first move g 3 to its destination (where g 0 is initially).Then, g 0 , g 1 , and g 2 need one additional swap each to be correctly placed.We observe that, after we do so, the black tokens are back to their respective destination.
We call the second way swapping the gray tokens via c.Basically, it is the way one would have to place the gray tokens if the black tokens (except the one in c) were removed from the graph.It consists of, first (a) swapping g 0 with the token on c, then moving g 0 to its destination, then (b) swapping g 1 with the current token on c, moving g 1 to its destination, (c) swapping g 2 with the token on c, moving g 2 to its destination, finally (d) swapping g 3 with the token on c and moving it to its destination.Considering that swapping the gray tokens via c takes 2 more swaps than swapping them internally, and leads to the exact same configuration where both the black tokens and the element tokens are back to their initial position, one can question the interest of the second way of swapping the gray tokens.It turns out that, at the end of steps (a), (b), and (c), an element token is on vertex c.We will take advantage of that situation to perform two consecutive happy swaps with its two other occurrences.By doing so, observe that the first swap of steps (b), (c), and (d) are also happy and place the last occurrence of the element tokens at its destination.
We assume that there is a solution S a1 , . . ., S a n/3 to the Exact Cover by 3-Sets instance.In the corresponding n/3 set gadgets, swap the gray tokens via c and interleave those swaps with doing the two happy swaps over element tokens, whenever such a token reaches c.By Claim 5, this requires 11•n/3+2n swaps.At this point, the tokens that are misplaced are the 4 • 2n/3 gray tokens in the 2n/3 remaining set gadgets.Swap those gray tokens internally.This adds 9 • 2n/3 swaps, by Claim 4. Overall, this solution consists of 29n/3 + 2n = 35n/3 = .
Let us now suppose that there is a solution s of length at most to the Token Swapping instance.At this point, we should observe that there are alternative ways (to Claim 4 and Claim 5) of placing the gray tokens at their destination.For instance, one can move g 3 to g 1 along the spine, place tokens g 2 and g 3 , then exchange g 0 with the token on c, move g 0 to its destination, swap g 3 with the token on c, and finally move it to its destination.This also takes 11 swaps but moves only one element token to c (compared to moving all three of them in the strategy of Claim 5).One can check that all those alternative ways take 11 swaps or more.Let r ∈ [0, n] be such that s does not swap the gray tokens internally in r set gadgets (and swap them internally in the remaining n−r set gadgets).The length of s is at least 11r+9(n−r)+2(n−q)+4q = 11n+2(r+q), where q is the number of elements of X for which none occurrence of its three element tokens has been moved to c in the process of swapping the gray tokens.Indeed, for each of those q elements, 4 additional swaps will be eventually needed.For each of the remaining n−q elements, only 2 additional happy swaps will place the three corresponding element tokens at their destination.It holds that 3r n − q, since the element tokens within the r set gadgets where s does not swap internally represent at most 3r distinct elements of X.Hence, 3r + q n.Also, s is of length at most = 11n + 2n/3, which implies that r + q n/3.Thus, n 3r + q 3r + 3q n.Therefore, q = 0 and r = n/3.Let S a1 , . . ., S a n/3 be the n/3 sets for which s does not swap the gray tokens internally in the correponding set gadgets.For each element of X, an occurrence of a corresponding element token is moved to c when the gray tokens are swapped in one of those gadgets.So this element belongs to one S ai and therefore S a1 , . . ., S a n/3 is a solution to the instance of Exact Cover by 3-Sets.
The ETH lower bound follows from the fact, that the size of constructed graph is linear in n.
7 Variants of Token Swapping on stars, cliques, and paths In this section we investigate the complexities of the variants of Token Swapping on very simple classes of graphs.
Let us start with a defining an auxiliary digraph, which will be useful in coping with the Colored Token Swapping problem.For an instance of the Colored Token Swapping problem on a graph G, we define the color digraph G * , whose vertices are colors of tokens on G, and arcs correspond to vertices of G.The vertex v corresponds to the arc e(v) = cc , such that c is the color of v and c is the color of the token placed in v.Note that both loops and multiple arcs are possible.There is a very close relation between color digraphs and Eulerian digraphs.
Observation 10.The following hold: (i) if G * is the color digraph of some instance of Colored Token Swapping problem, then every connected component of G * is Eulerian; (ii) for every Eulerian digraph H with n edges, and for any graph G with n vertices, there exists an instance of Colored Token Swapping on G, such that its color digraph G * is isomorphic to H.
Proof.Too see (i), consider a vertex c of G * .Its out-degree is the number of tokens placed on vertices with color c.The in-degree of c is the number of tokens in color c.Thus the in-degree is equal the out-degree, from which (i) follows.Now, to see (ii), consider a vertex c of G * , let d be its out-degree (equal to the in-degree, as G * is Eulerian).Then in G give the color c to any d vertices.Moreover, for each arc cc in G * we place a token in color c on a vertex in color c.We repeat this for every vertex c in G * , obtaining an instance of Colored Token Swapping, whose color digraph is exactly G * .Now consider a solution s for the instance of the Colored Token Swapping problem in G and fix the destinations of tokens according to s.We observe that the cycles in the permutation defined by these destinations correspond to circuits in G * .Thus, when trying to find a solution for an instance of Colored Token Swapping problem, we will first try to fix appropriate destinations (by analyzing circuits in G * ), and then we will solve the instance of Token Swapping problem.
To prove the next theorem we will use the following result by Pak [28].We state in the language of tokens and swaps, although the original motivation of Pak was sorting a permutation by transpositions with the first element.
Lemma 11 (Pak [28]).Let I be an instance of Token Swapping on a star with n leaves, with the initial configuration of tokens π.If the decomposition of π into cycles consists of one cycle involving the central vertex, m cycles of length at least 2, and b cycles of length 1, then the length of an optimal solution to I is n + m − b.
Theorem 12. Colored Token Swapping can be solved in polynomial time on stars.
Proof.Let G be a star with center v 0 and leaves v 1 , v 2 , . . ., v n .The color of the vertex v will be denoted by c(v).Also, let c 0 := c(v 0 ).
It is interesting to point out that if G is a clique, then the presence of many cycles in the permutation of tokens yields a short solution for the Token Swapping problem, while for the case when G is a star, the situation is opposite.
Theorems 13 and 15 can be used to show a slightly more general hardness result.A class G of graphs is hereditary, if for any G ∈ G and any induced subgraph G of G we have G ∈ G.We say that that a class G of graphs has unbounded degree, if for every d ∈ N there exists G ∈ G, such that ∆(G) ≥ d.
Theorem 16.Let G be a hereditary class of graphs with unbounded degree.The Subset Token Swapping problem is NP-complete on G.
Proof.We shall reduce from Directed Hamiltonian Cycle in digraphs with outdegree at most 2. Let H be such a digraph with n vertices.
First, assume that K 1,n ∈ G. Then we are done by Theorem 13.So assume that K 1,n / ∈ G. Since G is hereditary, we know that K 1,n / ∈ G for any n ≥ n.Since decomposing the arc set of an Eulerian digraph with no 2-cycles into directed triangles is NP-complete (see Lemma 14), there exists a polynomial reduction from Directed Hamiltonian Cycle to this problem.Consider the digraph H * obtained with this reduction.Its arc set can be decomposed into triangles if and only if H has a Hamiltonian cycle.Let m denote the number of edges in H * and set N = max(m, n).
By Ramsey theorem [31] (see also Erdős, Szekeres [9]) we know that there exists an absolute constant c such that every graph with more than c • 4 N vertices has either a clique or an independent set of size N .
Since G has unbounded degree, there exists a graph G ∈ G, such that ∆(G) ≥ c • 4 N .Let v be a vertex of G with degree at least c • 4 N and let G be a subgraph of G induced by the neighborhood of v.If G has an independent set U of size N , then G[U ∪ {v}] ∼ K 1,N , so we obtain a contradiction (recall that G is hereditary).Thus G has a subset C inducing a clique of size N .Since G is hereditary and N ≥ m, we obtain that K m ∈ G. Thus we can use the construction from Theorem 15.
Finally, we turn our attention to paths.Theorem 17. Colored Token Swapping can be solved in polynomial time on paths.
Proof.Let c be the color of the vertex v at the left end of the path.Let t be the leftmost token with color c.It is clear that no optimal solution contains a swap involving two tokens of the same color, so in any optimal solution the token t will end up in v. Repeat this argument with the second leftmost vertex, and so on.This way we fix the destinations for all tokens, obtaining an equivalent instance of Token Swapping problem, which can be solved in polynomial time (see [26]).

Conclusion
We conclude the paper with several ideas for further research.First, we believe that it would be interesting to fill the missing entries in Table 2.In particular, we conjecture that the Token Swapping problem remains NP-complete even if the input graph is a tree.
Another interesting problem is the following.By Miltzow et al. [26,Theorem 1] (see also Proposition 8), the Token Swapping problem can be solved in time 2 O(n log n) , and there is no 2 o(n) algorithm, unless the ETH fails.We conjecture that the lower bound can be improved to 2 o(n log n) .It would also be interesting to find single-exponential algorithms for some restricted graph classes, such as graphs with bounded treewidth or planar graphs.

Figure 1 :
Figure 1: Every token placement can be uniquely described by a permutation.

Corollary 3 .
Subset Token Swapping is FPT (a) parameterized by k + tw(G), (b) parameterized by k in planar graphs.

5 if 9 foreach
πs can be obtained from π0 with a swap on e then configuration π of tokens on G do 10 if Reach(G, π0, π , k/2 ) = true and Reach(G, π , πs, k/2 ) = true then 11 return true 12 return false

Figure 3 :
Figure 3: The linker gadget L a,b .Black tokens are initially properly placed.Dashed arcs represent where tokens of the finishing set should go in the starting path.At the bottom left, we depict the gadget after all the local tokens are swapped to a single private path.At the bottom right, we see the result after swapping all the local tokens to the starting path.In this case, the global tokens go to that private path.

Corollary 3 .
Subset Token Swapping is FPT (a) parameterized by k + tw(G), (b) parameterized by k in planar graphs.

Figure 7 :
Figure 7: The overall picture.Each element appears exactly 3 times, so there are 3 red tokens.

Claim 4 .
Swapping the gray tokens internally requires 9 swaps.