On the Cycle Augmentation Problem: Hardness and Approximation Algorithms

In the k-Connectivity Augmentation Problem we are given a k-edge-connected graph and a set of additional edges called links. Our goal is to find a set of links of minimum size whose addition to the graph makes it (k + 1)-edge-connected. There is an approximation preserving reduction from the mentioned problem to the case k = 1 (a.k.a. the Tree Augmentation Problem or TAP) or k = 2 (a.k.a. the Cactus Augmentation Problem or CacAP). While several better-than-2 approximation algorithms are known for TAP, for CacAP only recently this barrier was breached (hence for k-Connectivity Augmentation in general). As a first step towards better approximation algorithms for CacAP, we consider the special case where the input cactus consists of a single cycle, the Cycle Augmentation Problem (CycAP). This apparently simple special case retains part of the hardness of the general case. In particular, we are able to show that it is APX-hard. In this paper we present a combinatorial 32+ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\left (\frac {3}{2}+\varepsilon \right )$\end{document}-approximation for CycAP, for any constant ε > 0. We also present an LP formulation with a matching integrality gap: this might be useful to address the general case of the problem.


Introduction
The basic goal of Survivable Network Design is to construct low cost networks that provide connectivity guarantees between pre-specified sets of nodes even after the failure of a few edges/nodes (in the following we will focus on the edge failure case). This has many applications, e.g., in transportation and telecommunication networks.
A relevant subclass of these problems is given by Network Augmentation problems. Here the goal is to augment a given graph G = (V , E) by adding extra edges taken from a given set L (links), so as to satisfy given (edge-)connectivity requirements. Several such problems are NP-hard, and in most cases the best known approximation factor is 2 due to Jain [19].
In this paper we focus on the following k-Connectivity Augmentation Problem (k-CAP). Given a k-(edge)-connected undirected graph G = (V , E) and a collection L of extra edges (links), the goal is to find a subset A ⊆ L with minimum size, such that G = (V , E ∪ A) is (k + 1)-connected. (We recall that G = (V , E) is kconnected if for every set of edges F ⊆ E, |F | ≤ k − 1, the graph G = (V , E \ F ) is connected.) Dinitz et al. [10] presented an approximation preserving reduction from this problem to the case k = 1 for odd k, and k = 2 for even k. This motivates a deeper understanding of the latter two special cases.
The case k = 1 is also known as the Tree Augmentation Problem (TAP). The reason for this name is that any 2-edge-connected component of the input graph G can be contracted, hence leading to a tree. For this problem several better-than-2 approximation algorithms are known [1,7,11,12,17,24,28]. The case k = 2 is also known as the Cactus Augmentation Problem (CacAP) since, similarly to the previous case, the input graph can be assumed to be a cactus [10]. Recall that a cactus G is a connected undirected graph in which every edge belongs to exactly one cycle. For technical reasons in this paper we also consider cycles of length 2. However, here the best-known approximation factor was 2 [19] for a long time and only recently this was improved to 1.91 (implying the same for k-CAP in general).
For all the mentioned problems it makes sense to consider the weighted version, where links have non-negative integral weights, and the goal is to find a minimum weight (rather than minimum cardinality) subset of links A with the desired properties. In particular we will speak about Weighted TAP (WTAP) and Weighted CacAP (WCacAP). Here the best-known approximation factor is 2 in both cases [19]. Moreover, improving on that approximation factor for WTAP is considered as a major open problem in the area. We also notice that we can turn a WTAP instance into an equivalent WCacAP instance by replacing each edge with two parallel edges. Hence, approximating WCacAP is not any easier than approximating WTAP (and the same holds for the corresponding unweighted versions).

Our Results
As mentioned before, CacAP contains TAP as a special case when all the cycles in the cactus have length 2 (formed by a pair of parallel edges). Hence, in order to make progress on CacAP, it makes sense to consider the somehow complementary case where the input cactus consists of a single cycle of n nodes. We call the corresponding subproblem the Cycle Augmentation Problem (CycAP), and its weighted version Weighted CycAP (WCycAP). To the best of our knowledge, these special cases were not studied before. However, as we will see, they still retain part of the difficulties of the general cactus case. In more detail, we achieve the following main results:

Approximation Algorithms
We present better-than-2 approximation algorithms for this problem. In particular, we present a simple 5 3 -approximation, and a slightly more complex (3/2 + ε)-approximation for any constant ε > 0. Notice that the latter approximation factor is not far from the best known approximation factor for TAP which is equal to 1.458 [17]. Our algorithms are purely combinatorial, and they consist of two main phases. In the first phase, we greedily add some links to the solution under construction and contract them. At the end of this phase we achieve an instance of CacAP that can be solved exactly in polynomial time. In particular, for the 5 3 -approximation this reduces to computing a spanning tree, while for the (3/2 + ε)-approximation we use an FPT algorithm parameterized by a proper notion of maximum length of a link.

Hardness of Approximation
We are able to show that WCycAP is as hard to approximate as WCacAP. Therefore, improving on a 2-approximation for WCycAP would imply a major breakthrough in the area (in particular, it would imply the same for WTAP). This also justifies a more careful investigation of CycAP. In our opinion it is a priori not so obvious that CycAP is even NP-hard. Indeed, the special case of TAP (and even of WTAP) where the input graph is a path can be solved exactly in polynomial time. The case of an input cycle might closely remind the path case.
Here we show that this intuition is not correct: we prove that CycAP is NP-hard and even APX-hard via a simple but non-trivial adaptation of the proofs in [15,23]. In particular, we need one extra step in the reduction where we turn an intermediate CacAP instance into a CycAP one while maintaining certain properties of the optimal solution.

LP Gaps
The recent literature on TAP approximation [1,12,17] shows that finding strong LP relaxations for the problem can be very helpful to design improved approximation algorithms. In the same spirit, we tried to address the problem of finding LP relaxations for CycAP with small integrality gap. For both TAP and CacAP (hence CycAP) one can define a natural and simple standard cut LP (more details later). While for TAP it was recently shown that the standard cut LP has integrality gap smaller than 2 [29], interestingly for CycAP (hence for CacAP) the standard cut LP has integrality gap 2. Here we present a stronger LP that, for any ε > 0, has integrality gap at most 3 2 + ε (hence matching the approximation ratio of our algorithm). In our opinion this could be useful for future work on CacAP approximation.

Related Work
As mentioned before, the best known result in terms of polynomial time approximation algorithms for k-CAP is a 1.91-approximation proposed by Byrka et al [2]. However, if the set of links is equal to V × V it is possible to solve this problem optimally [33]. More recently, this problem has been studied in the framework of Fixed-Parameter Tractability: Végh and Marx [27] proved that this problem is in FPT when parameterized by the size of the optimal solution, and later the running time of their algorithm was further improved [3].
Tree Augmentation has been extensively studied over the past few decades. It was first shown that WTAP is NP-hard by Frederickson and Jájá [15], then that TAP is NP-hard by Cheriyan et al. [6], and later that TAP is APX-hard by Kortsarz et al. [23]. For WTAP, the best-known approximation guarantee is 2 and was first established by Frederickson and Jájá [15]. Their algorithm was later simplified by Khuller and Thurimella [21]. A 2-approximation can also be achieved by various other techniques developed later on, including a primal-dual approach [16] and iterative rounding [19]. Improvements on the factor 2 have only been obtained for restricted cases, including bounded diameter trees [8] and bounded weights [1,12,17,29].
Regarding TAP, the first algorithm beating the approximation guarantee of 2 is due to Nagamochi [28], achieving an approximation factor of 1.815 + ε. This factor was subsequently improved to 1.8 [11] and to 1.5 [24]. These results are combinatorial in nature, but LP-based results have been achieved as well. As an example, recently Nutov [29] showed that the standard cut LP for TAP has an integrality gap of at most 28/15 while a lower bound of 3/2 was known [7]. An LP-based 5 3 + εapproximation was given by Adjiashvili [1] and then refined by Fiorini et al. [12] to obtain a 3 2 + ε -approximation (see also [4,5,26]). Both results are obtained by adding a proper family of extra constraints to the standard cut LP. Recently, Grandoni et al. [17] achieved a 1.458 approximation for TAP, which is smaller than the integrality gap of the standard cut LP.
The rest of this paper is organized as follows. In Section 2 we give some preliminary definitions and results. The approximation algorithms, LP-gaps and hardness of approximation results are discussed in Sections 3, 4 and 5 respectively.

Preliminaries
For a set X and element y, we use the shortcut X \ y for X \ {y}, and similarly for other set operations.
Given a graph G = (V , E), we let V (G) = V and E(G) = E. Recall that in WCa-cAP we are given a cactus G = (V , E), a set of links L ⊆ V 2 and a non-negative weight function c : L → R ≥0 . The task is to compute a subset of links A ⊆ L such that the graph (V , E ∪A) is 3-edge-connected while minimizing c(A) := ∈A c( ). The special case where G is a cycle is called WCycAP, and the unweighted versions of the above problems are called CacAP and CycAP respectively. By n we will denote the number of nodes of the considered instance of the problem.
Notice that, given an instance (G, L) of CacAP, we can check in polynomial time if the graph (V (G), E(G) ∪ L) is 3-edge-connected by exhaustively checking if the removal of any pair of elements from E(G) ∪ L disconnects the graph. Hence we will assume along this work that the instance always admits a feasible solution. Note that in the case of CycAP, Observation 1 implies that any feasible solution must be an edge cover as 2-edge cuts defined by neighboring edges of the cycle must be satisfied. Given a 2-edge cut S = {e, e }, let L S be the subset of links satisfying S. The standard cut LP for CycAP is as follows: Now we proceed to define a standard building block for our algorithms, the contraction of a link. Definition 1 Contracting a subset of nodes W consists of the following operations: (i) remove the nodes in W and all edges/links incident to them; (ii) add a new node w and, for each original edge/link of type (y, x), x ∈ W, y / ∈ W , add the edge/link (y, w) (of the same weight for the case of links). Note that we do not create loops this way but may introduce parallel links. We say that (y, w) is the image of (y, x) and (y, x) is the preimage of (y, w).
We will sometimes slightly abuse notation and use the same label to denote a link and its image: the meaning will be clear from the context. For a link = (u, v), we define a sequence w 0 , . . . , w q of boundary nodes B( ) as follows. Consider a simple path from u to v in the cactus, and let C 1 , C 2 , . . . , C q be the ordered sequence of cycles visited by this path (possibly q = 1). Note that a path visits a cycle iff it includes an edge from the cycle. We define w i , i = 1, . . . , q − 1 as the unique common node between C i and C i+1 , and set w 0 = u and w q = v. Definition 2 Contracting a link is the operation of contracting its boundary nodes B( ). We denote by G| the graph obtained by this operation. Contracting a set of links A is the operation of contracting any ∈ A, and then continue recursively on G| and on the image of A \ until A becomes empty.
Note that contracting a link in a cactus yields again a cactus. We will extensively use the following standard fact. We require some further notation before proving the lemma. The internal projections S( ) of are the links (w i , w i+1 ), i = 0, . . . , q − 1. In terms of feasibility, and S( ) are equivalent as the following proposition states.

Proposition 1 Let (G, L) be a CacAP instance and ∈ L. Then satisfies precisely the same 2-edge cuts as S( ).
Proof Let B( ) = (w 0 , . . . , w q ) and C 1 , . . . , C q be the corresponding sequence of cycles visited by a simple path between the endpoints of . Notice that pairs (w i , w i+1 ), i = 0, . . . , q − 1, subdivide each C i into two paths next denoted as C i and C i . Trivially satisfies only cuts belonging to the cycles C 1 , . . . , C q , and the same holds for S( ). Consider any pair (e 1 , e 2 ) belonging to some C i . Link satisfies the corresponding cut if and only if precisely one such edge e j belongs to C i . The same holds for (w i , w i+1 ), hence for S( ).
In order to prove Lemma 1, let us first consider the simpler case where G is a cycle. Proof Let C 1 and C 2 be the two cycles in G| , with common node w.
Suppose first that the image of A\ is a feasible solution for (G| , L\ ). Consider a pair of edges {e 1 , e 2 } belonging to a common cycle C i , and the corresponding cut (S , S ) in G| with w ∈ S . There must be a link ∈ A\ satisfying this cut in G| . The preimage of has one endpoint in S and the other in V \S = (S \{w})∪{u, v}, hence it satisfies the {e 1 , e 2 }-cut in G. The remaining pairs of edges {e 1 , e 2 } of G satisfy e 1 ∈ C 1 and e 2 ∈ C 2 , modulo symmetries. Those cuts are satisfied by in G.
Suppose now that A is feasible for (G, L). Consider a pair of edges {e 1 , e 2 } belonging to a common cycle C i . Let (S , S ) be the corresponding cut in G| with w ∈ S . Since does not satisfy that cut in G, this means that there is some other link ∈ A \ satisfying it. The image of has one endpoint in S and the other in S , hence it satisfies the {e 1 , e 2 }-cut. Now we can proceed with the proof of Lemma 1.

Proof of Lemma 1 By Proposition 1, we obtain an equivalent statement of the lemma by replacing A with the set S(A) of the internal projections of links in A and replacing with its internal projection S( ).
Let B( ) = (w 0 , . . . , w q ) and C 1 , . . . , C q be the corresponding sequence of cycles visited by a simple path between the endpoints of . Consider any cycle C not in the above list. Then trivially any pair of edges in C is covered by links in S(A) \ S( ). Therefore it is sufficient to consider pairs of edges e 1 , e 2 belonging to the same cycle C i . Let i = (w i , w i+1 ) be the internal projection of with both endpoints in C i , and define similarly S i (A) w.r.t. S(A). Then it is sufficient to show that S i (A) is a feasible solution for the CycAP instance induced by C i if and only if S i (A) \ i is a feasible solution for the CycAP instance induced by C i | i , which follows from Lemma 2.

Approximation Algorithms for Cycle Augmentation
In this section we present improved approximation algorithms for CycAP. We start with a simple 5 3 -approximation to illustrate the main ideas, and then present a slightly more complex 3 2 + ε -approximation. The approach we will follow in both cases is as follows: in a first phase we iteratively add a properly chosen subset of a few links to the solution under construction, and then contract them. Notice that, after the first contraction, the cycle structure may be lost and we obtain a CacAP instance instead. These choices are designed so that, at the end of the first phase, the remaining CacAP instance can be solved efficiently, which is done in a second phase with an ad-hoc algorithm.

A 5 3 -Approximation
We next describe a simple greedy algorithm that provides a 5 3 -approximation for CycAP, that we refer to as CROSSING-FIRST algorithm. In order to present the algorithm clearly, we need the following definitions.

Definition 3 A link = (u, v) of a CacAP instance is internal if both its endpoints
belong to a common cycle, and external otherwise.

Definition 4 Given a CacAP instance, a pair of internal links
is crossing if they are node disjoint and deleting u 2 and v 2 disconnects u 1 from v 1 in C.
The kind of links that we want to add in the first stage of the algorithm are external links plus crossing pairs of links. More in detail, the algorithm has two main stages. The first stage consists of a set of rounds, where in each round we first check if there exists an external link , in which case we add it to our solution, contract it and proceed to the next round. Otherwise, if there exists a pair of (internal) crossing links and , we add them to our solution, contract them and proceed to the next round. If none of the two cases above applies, we are left with a CacAP instance without neither external links nor crossing pairs of links which we address in the second stage of the algorithm. As the following lemma states, in the second stage we can efficiently compute the optimal solution.

Lemma 3 Consider an instance
If there are no external links and no crossing pairs of links, then every minimal solution has size exactly |V | − 1 and induces a spanning tree over V .
Proof We prove the first part of the claim by induction on n = |V |. The base case n = 2 is trivial since in this case the instance is just a cycle consisting of two parallel edges and any link must be incident to the two nodes of G (hence defining a feasible solution). For the inductive case, assume the claim is true up to instances having n−1 nodes, and consider an instance of the problem defined by a cactus G having n nodes with optimal solution OPT. If G is not a cycle of length n, then it is defined by a set of cycles of length at most n − 1 where every link is internal, so we can apply the inductive hypothesis to each cycle independently. If G is a cycle of n nodes, then let = (u, v) ∈ OPT. Contracting leads to a CacAP instance on two cycles C 1 and C 2 sharing a common node w, with |V (C 1 )| + |V (C 2 )| = n. Let OPT be the optimal solution for the new instance. By Lemma 1, |OPT| = |OPT | + 1. Observe that any remaining link must have both endpoints in the same C i (otherwise and would be crossing). Thus by the inductive hypothesis the optimum solution for the problem induced by C i has size |V (C i )| − 1. It then follows that |OPT | = For the second part of the claim, it is sufficient to show that a minimal solution does not induce a cycle. By contradiction, consider a minimal solution containing a simple cycle L , and consider now a solution where we remove precisely one arbitrary link = (u, v) from L . Consider any pair of edges e 1 , e 2 belonging to the same cycle such that satisfies the {e 1 , e 2 }-cut. Since L \ induces a simple u-v path, then some ∈ L \ must satisfy the cut. Thus L \ is a feasible solution, contradicting the minimality of L . Now we proceed to prove the approximation guarantee of the algorithm.

Theorem 2
The CROSSING-FIRST algorithm is a 5 3 -approximation for CycAP.
Proof Let OPT be the optimal solution and APX the computed solution. Let also n be the number of nodes remaining at the end of the first stage, and APX (resp. APX ) be the set of links added to the solution during the first (resp. second) stage.
We complement this result with an asymptotically matching lower bound.

Lemma 4
The approximation ratio of the CROSSING-FIRST algorithm is not better than 5 3 .
Proof Consider the following construction: for each k ≥ 2 consider an instance (G k , L k ) of CycAP defined by a cycle of n = 6k nodes (assume that the cycle is defined by the order of the nodes v 1 , v 2 , . . . , v 6k ) and the following set of links (see Fig. 1 (Left)): Notice that the first and second set of links define a feasible solution of size n 2 , hence being optimal: if we remove any two edges of the cycle, then we are either satisfying the corresponding cut via (v 1 , v n 2 +1 ), or one side of the partition is contained in either {v 2 , . . . , v n 2 } or in {v n 2 +2 , . . . , v n } but the links selected form a matching between those sets.
We will now prove that there exists a sequence of choices performed by our algorithm that outputs a solution of size 5n 6 − 1, which implies that the approximation ratio is at least 5 3 − 2 n and this value approaches 5 3 as k goes to infinity. Notice first that the pair of links {(v 1 , v 3 ), (v 2 , v 4 )} ⊆ L k is crossing, and hence the algorithm can include them in the solution in the first round (and finish the round). Furthermore, this is the obtained CacAP instance after these links are contracted no link becomes external as the new cactus instance consists of a cycle of length n−3, and also the links with endpoints v n , v n−1 and v n−2 are not part of any pair of crossing links (see Fig. 1 (Right)). If we now iteratively pick all the pairs of crossing links n 6 , after n 6 rounds we end up with a cycle of length n 2 without crossing links, and the algorithm must now take the remaining n 2 − 1 links to complete the solution. Thus, the size of the computed solution is 2 · n 6 + n 2 − 1 = 5 6 n − 1, proving the claim.

A 3 2 + ε -approximation
The family of instances from Lemma 4 suggests that "short" crossing pairs of links, although being locally profitable, may enforce the algorithm to take expensive decisions in the end. In this section we present a more involved 3 2 + ε -approximation for CycAP that tries to avoid this kind of situation. Like in the previous algorithm, there is a certain kind of links that we want to iteratively add to our solution in a first phase, and in this case such links correspond to external links and long links, which are defined as follows.

Definition 5
The length of an internal link (u, v) is the length of the shortest path between u and v in the corresponding cycle. For a given parameter 0 < ε < 1, an internal link is called long if its length is at least 1 ε , and short otherwise.
Our algorithm consists of the following two main phases. In the first phase, we iteratively check if there exists a long (internal) link . Otherwise, we check if there exists an external link . In both cases, we add to the solution under construction and contract it. Observe that contracting links does not create new long links, hence we will first select a set L long of long links, and then a set L ext of external links. After exhausting the previous choices, we move to the second phase. Here we are left with an instance where all links are short and internal, so we can solve independently the sub-instance induced by each cycle. We refer to this algorithm as LONG-FIRST. This second stage can be solved efficiently, due to the lack of long links, by means of the following lemma. 1

Lemma 5 Given a CycAP instance, there exists an algorithm that returns the optimal solution in time poly
where h max is the maximum length among the links.
Let L short be the collection of edges obtained in the second stage. The final solution is L long ∪ L ext ∪ L short .
Proof The running time of the algorithm is upper-bounded by poly(n)2 O(1/ε 2 ) . Consider next the approximation factor. Note first that |L long | ≤ εn. Indeed, contracting a long link always increases the number of cycles in the cactus by one without decreasing the number of edges, and all these cycles always have size at least 1/ε, so there are at most εn of them. Similarly to Theorem 2, we have that |OPT| ≥ |L short | and |OPT| ≥ n 2 . If |L long | + |L ext | + |L short | ≤ (3+2ε)n 4 then we already have a 3 2 + εapproximation as |OPT| ≥ n 2 . Otherwise, since the contraction of each external link reduces the number of nodes by at least 2 and the contraction of any other link reduces the number of nodes by at least 1, we have that |L long | + 2|L ext | + |L short | ≤ n. So |L ext | ≤ n− (3+2ε)n 4 = (1−2ε)n 4 and hence |L ext |+|L long | ≤ n+2εn 4 ≤ 1 2 + ε |OPT|. Since |OPT| ≥ |L short |, we have that in this case the size of the solution is also at most ( 3 2 + ε)|OPT|, concluding the proof.
Remark 1 By replacing ε with 1/ log n in the above construction, we can obtain a slightly improved approximation factor of 3/2 + o(1) which still runs in polynomial time.
It remains to prove Lemma 5. To do this, we need some more notations. Given a link = (u, v), we say that the edges of the shortest path between u and v in the cycle are covered by (in case of multiple shortest paths we choose the one going from u to v in counter-clockwise order along the cycle). Given an edge e of the cycle, we define the cut-neighborhood of e, namely N (e), as the 2h max − 1 edges that are closest to e, e included. We also define N L (e) as the set of links in L covering at least one edge from N (e).
Notice that in any feasible solution to a CycAP instance, at most one edge of the cycle is not covered: if it is not the case, then the cut defined by two uncovered edges is not satisfied as any link satisfying the cut would cover one of these two edges. We can use this observation to characterize the feasibility of a solution in terms of the cut-neighborhoods. Proof If A is feasible then the required properties are clearly satisfied since every cut is satisfied. On the other hand, suppose that A satisfies that every edge is covered by some link in A and the {e, e }-cuts are satisfied for every edge e and e ∈ N (e). Consider a pair of edges {e, e } such that e / ∈ N (e). By definition of N (e) there is no link in A covering both edges at the same time, and as e is covered by some link, this link satisfies the {e, e }-cut. This implies that A is feasible as every cut is satisfied.

Lemma 6 Consider a CycAP instance and let A be a set of links such that every edge of the cycle is covered by some link in
This lemma is useful as it implies that, given an edge e and a set of links S, we can optimally complete S in order to satisfy every {e, e }-cut in time 2 O(h 2 max ) just by guessing the subset of links from N L (e) that must be added, which are O(h 2 max ) only. Now we proceed to present the proof.
Proof of Lemma 5 Let us assume that we deal with instances of CycAP such that there exists an optimal solution where every edge is covered by some link. If it is not the case, as there may be only one uncovered edge, we can guess this edge and contract it; this leads to an equivalent instance of the problem where we can require that the optimum solution covers all the edges. We say that an edge e is satisfied by a set of links A if it is covered by some link in A and furthermore every {e, e }-cut is satisfied by A. In particular A is a feasible solution for the problem iff it satisfies all the edges.
We next design a dynamic programming algorithm to compute a minimum cardinality feasible solution. Let us name the nodes v 1 , v 2 , ..., v n in counter-clockwise order starting from some arbitrary node v 1 , and let the edges be e i = (v i , v i+1 ) for each i = 1, . . . , n (assuming v n+1 = v 1 ).
For  , plus an extra factor n from the initial guessing of an uncovered edge (that is contracted).
We complement Theorem 3 with an asymptotically matching lower bound.

Lemma 7
The approximation ratio of the LONG-FIRST algorithm is at least 3 2 .
Proof Consider the following construction: for each k > 1 2ε consider an instance (G k , L k ) of CycAP defined by a cycle of n = 4k nodes (assume that the cycle is defined by the order of the nodes v 1 , v 2 , . . . , v 4k ) and the following set of links (see Fig. 3 (Left)): As argued in Lemma 4, the first and second set of links define an optimal solution of size n 2 . We will now prove that there exists a sequence of choices performed by our algorithm that outputs a solution of size 3n 4 − 1, which implies that the approximation ratio is at least 3 2 − 2 n and this value approaches 3 2 as k goes to infinity. Notice first that the link (v n 4 +1 , v 3n 4 +1 ) ∈ L k has length 2k > 1 ε and hence it is long so the first stage of the algorithm can include it in the solution. After doing that, the second and third set of links become external and thus the algorithm will include them in the solution. Once all these links are included and contracted, we get a cactus consisting of two cycles of n 4 nodes each and without crossing links (see Fig. 3 (Right)). Hence, the algorithm must pick all the remaining links to complete the solution. The size then of this solution is n 4 + 1 + 2 n 4 − 1 = 3n 4 − 1.

LP Relaxations for CycAP
We start by lower-bounding the integrality gap of the standard cut LP for CycAP.

Lemma 8 The standard cut LP for CycAP has integrality gap at least 2.
Proof Consider a cycle of size k and, for each edge, a parallel link. The optimum integral solution has size k − 1, while setting each variable to 1 2 gives a feasible fractional solution of cost k 2 .
This shows that the standard cut LP is not strong enough even for instances without crossing nor long links, cases that we can handle optimally via combinatorial algorithms. We next present a stronger LP that exploits a more general set of constraints.
Let (G = (V , E), L) be a CycAP instance and S ⊆ E. We define the S-reduced instance (G S , L S ) as follows: We contract the edges of E \ S, obtaining a cycle  (v 1 , v 9 )) we obtain this subinstance without crossing pairs of links with |S| edges which defines G S , and the set of links L S will correspond to the images of L. Notice that there is a one-to-one relation between L S and the links in L which satisfy some cut defined by a pair of edges from S. We denote by OPT S the optimal solution for the instance (G S , L S ) 2 . The following lemma characterizes the feasibility of a solution.

Lemma 9 Given an instance (G, L) of CycAP, a solution
Proof Suppose that there exists S ⊆ E such that |A ∩ L S | < |OPT S |. This means that A ∩ L S is not a feasible solution for (G S , L S ) and hence there exist two edges e i , e j ∈ S such that no link in A∩L S satisfies the {e i , e j }-cut. As the remaining links in A \ L S also do not satisfy the cut by definition, this cut remains unsatisfied in the original instance, implying that A is not feasible.
On the other hand, suppose that A satisfies the claimed property for every set S. If we consider just sets S consisting of two edges this is exactly the characterization of feasibility shown in Observation1, implying that A is feasible.
This implies that we can add the constraint ∈L S x ≥ |OPT S | for S ⊆ E. Unfortunately there is an exponential number of such constraints and most of them require to compute |OPT S | for large instances. However, if we restrict ourselves to sets of edges having constant size, we get an LP formulation with polynomially many constraints that can be written in polynomial time. We call this LP the k-edge-cut LP for a given constant k ∈ N, which is similar in spirit to the bundle-LP for TAP introduced by Adjiashvili [1].
Notice that for k = 2 this is exactly the standard cut LP. Now we will prove some properties of this relaxation and bound its integrality gap.

Lemma 10
Given ε > 0, for k = 1 ε 2 the k-edge-cut LP restricted to instances with links of length at most 1 ε has integrality gap at most (1 + 2ε).
Proof We will assume w.l.o.g. that the set of links L contains every possible link of length 1. If it is not the case, let us include them obtaining a new set of links L ⊇ L.
The optimal LP value can only decrease while the size of the optimal solution cannot decrease, implying that the integrality gap can only increase due to this operation. To see this last fact, assume by contradiction that there exists a solution OPT for the new instance having strictly smaller size than OPT. Consider now a solution S consisting of OPT ∩ L plus a minimal set of links from L that makes S feasible (this is possible since the instance admits a feasible solution). If we in parallel iteratively contract the common links in S and OPT we arrive to the same CacAP instance, but now the remaining links from OPT have length 1 and the contraction of each of them reduces the number of nodes in the instance by exactly one node while the contraction of the remaining links in S reduces the number of nodes by at least 1. Thus |S| ≤ |OPT | which is not possible since S ⊆ L. Let X = (x ) ∈L be an optimal solution for the k-edge-cut LP. We will construct an integral feasible solution of size at most (1 + ε) ∈L x . To do so, we will partition the cycle into disjoint intervals as follows: We will first define an interval of size k (which we will call a long interval) and then an interval of size 1 ε (which we will call a short interval), and then continue with this procedure until it is not possible to continue. If in the end there are at most 1 ε edges we define a last short interval consisting of these remaining edges, otherwise we define a short interval consisting of the last 1 ε edges and a long interval consisting of the remaining edges (which will have size at most k). The number of short intervals is upper bounded by 1 + n 1/ε 2 +1/ε ≤ 1 + ε 2 n 1+ε ≤ ε 2 n assuming w.l.o.g. that n is lower bounded by a large enough constant.
Notice that ∈L x ≥ n/2 by a simple averaging argument over the n constraints corresponding to all the pairs of consecutive edges: every link appears in exactly two such constraints and the right-hand side of each constraint is 1. Since the total number of links of length 1 having both endpoints in a short interval is at most ε 2 n · 1 ε = εn ≤ 2ε ∈L x , we can add them to our solution at a negligible cost. Consider now the set of long intervals S 1 , S 2 , . . . , S T . Notice that no link has endpoints in different long intervals, and hence the LP constraints associated to such intervals do not share common variables. This implies that Our feasible solution will consist of all the links of length 1 with both endpoints in a short interval plus the optimal solutions OPT S i for each long interval S i . As argued before, the size of this solution is at most (1 + 2ε) ∈L x and the feasibility of the solution follows since every {e, e }-cut where e is in a short interval is satisfied by a link of length 1, while the remaining cuts are satisfied by the links computed optimally.

Lemma 11
Given ε > 0, for k = 1 ε 2 the k-edge-cut formulation has integrality gap at most (1 + 4ε) restricted to instances without crossing pairs of links.
Proof Let X = (x ) ∈L be an optimal solution for the k-edge-cut LP. Suppose that the instance does not contain links of length at least 1 ε , then we can conclude the claim thanks to Lemma 10. Otherwise, we will pick any link of length at least 1 ε and contract it, obtaining a CacAP instance consisting of two cycles without external links (as there are no crossing links), both of size at least 1 ε . If any cycle still contains some long link, we iterate this procedure. Let L long be the set of long links we picked during this procedure and C 1 , C 2 , . . . , C T be the set of cycles at the end. By the same argument as in Theorem 3, we have that |L long | ≤ εn ≤ 2ε ∈L x . Applying Lemma 10 to each cycle, we obtain a feasible solution of size at most where LP i is the k-edge-cut LP defined by each cycle C i and its internal links. As there are no external links, the sum of the previous LP solutions is the optimal solution for the following LP: The set of constraints of this LP is a subset of the constraints of the original LP as links in L long do not appear in these constraints and the set of variables is a subset of the original one. Thus we have T i=1 OPT LP i ≤ ∈L x , and then we can conclude that the constructed solution has size at most (1 + 4ε) ∈L x .
Following the proof of Theorem 3 plus the previous results we can get the following bound on the integrality gap for general instances of CycAP.
Proof Let X = (x ) ∈L be an optimal solution for the k-edge cut LP and consider the output of the 3 2 + ε -approximation from Section 3.2 decomposed into L long , L ext and L short as in the proof of Theorem 3. As argued before, we know that ∈L x ≥ n 2 and analogously to the proof of Lemma 11 we have that |L short | ≤ (1 + 2ε) ∈L x . Hence essentially the same analysis as in Theorem 3 provides the same bound of 3/2 + O(ε) up to an extra (1 + ε) factor.

Hardness of Approximation
In the following two sections we discuss the hardness of approximation for WCycAP and CycAP, respectively.

Hardness of Approximation for WCycAP
We now provide an approximation preserving reduction from WCacAP to WCycAP. Note that finding a better-than-2-approximation for WCacAP is at least as hard as finding such an approximation for WTAP, a big open problem in the area. Therefore our reduction shows that achieving a similar result for WCycAP is a very hard task as well.

Theorem 4 Given an instance A of WCacAP, it is possible to construct in polynomial time an instance B of WCycAP (whose only possibly new weight value is 0) such that any feasible solution to A can be mapped in polynomial time into a feasible solution to B of the same cost and vice versa.
To prove Theorem 4 we make use of the "inverse" of the contraction of a link, which we call an expansion: Consider a WCacAP instance with a node v with degree greater than 2. An expansion of v will consist of taking two cycles containing v and replacing them by the Eulerian tour that traverses them starting from v. Every node appears exactly once except for v which appears twice, for which we create two copies: v 1 the starting node and v 2 the intermediate one. The links originally incident to v are replaced by links of the same cost incident to v 1 , and we also add a link of cost zero between v 1 and v 2 (see Fig. 4 for an example). The two main properties of this procedure are that: (1) the contraction of a link created by an expansion brings back the graph to the original state and (2) v is replaced by v 1 and v 2 , which have degree deg(v) − 2 and 2, respectively.
Proof of Theorem 4 At high level our proof works as follows. We will build in polynomial time a chain of WCacAP instances (G 1 , L 1 ), . . . , (G k , L k ), with the following properties: (i) (G 1 , L 1 ) is the input instance and G k is a cycle; (ii) (G i+1 , L i+1 ), i = 1, . . . , k − 1, is obtained from (G i , L i ) via precisely one expansion (so G i+1 contains precisely one cycle less than G i , and precisely one new link i+1 of cost zero); (iii) a feasible solution to (G i , L i ), i = 1, . . . , k − 1, can be turned in polynomial time into a feasible solution to (G i+1 , L i+1 ) of the same cost and vice versa. The above properties together trivially imply the claim.
Given (G i , L i ), we proceed as follows. Consider any node v ∈ G i of degree at least 4, and let C 1 and C 2 be any two cycles incident to v (that must exist). We apply an expansion to node v w.r.t. C 1 and C 2 , hence creating a new link i+1 of cost zero. Properties (i) and (ii) follow immediately by construction. Observe that (G i , L i ) can be obtained from (G i+1 , L i+1 ) by contracting i+1 . Hence property (iii) follows directly from Lemma 1. In more detail, given a feasible solution A i+1 to (G i , L i ), we first add i+1 to A i+1 (that keeps the solution feasible, and does not change its cost). By Lemma 1, A i := A i+1 \ i+1 is a feasible solution to (G i , L i ) of the same cost.

Hardness of Approximation for CycAP
In this section we prove that CycAP is APX-hard via a reduction from a restricted case of 3-Dimensional Matching (3DM). In the general version of 3DM we are given three disjoint sets W, X and Y having equal cardinality p and a set of m hyperedges H ⊆ W × X × Y . A (3D) matching is a subset M ⊆ H such that each element of W ∪ X ∪ Y belongs to at most one hyperedge in M, and this matching is perfect if |M| = p. Notice that in a perfect matching M each element of W ∪ X ∪ Y belongs to precisely one hyperedge. The goal is to determine whether a perfect matching exists. We will consider the special case 3DM-K, K ∈ N, where we add the constraint that each element from W ∪ X ∪ Y appears in at most K hyperedges. The following result will help us to conclude our final claim.
The proof of the following theorem is similar in spirit to the proof of NP-hardness for WTAP due to Frederickson and JáJá [15] and the extension presented by Kortsarz et al. [23]. In the first reduction the authors start from an instance A of 3DM with 3p nodes and m hyperedges, and build a WTAP instance B such that: A has a feasible solution (with p hyperedges) iff B has a feasible solution with p + m links. By duplicating the edges in B, one obtains a CacAP instance C with exactly the same property over some cactus G. Our main idea is to turn C into an instance D of CycAP by constructing an Euler tour G out of G and shortcutting some nodes. However, we need to carefully choose the ordering in the Euler tour in order to preserve a mapping between the feasible solutions of C and D. By following the refined approach from the second reduction, we will show that it is hard to distinguish solutions with a gap depending on the maximum degree in the instance and then use Theorem 5 to conclude the following result.

Theorem 6
For some fixed ε > 0, it is NP-hard to approximate CycAP within a factor 1 + ε. -For each node x i ∈ X we define a node x i ; -For each node y i ∈ Y we define a node y i ; -Let H (w i ) denote the hyperedges in H containing w i ∈ W . For each hyperedge h ∈ H (w i ) we define two nodes, namely h X and h Y (hyperedge nodes). These nodes are added to the cycle in the following order. For each i ∈ {1, . . . , p}, we add first nodes h X corresponding to hyperedges in H (w i ) (in some arbitrary order) and then the corresponding nodes h Y respecting the same order used before. We will denote the first set of nodes by H X (w i ), and the second set by H Y (w i ).

Construction of the Instance
Let us show that A is a feasible solution. By Observation 1, it is sufficient to consider any pair of edges {e 1 , e 2 }, and show that there exists some link ∈ A satisfying the corresponding {e 1 , e 2 }-cut. Let us denote by S and S the sets of nodes induced by the cut. Let H X (resp., H Y ) be the collection of nodes of type h X (resp., h Y ). We make the following case distinction: Suppose first that e 1 is incident to two nodes in X or e 1 = (x p , y 1 ) (the case e 1 being incident to two nodes in Y is symmetric). We distinguish the following 3 subcases depending on e 2 : 1. Suppose e 2 is incident to at least one node in X ∪ Y . Then one of the sets in the cut, say S , contains all the hyperedge nodes while S contains at least one node  (2) = (w 1 , x 2 , y 1 ) and green links join the copies of the hyperedges in z ∈ X ∪ Y . By construction each node in X (resp. Y ) is adjacent to some node in H X (resp., in H Y ). Thus this cut is satisfied. 2. Suppose e 2 is not incident to any node in H Y (w p ). Then one of the sets in the cut, say S , contains completely Y , while S contains H Y (w p ). By construction, for h = (w p , x, y) ∈ M, = (h Y , y) ∈ A, hence this cut is satisfied. 3. Suppose e 2 is incident to some node in H Y (w p ). Then one of the sets in the cut, say S , contains H X while the other set contains at least one node x from X. Again by construction, for h = (w, x, y) ∈ M, = (h X , x) ∈ A. Hence this cut is satisfied.
Suppose on the other hand that e 1 and e 2 are incident to at least one hyperedge node. Notice that one of the sets in the cut, say S , contains X ∪ Y . We distinguish the following 2 subcases: 1. If S contains entirely H X (w) or H Y (w) for some w ∈ W , then for h = (w, x, y) ∈ M, (h X , x) or (h Y , y) is contained in A and the cut is satisfied. 2. In the remaining case we prove that the following claim holds: There exists an hyperedge h such that h X ∈ S and h Y ∈ S .
Suppose by contradiction that for every hyperedge h both h X and h Y belong to the same side of the considered cut. Let w i be such that either H X (w i ) or H Y (w i ) has non-empty intersection with both sides of the cut. Note that such w i must exist, otherwise there would exist w j such that S contains either H X (w j ) or H Y (w j ) completely which was already covered by the previous case. Assume w.l.o.g. that H X (w i ) = {h 1 X , . . . , h q X } is the considered set with elements sorted in counterclockwise direction. Since h 1 X and h 1 Y are on the same side of the partition and H X (w i ) is not fully contained in any side of the partition, it must hold that one set of the partition is properly contained in H X (w i ). Then any node inside that set has its copy on the other side of the partition. This is in contradiction with the assumption.
Let h be an hyperedge as in the previous claim. We are either adding to the solution the link that joins both copies of h (i.e. the case when h / ∈ M) and the proof is finished, or we are adding the two links joining the two copies of h to elements in X and Y (i.e. the case when h ∈ M). Since X ∪ Y is contained in S and both copies of h are in different sides of the partition, one of the links satisfies the cut.
For any z ∈ W ∪ X ∪ Y in a 3DM instance, let deg(z) be the number of hyperedges in H containing z. Let also Δ denote the maximum degree of the instance, i.e., Δ = max z∈W ∪X∪Y deg(z). By following an analogous approach to the one from Kortsarz et al. [23], we can prove that even instances with a gap can be mapped.
Proof Let A be a feasible solution to (G, L) with |A| ≤ (1 + ε)(p + m). Note that G contains 2(p + m) nodes and the links must form an edge cover (otherwise the resulting graph would not be 3-edge-connected). Call a node permissible if it is adjacent to exactly one link in A and impermissible otherwise. Let V perm and V imperm be the set of permissible and impermissible nodes respectively. We will first prove that the number of impermissible nodes is upper bounded by 2ε(p + m). In fact, if deg A (v) denotes the number of links in A incident to v, we have that where the last inequality comes from the fact that impermissible nodes are adjacent to at least two links. Since |A| ≤ (1+ε)(p +m), and |V perm |+|V imperm | = 2(p +m), we can conclude the claim.
We will now compute a set M which is almost a matching. We initialize M = ∅ and then, iteratively for j = 1, . . . , p, we try to add an hyperedge to M as follows: if x j is permissible, then it is adjacent to one node h (j ) x ∈ H X (let us assume h (j ) x ∈ H X (w i )); if both h (j ) x and its copy h (j ) y ∈ H Y (w i ) are permissible, then h (j ) y is adjacent to one node y k . If y k is permissible, then we add (w i , x j , y k ) to M . Notice that hyperedges added by this procedure are indeed in H by construction. Our claim is that |M | ≥ p − 2Δ(p + m)ε. Actually, if x j , h (j ) x or h (j ) y are impermissible, then only one iteration fails (the one indexed by j ). If y k is impermissible then it can cause at most Δ iterations to fail, since it can be connected to at most Δ nodes in H Y . If we denote by n y the number of impermissible nodes y k involved in the procedure, then the number of iterations that fail is at most (2ε(p + m) − n y ) + n y Δ. Since n y ≤ 2ε(p + m) (the total number of impermissible nodes), the number of iterations that fail is at most 2Δ(p + m)ε, proving the claim.
By construction, hyperedges in M have different elements from X and Y but elements from W might be repeated. Thus, for every w i belonging to more than one hyperedge in M , we remove from M all but one of such hyperedges, obtaining M which is now a matching. Let μ = p − |M | be the number of vertices w i not appearing in any hyperedge of M (equivalently of M ). Since |M | − |M | ≤ p − |M | = μ, we can find a lower bound on the size of M by bounding above μ. We indeed claim that μ ≤ (2 + 8Δ)(p + m)ε.
Let L be the links in L of the form (x j , h Consider on the other hand the μ nodes w i which are not intersected by hyperedges in M . Since A is a feasible solution, for each such w i there must be a link in A connecting H X (w i ) ∪ H Y (w i ) and X ∪ Y , because otherwise we could disconnect H X (w i )∪H Y (w i ) from the rest of the graph by removing the two edges in the boundary of H X (w i ) ∪ H Y (w i ), contradicting the feasibility of A. Notice that these μ links are part of A \ L . Furthermore, since A is an edge cover, the remaining 2m − 2p − μ nodes in H X ∪ H Y untouched by L plus the μ aforementioned links must be incident to some link in A, implying that Combining both inequalities we get that μ ≤ (2 + 8Δ)(p + m)ε, and hence we conclude that the size of M is at least completing the proof.
We can now use Lemmas 12 and 13 together with Theorem 5 to conclude the proof of Theorem 6. Notice that in 3DM-5, since Δ = 5, we have that m = |H | ≤ 5|W | = 5p.
Proof of Theorem 6 We will show that our reduction presented above is gappreserving. Specifically, we will show that if H is an instance of 3DM-5 and (G, L) is the corresponding CycAP instance, then 1. If H admits a matching of size p, then (G, L) admits a feasible solution of size p + m; 2. If H does not admit a matching of size at least p(1 − ε 0 ), then (G, L) does not admit a feasible solution of size at most (p + m)(1 + ε 0 312 ). The first statement follows directly from Lemma 12, while the second is the contrapositive of Lemma 13 when setting ε = ε 0 312 , as in this case we have that p − (2 + 10Δ)(5p + p)ε = p(1 − 312ε) = p(1 − ε 0 ).