1 Introduction

Survivable network design deals with the design of a low-cost network that is resilient to the failure of some of its elements (like nodes or links), and finds application in various settings such as, e.g., telecommunications and transportation. A fundamental problem in this area is Connectivity Augmentation, which asks to cheaply increase the connectivity of a given network from a value k to \(k+1\). The most classical setting considers edge-connectivity: here the input consists of a k-edge-connected graph G and a set L of extra links, and the goal is to select the smallest subset \(L' \subseteq L\) of links such that if we add \(L'\) to the graph G, then its connectivity increases by 1, i.e., it becomes \((k+1)\)-edge-connected. This Edge-Connectivity Augmentation problem has a long history. It was observed long ago (see Dinitz et al. [13], as well as Cheriyan et al. [9] and Khuller and Thurimella [21]) that there is an approximation-preserving reduction from the Edge-Connectivity Augmentation problem for an arbitrary k to the case where \(k = 1\), if k is odd, and to the case where \(k = 2\), if k is even. When \(k = 1\), the problem is known as the Tree Augmentation problem (TAP), since the input graph in that case can be assumed to be a tree without loss of generality, while the case of \(k = 2\) is called the Cactus Augmentation problem (CacAP), since the input graph in that case can be assumed to be a cactus graphFootnote 1. It is easy to see that TAP is a special case of CacAPFootnote 2. This, combined with the aforementioned reduction, implies that an \(\alpha \)-approximation algorithm for CacAP yields an \(\alpha \)-approximation algorithm for the general Edge-Connectivity Augmentation problem. Both TAP and CacAP admit simple 2-approximation algorithms [18, 20, 21]. Approximation algorithms with approximation ratio better than 2 have been discovered for TAP in a long line of research spanning three decades [1, 6,7,8, 10, 12, 14, 16, 17, 19, 21, 23,24,25,26], and more recently for CacAP [4, 6, 29].

The above problems naturally extend to their node-connected variants. However, approximation results on Node-Connectivity Augmentation are more scarce, even for the most basic generalization, known as the Block-Tree Augmentation problem (Block-TAP), which is the direct extension of TAP to the node-connected case. We now define it formally.

Definition 1

(Block-TAP) Let \(T=(V, E)\) be a tree, and \(L \subseteq \left( {\begin{array}{c}V\\ 2\end{array}}\right) \). The goal is to compute a minimum cardinality set \(L'\subseteq L\) such that \(G = (V, E \cup L')\) is a 2-node-connected graph.

Similar to CacAP, a 2-approximation algorithm was known for quite some time [17, 21] for Block-TAP, and until recently, it was an open question to design an approximation algorithm with ratio better than 2. Very recently, Nutov [27] observed that Block-TAP can be reduced to special instances of the (unweighted) Node Steiner Tree problem, extending the techniques of Basavaraju et al. [2] and Byrka et al. [4] used for CacAP. We recall that the (unweighted) Node Steiner Tree problem takes as input a graph G and a subset of nodes (called terminals), and asks to find a tree spanning the terminals which minimizes the number of non-terminal nodes included in the tree. These special Node Steiner Tree instances exhibit some crucial properties, similar to the Steiner Tree instances constructed by [4] for CacAP, and hence the 1.91-approximation of [4] for CacAP also yields a 1.91-approximation to Block-TAP. This is the first result breaking the barrier of 2 on its approximability, and the best bound known so far.

Our results and techniques. In this work, we consider the problem of augmenting the node-connectivity of a graph from 1 to 2, that is slightly more general than Block-TAP.

Definition 2

(1-Node-CAP) Let \(G=(V, E)\) be a 1-node-connected graph, and \(L \subseteq \left( {\begin{array}{c}V\\ 2\end{array}}\right) \). The goal is to compute a minimum cardinality set \(L'\subseteq L\) such that \(G' = (V, E \cup L')\) is a 2-node-connected graph.

Our main result is the following.

Theorem 1

There exists a 1.892-approximation algorithm for 1-Node-CAP.

Note that Block-TAP is a particular case of 1-Node-CAP where the input graph G is a treeFootnote 3. Therefore as a corollary, with our result we improve upon the approximation bound for Block-TAP.

Corollary 1

There exists a 1.892-approximation algorithm for Block-TAP.

Moreover, as another corollary of our techniques and prior results, we also get the following result about CacAP.

Theorem 2

There exists a 1.892-approximation algorithm for CacAP.

The starting point of our work is the reduction of Block-TAP and CacAP to the previously mentioned Node Steiner Tree instances of [2, 4, 27], which from now on we call CA-Node-Steiner-Tree instances (formally defined in Definition 3). We first observe that instances of the more general 1-Node-CAP can be reduced to CA-Node-Steiner-Tree instances. We clarify that these instances, obtained by the above reduction, can also be treated as Edge Steiner Tree instances (as done by [4]) but we will view them as Node Steiner Tree instances, since this allows for a more direct correspondence between “links to add” for 1-Node-CAP/Block-TAP/CacAP, and “Steiner nodes to select” for Steiner Tree. This view helps us to give a cleaner analysis of the iterative randomized rounding technique. Thus, the main task for proving Theorems 1 and 2 is to design a 1.892-approximation for CA-Node-Steiner-Tree.

Besides giving a 1.892-approximation for 1-Node-CAP, which improves upon the state-of-the-art approximation of 1.91 by Nutov [27] for Block-TAP, one key point of our work is that the analysis of our approximation bound is quite simple compared to the existing results in the literature for CacAP that achieve a better than 2 approximation. More precisely, only the algorithm of [6] gives a better approximation factor than our algorithm—in fact, much better —but its analysis is more involved. Furthermore, our work gives some new insights on the iterative randomized rounding method introduced by Byrka et al. [5] that might be of independent interest. We give a few more details of this iterative rounding next.

The iterative randomized rounding technique, applied to CA-Node-Steiner-Tree, at each iteration uses an (approximate) LP-relaxation to sample a set of Steiner nodes connecting part of the terminals, contract them, and iterate until all terminals are connected. Roughly speaking, the heart of the analysis lies in bounding the expected number of iterations of the algorithm until a Steiner node of a given initial optimal solution is not needed anymore in the current (contracted) instance. This is achieved by a suitably chosen spanning tree on the set of terminals (called the witness tree). In the original Steiner Tree work given by [5], as well as in the work by [4], the witness tree is chosen (mostly) randomly. Later an edge-deletion process over this tree is mapped to an edge-deletion process over the edges of an optimal solution. In contrast, we give a purely deterministic way to construct the witness tree, and then map an edge-deletion process over this tree to a node-deletion process over the Steiner nodes of an optimal solution. The deterministic method of constructing the witness tree that we introduce here relies on computing some minimum weight paths from the Steiner nodes in an optimal solution to terminals, according to some node-weights that take into account two factors: the number of nodes in the path, and the degree of each internal node in the path.

Moreover, our techniques can be refined and give a \((1.8{\bar{3}} + \varepsilon )\)-approximation algorithm for what we call leaf-adjacent Block-TAP instances; these are Block-TAP instances where at least one endpoint of each link is a leaf. We note here that Nutov [29] recently gave a \(1.6{\bar{6}}\)-approximation for leaf-to-leaf Block-TAP instances; there are instances where both endpoints of each link are leaves. Thus, our \((1.8{\bar{3}} + \varepsilon )\)-approximation algorithm deals with a strictly larger set of instances compared to [29], albeit with a worse approximation factor.

Theorem 3

For any fixed \(\varepsilon >0\), there exists a \((1.8{\bar{3}} + \varepsilon )\)-approximation algorithm for leaf-adjacent Block-TAP.

Interestingly, we can also provide some concrete limits on how much our techniques can be further pushed. We show that there exist leaf-to-leaf Block-TAP instances for which our choice of the witness tree is the best possible, and yields a tight bound of \(1.8{\bar{3}}\). This shows that any approximation bound strictly better than \(1.8{\bar{3}}\) needs substantially different arguments.

Related work. We clarify here that the 2-approximation algorithms mentioned for TAP, CacAP, Block-TAP work even with weighted instances, where each link comes with a non-negative cost and the goal is to minimize the overall cost of the selected links. In fact, the 2-approximation algorithms for Block-TAP [17, 21] works more generally for (weighted) 1-Node-CAP. Differently, all other aforementioned algorithms only apply to the unweighted version of TAP, CacAP and Block-TAP (some of them extend to the weighted version with bounded integer weights). Traub and Zenklusen [31, 32] very recently presented a 1.7-approximation algorithm and a \((1.5 + \varepsilon )\)-approximation algorithm for the general weighted version of TAP, thus breaking the long-standing barrier of 2 for weighted TAP. For more details about the history of these problems, we refer the reader to [6, 19] and the references contained in them.

Regarding CacAP, last year Byrka et al. [4] managed to break the 2-approximation barrier and obtained a 1.91-approximation, thus resulting in the first algorithm with an approximation factor strictly smaller than 2 for the general Edge-Connectivity Augmentation problem. As mentioned, to get this result, they exploited a reduction of Basavaraju et al. [2] to the (unweighted) Edge Steiner Tree problem and utilized the machinery developed for that problem by Byrka et al. [5]. We recall here that the (unweighted) Edge Steiner Tree problem takes as input a graph G and a subset of nodes (called terminals), and asks for a minimum size subtree of G spanning the terminals. Specifically, Byrka et al. [4] tailored the analysis of the iterative randomized rounding algorithm for Steiner Tree in [5] to the specific Steiner Tree instances arising from the reduction. Nutov [29] then showed how one can use any algorithm for the Edge Steiner Tree problem in a black-box fashion in order to obtain approximation algorithms for CacAP, at the expense of a slightly worse approximation bound; in particular, he obtained a 1.942-approximation by applying the algorithm of [5] as a black box. Soon after, Cecchetto et al. [6] (relying on a completely different approach, more in line with the techniques used for TAP) gave a very nice unified algorithm that gives a state-of-the-art 1.393-approximation for CacAP, and thus, also for TAP and the general Edge-Connectivity Augmentation problem.

Regarding Node-Connectivity Augmentation for \(k > 1\), we refer the reader to [11, 28] and the references contained in them.

Organization of material. The rest of this work is organized as follows. In Sect. 2, we start by formally defining CA-Node-Steiner-Tree instances, and state the connection to CacAP, Block-TAP, and 1-Node-CAP. We also explicitly (re)prove that CA-Node-Steiner-Tree instances admit a hypergraph LP-relaxation which is based on so-called k-restricted components. We then describe the iterative randomized rounding procedure for these node-based instances in Sect. 3. Some missing proofs from Sects. 2 and 3 can be found in Appendix A. Section 4 contains our simpler witness tree analysis. Finally, Sect. 5 reports the improved approximation for leaf-adjacent Block-TAP instances, while Sect. 6 contains lower bound constructions that demonstrate the limits of our techniques.

2 From connectivity augmentation to node steiner tree

2.1 Reduction to CA-Node-Steiner-Tree instances

As a reminder, in the Node Steiner Tree problem, we are given a graph \(G = (V, E)\) and a subset \(R\subseteq V\) of nodes, called terminals, and the goal is to compute a tree T of G that contains all the terminals and minimizes the number of non-terminal nodes (so-called Steiner nodes) contained in T. We will refer to the number of Steiner nodes contained in a tree T as the cost of the solution, and indicate it with cost(T).

In general, the Node Steiner Tree problem is as hard to approximate as the Set Cover problem, and it admits a \(O(\log \vert R\vert )\) approximation algorithm (which holds even in the more general weighted version [22]). However, the instances that arise from the reductions of [2, 4, 27] have special properties that allow for a constant factor approximation. We now define the properties of these instances. We use the notation \(N_G(u)\) to denote the set of nodes that are adjacent to a node u in a graph G (u is not included in \(N_G(u)\)).

Definition 3

(CA-Node-Steiner-Tree) Let \(G=(V,E)\) be an instance of Node Steiner Tree, with \(R\subseteq V\) being the set of terminals. The instance G is a CA-Node-Steiner-Tree instance if the following hold:

  1. 1.

    For each terminal \(v \in R\), we have \(N_G(v) \cap R = \emptyset \).

  2. 2.

    For each Steiner node \(\ell \in V \setminus R\), we have \(\vert N_G(\ell ) \cap R \vert \le 2\).

  3. 3.

    For each terminal \(v \in R\), the set \(N_G(v)\) forms a clique in G.

As already mentioned, the starting point of this work is the reduction from CacAP and Block-TAP to CA-Node-Steiner-Tree. An overview of the reduction from CacAP to CA-Node-Steiner-Tree by Byrka et al. [4] is found in Appendix A.1. In Appendix A.2 we extend the reduction from Block-TAP to CA-Node-Steiner-Tree by Nutov [27], to deal with 1-Node-CAP instances. The following two theorems make these connections formal.

Theorem 4

([4]) The existence of an \(\alpha \)-approximation algorithm for CA-Node-Steiner-Tree implies the existence of an \(\alpha \)-approximation algorithm for CacAP.

Theorem 5

(Extending [27]) The existence of an \(\alpha \)-approximation algorithm for CA-Node-Steiner-Tree implies the existence of an \(\alpha \)-approximation algorithm for 1-Node-CAP (and hence Block-TAP).

The above two theorems imply that from now on we can therefore concentrate on the CA-Node-Steiner-Tree problem.

2.2 An approximate relaxation for CA-Node-Steiner-Tree

A crucial property that we will use is that, similar to the Edge Steiner Tree problem, one can show that the k-restricted version of the problem provides a \((1+\varepsilon )\)-approximate solution to the original problem, for an arbitrarily small \(\varepsilon >0\).

To explain this in more detail, we first recall that any Steiner tree can be seen as the union of components, where a component is a subtree whose leaves are all terminals, and whose internal nodes are all Steiner nodes. We note that two components of such a union are allowed to share nodes and edges. A component is called k-restricted if it has at most k terminals. A k-restricted Steiner tree is a collection of k-restricted components which induces a connected hypergraph when taking R as vertex set, and adding for each component the set of its terminals as a hyperedge. The key theorem of this section is the following.

Theorem 6

Consider a CA-Node-Steiner-Tree instance and let \(\texttt {OPT}\) be its optimal value. For any integer \(m \ge 1\), there exists a k-restricted Steiner tree Q(k), for \(k = 2^m\), whose cost satisfies \(cost(Q(k)) \le \left( 1 + \frac{4}{\log k}\right) \texttt {OPT}\).

We stress that cost(Q(k)) is equal to the number of Steiner nodes in Q(k) counted with multiplicities; an example demonstrating this is given in Fig. 1.

Fig. 1
figure 1

(a) An example of a 2-restricted CA-Node-Steiner tree, of cost 4. (b) An example of a 3-restricted CA-Node-Steiner tree, of cost 3. In both figures, the square nodes represent the terminals and the components are the nodes grouped together by the dashed lines

The proof of the above theorem mimics the result of Borchers and Du [3] regarding the Edge Steiner Tree problem, and can be found in Appendix A.3. In proving the above theorem, we crucially use the properties stated in Definition 3, as the statement is not true for general Node Steiner Tree instances.

The above theorem shows that CA-Node-Steiner-Tree can be approximated using k-restricted Steiner trees, with a small loss in the objective value. Based on this, we now present a linear programming relaxation for the k-restricted Node Steiner Tree problem, which we call the Directed Components Relaxation (or, in short, DCR), as it mimics the DCR relaxation of Byrka et al. [5] for the Edge Steiner Tree problem.

Let \(G = (V, E)\) be a Node Steiner Tree instance, where \(R \subseteq V\) is the set of terminals. Let \(k > 0\) be an integer parameter. For each subset \(R' \subseteq R\) of terminals of size at most k, and for every \(c' \in R'\), we define a directed component \(C'\) as the minimum Node Steiner tree on terminals \(R'\) with edges directed towards \(c'\). More precisely, let \(S(R') \subseteq V \setminus R\) be a minimum-cardinality set of Steiner nodes such that \(G[R' \cup S(R')]\) is connected. We define \(C'\) by taking a spanning tree in \(G[R' \cup S(R')]\) and directing the edges towards \(c'\), and we let \(cost(C') = \vert S(R')\vert \). We call \(c'\) the sink of the component and all other terminals in \(R' \setminus \{c'\}\) the sources of the component. Let \({\mathbf {C}}\) be the set of all such directed components \(C'\), with all possible sinks \(c'\in C'\). Note that \({\mathbf {C}}\) has an element for each subset of terminals \(R' \subseteq R\), \(\vert R'\vert \le k\), and each terminal \(c' \in R'\). We introduce one variable \(x(C')\) for each \(C' \in {\mathbf {C}}\). Finally, we choose an arbitrary terminal \(r \in R\) as the root. The k-DCR relaxation is the following.

figure a

Let \(\texttt {OPT}_{\text {LP}}(k)\) be the optimal value of the above relaxation for \(k \in {\mathbb {N}}_{>0}\), and \(\texttt {OPT}\) be the value of an optimal (integral) solution to our original (unrestricted) CA-Node-Steiner-Tree instance. The next theorem states that the k-DCR LP can be solved in polynomial time for any fixed k and its optimal solution yields a \((1+\varepsilon )\)-approximation of \(\texttt {OPT}\). Its proof can be found in Appendix A.4.

Theorem 7

For any fixed \(\varepsilon > 0\), there exists a \(k = k(\varepsilon ) > 0\) such that \(\texttt {OPT}_{\text {LP}}(k) \le (1+\varepsilon ) \texttt {OPT}\), and moreover, an optimal solution to the k-DCR LP can be computed in polynomial time.

Given the above theorems, all that remains to show is the following: given an optimal fractional solution to the k-DCR LP, design a rounding procedure that returns an integral CA-Node-Steiner-Tree solution whose cost is at most \(\gamma \) times larger than the cost of an optimal k-restricted Steiner tree, for as small \(\gamma \ge 1\) as possible. Following the chain of reductions discussed in this section, such a rounding scheme would immediately imply a \((\gamma +\varepsilon )\)-approximation algorithm for 1-Node-CAP and CacAP, given that \(\gamma \) is constant. In the following sections, we present such a scheme that gives \(\gamma \le 1.8917\).

3 The iterative randomized rounding algorithm

Our rounding scheme for the k-DCR relaxation consists in applying the iterative randomized rounding technique of Byrka et al. [5], first applied to the Edge Steiner Tree problem, and is described in Algorithm 1.

figure b

We will now analyze the cost of the output solution. Throughout this section, we follow the analysis of [5] and slightly modify it wherever needed. Let \(G = (V, E)\) be a given CA-Node-Steiner-Tree instance, where \(V = R \cup S\); R is the set of terminals and S is the set of Steiner nodes. Let \(\varepsilon > 0\) be a fixed small constant and \(k = k(\varepsilon )\) as in the proof of Theorem 7. For the sake of the analysis, we can assume that \(\sum _{C'\in {\mathbf {C}}^i}x^i_{C'}\) is the same for all iterations i, as in [5]Footnote 4. So let \(M =\sum _{C'\in {\mathbf {C}}^i}x^i_{C'}\).

Let \(T = (R \cup S^*, E^*)\), \(S^* \subseteq S\) and \(E^* \subseteq E\), be an optimal solution to the given CA-Node-Steiner-Tree instance G of cost \(cost(T) = \vert S^*\vert \). We will analyze the cost of the output of Algorithm 1 with respect to T. For that, we define a sequence of subgraphs of T, one for each iteration of the algorithm, in the following way: if, during the \(i^\text {th}\) iteration, we sample \(C^i\), then we delete a subset of nodes of T and what remains is the subgraph \(T^i\) defined for that iteration. What must be specified is which nodes of T are deleted at each iteration, which will be explained shortly. Let \(T=T^0 \sqsupseteq T^1 \sqsupseteq T^2 \sqsupseteq ...\) be sequence of subgraphs remaining after each iteration, where the notation \(T^{i+1} \sqsubseteq T^i\) means that \(T^{i+1}\) is a (not necessarily strict) subgraph of \(T^i\). We show that there exists some universal constant \(\gamma \in {\mathbb {R}}_{\ge 1}\) and a choice of optimal solution T such that the following two properties are satisfied:

  1. (a)

    \(T^i\) plus the components sampled until iteration i form a connected subgraph spanning the terminals.

  2. (b)

    On average, a Steiner node in T is deleted after \(M \gamma \) iterations.

Similar to [5], we show that conditions (a) and (b) yield that the iterative randomized rounding algorithm is in fact a \((\gamma + \varepsilon )\)-approximation algorithm, for any fixed \(\varepsilon > 0\). This is achieved by relying on the construction of a witness tree W, which is a particular kind of spanning tree on the set of terminals. However, the main differences with respect to [5] are that (i) we delete nodes of T instead of edges, and (ii) we have a purely deterministic way to construct the witness tree, and thus we need an explicit averaging argument for (b).

We now discuss the details of our deletion process. As mentioned, given T we construct a witness tree \(W = (R, E_W)\) that spans the set of terminals. For each component C, let \({\mathcal {B}}_W(C)\) be the family of maximal edge sets \(B \subseteq E_W\) such that \((W\setminus B) \cup C\) forms a connected subgraph spanning the terminals. In each iteration i, we “mark” a subset of edges of W that correspond to a randomly chosen set in \({\mathcal {B}}_W(C^i)\). For a positive integer t, let \(H(t) :=\sum _{j=1}^t \frac{1}{j}\) be the \(t^\text {th}\) harmonic number. The following lemma is proved in [5] (more precisely, see Lemmas 19 and 20 in [5]).

Lemma 1

([5]) For each component C, there exists a probability distribution over \({\mathcal {B}}_W(C)\) such that the following holds: for any \({\widetilde{W}} \subseteq E_W\), the expected number of iterations until all edges of \({\widetilde{W}}\) are marked is bounded by \(H(\vert {\widetilde{W}} \vert ) \cdot M\), where the expectation is over the random choices of the algorithm and the distributions over \(\{{\mathcal {B}}_W(C)\}_C\).

We will delete Steiner nodes from T using the marked edges in W, as follows. For each Steiner node \(v \in S^*\), we define

$$\begin{aligned} W(v) :=\{(p,q) \in E_W: v \text { is internal node of the }p \text {-}q \text { path in } T\}, \end{aligned}$$

and \(w(v) :=\vert W(v) \vert \); less formally, w(v) is equal to the number of edges e of W such that the path between the endpoints of e in T contains v. From now on, we will say that the vector \(w: S^* \rightarrow {\mathbb {N}}_{\ge 0}\) is the vector imposed on \(S^*\) by W. A Steiner node \(v \in S^*\) is deleted in the first iteration where all the edges in W(v) become marked. Thus, each subgraph \(T^i\), \(i > 0\), defined in the beginning of this section, is the subgraph obtained from \(T^{i-1}\) after the \(i^\text {th}\) iteration according to this deletion process.

Lemma 2

For every \(i > 0\), \(T^i \cup C^1 \cup \dots \cup C^i\) spans the terminals.

Proof

Let \(E_W^i \subseteq E_W\) be the set of edges of W that have not been marked at iteration i. By construction, \(E_W^i \cup C^1 \cup \dots \cup C^i\) spans the terminals. For every \((p,q) \in E_W^i\), the unique path between p and q in T is still present in \(T^i\), as none of the nodes in this path have been deleted. Hence p and q are connected in \(T^i\). The result follows.\(\square \)

We now address the universal constant \(\gamma \) that was mentioned above. For a CA-Node-Steiner-Tree instance \(G=(V, E)\), where \(R \subseteq V\) is the set of terminals and \(S = V \setminus R\) is the set of Steiner nodes, we define

$$\begin{aligned} \gamma _G :=\min _{\begin{array}{c} T = (R \cup S^*, E^*):\; T \text { is}\\ \text {optimal Steiner tree} \end{array}} \;\;\; \min _{\begin{array}{c} W:\; W\text { is }\\ \text {witness tree} \end{array}} \frac{\sum _{v \in S^*} H(w(v))}{cost(T)}, \end{aligned}$$

where w is the vector imposed on \(S^*\) by each witness tree W considered. The constant \(\gamma \) that will be used to pinpoint the approximation ratio of the algorithm is defined as

$$\begin{aligned} \gamma :=\sup \{\gamma _G: G\text { is an instance of CA-Node-Steiner-Tree}\}. \end{aligned}$$

In Sect. 4 we will prove the following theorem.

Theorem 8

\(\gamma \le 1.8917\). We are ready to prove the main theorem of this section; throughout its proof, we use the notation introduced in this section.

Theorem 9

For any fixed \(\varepsilon > 0\), the iterative randomized rounding algorithm (see Algorithm 1) yields a \((\gamma + \varepsilon )\)-approximation.

Proof

Let \(G = (R \cup S, E)\) be a CA-Node-Steiner-Tree instance. We run Algorithm 1 with \(k = 2^{\lceil 4/\varepsilon ' \rceil }\), as in the proof of Theorem 7, and \(\varepsilon ' :=\frac{\varepsilon }{2}\). It is easy to see that the algorithm runs in polynomial time.

Let \(T = (R \cup S^*)\) and W be the optimal Steiner tree and corresponding witness tree, respectively, such that \(\gamma _G = \frac{\sum _{v \in S^*} H(w(v))}{cost(T)}\). Clearly, \(\gamma _G \le \gamma \). For each Steiner node \(v \in S^*\), let \(D(v) :=\max \{i: v\in T^i\}\). By Lemma 1, we have \({{\textbf {E}}}[D(v)] \le H(w(v)) \cdot M\). We have

$$\begin{aligned} \sum _{i}{{\textbf {E}}}[cost(C^i)]&=\sum _{i}{{\textbf {E}}}\Big [\sum _{C} \frac{x^i_C}{M} cost(C)\Big ] \le \frac{1+\varepsilon '}{M} \sum _{i} {{\textbf {E}}}[cost(T^i)]\\&= \frac{1+\varepsilon '}{M}\sum _{v \in S^*} {{\textbf {E}}}[D(v)] \le (1+\varepsilon ') \sum _{v \in S^*} H(w(v))\\&= \gamma _G \left( 1 +\frac{\varepsilon }{2} \right) cost(T) \le (\gamma +\varepsilon ) cost(T). \end{aligned}$$

The first inequality holds since (i) Lemma 2 implies that an optimal solution to the CA-Node-Steiner-Tree instance defined in iteration i has cost at most \(cost(T^i)\), and (ii) Theorem 7 shows that the k-DCR LP provides a \((1+\varepsilon ')\)-approximate solution to it. The first inequality of the second line follows since \(\sum _{v \in S^*} {{\textbf {E}}}[D(v)] \le M \cdot \sum _{v \in S^*} H(w(v))\), while the last inequality follows by the definition of \(\gamma \) and Theorem 8.\(\square \)

Putting everything together, we can now easily prove our main two theorems, Theorems 1 and 2 .

Proof of Theorems 1 and 2

We first design a 1.892-approximation algorithm for the CA-Node-Steiner-Tree problem. For that, we pick \(\varepsilon \) to be sufficiently small (e.g., \(\varepsilon = 0.0003\) suffices) and by Theorems 8 and 9, we get that the iterative randomized rounding algorithm (Algorithm 1) is a 1.892-approximation algorithm for CA-Node-Steiner-Tree. By Theorems 4 and 5 , we also get a 1.892-approximation algorithm for 1-Node-CAP and CacAP. This concludes the proof.\(\square \)

4 A deterministic construction of the witness tree

In this section, for any given CA-Node-Steiner-Tree instance \(G=(V, E)\), we consider a specific optimal Steiner tree and describe a deterministic construction of a witness tree W that shows that \(\gamma _G < 1.8917\). By the definition of \(\gamma \), this immediately implies Theorem 8.

Throughout this section, we use the following notation: given is a CA-Node-Steiner-Tree instance \(G=(R \cup S, E)\) and an optimal solution \(T = (R \cup S^*, E^*)\). The goal is to construct a witness tree \(W = (R, E_W)\) spanning the terminals, such that \(\frac{1}{\vert S^* \vert } \sum _{v \in S^*} H(w(v))\) is minimized, where \(w: S^* \rightarrow {\mathbb {N}}_{\ge 0}\) is the vector imposed by W. From now on, we will refer to the quantity \(\frac{1}{\vert S^* \vert } \sum _{v \in S^*} H(w(v))\) as the average H(w)-value of the Steiner nodes. Similarly, we will refer to w(v) as the w-value of a node v. Since the proof of Theorem 8 consists of several steps, we will present them separately, and at the end of the section we will put everything together and formally prove Theorem 8.

We start by observing that every CA-Node-Steiner-Tree instance has an optimal solution in which the terminals are leaves. This follows by Definition 3, since (i) there are no edges between terminals and (ii) adjacent Steiner nodes of a terminal form a clique. Thus, from now on we assume that T is an optimal solution in which the terminals are leaves. Given this, the first step is to show that we can reduce the problem of constructing a witness tree spanning the terminals to the problem of constructing a witness tree spanning the Steiner nodes that are adjacent to terminals. This will allow us to remove the terminals from T, and only focus on \(T[S^*]\), which is a (connected) tree.

4.1 Removing terminals from T

Let \(F \subseteq S^*\) be the set of Steiner nodes contained in T with at least one adjacent terminal. We call F the set of final Steiner nodes. In order to simplify notation, we will remove the terminals from T, and compute a tree W spanning the final nodes of \(T[S^*]\) (note that the set of final nodes contains all the leaves of \(T[S^*]\)). Any such tree W directly corresponds to a terminal spanning tree as follows: for each final Steiner node \(f \in F\), we arbitrarily pick one of the (at most) two terminals adjacent to f, which we denote as \(\mathtt {rep}(f)\). Given a spanning tree W of the final Steiner nodes, we can now generate a terminal spanning tree \(W'\) as follows (see Fig. 2 a):

  • we replace every edge \((f, f')\) of W with the edge \((\mathtt {rep}(f), \mathtt {rep}(f'))\),

  • in case a final Steiner node f has a second terminal t adjacent to it besides \(\mathtt {rep}(f)\), we also add the edge \((\mathtt {rep}(f), t)\) to the tree.

It is easy to see that the vectors w and \(w'\) imposed by W and \(W'\), respectively, on the Steiner nodes are the same for all nodes except for the final Steiner nodes, where the w-values differ by 1 in case a final Steiner node has two adjacent terminals. Thus, from now on we focus on computing a tree W spanning the final Steiner nodes, and we simply increase the corresponding imposed w-value of each final Steiner node by 1. More formally, if \(W = (S^*, E_W)\) is a tree spanning the final nodes of \(T[S^*]\), we have for every \(v \in S^*\):

where denotes the indicator of the event “\(v \in F\)”. This expression corresponds to the (worst-case) assumption that every final Steiner node is incident to two terminals; this is without loss of generality since, if the average H(w)-value computed in this way is less than a given value \(\gamma ' > 0\), then clearly the average \(H(w')\)-value imposed by the terminal spanning tree \(W'\) is also bounded by \(\gamma '\), as the harmonic function H is an increasing function.

The next step is to root \(T[S^*]\) and then decompose it into a collection of rooted subtrees of \(T[S^*]\) such that every final Steiner node appears as either a leaf or a root (or both); we call these subtrees final-components. This decomposition is nearly identical to the one used in Sect. 2.2 where we decomposed a tree into components where terminals appeared as leaves (here we use final Steiner nodes instead of terminals), with the added element that now these components are also rooted.

Fig. 2
figure 2

(a) Given is a Steiner tree, marked with black, where the terminals are marked as square nodes. The set of final nodes is \(\{v_1,v_2,v_3,v_4,v_5\}\), and a tree spanning them is denoted with the green edges. In order to obtain a terminal spanning tree, each green edge \((v_i,v_j)\) is replaced with the edge \((t_i,t_j)\), marked with blue, where \(t_i= \mathtt {rep}(v_i)\) for every \(i\in \{1,2,3,4,5\}\). Finally, every terminal with a “sibling” terminal has a corresponding edge added, marked again with blue. (b) An example of the the Steiner tree from (a) decomposed into its final-components as defined in Sect. 4.2, where the components are circled in green, red, and blue dashed lines (colour figure online)

4.2 Decomposing T into final-components

To simplify notation, let \(T = (V, E)\) be a tree where \(F \subseteq V\) is the set of final Steiner nodes, again noting that F contains all the leaves of T. We start by rooting the tree T at an arbitrary leaf r; note that \(r \in F\). We decompose T into a set of final-components \(T_1, \dots , T_{\tau }\), where a final-component is a rooted maximal subtree of T whose root and leaves are the only nodes in F. We clarify here that for a rooted tree, we call a vertex a leaf if it has no children (thus, although the root may have degree 1, it is not referred to as a leaf).

More precisely, we start with T and consider the maximal subtree \(T_1\) of T that contains the root r, such that all leaves of \(T_1\) are final Steiner nodes, and including the root, are the only final Steiner nodes. For each leaf \(f \in F \cap T_1\), we do the following. If f has children \(\{x_1, \ldots , x_p\}\) in T, for some \(p \in {\mathbb {N}}_{p \ge 1}\), we first create p copies of f; this means that, along with the original node f, we now have \(p+1\) copies of f, and one copy of it remains in \(T_1\). We then remove \(T_1\) from T, and for each tree \(T(x_i)\), \(i \in [p]\) where \([p] :=\{1,2,\dots ,p\}\), rooted at \(x_i\), that is part of the forest \(T \setminus T_1\), we connect \(x_i\) with one distinct copy of f, and set this copy to be the root of the modified tree \(T(x_i)\). Doing this for every leaf of \(T_1\), we end up with a forest. We repeat this procedure recursively for each tree of this forest, until all trees of the forest are final-components. A simple example of this decomposition is shown in Fig. 2b. Let \(T_1, \ldots , T_\tau \) be the resulting final-components. Without loss of generality, we assume that for every \(i \in [\tau ]\), \(\bigcup _{j\le i} T_j\) is connected. We clarify that in this union of trees, duplicate copies of the same node are “merged” back into one node.

Given the final-components \(T_1, \ldots , T_\tau \) of T, we will now show how to compute a tree \(W_i\) spanning the final Steiner nodes of each \(T_i\), and then show how to combine all of them to generate a tree W that spans the final Steiner nodes of T, while bounding the average H(w)-value imposed on \(S^*\) by W.

4.3 Computing a tree W spanning the final Steiner nodes of T

We start by describing how to compute a tree \(W_i\) that spans the final Steiner nodes of each final-component \(T_i\), as well as how to join these trees \(\{W_i\}_{i = 1}^\tau \) to obtain a tree W spanning the whole set of final Steiner nodes of T. We construct W iteratively as follows. We start with \(W =\emptyset \), process the final-components \(T_1, \ldots , T_\tau \) one by one according to the index-order, and for each i, compute a tree \(W_i\) for \(T_i\) and merge it with the current W; again, in this merging, multiple copies of the same node are merged back into one node.

To analyze this procedure, we fix a final-component \(T_i\) rooted at \(r_i\), and let \((r_i, v)\) be the unique edge incident to \(r_i\) inside \(T_i\). Let \(T' = \bigcup _{j < i} T_j\), and \(W'\) be the constructed witness tree for \(T'\). Let \(w'\) be the vector imposed on the Steiner nodes of \(T'\) by \(W'\). Note that for \(i=1\) we let \(T'=\emptyset \), \(W'=\emptyset \), and \(w'= 0\).

We do some case analysis based on whether \(v \in F\) or not; throughout this analysis, we use the notation \(\vert T\vert \) to denote the number of nodes of a tree T.

Case 1: \(v \in F\). In this case, the edge \((r_i,v)\) is the only edge of \(T_i\), so \(W_i\) simply consists of the nodes \(\{r_i, v\}\) and the edge \((r_i,v)\). The following lemma shows that merging \(W_i\) with \(W'\) (by simply merging the two copies of \(r_i\) and taking the union of the edges of \(W'\) and \(W_i\), in the case where \(W' \ne \emptyset \)) would keep the average H(w)-value below our threshold. Let \(w''\) be the vector imposed by \(W'' = W' \cup W_i\).

Lemma 3

Let \(\gamma ' \ge H(3)\), and suppose that \(\sum _{u \in T'} \frac{H(w'(u))}{\vert T' \vert } < \gamma '\). Then, \(\sum _{u \in T'\cup T_i} \frac{H(w''(u))}{\vert T' \cup T_i \vert } < \gamma '\).

Proof

We do some case analysis. If \(i = 1\), then \(W' = \emptyset \) and \(r_i = r\). In this case we have \(w''(r_i) = w''(v) = 2\), which gives \(\sum _{u \in T'\cup T_i} \frac{H(w''(u))}{\vert T' \cup T_i \vert } = H(2) < \gamma '\).

So, suppose now that \(i > 1\), which implies that \(W' \ne \emptyset \). Note that \(w''(v) = 2\), \(w''(r_i) = w'(r_i) + 1\), and for each \(u \in T'\setminus \{r_i\}\), \(w''(u)= w'(u)\). Moreover, we have \(w'(r_i) \ge 2\); to see this, note that \(r_i\) is a final Steiner node of \(T'\) and has degree at least 1 in \(W'\). Thus, \(H(w''(r_i)) \le H(w'(r_i)) + \frac{1}{3}\). Therefore,

$$\begin{aligned} \sum _{u \in T'\cup T_i} \frac{H(w''(u))}{\vert T' \cup T_i \vert }&\le \frac{H(w''(v)) + 1/3 + \sum _{u \in T'} H(w'(u))}{\vert T' \cup T_i \vert }\\&= \frac{H(3) + \sum _{u \in T'} H(w'(u))}{\vert T' \vert + 1} < \gamma '. \end{aligned}$$

We conclude that the lemma holds.\(\square \)

Case 2: \(v \notin F\). In this case, the witness tree \(W_i\) of \(T_i\) is computed deterministically by a procedure described in Algorithm 2; we provide an example of the output of Algorithm 2 in Fig. 3.

figure c
Fig. 3
figure 3

(a) Given is a graph \(G=(V,E)\), with final Steiner nodes depicted as red squares. At each interior node u we mark in red the u-to-leaf-path P(u) with the least weight (sum of the inverse of the node degrees on the path); this corresponds to step (3) of Algorithm 2. (b) At step (4) of Algorithm 2, we contract the red edges and label the new contracted nodes with the name of the original (marked) leaf node. (c) A pictorial description of the witness tree obtained for the original graph

In order to bound the average H(w)-value that we get after adding \(W_i\) to the current \(W'\), we first introduce some notation. For any node \(u \ne r_i\) of \(T_i\), let \(\mathtt {parent}(u)\) denote the parent of u in \(T_i\). Let \(Q_u\) denote the subtree of \(T_i\) rooted at u. For any non-final Steiner node u, P(u) denotes the minimum-weight path from u to a leaf, as computed at step (3) of Algorithm 2. In the case where u is a final Steiner node of \(T_i\), we define \(P(u) :=\{u\}\); in particular \(P(r_i) :=\{r_i\}\). For every \(u \in T_i\), let \(q_u\) be the number of nodes of P(u) (including the endpoints) and l(u) denote the (unique) endpoint of this path that is a final Steiner node. For every non-final Steiner node u, exactly one child of u is in this path, which we call the marked child of u. For every \(u \ne r_i\), we let a(u) denote the node closest to u on the path from u to \(r_i\) such that \(l(u) \ne l(a(u))\). Examples of these definitions are given in Fig. 4.

Fig. 4
figure 4

Given is an example of a node u; the minimum-weight path from u to the leaf \(\ell (u)\), \(P(u) = \{u,u_1,\ell (u)\}\); the parent of u, \(\mathtt {parent}(u)\); the first node on the u to \(r_i\) path, a(u), such that \(\ell (u) \ne \ell (a(u))\); and finally, the edge \(e_u = \{\ell (u), \ell (a(u))\}\)

To begin the analysis, we first observe that for every node \(u \ne r_i\) of \(T_i\), \(W_i\) contains an edge with endpoints l(u) and l(a(u)); we call this edge \(e_u\). Note that we can have \(e_u = e_v\) for \(u \ne v\). For every \(u \ne r_i\), we consider the subtree \(W^u\) of \(W_i\) defined with respect to \(Q_u\), defined as a set of edges as

$$\begin{aligned} W^u = \{e \in W_i : \text{ both } \text{ the } \text{ endpoints } \text{ of } e \text{ are } \text{ in } Q_u \} \cup e_u. \end{aligned}$$

We also define \(W^{r_i} = W_i\). Observe that for every \(u \in T_i\), \(W^u\) indeed corresponds to a connected subtree of \(W_i\). An example of \(W^u\) is provided in Fig. 5, where we also clarify several properties of Lemma 4, proved below. We let \(w^u\) be the vector imposed by \(W^u\), where for every \(\ell \in Q_u\), we have

Let \(h(Q_u) :=\sum _{\ell \in Q_u} H(w^u(\ell ))\). We note here that \(W^u\) might involve one final Steiner node, namely l(a(u)), that is not contained in \(Q_u\). Finally, let \(d_u\) be equal to 3 if u is a leaf, otherwise, let \(d_u\) be equal to the degree of u in \(T_i\).

Fig. 5
figure 5

Given is the tree \(Q_u\) of \(T_i\). \(W^u\) contains all of the red edges. Note that each \(W^{u_i}\) is equal to the red edge with both endpoints below \(u_i\) plus \(e_{u_i}\), for \(i = 1,2,3,4\). We can also see an example of the first three parts of Lemma 4: (a) \(w^u(u) = 4\); (b) for \(j = 2,3,4\), and every \(\ell \in Q_{u_j}\), \(w^u(\ell ) = w^{u_j}(\ell )\), and; (c) for each \(\ell \in Q_{u_1}\setminus P(u_1), w^u(\ell ) = w^{u_1}(\ell )\)

Lemma 4

Let \(u \in T_i\) be a non-final Steiner node and \(u_1, \dots , u_p\) be the children of u, with \(u_1\) being the marked child of u. Then:

  1. (a)

    \(w^u(u) = p\).

  2. (b)

    For every \(j \in \{2, \ldots , p\}\) and every Steiner node \(\ell \in Q_{u_j}\), \(w^u(\ell ) = w^{u_j}(\ell )\).

  3. (c)

    For every \(\ell \in Q_{u_1} \setminus P(u_1)\), \(w^u(\ell ) = w^{u_1}(\ell )\).

  4. (d)

    For every \(\ell \in P(u_1)\), \(H(w^u(\ell )) - H(w^{u_1}(\ell )) \le \sum _{j=2}^{p} \frac{1}{d_{\ell } +j-2}\).

Proof

We first observe that \(W^u\) is given by the disjoint union of \(W^{u_1} \cup W^{u_2} \cup \ldots \cup W^{u_p}\). In fact, for \(j=2, \ldots , p\), the edge \(e_{u_j} \in W^{u_j}\) is precisely the edge with endpoints \(l(u_1)\) and \(l(u_j)\), and the edge \(e_{u_1} \in W^{u_1}\) corresponds to \(e_u\). Given these observations, \( w^u(u) = p\) as the edges of \(W^u\) which contribute to \(w^u(u)\) are precisely \(e_{u_1}, \dots , e_{u_p}\). Hence (a) holds.

Statements (b) and (c) follow by observing that the only nodes whose \(w^u\)-value differs from the corresponding \(w^{u_j}\)-value are in \(Q_{u_1}\), and are precisely the nodes of the path \(P(u_1)\). It is not hard to see that for every \(\ell \in P(u_1)\), we have \(w^u(\ell ) = w^{u_1}(\ell ) + p-1\). Thus, we get that \(H(w^u(\ell )) - H(w^{u_1}(\ell )) = \sum _{j= 2}^{p} \frac{1}{w^{u_1}(\ell ) + j - 1}\). We now compute a lower bound for \(w^{u_1}(\ell )\). If \(\ell \ne l(u_1)\), then we can apply statement (a), which we just proved, to \(Q_{u_\ell }\), and get that \(d_{\ell } -1 = w^{\ell }(\ell ) \le w^{u_1}(\ell )\). Hence (d) holds in this case. We now consider the case \(\ell = l(u_1)\), i.e., \(\ell \) is a leaf of \(Q_{u_1}\). In this case, we have \(d_{\ell } =3\) and \(w^{\ell }(\ell ) =2\), where the latter equality holds since there is a unique edge \(e_{\ell } \in W^{\ell }\) and \(\ell \) is a leaf, which means that its w-value is increased by 1. Thus, statement (d) holds in this case as well. This concludes the proof.\(\square \)

The following lemma is the main technical part of our analysis.

Lemma 5

Let \(\delta = 7/120\). For each node \(u \in T_i \setminus r_i\), we have

$$\begin{aligned} h(Q_u) + \sum _{\ell \in P(u)} \frac{1}{d_{\ell }} + \delta < 1.8917 \cdot \vert Q_u\vert . \end{aligned}$$

Proof

The proof is by induction on \(\vert Q_u\vert \). The base case is \(\vert Q_u\vert =1\), in which case u is leaf of \(T_i\). In this case, we have \(d_u =3\), \(w^u(u)=2\), and \(P(u)=\{u\}\), giving

$$\begin{aligned} h(Q_u) + \frac{1}{3} + \delta = H(2) + \frac{1}{3} + \delta = 1.891{\bar{6}} < 1.8917. \end{aligned}$$

For the induction step, let u be a non-final Steiner node of \(T_i\), let \(u_1, \dots , u_p\) be its children, and let \(u_1\) be its marked child. By Lemma 4 we have that

$$\begin{aligned} h(Q_u) = \sum _{j=1}^p h(Q_{u_i}) + \sum _{\ell \in P(u_1)} \left( H(w^{u}(\ell )) - H(w^{u_1}(\ell ))\right) + H(p). \end{aligned}$$

By the induction hypothesis, for each \(j \in [p]\) we have \(h(Q_{u_j}) + \sum _{\ell \in P(u_j)} \frac{1}{d_{\ell }} + \delta < 1.8917 \cdot \vert Q_{u_j}\vert \). Combining these we get

$$\begin{aligned} h(Q_u)&< 1.8917 \sum _{j=1}^p \vert Q_{u_j} \vert - \delta p - \sum _{j=1}^p \sum _{\ell \in P(u_j)} \frac{1}{d_{\ell }} + \sum _{\ell \in P(u_1)} (H(w^{u}(\ell ))\\&\quad - H(w^{u_1}(\ell )))+ H(p)\le 1.8917 \sum _{j=1}^p \vert Q_{u_j} \vert - \delta p - \sum _{j=1}^p \sum _{\ell \in P(u_j)} \frac{1}{d_{\ell }} \\&\quad + \sum _{\ell \in P(u_1)} \sum _{j = 2}^p \frac{1}{d_\ell + j - 2} + H(p), \end{aligned}$$

where the last inequality follows by statement (d) of Lemma 4. We now observe that due to the greedy choice of the path P(u) at step (3) of Algorithm 2, for each \(j \in [p]\) we have \(\sum _{\ell \in P(u_j) \setminus F} \frac{1}{d_{\ell }} + 1 \ge \sum _{\ell \in P(u_1) \setminus F} \frac{1}{d_{\ell }} + 1\), since there is exactly one final Steiner node in each path \(P(u_j)\), \(j \in [p]\). Thus, we get \(\sum _{\ell \in P(u_j)} \frac{1}{d_{\ell }} \ge \sum _{\ell \in P(u_1)} \frac{1}{d_{\ell }}\) for every \(j \in [p]\), and so \(\sum _{j=1}^p \sum _{\ell \in P(u_j)} \frac{1}{d_{\ell }} \ge p \sum _{\ell \in P(u_1)} \frac{1}{d_{\ell }}\). Moreover, for every \(\ell \in P(u_1)\) we have \(\sum _{j = 2}^p \frac{1}{d_\ell + j - 2} \le \frac{p-1}{d_\ell }\). Combining these we get

$$\begin{aligned} h(Q_u)&< 1.8917 \sum _{j=1}^p \vert Q_{u_j} \vert - \delta p - \sum _{\ell \in P(u_1)} \frac{p}{d_{\ell }} + \sum _{\ell \in P(u_1) \setminus \{l(u)\}} \frac{p-1}{d_\ell } \\&\quad + \sum _{j = 2}^p \frac{1}{d_{l(u)} + j - 2} + H(p)\le 1.8917 \sum _{j=1}^p \vert Q_{u_j} \vert - \delta p - \sum _{\ell \in P(u_1)} \frac{1}{d_{\ell }}\\&\quad - \frac{p-1}{d_{l(u)}} + \sum _{j = 2}^p \frac{1}{d_{l(u)} + j - 2} + H(p). \end{aligned}$$

Substituting \(d_{l(u)}=3\) into this equation, we get

$$\begin{aligned} h(Q_u) < 1.8917 \sum _{j=1}^p \vert Q_{u_j}\vert - \delta p - \sum _{\ell \in P(u_1)} \frac{1}{d_{\ell }} - \frac{p-1}{3} + \sum _{j=2}^{p} \frac{1}{j+1} + H(p). \end{aligned}$$

Rearranging, and using \(\sum _{\ell \in P(u)} \frac{1}{d_{\ell }} = \sum _{\ell \in P(u_1)} \frac{1}{d_{\ell }} + \frac{1}{d_{u}}\) and \(d_u = p+1\), we get

$$\begin{aligned} h(Q_u) + \sum _{\ell \in P(u)} \frac{1}{d_{\ell }} + \delta&< 1.8917 \sum _{j=1}^p \vert Q_{u_j}\vert - \delta (p-1) - \frac{p-1}{3} + \sum _{j=2}^{p} \frac{1}{j+1} + H(p) + \frac{1}{d_u}\\&= 1.8917 \sum _{j=1}^p \vert Q_{u_j}\vert - \delta (p-1) - \frac{p-1}{3} + \sum _{j=2}^{p} \frac{1}{j+1} + H(p+1). \end{aligned}$$

We will now show that \(-\delta (p-1) - \frac{p-1}{3} + \sum _{j=2}^{p} \frac{1}{j+1} + H(p+1) \le 1.8917\). For that, we define \(\psi (p) :=2H(p+1) - (p-1)\left( \frac{1}{3} + \delta \right) - H(2)\). Note that the left-hand side of the inequality above is equal to \(\psi (p)\). We now show that \(\psi (p) \le \psi (4)\) for every \(p \in {\mathbb {N}}_{>0}\). For that, we have

$$\begin{aligned} \psi (p+1) - \psi (p)&= 2H(p+2) {-} p\left( \frac{1}{3} {+} \delta \right) - H(2) - \left( 2H(p+1) - (p-1)\left( \frac{1}{3} + \delta \right) - H(2) \right) \\&= \frac{2}{p+2} - \frac{1}{3} - \delta . \end{aligned}$$

We now observe that \(\psi (p+1) - \psi (p) \ge 0\) for \(p \in \{1,2,3\}\) and \(\psi (p+1) - \psi (p) < 0\) for \(p \ge 4\). Thus, \(\psi (p)\) is increasing for \(p \le 4\) and decreasing for \(p \ge 4\). This means that \(\max _{p \in {\mathbb {N}}_{>0}} \psi (p) = \psi (4)\). It is easy to verify that \(\psi (4) = 1.891{\bar{6}} < 1.8917\). Putting everything together, we conclude \(h(Q_u) + \sum _{\ell \in P(u)} \frac{1}{d_{\ell }} + \delta < 1.8917 \cdot \vert Q_u \vert \).\(\square \)

Finally, we observe that \(W_i = W^{r_i}\). With the above lemma at hand, it is not difficult to show that merging \(W_i\) with the current W keeps the average H(w)-value below our targeted threshold. Once again, let \(T'\) be the union of \(T_1, \dots , T_{i-1}\), let \(w'\) be the vector imposed by the current \(W'\) before adding \(W_i\), and \(w''\) be the vector imposed by \(W' \cup W_i\).

Lemma 6

If \(\sum _{u \in T'} \frac{H(w'(u))}{\vert T'\vert } < 1.8917\), then \(\sum _{u \in T'\cup T_i} \frac{H(w''(u))}{\vert T' \cup T_i\vert } < 1.8917\).

Proof

We do some case analysis. We first consider the case of \(i = 1\), which means that \(r_i = r\) and \(W' = \emptyset \). In this case, since \(T_i \setminus \{r_i\} = Q_v\), we can apply Lemma 5 and get that \(\sum _{\ell \in T_i \setminus \{r_i\}} H(w''(\ell )) < 1.8917(\vert T_i\vert -1) - \sum _{\ell \in P(v)} \frac{1}{d_{\ell }} - \delta \). Moreover, we have \(w''(r_i) = 2\). Thus, we get

$$\begin{aligned} \sum _{u \in T'\cup T_i} \frac{H(w''(u))}{\vert T' \cup T_i\vert }&= \frac{\sum _{\ell \in T_i \setminus \{r_i\}} H(w''(\ell )) + H(2)}{\vert T' \cup T_i\vert }\\&< \frac{1.8917 (\vert T_i\vert -1) + H(2)}{\vert T' \cup T_i\vert } < 1.8917. \end{aligned}$$

We now turn to the case of \(i > 1\), which means that \(W' \ne \emptyset \). In this case, we note the following:

  1. 1.

    \(w''(\ell ) = w'(\ell )\) for all \(\ell \in T_i \setminus r_i\),

  2. 2.

    \(w''(r_i) = w'(r_i) + 1\), and

  3. 3.

    \(w'(u)=w''(u)\) for all \(u \in T'\setminus \{r_i\}\).

Since \(r_i\) is a leaf of \(T'\), we have \(w'(r_i) \ge 2\), which implies that \(H(w''(r_i)) \le H(w'(r_i)) + \frac{1}{3}\). Moreover, since \(T_i \setminus \{r_i\} = Q_v\), we can apply Lemma 5 to get \( \sum _{\ell \in T_i \setminus \{r_i\}} H(w''(\ell )) < 1.8917(\vert T_i\vert -1) - \sum _{\ell \in P(v)} \frac{1}{d_{\ell }} - \delta \). Furthermore, \( \sum _{\ell \in P(v)} \frac{1}{d_{\ell }} + \delta \ge \frac{1}{3}\), as \(d_{l(v)} = 3\) and \(l(v) \in P(v)\). Hence we get that

$$\begin{aligned} \sum _{u \in T'\cup T_i} \frac{H(w''(u))}{\vert T' \cup T_i\vert }&\le \frac{\sum _{\ell \in T_i \setminus \{r_i\}} H(w''(\ell )) + 1/3 + \sum _{u \in T'} H(w'(u))}{\vert T' \cup T_i\vert }\\&< \frac{1.8917 (\vert T_i\vert -1) + 1.8917\vert T'\vert }{\vert T' \cup T_i\vert } = 1.8917. \end{aligned}$$

\(\square \)

4.4 The proof of Theorem 8

We are now ready to put everything together and prove Theorem 8.

Proof of Theorem 8

Let \(G = (R \cup S, E)\) be a CA-Node-Steiner-Tree instance, where R is the set of terminals. Let \(T = (R \cup S^*, E^*)\) be an optimal solution in which the terminals are leaves. By the discussion of Sect. 4.1, it suffices to construct a tree \(W = (F, E_W)\) spanning the set of final Steiner nodes of T, whose imposed w-value on \(v \in S^*\) is defined as

Following the discussion of Sect. 4.2, we root T at an arbitrary leaf r and decompose it into a set of final-components \(T_1, \ldots , T_\tau \), with the property that \(r \in T_1\) and for every \(i \in [\tau ]\), \(\bigcup _{j \le i} T_j\) is connected. We process the trees in increasing index order and do the following. We maintain a tree \(W'\) spanning the final Steiner nodes of \(T' = \bigcup _{j < i} T_j\), and during iteration i, we compute a tree \(W_i\) spanning the final Steiner nodes of \(T_i\) and then merge it with \(W'\), as discussed in Sect. 4.3. The resulting tree is a tree spanning the final Steiner nodes of \(\bigcup _{j \le i} T_j\), as demonstrated in Sect. 4.3. The analysis we did now shows that after each iteration i, the resulting tree spanning the final Steiner nodes of \(\bigcup _{j \le i} T_j\) satisfies the desired bound of 1.8917. More precisely, by setting \(\gamma ' = 1.8917\) in Lemma 3 and by using Lemma 6, we get that after each iteration, the resulting tree has an average H(w)-value strictly smaller than 1.8917. Thus, a straightforward induction shows that the final tree W will have an average H(w)-value strictly smaller than 1.8917 and will span the final Steiner nodes of T. This means that \(\gamma _G < 1.8917\), which implies that \(\gamma \le 1.8917\).\(\square \)

5 Improved approximation for leaf-adjacent block-TAP

In this section, we consider Block-TAP instances where at least one endpoint of every link is a leaf; we call such instances leaf-adjacent Block-TAP instances. We note that leaf-adjacent Block-TAP is a generalization of the case where both endpoints are leaves, often called leaf-to-leaf Block-TAP. For leaf-to-leaf Block-TAP, Nutov gave a \(1.6{\bar{6}}\)-approximation [29]. Here, we give a \((1.8{\bar{3}} + \varepsilon )\)-approximation for the more general setting of leaf-adjacent Block-TAP.

Throughout this section, we assume that we are given a Block-TAP instance, where \(T=(V_T, E_T)\) is the input tree, \(R_T \subseteq V_T\) is its set of leaves and \(L \subseteq \left( {\begin{array}{c}V_T\\ 2\end{array}}\right) \) is the set of links, such that \(\ell \cap R_T \ne \emptyset \) for every \(\ell \in L\). Using the reduction to CA-Node-Steiner-Tree (see Theorem 5), we get a CA-Node-Steiner-Tree instance \(G= (R\cup S,E)\) that satisfies the following property: every Steiner node in G is adjacent to at least one terminal. We call such instances leaf-adjacent CA-Node-Steiner-Tree instances.

We start by observing that any leaf-adjacent CA-Node-Steiner-Tree instance \(G = (R \cup S, E)\) has an optimal solution \(T^* = (R \cup S^*, E^*)\) such that every Steiner node \(s \in S^*\) is adjacent to at least one terminal in \(T^*\). To see this, suppose that there is a Steiner node \(s \in S^*\) that is not adjacent to any terminal in \(T^*\). Since s is adjacent to some terminal \(r \in R\) in G, we add the edge (sr) to \(T^*\). This creates a cycle that goes through s. By removing the edge of the cycle that is adjacent to s and is not equal to (sr), we end up with a different optimal solution where s is adjacent to a terminal. This shows that we can transform any optimal solution to a solution that satisfies the desired property, namely, that every Steiner node is adjacent to a terminal. Thus, from now on we assume that \(T^*\) satisfies this property.

Our analysis consists of demonstrating that the procedure laid out in Sect. 4.3 finds a witness tree W for \(T^*\) such that if \(w: S^* \rightarrow {\mathbb {N}}\) is the vector imposed on \(S^*\) by W, then we have \(\frac{1}{\vert S^* \vert } \sum _{v \in S^*} H(w(v)) \le H(3)\). To see this, we first transform \(T^*\) to a forest as follows. In case there is a terminal \(r \in R\) of \(T^*\) that is not a leaf, we split the tree \(T^*\) at r by first removing r, and then adding back a copy of r as a leaf to the d(r) trees of the forest \(T^* \setminus \{r\}\), where d(r) is the degree of r in \(T^*\). By performing this operation for all terminals whose degree is larger than 1, we end up with a forest, where each terminal \(r \in R\) appears in d(r) trees, in total, such that for each tree of the forest, every final Steiner node is adjacent to a terminal and moreover, all terminals are leaves. Since each Steiner node belongs to exactly one tree of this forest, it is easy to see that if we compute a terminal spanning tree for each tree of the resulting forest, and take the union of these trees, then the w-value of each Steiner node imposed by the final terminal spanning tree is the same as the w-value imposed by the terminal spanning tree computed for the particular tree of the forest which the Steiner node belongs to. Thus, without loss of generality, we assume from now on that \(T^*\) satisfies both desired properties, namely that every Steiner node is adjacent to a terminal and every terminal is a leaf.

We now remove the terminals from \(T^*\) as in Sect. 4.1 and decompose the remaining tree \(T^*[S^*]\) into final-components as in Sect. 4.2, which we denote \(T^*_1,\dots , T^*_\tau \). It then remains to see that when we construct the witness trees \(\{W_i\}_{i=1}^\tau \) and merge them together to create the witness tree W, every tree in \(\{T^*_i\}_{i=1}^\tau \) falls under Case 1 of the analysis. This implies that we only apply Lemma 3 when merging witness trees. Since the guarantee of Lemma 3 holds for any \(\gamma ' \ge H(3)\), we use it with \(\gamma ' = H(3)\) and get the desired bound.

More precisely, observe that each node of \(T^*[S^*]\) is a final node by the construction of \(T^*\), so the final-components \(T^*_1,\dots , T^*_\tau \) are in fact the edges of \(T^*[S^*]\). When the analysis of Sect. 4 considers a fixed final-component \(T^*_i\) rooted at \(r_i\) with \((r_i,v)\) being the unique edge incident to \(r_i\) inside \(T^*_i\), we know that v is a final node by construction, so we only consider Case 1 when analyzing the change of the average H(w)-value. Thus, we obtain the following theorem.

Theorem 10

Given is a leaf-adjacent CA-Node-Steiner-Tree instance \(G = (R \cup S, E)\), and let \(T^* = (R \cup S^*, E^*)\) be an optimal Steiner tree such that each Steiner node \(s \in S^*\) is adjacent to at least one terminal in \(T^*\). Then, there exists a witness tree W such that \(\frac{1}{\vert S^* \vert } \sum _{v \in S^*} H(w(v)) \le H(3)\), where w is the vector imposed on \(S^*\) by W.

The witness tree that achieves the stated value of \(\frac{1}{\vert S^* \vert } \sum _{v \in S^*} H(w(v))\) is in fact a very natural one. We perform the operation above to transform \(T^*\) into a forest where the terminals are exactly the leaves of \(T^*\). For every component, and for every edge (uv) of that component, there is a corresponding edge in the witness tree between one of the terminals adjacent to u and one adjacent to v. Furthermore, if two terminals have the same adjacent Steiner node, then the witness tree has an edge between those two terminals as well.

The above theorem, along with Theorems 5 and 9 , immediately imply Theorem 3.

6 Limitations of the witness tree analysis

In this section, we show that we cannot get an approximation factor better than \(1.8{\bar{3}}\) by using (deterministic/randomized) witness trees in the analysis of Algorithm 1. Moreover, we show that Algorithm 2 sometimes finds a witness tree that gives an average H(w)-value greater than 1.8504. This shows that our analysis of the approximation factor of Algorithm 1 with the witness tree generated by Algorithm 2 is off by at most 0.0416.

6.1 Lower bound for \(\gamma \)

In this section, we show that \(\gamma \ge H(3)\). To prove this, we construct a family of leaf-adjacent CA-Node-Steiner-Tree instances that are trees, i.e., the optimal Steiner tree is the input graph itself, along with a specific witness tree that we prove to be optimal, such that the average H(w)-value is at least \(H(3) - \varepsilon \), for any fixed \(\varepsilon > 0\). More formally, we prove the following theorem.

Theorem 11

For any \(\varepsilon > 0\), there exists a leaf-adjacent CA-Node-Steiner-Tree instance \(G_\varepsilon \) such that \(\gamma _{G_\varepsilon } > H(3) - \varepsilon \).

Proof

We give an explicit construction that corresponds to a leaf-to-leaf Block-TAP instance, i.e., every Steiner node is adjacent to exactly two terminals. Recall that leaf-to-leaf instances are a specific case of leaf-adjacent instances. Consider the following graph. Let \(t = \lceil \frac{2}{3\varepsilon } \rceil + 1\) and let \(S = \{s_1, \ldots , s_t\}\). Let \(R = \bigcup _{i = 1}^t\{r_i^{(1)}, r_i^{(2)}\}\). For every \(i \in [t-1]\), the node \(s_i\) is adjacent to \(s_{i+1}\). In other words, the nodes \(s_1, \ldots , s_t\) form a path whose endpoints are \(s_1\) and \(s_t\). Finally, every Steiner node \(s_i\) is adjacent to terminals \(r_i^{(1)}\) and \(r_i^{(2)}\). Let \(G_\varepsilon = (R \cup S, E)\) be the resulting CA-Node-Steiner-Tree instance. Observe that there is a unique optimal Steiner tree, which is the graph \(G_\varepsilon \) itself. This implies that

$$\begin{aligned} \gamma _{G_\varepsilon } = \frac{1}{t} \min _{\begin{array}{c} W:\; W\text { is }\\ \text {witness tree} \end{array}} \;\left( \sum _{i = 1}^t H(w(s_i)) \right) . \end{aligned}$$

We now consider the following witness tree W, that simply “follows” the path \(s_1, \ldots , s_t\). More precisely, we define \(W = (R, E_W)\) as follows:

  • There is an edge between \(r_i^{(1)}\) and \(r_i^{(2)}\) for every \(i \in [t]\).

  • There is an edge between \(r_i^{(1)}\) and \(r_{i+1}^{(1)}\) for every \(i \in [t-1]\).

It is easy to see that we have \(w(s_1) = w(s_t) = 2\), and \(w(s_i) = 3\) for every \(i \in \{2, \ldots , t-1\}\), and so we get that

$$\begin{aligned} \frac{1}{t} \sum _{i = 1}^t H(w(s_i)) = \frac{2H(2) + (t-2) H(3)}{t} = H(3) - \frac{2}{3t} > H(3) - \varepsilon . \end{aligned}$$

Thus, the only thing remaining to show is that W is an optimal witness tree with respect to minimizing \(\sum _{i = 1}^t H(w(s_i))\). For that, let \(W^*\) be an optimal witness tree with respect to minimizing \(\sum _{i = 1}^t H(w(s_i))\), and let \(w^*\) be the vector imposed on S by \(W^*\). We first check if \((r_i^{(1)}, r_i^{(2)}) \in W^*\) for every \(i \in [t]\). Suppose that there is an \(i \in [t]\) such that \((r_i^{(1)}, r_i^{(2)}) \notin W^*\). We now add the edge \((r_i^{(1)}, r_i^{(2)})\) to \(W^*\) and a cycle is created. Clearly, there exists an edge \(e' \in W^*\) adjacent to \(r_i^{(2)}\) such that \(W' = (W^* \setminus \{e'\}) \cup \{(r_i^{(1)}, r_i^{(2)})\}\) is a terminal spanning tree. It is also easy to see that \(w'(s_j) \le w^*(s_j)\) for every \(j \in [t]\), where \(w'\) is the vector imposed on S by \(W'\). We conclude that \(\sum _{i = 1}^t H(w'(s_i)) \le \sum _{i = 1}^t H(w^*(s_i))\), and so from now on we assume without loss of generality that \((r_i^{(1)}, r_i^{(2)}) \in W^*\) for every \(i \in [t]\).

We now impose some more structure on \(W^*\). In particular, we process the terminals in increasing order of index, and for each \(i \in [t - 1]\), we replace any edge of the form \((r_i^{(2)}, r_j^{(x)})\), \(j > i\) and \(x \in \{1,2\}\), with the edge \((r_i^{(1)}, r_j^{(1)})\). It is easy to see that no w-value changes, and so from now on we assume without loss of generality that for every \(i \in [t]\), the terminal \(r_i^{(2)}\) is a leaf of \(W^*\) that is connected with \(r_i^{(1)}\).

Finally, we turn to the edges \((r_i^{(1)}, r_{i+1}^{(1)})\) that are in W for every \(i \in [t-1]\). If \(W^* \ne W\), then there exist \(1 \le i < j \le t\), with \(j - i > 1\) such that \(e = (r_i^{(1)}, r_j^{(1)}) \in W^*\). We remove e from \(W^*\) and get two subtrees \(W_1^*\) and \(W_2^*\), where \(r_i^{(1)} \in W_1^*\) and \(r_j^{(1)} \in W_2^*\). Consider any \(k\in \{i+1,\dots ,j-1\}\). Assume \(r_k^{(1)}\in W^*_1\). We now add the edge \(e' = (r_{k}^{(1)}, r_j^{(1)})\) and obtain a new witness tree \(W' = W_1^*\cup W_2^*\cup \{e'\}\). Note that the removal of e causes \(w(s_l)\) to decrease by 1 for every Steiner node \(s_l\) with \(i \le l \le j\), while the addition of \(e'\) increases \(w(s_l)\) by 1 for every Steiner node \(s_l\) with \(k \le l \le j\). The case when \(r_k^{(1)}\in W^*_2\) is symmetric, simply adding \((r_i^{(1)},r_k^{(1)})\) instead of \((r_{k}^{(1)}, r_j^{(1)})\). We conclude that \(\sum _{i = 1}^t w'(s_i) \le \sum _{i = 1}^t w^*(s_i)\), where \(w'\) is the vector imposed on S by \(W'\). Putting everything together, we get that W is an optimal witness tree with respect to minimizing \(\sum _{i = 1}^t H(w(s_i))\), and so \(\gamma _{G_\varepsilon } > H(3) - \varepsilon \).\(\square \)

An immediate corollary of the above theorem is the following.

Corollary 2

\(\gamma \ge H(3) = 1.8{\bar{3}}\).

The above lower bound shows a limitation of our techniques, as it demonstrates that one cannot hope to obtain a better than \(1.8{\bar{3}}\)-approximation by selecting a different (deterministic/randomized) witness tree.

6.2 Lower bound for the witness tree generated by Algorithm 2

Finally, besides the \(1.8{\bar{3}}\) lower bound, we also give an explicit example of a CA-Node-Steiner-Tree instance for which Algorithm 2 finds a witness tree that gives an average H(w)-value greater than 1.8504. This shows that our analysis of the approximation factor of Algorithm 1 with the witness tree generated by Algorithm 2 is off by at most 0.0416.

We now describe the CA-Node-Steiner-Tree instance \(T = (R\cup S^*, E^*)\), which is a tree; thus the optimal solution is T itself. We will show that Algorithm 2 finds a witness tree spanning the terminals of T that gives an average H(w)-value greater than 1.8504. We now describe the tree T, which consists of five layers:

  • the \(1^{\text {st}}\) layer contains a single node r.

  • the \(2^{\text {nd}}\) layer consists of 9 nodes \(\{x_1, \ldots , x_9\}\). For each \(i \in [9]\), there is an edge \((r, x_i)\).

  • the \(3^{\text {rd}}\) layer is the set \(\bigcup _{i = 1}^9 \{y_{i1}, \ldots , y_{i5}\}\), which consists of 9 groups of 5 nodes. For each \(i \in [9]\) and \(j \in [5]\), there is an edge \((x_i, y_{ij})\).

  • the \(4^{\text {th}}\) layer is the set \(\bigcup _{i = 1}^9 \bigcup _{j = 1}^5 \{z_{ij1}, \ldots , z_{ij4}\}\), which consists of \(9 \times 5\) groups of 4 nodes. For each \(i \in [9]\), \(j \in [5]\) and \(k \in [4]\), there is an edge \((y_{ij}, z_{ijk})\).

  • the \(5^{\text {th}}\) layer is the set \(R = \bigcup _{i = 1}^9 \bigcup _{j = 1}^5 \bigcup _{k = 1}^4 \{q_{ijk}^{(1)},q_{ijk}^{(2)}\}\), which consists of \(9 \times 5 \times 4 \times 2\) terminals. For each \(i \in [9]\), \(j \in [5]\) and \(k \in [4]\), there is an edge \((z_{ijk}, q_{ijk}^{(1)})\) and an edge \((z_{ijk}, q_{ijk}^{(2)})\).

There are no other edges contained in T. As in Sect. 4.1, the set of final Steiner nodes is the set of vertices in the \(4^{\text {th}}\) layer, and since there are exactly two distinct terminals adjacent to each final Steiner node, we remove all the terminals and simply increase the w-value of the each final Steiner node by 1. A pictorial representation of \(T \setminus R\) is provided in Fig. 6.

Fig. 6
figure 6

The CA-Node-Steiner-Tree instance T with its first four layers of Steiner nodes; the set of terminals is absent while final Steiner nodes are depicted with squares, and many edges are depicted with dotted lines to simplify the figure. Since T is a tree, the solution to the CA-Node-Steiner-Tree problem is trivially T itself

From now on, we denote the set of vertices of the subtree rooted at \(x_i\) as \(Q_{x_i}\). To use Algorithm 2, we root T at a final Steiner node, and without loss of generality, by the symmetry of T, we can root T at \(z_{222}\). Similarly, without loss of generality we can let the ordering of the leaves of T selected at step (1) of Algorithm 2 coincide with the lexicographic ordering of the leaf indices. With this ordering, for each non-final Steiner node \(u\in T\), we find the minimum-weight path P(u) described at step (3) of Algorithm 2, which we highlight in Fig. 7, along with the corresponding w-values.

Fig. 7
figure 7

(a) The w-values for the nodes of \(Q_{x_i}\) are described pictorially. The term \(b_i\) is equal to 8 if \(i=1\) and equal to 1 otherwise. The red edges indicate the paths found by Algorithm 2. (b) A clearer explanation of the w-values for every node in \(Q_{x_2}\) is given here, as well as the shape of \(Q_{x_2}\) when T is rooted at the node \(z_{222}\)

For each \(Q_{x_i}\), \(i\in \{3,\dots ,9\}\), we have

$$\begin{aligned} \sum _{v\in Q_{x_i}} H(w(v)) = 15H(2)+4H(4)+5H(5)+H(8)+H(9), \end{aligned}$$

and for \(Q_{x_1}\) we have

$$\begin{aligned} \sum _{v\in Q_{x_1}} H(w(v)) = 15H(2)+4H(4)+4H(5)+H(12)+H(15)+H(16). \end{aligned}$$

We now turn to the nodes in the set \(T\setminus \bigcup _{i \in [9] \setminus \{2\}}Q_{x_i}\). It is not hard to see that \(w(r) = 8\), and

$$\begin{aligned} \sum _{v\in T\setminus \cup _{i \in [9] \setminus \{2\}} Q_{x_i}} H(w(v)) = 15H(2)+4H(4)+5H(5)+2H(8)+H(9). \end{aligned}$$

Putting everything together, we get that the average H(w)-value of the Steiner nodes of T is

$$\begin{aligned}&\frac{\sum _{v\in S^*} H(w(v))}{\vert S^*\vert } = \\&\quad = \frac{\sum _{v\in T\setminus \cup _{i \in [9] \setminus \{2\}} Q_{x_i}} H(w(v)) + \sum _{i=3}^9 \sum _{v\in Q_{x_i}} H(w(v)) + \sum _{v\in Q_{x_1}} H(w(v)) }{235} \\&\quad = \frac{135H(2) + 36H(4) + 44H(5) + 9H(8) + 8H(9) + H(12) + H(15) + H(16)}{235} \\&\quad > 1.8504. \end{aligned}$$

We conclude that for this particular instance, the analysis of of Algorithm 1 with the witness tree generated by Algorithm 2 will necessarily give an approximation factor strictly larger than 1.8504.