1 Introduction

Given an undirected connected graph \(G=(V,E)\), edge costs \(c: E \rightarrow {\mathbb {Q}}_{> 0}\) and a set \(T \subseteq V\) of terminals, the Steiner tree problem in graphs (SPG) is to find a tree \(S\subseteq G\) with \(T \subseteq V(S)\) such that c(E(S)) is minimized. The SPG is a classic \(\mathscr {NP}\)-hard problem [23], and one of the most studied problems in combinatorial optimization. Part of its theoretical appeal might be attributed to the fact that the SPG generalizes two other classic combinatorial optimization problems: Shortest paths, and minimum spanning trees. On the practical side, many applications can be modeled as SPG or closely related problems, see e.g. [6, 27].

The SPG has seen numerous theoretical advances in the last 10 years, bringing forth significant improvements in complexity and approximability. See e.g. [5, 15] for approximation, and [24, 29, 47] for complexity results. However, when it comes to (practical) exact algorithms, the picture is significantly more bleak. After flourishing in the 1990s and early 2000s, algorithmic advances came to a staggering halt with the (joint) PhD theses of Polzin and Vahdati Daneshmand almost 20 years ago [31, 46]. They introduced a wealth of new results and algorithms for SPG, and combined them in an exact solver that drastically outperformed all previous results from the literature. Their work is also published in a series of articles [32,33,34,35,36]. However, their solver is not publicly available.

The 11th DIMACS Challenge in 2014, dedicated to Steiner tree problems, brought renewed interest to the field of exact algorithms. In the wake of the challenge, several new exact SPG solvers were introduced in the literature [13, 14, 17, 30]. Overall, the 11th DIMACS Challenge brought notable progress on the solution of notoriously hard SPG instances that had been designed to defy known solution techniques, see [26, 41]. Several of these instances could be solved for the first time to optimality. However, on the vast majority of instances from the literature, Polzin and Vahdati Daneshmand [31, 46] (whose solver did not compete at the DIMACS Challenge) stayed out of reach: For many benchmark instances, their solver is even two orders of magnitude or more faster, and it can furthermore solve substantially more instances to optimality—including those introduced at the DIMACS Challenge [37]. In 2018, the 3rd PACE Challenge [4] took place, dedicated to fixed-parameter tractable algorithms for SPG. Thus, the PACE Challenge considered mostly instances with a small number of terminals, or with small tree-width. Solvers that successfully participated in the PACE Challenge are for example described in [18, 21]. Still, even for these special problem types, the solver by [31, 46] remained largely unchallenged, see e.g. [21].

The following article aims to once again advance the state of the art in exact SPG solution.

1.1 Contribution

This article is based on a combination of three concepts: Implications, conflicts, and reductions. As a result, various new SPG techniques are conceived. The main contributions are as follows.

  • By using a new implication concept, a distance function is conceived that provably dominates the well-known bottleneck Steiner distance. As a result, several reduction techniques that are stronger than results from the literature can be designed.

  • We show how to derive conflict information between edges from the above methods. Further, we introduce a new reduction operation whose main purpose is to introduce additional conflicts. Such conflicts can for example be used to generate cuts for the integer programming (IP) formulation.

  • We introduce a more general version of the powerful so-called extended reduction techniques. We furthermore enhance this framework by using both the previously introduced new distance concept, and the conflict information.

  • Finally, we integrate the components into a branch-and-cut algorithm. Besides preprocessing, domain propagation, and cuts, also primal heuristics can be improved (by using the new implication concept). The practical implementation is realized as an extension of the branch-and-cut solver SCIP-Jack [14].

The resulting exact SPG solver outperforms the current state-of-the-art solver from [31, 46] on many well-established benchmark sets from the literature. Furthermore, it can solve several instances for the first time to optimality.

1.2 Preliminaries and notation

We write \(G = (V,E)\) for an undirected graph, with vertices V and edges E. We set \(n := |V|\) and \(m := |E|\). We denote the vertices and edges of any subgraph \(S \subseteq G\) by V(S) and E(S), respectively. For a walk W we likewise denote the set of vertices and the set of edges it contains by V(W) and E(W). For any \(U \subseteq V\) we define the cut \(\delta (U):=\{ \{u,v\} \in E \mid u\in U, v\in V{\setminus }U\}\). We write \(\delta _G(U)\) to emphasize that the cut is defined with respect to graph G. For \(v \in V\) we write \(\delta (v) := \delta (\{v\})\). For any \(v \in V\) we define its neighborhood as \(N(v) := \{ w \in W \mid \{v,w\} \in \delta (v) \}\). Note that \(v \notin N(v)\).

Given edge costs \(c: E \mapsto {\mathbb {Q}}_{\ge 0}\), the triplet (VEc) is referred to as network. By d(vw) we denote the cost of a shortest path (with respect to c) between vertices \(v, w \in V\). For any (distance) function \({\tilde{d}}: \genfrac(){0.0pt}1{V}{2} \mapsto {\mathbb {Q}}_{\ge 0}\), and any \(U \subseteq V\) we define the \({\tilde{d}}\)-distance network on U as the network

$$\begin{aligned} D_G(U, {\tilde{d}}) := ( U, \left( {\begin{array}{c}U\\ 2\end{array}}\right) , {\tilde{c}}), \end{aligned}$$
(1)

with \({\tilde{c}}( \{v,w\} ) := {\tilde{d}}(v,w)\) for all \(v,w \in U\). If \( {\tilde{d}}\) is the standard distance (i.e. \({\tilde{d}}=d\)), we write \(D_G(U)\) instead of \(D_G(U, {d})\). Note that we write usually \({\tilde{d}}(v,w)\) instead of \({\tilde{d}}(\{v,w\})\). For an SPG instance on a graph \(G = (V,E)\) with terminal set \(T \subseteq V\) and edge costs c we write (GTc) or (VETc).

2 From implications to reductions

Reduction techniques have been a key ingredient in exact SPG solvers, see e.g. [9, 25, 33, 45]. Among these techniques, the bottleneck Steiner distance introduced in [11] is arguably the most important one, being the backbone of several powerful reduction methods. This section introduces a (provably) stronger distance concept, and discusses several applications for improved reduction methods.

2.1 The bottleneck Steiner distance

Let P be a simple path with at least one edge. The bottleneck length [11] of P is

$$\begin{aligned} bl(P) := \max _{e \in E(P)} c(e). \end{aligned}$$

Let \(v,w \in V\). Let \({\mathscr {P}}(v,w)\) be the set of all simple paths between v and w. The bottleneck distance [11] between v and w is defined as

$$\begin{aligned} b(v,w) := \inf \{ bl(P) \mid P \in {\mathscr {P}}(v,w) \}, \end{aligned}$$

with the common convention that \(\inf \emptyset = \infty \). It holds that \(b(v,w) = \infty \) if and only if v and w are unconnectedFootnote 1. Note that b(vw) is equal to the bottleneck length of the path between v and w on any minimum spanning tree (MST) of (Gc), as observed in [8].

Now consider the distance network \(D := D_G(T \cup \{v,w\})\). Let \(b_D\) be the bottleneck distance in D. Define the bottleneck Steiner distance or special distance [11] between v and w as

$$\begin{aligned} s(v,w) := b_D(v,w). \end{aligned}$$

The arguably best known bottleneck Steiner distance reduction method is based on the following criterion, which allows for edge deletion [11].

Theorem 1

Let \(e = \{v,w\} \in E\). If \(s(v,w) < c(e)\), then no minimum Steiner tree contains e.

Note the analogy between bottleneck distance applied to the MST problem, and bottleneck Steiner distance applied to the SPG: Any edge \(e = \{v,w\}\) that satisfies \(b(v,w) < c(e)\) cannot be part of an MST. Otherwise, e could be replaced by an edge of cost at most b(vw) to obtain a spanning tree of smaller cost. Any edge \(e = \{v,w\}\) that satisfies \(s(v,w) < c(e)\) cannot be part of a minimum Steiner tree. Otherwise, e could be replaced by a path in G corresponding to an edge in \(D = D_G(T \cup \{v,w\})\) with cost at most \(b_D(v,w)\). In this case, one would obtain a Steiner tree of smaller cost. We also point out that bottleneck Steiner distances can be computed in polynomial time, but in practice (heuristic) approximations are used. See [33] for a state-of-the-art algorithm.

2.2 A stronger bottleneck concept

In the following, we describe a generalization of the bottleneck Steiner distance. Initially, for an edge \(e = \{v,w\}\) define the restricted bottleneck distance \({\overline{b}}(e)\) [33] as the bottleneck distance between v and w on \((V,E{\setminus }\{e\},c)\).

The basis of the new bottleneck Steiner concept is formed by a node-weight function that we introduce in the following. For any \(v \in V{\setminus }T\) and \(F \subseteq \delta (v)\) define

$$\begin{aligned} p^+(v, F) := \max \left\{ 0, \sup \{ {\overline{b}}(e) - c(e) \mid e \in F, e \cap T \ne \emptyset \} \right\} . \end{aligned}$$
(2)

We call \(p^+(v, F)\) the F-implied profit of v. The following observation motivates the subsequent usage of the implied profit. Assume that \(p^+(v, \{ e \})>0\) for an edge \(e \in \delta (v)\). If a Steiner tree S contains v, but not e, then there is a Steiner tree \(S'\) with \(e \in E(S')\) such that \(c(E(S')) + p^+(v, \{ e \}) \le c(E(S))\).

Let \(v, w \in V\). Consider a finite walk \(W = (v_{1},e_{1}, v_{2},e_{2},\ldots ,e_{r-1}, v_{r})\) with \(v_{1} = v\) and \(v_{r} = w\). We say that W is a (vw)-walk. For any \(k,l \in {\mathbb {N}}\) with \(1 \le k \le l \le r\) define the subwalk \(W({k}, {l}) := (v_{k}, e_{k},v_{{k+1}}, e_{{k+1}},\ldots ,e_{l-1}, v_{l})\). W will be called Steiner walk if \(V(W) \cap T \subseteq \{v,w\}\) and vw are contained exactly once in W (the latter condition could be omitted, but has been added for ease of presentation). The set of all Steiner walks from v to w will be denoted by \({\mathscr {W}}_T(v,w)\). With a slight abuse of notation we define \(\delta _W(u) := \delta (u) \cap E(W)\) for any walk W and any \(u \in V\). First, for a Steiner walk \(W \in {\mathscr {W}}_T(v,w)\) define

$$\begin{aligned} P^+_W := \{ u \in V(W) \mid p^+\left( u, \delta (u){\setminus }\delta _W(u) \right) > 0 \} \cup \{v,w\}. \end{aligned}$$

Define the implied Steiner cost of W as

$$\begin{aligned} c_p^+(W) := \sum _{e \in E(W)} c(e) - \sum _{u \in P^+_W{\setminus }\{v,w\}} p^+\left( u, \delta (u) {\setminus } \delta _W(u) \right) . \end{aligned}$$

Define the implied Steiner length of W as

$$\begin{aligned} l_{p}^+(W) := \max \{c_p^+(W(v_{k},v_{\ell })) \mid 1 \le k \le \ell \le r,~ v_{k},v_{\ell } \in P^+_W\}. \end{aligned}$$
(3)

To understand the usage of the implied Steiner length, consider the SPG instance segment shown in Fig. 1. Assume that edge \(\{v_1,v_4\}\) is part of Steiner tree S. Removing this edge from S results in two trees \(S'\) and \(S''\) with \(v_1 \in V(S')\), \(v_4 \in V(S'')\). Consider the Steiner walk \(W := (v_1, \{v_1,v_2\}, v_2, \{v_2,v_3\}, v_3, \{v_2,v_3\}, v_2, \{v_2,v_4\}, v_4)\). Note that \(p^+\left( v_3, \delta (v_3) {\setminus } \delta _W(v_3)\right) = 3\), and thus \(l_{p}^+(W) = 4\). We claim that \(S'\) and \(S''\) can be reconnected to a Steiner tree \({\tilde{S}}\) that is of smaller weight than S by using only edges from W. First, assume \(v_3\) is contained in either \(S'\) or \(S''\). In this case, we can use the edges \(\{v_1, v_2\},\{v_2, v_3\}\) (if \(v_3 \in V(S'')\)) or the edges \(\{v_2, v_3\},\{v_2, v_4\}\) (if \(v_3 \in V(S')\)) to reconnect \(S'\) and \(S''\). Second, assume that \(v_3\) is neither contained in \(S'\) nor in \(S''\). In this case, also the edge \(\{v_3, t_1\}\) cannot be contained in \(S'\) or \(S''\): Because S is a Steiner tree, we have \(t_1 \in V(S'')\). Indeed, also \(\{t_1, v_4\} \in E(S'')\) holds. Reconnect \(S'\) and \(S''\) by adding all edges of W that are neither in \(S'\) nor in \(S''\). This procedure results in a Steiner tree \({\tilde{S}}\). Next, add edge \(\{v_3, t_1\}\) and remove edge \(\{t_1, v_4\}\) from \({\tilde{S}}\). This exchange reduces the weight of \({\tilde{S}}\) by \(p^+\left( v_3, \delta (v_3) {\setminus } \delta _W(v_3)\right) \). Thus, the final Steiner tree \({\tilde{S}}\) satisfies \(c(E({\tilde{S}})) \le c(E({S})) - 1\).

Fig. 1
figure 1

Segment of an SPG instance. Terminals are drawn as squares

With the above discussion in mind, define the implied Steiner distance between v and w as

$$\begin{aligned} d_{p}^+(v,w) := \min \{ l_{p}^+(W) \mid W \in {\mathscr {W}}_{T}(v,w)\}. \end{aligned}$$

Note that \(d_{p}^+(v,w) = d_{p}^+(w,v)\). At last, consider the distance network \(D^+ := D_G(T \cup \{v,w\}, d_{p}^+)\). Let \(b_{D^+}\) be the bottleneck distance in \(D^+\). Define the implied bottleneck Steiner distance between v and w as

$$\begin{aligned} s_p(v,w) := b_{D^+}(v,w). \end{aligned}$$
Fig. 2
figure 2

SPG instance with \(\tfrac{s(v_0,v_n)}{s_p(v_0,v_n)} = n\). Terminals are drawn as squares

Note that \(s_p(v,w) \le s(v,w)\) and that the inequality can be strict. Indeed, \(\tfrac{s(v,w)}{s_p(v,w)}\) can become arbitrarily large, as Fig. 2 shows: It holds that \(s(v_0,v_n) = n\), but \(s_p(v_0,v_n) = 1\). To see the latter, consider the Steiner walk \(W = (v_0,\{v_0,v_1\},v_1,\ldots ,\{v_{n-1}, v_n\}, v_n)\). Each vertex \(v_1,\ldots ,v_{n-1}\) has an implied profit of 1 in W. Thus, \(l_{p}^+(W) = 1\). Because 1 is the minimum edge cost on any \((v_0,v_n)\)-walk and \(s_p(v_0,v_n) \le l_{p}^+(W)\) by definition, we also have \(s_p(v_0,v_n) = 1\).

The above discussion implies that the following result provides a strictly stronger reduction criterion than Theorem 1.

Theorem 2

Let \(e = \{v,w\} \in E\). If \(s_p(v,w) < c(e)\), then no minimum Steiner tree contains e.

Proof

Assume \(s_p(v,w) < c(e)\) and let S be a Steiner tree with \(e \in E(S)\). We will show the existence of a Steiner tree \({S'}\) with \(e \notin E({S'})\) such that \(c(E({S'})) \le c(E(S))\), which concludes the proof. First, remove e from S to obtain a new subgraph \({\tilde{S}}\), which consists of exactly two connected components. Assume that each connected component contains at least one terminal (otherwise the proof is already finished). In the following, we will use a Steiner walk to reconnect \({\tilde{S}}\). First, we show the existence of such a reconnecting Steiner walk that has an implied Steiner length (3) smaller than c(e). Second, we add the edges of this walk to \({\tilde{S}}\), obtaining a Steiner tree. Third, we follow the same underlying idea as in the discussion for Fig. 1 and apply edge-exchange operations for each vertex of positive implied profit on the Steiner walk. In this way, the weight of \({\tilde{S}}\) is reduced.

Consider a (vw)-path P in \(D^+\) such that \(bl_{D^+}(P) = b_{D^+}(v,w)\). Let \(\{t,u\}\) be an edge on P such that t and u are in different connected components of \({\tilde{S}}\) (where t and u are considered in the original SPG). Let \({\tilde{S}}^t\) and \({\tilde{S}}^u\) be the connected components of \({\tilde{S}}\) such that \(t \in V({\tilde{S}}^t)\) and \(u \in V({\tilde{S}}^u)\). By the definition of the bottleneck length it holds that

$$\begin{aligned} d_{p}^+(t,u) \le s_p(v,w). \end{aligned}$$
(4)

Let \(W \in {\mathscr {W}}_T(t,u)\) such that

$$\begin{aligned} l_{p}^+(W) = d_{p}^+(t,u). \end{aligned}$$
(5)

Assume that W is given as \(W = (v_{1},e_{1},\ldots ,e_{r-1}, v_{r})\). Define \(b := \min \{k \in \{1,\ldots ,r\} \mid v_{k} \in V({\tilde{S}}^u) \}\) and \(a := \max \{k \in \{1,\ldots ,b\} \mid v_{{k}} \in V({\tilde{S}}^t)\}\). Further, define \(x := \max \{k \in \{1,\ldots ,a\} \mid v_{{k}} \in P^+_W \}\) and \(y := \min \{k \in \{b,\ldots ,r\} \mid v_{k} \in P^+_W \}\). By definition, \(x \le a < b \le y\) and furthermore:

$$\begin{aligned} \sum _{e \in E(W(a,b))} c(e) - \sum _{v \in V(W(a,b)) {\setminus } \{v_x,v_y\}} p^+\left( v, \delta (v) {\setminus } \delta _{W(x,y)} \right) \le c_p^+\left( W({x}, {y})\right) . \end{aligned}$$
(6)

Reconnect \({\tilde{S}}^t\) and \({\tilde{S}}^u\) by W(ab), which yields a connected subgraph \(S'_0\) with \(T \subseteq V(S'_0)\). Assume that \(S'_0\) is a tree (otherwise remove any redundant edges).Footnote 2 It holds that

$$\begin{aligned} \sum _{e \in E(S'_0)} c(e) \le \sum _{e \in E(S)} c(e) + \sum _{e \in E(W(a,b))} c(e) - c(\{v,w\}). \end{aligned}$$
(7)

Let \(v_1^+, v_2^+, \ldots , v_z^+\) be the vertices in \(P^+_{W({a}, {b})} {\setminus } \{v_a, v_b\}\), so all vertices with positive implied profit in the interior of the walk W(ab). Choose for each \(i = 1,\ldots ,z\) an edge \(e_i^+ \in \delta (v_i^+) {\setminus } \delta _{W(x,y)}(v_i^+)\) such that \(e_i^+ \cap T \ne \emptyset \) and

$$\begin{aligned} {\overline{b}}(e_i^+) - c(e_i^+) = p^+(v_i^+, \delta (v_i^+) {\setminus } \delta _{W({x}, {y})}). \end{aligned}$$
(8)

Note that all \(e_i^+\) are pairwise disjoint (just as the \(v_i^+\)).

We will construct Steiner trees \(S'_i\) for \(i \in \{1,\ldots ,z\}\) that satisfy

$$\begin{aligned} \sum _{e \in E(S'_i)} c(e) \le \sum _{e \in E(S'_0)} c(e)- \sum _{k = 1}^{i} p^+(v_k^+, \delta (v) {\setminus } \delta _{W({x}, {y})}), \end{aligned}$$
(9)

as well as

$$\begin{aligned} \bigcup _{k=i+1}^{z} \{e_k^+\} \cap E(S'_i) = \emptyset , \end{aligned}$$
(10)

and

$$\begin{aligned} V(S'_i) = V(S'_0). \end{aligned}$$
(11)

One readily verifies that \(S'_0\) satisfies (9)–(11). Let \(i \in \{1,\ldots ,z\}\) and assume that (9)–(11) hold for \(S'_{i-1}\). Thus, \(e_i^+ \notin E(S'_{i-1})\). Let \(P_{i}\) be the (unique) path in \(S'_{i-1}\) between \(v_i^+\) and the terminal \(t_i\) with \(\{t_i\} = e_i^+ \cap T\). Choose any \({\tilde{e}}_i \in E(P_i)\) with \(c({\tilde{e}}_i) = bl(P_i)\). Define the tree \(S'_{i}\) by \(V(S'_{i}) := V(S'_{i-1})\) and \(E(S'_{i}) := \left( E(S'_{i-1}) {\setminus } \{ {\tilde{e}}_i \}\right) \cup \{e_i^+\}\). We claim that \(S'_{i}\) satisfies (9)–(11). Equality (10) follows from the fact that all \(e_i^+\) are disjoint. And (11) follows from the construction of \(S'_{i}\). For (9), observe that by definition of the bottleneck distance it holds that \(c({\tilde{e}}_i) \ge {\overline{b}}(e_i^+)\) and therefore

$$\begin{aligned} {\overline{b}}(e_i^+) - c(e_i^+) \le c({\tilde{e}}_i) - c(e_i^+). \end{aligned}$$

Thus, Eq. (8) implies that \(S'_{i}\) satisfies (9).

Finally, set \(S':= S'_{z}\). Because of (11) it holds that \(T \subseteq V(S')\). Furthermore, one obtains:

$$\begin{aligned} \sum _{e \in E(S')} c(e)&{\mathop {\le }\limits ^{(9)}} \sum _{e \in E(S'_0)} c(e) - \sum _{k = 1}^{z} p^+(v_k^+, \delta (v_k^+) {\setminus } \delta _{W({x}, {y})}) \end{aligned}$$
(12)
$$\begin{aligned}&{\mathop {\le }\limits ^{(7)}} \sum _{e \in E(S)} c(e) + \sum _{e \in E(W(a,b))} c(e) - c(\{v,w\}) \nonumber \\&- \sum _{k = 1}^{z} p^+(v_k^+, \delta (v_k^+) {\setminus } \delta _{W({x}, {y})}) \end{aligned}$$
(13)
$$\begin{aligned}&{\mathop {\le }\limits ^{(6)}} \sum _{e \in E(S)} c(e) - c(\{v,w\}) + c_p^+(W({x}, {y})) \end{aligned}$$
(14)
$$\begin{aligned}&{\mathop {\le }\limits ^{(5)}} \sum _{e \in E(S)} c(e) - c(\{v,w\}) + l_p^+(W) \end{aligned}$$
(15)
$$\begin{aligned}&{\mathop {\le }\limits ^{(4)}} \sum _{e \in E(S)} c(e) - c(\{v,w\}) + s_p(v,w) \end{aligned}$$
(16)
$$\begin{aligned}&~{\le }~ \sum _{e \in E(S)} c(e), \end{aligned}$$
(17)

where the last inequality follows from the initial assumptions. \(\square \)

Furthermore, we define the restricted implied bottleneck Steiner distance \({\overline{s}}_p(v,w)\) between any \(v, w \in V\) as the implied bottleneck Steiner distance between v and w in the SPG \((V, E {\setminus } \left\{ \{v,w\}\right\} , c)\). One obtains the following corollary.

Corollary 1

Let \(e = \{v,w\} \in E\). If \({\overline{s}}_p(v,w) \le c(e)\), then at least one minimum Steiner tree does not contain e.

Fig. 3
figure 3

Segment of a Steiner tree instance. Terminals are drawn as squares. The dashed edge can be deleted by employing Theorem 2

Figure 3 shows a segment of an SPG instance for which Theorem 2 allows for the deletion of an edge, but Theorem 1 does not. The implied bottleneck Steiner distance between the endpoints of the dashed edge is 1—corresponding to a walk along the four non-terminal vertices. The edge can thus be deleted. In contrast, the (standard) bottleneck Steiner distance between the endpoints is 1.5 (corresponding to the edge itself).

Unfortunately, already computing the implied Steiner distance is hard, as the following proposition shows.

Proposition 1

Computing the implied Steiner distance is \(\mathscr {NP}\)-hard.

The proposition can for example be proved by a reduction from the Hamiltonian path problem, similar to a reduction for the prize-collecting Steiner distance concept in [44]. We note that it would also be possible to use the implied Steiner distance concept introduced in this article to generalize the Steiner distance concept used for the prize-collecting Steiner tree problem; see [39] for a definition that dominates the original one from [44]. However, formulating and proving this generalization is quite technical, and the computational benefit seems limited.

Finally, despite this \(\mathscr {NP}\)-hardness, one can devise heuristics that provide useful upper bounds on \(s_p\). We will discuss one such heuristic in the next section.

2.3 Approximating the implied bottleneck Steiner distance

This section describes one of the heuristics we use to delete edges by using an approximation of \(s_p\). Starting from a vertex \(v_0\), the heuristic tries to delete several edges of \(\delta (v_0)\) at once. Initially, define a distance array \({\tilde{d}}\) and a predecessor array pred as follows. For all \(u \in V {\setminus } \left( \{v_0\} \cup N(v_0) \right) \): \({\tilde{d}}[u] := \infty \) and \(pred[u] := null\). For all \(u \in N(v_0)\): \({\tilde{d}}[u] := c(\{v_0,u\})\) and \(pred[u] := v_0\). Moreover, set \({\tilde{d}}[v_0] := 0\) and \(pred[v_0] := v_0\). Finally, set \(Q := N(v_0)\).

While \(Q \ne \emptyset \) let \(v := {{\,\mathrm{\hbox {arg\,min}}\,}}_{u \in Q} {\tilde{d}}[u] \). For all \(\{v,w\} \in \delta (v)\) proceed as follows. First, set \(p_{vw} := \max \left\{ {p}^+(v,\{e\}) \mid e \in \delta (v) : w, pred[v] \notin e \right\} \). If

$$\begin{aligned} {\tilde{d}}[v] + c(\{v,w\}) - \min \left\{ c(\{v,w\}), p_{vw}, {\tilde{d}}[v] \right\} < {\tilde{d}}[w], \end{aligned}$$
(18)

then set \({\tilde{d}}[w]\) to the left hand side of (18) and add w to Q. Further, set \(pred[w] := v\). If (18) holds and \(w \in N(v_0)\), then we can delete edge \(\{v,w\}\).

Note that on the left hand side of (18) a possibly smaller value than \(p_{vw}\) is subtracted to prevent the algorithm from circling. Furthermore, note that a terminal might be used more than once for a profit calculation \(p_{vw}\) on one walk. However, since we subtract only a bounded part of the profit from the distance value in (18), the algorithm still works correctly. Note that one can extend the algorithm to cover the case of equality for edge deletion. In this case, one also needs to check whether (18) is satisfied with equality if \(w \in N(v_0)\). In practice, one should bound the maximum number of visited edges (in the implementation for this article we simply use a fixed bound). Additionally, one can abort the algorithm if \(\min _{u \in Q} {\tilde{d}}[u] > \max _{e \in \delta (v_0)} c(e)\).

The above algorithm is also useful for finding a simple path between endpoints of an edge that is not longer than the edge itself. Other authors, e.g. [19, 33], suggest to run a shortest path algorithm from both endpoints of each edge of the given SPG for this purpose. However, running the above algorithm from each vertex is usually considerably faster in practice.

2.4 Bottleneck Steiner reductions beyond edge deletion

This section discusses applications of the implied bottleneck Steiner distance that allow for additional reduction operations: Edge contraction and node replacement. We start with the former. For an edge e and vertices vw define \({b}_e(v,w)\) as the bottleneck distance between v and w on \((V,E {\setminus } \{e\},c)\). With this definition, we define a generalization of the classic NSV reduction test from [12].

Proposition 2

Let \(\{v,w\} \in E\) and \(t_i,t_j \in T, t_i \ne t_j\) such that: If

$$\begin{aligned} s_p(v,t_i) + c(\{v,w\}) + s_p(w,t_j) \le b_{\{v,w\}}(t_i,t_j), \end{aligned}$$
(19)

then there is a minimum Steiner tree S with \(\{v,w\} \in E(S)\).

Proof sketch

Unfortunately, the use of the implied bottleneck Steiner distance makes the proof of the proposition far more difficult than that of the original result from [12]. To avoid an abundance of technicalities, we therefore only provide a proof sketch. For a detailed proof see the technical report [38].

Assume there is an optimal solution S such that \(\{v,w\} \notin E(S)\). Remove from E(S) an edge on the (unique) path between \(t_i\) and \(t_j\) in S of maximum cost. This operation results in two disjoint trees: \(S_i\) with \(t_i \in S_i\) and \(S_j\) with \(t_j \in S_j\). By definition of \(b_{\{v,w\}}(t_i,t_j)\) it holds that

$$\begin{aligned} c(E(S_i)) + c(E(S_j)) + b_{\{v,w\}}(t_i,t_j) \le c(E(S)). \end{aligned}$$
(20)

Now the sketchy part starts: Similar to the proof of Theorem 2, condition (19) allows us to connect \(S_i\) to v such that the resulting tree \({\tilde{S}}_i\) satisfies

$$\begin{aligned} c(E({\tilde{S}}_i)) \le c(E(S_i)) + s_p(v,t_i). \end{aligned}$$
(21)

Equivalently, we can connect \(S_j\) to w with the result satisfying

$$\begin{aligned} c(E({\tilde{S}}_j)) \le c(E(S_j)) + s_p(w,t_j). \end{aligned}$$
(22)

However, the above is only true, because the two Steiner walks that correspond to \(s_p(v,t_i)\) and \(s_p(w,t_j)\) in (21) and (22), respectively, have no vertex in common. If they had a vertex in common, one could build a new Steiner walk \(W_0\) with \(l_{p}^+(W_0) \le s_p(v,t_i) + s_p(w,t_j)\) out of the two above Steiner walks, such that \(W_0\) connects \(S_i\) and \(S_j\). This walk \(W_0\) could then be used to reconnect \(S_i\) and \(S_j\) to a Steiner tree of weight smaller than \( b_{\{v,w\}}(t_i,t_j)\).

Finally, we define \({\tilde{S}}\) as the union of \({\tilde{S}}_i\), \({\tilde{S}}_j\), and \(\{v,w\}\). This connected subgraph is not necessarily a tree, but can be made one without increasing \(c(E({\tilde{S}}))\) by deleting an edge from each cycle. From (20),  (21), and (22) it follows that

$$\begin{aligned} c(E({\tilde{S}})) \le c(E(S)), \end{aligned}$$

which concludes the proof. \(\square \)

If criterion (19) is satisfied, one can contract edge \(\{v,w\}\) and make the resulting vertex a terminal. The original criterion from [12] uses the standard distance in (19) instead of the implied bottleneck Steiner distance. We note that using the (standard) bottleneck Steiner distance in (19) does not improve the original test. However, using the implied bottleneck Steiner distance leads to a strictly stronger criterion, as the example in Fig. 4 shows. Note that \(b_{\{ t_1,v_1 \} }(t_1, t_3) = 2\) and \(s_p(v_1,t_3) = 1\). Thus, (19) is satisfied for edge \(\{ t_1,v_1 \}\) and terminals \(t_1,t_3\).

Fig. 4
figure 4

Segment of a Steiner tree instance. Terminals are drawn as squares. The dashed edge can be contracted by employing Proposition 2

The following proposition allows one to identify edges that are candidates for edge contraction. Afterwards, the bottleneck distances can be computed for all these edges in \(O(m + n \log n)\) amortized time [9].

Proposition 3

Let \(\{v,w\} \in E\) and \(t_i,t_j \in T, t_i \ne t_j\). If (19) holds, then there is a minimum spanning tree \(S_{MST}\) on (VEc) such that \(\{v,w\} \in E(S_{MST})\).

Proof

Assume there is a spanning tree S such that \(\{v,w\} \notin S\). Remove from E(S) an edge on the (unique) path between \(t_i\) and \(t_j\) in S of maximum cost. By definition of \(b_{\{v,w\}}(t_i,t_j)\) it holds that

$$\begin{aligned} c(E(S_i)) + c(E(S_j)) + b_{\{v,w\}}(t_i,t_j) \le c(E(S)). \end{aligned}$$
(23)

This operation results in two disjoint trees: \(S_i\) with \(t_i \in S_i\) and \(S_j\) with \(t_j \in S_j\). If v and w are in different trees, one can add \(\{v,w\}\) to connect \(S_i\) and \(S_j\) and obtain a spanning tree of no higher cost than S. Otherwise, assume that \(v,w \in V(S_j)\). Let \(W_i\) be a Steiner walk from v to \(t_i\) with \(l_{p}^+(W_i) = s_p(v,t_i)\). There is at least one edge \(\{p,q\} \in E(W_i)\) such that \(p \in V(S_i)\) and \(q \in V(S_j)\). By definition it holds that \(c(\{p,q\}) \le l_{p}^+(W_i)\). Thus, one can add both \(\{p,q\}\) and \(\{v,w\}\) to \(S_i\), \(S_j\) to obtain a connected spanning subgraph \({S'}\). Because of condition (19) and (23) it holds that

$$\begin{aligned} c(E(S')) \le c(E(S)). \end{aligned}$$

Delete any edge other than \(\{v,w\}\) on the cycle in \(E(S')\) that includes \(\{v,w\}\). In this way one obtains a spanning tree \(S''\) of no higher cost than S. \(\square \)

This section closes with a reduction criterion based on the standard bottleneck Steiner distance. Besides being a new technique, this result also serves to highlight the complications that arise if one attempts to formulate similar conditions based on the implied bottleneck Steiner distance.

Proposition 4

Let \(D:= D_G(T, d)\). Let Y be a minimum spanning tree in D. Write its edges \(\{e^Y_1,e^Y_2,\ldots , e^Y_{|T|-1}\} := E(Y)\) in non-ascending order with respect to their weight in D.

Let \(v \in V {\setminus } T\). If for all \(\varDelta \subseteq \delta (v)\) with \(|\varDelta | \ge 3\) it holds that:

$$\begin{aligned} \sum _{i = 1}^{|\varDelta | - 1} d(e^Y_i) \le \sum _{e \in \varDelta } c(e), \end{aligned}$$
(24)

then there is at least one minimum Steiner tree S such that \(|\delta _S(v)| \le 2\).

The proposition follows from Corollary 3, which we will introduce in Sect. 4.2. If the conditions (24) are satisfied for a vertex \(v \in V {\setminus } T\), one can pseudo-eliminate [12] or replace [31] vertex v, i.e., delete v and connect any two vertices \(u, w \in N(v)\) by a new edge \(\{u,w\}\) of weight \(c(\{v, u\}) + c(\{v, w\})\).

The SPG depicted in Fig. 5 exemplifies why Proposition 4 cannot be formulated by using the implied Steiner distance. The weight of the minimum spanning tree Y for \(D_G(T, d)\) is 4, but the weight of a minimum spanning tree with respect to the implied bottleneck Steiner distance is 2. Similarly also the \(BD_m\) reduction technique from [12] cannot be directly formulated by using the implied bottleneck distance. Still, it is possible to formulate a similar criterion that makes use of the implied bottleneck distance. Unfortunately, both the result and the corresponding proof are more involved than those of their edge elimination counterparts (see Theorem 2). Thus, we omit the details here. The important point is to make sure that the selected Steiner walks do not overlap at vertices with a positive implied profit. However, these techniques have not been implemented yet.

Fig. 5
figure 5

SPG instance. Terminals are drawn as squares

3 From reductions to conflicts

This section shows an additional advantage of the just introduced node replacement reduction: The creation of conflicts between the newly inserted edges. Furthermore, a new replacement operation is introduced. We say that a set \(E' \subseteq E\) with \(|E'|\ge 2\) is in conflict if no minimum Steiner tree contains more than one edge of \(E'\).

3.1 Node replacement

Recall that we have seen three types of reductions so far: Edge deletion, edge contraction, and node replacement. For simplicity, we assume in the following that a reduction is only performed if it retains all optimal solutions. For example, we only delete an edge if we can show that there is no minimum Steiner tree that contains this edge. We say that such a reduction is valid. We start with an SPG instance \(I = (G,T,c)\), and consider a series of subsequent, valid reductions (of one of the three above types) that are applied to I. In each reduction step \(i \ge 0\), the current instance \(I^{(i)} = (G^{(i)},T^{(i)},c^{(i)})\) is transformed to instance \(I^{(i+1)} = (G^{(i+1)},T^{(i+1)},c^{(i+1)})\). We set \(I^{(0)} := I\). We define ancestor information for each \(i = 0,1,\ldots ,k\) by \(\varPi ^{(i)}: E^{(i)} \rightarrow {\mathscr {P}}(E)\) and \(\varPi ^{(i)}_{FIX} \subseteq E\). Initially, we set

  • \(\varPi ^{(0)}(e) := \{e\}\) for all \(e \in E\),

  • \(\varPi ^{(0)}_{FIX} = \emptyset \).

Consider a reduced instance \(I^{(i)}\). If we contract an edge \(e \in E^{(i)}\), we set \(\varPi ^{(i + 1)}_{FIX} := \varPi ^{(i)}_{FIX} \cup \varPi ^{(i)}(e)\). For any other operation we set \(\varPi ^{(i + 1)}_{FIX} := \varPi ^{(i)}_{FIX}\). If we replace a vertex \(v \in V^{(i)}\), then

  • for each newly inserted edge \(\{u,w\} \subset N(v)\) we set: \(\varPi ^{(i+1)}(\{u,w\}) := \varPi ^{(i)}(\{v,u\}) \cup \varPi ^{(i)}(\{v,w\})\),

  • for all other remaining edges e we set: \(\varPi ^{(i+1)}(e) := \varPi ^{(i)}(e)\).

Overall, one observes the following.

Observation 1

Let I be an SPG and let \(I^{(k)}\) be the SPG obtained from performing a series of k valid reductions on I. For any Steiner tree \(S^{(k)}\) for \(I^{(k)}\), the tree S with

$$\begin{aligned} E(S) = \bigcup _{e \in E^{(k)}} \varPi ^{(k)}(e) \cup \varPi ^{(k)}_{FIX} \end{aligned}$$

is a Steiner tree for I, and it holds that

$$\begin{aligned} c\left( E(S)\right) = c^{(k)}\left( E^{(k)}(S^{(k)})\right) + c\left( \varPi ^{(k)}_{FIX}\right) . \end{aligned}$$

Furthermore, if \(S^{(k)}\) is optimal for \(I^{(k)}\), then S is optimal for I.

[34] observed that two edges that originate from a common edge by a series of replacements cannot both be contained in a minimum Steiner tree. Using the above notation, we can formulate the condition as follows: If \(e_1,e_2 \in E^{(k)}\) satisfy \(\varPi ^{(k)}(e_1) \cap \varPi ^{(k)}(e_2) \ne \emptyset \), then there is no minimum Steiner tree that contains both \(e_1\) and \(e_2\). As we will see in Sect. 4, such conflict information can be used for further reductions.

In the following, we will introduce an edge conflict criterion that is strictly stronger than the one from [34]. Initially, we define additional ancestor information for each \(i = 0,1,\ldots ,k\). Namely, sets of replacement ancestors \(\varLambda ^{(i)}: E^{(i)} \rightarrow {\mathscr {P}}({\mathbb {N}})\), and \(\varLambda ^{(i)}_{FIX} \in {\mathscr {P}}({\mathbb {N}})\). We set \(\varLambda ^{(0)}(e) := \emptyset \) for all \(e \in E\), and \(\varLambda ^{(0)}_{FIX} := \emptyset \). Further, we define \(\lambda ^{(0)} := 0\). Consider a reduced instance \(I^{(i)}\). If we contract an edge \(e \in E^{(i)}\), we set \(\varLambda ^{(i + 1)}_{FIX} := \varLambda ^{(i)}_{FIX} \cup \varLambda ^{(i)}(e)\). If we replace a vertex \(v \in V^{(i)}\), we set \(\lambda ^{(i + 1)} := \lambda ^{(i)} + 1\). Further, we define the replacement ancestors for each newly inserted edge \(\{u,w\} \subset N(v)\), as follows:

$$\begin{aligned} \varLambda ^{(i+1)}(\{u,w\}) := \varLambda ^{(i)}(\{v,u\}) \cup \varLambda ^{(i)}(\{v,w\}) \cup \{\lambda ^{(i)}\}. \end{aligned}$$

If no node replacement is performed, we set \(\lambda ^{(i + 1)} := \lambda ^{(i)}\).

Proposition 5

Let I be an SPG and let \(I^{(k)}\) be the SPG obtained from performing a series of k valid reductions on I. Further, let \(e_1,e_2 \in E^{(k)}\). If \(\varLambda ^{(k)}(e_1) \cap \varLambda ^{(k)}(e_2) \ne \emptyset \), then no minimum Steiner tree \(S^{(k)}\) for \(I^{(k)}\) contains both \(e_1\) and \(e_2\).

Proof

Suppose that there is a minimum Steiner tree \(S^{(k)}\) with \(e_1,e_2 \in E^{(k)}(S^{(k)})\). Let \(x \in \varLambda ^{(k)}(e_1) \cap \varLambda ^{(k)}(e_2)\). Let i be the first reduction iteration with \(\lambda ^{(i)} = x\). We may assume that \(i=1\). Otherwise, we can define additional ancestor information \({\overline{\varPi }}\) and \({\overline{\varLambda }}\) starting from \(I^{(i-1)}\), and perform the reductions from iteration i to iteration k. Let v be the vertex that is replaced in iteration \(i=1\). Note that \(x=\lambda ^{(1)}=1\). From Observation 1 we know that the tree S defined by \(E(S) = \bigcup _{e \in E^{(k)}} \varPi ^{(k)}(e) \cup \varPi ^{(k)}_{FIX}\) is a minimum Steiner tree for I. However, because of \(\lambda ^{(1)} \in \varLambda ^{(k)}(e_1) \cap \varLambda ^{(k)}(e_2)\), we have that \(\left| \left( \varPi ^{(k)}(e_1) \cup \varPi ^{(k)}(e_2)\right) \cap \delta _S(v)\right| \ge 3\). This implies however, that replacing v is not valid—a contradiction.\(\square \)

Corollary 2

Let I, \(I^{(k)}\) as in Proposition 5, and let \(e \in E^{(k)}\). If \(\varLambda ^{(k)}(e) \cap \varLambda ^{(k)}_{FIX} \ne \emptyset \), then no minimum Steiner tree \(S^{(k)}\) for \(I^{(k)}\) contains e.

Note that any edge e as in Corollary 2 can be deleted.

3.2 Edge replacement

This subsection introduces a new replacement operation, whose primary benefit lies in the conflicts it creates. We start with a condition that allows us to perform this operation.

Proposition 6

Let \(e = \{v,w\} \in E\) with \(e \cap T = \emptyset \). Define

$$\begin{aligned} {\mathscr {D}} := \left\{ \varDelta \subseteq \left( \delta (v) \cup \delta (w)\right) {\setminus } \{e\} \mid \varDelta \cap \delta (v) \ne \emptyset , \varDelta \cap \delta (w) \ne \emptyset \right\} . \end{aligned}$$

For any \(\varDelta \in {\mathscr {D}}\) let

$$\begin{aligned} U_{\varDelta } := \left\{ u \in V \mid \{u,v\} \in \varDelta \vee \{u,w\} \in \varDelta \right\} . \end{aligned}$$

If for all \(\varDelta \in {\mathscr {D}}\) with \(|\varDelta | \ge 3\) the weight of a minimum spanning tree on \(D_G(U_{\varDelta }, s)\) is smaller than \(c(\varDelta )\), then each minimum Steiner tree S satisfies \(|\delta _S(v)| \le 2\) and \(|\delta _S(w)| \le 2\).

The proposition can be proven by using Corollary 3, which will be introduced in Sect. 4. If the condition of Proposition 6 is successful, we can perform what we will call a path replacement of e: We delete e and add for each pair \(p,q \in V\) with \(p \in N(v) {\setminus } \{w\}\), \(q \in N(w)) {\setminus } \{v\}\), \(p \ne q\) an edge \(\{p,q\}\) with weight \(c(\{p,v\}) + c(\{v,w\}) + c(\{q,w\})\). At first glance, the apparent increase in the number of edges by this operation seems highly disadvantageous. However, due to the increased weight, the new edges can often be deleted by using the criterion from Theorem 2. Furthermore, an edge does not need to be inserted if any two of the three edges it originates from have a common replacement ancestor. Indeed, we only perform a path replacement if at most one of the new edges needs to be inserted. The case that all new edges can be deleted is in principle also covered by the extended reduction technique introduced in the next section (albeit being potentially far more expensive). If exactly one new edge remains, we create new replacement ancestors as follows: Let \({\hat{e}} = \{p,q\}\) be the newly inserted edge. Initially, set \(\lambda ^{(i+1)} := \lambda ^{(i)}\) and \(\varLambda ^{(i+1)}({\hat{e}}) := \varLambda ^{(i)}(\{p,v\}) \cup \varLambda ^{(i)}(\{v,w\}) \cup \varLambda ^{(i)}(\{v,q\})\). Next, for each \(e' \in \left( \delta (v) \cup \delta (w)\right) {\setminus } \{e\}\) increment \(\lambda ^{(i+1)}\), and add \(\lambda ^{(i+1)}\) to \(\varLambda ^{(i+1)}({\hat{e}})\) and \(\varLambda ^{(i+1)}(e')\). One can show that Proposition 5 remains valid if path replacement is added to the list of valid reduction operations.

Figure 6 illustrates an application of Proposition 6. In this example, all but one replacement edges can be deleted by using a simple alternative path argument. While the number of edges remains unchanged, six new conflicts are created.

Fig. 6
figure 6

Segment of a Steiner tree instance (showing only non-terminals). All edges except for the dashed ones have unit weight. The dashed edge in (a) has been replaced in (b). All edges that are in conflict with the replacement edge in (b) are drawn in bold

4 From Steiner distances and conflicts to extended reduction techniques

At the end of the last section we have seen a reduction method that inspects a number of trees (of depth 3) that extend an edge considered for replacement. This section continues along this path, based on the reduction concepts introduced so far.

Given a tree Y (e.g. a single edge), extended reduction techniques use an enumeration of trees that contain Y to show that there is an optimal Steiner tree that does not contain Y. The trees are built by iteratively enlarging or extending Y. During this process, reduction, conflict, and implication techniques are employed to rule out these extensions of Y. In this way, extended reduction techniques are loosely related to the concepts of probing and conflict (graph) analysis for mixed-integer programming (MIP), see e.g. [1, 42].

The idea of extension was first introduced in [48] for the rectilinear Steiner tree problem. Later the idea was adopted by [45] for the SPG. The next advancement came in [10], where backtracking was used, together with a number of new reduction criteria for the enumerated trees. Finally, [34] introduced the up-to-now strongest extended reduction techniques, which improved and complemented the previous results. The authors showed that their sophisticated algorithm could drastically reduce the size of many benchmark SPG instances, and even allowed for the solution of previously intractable instances.

In the following, we introduce new extended reduction algorithms that (provably) dominate those by [34].

4.1 The framework

For a tree Y in G, let \(L(Y) \subseteq V(Y)\) be the set of its leafs. We start with several definitions from [34]. Let \(Y'\) be a tree with \(Y' \subseteq Y\). The linking set between Y and \(Y'\) is the set of all vertices \(v \in V(Y')\) such that there is a path \(Q \subseteq Y\) from v to a leaf of Y with \(V(Q) \cap V(Y') = \{v\}\). Note that Q can consist of a single vertex. \(Y'\) is peripherally contained in Y if the linking set between Y and \(Y'\) is \(L(Y')\). Figure 7 exemplifies this concept. To motivate those definitions, consider a path Q without inner terminals between vertices v and w. For Q to not be peripherally contained in a minimum Steiner tree it is sufficient that s(vw) is smaller than the weight of Q. However, this condition is not sufficient to show that Q is not contained in a minimum Steiner tree. However, if Q is indeed contained in a minimum Steiner tree, at least one of its inner vertices needs to be of degree greater 2 in this tree. Thus, we can exploit this observation to enumerate extensions of Q from those inner vertices and attempt to rule those extensions out. Such kind of deductions are used in extended reduction techniques.

Fig. 7
figure 7

Illustration of peripherally inclusion. The bold subtree is peripherally contained in the entire tree in (a), but not in (b)

For any \(P \subseteq V(Y)\) with \(|P| > 1\) let \(Y_P\) be the union of the (unique) paths between any \(v,w \in P\) in Y. Note that \(Y_P\) is a tree, and that \(Y_P \subseteq Y\) holds. P is called pruning set if it contains the linking set between \(Y_P\) and Y. Additionally, we will use the following new definition: P is called strict pruning set if it is equal to the linking set between \(Y_P\) and Y. Figure 8 provides an example of pruning and strict pruning sets. One readily verifies the following property of pruning sets.

Observation 2

Let Y be a tree, and let \(Y' \subseteq Y\) be a tree that is peripherally contained in Y. Further, let \(P \subseteq V(Y')\). If P is a pruning set for \(Y'\), then P is also a pruning set for Y. If P is a strict pruning set for \(Y'\), then P is also a strict pruning set for Y.

Fig. 8
figure 8

Illustration of pruning and strict pruning sets. The filled vertices in a form a (non-strict) pruning set, whereas the filled vertices in b constitute a strict pruning set

Additionally, we define a stronger, and new, inclusion concept. Consider a tree \(Y \subseteq G\), and a subtree \(Y'\). Let P be a pruning set for \(Y'\). We say that \(Y'\) is P-peripherally contained in Y if P is a pruning set for Y. Now let P be a strict pruning set for \(Y'\). We say that \(Y'\) is strictly P-peripherally contained in Y if P is a strict pruning set for Y. From Observation 2 one obtains the following important property.

Observation 3

Let \(Y \subseteq G\) be a tree, let \(Y' \subseteq Y\) be a subtree, and let P be a pruning set for \(Y'\). If \(Y'\) is peripherally contained in Y, then \(Y'\) is also P-peripherally contained in Y.

In fact, we will use the contraposition of the observation: If \(Y'\) is not P-peripherally contained in Y, then \(Y'\) is not peripherally contained in Y. Note that an equivalent property holds for strict pruning sets.

Given a tree Y and a set \(E' \subseteq E\), we write with a slight abuse of notation \(Y + E'\) for the subgraph with the edge set \(E(Y) \cup E'\). Algorithm 1 shows a high level description of the extended reduction framework used in this article. The framework is similar to the one introduced in [34], but more general.Footnote 3 Note that the algorithm is recursive.

A possible input for Algorithm 1 is an SPG instance together with a single edge. If the algorithm returns true, the edge can be deleted. Besides ExtensionSets, which is described in Algorithm 2, the extended reduction framework contains the following subroutines:

  • RuledOut(IYP) is given an SPG \(I =(G,T,c)\), a tree \(Y \subseteq G\), and a pruning set P for Y such that \(V(Y_P) \cap T \subseteq L(Y_P)\). The routine returns true if Y is shown to not be P-peripherally contained in any minimum Steiner tree. Otherwise, the routine returns false.

  • RuledOutStrict(IYP) is given an SPG \(I =(G,T,c)\), a tree \(Y \subseteq G\), and a strict pruning set P for Y such that \(V(Y_P) \cap T \subseteq L(Y_P)\). The routine returns true if Y is shown to not be strictly P-peripherally contained in any minimum Steiner tree. Otherwise, the routine returns false.

  • StrictPruningSets(IY) is given an SPG \(I =(G,T,c)\), a tree \(Y \subseteq G\). It returns a subset of all strict pruning sets for Y. A typical strict pruning set is L(Y).

  • Truncate(IY) is given an SPG \(I =(G,T,c)\), and a tree \(Y \subseteq G\). The routine returns true if no further extensions of Y should be performed; otherwise the routine returns false.

  • Promising(IYv) is given an SPG \(I =(G,T,c)\), a tree \(Y \subseteq G\), and a vertex \(v \in L(Y)\). The routine returns true if further extensions of Y from v should be performed; otherwise the routine returns false.

The usage of P-peripheral inclusion in RuledOut might appear somewhat awkward, but is necessary for ruling-out not only trees (as in line 2 of Algorithm 1), but also all possible extension via a single edge (as in line 4 of Algorithm 2). We explain the extended reduction framework via an example at the end of Sect. 4.2.

figure a
figure b

In Lines 13 of Algorithm 1, we try to peripherally rule-out tree Y. If that is not possible, we try to recursively extend Y in Lines 514. Since (given positive edge weights) no minimum Steiner tree has a non-terminal leaf, we can extend from any of the non-terminal leaves of Y. Note that ruling-out all extensions along one single leaf is sufficient to rule-out Y. The correctness of Extended-RuledOut can be proven by induction (under the assumption that the subroutines are correct). We also remark that it is under certain conditions possible to replace the condition not peripherally contained in any minimum Steiner tree by the condition not peripherally contained in at least one minimum Steiner tree. See also the discussion following Theorem 3.

Although the extended reduction framework shown in Algorithm 1 looks simple, an efficient realization is highly intricate. Not least, because the interaction of many different algorithmic components needs to be taken into account. Also, the re-use of intermediate results obtained during the tree extension (such as bottleneck Steiner distances) is non-trivial.

We just note here that we have only implemented extensions in a depth-first-search manner: We extend only from leaves that are farthest away from the initial tree Y. A stronger, but potentially more expensive, alternative is to employ full backtracking, as partially done in [34]. In the following, we concentrate on mathematical descriptions of the subroutines for ruling-out enumerated trees.

4.2 Reduction criteria

In this section we introduce several elimination criteria used within RuledOut and RuledOutStrict. In fact, both of these routines consist of several subalgorithms that check different criteria for eliminating the given tree. Note that any criterion that is valid for RuledOut is also valid for RuledOutStrict. We also note that several of the criteria in this section are similar to results from [31, 34], but are all stronger. Throughout this section we consider a graph \(G =(V,E)\) and an SPG instance \(I =(G,T,c)\).

Consider a tree \(Y \subseteq G\), and a pruning set P for Y such that \(V(Y_P) \cap T \subseteq L(Y_P)\). For each \(p \in P\) let \({\overline{Y}}_p \subset Y\) such that \(V({\overline{Y}}_p)\) is exactly the set of vertices \(v \in V(Y)\) that satisfy the following: For any \(q \in P {\setminus } \{p\}\) the (unique) path in Y from v to q contains p. Note that when removing \(E(Y_P)\) from Y, each non-trivial connected component equals one \({\overline{Y}}_p\). Further, note that \(p \in V({\overline{Y}}_p)\) for all \(p \in P\). Let \(G_{Y,P} = (V_{Y,P}, E_{Y,P})\) be the graph obtained from \(G = (V,E)\) by contracting for each \(p \in P\) the subtree \({\overline{Y}}_p\) into p. For any parallel edges, we keep only one of minimum weight. We identify the contracted vertices \(V({\overline{Y}}_p)\) with the original vertex p. Overall, we thus have \(V_{Y,P} \subseteq V\). Let \(c_{Y,P}\) be the edge weights on \(G_{Y,P}\) derived from c. Let

$$\begin{aligned} T_{Y,P} := \big (T \cap V_{Y,P}\big ) \cup \{ p \in P \mid T \cap V({\overline{Y}}_p) \ne \emptyset \}. \end{aligned}$$

Finally, let \(s_{Y,P}\) be the bottleneck Steiner distance on \((G_{Y,P}, T_{Y,P}, c_{Y,P})\). With these definitions at hand, we are able to formulate a reduction criterion that generalizes a number of results from the literature. See [19, 31] for similar, but weaker, conditions.

Theorem 3

Let \(Y \subseteq G\) be a tree, and let P be a pruning set for Y such that \(V(Y_P) \cap T \subseteq L(Y_P)\). Let \(I_{Y,P}\) be the SPG on the distance network \(D_{G_{Y,P}}\big (V_{Y,P}, s_{Y,P}\big )\) with terminal set P. If the weight of a minimum Steiner tree for \(I_{Y,P}\) is smaller than \(c(E(Y_P))\), then Y is not P-peripherally contained in any minimum Steiner tree for I.

Proof

Let S be a (not necessarily minimum) Steiner tree for I such that Y is P-peripherally contained in S. Let \(S_{Y,P}\) be a minimum Steiner tree for \(I_{Y,P}\). The underlying idea of the proof is as follows: First, we remove \(Y_P\) from S. Next, we interconnect all vertices in P. Because of the assumptions of the theorem, this procedure also reconnects S. To obtain a tree that is of smaller weight than S, we use only edges for the reconnection that correspond to edges of \(S_{Y,P}\).

Let \({\tilde{S}} \subset G\) be the forest defined as follows:

$$\begin{aligned} V({\tilde{S}})&:= (V(S) {\setminus } V(Y_P)) \cup V(S_{Y,P}), \end{aligned}$$
(25)
$$\begin{aligned} E({\tilde{S}})&:= E(S) {\setminus } E(Y_P). \end{aligned}$$
(26)

Let \(\tilde{{\mathscr {C}}}\) be the set of connected components of \({\tilde{S}}\). Further, let \(f: V \rightarrow \tilde{{\mathscr {C}}} \cup \{ \emptyset \}\) such that \(f(v) = {\tilde{C}}\) if \(v \in V({\tilde{C}})\) for a \({\tilde{C}} \in \tilde{{\mathscr {C}}}\), and \(f(v) = \emptyset \) otherwise. Note that each \({\tilde{C}} \in \tilde{{\mathscr {C}}}\) contains at least one vertex of P, and thus also at least one vertex of \(S_{Y,P}\). Also, \(f(v) \ne \emptyset \) for all \(v \in V(S_{Y,P})\). Further, note that for each of the contracted subtrees \({\overline{Y}}_p\) there is a \({\tilde{C}} \in \tilde{{\mathscr {C}}}\) with \({\overline{Y}}_p \subseteq {\tilde{C}}\). In the following, we will iteratively connect all the components in \(\tilde{{\mathscr {C}}}\).

While \(|\tilde{{\mathscr {C}}}| > 1\) proceed as follows. Choose a \((v,w) \in E(S_{Y,P})\) with \(f(v) \ne f(w)\) such that \(s_{Y,P}(v,w)\) is minimized. Let W be a (vw)-walk in \(G_{Y,P}\) corresponding to \(s_{Y,P}(v,w)\). Because of \(f(v) \ne f(w)\), there is at least one subwalk \(Q = W(q,r)\) of W such that \(f(q),f(r) \ne \emptyset \), \(f(q) \ne f(r)\), and \(f(u) = \emptyset \) for all \(u \in V(Q) {\setminus } \{q,r\}\). Note that \(c(E(Q)) \le s_{Y,P}(v,w)\), because \(f(t) \ne \emptyset \) for all \(t \in T\). As long as such a path Q exists, proceed as follows. Add Q to \({\tilde{S}}\), and remove from \(E(S_{Y,P})\) an (arbitrary) edge of the path between f(q) and f(r) in \(S_{Y,P}\). Also, update \(\tilde{{\mathscr {C}}}\) and f. Note that the weight of the removed edge (with respect to \(s_{Y,P}\)) is at most \(s_{Y,P}(q,r)\).

Once \(|\tilde{{\mathscr {C}}}| = 1\), one notes that the summed up weight of all newly inserted paths (with respect to c) does not exceed the weight of \(S_{Y,P}\) (with respect to \(s_{Y,P}\)). Because the weight of \(S_{Y,P}\) is smaller than \(c(E(Y_P))\), we obtain from the construction of \({\tilde{S}}\) that

$$\begin{aligned} c(E({\tilde{S}})) < c(E(S)), \end{aligned}$$

which concludes the proof. \(\square \)

In practice, one does not need to explicitly form \(G_{Y,P}\). Instead, one can use the (original) bottleneck Steiner distances between the connected components of the graph induced by \(E(Y) {\setminus } E(Y_P)\). Note that one can also extend Theorem 3 to the case of equality if at least one vertex of \(Y_P\) is not contained in any of the paths corresponding to the s values used for edges of \(S_{Y,P}\). However, in the context of extended reduction techniques one needs to be careful to not discard all of several equivalent extensions. We omit the quite technical details, but merely note that allowing for equality (and adding suitable checks) can have a significant impact for some instances.

In practice, computing a minimum Steiner tree (or even an approximation) on \(D_{G_{Y,P}}\big (V_{Y,P}, s_{Y,P}\big )\) is often too expensive. In such cases, the following corollary provides a strong alternative.

Corollary 3

Let Y, P as in Theorem 3. Let \((P',P'')\) be a partition of P. Let \(F'\) be an MST on \(D_{G_{Y,P}}\big (P', s_{Y,P}\big )\), and let \(z'\) be the weight of \(F'\). Let \(F''\) be an MST on \(D_{G_{Y,P}}\big (T_{Y,P}, s_{Y,P}\big )\). Write \(\{e^{F''}_1,e^{F''}_2,\ldots , e^{F''}_{|T_{Y,P}|-1}\} := E_{Y,P}(F'')\) such that \(s_{Y,P}(e^{F''}_i) \ge s_{Y,P}(e^{F''}_j)\) for \(i < j\). Define

$$\begin{aligned} z'' := \sum _{i = 1}^{|P''|} s_{Y,P}(e^{F''}_i). \end{aligned}$$

If \(z' + z'' < c(E(Y_P))\), then Y is not P-peripherally contained in any minimum Steiner tree for I.

Proof

First, note that if \(P''= \emptyset \), then the corollary follows directly from Theorem 3, because \(z'\) is a lower bound on the weight of a minimum Steiner tree in \(I_{Y,P}\). Thus, we assume \(P'' \ne \emptyset \) in the following.

Suppose there is a minimum Steiner tree S for I such that Y is P-peripherally contained in S. Define \({\tilde{S}}\) as in the proof of Theorem 3. Further, proceed as in the proof of Theorem 3 to reconnect all connected components of \({\tilde{S}}\) that contain a vertex from \(P'\). As a result, \({\tilde{S}}\) has at most \(|P''| + 1\) connected components. Because S is assumed to be optimal, each connected component of \({\tilde{S}}\) contains at least one terminal. Thus, we can reconnect the remaining connected components similarly to Theorem 3, by using paths corresponding to edges of \(F''\). We need to add at most \(|P''|\) such paths. Overall, we have increased the weight of \({\tilde{S}}\) by at most \(z' + z''\). From \(z' + z'' < c(E(Y_P))\) we obtain that

$$\begin{aligned} c(E({\tilde{S}})) < c(E(S)), \end{aligned}$$

which contradicts the optimality of S. \(\square \)

As for Theorem 3, the contractions in Corollary 3 should only be performed implicitly in practice. Furthermore, one requires a careful implementation to avoid a recomputation from scratch of the two minimum spanning trees in Corollary 3 for each enumerated tree in Algorithm 1.

Next, let \(Y \subseteq G\) be a tree with pruning set P, and let \(v,w \in V(Y)\) and let Q be the path between vw in Y. We define a pruned tree bottleneck between v and w as a subpath Q(ab) of Q that satisfies \(|\delta _Y(u)| = 2\) and \(u \notin P\) for all \(u \in V(Q(a,b)) {\setminus } \{a,b\}\), \(V(Q(a,b)) \cap T \subseteq \{a,b\}\), and maximizes c(V(Q(ab))). The weight c(V(Q(ab))) of such a pruned tree bottleneck is denoted by \(b_{Y,P}(v,w)\). Using this definition and the implied bottleneck Steiner distance, we obtain the following result.

Proposition 7

Let Y be a tree, let P be a pruning set for Y, and let \(v,w \in V(Y)\). If \(s_p(v,w)< b_{Y,P}(v,w)\), then Y is not P-peripherally contained in any minimum Steiner tree.

The proposition can be proven in a similar way as Theorem 2 (and is indeed a generalization of the latter).

Based on the SPG instance in Fig. 9, we demonstrate the usage of the extended reduction framework and the above reduction criteria in the following. We aim to replace (or pseudo-eliminate) vertex \(v_3\). To show that this operation is valid, we prove that the tree Y with \(V(Y) = \{v_3\} \cup N(v_3)\), \(E(Y) = \delta (v_3)\) is not peripherally contained in any minimum Steiner tree. We call Algorithm 1 with Y as defined above. We are neither able to rule out Y in Line 2, nor do we truncate the search in Line 4. In Line 5, we consider vertex \(v_5\) and mark it as promising. The extension sets obtained from Algorithm 2 are: \( \left\{ \{ t_2, v_5 \} \right\} , \left\{ \{ v_4, v_5 \} \right\} \), and \(\left\{ \{ t_2, v_5 \}, \{ v_4, v_5 \} \right\} \). We (recursively) call Algorithm 1 for each of these three extensions in Line 8.

First, we consider the extension via the edge \(\{ t_2, v_5 \}\). The tree \(Y' := Y + \left\{ \{ t_2, v_5 \} \right\} \) with pruning set \(P = \{t_1, t_2, v_2 \}\) can be shown to not be P-peripherally contained in a minimum Steiner tree by using Proposition 7: It holds \(s_p(t_1,t_2) = 2 < 2.5 = b_{Y',P}(t_1,t_2)\), where the pruned tree bottleneck corresponds to the edges \(\{v_3,v_5\}\) and \(\{t_2,v_5\}\).

Next, we consider the extension via the edge \(\{ v_4, v_5 \}\). We are not able to rule out this extension, and thus extend the tree \(Y' := Y + \left\{ \{ v_4, v_5 \} \right\} \) from vertex \(v_4\). The extension set obtained from Algorithm 2 is just \(\left\{ \{ v_4, v_6 \} \right\} \), because any extension of \(Y'\) via the edge \(\{ v_2, v_4 \}\) would results in a cycle and can thus be discarded. However, the tree \(Y'' := Y' + \left\{ \{ v_4, v_6 \} \right\} \) with pruning set \(P = \{t_1, v_2, v_6\}\) can be ruled out by using Proposition 7: It holds that \(s_p(t_1,v_6) = 2 < 3 = b_{Y'',P}(t_1,v_6)\), where the pruned tree bottleneck corresponds to the edges \(\{v_3,v_5\}\), \(\{v_4,v_5\}\), and \(\{v_4,v_6\}\).

Finally, we consider the extension via the edge set \(\left\{ \{ t_2, v_5 \}, \{ v_4, v_5 \} \right\} \). We are not able to rule out this extension, and thus extend the tree \(Y' := Y + \left\{ \{ t_2, v_5 \}, \{ v_4, v_5 \} \right\} \) from vertex \(v_4\). As before, the extension set obtained from Algorithm 2 is \(\left\{ \{ v_4, v_6 \} \right\} \). The tree \(Y'' := Y' + \left\{ \{ v_4, v_6 \} \right\} \) with pruning set \(P = \{t_1, t_2, v_2, v_6\}\) can again be ruled out by using Corollary 3: It holds that \(c(E(Y''_P)) = 6.5\), but the weight of an MST on \(D_{G_{Y'',P}}\big (P, s_{Y'',P}\big )\) is 6; the edges of the MST on \(D_{G_{Y'',P}}\big (P, s_{Y'',P}\big )\) are \(\{t_1,t_2\}\), \(\{t_1,v_2\}\), and \(\{t_2,v_6\}\).

In summary, all extensions of the initial tree Y along vertex \(v_5\) are ruled out in the first call of Algorithm 1. Thus, the algorithm returns true, which implies that vertex \(v_3\) can be replaced.

Fig. 9
figure 9

Segment of a Steiner tree instance. Terminals are drawn as squares. By using the extended reduction framework, one can show that vertex \(v_3\) can be replaced

Another criterion can be devised by using the reduced costs of the well-known bidirected cut formulation [50] for SPG. This formulation is based on the observation that any optimal Steiner arborescence for the bidirected equivalent of a given SPG instance with arbitrary root \(r \in T\) corresponds to an optimal Steiner tree for the original SPG. Let \(D = (V,A)\) be the bidirected equivalent of G, and let \(r \in T\). Consider a dual solution for the bidirected cut formulation, with reduced costs \({\tilde{c}}\), and with objective value \({\tilde{L}}\). Further, for any \(v,w \in V\), let \({\tilde{d}}(v,w)\) be the length for a shortest, directed path from v to w in A with respect to the reduced costs. From the observation that an optimal Steiner arborescence cannot contain any cycles, we obtain the following result with standard linear programming arguments:

Proposition 8

Let Y be a tree. Let \(P = \{p_1,\ldots ,p_k\}\) be a strict pruning set for Y such that there is a \(k' \le k\) with \(p_i \in T\) if and only if \(i > k'\). Further, assume that \(V(Y_P) \cap T \subseteq L(Y_P)\), and \(|P| < |T|\). The weight of any Steiner tree that strictly P-peripherally contains Y is at least

$$\begin{aligned} {\tilde{L}} + \min _{i \in \{1,\ldots ,k\}} \max _{\{t_1,\ldots t_{i-1}, t_{i+1},\ldots ,t_{k'} \} \subseteq T {\setminus } V(Y_P)} \left\{ {\tilde{d}}(r,p_i) + \sum _{j\le k', j \ne i} {\tilde{d}}(p_j,t_j)\right\} . \end{aligned}$$
(27)

Given an upper bound on the cost of a minimum Steiner tree, this proposition can be used in the RuleOutStrict routine. In practice, we only use a lower bound on the \(\max \) subterm in (27).

Finally, another important reduction criteria is constituted by edge conflicts—this result follows directly from Proposition 5.

Corollary 4

Let \(I^{(k)}\) be an SPG obtained from performing a series of k valid reductions on an SPG I. Let \(Y \subseteq G^{(k)}\) be a tree, and let P a pruning set for Y. If there are distinct edges \(e_1, e_2 \in E^{(k)}(Y)\) such that \(\varLambda ^{(k)}(e_1) \cap \varLambda ^{(k)}(e_2) \ne \emptyset \), then Y is not P-peripherally contained in any minimum Steiner tree.

5 Exact solution

This section describes how to use the techniques introduced so far for the exact solution of SPG. The new methods have been implemented as an extension of the branch-and-cut solver SCIP-Jack [14].

5.1 Branch-and-cut

As shown in [33], reduction techniques are the most important ingredient in a state-of-the-art SPG solver. While [33] uses linear programming and branch-and-bound mostly to trigger further reductions, we employ a proper branch-and-cut approach, based on [14]. On a high level, the solution process of SCIP-Jack can be naturally divided into three phases.

First, the presolving phase. Here, reduction techniques (combined with primal and dual heuristics) are employed to decrease the problem size. As can be seen in Sect. 5.2 and in the detailed results in the appendix, many instances are already drastically reduced in this phase.

Second, the linear programming (LP) based separation phase at the branch-and-bound root node. SCIP-Jack employs a specialized separation algorithm, see e.g. [14], to compute lower bounds based on the well-known bidirected cut formulation [50]. Additionally, several specialized methods such as primal heuristics and domain propagation are employed. For domain propagation, we also employ a modified version of the extended reduction techniques that makes use of the reduced costs from the LP-relaxation.

Third and finally, a branch-and-bound search is initiated, with the branching being done on the vertices of the graph. In this phase, again primal heuristics and domain propagation are employed. However, SCIP-Jack aims to avoid the branch-and-bound search, and puts much effort into the root node. Indeed, fewer than five percent of the instances used in this article require branching.

We enhance several vital components of this branch-and-cut framework. The most natural application of reduction methods is within presolving. However, one can also use them within domain propagation, translating the deletion of edges into variable fixings in the integer programming model. However, in our implementation the reduction methods are employed far less aggressively in domain propagation than in presolving (so also the time spent in domain propagation is usually less than 10 percent of the time spent in presolving). The edge conflicts described in this article are used for generating clique cuts, which are well-known for general MIPs [2]. We note, however, that the impact of these cuts on the overall solution time is small; even for instances with many edge conflicts the obtained speed-up is usually only a few percent. Finally, also primal heuristics are improved. First, the stronger reduction methods enhance primal heuristics that involve the solution of auxiliary SPG instances, such as from the combination of several Steiner trees. Second, the implication concept introduced in this article can be used to directly improve a classic SPG heuristic, as shown in the following.

5.1.1 Implications and the shortest path heuristic

The simple 2-approximation for SPG introduced by [43] has been widely used in the literature and is perhaps the best known primal heuristic for SPG. The algorithm starts with a tree S consisting of a single vertex and iteratively connects S by a shortest path to a terminal closest to S. As a simple postprocessing step, one can compute a minimum spanning tree on (V(S), E[S]) and iteratively remove non-terminal leaves. An efficient implementation is given in [3]. This section shows how to use the implication concept introduced in Sect. 2.2 to (empirically) improve the algorithm.

Let \(v_0 \in V\), and initially set \(S := \{v_0\}\). Define a distance array \({\tilde{d}}\) and a predecessor array pred by \({\tilde{d}}[u] := \infty \), \(pred[u] := null\) for all \(u \in V {\setminus } \{v_0\}\), and \({\tilde{d}}[v_0] := 0\), \(pred[v_0] := v_0\). Define for all \(v \in V {\setminus } T\):

$$\begin{aligned} {\tilde{p}}(v) := \max \left\{ 0, \sup \left\{ {\overline{b}}(e) - c(e) \mid e = \{v,w\} \in \delta (v), w \in T {\setminus } V(S) \right\} \right\} . \end{aligned}$$
(28)

For all \(v \in T\) set \({\tilde{p}}(v) := 0\). Essentially, (28) is a weaker version of the implied profit from Sect. 2.2. Finally, set \(Q := \{v_0\}\).

While \(Q \ne \emptyset \) let \(v := {{\,\mathrm{\hbox {arg\,min}}\,}}_{u \in Q} {\tilde{d}}[u] \). If \(v \in T\), add the path P from v to S, marked by the predecessor array, to S, add V(P) to Q, and set \({\tilde{d}}[u] := 0\) for all \(u \in V(P)\). Furthermore, update (28). For all \(\{v,w\} \in \delta (v)\) proceed as follows. If

$$\begin{aligned} {\tilde{d}}[v] + c(\{v,w\}) - \min \left\{ c(\{v,w\}), {\tilde{p}}(v), {\tilde{d}}[v] \right\} < {\tilde{d}}[w], \end{aligned}$$
(29)

then set \({\tilde{d}}[w]\) to the left hand side of (29), and add w to Q. Further, set \(pred[w] := v\).

Note that (29) provides a bias for paths computed by the heuristic to include vertices of implied profit. In this way, the distance associated with a path also reflects the cost needed to connect additional terminals later on. Note that the minimum spanning tree computed during postprocessing will always contain the edge associated with each vertex of positive implied profit contained in S. We use the value \(\min _{e' \in \delta (w) {\setminus } \{e\}} c(e')\) instead of \({\overline{b}}(e)\) for \(e = \{v, w\}, w \in T\) in (28) for two reasons: First, the value better represents the weight that can be saved when connecting w via v (because the bottleneck edge corresponding to \({\overline{b}}(e)\) might already be part of the tree computed by the heuristic so far). Second, this value is much faster to compute (and the primal heuristic is executed often as a subroutine within our implementation).

Computational experiments on the benchmark instances from the next section have shown that the above modifications improve the solution quality of the shortest path heuristic in a surprisingly consistent manner: When run 100 times from different starting points after SPG presolving (as is the default in SCIP-Jack), the solution quality of the heuristic is improved for more than 85 % of the instances. We also note that the shortest path heuristic is used as a subroutine in several more involved heuristics applied by SCIP-Jack, see [14].

5.2 Computational results

This section provides computational results for the new solver. In particular, we compare its performance with the updated results of the solver by [31, 46] published in [37]. The computational experiments were performed on Intel Xeon CPUs E3-1245 with 3.40 GHz and 32 GB RAM. According to the DIMACS benchmark software [7], this computer is 1.59 times faster than the machine used in [37]Footnote 4. While the authors of the current article do not have access to the machine used in [37], preliminary experiments on different machines have shown that the DIMACS score is a good estimate for the performance of the new solver. Thus, we have scaled the run-times reported in the following accordingly, by multiplying the run-times of SCIP-Jack by 1.59. We use the same LP solver as [37]: CPLEX 12.6 [20]. All results were obtained single-threaded.

For the comparison with the solver by [31, 46], we are restricted to the instances used in [37]. Still, the experiments in [37] include a large number of test-sets (both the SteinLib and the 11th DIMACS Challenge collection). Thus, we only use test-sets with at least one instance that takes more than 10 s to be solved by [37] or our solver. There is one notable exception: We do not consider the test-sets I320 and I640 from the SteinLib; for the following reason: [37] use specialized, non-default settings for several test-sets, including I320 and I640, where they use only “(...) fast calculation of bounds (...)” during branch-and-bound. As we aim to give an unbiased picture of the performance of our solver, we only use our default settings for all instance sets. While we can achieve significant speed-ups on all tests-sets when using specialized settings, the impact is by far strongest on the I instances—more than an order of magnitude for the harder instances. We note, however, that we can match the results from [37] on I320 and I640 if we use dual-ascent bounds during branch-and-bound, instead of LP-based ones.

An overview of the test-sets is given in Table 1. The second column gives the number of instances per test-set. The third and fourth columns give the range of nodes and edges per test-set. The fifth column states whether for all instances of the test-set optimal solutions are known.

Table 1 Details on SPG benchmark sets

5.2.1 Impact of implied profit reductions

In the following, the impact of the \(s_p\) based reduction methods on the preprocessing strength is reported. For the reduced cost based reductions we use the dual-ascent heuristic from [50]. We use seven benchmark sets from the literature; three from the DIMACS Challenge, three from the SteinLib, and one from [22]. Table 2 shows in the first column the name of the test-set, followed by its number of instances. The next columns show the percentual average number of nodes and edges of the instances after the preprocessing without (column three and four), and with (columns five and six) the \(s_p\) based methods. The last two columns report the percentual relative change between the previous results.

It can be seen that the \(s_p\) methods allow for a significant additional reduction of the problem size. This behavior is rather remarkable, given the variety of other powerful reduction methods included in SCIP-Jack. Even if the percentage of remaining edges and nodes is already small on average for the base processing (such as for VLSI), there are for each of the seven test-sets at least a few instances that are still of large size. These instances can often be significantly reduced by the \(s_p\) techniques. While no run times are reported in the table, we note that on each of the seven test-sets the overall run time of the preprocessing (often significantly) decreases when the \(s_p\) based methods are used. Furthermore, even for other test-sets where the \(s_p\) methods are less (or not at all) successful, one does not observe an increase in the run time of the preprocessing above 10 percent.

Table 2 Average remaining nodes and edges after preprocessing
Table 3 Computational comparison of the solver developed for this article (S.-J.) and the solver described in [31, 46] (P.&V.)

5.2.2 Comparison with the state of the art

Next, we compare the solver by Polzin and Vahdati Daneshmand [31, 46] and the new solver SCIP-Jack with respect to the mean time, the maximum time, and the number of solved instances. For the mean time we use the shifted geometric mean with a shift of 10. We note that the use of an arithmetic mean would bias strongly in favor of SCIP-Jack, which is especially faster on harder instances.

Table 3 provides the results for a time-limit of 24 h (divided by 1.59 in the case of SCIP-Jack), which is the same time-limit as used in the updated report [37]. The second column shows the number of instances in the test-set. Column three gives the number of instances solved by [37], column four the number of instances solved by SCIP-Jack. Column five shows the mean time taken by [37], column six shows the mean time of SCIP-Jack. The next column gives the relative speedup of SCIP-Jack. The last three columns provide the same information for the maximum run-time.

It can be seen that SCIP-Jack consistently outperforms [37]—both with respect to mean and maximum time. Also, SCIP-Jack solves on each test-set at least as many instances as [37]. The only test-set where [37] prevail is VLSI. On this test-set the results of the extended reductions reported in [31] are also stronger, which might be attributed to the use of full-backtracking, which has not yet been implemented in SCIP-Jack.

On the other test-sets, the difference in the run-time is especially apparent for the maximum run time. This behavior can be explained by the fact that most test-sets contain many instances that can be solved very fast by both solvers—which brings the mean times closer together. Prominent examples are the SP and Copenhag14 test-sets, for which all instances can be solved by SCIP-Jack within roughly 1 h, whereas [37] leave several instances unsolved even after 24 h.

As already mentioned, most test-sets in Table 3 contain a large number of instances that can be solved by both [37] and our solver in well below 1 s. To mitigate the impact of very easy instances on the average times, we group the instances according to their hardness in the following experiment. We use instance groups \([10^k, 86{,}400]\) for \(k=-\infty , 0,1,2,3\). Any group \([10^k, 86{,}400]\) contains each instance from Table 3 such that [37] or SCIP-Jack solves this instance in not less than \(10^k\), and at most 86,400 s. If an instance can be solved by only one solver within the time-limit, we consider the run-time of the other solver on this instance as 86,400 s. Such groupings are commonly used in computational mathematical optimization (also with the time lower bounds being powers of 10), see e.g. [28, 49]. In addition to the shifted geometric mean, Table 4 also provides the arithmetic mean of the run-time for each group. As before, we give the results for both [37] and SCIP-Jack, and report the respective speed-up of SCIP-Jack.

Table 4 Computational comparison of the solver developed for this article (S.-J.) and the solver described in [31, 46] (P.&V.), with instance groups ordered by hardness

Unsurprisingly, the ratio of the arithmetic mean stays largely unchanged with increasing hardness of the groups. SCIP-Jack is more than a factor of 5 faster than the solver from [37] on all groups. On the other hand, the performance difference with respect to the shifted geometric mean significantly increases with the hardness of the instances. For instances that take more than a 1000 s to be solved by [37] or SCIP-Jack, the latter is even by a factor of more than 7 faster.

5.2.3 Further results

Finally, we provide results for several large-scale Euclidean Steiner tree problems. For solving such problems, the bottleneck is usually the full Steiner tree concatanation [22]. This concatanation can also be solved as an SPG, however [35]. In Table 5 we give results for Euclidean instances from [22] with 25 thousand (EST-25k), 50 thousand (EST-50k), and 100 thousand (EST-100k) points in the plane. For EST-25k the mean and maximum times are between one and two orders of magnitude faster than those of the well-known geometric Steiner tree solver GeoSteiner 5.1 [22]. Moreover, 7 of the 15 instances from EST-50k are solved for the first time to optimality—in at most 197 s. On the other hand, GeoSteiner cannot solve these instances even after seven days of computation. For EST-100k, GeoSteiner even leaves 12 of the 15 instances unsolved after one week of computation. In contrast, we solve all these instances to optimality in less than 13 minutes. Overall, we solve 19 instances for the first time to optimality.

Unfortunately, [37] does not report results for these instances. However, the solver by [30], which won the heuristic SPG category at the 11th DIMACS Challenge, does not reach the upper bounds from GeoSteiner on any of the EST-25k, EST-50k, and EST-100k instances.

Table 5 Results of SCIP-Jack for Euclidean Steiner tree instances

6 Conclusion and outlook

This article has described the combination of implication, conflict, and reduction concepts for the SPG, with the aim of improving the state of the art in exact SPG solution. This combination has spawned several new techniques that (provably) dominate well-known results from the literature, such as the bottleneck Steiner distance. The integration of the new methods into the branch-and-cut solver SCIP-Jack has shown a large impact on exact SPG solution. The new SCIP-Jack could even outperform the long-reigning state-of-the-art solver by [31, 46].

Still, there are several promising routes for further improvement. First, one could improve the newly introduced methods. For example, by using full-backtracking in the extended reduction methods, by improving the approximation of the implied bottleneck Steiner distance, or by adapting the latter for replacement techniques. Second, several powerful methods described in [31, 46] could be added to the new solver, e.g. a stronger IP formulation realized via price-and-cut, or additional reduction techniques via partitioning.

Unlike the solver by [31, 46], the new SCIP-Jack will be made freely available for academic use—as part of the SCIP Optimization Suite 8.