1 Introduction

In this paper, a new strategy for the creation of anti-loop formulations in optimization problems associated with graphs, based on binary labelling of nodes, is presented. In the new formulations presented in this work, the formation of cycles with an odd number of nodes is not allowed. This strategy can lead to an improvement in the iterative methods that allow the formation of cycles, in addition to allowing the existing anti-loop strategies to be reformulated.

When we propose arborescence formulations based on level or sequence relationships, avoiding the formation of cycles is one of the constraints for obtaining a tree from a graph. Also, to obtain a path that visits all the nodes of the graph (Hamiltonian circuit) in the case of the traveling salesman problem. In the case of trees, the solution must be a connected subgraph where each node connects to a higher level node or parent node, except the node that is defined as root node of the tree. In the traveling salesman case, the solution must present a connected subgraph with as many edges as nodes, and each node is connected to two nodes (one preceding node and one posterior node in the sequence).

The most relevant optimization problems where it is necessary to obtain a tree of a graph are the MST problem, the VRP problem and the Steiner problem, and all those derived from them.

In general, the outstanding formulations that have been used on graphs to create trees or for the traveling salesman problem have been the following:

  • Obtain all the cycles [1]: Calculate all the cycles and avoids that all the nodes of any cycle are part of the tree (DFJ).

  • Cuts: Derived from the previous, it defines that there is at least one edge between each pair of complementary node sets.

  • Flow sending: Flow sent from a source node to the rest of the nodes of the graph imposing the flow conservation restrictions. Different versions: Single commodity flow [2], two-commodity flow [3] and multiple-commodity flow [4].

  • Based on decisions of precedence. Proposed by Sarin et al. [5] for the ATSP problem. Formulation also applied in scheduling problems.

  • Time Staged Formulations: Arcs are added to the tree or path at each stage or period [6].

  • Setting depth levels at nodes or sequential formulation. Proposed by Miller et al. [7], MTZ from now, and later lifted by Desrochers and Laporte [8], DL from now. Other versions such as Sherali and Driscoll [9] and Sawik [10] also stand out.

In addition, there are some versions that consider mixed variants of the previous ones, as appears in Sherali et al. [11] and Gouveia and Pires [12], where the identification of some cycles is incorporated into sequence formulations. There are also formulations like Martin [13], in this case for obtaining trees, which uses decisions about which part of the path each node is in with respect to each selected arc. It is inefficient because it makes use of O(n3) variables and constraints.

The Danzig and Cuts strategies turn out to be formulations with little practical scope. The exponential growth in the number of constraints, that is, in the number of cycles, makes it difficult for us to experimentally analyse these models.

The strategies that have received more practical attention have been the flow formulation and the sequential formulation, the latter being more efficient when solving larger problems. Both in one and in the other, each edge (i,j) ∈ E becomes in two directed arcs, (i,j) and (j,i), to express a concept of direction for the case of flow formulations, and a concept of successor/predecessor relationship for sequential formulations.

Formulation MTZ is based on introducing as a variable the position of each node in the sequence (or depth level of each node in the tree):

\(\forall i:u_{i} =\) Position in the sequence (or depth level in the tree) of node i

$$ 1 \le u_{i} \le n - 1 $$

Position 0 is assumed for node 1 (root node) and for the rest of it is imposed as a constraint that if we select the arc (i,j), the position of j in the sequence is greater than the position of i (at least one more unit). We define the boolean variable xij to collect the selection of arcs (i,j) ∈ A.

Mathematically this is expressed by modelling the following logical proposition:

$$ \forall i,j = 2...n,i \ne j:IF\;x_{ij} = 1\;THEN\;u_{i} \ge u_{j} + 1 $$

Whose modelling results in:

$$ \forall i,j = 2...n,i \ne j:u_{i} - u_{j} + (n - 1)x_{ij} \le n - 2 $$
(1)

The proposed modification DL, proposes a formulation based on the modeling of the following logical statement: \(\forall i,j = 2...n,i \ne j:IF\;x_{ij} + x_{ji} = 1\;THEN\;u_{i} + x_{ji} = u_{j} + x_{ij}\).

which is developed in García [14] and generates constraints:

$$ \forall i,j = 2...n,i \ne j:\;u_{i} - u_{j} + (n - 3)x_{ij} + (n - 1)x_{ji} \le (n - 2) $$
(2)

DL improves the results of the MTZ formulation because the DL formulation integrates into the anti-loop constraint that \(x_{ij} + x_{ji}\) is a binary expression.

On the other hand, solving problems associated with graphs to integer optimality, strategies based on obtaining solutions for the previous models have also been used, eliminating the anti-loop restrictions. These are iterative processes, where for each solution obtained it is calculated whether it has subtours, incorporating the specific subtour elimination constraints into the model to avoid these cycles, and solving the model again in the next iteration. Model resolutions in this way are quite fast. For some applications of these strategies see Aguayo et al. [15], Miliotis [16], Pferschy and Stanek [17], Crowder et al. [18] or Bosch [19]. With the technique that we are going to introduce in this work, these iterative methods will always avoid the formation of odd subtours, without the need to introduce new integer variables into the model, as will be explained in Sect. 2. These methods are not undertaken in this work, but we focus on the study of analysing variations in the MTZ and DL strategies derived from introducing our odd anti-loop constraints.

2 Binary labelling of nodes

Binary labelling of nodes consists of incorporating a boolean variable \(\delta_{i}\) for each node i = 2…n, and imposing constraints (3) and (4) whose logical meaning expresses that.

“IF \(x_{ij}\) = 1 THEN \(\delta_{i}\) + \(\delta_{j}\) = 1 “

$$ \forall (i,j):\delta_{i} + \delta_{j} \le 2 - x_{ij} $$
(3)
$$ \forall (i,j):\delta_{i} + \delta_{j} \ge x_{ij} $$
(4)

Therefore, when we select the arc (i,j) ∈ A:

xij = 1 ⇒ \(\delta_{i} + \delta_{j} = 1\). The labelling of the two nodes connected by the arc must be different.

xij = 0 ⇒ \(\begin{gathered} \delta_{i} + \delta_{j} \le 2 \hfill \\ \delta_{i} + \delta_{j} \ge 0 \hfill \\ \end{gathered}\) No constraint for \(\delta_{i}\) and \(\delta_{j}\).

The consequence of this labelling is that the model cannot allow solutions with odd loops, as shown in Fig. 1.

Fig. 1
figure 1

Nodes loop

The labelling of the root node or initial node (node 1) can be assigned freely. This has the consequence that if the constraints of the problem ensure a connected graph, we can consider the labelling variables as continuous, although in that case they would not form part of the decision process in the branch&bound resolution. Both configurations will be analysed experimentally in the computational results section.

From previous experiments, it is more efficient to group the constraint associated with xij and xji in a single formulation:

\(\forall (i,j) \in E:\) IF xij + xji = 1 THEN \(\delta_{i}\) + \(\delta_{j}\) = 1.

Generating:

$$ \delta_{i} + \delta_{j} \le 2 - x_{ij} - x_{ji} $$
(5)
$$ \delta_{i} + \delta_{j} \ge x_{ij} + x_{ji} $$
(6)

For the case of the TSP problem, if the number of nodes is odd, since the constraints do not apply to node 1, the model would allow a solution that creates the Hamiltonian path.

3 Application to classical formulations

To apply binary labelling, we are going to modify the TSP formulations proposed by Danzig, Fulkerson, and Johnson, the TSP MTZ formulation, and for tree generation, an arborescent formulation of the Steiner problem.

3.1 TSP-Danzig, Fulkerson, and Johnson

The classic formulation DFJ would go on to consider only the calculation of even cycles. Odd cycles cannot occur.

For a directed graph G(N,A) from the undirected graph G(N,E) and given the set of decision variables,

xij = 1 if the directed arc (i,j) is selected; 0 otherwise. (i,j) ∈ A.

The TSP would be as follows:

$$ Min\;\sum\limits_{(i,j) \in A} {c_{ij} x_{ij} } $$
(7)
$$ \begin{gathered} s.t. \hfill \\ \forall j \in N:\sum\limits_{{i/(i,j) \in A}} {x_{{ij}} } = 1 \hfill \\ \end{gathered} $$
(8)
$$ \forall i \in N:\sum\limits_{j/(i,j) \in A} {x_{ij} } = 1 $$
(9)
$$ \forall S/S \subset N,\left| S \right| \ge 2,\left| S \right|even:\sum\limits_{i,j \in S\& (i,j) \in A} {x_{ij} } \le \left| S \right| - 1 $$
(10)
$$ \forall (i,j) \in E,i \ne 1:\delta_{i} + \delta_{j} \le 2 - x_{ij} - x_{ji} $$
(11)
$$ \forall (i,j) \in E,i \ne 1:\delta_{i} + \delta_{j} \ge x_{ij} + x_{ji} $$
(12)
$$ \delta_{1} = 1 $$
(13)
$$ \forall (i,j) \in A:x_{ij} \in \left\{ {0,1} \right\} $$
$$ \forall i \in N,i > 1:\delta_{i} \le 1 $$

With (10) partial even cycles are not allowed. (11) and (12) express the binary labelling. (13) assigns the label to the first node, hence we would allow the variables of the binary label to be relaxed, that is, we do not need to use binary variables to prevent odd cycles.

3.2 TSP-Miller, Tucker, and Zemlin

When we introduce binary labelling in the MTZ formulation, the number of positions can be halved. We transform the MTZ proposition into the following:

$$ \forall i,j = 2...n,i \ne j:IF\;x_{ij} = 1\;THEN\;u_{i} \ge u_{j} + \delta_{j} $$
(14)
$$ \begin{gathered} \forall i = 2...n:\;u_{i} \ge 0 \hfill \\ \forall i = 2...n:\;u_{i} \le \left\lceil \frac{n}{2} \right\rceil - 1 \hfill \\ \end{gathered} $$

Hence, the positions are sequenced as shown in Fig. 2.

Fig. 2
figure 2

Sequence of positions with binary labelling

Therefore the upper bound of variables \(u_{i}\) becomes \(\left\lceil \frac{n}{2} \right\rceil - 1\), and the modeling of (14) is as:

$$ \forall i,j \in N,i > 1,j > 1,i \ne j:u_{i} - u_{j} + (\left\lceil \frac{n}{2} \right\rceil - 1)x_{ij} \le \left\lceil \frac{n}{2} \right\rceil - 1 - \delta_{j} $$

The complete resulting model would be:

$$ \begin{array}{*{20}l} {\begin{array}{*{20}c} {Min} & {\sum\limits_{(i,j) \in A} {c_{ij} x_{ij} } } \\ \end{array} } \hfill \\ {s.t.\;\forall j \in N:\sum\limits_{i/(i,j) \in A} {x_{ij} } = 1} \hfill \\ {\forall i \in N:\sum\limits_{j/(i,j) \in A} {x_{ij} } = 1} \hfill \\ {\forall i,j \in N,i > 1,j > 1,i \ne j:u_{i} - u_{j} + (\left\lceil \frac{n}{2} \right\rceil - 1)x_{ij} \le \left\lceil \frac{n}{2} \right\rceil - 1 - \delta_{j} } \hfill \\ \end{array} $$
(15)
$$ \begin{gathered} \forall i,j \in N,i > 1,j > 1,i \ne j:\delta_{i} + \delta_{j} \le 2 - x_{ij} - x_{ji} \hfill \\ \forall i,j \in N,i > 1,j > 1,i \ne j:\delta_{i} + \delta_{j} \ge x_{ij} + x_{ji} \hfill \\ \end{gathered} $$
(16)
$$ \delta_{1} = 1 $$
(17)
$$ \forall i \in N,i > 1:0 \le n_{i} \le \left\lceil \frac{n}{2} \right\rceil - 1 $$
(18)
$$ \begin{gathered} \forall \;i,j \in N,i \ne j:x_{ij} \in \left\{ {0,1} \right\} \hfill \\ \forall \;i \in N,i > 1:\delta_{i} \le 1 \hfill \\ \end{gathered} $$

3.3 Steiner problem in graphs

In an equivalent way, we are going to analyse a tree formulation of Steiner's problem, collected in Khoury, Pardalos, and Du [20]. The model includes the MTZ constraints.

Given a directed graph G(N,A) where are defined as data:

\(\forall j \in N:T_{j} = 1\) if i is Terminal node; 0 is Steiner node.

\(\forall (i,j) \in A:c_{ij} =\) Cost of selecting arc (i,j).

And the following variables:

\(\forall (i,j) \in A:x_{ij} = 1\) if arc (i,j) is selected; 0 otherwise;

The common formulation without introducing anti-loop constraints is as follows:

$$ s.t.\;Min\;\sum\limits_{(i,j) \in A}^{{}} {c_{ij} x_{ij} } $$
$$ \forall i/T_{i} = 1\& i \ne R:\;\sum\limits_{j/(i,j) \in A}^{{}} {x_{ij} } = 1\;\; $$
(19)
$$ \forall i/T_{i} = 0,j/\left( {j,i} \right) \in A:x_{ji} \le \sum\limits_{k/(i,k) \in A}^{{}} {x_{ik} } $$
(20)
  1. (19)

    Every terminal node, except the root node, has a parent node.

  2. (20)

    Connectivity of the Steiner nodes used. If a Steiner node acts as a parent of a node, then it must be connected to another node.

This formulation would only need to prohibit the formation of subtours. The original model incorporates the MTZ constraints. In our case we incorporate the same constraints incorporated to the TSP problem, constraints (15), (16), (17) and (18).

4 Computational results

For the analysis of the formulations we have executed the following formulations both for the ATSP to a battery of problems collected in TSLIB, as well as for a set of existing Steiner problems in the same library:

  • MTZ: Incorporates anti-loop constraints (1)

  • MTZBL: Binary labelling. Incorporates constraints (10).

  • MTZRBL: Relaxing binary labelling. Incorporates constraints (10), relaxing variables \(\delta_{i}\).

Experiments were performed on an Intel(R) Core(TM) i7-10700 K CPU @ 3.80GH and 16 Gb RAM. The optimization library used was CPLEX v22.1.0, widely used in computational analysis of mathematical models [21, 22] and [21, 22]).

4.1 Results for ATSP

Table 1 shows a comparison of the solutions of the relaxed linear problem of each model. It is shown that MTZ presents the worst values. MTZBL presents an average lower bound improvement of 5.9%.

Table 1 Comparison of the LP relaxation bound for the ATSP

Table 2 presents the results of the execution of the models up to a maximum of 300 s. CPU time presents the time until the completion of the branch&bound resolution (LB = UB) or the best solution found (ZIP) in the case of not completing after 300 s. The MTZBL strategy has a slightly longer time than MTZ strategies. In general, all formulations have difficulty as the problem size increases (increase in the computational time). And this is emphasized more in the binary labeling strategies, since the size of the problem increases with respect to the MTZ strategy in n variables and 2n constraints, with n the number of nodes.

Table 2 CPU time for branch & bound (aborted after 300 s run) for the ATSP

The three strategies do not end the B & B on 2 occasions out of 14 problems. MTZRBL was the only strategy achieving all the optimal solutions.

Regarding the comparison between MTZBL and MTZRBL, the results are very similar, on 7 test cases MTZBL was better and on 6 it was MTZRBL.

4.2 Results for the Steiner problem

In the case of the Steiner problem, the LP relaxation follows a trend similar to the ATSP problem. MTZ gets a slightly lower value. The results obtained are presented in Tables 3 and 4. In Table 3 we show the SteinC problems (problems with 500 nodes), whereas in Table 4 are shown the results for the SteinD problems (1000 nodes). We test each problem with a maximum of 300 s.

Table 3 CPU time for branch & bound (until 300 s run) for the Steiner problem (SteinC problems)
Table 4 CPU time for branch & bound (until 300 s run) for the Steiner problem (SteinD problems)

In the case of completing 300 s without finishing the brach&bound, we show in ZIP the best solution found.

In the case of the Steiner problem, the results of the continuous labelling are always better than those of the integer label. MTZRBL was also better than the MTZ strategy on seven SteinC problems and five SteinD problems. In general, the binary labelling strategy shows better behaviour when the percentage of terminal nodes is low.

5 Conclusions

We have proposed in this paper a new strategy to formulate anti-loop constraints, avoiding odd loops. The experimentation carried out shows the best performance for the binary labelling is its relaxation, which keeps integrity of the solution. Although better global results are achieved for the MTZ strategy, the relaxing binary labelling does have a good convergence and in some problems it improves the results of the MTZ. For the Steiner problem the results are better than the MTZ formulation in 12 problems over 35.

On other hand, this strategy can provide new ideas for the design of algorithms and models in problems related to graphs when iterative strategies are proposed based on generating subtour elimination constraints.