A greedy heuristic for the capacitated minimum spanning tree problem

Kritikos, M.; Ioannou, G.

doi:10.1057/s41274-016-0146-7

A greedy heuristic for the capacitated minimum spanning tree problem

Open access
Published: 21 December 2016

Volume 68, pages 1223–1235, (2017)
Cite this article

Download PDF

You have full access to this open access article

Journal of the Operational Research Society

A greedy heuristic for the capacitated minimum spanning tree problem

Download PDF

M. Kritikos¹ &
G. Ioannou¹

3121 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

This paper develops a greedy heuristic for the capacitated minimum spanning tree problem (CMSTP), based on the two widely known methods of Prim and of Esau–Williams. The proposed algorithm intertwines two-stages: an enhanced combination of the Prim and Esau–Williams approaches via augmented and synthetic node selection criteria, and an increase of the feasible solution space by perturbing the input data using the law of cosines. Computational tests on benchmark problems show that the new heuristic provides extremely good performance results for the CMSTP, justifying its effectiveness and robustness. Furthermore, excluding the feasible space expansion, we show that we can still obtain good quality solutions in very short computational times.

Exact algorithms for finding constrained minimum spanning trees

Article 05 May 2020

A Heuristic for the Degree-Constrained Minimum Spanning Tree Problem

Novel Heuristics for the Euclidean Leaf-Constrained Minimum Spanning Tree Problem

Article 30 March 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The Capacitated Minimum Spanning Tree problem (CMSTP) can be defined as the design of a minimum cost tree which spans over all vertices of an undirected graph G, so that the sum of demands of every main subtree does not exceed a given capacity Q. The CMSTP plays an important role in the design of backbone telecommunications networks, as well as in distribution, transportation, and logistics. Gavish (1991), formulated telecommunication network design problems as CMSTPs. In addition a CMSTP solution provides a lower bound on the capacitated vehicle routing problem (CVRP) defined on G (Toth and Vigo, 1995).

The CMSTP can be divided into two categories based on whether the weights of vertices are identical or not. The first is the homogeneous CMSTP where all vertices have the same weight. The second is the heterogeneous CMSTP where different vertices have different weights. When the weights of all vertices are equal to unity the problem reduces to finding a minimum cost rooted spanning tree in which each subtree contains at most Q (the capacity) vertices; this unit demand case is usually referred to as the CMSTP in the literature (Oncan and Altinel, 2009). The general homogeneous problem can be transformed into a unity problem by dividing the weights of the vertices and the capacity Q by the common vertex weight. In our work, we propose an algorithm for the CMSTP when the weights of all vertices are equal to unity.

The CMSTP is a difficult combinatorial optimization problem. It has been shown to be NP-hard even in the case of unit demand (Papadimitriou, 1978); thus, the solution of the CMSTP with exact methods is very time consuming and even impossible even for moderate size instances (Ruiz et al, 2015), and as a result heuristics are widely used in practice. Due to the importance of the problem, there is a vast literature that addresses modelling and solutions aspects of the CMSTP. Several mathematical formulations and exact algorithms have been proposed for the CMSTP. Exact algorithms are based on branch and bound and dynamic programming methods. The existing exact algorithms only solve small scale CMSTPs or find lower bounds on the optimal solution—see e.g., Chandy and Russell (1972), Chandy and Lo (1973), Gavish (1982), Gouveia and Paixao (1991), Malik and Yu (1993), Hall (1996), Han et al (2002), Gouveia and Martins (2005), and Uchoa et al (2008).

Early heuristics based on the greedy paradigm are those of Esau and Williams (1966)—the most widely known and the one used as a benchmark in computational tests—and the unified algorithm of Kershenbaum and Chou (1974). More sophisticated heuristics have been developed by Amberg et al (1996), Sharaiha et al (1997), Patterson et al (1999), Ahuja et al (2001, 2003) and Souza et al (2003), that employ techniques such as local search. The problem with these approaches is that in each iteration the new solution does not always improve the objective function, thus they are quite slow in converging to high quality trees. More recent metaheuristics include: (a) the hybrid ant colony algorithm of Raimann and Laumanns (2006) that solves the CVRP and applies an implementation of Prim’s algorithm to obtain a feasible CMST solution; (b) the work of Martins (2007) that proposes an enhanced version of the second order (SO) algorithm originally described by Karnaugh (1976), one of the first metaheuristics applied for the CMSTP; (c) the tabu search heuristic of Rego et al (2010) introducing dual and primal–dual RAMP algorithms, (d) the filter and fan algorithm by Rego and Mathew (2011); and (e) the biased random—key genetic algorithm (BRKGA) of Ruiz et al (2015).

The enhancements of construction algorithms, play a great role in producing very good solutions to complex classical combinatorial problems such as the VRP and the CMSTP. Altinel and Oncan (2005) reported that to improve the accuracy of the Clarke and Wright (1964) heuristic for the VRP problem without harming its speed and simplicity is an interesting question and proposed a new enhancement of the savings criterion; the new method was both fast and very accurate. Bruno and Laporte (2002) suggested a simple enhancement of the Esau–Williams heuristic for the CMSTP removing the longest edge on the path linking vertex j to the root against removing the first edge in the path linking j to the root of the tree according to the Esau–Williams heuristic. Oncan and Altinel (2009) proposed three parametric enhancements of the Esau–Williams heuristic for the CMSTP: In the first enhancement they parametrized the classical saving criterion of Esau–Williams; in the second enhancement, they added a term expressing the asymmetry between two vertices with respect to the central vertex. In the third enhancement the authors took into account the fact that CMSTP is a combination of the minimum spanning tree and of the bin packing problem; so they added a third term that included demand information over the joining process.

Battara et al (2012) justified the popularity of the Esau–Williams heuristic in practice and the motivation behind its enhancements, recognizing the problem that the best metaheuristic implementations outperform classical heuristics but they require long computational times and many are not very easy to implement. Additionally, they claimed that the parameters involved in the Esau–Williams enhancements improve their competition with the best metaheuristics and proposed a genetic algorithm procedure to tune efficiently a three-parameter enhancement of the latter algorithm. The proposed evolutionary approach produced high quality results without affecting its simplicity in a limited amount of computing time.

In our work we proposed a two-stage algorithm for the CMSTP: First, motivated by the results of Oncan and Altinel (2009), we develop a new greedy function that measures the cost of linking vertex j to the partial -under construction- capacitated minimum spanning tree. This composite greedy function combines the effects of several metrics and the heuristic follows Prim’s optimization framework. Then we increase the space of feasible solutions by perturbing the input data using the law of cosines, in order to explore multiple and possibly not easily reachable solutions by applying the previous framework. Computational tests on benchmark problems from the literature show that the new heuristic provides extremely good performance results for the CMSTP. Furthermore, excluding the feasible space expansion, we show that we can still obtain good quality solutions in very short computational times.

The remainder of the paper is organized as follows: Section 2 provides an overview of the steps embedded within the proposed heuristic, the selection criteria, and the cosine-based engine for the expansion of the feasible solution space. Section 3 offers the computational results while the conclusions are summarized in Section 4.

2 The heuristic and the feasible solution space expansion mechanism

Let G = (V, E) be an undirected graph, where V = {0, 1, 2, …, n} is the vertex set, with 0 as the root vertex, and E = {(i, j): i, j ϵ V, i ≠ j} is the edge set. A nonnegative weight or demand q _i is associated with each vertex i∈V-{0}, and a length or cost c _ij is associated with each edge {i, j}. Given a spanning tree, any subtree linked to the root by a single edge is called a main subtree. Given a vertex i∈V-{0} the main subtree containing j is called the subtree of j. We refer to the first vertex in the subtree of j as the gate vertex g(j) of the subtree of j.

Our work is based on the pioneering algorithms of Prim (1957) and Esau–Williams (1966). Prim’s algorithm starts only with the root vertex in the spanning tree and at each iteration, the vertex whose distance or cost to any vertex already in the tree is minimal is brought into the tree. Esau–Williams’s algorithm, on the other hand, starts with each vertex and the root vertex in separate components; then they define a tradeoff function C _ij (different from c _ij) as the minimum cost of connecting the component containing vertex i to the root vertex minus the cost of connecting vertices i and j. Thus, at each stage, the algorithm finds C _i*j* = max(C _ij) and brings in line (i*,j*), forming a new component without exceeding the capacity constraint.

The algorithm we propose requires the definition of the “shortest point” for a vertex i, i.e., the root vertex or a vertex in the tree that has the minimum direct distance with i and the linking is feasible (does not violate the capacity constraint); we denote the shortest point for vertex i by s(i). Furthermore, let C ⁱ_s(i) be the value of the composite selection criterion that is associated with the selection of vertex i; C ⁱ_s(i) will be properly defined later in this section.

Given all the previous definitions and notation, the steps of the new heuristic are as follows:

PEW Heuristic

Step 0:

Initialization. Read n, c _ij, Q ∀ i, j = 0, …, n

Step 1:

Select the vertex nearest to the root to start the spanning tree

Step 2:

Find the feasible vertex j that minimizes the composite criterion C ^j_s(j) :

Step 2a:: Select the shortest point s(j) and calculate C ^j_s(j)
Step 2b:: Repeat step 2a for all vertices not linked to the spanning tree
Step 2c:: Select vertex j with the minimum C ^j_s(j)

Step 3:

Link the selected vertex j to its shortest point on the current spanning tree and update the spanning tree; set vertex j as a tree vertex

Step 4:

If there are vertices remaining to be linked to the spanning tree, return to Step 2; otherwise proceed to Step 5

Step 5:

Terminate; get sequence of vertices in each subtree and total distance

The overall loop is performed until all vertices have been assigned to the spanning tree. The solution procedure is straightforward adopting a very simple execution mechanism. However, it is based on new criteria for vertex selection and linking, which are motivated by the minimization function of the new enhancement heuristics for the CMSTP.

2.1 Selection criteria

To form our new composite selection function, we combine six metrics through a weighted linear relationship.

The first metric is C ¹_s(j)j defined as the direct distance of vertex j to the subtree s(j) (the shortest point of vertex j), namely c _s(j)j.

The second metric is C ²_s(j)j defined as the direct distance of vertex j to the gate node (or vertex) of s(j), that is c _jg(s(j)).

The third metric is C ³_ij i.e., the distance of the shortest point s(j) to the gate vertex of s(j), named c _g(s(j))s(i).

The fourth metric is C ⁴_ij defined as the inverse of the following parameterized saving expression (Oncan and Altinel, 2009), adjusted for the component case with one vertex inside: C ⁴_s(j)j = (c _dj − α×c _s(j)j)⁻¹, where d is the root vertex and α is the positive tree shape parameter. This formula is the inverse of the parameterized extension of the savings formula of the Esau–Williams heuristic because our composite vertex selection criterion ensures that a vertex j selected for subtree connection will minimize the selection criterion.

The fifth metric is motivated by Paessens (1988) who introduced a new term to the savings expression of the Vehicle Routing Problem solution algorithm, that was the asymmetry between customers i and j with respect to their distances to the depot. Oncan and Altinel (2009) used the same term to extent the first enhancement of the Esau–Williams heuristic for the CMSTP. The inverse of the asymmetry is included in our selection criterion by the metric: C ⁵_ij = |c _ds(j) − c _dj|⁻¹ where d is again the root vertex.

The sixth metric involves the notion of a “moving vertex” m, i.e., a vertex that is second, third, fourth, etc. in distance from the root vertex; this is to capture additional information about the spatial distribution of vertices in a space expanding mechanism logic embedded within the selection criteria. In the first iteration of the metric calculations, the moving vertex is the second in distance vertex nearest to the root; in the next iteration, m is the third in distance and so on so forth. Consequently, the calculation of this new metric involves an iterative scheme based on the moving vertex m. The criterion is defined as C ⁶_s(j)j = |c _j,f − β×c _j,m|, where β is a tuning parameter taking values between 0 and 1, f is the nearest to the root vertex, and the final value of C ⁶_s(j)j is the minimal one along all possible m’s examined.

To summarize, we can state the following:

Criteria C¹, C² and C³ are simple distance-based functions we define to link vertex j to other points of the network.
Criterion C⁴ is the inverse of the parametrized extension of the saving formula of Esau–Williams proposed by Oncan and Altinel (2009).
Criterion C⁵ is the inverse of the third term within the second enhancement of the saving formula of Esau–Williams proposed by Oncan and Altinel (2009).
Criterion C⁶ is a new composite criterion we propose for the first time.

Now we can define the overall vertex selection criterion which accounts for all previously defined metrics. We use a simple linear relationship to merge the effects of the six metrics. The greedy function that measures the cost of connecting vertex j to its shortest point s(j) in the minimum spanning tree under construction is denoted as C ^j_s(j) . This composite greedy function is defined as follows:

$$ C_{s(j)}^{j} = b_{1} C_{s(j)j}^{1} + b_{2} C_{s(j)j}^{2} + b_{3} C_{s(j)j}^{3} + b_{4} C_{s(j)j}^{4} + b_{5} C_{s(j)j}^{5} + b_{6} C_{s(j)j}^{6} $$

(1)

In (1) b ₁, b ₂, b ₃, b ₄, b ₅, b ₆ ≥ 0. Note that the weights b ₁, b ₂, b ₃, b ₄, b ₅, b ₆ define the relative importance of the associated metric in the selection of vertex j. It is important to note that a determinant factor for the effective deployment of the proposed greedy heuristic is the selection of appropriate weights and the metrics’ parameters embedded in the composite greedy function. The tuning of these parameters requires statistical experimentation. We discuss the determination of the intervals of weights and parameters of the metrics in the computational results section.

The composite greedy function allows for the exploration of a large solution space, and thus, we expect the criterion by itself to lead an unsophisticated method to solutions of high quality; such expectations are proven by the application of the proposed heuristic to literature benchmarks. Furthermore, the diversity of the individual selection metrics can capture the specific and the unique characteristics of each problem instance, thus leading a simple greedy approach to evolve in a manner similar to that of meta-heuristics.

2.2 Mechanism for feasible solution space expansion

Hart and Shogan (1987) suggested the perturbation of the problem’s data and the reapplication of the relevant algorithm as one way to improve the performance of a heuristic. Thus, instead of applying a heuristic only to the original data, they claimed that improvements can be achieved when several minor perturbations of the data are used as starting points for the algorithm’s execution. The best of the solutions obtained can then be implemented using the original data.

Based on this observation, to improve the performance of any proposed greedy heuristic for the CMSTP, we can proceed to data perturbation by altering the distance between the root vertex and the second in distance vertex nearest to the root. We opt to move this particular vertex because it is included in the selection criterion of our algorithm (C ⁶_s(j)j ); furthermore, according to Kershenbaum and Chow (1974), the second nearest feasible neighbor leads to significant benefits for the overall solution.

To perturb the CMSTP data, we move the second in distance nearest to the root vertex B cyclically by an angle θ, while also changing its distance from vertex O (which is the root vertex 0) as shown in Figure 1. Thus, first B moves to B′ and then B′ moves to D, and using the law of cosines we can calculate the distance DE (where E is any random vertex) as follows:

$$ EB^{2} = OE^{2} + OB^{2} {-} \, 2 \, \times OE \times OB \times \, \cos \angle BOE => $$

$$ \cos \angle BOE \, = \, \left( {OE^{2} + OB^{2} - EB^{2} } \right) / \left( {2 \, \times OE \times OB} \right) $$

so the angle BOE is known. Because the angle BOD is known by the cyclic move of vertex B, we conclude that ∠DOE = ∠BOD − ∠BOE. The law of cosines also states:

$$ DE^{2} = OD^{2} + OE^{2} {-} \, 2 \, \times OD \times OE \times \, \cos \angle DOE $$

As a result, the distance between new location of B (which is D now) and the random vertex E is known.

Our mechanism of feasible solution space expanding calculates all the new distances between the vertices of the graph and the new location of the second nearest to the root vertex. Our proposed greedy algorithm implemented to new distance matrix producing a new spanning tree under the capacity restriction. Then, we can recalculate the cost of the solution using the real distance matrix. The final results should lead to improvements in the objective function value for data sets, as per the claim of Hart and Shogan (1987).

The sequential steps of this expanded procedure, denoted as PEW-PLC heuristic, are shown in Figure 2. Note that PLC refers to the recalculation of distances after the perturbation via the angle rotation.

3 Computational results

The proposed heuristic in both its versions (PEW and PEW-PLC) has been implemented in FORTRAN 90, using the Fortran PowerStation 4.0 compiler. The computational experiments have been performed on a PC with an Intel Core i5 processor. The heuristics were tested on the classical unit demand data sets from the OR-Library (http://people.brunel.ac.uk/~mastjjb/jeb/info.html). The tc instances in this library have the central vertex in a central position with respect to the other ones. The te instances have the central vertex in a corner with respect to the other ones. The problems include ten instances with fully connected graphs of 40-vertices with arc capacities 3, 5 and 10, and ten instances with fully connected graphs of 80-vertices with arc capacities 5, 10 and 20. Thus, a total of 60 problem instances are examined solved.

One requirement in our approach is the determination of the intervals of weights b ₁, b ₂, b ₃, b ₄, b ₅, and b ₆ in the selection formula, as well as the setting of parameters α and β in the metrics of the cost function. In our experiments, the weights b ₁, b ₂, b ₃, b ₄, b ₅, and b ₆ are chosen within the interval [0.0, 1.0] in an incremental manner with increment set to 0.5. Parameters α and β are set in the same manner.

The experimental results of the proposed greedy heuristic PEW are reported in Tables 1, 2, 3 and 4. The test problems reference in OR Library and its capacity are shown in columns one and two of each table. The optimal known (literature) solutions for each instance or the relevant lower bound if the optimal is not reached are listed in the third column, while the best solutions produced by PEW are reported in the fourth column (distance of the capacitated minimum spanning tree and moving vertex in parenthesis). In the fifth column, we provide the number of capacitated spanning trees produced up to point when the best solution is obtained. The sixth column of each table offers the CPU time in seconds that PEW requires to find the best solution.

Table 1 Comparison of the results for benchmark instances te40 with n = 40 vertices

A greedy heuristic for the capacitated minimum spanning tree problem

Abstract

Similar content being viewed by others

Exact algorithms for finding constrained minimum spanning trees

A Heuristic for the Degree-Constrained Minimum Spanning Tree Problem

Novel Heuristics for the Euclidean Leaf-Constrained Minimum Spanning Tree Problem

1 Introduction

2 The heuristic and the feasible solution space expansion mechanism

2.1 Selection criteria

2.2 Mechanism for feasible solution space expansion

3 Computational results

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation