Construction Learning

Taillard, Éric D.

doi:10.1007/978-3-031-13714-3_8

Éric D. Taillard⁴

Part of the book series: Graduate Texts in Operations Research ((GRTOPR))

6010 Accesses

Abstract

The first basic ingredient of heuristics is to build a solution. So, a first metaheuristic approach is to improve the process of building solutions. This chapter presents construction learning mechanisms. A typical example is artificial ants systems. Another technique is vocabulary building.

You have full access to this open access chapter, Download chapter PDF

After having studied the four basic principles—modeling, decomposition, construction, and improvement—this chapter introduces the fifth principle of metaheuristics: learning mechanisms. The algorithms seen in the previous chapter rely solely on chance to try to obtain better solutions than would be provided by greedy constructive methods or local searches. This is probably not very satisfactory from the intellectual point of view. Without solely relying upon chance, this chapter studies how to implement learning techniques to build new solutions. Learning processes require three ingredients:

Repeating experiences and analysing successes and failures: we only learn by making mistakes!
Memorizing what has been made.
Forgetting the details. This gives the ability to generalize when in a similar but different situation.

1 Artificial Ants

The artificial ant technique provides simple mechanisms to implement these learning ingredients in the context of constructing new solutions.

The social behavior of some animals has always fascinated, especially when a population comes to realizations completely out of reach of an isolated individual. This is the case with bees, termites, or ants: although each individual follows an extremely simple behavior, a colony is able to build complex nests or efficiently supply its population with food.

1.1 Real Ant Behavior

Following the work of Deneubourg et al. [2] who described the almost algorithmic behavior of ants, researchers had the idea of simulating this behavior to solve difficult problems.

The typical behavior of an ant is illustrated in Fig. 8.1 with an experience made with a real colony that has been isolated. The latter can only look for food by going out from a single orifice. The last is connected to a tube separated into two branches joining further. The left branch is shorter than the one on the right. As ants initially have no information on this fact, the ants equally distribute in both branches (Fig. 8.1a).

While exploring, each ant drops a chemical substance that it is apt to detect with its antennas, which will assist it when returning to the anthill. Such a chemical substance carrying information is called pheromones. On the way back, an ant deposits a quantity of pheromones depending on the quality of the food source. Naturally, an ant that has discovered a short path is able to return earlier than that which used the bad branch.

Therefore, the quantity of pheromones deposited on the shortest path grows faster. Consequently, a new arriving ant has information on the way to take and bias its choice in favour of the shortest branch. After a while, it is observed that virtually all ants use the shortest branch (Fig. 8.1b). Thus, the colony collectively determines an optimal path, while each individual sees no further than the tip of its antennas.

1.2 Transcription of Ant Behavior to Optimization

If an ant colony manages to optimize the length of a path, even in a dynamic context, we should be able to transcribe the behavior of each individual in a simple process for optimizing intractable problems. This transcript may be obtained as follows:

An ant represents a process performing a procedure that constructs a solution with a random component. Many of these processes may run in parallel.
Pheromone trails are τ _e values associated with each element e constituting a solution.
Traces play the role of a collective memory. After constructing a solution, the values of the elements constituting the latter will be increased by a quantity depending on the solution quality.
The oblivion phenomenon is simulated by the evaporation of pheromone trails over time.

Next is to clarify how these components can be put in place. The construction process can use a randomized construction technique, almost similar to the GRASP method. However, the random component must be biased not only by the incremental cost function c(s, e), which represents the a priori interest of including element e in the partial solution, but also by the value τ _e which is the a posteriori interest of this element. The last is solely known after having constructed a multitude of solutions.

The marriage of these two forms of interest is achieved by selecting the next item e to include in the partial solution s with a probability proportional to \(\tau _e^\alpha \cdot c(s,e)^\beta \), where α > 0 and β < 0 are two parameters balancing the respective importance accorded to memory and incremental cost. The update of artificial pheromones is performed in two steps, each requiring a parameter. First, the evaporation of pheromones is simulated by multiplying all the values by 1 − ρ, where \(0 \leqslant \rho \leqslant 1\) represents the evaporation rate. Then, each element e constituting a newly constructed solution has its τ _e value increased by a quantity 1∕f(s), where f(s) is the solution cost, which is assumed to be minimized and greater than zero.

1.3 MAX-MIN Ant System

The first artificial ant colony applications contained only the components described above. The trail update is a positive feedback process. There is a bifurcation point between a completely random process (learning-free) and an almost deterministic one, repeatedly constructing the same solution (too fast learning). Therefore, it is difficult to tune a progressive learning process with the three parameters α, β and ρ.

To remedy this, Stützle and Hoos [5] suggested limiting the trails between two values τ _min and τ _max. Hence, selecting an element is bounded between a minimum and a maximum probability. This avoids elements possessing an extremely high trail value, implying that all solutions would contain these elements. This leads to the MAX-MIN ant system, which proved much more effective than many other previously proposed frameworks. It is given in Algorithm 8.1.

Algorithm 8.1: MAX-MIN ant system framework

This framework comprises an improvement method. Indeed, implementations of “pure” artificial ants colonies, based solely on building solutions, have proven inefficient and difficult to tune. There may be exceptions, especially for the treatment of highly dynamic problems where an optimal situation at a given time is no longer optimum at another one.

Algorithm 8.1 has a theoretical advantage: it can be proved that if the number of iterations I _max →∞ and if τ _min > 0, then it finds a globally optimal solution with probability tending to one. The demonstration is based on the fact that τ _min > 0 implies that the probability of building a globally optimal solution is not zero. In practice, however, this theoretical result is not tremendously useful.

1.4 Fast Ant System

One of the disadvantages of numerous frameworks based on artificial ants is their large number of parameters and the difficulty of tuning them. This is the reason why we have not presented Ant systems (AS [1]) or Ant Colony System (ACO [3]) in detail. In addition, it can be challenging to design an incremental cost function providing pertinent results. An example is the quadratic assignment problem. Since any pair of elements contributes to the fitness function, the ultimate element to include can contribute significantly to the quality of the solution. Conversely, the first item placed does not incur any cost. This is why a simplified framework called FANT (for Fast Ant System) has been proposed.

In addition to the number of iterations, I _max, the user of this framework must only specify another parameter, τ _b. It corresponds to the reinforcement of the artificial pheromone trails. This reinforcement is systematically applied to the elements of the best solution found so far at each iteration. The reinforcement of the traces associated with the elements of the solution constructed at the current iteration, τ _c, is a self-adaptive parameter. Initially, this parameter is set to 1. When over-learning is detected (the best solution is again generated), τ _c is incremented, and all trails are reset to τ _c. This implements the oblivion process and increases the diversity of the solutions generated.

If the best solution has been improved, then τ _c is reset to 1 to give more weight to the elements constituting this improved solution. Ultimately, FANT incorporates a local search method. As mentioned above, it has indeed been noticed that the sole construction mechanism often produces bad quality solutions. Algorithm 8.2 provides the FANT framework.

Algorithm 8.2: FANT framework. Most of the lines of code are about automatically adjusting the weight τ _c assigned to the newly built solution against the τ _b weight of the best solution achieved so far. If the latter is improved or if over-learning is detected, the trails are reset

Figure 8.2 illustrates the FANT behavior on a TSP instance with 225 cities. In this experiment, the value of τ _b was fixed to 50. This figure provides the number of edges different from the best solution found so far, before and after calling the improvement procedure.

A natural implementation of the trails for the TSP is to use a matrix τ rather than a vector. Indeed, an element e of a solution is an edge [i, j], defined by its two incidents vertices. Therefore the value τ _ij is the a posteriori interest to have the edge [i, j] in a solution. The initialization of this trail matrix and its update may therefore be implemented with the procedures described by Code 8.2.

The core of an ant heuristic is the construction of a new solution exploiting artificial pheromones. Code 8.1 provides a procedure not exploiting the a priori interest (an incremental cost function) of the elements constituting a solution. In this implementation, the departure city is the first of a random permutation p. At iteration i, the i first cities are definitively chosen. At that time, the next city is selected with a probability proportional to the trail values of the remaining elements.

Code 8.1 generate_solution_trail.pyImplantation of the generation of a permutation only exploiting the information contained in the pheromone trails

Once the three procedures given by the Codes 8.1 and 8.2 as well as an improvement procedure are available, the implementation of FANT is very simple. Such an implantation, using an ejection chain local search, is given by Code 8.3

Code 8.2 init_update_trail.pyImplementation of the trail matrix initialization and update for the FANT method applied to a permutation problem. If the solution just generated is the best previously found, trails are reset. Otherwise, the trails are reinforced both with the current solution and the best one

Code 8.3 tsp_FANT.pyFANT for the TSP. The improvement procedure is given by Code 12.3

2 Vocabulary Building

Vocabulary building is a more global learning method than artificial ant colonies. The idea is to memorize fragments of solutions, which are called words, and to construct new solutions from these fragments. Put differently, one has a dictionary used to build a sentence attempt in a randomized way. A repair/improvement procedure makes this solution attempt feasible and increases its quality. Finally, this new solution sentence is fragmented into new words that enrich the dictionary.

This method has been proposed in [4] and is not yet widely used in practice, although it has proved efficient for a number of problems. For instance, the method can be naturally adapted to the vehicle routing problem. Indeed, it is relatively easy to construct solutions with tours similar to those of the most efficient solution known. This is illustrated in Fig. 8.3.

By building numerous solutions using randomized methods, the first dictionary of solution fragments can be acquired. This is illustrated in Fig. 8.4.

Once an initial dictionary has been constructed, solution attempts are built, for instance, by selecting a subset of tours that do not contain common customers. This solution is not necessarily feasible. Indeed, during the construction process, the dictionary might not include tours only containing customers not yet covered. Therefore, it is necessary to repair this solution attempt, for instance, by means of a method similar to that used to produce the first dictionary but starting with the solution attempt. This phase of the method is illustrated in Fig. 8.5. The improved solution is likely to contain tours that are not yet in the dictionary. These are included to enrich it for subsequent iterations.

The technique can be adapted to other problems, like the TSP. In this case, the dictionary words can be edges appearing in a tour. Figure 8.6 shows all the edges present in more than two-thirds of 100 tours obtained by applying a local search starting with a random solution. The optimal solution to this problem is known. Hence, it is possible to highlight the few edges frequently obtained that are not part of the optimal solution. Interestingly, nearly 80% of the edges of the optimal solution have been identified by initializing the dictionary with a basic improvement method.

Problems

8.1

Artificial Ants for Steiner Tree

For the Steiner tree problem, how to define the trails of an artificial ant colony? Describe how these trails are exploited.

8.2

Tuning the FANT Parameter

Determine good values for the parameter τ _b of the tsp_FANT method provided by Code 8.3 when the latter performs 300 iterations. Consider the TSPLIB instance tsp225.

8.3

Vocabulary Building for Graph Coloring

Describe how vocabulary construction can be adapted to the problem of coloring the vertices of a graph.

References

Colorni, A., Dorigo, M., Maniezzo, V.: Distributed optimization by ant colonies. In: Actes de la première conférence européenne sur la vie artificielle, pp. 134–142. Elsevier, Paris (1991)
Google Scholar
Deneubourg, J., Goss, S., Pasteels, J., Fresneau, D., Lachaud, J.: Self-organization mechanisms in ant societies (II): Learning in foraging and division of labor. In: Pasteels, J., et al. (eds.) From Individual to Collective Behavior in Social Insects, Experientia supplementum, vol. 54, pp. 177–196. Birkhäuser, Basel (1987)
Google Scholar
Dorigo M., Gambardella L.M.: Ant Colony System : A Cooperative Learning Approach to the Traveling Salesman Problem. IEEE Trans. Evol. Comput. 1(1), 53–66 (1997). https://doi.org/10.1109/4235.585892
Article Google Scholar
Glover, F.: Tabu search and adaptive memory programming — advances, applications and challenges. In: Barr, R.S., Helgason, R.V., Kennington, J.L. (eds.) Interfaces in Computer Science and Operations Research: Advances in Metaheuristics, Optimization, and Stochastic Modeling Technologies, pp. 1–75. Springer, Boston (1997). https://doi.org/10.1007/978-1-4615-4102-8_1
Google Scholar
Stützle, T., Hoos, H.H.: MAX MIN Ant system. Future Gener. Comput. Syst. 16(8), 889–914 (2000). https://doi.org/10.1016/S0167-739X(00)00043-1
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Applied Sciences of Western Switzerland, HEIG-VD, Yverdon-les-Bains, Switzerland
Éric D. Taillard

Authors

Éric D. Taillard
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Taillard, É.D. (2023). Construction Learning. In: Design of Heuristic Algorithms for Hard Optimization. Graduate Texts in Operations Research. Springer, Cham. https://doi.org/10.1007/978-3-031-13714-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-13714-3_8
Published: 01 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13713-6
Online ISBN: 978-3-031-13714-3
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics