A matheuristic approach for the family traveling salesman problem

In the family traveling salesman problem (FTSP), there is a set of cities which are divided into a number of clusters called families. The salesman has to find a shortest possible tour visiting a specific number of cities from each of the families without any restriction of visiting one family before starting the visit of another one. In this work, the general concept of the Partial OPtimization Metaheuristic Under Special Intensification Conditions is linked with the exact optimization by a classical solver using a mathematical programming formulation for the FTSP to develop a matheuristic. Moreover, a genetic and a simulated annealing algorithm are used as metaheuristics embedded in the approach. The method is examined on a set of benchmark instances and its performance is favorably compared with a state-of-the-art approach from literature. Moreover, a careful analysis of the specific components of the approach is undertaken to provide insights into the impact of their interplay.


Introduction
The traveling salesman problem (TSP) is one of the most investigated problems in operations research.The classical TSP consists of N cities and a salesman tries to find the shortest possible tour, which includes all the nodes and each one exactly once.Matai et al. (2010) present many real-world applications for the TSP; vehicle routing and communication networks are only two examples of them.The TSP is known to be NP-hard based on the seminal exposition of Karp (1972) and, therefore, this also holds for its generalizations.
The TSP has many variants and generalizations each representing a range of realworld problems.In this paper, we focus on a challenging and emerging TSP variant, which is the family TSP (FTSP).In this variant, we have some predetermined clusters of cities called families.In the FTSP, it is sought to visit a predefined number of nodes in each family and there is no obligation to visit the cities of each cluster contiguously.A main application of the FTSP arises in the order picking problem in warehouses where the goods of the same product type exist in different warehouses or in different storage places within the same warehouse (Morán-Mirabal et al. 2014).In this case, it is assumed that each product is a family and the number of members required to be visited from the family is the demand of that product.
Considering the advantages as well as drawbacks of mathematical programming and metaheuristics, matheuristics are developed to cover the strengths, take advantage of both approaches and boost the optimization ability (Maniezzo et al. 2021).A special characteristic of matheuristics is that the exploitation is done by benefiting from the features of mathematical models; accordingly the name "model-based heuristics" is sometimes used for such algorithms (Maniezzo et al. 2010).In this research, we aim at developing a matheuristic for solving a specific generalization of the TSP.Our method uses a mathematical programming approach for local optimization processes in different parts of the solutions together with a genetic algorithm (GA) as its metaheuristic component.For details on the GA concept, interested readers can be referred to Haupt and Haupt (2003).For local exact optimization, the mathematical model of the problem is programmed in PYTHON and solved by the GUROBI solver (Gurobi Optimization, LLC 2021).For more details on the use and the implementation of metaheuristics for the TSP, we refer to Taillard (2023).
To be more specific, our matheuristic works by using a GA to provide some appropriate candidate solutions, and then, neighborhood searches in a selection of solutions based on simulated annealing (SA) (Kirkpatrick et al. 1983) are performed.Subsequently, the solutions are improved by a procedure devised according to the concept known as Partial OPtimization Metaheuristic Under Special Intensification Conditions (POPMUSIC) (Taillard and Voss 2002).This improvement is made by working each time on a part of the solution through applying the mathematical programming model and using a standard solver like the Gurobi optimizer (Gurobi Optimization, LLC 2021).Beyond finding an appropriate solution approach, our goal is also to investigate the specific components within the solution approach to be able to provide component-based insights into matheuristics and metaheuristics development as it is demanded by, e.g., de Armas et al. (2022), Swan et al. (2022).
The outline of the remainder of this paper is as follows.Related works are introduced in Sect. 2. Section 3 is dedicated to explaining the corresponding mathematical model.The solution methodology is presented in Sect. 4.Then, Sect. 5 discusses the experiments and their results.Finally, the conclusions of this research are drawn in Sect.6.

Literature review
In this section, we review different methods adopted in the literature in order to propose a suitable approach for the FTSP.Firstly, we start with adopted solution approaches to deal with the FTSP.Then, we focus on matheuristics and particularly POPMUSIC.Finally, we add a few pointers regarding the parametrization of GAs.

The family traveling salesman problem
The FTSP is a variant of the TSP which was introduced in Morán- Mirabal et al. (2014).In that paper, the authors propose an integer programming formulation for the problem, and it is shown that the CPLEX solver can only solve small instances of the problem in reasonable time.The authors develop two randomized heuristics for obtaining solutions to this problem.Bernardino and Paias (2018) introduce several compact and non-compact models for the FTSP.One of the non-compact models is able to solve instances with 127 nodes in less than 70 s and one of the instances with 280 nodes in 3615 s.Moreover, Bernardino and Paias (2021) present two metaheuristics and a hybrid algorithm, which combines a branch-and-cut algorithm with a local search scheme.This method can already be classified as a matheuristic concept.The results show that their proposed methods improve the previously best-known upper bounds.Reyes Vega (2014) reviews the FTSP with capacitated agents, including the different methods adopted for it.
A special case of the FTSP is the equality generalized TSP (GTSP), where the number of cities which must be visited in each family is one.The equality GTSP is investigated, e.g., in Cacchiani et al. (2010).Bernardino and Paias (2022) also address the FTSP with incompatibility constraints, which is a special case of the problem where it is not allowed to visit cities from incompatible families on the same route.
In Table 1, we summarize the works done in the FTSP field.Among others, we can conclude that the GA is a propitious algorithm that shows success in solving the FTSP (e.g. when combined with a local search; furthermore, branching techniques are also successfully applied by Bernardino and Paias 2021).The combination of the two is then an approach worth studying which has not yet been investigated for the FTSP (Bernardino and Paias 2022;Bernardino et al. 2022).

Matheuristics
Matheuristics have been used for more than a decade as efficient alternatives for pure exact and metaheuristic approaches to solve complex optimization problems.In fact, they have found applications in many real-life problems, including the inventory routing problem (Vadseth et al. 2021;Su et al. 2020), various types of (vehicle) routing problems (Bosco et al. 2014;Leggieri and Haouari 2018), the multi-depot ring star problem (Hill and Voß 2016), the redundancy allocation problem (Caserta and Voß 2014), the airline crew rostering problem (Doi et al. 2018), the dynamic berth allocation problem (Nishi et al. 2017), and generalized versions of the knapsack problem (Galli et al. 2023), to name a few.For a comprehensive and detailed overview of the advances in matheuristics, the interested reader is referred to Maniezzo et al. (2010), Maniezzo et al. (2021).
In particular, the POPMUSIC algorithm from Taillard and Voss (2002) has shown an outstanding ability in combinatorial optimization problems, including the berth allocation problem (Lalla-Ruiz and Voß 2014;Lalla-Ruiz et al. 2015), the point feature label placement problem (Alvim and Taillard 2009), the capacitated vehicle routing problem (Queiroga et al. 2021) and the multi-depot cumulative capacitated vehicle routing problem (Lalla-Ruiz and Voß 2019).In particular for the TSP, as an extension to Taillard (2017), Taillard and Helsgaun (2019) aim to build a list of good candidate edges with a complexity lower than quadratic in the context of TSP instances given by a general function.Their numerical results show that solutions of excellent quality can be obtained with an empirical complexity lower than quadratic.Moreover, Taillard (2022) proposes an adaptation of the approach.The method is tested on instances with billions of cities.We can then conclude that POPMUSIC is an appropriate approach to solve very large instances of the TSP.However, to our knowledge, the approach has not yet been applied to the FTSP despite its practical importance.Additionally, POPMUSIC has been successful in solving similar issues involving problem decomposition.In Table 2, we summarize some main related works which use the POPMUSIC approach.More specifically, in that table, we focus on proposing a new hybridization of a branching strategy or a population-based method to be applied to a TSP variant, and papers applying it to a new domain.
As mentioned above, POPMUSIC is all about decomposing a problem and combining a metaheuristic and a mathematical programming approach.Their combination is done in different ways.In particular, the combination of the GA and branching approaches demonstrates potential in tackling diverse combinatorial optimization problems (Nagar et al. 1996;Poojari and Beasley 2009), including the TSP (Choi et al. 2003).However, their combination is not done in a systematic way as in POP-MUSIC, which we propose in the following sections.

Parametrization of genetic algorithms
Most metaheuristics are using one or more parameters.In many cases they can be crucial for the success of a method.One distinction refers to whether a method is parameterized in general or whether the parameters need to be tuned for every instance.The latter way for parametrization, known as per-instance or instance-aware parametrization, is often more effective (Hutter and Hamadi 2005).As an example, consider the case of a GA, where small problem instances might be sufficiently solved by using a small population while large instances need a larger population (see, e.g., Almeida et al. 2013).In this paper, we adopt the response surface methodology (RSM; see, e.g., Box and Draper 2007), which is a widely used per-instance parametrization approach (as indicated in, e.g., Hutter et al. 2014).
The RSM is a statistical and mathematical technique used to optimize and analyze complex systems, processes, or experiments.It is particularly helpful when the relationship between input variables and the response of interest (output) is not easily discernible or is too costly or time-consuming to evaluate through direct experimentation.
RSM involves fitting a mathematical model to experimental data, typically using regression analysis, to approximate the behavior of the system.This model, known as the response surface, allows researchers to explore and identify the optimal input settings that lead to the desired response.By systematically varying the input variables and observing the corresponding responses, RSM helps to efficiently locate the optimal conditions that yield the best outcomes.
The method is widely applied in various fields such as engineering, manufacturing, chemistry, and process optimization, where it aids in understanding the underlying relationships between variables, reducing experimental efforts, and maximizing performance or quality (e.g., de Oliveira et al. 2019).

Mathematical models
The following notations and formulations express a general mathematical model for the FTSP, which is based on the model of Morán-Mirabal et al. (2014) but includes some slight modifications: Input Explanation

O
The fixed start and end city of the salesman V The set of vertices or cities except O N The set of all cities The number of families nv l The number of members from family l which must be visited; l ∈ {1, 2, . . ., m} K N The total number of cities (members) which must be visited; The set of roads between any two cities (edges) c i j The cost of traversing edge (i, j) Auxiliary variable which keeps the number of visited cities until city i in the tour The objective function (1) calculates the total cost of visiting the selected family members; it has to be minimized.Constraints (2)-(3) set the start and the end of the tour to be at city O. Constraints (4)-( 5) ensure that each city is entered and left at most once.Equation ( 6) ensures that the required family members are visited.Con-straints ( 7)-( 8) imply entering and exiting the required number of members from each family.The equal in-degree and out-degree of each node is guaranteed by constraints (9), and constraints ( 10)-( 11) are responsible for avoiding subtours.The method of preventing subtours in this model is different from the one used by the base model presented by Morán-Mirabal et al. (2014).This is based on the Miller-Tucker-Zemlin formulation (Miller et al. 1960) and imposes lower complexity compared to the one of Morán-Mirabal et al. (2014) because with that alternative formulation the number of constraints sharply increases by the number of cities.For more details about subtour elimination constraints for the FTSP see Bernardino and Paias (2018).
The presented mathematical model is based on an undirected graph, reflecting the bidirectional nature of roads between any two cities, allowing traversal in both directions.However, it is essential to consider that this model might be applied to an asymmetric instance where the costs of traveling an edge in both directions are not necessarily equal.In such cases, the edges of the graph would have distinct associated weights or costs depending on the direction of traversal.Hence, for asymmetric instances, the graph can be considered as a directed one and each direction of an edge can be regarded as a distinct arc.This adjustment accommodates scenarios in which the transportation costs, travel times, or any other relevant factors may differ depending on the direction in which the road is traveled, providing a more accurate representation of the real-world conditions.The flexibility of this mathematical model allows it to be applied to a wide range of transportation systems, both symmetric and asymmetric, optimizing routes and efficiently addressing diverse scenarios.

Methodology
We use the concept of the POPMUSIC algorithm proposed in Taillard and Voss (2002) to design an efficient matheuristic for the FTSP.
In our matheuristic, a simple GA is employed, which works iteratively on a population of candidate solutions to improve them by genetic operators such as selection, mutation and crossover.The initial population of this GA is generated randomly.The solutions created by the genetic operators are merged with the original population of the iteration, and then, a proportion of the new population is selected according to the objective value by the roulette wheel method (see Blickle and Thiele 1996) to be improved by SA.The SA optimization method explores neighboring solutions around an incumbent solution to find an improved solution.It evaluates a specified number of neighboring solutions based on their objective value, denoted as N BS, and selects the neighboring solution with a certain probability determined by the formula , where f (s) and f (s ) represent the objective values of the current and neighboring solutions, respectively.The parameter T represents the current temperature of the system, which controls the likelihood of accepting solutions that may not be better than the current one.By iteratively adjusting the temperature and exploring the solution space, the SA method can escape local optima and ultimately converge towards an optimal solution.In other words, we jump to the neighboring solution if it has a better objective value; otherwise this action is done according to a probability which is directly proportional to the quality of the new solution and inversely proportional to the temperature.This SA stops whenever it cannot improve the objective value after a maximum number of consecutive iterations called MCU I S A (an abbreviation for Maximum Consecutive Unsuccessful Iterations of SA).The SA solutions are similarly added to the previous population.
Subsequently, a number of the best solutions (P M B) regarding the objective value and a number of other solutions (P M O) are also improved according to the POP-MUSIC framework in each GA iteration.The members of P M O are selected again probabilistically using the roulette wheel method.
In the POPMUSIC framework, the solutions are divided into K parts; then, each time, one of these parts is chosen at random and an area including it and the D nearest parts to it called altogether area A is processed by solving the mathematical model of the problem.All variables except for those existing in A are fixed at their values in a given incumbent solution.Consequently, the Gurobi solver is called to optimize the active (non-fixed) variables, which are those existing within A. This procedure continues until no further improvement is found in the solution.It means that if we have consecutively tested all parts of a solution and the exact solving attempts cannot make any improvement, we terminate the POPMUSIC procedure on that solution.
The collaboration of GA, SA and POPMUSIC continues until the number of consecutive unsuccessful GA iterations, i.e., iterations without any improvement in the objective value, exceeds a limit.The pseudocode of this matheuristic algorithm is presented in Algorithm 1.
The parameters of the used GA are the population size (n Pop), the crossover rate (Cross_rate), the mutation rate (Mutat_rate), those of SA are the initial temperature (T 0 ) and cooling rate (CT ), the number of neighborhood searches (N BS) and the maximum number of consecutive stagnant iterations MCU I S A. The rate of best solutions and other solutions for being improved by mathematical programming (P M B_rate and P M O_rate, respectively), the number of solution parts (K ), and the number of nearest parts (D) are the parameters belonging to POPMUSIC.Finally, the number of unsuccessful iterations (MCU I , an abbreviation for Maximum Consecutive Unsuccessful Iterations) is an important parameter for the whole matheuristic.
The classical solution encoding scheme for the TSP is used here for the FTSP.A similar approach is also used in Bernardino and Paias (2021).This is a string containing a permutation of the numbers from one to |V | as shown in the first row of Fig. 2. So each cell of this structure contains the label of one city.In order to translate the structure (chromosome) to an FTSP solution, it is assumed that the salesman visits the cities by their order in the string but whenever it comes to a label belonging to a cluster, whose required number of cities have already been visited, we ignore (eliminate) this city and jump to the next one until it comes to a city that belongs to a family which is still required to be visited.
As a simple example, if it is aimed at converting the first string shown in Fig. 2 to a feasible solution for an FTSP with three families as {1,2,5}, {3,4,9} and {6,7,8,10}, and the assumption that the salesman must visit one city from the first, two cities from the second and two cities from the third family, we obtain the tour 3-5-9-10-7.The reason is that "3", which is the first in the string, comes as the first city of the tour and is from the second family.Then, there is "5" in the string, which belongs to the first family.Now the salesman is finished with the first family because only one city must be visited from it.Therefore, the next node in the string, "2", is not visited or ignored regarding that it belongs to the first family."9" is the next and is visited because two cities from the second family must be visited and we still need one.Finally, we need to visit two cities from the third family, "10" and then "7", that come sooner in this order and are chosen as the last cities of the tour in this string.In this way, the salesman has visited the required number of cities from each family.This decoded solution is shown for some example positions of the ten cities in Fig. 1, where the three families are shown in different colors.
One-point crossovers are performed and the offspring chromosomes are repaired to make their elements unique.For this sake the repeated cities are replaced by the cities which do not exist in the string.The order of adding these required cities is according to the order of the repeated cities, which are removed.In other words, the city with the largest label from the former group comes to the gene containing the largest repeated city label.The second-largest label of the lacking cities is put in the place of the secondlargest repeated city label and so on.Figure 2  crossover, and then, the required repairs to make the offspring results feasible.The grey genes are the crossover points, and after exchanging the corresponding parts, which include the genes from the beginning to the crossover point, "1" and "7" in the first child and "3" and "5" in the second child are repeated.Therefore, their repetitions are replaced by "3" and "5" and "1" and "7" in the first and second child, respectively.The choice to utilize the classical crossover method is grounded in its proven track record of yielding favorable results.On the other hand, some permutation-specific crossover methods may appear more straightforward since they obviate the need for child repair, but its outcomes are noticeably of inferior quality.

depicts a simple example for doing the
For mutation, completely new chromosomes are generated.The reason is that we aim at providing enough diversity in the population.Another method of mutation is to swap the elements in the chromosome.However, this operator is considered in our SA process.
In the SA process, the neighborhood of a chromosome is defined as comprising all chromosomes that differ in the position of either two nodes or edges.In other words, a neighboring chromosome is build by either swapping the contents of two genes or two pairs each including adjacent genes.The probability of doing each of these cases is equal.For the first case the two genes are selected at random, while in the second case two genes are randomly selected from 1 to |N |-1 (|N | is the number of nodes or the chromosome length), and then, they and their immediate right genes make our two gene pairs.The first and second neighborhood are shown in Figs. 3 and 4, respectively.The grey genes are chosen randomly.
Each cell (gene) of this chromosome is considered as a solution part to apply POPMUSIC.So K is equal to the number of genes.It means that each time a gene is Fig. 3 An example for a neighboring chromosome by swapping two nodes Fig. 4 An example for a neighboring chromosome by swapping two pairs of adjacent genes Fig. 5 Applying POPMUSIC to a chromosome selected as s seed together with the D nearest genes to it, which are located on both of its sides.They are selected one from the right, and then, one from the left, until D genes are picked, and in case that we reach the beginning or end of the chromosome in one side, the rest of the parts are all selected from the other side.These genes comprise the area A and the variables corresponding to pairs of consecutive genes within this area in the mathematical model are reset and optimized by the Gurobi solver, while other variables are fixed at the values that the decoding of the chromosome shows.Figure 5 depicts a simple example for the application of POPMUSIC on a chromosome.s seed is shown in dark grey, D is equal to 5, so the last gene is chosen from the right.The Dnearest genes to s seed are shown in grey.So x 2,9 , x 9,10 , x 10,1 , x 1,4 , x 4,7 are optimized by Gurobi through solving the corresponding sub-model.

Computational experiments
This section is divided into three parts of explaining the problem instances, the parameter setting, and the results.All our computational experiments are run on a computer with a Core(TM) i7 processor, 3.10 GHz CPU and 16 GB of RAM.The used programming language is PYTHON.

Instances
A set including numerous typical benchmark instances of the FTSP is used.This set consists of 37 specific instances, which are also considered in Morán-Mirabal et al. (2014), Bernardino and Paias (2018) and Bernardino and Paias (2021).These instances are available at http://familytsp.rd.ciencias.ulisboa.pt/.Note that except the last instance (rbg443_2), all others are symmetric in a metric space, where the triangle inequality is respected.The characteristics of the investigated instances including their number of cities, families, and known upper bounds (or optimal objective values) are listed in Table 3.

Parameter setting
The parameter setting of our solution methodology is done separately for each instance, and the required time is considered in the reported execution time of the algorithm.
The RSM (Box and Draper 2007) is applied for the parametrization, in which an empirical wide interval for each parameter is the input to find a good value within it.The parametrization with the RSM is implemented in R language.The considered intervals are shown in Table 4.
Regarding our previous experiments on this FTSP and other problems, tuning the parameters once and applying the same parameter values to all instances significantly deteriorates the algorithm's performance.Therefore, it is essential to investigate and fine-tune the parameters specifically for each instance.
Regarding the necessity of doing numerous experiments, we limit the execution time of each RSM experiment to 1 5 of a normal run time for each instance to avoid long total execution times for the parametrization phase.For this sake, the instance is run once with average values of parameters and the corresponding execution time is measured and regarded as the normal run time.
K is considered to be equal to the number of cities or |N |.The reason for this choice is that there are |V | genes or cells in the solution encoding structure and it is an immediate idea to consider each as a solution part.
A parameter setting step is executed here before solving each benchmark instance.The chosen values are shown in Table 5.

Results
As the comparative approach from literature, the hybrid method presented in Bernardino and Paias (2021) is chosen as it supersedes earlier approaches like the

Table 5
The parameter values set by the RSM and the time of the parameter setting phase for the benchmark instances Parameter  2018).1That is, the hybrid method has been shown to be the best solution methodology so far.Each instance is solved five times, and the best value (Best), the average of all runs (Average), the average execution time in seconds (Time (s)) are reported in Table 6 for our matheuristic and the hybrid method.
In addition, for our matheuristic, the improvement percentages of best, and then, the average value compared to the hybrid method, and finally, the improvement of the best value over the previous best-known value (separated by a comma) are shown in the column "Improvement (%)".Bold values indicate instance-based best values within the comparison.
It is observed, that our matheuristic is able to provide solutions with better objective values concerning best and average.In other words, our matheuristic has improved the previously found best solutions in many cases.Moreover, it has quite short execution times, although they are a little longer than those of the hybrid method.The superiority of our matheuristic becomes more evident as we face larger problem instances.The two last rows are dedicated to the average values and the p-values resulting from the execution of the Wilcoxon signed-rank test (see, e.g., Conover 1999) to investigate if there are statistically significant differences between the averages of the results of the two approaches.
The p-value represents the probability of obtaining results as extreme as the observed data, assuming that there is no real difference between the two groups (null hypothesis).A low p-value indicates that the observed difference is unlikely to be due to chance alone, and this leads to rejecting the null hypothesis in favor of the alternative hypothesis, which suggests that there is a significant difference between the two groups.
Typically, a significance level (often denoted as α) is predetermined, commonly set to 0.05.If the calculated p-value is less than the significance level, the results are considered statistically significant, and researchers can conclude that there is evidence of a meaningful difference between the groups.On the other hand, if the p-value is greater than the significance level, there is insufficient evidence to reject the null hypothesis, and the observed difference is deemed not statistically significant.In our case, the low p-values, which lead to the rejection of the hypotheses of the equality at the significance level of 0.05, show that there are statistically significant differences in terms of the solution quality and time.
Overall, it can be deduced that our matheuristic has the capability of providing better quality solutions for the FTSP when compared to the outcomes of the state-of-the-art algorithm.In addition, it is also quite fast.

Proportion of execution time related to components
In this section, we show how much of the total execution time of the algorithm is dedicated to each part of it.The proportions are shown in Table 7 for all instances separately and the averages are given in the last row.The considered parts are the    parameter setting phase, the GA including the evaluation and sorting, SA and POP-MUSIC.
It is observed that most of the time (on average 78.09%) is used to set the parameters of the algorithm.This proportion of time is understandable because this initial phase comprises many runs, although their times are limited.Nevertheless, regarding the considerable effect of the parameter values in the performance of the algorithm, investment in the parametrization seems really necessary and beneficial.The required time for the GA is on average 10.64% of the total time.As the GA constitutes the main framework of this algorithm and also contains the objective function calculation of the solutions as well as sorting them accordingly, this share is expected.POPMUSIC consumes on average only 5.80% of the time.This indicates the speed of this approach; the reason is that we only deal with tractable parts of the problem, which are solvable in short times.Finally, the shortest average time (5.44%) belongs to the SA since its aim is to do some fast neighborhood searches around a number of solutions.
By separate analysis of the instances, it can be deduced that more time is needed for parameter tuning of larger or more complex instances and the GA time also slightly increases for them.In terms of POPMUSIC, a trend which shows the increase in time in terms of harder instances is not observed.The reason can lie in the concept of decomposition inherent in POPMUSIC.

POPMUSIC effects
In the last part, we aim at analyzing the contribution of POPMUSIC and the metaheuristic components in our matheuristic (M).For this sake, we implement it once without the POPMUSIC part (MWP) and once without the genetic operators, i.e. crossover and mutation, and the SA part (MWG).The overall results of all runs of all instances are normalized in the interval [0, 1] based on the best and worst objective value found in the iterations of the matheuristic.So we can show all objective values in the same scale as box plots for the entire matheuristic, MWP, MWG and also the hybrid method of Bernardino and Paias (2021) denoted by H in Fig. 6.It can be observed that eliminating either POPMUSIC or the metaheuristic operators considerably decreases the ability of our approach, even when MWP and MWG are compared to H.It means that the POPMUSIC and the metaheuristic together provide a very good integration and complement each other in searching within the solution space.In other words, the combination of the genetic operators, the SA neighborhood searches, and solving the partial mathematical models in a well-structured framework shows great promise.
Next, we conduct again some pairwise statistical tests to examine the existence of any significant differences between the solution qualities of the analyzed methods.The chosen test is the non-parametric Friedman with Bergman-Hommel post-hoc procedure (Bergmann and Hommel 1988).This is assumed to be one of the best statistical tests when the aim is to compare more than two methods in a pairwise manner (Derrac et al. 2011).The test is implemented in R language.Regarding the low p-values, we reject the null hypotheses, which claim the equality of the average objective values, for all the pairs at the significance level of 0.05.

Conclusions
This paper has described a new efficient approach to acquire good-quality solutions for a recent generalization of the traveling salesman problem, which has a lot of applications in the real world, namely the family TSP.A matheuristic framework is used in which the searches based on the SA and POPMUSIC concept with mathematical programming are embedded in the iterations of a GA.That is, the advantages of integrating both metaheuristics and an exact optimization procedure are used in our approach, which has enhanced the ability of the search.The proposed method is tested on a large number of benchmark instances and its efficiency is verified by comparing its results with those of the currently best-known state-of-the-art method.
The matheuristic has the ability to find optimal solutions for many FTSP instances in short execution times.It owes its favorable performance to the effective interactions between the metaheuristic and the mathematical programming part throughout the progress of the algorithm.We have integrated genetic operators, SA neighborhood searches and mathematical programming of tractable parts of the solutions.They act jointly better than the previously best algorithm from the literature (the hybrid method of Bernardino and Paias 2021), which constructs the solution by providing an optimal solution for only a core part of the problem and then inserting the missing parts.Providing even slightly improved results can be regarded as a considerable success in the research direction.Finally, we proved that the ability of our approach drops sharply if either the POPMUSIC parts or the genetic and SA operators are excluded, providing insights into their interoperability.
In the future, other hybridization schemes for connecting metaheuristics and mathematical programming techniques can be investigated.This may even incorporate specialized solvers rather than a standard solver (like VRP solver for vehicle routing problems; Pessoa et al. 2020).Moreover, matheuristics based on other concepts like the fixed set search algorithm (Jovanovic et al. 2019) can be tried.Since the proper parametrization can positively influence the performance of optimization algorithms, it can be worthwhile to invest much in this part of the developed algorithm.

Algorithm 1 : 16 forI
The proposed matheuristic Data: Problem inputs Result: A feasible solution expected to be of good quality 1 Set the GA parameters: n Pop, Cross_rate, Mutat_rate and MCU I , the SA parameters: T 0 , CT , N BS and MCU I S A as well as the POPMUSIC parameters: P M B_rate, P M O_rate, K and D. 2 Generate n Pop random candidate solutions.3 Evaluate the solutions of the population based on their objective value.4 I t = 1. 5 Initialize a variable called B O, which keeps the best objective value found so far.6 while I t ≤ MCU I do 7Choose the candidates for crossovers by roulette wheel selection.and mutation results based on their objective value, and then merge them with the original population.11Choose a set P S A of candidates from the population by the roulette wheel selection.12Improve candidates of P S A by neighborhood searches based on SA.13Evaluate the SA results based on the objective value, and then, merge them with the previous population.14 Sort the total population based on the objective values.15 Choose P M B from the best solutions regarding the objective value and P M O based on the roulette wheel selection from other solutions.Each solution spm ∈ (P M B ∪ P M O) do 17 Decompose spm in K parts, i.e., the set of parts is H = {part 1 , ..., part K } 18 Set P = ∅ 19 while P = {part 1 , ..., part K } do 20 Select a seed part s seed ∈ H and s seed = P at random 21 Build a sub-problem R composed of s seed and its nearest D parts (corresponding to area A) 22 Optimize R through solving the mathematical formulation by the Gurobi solver 23 if The objective value of spm has been improved (decreased) results with the population.32 Sort the population based on the objective value and choose the first n Pop best of them as the new population.33 Set B O = Objective value of the first solution of the population.34if B O is improved in comparison to its previous value then 35

Fig. 1 Fig. 2
Fig. 1 Decoding of the first chromosome shown in Fig. 2 to a feasible solution

Fig. 6
Fig. 6 The box plots of normalized objective values based on all runs of all instances

Table 1
Studies on the FTSP

Table 2
Studies using the POPMUSIC approach

Table 3
The characteristics of the instances of Set 1, * optimal value

Table 6
Comparison between our matheuristic and the best state-of-the-art method

Table 7
The percentage (%) of time spent on different parts of our matheuristic

Table 8
P-values of the pairwise statistical test based on all results