A Self-Adaptive Variant of CMSA: Application to the Minimum Positive Influence Dominating Set Problem

Construct, merge, solve and adapt (CMSA) is a recently developed, generic algorithm for combinatorial optimisation. Even though the usefulness of the algorithm has been demonstrated by applications to a range of combinatorial optimisation problems, in some applications, it was observed that the algorithm can be sensitive to parameter settings. In this work, we propose a self-adaptive variant of CMSA, called Adapt-CMSA, with the aim of reducing the parameter sensitivity of the original version of CMSA. The advantages of this new CMSA variant are demonstrated in the context of the application to the so-called minimum positive influence dominating set problem. It is shown that, in contrast to CMSA, Adapt-CMSA does not require a computation time intensive parameter tuning process for subsets of the considered set of problem instances. In fact, after tuning Adapt-CMSA only once for the whole set of benchmark instances, the algorithm already obtains state-of-the-art results. Nevertheless, note that the main objective of this paper is not the tackled problem but the improvement of CMSA.


Introduction
Algorithms for solving combinatorial optimisation (CO) problems [1] generally fall into two different categories: (1) exact techniques provide optimal solutions to the tackled problems in bounded time, and (2) approximate techniques are designed to provide good-enough solutions to the tackled problems within rather low computation times. The first category of approaches includes algorithms such as dynamic programming and mathematical programming techniques, as, for example, branch and bound or branch and cut. Most cutting edge mathematical programming techniques are implemented in commercial solvers such as CPLEX 1 and Gurobi. 2 These solvers exhibit a great performance, for example, for combinatorial optimisation problems that can be expressed in terms of integer linear programming (ILP) models. 3 However, with growing problem instance size and/or difficulty, these solvers start to fail. For some problems, this happens already for rather small problem instances [as an example consider the well-known quadratic assignment problem (QAP)], and for other problems, these solvers are able to solve surprisingly large instances to optimality in short computation times [as an example consider the minimum dominating set (MDS) problem]. In those cases in which exact solvers fail, approximate techniques are applied instead. This category of algorithms includes simple Greedy heuristics, but also more sophisticated metaheuristics [2,3]. Examples of the latter ones are tabu search, iterated local search, evolutionary algorithms, and ant colony optimisation. These algorithms perform often very well for instances of medium and even large size. However, in the context of large to very large instances, metaheuristics might get lost in the huge search spaces defined by these instances. For this reason, a popular trend in recent years concerns the hybridisation between exact techniques and metaheuristics [4][5][6]. The resulting hybrid algorithms often benefit from synergies between the exact and the approximate algorithm components of which they are composed. This has turned out to be beneficial, especially in the context of huge search spaces.

Background
One of the most popular hybrid techniques is large neighbourhood search (LNS) [7], respectively, large-scale neighbourhood search [8]. LNS is a method based on local search. In other words, the algorithm generally works with one incumbent solution per iteration and tries to identify a better solution in a predefined neighbourhood of the incumbent solution. The difference to a standard local search technique is that the neighbourhoods considered in LNS are much larger. In fact, in some cases, it might even be an NP-hard problem itself to find an improving neighbour in such a large neighbourhood. The main difference between existing LNS approaches is the way in which the large neighbourhoods are generated. However, many LNS approaches are based on the principle of ruin-andrecreate [9], also sometimes found as destroy-and-recreate or destroy-and-rebuild. In this type of LNS, the following is done at each iteration. First, the incumbent solution is partially destroyed, resulting in a partial solution. Second, a heuristic or an exact technique is used to search for an improving solution in the space of all feasible solutions that contain this partial solution. Examples of applications can be found in [10][11][12], just to name a few. Alternative ways of defining large neighbourhoods include local branching [13], the corridor method [14], and POPMUSIC [15].
A recent alternative to LNS is construct, merge, solve and adapt (CMSA) [16]. In principle, the idea of CMSA is similar to the one of LNS: at each iteration, the search space of a substantially reduced sub-instance of the tackled problem instance is searched for a better solution than the best solution found so far. The way, however, in which these reduced sub-instances are produced is conceptually very different. At each iteration, the algorithm probabilistically generates a set of solutions to the tackled problem instance. These solutions are then merged with an initially empty sub-instance, and an exact solver is applied to possibly find a best solution to the current sub-instance. Finally, the sub-instance is adapted based on this solution, and the algorithm proceeds with the subsequent iteration. CMSA has been successfully applied to a number of combinatorial optimisation problems. Some of the latest applications include the one to the maximum happy vertices problem [17], to route planning for cooperative air-ground robots [18], to refuelling and maintenance planning of nuclear power plants [19], and to the prioritised pairwise test data generation problem in software product lines [20].

Contribution
An overly high sensitivity to changes in parameter values is a recognised problem in research on metaheuristics [21]. Hereby, a metaheuristic is generally said to be parameter sensitive, if (1) the algorithm performance for specific instances or instance groups strongly depends on the parameter values and if (2) the required parameter values for different instances or instance groups are rather different to each other. When algorithms are too sensitive to parameter settings, this is seen as a rather negative aspect in the research community. Unfortunately, such a high sensitivity to parameter values was noticed in some applications of CMSA in the literature. One of the examples concerns the preliminary application of CMSA to an NP-hard CO problem known as the minimum positive influence dominating set (MPIDS) problem [22]. Therefore, in this paper, we propose a self-adaptive variant of CMSA, called Adapt-CMSA, with the aim of obtaining an algorithm less sensitive to parameter values. As a test case, we use the above-mentioned MPIDS problem. The obtained results show that Adapt-CMSA has several advantages over standard CMSA in the context of the MPIDS problem. First, Adapt-CMSA does, indeed, not require specific parameter tuning for subsets of the considered benchmark set. After applying parameter tuning once, Adapt-CMSA works very well for the whole benchmark set containing instances of very different sizes. Second, Adapt-CMSA clearly outperforms standard CMSA in the context of large networks for which even a specialised tuning does not enable CMSA to compete with Adapt-CMSA. We would expect a similar advantage of Adapt-CMSA over standard CMSA in most applications in which standard CMSA shows a high parameter sensitivity.

Paper Outline
The remainder of this paper is organised as follows. In Sect. 2, first, an introduction to standard CMSA is given, before the new self-adaptive CMSA variant is presented. Subsequently, in Sect. 3, the application of both standard CMSA and Adapt-CMSA to the MPIDS problem are outlined. Finally, a comprehensive experimental evaluation is provided in Sect. 4, while conclusions and an outline of future work can be found in Sect. 5. In this section, we first describe the standard version of CMSA in the context of CO problems that can be modelled in terms of binary ILPs. Subsequently, we introduce the self-adaptive variant of CMSA, henceforth labelled Adapt-CMSA. For the description of both CMSA variants, we assume to be tackling a CO problem which can be modelled in term of an ILP where f() is the objective function to be minimised, and x i ∈ {0, 1} ( i = 1, … , n ) is the set of binary decision variables used to model the objective function and the constraints of the problem. Note that many NP-hard CO problems fall into this category of problems. Examples include the well-known travelling salesman problem (TSP) and the quadratic assignment problem (QAP), just to name two emblematic problems.
In the general case as described above, we introduce a solution component c 0 i and a solution component n } is the complete set of 2n solution components. Any candidate solution s is a subset of C with |s| = n . In addition, it is required that s contains exactly one of the two components c 0 i and c 1 i for each i = 1, … , n . Finally, a candidate solution s is a valid solution if it fulfils all the constraints of the tackled problem.

Standard CMSA
Algorithm 1 provides the pseudo-code of a standard CMSA for binary optimisation problems. Note that all functions in the pseudo-code are indicated with a special font as, for example, in GenerateGreedySolution(C). This function is used to initialise the best-so-far solution s bsf with the solution generated by a greedy algorithm, as outlined in detail below. This is done at the start of the algorithm. Moreover, the sub-instance C ′ , which is solved by an ILP solver at each iteration, is initialised to s bsf . Note that, alternatively, s bsf might be initialised to and C ′ to the empty set. Each solution component c ∈ C maintains a so-called age value age [c] . These age values are all initialised to zero. Note that the purpose of the age value of a solution component c is to count the number of consecutive CMSA iterations for which c forms part of C ′ , without being included in the ILP-solution to the reduced problem instance generated on the basis of C ′ . At each iteration, CMSA iterates through four algorithmic steps. In the construct step, n a valid solutions to the tackled problem are probabilistically constructed in function ProbabilisticSolutionGeneration(C). In the merge step, those solution components that (1) are found in at least one of the constructed solutions from the construct step, and (2) do currently not form part of C ′ , are added to C ′ and their age value is set to zero. Next, the solve step first generates a reduced problem instance on the basis of C ′ , which is done by adding-for all i = 1, … , n-the following constraints to the original ILP model of the tackled problem: Note that, the more of these constraints are added to the original ILP, the smaller is the search space of the resulting subinstance. Afterwards the extended ILP is solved in function SolveSubinstance(C ′ , t ILP ), for example, by the application of an ILP solver with a CPU time limit of t ILP seconds. Note that a variable x i is only free in the extended ILP, if both solution components c 0 i and c 1 i form part of C ′ . Note also that the output of function SolveSubinstance(C ′ , t ILP ) is-due to the computation time limit-not necessarily an optimal solution to the extended ILP. In those cases in which f (s � opt ) < f (s bsf ) , the output of function SolveSubinstance(C ′ , t ILP ) is set as s bsf . Finally, in the adapt step, sub-instance C ′ is adapted in function Adapt(C ′ , s � opt , age max ) depending both on s � opt and on the age values of the solution components. This is done by increasing the age values of all components in C � ⧵ s � opt by one, and by re-initialising the age values of all components in s � opt to zero. The final action in the adapt step consists in removing all those components from C ′ whose age value has reached the maximum allowed age of age max . This is done in order to prevent components that never appear in s � opt to slow down the ILP solver in subsequent iterations.

44
Page 4 of 13 2.2 Self-Adaptive CMSA 4 Remember that this means that solutions constructed in this way will be more similar to s bsf than with lower values of bsf . The pseudo-code of self-adaptive CMSA (Adapt-CMSA) is provided in Algorithm 2. The first noticeable difference to standard CMSA is the absence of the age values. This is because Adapt-CMSA works with a fixed maximum age of one, that is, after each iteration all solution components apart from those that form part of the best-so-far solution s bsf are removed from the sub-instance C ′ (see line 23). Another difference can be seen in function ProbabilisticSolutionConstruction(C, s bsf , bsf ) for the probabilistic generation of solutions at each algorithm iteration (see line 8). Note that this latter function receives, apart from the set of all possible solution components (C), the currently best-so-far solution s bsf and a parameter bsf (where 0 ≤ bsf < 1 ) as input. This parameter biases the construction of new solutions towards the best-so-far solution s bsf . More specifically, the higher the value of bsf , the higher will be the similarity of the solutions constructed in ProbabilisticS olutionConstruction(C, s bsf , bsf ) to s bsf . The dynamic change of the value of bsf is one of the aspects that is handled in a self-adaptive way in Adapt-CMSA. First of all, Adapt-CMSA requires a lower bound LB and an upper bound UB for the value of bsf as input. Moreover, the step size red for the reduction of bsf must also be given as input. Adapt-CMSA starts with setting bsf to the highest possible value UB ; see line 5. 4 In case the resulting ILP can be solved in a computation time t solve which is below a proportion t prop of the maximally possible computation time t ILP , the value of bsf is reduced by red ; see line 12. The rationale behind this step is the following one. In case the resulting ILP can be solved easily, the search space of the ILP is too small due to a rather low number of free variables. In order to have more free variables in the ILP, the solutions constructed in ProbabilisticSolutionCo nstruction(C, s bsf , bsf ) should be more different to s bsf , which can be achieved by reducing the value of bsf .
The second aspect which is handled in a self-adaptive way in Adapt-CMSA is the number of solution constructions per iteration ( n a ); see lines [13][14][15][16][17][18][19][20][21][22]. The algorithm starts with a value of n a = 1 ; see line 5. Moreover, in case the solution of the reduced ILP ( s � opt ) improves over the best-so-far solution s bsf , n a is set back to one; see line 15. If, however, the solution of the reduced ILP ( s � opt ) is strictly worse than the best-so-far solution s bsf , the corresponding sub-instance was clearly too large and/ or complex in order to be solved by the ILP solver within t ILP seconds. In this case, if n a = 1 the value of bsf is slightly increased (by red 10 ); resp. n a is set back to one, otherwise. In the remaining case ( f (s � opt ) = f (s bsf ) ), n a is incremented by one; see line 20. This is done because the sub-instance did not contain a better solution than s bsf . At the same time, the subinstance was solved within the allowed computation time of t ILP seconds, which means that the size of the sub-instance should be increased.
Finally, note that functions SolveSubinstance(C ′ , t ILP ) are exactly the same in both version of CMSA (standard CMSA and Adapt-CMSA).

Application to the MPIDS Problem
The only problem-dependent part of both standard CMSA and Adapt-CMSA is the construction of feasible solutions. For the purpose of describing the solution construction procedures, we first need to introduce the tackled problem. As mentioned in the introduction, both standard CMSA and Adapt-CMSA are applied to an NP-hard combinatorial optimisation problem known as the minimum positive influence dominating set (MPIDS) problem. This problem is known for its applications in the context of social networks. Imagine that the nodes and edges in such a social network represent individuals (persons) and relationships/interactions between those individuals, respectively. In general, information propagated in social networks has the potential to have a significant impact, which might be either positive or negative, on (parts of) the society. As social norms theory shows that the behaviour of individuals can be affected by the perception of others' thoughts and behaviours [23], relationships among people in social networks may be exploited in order to obtain economical and/or societal benefits. In this sense, the aim of the MPIDS problem is to identify a small subset of influential individuals (or key individuals) for speeding up the spread of positive influence in a social network [24,25]. Alternative applications of the MPIDS problem can be found in e-learning software [26], online business [27], drinking, smoking, and other drug-related problems [28].
In the following, the MPIDS problem is described in a technical way. Let G = (V, E) be an undirected graph without loops and without parallel edges. Any subset S ⊆ V that fulfils the following condition is a valid solution to the problem: at least half of the neighbours of each vertex v ∈ V must form part of S. Note that, if G is connected, any valid solution S is also a dominating set of G. The MPIDS problem aims to find a valid solution S * ⊆ V of minimum size. In other words, given a valid solution S ⊆ V , the objective function value of S is f (S) ∶= |S| . Note that S ∶= V is a trivial solution to the problem. The MPIDS problem is NP-hard.
From an algorithmic point of view, the efforts of the research community initially focussed on the development of well-working greedy heuristics [29][30][31][32][33][34]. In fact, until 2021, the best available approach was our own greedy method from [34]. The development of successful metaheuristic approaches seemed much harder. This is shown by the results of the first two metaheuristics-an ILP-based memetic algorithm [35] and a swarm intelligence based algorithm [36]-whose results are inferior to the greedy approach from [34]. The first metaheuristic that was able to improve over [34] is the iterated carousel approach from [37]. Finally, the currently best metaheuristics are our own approaches: a negative learning ant colony optimisation approach from [38] and the preliminary standard CMSA approach from [22]. Both approaches perform on a comparable level.
Note that, for the application of CMSA and Adapt-CMSA, we make use of the following ILP model which is well known from the related literature. This model is based on a binary variable x i for each vertex v i ∈ V. (2) forces any feasible solution to contain at least half of the neighbours of each vertex v i ∈ V.

Solution Construction in CMSA and Adapt-CMSA
In the following, we outline the remaining aspect of CMSA and Adapt-CMSA: the construction of solutions in function ProbabilisticSolutionConstruction (C, l size , d rate ) in the case of CMSA, respectively in function ProbabilisticSolutionConstruction (C, s bsf , bsf ) in the case of Adapt-CMSA. Both functions make use of the solution construction mechanism of the greedy procedure from [34]. They only differ in the way in which this procedure is made probabilistic. For the following discussion, remember that a vertex v ∈ V is called covered with respect to a (partial) solution s if and only if at least half of its neighbours form part of S. In the opposite case, v is labelled as uncovered.
The solution construction mechanism utilised by both functions is shown in Algorithm 3. First, each solution s to be constructed is initialised by a set s par ⊂ V of nodes that must form part of an optimal solution; see line 3. Note that s par is obtained by the application of a pre-processing procedure described in [34]. Then, at each step of the solution construction mechanism, the following is done. First, the set of all uncovered vertices with respect to (partial) solution s is determined; see line 5. This ; see lines 7-10. Hereby, exactly one vertex at each entry of the while loop.
In standard CMSA, function ChooseFrom(N(v) ⧵ s ) is implemented as follows. At first, a candidate list L is created. This list includes all vertices v � ∈ N(v) ⧵ s . Each vertex v ′ in L is characterised by its cover degree cov_deg(v � ) , which is the number of uncovered adjacent vertices of v ′ . Note that vertices in L are sorted according to a non-increasing cover degree value. Then, a uniform random number r is generated from the interval [0, 1]. If r ≤ d rate (where d rate is the so-called determinism rate) the vertex with the highest cover degree is selected and added to s. Otherwise, a vertex is selected randomly from the restricted candidate list which contains the first l size vertices of L. Hereby, l size is the size of the restricted candidate list. All vertices in the restricted candidate list have an equal probability In other words, the higher the value of parameter bsf ∈ [0, 1] , the stronger is the bias towards the best-so-far solution s bsf . This bias does not exist in standard CMSA.

Experimental Evaluation
All experiments reported in the following were performed on a cluster of machines with Intel ® Xeon ® 5670 CPUs with 12 cores of 2.933 GHz and a minimum of 32 GB RAM. Note that CPLEX version 20.1 was used in one-threaded mode both in a standalone manner and within CMSA and Adapt-CMSA for solving the respective sub-instances. Two sets of experiments were performed. A comprehensive experimentation in the context of a new set of 800 scale-free networks is described in Sect. 4

Experiments Regarding Scale-Free Networks
In order to be able to compare CMSA and Adapt-CMSA on a controlled set of benchmark instances with different features, we generated the following set of 800 scale-free networks using the igraph software package [39]. An undirected network is said to be scale-free-or, equivalently, to follow a power-law distribution-if the statistical distribution of the degrees of its nodes is as follows: where P(k) is the probability that a given node has exactly k neighbours. Specifically, we used the random power-law graph generator function called igraph_static_power_ law_game to generate networks differing in the following parameters: • Number of nodes, |V| ∈ {1.000, 10.000, 50.000, 100.000 250.000, 500.000, 750.000, 1.000.000}. • Number of edges: l ∈ {5, 10, 20, 30} , where l ⋅ |V| is the number of edges • Exponent for the power-law exponential distribution, ∈ {2, 2.25, 2.5, 2.75, 3} . Note that parameter establishes the pace at which the probability of having highly connected nodes decreases.
Note that five networks were generated for all combinations of the number of nodes, the number of edges, and the exponent for the power-law exponential distribution. This makes a total of 800 networks. Neither of the generated networks has self-loops or multiple edges between a pair of nodes. Note that, following the suggestion in the igraph package, the finite_size_ correction mechanism [40] for the generation of the networks. Our reason for choosing power-law, scale-free networks for the comparison between CMSA and Adapt-CMSA is that they are generally accepted models for social networks [41,42]. 5 For the purpose of parameter tuning, we separately generated one graph for each combination of |V| ∈ {50.000, 100.000, 500.000, 1.000.000} , l ∈ {5, 30} and ∈ {2, 3} . That is, 16 graphs were used for parameter tuning purposes. In particular, we used the scientific tuning software irace [43] for fine-tuning the parameters of CMSA and Adapt-CMSA. The parameters of CMSA (together with their domains allowed for tuning) are the following ones: Note that in the case of numerical parameters, the precision of irace was fixed to two positions behind the comma. Both for the tuning of CMSA and Adapt-CMSA irace was applied with a budget of 3.000 algorithm applications. The time limit for each problem instance was set to |V|∕100 CPU seconds. The outcome of the tuning runs can be summarised as follows: • CMSA parameter values: n a = 1 , age max = 1 , t ILP = 38 , d rate = 0.53 , l size = 2. Note that both parameter settings indicate that we are dealing with very large graphs. In the case of CMSA, for example, n a = 1 and age max = 1 are set in this restrictive way because, otherwise, the size of the sub-instances would be too large to be solved by CPLEX. Similarly, the parameters of Adapt-CMSA are characterised by a strong bias towards the best-so-far solution in order to keep the sub-instance as small as possible, while still being able to find improving solutions.
With the final parameter settings as provided above, both CMSA and Adapt-CMSA were applied exactly once to each of the 800 problem instances. The computation time limit was the same as the one chosen for tuning, that is, |V|∕100 CPU seconds. The results are shown in a summarised way in the graphic of Fig. 1. The graphic is composed of 4 × 8 = 24 sub-graphics for each combination of |V| = n (rows) and l (columns). Each sub-graphic shows-for all five values of -the average improvement of Adapt-CMSA over CMSA (in percent). Note that those cases in which Adapt-CMSA improves over CMSA are additionally marked by bars in blue colour, while bars in red colour indicate the cases in which CMSA is better than Adapt-CMSA.
The following observations can be made. First, for smaller graphs (up to 50.000 nodes) not much difference between the two algorithms can be observed. However, starting from 100.000 nodes, Adapt-CMSA clearly outperforms CMSA. This holds especially with a growing number of nodes and a growing number of edges. Interestingly, for the smallest values of -that is, for ∈ {2, 2.25} in the case of l = 5 , respectively for = 2 in the case of the remaining values of l-CMSA often seems to have a slight advantage over Adapt-CMSA. In other words, the advantage of Adapt-CMSA over CMSA is higher for graphs with less nodes with high degrees. Nevertheless, these results provide a strong indication for the general superiority of Adapt-CMSA over CMSA.

Experiments Regarding Instances from the Literature
In our second set of experiments, we compare CMSA and Adapt-CMSA to the standalone application of CPLEX and to the best metaheuristic from the literature [37] (ICG). This is done in the context of 17 social networks that are partially used in the related literature on the MPIDS problem. These networks are of small and medium size, and contain between 34 and 36.692 nodes and between 788 and 198.050 edges. In addition, CPLEX and our CMSA variants were applied to 10 larger social networks from the SNAP library that contain between 37.700 and 1.134.890 nodes and between 2.289.003 and 3.387.388 edges (https:// snap. stanf ord. edu/ data/). While CPLEX was applied exactly once to each of these 27 problem instances, both CMSA and Adapt-CMSA were applied 10 times to each instance. A computation time limit of 2 h was given to each CPLEX run. On the contrary, much less time was given to the CMSA variants. In the case of the 17 small/ medium size problem instances we allowed a computation time of |V|∕10 CPU seconds for each run. A relatively shorter computation time of |V|∕100 CPU seconds was allowed for the application to the large instances from the SNAP library. The main reason for this difference is that |V|∕100 s would have been a very short computation time for most of the small and medium size instances.
In a first experiment we applied both CMSA and Adapt-CMSA with the parameter values from the previous section to all 27 instances. The obtained results are shown in numerical form in Table 1 (small/medium size instances) and Table 2 (large instances). These tables have the following structure. The first column contains the instance name, and the second column provides information about the quality of the best solutions known to date. Columns with heading 'q' report on the quality of the best solutions found by the four approaches, and columns with heading 'avg' provide the respective average solution quality. Furthermore, columns with heading ' t(s) ' indicate the average computation times of CMSA and Adapt-CMSA to find the best solutions in each run. Note that the information about average computation times was not provided in [37] for ICG. The authors, however, state that they chose a computation time Table 1 Numerical results for small to medium size instances a These best-known results were obtained by [22] (CA-AstroPh) and [38] (socfb-Brandeis99, socfb-Mich67) limit of |V| ⋅ 30∕1000 , because this assured convergence of their algorithm in the case of all considered problem instances. In other words, ICG would not profit from a higher computation time limit. Finally, the gap (in percent) between the solution obtained by CPLEX and the best lower bound is indicated in the column with heading 'gap(%)'. Note that when the gap is zero, CPLEX was able to prove optimality. The best result for each instance is shown in bold font. Furthermore, in case the best solution known so far was improved, the respective result is underlined. Finally, in those cases in which none of the algorithms was able to reach the currently best-known solution, we provide at the bottom of the table an indication of the algorithm that obtained the respective best-known solution.
The following observations can be made. First, CPLEX performs strongly for small and medium size instances. Apart from instance CA-AstroPh, CPLEX obtains all best-known solutions. Only in six out of 17 cases, CPLEX is not able to prove optimality of these results. The performance of Adapt-CMSA is very similar to the one of CPLEX. In one case (instance CA-AstroPh) Adapt-CMSA outperforms CPLEX both in terms of best-performance and in average-performance. On the downside, in four other cases (instances CA-CondMat, actorsdata, socfb-Brandeis99 and socfbMich67) the results of Adapt-CMSA fall slightly short of those of CPLEX. On the other side, Adapt-CMSA clearly outperforms CMSA, which (with the parameter setting for scale-free networks) only matches the results of Adapt-CMSA for six out of 17 problem instances. Finally, note that both CMSA and Adapt-CMSA clearly outperform the most recent metaheuristic from the related literature (ICG). Concerning the large instances from the SNAP library-see Table 2-we can state that the standalone application of CPLEX clearly starts to fail with a growing problem instance size. In fact, in five out of 10 cases-see the ones with an optimality gap of more than 95%-CPLEX is only able to provide the trivial solution that simply contains all network nodes. In addition, it can also be observed that the standard CMSA approach fails for instances Amazon0312, Amazon0505 and Amazon0601. In these three cases, standard CMSA is not able to improve over the initial solutions provided by the greedy approach. Adapt-CMSA, on the other side, works very well also for these large-size SNAP networks. In fact, Adapt-CMSA is able to obtain new best-known solutions in five out of 10 cases. Moreover, in those five cases in which CPLEX still works fine, the results of Adapt-CMSA are only slightly worse than those of CPLEX. Therefore, a first conclusion of this work is that Adapt-CMSA appears to be a CMSA variant that does not require to be specifically tuned for subsets of the considered benchmark set. It shows a high performance over the whole range of benchmark instances with one single parameter value set.
In a second experiment, we aimed at studying the change of performance of standard CMSA when specifically tuned for small and medium size problem instances on one side, and for large SNAP network on the other side. Again, we used irace for the purpose or parameter tuning. For small and medium size instances, the budget given to irace consisted of 1000 algorithm applications, and instances CA-AstroPh and socfb-Brandeis99 were used for tuning. The same budget was used for the tuning run concerning large SNAP networks. In this case, instances Amazon0505 and Amazon0601 were used for tuning. The outcome of these two tuning experiments was the following one: • Small/med. size instances: n a = 1 , age max = 3 , t ILP = 16 , d rate = 0.09 , l size = 8. • Large SNAP networks: n a = 1 , age max = 1 , t ILP = 39 , d rate = 0.95 , l size = 10.
Clearly, the parameter settings for small and medium size instances result in much larger sub-instance sizes than those for large SNAP networks. With these new parameter settings we repeated the experiments of CMSA. The results, in comparison to the CMSA results obtained with the previous parameter values, are shown in Tables 3 and 4.
The new CMSA results improve substantially in the case of small and medium size problem instances (Table 3). In fact, CMSA is now able to generate for 14 out of 17 problem instances the best-known solutions. In one case-see instance actors-data-CMSA is even able to generate a new best known solution of value 3091. With the specialised parameter setting, CMSA is now even able to perform slightly better, on average, than Adapt-CMSA (compare to Table 1). The results for large SNAP instances, however, show that specialised tuning does not help in this case. Even though the results of CMSA improve over the original ones from Table 2 in the case of those problem instances that were used for tuning (Amazon0505 and Amazon0601), they become worse for seven out of 10 problem instances. Moreover, even in those cases in which CMSA is able to improve with a specialised parameter setting, the results are still clearly inferior to those of Adapt-CMSA.

Conclusions and Outlook to Future Work
Construct, merge, solve and adapt (CMSA) is a recent matheuristic for the application to combinatorial optimisation problems. The algorithm is based on solving opportunely defined sub-instances of the original problem instances at each iteration by means of an exact solver such as, for example, an integer linear programming solver. One of the occasional disadvantages of CMSA is the need for repeated parameter tuning for subsets of the considered benchmark set. For dealing with this problem, we proposed in this work a self-adaptive variant of CMSA, called Adapt-CMSA, that adjusts its parameters on the fly in order to be able to solve problem instances of very different sizes without the need of re-tuning. Experiments were performed in the context of the minimum positive influence dominating set (MPIDS) problem.
Based on the obtained results, we can say that Adapt-CMSA has several advantages over standard CMSA in the context of the MPIDS problem. First, Adapt-CMSA does not need to be specifically tuned for subsets of the considered benchmark set. After one single tuning run, Adapt-CMSA works very well for the whole benchmark set, which contains instances of very different sizes. Second, Adapt-CMSA clearly outperforms standard CMSA in the context of large networks for which even a specialised tuning does not enable CMSA to compete with Adapt-CMSA.
In future work, we aim at confirming the findings of this paper in the context of other hard combinatorial optimisation problems. We believe that making CMSA self-adaptive is a significant step towards improving the wide applicability of this approach.