1 Introduction

Ant colony optimization (\(\textsc {Aco}\)) is a metaheuristic optimization technique [1] that was inspired by the foraging behaviour of ant colonies in nature. In technical terms, \(\textsc {Aco}\) algorithms are based on the repeated, step-by-step construction of solutions to the tackled optimization problem. For this purpose, \(\textsc {Aco}\) algorithms make use of a greedy heuristic in a probabilistic way. At each construction step, the probabilities of all options for extending the current partial solution are calculated based on the corresponding greedy function values and on so-called pheromone information. Hereby, pheromone information is implemented by means of a pheromone model that consists of a set of pheromone trail parameters together with their values. In most \(\textsc {Aco}\) algorithms there is a one-to-one correspondence between so-called solution components and pheromone trail parameters. In the case of the traveling salesman problem, for example, each edge of the input graph is a solution component and there is exactly one pheromone trail parameter assigned to each solution component. At each iteration, \(\textsc {Aco}\) algorithms perform two main actions. First, a certain number of solutions to the tackled optimization problem are generated based both on greedy information and on pheromone information. Then, high-quality solutions from the current iteration and/or from earlier iterations are used for updating the values of the pheromone trail parameters (henceforth simply called pheromone values). This is done to reward solution components that form part of high-quality solutions, with the aim of producing—over time—better and better solutions. Hence, it can be said that \(\textsc {Aco}\) incorporates positive learning. Note that the solution construction mechanism, together with the pheromone values, defines a probability distribution over the search space which is sampled at each iteration of the algorithm. The pheromone value update then changes this probability distribution, presumably towards an area of the search space that contains even better solutions than the ones encountered so far.

Since its first introduction in 1991 [2], \(\textsc {Aco}\) has evolved into several improved variants [3,4,5,6,7] by implementing a better exploitation of the positive learning mechanism. Nevertheless, nature provides numerous examples which show that negative learning mechanisms are also an integral part of the communication and coordination systems of many social insect colonies [8,9,10,11,12]. Moreover, examples of negative learning in species and inter-species evolution, animal swarm behaviour, and human revolution have already been adopted in metaheuristic techniques such as evolutionary algorithms [13, 14], extremal optimization [15,16,17,18,19], particle swarm optimization [20,21,22,23], and opposition based learning algorithm [24, 25]. The \(\textsc {Aco}\) research community has also identified this potential and brought out some relevant works. Most of them, however, produced limited successes.

Our first contribution [26] to the development of negative learning \(\textsc {Aco}\) variants was characterized by adding two features to the baseline \(\textsc {Aco}\) algorithm: (1) the use of an additional algorithmic component that provides negative feedback to the main \(\textsc {Aco}\) algorithm, and (2) the construction of a sub-instance for the additional algorithmic component to work on. We first applied negative learning \(\textsc {Aco}\) variants to the Capacitated Minimum Dominating Set (CapMDS) problem and used IBM ILOG \(\textsc {Cplex}\) as the first additional algorithmic component. The empirical results showed that our negative learning \(\textsc {Aco}\) variants significantly outperformed the baseline \(\textsc {Aco}\). Moreover, they showed to be competitive with the respective state-of-the-art [27]. We then proved that these approaches are also effective for other relevant combinatorial optimization problems [28,29,30]. Moreover, in [28], we developed negative learning \(\textsc {Aco}\) variants that incorporate a \(\mathcal{MAX}\)-\(\mathcal{MIN}\) Ant System (\(\mathcal{MM}\mathcal{}AS\)) [6] as an additional algorithmic component. These variants outperform the baseline \(\textsc {Aco}\), proving that our negative learning proposal is effective for algorithmic components other than \(\textsc {Cplex}\).

In this paper, we apply the negative learning approach to the Maximum Satisfiability problem (MaxSAT) [31, 32], which differs substantially from the problems considered to date. Given a multiset of clauses, where each clause is a disjunction of Boolean literals, the goal of MaxSAT is to find a truth assignment that maximizes the number of satisfied clauses or, equivalently, that minimizes the number of unsatisfied clauses.

There exist only a few \(\textsc {Aco}\) approaches (e.g. [33,34,35,36]) for solving MaxSAT, despite its practical relevance. The community working on satisfiability testing, for example, has solved challenging optimization problems by first encoding them as MaxSAT instances and then solving the resulting encodings with a MaxSAT solver. Nowadays, MaxSAT offers a competitive generic problem solving formalism for combinatorial optimization. For example, MaxSAT has been applied to solve optimization problems in domains as diverse as bioinformatics [37, 38], combinatorial testing [39], community detection in complex networks [40], diagnosis [41], planning [42], scheduling [43] and team formation [44]. Moreover, the MaxSAT community holds an annual MaxSAT Evaluation (MSE) since 2006 [45, 46]. This event has promoted the implementation of highly optimized MaxSAT solvers and the creation of a wide collection of MaxSAT instances from different domains. Thus, MaxSAT is a suitable test problem to validate the general applicability of the negative learning \(\textsc {Aco}\) approach in an extremely competitive scenario.

1.1 Related Work

Attempts to incorporate negative learning into \(\textsc {Aco}\) started almost a decade after \(\textsc {Aco}\)’s creation in 1991. Maniezzo [47] in 1999 and Cordon et al. [48] in 2000 pioneered the incorporation of negative learning in \(\textsc {Aco}\) by implementing pheromone value reductions of those components that form low-quality solutions. Then, Montgomery and Randall [49] proposed three variants of negative learning \(\textsc {Aco}\), each one adopting different strategies for the identification, storage, and use of the negative feedback information. The first variant identifies bad solutions by simply choosing the worst solution found in each \(\textsc {Aco}\) iteration. The second variant has a function that shifts its preference from searching for good solutions at the beginning of an iteration to searching for bad solutions at the end of the iteration. The negative feedback information found by this variant is stored in an additional pheromone matrix. The third variant has a dedicated function to search for bad solutions, in addition to the standard function that searches for good solutions. The negative feedback information, however, is stored in the same pheromone matrix that stores the positive feedback information. The techniques described in some of these variants were also partially used in the negative learning \(\textsc {Aco}\) applications developed by other researchers such as Simon and Smith [50], Ye et al. [51], Masukane and Mizuno [52], and Rojas-Morales et al. [53]. In [28], we re-implemented negative learning \(\textsc {Aco}\) variants that incorporate the techniques proposed by Montgomery and Randall as well as the one proposed by Ramos et al. [54]. We compared these variants with our own negative learning approach on a large number of Minimum Dominating Set (MDS) and Multi-Dimensional Knapsack Problem (MDKP) instances. The empirical results showed that our negative learning variants significantly outperform the above mentioned variants.

Negative learning \(\textsc {Aco}\) has not been used so far to solve MaxSAT. Despite being a state-of-the-art metaheuristic, \(\textsc {Aco}\) itself has been applied just a few times to MaxSAT. The first \(\textsc {Aco}\) application was due to Drias and Ibri [34], who used a variant of \(\textsc {Aco}\), known as Ant Colony System (ACS) [5], to solve weighted MaxSAT. This algorithm works by generating an initial solution to which a number of successive variable flips are applied. Drias and Ibri also added parallelization to their sequential ACS technique by using synchronous and asynchronous methods, but the empirical results showed that their algorithm did not outperform the existing approaches.

Another \(\textsc {Aco}\) implementation for MaxSAT was due to Pinto et al. [35]. They used an ACS variant to solve two unweighted and three weighted types of static and dynamic MaxSAT instances. Their implementation works by constructing the solutions in two phases: (1) variable selection, which is done randomly, and (2) value selection, which is based on a heuristic and on pheromone values. This \(\textsc {Aco}\) variant outperforms the baseline local search algorithm [55, 56]. The authors, however, admitted that \(Walk\textsc {SAT}\) [57, 58] and other native MaxSAT solvers were yet a substantial challenge for their proposal.

Villagra and Barán [36] developed Max-Min-SAT, a version of \(\textsc {Aco}\) specifically designed to solve MaxSAT. This algorithm borrowed the adaptive fitness function [59] from genetic algorithms and is available in three variants: (1) \(\textsc {Aco}_{\textsc {Saw}}\), which uses the step-wise adaptation of weights, (2) \(\textsc {Aco}_{\textsc {Rf}}\), which implements refining functions, and (3) \(\textsc {Aco}_{\textsc {Rfsaw}}\), which employs both step-wise adaptation of weights and refining functions. An empirical comparison on the basis of 50 random Max-3SAT instances showed that Villagra and Baran’s approach did not outperform the \(Walk\textsc {SAT}\) MaxSAT solver.

1.2 Contribution and General Idea

This paper aims to prove the general applicability of our negative learning \(\textsc {Aco}\) framework by (1) implementing the approach to an optimization problem that has different characteristics than the problems considered to date, and (2) exploring other options for the additional algorithmic component that provides negative feedback to the main \(\textsc {Aco}\) algorithm. As mentioned above, there are only a few implementations of \(\textsc {Aco}\) for MaxSAT. Therefore, this work not only makes significant contributions to our negative learning \(\textsc {Aco}\) framework but also to the use of \(\textsc {Aco}\) for MaxSAT solving.

As mentioned before, in earlier work we proved that our negative learning \(\textsc {Aco}\) approach works very well with \(\textsc {Cplex}\) and \(\mathcal{MM}\mathcal{}AS\) as additional algorithmic components that provide negative feedback to \(\textsc {Aco}\) [26, 28,29,30]. In this paper, we take a step further and consider two local search MaxSAT solvers, \({\textsc {SAT}Like}\)-c [60] and \(\textsc {SlsMcs}\) [61], as additional algorithmic components. The empirical results show that our approach also performs very well with these new algorithmic components. Moreover, the experimental investigation shows that all our negative learning \(\textsc {Aco}\) variants significantly outperform the baseline \(\textsc {Aco}\), \(\textsc {Cplex}\), and MaxSAT solvers. Therefore, the obtained results provide stronger evidence of the general applicability and the effectiveness of our negative learning \(\textsc {Aco}\) approach. In particular, it might be very interesting to see for the MaxSAT community that our approach can be seen as a general framework for the improvement of existing MaxSAT solvers. Moreover, considering our findings, we believe that this algorithmic framework might be very useful also for other combinatorial optimization approaches.

2 The Maximum Satisfiability Problem

MaxSAT is an NP-hard optimization problem which can be stated as follows. Given is a set of n Boolean variables \(X=\{x_1, x_2, \ldots , x_n\}\). A clause is a disjunction of literals and each literal is either a variable \(x_i\) (that is a positive literal) or its negation \(\bar{x_i}\) (that is a negative literal). The variable \(x_i\) can take the truth value 0 for \(\textsc {false}\) or 1 for \(\textsc {true}\). A Conjunctive Normal Form (CNF) formula \(\phi\) is a conjunction of a set of m clauses \(C=\{c_1, c_2, \ldots , c_m\}\). A valid solution S to a MaxSAT problem is in the form of a complete truth assignment to all variables in X. The optimization objective of MaxSAT is to satisfy as many clauses in \(\phi\) as possible.

Weighted MaxSAT is a variant of MaxSAT in which each clause has an associated positive weight and its optimization objective is to maximize the sum of weights of the satisfied clauses.

A standard integer linear programming model for Weighted MaxSAT can be stated as follows [32]:

$$\begin{aligned} \text{ max } \;\; \sum _{z_j \in Z} w_j . z_j \end{aligned}$$
(1)

subject to the constraints:

$$\begin{aligned} \sum _{i \in I_{j}^+} x_i + \sum _{i \in I_{j}^-} (1 - x_i) \ge z_j&\quad \forall z_j \in Z \end{aligned}$$
(2)
$$\begin{aligned} z_j \in \{0, 1\}&\quad \forall z_j \in Z \end{aligned}$$
(3)
$$\begin{aligned} x_i \in \{0, 1\}&\quad \forall x_i \in X. \end{aligned}$$
(4)

The model consists of a set Z of binary variables \(z_1, z_2, \ldots , z_m\) for each corresponding clause in C. Variable \(z_j\) takes value 1 if clause \(c_j\) is satisfied; otherwise it takes value 0. The sets \(I_{j}^+\) and \(I_{j}^-\) contain positive and negative literals indexes in clause \(c_j\), respectively. Parameter \(w_j\) represents the weight of clause \(c_j\). We implemented all the approaches in this paper for unweighted MaxSAT. Therefore, all the clauses have weight 1. The objective function (1) counts the number of satisfied clauses and the restriction (2) ensures that each satisfied clause has at least one satisfied literal.

The satisfiability testing community has been very active in the development of MaxSAT solvers. As a result, their performance has improved dramatically in the last years, as witnessed by the results of the different editions of the MaxSAT Evaluation. The efforts have mainly focused on developing local search and exact MaxSAT solvers.

Local search MaxSAT solvers start from an initial complete assignment and, at each step, they flip the Boolean value of a selected variable to find a better solution using a heuristic. The most critical point of such solvers is that they can be trapped in local optima, and so they must incorporate suitable strategies to escape from local optima. Among the best performing solvers, we find Dist [62], CCEHC [63], SATLike [64] and SATLike 3.0 [65].

There are two main groups of exact MaxSAT solvers: branch-and-bound (BnB) and SAT-based solvers. BnB MaxSAT solvers implement the branch-and-bound scheme and are competitive on random and some types of crafted instances. At each node of the search tree, they apply some inference rules and compute a lower bound by detecting disjoint inconsistent subsets of soft clauses with unit propagation [66, 67]. Representative BnB solvers are MaxSatz [68, 69] and Ahmaxsat [70]. BnB MaxSAT solvers can become competitive on industrial instances by incorporating the recently defined clause learning mechanism defined in [71].

SAT-based MaxSAT solvers proceed by reformulating the MaxSAT optimization problem into a sequence of SAT decision problems [31] and are particularly competitive on industrial instances. These solvers could still be divided into three subgroups: model-guided, core-guided and Minimum Hitting Sets (MHS-)guided solvers. Model-guided approaches reduce the problem of deciding whether there exists an assignment for the MaxSAT instance with a cost less than or equal to a given k to SAT, and successively decrease k until an unsatisfiable SAT instance is found. Examples of such solvers are SAT4J-Maxsat [72] and Pacose [73]. Core-guided and MHS-guided approaches consider a MaxSAT instance as a SAT instance and call a CDCL SAT solver to identify an unsatisfiable subset of soft clauses, called a core. Then, they relax this core and solve the relaxed instance with a CDCL SAT solver to identify another core, repeating this process until deriving a satisfiable instance. The difference between them is that core-guided solvers relax a core using cardinality constraints, while MHS-guided solvers remove one clause from each detected core so that the number of different clauses removed from the cores is minimized by solving a minimum hitting set instance with an integer programming solver. The solvers Open-WBO [74], WPM3 [75] and RC2 [76] are representative core-guided solvers, and the solvers MHS [77] and MaxHS [78] are representative MHS-guided solvers.

3 Negative Learning ACO for MaxSAT

Our negative learning \(\textsc {Aco}\) for MaxSAT is based on a \(\mathcal{MM}\mathcal{}AS\) variant implemented in the hypercube framework [7] as the baseline algorithm. Depending on the type of additional algorithmic component that is used for providing negative feedback to the main \(\textsc {Aco}\) algorithm, we constructed three variants: (1) \(\textsc {AcoSat}_{\mathrm {neg}}^+\), which uses the local search MaxSAT solver \({\textsc {SAT}Like}\)-c [60]; (2) \(\textsc {AcoSls}_{\mathrm {neg}}^+\), which uses the local search SAT solver \(\textsc {SlsMcs}\) [61]; and (3) \(\textsc {Aco}_{\mathrm {neg}}^+\) which applies the integer linear programming (ILP) solver \(\textsc {Cplex}\). Note that \(\textsc {AcoSat}_{\mathrm {neg}}^+\), \(\textsc {AcoSls}_{\mathrm {neg}}^+\) and \(\textsc {Aco}_{\mathrm {neg}}^+\) take benefit from both the positive and negative feedback information obtained by the solvers, whereas a fourth variant called \(\textsc {Aco}_{\mathrm {neg}}\) uses the solver \(\textsc {Cplex}\) exclusively as negative feedback provider. Algorithm 1 displays the pseudo-code of the general algorithmic framework of all these variants.

3.1 General Description of the Algorithmic Framework

Our baseline \(\textsc {Aco}\) algorithm is a \(\mathcal{MAX}\)-\(\mathcal{MIN}\) Ant System (\(\mathcal{MM}\mathcal{}AS\)) implemented in the so-called hybercube framework [7]. This algorithm variant is nowadays one of the most used versions of \(\textsc {Aco}\). It is characterized by the following three features:

  1. 1.

    All pheromone values are naturally limited to [0, 1], due to the specific pheromone value update that is employed. When each pheromone value either has value zero or value one, the algorithm is fully converged. Therefore, \(\mathcal{MM}\mathcal{}AS\) algorithms further limit the range of the pheromone values to \([\tau _{\min }, \tau _{\max }]\), where \(\tau _{\min }\) is a small constant close to zero, and \(\tau _{\max }\) is a constant close to one. As in most \(\mathcal{MM}\mathcal{}AS\) algorithms in the literature, we use fixed values \(\tau _{\min } = 0.001\) and \(\tau _{\max } = 0.999\).

  2. 2.

    At each iteration, the state of convergence of the algorithm is measured by calculating the so-called convergence factor \(\text{ cf }\), where \(0 \le \text{ cf }\le 1\). Once convergence is detected—which is the case when \(cf = 1\)—the algorithm is restarted by a re-initialization of the pheromone values to their initial values.

  3. 3.

    \(\mathcal{MM}\mathcal{}AS\) algorithms keep three solutions at any time: (1) the best solution constructed at the current iteration (\(S^{{\mathrm{ib}}}\)), (2) the best solution found since the last restart of the algorithm (\(S^{{\mathrm{rb}}}\)), and (3) the best overall solution (\(S^{{\mathrm{bsf}}}\)). These three solutions are used in a weighted form for updating the pheromone values at each iteration. The weights used for this purpose depend on (1) the value of the convergence factor \(\text{ cf }\), and on (2) the value of a Boolean control variable \(\mathsf bs\_update\). The role of \(\mathsf bs\_update\) is hereby the following one. After pheromone (re-)initialization, only solutions \(S^{{\mathrm{ib}}}\) and \(S^{{\mathrm{rb}}}\) are used for the pheromone update. In this phase, \(\mathsf bs\_update\) has a value of \(\textsc {false}\). When convergence is detected in this phase, the value of \(\mathsf bs\_update\) changes to \(\textsc {true}\) and the pheromone update is exclusively performed using solution \(S^{{\mathrm{bsf}}}\). When algorithm convergence is detected in this phase, the algorithm is restarted as described above.

The pseudo-code of our baseline \(\mathcal{MM}\mathcal{}AS\) algorithm can be found in Algorithm 1. At the start of the algorithm both \(S^{{\mathrm{rb}}}\) and \(S^{{\mathrm{bsf}}}\) are initialized as empty sets (line 3). Moreover, \(\text{ cf }\) and \(\mathsf bs\_update\) are initialized to 0 and \(\textsc {false}\), respectively.

For the application to MaxSAT, the algorithm applies a standard pheromone model \(\mathcal{T}\) that consists of pheromone \(\tau _{( \langle x_i, j \rangle )} \ge 0\) for each possible value \(j \in \{0,1\}\) to be assigned to each Boolean variable \(x_i\). In addition to the standard pheromone model, the algorithm also employs a negative pheromone model \({\mathcal{T}}^{\mathrm {neg}}\) that consists of negative pheromone \(\tau _{( \langle x_i, j \rangle )}^\mathrm {neg}\) for each Boolean value j to be assigned to each Boolean variable \(x_i\). The pheromones in \(\mathcal{T}\) are initialized to 0.5 while the pheromones in \({\mathcal{T}}^{\mathrm {neg}}\) are initialized to 0.001 at the start of the algorithm by function InitializePheromoneValues\((\mathcal{T}, {\mathcal{T}}^{\mathrm {neg}})\) (line 4). Based on greedy and pheromone information, then \(n_{\mathrm {a}}\) solutions are generated at each iteration according to function Construct_Solution\((\mathcal{T}, {\mathcal{T}}^{\mathrm {neg}}, d_{\mathrm {rate}})\) (lines 6 – 10). Further explanations on how this function works are given after this general description.

figure a

All solutions \(S^k\) found in the current iteration are added to a set \({\mathcal{S}}^{\mathrm {iter}}\). Subsequently, function SolveSubinstance\(({\mathcal{S}}^{\mathrm {iter}}, t_{{\mathrm{solver}}})\) builds a sub-instance \(I^{\mathrm {sub}}\) in the form of a MaxSAT partial solution. The pre-assigned variables in this partial solution are stored in set \(X' \subseteq X\), which contains the variables that have been assigned the same truth value in each \(S^k \in {\mathcal{S}}^{\mathrm {iter}}\). Depending on the specific variant of the negative learning \(\textsc {Aco}\) to be applied, the function then chooses either \(\textsc {Cplex}\) or one of the two MaxSAT solvers for solving this sub-instance (line 11). Note that this function is the only place where these solvers are used within our negative learning ACO algorithm. After trying to solve the sub-instance for a maximum time of \(t_{{\mathrm{solver}}}\) CPU seconds, the function returns a solution \(S^{\mathrm {sub}}\). Next, \(S^{\mathrm {sub}}\) is compared with the solutions in \({\mathcal{S}}^{\mathrm {iter}}\). The solution with the best objective function value becomes the iteration-best solution \(S^{{\mathrm{ib}}}\) (line 12). This means that, in addition to using solution \(S^{\mathrm {sub}}\) for performing the update of the negative pheromone values (as outlined below), \(S^{\mathrm {sub}}\) is also used for additionally boosting the positive learning mechanism of the algorithm by adding it to set \({\mathcal{S}}^{\mathrm {iter}}\) which is used to update solutions \(S^{{\mathrm{ib}}}\), \(S^{{\mathrm{rb}}}\) and \(S^{{\mathrm{bsf}}}\). There is only one exception. In the case of algorithm variant \(\textsc {Aco}_{\mathrm {neg}}\), \(S^{\mathrm {sub}}\) is not added to \({\mathcal{S}}^{\mathrm {iter}}\). Hence, in this algorithm variant, \(S^{\mathrm {sub}}\) is exclusively used for updating the negative pheromone values. Afterwards, the restart-best solution \(S^{{\mathrm{rb}}}\) and the best-so-far solution are updated with \(S^{{\mathrm{ib}}}\) (lines 13–14). Finally, the pheromone update and the calculation of the convergence factor are implemented by functions ApplyPheromoneUpdate(\(\mathcal{T}\), \({\mathcal{T}}^{\mathrm {neg}}\), \(\text{ cf }\), \(\mathsf bs\_update\), \(S^{{\mathrm{ib}}}\), \(S^{{\mathrm{rb}}}\), \(S^{{\mathrm{bsf}}}\), \(S^{\mathrm {sub}}\), \(\rho\), \(\rho ^{\mathrm {neg}}\)) and ComputeConvergenceFactor(\(\mathcal{T}\)) (lines 15–16), respectively. If \(cf > 0.999\) and \(\mathsf bs\_update= \textsc {true}\), the algorithm is restarted (lines 17–24). Note that our algorithm terminates once the termination conditions are met. In most of our experiments in this paper, the termination conditions are met once a given CPU time limit is reached. However, in some of the experiments we used a maximum number of solution constructions as termination condition. Nevertheless, this will be clearly stated in the section on the experimental results. In the following, the functions in the algorithm are described in more detail.

3.2 Solution Construction

Function Construct_Solution\((\mathcal{T}, {\mathcal{T}}^{\mathrm {neg}}, d_{\mathrm {rate}})\) generates each solution \(S^k\) in two phases: (1) variable selection and (2) value selection. In the first phase, a variable \(x_{i}\) is taken from the set \(\hat{X} \subseteq X\) that contains the variables that have not been assigned a value in solution \(S^k\). The probability \(\mathbf {p}_1(x_{i})\) of selecting variable \(x_i\) is calculated according to the following equation.

$$\begin{aligned} {\mathbf {p}_1(x_i)} := \frac{\eta _{i} }{\sum _{x_j \in \hat{X}} \eta _{j}} \end{aligned}$$
(5)

where \(\eta _{i}\) is the greedy information for variable selection. More specifically, \(\eta _i\) is the number of occurrences of variable \(x_i\) in the initial instance. Afterwards, a random number \(r \in [0,1]\) is generated. The variable that has the highest value of \(\mathbf {p}(x_{i})\) in Eq. 5 is directly selected if \(r \le d_{\mathrm {rate}}\). Otherwise, the variable is randomly selected using a roulette wheel selection [79]. Hereby, \(d_{\mathrm {rate}}\) is the so-called determinism rate.

In the second phase of the solution construction, a truth value is assigned to the selected variable \(x_i\), in a way similar to the one of the first phase. The probability of assigning truth value j to variable \(x_i\) is calculated with the following equation:

$$\begin{aligned} {\mathbf {p}_2(j)} := \frac{\vartheta _{( \langle x_i, j \rangle )} \cdot \tau _{( \langle x_i, j \rangle )} \cdot (1 - \tau ^{\mathrm {neg}}_{( \langle x_i, j \rangle )})}{\sum _{k \in \{0,1\}} \vartheta _{( \langle x_i, k \rangle )} \cdot \tau _{( \langle x_i, k \rangle )} \cdot (1 - \tau ^{\mathrm {neg}}_{( \langle x_i, k \rangle )}) }, \end{aligned}$$
(6)

where

$$\begin{aligned} \vartheta _{( \langle x_i, j \rangle )} := \frac{1}{1+\text{cost}(S^k , \{ \langle x_i, j \rangle \}) - \text{cost}(S^k)}. \end{aligned}$$
(7)

Hereby, the greedy information \(\vartheta _{( \langle x_i, j \rangle )}\) for the truth value selection in Eq. 7 is inversely proportional to the number of new constraint violations in the partial solution \(S^k\). In this equation, \(\text{cost}()\) is a function that counts the number of constraint violations in a partial solution. With this definition, \(\text{cost}(S^k)\) refers to the number of constraint violations in the current partial solution \(S^k\), while \(\text{cost}(S^k , \{ \langle x_i, j \rangle \})\) refers to the cost of the partial solution obtained by extending the partial solution \(S^k\) with the assignment \(x_i = j\).

These two phases of the solution construction are repeated until all \(x_i \in X\) are assigned a truth value.

3.3 Solving Sub-instances

Fig. 1
figure 1

Illustrative example of the negative learning \(\textsc {Aco}\) approach for the MaxSAT problem

Figure 1 shows an illustrative example of how negative learning is added to the baseline \(\textsc {Aco}\) in the context of the MaxSAT problem. The example shows five solutions generated in the current \(\textsc {Aco}\) iteration for a MaxSAT problem on seven binary variables. These five solutions are added to set \({\mathcal{S}}^{\mathrm {iter}}\). Subsequently, function SolveSubinstance\(({\mathcal{S}}^{\mathrm {iter}}, t_{{\mathrm{solver}}})\) (line 11 of Algorithm 1) builds a sub-instance \(I^{\mathrm {sub}}\) in the form of a MaxSAT partial solution. The pre-assigned variables in this partial solution are stored in set \(X' \subseteq X\), which contains the variables that have been assigned the same truth value in each solution \(S^k \in {\mathcal{S}}^{\mathrm {iter}}\). In the illustrative example in Fig. 1, variables \(x_2, x_5,\) and \(x_6\) from \({\mathcal{S}}^{\mathrm {iter}}\) are assigned values 1, 0 and 1, respectively, in each of the five solutions. Consequently, in sub-instance \(I^{\mathrm {sub}}\), variables \(x_2, x_5,\) and \(x_6\) are pre-assigned with fixed values 1, 0 and 1, respectively. With this, the additional algorithmic component can only work on the remaining variables whose values are still unassigned.

3.4 Pheromone Update

Function ApplyPheromoneUpdate(\(\mathcal{T}\), \({\mathcal{T}}^{\mathrm {neg}}\), \(\text{ cf }\), \(\mathsf bs\_update\), \(S^{{\mathrm{ib}}}\), \(S^{{\mathrm{rb}}}\), \(S^{{\mathrm{bsf}}}\), \(S^{\mathrm {sub}}\), \(\rho\), \(\rho ^{\mathrm {neg}}\)) updates the standard pheromone model \(\mathcal{T}\) and the negative pheromone model \({\mathcal{T}}^{\mathrm {neg}}\) at each iteration. The standard pheromone model \(\mathcal{T}\) is updated in the same way as in all \(\mathcal{MM}\mathcal{}AS\) algorithms implemented in the hypercube framework. The value of each standard pheromone \(\tau _{( \langle x_i, j \rangle )}\) is updated with the following equation:

$$\begin{aligned} \tau _{( \langle x_i, j \rangle )} := \tau _{( \langle x_i, j \rangle )} + \rho \cdot (\xi _{( \langle x_i, j \rangle )} - \tau _{( \langle x_i, j \rangle )}) \end{aligned}$$
(8)

where:

$$\begin{aligned} \xi _{( \langle x_i, j \rangle )} := \kappa _{ib} \cdot \Delta (S^{{\mathrm{ib}}},x_i,j) + \kappa _{rb} \cdot \Delta (S^{{\mathrm{rb}}},x_i,j) + \kappa _{bsf} \cdot \Delta (S^{{\mathrm{bsf}}},x_i,j). \end{aligned}$$
(9)

The parameter \(\xi _{( \langle x_i, j \rangle )}\) in Eq. 9 stores the accumulative update received by each possible value \(j \in \{0, 1\}\) that can be assigned to each Boolean variable \(x_i\). The weights \(\kappa _{\mathrm{ib}}\), \(\kappa _{\mathrm{rb}}\), and \(\kappa _{\mathrm{bsf}}\) in the same equation represent the influences of solutions \(S^{{\mathrm{ib}}}\), \(S^{{\mathrm{rb}}}\), and \(S^{{\mathrm{bsf}}}\), respectively, on the amount of pheromone deposits, and \(\rho\) is the learning rate. The values of these weights are determined based on the states of \(\text{ cf }\) and \(\mathsf bs\_update\) as shown in Table 1. Note that in each state, the sum of \(\kappa _{\mathrm{ib}}\), \(\kappa _{\mathrm{rb}}\), and \(\kappa _{\mathrm{bsf}}\) is equal to 1. Furthermore, \(\Delta (S,x_i,j)\) evaluates to 1 if, and only if, the truth value j is assigned to variable \(x_i\) in the corresponding solution; otherwise, \(\Delta (S,x_i,j)\) evaluates to 0. For preventing the algorithm to reach complete convergence, the pheromone values are limited in the range of \(\tau _{\mathrm {min}}= 0.001\) to \(\tau _{\mathrm {max}}= 0.999\). Any pheromone that falls below \(\tau _{\mathrm {min}}\) is set back to \(\tau _{\mathrm {min}}\) and any pheromone that exceeds \(\tau _{\mathrm {max}}\) is set back to \(\tau _{\mathrm {max}}\).

Table 1 Values of \(\kappa _{\mathrm{ib}}\), \(\kappa _{\mathrm{rb}}\), and \(\kappa _{\mathrm{bsf}}\) based on the convergence factor \(\text{ cf }\) and the Boolean control variable \(\mathsf bs\_update\)

Function ApplyPheromoneUpdate(\(\mathcal{T}\), \({\mathcal{T}}^{\mathrm {neg}}\), \(\text{ cf }\), \(\mathsf bs\_update\), \(S^{{\mathrm{ib}}}\), \(S^{{\mathrm{rb}}}\), \(S^{{\mathrm{bsf}}}\), \(S^{\mathrm {sub}}\), \(\rho\), \(\rho ^{\mathrm {neg}}\)) also updates negative pheromones with a similar mechanism as the one used for the standard pheromone update. However, in the case of the negative pheromone values, Eq. 10 is only used to update the negative pheromone values corresponding to variables from \(X \setminus X'\), that is, variables that did not have already a pre-assigned value in sub-instance \(I^{\mathrm {sub}}\). The update formula for the negative pheromone values is as follows:

$$\begin{aligned} \tau _{( \langle x_i, j \rangle )}^{\mathrm {neg}} := \tau _{( \langle x_i, j \rangle )}^{\mathrm {neg}} + \rho ^{\mathrm {neg}}\cdot (\xi _{( \langle x_i, j \rangle )}^{\mathrm {neg}} - \tau _{( \langle x_i, j \rangle )}^{\mathrm {neg}}). \end{aligned}$$
(10)

Hereby, \(\rho ^{\mathrm {neg}}\) is the negative learning rate. Furthermore, for all \(x_i \in X \setminus X', \xi _{( \langle x_i, 0 \rangle )}^{\mathrm {neg}}\) is set to 1 if \(x_i\) has value 1 in solution \(S^{\mathrm {sub}}\), to 0 otherwise. Moreover, \(\xi _{( \langle x_i, 1 \rangle )}^{\mathrm {neg}}\) is set to 1 if \(x_i\) has value 0 in solution \(S^{\mathrm {sub}}\), to 0 otherwise. In the illustrative example in Fig. 1, negative pheromone update is only applied to the truth values of variables \(x_1, x_3, x_4,\) and \(x_7\) since their values are not pre-assigned in the sub-instance \(I^{\mathrm {sub}}\). In this example, the truth values 1, 0, 1,  and 1 are assigned to variables \(x_1, x_3, x_4,\) and \(x_7\), respectively. As a consequence of this assignment, negative pheromone increase is given to the truth values 0, 1, 0,  and 0 which are not assigned to variables \(x_1, x_3, x_4,\) and \(x_7\), respectively. Hence, our algorithm gives penalty in the form of a negative pheromone increase to each Boolean value that is not assigned in \(S^{\mathrm {sub}}\) to a variable \(x_i \in X \setminus X'\). Note that also the negative pheromone values are limited to the range from \(\tau _{\mathrm {min}}= 0.001\) to \(\tau _{\mathrm {max}}= 0.999\). Any negative pheromone that falls below \(\tau _{\mathrm {min}}\) is set back to \(\tau _{\mathrm {min}}\) and any negative pheromone that exceeds \(\tau _{\mathrm {max}}\) is set back to \(\tau _{\mathrm {max}}\).

3.5 Calculation of the Convergence Factor

Function ComputeConvergenceFactor(\(\mathcal{T}\)) calculates the value of \(\text{ cf }\) needed to regulate the update of the standard pheromone model \(\mathcal{T}\) using Eq. 11:

$$\begin{aligned} \text{ cf } := 2 \left( \left( \frac{\sum \limits _{\tau \in \mathcal{T}}\max \{\tau _{\max }-\tau , \tau -\tau _{\min }\}}{\vert \mathcal{T}\vert \cdot (\tau _{\max }-\tau _{\min })}\right) - 0.5\right) . \end{aligned}$$
(11)

With this equation, the value of \(\text{ cf }\) is equal to zero when all pheromone values are initialized to 0.5. On the contrary, the value of \(\text{ cf }\) is equal to one when all pheromone values are either \(\tau _{\mathrm {min}}\) or \(\tau _{\mathrm {max}}\). In the rest of the conditions, the value of \(\text{ cf }\) is between 0 and 1.

4 Experimental Evaluation

We performed the experimental evaluation of our negative learning \(\textsc {Aco}\) variants, the baseline \(\textsc {Aco}\) algorithm without negative learning, the ILP solver \(\textsc {Cplex}\), the two chosen local search MaxSAT solvers and the exact MAXSAT solver MaxHS [78] on a cluster of machines with two Intel® Xeon® Silver 4210 CPUs with 10 cores of 2.20 GHz and 92 GB of RAM. The version of \(\textsc {Cplex}\) used by \(\textsc {Aco}\) variants \(\textsc {Aco}_{\mathrm {neg}}^+\) and \(\textsc {Aco}_{\mathrm {neg}}\), as well as in standalone model, was 12.10, in one-threaded mode. The local search MaxSAT solvers \({\textsc {SAT}Like}\)-c [60] and \(\textsc {SlsMcs}\) [61] used by variants \(\textsc {AcoSat}_{\mathrm {neg}}^+\) and \(\textsc {AcoSls}_{\mathrm {neg}}^+\), as well as the exact MAXSAT solver MaxHS [78], are taken from [80]. The solvers \({\textsc {SAT}Like}\)-c and MaxHS were the winners of the incomplete and complete unweighted tracks of MaxSAT Evaluation 2020 (MSE 2020), respectively.

4.1 Problem Instances

First, we decided to compare our negative learning \(\textsc {Aco}\) approaches with the \(\textsc {Aco}\) approaches for MaxSAT by Pinto et al. [35] and Villagra and Barán [36]. Next, we also want to compare our approaches with the state-of-the-art MaxSAT solvers \({\textsc {SAT}Like}\)-c, \(\textsc {SlsMcs}\) and MaxHS from MSE 2020. For this purpose, we tested our negative learning \(\textsc {Aco}\) variants on the problem instances from [35, 36] as well as on a wide range of problem instances from MSE 2016 [81] and MSE 2020 [80].

The specifications of these MaxSAT instances are given in Table 2 (Pinto et al.), Table 3 (Villagra and Baran), and Tables 4, 5, 6, 7 (MSE 2016 and MSE 2020), where \(n_l\), \(n_x\), and \(n_c\) denote the number of literals, variables and clauses, respectively. From the work of Pinto et al., we took two unweighted instances used to test their \(\textsc {Aco}\) implementation for the static MaxSAT problem. Each of these instances has three literals per clause, 250 variables, and 1065 clauses. The MaxSAT instances from Villagra and Baran consist of phase-transition instances (instances 1–25 in Table 3) and over-constrained instances (instances 26–50 in Table 3). Each of these 50 instances has three literals per clause and 50 variables. Each phase-transition instance has 215 clauses while each over-constraint instance has 323 clauses. The chosen instance set from MSE 2020 and MSE 2016 consists of four groups: (1) maxcut, (2) highirth, (3) ramsey, and (4) set-covering. In Tables 4, 5, 6 and 7, we sort these 113 instances according to the number of literals, the number of variables, and the number of clauses. Overall, these instances vary considerably in terms of size and structure.

Table 2 Average results for Pinto’s instances
Table 3 Average results for Villagra and Baran’s instances
Table 4 Best results of all algorithms tested on MSE 2020 and 2016 instances
Table 5 Continuation of Table 4 Best results of all algorithms tested on MSE 2020 and 2016 instances
Table 6 Continuation of Table 4 Best results of all algorithms tested on MSE 2020 and 2016 instances
Table 7 Continuation of Table 4 Best results of all algorithms tested on MSE 2020 and 2016 instances

4.2 Algorithm Tuning and Test Settings

The baseline \(\textsc {Aco}\) as well as the negative learning variants require well-working configurations of their parameter values. We used the scientific tuning software irace [82] for parameter tuning purposes. In particular, we carried out separate tuning runs for each of the considered MaxSAT instance groups. Concerning the instances by Pinto et al., we chose instance number 1 from Table 2 for parameter tuning. The parameter values obtained for this instance group are presented in Table 8. Pinto et al. employed an \(\textsc {Aco}\) approach using a single ant that was evaluated for 100 runs and each run consisted of 100 iterations. Consequently, we limited our algorithms to match the number of their \(\textsc {Aco}\) algorithm’s solution constructions. In particular, we limited the execution of \(\textsc {Aco}_{\mathrm {neg}}^+\), \(\textsc {Aco}_{\mathrm {neg}}\), and \(\textsc {Aco}\) to 6, 14, and 50 iterations, respectively.

Table 8 Parameter values obtained for all \(\textsc {Aco}\) algorithms concerning the Pinto et al. [35] instances

Concerning the instances of Villagra and Baran, we chose the first five instances from each of the two instance types (phase-transition and over-constrained) for tuning. The parameter values obtained for this instance group are presented in Table 9. Villagra and Baran employed 10 ants in their \(\textsc {Aco}\) variants and limited the executions to 10,000 iterations for each of the 10 test runs for every instance. Adjusting to their test setting, we limited the execution of \(\textsc {Aco}_{\mathrm {neg}}^+\), \(\textsc {Aco}_{\mathrm {neg}}\), and \(\textsc {Aco}\) to 20,000, 16,666, and 8333 iterations, respectively.

Table 9 Parameter values obtained for all \(\textsc {Aco}\) algorithms concerning the Villagra and Barán [36] instances

As shown in Tables 4, 5, 6 and 7, the instance set selected from the MSE 2016 and MSE 2020 Evaluations is very diverse in its specifications. For tuning purposes, we divided these instances into 8 sub-groups based on their type and size: (1) \(maxcut_1\), (2) \(maxcut_2\), (3) highirth, (4) \(ramsey_1\), (5) \(ramsey_2\), (6) \(setcov_1\), (7) \(setcov_2\), and (8) \(setcov_3\). We took the first two instances from each of these sub-groups for the tuning process. As an exception, we took the first two instances from each configuration of \(n_l\), \(n_x\), and \(n_c\) for the sub-group highirth. Therefore, for this sub-group we used a total of eight instances for tuning. The obtained parameter values are presented in Table 10. We limited the execution time of all the algorithms tested on this instance group to 300 seconds, corresponding to one of the time limits used in MSE 2020 [80].

Table 10 Parameter values obtained for all \(\textsc {Aco}\) algorithms concerning MSE 2016 and 2020 instances

4.3 Results

The empirical results of all algorithms applied to the instances of Pinto et al. are presented in Table 2. Note that the results are provided in terms of the average number of satisfied clauses obtained within 100 runs, hence, a higher value represents a better result. Moreover, the results under the header \(\textsc {Aco}_{{{\mathrm{Pinto}}}}\) are the results of the \(\textsc {Aco}\) version from Pinto et al. [35]. Additionally, results marked in bold correspond to the best result of the comparison for each table row. In summary, the results show that \(\textsc {Aco}_{\mathrm {neg}}^+\) is the best algorithm for these instances. They also show that even though \(\textsc {Aco}_{\mathrm {neg}}\) is outperformed by \(\textsc {Cplex}\), it still performs significantly better than \(\textsc {Aco}_{{{\mathrm{Pinto}}}}\). Compared to \(\textsc {Aco}\), each of our negative learning approaches produces a remarkable improvement over the baseline algorithm.

Table 3 shows the empirical results of all the algorithms applied to the instances of Villagra and Baran in a summarized way. In particular, results are averaged over the 25 instances of each of the two instance sub-groups. In addition, the number of instances solved to optimality for each sub-group are given in brackets after the corresponding average results. In the context of these instances, our negative learning \(\textsc {Aco}\) variants are compared to the MaxSAT solver \(Walk\textsc {SAT}\) as well as the \(\textsc {Aco}\) variants from Villagra and Baran: \(\textsc {Aco}_{\textsc {Saw}}\), \(\textsc {Aco}_{\textsc {Rf}}\), and \(\textsc {Aco}_{\textsc {Rfsaw}}\). Each result in the table indicates the average number of satisfied clauses obtained within 10 algorithm runs. Additionally, we made use of the R package scmamp [83] to facilitate the interpretation of the results in Table 3. This statistical tool works as follows. First, the results from all algorithms are compared simultaneously using the Friedman test for obtaining the rejection to the hypothesis that all the algorithms perform equally. Next, a set of pairwise comparisons are performed using the Nemenyi post-hoc test [84] and, eventually, the output of this statistical analysis is presented as a critical difference (CD) plot in Fig. 2. The horizontal axis of the CD plot represents the range of algorithm ranks, while each of the vertical lines represents the average rank of the corresponding algorithm. Bold horizontal lines connecting algorithm markers means that the corresponding algorithms performed statistically equivalent i.e. the critical difference is not greater than the significance level of 0.05. Fig. 2 shows that all of our negative learning approaches, as well as the baseline \(\textsc {Aco}\), perform statistically better than each of the \(\textsc {Aco}\) versions from Villagra and Baran. Furthermore, all our \(\textsc {Aco}\) versions perform statistically equivalent to the MaxSAT solver \(Walk\textsc {SAT}\) and the ILP solver \(\textsc {Cplex}\) for this instance group.

Fig. 2
figure 2

Critical difference plot concerning the results of the test on Villagra and Baran’s instances

Tables 4, 5, 6 and 7 present the results of all the algorithms applied to the selected MSE 2016 and MSE 2020 instances. Note that this table provides the best result of each (stochastic) algorithm, while the average results are, due to space limitations, provided as supplementary material [85]. Also note that the results in Tables 4, 5, 6 and 7 are given in terms of the number of violated clauses. Thus, a lower value represents a better result. For facilitating the interpretation of the results obtained for this instance group, we additionally present the data from Tables 4, 5, 6 and 7 in a summarized way in Table 11. In addition, we conducted the same statistical analysis with scmamp (as explained above) to the data from Tables 4, 5, 6, 7 and present the result as a CD plot in Fig. 3.

Fig. 3
figure 3

Critical difference plot concerning the results of the tests for the MSE 2020 and 2016 instances

Table 11 Comparative performance of our negative learning \(\textsc {Aco}\) variants with their individual algorithmic components

In particular, Table 11 shows the number of instances for which each one of the negative learning \(\textsc {Aco}\) variants performs better, worse, or equally in comparison to its individual algorithmic component. These summarized results indicate that, in general, each of our negative learning \(\textsc {Aco}\) variants improves both over the baseline \(\textsc {Aco}\) and over each of the solvers that are used internally for solving sub-instances. Among all the negative learning \(\textsc {Aco}\) variants, \(\textsc {AcoSat}_{\mathrm {neg}}^+\) achieved the highest number of improvements over the baseline \(\textsc {Aco}\). It improves in 108 of 113 problem instances. Compared with the internally used MaxSAT solver \({\textsc {SAT}Like}\)-c, however, it improves over the result of \({\textsc {SAT}Like}\)-c only in \(11.5\%\) of all the problem instances. Nevertheless, \(\textsc {AcoSat}_{\mathrm {neg}}^+\) can be called the best algorithm for this instance group according to the CD plot in Fig. 3, even though no statistical difference can be detected with respect to \({\textsc {SAT}Like}\)-c and \(\textsc {Aco}_{\mathrm {neg}}^+\). Furthermore, all remaining negative learning \(\textsc {Aco}\) variants also significantly improve over both the baseline \(\textsc {Aco}\) approach and their internally used solvers. Even \(\textsc {Aco}_{\mathrm {neg}}\), the variant that does not take advantage of the internally derived \(\textsc {Cplex}\) result for updating its own best result, improves over both the baseline \(\textsc {Aco}\) approach and its constituent solver \(\textsc {Cplex}\) in the context of most of the problem instances. Furthermore, the statistical analysis graphically presented in Fig. 3 also shows that \(\textsc {Aco}_{\mathrm {neg}}\) outperforms the MaxSAT solver \(\textsc {SlsMcs}\). Hence, this proves the effectiveness of our negative learning strategy. Moreover, these results indicate an interesting aspect: our negative learning \(\textsc {Aco}\) framework can potentially be used for improving the results of MaxSAT solvers that are already very successful in standalone-mode.

5 Discussion and Conclusions

Ant colony optimization (\(\textsc {Aco}\)) was subject to several major improvements and extensions in its history. Most of these extensions, however, deal exclusively with the improvement of the positive learning mechanisms. Observing that negative learning works in synergy with positive learning in nature, several works were presented in the literature to integrate negative learning into \(\textsc {Aco}\) in the past decades. Most of these works, however, produced limited successes. In previous work, we introduced a novel strategy for the implementation and use of negative learning in \(\textsc {Aco}\). In contrast with other negative learning proposals, we made use of an additional algorithmic component to provide negative feedback to the main \(\textsc {Aco}\) algorithm. Further, we also implemented an effective cooperation mechanism between the main \(\textsc {Aco}\) approach and the additional algorithmic component through the use of a sub-instance that is not only reduced in size but that also contains high quality solutions. Our strategy was proven to be useful for the improvement of the performance of the baseline \(\textsc {Aco}\) algorithm in the context of a range of sub-set selection problems. In our opinion, one of the main reasons for the success of negative learning in ACO is that, in addition to guiding the algorithm to promising areas of the search space due to positive learning, it also guides the algorithm away from making use of solution components that initially seem promising but that have shown during the search process to lead to rather low-quality solutions.

In this work, we applied the negative learning \(\textsc {Aco}\) strategy to the MaxSAT problem, an optimization problem which is substantially different from the problems considered to date. Moreover, this problem is an extremely well studied optimization problem for which a wide range of high performance solvers are available for comparison. Also, \(\textsc {Aco}\) approaches were rarely implemented for this optimization problem and most of the existing implementations are far from being able to compete with state-of-the-art approaches. Hence, testing our negative learning proposal on MaxSAT provides a good opportunity to demonstrate the general applicability as well as the effectiveness of our approach. In addition to the ILP solver \(\textsc {Cplex}\) that we already employed in previous work, we made use of two high-performance MaxSAT solvers, \({\textsc {SAT}Like}\)-c and \(\textsc {SlsMcs}\), as new options for the additional algorithmic component to be internally used with negative learning \(\textsc {Aco}\). In this study, we evaluated the resulting negative learning \(\textsc {Aco}\) variants on three instance groups. In the context of the first two instance groups, the results show that our negative learning \(\textsc {Aco}\) variants perform significantly better than the baseline \(\textsc {Aco}\) as well as existing \(\textsc {Aco}\) variants from the literature. In the third instance group, consisting of instances used for recent MaxSAT evaluations, the obtained results showed that all our negative learning \(\textsc {Aco}\) variants were able to improve over the baseline \(\textsc {Aco}\) approach and over each of the internally used solvers. This is a very interesting result, as it shows that high-performance MaxSAT solvers can even be improved using them for solving sub-instances within our framework. In our opinion, this happens because the ACO algorithm reduces the original problem instances and the solver is then applied only to limited areas of the search space in which presumably good solutions can be found.

A natural extension of this work is to adapt our negative learning ACO for weighted MaxSAT as well as for partial MaxSAT, which is the variant of MaxSAT that declares some clauses as hard and imposes that hard clauses must be satisfied by any valid solution. Since industrial instances are generally encoded using weighted and partial MaxSAT, it might be interesting to use a SAT-based MaxSAT solver or a branch-and-bound MaxSAT solver with clause learning as additional algorithmic component. These solvers are particularly competitive on industrial instances and our negative learning ACO might help improve their performance. Finally, another extension of this work is to incorporate a decimation approach [65] in the generation of solutions.