Certified Core-Guided MaxSAT Solving

. In the last couple of decades, developments in SAT-based optimization have led to highly eﬃcient maximum satisﬁability (MaxSAT) solvers, but in contrast to the SAT solvers on which MaxSAT solving rests, there has been little parallel development of techniques to prove the correctness of MaxSAT results. We show how pseudo-Boolean proof logging can be used to certify state-of-the-art core-guided MaxSAT solving, including advanced techniques like structure sharing, weight-aware core extraction and hardening. Our experimental evaluation demonstrates that this approach is viable in practice. We are hopeful that this is the ﬁrst step towards general proof logging techniques for MaxSAT solvers.


Introduction
Combinatorial optimization is one of the most impressive, and most intriguing, success stories in computer science.This area deals with computationally very challenging problems, which are widely believed to require exponential time in the worst case [21,49].In spite of this, during the last couple of decades astonishing progress has been made on so-called combinatorial solvers for a number of different algorithmic paradigms such as Boolean satisfiability (SAT) solving and optimization [15], constraint programming (CP) [72], and mixed integer programming (MIP) [1,16].Today, such solvers are routinely used to solve real-world problems with hundreds of thousands or even millions of variables.
While the performance of modern combinatorial solvers is truly impressive, one negative aspect is that they are highly complex pieces of software, and it is well documented that even mature state-of-the-art solvers sometimes give wrong results [2,18,25,37].This can be fatal for applications where correctness is a non-negotiable demand.Perhaps the most successful approach for addressing this problem so far is the requirement in the SAT solving community that solvers should be certifying [3,62], meaning that when given a formula a solver should output not only a verdict whether the formula is satisfiable or unsatisfiable, but also an efficiently machine-verifiable proof log establishing that this verdict is guaranteed to be correct.One can then feed the input formula, the verdict, and the proof log to a special, dedicated proof checker, and accept the result if the proof checker agrees that the proof log shows that the solver computation is valid.Over the years, different proof formats such as RUP [43], TraceCheck [14], DRAT [44,45], GRIT [27], and LRAT [26] have been developed, and for almost a decade DRAT proof logging has been compulsory in the (main track of the) SAT competition.However, there has been very limited progress in designing analogous proof logging techniques for more powerful algorithmic paradigms.
Our focus in this work is on the optimization paradigm that is arguably closest to SAT solving, namely maximum satisfiability or MaxSAT solving [8,56], and the challenge of developing proof logging techniques for MaxSAT solvers.

Previous Work
Since essentially all modern MaxSAT solvers are based on repeated invocations of SAT solvers, a first question is why SAT proof logging techniques are not sufficient.While DRAT is a very powerful proof system, it seems that the overhead of generating proofs of correctness for the rewriting steps in between SAT solver calls in MaxSAT solvers is too large to be tolerable for practical purposes.Another, related, problem is that for optimization problems one needs to reason about the objective function, which DRAT struggles to do since its language is limited to disjunctive clauses.But perhaps the biggest challenge is that while modern SAT solving is completely dominated by the conflict-driven clause learning (CDCL) method [11,59,66], for MaxSAT there is a rich variety of approaches including linear SAT-UNSAT (or model-improving search) [31,54,68], coreguided search [4,7,35,67], implicit hitting set (IHS) search [28,29], and some recent work on branch-and-bound methods [57] (where we stress that the lists of references are far from exhaustive).
One tempting solution to circumvent this heterogeneity of solving approaches is to treat the MaxSAT solver as a black box and use a single call to a certifying SAT solver to prove optimality of the final solution found.However, there are several problems with this proposal.Firstly, we would still need proof logging to ensure that the input to the SAT solver is a correct encoding of a claim of optimality for the correct problem instance.Secondly, such a SAT call could be extremely expensive, running counter to the goal of proof logging with low (and predictable) overhead.Finally, even if the SAT-call approach could be made to work efficiently, this would just certify the final result, and would not help validate the correctness of the reasoning of the solver.For these reasons, our goal is to provide proof logging for the actual computations of the MaxSAT algorithm.
While some proof systems and tools have been developed specifically for MaxSAT [19,34,48,53,64,65,[69][70][71], none of them comes close to providing general-purpose proof logging, because they apply only for very specific algorithm implementations and/or fail to capture the full range of reasoning used in an algorithmic approach.A recent work [75] by two co-authors on the current paper instead leverages the pseudo-Boolean proof logging system VeriPB [76] to certify correctness of the unweighted linear SAT-UNSAT solver QMaxSAT.VeriPB is similar in spirit to DRAT , but operates with more general 0-1 linear inequalities rather than just clauses.This simplifies reasoning about optimization problems, and also makes it possible to capture the powerful MaxSAT solver inferences in a more concise way.VeriPB has previously been used for proof logging of enhanced SAT solving techniques [17,42] and pseudo-Boolean solving [38], as well as for providing proof-of-concept tools for a nontrivial range of techniques in constraint programming [33,41] and subgraph solving [39,40].

Our Contributions
In this work, we use VeriPB to provide, to the best of our knowledge for the first time, efficient proof logging for the full range of techniques in a cutting-edge MaxSAT solver.We consider the state-of-the-art core-guided solver CGSS [47], based on RC2 [46], and show how to enhance CGSS to output proofs of correctness of its reasoning, including sophisticated techniques such as stratification [6,58], intrinsic-at-most-one constraints [46], hardening [6], weight-aware core-extraction [13], and structure sharing [47].We find that the overhead for such proof logging is perfectly manageable, and although there is certainly room to improve the proof verification time, our experiments demonstrate that already a first proof-of-concept implementation of this approach is practically feasible.
It has been shown previously [32,39,52] that proof logging can also serve as a powerful debugging tool.This is because faulty reasoning is likely to lead to unsound proofs, which can be detected even if the solver produces correct output for all test cases.We exhibit yet another example of this-some proofs for which we struggled to make the verification work turned out to reveal two well-hidden bugs in RC2 and CGSS that earlier extensive testing had failed to uncover.
Although it still remains to provide proof logging for other MaxSAT approaches such as (general, weighted) linear SAT-UNSAT and implicit hitting set (IHS) search, we are optimistic that our work could serve as an important step towards general adoption of proof logging techniques for MaxSAT solvers.

Outline of This Paper
After reviewing preliminaries for pseudo-Boolean reasoning and core-guided MaxSAT solving in Sects. 2 and 3, we explain how core-guided MaxSAT solvers can be equipped with proof logging methods in Sect. 4. In Sect. 5 we present our experimental evaluation, after which some concluding remarks and directions for future research are given in Sect.6.

Preliminaries
We start by a review of some standard material which can be found, e.g., in [20,38,42].A literal over a Boolean variable x (taking values in {0, 1}, which we identify with false and true, respectively) is x itself or its negation x, where x = 1 − x.A pseudo-Boolean (PB) constraint is a 0-1 integer linear inequality C .= i a i i ≥ A (where .= denotes syntactic equality).When convenient, we can assume without loss of generality that PB constraints are in normalized form [10]; i.e., all literals i are over distinct variables and the coefficients a i and the degree (of falsity) A are non-negative integers.The set of literals in C is denoted lits(C).The negation of C is ¬C .= i a i i ≤ A − 1 (rewritten in normalized form when needed).A pseudo-Boolean formula is a conjunction F .= j C j of PB constraints.Note that a disjunctive clause can be viewed as a PB constraint with all coefficients and the degree equal to 1, and so formulas in conjunctive normal form (CNF) are special cases of PB formulas.
A (partial) assignment ρ is a (partial) function from variables to {0, 1}, which we extend to literals by respecting the meaning of negation.Applying ρ to a constraint C yields C ρ by substituting the variables assigned in ρ by their values, and for a formula F .
The foundation of the pseudo-Boolean proof logging in this paper is the cutting planes proof system [24], which is a method to iteratively derive new constraints implied by a pseudo-Boolean formula F .If C and D have been derived before or are axiom constraints in F , then any positive linear combination of these constraints can be derived.Literal axioms ≥ 0 can also be added to any previously derived constraints.For a constraint i a i i ≥ A in normalized form, division by a positive integer d derives i a i /d i ≥ A/d , and we also add a saturation rule that derives i min{a i , A} • i ≥ A (where the soundness of these rules crucially depends on the normalized form).It is well known that any PB constraint implied by F can be derived using these rules.
A constraint C is said to unit propagate the literal to true under an assignment ρ if C ρ cannot be satisfied unless is true.During unit propagation on F under ρ, we extend ρ iteratively by any propagated literals until an assignment ρ is reached under which no constraint C ∈ F is propagating or some constraint C wants to propagate a literal that has already been assigned to the opposite value.The latter case is called a conflict, since C is violated by ρ .We say that F implies C by reverse unit propagation (RUP), and that C is a RUP constraint with respect to F , if F ∧ ¬C unit propagates to conflict under the empty assignment.It is not hard to see that F |= C holds if C is a RUP constraint, and as a convenient shorthand we will add a RUP rule for deriving new constraints.
In addition to deriving constraints that are implied by a formula F , we also allow deriving so-called redundant constraints C that are not implied by F as long as some optimal solution is guaranteed to be preserved.This is done by extending the proof system with a redundance-based strengthening rule [17,42].We will only need the special case of this rule saying that for a fresh variable z and for any constraint D .= i a i i ≥ A we can introduce the reified constraints encoding the implications z ⇒ D and z ⇐ D, respectively.We refer to z as the reification variable, and when D is clear from context, we will sometimes write just C ⇒ reif (z) for (1a) and C ⇐ reif (z) for (1b).The maximum satisfiability (MaxSAT) problem can be described conveniently as a special case of pseudo-Boolean optimization.A discussion on the equivalence of the following and the-more classical-clause-centric definition can be found in, for instance, [8,55].An instance (F, O) of the (weighted partial) MaxSAT problem consists of a CNF formula F and an objective function O written as a non-negative affine combination of literals.The goal is to find a solution α that satisfies F and minimizes O(α).We say that such a solution α is optimal for the instance and that the optimal cost of the instance (F, O) is O(α).

The OLL Algorithm for Core-Guided MaxSAT Solving
We now proceed to discuss the core-guided MaxSAT solving in CGSS, which is based on the OLL algorithm [5,63], and describe the main heuristics used in efficient implementations of this algorithm.Given a MaxSAT instance (F orig , O orig ), OLL takes an optimistic view and attempts to find an assignment satisfying F orig in which O orig equals its constant term (i.e., all literals in lits(O orig ) are false).If such a solution exists, it is clearly optimal.Otherwise, the solver will extract a core K, which is a clause such that (i) K only contains objective literals, i.e., lits(K) ⊆ lits(O orig ), and (ii) F orig implies K, which means that any solution to F orig has to set at least one literal in lits(K) to true.The cost w lower bound on the optimal cost of the instance, and the reformulation is done in such a way that the lower bound increases (exactly) with the cost of the core K as defined above.
In more detail, the algorithm maintains a reformulated objective O ref (initialized to O orig ) such that the (non-normalized) pseudo-Boolean constraint is satisfied by all solutions of F ref .
Note that the constraint (2), which we refer to as an objective reformulation constraint, implies that the constant term LB is a lower bound on the optimal cost.In each iteration, a SAT solver is queried for a solution α to F ref with O ref (α) = LB .If such an α exists, the constraint (2) yields that O orig (α) = LB , and so α is a minimal-cost solution to (F orig , O orig ).Otherwise, the solver returns a new core K that requires at least one literal in lits(O ref ) to be set to 1.This implies that the optimal cost is strictly larger than LB , and the core K is used for a new reformulation step.
The objective reformulation step adds new clauses to , where the additive 1 comes from the fact that at least one literal in K has to be true, and the reformulation step is just applying this equality multiplied by Notice that the variables added during objective reformulation can later be discovered in other cores.In practice, all implementations of OLL we are aware of encode the semantics of counting variables incrementally [60].This means that initially only the variable y K,2 is defined, and the variable y K,i+1 is introduced only after y K,i is found in a core.
Implementations of OLL for MaxSAT-including the CGSS solver that we enhance with proof logging in this work-extend the algorithm with a number of heuristics such as stratification [6,58], hardening [6], the intrinsic-at-most-ones technique [46], weight-aware core extraction [13], and structure sharing [47].
Stratification extracts cores not over all literals in O ref but only over those whose coefficient is above some bound w strat .This steers search toward cores containing literals with high coefficients, resulting in larger increases of LB .Once no more cores over such variables can be found, the algorithm lowers w strat , terminating only after no more cores can be found with w strat = 1.The fact that no more cores containing only variables with coefficients above w strat exist is detected by the SAT solver returning a (possibly non-optimal) solution α.The minimal cost O orig (α) of all such solutions gives an upper bound UB on the optimal cost of the instance, allowing OLL to terminate as soon as LB = UB .
Hardening Weight-aware core extraction (WCE) delays objective reformulation, and the accompanying increase in new variables and clauses, for as long as possible.When a new core K is extracted by a solver that uses WCE, initially only the coefficient of each b ∈ lits(K) is lowered and the lower bound LB is increased by w (K, O ref ).Then the SAT solver is invoked again with the literals, that still have coefficients above w strat in O ref , set to 0. When the SAT solver finds a satisfying assignment extending the assumptions, all objective reformulations steps are then performed at once.This is correct since the final effect is the same as if the core would have been discovered one by one and immediately followed by objective reformulation.Notice that this core extraction loop is guaranteed to terminate since the coefficient of at least one variable is decreased to 0 for each new core.Structure sharing is a recent extension to weight-aware core extraction that makes use of the potential overlap in cores detected in order to achieve more compact encodings of counting variable semantics.

Proof Logging for the OLL Algorithm for MaxSAT
We have now reached a point where we can describe the contribution of this work, namely how to add proof logging to an OLL-based core-guided MaxSAT solver, including all the state-of-the-art techniques described in Sect.3.
In our proof logging routines we maintain the invariants described next.The reformulated objective O ref is already implicitly tracked by the solver and at all times it is possible to derive that O orig ≥ O ref as in (2).We also keep track of the current upper bound UB on O orig and best solution α best found so far.All cores that have been found and processed are in the set K.
SAT Solver Calls.The CDCL SAT solvers used in core-guided MaxSAT algorithms can support DRAT proof logging, and since the proof format used by VeriPB is a strict extension of DRAT (modulo small and purely syntactical modifications) it is straightforward to provide proof logging for the part of the reasoning done in SAT solver calls, and to add all learned clauses to the proof checker database.
Each invocation of the SAT solver returns either a new solution α or a new core K.When a solution α with O orig (α) < UB is obtained, it is logged in the proof, which adds the objective-improving constraint in normalized form).A technical side remark is that later solutions with cost greater than UB cannot successfully be logged, since they violate the constraint (3a) added to the proof checker database, and so the proof logging routines make sure to only log solutions that improve the current upper bound.
If the SAT solver instead returns a new core K, this clause is guaranteed to be a reverse unit propagation (RUP) clause with respect to the set of clauses currently in the solver database, and so we can use the RUP rule to add K to the proof checker database (which contains a superset of the clauses known by the solver).For our book-keeping, we also add K to the set K. A special case is that K could be the contradictory empty clause, corresponding to the pseudo-Boolean constraint 0 ≥ 1.This means that there are no solutions to the formula.
To optimize the efficiency of proof verification, constraints should be deleted from the proof when they are no longer needed.Since SAT solver proofs are only used to prove unsatisfiability this does not cause any issues, but when certifying optimality we have to be careful in order not to create better-thanoptimal solutions (which could happen if, e.g., constraints in the input formula are removed).The checked deletion rule [17] ensuring this in VeriPB does not have any analogue in DRAT , so some care is needed here when translating SAT solver proofs into the VeriPB format.
Incremental Totalizer with Structure Sharing.Different implementations of OLL for MaxSAT differ in which encoding is used for the counting variables introduced during objective reformulation [9,50,51].The two solvers we consider use totalizers [9], so we start by explaining this encoding and then show how to provide proof logging for the clauses added to the proof checker database.
The totalizer encoding for a set I = { 1 , . . ., n } of literals is a CNF formula T that defines counting variables y I,j for j = 1, . . ., n such that for any assignment that satisfies T the variable y I,j is true if and only if n i=1 i ≥ j.The structure of T can be viewed as a binary tree, with literals in I at the leaves and with each internal node η associated with variables counting the true leaf literals in the subtree rooted at η.The variables y I,j are associated with the root of the tree.
More formally, given a set of literals I, we construct a binary tree with leaves labelled by the literals in I.For every node η of T , let lits(η) denote the leaves in the subtree rooted at η; where it is convenient, we will overload I to also refer to the root note.For each internal node η, the totalizer encoding introduces the counting variables S η = {y η,1 , . . ., y η,|lits(η)| }, the meaning of which can be encoded recursively in terms of the variables S η1 and S η2 for the children η 1 and η 2 of η by the (pseudo-Boolean form of the) clauses for all integers α, β, σ such that α + β = σ and 0 ≤ α ≤ |lits(η 1 )|, 0 ≤ β ≤ |lits(η 2 )|, and 0 ≤ σ ≤ |lits(η)|.We use the notational conventions in (4a)-(4b) that y ,1 = for all leaves , and that y η,0 = 1 and y η,|lits(η)|+1 = 0 for all nodes η (so that clauses containing y η,0 or y η,|lits(η)|+1 can be simplified to binary clauses or be omitted when they are satisfied).The clauses C ⇒ η (α, β, σ) in (4b) are not necessarily added to the clause database of the MaxSAT solver, but are sometimes included for improved propagation.We now turn to the question of how to derive the clauses (4a)-(4b) encoding the meaning of the counting variables y I,j in the proof.This is a two-step process.First, reified pseudo-Boolean (and, in general, non-clausal) constraints C ⇒ reif (y η,j ) and C ⇐ reif (y η,j ) as in (1a)-(1b), encoding that y η,j holds if and only if ∈lits(η) ≥ j, are derived by redundance-based strengthening.Then the clauses added to the MaxSAT solver are derived from these pseudo-Boolean constraints.Although we omit the details due to space constraints, it is not hard to show that for any internal node η with children η 1 and η 2 , the clauses ), and C ⇒ reif (y η2,β ) by standard cutting planes derivations as in [75].In particular, the certification of these totalizers can be done incrementally: clauses in the encoding can be derived as the corresponding counter variables are lazily introduced in the OLL algorithm.
This approach is also compatible with structure sharing, where subtrees of a previously constructed totalizer tree can be reused (to avoid doing the same work twice).The only constraints from a subtree rooted at η * that are needed when generating another totalizer encoding at a higher level are the constraints C ⇒ reif (y η * ,σ ) and C ⇐ reif (y η * ,σ ) defining the counter variables in the subtree root η * .To decrease the memory usage of the proof checker, it can be useful to delete reification constraints from the proof once we know that they will no longer be needed.Without structure sharing, for an internal node η, once all clauses that mention y η,j are created, the constraints C ⇐ reif (y η,j ) and C ⇒ reif (y η,j ) will not be used anymore and can thus be deleted.On the other hand, structure sharing reuses as many counting variables as possible, even over multiple iterations of weight-aware core extraction.This means that C ⇐ reif (y η,j ) and C ⇒ reif (y η,j ) need to be retained, even after all clauses in the totalizer encoding for all parents of node η have been created.
Objective Reformulation.If counting variables y K,i for i = 2, . . ., s K have been introduced for the core K, then the objective reformulation with respect to K is derived with the help of the constraint in normalized form).The constraint (5b) can in turn be obtained from the core clause K and the reified constraints C ⇒ reif (y K,j ).It is clear that this should be possible, since the latter constraints define the variables y K,j precisely so that (5b) should hold, and we refer to Algorithm 5 in [38] for the details.Also, each time a new counting variable y K,j is introduced for a core K, we add it to (5b) to maintain this constraint as an invariant.
To illustrate how this update works, suppose we have a core K .

has already been derived. The next counting variable y K,sK is introduced by the reification s
The previous constraint is multiplied by s K − 1 and added to the new reified constraint, yielding Proving Optimality.When the solver has found an optimal solution and established a matching lower bound, optimality is certified in the proof log using a proof by contradiction from the objective reformulation constraint O orig ≥ O ref in (2) and the (normalized form of the) objective-improving constraint O orig ≤ UB − 1 in (3b).If we add these two constraints and cancel like terms, we get Since we have UB = LB when the optimal solution has been found, and since ), the constraint ( 6) can be simplified to contradiction 0 ≥ 1.
Intrinsic At-Most-One Constraints.Certifying intrinsic at-most-one constraints for a set S ⊆ lits(O ref ) of literals requires deriving (i) the at-most-one constraint stating that at most one b ∈ S is assigned to 0 by any solution and (ii) constraints defining the variable S .Such sets S are detected by unit propagation that implicitly derives implications b i ⇒ b j in the form of binary clauses b i + b j ≥ 1 for every pair of variables in S. In the proof log, all these binary clauses can be obtained by RUP steps, after which the at-most-one constraint b∈S b ≤ 1 (which is b∈S b ≥ |S| − 1 in normalized form) is derived by a standard cutting planes derivation (see, e.g., [24]).Upper Bound Estimation.A final technical proof logging detail is that some implementations of the OLL algorithm for MaxSAT-including the Pythonbased version of CGSS-do not use the actual cost of the solution found by the SAT solver as the upper bound UB when hardening.In order to avoid the overhead in Python of extracting the solution from the SAT solver, an upper bound estimate UB est is computed instead based on the initial assignment passed to the SAT solver in the call.Since any valid estimate is at least the cost of the solution found (i.e., UB est ≥ UB ), hardening steps based on UB est can be justified by first deriving O orig ≤ UB est − 1, which follows from the latest objective-improving constraint (3a).However, in order to handle solutions correctly in the proof, the proof logging routines need to extract the solution found by the solver and compute the actual cost, which means that a Python-based solver will not be able to avoid this overhead when running with proof logging.
Worked-Out Example.We end this section with a complete, worked-out example of OLL solving and proof logging for the toy MaxSAT instance (F, O) with formula After initialization, the internal SAT solver of the OLL algorithm is loaded with the clauses of F and the proof consists of constraints (1)-( 3) in Table 1.The OLL search begins by invoking the SAT solver on the clauses in F in order to check the existence of any solutions.Assume the SAT solver returns the solution α 1 assigning b 1 = b 3 = b 4 = 1 and b 2 = x = 0.This solution has objective value O(α 1 ) = O orig (α 1 ) = 7 so the algorithm updates UB = 7 and logs the objective-improving constraint (4) in Table 1 equivalent to O orig ≤ 6.
Assume the stratification bound w strat is initialised to 2. Then the solver is invoked with b 1 = b 2 = 0 and returns the core which is added to the proof as constraint (5).As already mentioned, core clauses are guaranteed to be RUP with respect to the set of clauses in the SAT solver database, which are also added to the proof.
For simplicity, we ignore WCE and structure sharing in this example, meaning that the solver next reformulates the objective based on K 1 by introducing clauses enforcing y K1,2 ⇐ (b 1 + b 2 ≥ 2) for the new counting variable y K1,2 .This is done by (i) introducing the pseudo-Boolean constraints ( 6) and (7) in Table 1 by reification, and (ii) deriving the clauses corresponding to these constraints.While the MaxSAT solver only uses the implication (6), the proof also requires  Since it now holds that coeff (O ref , y K1,2 ) + LB = 5 + 5 ≥ 7 = UB , the literal y K1,2 is hardened to 0. In order to certify this hardening step, i.e., derive y K1,2 ≥ 1, the proof logger first derives the objective reformulation constraint The objective-improving and objective reformulation constraints are then added together to get constraint (9), after which y K1,2 ≥ 1 is obtained by a RUP step.
The next SAT solver call with b 3 = b 4 = 0 returns as core the input clause b 3 + b 4 ≥ 1, and reformulation (lines (11)-( 13)) yields O ref = 5y K1,2 + y K2,2 + 6 with LB = 6.Now suppose the SAT solver finds the solution α 2 with b 2 = b 3 = x = 1 and all other variables set to 0, resulting in the objective-improving constraint (15).Since O orig (α 2 ) = 6 = LB , the solver terminates and reports α 2 to be optimal.To certify that this is correct, another objective reformulation constraint ( 16) is derived, after which the contradictory constraint ( 17) is obtained by adding ( 15) and ( 16).This proves that solutions with cost less than 6 do not exist.

Experimental Evaluation
To evaluate the proof logging techniques developed in this paper, we have implemented them in the state-of-the-art MaxSAT solver CGSS [22,47], which uses the OLL algorithm and structure-sharing totalizers.We employed VeriPB [76], extended to parse MaxSAT instances in the standard WCNF format, to verify the certificates of correctness emitted by the certifying solver.
Our experiments were conducted on machines with an 11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60 GHz CPU and 16 GB of memory.Each benchmark ran exclusively on a single machine with a memory limit of 14 GB and a time limit of 3 600 s for solving with CGSS and 36 000 s for checking the certificates with VeriPB.As benchmarks we used all 594 weighted and 607 unweighted instances from the complete track of the MaxSAT Evaluation 2022 [61], where an instance (F, O) is unweighted if all coefficients coeff (O, ) are equal.The data from our experiments can be found in [12].
Overhead of Proof Logging.To evaluate the overhead in solver running time, we compared the standard CGSS solver [23] without proof logging (but with the bug fixes discussed below) to CGSS with proof logging as described in this paper.With proof logging 803 instances are solved within the resource limits, which is 3 instances less than without proof logging (see Fig. 1).Adding proof logging slowed down CGSS by about 8.8% in the median over all solved instances.For 95% of the instances CGSS with proof logging was at most 36.2%slower.Thus, the proof logging overhead seems perfectly manageable and should present no serious obstacles to using proof logging in core-guided MaxSAT solvers.
Overhead of Proof Checking.To assess the efficiency of proof checking, we compared the running time of CGSS with proof logging to the time taken by VeriPB for checking the generated proofs.The instances that were not solved Core K#iter extracted by CGSS within the resource limits were filtered out, since the running time for checking an incomplete proof is inconclusive.
VeriPB successfully checked the proofs for 747 out of the 803 instances solved by CGSS (see Fig. 2); 42 instances failed due to the memory limit and 14 instances failed due to the time limit.Checking the proof took about 3 times the solving time in the median for successfully checked instances.About 87% of the successfully checked instances were checked within 10 times the solving time.
Proof checking time compared to solver running time varies widely, but our experiments indicate that the performance of VeriPB is sufficient in most cases, and verification time scales linearly with the size of the proof for a majority of the instances.However, there is room to improve VeriPB, where focus so far has been on proof logging strength rather than performance.For the instances where checking is 100 times slower than solving, the main bottleneck is the proof generated by the SAT solver, which could be addressed by standard techniques for checking DRAT proofs, and checking logged solutions (when objective improving constraints (3a) are added) could also be implemented more efficiently.
Bugs Discovered by Proof Logging.Our work on implementing proof logging in CGSS led to the discovery of two bugs, which were also present in the solver RC2 on which CGSS is based, but have now been fixed in CGSS in commit 5526d04 and in RC2 in commit d0447c3.The bugs are due to a slightly different implementation of OLL compared to the description in Sect.3.
First, when a counting variable y K old ,i for a core K old appears for the first time in a later core K new , the next counting variable y K old ,i+1 is added to the reformulated objective with coefficient w K new , O new rather than w K old , O old .The coefficient of y K old ,i+1 is then further increased when y K old ,i is found in future cores.Second, rather than computing the upper bound UB from an actual solution, CGSS uses a weaker estimate UB est obtained by summing the current lower bound and the coefficients of all literals b where coeff (O ref , b) < w strat (meaning that these literals were not set to 0 in the SAT solver call, and so could potentially be true in the solution).
The bugs we detected could lead to the solver producing an overly optimistic estimate UB est < UB .The first way this can happen is when the contributions of counting variables y K,k in the reformulated objective are underestimated due to too small coefficients.The second bug is when the coefficient of y K old ,i+1 is first lowered below w strat and then raised above this threshold again when y K old ,i is found in a core.Then CGSS fails to assume y K old ,i+1 = 0 in future solver calls.These bugs can result in erroneous hardening as detailed in the next example.2 displays a possible CGSS run for this instance, except that for simplicity we assume one core extraction per iteration and no use of any other heuristics.The upper half of the table lists the variables set to 0 in solver calls, the extracted core, and the lower bound derived from it.The lower half of the table provides the reformulated objective.Even though the coefficient of y K1,3 is increased to 8 after the fourth core, this variable is not set to 0 in subsequent iterations, which allows the solver to finish the stratification level after extracting 6 cores with a solution that sets to true the variables b 1 , b 2 , b 3 , b 5 , e 4 , o 1 , o 2 , y K2,2 and y K1,i for i = 1, . . ., 4, and all other variables to false.The cost of this solution is 45.Now CGSS would incorrectly estimate UB est = LB + 4 = 28, since y K1,3 and y K2,2 (abbreviated as y 1,3 and y 2,2 in the table) both have coefficient 1 in the current reformulated objective.This is lower than the cost 45 of the solution found (and even than the optimum 36), and erroneously allows hardeningwhich considers y K1,3 with the correct coefficient 8-to fix y K1,3 = 0, even though b 1 , b 2 and b 3 (and hence also y K1,3 ) are true in every minimal-cost solution.
In our computational experiments there were cases of faulty hardening, but all incorrectly fixed values happened to agree with some optimal solution and so we never observed incorrect results.Proof logging detected the problem, however, since the derivations of the buggy hardening steps failed during proof checking.Interestingly, what proof logging did not turn up was any examples of mistaken claims O orig ≤ UB est − 1 when the cost of a found solution was estimated.The issue with mistaken estimates due to faulty stratification was instead discovered while analyzing and fixing the hardening bug.The moral of this is that even if all results are certified as correct, this does not certify that the code is free from bugs that have not yet manifested themselves.However, proof logging still guarantees that even if the solver would have undiscovered bugs, we can always trust computed results for which the accompanying proofs pass verification.

Concluding Remarks
In this work, we develop pseudo-Boolean proof logging techniques for core-guided MaxSAT solving and implement them in the solver CGSS [47] with support for the full range of sophisticated reasoning techniques it uses.To the best of our knowledge, this is the first time a state-of-the-art MaxSAT solver has been enhanced to output machine-verifiable proofs of correctness.We have made a thorough evaluation on benchmarks from the MaxSAT Evaluation 2022 using the VeriPB proof checker [17,42], and find that proof logging overhead is perfectly manageable and that proof verification time, while leaving room for improvement, is definitely practically feasible.Our work also showcases the benefit of proof logging as a debugging tool-erroneous proofs produced by CGSS revealed two subtle bugs in the solver that previous extensive testing had failed to uncover.
Regarding proof verification time, further investigation is needed into the rare cases where verification is much slower (say, more than a factor 10) than solving.There are reasons to believe, though, that this is not a problem of MaxSAT proof logging per se, but rather is explained by features not yet added to VeriPB, which is a tool currently undergoing very active development.So far, the proof checker has been optimized for other types of reasoning than the clausal reverse unit propagation (RUP) steps that dominate SAT proofs.Also, VeriPB lacks the ability to trim proofs during checking as in [44].Finally, introducing a binary proof format in addition to plain-text proofs would be another way to boost performance of proof checking.But these are matters of engineering rather than research, and can be taken care of once the proof logging technology as such has been developed and has proven its worth.
The focus of this work is on core-guided MaxSAT solving, but we would like to extend our techniques to solvers using linear SAT-UNSAT (LSU) solving (such as Pacose [68]) and implicit hitting set (IHS) search (such as MaxHS [28,29]).Although there are certainly nontrivial technical challenges that will need to be overcome, we are optimistic that our work paves the way towards a unified proof logging system for the full range of modern MaxSAT solving approaches.Going beyond MaxSAT, it would also be interesting to extend VeriPB proof logging to pseudo-Boolean solvers using core-guided search [30] or IHS [73,74], and perhaps even to similar techniques in constraint programming [36] and answer set programming [5].
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material.If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
fixes literals in O ref to 0 based on information provided by the current upper and lower bounds UB and LB .If for any b ∈ lits(O ref ) it holds that coeff (O ref , b)+LB > UB , then any solution α with b = 1 would have higher cost than the current best solution known, and would thus not be optimal.The intrinsic-at-most-one technique identifies subsets S ⊆ lits(O ref ) of objective literals such that b∈S b ≤ 1 is implied, i.e., any solution can assign at most one literal in S to 0. This is used both to increase the lower bound and to reformulate the objective.If we let w min = min{coeff (O ref , b) : b ∈ S}, then S implies a lower bound increase of LB S = (|S| − 1) • w min .Additionally, we define a new variable S by the clause S + b∈S b ≥ 1 to indicate if in fact all literals in S are true, and introduce it in the reformulated objective with coefficient w min .This means that we remove the already known lower bound LB S from O ref and transfer the possible additional cost w min from S to the variable S .
The reified constraints S ⇐ b∈S b ≥ |S| and S ⇒ b∈S b ≥ |S| defining the variable S (which are S + b∈S b ≥ 1 and S + b∈S b ≥ |S|, respectively, in normalized form) are derived by redundance-based strengthening.Note that the latter constraint does not exist in the MaxSAT solver, but we need it in the proof in order to derive the objective reformulation for the at-most-one constraint.Hardening.Formally, hardening corresponds to deriving b ≥ 1 in the proof for some literal b ∈ lits(O ref ) for which UB < LB + coeff (O ref , b) holds.Such an inequality b ≥ 1 is implied by RUP if we first derive the constraint (6), since assigning b = 1 results in (6) being contradicting.
Conveniently, in this toy example y K1,2 ⇐ (b 1 + b 2 ≥ 2) is already the clause b 1 + b 2 + y K1,2 ≥ 1, so step (ii) is not needed.For the general case, we derive totalizer clauses as explained in Sect. 4. Conceptually, we now replace 5b 1 + 5b 2 by 5y K1,2 + 5 to obtain the reformulated objective O ref = b 3 + b 3 + 5y K1,2 + 5 with lower bound LB = 5.The core K 1 says that at least one of b 1 and b 2 must be true, thus incurring a cost of 5, and y K1,2 is added to the objective to indicate if both of them incur cost.

Fig. 1 .
Fig. 1.Running time of CGSS with and without proof logging.

Fig. 2 .
Fig. 2. CGSS running time compared to time required for proof checking.

Example 1 . 5 i=1 10 •
Given a MaxSAT instance (F, O) with F = 5 i=1 b i , (o 1 ∨ o 2 ) ∪ {b i ∨ e i | i = 1, . . ., 5} and O = b i + 11 • e 1 + 14 • e 2 + 11 • e 3 + 3 • e 4 + 2 • e 5 + o 1 + o 2 ,assume the stratification bound is w strat = 2. Table and ρ satisfies F if it satisfies all C ∈ F , in which case F is satisfiable.A formula lacking satisfying assignments is unsatisfiable.We say that F implies C, denoted F |= C, if any assignment satisfying F also satisfies C.An objective O .= i w i i + M is an affine function over literals i to be minimized by (total) assignments α satisfying F .The value (or cost ) of an objective O under such an α, which we refer to as a solution, is O |K|.The new variables y K,k are added to O ref with coefficient w (K, O ref ) equalling the cost of K, and the coefficient in O ref of each literal in K is decreased by the same amount.Finally, the lower bound LB -the constant term of O ref -is also increased by w (K, O ref ).Since y K,k encodes that at least k literals in K are true, we have the equality which is the desired updated constraint.For a set of extracted cores K, we can derive the objective reformulation constraint O orig ≥ O ref by multiplying (5b) for each K ∈ K by the cost w (K, O ref ) of K and summing up all these multiplied constraints.The fact that we have an inequality O orig ≥ O ref rather than an equality is due to the incremental use of totalizers.More specifically, if s K = |lits(K)| would hold for every K ∈ K, it would be possible to derive O orig = O ref instead.Here we would like to stress one subtlety for developing proof logging for OLL: as the algorithm progresses and more output variables of totalizers are introduced (i.e., the counters s K increase), the reformulated objective potentially also increases-because of added counted variables when s K increases we have the inequality O orig ≥ O new ref ≥ O old ref .For this reason, the old constraint O orig ≥ O old ref cannot be used to derive O orig ≥ O new ref after objective reformulation.Instead, we have to derive O orig ≥ O ref from scratch each time the solver argues with the reformulated objective.For doing this we need to have access to the entire set K of cores.

Table 1 .
Example proof produced by a certified OLL solver.

Table 2 .
Illustration of discovered bug (where y i,k should be read as y K i ,k ).