Keywords

1 Introduction

Building heavily on the success of Boolean satisfiability (SAT) solving [13], maximum satisfiability (MaxSAT) as the optimization extension of SAT constitutes a viable approach to solving real-world NP-hard optimization problems [6, 35]. In the context of SAT, the study of fundamental aspects of clause redundancy [20, 21, 23, 28, 29, 31, 32] has proven central for developing novel types of preprocessing and inprocessing-style solving techniques [24, 29] as well as in enabling efficient proof checkers [7, 15, 16, 18, 19, 41, 42] via succinct representation of most practical SAT solving techniques. Furthermore, clause redundancy notions have been shown to give rise to very powerful proof systems, going far beyond resolution [22, 23, 30]. In contrast to viewing clause redundancy through the lens of logical entailment, the redundancy criteria developed in this line of work are based on a semantic implication relationship between formulas, making them desirably efficient to decide and at the same time are guaranteed to merely preserve satisfiability rather than logical equivalence.

The focus of this work is the study of clause redundancy in the context of MaxSAT through lifting recently-proposed variants of the notion of propagation redundancy [23] based on a semantic implication relationship between formulas from the realm of SAT. The study of such liftings is motivated from several perspectives. Firstly, earlier it has been shown that a natural MaxSAT-lifting called SRAT [10] of the redundancy notion of the notion of resolution asymmetric tautologies (RAT) [29] allows for establishing the general correctness of MaxSAT-liftings of typical preprocessing techniques in SAT solving [14], alleviating the need for correctness proofs for individual preprocessing techniques [8]. However, the need for preserving the optimal cost in MaxSAT—as a natural counterpart for preserving satisfiability in SAT—allows for developing MaxSAT-centric preprocessing and solving techniques which cannot be expressed through SRAT [2, 11]. Capturing more generally such cost-aware techniques requires developing more expressive notions of clause redundancy. Secondly, due to the fundamental connections between solutions and so-called minimal corrections sets (MCSes) of MaxSAT instances [8, 25], analyzing the effect of clauses that are redundant in terms of expressive notions of redundancy on the MCSes of MaxSAT instances can provide further understanding on the relationship between the different notions and their fundamental impact on the solutions of MaxSAT instances. Furthermore, in analogy with SAT, more expressive redundancy notions may prove fruitful for developing further practical preprocessing and solving techniques for MaxSAT.

Our main contributions are the following. We propose natural liftings of the three recently-proposed variants PR, LPR and SPR of propagation redundancy in the context of SAT to MaxSAT. We provide a complete characterization of the relative expressiveness of the lifted notions CPR, CLPR and CSPR (C standing for cost for short) and of their impact on the set of MCSes in MaxSAT instances. In particular, while removing or adding clauses redundant in terms of CSPR and CLPR (the latter shown to be equivalent with SRAT) do not influence the set of MCSes underlying MaxSAT instances, CPR can in fact have an influence on MCSes. In terms of solutions, this result implies that CSPR or CLPR clauses can not remove minimal (in terms of sum-of-weights of falsified soft clauses) solutions of MaxSAT instances, while CPR clauses can.

The—theoretically greater—effect that CPR clauses have on the solutions of MaxSAT instances is key for succinctly expressing further MaxSAT reasoning techniques via CPR and allows for obtaining correctness proofs in a uniform way for MaxSAT reasoning techniques very generally; we give concrete examples of how CPR captures techniques not in the reach of SRAT. Bridging to practical preprocessing in MaxSAT, we also provide a new MaxSAT preprocessor extended with such techniques. Finally, we provide large-scale empirical evidence on the positive impact of the preprocessor on the runtimes of various modern MaxSAT solvers, covering both complete and incomplete approaches, suggesting that extensive preprocessing going beyond the scope of SRAT appears beneficial to integrate for speeding up modern MaxSAT solvers.

An extended version of this paper, with formal proofs missing from this version, is available via the authors’ homepages.

2 Preliminaries

SAT. For a Boolean variable x there are two literals, the positive x and the negative \(\lnot x\), with \(\lnot \lnot l = l\) for a literal l. A clause C is a set (disjunction) of literals and a CNF formula \(F\) a set (conjunction) of clauses. We assume that all clauses are non-tautological, i.e., do not contain both a literal and its negation. The set \(\mathtt {var}(C) = \{ x \mid x \in C \text { or } \lnot x \in C\}\) consists of the variables of the literals in C. The set of variables and literals, respectively, of a formula are \(\mathtt {var}(F) = \bigcup _{C \in F} \mathtt {var}(C)\) and \(\mathtt {lit}(F) = \bigcup _{C \in F} C\), respectively. For a set L of literals, the set \(\lnot L = \{\lnot l \mid l \in L\}\) consists of the negations of the literals in L.

A (truth) assignment \(\tau \) is a set of literals for which \(x \notin \tau \) or \(\lnot x \notin \tau \) for any variable x. For a literal l we denote \(l \in \tau \) by \(\tau (l) = 1\) and \(\lnot l \in \tau \) by \(\tau (l) = 0\) or \(\tau (\lnot l)=1\) as convenient, and say that \(\tau \) assigns l the value 1 and 0, respectively. The set \(\mathtt {var}(\tau ) = \{ x \mid x \in \tau \text { or } \lnot x \in \tau \}\) is the range of \(\tau \), i.e., it consists of the variables \(\tau \) assigns a value for. For a set L of literals and an assignment \(\tau \), the assignment \(\tau _L = (\tau \setminus \lnot L ) \cup L\) is obtained from \(\tau \) by setting \(\tau _L(l) = 1\) for all \(l \in L\) and \(\tau _L(l)=\tau (l)\) for all \(l \notin L\) assigned by \(\tau \). For a literal l, \(\tau _l\) stands for \(\tau _{\{l\}}\). An assignment \(\tau \) satisfies a clause C (\(\tau (C) = 1\)) if \(\tau \cap C \ne \emptyset \) or equivalently if \(\tau (l) = 1\) for some \(l \in C\), and a CNF formula \(F\) (\(\tau (F) = 1\)) if it satisfies each clause \(C \in F\). A CNF formula is satisfiable if there is an assignment that satisfies it, and otherwise unsatisfiable. The empty formula \(\top \) is satisfied by any truth assignment and the empty clause \(\bot \) is unsatisfiable. The Boolean satisfiability problem (SAT) asks to decide whether a given CNF formula \(F\) is satisfiable.

Given two CNF formulas \(F_1\) and \(F_2\), \(F_1\) entails \(F_2\) (\(F_1 \models F_2\)) if any assignment \(\tau \) that satisfies \(F_1\) and only assigns variables of \(F_1\) (i.e. for which \(\mathtt {var}(\tau ) \subset \mathtt {var}(F_1)\)) can be extended into an assignment \(\tau ^2 \supset \tau \) that satisfies \(F_2\). The formulas are equisatisfiable if \(F_1\) is satisfiable iff \(F_2\) is. An assignment \(\tau \) is complete for a CNF formula \(F\) if \(\mathtt {var}(F) \subset \mathtt {var}(\tau )\), and otherwise partial for \(F\). The restriction \({ F \big |_{\tau } }\) of \(F\) wrt a partial assignment \(\tau \) is a CNF formula obtained by (i) removing from \(F\) all clauses that are satisfied by \(\tau \) and (ii) removing from the remaining clauses of \(F\) literals l for which \(\tau (l) = 0\). Applying unit propagation on \(F\) refers to iteratively restricting \(F\) by \(\tau = \{l\}\) for a unit clause (clause with a single literal) \((l) \in F\) until the resulting (unique) formula, denoted by \(\mathtt {UP}(F)\), contains no unit clauses or some clause in \(F\) becomes empty. We say that unit propagation on \(F\) derives a conflict if \(\mathtt {UP}(F)\) contains the empty clause. The formula \(F_1\) implies \(F_2\) under unit propagation (\(F_1 \vdash _1F_2\)) if, for each \(C \in F_2\), unit propagation derives a conflict in \(F_1 \wedge \{ (\lnot l) \mid l \in C\}\). Note that \(F_1 \vdash _1F_2\) implies \(F_1 \models F_2\), but not vice versa in general.

Maximum Satisfiability. An instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {F}_S, w)\) of (weighted partial) maximum satisfiability (MaxSAT for short) consists of two CNF formulas, the hard clauses \(\mathcal {F}_H\) and the soft clauses \(\mathcal {F}_S\), and a weight function \(w :\mathcal {F}_S\rightarrow \mathbb {N}\) that assigns a positive weight to each soft clause.

Without loss of generality, we assume that every soft clause \(C \in \mathcal {F}_S\) is unitFootnote 1. The set of blocking literals \(\mathcal {B}(\mathcal {F}) = \{l \mid (\lnot l) \in \mathcal {F}_S\}\) consists of the literals l the negation of which occurs in \(\mathcal {F}_S\). The weight function w is extended to blocking literals by \(w(l) = w( (\lnot l) )\). Without loss of generality, we also assume that \(l \in \mathtt {lit}(\mathcal {F}_H)\) for all \(l \in \mathcal {B}(\mathcal {F})\)Footnote 2. Instead of using the definition of MaxSAT in terms of hard and soft clauses, we will from now on view a MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) as a set \(\mathcal {F}_H\) of hard clauses, a set \(\mathcal {B}(\mathcal {F})\) of blocking literals and a weight function \(w :\mathcal {B}(\mathcal {F}) \rightarrow \mathbb {N}\).

Any complete assignment \(\tau \) over \(\mathtt {var}(\mathcal {F}_H)\) that satisfies \(\mathcal {F}_H\) is a solution to \(\mathcal {F}\). The cost \(\mathtt {COST}(\mathcal {F}, \tau ) = \sum _{l \in \mathcal {B}(\mathcal {F})} \tau (l) w(l)\) of a solution \(\tau \) is the sum of weights of blocking literals it assigns to 1Footnote 3. The cost of a complete assignment \(\tau \) that does not satisfy \(\mathcal {F}_H\) is defined as \(\infty \). The cost of a partial assignment \(\tau \) over \(\mathtt {var}(\mathcal {F}_H)\) is defined as the cost of smallest-cost assignments that are extensions of \(\tau \). A solution \(\tau ^o\) is optimal if \(\mathtt {COST}(\mathcal {F}, \tau ^o) \le \mathtt {COST}(\mathcal {F}, \tau )\) holds for all solutions \(\tau \) of \(\mathcal {F}\). The cost of the optimal solutions of a MaxSAT instance is denoted by \(\mathtt {COST}(\mathcal {F})\), with \(\mathtt {COST}(\mathcal {F}) = \infty \) iff \(\mathcal {F}_H\) is unsatisfiable. In MaxSAT the task is to find an optimal solution to a given MaxSAT instance.

Example 1

Let \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) be a MaxSAT instance with \(\mathcal {F}_H= \{ (x \vee b_1), (\lnot x \vee b_2), (y \vee b_3 \vee b_4), (z \vee \lnot y \vee b_4), (\lnot z)\}\), \(\mathcal {B}(\mathcal {F}) = \{b_1, b_2, b_3, b_4\}\) having \(w(b_1) = w(b_4) = 1\), \(w(b_2) = 2\) and \(w(b_3) = 8\). The assignment \(\tau = \{b_1, b_4, \lnot b_2, \lnot b_3, \lnot x, \lnot z, y\}\) is an example of an optimal solution of \(\mathcal {F}\) and has \(\mathtt {COST}(\mathcal {F}, \tau ) = \mathtt {COST}(\mathcal {F}) = 2\).

With a slight abuse of notation, we denote by \(\mathcal {F}\wedge C = (\mathcal {F}_H\cup \{C\}, \mathcal {B}(\mathcal {F}\wedge C), w)\) the MaxSAT instance obtained by adding a clause C to an instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\). Adding clauses may introduce new blocking literals but not change the weights of already existing ones, i.e., \(\mathcal {B}(\mathcal {F}) \subset \mathcal {B}(\mathcal {F}\wedge C)\) and \(w^\mathcal {F}(l) = w^{\mathcal {F}\wedge C}(l)\) for all \(l \in \mathcal {B}(\mathcal {F})\).

Correction Sets. For a MaxSAT instance \(\mathcal {F}\), a subset \(\text {cs}\subset \mathcal {B}(\mathcal {F})\) is a minimal correction set (MCS) of \(\mathcal {F}\) if (i) \(\mathcal {F}_H\wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}} (\lnot l)\) is satisfiable and (ii) \(\mathcal {F}_H\wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}_s} (\lnot l)\) is unsatisfiable for every \(\text {cs}_s \subsetneq \text {cs}\). In words, \(\text {cs}\) is an MCS if it is a subset-minimal set of blocking literals that is included in some solution \(\tau \) of \(\mathcal {F}\).Footnote 4 We denote the set of MCSes of \(\mathcal {F}\) by \(\mathtt {mcs}(\mathcal {F})\).

There is a tight connection between the MCSes and solutions of MaxSAT instances. Given an optimal solution \(\tau ^o\) of a MaxSAT instance \(\mathcal {F}\), the set \(\tau ^o \cap \mathcal {B}(\mathcal {F})\) is an MCS of \(\mathcal {F}\). In the other direction, for any \(\text {cs}\in \mathtt {mcs}(\mathcal {F})\), there is a (not necessary optimal) solution \(\tau ^\text {cs}\) such that \(\text {cs}= \mathcal {B}(\mathcal {F}) \cap \tau ^\text {cs}\) and \(\mathtt {COST}(\mathcal {F}, \tau ^{\text {cs}}) = \sum _{l \in \text {cs}} w(l)\).

Example 2

Consider the instance \(\mathcal {F}\) from Example 1. The set \(\{b_1, b_4\} \in \mathtt {mcs}(\mathcal {F})\) is an MCS of \(\mathcal {F}\) that corresponds to the optimal solution \(\tau \) described in Example 1. The set \(\{b_2, b_3\} \in \mathtt {mcs}(\mathcal {F})\) is another example of an MCS that instead corresponds to the solution \(\tau _2 = \{ b_2, b_3, \lnot b_1, \lnot b_4, x, \lnot z, \lnot y\}\) for which \(\mathtt {COST}(\mathcal {F}, \tau ) = 10\).

3 Propagation Redundancy in MaxSAT

We extend recent work [23] on characterizing redundant clauses using semantic implication in the context of SAT to MaxSAT. In particular, we provide natural counterparts for several recently-proposed strong notions of redundancy in SAT to the context of MaxSAT and analyze the relationships between them.

In the context of SAT, the most general notion of clause redundancy is seemingly simple: a clause C is redundant for a formula \(F\) if it does not affect its satisfiability, i.e., clause C is redundant wrt a CNF formula \(F\) if \(F\) and \(F\wedge \{C\}\) are equisatisfiable [20, 29]. This allows for the set of satisfying assignments to change, and does not require preserving logical equivalence; we are only interested in satisfiability.

A natural counterpart for this general view in MaxSAT is that the cost of optimal solutions (rather than the set of optimal solutions) should be preserved.

Definition 1

A clause C is redundant wrt a MaxSAT instance \(\mathcal {F}\) if \(\mathtt {COST}(\mathcal {F}) = \mathtt {COST}(\mathcal {F}\wedge C)\).

This coincides with the counterpart in SAT whenever \(\mathcal {B}(\mathcal {F}) = \emptyset \), since then the cost of a MaxSAT instance \(\mathcal {F}\) is either 0 (if \(\mathcal {F}_H\) is satisfiable) or \(\infty \) (if \(\mathcal {F}_H\) is unsatisfiable). Unless explicitly specified, we will use the term “redundant” to refer to Definition 1.

Following [23], we say that a clause C blocks the assignment \(\lnot C\) (and all assignments \(\tau \) for which \(\lnot C \subset \tau \)). As shown in the context of SAT [23], a clause C is redundant (in the equisatisfiability sense) for a CNF formula \(F\) if C does not block all of its satisfying assignments. The counterpart that arises in the context of MaxSAT from Definition 1 is that the cost of at least one of the solutions not blocked by C is no greater than the cost of \(\lnot C\).

Proposition 1

A clause C is redundant wrt a MaxSAT instance \(\mathcal {F}\) if and only if there is an assignment \(\tau \) for which \(\mathtt {COST}(\mathcal {F}\wedge C, \tau ) = \mathtt {COST}(\mathcal {F}, \tau ) \le \mathtt {COST}(\mathcal {F}, \lnot C)\).

The equality \(\mathtt {COST}(\mathcal {F}\wedge C, \tau ) = \mathtt {COST}(\mathcal {F}, \tau )\) of Proposition 1 is necessary, as witnessed by the following example.

Example 3

Consider the MaxSAT instance \(\mathcal {F}\) detailed in Example 1, the clause \(C = (b_5)\) with \(b_5 \in \mathcal {B}(\mathcal {F}\wedge C)\) and the assignment \(\tau = \{b_5\}\). Then \(2 = \mathtt {COST}(\mathcal {F}, \tau ) \le \mathtt {COST}(\mathcal {F}, \lnot C) = 2\) but C is not redundant since \(\mathtt {COST}(\mathcal {F}\wedge C) = 2 + w^{\mathcal {F}\wedge C}(b_5) > 2 = \mathtt {COST}(\mathcal {F})\).

Proposition 1 provides a sufficient condition for a clause C being redundant. Further requirements on the assignment \(\tau \) can be imposed without loss of generality.

Theorem 1

A non-empty clause C is redundant wrt a MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) if and only if there is an assignment \(\tau \) such that

(i) \(\tau (C) = 1\),       (ii) \({ \mathcal {F}_H \big |_{\lnot C} } \models { \mathcal {F}_H \big |_{\tau } }\) and

(iii) \(\mathtt {COST}(\mathcal {F}\wedge C, \tau ) = \mathtt {COST}(\mathcal {F}, \tau ) \le \mathtt {COST}(\mathcal {F}, \lnot C)\).

As we will see later, a reason for including two additional conditions in Theorem 1 is to allow defining different restrictions of redundancy notions, some of which allow for efficiently identifying redundant clauses.

Example 4

Consider the instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) detailed in Example 1, a clause \(C = (\lnot x \vee b_5)\) for a \(b_5 \in \mathcal {B}(\mathcal {F}\wedge C)\) and an assignment \(\tau = \{ \lnot x , b_1 \}\). Then: \(\tau (C) = 1\), \( \{ (b_2), (y \vee b_3 \vee b_4), (z \vee \lnot y \vee b_4), (\lnot z)\}= { \mathcal {F}_H \big |_{\lnot C} } \models { \mathcal {F}_H \big |_{\tau } } = \{ (y \vee b_3 \vee b_4), (z \vee \lnot y \vee b_4), (\lnot z)\} \), and \(2 = \mathtt {COST}(\mathcal {F}\wedge C, \tau ) = \mathtt {COST}(\mathcal {F}, \tau ) \le \mathtt {COST}(\mathcal {F}, \lnot C) = 3\). We conclude that C is redundant.

In the context of SAT, imposing restrictions on the entailment operator and the set of assignments has been shown to give rise to several interesting redundancy notions which hold promise of practical applicability. These include three variants (LPR, SPR, and PR) of so-called (literal/set) propagation redundancy [23]. For completeness we restate the definitions of these three notions. A clause C is LPR wrt a CNF formula \(F\) if there is a literal \(l \in C\) for which \({ F \big |_{\lnot C} } \vdash _1{ F \big |_{(\lnot C)_l} }\), SPR if the same holds for a subset \(L \subset C\), and PR if there exists an assignment \(\tau \) that satisfies C and for which \({ F \big |_{\lnot C} } \vdash _1{ F \big |_{\tau } }\). With the help of Theorem 1, we obtain counterparts for these notions in the context of MaxSAT.

Definition 2

With respect to an instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\), a clause C is

  • cost literal propagation redundant (CLPR) (on l) there is a literal \(l \in C\) for which either (i) \(\bot \in \mathtt {UP}({ \mathcal {F}_H \big |_{\lnot C} })\) or (ii) \(l \notin \mathcal {B}(\mathcal {F}\wedge C)\) and \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{(\lnot C)_l} }\);

  • cost set propagation redundant (CSPR) (on L) if there is a set \(L \subset C \setminus \mathcal {B}(\mathcal {F}\wedge C)\) of literals for which \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{(\lnot C)_L} }\); and

  • cost propagation redundant (CPR) if there is an assignment \(\tau \) such that    (i) \(\tau (C)=1\),       (ii) \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{\tau } }\) and (iii) \(\mathtt {COST}(\mathcal {F}\wedge C, \tau ) = \mathtt {COST}(\mathcal {F}, \tau ) \le \mathtt {COST}(\mathcal {F}, \lnot C).\)

Example 5

Consider again \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) from Example 1. The clause \(D = (b_1 \vee b_2)\) is CLPR wrt \(\mathcal {F}\) since \(\bot \in \mathtt {UP}({ \mathcal {F}_H \big |_{\lnot D} })\) as \(\{ (x), (\lnot x)\} \subset { \mathcal {F}_H \big |_{\lnot D} }\). As for the redundant clause C and assignment \(\tau \) detailed in Example 3, we have that C is CPR, since \({ \mathcal {F}_H \big |_{\tau } } \subset { \mathcal {F}_H \big |_{\lnot C} }\) which implies \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{\tau } }\).

We begin the analysis of the relationship between these redundancy notions by showing that CSPR (and by extension CLPR) clauses also satisfy the MaxSAT-centric condition (iii) of Theorem 1. Assume that C is CSPR wrt a instance \(\mathcal {F}=(\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) on the set L.

Lemma 1

Let \(\tau \supset \lnot C\) be a solution of \(\mathcal {F}\). Then, \(\mathtt {COST}(\mathcal {F}, \tau ) \ge \mathtt {COST}(\mathcal {F}, \tau _L)\).

The following corollary of Lemma 1 establishes that CSPR and CLPR clauses are redundant according to Definition 1.

Corollary 1

\(\mathtt {COST}(\mathcal {F}\wedge C, (\lnot C)_L) = \mathtt {COST}(\mathcal {F}, (\lnot C)_L) \le \mathtt {COST}(\mathcal {F}, \lnot C)\).

The fact that CPR clauses are redundant follows trivially from the fact that \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{\tau } }\) implies \({ \mathcal {F}_H \big |_{\lnot C} } \models { \mathcal {F}_H \big |_{\tau } }\). However, given a solution \(\omega \) that does not satisfy a CPR clause C, the next example demonstrates that the assignment \(\omega _{\tau }\) need not have a cost lower than \(\omega \). Stated in another way, the example demonstrates that an observation similar to Lemma 1 does not hold for CPR clauses in general.

Example 6

Consider a MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) having \(\mathcal {F}_H= \{ (x \vee b_1), (\lnot x, b_2) \}\), \(\mathcal {B}(\mathcal {F}) = \{b_1, b_2\}\) and \(w(b_1) = w(b_2) = 1\). The clause \(C = (x)\) is CPR wrt \(\mathcal {F}\), the assignment \(\tau = \{x, b_2\}\) satisfies the three conditions of Definition 2. Now \(\delta = \{\lnot x, b_1\}\) is a solution of \(\mathcal {F}\) that does not satisfy C for which \(\delta _{\tau } = \{x, b_1, b_2\}\) and \(1 = \mathtt {COST}(\mathcal {F}, \delta ) < 2 = \mathtt {COST}(\mathcal {F}, \delta _{\tau })\).

Similarly as in the context of SAT, verifying that a clause is CSPR (and by extension CLPR) can be done efficiently. However, in contrast to SAT, we conjecture that verifying that a clause is CPR can not in the general case be done efficiently, even if the assignment \(\tau \) is given. While we will not go into detail on the complexity of identifying CPR clauses, the following proposition gives some support for our conjecture.

Proposition 2

Let \(\mathcal {F}\) be an instance and \(k \in \mathbb {N}\). There is another instance \(\mathcal {F}^M\), a clause C, and an assignment \(\tau \) such that C is CPR wrt \(\mathcal {F}^M\) if and only if \(\mathtt {COST}(\mathcal {F}) \ge k\).

As deciding if \(\mathtt {COST}(\mathcal {F}) \ge k\) is NP-complete in the general case, Proposition 2 suggests that it may not be possible to decide in polynomial time if an assignment \(\tau \) satisfies the three conditions of Definition 2 unless P=NP. This is in contrast to SAT, where verifying propagation redundancy can be done in polynomial time if the assignment \(\tau \) is given, but is NP-complete if not [24].

The following observations establish a more precise relationship between the redundancy notions. For the following, let \(\textsc {RED}(\mathcal {F})\) denote the set of clauses that are redundant wrt a MaxSAT instance \(\mathcal {F}\) according to Definition 1. Analogously, the sets \(\textsc {CPR}(\mathcal {F})\), \(\textsc {CSPR}(\mathcal {F})\) and \(\textsc {CLPR}(\mathcal {F})\) consist of the clauses that are CPR, CSPR and CLPR wrt \(\mathcal {F}\), respectively.

Observation 1

\(\textsc {CLPR}(\mathcal {F}) \subset \textsc {CSPR}(\mathcal {F}) \subset \textsc {CPR}(\mathcal {F}) \subset \textsc {RED}(\mathcal {F})\) holds for any MaxSAT instance \(\mathcal {F}\).

Observation 2

There are MaxSAT instances \(\mathcal {F}_1, \mathcal {F}_2\) and \(\mathcal {F}_3\) for which

\(\textsc {CLPR}(\mathcal {F}_1) \subsetneq \textsc {CSPR}(\mathcal {F}_1)\), \(\textsc {CSPR}(\mathcal {F}_2) \subsetneq \textsc {CPR}(\mathcal {F}_2)\) and \(\textsc {CPR}(\mathcal {F}_3) \subsetneq \textsc {RED}(\mathcal {F}_3)\).

The proofs of Observations 1 and 2 follow directly from known results in the context of SAT [23] by noting that any CNF formula can be viewed as an instance of MaxSAT without blocking literals.

For a MaxSAT-centric observation on the relationship between the redundancy notions, we note that the concept of redundancy and CPR coincide for any MaxSAT instance that has solutions.

Observation 3

\(\textsc {CPR}(\mathcal {F}) = \textsc {RED}(\mathcal {F})\) holds for any MaxSAT instance \(\mathcal {F}\) with \(\mathtt {COST}(\mathcal {F}) < \infty \).

We note that a result similar to Observation 3 could be formulated in the context of SAT. The SAT-counterpart would state that the concept of redundancy (in the equisatisfiability sense) coincides with the concept of propagation redundancy for SAT solving (defined e.g. in [23]) for satisfiable CNF formulas. However, assuming that a CNF formula is satisfiable is very restrictive in the context of SAT. In contrast, it is natural to assume that a MaxSAT instance admits solutions.

We end this section with a simple observation: adding a redundant clause C to a MaxSAT instance \(\mathcal {F}\) preserves not only optimal cost, but optimal solutions of \(\mathcal {F}\wedge C\) are also optimal solutions of \(\mathcal {F}\). However, the converse need not hold; an instance \(\mathcal {F}\) might have optimal solutions that do not satisfy C.

Example 7

Consider an instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) with \(\mathcal {F}_H= \{(b_1 \vee b_2)\}\), \(\mathcal {B}(\mathcal {F}) = \{b_1, b_2\}\) and \(w(b_1) = w(b_2) = 1\). The clause \(C = (\lnot b_1)\) is CPR wrt \(\mathcal {F}\). In order to see this, let \(\tau = \{\lnot b_1, b_2\}\). Then \(\tau \) satisfies C (condition (i) of Definition 2). Furthermore, \(\tau \) satisfies \(\mathcal {F}_H\), implying \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{\tau } }\) (condition (ii)). Finally, we have that \(1 = \mathtt {COST}(\mathcal {F}, \tau ) = \mathtt {COST}(\mathcal {F}\wedge C, \tau ) \le \mathtt {COST}(\mathcal {F}, \lnot C) = 1\) (condition (iii)). The assignment \(\delta = \{b_1, \lnot b_2\}\) is an example of an optimal solution of \(\mathcal {F}\) that is not a solution of \(\mathcal {F}\wedge C\).

4 Propagation Redundancy and MCSes

In this section, we analyze the effect of adding redundant clauses on the MCSes of MaxSAT instances. As the main result, we show that adding CSPR (and by extension CLPR) clauses to a MaxSAT instance \(\mathcal {F}\) preserves all MCSes while adding CPR clauses does not in general. Stated in terms of solutions, this means that adding CSPR clauses to \(\mathcal {F}\) preserves not only all optimal solutions, but all solutions \(\tau \) for which \((\tau \cap \mathcal {B}(\mathcal {F})) \in \mathtt {mcs}(\mathcal {F})\), while adding CPR clauses only preserves at least one optimal solution.

Effect of CLPR Clauses on MCSes. MaxSAT-liftings of four specific SAT solving techniques (including bounded variable elimination and self-subsuming resolution) were earlier proposed in [8]. Notably, the correctness of the liftings was shown individually for each of the techniques by arguing individually that applying one of the liftings does not change the set of MCSes of any MaxSAT instance. Towards a more generic understanding of optimal cost preserving MaxSAT preprocessing, in [10] the notion of solution resolution asymmetric tautologies (SRAT) was proposed as a MaxSAT-lifting of the concept of resolution asymmetric tautologies (RAT). In short, a clause C is a SRAT clause for a MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) if there is a literal \(l \in C \setminus \mathcal {B}(\mathcal {F}\wedge C)\) such that \(\mathcal {F}_H\vdash _1\left( (C \vee D) \setminus \{\lnot l\}\right) \) for every \(D \in \mathcal {F}_H\) for which \(\lnot l \in D\).

In analogy with RAT [29], SRAT was shown in [10] to allow for a general proof of correctness for natural MaxSAT-liftings of a wide range of SAT preprocessing techniques, covering among other the four techniques for which individual correctness proofs were provided in [8]. The generality follows essentially from the fact that the addition and removal of SRAT clauses preserves MCSes. The same observations apply to CLPR, as CLPR and SRAT are equivalent.

Proposition 3

A clause C is CLPR wrt \(\mathcal {F}\) iff it is SRAT wrt \(\mathcal {F}\).

The proof of Proposition 3 follows directly from corresponding results in the context of SAT [23]. Informally speaking, a clause C is SRAT on a literal l iff it is RAT [29] on l and \(l \notin \mathcal {B}(\mathcal {F})\). Similarly, a clause C is CLPR on a literal l iff it is LPR as defined in [23] on l and \(l \notin \mathcal {B}(\mathcal {F})\). Proposition 3 together with previous results from [10] implies that the MCSes of MaxSAT instances are preserved under removing and adding CLPR clauses.

Corollary 2

If C is CLPR wrt \(\mathcal {F}\), then \(\mathtt {mcs}(\mathcal {F}) = \mathtt {mcs}(\mathcal {F}\wedge C)\).

Effect of CPR Clauses on MCSes. We turn our attention to the effect of CPR clauses on the MCSes of MaxSAT instances. Our analysis makes use of the previously-proposed MaxSAT-centric preprocessing rule known as subsumed label elimination (SLE) [11, 33]Footnote 5.

Definition 3

(Subsumed Label Elimination [11, 33]) Consider a MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) and a blocking literal \(l \in \mathcal {B}(\mathcal {F})\) for which \(\lnot l \notin \mathtt {lit}(\mathcal {F}_H)\). Assume that there is another blocking literal \(l_s \in \mathcal {B}(\mathcal {F})\) for which (1) \( \lnot l_s \notin \mathtt {lit}(\mathcal {F}_H)\), (2) \(\{ C \in \mathcal {F}_H\mid l \in C\} \subset \{ C \in \mathcal {F}_H\mid l_s \in C\}\) and (3) \(w(l) \ge w(l_s)\). The subsumed label elimination (SLE) rule allows adding \((\lnot l)\) to \(\mathcal {F}_H\).

A specific proof of correctness of SLE was given in [11]. The following proposition provides an alternative proof based on CPR.

Proposition 4 (Proof of correctness for SLE)

Let \(\mathcal {F}\) be a MaxSAT instance and assume that the blocking literals \(l, l_s \in \mathcal {B}(\mathcal {F})\) satisfy the three conditions of Definition 3. Then, the clause \(C = (\lnot l)\) is CPR wrt \(\mathcal {F}\).

Proof

We show that \(\tau = \{ \lnot l, l_s\}\) satisfies the three conditions of Definition 2. First \(\tau \) satisfies C (condition (i)). Conditions (1) and (2) of Definition 3 imply \({ \mathcal {F}_H \big |_{\tau } } \subset { \mathcal {F}_H \big |_{\lnot C} }\) which in turn implies \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{\tau } }\) (condition (ii)).

As for condition (iii), the requirement \(\mathtt {COST}(\mathcal {F}\wedge C, \tau ) = \mathtt {COST}(\mathcal {F}, \tau )\) follows from \(\mathcal {B}(\mathcal {F}\wedge C) = \mathcal {B}(\mathcal {F})\). Let \(\delta \supset \lnot C\) be a complete assignment of \(\mathcal {F}_H\) for which \(\mathtt {COST}(\mathcal {F}, \delta ) = \mathtt {COST}(\mathcal {F}, \lnot C)\). If \(\mathtt {COST}(\mathcal {F},\delta ) = \infty \) then \(\mathtt {COST}(\mathcal {F}, \tau ) \le \mathtt {COST}(\mathcal {F}, \lnot C)\) follows trivially. Otherwise \(\delta \setminus \lnot C\) satisfies \({ \mathcal {F}_H \big |_{\lnot C} }\) so by \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{\tau } }\) it satisfies \({ \mathcal {F}_H \big |_{\tau } }\) as well. Thus \(\delta ^R = ((\delta \setminus \lnot C) \setminus \{\lnot l \mid l \in \tau \}) \cup \tau = (\delta \setminus \{l, \lnot l, \lnot l_s\}) \cup \{\lnot l, l_s\}\) is an extension of \(\tau \) that satisfies \(\mathcal {F}_H\) and for which \(\mathtt {COST}(\mathcal {F}, \tau ) \le \mathtt {COST}(\mathcal {F}, \delta ^R) \le \mathtt {COST}(\mathcal {F},\delta )\) by condition (3) of Definition 3. Thereby \(\tau \) satisfies the conditions of Definition 2 so C is CPR wrt \(\mathcal {F}\).    \(\square \)

Example 8

The blocking literals \(b_3, b_4 \in \mathcal {B}(\mathcal {F})\) of the instance \(\mathcal {F}\) detailed in Example 1 satisfy the conditions of Definition 3. By Proposition 4 the clause \((\lnot b_3)\) is CPR wrt \(\mathcal {F}\).

In [11] it was shown that SLE does not preserve MCSes in general. By Corollary 2, this implies that SLE can not be viewed as the addition of CLPR clauses. Furthermore, by Proposition 4 we obtain the following.

Corollary 3

There is a MaxSAT instance \(\mathcal {F}\) and a clause C that is CPR wrt \(\mathcal {F}\) for which \(\mathtt {mcs}(\mathcal {F}) \ne \mathtt {mcs}(\mathcal {F}\wedge C)\).

Effect of CSPR Clauses on MCSes. Having established that CLPR clauses preserve MCSes while CPR clauses do not, we complete the analysis by demonstrating that CSPR clauses preserve MCSes.

Theorem 2

Let \(\mathcal {F}\) be a MaxSAT instance and C a CSPR clause of \(\mathcal {F}\). Then \(\mathtt {mcs}(\mathcal {F}) = \mathtt {mcs}(\mathcal {F}\wedge C)\).

Theorem 2 follows from the following lemmas and propositions. In the following, let C be a clause that is CSPR wrt a MaxSAT instance \(\mathcal {F}\) on a set \(L \subset C \setminus \mathcal {B}(\mathcal {F}\wedge C)\).

Lemma 2

Let \(\text {cs}\subset \mathcal {B}(\mathcal {F})\). If \(\mathcal {F}_H\wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}} (\lnot l)\) is satisfiable, then

\((\mathcal {F}_H\wedge C) \wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}\wedge C) \setminus \text {cs}} (\lnot l)\) is satisfiable.

Lemma 2 helps in establishing one direction of Theorem 2.

Proposition 5

\(\mathtt {mcs}(\mathcal {F}) \subset \mathtt {mcs}(\mathcal {F}\wedge C).\)

Proof

Let \(\text {cs}\in \mathtt {mcs}(\mathcal {F})\). Then \(\mathcal {F}_H\wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}} (\lnot l)\) is satisfiable, which by Lemma 2 implies that \((\mathcal {F}_H\wedge C) \wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}\wedge C) \setminus \text {cs}} (\lnot l)\) is satisfiable.

To show that \((\mathcal {F}_H\wedge C) \wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}\wedge C) \setminus \text {cs}_s} (\lnot l)\) is unsatisfiable for any \(\text {cs}_s \subsetneq \text {cs}\subset \mathcal {B}(\mathcal {F})\), we note that any assignment satisfying \((\mathcal {F}_H\wedge C) \wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}\wedge C) \setminus \text {cs}_s} (\lnot l)\) would also satisfy \(\mathcal {F}_H\wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}_s} (\lnot l)\), contradicting \(\text {cs}\in \mathtt {mcs}(\mathcal {F})\).    \(\square \)

The following lemma is useful for showing inclusion in the other direction.

Lemma 3

Let \(\text {cs}\in \mathtt {mcs}(\mathcal {F}\wedge C)\). Then \(\text {cs}\subset \mathcal {B}(\mathcal {F})\).

Lemma 3 allows for completing the proof of Theorem 2.

Proposition 6

\(\mathtt {mcs}(\mathcal {F}\wedge C) \subset \mathtt {mcs}(\mathcal {F}).\)

Proof

Let \(\text {cs}\in \mathtt {mcs}(\mathcal {F}\wedge C)\), which by Lemma 3 implies \(\text {cs}\subset \mathcal {B}(\mathcal {F})\). Let \(\tau \) be a solution that satisfies \((\mathcal {F}_H\wedge C) \wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}\wedge C) \setminus \text {cs}} (\lnot l)\). Then \(\tau \) satisfies \(\mathcal {F}_H\wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}} (\lnot l)\). For contradiction, assume that \(\mathcal {F}_H\wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}_s} (\lnot l)\) is satisfiable for some \(\text {cs}_s \subsetneq \text {cs}\). Then by Lemma 2, \((\mathcal {F}_H\wedge C) \wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}\wedge C) \setminus \text {cs}_s} (\lnot l)\) is satisfiable as well, contradicting \(\text {cs}\in \mathtt {mcs}(\mathcal {F}\wedge C)\). Thereby \(\text {cs}\in \mathtt {mcs}(\mathcal {F})\).    \(\square \)

Theorem 2 implies that SLE can not be viewed as the addition of CSPR clauses. In light of this, an interesting remark is that—in contrast to CPR clauses in general (recall Example 6)—the assignment \(\tau \) used in the proof of Proposition 4 can be used to convert any assignment that does not satisfy the CPR clause detailed in Definition 3 into one that does, without increasing its cost.

Observation 4

Let \(\mathcal {F}\) be a MaxSAT instance and assume that the blocking literals \(l, l_s \in \mathcal {B}(\mathcal {F})\) satisfy the three conditions of Definition 3. Let \(\tau = \{ \lnot l, l_s\}\) and consider any solution \(\delta \supset \lnot C\) of \(\mathcal {F}\) that does not satisfy the CPR clause \(C = (\lnot l)\). Then \(\delta _{\tau }\) is a solution of \(\mathcal {F}\wedge C\) for which \(\mathtt {COST}(\mathcal {F},\delta _{\tau }) \le \mathtt {COST}(\mathcal {F}, \delta )\).

5 CPR-Based Preprocessing for MaxSAT

Mapping the theoretical observations into practical preprocessing, in this section we discuss through examples how CPR clauses can be used as a unified theoretical basis for capturing a wide variety of known MaxSAT reasoning rules, and how they could potentially help in the development of novel MaxSAT reasoning techniques.

Our first example is the so-called hardening rule [2, 8, 17, 26]. In terms of our notation, given a solution \(\tau \) to a MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) and a blocking literal \(l \in \mathcal {B}(\mathcal {F})\) for which \(w(l) > \mathtt {COST}(\mathcal {F}, \tau )\), the hardening rule allows adding the clause \(C = (\lnot l)\) to \(\mathcal {F}_H\).

The correctness of the hardening rule can be established with CPR clauses. More specifically, as \(\mathtt {COST}(\mathcal {F}, \tau ) < w(l)\) it follows that \(\tau ( C ) = 1\) (condition (i) of Definition 2). Since \(\tau \) satisfies \(\mathcal {F}\), we have that \({ \mathcal {F}_H \big |_{\tau } } = \top \) so \({ \mathcal {F}_H \big |_{\lnot C} } \vdash _1{ \mathcal {F}_H \big |_{\tau } }\) (condition (ii)). Finally, as \(\mathtt {COST}(\mathcal {F}, \delta ) \ge w(l) > \mathtt {COST}(\mathcal {F}, \tau )\) holds for all \(\delta \supset \lnot C\) it follows that \(\mathtt {COST}(\mathcal {F}, \lnot C) > \mathtt {COST}(\mathcal {F}, \tau ) = \mathtt {COST}(\mathcal {F}\wedge C, \tau )\). As such, \((\lnot l)\) is CPR clause wrt \(\mathcal {F}\). If fact, instead of assuming \(w(l) > \mathtt {COST}(\mathcal {F}, \tau )\) it suffices to assume \(w(l) \ge \mathtt {COST}(\mathcal {F}, \tau )\) and \(\tau (l) = 0\).

The hardening rule can not be viewed as the addition of CSPR or CLPR clauses because it does not in general preserve MCSes.

Example 9

Consider the MaxSAT instance \(\mathcal {F}\) from Example 1 and a solution \(\tau = \{ b_1, b_2, b_4, \lnot b_3, \lnot z, x, y\}\). Since \(\mathtt {COST}(\mathcal {F}, \tau ) = 3 < 8 = w(b_3)\), the clause \((\lnot b_3)\) is CPR. However, \(\mathtt {mcs}(\mathcal {F}) \ne \mathtt {mcs}(\mathcal {F}\wedge C)\) since the set \(\{b_2, b_3\} \in \mathtt {mcs}(\mathcal {F})\) is not an MCS of \(\mathcal {F}\wedge C\) as \((\mathcal {F}_H\wedge C) \wedge \bigwedge _{l \in \mathcal {B}(\mathcal {F}) \setminus \text {cs}} (\lnot l) = (\mathcal {F}_H\wedge (\lnot b_3)) \wedge (\lnot b_1) \wedge (\lnot b_4)\) is not satisfiable.

Viewing the hardening rule through the lens of CPR clauses demonstrates novel aspects of the MaxSAT-liftings of propagation redundancy. In particular, instantiated in the context of SAT, an argument similar to the one we made for hardening shows that given a CNF formula \(F\), an assignment \(\tau \) satisfying \(F\), and a literal l for which \(\tau (l) = 0\), the clause \((\lnot l)\) is redundant (wrt equisatisfiability). While formally correct, such a rule is not very useful for SAT solving. In contrast, in the context of MaxSAT the hardening rule is employed in various modern MaxSAT solvers and leads to non-trivial performance-improvements [4, 5].

As another example of capturing MaxSAT-centric reasoning with CPR, consider the so-called TrimMaxSAT rule [39]. Given a MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) and a literal \(l \in \mathcal {B}(\mathcal {F})\) for which \(\tau (l) = 1\) for all solutions of \(\mathcal {F}\), the TrimMaxSAT rule allows adding the clause \(C = (l)\) to \(\mathcal {F}_H\). In this case the assumptions imply that all solutions of \(\mathcal {F}\) also satisfy C, i.e., that \({ \mathcal {F}_H \big |_{\lnot C} }\) is unsatisfiable. As such, any assignment \(\tau \) that satisfies C and \(\mathcal {F}_H\) will also satisfy the three conditions of Definition 2 which demonstrates that C is CPR. It is, however, not CSPR since the only literal in C is blocking.

As a third example of capturing (new) reasoning techniques with CPR, consider an extension of the central variable elimination rule that allows (to some extent) for eliminating blocking literals.

Definition 4

Consider a MaxSAT instance \(\mathcal {F}\) and a blocking literal \(l \in \mathcal {B}(\mathcal {F})\). Let \(\textsc {BBVE}(\mathcal {F})\) be the instance obtained by (i) adding the clause \(C \vee D\) to \(\mathcal {F}\) for every pair \((C \vee l), (D \vee \lnot l) \in \mathcal {F}_H\) and (ii) removing all clauses \((D \vee \lnot l) \in \mathcal {F}_H\). Then \(\mathtt {COST}(\mathcal {F}) = \mathtt {COST}(\textsc {BBVE}(\mathcal {F}))\) and \(\mathtt {mcs}(\mathcal {F}) = \mathtt {mcs}(\textsc {BBVE}(\mathcal {F}))\).

On the Limitations of CPR. Finally, we note that while CPR clauses significantly generalize existing theory on reasoning and preprocessing rules for MaxSAT, there are known reasoning techniques that can not (at least straightforwardly) be viewed through the lens of propagation redundancy. For a concrete example, consider the so-called intrinsic atmost1 technique [26].

Definition 5

Consider a MaxSAT instance \(\mathcal {F}\) and a set \(L \subset \mathcal {B}(\mathcal {F})\) of blocking literals. Assume that (i) \(|\tau \cap \{ \lnot l \mid l \in L\}| \le 1\) holds for any solution \(\tau \) of \(\mathcal {F}\) and (ii) \(w(l) = 1\) for each \(l \in L\). Now form the instance \(\textsc {AT-MOST-ONE}(\mathcal {F}, L)\) by (i) removing each literal \(l \in L\) from \(\mathcal {B}(\mathcal {F})\), and (ii) adding the clause \(\{ (\lnot l) \mid l \in L\} \cup \{l_L\}\) to \(\mathcal {F}\), where \(l_L\) is a fresh blocking literal with \(w(l_L) = 1\).

It has been established that any optimal solution of \(\textsc {AT-MOST-ONE}(\mathcal {F}, L)\) is an optimal solution of \(\mathcal {F}\) [26]. However, as the next example demonstrates, the preservation of optimal solutions is in general not due to the clauses added being redundant, as applying the technique can affect optimal cost.

Example 10

Consider the MaxSAT instance \(\mathcal {F}= (\mathcal {F}_H, \mathcal {B}(\mathcal {F}), w)\) with \(\mathcal {F}_H= \{ (l_i) \mid i=1\ldots n \}\), \(\mathcal {B}(\mathcal {F}) = \{l_1 \ldots l_n\}\) and \(w(l) = 1\) for all \(l \in \mathcal {B}(\mathcal {F})\). Then \(|\tau \cap \lnot \mathcal {B}(\mathcal {F})| = 0 \le 1\) holds for all solutions \(\tau \) of \(\mathcal {F}\) so the intrinsic-at-most-one technique can be used to obtain the instance \(\mathcal {F}^2 = \textsc {AT-MOST-ONE}(\mathcal {F}, \mathcal {B}(\mathcal {F})) = (\mathcal {F}_H^2, \mathcal {B}(\mathcal {F}^2), w^2)\) with \(\mathcal {F}_H^2 = \mathcal {F}_H\cup \{ (\lnot l_1 \vee \ldots \vee \lnot l_n \vee l_L) \} \), \(\mathcal {B}(\mathcal {F}^2) = \{ l_L\}\) and \(w^2(l_L) = 1\). Now \(\delta = \{ l \mid l \in \mathcal {B}(\mathcal {F}) \} \cup \{l_L\}\) is an optimal solution to both \(\mathcal {F}^2\) and \(\mathcal {F}\) for which \(1 = \mathtt {COST}(\mathcal {F}^2, \delta ) < \mathtt {COST}(\mathcal {F}, \delta ) = n\).

Example 10 implies that the intrinsic atmost1 technique can not be viewed as the addition or removal of redundant clauses. Generalizing CPR to cover weight changes could lead to further insights especially due to potential connections with core-guided MaxSAT solving [1, 36,37,38].

6 MaxPre 2: More General Preprocesssing in Practice

Connecting to practice, we extended the MaxSAT preprocessor MaxPre [33] version 1 with support for techniques captured by propagation redundancy. The resulting MaxPre version 2, as outlined in the following, hence includes techniques which have previously only been implemented in specific solver implementations rather than in general-purpose MaxSAT preprocessors.

First, let us mention that the earlier MaxPre [33] version 1 assumes that any blocking literals only appear in a single polarity among the hard clauses. Removing this assumption—supported by theory developed in Sects. 34—decreases the number of auxiliary variables that need to be introduced when a MaxSAT instance is rewritten to only include unit soft clauses. For example, consider a MaxSAT instance \(\mathcal {F}\) with \(\mathcal {F}_H= \{ (\lnot x \vee y), (\lnot y \vee x) \}\) and \(\mathcal {F}_S= \{ (x), (\lnot y) \}\). For preprocessing the instance, MaxPre 1 extends both soft clauses with a new, auxiliary variable and runs preprocessing on the instance \(\mathcal {F}= \{ (\lnot x \vee y), (\lnot y \vee x), (x \vee b_1), (\lnot y \vee b_2)\}\) with \(\mathcal {B}(\mathcal {F}) = \{b_1, b_2\}\). In contrast, MaxPre 2 detects that the clauses in \(\mathcal {F}_S\) are unit and reuses them as blocking literals, invoking preprocessing on \(\mathcal {F}= \{ (\lnot x \vee y), (\lnot y \vee x) \}\) with \(\mathcal {B}(\mathcal {F}) = \{\lnot x, y\}\).

In addition to the techniques already implemented in MaxPre 1, MaxPre 2 includes the following additional techniques: hardening [2], a variant TrimMaxSAT [39] that works on all literals of a MaxSAT instance, the intrinsic atmost1 technique [26] and a MaxSAT-lifting of failed literal elimination [12]. In short, failed literal elimination adds the clause \((\lnot l)\) to the hard clauses \(\mathcal {F}_H\) of an instance in case unit-propagation derives a conflict in \(\mathcal {F}_H\wedge \{(l)\}\). Additionally, the implementation of failed literal elimination attempts to identify implied equivalences between literals that can lead to further simplification.

For computing the solutions required by TrimMaxSAT and detecting the cardinality constraints required by intrinsic-at-most-one constraints, MaxPre 2 uses the Glucose 3.0 SAT-solver [3]. For computing solutions required by hardening, MaxPre 2 additionally uses the SatLike incomplete MaxSAT solver [34] within preprocessing. MaxPre 2 is available in open source at https://bitbucket.org/coreo-group/maxpre2/.

We emphasize that, while the additional techniques implemented by MaxPre 2 have been previously implemented as heuristics in specific solver implementations, MaxPre 2 is—to the best of our understanding—the first stand-alone implementation supporting techniques whose correctness cannot be established with previously-proposed MaxSAT redundancy notions (i.e., SRAT). The goal of our empirical evaluation presented in the next section is to demonstrate the potential of viewing expressive reasoning techniques not only as solver heuristics, but as a separate step in the MaxSAT solving process whose correctness can be established via propagation redundancy.

7 Empirical Evaluation

We report on results from an experimental evaluation of the potential of incorporating more general reasoning in MaxSAT preprocessing. In particular, we evaluated both complete solvers (geared towards finding provably-optimal solutions) and incomplete solvers (geared towards finding relatively good solutions fast) on standard heterogenous benchmarks from recent MaxSAT Evaluations. All experiments were run on 2.60-GHz Intel Xeon E5-2670 8-core machines with 64 GB memory and CentOS 7. All reported runtimes include the time used in preprocessing (when applicable).

7.1 Impact of Preprocessing on Complete Solvers

We start by considering recent representative complete solvers covering three central MaxSAT solving paradigms: the core-guided solver CGSS [27] (as a recent improvement to the successful RC2 solver [26]), and the MaxSAT Evaluation 2021 versions of the implicit hitting set based solver MaxHS [17] and the solution-improving solver Pacose [40]. For each solver S we consider the following variants.

  • S: S in its default configuration.

  • S no preprocess: S with the solver’s own internal preprocessing turned off (when applicable).

  • S+maxpre1: S after applying MaxPre 1 using its default configuration.

  • S+maxpre2/none: S after applying MaxPre 2 using the default configuration of MaxPre 1.

  • \(S+\texttt {maxpre2/{<}TECH{>}}\): S after applying MaxPre 2 using the standard configuration of MaxPre 1 and additional techniques integrated into MaxPre 2 (as detailed in Section 6) as specified by \(\texttt {{<}TECH{>}}\).

More precisely, \(\texttt {{<}TECH{>}}\) specifies which of the techniques HTVGR are applied: H for hardening, T and V for TrimMaxSAT on blocking and non-blocking literals, respectively, G for intrinsic-at-most-one-constraints and R for failed literal elimination. It should be noted that an exhaustive evaluation of all subsets and application orders of these techniques is infeasible in practice. Based on preliminary experiments, we observed that the following choices were promising: HRT for CGSS and MaxHS, and HTVGR for Pacose; we report results using these individual configurations.

As benchmarks, we used the combined set of weighted instances from the complete tracks of MaxSAT Evaluation 2020 and 2021. After removing duplicates, this gave a total of 1117 instances. We enforced a per-instance time limit of 60 minutes and memory limit of 32 GB. Furthermore, we enforced a per-instance 120-second time limit on preprocessing.

Fig. 1.
figure 1

Impact of preprocessing on complete solvers. For each solver, the number of instances solved within a 60-min per-instance time limit in parentheses.

An overview of the results is shown in Fig. 1, illustrating for each solver the number of instances solved (x-axis) under different per-instance time limits (y-axis). We observe that for both CGSS and MaxHS, S+maxpre1 and S+maxpre2/none leads to less instances solved compared to S. In contrast, S+maxpre2/HRT, i.e., incorporating the stronger reasoning techniques of MaxPre 2, performs best of all preprocessing variants and improves on MaxHS also in terms of the number of instances solved. For Pacose, we observe that both Pacose+maxpre1 and Pacose+maxpre2/new (without the stronger reasoning techniques) already improve the performance of Pacose, leading to more instances solved. Incorporating the stronger reasoning rules further significantly improves performance, with Pacose+maxpre2/HVRTG performing the best among all of the Pacose variants.

7.2 Impact of Preprocessing on Incomplete MaxSAT Solving

As a representative incomplete MaxSAT solver we consider the MaxSAT Evaluation 2021 version of Loandra [9], as the best-performing solver in the incomplete track of MaxSAT Evaluation under a 300s per-instance time limit on weighted instances. Loandra combines core-guided and solution-improving search towards finding good solutions fast. We consider the following variants of Loandra.

  • \(\texttt {base}\) \((\texttt {maxpre1})\): Loandra in its default configuration which makes use of MaxPre 1.

  • \(\texttt {no-prepro}\): Loandra with its internal preprocessing turned off.

  • \(\texttt {maxpre2/none}\): \(\texttt {base}\) with its internal preprocessor changed from MaxPre 1 to MaxPre 2 using the default configuration of MaxPre 1.

  • \(\texttt {maxpre2/VG}\): \(\texttt {maxpre2}\) incorporating the additional intrinsic-at-most-one constraints technique and the extension of TrimMaxSAT to non-blocking literals (cf. Sect. 6), found promising in preliminary experimentation.

As benchmarks, we used the combined set of weighted instances from the incomplete tracks of MaxSAT Evaluation 2020 and 2021. After removing duplicates, this gave a total of 451 instances. When reporting results, we consider for each instance and solver the cost of the best solution found by the solver within 300 s (including time spent preprocessing and solution reconstruction).

We compare the relative runtime performance of the solver variants using two metrics: #wins and the average incomplete score. Assume that \(\tau _x\) and \(\tau _y\) are the lowest-cost solutions computed by two solvers X and Y on a MaxSAT instance \(\mathcal {F}\) and that \(\texttt {best-cost}(\mathcal {F})\) is the lowest cost of a solution of \(\mathcal {F}\) found either in our evaluation or in the MaxSAT Evaluations. Then X wins over Y if \(\mathtt {COST}(\mathcal {F}, \tau _x) < \mathtt {COST}(\mathcal {F}, \tau _y)\). The incomplete score, \(\texttt {score}(\mathcal {F}, X)\), obtained by solver X on \(\mathcal {F}\) is the ratio between the cost of the solution found by X and \(\texttt {best-cost}(\mathcal {F})\), i.e., \(\texttt {score}(\mathcal {F},X) = (\texttt {best-cost}(\mathcal {F})+ 1) / (\mathtt {COST}(\mathcal {F}, \tau _x) + 1)\). The score of X on \(\mathcal {F}\) is 0 if X is unable to find any solutions within 300 s.

Table 1. Impact of preprocessing on the incomplete solver Loandra. The wins are organized column-wise, the cell on row X column Y contains the total number of instances that the solver on column Y wins over the solver on row X.

An overview of the results is shown in Table 1. The upper part of the table shows a pairwise comparison on the number of wins over all benchmarks. The wins are organized column-wise, i.e., the cell on row X column Y contains the total number of instances that the solver on column Y wins over the solver on row X. The last row contains the average score obtained by each solver over all instances. We observe that any form of preprocessing improves the performance of Loandra, as witnessed by the fact that \(\texttt {no-prepro}\) is clearly the worst-performing variant. The variants that make use of MaxPre 2 outperform the baseline under both metrics; both \(\texttt {maxpre2 no new}\) and \(\texttt {maxpre2-w:VG}\) obtain a higher average score and win on more instances over \(\texttt {base}\). The comparison between \(\texttt {maxpre2/none}\) and \(\texttt {maxpre2/VG}\) is not as clear. On one hand, the score obtained by \(\texttt {maxpre2/VG}\) is higher. On the other hand, \(\texttt {maxpre2/none}\) wins on 80 instances over \(\texttt {maxpre2/VG}\) and looses on 77. This suggests that the quality of solutions computed by \(\texttt {maxpre2/VG}\) is on average higher, and that on the instances on which \(\texttt {maxpre2/none}\) wins the difference is smaller.

7.3 Impact of Preprocessing on Instance Sizes

In addition to improved solver runtimes, we note that MaxPre 2 has a positive effect on the size of instances (both in terms of the number of variables and clauses remaining) when compared to preprocessing with MaxPre 1; see Fig. 2 for a comparison, with maxpre2/HRT compared to maxpre1 (left) and to original instance sizes (right).

Fig. 2.
figure 2

Impact of preprocessing on instance size.

8 Conclusions

We studied liftings of variants of propagation redundancy from SAT in the context of maximum satisfiability where—more fine-grained than in SAT—of interest are reasoning techniques that preserve optimal cost. We showed that CPR, the strongest MaxSAT-lifting, allows for changing minimal corrections sets in MaxSAT in a controlled way, thereby succinctly expressing MaxSAT reasoning techniques very generally. We also provided a practical MaxSAT preprocessor extended with techniques captured by CPR and showed empirically that extended preprocessing has a positive overall impact on a range of MaxSAT solvers. Interesting future work includes the development of new CPR-based preprocessing rules for MaxSAT capable of significantly affecting the MaxSAT solving pipeline both in theory and practice, as well as developing an understanding of the relationship between redundancy notions and the transformations performed by MaxSAT solving algorithms.